Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

image

Play button

image

Play button

image

Progress

1/146

Click to flip

146 Cards in this Set

  • Front
  • Back
Realism
The external world exists, or there is a real world to sense.
Positivism
All we really have to go on is our senses, so the world could be nothing more than an elaborate hallucination.
Euclidean
Real-world geometry
non-Euclidean
Non-real-world geometry, such as that within retinal projections
Why have two eyes?
Fundamentally: You can lose one and still be able to see.
Also:
-See more of the world
-binocular vision (depth)
Binocular summation
An advantage in detecting a stimulus that is afforded by having two eyes.
Binocular disparity
The differences between the two retinal images of the same scene.
Stereopsis
The ability to use binocular disparity as a cue for depth, and the impression of three-dimensionality--of objects popping out in depth.
Occlusion
A cue to relative depth order when, for example, one object obstructs the view of another object.
Nonmetrical depth cue
A depth cue that provides information about the depth order (relative depth) but not the depth magnitude (e.g., his nose is in front of his face)
Metrical depth
A depth cue that provides quantitative information about distance in the third dimension
Relative size
A comparison of size between items without knowing the absolute size of either one
Texture gradient
A depth cue based on the geometric fact that items of the same size form smaller images when they are farther away.
Relative height
A depth due based on the observation that objects at different distances from the viewer on the ground plane will form images at different heights in the retinal image.

Objects farther away will be seen as higher in the image
horopter
The horopter refers to sets of points in the world having identical binocular disparities.

Objects that fall on the semicircular set of horopter points project images to corresponding retinal points
Crossed disparity
Crossed disparity indicates that a point is nearer to the observer than the point being fixated.
Uncrossed disparity
Uncrossed disparity indicates that a point is farther from the observer than the point being fixated.
Familiar size
A depth cue based on knowledge of the typical size of objects (e.g., humans)
Relative metrical depth cue
A depth cue that could specify, for example, that object A was twice as far away as object B without providing information about the absolute distance to either A or B
Absolute metrical depth cue
A cue to depth that provides absolute information about the distance in the third dimension (e.g., his nose sticks out 4cm in front of his face)
Aerial perspective
Also called haze, it is a depth cue that is based on the implicit understanding that light is scattered by the atmosphere
Gestalt
From the German word referring to the "whole."

In perception, the name of the school of thought that stressed that the perceptual whole is greater than the apparent sum of its parts.
Gestalt grouping rules
A set of rules describing which elements in an image will appear to group together.
Good Continuation
Rule stating that two elements will tend to group together if they seem to lie on the same contour.
Illusory contour
A contour that is perceived even though nothing changes from one side of the contour to the other in the image.
Structuralism
A school of thought that held that complex objects or perceptions could be understood by analysis of the components
Texture segmentation
Carving an image into areas of common texture properties
Similarity (gestalt)
The tendency of two features to group together will increase as the similarity between them increases.
Proximity
The tendency of two features to group together will increase as the distance between them decreases.
Symmetry
A rule for figure-ground assignment stating that symmetrical regions are more likely to be seen as figure.
Parallelism
A rule for figure-ground assignment stating that parallel contours are more likely to belong to the same figure.
Common region
Two features will tend to group together if they appear to be part of the same larger region.
Connectedness
Two items will tend to group together if they are connected.
Common fate
Two features tend to group together if they are doing the same thing (e.g., moving together)
Synchrony
items that change at the same time tend to group together, even if they change in different ways.
Accidental viewpoint
A viewing position that produces some regularity in the visual image that is not present in the world.
Figure-ground assignment
The process of determining that some regions of an image belong to a foreground object and that other regions are part of the background.
Surroundness
A rule for figure-ground assignment stating that if one region is completely surrounded by another, it is likely that the surrounded region is the figure.
Relatability
The degree to which two line segments appear to be part of the same contour.
Heuristic
A mental shortcut
Nonaccidental feature
A feature of an object that is not dependent on the exact viewing position of the observer.

Such as the Y, T, or arrow junctions present when boxes overlap.
What is an object?
Objects are the basic units in our
representations of the world.

Object perception tells where the physical
world breaks apart.

Different from other representations of a scene

Example: Pixel representations in computer graphics and image processing do not "know about” objects.

Object perception tells us how the physical world
breaks apart, how it is organized, and
how it will function.

Vision tells us these things at a distance.
Efficiency of Object Perception
It accurately tells how the physical world
breaks apart.

2) It works at a distance.

3) It works fast.

In experiments with rapid serial visual presentation, we can see many objects per second (even at 50 - 100 msec each).
Obstacles / mysteries in object perception
1) Segmentation: How is the visible world broken up into
separate objects?

2) Grouping / Unit Formation: How do separate visible
regions get connected into objects?
Problem of occlusion

3) How do we recover 3D shape from particular views?

4) How do we describe and represent shape?
Distinguishing perception and recognition
Object perception involves obtaining a description of
shape, size, material composition, etc. from light
information.

Object recognition is the matching of some
description obtained by perception with
something previously stored in memory.
Recognition
The previously stored information can be a category,
such as "chair"
…or an instance, such as "my favorite lounge chair
with the big cushions."

Most research on object recognition involves categorization
(so-called "basic level" recognition).

All recognition research presupposes some description
obtained from perception.
Object Perception Tasks
A. General: perception vs. recognition.
B. Edge detection
C. Edge classification
D. Junction detection and classification
E. Boundary assignment
F. Unit formation
G. Shape perception
Edge Detection
Models of edge detection use operators that are applied across large regions of the visual field.

Important edges are given by differences in:
• luminance
• color
• motion
• depth
• texture
Edge Classification
Edges come in different flavors.

The visual system wants to know about objects.

Illumination:
Shadows

vs. reflectance edges:
Occluding edges
Surface markings
Junction detection and classification
Occluding edges usually have “T” junctions

Transparency signalled by “X” junctions

Object corners have “L” junctions

“Y” Junctions indicate 3D object corners.

Presence or absence of junctions determines
important aspects of visual processing.

Segmentation and grouping-- what gets separated or connected.

"Rounded" junction valleys indicates connectedness
Boundary Assignment
Figure ground tasks such as the vase-faces illusion
“transposition"
Changing constituent
elements but retaining
same form
Unit Formation / Object Formation
Also called: Segmentation and Grouping
Max Wertheimer
Wertheimer criticized the current educational emphasis on traditional logic and association, arguing that such problem-solving processes as grouping and reorganization, which dealt with problems as structural wholes, were not recognized in logic but were important techniques in human thinking. Related to this argument was Wertheimer's concept of Pragnanz (“precision”) in organization; when things are grasped as wholes, the minimal amount of energy is exerted in thinking. To Wertheimer, truth was determined by the entire structure of experience rather than by individual sensations or perceptions.

Early 20th century theorists, such as Kurt Koffka, Max Wertheimer, and Wolfgang Köhler (students of Carl Stumpf) saw objects as perceived within an environment according to all of their elements taken together as a global construct. This 'gestalt' or 'whole form' approach sought to define principles of perception -- seemingly innate mental laws which determined the way in which objects were perceived.
Problems in Object Perception: Unit Formation
The visual system connects spatially separated visible
areas using two processes:

contour interpolation
-Illusory Contours
-Occluded Contours
-Transparency
-Self-splitting objects

surface interpolation
Contour Interpolation
The process of contour interpolation follows a definite
geometry, involving particular image features and relations

1) The process begins with the locating of contour junctions.

2) Interpolated edges begin and end at these junctions.

3) Contour interpolation follows a smoothness constraint, known as contour relatability.

4) Relatability is related to the Gestalt idea of good continuation.
contour interpolation phenomena
A number of contour interpolation phenomena depend on a common process

-The Petter Effect
-Hybrid occluded/illusory contours
Surface Interpolation
Contour Interpolation Operates Despite Some Differences in Surface Color

Surface Interpolation Depends on
Surface Color Relations

1) The contour interpolation process depends on
oriented edges leading into junctions.

2) The surface process complements contour
interpolation.

3) Surface properties “spread” under occlusion within
real and interpolated boundaries.

4) This process depends crucially on matches of
color, lightness, and texture.
Shape
Shape is a relational notion
-relations between visible or imagined parts

Shape cannot be gotten from the “sum of the parts”

Shape shows scale invariance and orientational invariance

Shape representations are mysterious

--cannot be simply the collection of local oriented units

Contour shapes switch boundary assignment all
at once
Shape representations: decomposition into
smaller units
Volumetric primitives -- “geons”

--works best for artifacts (human-made objects)
Modal and Amodal completion
Amodal completion:
A form of visual completion that occurs when portions
of an object are hidden behind another object—but the former object
is nevertheless perceived to be a single continuous entity.
This is known as amodal completion because, despite the vivid percept
of object unity, observers do not actually see a contour (i.e., a
contrast border) in image regions where the completion occurs

Modal Completion:
A second form of completion that occurs when portions of an object are camouflaged by an underlying
surface—because this underlying surface happens to project the same
luminance and color as the nearer object. This form of
completion is known as modal completion because observers perceive
a contrast border—an illusory contour—in image regions that contain
no contrast (thus, an observer’s percept has the same ‘‘mode’’ as if a
contour were actually present).
Importance of Spatial Perception
Mobile organisms need to:
-- know where things are.
-- know whether locomotion is safe.
-- guide action appropriately in their spatial environments.

These tasks depend on comprehending and representing three-dimensional (3D) space.

Different dimensions present different challenges perceptually.
SIZE CONSTANCY
As a person walks away, their image size on our retinas
shrinks. Yet we do not see the person as getting smaller.
Depth and Distance
Depth : Relative position from observer (nearer/farther)

Distance: Absolute position given using some kind of metric
or scale.
-Not necessarily a scale like feet or inches
-Perception may be "body-scaled"
(arm's length, step size, etc.)
Sources of Information for Depth and Distance
Kinematic Information [motion]
Stereoscopic Information
Oculomotor Information
Pictorial Information [monocular]
Kinematic Information
Definition: Kinematic means relating to motion.

Motion perspective: When the observer moves, displacement of an object's image on the eye depends on its distance.

When the whole visual field is considered, this information is
often called “optic flow.”

Optical expansion/contraction: When an object approaches, its image expands. If it is on a "hit" path, the expansion is symmetric.

Accretion/deletion of texture: When a surface moves relative to
another, the nearer surface progressively occludes background
texture on the further surface.
Simple algorithm for Kinematic edge detection
Get new value at each
point by subtracting
value there from average
of its neighbors.

This computation produces a map of significant object and
surface edges in visual field. Edges are marked by
non-zero values.
Stereoscopic Information
Definition: Stereoscopic means using the two eyes together.

Binocular disparity refers to differences in the two eyes' views of an object.

The direction and amount of binocular disparity depend on the distance of an object from the observer.

The two retinal images of a three-dimensional world are not the same

The horopter refers to sets of points in the world having identical binocular disparities.
Crossed disparity indicates that a point is nearer to the observer than the point being fixated.
Uncrossed disparity indicates that a point is farther from the observer than the point being fixated.
Foveating
Focusing on an object
Stereoscopes and stereograms
Use binocular disparity to create a perception of depth (steropis)
Oculomotor Information
Definition: Oculomotor means having to do with eye muscles

Accommodation refers to changes in the shape of the lens to achieve focused images at varying distances.

Accommodation may provide distance information via unconscious sensing of the muscular movements (in the ciliary muscles) that produce the lens changes.

Convergence refers to the turning of the two eyes to get a particular point in the center of fixation (fovea) of each eye.

Convergence provides depth information via unconscious sensing of the muscular movements used to turn the eyes.

Accommodation and convergence are not very useful beyond about 2 meters of distance. Sometimes this is called "near space.”

These cues potentially provide absolute distance information.
Pictorial Information
Definition: Pictorial refers to depth cues that can operate in flat
pictures. They are all also monocular cues, in that they can
operate (usually better) when you view with only one eye.

Some pictorial cues were discovered by artists.

Most pictorial cues relate to rules of optics and geometry that govern the projection of the world onto the retina.

Use of pictorial cues for depth perception involves using the rules of projection in reverse.

Laws of Optics: Scene -> Retina

Inverse Optics: Retina -> Scene
Pictorial Information

ASSUMED PHYSICAL EQUALITY
Many pictorial cues have a common theoretical basis.

The visual system operates as if it assumes that things whose
projections to the retina are different are actually
similar in the world.

The differences in the retinal projections is taken to be
caused by differences in depth position in the world.
Parallel lines
Parallel lines in the image plane, (a) remain parallel; in other planes they (b) converge
Monocular depth cues
Parallel Lines
Texture Gradients
-texture patterns that shrink into the distance
Familiar size
Relative size (in relation to objects of similar/known size)
-Relative size is more effective when size changes systematically
Relative height
Aerial perspective
Occlusion makes it easy to infer relative position in depth
Multiple Sources of Information:
Why?
"God must have loved depth cues, for she made so many of them.” -- A. Yonas

Some provide absolute information about absolute position, whereas others provide information about the relations of objects and surfaces. (Distance vs. Depth)

Different sources of information have different operating conditions.

Some evidence suggests that the system relies on the cues that provide the best evidence in general or under specific conditions.
Ecological Validity
Ecological validity refers to how accurately a cue specifies some situation in the environment.

Roughly speaking, one can get at ecological validity of
depth cues by considering how hard it would be to
arrange a situation that depicts depth according to the cue, but does not really have depth in the world.

Example: A TV show depicts 3-D environments, but the screen is actually flat.
Ecological Validity - Highest and Lowest
Of the 4 categories of depth / distance information, stereoscopic and kinematic have highest ecological validity and pictorial has the weakest.
Correspondence Problem
For stereo vision, our brain must match points on one retina to points on the other retina. This is known as the correspondence problem.
Correspondence Problem-Potential Solutions
align low-frequency information first
simplifies the problem
now we match hundreds of big blurry dots
instead of thousands of small sharp dots

uniqueness constraint
a feature in the world appears exactly once on each retina
continuity constraint
neighboring points lie at similar distances (except at object boundaries)
Prior knowledge and assumptions
We assume that the pennies are the same size
We assume that both pennies are circular
We assume that occlusion is more likely to produce the image than an accidental alignment

These assumptions allow us to exclude unlikely interpretations of the world
Bayesian Approach
Bayes' theorem provides a rigorous mathematical method to integrate prior knowledge with the input to make an inference about the world

P(scene|image) ≈ P(image|scene) x P(scene)

P(scene|image) = likelihood of scene given the image as input
P(image|scene) = likelihood that the scene would produce the image
P(scene) = likelihood of the scene
Size Perception: The Task and the Information
Size of the retinal projection is an unreliable
indicator of real size.

There is a lawful relation between:

Real size (S), viewing distance (D), and projective size (s):

S / D = s / d

where d is a constant representing the depth of the eyeball.

The visual system can essentially solve
for S, the real size, by:

S = D s / d
In this equation, s is given on the retina, and d is a constant
assumed to be known by the system.

D is gotten through distance perception, as we have discussed.

From these three inputs, perceived size S can be computed.
Size Perception: Brain Location
No good current understanding of where the equation gets solved --

Some evidence for dorsal stream
Holway & Boring (1941)
Question - does depth information contribute to the perception of size?

Idea – remove depth information. Will the size perception change?

task: adjust the size of the comparison circle to match the size of the test circle
the test circle always covered the same visual angle, but was presented at different sizes and distances.
Holway & Boring (1941): Depth manipulations
removed depth cues in some conditions
full cue
monocular – eliminates Stereoscopic Information
viewed through a hole – eliminates Binocular Vision (and maybe kinematic)
cover surfaces with draperies – eliminates ?
Holway & Boring (1941): Conclusions
As we remove depth information, the perception of size changes! Therefore, size perception depends on depth perception

Without depth information, the observers increasingly relied on visual angle
direct vs. indirect perception
Motion perception: Intro
Motion perception is incredibly sensitive and accurate

Example: Returning a 120 mph tennis serve

Ball moves 176 ft. / sec

Almost 2 feet in 1/100 sec

Need to contact ball in right place with about 3 sq. in. area of racket
Motion is closely related to perception…
Complex perceptual systems are found only in organisms that guide their own motion in space

Perception and self-motion probably co-evolved

Two equally important aspects of visual motion perception are:

Seeing moving objects

Seeing and guiding one's own motion
Sensitivity to Motion
We can characterize motion sensitivity in various ways:

Slowest perceptible motion

Fastest perceptible motion

Sensitivity in central and peripheral viewing
Important in driving
Characterizing slowest perceptible motion
Find a velocity threshold

Could define threshold as the velocity at which motion is detected 50% of the time

Example

T-rex wants to eat those annoying kids from Jurassic Park (I would too)

If they remain still, he won’t see them, but if they move faster than his velocity threshold… kiddie toast

How exactly do we quantify T-Rex’s velocity threshold?

Retinal velocity

Change in visual angle on the retina per unit time

Confusing terminology: visual angle typically measured in “minutes”
Sensitivity to Motion: Scenarios
Scenario 1: T-Rex searches for the kid brother in an “empty field” situation


Empty field: no background references

Subject-relative motion: motion of a single visible object with no background references

Kid brother would have to move at ~10-20 min/sec

Scenario 2: T-Rex searches for the older sister in a crowded kitchen


Object-relative motion: motion of a visible object relative to some other object or visible background

Older sister would have to move at ~1-2 min/sec

Velocity thresholds: subject vs object relative motion


Subject-relative motion thresholds are about 10 times as high as object-relative motion

Bottom line: motion is best seen against a background
Information for Motion Perception
There are multiple sources of information or multiple situations that lead to perceived motion


RETINAL DISPLACEMENT is the changing position of an object's image on your eye


OPTICAL PURSUIT occurs when you track a moving object with your eye; the image stays on the fovea, yet you perceive it as moving
Real Motion
Refers to situations in the world in which an object is actually moving

Real motion can produce either retinal displacement OR optical pursuit
Apparent Motion
Occurs when images flash on and off in separate locations with certain timing relations

Although nothing really moves between flash locations, motion is seen

Also called stroboscopic motion

Wertheimer performed a famous experiment investigating the timing relations in apparent motion

Flashed a spot of light at location X, waited, then flashed another spot of light at location Y

Varied the time interval between the two flashes of light to see what effect this had on our perception of motion
Apparent Motion: Timing
Interstimulus Interval (ISI): time between the end of one flash and the start of another

When ISI < 60 ms: SIMULTANEITY, not movement, is seen

When ISI is 60-200 ms: OPTIMAL MOVEMENT is seen
Movement appears smooth and continuous

When ISI > 200: SUCCESSION, not movement, is seen
Induced Motion
Involves an object and a surrounding reference frame

When the surround or frame moves, the object appears to move

May also make the observer feel like he or she is moving
General Theories of Motion Perception
Indirect Perception Theory

Motion is not a basic perceptual quality; it is derived from other things


Direct Perception Theory

Motion is a basic perceptual quality; your system is wired to perceive it

Exner and Wertheimer each did experiments offering support for the direct view
-DIRECT Perception Theory is correct
Exner’s Experiment
Looked at threshold for perceived succession

Below 45 ms, can’t judge two events in succession

BUT: Can see one flash moving as low as 14 ms
Wertheimers’s Experiment: “Phi” motion
At ISIs around 60 ms, one sees simultaneous lights but also sees something moving between them

Two simultaneous percepts
Off and on in fixed location (Simultaneity) + Motion
"Objectless" motion

Indicates that motion perception mechanism is triggered independently of the perception of a single object from the two flashes
What neural mechanisms allow us to perceive motion?
Reichardt Detectors

Basic model for motion circuits

What are basic requirements for velocity detection?

What algorithm could work?

Register: CHANGE AT LOCATION 1 (which has a delay between it and the location 2 neuron)

…. Elapsed time …

Register: CHANGE AT LOCATION 2

If both fire at the same time motion neuron fires

-Directional and specific speed dependent
Reichardt Detectors
Arranged to detect in all different directions

Evidence suggests opponent process arrangement to determine net motion

Opponent process… think color theory (ie, blue vs yellow)

With respect to motion, you could have, for example:

Up vs down

Left vs right

Can explain movement after-effects

Resulting motion perception is the vector sum
of opposite direction detectors
1)Perceiving motion in one direction fatigues
detectors for that direction
2)Afterwards, looking at a stationary scene,there is a reduced response from fatigued detectors
The waterfall illusion
For a stationary scene, there is a base level of
response from each of the opposing detectors
---> <---

After exposure to a certain direction of motion, this base level of response is reduced for a fatigued detector
->

Combining their responses shows a net motion signal
in the direction opposite to the original adapting motion
-> + <--- = <--
Perception of a Stable World
From the standpoint of basic motion detectors (ie, Reichardt detectors) object motion and observer motion can have the same effects

Two main theories:

The COROLLARY DISCHARGE theory

The RELATIONAL VISUAL INFORMATION theory
Corollary Discharge Theory
Movement perception depends on three types of signals:

Motor Signal (MS)
Sent to the eye muscles when observer moves eyes
Similar to optical pursuit

Corollary Discharge Signal (CDS)
Copy of the motor signal sent to the comparator

Image Movement Signal (IMS)
Occurs when an image stimulates receptors as it moves across the retina
Similar to retinal displacement
Signal sent to the comparator.

Perception of movement is determined by whether the CDS, IMS, or both reach a structure called the comparator

Movement is perceived when the comparator receives either the CDS or IMS individually

No movement is perceived when the comparator receives both signals at once

When these signals reach the comparator simultaneously, they cancel each other out
Relational Visual Information Theory
System uses information about relations between objects
-Not concerned with eye movement

Optic array: structure created by the surfaces, textures, and contours of the environment

Non-technical definition…
Whatever scene you happen to be looking at

Local disturbances in the optic array indicate object movement

Global disturbances in the optic array indicate observer movement, and thus a stable world
Local disturbances in the optic array
Occur when one object moves relative to the environment, covering and uncovering the background
Remember accretion/deletion?

Happen whether we are tracking a moving object, or whether our eyes are stationary
Global disturbances in the optic array
Occur when all the elements of the optic array move
Event perception
Perception of what
things are moving, how they are
interacting, and where they are

Example of “higher-order” perception:
Perception is not just about basic
sensory dimensions, such as color,
loudness, etc.
Perception is about higher-order
relationships, such as causality.
Dual Specification
Refers to the fact that
events provide information for both changes
in the world AND about persisting properties
of the world

--phrase coined by J.J. Gibson

Examples: object form
spatial layout
Motion Information for Form
A. Motion alone can indicate the 3-D form of objects.

B. Originally, this was called the kinetic depth effect.

C. More generally, these abilities are called structure-from-motion (SFM)

-- referring to the extraction of object structure
from information in moving displays
Motion Information for Depth
Recall some kinematic depth cues:
motion perspective
accretion and deletion of texture
optical expansion
Structure From Motion
Moving dots are often used to study SFM.

-- When stationary dots do not reveal the contours or surface information of objects.

-- Thus, they can be used to study the motion effects independent of surface properties.
Rigid motion
Rigid motion is the geometric term describing motion of an object in space during which there are not changes in the distances between any two points on the object.

In other words, the object does not bend or deform during motion.
Mathematical Analyses of SFM
Focus on Rigid Motion
-In other words, the object does not bend or deform during motion.

The Problem of Determining SFM
The problem of SFM in rigid motion is for the observer to recover 3-D form from the changing 2-D projection of an object.

A number of theorems have been proven showing that from a small number of points (3-5) moving in the 2-D image, the 3-D structure of a rigid object can be recovered.

Some have argued that the visual system will fit a rigid solution to motion displays whenever this is possible.
Non-Rigid Motion
Examples:

Your hand (jointed motion)
Jellyfish (elastic motion)

Much research has looked at point-light walker
displays, in which perceivers see a person
walking from motion of only a few points of light.
(Johansson)

Sometimes this is called biological motion.
Non-rigid Motion
(Mathematical analysis of)
What constraints are there?

What distinguishes unified, non-rigid
motion from disconnected motion
of dots?

These are unsolved problems!
Sound as Information
Sound provides many different kinds of information
about our environment.

SPATIAL LOCATION INFORMATION

HEARING IS ALSO SPECIALIZED FOR
PROCESSING OF SPEECH

MUSIC: Another gift from the sense of hearing
Sound as Information: SPATIAL LOCATION
It tells us directions of events.

It can tell us distance to a source.

-Like vision, hearing is a “distance sense”

Monitoring the environment:
Vision is better for spatial detail, but…
Hearing is special in allowing us to monitor
our surrounds without special orienting
Sound as Information: INFORMATION ABOUT SUBSTANCE AND EVENTS
Sound carries rich information about what things are
made of and about events occurring

Much of the information about substance and events
(for example, what distinguishes metallic
sounds from wooden sounds) remains
poorly understood.
Puzzling questions:
What are the biological functions of musical perception
and the aesthetics of music?
The enjoyment of music, and our abilities to process it, are
among a number of aspects of human psychology and motivation that are difficult to connect to standard biological accounts of the functions of behavior and cognition (e.g., survival, reproduction).
The Physical Stimulus for Sound
Physically, sound is the compression and spreading apart of air molecules.

1) This compression and rarefaction (spreading) is caused by any movement or vibrational event in a medium, such as air or water.

2) The disturbance in the medium spreads as a wave outward from the disturbance.
The Physical Stimulus for Sound
What moves?
Individual molecules do not move very far.

It is the wave that propagates.
The Physical Stimulus for Sound
The Speed of Sound …
is specific to a medium.

In air, it is about 330 m/sec (or 1100 feet/sec)
The Mathematical Description of Sound
All sounds can be described as combinations of simple sinusoidal waves.

This is true because of a theorem proved by the mathematician Fourier. Put simply, any complex waveform can be decomposed into a combination of simple sine waves, each having a specific frequency and amplitude.

The decomposition is unique (there is only one way to do it).

The decomposition has the interesting property that all components will have frequencies that are integer multiples of the lowest one.
The Mathematical Description of Sound
Basic Unit
Basic Unit: the Sine Wave

-Frequency
-Amplitude
-Phase
Simple and complex sounds
Sine waves: Not common everyday sounds because not many vibrations in the world are so pure

Most sounds in world: Complex sounds, (e.g., human voices, birds, cars, etc.)

All sound waves can be described as some combination of sine waves

Complex sounds can be described by Fourier analysis
A mathematical theorem by which any sound can be divided into a set of sine waves. Combining these sine waves will reproduce the original sound
ossicles
Amplification provided by ossicles is essential to ability to hear faint sounds
Inner ear is made up of collection of fluid-filled chambers
Middle ear muscles
Middle ear: Two muscles-tensor tympani and stapedius
Purpose: To tense when sounds are very loud, muffling pressure changes
However, acoustic reflex follows onset of loud sounds by about one-fifth of second, so cannot protect against abrupt sounds, (e.g., gun shot)
Inner ear
Inner ear: Fine changes in sound pressure are translated into neural signals
Function is roughly analogous to that of retina
Cochlear canals and membranes
Cochlea: Spiral structure of the inner ear containing the organ of Corti
Cochlea is filled with watery fluids in three parallel canals
Organ of Corti
Movements of cochlear partition are translated into neural signals by structures in the organ of Corti; extends along top of basilar membrane
Made up of specialized neurons called hair cells, dendrites of auditory nerve fibers that terminate at base of hair cells, and scaffold of supporting cells
translating sound waves
Firing of auditory nerve fibers into patterns of neural activity finally completes process of translating sound waves into patterns of neural activity
Coding of amplitude and frequency in the cochlea
Place code: Tuning of different parts of cochlea to different frequencies, in which information about the particular frequency of incoming sound wave is coded by place along cochlear partition with greatest mechanical displacement
Inner and outer hair cells
Inner hair cells: Convey almost all information about sound waves to brain
Outer hair cells: Convey information from brain (use of efferent fibers). They are involved in elaborate feedback system
Psychological Dimensions of Hearing
The psychological experience of loudness is related to the physical variable of sound pressure

Pitch (psychological) is related to sound frequency (physical).

Human sensitivity to frequency ranges from about
20 - 20,000 Hz (where one Hz = one cycle per second of vibration)

-- Children have greatest range;
there is some loss at the very high end with age.

-- There is substantial loss in the high frequencies
for people who are exposed to loud noises.
Auditory Space Perception: General
The auditory system helps us to perceive spatial locations of events, especially their direction in space from us.

We can also perceive the distances of sounds to
some degree.

There are multiple ways in which direction is computed by auditory processes.

-- These relate to certain differences among sounds and situations, allowing us to perceive direction well through the combination of processes.
Interaural time difference (ITD)
The difference in time between a sound arriving at one ear versus the other

-Differences in the arrival at the two ears of an ONSET of a sound can be used to localize it.

A head's "sound shadow" contributes to differences in ITD, and/or blocks energy in frequencies 1000hz+
Auditory Space Perception: dependency
Most auditory space information depends on the fact we have two ears in different positions.

As in stereoscopic depth in vision, the two ears work
as a system.
Azimuth
Used to describe locations on imaginary circle that extends around us, in a horizontal plane
Can analyze ITD:
Where would a sound source need to be located to produce maximum possible ITD? Directly to the side

What location would lead to minimum possible ITD? front and back

What would happen at intermediate locations?
Auditory Space Perception: Information from differences
Differences in the arrival at the two ears of an ONSET of a sound can be used to localize it.

Differences in the PHASE of soundwaves arriving at the two ears can be used for localization.

Differences in the INTENSITY of soundwaves arriving at the two ears can be used for localization.
Interaural level difference (ILD)
The difference in level (intensity) between a sound
arriving at one ear versus the other

Sounds are more intense at the ear closer to sound source
ILD is largest at 90 degrees and –90 degrees, nonexistent for 0 degrees and 180 degrees
ILD generally correlates with angle of sound source, but correlation is not quite as great as it is with ITDs
PHASE and INTENSITY differences
PHASE and INTENSITY differences play a complementary role.

1) Phase differences work for low frequency sounds.

2) Intensity differences work for high frequency sounds.

3) The combination allows us to localize most sounds.
CONE OF CONFUSION
The time and intensity differences we have described so far are subject to the
CONE OF CONFUSION

The cone of confusion refers to a set of points in space that produce identical onset, phase or intensity differences, due to symmetries of being in front / behind the head, or
above / below the head. (e.g., 45deg right/front and right/back)

HEAD MOVEMENTS can be used to resolve the cone of confusion.
CONE OF CONFUSION: Pinna
The shape of the PINNAE also helps resolve ambiguities in sound location, primarily for high-frequency sounds.

The pinna is the outer ear.
Auditory Distance Perception
How do listeners know how far a sound is?

Simplest cue: Relative intensity of sound

Inverse-square law: As distance from a source increases, intensity decreases faster such that decrease in intensity is distance squared

Spectral composition of sounds: Higher frequencies decrease in energy more than lower frequencies as sound waves travel from source to one ear

Relative amounts of direct vs. reverberant (reflected sound) energy