Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
34 Cards in this Set
- Front
- Back
What is speech perception?
|
Assigning meaning to sound
|
|
We perceive ________ events; we hear __________ events
|
We perceive LINGUISTIC events; we hear ACOUSTIC events
|
|
What are some of the challenges in discriminating between and identifying the acoustic-phonetic features of speech sounds?
|
- lack of segmentation
- lack of phonemic invariance - lack of linearity |
|
What perception cues do we rely on in addition to the acoustic information?
|
body language
facial expression content-topic familiarity w/speaker & speech patterns movement of articulators as speech cues knowledge of language -- what do you expect in terms of linguistic rules |
|
What are the acoustic cues for MANNER of articulation?
|
I. Harmonics - no noise; relatively low frequency energy
A) high intensity formants: i. formant glides a) semivowels - faster transitions b) diphthongs - slower transitions ii. vowels -- relatively steady state formants B) Low-intensity formants: i. nasals -- relatively steady state formants; low frequency resonance --- II. Non-periodic component; relatively high frequency energy A) Duration of noise i. Stops: transient burst ii. Affricates: sharp onset of noise iii. Fricatives: longer noise |
|
What are the cues for PLACE of articulation?
|
I. Formant Spacing
A) Vowels i. F1 by F2 ii. F2 high - front V; F2 low - back vowel iii. F1 high - low V; F1 low - high V B) Semivowels i. F2: high - /j/; mid - /r,l/; low - /w/ ii. F3: low - /r/; flat - /l/ II. F2 Transition: A) Stops & Nasals: i. Low - labial ii. High - alveolar iii. Varied - palatal-velar B) Fricatives: i. Especially /f/ and theta: lowm - /f/; high - theta C) Frequency of Noise: i. Friction: Fricatives: High - /s/; Lower - /sh/; Wide band - /f, theta, h/ ii. Burst: Stops & Affricates: High - alveolar; Low - labial; Varied - palatal-velar |
|
What are the cues for VOICING distinctions?
|
I. Voice bar:
A) stops - during closure B) affricates - during closure C) fricatives - with friction --- II. Timing: A) Noise duration i. longer aspiration - voiceless stops ii. longer friction - voiceless affricates; voicelss fricatives B) VOT i. stops a) longer, voiceless; b) shorter, vocied C) F1 i. stops a) rises from base - voiced b) cutback - voiceless D) Closure duration: i. Stops & Affricates a) longer - voiceless b) shorter - voiced E) Duration of preceding vowel i. longer before voiced ii. shorter before voiceless |
|
What are the perceptual cues for VOWELS?
|
long duration
higher intensity strong formant patterns In running speech: transitions to/away from vowels; changing information - phonetic context (esp. because vowels are neutralized) |
|
What are the perceptual cues for DIPHTHONGS?
|
gradual transition from one vowel to another -- how fast formants change is very important for perception
intense formants |
|
What are the general perceptual cues for CONSONANTS, to differentiate from vowels?
|
shorter in duration
less intense construction voiced/voiceless |
|
In consonant perception, there is lots of ___________ to help us perceive
|
REDUNDANCY -- many speech cues to help us perceive
|
|
What is categorical perception?
|
the ability to discriminate only as well as one can identify
|
|
Categorical perception:
Sounds differing in some small way are perceived as the same ________ until a _______ is reached |
Sounds difering in some small way are perceived as the same PHONEME until a BOUNDARY is reached
i.e., there is a whole category of what /k/ would be -- lots of different phones, with different physical data, are all perceived as /k/ and cannot be discriminated |
|
T/F: Categorical perception exists only for place cues.
|
FALSE: CP exists for place, manner, and voicing cues.
|
|
Describe the ba/da/ga place study and discrimination task.
|
The vowel /a/ was synthesized; the formants for /a/ in each presentation were kept the same except for the only thing changing was the F2 transition (which gives info about which stop-plosive the /a/ is coming from). Frequency of F2 transition varied in equal increments. When presented to listeners, distinct boundaries were found to be present. Some presentations were clearly ID'd as ba, some as da, some as ga; even though no two sounds were the same.
Evidence for categorical perception -- our brains like to put info into categories. Discrimination task: Two-step pattern (present 1&3; 2&4; difference between 1&3 and 2&4 is equal). Presented A, B, and X (which could be either A or B). Task is to decide if X is more like A or more like B. If you can't discriminate A vs. B, X will be chance (you're guessing). If you can, your X will be close to 100%. I.e., when A and B were within the same perceptual category (both /ba/), chance of getting X correct was 50%. When A and B were from different perceptual categories (e.g. one /ba/ and one /da/), people were close to 100% at getting it right. |
|
What are some of the considerations for categorical perception?
|
- Do people perceive speech differetly than they perceive nonspeech? (do we have CP for other sounds)
- Does learning of a language alter speech perception? - Is CP innate or learned? (nature vs nurture) - Auditory vs. linguistic contributions |
|
What did Within-language and Cross-language Studies of CP in Adults find?
|
- categorical perception for place: b,d; r,l
- CP for manner (transition duration: b/g to w/j/ to ua/ia) and voicing (VOT from -150 to +150) - perception is more categorical for consonants than vowels (dynamics/speed of consonant is perceived more readily than for vowels) - Influence of linguistic knowledge (learning a 2nd language; age; shifting boundaries depend on expectation of what lg to be heard - e.g. Puerto Rican) |
|
T/F: There are multiple acoustic cues in consonant perception.
|
TRUE. Example of VOT in voicing contrasts for plosives.
|
|
What are the 7 acoustical parameters that are important cues for IDing stop consonants in VCV environments:
|
1) Rate, degree, extent of vocalic transition from V to C
2) Duration of closed phase (stop phase) 3) Presence or absence of voicing during closed phase 4) Voice onset time VOT 5) Duration of noise burst 6) Spectral characteristics of noise burst 7) Rate, degree and extent of vocalic transition from C to V ---- all of this is evidence that there are multiple acoustic cues in consonant perception |
|
What does it mean when we say that "trading relations exist among perceptual cues"?
|
Many cues contribute to our perception -- but in certain conditions, certain cues will be more salient. There are trade-offs in terms of which cues will prevail.
|
|
Coarticulation/Rate:
How do coartic/rate affect speech perception? |
Perceptually we integrate information that is spread out over several phonetic segments to make a decision about phoneme identity
|
|
Coartic/rate:
What is an example of how context influences perception? |
"sh" vs "s" -- (when fricative noise in midrange of bandwidth; slushy "sl") -- percept varies according to vowel context. Will perceive as /s/ if preceding a back vowel, b/c relatively big difference. Will perceive as /sh/ if preceding a front vowel b/c relatively smaller difference
|
|
Coartic/rate:
What is an example of how rate influences consonant ID? |
an intermediate VOT will be perceived as voiceless in fast speech; voiced in slow speech
|
|
What are the acoustic cues for liquids?
|
formant transitions; F3 lower for /r/ than for /l/
|
|
What are the acoustic cues for glides?
|
shorter transition duration (40-60ms to </110-150 ms) than for diphthongs (>100ms)
|
|
What are the acoustic cues for nasals?
|
internal formant structure; vowel formant transitions.
Manner cue -- weak formants; nasal formant; Place cue -- F2 transition frequency and duration |
|
STOPS -- (cues are intertwined with cues for surrounding vowels & consonants)
What are the manner cues for stops? |
preceding silence or sound attenuation; transient noise burst; rapid formant transitions
|
|
STOPS --
What are the place cues for stops? |
noise burst frequency; formant transition to vowels; VOT (shortest for bilabials; longest for velars)
|
|
STOPS --
What are the voicing cues for stops? |
duration of silent interval; initial position stops -- VOT, f1 cutback, F0 of vowel (higher for voiceless stops); final position stops -- vowel duration (longer before voiced stops)
|
|
FRICS --
What are the manner cues for fricatives? |
friction noise; longer in duration than for stops (130ms or greater)
|
|
FRICS --
What are the place cues for fricatives? |
sibilants more intense than non-sibilants and high-frequency spectral peaks; nonsibilants flat spectra
/s,z/ prominent peak @ 4000 Hz; sh, zh @ 2500 Hz; F2 transitions |
|
FRICS --
What are the voicing cues for fricatives? |
friction noise (longer duration and more intense for voiceless fricatives)
|
|
What are the three roles of context in speech production?
|
ACOUSTIC context -- info regarding several phonemes converge on a specific sound
LINGUISTIC context -- language specific sound sequences (e.g. ng and nd at end of syllable; never at beginning); other language based expectations re: syntax and semantics (we are biased by what we expect to hear) TOPIC/situational context -- familiarity with subject matter |
|
bring
|
trazer
|