Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

58 Cards in this Set

  • Front
  • Back
  • 3rd side (hint)
what is coarticulation
When the phonemes in a word influence one another . AKA: Coproduction. Coarticulation happens all the time in normal, mature, connected speech.
2 types of coarticulation
* 1. FORWARD/ANTICIPATORY: an earlier sound is influenced by a later sound; for example, the word "spoon" may notice lip rounding for the /s/ and occurs before u have bilabial closure for /p/(forward/anticipatory)

* 2. BACKWARD/ RETENTIVE: an earlier sound influences a later one. Ex: The nasalization of the /o/ in the word ‘no’
when does reduced coarticulation occur and what is it
children and people with disorders will have reduced coarticualtion, ex: Apraxia of speech. It can result in difficulty linking syllables together. The syllables will not be as cleanly coordinated in their production. There is a reduction in coarticulation and the speech will not sound as natural; it doesn’t flow as well.
How long does a consonant last?
* depends on the position -- in theory it is:

* Longest for single consonant + vowel

* Shorter for 2 consonant blend

* Shortest for 3 consonant blend

* Shortest duration in clusters

* So CV long, ccv shorter (pie, spy, spry)
explain difference between relaxed and clear speech
* relaxed speech will be shorter in duration (smaller articulatory movements)

* clear speech longer in duration (larger/exaggerated articulatory movements)

* With clear speech we release stops
what hypothesis did Lindblom create?
* Lindblom: HYPO and HYPER Hypothesis: we adjust how much articulatory effort we use depending upon the circumstances.

* Lindblom: refers to how much we adjust our articulatory effort depends on the situation. You articulate with less force when in a quiet environment and more force when in a louder environment
explain what happens in rapid speech production
An analogy would be someone who is skiing down a run. If you made wide loops around each pole in the snow, so you went well to the left and well to the right so that you never touch the pole, it would take you much longer to reach the goal in this event. But if you want to go very quickly down the ski slope (like the Olympics in downhill), you cut the corners as much as you possibly can so each pole is clipped by the skis as you go by. Cutting corners as best you can to gain speed. This is what happens in rapid speech production.
what is the lombard effect?
Some people call the Lombard effect a 'reflex' because it's the natural tendency to speak up when the environment becomes noisy or you cannot hear yourself speak for some other reason. If you have spoken to someone who is listening to loud music on their iPod, you may have found that they shout back at you, not because they're rude, but because they can't tell how loud they are being. This is the Lombard effect - it's essentially a spontaneous human response to not hearing your own speech.
What is the H & H Hypothesis
The H&H hypothesis (hyper and hypo) is a theory suggesting that we speak as clearly as we need to, but no clearer than that. It is metabolically economical to use as little speech effort as you can get away with. So if you are casually chatting with friends in a quiet environment, you won't be putting a lot of effort into speaking super clearly. If, however, you are talking to someone with a hearing loss, talking over a poor telephone connection, or for some other reason trying to overcome a barrier to comfortable communication, you'll tend to put a lot more effort into speaking clearly, which may or may not involve getting louder.
3 key elements of Prosody aka Intonation
2 main classes of prosody
2 main classes are linguistic and affective prosody
What is Linguistic prosody
* Linguistic prosody helps the listener understand the grammar of what is being said (stress on certain words, question vs. statement)
o It’s in the mail.
o It’s in the mail?
What is Affective prosody
* Affective prosody AKA Emotional prosody. Has to do with the expression of anger, surprise etc.
o Don’t talk to me in that tone of voice.
Explain Respiration Role w/ Prosody and how might influence a disorder
* Respiratory system is primary driver of prosody (modification of lung pressure/force to vocal folds which affects pitch, loudness and length of utterance)

* Some disorders have prosody issues (due to insufficient air supply or air supply modulation problems - laryngeal problems)
How do kids learn prosody?
Children do not have prosody; it is learned via Child Directed Speech
Who is the best at recognizing emotional tone?
Which overlapping group performed the best overall? Women are better at recognizing emotional tone.

* Dromey's study uncovered that the Female EMT(English mother tongue poly glot) group ouperformed other groups in recognizing emotional tone, identifying neutral vs. angry speech.. This could be due to the fact that they knew the prosodic patterns of English as well as other languages.
name 3 basic theories of speech perception
Template modules, Feature theory, Motor theory
Describe Template Model Theory
• ideal phoneme example stored in memory
• incoming sounds evaluated, compared with template
• best match assigned to that phoneme
• perceptual magnet effect
i.e. if you have a sound that is close to the ideal target, it tends to be drawn towards that target, and ‘close enough’ counts as a match. It doesn’t have to be a perfect match. The Perceptual Magnet doesn’t actually change the nature of the sound itself, but it does influence our perceptual system so that we can assign a category to a sound more easily, as long as it is approximately where we would expect that sound to be.
• similar acoustics linked with category
• We look for FEATURES in sounds that we hear in order to identify what sounds they are.
• English consonant can be identified by Place of Articulation, Manner of Production, and voicing. Ex. what makes a /t/?

• place = alveolar

• manner = stop

• voicing = no
• We automatically via the auditory cortex, assign a phoneme to a category based on those features that the auditory system detects.
• Change a feature, change the phoneme. If you change a feature, you change the sound. For example if you change a /t/ from a stop to a fricative, you would have an /s/.
• Problem with theory is that it does not take timing into account; Sometimes voice onset times will change somewhat depending on the rate of speech, the dialect, or the manner of speech that a person is producing.
• Evidence in favor- evidence suggests that we rely on features, because we often misperceive sounds by being off by just one feature – you might perceive a ‘d’ instead of a ‘t’, or you might perceive a ‘p’ instead of a ‘t’
• Misperceptions often differ by one feature-you are not usually off by several features all at once. It’s unlikely that you would confuse a ‘t’ for an ‘m’, because an ‘m’ would have bilabial production, a nasal manner, and voicing.
The way we understand speech is by figuring out the way it was produced.

•We hear speech
•We decide how it was articulated
•We now ‘know’ the neuromotor commands that moved the muscles
•Thus, we know what was said
What is an example of Categorical perception of sound?
You can change the acoustic features of sounds, say ‘p’ vs. ‘b’, across a continuum so that there’s a gradual change of Voice Onset Time. At some point, you decide it’s no longer a ‘p’ and now it’s a ‘b’.

* similar to when you look at a rainbow , when you look straight on you see the seperate colors but when you look at the physical components of the light it would be one continous spectrum. This can also be applied to sounds and it's acoustic features.

* i.e when you produce a "p" and a "b" they can be confused for one another depending on production
* but if you produce a "b" and an "m" you most likely will not confuse them
Categories are assigned to ________ in our perception
* The _________ is the place where a sound is no longer perceived as that sound but has become closer to a neighboring sound
Crossover Point
bbbbbbbbbbbbbbbb--right here---ppppppppppp
VOT Continuum diagram illustrates....
this illustrates the VOT continuum between ‘p’ and ‘b’, and the phonetic boundary in the middle where your perception might change from ‘p’ to ‘b’. Our auditory processing system really likes to assign incoming sounds to categories.
Categorical Perception Within category changes are...
Within category changes are within a phoneme (everyone says /b/ differently and if it is within the category we perceive as /b/ we will hear /b/)
Categorical Perception: sound changes WITHIN A CATEGORY sound the same to us. We do NOT discriminate well within phoneme boundaries
Categorical Perception across categories is ....
between /p/ and /b/. Categorical Perception: sound changes ACROSS CATEGORIES are heard as different sounds. We can discriminate easily across.
Infant response to new VOT
* An infant's heartbeat will dip when it perceives something new. Studies show that the infants’ heart rate does change when a new voice onset time is presented. VOT is meaningful for infants even for infants who have not yet acquired a language
What can an infant of 4-6 months do that you cannot?
Four to Six month old infants can also respond to VOT categories that we are not used to separating. Ex. Can detect the 3 VOT categories of Thai, as we cannot.
What is the language hardening that happens at 1 year?
* As infants approach the one year mark they have a "hardening" of the traditional language that surrounds them. They lose those extra categories that they used to be able to distinguish.
* Infants surrounded by English would retain 2 VOT categories after they turn one, and infants surround by Thai would retain3.
segmenting sounds from connected speech with TOP DOWN KNOWLEDGE
- if we have linguistic knowledge of which words are potential candidates to be segmented then we can apply that knowledge to the sounds that we hear.
How are PHONOTACTICS useful when segmenting words
PHONOTACTICS can be useful when segmenting words. You short-list the possibilities and select the most logical in the context. We do this automatically.
Describe speech rate changes
• not a linear stretching / compression. Not all sounds are affected equally. You cannot really prolong a stop although you can prolong a vowel.
What is clinical rate manipulation
• slowing speech may help intelligibility
• what helps you in parsing foreign languages? (pauses)
• prolong speech segments
• add pauses between words
• pacing boards • point to first letter (Alphabet Board) • metronome or other tech
2 Speech Perception Processes:
Bottom Up & Top Down. Both the Top-Down approach and the Bottom-Up approach play an important role in speech perception.
Bottom-Up Approach:
This speech perception approach is where the signal comes in and then our brain picks it apart. Reliance on signal acoustics; it looks at the different acoustic features of each sound and then pieces them together into a message.
Top-Down Approach:
Reliance on linguistic knowledge • recognizing established patterns • influenced by context, syntax, etc. This speech perception approach is where you have knowledge of the language that is being spoken, and then you use your knowledge to impose some structure on the incoming sound stream so that you can make sense on what is being said.
Explain multimodal speech perception
• In some environments (phone) we understand audio only
• Visual cues; speech-read is helpful in noisy settings
• thalamus integrates sensory input; THALAMUS is the part of the brain that integrates all of our sensations from all the different sensory organs in the body except for the sense of smell. The thalamus is important to allow us to connect the auditory signal with the visual signal so that we can get the most possible meaning out of a message that is being given to us.
• what about mismatched input? THE MCGURK EFFECT – when people say one thing visually but the sound track doesn’t match it.
what are the advantages of semitones vs. Hertz in measuring frequency (IN GENERAL)
A semitone scale is easier to use because it's A LOGARITHMIC SCALE -- it puts the final result into a series of numbers that are much easier to manage without lots of trailing, all-leading zeros. Also, the Human perceptual system is much more aligned with a semitone or non-linear scale than it would be with a linear scale in Hz. Pitch perception is not linear, our ear are more sensitive to proportional changes rather than absolute changes in Hz.
what is an octave, how do you go up/down, and what does it have to do with semitones?
* 12 semitones in 1 octave.
* To go up one octave you double the frequency in Hz. (100 to 200 Hz is an octave; 200 to 400 or from 400 to 800 is also one octave)
* Going down one octave means the frequency has been halved.
what is semitone scaling?
* SEMITONE SCALING: The higher up the scale you go, the bigger the change in Hz, even though there is always still 12 semitones in an octave. One semitone is never a fixed number of Hz.
what are the advantages of semitones vs. Hertz in measuring FUNDAMENTAL frequency
Expressing a person’s F0 variability in semitones does correspond more closely with the way our pitch perception works because it’s not linear. The benefit of this semitone conversion is that it allows you to compare across speakers who differ in their average F0.
What is fundamental frequency?
Fundamental Frequency: Lowest frequency and also the highest amplitude harmonic component of all the individual waves that add up together to create the voice source spectrum. It reflects the rate at which the vocal folds are vibrating. It is a physical measurement (Hz or cycles per second)
our PERCEPTION of pitch corresponds closely-- to the actual _____ ______ in a periodic or nearly periodic signal.Our perception of pitch can be influenced by a number of things. Whether a sound is _____or ____ may influence very slightly the pitch that we perceive.
fundamental frequency
Pitch and frequency are similar BUT:

PITCH is a _________characteristic.
FREQUENCY is a ___________characteristic.
The period of a signal and its frequency are ________related – as a frequency of a signal ______, the duration of each cycle, or the period, will ________.
* Frequency=cycles per second
* 1 Hz = 1 cycle per second.
* Period = 1/f0.
Example: F0 = 200 Hz, then the period is 1/200 of a second (5 milliseconds)
Give an examples of periodic, nearly periodic and aperiodic sounds
The human voice is nearly periodic. Sine wave is periodic.
Sawtooth is periodic, Aperiodic is noise, lacks pattern repetition.
What is a vocal register
Vocal Register

* Not a region of fundamental frequency
* Pattern of physiological activity of the vocal folds, in terms of the way the vocal folds oscillate.
Pitch changes are far less detectable to us in the ________ frequency range
Far less detectable to us in the higher frequency range.
500 Hz -> 600 Hz (100 Hz increase) = easily hear that pitch has changed
3500 Hz -> 3600 Hz (100 Hz increase) = barely hear that pitch has changed
Describe how a pulse/vocal fry voice sounds and when you might use this register
Pulsatile quality-You can perceive the individual glottal pulses. Lower gravelly sounding voice. * We will often slip down from Modal phonation into pulse register as we get towards the end of an utterance when making a statement. If you were singing in pulse register and went too far up the scale, you would suddenly switch to modal register
Describe pulse/vocal fry voice physiology
* Physiology: Vocal folds relatively slack. Low sub-glottic driving pressure from lungs.
Describe pulse/vocal fry voice waveform & frequency/pitch/loudness characteristics
* Frequency/Pitch/Loudness: Very low F0 range. Very limited pitch range. Limited loudness range.
* "Pulse or fry is the lowest and has a double-pulse waveform in many instances. "
Describe how a MODAL REGISTER/ CHEST VOICE sounds and when you might use this register
Typical speaking / mid-range singing voice. widely employed in singing
Describe MODAL REGISTER/ CHEST VOICE waveform & frequency/pitch/loudness characteristics
The Modal (moving) register is widely employed in singing because it has a wide dynamic range; can go from a very soft voice to a very loud voice in this pattern of oscillation

* The pitch can go up and down, the singing voice quality can be dramatically different from a speaking voice, but the PATTERN OF VOCAL FOLD VIBRATION can still be such that it is within the modal register.
Describe MODAL REGISTER/ CHEST VOICE voice physiology
o the full vocal fold (with all of its layers) vibrates.
o A good, strong modal voice has a lot of vocal fold tissue in contact from one side to another.
o During modal register, the whole mass of the vocal fold vibrates including the 3 sections of the vocal fold: Thyoarytenoid muscle, layers of lamina propria, mucosal cover
Describe FALSETTO /LOFT/HEAD waveform & frequency/pitch/loudness characteristics
o At high end of F0 range.
o When you get to the very high notes in Falsetto, the wave form is almost like a sine wave because you don’t have as many harmonic components, so it can sound a bit like a pure TONE.
Describe how a FALSETTO /LOFT/HEAD sounds
o High pitched, thinner quality of phonation than MODAL, a falsetto voice doesn’t have the harmonic richness of the Modal register; it’s somewhat simpler, thinner and more akin to a pure tone
Describe FALSETTO /LOFT/HEAD voice physiology
o Vocal folds are stretched tightly - Cricothyroid muscle has stretched them.
o Cover of vocal folds oscillates medially. Folds are stretched tightly and only the medial edge oscillates.
o Little / no involvement of thyroarytenoids in vibration. Thyroarytenoid muscle underneath is stretched so much that it's too stiff to vibrate.
o In falsetto/ loft register, the cover of the vocal folds oscillates, but you don’t get the contribution of the thyroarytenoid muscles, so the surface of the vocal folds is stretched very tightly.
o Oscillation almost sinusoidal.
o In the falsetto register, you have rather minimal vocal fold contact between the right and left sides.
Moving between registers
Goal is to have seemless transition
Hard to transition for non-singers because of the physiological differences within each register (coordinated movements of each register is vastly different... Example between a horse’s walk and gallop)