• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/462

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

462 Cards in this Set

  • Front
  • Back
The Waveform
A plot of changes in amplitude over time for a particle object or system

The specific display is called a time- domain wave form

The waveform is actually a representation of a particle motion or system motion in time
Features of Periodic waveforms:
PERIOD- the time one cycle takes; duration of one cycle of vibration

period is the inverse of frequency

t = 1 / f

200Hz means compression and rarefaction is occurring 200 times per second

FREQUENCY = number of vibrations per second

f = 1 / time
PERIOD
- the time one cycle takes; duration of one cycle of vibration

- period is the inverse of frequency

- t = 1 / f

- 200Hz means compression and rarefaction is occurring 200 times per second
FREQUENCY =
number of vibrations per second

f = 1 / time
Period is the inverse of...
frequency
Wavelength =
- distance in space that one cycle occupies, from one peak to the next, measure in meters (m)

λ = c/ f
C = ___________ in the case of c= fλ

You must know what to derive "c"
c= fλ

speed of sound
(to know this you need to know the medium)
SOUND =
systematic changes in pressure generated by an energy source

- The object or source must be vibrating or oscillating
Must have a source and a medium that transmits energy...
generated by the source

- The object or source must be vibrating or oscillating
What must be disturbed to produce a sound?
- Atmospheric pressure (Patm)
So as sound is put into vibration it will:
- Cause air pressure changes around the vibrating source part
Sound is conducted through a process of...
compression (decrease in distance) and rarefaction (increase in distance) of the medium (air)
Compression
Decrease in distance
Rarefaction definition
Increase in distance
How is a a “chain reaction” set up with with neighboring air molecules?
Remember that air is springy. It has inherent elasticity and is very suitable for compression and expansion.

Like a pendulum swinging, once air molecules are displaced, they will oscillate due to their mass, elasticity and inertia.

With so many neighboring air molecules nearby…You set up a “chain reaction” with neighboring air molecules.
Once air molecules are displaced, they will oscillate due to...
... their mass, elasticity and inertia.
What is passed along or exchanged with neighbors or other air molecules?
Energy
The Traveling Wave –
Energy is transferred and propagated.
Density of air molecules INCREASES with __________.
COMPRESSION
DENSITY-
Mass of air molecules per unit volume of space
Density of air DECREASES with ___________.
RAREFACTION
Alternating regions of __________________ move through medium
Compression and rarefaction

- low pressure (rarefaction) to high pressure (compression) to low pressure (rarefaction)
TIME- DOMAIN WAVEFORM
A plot of changes in amplitude over time for a particle, object, or system
This specific display (below) is called the time-domain waveform
The waveform is actually a representation of particle motion or system motion in time
Amplitude of a sine wave represents air pressure over time
Wave motions are disturbances moving through a medium, that...
Transfer energy in a radiant manner from a source, with ….

The speed of wave propagation (travel) depending on the physical characteristics of the material through which the wave travels.
Wave motions are...
Disturbances moving through a medium that transfer energy in a radiant manner from a source
Wave motions transfer energy in a radiant manner from a source, with ….
The speed of wave propagation (travel) depending on the physical characteristics of the material through which the wave travels.
3 types of waves are defined
transverse

longitudinal

surface
-# of vibrations per second =
Frequency, measured in hertz (hz)

- frequency is the inverse of period
Speed of sound in air:
341 m/s at 32 degrees C (temperature of medium) or 343 m/s at 68 degrees F
Speed of sound in water:
1461 m/s
Speed of sound in steel rod:
5000 m/s
Incident wave
One that is projected, some portion of the wave will be bounced off surface, this is a reflected wave
Damping:
-decreases amplitude of incident wave

-soft, porous, rough surface material: very absorbent, (that’s why we scream into a pillow)

-reflection wave: some part of incident wave is reflected back towards source

-hard and smooth surface: reflect light waves more
Boundary
Sudden change in media density that will reflect a sound wave in a different direction.

Ex: something traveling through air and then suddenly going into water
In the real world, what happens to sound waves after they are produced?
In the real world, the sound waves hit boundaries after their production.


Waves of pressure bounce or “reflect” from off of boundary locations
When sounds “reflect” off of surfaces, a complex situation arises. Describe it:

What two wave types are formed?
A complex situation arises where one sound wave is now traveling in one direction, while its reflected wave is traveling back from where it came from and overlaps with the first wave.

- Reflected wave
- Incident wave
Reflection is a type of...
Boundary condition
What is the difference between echo, reverberation, and resonance...
It depends on DISTANCE between the sound source and the reflective surface.
What exactly is a sound?
You need a source and a medium that transmits the energy generated by the source.

Patm must be disturbed to produce a sound
Sound=
Systematic changes in pressure generated by an energy source
The object or source generate the sound must be free to...
vibrate or oscillate
So as a sound source is put into vibration, it will...
Cause air pressure changes around the vibrating source part
Sound is a conducted through a process of...
Compression and rarefaction of the medium (air)
Remember that air is springy...this means is has ________ and is suitable for ___________?
It has inherent elasticity and is very suitable for compression and expansion.
Like a pendulum swinging, once air molecules are displaced, what happens?
They will oscillate due to their mass, elasticity and inertia
Air molecules, once displaced, will oscillate due to:
Their mass, elasticity and inertia
A "chain reaction” with neighboring air molecules means that...
Energy is passed along or exchanged with neighbors.

The Traveling Wave – energy is transferred and propagated.
Time Domain Waveform
Plot of changes in amplitude over time for a particle, object, or system


The waveform is actually a representation of a particle motion or system motion in time
Wave motions are
disturbances moving through a medium, that…

Transfer energy in a radiant manner from a source, with...

Moving out in all directions from source
Wave motions transfer...
Transfer energy in a radiant manner from a source, with...

Moving out in all directions from source
The speed of wave propagation is...
how quickly it is traveling or radiating
The speed of wave propagation depends on...
the physical characteristics of the material through which the wave travels
Three types of waves are defined
Transverse Waves (stadium waves)

Longitudinal Waves

Surface Waves
Transverse Waves (stadium waves)
Cause the medium to move perpendicular to the direction of the waves travel.

Displacement of particles is at right angles to the direction of wave propagation

Ex: slinky motion
Longitudinal Waves
Cause the medium to move parallel to the direction of the waves travel.

Displacement of particles is in the same direction of wave propagation

Ex: Sound waves
Surface Waves
Are both transverse waves and longitudinal waves mixed in one medium

Displacement is circular

Ex: Water motion
Reflection is a type of
boundary condition.
Three different types of Reflected waves:
Echo, Reverberation, and Resonance (interaction of incident waves w/ reflected waves)
Difference between echo, reverberation, and resonance depends on
DISTANCE between the sound source and the reflective surface (or the time it takes for that sound to get to its reflective surface and bounce back)
Echo
Occurs when sound is reflected off a distant surface

The reflecting sound is heard AFTER you hear your own production
Reverberation
Think of it as a more refined form of echo.

Occurs within smaller & enclosed spaces where the sound bounces from wall to wall DURING the time the sound you’re producing is heard.

As sound is reflected from wall to wall within a small room, the reflected sound can enhance or degrade the original sound source somewhat.
INTERFERENCE
Special situation arises from the idea of two waves crossing each other in time
In-phase means
Waves have amplitudes that are changing or rising in the same direction
When two (or more) waves are in-phase, they interfere...
Constructively, creating a resultant wave that has a greater amplitude than either of the first two individual waves by themselves.

BASICALLY: Waves and their amplitudes are adding up
Opposite-phases means
Trough and crest happen at the same time
When the two waves have opposite-phases they interfere how?
Destructively and the amplitude of the resultant wave is attenuated (lowered).


This means: Wave amplitudes take away from each other
Standing Waves arise from interference patterns that are...
Synchronized in time and space
STANDING WAVE
When energy (in our case, sound pressures) is placed into a system that is enclosed or small and confined, a dynamic condition arises that produces this unique form or type of a wave
STANDING WAVES are created...
Standing waves are created whenever two waves with the same wavelength are passing in opposite directions to each other in the same medium.

In essence, the wavelength and period of the reflected wave is synchronized to the wavelength and period of the incident wave, giving us the illusion that the resultant wave is not moving.
What is giving us the illusion that the resultant wave is not moving in the standing waveform?
In essence, the wavelength and period of the reflected wave is synchronized to the wavelength and period of the incident wave, so it looks like it is not moving.
ANY complex period waveform…
…Is simply the linear summation of a series of sinusoids
Lowest frequency is ...
Lowest frequency is the “Base Freq” or Fo

For a complex periodic waveform (one made up of several individual pure tone sinusoids), the F0 is the “base frequency”
Harmonics are...
Harmonics are whole number multiples of the Fo

All other sine waves that exist in the complex periodic wave must be related to the base frequency and are referred to as harmonics
Fo & Harmonics
For a complex periodic waveform (one made up of several individual pure tone sinusoids), the F0 is the “base frequency”

All other sine waves that exist in the complex periodic wave must be related to the base frequency and are referred to as harmonics

If F0 = 100 Hz, then harmonics are 200, 300, 400
If F0 = 300 Hz, then harmonics are 600, 900, 1200
If F0 = 500 Hz, then harmonics are 1000, 1500, 200
If during reflection, the incident wave and the reflected wave have the same wavelength…
...Then reflection sets up a constant process of construction and deconstruction (interference) producing the resultant – standing wave (be able to draw and define)
Draw and define the standing wave
Standing waves are created whenever two waves with the same wavelength are passing in opposite directions to each other in the same medium.

In essence, the wavelength and period of the reflected wave is synchronized to the wavelength and period of the incident wave, giving us the illusion that the resultant wave is not moving.
When the phases are the same, the resultant wave is amplified…
It is greater in amplitude than the component waves by themselves (can be positive or negative amplification)
“Modes of vibration”
“Modes of vibration” are motion patterns that a system can create or support based on its
1) boundary conditions,
2) the amount of energy put into the system (respiratory drive and glottal spectra), and
3) its physical characteristics (of system – in this case, vocal tract)
“Modes of vibration” are motion patterns that a system can create or support based on its
1) boundary conditions

2) the amount of energy put into the system (respiratory drive and glottal spectra)

3) its physical characteristics (of system – in this case, vocal tract)
The vocal tract is a flexible tube or enclosure, with a sound source at one end… that sound source is?
The vibrating vocal fold
If you place a frequency energy into the vocal tract tube or enclosure that MATCHES the natural vibrating mode of the tube itself or the air inside the tube...
...then the amplitude of the resulting wave is increased

This gets us to our last form of reflection…RESONANCE
Natural frequency-
what the tube like to sound like/ idea brings us to the idea of resonance
RESONANCE
All objects and systems can be made to vibrate AND we know that they have a certain ‘natural’ frequency at which they will vibrate MOST STRONGLY (with maximum amplitude) based on the physical properties of the system

These are the vibrating modes of the system

(the slinky example/ base frequency)
The exact frequency at which something naturally vibrates is determined by its...
It's size, length and overall shape.

And also, whatever it is composed of, or made of
Resonant Frequency of the object
The frequency (Hz) that an object naturally vibrates maximally
Two Types of Resonators:
Mechanical versus Acoustical Resonators
Resonators are...
Any material objects that can “respond” to an outside energy source.
Mechanical Resonators:
The object itself is put into vibration (a swing, the slinky)

Energy source is physically linked to the ‘responding’ system
Acoustical Resonators:
Volume of air in tube (or an enclosure) is actually placed into vibration.

This type of resonator is enormously important for speech and vocalization.

RF of an acoustical resonator depends on its volume

Small volumes resonate high frequencies and large volumes resonate lower frequencies.
Acoustical Resonators:

RF of an acoustical resonator depends on its volume..
Small volumes resonate high frequencies and large volumes resonate lower frequencies.
Acoustical Resonators:

Small volumes resonate
high frequencies
Acoustical Resonators:

Large volumes resonate...
lower frequencies
Think of Resonant Frequencies as
“Natural” points of energy amplification
All objects favor their natural modes of...
Vibration or RF’s (resonance frequencies)

A mode represents patterns of vibration, which require the least amount of energy to achieve.
Thus, objects are most easily forced into resonant vibration when...
... disturbed at or with energy whose frequencies match with the object’s natural modes.
Think of Resonant Frequencies as “Natural” points of energy amplification
All objects favor their natural modes of vibration or RF’s (resonance Frequencies)

A mode represents patterns of vibration which require the least amount of energy to achieve.

Thus, objects are most easily forced into resonant vibration when disturbed at or with energy whose frequencies match with the object’s natural modes.

Outside “driving forces” are needed to force the vibration of the object or system.

When the driving freq matches the RF, because of interference patterns, the object becomes a “natural” amplifier for a specific frequency of the driving force’s energy. (matches frequency of the tube= bigger amplitude)

-What does that mean?- signal with 200, 400, and 600 components

-Tube has resonant frequency of 400 Hz- 400 Hz will be amplified

- Others, if they pass through, will be dampened or attuned

-Again, recall that the EXACT FREQUENCY at which the resonates is determined by its SIZE AND OVERALL SHAPE

- Pharyngeal and oral vocal tract cavities
Outside “driving forces” are needed to...
force the vibration of the object or system
When the driving freq matches the RF, because of interference patterns...
...the object becomes a “natural” amplifier for a specific frequency of the driving force’s energy.
When the driving freq matches the RF, because of interference patterns, the object becomes a “natural” amplifier for a specific frequency of the driving force’s energy.

What does that mean?
Signal with 200, 400, and 600 components

Tube has resonant freq of 400 Hz – 400 Hz will be amplified

Others, if they pass through, will be dampened or attenuated
Again, recall that the exact frequency at which the container resonates is determined by its...
...size and overall shape

And pharyngeal and oral vocal tract cavities
Summarizing Resonance
Vibration from one object can set other objects into motion.

In the vocal tract, the folds are setting the air within the tube of the vocal tract into motion.

The VT‘s length supports the generation of modes that are in the same range as the vibration patterns being created by the vocal folds.

When the mode of the air within the tube matches the frequency energy created by the vocal folds, we get a resonance.

The better and closer the match, the greater the amplitude increase for that frequency – means that frequency will get louder than all other frequencies being put into the system
Vibration from one object can....

In the vocal tract, this means...
...set other objects into motion.


In the vocal tract, the folds are setting the air within the tube of the vocal tract into motion.
The VT‘s length supports the generation of modes that are...
...in the same range as the vibration patterns being created by the vocal folds.
We get a resonance when...
When the mode of the air within the tube matches the frequency energy created by the vocal folds
The better and closer the match to the frequency energy created by the vocal folds,
the greater the amplitude increase for that frequency – means that frequency will get louder than all other frequencies being put into the system
What happens to complex waves in a resonator?
Recall that complex waves have a Fo and some x number of harmonics.

Because resonators are capable of highlighting or accentuating those frequencies (can also dampen or attenuate which is the opposite) in the complete complex signal that match the resonator’s natural modes

The resonator is acting like a FILTER (amplifying some, dampening/attenuating others)

Highlighting some frequencies of the complex waves components and suppressing others

Resonator STILL has its own natural RF at which it will resonate maximally (achieve max amp)

Often we are talking about a single tube – vocal tract has 2, some times 3 tubes – multiple resonant frequencies
Complex waves have...
...a Fo and some x number of harmonics.
Because resonators are capable of highlighting or accentuating those frequencies and it can also dampen or attenuate which is the opposite, in the complete complex signal that match the resonator’s natural modes the resonators is acting like...
The resonator is acting like a FILTER (amplifying some, dampening/attenuating others)

Highlighting some frequencies of the complex waves components and suppressing others

Resonator STILL has its own natural RF at which it will resonate maximally (achieve max amp)

Often we are talking about a single tube – vocal tract has 2, some times 3 tubes – multiple resonant frequencies
In the complete complex signal that match the resonator’s natural modes, the resonators are capable of...
resonators are capable of highlighting or accentuating those frequencies and it can also dampen or attenuate which is the opposite,
The resonator is acting like a FILTER meaning....
IN COMPLEX WAVES:

Highlighting some frequencies of the complex waves components and suppressing others

Resonator STILL has its own natural RF at which it will resonate maximally (achieve max amp)
Systems that change the relationship among components of a complex waveform...
Filters
A filter is basically a...
system that performs an action on inputs fed into it

It passes frequencies and rejects (attenuates) others, based on it’s physical or mathematical characteristics, such as.
Low-pass filters:
Passing low frequencies
Rejecting high frequency
High-pass filters:
Passing high frequencies
Rejecting low frequencies
Band-Pass filters:
Passing a range of frequencies
Rejecting frequencies outside of range
Bandwidth –
A range of frequencies that a resonator will transmit or pass with minimal reduction in amplitude

Freq Hi – Freq Lo
Cut-off Frequency (frequencies) –
Discrete point on the curve where the intensity (amp) value of a range of frequencies has decreased by 3 dB from the value of the peak intensity.

3 dB down defines the boundaries of our bandwidth.
Resonators are in essence, selective to...
frequency
Resonator systems are _______________ when presented with a complex wave; in other words, ______________________
Resonator systems are frequency selective when presented with a complex wave; in other words, they filter complex signals
Resonator systems selectively pass ___________________ and attenuate ________________________________
Resonator systems selectively pass frequencies at their resonant frequencies and attenuate other frequencies found in the driving energy source
It IS possible to push a resonator into highlighting a non-preferred frequency if ...
you put enough energy into the system
Selection of the RF is based on...
the inherent properties of the system (mucus, thickness, lubrication ect)
Frequencies remote from the peak of the system response are said to be...
effectively filtered out

In effect, the contribution of some frequencies are emphasized and others diminished in the sound output you hear
Narrowly tuned
Closer to a pure tone signal


Ex: only accepting 500 Hz; some other frequencies will get through but will amplify above all other the 500 Hz frequency
Broadly tuned
Ex: broadly tuned from 500 – 900 Hz; those are the frequencies it will amplify

More frequencies evenly and equally represented
Describing the filtering process of a resonator mathematically requires the use of a...
Transfer Function
A transfer function defines...
The relationship between the inputs into a system and the outputs produced by the system
In other words, a transfer function describes the...
Shape of the filter
Ex of transfer function: Making cookies
Sheet of dough = input

Cookie cutter = system (operating on the dough)

Trims away excess dough to produce the shape you want

Cookies made up of selected frequencies from input = output
The curve of a TRANSFER FUNCTION DOES NOT represent...
a sound wave
The curve of a TRANSFER FUNCTION shows you...
...How the physical features of the resonator affects the amplitudes of the various frequency components of the energy passing through it.
The TRANSFER FUNCTION curve shows you how the features of...
A sound input into a resonator are changed by the physical characteristics of the resonator itself.
The Basic Idea
When you use the musculature of the vocal tract to change the size of different regions of the vocal tract you create different resonance patterns that also interact with one another

It is these patterns and these interactions that dictate what sound you will perceive when it is spoken

Ex: plastic models of vocal tract for vowels in English

Note the different sizes, volumes, shapes of the different regions of these plastic models
Each volume maps onto a different segment of the vocal tract
When you use the musculature of the vocal tract to change the size of different regions of the vocal tract you create ...
...different resonance patterns that also interact with one another


It is these patterns and these interactions that dictate what sound you will perceive when it is spoken
What dictates what sound you will perceive when it is spoken?
When you use the musculature of the vocal tract to change the size of different regions of the vocal tract you create different resonance patterns that also interact with one another

It is these patterns and these interactions that dictate what sound you will perceive when it is spoken
We can model the VT as series of...
Interlinked tubes that can change their shape and volume dynamically
Quarter-wave resonator =
Tube closed at one end (glottis) and open at the other end (lips)
VT =
Series of containers hooked up to each other

Each container has its own different filtering properties

Therefore each container has its own resonating frequency (RF)

And the vocal tract resonates at numerous RFs (these are called formants)

But, each container can change its shape dynamically

Thus, resonant frequencies (RF’s) will constantly be changing also
And the vocal tract resonates at numerous RFs called...
Formants
But, each container of the vocal tract can change its shape dynamically, thus...
...resonant frequencies (RF’s) will constantly be changing also
VT will resonate or absorb energy best...
at a frequency with a wavelength 4 times the length of the tube

All of this is only for ONE SOUND!
VT will resonate or absorb energy best a frequency with a wavelength 4 times the length of the tube

All of this is only for ONE SOUND!
All of this is only for ONE SOUND!
Adult males – 17.5 cm
So wavelength of first resonant frequency is 70 cm
Adult females – 14.7 cm

Calculating…
Males = 340/.7 = 485.7 Hz = first resonant frequency

Females = 340 / .588 = 578.23 Hz

Children = 340 / .35 = 971.42 Hz

Wavelength is 58.8 cm

Small children – 8.75 cm

Wavelength is 35 cm
Fundamental wave equation:
f = V / wavelength

V = velocity of air

Where v = 340 m/sec in room temperature at normal altitudes

Temperature, altitude, how quickly traveling – will affect frequency

Change cm to m = 70 cm = 0.7 m
To determine other resonant frequencies (aka formants), find the...
....odd-numbered multiples of the lower frequency
Ex: 485.7 Hz x3, x5, x7... etc….
Ex: 578.23 Hz x3, x5, x7…
Formalize all these ideas into the Source-Filter Theory of Speech Production....
Energy from the sound source (vocal fold vibration) is modified (filtered) by the resonance features of the vocal tract

The vocal tract’s shape operates to filter the frequency energy produced by the vocal folds.

The shape of the output spectrum (the spectrum you hear and recognize as a phoneme) represents the transfer function of the vocal tract on the sound source.

This implies that the output spectra registered at the lips is modified dramatically from that produced at the glottis.
Energy from the sound source (vocal fold vibration) is modified (filtered) by...
the resonance features of the vocal tract
The vocal tract’s shape operates to...
Filter the frequency energy produced by the vocal folds.
output spectrum
the spectrum you hear and recognize as a phoneme
The shape of the output spectrum represents....
The transfer function of the vocal tract on the sound source.
Source-Filter Theory of Speech Production implies....
This implies that the output spectra registered at the lips is modified dramatically from that produced at the glottis.
A few concrete examples of Source-Filter Theory of Speech Production
The glottal spectra is a stereotyped input

It always has the same gradually falling slope to it

Depending on how the vocal tract is shaped, this input is altered to highlight different frequencies as seen in the output spectra of the sound recorded at the lips

For the following figure keep in mind these relations:
Glottal spectra = input
VT shape = transfer function
Vowel spectra = output
The glottal spectra is a...
...stereotyped input
The glottal spectra always has the same...
...gradually falling slope to it
Depending on how the vocal tract is shaped, the stereotyped input of the glottal spectra, is...
...altered to highlight different frequencies as seen in the output spectra of the sound recorded at the lips
For the following figure keep in mind these relations:

Glottal spectra =
VT shape =
Vowel spectra =
Glottal spectra = input
VT shape = transfer function
Vowel spectra = output
The peak amplitude frequencies of the vowel spectra are called ______ in the speech world
FORMANTS
FORMANTS
The peak amplitude frequencies of the vowel spectra

They are numbered from left to right as F1, F2, F3… etc…in the speech world
Formants in the speech world are numbered...
They are numbered from left to right as F1, F2, F3… etc…
Formants ARE the “vocal tract’s” ..
...resonances (resonant frequencies)
Formants ARE the...
...“natural” modes of vibration for different parts or chambers within the whole VT tube.
If formants ARE the “natural” modes of vibration for different parts or chambers within the whole VT tube....
Formants are thus the ...
harmonics from the glottal source that lie within the resonator's bandwidth
Boosted over all other harmonics
What determines which vowel you will perceive?
The lowest three formants
What formants are of interest in most typical speech science applications?
Only the lowest three formants are of interest in most typical speech science applications

These three determine which vowel you will perceive
The first 3 formant peaks determine...
...which vowel you will perceive
Connected together, the formants make up...
...the transfer function of the VT
Each peak in the transfer function =
A formant peak
A formant peak is...
Each peak in the transfer function
Acoustic Analysis of Speech: The Problem
Acoustic signal is constantly changing and is complex.

Acoustic (simple audio) signal displays amplitude changes in time, not frequency changes in time.

The speech waveform can be viewed as an acoustical time domain signal, capturing the intensity dynamics of speech very well in time.

But it does not easily describe the frequency relations and changes to the VT’s transfer function during running speech.
Acoustic (simple audio) signal displays _______________, not_________________
Acoustic (simple audio) signal displays amplitude changes in time, not frequency changes in time.
The speech waveform can be viewed as....
...an acoustical time domain signal, capturing the intensity dynamics of speech very well in time.
Acoustic signal does not easily describe...
... the frequency relations and changes to the VT’s transfer function during running speech.
Frequency “spectrum” plots are useful for...
...seeing the state of a complex waveform at ONLY an instant in time
Spectrums alone tell you nothing about...
...how the speech signal is changing THROUGH time
Spectrums are...
...snapshots of a sound’s harmonic structure at a single moment in time.
Spectrums tell you nothing about how…
…These harmonic structures are changing through time during running speech.
How do we “see” the harmonic changes that are occurring as a consequence of changes in the vocal tract’s transfer function* through time?
The Solution is a Spectrogram
Spectrograms are a convenient way to...
...diagram the changes in a sound's spectrum as it runs through time.
“Running spectrum”.
The spectrogram is a “running spectrum”.
it displays a continuously changing spectral patterns IN TIME
The spectrogram allows you to directly visualize the changes in...
Formant frequencies AND amplitudes during running speech.
You can think of a spectrogram as a continuous...
Fourier analysis for the word or phase you are analyzing
Speech is constantly being decomposed into...
...its frequency and amplitude components along the time domain.
A spectrogram is a 3 dimensional plot...
The horizontal dimension (X axis) represents time.

The vertical dimension (Y axis) represents frequency.

The Z-axis is intensity represented by variations in darkness.
The horizontal dimension (X axis) on a spectrogram represents...
TIME
The vertical dimension (Y axis) on a spectrogram represents...
FREQUENCY
The Z-axis on a spectrogram represents...
Intensity represented by variations in darkness
Components of Spectrograms
Each thin vertical slice of the spectrogram shows ONE spectrum plot during a short period of time, using “darkness” (grey scale level) to indicate amplitude changes in a 2-D image.

Formant bands on a spectrogram are created when you pancake individual spectra side to side.

Darker areas are those frequencies having higher amplitudes.
Each thin vertical slice of the spectrogram shows-
ONE spectrum plot during a short period of time

Uses “darkness” (grey scale level) to indicate amplitude changes in a 2-D image.
Formant bands on a spectrogram are created when...
...you pancake individual spectra side to side
Darker areas on a spectrogram are...
Darker areas are those frequencies having higher amplitudes
Formant is not just ________.
one frequency
All the frequencies that are in the passband make up ____________.
the formant
Important features of a formant include:
Center frequency (aka, Formant Freq)

bandwidth

amplitude of the passband

3 dB down or cut-off
Numbering the formant bands...
F1, F2, F3
Occasionally, the Voice Bar is seen.
Represents F0
Formant Transitions reflect changes in the volume of vocal tract spaces in time.
Formant Transitions reflect ....
...changes in the volume of vocal tract spaces in time.
Fo is not related to ____________!!!!
resonance
Resonances are due to the size and shape of the container/cavity, not the __________________.
nature of the sound source into that container
A speaker’s fundamental frequency does not affect __________________.
the formant frequencies
A speaker’s pitch is determined by the ___________________.
rate of vocal fold vibration (i.e., the fundamental frequency)
Changing the size of the vocal tract, by moving the articulators, affects ____________________, but not ________________________.
Changing the size of the vocal tract, by moving the articulators, affects the formant frequencies, but not the speaker’s fundamental pitch (fo)....
When using the labels F1, F2, F3, etc when you refer to formants, remember that:
F1 DOES NOT EQUAL or imply the fundamental frequency

F1 is actually the harmonic located immediately located above the Fo

Fo is a function of the glottal source, not the resonance of VT
When using the labels F1, F2, F3, etc when you refer to formants, remember that:
F1 DOES NOT ______________
EQUAL or imply the fundamental frequency
When using the labels F1, F2, F3, etc when you refer to formants, remember that:

F1 is actually ________
The harmonic located immediately located above the Fo
When using the labels F1, F2, F3, etc when you refer to formants, remember that:

Fo is a ______________, not _______
Function of the glottal source, not the resonance of VT
Male Example
Fund freq = 100 Hz
First resonant freq =
485 Hz (F1) x3 (F2) X5 (F3)
It is not the actual values of formants that allow you to hear an /a/ vs. an /i/.
Rather it is __________________________.
Rather it is how the formant frequencies compare to each other
When a speaker produces the vowel /i/ we hear that because of _____________
...How F1 compares to F2 and F3
It is the pattern of formants in the output
Physiologically, formant frequencies are related to ___________________
Oral and pharyngeal volumes
The frequency of F1 is generally related to ___________________.
The volume & length of the pharyngeal cavity
The frequency of F2 is generally related to _______________________.
The volume & length of the oral cavity
The frequency of F1 is generally related to _____________ and
the frequency of F2 is generally related ___________________.
The frequency of F1 is generally related to the volume & length of the pharyngeal cavity.

The frequency of F2 is generally related to the volume & length of the oral cavity
Always remember that the range of the frequencies that will resonate within any tube depends on the .....
VOLUME, SIZE, AND LENGTH OF THE TUBE
If the tongue is positioned to create a smaller oral cavity relative to that of the pharyngeal cavity you get:
Lower F1 frequency (relating to pharyngeal cavity volume)

Higher F2 frequency (relating to oral cavity volume)
If the tongue is positioned to create a smaller oral cavity relative to that of the pharyngeal cavity you get:

Why?
Lower F1 frequency (relating to pharyngeal cavity volume)

Why? – the larger volume of pharynx will resonant lower frequencies of the glottal source

Higher F2 frequency (relating to oral cavity volume)

Why? – the reduced length of the oral cavity boosts the high frequency of the glottal source
If the tongue is positioned to create a larger oral cavity relative to the pharyngeal cavity you get:
Higher F1’s- relative to where they “should be”

Lower F2’s – relative to where they “should be” (remember to get this formant we multiply F1 by 3!)

Why?- the opposite reasons
Traditionally, vowels are classified primarily according to ...
...tongue body location in the oral cavity.
Tongue height and advancement are...
...two dimensions that can be manipulated by the speaker to change the resonance features of the sound.
Schematically, we can plot Tongue height and advancement dimensions and the sounds that emerge from them on ______________
A graph called a vowel quadrilateral
Vowel quadrilateral
Tongue height and advancement dimensions and the sounds that emerge from them plotted on a graph called a vowel quadrilateral


Horizontal plane represents tongue advancement

Vertical plane represents tongue height
Horizontal plane on the vowel quadrilateral represents....
Horizontal plane represents tongue advancement
Vertical plane on the vowel quadrilateral represents....
Vertical plane represents tongue height
Tongue is a movable wall/barrier that influences...
...volumes in the oral and pharyngeal area.
Vowel Quadrilateral physiologically represents __________.
tongue placement
A listener can literally "hear" the position of the speaker's tongue body if you understand ______________________.
The trade-off in regional volumes with respect to tongue position.
There are some orderly generalities with regard to the formants that we can extract:
F1 value is mostly influenced by tongue body height (high or low-ness) position.

F2 value is mostly influenced by tongue body Anterior-Posterior position (front-ness or back-ness), advancement
Tongue body height relates to:
the high or low-ness position
Tongue body Anterior-Posterior position relates to:
The front-ness or back-ness advancement
F1 value is mostly influenced by __________ in relation to the tongue body.
tongue body height position

(high or low-ness)
F2 value is mostly influenced by __________ in relation to the tongue body.
Tongue body Anterior-Posterior position advancement

(front-ness or back-ness)
REMEMBER THIS…
F1 is influenced by =

F2 is influenced by =
F1 = pharyngeal cavity and tongue height

F2 = oral cavity and tongue advancement (movement in anterior-posterior direction)
F1 is influenced by what two aspects?
F1 = pharyngeal cavity and tongue height
F2 is influenced by what two aspects?
F2 = oral cavity and tongue advancement (movement in anterior-posterior direction)
High vowels (/i/ & /u/) have _____________F1’s, than Low vowels ( _______ cavity is ______)
High vowels (/i/ & /u/) have LOWER value F1’s, than Low vowels (PHARYNGEAL cavity is LARGER)
Two high vowels:
/i/ & /u/)
Low vowels (/ae/ & /a/) have __________________ F1’s than High vowels (______ cavity now _______)
Low vowels (/ae/ & /a/) have HIGHER value F1’s than High vowels (PHARYNGEAL cavity is now SMALLER)
Two low vowels:
(/ae/ & /a/)

HIGHER value F1’s than High vowels

(pharyngeal cavity now smaller)
Back vowels (/u/ & /a/) have ____________ F2’s than Front vowels (_____ cavity is ______)
Back vowels (/u/ & /a/)

LOWER value F2’s than Front vowels

(ORAL cavity is LARGER)
Two back vowels:
Back vowels (/u/ & /a/)

LOWER value F2’s than Front vowels

(ORAL cavity is LARGER)
Front vowels (/I/ & /ae/) have ____________ F2’s than Front vowels (_____ cavity is ______)
Front vowels (/I/ & /ae/)

HIGHER value F2’s than back vowels

(ORAL cavity now SMALLER)
Two front vowels:
Front vowels (/I/ & /ae/)

HIGHER value F2’s than back vowels

(ORAL cavity now SMALLER)
The relative value of F1 is related to ______________.
The relative value of F2 is related to __________________.
The relative value of F1 is related to tongue height.

The relative value of F2 is related to the anterior-posterior position of the tongue (tongue advancement).
The relative value of F1 is related to ______________.
The relative value of F1 is related to tongue height.
The relative value of F2 is related to __________________.
The relative value of F2 is related to the anterior-posterior position of the tongue (tongue advancement).
Formants patterns of Vowels (refer to formants of schwa)

Front Vowel Patterns
Large separation between F1 and F2

Small separation between F2 and F3
Formants patterns of Vowels (refer to formants of schwa)

Back Vowel Patterns
Small separation between F1 and F2

Large separation between F2 and F3
Small separation between F1 and F2

Large separation between F2 and F3
Back Vowel Patterns
Large separation between F1 and F2

Small separation between F2 and F3
Front Vowel Patterns
Formants patterns of Vowels (refer to formants of schwa)

Central Vowel Patterns
Uniform separation between F1, F2, and F3 (formants are equally spaced)

However, the r-colored vowels (-er) have a low F3, giving them a relatively small F2 – F3 difference

The value of F1 is approximately 500 Hz for an adult male speaker

All of this info has had to do with adult males and relative to the formants using schwa
Uniform separation between F1, F2, and F3 (formants are equally spaced)
Central Vowel Patterns
For [i]
(the vowel in the English word feed)

Described as High front unrounded:

The tongue is at the top of the mouth

The tongue is pushed towards the front of the mouth

The lips are spread (i.e. not rounded)
For [u]
(the vowel in food)

High back rounded:

The tongue is at the top of the mouth (high)

The tongue is bunched towards the back of the mouth

The lips are rounded
For [ɑ]
(the vowel in far)

Low back unrounded:

* The tongue is lowered

* The tongue is at the back of the mouth

* The lips are not rounded
For [ɜ]
(the vowel in fur)

Mid central neutral:

* The tongue is neither raised nor lowered; it is in a mid position

* The tongue is bunched neither at the front nor the back of the mouth; it is central

* The lips are in a neutral position
Lip Rounding Patterns
In English,

back vowels tend to be produced with lip rounding

front vowels are produced with little or no lip rounding.

The acoustic correlate of lip rounding is lowering of all of the formant frequencies

Lengthens tube = greater volume =passes lower frequencies

Lip rounding effectively lengthens the vocal tract
In English, back vowels tend to be produced _____________ and front vowels are produced __________.
In English, back vowels tend to be produced with lip rounding and front vowels are produced with little or no lip rounding.
Lip rounding effectively_________.
lengthens the vocal tract
The acoustic correlate of lip rounding is ____________________.
lowering of all of the formant frequencies
In English, back vowels tend to be produced _____ lip rounding.
WITH
In English, front vowels are produced _______ lip rounding.
WITH LITTLE OR NO
The acoustic correlate of lip rounding is lowering of ____________. This means the tube______and volume__________.
all of the formant frequencies

Lengthens tube = greater volume THUS passes lower frequencies
Remember that vowels….
Vowels are produced with a relatively open vocal tract and laminar flow throughout the tract.

Vibration of the vocal folds is the sound source for vowel production
- Vowels are then all voiced

The shape of the vocal tract determines the resonance pattern (formants) for a particular vowel.

Vowels will have many discrete regions of energy (formants) and are the longest duration of all phonemes.
Vowels are produced with a _______ vocal tract and __________ throughout the tract.
Vowels are produced with a relatively open vocal tract and laminar flow throughout the tract.
Vibration _______________________ vowel production

Vowels are ALL _______
Vibration of the vocal folds is the sound source for vowel production

Vowels are then all voiced
The shape of the vocal tract determines ___________________.
The resonance pattern (formants) for a particular vowel.
Vowels will have many discrete regions of ____________ and are the ___________ of all phonemes.
Vowels will have many discrete regions of energy (formants)

and are the longest duration of all phonemes.
Consonants are formed by ______________
Regions of varying degrees of constriction produced by articulator motion and posturing
Three places to look for narrowing of VT (for English)
Lips

Alveolar ridge

Velum
Turbulence is generated at----
The three places of narrowing of VT

Lips
Alveolar ridge
Velum
Sound source is __________ noise
aperiodic
Source created through:
Pressure buildup

Forced airflow through articulation points or regions of constrictions
Two general “flavors” of consonants
Voiced consonants

Unvoiced consonants
Voiced consonants
Will have formants present to some degree

Higher freq diffuse energy is also visible in the background.
Unvoiced consonants
Unvoiced consonants will NOT have any formants.

Still a noise and still has frequency spectrum = can highlight particular frequencies which are resonant frequencies

They will also possess broad regions of diffuse energy in the output spectra
Voiced consonants will have...
Will have formants present to some degree
Unvoiced consonants will NOT have ______.
Unvoiced consonants will NOT have any formants.
Can the source-filter theory we have developed for vowels also account for consonant production?
YES

Even though the sound source is not vocal fold vibration, there is a still a SOURCE of energy being created and a FILTER present in the front of the sound source (resonating chamber)
Even though the sound source is not vocal fold vibration, there is a still ________________ being created and a __________________.
Even though the sound source is not vocal fold vibration, there is a still a SOURCE of energy being created and a FILTER present in the front of the sound source (resonating chamber)
Our basic problem in studying the articulators of the vocal tract is that…
Besides the lips, the rest of the vocal tract is very difficult visualize and access for detailed investigation.

No other species has verbal speech, beside the human.

This limits our use of animal research strategies.

So..we have to be creative in our approaches to studying the vocal tract’s neuromuscular subsystems.
Coarticulation effects
Wrong to assume that each sound is produced in isolation and then strung together to form a word (like bricks in a row).

Due to the highly sequential and rapid nature of speech, overlap in gesture performance is extensive (more like roofing tiles)
Coarticulation effects:
Wrong to assume that each sound is produced ____________
In isolation and then strung together to form a word (like bricks in a row).
Coarticulation effects:
Due to the highly sequential and rapid nature of speech, ______________
overlap in gesture performance is extensive (more like roofing tiles)
What do you think are the consequences of coarticulation effect on what we hear?
o The features of one sound influence the features of the next, or the one in front of it too.

o Acoustics of one sound becomes blended into the acoustics of the previous sound and the one coming up in time.
The features of one sound influence _______.
....the features of the next, or the one in front of it too.
Acoustics of one sound becomes blended into _________________
... into the acoustics of the previous sound and the one coming up in time.
Articulatory movements___________.
Articulatory movements overlap each other in time.
Coarticulation refers to __________________ .
How two or more articulators move at virtually the same time to produce two or more phoneme gestures
Coarticulation refers to how two or more articulators move at virtually the same time to produce two or more phoneme gestures which are _____________________.
Makes the whole issue of reading a spectrogram more complicated

Makes the whole issue of understanding the acoustic features of sounds more complicated.
We have to take into account the ____________ in which every other sound exists in order to understand __________________ of our language.
take into account the phonetic (or sound) context

in order to understand our ability to discriminate the phonemes of our language
Scaling Method
listener's subjectively rate a person’s overall speech intell.
Identification Method
listener transcribes speaker’s speech and id’s sound phonetically
Traditional assessments are perceptually based on:
.
Scaling Method

Identification Method
While widely used, Scaling Method and Identification Method are ____________ .
inadequate to understand the specific underlying biomechanical features of speech that are contributing to a decrease in intelligibility.
Two patients with the same intelligibility rating will likely have _________.
Two completely different underlying etiologies for the decreased intell.

Patients may sound the same but for two totally separate reasons.

The general nature of the rating also makes your vector to intervention more difficult to determine.
What makes your vector to intervention more difficult to determine?
The general nature of the rating also makes your vector to intervention more difficult to determine.
The more specific you can be, the better and more efficient your efforts will be.
• Acoustical analysis is another important means to specify the speaker’s articulatory behavior.

• Like all other analysis we’ve described….

o Treatment can be targeted and geared to the individual’s patterns, to increase the probability that therapy is as effective and efficient as possible.
Acoustical analysis is another important means to ....
...specify the speaker’s articulatory behavior
Like all other analysis we’ve described….Treatment can be targeted and geared to the...
.... individual’s patterns, to increase the probability that therapy is as effective and efficient as possible.
Consonants are acoustically more _________________.
Consonants are acoustically more complex to understand than vowels
Consonants are acoustically more complex to understand than vowels

Hard to describe them with one set of measures:
Some have significant noise energy (/f/), while others don’t (/r/).

Some are produced by complete obstruction (/b/) others only need a narrowing (/s/) of the system.

Some are strictly oral in their sound transmission, while others involve the nasal transmission of energy.
Let’s study consonants based on acoustic and articulatory properties.
Stops,
Fricatives,
Affricates,
Glides,
Nasals,
Liquids.
STOPS
Formed by totally blocking airflow out of the vocal tract.

Block lasts from 50 to > 100 msec.

Requires closure at the:
Glottis (closure of vocal folds)
Velum (closure between back of tongue and roof of mouth) – k, g
Alveolar ridge (closure with tongue tip) – t, d
Lips – p, b

Stops (Plosives) are thus known as Obstruents.

Burst and release of built-up high pressure IS the sound energy or sound source

Stops can also be voiced or voiceless.

Voiced would have two energies (burst + voicing energy of VFs)

Depending on the position in the word, …stop may be followed by a brief burst of air (stop release).

Aspiration (turbulence of air)
Stops can be aspirated (h-like noise) or unaspirated (/pat/ vs. /spat/)
Stops are formed by....
Formed by totally blocking airflow out of the vocal tract.
Stops last...
Block lasts from 50 to > 100 msec.
Stops requires closure at the:
Glottis (closure of vocal folds)

Velum (closure between back of tongue and roof of mouth) – k, g

Alveolar ridge (closure with tongue tip) – t, d

Lips – p, b
Stops are also called ________ and are thus known as ______________.
Plosives

Obstruents.
Stops

Burst and release of built-up high pressure IS _____________________
IS the sound energy or sound source



Voiced would have two energies (burst + voicing energy of VFs)

Depending on the position in the word, …stop may be followed by a brief burst of air (stop release).
Stops can be ________ or ____________.
Stops can also be voiced or voiceless.
Stops can be ___________ or __________, depending on the position in the word.
Stops can be aspirated (h-like noise) or unaspirated (/pat/ vs. /spat/)

Depending on the position in the word, …stop may be followed by a brief burst of air (stop release).
Aspiration
turbulence of air

Stop may be followed by a brief burst of air (stop release).
Acoustic features of stop consonants used to cue_________.
Differences in stop articulation
STOP GAP or Silent gap
time of airflow blockage
Syllable initial and pre-vocalic
Closure (stop gap), followed by impaction.

Release (burst)

Transition to the vowel (formant transition)
Syllable-final and post-vocalic
Transition from vowel

Closure

Release (aspirated noise burst) or No release (unaspirated)
STOP GAP
STOP GAP or Silent gap – time of airflow blockage

Syllable initial and pre-vocalic
-Closure (stop gap), followed by impaction.
-Release (burst)
-Transition to the vowel (formant transition)

Syllable-final and post-vocalic
-Transition from vowel
-Closure
-Release (aspirated noise burst) or No release (unaspirated)
BURST –
explosive burst of air rushed through the opening, involving energy in most or all of the audible spectrum…

Aperiodic noise
Voiced Stops produce recognizable patterns of
FORMANT TRANSITIONS to the vowel

o Seen in diphthongs and transition from voiced stop to a vowel
__________________ cause the medium to move perpendicular to the direction of the waves travel.
Transverse waves

Displacement of particles is at right angles to the direction of wave propagation (i.e.: slinky motion)
Longitudinal waves cause
the medium to move parallel to the direction of the waves travel.

Displacement of particles is in the same direction of wave propagation (i.e.: Sound waves)
Surface waves are....

Displacement is...
....both transverse waves and longitudinal waves mixed in one medium.

Displacement is circular (i.e.: Water motion)
Explain...
Red – hasn’t hit boundary until gray thing
- first incident
Hits boundary – bounces back
- reflected wave is reflected
- will go back to direction from which it came
Involved some part of sound wave bouncing back
- some surface examples:
mirror
hard, smooth surfaces will serve to reflect
Explain constructive interference.
Crest meets crests and doubles

Red (1) + blue (1) = green (2)
Explain deconstructive interference.
The two cancel each other out
Explain
- When two (or more) waves are in-phase (have amplitudes that are changing in direction) they interfere constructively, creating a resultant wave that has a greater amplitude than either of the first two individual waves by themselves.
-Waves and their amplitudes are adding up

- When the two waves have opposite-phases, they interfere destructively and the amplitude of the resultant wave is attenuated (lowered).
-Wave amplitudes take away from each other
When two (or more) waves are in-phase (have amplitudes that are changing in direction) they interfere constructively, creating a resultant wave that has a greater amplitude than either of the first two individual waves by themselves.
-Waves and their amplitudes are adding up
-Wave amplitudes take away from each other

When the two waves have opposite-phases, they interfere destructively and the amplitude of the resultant wave is attenuated (lowered).
Explain

ANY complex period waveform…
….is simply the linear summation of a series of sinusoids.

Lowest frequency is the “Base Freq” or Fo

Harmonics are:
Whole number multiples of the Fo



Lower frequency - base frequency = fundamental frequency
Little more frequency
Highest frequency
Resultant sound = combination of all three – complex waveform
Spectrum – how we’re viewing (this simple) complex signal
1 line – fundamental frequency
Spectra Before LP filtering
Spectra After LP filtering
Bandpass filter: Draw
Illustrate broadly and narrowly tuned...
Draw a high pass filter
Cut-off Frequency (ies) – discrete point on the curve where the intensity (amp) value of a range of frequencies has decreased by 3 dB from the value of the peak intensity.
3 dB down defines the boundaries of our bandwidth.
Cut-off frequencies are also know as 3 dB down points
Describing the filtering process of a resonator mathematically requires the use of a Transfer Function
Input: glottal spectra (sound from our VF’s)
A transfer function defines the relationship between the inputs into a system and the outputs produced by the system.
In other words, it describes the shape of the filter.
First, Input is the sound source.
What happens next?
NEXT:
A system’s transfer function changes the amplitudes of the input signals energies
Application of transfer function:
Transfer function will “trim away” everything on outside of it
lowering the amplitudes
lower frequencies will have the highest amplitudes
most amplified
could relate to female’s voice (amplifying 200 – 300 Hz)
Application of Transfer Function
(works like a cookie cutter on a large sheet of dough)
What happens after this stage?
5) Output Sound Waveform: Transform energy into something with very different features from your input
Reminder: We are still working with the idea of a simple transfer function. A filter and a resonator ARE described by a transfer function

Simple filtering properties and transfer functions
NEVER just apply one transfer function all the time
VT (vocal tract) constantly changes shape
MANY transfer functions being applied to input
We can model the VT as series of interlinked tubes, that can change their shape and volume dynamically

VT straightened out can be modeled as series….

each ridge represents a new tube

each one has a different resonant frequency
VT model "ah"
Draw a spectrogram for /i/

"eeee" as in "beat" or "Easter"
High front
Draw a spectrogram for /I/

"iiii" as "pill" or "bill"
What is this formant?
ae
front low

ahhh as in "bat"
Draw a spectrogram for /E/
Front mid

"bet"
All formants are evenly spaced
This is a spectrogram for "uh"
or schwa
"a" as in pot

This is a low back vowel
Dog
"awe"
awkward

back
Draw a spectrogram for /u/
high back

"ooo" was in food
Draw a spectrogram for "U" or omega
uuuu as in good
au and ai dipthongs
au - "cow"

ai- "find"
A spectrogram is a 3 dimensional plot.
The horizontal dimension (X axis) represents time.
The vertical dimension (Y axis) represents frequency.
The Z-axis is intensity represented by variations in darkness.
Your “typical” spectrogram for /i/ and /u/
Dark – z axis – intensity (formants)
(Eee vs. ooo)
“uh” – schwa

Connect the formants (the colored lines)
Flip them over to see through time

Add time, frequency, and intensity (on z-axis)

Just because fundamental frequency changes does not mean formants will change
formants/resonant frequencies are dependent on shape of vocal tract, not the energy being put into it (glottal spectra  base/fundamental frequency)
pitch change  change in fundamental frequency, but NOT formants
Just because formants change does not mean fundamental frequencies change

Shape change  formant change
Pitch change  fundamental frequency change

First formant at 500 Hz – resonant frequency
Explain
Just to look at
Uh vs ee
Eeee vs ooo vocal tract and glottal spectrum
Summary of source filter theory

Picture
formants
Spectrogram
Formant is not just one frequency.
All the freq’s that are in the passband make up the formant
Important features of a formant include:
Center frequency (aka, Formant Freq)
bandwidth
amplitude of the passband
3 dB down or cut-off
Formant name IS the center freq of pass-band, regardless of how broad the tuning is.


Spectra (spectragram shows time – this doesn not)
y axis  intensity; x axis  frequency
Identify formant as 500 Hz, although bandwidth = 100 Hz
Amp = 30 dB
Numbering Formants
F1, F2, F3
You are looking for distinct and seperate bands of grayscale

Occasionally, the Voice Bar is seen.
Represents F0
Very hard to pick out of the spectrogram

Formants wont be seen during voiceless
All vowels are voiced
True or false:

The EGG reflects the contact area of the vocal folds.
TRUE
True or false:

If the frequency of a vibration is 10 cycles per second, the period of that vibration is half a second.
FALSE
What is the major difference between STOPS and VOWELS?
Sound source for vowels is the nearly periodic vibration of the vocal folds. For STOPS the sound source is pressurized air forcefully exiting the oral cavity
What are the acoustic features of STOPS?
The SILENT GAP
The RELEASE BURST
FORMANT TRANSITIONS
and VOICE ONSET TIME
SILENT GAP
This is where the articulators are forming a blockage of are and building up pressure in the oral cavity
What important thing happens in VOICED STOPS?
In some voiced stops, a VOICE BAR, or a band of low frequency, is sometimes apparent.

This voice bar is an indication that vocal fold vibration is occurring during the articulatory closure and pressure build up.
VOICE BAR
Happens occasionally in VOICED STOPS

It is a band of low frequency on the spectrogram

An indication that vocal fold vibration is occurring during the articulatory closure and pressure build up.
RELEASE BURST
In stops, this is the second characteristic, and it follows the SILENT GAP.

A burst of APERIODIC sound follows the silent gap. On the Spectrogram, we see a vertical line extending into the high frequencies, representing a broad range of frequencies characteristic of aperiodic sound.

Lasts about 10- 30ms
Voiced Stops produce recognizable patterns of ....
FORMANT TRANSITIONS to the vowel


Seen in diphthongs and transition from voiced stop to a vowel
The BURST is typically evident for stops in which position?
.... word initial and medial position, but don’t always occur in word final position
Bilabial Stops
VOICELESS p and VOICED b

Whole space in front of mouth is the “filter” so they have the biggest filter, see the spike but low frequency emphasis
Alveolar STOPS-
t and d

In front have a TINY little filter

High frequency
Velar STOPS-
(k and g)
medium frequency emphasis
(again think about shape of filter)
The burst (release) of the stop consonant may signal information about _____________ for voiceless stops
The place of articulation
Size of filter in Stop Bursts indicates what?
Smallest will pass highest frequency
Biggest will pass lowest frequency
If the filter is SMALLER, which kind of frequencies will it pass?
HIGHER
If the filter is LARGER, which kind of frequencies will it pass?
LOWEST
Voiceless stops have a greater degree of _________ than voiced stops
ASPIRATION
ASPIRATION
NOISE GENERATED BY TURBULENT AIRFLOW

Aspiration noise is generated at the larynx as the vocal folds move from abducted to adducted
Seen as a visible bend in the formant pattern (dog-tail) during a voiced stop to vowel transition...
During the movement from closure for the voiced stop to the open vocal tract posture of the following vowel (or vice versa).

Formant can be rising, falling, or relatively flat patterns.
Formant transition patterns in STOPS depend on....
Place of articulation for the voiced stop

Shape of the vocal tract of the adjacent vowel
Summarizing Formant Transitions

Bilabial Stops: /b/
F1 increases from nearly zero to the F1 frequency value of the vowel.

F2 increases from ~800 Hz to the F2 of the vowel.
Summarizing Formant Transitions

Alveolar Stops: /d/
F1 increases from nearly zero to the F1 frequency value of the vowel.

F2 starts at ~1800 Hz and raises or lowers to match the F2 of the vowel.
Summarizing Formant Transitions

Velar Stops: /g/
F1 increases from nearly zero to the F1 frequency value of the vowel.

F2 & F3 start close together (~1300-2300 Hz.) but separate during the transition.
Voice Onset Time
Just voiced stops

Voice Onset Time is used to distinguish voiced vs. unvoiced contrast

VOT: refers to the time between the begining of the stop burst and the onset of vocal fold vibration for the following vowel
VOT values:
Prevoicing (negative VOT):
- Voice onset precedes the burst release
Simultaneous voicing:
- Voice onset is simultaneous with the burst
Short Lag:
- Voice onset slightly follows the burst release (puppy, period of time before my voice comes on)
Prevoicing
(negative VOT):
- Voice onset precedes the burst release
Simultaneous voicing:
- Voice onset is simultaneous with the burst
Short Lag:
- Voice onset slightly follows the burst release (puppy, period of time before my voice comes on)
Length of VOT lag:
Short VOT lag
- Indicates voiced stops (0-30 msec)- just producing this in isolation, different in running speech

Long VOT lag:
- Indicates (aspirated) voiceless stops (> (greater) than 30 msec)
Short VOT lag
- Indicates voiced stops (0-30 msec)- just producing this in isolation, different in running speech
Long VOT lag:
- Indicates (aspirated) voiceless stops (> (greater) than 30 msec)
Stops – Summary
Acoustic cue signaling VOICING:
- Voice Onset Time (VOT)

Acoustic cues signaling PLACE of articulation:
- Spectrum of Burst – voiceless stops
- Formant Transitions – voiced stops

Acoustic cues signaling MANNER of articulation:
- Silence (Stop Gap)
- Plosion,(Burst)
- Aspiration (for voiceless stops)
- Stereotypic rising F1 pattern for all voiced stops
Acoustic cue signaling VOICING in STOPS:
- Voice Onset Time (VOT)
(Vowels are longer before voiced stops, and shorter before voiceless stops)
Acoustic cues signaling PLACE of articulation IN STOPS:
- Spectrum of Burst – voiceless stops

- Formant Transitions – voiced stops
Acoustic cues signaling MANNER of articulation IN STOPS:
- Silence (Stop Gap)
- Plosion,(Burst)
- Aspiration (for voiceless stops)
- Stereotypic rising F1 pattern for all voiced stops
Fricatives
Turbulence develops from forced airflow through constriction, or narrow passage-way

Obstacles (teeth, lips) present in front of constriction and source
Typically increase the amplitude of the noise
Sound source in Fricatives:
Turbulent noise

Fricative energy is much longer duration than stops

Turbulent noise is inharmonic (aperiodic) and has a broad (flat) spectrum: equal energy at many frequencies

Talking about noise, more specifically white noise

Flat
How does fricative energy compare in duration to stops?
Fricative energy is much longer duration than stops
Turbulent noise
Turbulent noise is inharmonic (aperiodic) and has a broad (flat) spectrum: equal energy at many frequencies
How do obstacles such as teeth and lips in front of the air passage and sound source affect the noise? (FRICATIVES)
Obstacles (teeth, lips) present in front of constriction and source
Typically increase the amplitude of the noise
How is noise filtered in fricatives?

Hint- In the same way as _________ are
Noise is filtered by the resonant characteristics of the oral cavity in front of the constriction in the same way that the glottal source is filtered for vowel production by supraglottal areas.
The place of articulation for fricatives depends on WHAT?
Frequency range of noise depends on place of articulation
Freq range is an cue for place of articulation
Summary of FRICATIVE VOICING
Fricatives may be voiced or voiceless

Presence of voice bar is a dead give away
In fricatives, noise is shaped by what?
Noise is shaped by the area resonances of the cavity in front of the narrowing
Name the fricatives... (9)
Interdental / θ /, / ð/
Labiodentals / f/, / v/
Alveolar / s /, / z/
Palatal / ʃ /, / ʒ/
Glottal / h /
Affricates
Affricates – formed by combining the beginning of a stop and the release of a fricative

A sequence of a stop and a fricative

Palatal / tʃ /, / dʒ/…. “ch” and “j”
Affricates place of articulation....

What kind of linguistic unit are they?
Acting as a single linguistic unit (phoneme)

Produced at the same place of articulation
Affricates Characteristics:
They have a __________ followed by __________.
Affricates have a stop gap (silence) followed by intense frication.

Stop gap is very brief, so brief you may not even hear it.
In affricates, how is turbulence generated?
What is the frequency range of noise?
Turbulence is generated in similar way to fricatives

Frequency range of noise is the same as for a fricative

Whatever is highlighted at the “sh” is also highlighted at the “ch”
In affricates,transition & voicing cues _______________
Transition & voicing cues are similar to stops & fricatives
Duration of friction in affricates compares to fricatives in what way?
Duration of frication is shorter in affricates than in fricatives
Glides-
Formed when the tongue rapidly shifts from the position for a front vowel to the position for a back vowel and vice versa

Always voiced

Made with this changing tongue motion

Similar to vowels and diphthongs

Bilabial- /w/
Palatal - /j/ “y”
Are glides voiced or voiceless?
Always voiced

Similar to vowels and diphthongs
Glides are dependent on....
Dependent on coarticulation, placement in word, etc.
Glides move from where to where?
Gliding from place of articulation to another (front vowel position to back vowel position)
Airflow production in GLIDES is....
Airflow production is intermediate

It is between laminar and turbulent.

Laminar – layered, vowels travel smoothly through vocal tract

Turbulent – airflow hits obstructions, such as tongue

Slight turbulent component seen on spectrogram
Laminar
layered, vowels travel smoothly through vocal tract
Turbulent
airflow hits obstructions, such as tongue
Do glides formant transitions?
YES!
Glides (j w) have clearly visible formant transitions.

Formants will be moving, versus vowels
Formant transitions in guides lack what?
Formant transitions lack the steady state portion of the signal you’d see in the diphthongs.
Glides have what kind duration?
How does it compare to diphthongs?
Glides are of very short duration

Shorter than diphthongs

Look like really brief duration diphthongs
Formant transition characteristics in glides
Formant transitions lack the steady state portion of the signal you’d see in the diphthongs.

Glides are of very short duration

Shorter than diphthongs

Look like really brief duration diphthongs

Looks only like a formant transition b/w two other sounds

(Do not have steady state component)
What is another name for glides?
Also called semi vowels because of their properties
Liquids
Occur as the tongue forms a loose blockage with in the oral cavity and air flows around the sides of the tongue

Palatal- /r/
Alveolar - / l /
Liquids differ from glides in what way?
Different than glides because tongue is not really moving

/r/ does move, especially in other languages

Still more steady state than glides
What is the most difficult manner of articulation to visualize on a spectrogram?
Liquid
How are liquids produced?
They are always:
Liquids are always voiced.
Made without tongue motion, more steady state.
/ l /: lateral resonance (“love” instead of “bull”)
Tongue splits the oral cavity into two halves.

Air flowing on two sides of tongue
/ r /: retroflex (“red”)
Tongue tip pointing backward toward palate

Dependent upon which vowel in being produced
Steady-state formant values for / l / (adult male values) are __________________________
Steady-state formant values for / l / (adult male values) are stereotypical among different speakers

For adult males: But we do not need to know these numbers)

350 Hz for F1,
1300 Hz for F2,
2700 Hz for F3.
Liquid / r / Steady-state formant values in (adult male values) are __________________________
Liquid / r / has similar steady-state frequencies for F1 & F2…but F3 is much lower (at ~1600 Hz).

350 Hz for F1,
1300 Hz for F2

The formant frequencies begin at these values and then change to match the frequencies of the sound that follows. – moving from a liquid or glide to a vowel….(something)
_____________are the most reliable auditory and visual cue for discrimination in liquids.
Formant transitions are the most reliable auditory and visual cue for discrimination.
There are three nasal phonemes in English. What are they?
Bilabial /m/
Alveolar /n/
Velar /ŋ/
Features of Nasals
Nasals are always voiced.

Produced by lowering the velum (opening VP (velopharyngeal) port), thus adding the nasal cavity to the rest of the vocal tract – nasal cavity cannot change shape!
-Coupling of oral and nasal cavities has interesting results- creates antiformant

Produced by completely obstructing the oral cavity and lowering the velum
-Allows for the radiation of sound energy through the nasal cavity.

Resonance features of nasals are all very similar,
-b/c nasal region does not change shape. – at lips and velar area
Nasals are always _________
Nasals are always voiced.
Nasals are produced by _____________
Produced by lowering the velum (opening VP (velopharyngeal) port), thus adding the nasal cavity to the rest of the vocal tract – nasal cavity cannot change shape!

Produced by completely obstructing the oral cavity and lowering the velum

Coupling of oral and nasal cavities has interesting results- creates antiformant
Coupling of oral and nasal cavities does what?
Creates antiformant
Feature of nasal
Nasals allows for what to pass through the nasal cavity?
Allows for the radiation of sound energy through the nasal cavity.
Nasals resonance features are __________.
Resonance features of nasals are all very similar, because the nasal region does not change shape.

– formed at lips and velar area
Source-Filter: Antiformants
If we think of “formants” as a Band-Pass shaped transfer function.

Antiformants are considered Band-Reject transfer functions.

Antiformants attenuate

Decrease amplitude of frequency components of the source spectrum.
Coupling the nasal region to the rest of the vocal tract has the effect of ___________.
producing Antiformants
(or anti-resonances)

Because of mucus, hair cells, actual tissues (which are very sound absorbent)

Dampens frequency out instead of amplifying it
Antiformants are considered _________ functions.

What do they do to the amplitude of frequency components of the source spectrum?
Band-Reject transfer functions

Antiformants attenuate

They decrease amplitude of frequency components of the source spectrum.
Nasals anti-resonances
Nasals in general are weak intensity due to sound absorption of the nasal area and the greater length the energy must travel (after adding nasal cavity)
Antiformants are seen as very weak areas arise because:
Sound absorption quality of inner nasal regional

Sound energy is heavily damped.

Winding nature of area around the turbinates

Frequencies in the bandwidth are attenuated…while frequencies outside of BW are amplified.
Nasals have one strong _______

Why does this happen?
Nasals have one strong formant below 500 Hz
- Nasal formant (or nasal murmur)

Why does this happen?
- (extremely SLIGHT differences happen to separate)


Opening the pharyngeal cavity to the nasal cavity opens another branch (other than oral cavity) WHICH increases size of pharyngeal cavity, which would then highlight lower frequencies

Slight differences between nasal formants helps to distinguish between three nasal phonemes.
Opening the pharyngeal cavity to the nasal cavity opens another branch (other than oral cavity) WHICH DOES WHAT TO THE pharyngeal cavity? This it would then highlight __________ frequencies.
Opening the pharyngeal cavity to the nasal cavity opens another branch (other than oral cavity) which increases size of pharyngeal cavity, which would then highlight lower frequencies
Slight differences between nasal formants helps to distinguish between
three nasal phonemes
Suprasegmental effects
Three primary features:
Intonation:

Stress:

Duration:
Intonation:
Suprasegmental effects

Intonation:
“today is the last day of classes???”
Rapid variations in Fo
Signify type of utterance: Declarative, question, etc
Stress:
Suprasegmental effects

Stress:
“Today IS the last day of classes”
Vary the intensity, frequency, and durations of a syllable.
Duration:
Suprasegmental effects
Duration:
“Today is the laaaaaaaaasssssst day of classes!”
Prolonging or shortening the duration of a sound
Coarticulation Effects
Wrong to assume that each sound is produced
in isolation and then strung together to form a word (like bricks in a row)
Coarticulation Effects
Due to the highly sequential and rapid nature of speech,
...overlap in gesture performance is extensive (more like roofing tiles)

For example, saying the word “chat”, you do not finish the “ch” completely, then move on to the /ae/, followed lastly by the /t/

Rather, the tongue tip is touching the alveolar ridge for “ch”, while at the same time your tongue is moving downward for the vowel

Before you even finish the vowel, the tongue is already on the move forward and upward for the /t/
What do you think are the consequences of the Coarticulation effect are on what we hear?
The features of one sound influence the features of the next, or the one in front of it too
Acoustics of one sound becomes blended into the acoustics of the previous sound and the one coming up in time
Simply put, articulator movements overlap each other in time
Coarticulation refers to how tow or more articulators move at virtually the same time to produce two of more phoneme gestures
Makes the whole issue of reading a spectrogram more complicated
Makes the whole issue of understanding the acoustic features of sounds more complex
We now have to take into account the phonetic (or sound) context in which every other sound exists in order to understand our ability to discriminate the phonemes of our language
Major factor in the limitation of computers to recognize speech
Brings up critical issues in terms of how infants even learn to recognize and produce individual words during speech-language development
Coarticulation makes the whole issue of understanding the acoustic features of sounds more complex
We now have to take into account the phonetic (or sound) context in which every other sound exists in order to understand our ability to discriminate the phonemes of our language

Major factor in the limitation of computers to recognize speech

Brings up critical issues in terms of how infants even learn to recognize and produce individual words during speech-language development
Coarticulation
The features of one sound influence the features of the next, or the one in front of it too

Acoustics of one sound becomes blended into the acoustics of the previous sound and the one coming up in time
Articulator movements do what in coarticulation?
Simply put, articulator movements overlap each other in time
Coarticulation refers to how ______________ to produce two or more phoneme gestures
... two or more articulators move at virtually the same time to produce two or more phoneme gestures
How does coarticulation effect reading a spectrogram?
Makes the whole issue of reading a spectrogram more complicated
Coarticulation brings up critical issues in terms of _______________
how infants even learn to recognize and produce individual words during speech-language development