Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
Chapter 7.2 Speech Synthesis

Chapter 7.2 Speech Synthesis

by Katja-Malkova, Feb. 2015

Subjects: Speech Communication

Favorite

Add to folder

Flag

Related Essays

Bock And Levelt's Model Of Speech Language Processing
There are many computational speech production models regarding serial language processing. In these models the stages are independent and information flow i...
Importance Of Pronounciation Mistakess Among ESL Learners
Sources such as recording from television and radio can be used. Moreover a guest speaker an be brought to exporse the learners to different syles of speechs...
Analysis Of The Goldman-Fristoe Test Of Articulation 2
Listening to the recording multiple times helped solidify my transcriptions. There were a couple steps in this phonological process analysis that were di...
Speech Organisms: The Structure Of Speech Production Mechanisms
structure of speech production mechanism involves how human beings produce sound (Honda 2003). When humans produce sounds it is one way in which humans comm...
Spoken Language Research Paper
Although we seem to make utterances rather effortlessly, producing spoken language is a demanding task that requires speakers to gather scattered pieces of c...
Difference Between Phonetics And Phonology
Moreover phonetics is a complicated and interesting study of sound and has many branches. They are rules that simplify its study and they are not for grante...
Nervous System In The Human Body
The air arrives at the oral cavity where the articulatory system takes over forming the perceive speech sounds. First as the air come into the oral cavity i...
How Adults Respond To Children's Speech
The vowels were longer and more dispersed in clear speech productions. The durations of fricatives were longer in clear speech. The vowel dispersion was grea...
English Ambiguity Essay
We make a brief analysis from phonetic, syntactic and semantic ambiguity. 一． Phonetic ambiguity Generally speaking, phonetic ambiguity only happens in spok...
Input Hypothesis In Second Language Acquisition
When input is in the right amount and well practiced the formation of language is produced which therefore is the transformation of the learned input

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/12

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

12 Cards in this Set

Front
Back

	What is "canned speech" and when it can be used?	The speech signals recorded and saved as they are. Playback with as they are, segment boundaries can be manipulated a little. Used when vocabulary small.
	What are the classes of speech output systems? Describe them. What do they convert to speech? What steps do they include?	1. Announcement machine: phonetics to speech 2. Statement machine: concept-to-speech (CTS). Avoids some fof the problems of reading out machine, thus a more natural speech achieved. 3. Reading out machine: text to speech (TTS). The simplest case. Symbolic processing, generates prosody and generates speech signal.
	What steps does symbolic processing have?	1. Preprocessing: write out abbreviation, distinguish main clauses and subordinate clauses 2. Exception handling: names, foreign-language words > other pronunciation ("multilingual synthesis"). 3. Morphological analysis: better sentence stress 4. Assigning word stress: rules or lexicon based stress finding 5. Orthography-to-phonetic mapping 6. Word classes identified 7. Syntactic and prosodic structure analysed. Prosody applied based on punctuation, word stress, syntactic objects
	What are the two approaches for speech prosody generation?	1. Based on rules - Fujisaki: Word and phrase components superimposed on a declination line, processed by a 2nd-order system - Adriaens: Copy contours - Mersdorf: LPC parametrization 2. Based on data - Neural networks - Classification trees
	Explain how formant synthesis works. What are the control parameters?	Idealized excitation signals: voiced impulse comb and unvoiced noise. Two formant filter lines: longer (3-5 filters/resonators) for vowels, shorter (1-2 filters/resonators) for fricatives. Aspirated sounds use both. Control parameters: fundamental frequency, frequencies and bandwidth of formant filters, amplitudes of excitation signals.
	How does LPC synthesizer work? What are its control parameters?
	What is articulatory synthesis?	Parametric synthesis, modelling the exact movement of articulators in vocal tract. Computationally heavy. Source-filter model picture (diameters of different parts in vocal tract).
	What are advantages and disadvantages of parametric syntesis? Name another type of syntesis.	1. Formant synthesis: Difficult determination of parameter values 2. LPC Synthesis: Only 2 types of excitation -> small variability, but simple generation 3. Articulatory Synthesis: Detailed modelling of the control and the movements of the human vocal tract Another type: Concatenative Synthesis - concatenation of individual elements (phones, diphones, demisyllables, syllables, etc.)
	What is concatenation synthesis? How does it work?	Lots of pre-recorded material, choosing short pieces and concatenating so that the transition are smooth.
	How does PSOLA work? What are the benefits of PSOLA in speech synthesis?	Recorded signal cut into elementary components (at fundamental period markers), which are overlaid and added to make a new signal with different fundamental frequency. Fundamental frequency and thus prosody can be adjusted by elementary component distances. Smooth changes of f0 makes the speech more natural. However, manipulation causes artefacts.
	What is unit-selection synthesis? How long are the units? How does it decide what units to select? What kind of labelling is needed? When does it work correctly and when not?	Choose as long speech units as possible without manipulating. Requires lots of recordings (high effort). Units that 1) match the text to be synthesized 2) can be used together with minimal discontinuities in signal are chosen, based on a cost function. The costs are called 1) cost of units 2) concatenation costs. Units need to be labelled both phonemically and prosodically. Text to be synthesized needs to have labels too, in real-time. Unit-selection achieves quite natural speech if the text is found in the inventory.
	How can HMM be used for speech synthesis?	1. Best-fitting connection of elements/units ->Representation as an HMM -> Elements = States of the HMM, each linked to a parametric signal representation (e.g. LPC) 2. Synthesis = Finding the optimum path through all states (elements)

Share This Flashcard Set