Speech Recognition Research Paper

Improved Essays
Speech recognition is used for two main purposes. First and foremost dictation that is in the context of speech recognition is translation of spoken words into text. Second controlling the computer, that is to develop such software that probably would be capable of authorizing a user to operate different applications by voice [1].some of them described below:
1. People with disabilities can benefit from speech recognition programs. Speech recognition is especially useful for people who have difficulty using their hands, in such cases speech recognition programs are much beneficial and they can be used for operating computers. Speech recognition is used in deaf telephony, such as voicemail to text.
2. Individuals with learning disabilities who
…show more content…
Saying something in a different tone can for example mean the exact opposite, for instance in the case of sarcasm. Although the sound is the same, the tone is different. Speech recognizers are good at recognizing sounds, but still have a hard time distinguishing the tone. Therefore is much research being done in order to detect the tone of an utterance and the emotion behind it, being anger, happiness, sadness etc. Tones are also important for certain languages, particularly Asian ones, such as Chinese and Japanese. For example, in Japanese, a word with two syllables can have different meanings, depending on whether the emphasis is on the first or second syllable, or no …show more content…
o Speaker Independent SR System
Which is a system that can recognize the speech of any speaker. This is the most difficult type in SR systems. There has to be a large sounds database for training this system.
Telephone services are open to anyone with a phone and who is willing to dial any number. Therefore a whole variety of speakers can be expected to make use of these services. In contrast to dictation systems where the speaker is known beforehand and performance can be optimized for one speaker, the recognition must be as optimal for an as large group as possible. Because of this factors which affect the voice such as race, dialect, non-nativeness, gender, age, health etc. are all variables for which an as wide range as possible should be covered by the recognition system. This makes the problem much harder and the relative performance is therefore lower than for a system for which the speaker is known beforehand. The amount of necessary training data needed is also larger, because a lot of speaker variability should be covered by the training set.
• Noise and

Related Documents

  • Improved Essays

    Non-Verbal Communication

    • 724 Words
    • 3 Pages

    Written communication is vital in all aspects in the care home, whether its record keeping, letters, meetings and updating care plans. Even in advertising written with pictures is ideal. Sometimes it is not possible to overcome a barrier to communication sovan alternative form of communication must be found such as sign language, Lip reading and Makaton which is often used for people with learning disabilities. Also Braille if somebody is…

    • 724 Words
    • 3 Pages
    Improved Essays
  • Improved Essays

    Vannevar Bush, in his paper “As We May Think”, pictured “text-to-speech” and how it is utilized in human-machine communication. A machine called a Voder was shown at World fair. A girl stroked its keys and it emitted recognizable speech. No human vocal chords entered into the procedure at any point; the keys simply combined some electrically produced vibrations and passed these on to a loud-speaker. In the Bell Laboratories there is the converse of this machine, called a Vocoder.…

    • 160 Words
    • 1 Pages
    Improved Essays
  • Decent Essays

    • Universality: If a person cannot present a trait to the multimodal biometric system due to disease or disability, the system can use other biometric traits for the recognition task, so we can say that the multimodal biometric system is universal in nature. • Reliable recognition: the multimodal biometric systems integrate multiple biometric modalities and each single trait can offer additional evidence for the recognition task. For example, the voice of two individuals can be similar. So, the unimodal voice recognition system might lead to a false recognition. However, if we add for the same biometric system other modality such as iris or fingerprints, the recognition rate of the system would be certainly increased, because it is nearly impossible…

    • 136 Words
    • 1 Pages
    Decent Essays
  • Decent Essays

    A deaf scientist named Robert Weitbrecht developed the teletypewriter or TTY in the 1960s (NAD). Teletypewriters along with acoustic coupler, which holds the telephone handset receiver, allowed the Deaf and hard-of-hearing people to be able to call one another. TTY relay services first began with volunteers being connected to TTY users with people who use telephones. “The TTY relay service communication assistant (CA) connects TTY relay calls with people who communicate by telephone (NAD).” The CA would convert text-to-voice and vice versa, with the text being displayed on the user’s TTY.…

    • 344 Words
    • 2 Pages
    Decent Essays
  • Improved Essays

    Pocket Talker Essay

    • 2177 Words
    • 9 Pages

    Watching television shows with pocket talker Description: Now, you will not face any trouble at all in watching TV as you have got the amazing hearing aid in the form of pocket talker. This device is now getting available in every online store. Do you have to struggle a lot in hearing TV sounds? Well, now you can get rid of this trouble with the regular usage of pocket talker. This is a unique hearing aid which has got outstanding combinations of accessories.…

    • 2177 Words
    • 9 Pages
    Improved Essays
  • Improved Essays

    With all the new technologies around today, it’s hard to keep up with them. In about every aspect of the American life, you see technology—from the cellphones to automated machinery. There is a lot of good that comes from technology, like technology for hospitals, then there are some nuisances, like people who hack someone else’s email in order to steal their identity. The negative part of technology is something most people would like to ignore about it, while at the same time focusing on the good. Like in hospitals were technology is used for good, other public services in America use them as well, like the police.…

    • 667 Words
    • 3 Pages
    Improved Essays
  • Improved Essays

    Media Convergence: The Case Study of Miku Hatsune Hatsune Miku and the other VOCALOIDs are swarming social media. Their illustrations and capabilities enchant their audiences no matter what medium is used. Whether it is video games, live concerts, karaoke or simply making a new song, VOCALOID has its many uses and benefits and will continue to impact culture and technology. What is a Voice Synthesizing Program? A voice synthesizing program is essentially a computer singing as human-like as it can.…

    • 1319 Words
    • 6 Pages
    Improved Essays
  • Improved Essays

    Hearing Voices Hearing voices is a simulation where each student is given an MP3 player to mimic hearing voices while trying to accomplish several different tasks. Overall, this simulation challenged me emotionally and physically, tested my ability to cope, and gave me insight into what it is like to have a mental disorder. How did I feel physically & emotionally during the experience?…

    • 923 Words
    • 4 Pages
    Improved Essays
  • Superior Essays

    Introduction Speech recognition is a computer software that operates computers to distinguish speech and translate it into text. The term recognition refers to the software that recognize or identify the sounds of the speaker rather than the exact words the speaker is saying. It provides people with synchronized captions on recorded videos, television shows and movies. Speech recognition can be found anywhere and many people today are familiar with it. It can be found in Google Chrome, Microsoft, Amazon, computers, laptops, cars or Dragon Drive, at the workplace and especially on phones such as Android or Apple.…

    • 1509 Words
    • 7 Pages
    Superior Essays
  • Improved Essays

    Speaker Recognition Essay

    • 955 Words
    • 4 Pages

    1.1. Background of the Study Speaker recognition is the process of automatically recognizing who is speaking [4][24] on the basis of information obtained from the speech waves[13],[18],[20],[21][22]. It should be noted that this process of recognition is different from speech recognition which is not [5][16] biometrics and it’s defined as the process of recognizing what is being said, e.g., dictation of words by an individual for computer understanding (speech to text recognition). The goal of speech recognition is to answer the question “what are you saying?” This means that speech recognition is not interested in who is speaking.…

    • 955 Words
    • 4 Pages
    Improved Essays
  • Improved Essays

    Numerous studies conducted in the last three decades have shown that our ability to discriminate non-native phonemic contrasts can be improved thanks to specific laboratory training procedures (e.g., Bradlow et al., 1999; Bradlow et al., 1997; Jamieson and Morosan, 1986, 1989; Lively et al., 1994; Lively et al., 1993; Logan et al., 1991; Sadakata and McQueen, 2013). For example, Bradlow et al. (1997) showed that the forced-choice identification of /r/ and /l/ by Japanese speakers significantly improved after several weeks of intensive training using stimuli produced by multiple speakers of General American English. The improvements in /r/–/l/ identification generalized to novel stimuli produced by new speakers, and were maintained 3 months…

    • 1535 Words
    • 7 Pages
    Improved Essays
  • Improved Essays

    Another thing you have to understand about the Echo is that various ring colors signify different things about Amazon Echo. Here’s what you need to know about them: Solid Blue If the color of the ring on top of the device is solid blue, it means that the device is alive—it’s awake—and is actually waiting for your commands! Solid White If you see solid white, it means that the volume of Amazon Echo is currently being adjusted.…

    • 852 Words
    • 4 Pages
    Improved Essays
  • Superior Essays

    Arduino UNO Lab Analysis

    • 861 Words
    • 4 Pages

    This section describes the algorithm used for traffic light and road sign detection. It also provides the hardware components used to provide output. In the proposed system, initially input image was obtained in RGB form and converted to gray scale and finally to binary images. This said to be pre-processing of images. Then filtering was performed for removal of noise from images.…

    • 861 Words
    • 4 Pages
    Superior Essays
  • Improved Essays

    Except for the speech text (words), the rich dimensions also refer as the gender, attitude, emotion, health situation and identity of a speaker. Such information is very important for an effective communication. The speaker recognition systems are developed in two phases: training phase and recognition phase. In the training phase, each registered speaker has to provide samples of their speech so that the system can build a reference model for that speaker.…

    • 728 Words
    • 3 Pages
    Improved Essays
  • Brilliant Essays

    Introduction Linguistics is when language and the structure of language are studied. How language functions and how language is used. Many building blocks of different types and length are put together to bring up a language. Sounds are put together, and at times, when this happens, they change their form and do amazing things. Words are put in a certain order, and at times the beginnings and endings of words are changed to change the meaning.…

    • 984 Words
    • 4 Pages
    Brilliant Essays