In this seminar titled as "Implementation of Speech Recognition Technology Translating Business Process into Intelligence Process", author is going to discuss about an advance technology - Speech Recognition. Speech recognition is a technology where a computer can receive input and return output to the user via speech. In other words, the computer is able to understand what is spoken by the user and manage to give different responses accordingly. Today, there are different types of speech recognition software available on the shelf, which offering a new way of interacting with the computer. Features of speech recognition technology include command system, dictation, correction and more. Speech recognition technology has become a common technology nowadays that is widely implemented into many industries. Different industries gain different benefits from speech recognition technology and this will lead to a trend of moving towards this technology. However, speech recognition technology has certain limitations that will be discussed in this seminar. Fortunately, the solutions to solve the limitations are available in the market now.
Fundamental knowledge of speech recognition technology
What is Speech Recognition Technology?
Voice recognition is the processes of converting an acoustic signal to a set of words that can be understand by computer. The capturing process can be done by using a microphone or even a telephone. This process has evolved into an advance technology that allows user to control the computer by speaking. This technology will have to work closely with speech recognition software in order to perform its function. Speech recognition software/ voice recognition software will capture the words that are spoken by the user intelligently and convert it to the language that is understandable by the computer. Then, the converted words can become the final results display on the screen, a command for an application, data entry and more.
Speech recognition applications include voice user interface such as voice dialling and call routing that built in most of the high-end smart phones nowadays. Voice dialling is referring to a function of a high-end smart phone which will dial to the targeted person that the user mentioned to it. Demotic appliance control and simple data entry also one of the speech recognition applications that used by certain organization. The best example will be Microsoft call centre that uses speech recognition to serve their customer. Furthermore, speech recognition applications cover preparation of structured documents such as a radiology report, speech-to-text processing integrated in word processors and aircraft.
Types of Speech Recognizer
Speech recognition can be categorized into several different classes by determining the difference between their ability to recognize words. (Stephen C.Cook, 2002)
Isolated words recognizer is a lower class of speech recognition system. Isolated word recognizer does not mean that it can only accept a single word. Instead, user can only pronounce a word at a time. It requires the user to pause between words because the system will process word by word. Usually the system will has 2 states which is "Listen" or "Not-Listen". Normally when the system is processing a word, it will change to "Not-Listen" state because it cannot accept another word anymore.
Continuous speech recognizer is an enhancement from isolated words recognizer. Continuous speech recognizer is more complex and required longer development cycle because of complex components within the system to recognized continuous speech. User can speak to the computer naturally with the speed of normal conversion.
Spontaneous speech recognizer is similar to continuous speech recognizer. However, spontaneous speech recognizer is slightly more advance. It accepts a speech that is natural sounding and not rehearsed. Speech recognition usually requires user to pronounce a word in a specific way. However, spontaneous speech recognition do not requires user to do so. It has the ability to handle a variety of pronunciation.
Speaker- dependant speech recognizer requires user to train the system by providing some samples of the speech before the system can recognize the word. This process also known as user enrolment.
Speaker-independent will have a large set of words pre-coded in the system so the user does not need to train the system to understand them.
How speech recognition technology works?
The ability of speech recognition technology to accept and process spoken language, speech-to-text, involves the following steps:
Audio received from the speaker/ microphone will be turned into a waveform which is a mathematical representation of sound. Then, the sound waves will be captured into a speech engine. In the speech engine, it will be process and convert the sound waves into basic language units that the computer can understand. Furthermore, the words will be constructed from phonemes. The speech engine will analyze the words to avoid wrong interpretation of sound alike words. Finally, the finalize words will be display on the screen or issued as a command.
Diagram 2.2.1 Process flow on how audio/speech is converted into commands/text
There are two main tools used by the speech recognition engine in order to operate with acceptable level of speed and accuracy. First tool will be a grammar bank. It is the most important tools in a speech engine because a grammar will defines the recognized words. Therefore, a grammar bank must include nearly all the words from the dictionary. Second will be a speaker profile. A speaker profile is playing a very important role in either speaker-independent or speaker-dependent speech recognition system. By having a speaker profile, the speech recognition software can identify and accommodate a user's unique speech patterns and accent. This tool can ensure the accuracy, consistency and reliability of a speech recognition engine.
Types and features of speech recognition
Different Types of Speech Recognition Software
Windows Speech Recognition
Windows Speech Recognition was an embedded software that comes along with Windows Vista with the purpose of empowering Windows' users to interact with computers by voice or speech. The software is specially designed for people who wish to limit or change their input style by using mouse and keyboard while maintaining or increasing their performance. Users can perform actions such as switching between applications, starting an application and controlling the operating system by Windows Speech Recognition.
Dragon Naturally Speaking
Dragon Naturally Speaking is a product developed by Nuance. The purpose of developing Dragon Naturally Speaking is to make voice work. Dragon Naturally Speaking relies on advance technology which improves people live and work with intelligence. Dragon Naturally Speaking is the leading speech recognizer that provides healthcare solutions that reduce doctor paper work and focus on patient care, to companies that offer customer service or support, to imaging technologies that can convert some documents into a digital file which can be search easily. (www.nuance.com, 2010)
According to speech technology magazine, IBM ViaVoice has achieved as the market leader in the year 2007. It provides text-to-speech capabilities to computer user as well as small mobile devices. Furthermore, it can shorten the development time and skill required for developing advance voice applications. It is an user friendly developer toolkit which is empowered by Eclipse technology. (www.01.ibm.com, 2010)
Features of speech recognition technology
Today, there are too many speech recognition software in the market. However, there are some common features that owned by almost every speech recognition software in the market.
Every speech recognition software has a command system. The command system is used to serve for easier navigation in clicking a button or menus in our computer. User just require to say the words on whatever button he/she would like to press, then the command system will do it for him/her. This feature is exceptionally useful in browsing the web. Speech recognition software can discern the links that the user are saying and click them accurately. There is a fantastic command that is very useful especially for user that is having hard time getting the program to recognize where you want it to click. This command is known as "show numbers" command. This command will show a number box over every possible menu, link and button on your screen. User would just have to say the number to select it. Speech recognition software is so efficient with a proper command system.
Speech recognition software shines better in dictation mode compared to the command system. Dictation or known as a grammar bank is the heart of a speech recognition software. The most impressive part of speech recognition software in dictation mode was the formatting, correcting and punctuation capabilities. The accuracy of recognizing the punctuation and correction commands is the key of the speech engine. User can move the cursor around the screen easily or select a word by saying it. In dictation mode makes the user easier to format the text that he/she had selected too.
Speech recognition software will fix and predict some incorrectly recognized words. If the user pronounces the word which the speech recognition software could not recognize, then it will select the best and possible word from a list of alternatives for the similar word that pronounce by the user. For example, if the user says:"I want to go bag home", speech recognition software will correct the "bag" to "back" automatically. (http://www.microsoft.com/enable/products/windowsvista/speech.aspx, 2010)
There is an interactive tutorial within speech recognition software to boost up end user's learning curve. It can effectively increase the understanding of user on controlling and commanding the speech recognition software. At the same time, the speech recognition software will recognize the user's voice when the interactive tutorial is being carried out.
User can personalize the speech recognition software by adding voice command into it. User can has own personalized command that triggers certain action.
The more frequent a user uses the speech recognition software, the more accurate the speech recognition software. Speech recognition software will adapt to both the user's speaking style and accent.
Methods of Handling Recognition Errors
In fact, a speech recognition system could not be 100% accurate. It will make errors in recognizing a spoken input by the user. It is very important for speech recognizer to handle errors in recognizing a spoken input and the system does not break down at the same time. There are four main ways to handle recognition errors. (Biing-Hwang Juang, 1993)
Fail Soft Method
Speech recognition system which applies this method will utilize a known task constraints to detect and correct the recognition errors automatically. Therefore, it is easy to detect and correct the recognition errors because the system will spell the errors a search for an appropriate name in a finite list. Reason is the recognized name is constrained to the set of names within a given list. For instances, speech recognition application that uses digit strings to refer to certain items that can be chosen from a catalogue will have the advantage of knowing the recognition error because the input required from the user is fixed in a given list. Thus, digit recognition errors can be detected and corrected within the error correction capabilities.
In this case, the system will ask the user for help when recognition errors happen. The user will be required to verify the first decision made by the system. However, if the decision is wrong, the system will ask the user to verify the second decision. This process will be a loop until the system gets the correct decision. In other words, the system will list the possible matches for a recognition error and allow the user to choose the correct string from the list by calling the index number. The reason for using an index number is to prevent confusion and easier recognition.
Speech recognition system which uses rejection method not makes a decision on recognition error. In fact, the speech recognition system will record all spoken input by the user in a digital format. From the digital file, the speech recognition system will reject some but limited percentage of spoken input to reduce error rate. Then, it will pass the digital file to a human operator who will make the correction on recognition errors by listening to the spoken input.
Application of Speech Recognition Technology
Criteria for Speech-Recognition Deployment
The features and functions offered by speech recognition look convincing and worth to implement. However, there are several criteria that must be considered to decide whether is suitable to deploy speech recognition system in an organization. (Lawrence Rabiner, 1993)
Tangible and Intangible Benefits
First of all, the organization must make sure the proposed speech recognition system manages to provide either tangible or intangible benefits to them. For instances, increased productivity and maximize profit. There are quite a number of proposed speech recognition system fails to exploit its advantages and benefits the organization. Therefore, without determining the benefits that a proposed speech recognition system will bring to an organization before deployment will leads to failure over time.
Secondly, user friendliness of the proposed speech recognition system must be determined. A good and successful speech recognition system will make the user feel comfortable with the way it works and able to adapt into an organization's culture easily. Therefore, the proposed speech recognition system must be easy to use and provides an effective means of communications. For example, the speech recognizer will enter a correction mode if it fails to understand the user's spoken commands. User friendliness is an essential requirement in order to make sure the deployment of speech recognition system successful.
Thirdly, accuracy level of the proposed recognition system must be high. A proposed recognition system must at least achieve a certain level of performance on the task related with the recognizer. For example, a speech recognition system that provides 95% word accuracy will makes an error, on average, once in 20 tries. However, a speech recognition system that provides 99% will make an error once in 100 tries. Therefore, an organization must make sure the proposed recognition system is accurate in order to ensure the recognizer is reliable.
Real Time Respond
Lastly, the proposed system recognition system must respond to the user in real time. It is a must for a user to receive a respond from the speech recognition system immediately after 250 milliseconds after the end of the spoken input. If the system takes more than 3 seconds to provide respond to the user, it will be considered as a failure deployment.
Applications of Speech Recognition Technology in Different Industries
Speech recognition has been widely applied in different industries globally due to its usefulness and benefits it can bring to an organization. The following are industries which had benefitted from the implementation of speech recognition technology.
In the year of 1994, speech recognition technology had been implemented into health care industry to help in completing documentation. The speech recognition technologies which have been applied to health care industry at the early stage require the user to speak slowly. The user will have to insert a short pause between the words in order for the speech recognizer to capture the words that the user said. Furthermore, the users would have to enroll and train the system in order to recognize their voice at that time. However, the technology evolves and advance speech recognition technology has been used to serve the health care industry in the beginning of year 2000.
The main purposes for implementing speech recognition technology into health care industry can be fall into few categories. First of all, it can be used to record a patient's information (Mary-Marshall Teel, 1998). Speech recognition technology has been used as a translator to convert spoken language to digital text in medical reports. This is usually related to electronic patient records. A large vocabulary and speaker-dependant speech recognizer is generally used in clinical data entry, pathology, dental offices, radiology, operating room in hospital and more for this purpose. Nurses were able to enter data via speech regarding a patient from anywhere within the specific area by using a wireless headset. Doctors and nurses take half of the time needed to achieve the same tasks using the keyboard and mouse in documentation. This improves the service quality and efficiency of a hospital or clinic by shortening the documentation time and allowing the doctor to have more time to take care their patients.
Secondly, it can be used as a speech-based interactive voice response system that can interact with doctors and patients. Interactive Voice Response (IVR) is a speech-based system specially designed to monitor a patient pain levels. IVR has an integrated question and answer module to gather information about the location and level of pain. IVR system is a very successful and effective way for monitoring patients in the ward or even at home. In addition, IVR allows nurses and doctors to provide a better care to patients because the system clearly records the time, location and level of pain for a particular patient. (Scott Durling, 2008)
Lastly, it can be a multi language interpretation system. Due to the large variety of potential patients who have limited proficiency in English, speech recognition used in health care industry was combined with other input modalities to enhance the usability of the system. Microsoft Speech SDK 5.1 was developed and implemented for nurses' usage in an emergency department to recognize Chinese speech in a command-and-control fashion. (Jo Lumsden, 2008) Furthermore, it can translate a sentence into different languages to the patient. Patients could enter data or navigate through a list of menu via different languages. This enhances the usability and user friendliness of speech recognition used in health care industry.
Overall, speech recognition technology in education industry has been proved to improve reading and writing skills among students. Speaking or reading aloud to the recognizer can increase oral reading efficiency, improves in word recognition and pronunciation as well as improves in reading fluency. Speech recognition technology is an impactful way to maintain students' oral reading skills. (www.nuance.com , 2010)
For students with disabilities especially for handicapped students who have difficulties in using keyboard will benefit from speech recognition technology for accessibility purposes. It provides a great way for these students to keep pace with their non-disabled peers.
First and foremost, speech recognition was applied in help students to carry out their daily task which I writing and taking notes as well as presentations. Students find it easier to convert their ideas into written text via speech because they do not have to worry about spelling errors. In additions, students no longer have to cramp their fingers because of taking down notes by hand.
"A student in our summer program was struggling with his writing; he was rushing to get out his thoughts and skipping words that he could not spell. However, after training with Dragon Naturally Speaking for about an hour and 15 minutes, this student was able to complete a high-quality, six-page paper on Heller Keller in a single evening." (Kathy Burris, 2010)
Second, students can use simple yet understandable command to find information on the internet as well as files in a computer. For example, students can say "Search Wikipedia for Speech Recognition", the information will be easily located. Students can use this command to send email to their teachers or friends to discuss on their assignments too.
For teachers, they can prepare their presentation slides via speech. They can spend more time on their research because speech recognition technology allows them to "type" three times faster compared to typing a keyboard. Besides that, teacher can access to the students' work and provide feedback directly via speech through their computer. Most importantly, teachers are often busy with a big amount of emails a day. By using Dragon Naturally Speaking, teachers can reply or forward email and control their email applications by voice. It can simplify their work tasks so they can be more productive.
Benefits of speech recognition technology
Speech recognition technology seems to have exploded throughout multiple industries globally. Some companies in different industries had implemented robust speech recognition system to help them in their business processes. Author strongly believed that speech recognition technology can bring precious value to certain industries. The following are the benefits of speech recognition technology.
Many businesses can benefit from speech technology in reducing support costs. The best example will be routine customer service inquiries. By having speech recognition technology, it can replace the man power needed in customer service centre. Most importantly, customer service centre can work 24/7 by implementing speech recognition system. It could reduce cost and improve customer satisfaction at the same time.
Speech recognition technology can reduce the number of live calls. Options will be provided to the consumer who is on the line and allows the consumer to gather any information that he or she needs without accessing the agent. Therefore, it definitely reduces the amount of time an agent needs to remain on a line and more agents will be able to handle calls that could not be resolved with self-service. It increases the number of calls an agent can handle. This shows that the technology has the ability to improve the efficiency and productivity of a company.
Speech recognition technology improves computer accessibility for vision impaired people and people who are unable to use a keyboard and mouse. It becomes extremely useful for people who have vision problems to control the computer as if they are using conventional computer input devices. Furthermore, it enables students or patients who are physically handicapped to enter text or control the computer verbally.
Ease of use
The method of accessing and controlling applications with voice command is easy to use. Teachers and lecturer can capture their speech and allow students to download the audio file to do self revision at home. Students who unable to attend to classes can "attend" the class virtually. Furthermore, speech recognition technology has been used in several computer games in teaching kids who are below 5 years old. It shows speech recognition technology is so easy to use that a kid below 5 years old manages to control it.
Critical Evaluation on Speech Recognition Technology
Current issues for speech recognition technology
Along these years, computer users often have a bad impression on speech recognition technology because of some bad user experience. Users are normally frustrated by the slow response of speech recognition technology and of course the inaccurate recognition as well. In fact, these issues are closely related to some issues around us.
Noise could come from anywhere around us, a dog barking, a radio playing somewhere down the streets, a car passing by, another human speaking in the background and more. There is another kind of noise which is echo effect. This noise is produced naturally when the human speaks. Reason is the sound wave might bounce on some object around the speaker. Then, the sound wave will bounce back to the microphone a few milliseconds later. This is usually unwanted information by the speech engine that will leads to inaccurate recognition. (Markus Forsberg, 2003)
A speech engine would have to depend closely on the processor speed in its converting process. A faster processor leads to a faster capturing and converting of a speech engine. On the contrary, a slower processor will have some delay in converting the sound wave to the computer language. That is reason user will experience slow response of speech recognition technology.
Speed of speech
Human speak naturally without pauses between the words and normally the pause appear only after a phrase or a sentence. Furthermore, different human speaks with different speed at different time. When a human is tired, he/she will tend to speak slower and softer. On the other hand, if a human is angry or stress, he/she will speak louder and faster. This introduces a tough problem for speech recognition.
Homophones are referring to the words that sounds exactly the same, but with different orthography and meaning. For example, the tale of a dog and the tail of a dog. Both words have exactly the same pronunciation. Even human sometimes find it difficult to distinguish between homophones. Therefore, a speech engine will face the same difficulty and came out with an inaccurate result.
Quality of microphone
The quality of a microphone will affect the quality of audio produced. At the same time, the audio quality will have a significant effect on the speech engine recognition process. A bad quality of audio waves might bring the wrong message or some unreadable waves to the speech engine that will leads to inaccurate recognition.
Automotive industry is getting more and more competitive. Strong companies like BMW, Honda and Toyota shines brightly in automotive industry and created a barrier for other companies to compete in such a competitive market. Automotive companies compete in every aspect includes pricing, safety, design, style, technologies used and more. Audi was one of the automotive companies who aim to survive and gain competitive advantages in the market. Audi aims to improve the technologies used in each of Audi's car model to provide driver and passengers with a comfortable and unforgettable experience. At the same time, Audi wants to improve overall safety in an interactive automotive environment.
Audi develop a partnership with Nuance to bring speech recognition technology to Audi A8's In-Car Multimedia Interface System. Audi A8 is the newest model release by Audi in the year 2010. It is now features with music search, one-shot destination entry and more via speech technology. The following are the features that are supported by speech technology in Audi A8 models:
One-Shot Destination Entry
Global Positioning System (GPS) is a built in tools for Audi A8 models. Unlike traditional GPS which requires user to enter the input by touch screen, Audi A8 allows the drivers to enter a destination via simple spoken command. For example, driver can say "Kuala Lumpur, Jalan Genting Klang", and then the GPS will start to navigate the user to the destination. The Audi A8 model in United States will be slightly advance. The system allows the drivers to eliminate the need of telling the city and state. Instead, the user just needs to say "Street in Vicnity" to begin the route navigation.
Speech-enabled system in Audi A8 model ensures a safer and more enjoyable trip to the drivers and passengers. Drivers are given the ability to browse through song albums and access to their favorite songs by speaking the artist, song title, genre, or album title to the system. For example, if the driver wishes to listen to Micheal Jackson's song, he or she would just have to say "Play artist Micheal Jackson" or "Play title Thriller". Furthermore, drivers can set different radio stations by the frequency or name and play a CD and DVD via speech command. Most importantly, the speech recognition system is able to support several languages in parallel.
Address Book and Phone
Speech recognition technology in Audi A8 models support voice-dialing. It allows the drivers to store up to 2000 contact entries and 50 name tags for contacts who are most commonly accessed. For example, "Dad" and "Wife". This function is fully customizable. It is very useful as the driver can concentrate in driving while calling his or her family and friends.
"Audi's MMI is an example of just how amazingly sophisticated in-car infotainment systems have become, while providing drivers with incredibly easy access to navigation, music and more via sample voice commands." (Arnd Weil, 2010)
According to the recent researches and surveys conducted by Nuance, it shown that speech-enabled interactive system in car is one of the most important features that every car should have. It is useful in terms of reducing driver distraction and improves user experience in navigation and entertainment system.
Evaluation of the Case Study
Author strongly agrees the statement stated by Arnd Wiel, the general manager of Nuance Automotive that speech-enable multimedia interface in Audi A8 can improve user experience and easier navigation access. In addition, author agreed that by applying speech recognition technology in automotive industry can definitely increase the safety of the driver and passengers in an interactive automotive environment.
First of all, one-shot destination entry eases the usage of Global Positioning System (GPS) in car. Drivers do not need to activate and find the route themselves by using touch screen. Concentration of a driver while driving is extremely important because it ensure the safety on the road. Inattentive is one of the key factors that cause accidents to happen on the road. Therefore, with Multimedia Interface System built in Audi A8 models, drivers can concentrate in driving to ensure the safety of the passengers and themselves. According to Nuance's Driver Distraction Study, it shows that short commands with few confirmation steps reduce overall driver distraction and reaction time by 47 percent if compared to manual operation. (www.nuance.com/news/pressreleases, 2009)
Second, music search via speech enrich user experiences and gives driver an enjoyable interaction with the Multimedia Interface System. Instead of browsing through the album one by one, driver now can use simple voice command to play a song, album or even a CD. It is very convenient and user friendly. Driver can have a more enjoyable trip because they have a new experience and entertainment by speech-enable Multimedia Interface System. Most importantly, driver would not feel sleepy with this features built in a car.
Thirdly, speech-enable Multimedia Interface System allow driver to use voice-dialing and stores up to 2000 contact entries in the system. Making call in the car is seriously prohibited because it will create accidents. Now by having voice-dialing, it can shorten the time to search for a contact from a phone book which contains of 2000 contact entries. Besides that, 2000 contact entries ensure the driver can keep in touch with his friends and family. However, author suggests that Bluetooth should be built into speech-enable Multimedia Interface System because driver can use their Bluetooth headset to talk on the phone instead of using the loud speaker in the Multimedia Interface System. Reason is the audio quality of loud speaker is lower compared to a headset. Furthermore, driver might do not want to let the passenger to listen to the conversation between him and his friend. In order to ensure privacy and safety, author strongly suggests that Bluetooth must be build into the Multimedia Interface System.
On top of that, author suggests that speech recognition technology should be implemented into the lighting and window system in a car. For example, the driver can say "Light On" to on the light in the car rather than pressing the button themselves. Furthermore, it will be more convenient if the driver can roll down and up the window by saying "Window Down" and "Window Up".
Linkage of Speech Recognition Technology with Final Year Project
According to the research conducted, author gained knowledge and better understanding about speech recognition technology. Author realizes that speech recognition technology can benefits the user of Green Reconnect.
Therefore, author decided to implement speech recognition technology into final year project - Green Reconnect. Green Reconnect is developed by using Microsoft Blend and WPF technology. Author had implemented windows speech and narrator into several modules. First module will be Biological Method Module which consists of a series of biological methods in eliminating pest. In this module, once a method is selected, the system will read out the words on the screen to the user. The purpose of implementing speech recognition technology into this module is to help farmers to understand the content better. They can listen to the content instead of reading it. It is very helpful for farmers who have vision impaired problem. For farmers who prefer reading the content, they can choose to turn of the features. Besides that, farmers can use simple command such as "Next", "Back", and "Mute" to control the system. Mute will stop the system from reading the instruction while Next and Back will navigate the system.
In author's opinion, speech recognition technology had benefits vision impaired farmers and improves the usability of the system. The research done in the seminar definitely benefited author in implementing speech recognition technology into Green Reconnect.
As a conclusion, speech recognition technology was a successful product that benefits different type of user across the world including doctors, nurses, patients, students, teachers, drivers, business operators and more. The implementation of speech recognition technology can be easily seen in any places in the world. This indicates that speech recognition technology is significantly changing the way of human's operations. In addition, speech recognition technology has changed the traditional way of doing business and converted business process flow into a more intelligent process. For instance, speech recognition technology has change the traditional way of recoding a patient's information in a hospital and changed the way of jotting down notes in a school. Speech recognition technology has provides an opportunity for individuals to perform their task hands-free and three times faster than usual. The benefits that speech recognition technology can bring to an organization cover monetary benefits as well as intangible benefits. Author concluded that speech recognition technology is an advance technology that worth for an organization to deploy in order to move the particular organization to the next level and compete with others in the market.
Appendix and References
Rabiner, Lawrence, Biing-Hwang, Juang, 1993. Fundamentals of speech recognition. New Jersey: PTR Prentice-Hall, Inc
Cook, Stephen, 2000. Speech Recognition HOWTO. Vol 5. Pp 4-20
Durling, Scott, Lumsden, Jo, 2008. Speech Recognition Use in Healthcare Applications. Pp 1-6.
Accessibility and Productivity Tools for Students and Teachers, 2010. Available at:<http://www.nuance.com/for-business/by-industry/education/dragon-education-solutions/index.htm>[Accessed 10 October 2010].
Forsberg, Markus, 2003. Why is Speech Recognition Difficult?
J.Bertuca, David, 2000. Voice recognition software and OCLC:technology that works. OCLC Systems and Services. Vol 16. Pp 69-75.
Windows Speech Recognition. Available at: <http://www.microsoft.com/enable/products/windowsvista/speech.aspx>[Accessed 1 November 2010].
Embedded ViaVoice delivers IBM speech technology to mobile devices and automobile components. Available at:<http://www-01.ibm.com/software/pervasive/embedded_viavoice/> [Accessed 10 October 2010].
Desktop Speech Recognition, Speak Your Mind. Available at: <http://www.nuance.com/for-business/by-product/dragon/index.htm> [Accessed 10 October 2010].
Junqua, J.-C, Haton, J.-P, 1995. Robustness in Automatic Speech Recognition: Fundamentals and Applications. Kluwer Academic Publishers.