Integrating Eye Tracking And Speech Technology Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The report will be divided into sections; defining the terms and its uses, how speech technology could be advanced through eye movement, then how usability could be improved and what impact would this integration make. As there have been less study on this topic we would be taking help from the studies on eye tracking tools, eye movements in speech, eye activities & speech in human computer interaction and eye movements in speech technology.

1 Introduction

Whenever one gives speech instructions to manipulate objects to move they gaze at the instructed object while referring to it. More recently, a growing number of researchers in speech technology and human-computer interaction have done experiments on it and are now using eye tracking to address diverse issues such as dialog system design, synthesized speech evaluation and automatic speech recognition.

However, there has been no forum for research on the ways in which eye movement may instruct speech technologies; and papers & journals are found in this context concentrating on this problem. Hence, the paper consist of general studies on eye tracking and speech technology to enhance the usability

1.1 Usage of Speech Technology

Speech technology is used to communicate to the technologies that are designed to replica and react to the human voice. They are being used widely in the daily lives of people who are impaired, like; an aid to the voice-disabled, the hearing-disabled, the blind, and also to correspond with computers without using a keyboard or mouse.

This one could improve the senses used on computer as numerous computer-based reading instructors relying on speech recognition and oriented to promote reading in children have been developed (Gruenenberg, Katriel, Lai, & Feng, 2008; Williams, 2002).

Speech technology could help to solve challenges such as the development of literacy, especially reading ability.

1.2 Usage of Eye-Tracking

Eye-tracking is basically the idea that the eyes can be used as window to the brain. When one is looking at images, the eye-movements are not at one place thus moving from one place to another.

In fact, the gaze position leaps between examined locations, as shown in the Figure 1.0.

Figure 1.0: Gaze Plot

The eye tracking information is integrated into a simulated version of a personal satellite assistant (PSA), which is a robot developed at NASA (National Aeronautics and Space Administration).

1.3 Speech Technology & Eye-Tracking

Technology can increase efficiency, and therefore optimize results for a given level of resources without necessarily compromising educational objectives.

For example, in reference to Figure 1.2, when listening to an instruction such as "touch the starred yellow square" under the condition when there was only one starred object in the display, subjects made an eye movement to the target object 250 ms after the end of the word "starred".

Figure 1.2: Program visualization tool used in the experiment with a representative scan-path superimposed.

1.4 Problem Statement

"Can usability be improved by integrating eye tracking with speech technology, for precise & accurate results?"

1.4.1 First sub-problem

The first sub-problem is what are eye tracking, usability & speech technology?

1.4.2 Second sub-problem

The second sub-problem is what are the tools & techniques used for eye-tracking, speech technology & usability?

1.4.3 Third sub-problem

The third sub-problem is how usability could be improved through this integration and what impact would it do.

1.5 Limitations

The research on this topic has been limited due to which it would be hard to find authentic researches on the topic. Also, the scope will narrow down as the research is concluded.

1.6 Definition of terms

Speech technology relates to the technologies designed to duplicate and respond to the human voice. They have many uses, including to aid the voice-disabled, the hearing-disabled, the blind, and to communicate with computers without a keyboard, to market goods or services by telephone and to enhance game software.

Eye-tracking is based on the idea that the eyes can be used as windows to the brain". When we are looking at pictures, our eye-movements are never performed continuously. In fact, the gaze position jumps" between inspected locations. 

Usability is the ease of use and the learning ability of a human-made object.

Human-computer interaction (HCI) is the study, planning and design of the interaction between people (users) and computers. It is often regarded as the intersection of computer science, behavioral sciences, design and several other fields of study. Interaction between users and computers occurs at the user interface (or simply interface), which includes both software and hardware.

1.5 Assumption

The respondent should have basic knowledge about human computer interaction, speech recognition & eye movement.

2.0 Related Work

Roy and Pentland (2002) used the connection of speech and eye to associate spoken statement with an equivalent object's visual appearance.

At today's time Eye tracking appear to be a capable research method to appraise synthetic speech. The results give us an insight in how similar the processing of synthetic speech is compared to the processing of human speech on a segmental and suprasegmental level. It would be interesting to create a test bed environment in which it would be easy to compare the processing of a new speech synthesis system with reference fixation patterns.

Speech technology deals with voice recognition and speech synthesis. Speech synthesis enables the simultaneous of both auditory sensory channels. This increases the learning process of the user as they interact with the software, as dual modularity demands less cognitive resources. Voice recognition, which converts captured voice into commands, can be used to drive computer software in real time. A voice driven interface can simplify a complex task in particular those that require a lot of eye co-ordination on the display as well as the keyboard. The accuracy of voice recognition has improved with the use of a feature called "Profiler" that is now available with many voice recognition software's (Ravesi, 1999)

3.0 Senses Enhancing Usability

Researchers are conducting research on how instructions are provided to the system to run an application through human eye, voice, head, and gesture and so on.

The research topic is on speech technology could enhance usability if eye-tracking is integrated in it. Speech technology is one of the branches of Human computer interaction and it deals with voice recognition and speech creation. Voice recognition converts the instructions in the form of speech to use the applications in real-time. Usability is superior through by adding up features like speech technology that involve eye synchronization on the display.

There are several researches being made in enhancing speech technology and eye tracking but the attempts on their integration is still at a low stage.

The reason behind doing such topic is to understand that how these technology help user to have efficient interaction by removing the barrier of clicking through mouse. Moreover, the reason to focus on this integration is to enhance human computer interaction which would provide accurate results and efficient processing

Currently, speech recognition and eye tracking are two different topics on which the researches have given various suggestions and tools have been developed in this regard.