Artificial Intelligent Browser

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

BANNARI AMMAN INSTITUTE OF TECHNOLOGY

Abstract

Artificial Intelligent browser is system software to communicate the web based technologies for visually challenged person and traveling person. In this project the web page contents of HTML, XML document files are converted into voice by using Text to Speech Synthesizing Technique. In this project the end user will be requesting the URL, the corresponding URL page will be got from the web server as an HTML or XML pages.

The file format for a web page is usually HTML (hyper text markup language) and is identified in the HTTP protocol using a MIME content type. Most browsers natively support a verity of formats in addition to HTML, such as the JPEG, PNG and GIF image formats, and can be extended to support more through the user of plug-ins.

The combination of HTTP content type and URL protocol specification allows web page designers to embed images, animations, video sound, and streaming media into a Web page, or to make them accessible through the web page. This page will be processed by the content extractor algorithm to extract the necessary data and blocks separately.

After the parsing is done the parsed data is synthesized and the audio output is given to the end user via speaker. The benefits of this project is that it is economically free for creating websites with voice which is very useful for visually challenged person to hear their study tutorials available in the internet. The efficiency of this project is that it consumes less time for synthesizing text to audio.

OVERVIEW:

  • Artificial Intelligent browser is system software to communicate the web based technologies.
  • In this Mission the web page contents of HTML, XML document files are converted into voice by using Text to Speech Synthesizing Technique.
  • The end user will be requesting the URL, the corresponding URL page will be got as an HTML or XML pages.
  • This page will be processed by the content extractor algorithm to extract the necessary data and blocks separately.
  • After the parsing is done the parsed data is synthesized and the audio output is given to the end user via speaker.

OVER ALL ARCHITECTURE

METHODS

The Mission is mainly divided into three modules. Each module have a specific process.

Browser design :

The Java browser is an application which enables a user to display and interact with text, images, videos, music and other information's typically located on a Web page at a website on the World Wide Web.

Content Extractor :

The algorithm takes the web pages as input and extracts the text from the web page.

Text to speech converter :

The text to speech TTS algorithm converts the text into audio form and the output is given to the end user via speaker.\

CONTENT EXTRACTOR

  • The input to the algorithm is a web page belonging to the class of web pages.
  • The outputs of the algorithm are the primary content blocks in the given class of web pages.
  • The first step of the algorithms is to use the getblockset routine to partition each page into blocks.

GETBLOCKSET

The getblockset routine takes an html page as input with the ordered tag set. GetBlockSet takes a tag from the tag set one by one and calls the GetBlocks routine for each block belonging to the set of blocks generated.

GETBLOCKS

GetBlocks takes a full document or a part of a document, written in HTML, and a tag as its input. It partitions the document into blocks according to the input tag.

For example, in case of the <TABLE> tag given as input, it will produce the DOM tree with all the tables' blocks.

If the input tag is <TABLE> and there is no table structure available in the HTML page, it does not partition the page. In that case , the whole input page comes back as a single block.

IDENTIFY PRIMARY CONTENT

After the blocks have been identified, the second step of the process involves identifying the primary content blocks and separating them from the non-content blocks.

FLOW DIAGRAM FOR CONTENT EXTRACTOR

TEXT-TO-SPEECH

  • FreeTTS Speakable is an interface. One implementation of this interface is FreeTTS Speakable Implementation.
  • This implementation will wrap the most common input forms like a string, an InputStream or a JSML or XML document as a FreeTTS Speakable.
  • A FreeTTS Speakable is given to a voice to be spoken.
  • The voice takes the FreeTTS Speakable, translates the text associated with the FreeTTS Speakable into speech and generates audio output corresponding to that speech.
  • The voice converts a FreeTTS Speakable into a series of Utterances.
  • The utterances are processed by the utterance processor.
  • The processor's final output is the audio.

PROCESS STEPS

1. Tokenization

The given input stream of text is broken into a series of Tokens. A token represents a single word in the input stream.

2. Token to words

The main role of Token to Words is to look for various forms of numbers and convert them into corresponding  English words.For example 2002 into two thousand two.

3. Phraser

The phraser processor creates a phrase relation in the Utterance, i.e how the Utterance is broken into phrases when spoken.

4. Segmenter

   

The segmenter determines where the syllable breaks occur in the Utterance. This produces syllable structure.

5. Pause Generator

It inserts a pause before the first segment of each phrase.

6. Intonator

It sets the accent and end tone features of the item

7. Durator

It determines the ending time for each unit in the segment list. Each unit is tagged with an end attribute that indicates the time in seconds, at which the unit should be completed.

8. Contour Generator

It calculates the frequency curve for the Utterance. It contains the starting point, mid point and end point of the term.

9. Pitch mark Generator

It calculates the pitch marks for the Utterance.

10. Unit concatenator

It gathers all of the diphone data and join it together. It extracts the unit sample data from the unit based upon the target times as stored in the result object that is added to the Utterance.

AFTER UNITCONCATENATOR PROCESSING

CONCLUSION:

  • The benefit of this mission is that it is economically free for creating websites with voice.
  • This is very useful for visually challenged person to hear their study tutorials available in the internet.
  • The efficiency of this Mission is that it consumes less time for synthesizing text to audio.