This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Abstract- With the advance in web search engine techniques and the need to deliver the correct answer to the user query is important task in (QA) Query Answer processing. The objective of the QA processing is to deliver exact answer to the user query rather than a set of document containing diverse answer for a given query. Semantic based query reformulation techniques can be used to retrieve the exact answer form huge number of document retrieved from the search engine. In this paper we used TREC-8, TREC-9, and TREC-10 collection as training set. Different types of question and corresponding answer can use from TREC collection. The QA system automatically retrieves the answer from the document retrieved from the search engine. The semantic relation, syntactic tags between the query and answer pair will be checked with help of word net. Finally weight can be assigned to the candidate answer according to the its length, the distance between keyword and the level semantic similarity between extracted query and answer pair. In this paper we propose Q (query) and A (answer) system vary from most other semantic reformulation learning system.
Keywords-component; Query and answer ,search engine,semantic reformulation,WordNet.
Semantic reformulation play vital role in QA processing to extract semantic based information from the document retrieved by search engine. The TREC (Text Retrieval Conferences) whose aim is all the participants use the same corpus given by the organization to evaluate the system. In our system we used TREC corpus for evaluation. Question set used to evaluate the QA system is mainly built up from factual question whose answer is a Named Entity (NE). The natural language will be a most habitual communication to for human being, and query answering mode is one of the handiest ways for people to exchange the information.
Bril, E Dumas, S.and Banko,  Huge development in network technologies and the popularization are breaking the limitation of space and time and especially the integration of semantic reformulation techniques provides the channel for the people to ask and answer through human computer interaction. QA system use semantic bsed query reformulation to retrieve the answer in huge document collection. For example the search engine searches the answer for given question who is the president of Stanford University? The reformulation based QA system will search for formulation like <NP> the president name of Stanford University or the Stanford university president name in document collection and will instantiate <NP> with matching noun phrase.
QA system first parses the specified question and then identifies its answer type. Question reformulation module uses the parsed version of reformulation pattern to extract answer from the sentence returned by the search engine. In the case of multilingual language writing reformulation is tedious task that must be repeated for each type of question. This is why many researcher attempts at acquiring reformulation automatically.
The rest of the paper organized as follows: section 2 present related works on semantic based query reformulation based QA system. Section 3 discussed about the proposed architecture of semantic based reformulation system architecture. Section 4 Evaluation of the system Section 5 discuss conclusion and feature work.
To evaluate the QA system performance Mean reciprocal rank (MRR) used as a standard measure in TREC collection. QA system will return a ranked list of candidate answer for each question posed by user. Reciprocal rank (RR) can be used as a measure to compute the score for a question (x). If the answer is present in the candidate list, the score is equal to the reciprocal of its rank; otherwise the score is zero. Rivachandran et al  has used machine learning technique in QA system to automatically learn question and answer patterns along with a confidence score.
Syntactic modification on query and answer pair has performed by Kwok et al  with the help of transformational method. Soubbotin et al.  has used first reformulation pattern as the core of their QA system. [3-5] Our QA system uses training corpus of 1343 question-answer pairs taken from the TREC-8, TREC-9, and TREC-10 collection data.
Web can be used as a linguistic resource for QA reformulation. The techniques are based on identifying various ways for expressing the answer context given a natural language question. soubbotin et al utilize reformulation pattern as a nucleus of their QA system. He manually generate pattern for each question in TREC-10 QA track.  E. Brill, J. Et al. were generated pattern automatically with the help of simple word permutation to produce paraphrase. With the result of permuting words of question resembling simple words they produced large set of reformulation.
Factoid query from TREC-QA track (Track-8-11)
Who was the first Taiwanese President?
Who was the lead actress in the movie "Sleepless in Seattle"?
Where is Belize located?
Who was Galileo?
Where is John Wayne airport?
What year was Alaska purchased?
When did the story of Romeo and Juliet take place?
Kwok, et al was performing syntactic modification on question using transformational grammar such as Subject-Aux and Subject-Verb activities. Kwok, C.C.T has discussed about transformational grammar to perform syntactic modification.  Radev, be trained the most excellent query reformulations for their QA system. Molla QA system translates question and answer sentence in to graph based logical form representation.  Stevenson et al used vector space model to learn the answer pattern and rank candidate answer.
The answer pattern consist of predicate argument structure is mapped to the subject, verb, object (SVO) of clause. Yang, H., & Chua, T. S. has told query answering is more important research topic for many field such as linguistic, computer science, psychology,etc, also more concerned subject in International Conferences of Text Retrieval Conference(TREC) ,Text Analysis Conference(TAC) and important R&D project for big companies. T. Rindflesch, A. Aronson has discussed syntactic analysis the syntactic structure of the input query. In Horii et al  has implement the method based on the concept of the word which indicate the concept represent the shared synonyms for the meaning of the word. [18,19,20]S. Deerwester, .K. Landauer, P.W. Foltz, and D. Laham has discussed corpus based semantic similarity is (LSA) latent semantic analysis relies on co-occurrence count of words with in document and apply singular value decomposition method to derive the semantic similarity
SEMANTIC BASED REFORMULATION FOR QA SYSTEM
The web search engine can be used as a linguistic resource to learn the reformulation for user query.  Barzilay, R., K.R. McKeown has discussed differentiate among three diverse methods such us manual collection, corpus based extraction and use of linguistic resource. With these methods manual collection of paraphrase for given question and answer pair is certainly the easiest one to execute. Semantic network and word net (lexical database) also prove useful for meeting paraphrases for given question. Riloff, E  has discussed about information extraction approach that can be adopted to solving the problem of reformulation learning.
Fig. 1.Semantic based reformulation architecture for Query and answer
The user query and answer pair is analyzed to extract the argument and semantic relation asset among the query and answer. With the help of the argument extracted from the query and answer pair then the user query is formulated. Subsequently the web search engine retrieves the formulated query and returns the most relevant documents. The query terms are then drinkable from the retrieved documents to keep only these that contain the semantic relation. These are then passed to NLP tool such as part-of-speech tagger, named entity recognizer and noun phrase chunker, to select be generalized into an answer pattern using syntactic and semantic tags. Finally semantic distance is calculated between question and frequency of the pattern confidence weight is assigned to each generated pattern.
Web as a linguistic resource
In this section we discuss about use of the reformulation in QA processing. Habert et al,and John Sinclair defines a corpus as a "collection of language data which are selected and organized according to explicit linguistic criteria, in order to be used as a language sample". This part focuses on the linguistic criteria needed in our corpus to learn reformulations. Despite some drawbacks (discussed at the end of the present part), the Web is considered to be the most adequate corpus when considering these criteria for reformulation learning.
A user query: who vb person is example for question pattern that matches Who was Galileo ?
An answer to the query: once the query pattern is matched to the input question then a set of answer pattern will search for the document collection. It could specify the form of sentence that may hold a possible candidate answer. For instance Who was Galileo? The QA system tries to discover sentence that may match any one of these of answer pattern.
Semantic based Reformulation system
QA reformulation system is to analyze how persons will logically form queries to discover the answer to an individual question. Our reformulation used 200 questions from TREC8, 693 questions from TREC9 and 500 questions from TREC10. We randomly selected questions from TREC9 collection. We formed simplest queries for each question that yield the most relevant web pages containing the answer. For instance some of the question and corresponding web queries are given below:
1. Who invented the paper clip?
"the paper clip invented by"
2. How many people live in Chile?
"the people" AND "quantity"
3. When was Babe Ruth born?
"the babe ruth born in.
The acquisition and validation process can be used in semantic based reformulation process. The acquisition process has a capable of digging the web for linguistic information. The following examples will shows working principle. In this reformulation system it is likely to avoid entirely keyword extraction phase and use of argument and use very common information extraction pattern directly derived from the arguments being processed.
Fig. 2. Reformulation learning system
For instance, if the system found new formulations are being searched for based on the argument tuple [general electronic scientist, silly putty], then these arguments will be used as keywords, and two answer patterns will be searched for in the retrieved documents: In this above example, a verb is required to occur between the two keywords. This verb will describe a new possible formulation of the early semantic relation.
Next, in this validation stage binary decision making principle has been used regarding the formulation received from previous step, and discriminates the suitable and not suitable expression of the semantic relationship.
Candidate answer generation to user query
Once we have discovered semantically equivalent sentences form the retrieved documents, we attempt to make things easier them into a pattern using both syntactic and semantic feature. To identify noun phrase and part of speech (POS), each sentence is tagged and syntactically chunked. Then we construct general form for answer pattern, we substitute the noun phrase with corresponding argument in the answer. To achieve the general answer pattern the prepositions is removed from the retrieved sentence.
Assigning the confidence weight to candidate answer
Assigning a weight to each candidate pattern is a challenging task because one answer pattern is more dependable than other. This helps us to better rank the answer pattern list, by their quality and precision. In our experiment we found that the answer sub-phrases score, the level of semantic similarity between the main verb of the pattern and the frequency of pattern, its length. To generate a weight for each pattern we used function which considers the entire above factor: the values of these weight lies between 0 and 1. Usually, let Ai be the ith pattern set A extracted for a question and answer pair we calculate each factor as following:
Count (Ai) ïƒ How many times the pattern xi was extracted for a given question pattern.
Distance ïƒ Compute the distance between the answer and the nearby term from question argument in the pattern.
Length (Ai) ïƒ Pattern length is measured in words.
Sub_phrase_scoreïƒ the candidate sub phrase is depends on the similarity of full candidate answer.
Semantic similarity (Aq, Bxi) ïƒ calculate the similarity between candidate answer pattern and the question. We want to estimate the likelihood between the word in the sentence actually refer same fact or result. The weight given to the answer pattern is based on the semantic relation between the terms as specified in the word Net.
Original verb in the question
Synonyms of the question verb
Hyponyms and hypernyms of the question verb
The previous four factors will be used to calculate the final weight of the pattern.
Weight (xi) =
evaluation of the proposed system
To implement our QA system Perl scripting language has used. To make our code efficient we made slight changes in coding. The main purpose of our system is to evaluate the quality of the results.
Result of each query class with candidate answer
No of query
Query with at least one candidate answer
Top 5 Correct answer for given query
Result of each question class with generated pattern
No of question
Question with at least one candidate answer
Top 5 Correct answer for given question
We used 493 question-answers of the TREC collection data . The TREC query set only used for training and evaluation. We submitted this question to our QA system. The system was evaluated under the semantic reformulation concepts with educated one. Then obtained candidate answers were compared. The results are reported in above tables 2 tables 3. Comparison of the results in tables 2 and tables 3 based on precision and number of question with at least one candidate answer. Tables 4 show the mean reciprocal rank for each class of question. While the results in the tables shows only slight improvement in precision. Our system results are limited to syntax of the pattern.
conclusion and future work
We discussed a technique for acquiring reformulation pattern based on the semantic features of the sentence obtained from search engine. The experimental work shows that using semantic based reformulation helps to improve the performance of QA system. The present system only concentrating semantic relation that asset between two or three argument. The work could be easily extended if we consider the variable size relation that holds between the arguments. Future work will focus on the improving the quality of query by signify a systematic evaluation and adjustment of the parameter that take part in weighting the pattern.