A Semantic Query Interpreter Framework Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Abstract - Due to the ubiquitous ness of the digital media including broadcast news, documentary videos, meeting, movies, etc. and the progression in the technology and the decreasing outlay of the storage media leads to an increase in the data production. This explosive proliferation of the digital media without appropriate management mimics its exploitation. Presently, the multimedia search and retrieval are an active research dilemma among the academia and the industry. The online data repositories like Google, YouTube, Flicker, etc. provides a gigantic bulk of information but findings and accessing the data of interest becomes difficult. Due to this explosive proliferation, there is a strong urge for the system that can efficiently and effectively interpret the user demand for searching and retrieving the relevant information. In order to cope with these problems, we are proposing a novel technique for automatic query interpretation known as the Semantic Query Interpreter (SQI). SQI interprets the user query both lexically and semantically by using open source knowledge bases i.e. WordNet and ConceptNet.  Effectiveness of the proposed method is explored on the open-benchmark image data set the LabelMe. Experimental results manifest that SQI shows substantial rectification over the traditional ones.

Keywords: Knowledge-based approach, Automatic Query Expansion. Retrieval Performance.


With the fast evolution in the digital technologies, has led to the stunning amount of the image data. The unprecedentedly high production of multimedia data, increases the expectation that it can be as easily manage as text. Researcher community are continuously exploiting the techniques for effectively and efficiently managing these data. However until now, biggest challenge is taking the user demand, interpreting it accurately for finding the data of the user's interest. Several attempts have been made for retrieving the relevant images, but still it's frustrating.

The conventional Content Based Image Retrieval techniques are still exploiting the image based on low- level features like colour, shape, texture, etc. These techniques don't exemplify the noteworthy efficiency. Content based image retrieval techniques are interpreting an image analogous to a computer. They are rendering an image just as the composite of pixels that are characterized by colour, shape and texture. However, for the user, the image is the combination of objects instead of pixels, delineating some concepts. For them, it doesn't only refer to the content of the image that is appearing, but rather the semantic idea it is exemplifying. It is worth saying that for the same image can be interpreted differently by different people. Owing to the flexible nature of the human and the hard coded computer nature there appears a problem known as the semantic gap. It is due to the difference between the user interpretation and the machine understanding. Bridging the semantic gap has been declared to be key problem Information Retrieval (IR) systems since a decade. The efficiency of the retrieval system relies on the ability of the system to comprehend the high level features or semantics.

The success of the retrieval system depends on the number of relevant documents it retrieves. Higher the number of relevant document it retrieves higher will be its precision and efficiency. For retrieving the relevant information the main step is to understand the user requirement that is in the form of user query. The retrieval begins when the user enters the query in the system. Queries are basically the formal statements of the required information. The user queries are not always precise enough to completely state what is required. Sometimes a word in the query doesn't match the word with the corpus in the form of metadata attach with the image known as annotation. This mismatch is may be due to the vocabulary different between the annotated word and the query e.g. rock and stone may share the same semantic idea. And sometimes some words have more than one meaning for example "Apple" it is the name of the fruit as well as the mane of the company. All these problems lead to poor retrieval performance.

As an attempt to remedying these stated problems, Query expansion has been gaining more and more significance from the recent years. Query expansion is a technique of magnifying the query by supplementing some additional terms to the query that are closely related to the query terms. Various query expansion techniques have been continuously exploring by the researcher since decades, but still some of the issues are remained at their infancy.

In this paper, we present a new query expansion technique for the image retrieval by using the open source lexical as well as common sense knowledge bases known as Semantic Query Interpreter (SQI). Initially the query is transformed into atomic terms and pre-processed by using the Natural Language Functions known as Natural Language Processing (NLP) like Tokenization, Lemmatization, Part of Speech (POS) tagging, etc. some of the terms are selected from the pre-processed query, because every word in the query doesn't contribute a lot. So the nouns, verbs, adjectives are selected for further processing. The selected terms are then passed on to the knowledge bases for expansion. The lexical expansion can be done by using the open source lexical knowledge base, i.e. WordNet [1] and the semantic expansion can be done by using the open source common sense knowledge base, i.e ConceptNet [2].  Eventually, the weights of the  expanded terms are calculated to find the most relevant terms. Some of the concepts are pruned from the list of expanded terms based on weights in order to maintain the precision of the system. Finally, these selected expanded terms along with the original query terms are used to retrieve the relevant images. One of the well- known traditional retrieval models the Vector Space Model (VSM) [3] will be used to retrieve and rank the results. The SQI technique will be executed on the open source image data set LabelMe [4]. The effectiveness of the proposed method can be assessed by exploiting precision and recall. Experiments reveal a substantial improvement in the state of the art methods.

In the reminder of this paper, section 2 provides the state of the art on the query expansion. The section 3 presents the detail of the proposed framework for semantic query expansion. The experimental results are demonstrated in section 4. Finally, we conclude in section 5.


The efficiency of the retrieval system usually relies on the ability of the system to infer the user requirements and then break through the relevant data according to the query specification. Query expansion is a promising approach to ameliorate the retrieval performance by appending some supplemental terms to the query that are closely related. The conception of query expansion has been exploited for decades, but still it is worth probing. Query expansion can either be done by the manual or computer generated thesauri, relevance feedback, automatic (statistical) query expansion or by interactive query expansion.

By scrutinizing the existing query expansion techniques, these methods make use of the NLP based query expansion, statistical model [5], ontology [6], semantic knowledge [7], knowledge bases [8], semantic web in case of web queries [9], user relevance [10] and Conceptual query expansion [11], user logs [12,13,14], co-occurrence[15], viewpoint  orientated manipulation [16]. The user relevance is based on fuzzy rules [17] and statistical models. The NLP based query expansion doesn't gain much admiration due to its limited use. It only uncovers the role of the terms in the query and doesn't elaborate its meaning. The user logs represent the interest of the user which can be constructed automatically and updated during the user interaction. After the expansion of the user queries, the queries come up with the set of terms. Among these terms, some of them are relevant while some are irrelevant that is noises. These noises will significantly degrade the precision of the overall system. The relevance between the expanded terms and the document in the corpus can be calculated by using term co-occurrence [15], Google similarity distance [18].

The query expansion technique is adequate to cope with the problems like Vocabulary mismatch or vocabulary gap [19], Word Sense Disambiguation (WSD) [20]. Vocabulary gap is the vocabulary difference between the annotated concept and the user query while WSD is the ability of the system to find the meaning of the word in its context [21]. The research community has focused to expand the user queries by using lexical resources such as lexical semantic relations [22], co-occurrence frequencies [23], WordNet [1].  WordNet hierarchy and synsets have also been used to improve the Wikipedia classification [24]. WordNet has also been used for Cocneptual Query expansion [25]. The Word Sense Disambiguation of the user queries can be performed by using WordNet.The dictionary like resources e.g. WordNet has gained a substantial researcher's attention and produces genuinely prolific results for the simple queries. Though the WordNet based expansion produces splendid results for the short and simple queries but for the complex or semantic based queries it's just outmoded [26]. The WordNet expansion relies merely on the lexical meaning rather than the conceptual meaning of the query. The semantic concept based query is still an intriguing issue. Semantic based queries are the combination of objects, scene, events as well as the semantic concepts like "Burning of wood in the street". Where the wood is an object burning is an event and street is a scene. It delineates that "What is actually happening in the image".

Semantic query expansion is still an exigent issue. Early work solely relies on the text matching techniques. However, subsequently the trend was moved to the semantic expansion of the user queries. Those systems heavily rely on lexical analysis that's why it flunks in the complex queries. It doesn't find the semantic relatedness or have no potential for the common sense reasoning. Despite the fact, that lexical analysis plays an imperative role in the extracting the meaning from the user request, the common sense reasoning also plays a focal role. Common sense knowledge includes knowledge about the spatial, physical, social, temporal and psychological aspects of everyday life. WordNet has been used usually for the query expansion. It has made some rectification but was limited. Several studies reveals the importance of common sense reasoning in information retrieval, data mining, data filtering etc. [27].

In order to cope with such limited rectification there is a need for the semantic expansion of the user query. Our proposed technique expands the user query lexically as well as semantically. Lexically the query was expanded by using WordNet. While the semantic based query expansion is done by ConceptNet. Concept Net is a contextual common sense reasoning system for common sense knowledge representation and processing. ConceptNet is developed by MIT Media Laboratory and is presently the largest common sense Knowledgebase [28]. ConceptNet is the semantic network representation of the OMCS (Open Mind Common Sense) knowledge base. ConceptNet has been rarely used by the research community for query expansion. The Annotation and Retrieval Integration Agent (ARIA) project also used the common sense reasoning to bridge the semantic gap and increase the retrieval efficiency [29]. Comparison has already been made between the WordNet and ConceptNet using TREC-6, TREC-7 and TREC-8 data sets. The result reveals that the WordNet has higher discriminative ability while the ConceptNet have higher concept diversity [30].


To enhance the effectiveness of query expansion, we proposed an automatic semantic query expansion technique that will interpret the user query by using the WordNet and ConceptNet knowledge bases known as SemanticQuery Interpreter (SQI). The framework of the proposed Semantic Query Interpreter (SQI) is outlined in this section. The proposed system has been segmented into four major phases. 

• Core Lexical Analysis

• Common Sense Reasoning

• Candidate Concept Selection

• Retrieval and Ranking of Results

 The overall structure of the semantic query interpreter is exhibited in the figure 1. After the user submits the query to the system, the system analyses the user query and as well as the data in the corpus to find the relevant and irrelevant data according to the user specification.

User enter a query Q into the system. The query may be a single keyword based query or the combination of keywords represented by Ki.

Q= ( K1, K2, K3, K4, …………….. KT )


Where KT is the no of keywords in the given user query.

Initially, the query is transferred to the core lexical analyzer phase. It, converts the user query into the finite set of tokens by using the tokenization technique.

Figure 1. Proposed Model for Semantic Query Interpreter

 Tokenization is a technique that transforms the sentence into a definite number of words. The words in the query may exist in many morphological forms like plurals, gerund form or in past or future suffixes. The query tokens are then converted to its base form for further analysis. Lemmatization is a technique that converts the different morphological form of the words into its base form known as lexeme e.g. bake exists in many forms like baking, baked, bakes, etc. Lemmatization converts the baking, bakes, baked into bake. Each lexeme is assigned a tagged i.e. the Part of Speech (POS) tag for selecting an appropriate lexeme from the list of lexemes. The lexemes are then tagged by using multilingual tagger[31]. Montilingua tagger uses Penn Treebank tagset for part of speech tagging [32]. From the tagged lexemes, some of the lexemes or candidate terms are selected because every word in the query doesn't contribute a lot. So common, unusual or stops words are removed, only nouns, verbs and adjectives are selected. Nouns represents the entities verbs represents the events and adjectives represents the properties of the entities. The selected candidate terms or lexemes consist of finite set of terms represented by CT.

CT={CT1, CT2,………..CTn}


The selected lexemes along with the appropriate part of speech is then transfer to the WordNet knowledge base for expanding the query by attaching the synonyms known as synsets. The WordNet hierarchy is used to extract the synonyms.

The WordNet engine expands the query selected candidate terms lexically. Now the selected terms are passed to the common sense reasoning module that will attach the semantic context with the query words instead of the similar words. The WordNet will reduce the vocabulary gap while ConceptNet will reduce the semantic gap. The ConceptNet knowledge base can be used to attach the semantics or concept set with the CT by performing the common sense reasoning.

Where SS represents the synonym set attach with the lexemes and CS represents the Concept set attach with the lexemes. represents the set of selected lexemes along with the concept set and synonym set.

In our model, we extract the common sense reasoning by using the Knowledge-Lines also called K-Lines from ConceptNet. K-Lines are the Conceptual correlation. ConceptNet contains the eight different kinds of K-Line categories that combine the K-Line into the ConceptNet twenty relationships. That helps in the conceptual reasoning.

The Lexical and Semantic expansion of the user queries comes up with the large number of attached concepts. This will significantly increase the recall but simultaneously decreases the precision of the system. In order to maintain the precision of the system we have to remove the noises. The expanded terms are filter by using the semantic similarity function. The semantic similarities of the expanded terms are measured against the selected query lexemes. The terms are selected by calculating the threshold on the basis of the calculated semantic similarity values. The resultant expanded query contains the selected query lexemes along with the selected expanded concepts are passed to the retrieval model for retrieving appropriate images and ranked them accordingly. The candidate concept (CC) selection module will attempt to increase the precision of the system.


Where represent the selected lexemes and the selected candidate concepts.

The selected candidate concepts will serve as an input to one of the well-known traditional bag of words Information Retrieval model known as Vector Space Model (VSM). The candidate terms are compared against the meta data attach with the image. The VSM will retrieve the images on the basis of the frequency of the selected concepts tagged with the image. The similarity between the images and the candidate concepts along with the original query terms are calculated by using the cosine measure.


The image with the larger frequency of selected concept will appear before the image with the less frequency.


Efficiency of the proposed method has been investigated on the LabelMe test collection that has been freely available for academia and research purpose. LabelMe is a project created by the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), which provides a data set of digital images with annotations. The LabelMe data set consists of images as well as the annotation data [33]. The corpus consists of 8,983 images, 56,943 annotated images and 25,040 images are still not annotated. Two of the mostly used and well accepted evaluation criteria can be used to measure the efficiency, i.e. Precision and Recall. Precision is related to specificity while recall is related to exhaustivity.

Figure 2. Precision-Recall curve for Semantic Query Interpreter

 Precision and recall are used to evaluate the SQI effectiveness. For simplicity, we have selected some of the categories from the LabelMe data set. We calculate the precision and recall for the Top 10 retrieved result i.e. P@10 and R@10. In our experiments, set of different types of queries i.e.  Keyword based (single word single concept, a single word multi-concept) queries or Sentence based queries (multi-word multi-concept). All the expanded queries are passed to the VSM for results and ranking of the results. The results are shown in the figure 2. Though promising, still results show some of the fluctuation in some of the complex query. It is because 100% efficiency is hard to accomplish Apart from that, our SQI result demonstrates the significant improvement over the traditional approaches and is not domain specific.


This paper presented a new semantic query expansion technique known as Semantic Query Interpreter by combining both the lexical and common sense knowledge for expansion of user queries. From the preliminary experiment, the Semantic Query Interpreter outperforms the traditional query expansion. The precision of the system has been significantly improved. Based on our observation, we concluded that lexical as well as semantic knowledge both play a vital role in bringing efficiency to the system. Future work includes investigating the SQI on the other like Corel and ImageCLEF. Investigating the same techniques for the video data set like LabelMe videos, for testing the efficiency of SQI for video retrieval.