Semantic Query Expansion By Using Knowledge Bases Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Due to the availability of the digital media and online services as well as the low cost storage devices, it will be difficult for the user to manage, store and access the relevant information. This explosive growth of the digital media without appropriate management mimics its use. Currently, the multimedia search and retrieval are an active research dilemma among the academia and the industry. The large data is available online and offline. The online data repositories like Google, YouTube, Flicker, etc. provides a dump of information but finding and accessing the data of interest becomes difficult. Finding the data of interest becomes harder and harder. The area of the textual information retrieval is matured but the image retrieval is still worth investigating. Due to this explosive growth, there is a strong urge for the system that can efficiently and effectively interpret the user demand for searching and retrieving the relevant information.


Main goal:

Design and evaluate the query Model for Multimedia Search and Retrieval.

Speci¬c goals:

To design strategies that can convert a user demand into set of discrete concepts.

Develop a Semantic query system which will automatically interpret the query according to the user's requirements.

To propose and implement a semantic query expansion by using the knowledge bases.

To evaluate the performance of the system using standard information retrieval measures.


Due to the cheap storage cost and easy way of obtaining multimedia data the user has the large amount of data on their personal computers. Due to this explosive growth of digital media (online image/video data, personal media, broadcast news videos, etc.) increases the expectation that it will be as easy to manage and search as text. This explosive growth has leads to the need for the system that can search and retrieve the data efficiently and on demand. The currently available repositories like YouTube, Flicker and Google's image/video and other like Multimedia management system are using different approaches for the annotating these data for future use. If these repositories are annotated with the keywords, still it can be difficult to find the relevant information. So taking the user demand and interpreting it in an appropriate way for effectively retrieving the required data is, however, an open challenge.

Statement of the Research Problem

The main challenge in the image retrieval systems is that image cannot be as easy to manage as text. It is said that.

"Picture is worth a thousand words"

The image is ultimately a group of objects depicts some concepts. For a computer, an image is just like a combination of pixels that are characterized by the low-level features like colour shape, texture, etc. while for the human it is more than that. For human an image is the combination of one or more semantic idea. For them, it refers to, not the content of the image that is appearing, but rather a semantic idea that it representing. It is worth saying that for the same image different people extract various Due to the flexible nature of the human and the hard coded computer nature there appears a problem known as the semantic gap.

Figure 1. Semantic Gap, where system focus on the data extraction while human focus on the intelligence


Research Questions:

My research aims to contribute the general research questions like:

How can we interpret the user requirements that are given to the system in the form of query?

How to reduce the semantic gap?

How to bring the efficiency to the system along with the semantic accuracy?

Proposed framework

The retrieval begins when the user enters the query in the system. Queries are basically the formal statements of the required information. [1] The queries basically consist of three interactive steps: Query formation, Query processing and Query result in presentation. This involves finding expressive methods for conveying what is desired, the capability to match what is expressed with what is there, and ways to evaluate the outcome of the search.

Many words have more than one meaning, for example, given a query "Apple" an apple is the name of a fruit as well as the name of a company. It is important for the retrieval systems to correctly understand and determine the meaning of the word, and determine the correct context of the word. Several objects may match the query, perhaps with a different degree of relevancy.

Query plays a vital role in the performance of the information retrieval systems. Sometimes the user queries cannot define about the user needs clearly whilst occasionally the vocabulary in the query is inconsistent with that in the relevant document.

The past content based Image retrieval (CBIR) systems uses the low level features such as the colour, texture and shape as a basis for search and retrieval, which increases the uncertainty in the output. This result in the gap, known as the semantic gap which is due to the difference between the user interpretation and the machine understanding. Semantic gap has been declared a key problem in multimedia information retrieval since long. The gap is between the well understood extraction methods of low level features from image will be used to extract the semantics, and these low level features cannot depict the entire semantics within the image. The efficiency of the retrieval system depends upon the ability of the system to understand the high level features or the semantics within the Image.

In light of the above stated problems, we proposed a framework for Semantic Query Expansion by using Knowledge bases for Image Search and Retrieval. Query expansion is a technique of expanding the query by adding some additional terms to the query that are closely related to the query terms. Some of the open source knowledge bases are available for research and academia purpose like WordNet [2], ConceptNet [3] and CYC [4]. The overall structure of the proposed model is shown in figure 2.

Figure2: Depicts the overall flow of the proposed model.

Application of this Research

It helps in managing the multimedia data effectively and efficiently.

It helps in searching and retrieving the particular piece of information from the large dump of information. It makes media search and retrieval easy.

It helps in managing the security data CCTV.


Content Owners -- Production companies like BBC, CNN, Geo…

TV Service Providers -- Satellite & Cable companies

Electronics Manufacturers -- Mobile, DVRs, Digital media players

Internet Protocol TV software developers like Microsoft and Virage.

Content-Service Providers

Content monitoring companies which provide push and pull services.

Web-Content aggregators

Companies that aggregate digital media like Google, yahoo and youtube…

Content-repackaging companies

Companies that acquire content like sports videos and TV programs and repackage it according to user needs.


In a nutshell, our system uses query expansion techniques for extracting the semantics from the user query. We will promise an effective way interpreting the user demand keeping in view the flexible nature of human as well as hard coded nature of computer. We will promise to reduce the semantic gap by interpreting the user demand semantically in order to achieve the semantic accuracy as well as the efficiency in the retrieval.