Search Using Different Query Expansion Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The traditional search engines when search for images, gives the results that are common and most of the users expect.Compared to normal queries, image search engines always have to search with unclear , inefficient and ambigous data. In this case, query expansion techniques can be useful for getting the correct images with different senses that users want to get. In this paper, we analyze the work done by people in the field of image search using different query expansion techniques. In this paper, we have proposed a new hybrid method in which first we do an ontology based image search and the results are passed to the second phase , which is a query expansion based search. This hybrid method gets finer results than CBIR and other methods because it considers both structural and lexical similarity of images.

Keywords-Image search engines;Query Expansion


With the introduction of internet and digital world, multimedia data has got a huge importance in people's life. As time goes, the amount of data handled through digital media is also getting increased.

Users always want to search for images as well as text information. There are traditional techniques for searching both text and images. But as the quantity increases, the performance and quality of these methods are not satisfying the users. Image search is slightly different from text search. In image search, both text and a sample image can be sometimes input as query. When the researchers thought beyond traditional techniques, a method called content based retrieval came into existence. In this method, a sample image are input for searching in a pool of images. Only the pixel values and contents of the images are matched against the image pool.

Later a few studies happened in the area of query expansion in image search. Query expansion is the process of adding a query tails to the original query and returning more semantically similar results. This can be done in various ways. The major difference between query expansion techniques lies in the selection of knowledge bases.This paper is focused on the evolution of traditional image searches to new techniques concentrating in the query expansion methods. Here we mainly discuss the approaches of query expansion with different knowledge bases in searching images and how and where these methods can be applied. Also we propose a new method of image search which uses both query expansion and ontology and gives more finer results.


In content based image retrieval systems[7,10], images are searched from a pool or database of images based on the content of a particular image. At first user will provide a query image. The pixel values and contents of that picture will be taken into consideration and similarities of query image with the pool of images will be found out. Extracting the features of image is the major step in this process. Features like colour, shape and texture are considered as features in this step. Feature vectors are matched against the images in database and are the similar images are grouped.

Features can be of two types. Global features and local features. Global features includes colour layout,colour and texture histograms of the whole image. But the colour, shape and texture of parts of images fall into the second category.

CBIR Model Architecture:it has basically 3 levels,the first one is pre-processing tier,in this images are given as query to search and images got as results are preprocessed.

In this at first the image given as a query is identified and then the category to which it belongs is identified.Then these details are passed to the web server,which further passes onto second level.

The second level is the application level. In this the images are checked to see whether it matches.The matching module has three sections: The query handler receives the details from the web server and produces a query in SQL form to select an image with close similarity from the category list.

Then this query is given to a database which handles all images got from web.The details of the similar image from the category are given to the query handler and the degree of match is calculated with the help of a calculator. The image with the highest degree is given to the result generator.

The results retrieved will be presented in a neat format and sent to the customer.

The last level stores all images and provides those images as per need.

In content based system, there is no particular method to find the semantic similarity between pictures. Only the similarity of contents is considered. So even the image retrieval using content based method can be modified or improved by adding a relevance feedback system into it.


Using WordNet,ConceptNet Knowledge Bases

In 2011 Nidan Aslam,Irfanullah et al. introduced a paper on image search using two knowledge bases wordnet and Conceptnet.The whole technique of using wordnet and Conceptnet in image search is described in the figure 1.

When a user types in a query to search an image, it may or may not contain more than one subqueries.



Here, 's are the sub queries of the query Q which the user enter.

Figure 1.



Semantic Expansion

Selection of concepts



Ranking of images and retrieval

In the first phase,the query is converted into a sequence of sub queries.This phase is known as Tokenization. After this phase, the morphological words are converted to the base words in the Lemmatization phase.

The selected lexemes are then moved to the WordNet. WordNet expands the query . Expansion of terms in this WordNet typically involves lexical expansion. The semantic expansion of terms is carried out in ConceptNet phase.ConceptNet adds the semantically related meanings to the candidate words.

Thus the list of words got from both WordNet and ConceptNet are suffixed to the original query Q. The number of concepts got from both these steps will be huge.To increase the precision of the results, similarities of the words are measured against the original query and the words with less relativeness are removed. The output query combination of the entire process is then passed to the information retrieval system .The retrieval system returns the images which have high frequency matching with the candidate keyset.

In VHR Image search

With the advent of VHR(Very High Resolution) [3] satellite images, the detection of objects became very easy for researchers.There are different techniques to find the objects in VHR images. But when using the traditional search method, the precision of results is very less. Zeng Huaxin,Zang Huigang et al. used the query expansion techniques for VHR image detection and got outstanding results.

In VHR object detection, a single or group of query images will be provided. These images are searched over a VHR test image. For the traditional object detection, first segmentation of test image is done with respect to the size of query images.

In the second phase called feature extraction, a set of local regions are find out and quantized using a clustering algorithm. These quantized regions are later used for indexing purpose. In the final phase, query vector and patch vector are compared and we get results.

For getting more polished results, query expansion techniques are used. Query expansion baseline is a simple method of query expansion.As in the text retrieval, top m results of original query will be taken and then from all the patches, frequency vectors will be computed. The results of new query will be added to the original query results.

Using CYC knowledge base

CYC KB [5] formally represents the human knowledge. It consists of number of facts and rules of thumbs in our day today life. To represent the knowledge formally,a language CycL is used.

The CYC KB consists of different terms and assertions that are related to those terms.C.H.C leung and Yuanxi Li thought of using this CYC knowledge base in the process of image searching so as to improve the correctness of the result set of images. They both created a framework

model for image retrieval based on CYC knowledge bases. The model they created has three main modules.

An image search interface

knowledge base for query expansion(CyCorp)

Image database(Flickr)

Figure.2 shows the different steps needed for the image search process using CYC.

When a user enters a query into the image search interface, the query is transferred to the CYC and at the same time Flickr returns the normal results of the original query. Then the CYC returns the expanded query terms and results of expanded queries from Flickr.

After all the results have been returned, user can choose the expanded queries for getting detailed results. Using this a content based image search is carried out.

Enter Image search query

More senses?

Pass query to Flickr, get back results

CYC Expansion

Pass Expanded query to Flickr and interface, get back results

User selects images

Is user satisfied?


Is user satisfied?

Content based retrieval

Refine query and search






Figure.2 Image search using CYC-A Flowchart

Using Random Sampling SVM

Content Based Image Retrieval (CBIR) is the most common image retrieval process. In this, image is retrieved nu giving queries in form of words or expressions. It is a natural method. CBIR systems faces a lot of performance problems because there is a communication difference between visual features and the query text provided, since visual features are low level and words are high level. In order to improve the performance of CBIR, relevance feedback was brought into existence. With the help of this, users classifies those systems which produces positive results and the ones which produces negative results and by combining both the results a classifier is train according to user's preference.

Traditional Relevance Feedback learning strategies includes the acceptance of positive examples than negative ones, it contains features which discriminate between positive and negative results and ranking is done according to this capability. The RF based on classification has become very famous, majorly SVM (support vector machine) based RF approach. This approach is very good in classification. Usually a classifier is given a set of samples in order to know the user's preference. But if much samples are not provided, the ability to distinguish between positive and negative images become quite difficult. This is a major drawback in SVM based RF which results in poor performance.

SVM RF is not suitable if only a small set of samples are provided. In order to solve this problem, Rand SVM based Query Expansion came into existence. Since users do not label the images retrieved from the system, the number of negative images is more than the number of positive images. So, due to this asymmetry and SVM includes advantages of bootstrapping and aggregation. Due to the more number of negative samples, bootstrapping process is done on the whole negative feedback space according to the number of positive feedback samples. After that a small number of negative samples are retrieved which are the same size of positive samples. A hard SVM classifier is trained after collecting all samples. With performance aggregation, the quality of individual SVM classifier can be increased. Their output is the degree of match between retrieval images and query images.


Using an image ontology[1] for image extraction is a good option for image search process. So in this paper, we first create an image ontology and using that, image search process is carried out. To sharpen the image search process, at query expansion is included to the search process. After the ontology based search we pass the results to the query expansion based search and do the expansion process based on the newly created database.

Image search with query expansion

Ontology based search

Query image


The different steps of ontology based query expansion are are, extraction of colour , texture and shape features, classification, creation of ontology, and image retrieval.

Extracting colour

The colour extraction process mainly includes 3 steps. First the colour spaces of the particular image I is divided into a number of bins(10). Then in the next step, number of pixels in each bin are counted, and histogram linking is done.

Extracting Texture

The proposed system uses the method of grayscale cooccurrence matrix (GLCM)for the extraction of texture feature from the image. The matrix values are calculated with by considering how many times a pixel with gray scale value I is spatially related with a pixel with gray scale value j. The texture information is obtained from the statistics that can be derived from the matrix obtained.Different textures can be contrast, Homogeneity and correlaton.

Extracting Shape

In the extraction of shape, smoothing is done to remove noise values. After this ,gradients are found out and then using double thresholding, strong and weak edges are also found. Then finally, edges are tracked using hysterisis.


Using some criterias we classify images. It is the process of forming groups of entities. Classification needs domain knowledge for it.

Construction of ontology

In the construction of ontology, different terminologies regarding the domain are found and concepts are discovered manually. After this, a concept hierarchy is created. This hierarchy is extended with the classified information. Then the relationship between concepts are defined and ontology is filled with values of features and instances. Like this, ontology is created.

Image retrieval using ontology

In normal content based systems, the features of query image is matched with the features of images in the database and then similar images are retrieved. In the proposed system which uses ontology, at first a normal CBIR based search is done and most similar image is discovered. Then using this image , an image ontology is constructed using the above discribed process. Referring this ontology, all the related and relevent pictures are retrieved from the given database.


The processes that happen when a user enters a query or sample image to search for , may be different in different search engines. A number of researches have been happened in the area of image search.

Content based search was a promising technique that came after the traditional database search method. But even this method has got limitations. Studies show that query expansion can be useful in improving the search results for images. Researchers have worked on different combinations of searches including the query expansion in different ways. The normal content based method also can improvise by adding a relevance feedback module.

In this paper, we have proposed a new hybrid method which uses a combined method of ontology based and query expansion based image search.After searching with the contents and features of images, query expansion helps the image searching process in returning semantically related images also.Altogether, query expansion can be said as the most promising technique for image search.