Precision And Recall Method Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The prototype which is implemented according to the proposed design is going to test and evaluate in this chapter by putting real web scenario. In any kind of information retrieval (IR) system, the main goal is to gain a high precision which means the relevance of results that produced.

Any kind of information retrieval process of an IR system is begun when a query which is natural statement of information needs is entered to the system by a user. A query can be matched to several set of outcomes in an IR system and it may consist of different degrees of relevancy. In most of IR systems, a numeric score can be computed for measure how well particular objects match the query and by using that score rank the particular object. In this system, we used precision - recall method to evaluate the relevancy of results.

Precision - Recall Method

Precision and recall method which is a statistical classification method is most wildly used method for evaluate IR models. This method is applying for a specific document collection and a given set of queries. Precision is a measure of exactitude and recall can be seen as a measure of the completeness. The precision-recall curves are calculated by averaging the precisions gained at the standard recall value all queries posted to the system.

Precision is obtained from following formula and it is denoted as a fraction of retrieved documents that are related to the user's searching requirements. All the retrieved documents are taken into account in precision.

Recall is obtained from following formula and it is denoted as the fraction of successfully retrieved documents which are relevant to the query.

The comparisons between information retrieval models are done by using plotted-recall diagrams as illustrate in figure 5.1.

Figure Recall and precision over queries entered to the IR system

100%

Precision

Recall

0

100%

Evaluation Objectives

The main objective of this experiment was to verify the capability of proposed web searching mechanism produce the most suitable result according to the user queries. There is a main approach to evaluate the precision of IR of Ontology based web search system, which is compare the Ontology based Web searching mechanism against the keyword based web search system with precision-recall values.

Experimental Design

Sampling

The conventional way for evaluate an IR system is select a document collection which are having the keywords in user queries of more than thousand documents. But it is not feasible because of the time line. Therefore I selected about 150 HTML documents collection which includes 10 relevant documents for each user query. That selection is performed based on the occurrence of keywords in the document. Total number of queries which were selected for the evaluation is 10. The same document collection is used in both Ontology based web search mechanism and keyword based web search.

Testing procedure

Selected document collection is indexed in Ontology based web searching mechanism according to the OntoWS indexing approach and in conventional keyword based web search according to their indexing approach.

Each selected 10 queries are submitted to the OntoWS and keyword based web search.

Obtained the searching results from OntoWS and conventional keyword search and computed the precision, recall values.

According to the results, drew the precision-recall curves.

Results Analysis

OntoWS

As I have prepared the experiments, 10 different queries were submitted to the OntoWS and OntoWS was retrieved sets of documents as the searching result. According to results, the precision and recall values of retrieved documents for the Query 1 are in the Table 5.1 and it is plotted in Figure 2.

Table 5. Precision Recall Values for Query 1 in OntoWS

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Precision

100%

100%

100%

100%

83.33%

75.00%

70%

66.67%

56.25%

41.67%

C:\Documents and Settings\Chamil\Desktop\Evaluation graphs\Q1OntoWS.JPG

Figure Precision recall curve for Query 1 in OntoWS

The precision and recall values of retrieved documents for the Query 2 are in the Table 5.2 and it is plotted in Figure 3.

Table 5. Precision Recall Values for Query 2 in OntoWS

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Precision

100%

100%

100%

100%

71.43%

60.00%

53.85%

47.06%

39.13%

35.71%

C:\Documents and Settings\Chamil\Desktop\Evaluation graphs\Q2OntoWS.JPG

Figure Precision recall curve for Query 2 in OntoWS

Using the same way to obtain the precision and recall values for retrieved documents for Query 3-10 and finally computes the average precision and recall value for OntoWS which is on Table 5.3 and it is plotted in Figure 4.

Table Average Precision Recall Values for Query 1-10 in OntoWS

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Average Precision

100%

100%

100%

98%

85.24%

71.64%

61.85%

51.53%

43.72%

36.42%

C:\Documents and Settings\Chamil\Desktop\Evaluation graphs\AVGOntoWS.JPG

Figure Average Precision recall curve for Query 1-10 in OntoWS

Keyword Search

As I have prepared the experiments, 10 different queries were submitted to the conventional keyword search and conventional keyword search was retrieved sets of documents as the searching result. According to results, the precision and recall values of retrieved documents for the Query 1 are in the Table 5.4 and it is plotted in Figure 5.

Table 5. Precision Recall Values for Query 1 in Conventional Keyword Search

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Precision

100%

100%

60.00%

33.33%

38.46%

30%

41.18%

26.67%

0.00%

0.00%

C:\Documents and Settings\Chamil\Desktop\Evaluation graphs\Keyword Search\Q1Keyword.JPG

Figure Precision recall curve for Query 1 in Conventional Keyword Search

The precision and recall values of retrieved documents for the Query 2 are in the Table 5.5 and it is plotted in Figure 6.

Table .5 Precision Recall Values for Query 2 in Conventional Keyword Search

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Precision

100%

22.22%

30.00%

26.67%

27.78%

30%

23.33%

0.00%

0.00%

0.00%

C:\Documents and Settings\Chamil\Desktop\Evaluation graphs\Keyword Search\Q2Keyword.JPG

Figure Precision recall curve for Query 2 in Conventional Keyword Search

Using the same way to obtain the precision and recall values for retrieved documents for Query 3-10 and finally computes the average precision and recall value for conventional keyword search which is on Table 5.6 and it is plotted in Figure 7.

Table 5. Average Precision Recall Values for Query 1-10 in conventional keyword search

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Precision

100%

61.93%

52.39%

37.52%

44.38%

36.34%

25.45%

8.63%

0.00%

0.00%

C:\Documents and Settings\Chamil\Desktop\Evaluation graphs\Keyword Search\AVGkeyword.JPG

Figure Average Precision-recall curve for Query 1-10 in conventional keyword search

The comparison precision-recall values between OntoWS and conventional keyword search is conducted by using the average of precision value of OntoWS and precision value of keyword search. The comparison data are in Table 5.7 and the comparison between OntoWS and keyword search is plotted in Figure 8.

Table 5. Average Precision Recall values of OntoWS and Keyword search

Recall

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Precision of OntoWS

100.00%

100.00%

100.00%

98.00%

85.24%

71.64%

61.85%

51.53%

43.72%

36.42%

Precision of Keyword Search

100.00%

61.93%

52.39%

32.57%

44.38%

36.34%

25.45%

8.63%

0.00%

0.00%

C:\Users\CHAMIL\Desktop\Evaluation graphs\compare.jpg

Figure Comparative precision curves OntoWS Vs Keyword Search

The variation of precision-recall curves of OntoWS and keyword search is shown in Figure 8. When considering each query separately, OntoWS provides high precision value than conventional keyword search for particular query. At the opening point, more relevant documents were retrieved in both searching approaches that apparent by the high precision values for retrieved documents in both of searching approaches produces. The precision value at the second recall value correspond to each query except query 1, query 4, query 6, query 7 show rapid abatement in keyword search than OntoWS approach. The precision value at the third recall value correspond to each query except query 2, query 4, query 6, query 9, query 10 show rapid abatement in keyword search than OntoWS approach. But when considering the average precision value of keyword search is lower than average precision value of OntoWS. There are some exceptional cases occurred when obtaining the precision value in keyword search that contradict the main conclusion. The larger sample is required to have a better result in IR systems. Therefore, here also the reason to have those exceptional states is the size of the sample.

Conclusion and Future Works

Conclusion

This research was focused on addressing the problem of conventional keyword search that is influence to decrease the interaction between human and the web and present an effective mechanism to represent the results of a meaningful web search. The proposed solution was based on the conceptual space which is an ontology. This proposed solution was totally depending on some domain ontology; currently breast cancer ontology.

At the implementation phase, there were many issues to be solved. Finding better domain ontology was the huge challenge to be solved. That requirement of conceptual space has been fulfilled by a standard ontology. I have faced huge challenge when selecting the domain ontology. The selected ontology at the beginning of the research was made many problems while carrying out the implementation and had to select domain ontology. But that was hugely influenced to proceeding of this research. One of other problem was to selecting better APIs for the implementation of OntoWS. There are various APIs for ontology processing such as the Jena Ontology API, OWL API and SOFA (Simple Ontology Framework API). The OWL API was selected since it is support OWL 2 and has an efficient in-memory reference which is highly useful in a web searching approach. But an additional effort and time hand to be spent on understanding the API.

The proposed Ontology based Web searching Mechanism provides a new approach which is embedded semantic information retrieval, in information retrieval area. The efficiency of Ontology based Web searching approach is higher than the ordinary keyword searching. And also this approach is highly confided on concepts, properties, hierarchy, classes and relationships. Therefore, the evaluation phase for this semantic approach was conducted for the realization of performance of OntoWS against the conventional keyword search. Thus, the OntoWS was produced higher precision-recall values for retrieved document than the conventional keyword searching approach.

Future Works

Current ontology based web search mechanism that explore through this research proficient only for single domain. Thus we hope to expand this research in future as it can be applied to wide area of domains. Since an ontology has the feature of importing other ontologies, experimentation and evaluation can be extended to integrate several areas.

Nowadays, personalized web searching is a most prominent requirement of web searching area. Current solution was not facilitated by web personalization. Thus, we hope to expand this research in future as web personalization applied solution.

Acknowledgement

Foremost, I would like to express my sincere gratitude to my supervisor Dr. K.L. Jayaratne, senior lecturer, University of Colombo School of Computing. His wide knowledge, encouraging and personal guidance have been great value for providing a good basis for the present thesis. I am deeply grateful his detailed and constructive comments, and for his important support throughout this work.

I warmly thank Mr Dulan Wathugala, coordinator of the final year project for his valuable and friendly help and his important support throughout this research. And also his broad discussions around my work and motivating explorations in operations have been very supportive for this research.

I warmly thank Miss.L.N.C.DeSilva, coordinator of the final year project for her assistance throughout this project.

I wish to thank all colleagues help me lot in proceed my work. And I warmly thank my parents for giving birth to me and supporting me spiritually throughout my life.

Abstract

The World Wide Web has been grown up as tree which has spread its branches in all the areas. Thus it can be identified as the largest data repository in the world that presents key driving force for large scale of information technology. With the increase of the amount of content it has been difficult to build an interactive web search with traditional keyword search. The idea presented here is improve the searching process with information extracted from the semantic model of the domain. Ontology is the backbone of semantic web technologies.

One of the greatest problems of the traditional search engines is that typically they are based in keyword processing. Because of the amount of information and the variety of tools used for searching, find information on the web is always more difficult. And also with the traditional web search, most probably search engines deliver results with including number of mismatched information. This causes low precision in search results in information retrieval. This can be influence to reduce the tendency of information retrieval through web searching. Thus, this motivation factor influenced to provide ontology based web searching mechanism for particular domain.

The aim of this project is to solve the problem of reducing interaction between human and the web by getting mismatched information when people search on the web (Ordinary key word search). The main goal of this project to present effective mechanism to represent the results of a meaningful web search in particular domain. Currently for the simplicity, the cancer care domain is used to implement this ontology based web searching system.

This is novel web searching mechanism that based on an ontology which means the proposed system is a semantic enabled web search mechanism. Keyword extraction component, domain ontology, Semantic mapping component, pre-processing component, indexing component, concept to concept matching component are the major components of the ontology based web search mechanism.

The domain reference terms are represented through the domain ontology. The user interface allows users to specify the description of their searching needs in natural language. The optimal keywords are extracted from users query by keyword extraction component. Those optimal keywords are mapped with semantic mapping component and produce a best concept for particular user keyword as the output.

We consider the multi concept approach which means we consider that each document can have more than one concept. This emerge the necessity of find the best concept of each document prior to the all process get started. Therefore the pre processing mechanism is introduced to this system. Pre-processing mechanism is applied to the data collection by performing keyword (page title, text content) extraction, indexing, and gather best concept of each and every document. In document indexing, markup and format removal, tokenization, filtration, stemming and weighing is applied. Finally, the concept to concept mapping component maps the pre-processed data collection that categorized in concept, with the best concept that represents the given keyword by user. The influential achievement of this research, conclude in the concept of Concept to Concept mapping for provide more accurate result for the user.

The proposed ontology based web searching model provides an efficient web searching mechanism for a particular domain. The searching results of this approach are more accurate and efficient than keyword based searching. This approach highly depends on the domain ontology that builds by relationships, hierarchy, properties, classes and defined rules.

Aims and objectives

This research was focused on addressing the problem of conventional keyword search that is influence to decrease the interaction between human and the web and present an effective mechanism to represent the results of a meaningful web search. Therefore the aims of this project is to solve the problem of reducing interaction between human and the web by getting mismatched information when people search on the web (conventional key word search). The main goal of this project to present effective mechanism to represent the results of a meaningful web search in cancer care domain. Currently, the searching environment will be the breast cancer domain. Main focus of the proposed solution would web searching mechanism for breast cancer environment which enable us to rediscover surfing the web and refreshing in new way and data/information about breast cancer will be shared and reused. This model will avoid the knowledge acquisition bottleneck \cite{prop:4}. The main objective of this project is to develop an ontology based web searching mechanism using semantic web concepts \cite{prop:5}. This can be able to apply in health care domain, and usable for any kind of people (especially Doctors, Patients, Researchers, Organizations which are related to health care) \cite{prop:6}.

User will interact with their personal agent using natural language

Scope

This proposed solution was totally depending on one domain ontology. For the simplicity, throughout the project concerns are constrained to breast cancer domain ontology. Later, this approach will be applied for other domains as a future works. Currently, the breast cancer domain was selected for simplify the project and the breast cancer is one of most dominant area in health care domain.

Chapter 2 consists of the literature review and background which discuss about related techniques and existing approaches if available.

Chapter 3 includes the design and methodology which elaborate the architecture of the proposed solution.

Chapter 4 consists of the implementation of the proposed solution; Ontology based Web searching Mechanism for Information Retrieval (OntoWS).

Chapter 5 includes the experimental design for the evaluation of proposed solution, experimental results and the analysis of results of the proposed solution.

Chapter 6 consists of the conclusion of the research and the works that are planning to extend this research.

\begin{table}[h]

\caption{Precision Recall Values for Query 1 in OntoWS}

\centering

\begin{tabular}{|l|l|l|l|l|l|l|l|l|l|l|} \hline

Recall & 0.1 & 0.2 & 0.3 & 0.4 & 0.5 & 0.6 & 0.7 & 0.8 & 0.9 & 1 \\ \hline

Precision & 100\% & 100\% & 100\% & 100\% & 83.33\% & 75\% & 66.67\% & 53.25\% & 41.67\% \\ \hline

\end{tabular}

\label{table:two}

\end{table}

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.