Efficient Database Driven Reverse Mapping Dictionary

By Matt Swarbrick

✅ Paper Type: Free Essay	✅ Subject: Computer Science
✅ Wordcount: 2061 words	✅ Published: 04 Apr 2018

Reference this

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Building an Efficient Database Driven Reverse Mapping Dictionary

ABSTRACT

With the enormous availability of words in usage it is always being a challenge to find the meaning. Even the versatile speaker may thrash about finding a meaning for certain unheard words. In such cases they need some source for reference like dictionary. In traditional model for using dictionary, forward concept is implemented where it result in set of definition and it may produce a comprehensive phases. This may even confuse the user with the different concept of understanding or sometimes user could not understand the detailed concept. To overcome this concept, we facilitate reverse dictionary in which for any phases or word, the appropriate single word meaning is given. This system also facilitates to provide the relevant meaning even if that word is not available in the database. It will also produce instant output for the user input.

1. INTRODUCTION AND RELATED WORKS

1.1 ABOUT THE PROJECT:-

Reverse Dictionary:-

A reverse dictionary is a dictionary organized in a non-standard order that provides the user with information that would be difficult to obtain from a traditionally alphabetized dictionary. For example, A Reverse Dictionary of the Spanish Language and Walker’s Rhyming Dictionary are reverse dictionaries, the organization of which is based upon sorting each entry word based upon its last letter and the subsequent letters proceeding toward the beginning of that word. Consequently, in these reverse dictionaries all words that have the same suffix appear in order in the dictionary. Such a reverse dictionary would be useful for linguists and poets who might be looking for words ending with a particular suffix, or by an anthropologist or forensics specialist examining a damaged text (e.g. a stone inscription, or a burned document) that had only the final portion of a particular word preserved.

Reverse dictionaries of this type have been published for most major alphabetical languages (see numerous examples listed below). By way of contrast, in a standard dictionary words are organized such that words with the same prefix appear in order, since the sorting order is starting with the first letter of the entry word and subsequent letters proceeding toward the end of that word. Reverse dictionaries of this type were historically difficult to produce before the advent of the electronic computer and have become more common since the first computer sorted one appeared in 1974. Another use of the term “reverse dictionary” is for a reference work that is organized by concepts, phrases, or the definitions of words. This is in contrast to a standard dictionary, in which words are indexed by the headwords, but similar in function to a thesaurus, where one can look up a concept by some common, general word, and then find a list of near-synonyms of that word. (For example, in a thesaurus one could look up “doctor” and be presented with such words as healer, physician, surgeon, M.D., medical man, medicine man, academician, professor, scholar, sage, master, expert.) In theory, a reverse dictionary might go further than this, allowing you to find a word by its definition only. Such dictionaries have become more practical with the advent of computerized information-storage and retrieval systems

Online Dictionary:

On Line reverse dictionary lets you describe a concept and get back a list of words and phrases related to that concept. Your description can be a few words, a sentence, a question, or even just a single word. Just type it into the box above and hit the “Find words” button. Keep it short to get the best results. In most cases you’ll get back a list of related terms with the best matches shown first.

How does it work?

On Line indexes hundreds of online dictionaries, encyclopedias, and other reference sites. By now you may have used the standard search available from the home page, which shows you a list of definition links for any word you type in. This is the reverse: Here we search our references for words that have definitions conceptually similar to the words you search for. We do this using a motley assortment of statistical language processing hacks.

Online reverse dictionary (RD). As opposed to a regular (forward) dictionary that maps words to their definitions, a RD performs the converse mapping, i.e., given a phrase describing the desired concept, it provides words whose definitions match the entered definition phrase. For example, suppose a forward dictionary informs the user that the meaning of the word “spelunking” is “exploring caves.” A reverse dictionary, on the other hand, offers the user an opportunity to enter the phrase “check out natural caves” as input, and expect to receive the word “spelunking” (and possibly other words with similar meanings) as output. Effectively, the RD addresses the “word is on the tip of my tongue, but I can’t quite remember it” problem. A particular category of people afflicted heavily by this problem are writers, including students, professional writers, scientists, marketing and advertisement professionals, teachers, the list goes on. In fact, for most people with a certain level of education, the problem is often not lacking knowledge of the meaning of a word, but, rather, being unable to recall the appropriate word on demand. The RD addresses this widespread problem.

2. EXISTING SYSTEM:-

In the fact that it is more significant to make a reference for unheard word, user prefers a source like dictionary for better understanding. The performance allows online interaction with users Current semantic similarity measurement schemes that are highly computationally intensive. In this technique, concepts are represented as vectors in a feature (or keyword) space. The two most common methods to achieve this, latent semantic indexing (LSI) and principal component analysis (PCA), both analyze the keywords of documents in a corpus to identify the dominant concepts in the document. Subsequently these dominant concepts are represented as vectors in the keyword space and are used as the basis of similarity comparison for classification. In most implementations of Concept Similarity Problem (CSP) solutions, vectorization is done a priori, and at runtime, only vector distances are computed.

Drawbacks

It requires the user’s input phrase to contain words that exactly match a dictionary definition;
It does not scale well—for a dictionary containing more than 100,000 defined words, where each word may have multiple definitions, it would require potentially hundreds of thousands of queries to return a result.

3. PROPOSED SYSTEM:-

Report the creation of the WordStar Reverse Dictionary (WRD), a database-driven RD system that attempts to address the core issues identified above. The WRD not only fulfils new functional objectives outlined above, it does so at an order of magnitude performance and scale improvement over the best concept similarity measurement schemes available without impacting solution quality. We also demonstrate that the WRD is far better in solution quality than the two commercial RDs available.

Our reverse dictionary system is based on the notion that a phrase that conceptually describes a word should resemble the word’s actual definition, if not matching the exact words, then at least conceptually similar. Consider, for example, the following concept phrase: “talks a lot, but without much substance.” Based on such a phrase, a reverse dictionary should return words such as “gabby,” “chatty,” and “garrulous.” However, a definition of “garrulous” in a dictionary might actually be “full of trivial conversation,” which is obviously close in concept, but contains no exact matching words. In our RD, a user might input a phrase describing an unknown term of interest. Since an input phrase might potentially satisfy the definition of multiple words, a RD should return a set of possible matches from which a user may select his/her choice of terms. This is complex, however, because the user is unlikely to enter a definition that exactly matches one found in a dictionary.

The meaning of the phrase the user entered should be conceptually similar enough to an actual dictionary definition to generate a set of possible matches, e.g., returning to the “talks a lot, but without much substance” example, our reverse dictionary should return words like “garrulous.”

Advantages

It does so at an order of magnitude performance
Scale improvement over the best concept similarity measurement schemes available without impacting solution quality

The system architecture diagram enables you to graphically model the applications of a system, and the externals that they interface with and data stores that they use or provide information to.

The following information describes the symbols used on the diagram:

Application

It uses the Application symbol to represent an entire application and graphically show on this diagram how it is related to externals and data stores. Within the application definition, it can specify overall information about the application — the process threads in the organization that it enables, the type of team effort being used to build it, etc. To specify more details on the implementation of the application, you can create child Data Flow diagrams or UML diagrams, depending on the nature of the application.

Data Flow

It can model the flow of data as it moves from one point in the system to another with the Data Flow line. The flow might be between externals and applications, or applications and data stores. Within the data flow you can model the data elements and data structures used. Data flows can split into two or more flows, or they can join to one from two or more flows.

Material Flow

It can model the direction of the flow of physical items and materials in the system with the Material Flow line. The flow might be between externals and applications, or applications and data stores.

Data Store

A Data Store symbol is where data “rests” when it is neither flowing nor being operated on. A data store can be a database, hard disk, floppy disk, or a file on a disk.

Multi-Data Store

A Multi-Data Store symbol is used to denote that multiple instances of the data store exist. This convention is used to avoid drawing a copy of a schema for each equivalent data store when you build a data model.

External

An External symbol represents an object that sends information or data to the system, or takes information from the system, but is not itself part of the system.

Multi-External

A Multi-External symbol is used to denote that multiple instances of the external exist.

4. CONCLUSION:-

In this paper, we describe the significant challenges inherent in building a reverse dictionary, and map the problem to the well-known conceptual similarity problem. We propose a set of methods for building and querying a reverse dictionary, and describe a set of experiments that show the quality of our results, as well as the runtime performance under load. Our experimental results show that our approach can provide significant improvements in performance scale without sacrificing solution quality. Our experiments comparing the quality of our approach to that of Dictionary.com and OneLook.com reverse dictionaries show that the Wordster approach can provide significantly higher quality over either of the other currently available implementations

5. REFERENCES

IEEE:-

[1] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. ACM Press, 2011.

[2] D.M. Blei, A.Y. Ng, and M.I. Jordan, “Latent Dirichlet Allocation,” J. Machine Learning Research, vol. 3, pp. 993-1022, Mar. 2003.

[3] J. Carlberger, H. Dalianis, M. Hassel, and O. Knutsson, “Improving Precision in Information Retrieval for Swedish Using Stemming,” Technical Report IPLab-194, TRITA-NA-P0116, Interaction and Presentation Laboratory, Royal Inst. of Technology and Stockholm Univ., Aug. 2001.

[4] H. Cui, R. Sun, K. Li, M.-Y. Kan, and T.-S. Chua, “Question Answering Passage Retrieval Using Dependency Relations,” Proc. 28th Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 400-407, 2005.

[5] T. Dao and T. Simpson, “Measuring Similarity between Sentences,” http://opensvn.csie.org/WordNetDotNet/trunk/Projects/Thanh/Paper/WordNetDotNet_Semantic_Similarity.pdf (last accessed 16 Oct. 2009), 2009.

[6]Dictionary.com, LLC, “Reverse Dictionary,”http://dictionary. reference.com/reverse, 2009.

[7] J. Earley, “An Efficient Context-Free Parsing Algorithm,” Comm. ACM, vol. 13, no. 2, pp. 94-102, 1970.

[8] Forrester Consulting, “Ecommerce Web Site Performance Today,” http://www.akamai.com/2seconds, Aug. 2009.

[9] E. Gabrilovich and S. Markovitch, “Wikipedia-Based Semantic Interpretation for Natural Language Processing,” J. Artificial Intelligence Research, vol. 34, no. 1, pp. 443-498, 2009.

[10] V. Hatzivassiloglou, J. Klavans, and E. Eskin, “Detecting Text Similarity over Short Passages: Exploring Linguistic Feature Combinations Via Machine Learning,” Proc. Joint SIGDAT Conf. Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 203-212, June 1999.

Matt Swarbrick

Matt holds a BA and MA certificate from Cambridge, and is an subject-matter expert in Business and Management. Matt also writes about subjects like Finance, Economics and Computing/ICT.

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Cite This Work

To export a reference to this article please select a referencing stye below:

Related Services

View all

Essay Writing Service

From £99

Report Writing Service

From £99

Student reading and using laptop to study

Assignment Writing Service

From £99

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please click the following link to email our support team:

Request essay removal