Computer Searching And Search Engines Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The main aim of making a search engine for computer is to search a file in a computer faster and for the user not showing his secret files to others. As Microsoft and Google desktop is searching and indexing files for many years but still they don't work on the privacy of peoples files. So I have started a work so that I can manage the secure the indexing system.

1.1 Purpose of document

We will make a system that can make a search easier we can specify the directory easily and we can also exclude some folders or drives which we don't want to use in search. It makes our search better. It has two parts whether that document is user personal or not then I will make such type of system that can easily distinguish between them and support the user to hide the personal information to be displayed on the screen.

1.2 Background knowledge

Indexing service is used by Microsoft and Windows that extorts contents from a file and makes an indexed catalog that is to help the user in searching a file. The indexing service extort both type of information i.e. text and property from the files. It can extort content information from a file using filter component that understand file format. It can easily read all kind of files e.g. text HTML and MIME too. Then the indexing service concatenates the information into catalogs of indexes for fast searching. An index is the process of filtering the file then creating the index doorway and then concatenating them in catalogs. The main and final step in indexing is making a catalog that can store file name and its location and key words related to a document in a document that is indexed. Then we can search the file by giving words in the query and it will fetch us the required result within no time. [1]

1.3 Motivation:

There are many aims to do this work. First I want to know that there are how many systems that are indexing now a day. Second aim is to find out that how the windows are indexing. Third is why indexing is important in our lives, if indexing is not done what are the consequences we have to face. Fourth is the windows are indexing very well but if want to hide a file then if the system are on show the hidden files it is not confidentential as it shows all the personal files too. So I want to make a system that provides confidentiality and privacy to the user that whether what file is shown in the search window and what is not.

1.4 Research statement:

Many a times, we store an item on the computer and then we cannot locate it. I shall develop a system, which will index all the information (documents) on the computer. Google has already provided an indexing system. But it has two snags. It indexes the whole computer. While we would like to more micro control on indexing. We shall be able to index only one drive or a group of folders. We shall also utilize the fuzzy logic so that if the search string is not a perfect match, still it could find the data. The indexing system will be very useful in locating the desired information.

1.5 Research area and its importance

Research area is search engines and database. The search engine helps you in indexing the files and the database queries help you in locating the files. The main importance of this thesis is to provide people with ease and quality service of finding any document file accurately. So that in short period of time, the user gets the desired request. Main purpose is to provide security in searching mean we are providing user login system like starting windows and a security checks system so that it may or may not include (show) the document in the searching window.

1.6 Objectives

To make a search more faster

To get desired result accurate

Don't show the private files in search window unless the authoritative party logins

1.7 Scope of Document


In first chapter we discussed the basic introduction the research area statement problem and the proposed solution a sort of conclusion. In 2nd chapter the related work or back ground knowledge of the terminology used in the document .In 3rd chapter literature review the work on which my research work depend. In 4rth chapter the proposed model and in chapter 5 the analysis and design .in the end chapter 6 is references and citation.

Chapter 2

Background Knowledge


2.1.1 Indexing (searching):

The process of index is called indexing.

2.1.2 Index:

To make a list of anything and its attribute is called index.


It is a reference table that contains the keys (numbers) or references needed to address data items to be searched.

2.1.3 Indexer:

The person who does indexing is called indexer.

2.2 Types of searches:

2.2.1 Dictionary search

2.2.2 Binary tree search

2.2.3 Sequential search

2.3 Types of search engine:

2.3.1 Web search engine

2.3.2 Photo search engine

2.3.3 Videos search engine

2.4 Feature:

2.4.1 Advantages of indexing

It is easy to use.

It is quick access to information.

We can easily find what we really need.

Less time is required.

2.4.2 Disadvantages

People get lazy by using it their memory could not be fast.

People get irrelevant data which they don't required while searching.


Literature Review

3 Techniques

3.1 Windows search system

3.1.1 How the System Index works

Track any documents or applications on your PC, and the content found within these elements using the system level of Windows Search index. It supports indexing of over 200 common file type's box and allows users to find any documents about the messages, including email, calendar, contacts and media files that are stored on your PC. All you need to remember a file name, keywords, labels, or even found in the text file or e-mail it. The index of the system will need to complete an initial scan of your PC, and number of new files and email messages arrive, will be indexed when the PC is idle and lose the search shortly after then. After the first scan, the software updates the index and continues to monitor system changes. Further exploration of new files or e-mail messages requires a fraction of the time and computer resources to maintain the index date. By default, Windows Search indexes only the content of "Documents for each user" and "Favorites" folder, the "Public" folder, and the default mail store on your PC. With administrative privileges, users can configure the system indexes to include other folders or volumes on the same PC, including specific network folders and email accounts, or to include or exclude certain file formats. IT administrators can also modify them and the other Group Policy settings. For more information, go to Windows Search 4.0 Administrator's Guide.... With respect to security, access to every piece of the index is protected by a matching ACL permission to index data. Windows Search indexes are bad as well, so it is not easily readable if someone tries to open the file index. Moreover, at no time the contents of the system index or user information are sent to Microsoft or other third parties. User information is up to the highest level of security and privacy. Finally, while the Windows Search 4.0 supports indexing of encrypted files that users have permission to use, encrypted files are not available in search results from the remote (PC to PC) to search.

3.1.2 Overview of the files

With respect to a Security, access to each piece of protected There ninja index matching ACL permissions index data. Windows Search windows search all the content in the most commonly used file types, including text files, Office Word documents, Office Excel spreadsheets, Office Outlook, Windows Mail and Outlook Express items and web pages. PDF files can also be sought to install a PDF IFilter. For more information about the types of files, please review find File Types page.

Note: You can search Outlook email, appointments and contacts, you need Outlook 2003 or later installed. The system of indexing Outlook e-mail, while Outlook is running. [2]

3.2 Google desktop search system

Google Desktop makes searching your computer as easy as searching the web with Google. It is a desktop search application that provides full text search your email, files, music, photos, chats, Gmail, web pages visited by you and others. By making your computer desktop search gives you the information easily in your reach and frees you from having to manually organize your files, emails and bookmarks.

Google Desktop does not just help you search your computer, but also easier to gather information from the Internet and track Gadgets and sidebar. Google Gadgets can be placed anywhere on your computer to show you new email, weather, photos, news and more. Sidebar is a vertical bar on your desktop that helps you keep track of your gadgets.

3.2.1 Google Desktop

Is desktop search software from Google for Mac OS X, Linux and Microsoft Windows. The program allows text searches of a user's e-mail, computer files, music, photos, chats, viewed WebPages, and other "Google Gadgets "....

3.2.2 File indexing

After the initial installation of Google Desktop, the software completes an indexing of all files on your computer. After the initial indexing is done, the software continues to index files as needed. Users can start searching for files immediately after installing the program. After completing the search results can also return an Internet browser Google Desktop Home Page much like Google's search results the Web. 

Google Desktop can index several different types of data, including email, browsing history, Internet Explorer and Mozilla Firefox, office documents in OpenDocument and Microsoft Office formats, transcripts of instant messaging from AOL, Google, MSN, Skype, Tencent QQ and various kinds of multimedia files. Other file types can be indexed by using plug-ins. Google Desktop allows users to control what types of data are indexed by program. Google Desktop only indexes 100,000 files per drive during the initial indexing period. If the user has over 100,000 files in a particular drive, Google Desktop does not index all of them during this the first time. However, Google Desktop adds files to your index during real-time indexing when users move or open them.

3.2.3 Side Bar

A prominent feature of Google Desktop is the Sidebar, which holds several common Gadgets and resides off to one side of the desktop. The Sidebar is available with the Microsoft Windows and Linux versions of Google Desktop. The Sidebar comes pre-installed with the following gadgets:

Email - a panel which lets one view one's Gmail messages.

Scratch Pad - here one can store notes; they are saved automatically

Photos - displays a slideshow of photos from the "My Pictures" folder (address can be changed)

News - shows the latest headlines from Google News, and how long ago they were written. The News panel is personalized depending on the type of news you read.

Weather - shows the current weather for a location specified by the user.

Web Clips - shows updated content from RSS and Atom web feeds.

Google Talk -

If Google Talk is installed, double clicking the window title will dock it to one's sidebar. Like the Windows Taskbar, the Google Desktop sidebar can be set to Auto-Hide mode, where it will only appear once the user moves the mouse cursor towards the side where it resides. If not on auto-hide, by default the sidebar will always take up about 1/6 - 1/9 of the screen (depending on the screen resolution), and other windows are forced to resize. However, the sidebar can be resized to take less space, and users can disable the "always on top" feature in the options. With the auto-hide feature on, the sidebar temporarily overlaps maximized windows. Another feature that comes with the Sidebar is alerts. When the Sidebar is minimized, new e-mail and news can be displayed on a pop-up window above the Windows Taskbar.

3.2.4 Quick Find

When searching in the sidebar, desk bar or floating desk bar, Google Desktop displays a "Quick Find" window. This window is filled with 6 (by default) of the most relevant results from the user's computer. These results update as the user types, and allow use without having to open another browser window

3.2.5 Desk bars

Desk bars are boxes which enable searching directly from the desktop. Web results will open in a browser window and selected computer results will be displayed in the "Quick Find" box (see above). A Desk bar can either be a fixed desk bar, which sits in the Windows Taskbar, or a Floating Desk bar, which may be positioned anywhere on the desktop.

3.2.6 Email Indexing

Google Desktop includes plug-ins that allow indexing and searching the contents of local Microsoft Outlook, IBM Lotus Notes, and Mozilla Thunderbird email databases, outside of the client applications' built-in search functions. For Lotus Notes, only local databases are indexed for searching. Google Desktop's email indexing feature is also integrated with Google's web-based email service, Gmail; it can index and search the email messages in Gmail accounts.

3.2.7 Gadget and Plug-ins

Desktop gadgets are interactive mini-applications that can be placed anywhere on the user's desktop - or docked in the Sidebar - to show new email, weather, photos, and personalized news. Google offers a gallery of pre-built gadgets for download on the official website. For developers, Google offers an SDK and an official blog for anyone who wants to write gadgets or plug-ins for Google Desktop. An automated system creates a developer hierarchy called the "Google Desktop Hall of Fame", where programmers can advance based on their gadgets' number and popularity.

The SDK also allows third-party applications to make use of the search facilities provided by Google Desktop Search. For example, the file manager Directory Opus offers integrated Google Desktop Search support. [3]

3.3 Aim at File Fast File Search: Indexing Service Document Search Tool

3.3.1 Overview

AimAtFile is an effective way to find document files by their contents. It lets you search for files of many types: Microsoft Office documents, HTML, PDF, Plain text, Media files and others on a desktop computer or in a network. The content search functionality is based on Microsoft Indexing Service included in Windows 2000/XP/2003. AimAtFile provides a document preview. You can see the contents of found documents without opening them in their editing programs.

Microsoft Indexing Service scans files in the background as a low-priority process and AimAtFile requests the index to return search results instantly. Unlike the most of third party file search utilities AimAtFile does not scan all files each time you try to find any file. Some of third party search tools use their own indexers. Windows already includes the native powerful Indexing Service so it is not necessary to install additional indexers into system. The only thing you need is a right tool to query the built-in Indexing Service. AimAtFile Fast File Search is the tool. [4] 

Research Paper

3.4.1 Problem statement:

An Efficient Indexing Technique for Full-Text Database Systems

3.4.2 Main Idea

Full-text database systems require an index to allow fast access to documents based on their content. We propose an inverted file indexing scheme based on compression. This scheme allows users to retrieve documents using words occurring in the documents, sequences of adjacent words, and statistical ranking techniques. The compression methods chosen ensure that the storage requirements are small and that dynamic update is straightforward. The only assumption that we make is that sufficient main memory is available to support an in-memory vocabulary; given this assumption, the method we describe requires at most one disc access per query term to identify answers to queries.

3.4.3 Conclusion

In this paper the writer describe a numbers of techniques basic purpose is to discuss all the possible methods to carry out indexing. Then they finally proposed a system that uses an indexing method based on compression for us (' in full-text retrieval databases. The assumptions we make are that we are working only with text and that the vocabulary is sufficiently restricted that it can be stored in main memory. Given these assumptions, the indexing scheme provides fast response to Boolean queries, and can be extended to support word sequence queries and ranking techniques. These indexes can be dynamically maintained, and it is not necessary to periodically rebuild them. At most one disc access per query term is required to identify answers, and we have shown that this need never be more than is required by a bit sliced signature file.[5]

3.5 Problem Statement

To make an Indexing system for computer and to provide privacy for the files.

3.6 Proposed Solution

We will make a system that can make a search easier we can specify the directory easily and we can also exclude some folders or drives which we don't want to use in search. It makes our search better. It has two parts whether that document is user personal or not then I will make such type of system that can easily distinguish between them and support the user to hide the personal information to be displayed on the screen. We can modify with time to time so that we can make a better system that is easily used by the user.

Chapter 4

Proposed model

4. Proposed model

4.1 Main idea

There are much software that help in searching files but the main are windows searching system and the Google desktop system. They are performing well but they don't provide privacy for the files. In windows if we use the folder synchronization system that is by making a file hidden we can't see the file but if we change that system then we can get that file so we cannot protect our files privacy unless we do cmd programming in the windows. In Google desktop it also search a file but it merge the search between computer and internet so it make a person confused that this searched file belong to computer or net and it scan whole pc and show all files that is even private. So to provide privacy for a specific number of files I am trying to make a system that help us password protected.

4.2 Feature/Characteristic

It can search a file.

It includes those files in search that is not password protected and security set is off.

When the user login it will show all search.

4.3 Graphical model

4.4 Algorithm/Pseudo code

1st s step

Make an indexing system.

2nd step

Store files in the database.

3rd Step

Give key words and security setting to the files.

4th step

Search for the files by name and key words.

5th step

User gets required result.

4.5 Advantages

It provides privacy that means if the user give some specific folder or file the privacy setting it will not include that file or folder in the search.

4.6 Limitations.

The system keeps on changing until the new changes will be applied each time to increase the user facilities so that a user can easily used that system with privacy.

Chapter 5

Analysis and design

5.1 Flow chart of indexing and searching system

5.2 Use Cases

5.3 Activity diagram:

5.3.1 Searching:

5.3 Sequence Diagram

References and citation:

[1] (VS.85).aspx




[5] PDF an Efficient Indexing Technique for Full-Text Database Systems