Storing And Exchanging Data Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Introduction

Storing and exchanging data are playing a huge role in our life nowadays. As volume of information goes up, the need for more effective ways to operate it increases. One of the most effective ways of storing information is known as "Tree structure". Tree structures arrange information in a hierarchical order. This structure easily allows you to create, edit, enumerate, and move data. The most popular way of operating tree-like structured data is Extensible Markup Language (XML). It is now widely used to store and exchange big amounts of information on the Web.

With global increasing of amounts of data the use of XML is undertaking a huge leap. Its low requirements and flexibility determine its prevalence. However, complicated structure of a document and inflating of the information it stores cause the need for refining the ways of retrieving data. Indexing of XML documents can provide significant performance boost for querying information. In this research I briefly describe two approaches of indexing of XML documents and analyze the results of their work.

Ctree

Intro

XML Document consists of hierarchical ordered elements, which can be displayed as a labeled tree. Each element can be reached through a certain path. If we attach to each element its appropriate label we will be able to build label paths through the tree. If the label paths of two elements are the same then we can consider these elements equivalent. Path summary is a tree on which every node represents one label path. Paths summary helps evaluate single path queries, but it does not keep the information about hierarchical relationships of individual elements. This was a reason for proposing an index structure which also provides detailed element-level connections, which is called a Compact Tree Structure (Ctree).

Description

Ctree is a tree on which every node contains an array of elements describing relationships of the elements. It provides a summarized view of hierarchical structures at the group level and detailed information at the element level.

Ctree index data can be modeled in relational tables and be implemented in relational database. It is also possible to transform it into XML database or store it in disk files for fast access.

For storing Ctree structure index can be used into four tables: Elements, Groups, CtreeDB, and ElmPosLen.The first table stores the mappings from elements to their parents. The next one stores the group information and the label path. The Ctree table describes the main features of Ctree. As for the ElmPosLen table, it has the information about the position and length of each element.

Since XML has different data types, Ctree proposes five types of value indexes. The number of data types can be extended, if required. All five indexes support a search operation search(value, gid?) where value and gid represent spicific value and a certain group. This operator returns a list of absolute elements, if the gid is not specified, or relative elements, if it is. For Ctree-based query processing, in order to eliminate irrelevant groups by group-level structure mapping, gid is determined and then value predicates are evaluated. Therefore, the returned results and the I/O cost are reduced.

Query Processing

These are the advantages of Ctree-based query processing algorithm:

- Locating frames at the group level cuts a big number of irrelevant groups at an early stage;

- Evaluating a value predicate based on a group significantly reduces the possible matches and improves the efficiency of combining the matches for value predicates and structure constraints.

- Using an array for fast mapping elements to their parents helps evaluate constraints of element-level structure. Fast access to elements' parents is extremely important for efficient XML query processing because of the tree nature of XML data and queries are rooted on sharing common parents.

XISS

Intro

XISS/R system is an implementation of the XML Indexing and Storage System (XISS) on top of a relational database. The system includes a web-based user interface, which enables stored documents to be queried via XPath. The user interface utilizes the Xpath Query Engine, which automatically translates XPath queries into efficient SQL statements.

Description

The XISS/R system consists of three components:

- A mapping of XML data to relational schema;

- An XPath Query Engine;

- A web-based user interface;

The mapping of XML data to relational schema is done by using the extended preorder numbering scheme. During the research, two relational schemas were generated to make best use of this numbering scheme. The XPath Query Engine enables XPath queries to be issued on the relational implementation of the mapping of XML data. Universal access to the system is provided via a web-based interface which allows users to visually interact with the query engine and result set.

Implementation

The current implementation of XISS/R uses MySQL but also can use any commercial database system. Documents are parsed and loaded into the database by a program that uses MySQL's C interface. This loader uses the LibXML library to access XML documents such that their structural information can be encoded with the extended preorder numbering scheme.

Several schemas were created to determine whether to store elements in a large node table or store them in separate node tables divided by node name. The overall trend in performance was that schemas using horizontal partitioning were faster. As for whether to store tag names in the node tables or store them in a separate tag name table, It was found that since the number of distinct elements and attributes is usually small, the join time between a tag name table and node tables is inconsequential compared to the common total query time. In addition to the tables in the schemas described before, was also utilized database indexes to accelerate query processing.

Performance evaluation

In order to compare the efficiency of both reviewed methods of XML indexing, two databases DBLP and XMARK were tested and the results of the work of both Ctree and XISS approaches were compared. According to the results of this test, Ctree requires much less space for building an index file than XISS (78MB to 118MB on DBLP and 28MB to 54MB on XMARK database). Then, 6 queries on each database were conducted. Results of this test show that Ctree significantly excels XISS in every query made on both database systems.

Conclusion

XML has become an important standard for data representation and exchange in the Internet. The need for efficient storage and manipulation is an important issue nowadays. Methods that I reviewed within this paper were designed to refine operating the data. It is notable from the research that in spite of good results of the XISS indexing system it is unable to compete with the Ctree Indexing method in terms of space- and time-consuming. The Ctree approach outperforms the XISS method by operating hierarchical connections and element descriptions as main sources for indexing XML documents. Conducted research opts for this approach as for an excellent choice for operating XML files.

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.