Column Store Database The Right Man Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Computing the aggregate values of required attributes over large amount of data is often a requirement of decision support system. Most existing query processing algorithms in such systems relies heavily on the tuple-scan based approach, results in heavy processing overhead. Column stores have shown to perform particularly well, relative to row store for query workloads found in data warehouse and decision support systems. The paper presented, appraises the database architecture and design considerations for column store database, including data compression and query processing technique.

The need for read optimize database system has grown significantly with the high demands of performance improvement in DSS. Bulk loads are made periodically with queries, for such systems. Column store database are proven to be efficient for factors namely; Data access, compression ratio, Data operations, and buffering techniques.

Performance of the database is attributed in terms of processor speed, compression rate, query characteristics and memory hits, Section 2 provides techniques to improve performance in column store. Architecture of a database is important to understand the working methodology, presented in Section 3. Summary and future research scope in related area is covered by Section 4. For detailed analysis we studied system based on DSM (MonetDB), which represents methodology of column store.

2. Techniques for High Performance

This section analyses the techniques for high performance and effects of these techniques on column store database (MonetDB).

2.1 The impact of modern processor architectures

Cache hits and cache misses for database are proven to be performance measures by research community. Due to the sophisticated techniques used for hiding I/O latency and the complexity of modern database applications, DBMSs are becoming compute and memory bound [1]. The optimization criteria differs in main memory database than I/O dominated database [10]. The query evaluation in column store should be compute and memory bound rather than I/O bound. MonetDB accounts the impact of modern computer architecture, with multi-level cache memories to alleviate the continually widening gap between DRAM and CPU speeds. Sequential access of data in main memory may improve performance significantly with faster processors [37]. With increase in load, a simple sequential scan on a table may spend 95% of its cycles waiting for memory to be accessed, resulting lower performance ratio for complex database operation (Figure 1). Based on the detailed analysis of cache, vertical fragmentation leads optimal memory cache usage.

Table Size 6GB , Processor: AMD Athlon II X2 245 2.90 GHz


Tables Used:




Query execution Time in miliseconds


Number of Tuples returned

Minimal sequential scan(Random scan)

Sequential Scan

select count(*) from ontime;




select AirlineID, count(*) from ontime group by AirlineID




select l_airline_id.description, count(*) from ontime, l_airline_id where ontime.AirlineID = l_airline_id.Code group by l_airline_id.description;




select l_ontime_delay_groups.description, count(*) from ontime, l_ontime_delay_groups where ontime.DepartureDelayGroups = l_ontime_delay_groups.Code group by l_ontime_delay_groups.description;




select TailNum, FlightNum from ontime where Flightdate between cast('2011/01/01' as date) and cast('2011/01/10' as date);




Figure 1: Query Execution Time for Sequential and Random Access

The performance bottleneck of partitioned hash Join is improved by perfect-hashing on MonetDB runs on modern processors. Research includes performance characteristics of cache memory extracted from operating system in cost model. MonetDB is main memory database is (a)less significance on secondary storage (b)Performance improvement significantly by multi-level hierarchical memory(Cache, Main memory ,Swap). Database algorithms and Data structures should be designed and optimized for efficient multi-level memory access from the outset.

2.2 Data compression: Way to save the I/O bandwidth

Data may be coded into more compact form by storing N values in K*N bits, where k is the smallest byte size, holds any values in column [43]. Data compression techniques are equally applicable to both row store and column store. Query processing with column store requires less I/O bandwidth, for relevant columns. Storing columns in multiple sort orders can maximize the query performance. Recent research has enhanced the column storage for querying compressed data [3]. Run length encoding (RLE), is presented as an attractive approach for compressing sorted data in a column-wise store. Compression ratio is higher in RLE for column store than Dictionary, bit packing and FOR-delta [28]. In depth analysis of performance improvements due to compression and data characteristics (skew, correlation) including join operations is presented in [3]. To combine compression alternatives ,branch and bound algorithm was developed mainly focused on cardinality, byte size. The algorithm solved this problem in time O(n).

2.3 Buffering techniques for accessing metadata

Metadata plays vital role for database contents and algorithm selection for performance optimization. The access pattern of metadata lookup queries is horizontally clustered, with random access pattern, not benefited from column store. Accessing the vertically structured attribute from the same buffer is approximately random hit of attribute, resulting in lower cache hit ratio. Design of separate cache memory for meta data tables with LRU replacement and NSM architecture for cache may improve the hit ratio.

2.4 Querying compressed, fully transposed files

Search algorithm designed for column store, outperforms inverted files (indexes) in large proportion, with direct operation on compressed data [44, 7]. Theoretical and empirical results were presented, for efficiency of column store with RLE compression [50]. Refinement to existing approach is done by integrating interpolation search, sequential search, and binary search into a poly-algorithm [5]. RLE -compressed vertical storage architecture allows direct operations on compressed data, join and set operations on relations. The design space for algorithms work with small number of attributes and sequential reads.

2.5 Compression-aware optimizations for the equi-join operator

Nested-Loop (NL) Join is capable to operate directly on compressed data with no sort operation [3]. The search [scanning] phase of the equi-join operation is similar to conjunctive query search [52]. Sorting will dominate the equi-join process for non-key attributes.

2.6 Scalability

Performance of processor or disk is restricted with growing data. A high performance DBMS must take advantage of multiple disks and multiple processors.

Three approaches to achieve required scalability:




Data are “horizontally partitionedâ€Â across nodes, each node has a subset of the rows (and in vertical databases, maybe also a subset of the columns) from each big table in the database. Shared-nothing is generally regarded as the best-scaling architecture [19].

3. Systems based on DSM(Column Store): MonetDB

The design strategies of Column store, a read optimized database system MonetDB with improvements have explored.

3.1 Design

MonetDB design is based on DSM [39, 11]. Time complexity for queries, retrieving more columns with DSM is extensive. Execution engine must be aware of join to reduce significant effort on finding matching tuples. MonetDB stores fragmentation information as metadata on each binary association table and propagates across operations. The decision of algorithm is taken at run time. For query processing MonetDB has shown performance improvement for hash join by using radix-cluster algorithm than standard bucket chained alternatives [38]. BAT(Binary Association table) is created as a result of hash-join, contains combination of matching tuples, i.e. a join index. In the MonetDB system two space optimizations have been applied for reduction of the per-tuple memory in BATs: Tuple identifier and Byte encoding.

To achieve higher CPU efficiency, MonetDB architecture follows Volcano approach, including data storage layout and query algebra. Calculations are typically expressed in tight loops over array which are beneficial in extracting maximum performance by increasing cache locality and better operation.

3.2 Binary Association Table Algebra

The BAT concept is inherited from DSM refers to two-column<surrogate,value> table. The BAT algebra get BAT as a parameters and produce BAT as result. Intermediate data is always stored in BATs, and result of a query is a collection of BATs. BAT storage is designed in the form of two simple memory arrays(one with the offsets, and other with all concatenated data), accessed by BAT algebra. Memory mapped files are used as storage for large relations. MonetDB exploits the memory management unit in CPU to offer lookup by position. MonetDB follows client/server architecture, where client is responsible for storing data in relational tables or XML trees or objects and server serves the queries only through BATs. Clients translates the queries in BAT algebra, and use the resulting BATs to present results. A sample BAT algebra is presented below:

join(bat[t1,t2] L, bat[t2,t3] R) : bat[t1,t3] = [ <L[i].head,R[j].tail> | i < |L|, j < |R}, L[i].tail = R[j].head ] “inner joinâ€Â

3.3 Efficiency of the BAT Algebra

Unlike relational algebra Boolean expression determines joining and selection of tuples, such expressions do not occur in BAT algebra. JOIN is a simple equality between the inner columns of the left BAT and right BAT, for select the equality is on the tail column. The head column of a BAT contains densely ascending TIDs starting with (0,1,2...) know as dense property. Dense TID columns are not stored in implementation, as it is same as array index in the column. MonetDB keeps the statistics and properties of columns i.e. selectivity, sortedness. An example of translating SQL query into BAT algebra:

SELECT DISTINCT P.firstname,P.lastname, SUM(I.price)

FROM Person P, Item I

WHERE = I.buyer and I.year = 2007

GROUP BY P.Firstname,P.lastname

translates into BAT algebra:

s := reverse(mark(uselect(Item_year, 2007)))

b := join(s,Item_buyer)

p := join(b,reverse(Person_id))

r := reverse(mark(reverse(p)))

g := group(join(r,Person_firstname), join(r,Person_lastname))

a := {sum}(join(join(reverse(g),r),Item_price)

[print](join(g,Person_firstname), join(g,Person_lastname), a)

As we can see from above example that only a single join operator is needed; all other joins are required to fetched the row required by query, which is the main concern in DSM architecture. MonetDB derives all such join result with low CPU cost, in which for each left input, fetches a single tail result from right input using the positional array lookup. As a result, MonetDB perform no expensive addition joins relative to NSM Model. BAT Algebra process arrays directly, resulting in compromise with ACID properties. Provision is made for separate module with write ahead log and explicit locking primitives for transaction management. The SQL and XQuery clients both offer full acid properties. Enhancements in MonetDB is relatively easy with new modules that introduce new BAT Algebra operators, since data is directly store into array like structures; implies that no API is needed to access data.

3.4 Issues and improvements

Relying on virtual memory for disk storage means buffer manager is removed, thus performance is enhanced. But the practical approach is, the virtual memory layout strongly depends on the OS (version and flavour). Virtual memory prefetching is configured at the OS kernel level, and tuned for different access pattern than MonetDB. BAT algebra is a design of full materialization, an operator fully consumes its input BATs, producing the BATs result. Large results of BATs produce swapping and deterioting the performance, improved by introducing pipelined model and buffer manager for efficient asynchronous I/O in MonetDB/X100. Architecture conscious query processing algorithms like radix cluster was possible to develop because of vertical storage layout. Random data access ,even if data fits into RAM, is difficult to make efficient, as it does not exploits all the RAM bandwidth . As a result main memory algorithms with sequential access perform well than random-access algorithm, even if they do more CPU work. Also, sequentially processed densely packed data allows compilers to generate Single Instruction, hence enhance the performance.

BAT working methodology

SQL. Each relational table decomposed into columns, in BAT with dense TID head, and tail column with values. For each table, a BAT with deleted position and inserted values are kept. BATs are designed to delay updates to main column. MonetDB/SQL maintains BATs for join indices.

XQuery. XML tree structure is relational tables in MonetDB as a collection of BATs. BAT algebra is slightly modified to include region joins, which accelerates XPath predicates.

Arrays. The Sparse Relational, map array data sets into MonetDB BATs, useful in scientific applications.

MonetDB provides run time optimization and efficient cache -conscious algorithms for high performance in analytical applications. MonetDB uses index only for primary and foreign keys. The indexes are generated when the columns are touched for the first time, so storage need decreased by approximately 30%.

3.5 Dynamic in Nature

Row store relatively give static behaviours with respect to workload changes. Modern DBMS comes with the fact of advanced database tuning representations for changing workloads. Due to its complexity, the workload analysis is generally performed off-line. With the changing workload, queries are not supported by the existing indices. MonetDB completely avoids by transferring into memory only new columns of interest. This is simply based on architecture and design principles of column-wise storage systems.

4. Summary and Future research Scope

For analytical disk-bound processing MonetDB confirms the advantages of column-storage systems. Database Cracking and adaptive segmentation and replication techniques have developed to improve performance [30, 31]. Compression has shown more improve performance in column store than row store. MonetDB execution works on fully materialization of intermediate results in query plan. Future research involves materializing result for the common queries which runs in batches includes buffer management and compression. New index techniques are required to be generated as:

Column wise storage mechanism in number of situations is slower than row wise storage layout i.e. point and range queries is efficiently supported in row store as available indices enables quick retrieval of data. For same kind of query MonetDB offers sequential scan which might be slower than searching with B-Tree .

Tuple reconstruction joins cost may be reduce by compressed bitmap index. Row wise storage layouts are suitable for situations where few rows are selected and more columns are involved .