An Overview Of Database Standard Query Language Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Standard Query Language is today the standard language for relational and object-relational databases. Application programs typically contain a relatively large number of SQL queries and updates, which are sent to the Database Management System (DBMS) for execution. The most widely used database management systems, such as Oracle, Access, mySQL, SQLserver, Paradox, Ingres, and others, are all claimed to be relational. Certainly they all use SQL which itself is often assumed to be an indicator of a relational database system. The purpose of this paper is to discuss the historical perspective of the development of SQL and its continuing development. This article also highlighted the benefits and risks of adopting a standard query language.

According to Kuhlemann, et al. (2008) SQL is a database query language used for formulating statements that are processed by a database management system for create and maintain a database. The SELECT statement is the most commonly used by the SQL query which can retrieve data from one or more tables in the database. It can limit the retrieved data using conditional statements in the WHERE clause; the GROUP BY clause can use for group related data and it can restrict the grouped data with the HAVING clause; for order or sort data which based on different columns using the ORDER BY clause.

SQL consists of many statements to create and manipulate database objects. Since its first standardization in 1986, more and more functionality is being included in SQL in each subsequent standard covering various aspects of user interaction. The newest edition of the SQL standard, referred to as SQL:2003, supports diverse functionality such as call level interfacing, foreign-data wrappers, embedding SQL statements in Java, business intelligence and data warehousing functions, support for XML, new data types, etc.

The next standard, called SQL 20071, will presumably add features like regular expression support, binary and oating decimal data types, materialized views, streaming data support, XQuery support and support for the Resource Description Framework (RDF) and the semantic web. The vast scope of SQL's functionality has led many researchers to advocate the usage of a `scaled down' version of SQL, especially for embedded systems. Embedded systems have many hardware limitations such as small RAM, small stable storage, and large data read/write ratio. Also the applications where embedded systems are used, e.g., healthcare and bank cash cards, need only a small set of queries like select, project, views, and aggregations.

A standard called Structured Card Query Language (SCQL) by ISO considers inter-industry commands for use in smart cards with restricted functionality of SQL. Some database systems and SQL engines have been proposed to address this issue. They are distinguished as `tiny', e.g., the TinyDB2 database system, for extracting information from a sensor network and tinySQL3 SQL engine, which is a Java SQL engine that supports only a few SQL statements like select, update, insert, delete. While the standardization process shows how SQL has increased in size and complexity in terms of features provided, efforts for `scaled down' versions indicate a need to control and manipulate features of SQL.

Hagenbuch and gardner (1983) stated that SQL is a data language designed for use with the relational data model. The executable unit of SQL is the statement -- there are no SQL "programs". SQL statements execute in the context of a single enrolled user of the database. The context in which a statement executes determines what privileges it may exercise on objects in the database. An application program is likely to be interested in only one or two contexts. Many SQL statements may run within each context. Each statement is parsed by the DBMS, i.e., prepared for execution.

Catrambone and Yuasa (2006) cited from (Smelcer, 1989).SQL, the Structured Query Language for databases (sometimes referred to as the ''Standard Query Language''), is a command language for relational databases. It was chosen here as the test domain because writing a query with SQL is a relatively complex task and because the knowledge required to write queries can be fully specified.

Moore (1992) stated that "SQL" was once an acronym for the "Structured Query Language" which was associated with a propriety implementation. When SQL is used to refer to the ANSI standard, it is no longer an acronym, merely a short form of "Database Language-SQL".

Evolution and current situation of the SQL

Calero, (2006) described that the relational model came about as a result of E. Codd's research2 at IBM during the sixties. The SQL, originally named SEQUEL (Structured English QUEry Language), was implemented in an IBM prototype (SEQUEL-XRM), during the mid-seventies. Some years later, a subset of this language was implemented in IBM's System-R.

In 1979, ORACLE emerged as the first commercial DBMS based on SQL, followed by several other products (e.g., SQL/DS, DB2, DG/SQL, SYBASE, INTERBASE, INFORMIX, UNIFY). Even those which had not originally implemented SQL as their base query language, offered SQL interfaces (e.g., INGRES, ADABAS, SUPRA, IDMS/R). As a result of this process, SQL became a de facto standard. In late 1982, ANSI H23 began to standardize a version of the relational data model through the IBM donated language, SEQUEL. Renamed SQL by H2, basic SQL was completed and became an American National Standard in 1986 and soon an ISO standard.

In 1989, the first version of the SQL standard was revised and an addendum, which included main improvements on referential integrity issues, was published. Meanwhile, ANSI brought out a standard for embedded SQL.

In the early nineties, a new version, known as SQL2 or SQL-92, was published by ISO. Both the semantic capabilities of the language and error management were then considerably improved. That standard was complemented a few years later, with the approval of SQL/CLI (Call-Level Interface) and SQL/PSM (Persistent Stored Modules). SQL became a complete computational language, with features such as control structures and exception handling.

During the last half of the nineties, SQL was extended by the inclusion of object-oriented capabilities. The resulting standard was divided into several parts. This version, formerly known as SQL3 and then finally called SQL:1999, incorporated features such as new basic data types (e.g., very large objects), userdefined data types, recursive query operators, sensitive cursors, tables generalization and user roles.

The latest version of the standard, the SQL:2003, is the result of major revisions and extensions to most parts of the SQL:1999 standard. This version includes SQL/XML (XML related specifications), new basic data types (bigint, multiset and XML), enhancements to SQL-invoked routines, extensions to the CREATE TABLE statement, a new MERGE statement, a new schema object (the sequence generator) and two new sorts of columns (identity and generated). Table 1 summarizes the evolution of SQL.

Table 1

Evolution of SQL




Relational model

DBMS prototypes (SEQUEL XRM)

First relational DBMS


ANSI SQL-86 standard

ISO SQL-87 standard

SQL-89 addendum

ANSI embedded SQL


SQL 92






The SQL:2003 standard is composed of nine parts, which are briefly described in Table 2. The numeration of parts is not contiguous due to historical reasons: some parts have disappeared (e.g., SQL:1999's part 5 - SQL/Bindings - was included in part 2 of SQL:2003) and other parts are new. The latter resulted either from further partitioning of previous parts (e.g., part 11 was previously included in SQL:1999 part 2) or from the implementation of new requirements, such as parts 13 and 14, dealing with Java methods and XML data, respectively. Since the SQL:1999, the SQL standard has evolved, to support the object-relational paradigm. This paradigm proposes a good compromise between relational and object-oriented databases. The former have a robust data model (the relational one) and powerful query optimization, recovery, security and concurrency mechanisms. The latter incorporate objectoriented mechanisms (e.g., encapsulation, generalization, aggregation and polymorphism), and allow to represent more complex elements which are required in several domains, such as CAD, CAM or GIS.

Object-relational databases offer the possibility of defining classes or abstract data types, as well as tables, primary and foreign keys and constraints, as relational databases also do. Furthermore, generalization hierarchies can be defined among classes or tables. Table attributes can be defined in a simple domain (e.g., CHAR(25)) or in a user-defined class, as a complex number or image.

Table 2

Structure and summary of the SQL:2003 standard







Overview of the standard. It describes the conceptual framework used in other parts to specify

the grammar of SQL and the result of processing statements in that language by an

SQL-implementation. It also defines terms and notation used in the other parts.




This part defines the data structures and basic operations on SQL-data. It provides functional

capabilities for creating, accessing, maintaining, controlling, and protecting SQL-data. This

part also specifies the syntax and semantics of a database language. It deals with the portability

of data definitions and compilation units between SQL-implementations and the interconnection

of SQL-implementations.


Call-Level Interface


It defines the structures and procedures that may be used to execute SQL statements from

within an application written in a standard programming language, such that used procedures

are independent of the SQL statements to be executed.


Persistent Stored Modules


This part specifies the syntax and semantics of a database language for declaring and

maintaining persistent database language routines in SQL-server modules.


Management of External Data


Extensions to Database Language SQL are defined, in order to support management of

external data, through the use of foreign-data wrappers and datalink types.


Object Language Bindings


It defines extensions to support embedding of SQL statements into programs written in the

Java programming language, commonly known as bSQLJQ. This part specifies the syntax and

semantics of SQLJ, as well as mechanisms to ensure binary portability of resulting SQLJ

applications. In addition, it specifies a number of Java packages and their classes.


Information and Definition Schema


This part specifies an Information Schema and a Definition Schema that describes the SQL

object identifier, the structure and integrity constraints of SQL-data, the security and authorization

specifications related to SQL-data, the features, sub-features and packages of this

standard, and the support that each of these has in an SQL implementation. It also includes

SQL-implementation information and sizing items.


Routines and Types Using the Java Programming Language


It specifies the ability of invoking static methods written in the Java programming language as

SQL-invoked routines and of using classes defined in the Java programming language as SQL

structured user-defined types.


XML-Related Specifications


This part defines ways in which SQL can be used in conjunction with XML

Benefits of adopting SQL

Donaho and davis listed that several features make SQL at least as good as any other query language currently in use :

The basic concepts and syntax of SQL are quickly learned . This short initial learning period decreases the amount o f training required and increases productivity .

SQL is a reasonably high-level language . The programmer can write queries without knowing all of the intimate details of the DBMS implementation . For example, a SELECT clause allows the user to identify the needed data without indicating how to access it .

SQL unifies the data definition and data manipulation languages . Unlike other query languages, SQL uses the same syntactic constructs for definition functions and manipulation functions . This uniformity makes the language easier to learn and use .

SQL provides the functionality needed for most database applications . That is, the language is powerful enough to do most of the things required in a database application.

Risks of adopting SQL

Maciol (2008) stated that SQL has a row of limitations coming from its foundations such as:

lack of possibility for defining terms and lists,

limitation of atomic data,

lack of recurrence and iteration,

limited possibilities of data processing control,

lack of deduction possibility.

Chan, Lu and Wei (2003) listed the problem while using SQL :

comprehension difficulty

- complex queries are difficult to analyse, especially by another person

- "nested maze" iis quite confusing. This confirms one of the theoretical flaws of SQL not well defined semantics for nesting (Codd 1990).

- multiple joins of many tables can lead to uncertainty of the query accuracy

- logical errors are harder to detect, as compared to 3GLs

formulation problem

- joins are difficult for end-users

- too many aggregate functions in a single query have led to problems

- use of wrong field and name definition

- unable to format the output as desired

- variables used with wrong variable types, especially for embedded SQL


- response is slow when system does not select the best path to access tables.

- database contention occurs by simultaneous accesses

- a query may need to be broken into smaller queries to speed up processing time. This

requires more temporary space.

unclear error message sometimes give wrong impressions.

When users encounter problems with SQL, the majority (68%) refer to manual. This also confirms the finding that manuals form a substantial secondary source of SQL knowledge. A substantial 24% prefer to seek the assistance of colleagues or superiors. Only a minority, 2%, attempt querying with other languages, while 6% will try other means, one of which was to try till I get it right, to SQL manuals

Brass and Goldberg highlighted that errors in SQL queries can be classified into syntactic errors and semantic errors. A syntactic error means that the entered character string is not valid SQL. Then any DBMS will print an error message because it cannot execute the query. Thus, the error is certainly detected and usually easy to correct. A semantic error means that a legal SQL query was entered, but the query does not or not always produce the intended results, and is therefore incorrect for the given task. Semantic errors can be further classified into cases where the task must be known in order to detect that the query is incorrect, and cases where there is sufficient evidence that the query is incorrect no matter what the task is.

Nicola and Kiefer (2009) observed that the adoption of SQL/XML faces several challenges. When relational legacy applications require access to new XML data, it is often too expensive or risky to convert them from SQL to SQL/XML. Another frequent challenge is to actually write queries and updates with SQL/XML and XQuery. We see that their use poses a number of problems:

Users need to learn these new languages, which are often perceived as difficult to master. This stems from the differences between the XML data model and the relational data model.

SQL/XML involves path expressions that navigate the tree structure of XML documents. To write path expressions, users must know the structure of the XML data in detail. It is not enough to know which data items exist, it is also necessary to know their exact case-sensitive name, namespace, and location within the document structure. But, this structure is often complex, difficult to understand, or even unknown to the user.

As more XML documents are accumulated in a database, newer documents may have a different XML Schema than older ones. This requires queries and updates to work across documents for different schemas, which compounds the complexity of writing SQL/XML statements. Also, existing XML queries may need to be changed when the XML Schema evolves.

In a hybrid database, where some data is stored in relational format and some in XML format, users need to know which data is in which format before they can write correct queries.