History of electronic theses and dissertations



This chapter explains the traditional method of theses submission and also the history of ETD. Some research has also been conducted on the existing ETD systems for example the DSpace and Virginia Tech University's ETD-db. Comparisons are also done between the DSpace and Tech University's ETD-db to extract the benefits offered to develop the most suitable system.


Theses and dissertations are the essential pillars for the application and advancement of knowledge. It presents tangible substantiation of the scholarly development of students and their ability to learn and communicate research findings. The conventional method of presenting theses and dissertations has very limited scope. Currently in our country, theses and dissertations are submitted, distributed and stored in paper. It is obligatory that a graduate students hand in manifolds of hard copies of their theses or dissertations for preservation purposes. These printed publications are hardly practical in a substantial way and are not easily accessed by potential scholars and interested researchers. Furthermore, handling and storing of paper theses in the library and the graduate office is costly in both aspects of time and money. The libraries and offices in universities pay for the space and maintenance costs, and they would need to manually catalog the theses. Students pay huge amounts of money in printing, photocopying and binding their theses. Numerous novel research projects, where authors have devoted prolonged hours, are now rotting in library basements, with no efficient way for students, researchers or publishers to find the information that may be in them. Submitting, distributing and storing electronic theses and dissertation will attend to these problems. Many universities all over the world have now begun to acknowledge and benefit from applying electronic submission of theses and dissertations (Fox, Edward.A. 1997).


There has been very extensive research in the digital library field in the past two decades. One of the first open discussions on the concept of electronic theses and dissertations (ETDs) was at a 1987 meeting in Ann Arbor, Michigan set by University Microfilms International ,UMI (currently known as ProQuest), and attended by Edward Fox and Susan Bright (Fox,Edward.A,1997). Both were representatives of Virginia Polytechnic Institute (Virginia Tech). Representatives from other institutes included the participants from University of Michigan, SoftQuad, and ArborText. Soon after the early discussions, work began converting the existing thesis and dissertations from a diskette to the internet. Many may wonder the origination of the acronym “ETD”, Professor Edward Fox explained that the acronym as containing an implicit Boolean ‘OR’: ‘ETs’ OR ‘EDs’ equals ‘ETDs’ (MacColl, 2002).

Since the usage of the terminology ‘theses’ and ‘dissertations’ varies from country to country and also from one institute to another, this acronym makes things uncomplicated, whereby a digital material be it an electronic thesis or dissertation may be referred to as ‘an ETD’ (The 7th International Symposium on Electronic Theses and Dissertations, 2004).

During the foundation of the Monticello Electronic Library Project, in 1993, which was supported by SURA and SOLINET, Professor Edward Fox of Virginia Tech, also known as the father of ETD Movement became Co-Chair of its Working Group on Theses, Technical Reports and Dissertations. In 1994, a workshop funded by and at Virginia Tech was held to develop plans for ETDs (Fox,Edward A, 1996). Adobe's Portable Document Format (PDF) and the Standard Generalized Markup Language (SGML) were selected for representation and archiving.

To enhance the convenience of the system so that the ETDs from can be accessed by all participating institutions, Virginia Tech coordinated the development and implementation of a distributed digital library system (Fox,Edward.A, 1997). This would include browsing and searching (based on institution, date, author, title, keywords, and full-text), as well as downloading for local reading or (selective) printing.


The fundamental aspects of an ETD system could be encompassed down to these three main attributes and their relationship; the people, the content and the technology. The Report of the DELOS-NSF Working Group on Digital Imagery for Significant Cultural and Historical Materials has provided a conceptual framework for digital libraries shown in figure 2.1 (Crane, Gregory & Bontcheva,Kalina) .




Applications and Use

Presentation and Usability

Creation and Preservation


Figure 2.1: Interdisciplinary Digital Library Research Model (Crane, Gregory & Bontcheva,Kalina)

This conceptual model illustrates the relationships among people, content, and technology in digital library research outline. In other words, an interdisciplinary digital library research will cultivate technologies to augment the way people create and access the contents (Ching-chih Chen & Kevin Kiernan). People comprise of all users, from librarians to scholars, lecturers and students in all field of the studies. Content is the infinite array of significant materials throughout the world. Technology enables research and development in all related technical fields such as information retrieval, artificial intelligence, image processing and data mining.

2.3.1 Benefits of ETDs.

There are many known benefits of ETDs, some of which have been researched by many scholars are as below:-

The inclusion of new media; multimedia, allows graduate students to be able to better express their findings and ideas and helps the readers to improve their understanding.

Since the ETDs are in electronic formats, they can be easily backed up which minimizes the risk of losing information compared to hardcopy paper TDs since they have no physical form which can decay with age or be damaged due to accidents. This is further supported by technologies which have and are advancing so tremendously such as the increase of electronic storage space with the availability of larger hard drives and advances in the file compression technology.

The drafts submission processes of the ETDs are more efficient with regards to the more timely feedback and also provide alternatives for draft sharing between authors and committee members which are not geographically constrained as both parties can afford to be away from the university during these times.

The ETD technology enables the readers, the public to be able to access the current research which will ideally be available all day, every day.

The approach towards an ETDs library also help graduate students of all areas of studies to learn how to become electronic publishers and the provides the knowledge of knowing how to use digital libraries in their own research areas, which prepares them for any future works.

The graduate student’s research will also be aided since all the works of their fellow peers can be found in a single repository data system.

ETD also helps in the library’s management if TDs as there will be fewer physical copies to handle and less shelf space will be needed for storage and most exciting part will be that there will not e any theses that are checked out , no overdue and no penalties.

Authors will also be able to get their works recognized by a global audience of readers, which allows them to gain wider exposure and recognition for their work.


There are some established frameworks and protocols that should be used as a guideline when developing an ETD system. To create a system that can be easily integrated for cross organizational access, the system developed should comply with internationally set protocols and standards.

Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH)

Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol created by the Open Archives Initiative. Its purpose is to gather the metadata descriptions of the records in archives so that services can be constructed using the collected metadata from various archives. An implementation of the OAI-PMH framework is required to support the metadata defined in Dublin Core, as well as it could (optional) also support additional metadata (Open Archives Initiative Protocol for Metadata Harvesting, Wikipedia).

OAI-PMH is centered around a client-server architecture, where "harvesters" request information on updated records from "repositories" (Open Archives Initiative,2008). Requests for data can be by date-stamp range, and could also be limited to named sets defined by the provider. It is essential for the data providers to provide XML metadata in Dublin Core format, but, they also may provide it in other XML formats.

Numerous Digital library and ETD systems champion the OAI-PMH, including Fedora, GNU EPrints from the University of Southampton, Open Journal Systems from the Public Knowledge Project, Desire2Learn, DSpace from MIT, HyperJournal from the University of Pisa, Primo, DigiTool, Rosetta and MetaLib from Ex Libris, DOOR from the eLab in Lugano, Switzerland, panFMP from the PANGAEA (data library), and jOAI (Open Archives Initiative Protocol for Metadata Harvesting,Wikipedia).

Dublin Core

The Dublin Core metadata element set is a paradigm for cross-domain information resource description (Lagoze.C, 2001). It defines standards for describing things online in ways that make them easy to be found. Dublin Core is extensively used to describe digital resources such as audio, video, image, text, and fused medium like web pages. (Dublin Core, Wikipedia). Implementations of Dublin Core generally make use of XML and are Resource Description Framework based. Dublin Core is characterized by ISO in ISO Standard 15836, and NISO Standard Z39.85-2007 (Dublin Core, Wikipedia).


The semantics of Dublin Core were set up and are maintained by an international multi-disciplinary individuals and organizations of professionals ranging from library studies, computer science, text encoding, archiving, data mining and other related fields of research and development (Weibel.S, 1997).

The Dublin Core Metadata Initiative (DCMI) is an organization supplying an open forum for the growth of interoperable online metadata standards that aid a variety of purposes and business models (Dublin-Core-Community, 1999). DCMI's movements include working groups, global conferences /workshops, standards association, and educational efforts to endorse metadata standards and practices (Jenkins.C, 2000). Simple Dublin Core

The Simple Dublin Core Metadata Element Set (DCMES) consists of 15 metadata elements (Hillmann.D, 2005):
















Each Dublin Core element is elective and could possibly be repetitive. The DCMI has set up standard ways to improve elements and promote the use of encoding and vocabulary schemes (Weibel, S. 1998). There is no arranged order in Dublin Core for submitting or using the elements (Dublin Core, Encyclopedia).


For lasting application and preservation, we need to have an economical way to preserve digital content for the coming generations by using diverse open source archiving methods. We ought to consider the existing systems that have already been tried, tested and implemented by respected institutions.

There are many systems or software in the market which we can choose that better suits our needs and functions. There’s DSpace, Virginia Tech University's ETD-db, ProQuest and EPrints, to name a few. Several open source packages demonstrate various extent of resemblance, but the major aspects in selection are Portability, Functionality, Interoperability and Sustainability. Copeland and Penman propose the following criteria for decide on a software for ETD systems.

Portability: The software should be easy to install on an array of hardware and operating systems, and should be accessible at no cost preferably to be open source software. The straightforwardness of customization and ease of use of upgrade should be one of the main concerns as well.

Functionality: The system should have a natural and engaging user interface for the system administrators as well as the supervisors, and it should encourage students to submit their thesis. The system should allow for simple as well as advance metadata searching, while full-text searching would be ideal. Application of metadata that is in line with national or institutional schemes should be encouraged. The software should be able to support any file format or size.

Interoperability: The system must conform to the up to date edition of the 'Open Archives Protocol for Metadata Harvesting' (OAI-PMH), on top of being able to suit individual institutional policies for combining the ETDs with information in other electronic repositories. It is important to guarantee that the system will be able to transfer information from one system to another (Goh D.H, 2006).

Sustainability: Digital repositories are meant for long-term preservation and use, as such, institutions should be assured that the software will offer sustained support and maintenance.

There are various options to evaluate when deciding or developing an ETD software. Richard D Jones has evaluated two open source solutions to present theses using a web-based interface. They are ETD-db by Virginia Tech and D-Space from HP and MIT. ETD-db is designed distinctively for ETDs and certified by the NDLTD, but it was not practical for use by universities that needed to build a repository and host ETD as a part of it (Jones, 2004). Jones realized that DSpace has a more complete and accommodating system for utilizing and warehousing metadata. DSpace allows for future changes to its metadata schemas, as the Dublin Core registry in DSpace is customizable (DSpace Federation, 2002).


DSpace is an Open Source Digital Repository software developed by MIT (Massachusetts Institute of Technology) and HP (Hewlett Packard) Labs (Smith MK., 2003). It is a java based system using Tomcat as the web server and Postrgres as the back-end database. The DSpace Digital Repository allows for opportunities to access control, rights management, versioning, retrieval and Community feedback (R.Crow, 2002). Installation

The DSpace software is developed in Java and requires the use of the Apache Tomcat server and the PostgreSQL database server (DSpace Documentation, 2002). It’s installation needs access to the Tomcat applications directory and administrative privileges for the PostgreSQL server. Some knowledge of system administration is required to configure Tomcat, PostgreSQL, and Apache. Installation is not very difficult if the directions given in the DSpace website ( is followed closely. Document Submission

The document submission operates in a similar way as the ETD-db where users are requested to supply metadata relating to their submission and to upload their submission as a whole or in parts (Hemminger.B, 2004). Students are required to sign a copyright agreement permitting the institution the non-exclusive right to publish the document while reserving all other rights for the student. There is no option currently present for students to disallow the distribution of their documents, although this feature could be achieved by modifying the DSpace software. Administration

The administrative selections for DSpace are more customizable than of those existing in ETD-db (Yiotis.K, 2008). Administrators will be able to perform user management, manage communities, and handle collections, metadata, documents, and workflow (Benjelloun.R, 2005). Each community is able to have various collections, each with its respective workflow and administrative authorization. ETD-db

ETD-db was developed by Virginia Tech as a segment of their work on the NDLTD. The software is a set of CGI programs developed in Perl and MySQL open source database. ETD-db was specifically designed as a solution for managing a collection of ETDs (Hemminger.B., Jackson Fox, Mao Ni, 2004). Besides the customary Perl installation, it is also important to install extra 'Perl Modules' which would improve the functionality of the language. It requires an experienced systems administrator to set up the prerequisite installation. Installation

The ETD-db website ( offers steps for installing the software using the Apache web server. It’s found that the installation to be slightly complex using the available documentation. Several bugs relating to creation of the database tables were stumbled upon though they were easily fixed. Document Submission

Document submission is performed through a progression of forms that allow users to enter metadata describing their submission and then upload their thesis or dissertation as single document or as a chain of documents. Submissions have to be evaluated by an administrator before they are published. Administration

ETD-db offers an administrative interface that authorizes administrators to assess submitted documents, modify, delete, add documents, set availability options on the documents, and approve documents for publication (Hemminger.B., Jackson Fox, Mao Ni, 2004). ProQuest

ProQuest has worked out a commercial ETD solution to complement its existing services (Fineman, Yale, 2003). A beta version of this system was reviewed. The web interface for this digital dissertation service was developed by Berkeley Electronic Press (bepress). A mock trial system is available for use on the web ( ) that allows one to play the roles of student submitter and administrator. Installation

No installation required at client site only access to paid site. Document Submission

The submission process is quite straightforward, which requests the author to provide contact information, document metadata, and to upload the full-text of their thesis. This process is not substantially different than any of the other packages reviewed. Administration

The administration module lets the collection manager to view current submissions and assign them to their respective reviewers. Reviewers can then revise the submission, accept the submission, or reject it (with remarks or notes). There is a checklist function for reviewers to utilize when reviewing the submissions. Document Submission

Since ProQuest is a business institutional scheme, access in some instances is fee based (Hemminger.B., Jackson Fox, Mao Ni, 2004). Member institutions will have access to their own institutions’ materials, and fee based access to the full-text of documents published by other intitutions. ProQuest also champions sending an electronic copy to the institution, so that the institution may well archive the item and make it freely available if it chooses.

Overall review of existing ETD systems.

DSpace is a very flexible digital archiving system. This software goes through dynamic enhancement and many attribute that would be advantageous to an ETD repository are being considered.

The ETD-db system is designed exclusively to manage ETDs and as such is not as well suited for other digital sources (Copeland, Susan & Penman, Andrew, 2004). Also, it is designed to support only the ETD-MS standard for ETD metadata. Nonetheless, it is an open system and is modifiable. Given the small size of the program code, modifications to the software should not be particularly difficult.

The ProQuest ETD service provides a simple means for institutions to begin constructing their ETD collections. Since many institutions are already operating with ProQuest, it may the answer to their issues. The ProQuest system is not an open-source solution so there is less likelihood to customize the system to fit individual requirements, albeit BEPRESS does allow some minor interface like customization of the web forms. Also, access to materials is much more constrained than with ETD-db or DSpace, both of which support OAI, and hence permiting open metadata searching, and full-text access to all materials.

2.5.3 Comparison of Customizable features between the 3 existing ETD systems discussed above.






Compliance to OAI-PMH

Fully Compliant

Not fully compliant as it follows the NDLTD which colplies to the ETD-MS standard for metadata.

Not compliant as it uses a set of unqualified Dublin Core elements.

Customizable options

- Workflow

GUI to a certain extent – Based on community and collection, not individual persons.

Through program codes.

Through program codes.

- User Interface

Through GUI.

Through GUI.

Through program codes.

- Ability to customize the metadata

Through program codes.



- Ability to configure Browse & Search

Through program codes.

Through program codes.

Through program codes.

- Configurable Database

Through program codes.

Through program codes.


Ease of Customizability

Quite Easy for advance users with some knowledge of programming.

Not easy for a person without programming knowledge.

Not easy for a person without programming knowledge.


Among the many weaknesses with the existing systems were due to the lack of flexibility in selecting a submission flow. The configurable ETD System that is proposed will help lecturers by providing them with a flexible way to examine and submit their student’s thesis. In the same way, the configurable tool also provided some benefits to students whereby they will be able to get direct feedback and liaise online with their lecturers at their own convenience.

With the general aim of promoting ETDs, this project is aimed at developing a configurable Open Archives Initiative (OAI)-compliant thesis archive to enable ETDs to be published on the web by institutions for use in all participating universities.

In addition, this project would also assist to help other universities by producing a 'checklist approach' to use as they develop e-theses capability.

