The new functions of DL

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


The primary purpose of a literature review is to 're-view' the area of investigation. It is an essential and ongoing phase of any research study design. It is an essential and ongoing phase of any research study design. It is essential as it can provide new ideas and approaches that may have not occurred to the researcher otherwise. This chapter describes the topics considered most important and relevant to this study including the concepts, usability and roles of Digital Library (DL). The aim of this literature review is to investigate the new functions of DL that can benefit education environment.

Digital Library

Digital Library: History and Definition

Libraries slowly started to appear in the middle of the 15th century. In the 17th and 18th century, these libraries started developing from private collections into college and university libraries, as well as libraries of scientific societies. Library not just a place to collect and protect books and manuscripts, but also a place to supplied library patrons with relevant materials. More than one million titles were produced in the 17th century. The number was multiplied in the following centuries and become more specialized. Together therefore, flooded information also growing and seemed impossible to handle. The problem anyway, tried to handle by contemporary scientific society by constructing bibliographies, exchanging abstract and publishing scholarly journals. Nowadays, the problem is approximately the same. However, today we have latest tools and technologies that can be used to tackle the problem. With the invention of computer in the mid 20th century, information overload problem was given serious attention by scientists and researcher in computer science area. Many great ideas have been proposed to solve this problem.

In July 1945, Dr. Vannevar Bush, who was a Director of the Office of Scientific Research and Development, has proposed a theoretical proto-hypertext computer system, namely 'The Memex' in his article 'As We May Think[1]'. The Memex (as shown in Figure 2.1) was a machine that used as a private file system for personal use. Using the device, individual could store all records, articles, books and communications. Most of this content would be available on microfilm and hence could be inserted into the machine. A simple numerical code can be given to a document to allow users to access them. All of these documents were then mechanized by The Memex so that they could be trusted and consulted to deliver responses with considerable flexibility and speed. The impressive feature of The Memex was its capability of binding two items together by associative indexing. The Memex machine is one of the enormous guides from the past in the direction of DLs and is the best attempt to solve the problem of overload information.

As an information collection centre, library's function not limited for information access which merely can be accessed within library building only, in fact, world borderless concept nowadays has given a vast space to effort accessing information globally. Assessment on the importance of information to society is rising, where information has being an important element in daily activities of our society. The increasing of research and development field also has been encouraging to optimum utilization of information, which indirectly required the efficient system information management through latest information technology. Therefore, DL was seen as one of the important element in information technology environment and should be given serious consideration.

Digital Libraries (DLs) are being created today for diverse communities and in different fields such as education, science, culture, development, health, government and so on. They are considered by many to be a key application of internet and web technologies as their collection are typically accessed over the internet and web from virtually anywhere and at anytime. Historically, shift from the traditional to digital libraries was not a one step process. Also, many different blends between the two are nowadays in praxis, providing various sets of services to best accommodate the needs of their patrons. Even boundaries between the following four types are not strict or exactly defined. The following is a categorization from the LISWiki[3] project:

  1. Traditional Library: The collections of the traditional libraries are mostly print media, manuscript, etc, and are not well organized. The documents are deteriorating at a rapid rate, the collection information is not easy to locate and so does not easy reach to user. Again the traditional libraries are confined itself within a physical boundary.
  2. Automated Library: A library with machine-readable catalog, computerized acquisition, circulation and OPAC are called as automated library. The holding of this type of libraries are same as that of traditional libraries.
  3. Electronics Library: When automated libraries goes for Local Area Networking (LAN) and CD-ROM networking and started procuring e-journal and other similar kind of publication then it is known as electronic library. The resources of the electronic libraries are in both print and electronic form. The electronic Medias are used for storage retrieval and delivery of information.
  4. Digital Library: it is a later stage of electronic library. In DL high speed optical fiber are used for LAN and the access is over Wide Area Network (WAN) and provide a wide range of internet based services i.e. audio and video conferencing and like other. The majority of the holding of a DL is in the computer readable form and also acts as a point of access to other online sources.
  5. Hybrid Library: the libraries, which are working both in electronic or digital and print environment, are known as hybrid library. Actually it is a transitional state between print and digital environment. It is estimated that in near future libraries will be of hybrid nature, some of the very strong point in favor of this view are centuries old reading habit of paper, convenience of handling and reading a paper document then the digitized one (in case of digitized some equipment are must needed to read the document), incompatible standard of electronic product, different display standard of digital product and its associated problem etc.

Digital repository is recently popularized term. Being 'collections of digital objects' they are essentially one of the core units of a DL. What makes them different from the other digital collections are:

  • Content is deposited in a repository, whether by the content creator, owner or third party on their behalf.
  • The repository architecture manages content as well as metadata.
  • The repository offers a minimum set of basic services e.g. put, get, search, and access control.
  • The repository must be sustainable and trusted, well-supported and well-managed (Heery and Anderson, 2005).

The term 'digital library' is the most recent in a long series of names for a concept that has been discussed since the development of the first computer. The definition of DL has changed over time along with the development of new technologies. Even after years professionals have not agreed upon a single definition of what a DL is. As a matter of fact, that is very good, because as complex entities as DLs are can be accessed from many points of view; a global, universal library is rather utopia. "It was a challenge to find words to express what DL really should be, and it took courage to argue that resource that called itself a DL really was not one" (Seadle and Greifeneder, 2007). A very general but broadly accepted description of DL was provided by Arms (2000):

"Digital Library is a managed collection of information with associated services where the information is stored in digital format and accessible over a network" (Arms, 2000).

In the US, The Digital Library Federation[4] (DLF), which formed in 1995, an organization of research libraries and various national institutions, after considerable deliberation agreed on a 'working definition of digital library', representing the definition of the practice community:

"Digital Libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities", (DLF, 1998).

The stated goal of DLF, as stated by Saracevic and Dalbello (2001), is 'to establish the conditions necessary for the creation, maintenance, expansion and preservation of a distributed collection of digital materials accessible to scholars and the wider public'. Other definitions of the term 'digital library' were found from this literature review are depicted as below:

"Digital Libraries are organized collections of digital information. They combine the structuring and gathering of information, which libraries and archives have always done, with the digital representation that computers have made possible", (Lesk, 1997).

"The digital library is the collection of services and the collection of information objects that support users in dealing with information objects available directly or indirectly via electronic/digital means", (Leiner, 1998).

"Digital Libraries basically store materials in electronic format and manipulate large collections of those materials effectively", (NSF, 1999).

"A Digital Library is a library in which collections are stored in digital formats (as opposed to print, microfilm, or other media) and accessible by computers. The digital content may be stored locally or accessed remotely via computer network", (Wikipedia, 2009a).

Each of these definitions of DL attempts to model different facet of existing systems. Some definitions attempts to place emphasis on the human aspects whereas other definitions try to fit DLs into formal frameworks. Anyway, "DLs are in fact probably too young to define in any permanent way, but how we think about them will have a great deal to do with how future generations of librarians conceptualize their mission in the digital world" (Seadle and Greifeneder, 2007).

Benefits of Digital Library

DL combines the structuring and gathers of information with digital representation. It can be accessed rapidly around the world. It is also provides the principle governing what is included and how the collection is organized. Chowdhury and Chowdhury (2003), in their book have explained the great impact that could occur in society from using DLs:

"DLs have the potential to make a tremendous impact on our every-day life. They will bring a paradigm shift in the ways we create, distribute, seek and use information, and thus will make significant impacts on the way we do our day-to-day work - study, research, jobs, problem solving, decision making, and so on. DLs will also have a tremendous impact on the information industry, effecting the information generators, publisher and distributors, and information service providers", (Chowdhury and Chowdhury, 2003).

DLs consist of a set of electronic resources and associated technical capabilities for creating, searching and using information. These capabilities are natural benefits of a DL. Rajashekar (2006a), have addressed the benefits of DLs as below:

1. DL brings the library to the user

  • DL brings information to the user, at work or at home.
  • With a DL on the desktop, user never need to visit a library building.
  • There is a library wherever there is a PC and a network collection.

2. Improved access - searching and browsing

  • Support full text searching - finding information in paper-based material is very difficult.
  • Search systems are improving.
  • Hypertext linking.

3. Information can be shared more easily

  • Placing digital information on a network makes it available to everybody - mirror sites improve access further - duplication of paper material is very expensive.

4. Easier to keep information current

  • Information can be updated continuously much more easily.

5. Information is always available

  • Not limited by time and geography (3 A's - anytime, anywhere, any format).
  • Materials are never checked out, mis-shelved, or stolen.

6. New forms of information become possible

  • Digital representation can support features and manipulations not possible in print form (e.g. chemical structures, mathematical equations, multi-media).

7. Wider access

  • A DL can meet simultaneous access requests for the same electronic document by easily creating multiple instances (or copies) of the requested document. A DL can thus meet requirements of much larger population of users.

8. Allow collaboration and exchange of ideas

  • Technology of DL is closely related to e-mail and teleconferencing.
  • Potential for convergence.
  • Integration with Knowledge Management.

9. DLs may save money

  • Hard data is not yet available.
  • Conventional libraries are expensive - building, professional staff, maintenance.
  • Today's DLs are also expensive - but as technology costs decline and improved tools become available, DLs may eventually prove to be less expensive.

10. Improved preservation

  • It is easier to copy digital information, without errors - no fear of maintaining one physical object permanently. So rare publications and artifacts may be preserved better by providing access to their digital versions (Rajashekar, 2006a).

A DL will be responsible in developing and providing collection integration to be accessed. It will be a wider and easier information technology system, which enables it to achieve its operational and strategy goals from the aspect of digital material collection and access preparation on that material. Other benefits of using DL are addressed below:

  1. Enable wider and easier uses of library collection, as it is a systematic way to collect, store, preserve and manage information and knowledge.
  2. Provide access to national digital publications.
  3. Encourage economical and efficient information dissemination.
  4. Encourage cooperative effort to invest in research, computerization and communication network are.
  5. Strengthen the relationship and cooperation between research, education, government and business community.
  6. Play an important role in production and expanding knowledge in strategic fields.
  7. Contribute to lifelong learning opportunity.
  8. Storage and manage digital materials obtained by a library.
  9. Create and maintain metadata.
  10. Allow users to identify and retrieve materials.
  11. Effectively interfacing with the current and future planned library information technology.

A global computer network or internet was assumed to be the primary delivery mechanism for digital information. For a DL to provide equitable access to information, it is imperative that the same universal availability that is a characteristic of the network. In the future complex multimedia resources and services may have specialized software and hardware requirements such that only a limited number of workstations can actually access the information.

Digital Library Services and Functional Components.

With the mass spreading of the internet after the arrival of the web technologies some people proclaimed that the whole internet is one huge DL. The Library Information Science (LIS) however disagree - the most important difference between internet and DL are associated services. This differences was described by Lagoze, Shaw, Davis and Crafft (1995) as:

"Although the internet provides access to an enormous amount of information, the current state-of-the-art falls far short of what is commonly viewed as a library service - that is, relatively easy navigation of and access to a set of documents that are part of collection. The notion of a collection is important in that it implies that the set of documents was not selected haphazardly, but by some trusted intermediary. Current users of the internet confront an information space where the quality of documents is far from reliable, facilities for locating documents are primitive, and access to a specific document frequently means wading through a Tower of Babel of architecture dependencies and file formats."

Even with the term 'Digital Library' explained, we can still find many different ways of how the idea of DL can be executed in praxis and thus many ways properties and features can be set up and made available. A description of potential properties of DL and to which level the can be adopted is shows in Table 2.1 below.

A DL service is the fusion of computing, storage, and communication machinery coupled with the software needed to reprise, emulate, and extend the services provided by conventional libraries based on paper and other material means of collecting, storing, cataloging, finding and disseminating information. The DL service exposes a specific functionality to it users to fulfill the users' information needs. The services exposed by a DL include services to support management of collections, services to provide replicated and reliable storage, services to aid in query formulation and execution, services to assist in name resolution and location, and services to access to the library items and the processing of the information contained in the items and communication of information about the items. Basic DL services include services for indexing, searching, browsing and cataloging digital resources (Kelapure, 2003). A few common functional components were shared by most DLs. "A basic understanding of the key functional components will help in preparing better for DL development efforts" (Rajashekar, 2006b). The key components of DLs were described as below followed by illustration in Figure 2.2.

  1. Selection and acquisition: Typical processes covered in this component include: the selection of documents to be added, the digitization and/or conversion of these documents to appropriate digital form.
  2. Organization: A key process involved in this component is the assignment of metadata (e.g. bibliographic information) to each document being added to the collection.
  3. Indexing and storage: This component carries out the indexing and storage of documents and metadata, for efficient search and retrieval.
  4. Repository: This is the core component of the DL, consisting of document objects, metadata and indexes created for purposes of search and retrieval.
  5. Search and retrieval: this is the DL front-end used by the end-users to browse, search, retrieve and view the contents of the DL. This is typically presented to the users as an HTML page.
  6. Digital library website: this is the server computer that hosts the DL collection, and presents the collection to the user in the form of a website home page. The user selects a suitable link on this page to go to the search and retrieval front-end mentioned above. The DL delivers the content based on search and retrieval operations. The DL home page itself may be integrated with the library website through an appropriate hypertext link.
  7. Network connectivity: for online access, the DL website computer should have a dedicated connection to the intranet and/or Internet. Depending on the target user community, access may be restricted to the intranet (organizational LAN) or extended to external users through the Internet (Rajashekar, 2006b).

Schema on Figure 2.3 describes key elements and their relations in a fully developed DL environment. These elements were introduced by Sun Microsystems.

  1. Initial conversion of content from physical to digital form.
  2. The extraction or creation of metadata or indexing information describing the content to facilitate searching and discovery, as well as administrative and structural metadata to assist in object viewing, management and preservation.
  3. Storage of digital content and metadata in an appropriate multimedia repository. The repository will include rights management capabilities to enforce intellectual property rights, if required. E-commerce functionality may also be present if needed to handle accounting and billing.
  4. Client services for the browser, including repository querying and workflow.
  5. Content delivery via file transfer or streaming media.
  6. Patron access through a browser or dedicated client.
  7. A private or public network. (Sun Microsystems, 2002)

Metadata standards in Digital Library

It was the information technology revolution that popularized metadata as a term but also as a tool. By the most general definitions metadata are 'data about data' or alternatively 'information about information'. To put the wording in a more materialistic frame we can say that:

"Metadata is information about a thing, apart from the thing itself." (Batchelder, 2003)

Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment and management of the described entities." (Durrel, 1985)

In the world of traditional libraries, metadata are represented by bibliographic record which usually tries to include all the information about particular item available. This is important for cataloguing, retrieval, circulation, searching and other library and archival services. Although mankind is storing and harvesting for well over a millennium, the term 'metadata' is relatively new and surfaced probably only after 1969 when it was coined by Jack E. Myers[5]. Interestingly, it was registered as a trademark Metadata for his company.

The definition of 'metadata' gives by Durrel (1985) shows more insight on what metadata are like in the digital world. Important terms are 'structured' and 'encoded'. While machine processability and readability is an important core feature of metadata, it is not essentially demarking like between what metadata is and what is not. Current trends in metadata are creating structures that are both machine and human readable and processable.

Unlike bibliographic records, which tend to be general, subjected to few authorities and use single format, metadata in a networked digital environment are usually highly specialized to cover a specific aspect or feature and they come in various structures and encodings. Metadata can be classified by various aspects. However in the world of DL we are mostly dealing with three types of metadata:

  • Descriptive metadata
  • Structural metadata
  • Administrative metadata

Descriptive metadata are the most common type of metadata. They are used to describe properties of the content, whether it is a book, picture, journal article, webpage or even the whole repository or database. These sets of information are used for traditional operations like identification, searching, selection, retrieval or evaluation. Descriptive metadata are very similar to a bibliographic record, which, in case of libraries, it usually stored in them.

Structural metadata focus on describing format, structure and/or relations of an object or compound set of objects. The purpose is to secure correct interpretation and storage of objects or object sets. Structural metadata can also be carriers of the semantic information inside the text. Unlike other types of metadata, structural metadata are often written directly inside the document or inside the object.

Administrative metadata are frequently used in a DL environment for management of the sources, controlled access and archival purposes. Important part of administrative metadata consists of transaction logs. The significance of administrative metadata is presently rising as they are used for many advanced applications such as personalization and adaptive DL models. In contrast with descriptive and structural metadata, administrative metadata lack widely adopted standard and are many times in various custom formats. However, with the rising importance of the ability to share these kinds of data, this situation can change rapidly.

Metadata can be used for different purposes. One of the first and best known applications of metadata is the description of bibliographic entities such as books. As information resources began to be made available digitally, it soon became clear that complex metadata standard we inefficient for dealing with the explosion of digital information, so less complex metadata standards started to be developed. The number of subjects represented by digital information greatly increased, and metadata standards specific to particular subjects were developed, such as those for geospatial data and educational resources. The growing need for a low-barrier interoperability solution to access across fairly heterogeneous digital repositories can be dealt with through metadata usage.

Various formats and standards are used to store and share descriptive metadata of objects. From purely digital formats we can mention Dublin Core Metadata Initiative[6] (DCMI) or more advanced MODS[7], which is regarded as a compromise between the simplicity of Dublin Core and complexity of MARC[8]. MARC formats are also undergoing a transition to the online world by adopting MARC XML[9], which is highly complex framework of various standards, schemas and converters which puts MARC21 format into XML environment.

Gorman (1999), argue that the non-cataloguing world perceives metadata as being different from the traditional cataloguing, which according to him, have complex formats and expensive and stringent quality requirements. He believes that four approaches are available for the bibliographic control of electronic resources of varying control: i) Full MARC cataloguing for high quality resources that are likely to have continuing value. ii) Enriched Dublin Core for next level. iii) Minimal Dublin Core for next level. iv) Unstructured full-text keyword searching for the reminder.

The Dublin core metadata standard was born out of the DCMI, an organization dedicated to the promotion and adoption of interoperable metadata standards. Dublin core helps with searching, browsing and indexing of web-based information by supplementing an existing metadata to them. The standard deals with mostly descriptive metadata, but also has some elements that can have other applications as well. Dublin Core includes fifteen properties that can be used for description of digital objects and other resources. "The fifteen element Dublin Core described in this standard is part of a larger set of metadata vocabularies and technical specifications maintained by the DCMI" (Dublin Core Metadata Initiative, 2008). Table 2.2 below shows the fifteen Dublin Core elements along with the official Dublin Core definition of each element.

MODS is probably the most important contemporary standard that is now being widely adopted for object description metadata. The needs to create MODS was brought up by DL operators and it was specifically designed to meet their needs; to be able to manage complex sources based on MARC records in digital environment and to be able to interoperate with systems currently using other metadata schemas. Originally planned as a MARC subset it was later developed in an independent schema which does not contain all the MARC field but includes some new features previously impossible in MARC.

The METS[10] schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library expressed using the XML schema language of the World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standard Office of the Library of Congress[11], and is being developed as an initiative of the DLF. The main purpose of METS is "creating XML document instances that express the hierarchical structure of DL objects" (Wikipedia, 2009b).

PREMIS[12] (The Preservation Metadata: Implementation Strategies) is initiative jointly sponsored by OCLC[13] and RLG (now part of OCLC). A working group was established with the two following objectives (OCLC, 2005):

  • Develop a core preservation metadata set, supported by a data dictionary, with broad applicability across the digital preservation community.
  • Identify and evaluate alternative strategies for encoding, storing, and managing preservation metadata in digital preservation systems.

The result of the initiative is a well structured metadata set and data dictionary. Also, implementation strategies and recommendations were introduced along with METS compatible metadata schemas describing PREMIS structures. The advantage of PREMIS is the simplicity of its data model that consists of five core entities; Objects, Events, Agents, Rights and Intellectual Entity itself (McCallum, 2005).

Digital Library Education

DLs are emerging as an important area of research and education for information science, computer science and a number of other related disciplines. The creation of DLs marks a fundamental shift in the way we ought to envision library and information science education (Spink, 1999). Ma, Clegg and O'Brien (2006) has defined Digital Library Education (DLE) as "the programmes or course specific to the training and educating of students who will be able to build and manage DLs after graduation". It is increasing and assuming importance area nowadays. Therefore, it is clear that there is a pressing need from social trends and technology for educational developments in this fast moving area.

DL is a concept that can be defined in different perspectives which mirror most of DLE programs in Library Information Science (LIS). According to Chowdhury and Chowdhury (2003), computer scientist refers DL as a tool to access and retrieval digital content, whilst library and information professionals describes DL as collection, organization and services aspects. Saracevic and Dalbello (2001) on the other hand define DL as organizing and accessing human knowledge records in digital and network environment with the assumption that digital technology and networks will affect knowledge handling. Taking general definition of DL based on the above point of view, it can be seen that DLE is a complex curricula of the reason that it involves various layers of evolving technology, content, knowledge representation, organization, access and use, including social, legal, and cultural issues as suggested by Saracevic and Dalbello (2003).

There are various definition of DL in the literature that have been taken as the fundamental notion when designing most of DLE program. Hence several definitions of the construct are elaborated. Lesk (1997), Harter (1997), Arms (2000) and Witten and Bainbridge (2003) define DL as an organized collection of digital information together with methods for access and retrieval, and for selection, organization, and maintenance of the collection. Some researchers view DL as a 'hybrid library' where digital and printed information resources co-exist and are brought together in an integrated information service accessible locally as well as remotely (HyLife, 2002). Hybrid library is on the continuum between the conventional and DL where electronic and paper-based information sources are used alongside each other (Pinfield et al, 1998). Marchionini and Fox (1999), alternatively shape DL by four dimensions:

  1. Community to reflect social, political, legal and culture issues.
  2. Technology to serve as the engine moving the DL field, including technical progress in computing, networking, information storage and retrieval, multimedia, interface design and so on.
  3. Services that provides digital references services, real-time questions answering, on demand help, information literacy and user involvement mechanisms.
  4. Content that represents all possible kinds of form and genre of information, printed as well as digital.

DLE is a program conducted to educate and equip new librarians and information professionals to be competent in building and managing digital collection as well as comfortable working in a digital environment and be able to maintain the DL. According to the America Library Association that accredits programs in LIS "over half (52%) of the accredited LIS master's programs offered courses on DLs between 2003 and 2006, a total of 40 courses" (Pomerantz et al, 2007).

According to the different definitions of DL of the above viewpoints, it is reasonable to conclude that DLE is a complex curricula for the reason that it involves various layers of involving technology, content, knowledge representation, organization, access and use, which also including social, legal, and cultural issues. A wide range of studies have been conducted in DLE fields mainly to improve the DLE curricula. Among them are studies that carried out to highlight the rational of teaching DL, what are the content should be included and approach that are used to teach DL (Saracevic and Dalbello, 2001). Conversely an investigation on the state of DL curriculum and a review on the adequacy of DLE have been conducted in Hungary and North America (Koltay and Boda 2008; Liu, 2004). These studies in general have demonstrated that DLE focus on the tools or technologies used for building DLs and that majority of DLE contain hands-on or practical element that requires the students to interact with DL. The rationale behind this approach is to prepare the students for the modern DL world with the assumption that students who have received practical experience in DL appear to be best served for future practice in the field of librarianship.

Educational Applications of Digital Library

A number of researchers have been done utilizing DL in educational practices although it is varied in the field of study (Marshall et al, 2006; Abdullah and Zainab, 2008; and Yaron et al, 2008). This is mainly because DL platform can be used to accommodate higher level of learning and thinking skills such as in Project-Based Learning (PBL), problem solving, decision making, and creative thinking taking advantage of Internet technology. Marshall et al (2006), promote utilizing DL in education for Computer Science education context using a system called GetSmart system. The GetSmart system used to integrate course management, DL and knowledge representation (using concept mapping) components to support information search process. In the study, Marshall et al (2006) state that "the search and concept mapping functions can be combined to support exploration, formulation and collection" that leads to learning progress functions that also support exploration and formulation as a part of educational process.

Abdullah and Zainab (2008), on the other hand, utilized an integrated information literacy based on Eisenberg and Berkowitz' Big 6 model to employed Collaborative DL (CDL) in teaching and learning environment. This study is focus on conducting student's project using PBL approach for secondary students in Malaysia. This study promote CDL to allow users to document and contribute their knowledge collaboratively in order to show how authoring is possible in digital environment learning. It also proved that CDL not only can be use as information storage but it also can be use as tools for information production.

Meanwhile, Yaron et al (2008) used DL to facilitate cross-disciplinary education in molecular science for undergraduate students. The study has been discussed based on the three model perspectives proposed by Sumner and Marlino in 2004. In this study, the collection has been housed within the Materials Digital Library, also known as MatDL collection, designed to help address the educational challenges. The purpose of the study is to develop educational resources and promote learning of recurring patterns in molecular science and has encouraged three levels flexible design using cross-disciplinary resources; curriculum level, technical level and delivery level.

Over the past few years the idea to embed DL into learning and teaching environment has been increased with the progression of e-learning initiatives. This can be demonstrated by several studies such as one conducted by Sumner and Marlino (2004). They proposed a model for designing and evaluating DL for educational practices in which DL can be utilized as a cognitive tool, component repository and knowledge network in e-learning environment. Cognitive tool model views DL as a tool that can assist learners to make use of the resources in the DL to construct their own knowledge representation. This is typically drawn by constructivist learning and particularly useful in PBL. The main idea behind this model is DL can provide users with interfaces and tools to help learner to engage actively in the learning process in order to support the learners in constructing their own knowledge representations. Component repository on the other hand supports DL users (ranging from educators, curriculum developers as well as learners) to construct new educational resources in collaboration. Knowledge network in addition accommodates interactions within the DL community in order to share knowledge and experiences, updating current activities to foster knowledge building and community development.

This model has been adopted by Digital Library for Earth System Education[14] (DLESE) and National Science Digital Library[15] (NSDL). DLESE which was funded by the National Science Foundation (NSF) consists of variety resources to support teaching and learning about the Earth system at all education level. DLESE contains lesson plans, scientific data, visualizations, interactive computer models, and virtual field trips for Earth system. It is a community owned and governed system that engaging users as contributors by provides a forum for sharing contents, ideas, enthusiasm and support. DLESE project starts on 1998 and the prototype of DLESE website is implemented on 2001. By 2002 DLESE provides infrastructure for a Distributed Community Library. DLESE which facilitate sharing, collaboration and excellence in Earth System Education also offered access to peer-reviewed teaching and learning resources, interface and tools to allow exploration of Earth data. Five operational service areas is established to ensure the efficiency of the system, that are; collection cervices, community services, data services, evaluation services and DLESE program center (DPC). NSDL alternatively provides digital educational resources, covering science, technology, engineering, and mathematics from pre-K to postgraduate levels. It is established in 2002 as an online library to provide access to its services and tools. As of February 2009, NSDL contained of 154 unique collections of resources and over two million records. NSDL's data repository is based on Fedora that has made it possible to create an online environment of context and resource contribution. NSDL has built an educational tool that models scientific inquiry and exposes the processes of scientific research. It also promotes and facilitates collaborations between research and education communities by brings content expertise into the classroom. NSDL used blogging technology to enhance discovery, selection and use of it resources by creating context for the resources. The community members are enabled to become the contributors of resources, questions, reviews, annotations and metadata.

In 2001, Ekmekcioglu and Brown has doing research called INSPIRAL (Investigating Portals for Information Resources and Learning) project to "investigate, identify and critically analyze the issues that surround the linking of online learning environments and DLs", focusing on the higher education learner. From this project, they found that "the integration of online learning environments into DL resources and services is a worthwhile aim to pursue" (Ekmekcioglu and Brown, 2001).

Collaborative Knowledge Management

The accumulation and use of knowledge is the foundation of human evolution and growth since its very beginning; however, systematic study of managing knowledge as organizational strategic resources or more precisely Knowledge Management (KM) has not been commenced and proliferated until recently. The term 'Knowledge Management' was first introduces in 1986 by the American Productivity and Quality Center (Baker, 2002), and then it has been the much-discussed topic throughout the past years (Nonaka and Takeuchi, 1995; Davenport and Prusak, 1998; and Alberto, 2000). However, the lack of theoretical understanding of knowledge and practically proven methods for efficient KM is surprising (Holsapple, 2003).

Wiig (1999) defines KM as "the systematic and explicit management of knowledge-related activities, practices, programs and policies within the enterprise", and there are multiple KM processes being identified: goal definition, identification, acquisition, development, distribution, application, maintenance and assessment of knowledge. Skyrme (1997) views KM is a purposeful and systematic management of vital knowledge along with its associated processes of creating, gathering, organizing, diffusing, using and exploiting that knowledge. Davenport and Prusak (1998), claim that KM is the process of capturing, distributing and effectively using knowledge.

Drucker (1998) mentioned in his book, Managing in a Time of Great Change, that "knowledge has become the key economic resource and the dominant - and perhaps even the only - source of comparative advantage", because knowledge is difficult to create and imitate (Peteraf, 1997; and Teece, 1998), and it has to be nurtured and managed (Maria and Marti, 2001). Senge (1990) has warned that many organizations are unable to transform and function as knowledge organization because of learning disabilities. With rapidly changes in technologies, the way information is created, stored, used and shared have made it more accessible and make the national borders are nearly meaningless in defining an organization's operating boundaries.

Explicit knowledge is easily formalized and documented (Hippel, 1994; and Duffy, 2000), and can be captured or shared through information technology. Explicit knowledge are usually expressed in the form of data and numbers, and can be shared formally and systematically in the form of data, specifications, manuals, drawings, audio and video, tapes, computer programs, patents, and the like. In contrast, tacit knowledge is difficult to express and formalize, and is thus difficult to share as it includes individual's insights, intuitions and bunches. Tacit knowledge resides in the human and is evolves from people's interactions, and requires skills and practices (Riggins and Rhee, 1999).

KM is a complicated and multifaceted discipline. Scholars, practitioners or researchers may take different perspectives and depth in analyzing the subject. Similarly, KM practitioners may take various approaches to tackle the KM problem. Therefore, the concepts of knowledge and knowledge management are best defined by the people who use them in respective areas. In survey study on KM by Davenport, De Long and Beers (1998), 4 categories of KM processes are named by the participants:

  • Creation of knowledge repositories.
  • Improvement of knowledge access.
  • Enhancement of knowledge environment.
  • Management of knowledge as an asset.

These categories of processes can be further divided into sub-tasks. There are various KM frameworks or models and the KM processes will vary a bit. However, the ultimate goal of KM is to provide systematic management framework and methodology to manage the knowledge resources effectively and to sustain competitive advantages.

Nonaka and Konno (1998) articulated a well-known model for knowledge creation process - The SECI (Socialization, Externalization, Combination, Internalization) model that describes the ways knowledge is generated, transferred and re-created within organization. In summary, the SECI model as shown in Figure 2.4 identifies the following

  • Two forms of knowledge (tacit and explicit).
  • A dynamic and interaction space (transfer).
  • Three levels of aggregation (individual, group, context).
  • Four knowledge-creating processes: socialization, externalization, combination and internalization.

These four knowledge-creation processes are considered as the basic processes by which knowledge is created.

  1. Socialization: Individuals get together and share their experience about specific tasks, projects or processes in a free and open environment or atmosphere, and in such way the tacit knowledge of individuals is transformed into the tacit knowledge of groups.
  2. Externalization: Individuals talk about their experience on particular area or subject and as a consequence of collective reflection members come up with a new knowledge about the addressing area and thus the tacit knowledge is articulated and expressed into an explicit form.
  3. Combination: Many persons are working together and each contributes to a particular area of knowledge to make the whole set of knowledge a complete and comprehensive one through collaboration and sharing processes. In such case, the existing explicit knowledge of the individuals or teams is transformed into systematic knowledge, such as a set of specifications.
  4. Internalization: Explicit knowledge is transformed into tacit knowledge which is operational in nature. The individual acquires the specific skill and becoming proficiency in particular skill after repeating learning and doing.

The importance of a shared or interactive space for knowledge creating is suggested by Alavari and Leidner (2001). They proposed the existence of a shared knowledge space for knowledge facilitation and the applicable of IT for knowledge exchange purpose is questionable without the existence of such space. Many IT applications, particular the groupware applications and portal applications aim to facilitate these knowledge processes namely creation, application, distribution and storage processes etc. by creating virtual collaboration space, chat room, bulletin board etc. to facilitate communications amongst team members.

There are various KM models of frameworks that guide the practitioners to implement KM solution or conduct KM research work. These frameworks identify the key processes of KM as well as the various key influential factors or enablers for KM within the organization. These key processes and the critical influential factors interact dynamically within the framework and practitioners have to address these various parameters or processes while designing the KM systems to ensure effectiveness. Commonly identified enablers in KM models or frameworks include management, structure, culture, competence, motivation and reward, information technology, etc. Davenport and Prusak (1998) describe KM as involving organizational, human and technical issues, and technology is always an enabler for KM. The technologies will facilitate the various knowledge processes for the KM purpose, say application of knowledge, creation of knowledge, distribution of knowledge and storage of knowledge.

There are three fundamental elements within any KM framework, namely people, process, and technology. The KM problem is tackled from the perspective of a process organization and IT as considered as one of the prime enablers for KM process and related KM activities realization. IT support communication, cooperation and coordination, and allows timely access to information and the sources of knowledge, and is always considered as prime enabler. Knowledge process should consist of the following essential knowledge process as illustrated in the Figure 2.5. These processes are create, capture, organize, access and use. Almost every IT elements would implement these functions within the operating system level or through resources management utilities or applications. Similarly, the human interactive processes are collaborate, find, mediate, facilitate or share, etc. which allow users to manipulate the information.

Knowledge Management Concept in Digital Library Environment

Knowledge management is still a nascent organizational practice, so as of yet there is no agreed upon definition for it. Therefore, it is generally described as broadly as possible, such as the following specified by Prusak: KM is "any process or practice of creating, acquiring, capturing, sharing and using knowledge, wherever it reside, to enhance learning and performance in organizations" (Prusak, 1997). Knowledge does not simply "exist" - it begins as raw facts and numbers. When put into context, this data becomes information, such as the content of documents or records in a database. This information becomes knowledge only after it is combined with experience and knowledge (Kidwell et al, 2000).

The goal of KM in the organization is to allow businesses to improve how knowledge within an organization is used and shared .Learning institutions are in the business of knowledge, so it seems that learning institutions would benefit immensely from participating in KM activities. At the Knowledge Management in Education Summit in 2002, the participants agreed that KM practices provide important benefits for educators, including better work processes, improved curriculum, and above all else, positive student outcomes (Petrides, 2003). In an educational environment, part of understanding work practices involves understanding the social landscape. An effective KM tool designed for educators will attempt to address problems (where appropriate) within the social structure. It is easy to overlook the true beneficiaries of KM in education for the students since implementing effective knowledge management tools in learning institutions relies heavily on positive teacher outcomes. It is important to stress to teachers that by participating in KM activities and using KM tools, teachers have the potential to improve both the curriculum and their effectiveness as educators, which ultimately benefits the students.

Training and education activities are informed by various theories of learning. Constructivists view the learner as actively constructing new knowledge drawing upon pre-existing information and past experiences. As experience is gained and knowledge is built, learning opportunities produce new concepts or ideas (Maughan and Anderson, 2005). Traditional industry and learning institution curricula tend to treat content in an abstract or formal epistemological fashion independent of applications or work settings. KM in the support of task performance must be derived from the activity and involves identifying and capturing knowledge, indexing knowledge, and making knowledge available to users in flexible and useful ways (Siemens, 2004).

Emerging KM practices are based partly on recent cognitive science understandings of human capabilities, such as conceptual blending and concepts for learning. KM should enhance individual, group and organizational learning, improve information circulation and even support innovation. It aims to capture and represent an organization's knowledge assets to facilitate knowledge access, sharing and reuse. The management of knowledge requires the ability to describe, organize and apply relationships.

At the core of KM is the desire to identify and share knowledge that may not otherwise be found and shared, such as tacit knowledge residing in a single individual or an organization's grey literature usually accessible to only a few of its members. The theory behind knowledge management practice is that knowledge is not an end into itself. When information and knowledge flow can be captured, organized and made accessible for reuse, there exists the potential for subsequent creation of new knowledge (Williams, 2004). The most common used process of knowledge manipulation are capturing, storage and distribution. People use different types of repositories and specialists implement different technologies for organization of knowledge collectors, storage and delivery on demand. The purpose of the process is to improve qualification and to achieve better result.

Sumner and Marlino (2004) have introduced the knowledge network model that can benefit educational DL in how libraries and library communities:

  • Accommodate and support different types of participant interactions, both human and technology-mediated.
  • Foster knowledge building and community development through specific forms of interactions.
  • Enable participants to choose varying thresholds of entry and ongoing participation.
  • Support participants to make use of captured interactions to inform their current activities.
  • Affect participants' views of themselves, their knowledge and skills, and their changing role in the community.
  • Grow and sustain themselves.

There is a strong relationship between knowledge and libraries. Material stored in libraries contains knowledge and to make this material available is the primary aim of DL. KM and DL could be aligned because they share a similar focus that is to enhance human knowledge. They are also looking for ways to categorize and store knowledge. DL has the potential to facilitate KM functions by enabling barrier-free access to materials and incorporating structured and unstructured information in a way that precipitates knowledge discovery (Rydberg-Cox, 2000).

When evaluating research about DL for learning purposes and knowledge sharing across organizations, it is clear that KM and DL for learning purposes could be more aligned. Reasons for this integration include:

  • DL for learning purposes and KM share a similar focus: how to enhance human knowledge and its use within organizations. Both DL for learning purposes and KM are looking for ways to categorize and share knowledge.
  • There is a growing awareness of the fact that knowledge in an organization is distributed among its people's minds and a variety of knowledge artefacts.

Both content management and learning management systems are defined to store knowledge or learning/course components, often at an object level. Because of this, not only KM may fuse with learning management. In the vendor market, there is an increasing demand to content management system to grow closer to learning management system.

The APQC[16] defines content management as follows: "a system to provide meaningful and timely information to end users by creating processes that identify, collect, categorize and refresh content using a common taxonomy across the organization. A content management system includes people, process, technology and the content itself".

The increasing demand to compress the time to develop content for DL initiatives and for more targeted or personalized learning through the use and repurposing of standard based learning objects leads to a quicker unification of concepts and systems. Key issues are:

  • Setting priorities for the investments in KM, learning management and content management, resulting in a holistic approach of intellectual capital management.
  • Developing and managing individuals, competencies and communities.
  • Describing, classifying and managing unstructured content.
  • Creating and managing activities aimed at transferring knowledge to individuals (communities within an organization and putting knowledge to work).

When learning management systems are designed to store course components on the object level, in a central repository, the learning management system grow closer to content management systems available. This opens the doors to single sourcing solutions, managing content throughout an organization.

Predicting the future of a huge and fast-changing area like DLs is a difficult task. However, DLs will no doubt play a key role in creating a perfect information management environment or, as the new terminology has it, a KM environment. KM is the new buzzword, in corporate as well as government sectors. In KM terms, organizational knowledge may be divided into tacit knowledge, explicit knowledge and cultural knowledge (Choo, 1998a, 1998b, 2000). Implicit in this suggestion in the important idea that knowledge is not just an object or artefact, but also the outcome of people working together, sharing experiences and constructing meaning out of what they do. DLs can play a significant role in achieving this goal.

Keeping these broad objectives in view, Rowley (1999) comments that KM is concerned with the exploitation and development of the knowledge assets of an organization with a view to furthering the organization's objectives - the knowledge assets to be managed include explicit, documented knowledge and tacit, subjective knowledge. DLs, with the major objective of making digital information - local as well as remote and distributed servers - accessible to every user in the community, can play a key role in KM in any organization. In future all organizations will need to have mechanisms for gaining easy access to local as well as global information. In order to create a knowledge-based environment, organizations should also build mechanisms for capturing information on local expertise.

DSpace - An Open Source Digital Library

The definitions and descriptions of DLs serve to describe the essence of these software systems which provide storage of and access to digital content. Like any type of software available, there are different ways to write a software product, different aspects that receive greater attention given the developers and goals for the system. For example, image editing software is similar in nature with regard to what such programs allow users to do, but looks slightly different, highlighting certain features more than others, some packages provide special functions that others may not. No doubt these various, similar products are developed and coded differently.

The DSpaceTM system[17] is a digital research repository system that was built to address a very real need among academic institutions, combating the problem of increasing amounts of scholarly work generated by faculty and students that was had sparse viewing and suffered from occasional preservation issues. The system, a joint venture between the Massachusetts Institute of Technology (MIT) and Hewlett-Packard (HP) Labs, aims at providing an online, electronic repository system that stores, can provide organization and preservation services to scholarly work, to provide for broader exposure and a longer, if not infinite lifespan to such work (Smith et al, 2003).

DSpace is a digital repository system that allows users to submit, store, and allow others to read and use information that may have broad appeal. With a specific focus on the preservation of stored data, DSpace employs digital preservation functions such as the storage of checksums along with digital objects in order to keep track of and verify a file's conformance with the original. DSpace is an open source software project and is entirely written in Java. DSpace was created breadth first, so that most functionality required by organizations seeking to use such a software product was covered, in a simple and basic way (Smith et al, 2003). DSpace has a developed underlying model that drives the way that users use the system, submit and use content, and how administrators can organize and configure the system. The software's underlying code base provides API's that administrators and third party applications can use to interact with tha DSpace system. In order to be more usable to different types of users, the software provides a configurable submission and workflow process that can be fit to any organization's policies and practices (Smith et al, 2003).

As detailed in Figure 2.6, The DSpace software is divided into a relatively common three -tiered architecture. These three layers are the application layer, business logic layer and storage layer. On the lowest layer of this architecture, the storage layer, all bitstreams stored in the repository are stored as files on the system's file structure. References to these files and most other metadata, settings and other information that drives the behavior of the systems are stored in a relational database system, usually PostgreSQL[18]. This marriage of techniques allows for the quick, relationally oriented access strategies for metadata and runtime data, while keeping stored documents in a regular file system. Together the storage layer aspects of DSpace make up the storage API. The business logic layer is made up of a set of classes or modules that embody the inner workings of many DSpace object types, including user related functions, browsing and searching related aspects, content management and others. Business logic classes make up DSpace's public API, which allows third party code to interact with DSpace in the same way that typical interaction within the software occurs. Lastly, the application layer is the highest level layer of functionality in DSpace and brings together DSpace backend functionalities to provide the services and functionality that users see when they use the system. Included in the application layer are the import/export functionality the software provides, statistics tools and the web-based user interface. Given DSpace's open source nature, all of these software aspects have source code available to organizations using the system that can be tweaked and customized to more adequately meet their needs.


This entire chapter has dedicated to providing an overview on the area under investigation, which is digital library as knowledge management tool for digital library education, with the main goal of providing a holistic view of the current status of research in this domain to assist in positioning this study.

In conclusion, it is important to understand that DLs were not originally designed as standalone entities; many users perceive DLs more as digital collections; extending and transforming traditional understanding of libraries. Thus, the DL is not just the collection of digital works, but this term also covers the services through which these collections are created, stored and provided to the readers.

DLs research has been focused on automating the activities carried out by users, such as automatic indexing and classification and expert systems for reference desk. Digital catalog can support long keywords along with deferential weights, long user queries, ranked retrieval etc. Information search via hypertext illustrates that indices can be implicit, giving users a seamless blend of primary and secondary works. Further, some current library activities may become irrelevant. For example, circulation problems originating in a fixed number of copies of each work simply disappear. We might redefine and redesign library services to achieve the basic aims more effectively than is possible now. Thus DLs not only involves automation of each traditional library activity and service, but also calls for redefinition of services, new groupings of services or replacements of groups of services with other solutions.

  • Available from: Last accessed 9/6/2009.
  • Available from: htt