Resolving Database Through Ontology Mapping For Archaeology Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

In these days the data available and heterogeneous data resources relating to Archaeology is increasing tremendously every day. With this much amount of data, Interoperability and data Integration of data is dragging the interest of researchers towards the database heterogeneity problem. Single database is not enough to serve the human needs in semantic web so we need to map more than one database. One of the solutions for this is to mapping the archaeological databases with one common language for domain experts in archaeology such as CIDOC CRM ontology. My project is "Resolving database heterogeneity through Ontology mapping for Archaeology" is a MSC CO7201 Individual project. In this project I need to map the two archaeological datasets with CIDOC CRM ontology and provide the interface for querying the databases. In this project I need to develop a tool which has to load the CIDOC CRM ontology and two other ontologies and display them in a user interface later on I need to perform mapping between these entities of both two archaeological ontologies with CIDOC CRM ontology and saving all these mappings in a owl file, when the user queries on this owl file the tool has to give the results. The main challenge in this project is to perform the mapping between these ontologies and saving all the mappings in an owl file. I was developed the tool which can map both schema and instances of the ontologies and save all those mappings in an owl file. Mapping process is done interactively with the user. We can see all the mappings which I done for the two archaeological ontologies with CIDOC CRM ontology in this report.


Motivation & Introduction

Project Aim and Objective

Literature Review

3.1. Basic Schema Mapping Techniques

3.1.1. Elementary level Techniques

3.1.2. Structure level techniques

3.2. Matching Systems

3.3. Example mapping using CIDOC CRM


3.5. Protégé-owl API programmer's guide

Technical Specifications

4.1. Project Tools

4.1.1. NetBeans

4.1.2. Protégé

4.2. Programming Technologies

4.2.1. Programming Language - Java 6.0

4.2.2. Sparql Query Language

4.2.3. OWL 2.0

Design Overview

5.1. User Interface

5.2. Loading the Ontologies

5.3. Mapping between the ontologies

5.4. Saving all the mappings

6. Implementation

6.1 Configuring the NetBeans

6.2. Graphical User Interface

6.3. Loading Ontologies

6.4. Mapping Algorithm

6.5. Saving the Mappings:

7. Further research and Limitations:

8. Conclusion

9. Project Work Plan

10. Bibliography


Motivation and Introduction

In the World Wide Web, the number of data providers and amount of available data is increasing tremendously [3]. This has spurred an increasing interest from professionals and the general public to make publicly available the tremendous wealth of information kept in museums, archives and libraries - the so-called "memory organisations". Quite naturally, their development has focused on presentation, such as websites and interfaces to their local databases. Now with more and more information becoming available, there is an increasing demand for targeted global search, comparative studies, data transfer and data migration between heterogeneous sources of cultural contents [4]. Therefore, integration issues are attracting ever more attention. Data integration refers to combining data in such a way that a homogeneous and uniform view is presented to users. One more thing we need to consider is, in these days the one database is not at all enough to serve the people needs and requirements in archaeology. We need to use two or more data resources to satisfy the requirements. This requires resolving database heterogeneity. Here Semantic heterogeneity comes into the picture. This term refers to differences or similarities in the meaning of local data. For example, two schema elements in two local data sources can have the same intended meaning, but different names. In the same way the two database resources have same meaning but different meanings. Thus, during integration, it should be realized that these two elements actually refer to the same concept. Alternatively, two schema elements in two data sources might be named identically, while their intended meanings are incompatible. Hence, these elements should be treated as different things during integration [3].

So this problem leads me to do this project resolving database heterogeneity in archaeology domain.

The solution for this problem is to mapping the Archaeology datasets by using CIDOC CRM ontology. In my project I have to remove the database heterogeneity for two Archaeological datasets by mapping those two datasets with CIDOC CRM Ontology. For this I am using ontology mapping techniques to map the database schema i.e. both schema mapping and instance mapping for heterogeneous databases in Archaeology domain. Here CIDOC CRM ontology is used to map the ontologies which is used as a common understanding between all the databases in cultural heritage. CIDOC Conceptual Reference Model ("CRM") is a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information [1]. The primary role of the CRM is to enable information exchange and integration between heterogeneous sources of cultural heritage information [1]. This ontology is having around 80 classes and 140 properties. By analyzing these classes and properties and relations between them if we need more classes or properties then we can add those in the ontology otherwise we can directly use that ontology for mapping. I need to map the two ontologies for archaeology datasets by using CIDOC CRM ontology. We can develop this project by using any java IDE. I developed this project by using NetBeans IDE due to easy designing of the User Interface for my tool. I need to develop a code which can have to load the two Archaeological ontologies as well as CIDOC CRM ontology which has to provide the opportunity to do the mapping between all the entities of these ontologies with CIDOC CRM ontology by using human intervention (interactive mapping). I need to do the mapping between classes as well as properties and instances of those classes. Mapping between the entities of two ontologies means I need to select the relation between those two entities. Relation is like equivalent entity or subclass of the entity. Entity means it may be a class, property or instance. After doing all the mappings I need save all those mappings in an owl file. After this we can query on this owl file by using the protégé owl API programming language. When we query on this mapped owl file it has to give the results of both the two archaeological data sets. By doing this we can reduce the database heterogeneity in archaeological datasets.

Up to now we have seen the motivation and introduction to this project. In the remainder of this report we can see the sections as follows: In section 2 we can see the main aim of this project and what are the objectives and the challenges presented in this project. In section 3 we can see some schema mapping techniques and matching systems and some background material. In section 4 we can see the tools used for this project and programming language used in this project. In section 5 design part of this project. In section 6 we can see how this project is going to implement and in section 7 what are the limitations of the project and further research work has to be done and in section 8 conclusion part of the project.

2. Project Aim and Objectives

The main aim of this project is to resolve database heterogeneity of heterogeneous databases for the Archaeology domain by using CIDOC CRM ontology and querying these databases by using all the mappings.

The main task of this project is to implement a java application which can load the ontologies and have to do the semiautomatic or manual mapping between those ontologies and then by using those mappings it has to give the results to user's queries. This tool has to be easy to use interface so that users can use this tool without finding any difficulty to do the mapping and get the results. The main challenges included in this project are to understand the CIDOC CRM ontology which means the classes and properties of this ontology because we need to map this ontology with other two ontologies. We need to load these three ontologies including two archaeological ontologies and CIDOC CRM ontology in the tool. Displaying all these ontologies in a tree form is the big challenge. We need to display all the entities including classes, properties and instances. Next step is to do the mapping between these ontologies and CIDOC CRM ontology. This step is the crucial in this project. The reason for this is If we don't do the mapping properly we don't get the required results for the queries so we should take care when we are mapping the entities. For doing this we need to study some basic schema mapping techniques and some algorithms for matching systems. We might need to map the large databases so the algorithm must have to be efficient and have to map quickly even though the databases are very large.

The requirements for this project are my tool has to be display the three ontologies in a tree form. The input for this tool is only the owl files so we need to generate the owl file from schema if the file is in RDF format. CIDOC CRM ontology in rdf format so we need to generate an ontology by using the protégé, then we can get the owl file of CIDOC CRM ontology The output file must be saved in an owl file as well so that we can query on that file.

The input for this project is two archaeological schema files and CIDOC CRM ontology. The output will be an owl file from the tool. The sample mapping of two ontologies is shown in below Figure: Sample results. We need to do in the similar way to three ontologies in my tool. In this example we can see only the classes mapping but my tool has to do the mapping for properties and instances as well.

Figure: Sample results [6]

Literature Review

In this section I am describing about the some of the basic schema mapping technique and some matching systems.

3.1. Basic Schema Mapping Techniques

The below Figure: Classification of basic schema mapping techniques shows the classification of basic schema mapping techniques

We can classify these techniques into different categories i.e. elementary level vs. Structure level and Syntactic vs. Semantic vs. External. Now I am not going to discuss about these classifications. We can see some of the basic schema mapping techniques here:

3.1.1. Elementary level Techniques:

String based techniques: These are often used in order to match names and name descriptions of schema/ontology entities. These techniques consider strings as sequences of letters in an alphabet. They are typically based on the following intuition: the more similar the strings, the more likely they denote the same concepts. Some examples of string-based techniques which are extensively used in matching systems are prefix, suffix, edit distance, and n-gram.

Figure: Classification of basic schema mapping techniques [12]

Language-based techniques: These techniques consider names as words in some natural language. They are based on Natural Language Processing (NLP) techniques exploiting morphological properties of the input words. Some examples of these techniques are Tokenization, Lemmatization, and Elimination

Constraint-based techniques: These are algorithms which deal with the internal constraints being applied to the definitions of entities, such as types, cardinality of attributes, and keys. Examples for these techniques are Data type's comparison and Multiplicity comparison [12].

Some of the other techniques are Linguistic resources, Alignment reuse and Upper level formal ontologies.

3.1.2. Structure level techniques:

Graph-based techniques: These are graph algorithms which consider the input as labelled graphs. The applications (e.g., database schemas, taxonomies, or ontologies) are viewed as graph-like structures containing terms and their inter-relationships. Usually, the similarity comparison between a pair of nodes from the two schemas/ontologies is based on the analysis of their positions within the graphs. Some examples are Graph matching, Children, Leaves and Relations.

Taxonomy-based techniques: These are also graph algorithms which consider only the specialization relation. The intuition behind taxonomic techniques is that is-a links connect terms that are already similar. Examples are Bounded path matching and Super (sub)-concepts rules [12]

Some of the other techniques are Repository of structures and Model based techniques.

3.2. Matching Systems

Now we can see some of the Matching systems developed based on the above techniques are Similarity Flooding, Artemis, Cupid, COMA, NOM, QOM, OLA, Anchor-PROMPT, and S-Match etc.

These all mapping systems are Automatic or semi - automatic mapping systems. Some of the above systems support only Schema mapping i.e. classes and properties. Some of the systems supports instance mapping and some of them are supports both schema and instances. My tool should be able to do the mapping for classes, properties and instances. If we are doing the automatic or semi - automatic mapping then it is good for the archaeology database websites but the problem with this is the algorithms implemented in these algorithms are little bit difficult to implement these algorithms and time taking. The main benefits to users due to these semi - automatic mapping systems are very less human intervention and quick mapping or on the fly mapping but there is no guarantee that all the mappings generated by these mapping systems are good. Sometimes there may be a chance of false mapping so we need to take a look on this aspect. One more disadvantage is the databases used for mapping are very big size so the algorithm used in the mapping systems are capable of doing the mapping ontologies very quickly so the algorithm must be most efficient and less runtime complexity. As an individual project it is not possible to finish the project in time if we go with semi - automatic mapping tool. I want to go with manual mapping instead of semi - automatic mapping. The disadvantages of manual mapping are taking much time to perform mapping between ontologies. Human intervention is necessary so maintenance cost is very high because we need to do the mapping interactively with the tool and on the fly mapping is not possible. The benefits are design time and complexity are very less when compared to semi - automatic mapping systems. It is very less expensive than those systems.

If we want to do the mapping manually then we need to understand the basic schema mapping techniques then only we can do the mapping the entities properly. False mapping leads to give the improper results or wrong results so we need to take care about this.

Example mapping using CIDOC CRM

The below Figure: Mapping DC record to the CIDOC CRM shows the example of how the mapping will be done for a partial DC record with the CIDOC CRM ontology:

Type: text

Title: Mapping of the Dublin Core Metadata Element Set to the CIDOC CRM

Creator: Martin Doerr

Publisher: ICS-FORTH

Identifier: FORTH-ICS / TR 274 July 2000

Language: English

Figure: Mapping DC record to the CIDOC CRM [13]

The above example is partial record taken from Dublin Core. The fields in the record are mapped to CIDOC CRM ontology classes by relating them with the properties of CIDOC CRM ontology. Then the instances of this record are related with the classes of CIDOC CRM ontology. In the similar way we need to map all the records in the Dublin Core with CIDOC CRM ontology. With this we done the mapping for one database, in a similar way we need to map another database and we will save all these mappings in an owl file then we can query on this file and we can get the required results.


The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation. The CIDOC CRM is intended to promote a shared understanding of cultural heritage information by providing a common and extensible semantic framework that any cultural heritage information can be mapped to. It is intended to be a common language for domain experts and implementers to formulate requirements for information systems and to serve as a guide for good practice of conceptual modelling. In this way, it can provide the "semantic glue" needed to mediate between different sources of cultural heritage information, such as that published by museums, libraries and archives [7].

Objectives of the CIDOC CRM: The primary role of the CRM is to enable information exchange and integration between heterogeneous sources of cultural heritage information. It aims at providing the semantic definitions and clarifications needed to transform disparate, localised information sources into a coherent global resource, be it within a larger institution, in intranets or on the Internet [1].

Good understanding of the CIDOC CRM Ontology is the only way to do the mapping the ontologies efficiently and quickly by using the manual mapping.

We can see one example in below Figure: reasoning about spatial information CIDOC CRM ontology which is partial view of the CIDOC CRM ontology representing reasoning about spatial information. In this diagram five of the main hierarchies are included. They are E39 Actor, E51 Contact Point, E41 Appellation, E53 Place, and E70 Thing. The arrows represent the relationship between the classes and their subclasses. The green rectangles represent the classes and subclasses. Properties for those classes are shown in between the classes with arrows. As can be seen, an instance of E53 Place is identified by an instance of E44 Place Appellation, which may be an instance of E45 Address, E47 Spatial Coordinates, E48 Place Name, or E46 Section Definition such as 'basement', 'prow', or 'lower left-hand corner' [1].

Figure: reasoning about spatial information [1]

In a similar way we can see the reasoning for temporal information in the below Figure: reasoning about temporal information. In this diagram four main hierarchies of CIDOC CRM ontology are included. They are E2 Temporal Entity, E52 Time-Span, E77 Persistent Item and E53 Place. In this example E2 Temporal Entity is an abstract class which means it does not have any instances but it serves to other classes like grouping the other classes with a temporal component, those classes are E4 Period, E5 Event and E3 Condition State. The class E52 time span is just a temporal interval that does not make any reference to cultural or geographical contexts. Instances of E52 Time-Span are sometimes identified by instances of E49 Time Appellation, often in the form of E50 Date. These are just some examples to know how we do the reasoning with the classes in CIDOC CRM ontology.

Figure: reasoning about temporal information

We can see the information about all the classes and properties and clear explanation about the relation between them in [1]. Here I am just explaining the information about the one class E22 man Made Object because in general in all the databases relating to the archaeology are having the subclass of this class.

It is the sub class of E19 Physical Object, E24 Physical Man-Made Thing classes and is a super class for E84 Information Carrier. This class is about the objects which are purposely created by human activity. No assumptions are made as to the extent of modification required to justify regarding an object as man-made. Some examples for the instance of this class are Coins, Loom Weights, Port land vase etc. In my two archaeological data bases one is about the Loom weights. It is made up of by using the clay and is used for weaving the cloth.

Protégé-owl API programmer's guide:

This is used to do the programming in Protégé by using the Java language for querying and traversing through the OWL model.

Overview: The Protégé-OWL API is an open-source Java library for the Web Ontology Language (OWL) and RDF(S). The API provides classes and methods to load and save OWL files, to query and manipulate OWL data models, and to perform reasoning based on Description Logic engines. Furthermore, the API is optimized for the implementation of graphical user interface.

The API is designed to be used in two contexts:

For the development of components that are executed inside of the Protégé-OWL editor's user interface

For the development of stand-alone applications (e.g., Swing applications, Servlets, or Eclipse or any IDE plug-ins) [8].

Protégé - owl API Programmer's guide provides lot of methods for loading the ontologies, crating the Jena owl model, creating and editing the classes, properties and instances etc. and for saving the ontology to an owl file. In my project I need to load the ontologies and display them in a user interface so for that purpose I need to create a Jena owl model and by using that I need to traverse through the owl model and display all the entities in a user interface. Some of the methods I used in my project are shown below.

By using the below code we can create the Jena owl model for the ontology through URI.

String uri = "";

OWLModel owlModel = ProtegeOWL.createJenaOWLModelFromURI(uri);

If we want to load the ontology from our system directory then we can change the above code at uri then we will be able to do crate Jena owl model for that ontology.

String uri = "file:///c:/Work/Projects/travel.owl"

By using the predefined classes and methods we can do the programming for traversing through the ontologies. Some of they are createOWLNamedClass, getOWLNamedClass, OWLNamedClass etc.

If we want to print all the classes and subclasses in the ontology then we can use the below recursive method for printing.

printClassTree(personClass, "");


private static void printClassTree(RDFSClass cls, String indentation) {

System.out.println(indentation + cls.getName());

for (Iterator it = cls.getSubclasses(false).iterator(); it.hasNext();) {

RDFSClass subclass = (RDFSClass);

printClassTree(subclass, indentation + " ");



4. Technical Specifications:

4.1. Project Tools:

This section will describe the tools and Integrated Development Environment (IDE) that used in this project. With the help of this project it is easier and quicker to develop the code with more efficiently.

4.1.1. NetBeans: Using an Integrated Development Environment (IDE) for developing applications saves the time by managing windows, settings, and data. Main benefits by using IDE's are designing the program is easy even though the user's are new to the programming. It will alert when we got any errors and providing the features like quick fix etc. The main advantage for using this NetBeans is the feature of drag and drop buttons for creating user interface and editing the components. It can save the lot of time by providing suggestions of predefined methods. In addition, an IDE can store repetitive tasks through macros and abbreviations.

The NetBeans IDE is open source and is written in the Java programming language. It provides the services common to creating desktop applications -- such as window and menu management, settings storage -- and is also the first IDE to fully support JDK 6.0 features. The NetBeans platform and IDE are free for commercial and non-commercial use, and they are supported by Sun Microsystems [14].

4.1.2. Protégé 3.4.3: Protégé is a free, open source ontology editor and knowledge-base framework. It is platform which is used for modelling ontologies and we can export these ontologies to any of the formats including RDF(S), OWL, and XML Schema. Protégé is based on Java, is extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development [5].

This is used as a background framework for this project. We need to import some jar files to NetBeans for setting up the environment. We can view the ontologies in protégé to understand the classes, properties and relations between them so that we can map them efficiently. Protégé must have to be installed in our machine for supporting some methods used in this project. So it is good to learn protégé for better knowledge of ontologies. After saving all the mappings in an owl file then we can open this owl file by using the protégé to look at the ontology how it is mapped.

4.2. Programming Technologies:

4.2.1. Programming Language - Java 6.0: Java is a simple and yet powerful object oriented programming language. It is platform independent language. I need to develop the code in Java due to its excellent features. We need many predefined classes and methods in my coding for traversing through the ontologies. Java can provide all those classes and methods by importing some packages.

Swings in java are the most useful part in my code. Good knowledge of swings is very useful to create a good and easy to use user interface. Mainly I need to use the panels, buttons, radio buttons, option panes etc in my user interface.

4.2.2. Sparql Query Language: SPARQL is a query language and a protocol for accessing RDF designed by the W3c RDF Data Access Working Group. As a query language, SPARQL is "data-oriented" in that it only queries the information held in the models; there is no inference in the query language itself [6]. After performing the entire mappings user can pose a query on these mappings by using Sparql query language or by using the methods in protégé - owl programmer's API guide by using a java program. I t is better to use java programming instead of sparql because by the time we are querying the ontology we already developed a owl model so we can directly query by using that owl model.

4.2.3. Owl 2.0: OWL is a language which is designed to provide a common way to process the content of web information. It provides rich expressivity than RDF.

5. Design Overview

This section will describe about the overview of the design of the tool. In this project I need to remove the database heterogeneity for two archaeology databases by using the CIDOC CRM ontology but the data format in the two databases is different from that of CIDOC CRM ontology. The format of the two databases is schema so I need to generate the ontology from schema. I have two RDF schemas and I will generate two ontologies by using these two RDF schemas name it as ontology 1 and ontology 2. Now I need to map these two ontologies with CIDOC CRM ontology so first we map ontology 1 with CIDOC CRM ontology and we will save all these mappings in an owl file. Next we will map the ontology 2 with CIDOC CRM ontology and we save all these mappings in the same owl file. The user can query on this owl file by using the methods in protégé - owl API programmer's guide, this will give the results from two databases accordingly.

The main sections present in this design of the tool is User Interface of my tool, displaying the three ontologies in my tool, performing the mapping between the entities of the three ontologies and saving all these mappings in an owl file.








Figure: User Interface for my Tool

5.1 User Interface:

The User Interface must be easy to use and be able to display the ontologies in a tree form and provide the opportunity to do the mapping interactively and save them in a owl file. The user interface for my tool is shown above figure: User Interface for my Tool.

My tool has mainly three tabs which consisting (1) Class Mapping, (2) Property Mapping and (3) Instance Mapping. Numbers are indicated in the diagram. In Class Mapping tab all the classes in three ontologies are displayed in a tree form. Similarly in Properties Mapping tab all the properties related to the selected classes will be displayed and all the instances are displayed in the Instance Mapping tab.

In the pane Indicating (4) all the buttons are providing the functionalities for my tool including loading ontologies and saving the mappings in an owl file and exit from the tool etc. In the pane (5) labelled as target ontology1 classes all the classes of that ontology are displayed in that pane. In the pane (6) labelled as CIDOC CRM Ontology Classes it loads all the classes from that ontology. In the pane (7) labelled as Target Ontology2 Classes it displays all the classes from that ontology. All the ontologies must have to be displayed like tree form but at this moment I am able to display them in the respective panes but not like tree form. I am working on it now.

5.2. Loading the Ontologies:

We have seen the user interface in the above section and now we can see how my tool loads all the ontologies in the tool. See the below figure Fig: Loading Ontologies.

In this figure we can see (1) Load CIDOC CRM button it will open the file chooser (2) then we need select the file location from the system. After selecting the file it loads that ontology classes in the CIDOC CRM Ontology Classes pane. In a similar way we can load and display the remaining two ontologies in their respective panes. We can see this figure below Figure: Classes of all three ontologies.



Figure: Loading Ontologies.



Figure: Classes of all three ontologies.

5.3. Mapping between the ontologies:

I am doing this process interactively by using the tool. We can see this in the figure: Mapping of Ontologies. When we select one class from CIDOC CRM Ontology and one class from target ontology1 and click on the button Submit (1) then it will popup one option pane (2) asking question like select the relation between two selected classes. It has options of Equivalent class and Subclass. We need select one of the options from these two options then it will save these two classes and relation between them is saved to the array List. When we clicked on the submit button at the same time it has to display the properties and instances related to these two classes in the property Mapping tab and Instance mapping tab respectively. Then we perform the mapping on these properties and instances and save them in array List. Similarly we map the CIDOC CRM ontology classes and Target Ontology 2classes. At this moment I done with the mapping of classes and I need to do the mapping for properties and instances after this report.

5.4. Saving all the mappings:

After performing all the mappings between all the entities we press the Done Mapping button then all the mappings will be saved to an owl file. I have not finished this function for my tool but I will do it after this report. If we want to exit from the tool at any time then we can click on the button Exit (1). Then it will popup one option pane (2) asking question that Do you want to quite from this tool? Options are yes or no. If we click on yes then the tool will be closed. If you click on no then it does not do anything simply you can continue your work. We can see this in the below figure: Exit from tool.

6. Implementation:

This section gives you the idea of how this tool is implemented. This has mainly 4 sections. They are configuring the NetBeans, Implementing the User Interface, Loading ontologies, performing the mapping, saving all the mappings in an owl file.

6.1 Configuring the NetBeans:

For doing the programming related to ontologies we need to add all the jar files presented in the edu.stanford.smi.protegex.owl and protégé.jar then it is ready to do the programming related to ontologies. For this first of all protégé must have to installed in the machine then add the required jar files to NetBeans.

6.2. Graphical User Interface:

The user Interface mainly has the three tabs. This is already described in the design section. By using NetBeans we can drag and drop all the components required for my tool. The panels for displaying the ontologies are scrolling Panes so that when the classes in that pane are exceeded than the size of scroll pane then we can scroll through the ontologies. I am not going to explain this section completely because we can design the user interface by using NetBeans.

6.3. Loading Ontologies:

When we click on the Load CIDOC CRM button we add one Action Listener on this. In this method I wrote the code for selecting the file, this code is like below:

String wd = System.getProperty("user.dir");

JFileChooser fc = new JFileChooser(wd);

int rc = fc.showDialog(null, "Select Data File");

if (rc == JFileChooser.APPROVE_OPTION) {

File file = fc.getSelectedFile();

String filename = file.getAbsolutePath();

After this the filepath of the owl file is passed to LoadOntology.LoadOnt() method. In that method by using this file path it create the Jena owl Model and later on it query that model for traversing through that ontology and stores all the classes and subclasses into an array list. It returns that array list to the method.

OWLModel owlModel = ProtegeOWL.createJenaOWLModelFromURI(newText);

Collection classes = owlModel.getUserDefinedOWLNamedClasses();

for (Iterator it = classes.iterator(); it.hasNext();)


RDFSNamedClass cls = (RDFSNamedClass);



return strArray;

We will display all these classes in the respective panes by using the radio buttons to each class. All these radio buttons are grouped so that only one class is selected from all these classes at a time. Similarly we load all the ontologies in the respective panes. Actually I need to display all the classes have to display in a tree form but at this moment I have not got it like displaying like tree form.

6.4. Mapping Algorithm:

After displaying all the classes we select one class from each ontology and click on the respective submit button as explained in the design overview section it will have to display the properties and instances of those classes in the respective tabs. At the same time it opens one option pane for selecting the relation between these two classes, after selecting one from those two it has to save these mappings in array list. The code for displaying option pane is:

Object[] options = {"Equivalent Class","SubClass"};

int x = JOptionPane.showOptionDialog(jPanel1, "Select the Relation between "

+ SelectedRadioButton2

+ " and " + SelectedRadioButton1,

"Setting Up Relation window", JOptionPane.DEFAULT_OPTION,

JOptionPane.QUESTION_MESSAGE, null, options, null);

After this one line has to display by connecting these two classes. Similarly we map the properties and instances and save them in an array list. Connecting line not yet implemented and mapping for properties and instances is also not yet finished so I want to do it later.

6.5. Saving the Mappings:

After mapping all the entities we need to save all these in an owl file by using the array lists. This is also not yet implemented. I will do it later after this report.

7. Limitations and Further Research:

In this section we can see the limitations and further improvements of my tool, what the objectives I have not achieved are or modified and further research on this project.

There are still some limitations for my tool. The main thing in those limitations for the current version of my tool is it cannot load the owl files if the ontology is having any equivalent classes in the owl file. Another limitation is we must have to map the properties and instances of the selected classes before mapping the corresponding classes. This is due to once we click on submit button for classes mapping it automatically disables the target ontology class so you do not have the chance to come back to the previous mappings for editing them or mapping properties and instances. So we must have to map all the properties and instances before mapping the corresponding classes. The user must have to know how to use the tool properly otherwise user do not get the required owl file and gets the exceptions because this tool does not show any warning messages if the user doing the mappings in improper way. The current version of this tool cannot handle many exceptions but in future we can make this tool better user friendly and handles exceptions more efficiently.

The objectives which I did not achieve are one of those is querying the mapping owl file. Original plan is to design one query panel in the user interface and when user queries on the mapping owl file then it has to give the results to user. This is not achieved due to lot of time taken for designing the user interface in starting of implementation phase. One objective which is modified is When we map two entities of two ontologies then the line has to be drawn between those two entities but I was unable to do that due to some user interface limitations. In place of line I am disabling the target ontology class so that we can know that the mapping is done for this entity.

We can extend this project by doing the automatic or semi - automatic mapping by adding the mapping algorithms. In the current version of my tool I am doing the mapping interactively but it is time taking process and more expensive due to human intervention is needed. Till now there have not been implemented the completely automatic mapping tool. We have some semi automatic mapping tools. These are automatically maps some of the entities and gives the suggestions for the user about the remaining mappings, user checks about the mappings and do the remaining mappings. Another extension we can add to my tool is we can add one query panel to user interface and write the algorithm for querying the owl file by using the protégé - owl api programmer's guide. We can develop this tool by improving the performance and run time complexity which mean developing the efficient algorithm, it has to be able to handle the exceptions. When we are loading the owl file it is showing all the files in the file chooser list. If we add the restriction on selecting only .owl files then it would be good to see.

8. Conclusion:

In this project I have presented a method to resolve the database heterogeneity by using the CIDOC CRM ontology by mapping the two archaeology ontologies with the CIDOC CRM ontology. My tool can load all the three ontologies and it can provide the opportunity to perform the mapping between all the entities i.e. classes, properties and instances. We can save all these mappings in owl file. I done the mapping interactively by human intervention, due to this the design cost is less when compared to automatic or semi automatic mapping tools but maintenance cost is more expensive when compared to other because my tool needs to do the mapping by human work. Though it takes lot of time to do the mapping the benefit of this is there is no mismatching of entities will happen. My tool is somewhat better to the mapping interactively but it is always better to do the automatic or semi automatic mapping. Lot of research has to be done for getting the efficient tool.

9. Project Work Plan:

Project work plan is slightly changed from the initial plan. I am developing the example prototype by using two small ontologies. Later on I will work with the actual databases.







Start of the project

To know about project and learn basic things

22nd Feb - 26th Feb



Preliminary report

Writing the report and planning the project.

Design the work plan in Microsoft project tool.

1st Mar - 5th Mar



Milestone 1

Studying mapping techniques and matching systems

7th Mar - 15th Mar



Ontology mapping

Working with Eclipse and mapping databases

16th Mar - 2nd Apr



Interim Report

Writing the Interim report and studying the algorithm for QOM

5th Apr - 9th Apr



Milestone 2

Developing the QOM algorithm

12th Apr - 16th Apr



Developing Algorithm

Working with actual databases

19th Apr - 30th Apr



Dissertation Draft

Writing the Dissertation draft

3rd May - 7th May




Making final conclusions and results.

Writing final dissertation

10th May - 21st May




Writing the power point presentation and preparing for Interview

Depends on the day of Interview