Formalization Of Uml Class Diagram Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Abstract-Unified Modeling Language (UML) is as a standard object-oriented modeling notation that is widely accepted and used in software development industry. However, most of the UML notation is informally defined in term of natural language description (English) and Object Constraint Language (OCL) which makes difficult to formally analyzed and error-prone. In this paper, we elucidate the preliminary result on an approach to formally define UML class diagram using logic-based representation formalism. We represent how to define the UML class diagram using Description Logics (DLs).

Keywords-component; formalization; UML class diagram; logic-based; description logic

Introduction and Motivation

The emergence and development of new methodology has positive impact to software engineering community as an attempt to reduce the complexity and diversity of software development process. Currently, object-oriented (OO) approach is more popular used in software specification due to relative easy to use and understand that is graphically presented. Most of OO approach is informally defined and it has any restrictiveness such as not support the rigorous analysis and its notation is limited in expression. One example of the OO language is UML. UML is as a standard OO modeling notation that is widely accepted and used in industry application.

UML is graphically defined from requirement, analysis and design to model used as a programming specification. The UML specification is defined using metamodeling approach. Normally, the language definition consists of an abstract syntax, concrete syntax and semantics. The abstract syntax of UML can be described by UML class diagram. While the concrete syntax and semantic of UML notations define a model's meaning in a combination of the Natural Language (English), and Object Constraint Language (OCL). These languages are either not sufficiently to state the semantic of UML notation, not support to analyze the UML models and must always be combined with others to state the complex constraints [1]. In additional, most of UML notation is informally defined which makes difficult to formally analyze and error-prone. The need is to provide the precise semantic i.e. UML class diagram formally that lead in this work.

This paper shows the preliminary result based on an approach to formally define UML class diagram. UML class diagrams are a diagram type that well-established in UML model and play an important role in analysis and the complexity design system. UML class diagram is used to specify the static structure of the software system under study. In order UML class diagram can be analyzed, a formal and sound foundation is required. We are interested to explore logic-based representation formalism (i.e Description Logics) to carry out the formalization of UML class diagram. With the help of Description Logics (DLs) [2] can support the specification phase in software development. Based on the literature perspective, DLs is well-suited to express the design models that are completed with a declarative nature. This paper focuses on how to define the UML class diagram using DLs.

The structure of the rest of this paper is as follows. In section 2 of this paper presents a brief review of the description logics. Section 3 descriptions the proposed approach which depicts the basic conceptual framework for formalization and presents the UML class diagram formalization result. Section 4 discusses the related work. Finally, section 5 discusses conclusion and future work.

Related Work

UML is de-facto standard for OO modeling [3-7] in industry application to develop a software system. UML is based on graphical notation that is supported by visual syntax and expressiveness which make it easy to understand. However, it lacks precise to define the semantics and hard to make sure whether the end design is consistent, unambiguous and complete. As a result, some works have been done to remedy this problem. For example, Object Constraint Language (OCL) tried to reduce them [4, 8, 9] but OCL is not enough to do it due to OCL constraint cannot be executed and difficult to check and detect that can cause the some problem in development and maintenance phase [8]. In addition, description of UML's syntax can be described using OCL. Unfortunate, it is not complete and precise specification of its semantics which can appear confusion and complicated during analyses a model.

Currently, researchers frequently study to obtain the high expressiveness of UML that are by formalization. According to Zhihong and Mingtian [10], there are two objective why formalization is needed in UML construct. Firstly, the precise formalization can be use to formally explain the semantics of UML. This objective, can give the similar perception about UML. Secondly, the formalization can produce an automatic reasoning procedure which helps us in checking of UML. To reach the first objective, many formalizing techniques have been proposed. Formal Method (FM) may be considered to formalize UML whether related to UML's syntax or UML's semantics or both. FM is an appropriate technique and a precise way to produce software.

The current FM that was applied in this work such as Z notation [1, 11-13], B notation[14, 15], VDM++ [9], PVS [3, 16], FOL [17] and DLs [10, 18, 19]. However, most of these works is not sufficient to provide a formal basic for the abstract syntax and semantics of notation used. The literature perspective mentioned that FM is an approach that difficult to understand and expensive to use [14, 15, 20], but if FM is integrated with OO can reduce difficulty in its used. Using OO modeling allows reducing the semantics gap between the certain problem domain and the evolving structural models. Unfortunately, the existing tools and current work is still limited, and ineffective [9]. The various type of UML diagrams used in formalization such as class, interaction, state chart, use case, object, collaboration and sequence diagram. This work only considers one kind of the structure diagram i.e. class diagram and DLs is considered to be used.

UML class diagram is choice based on the following reasons:

1) Class Diagrams (CDs) play a prominent role in analysis and design of a complex system. During analysis, classes refer to people, place, event and things about which the system will capture information later. While, implementation specific artifact like windows, forms, and other object used to build the system is addressed during the design and implementation.

2) CDs are the most important structure diagram and well-established. It is able to describe the information on the interest domain in term of object organized by classes and relationship between them. CD is also a formal way of representing objects that is used and created by business roles.

3) A CD is core of modeling element which is used in behavioral diagram and the abstract syntax of UML may be explained by CDs.

Whilst, DLs is considered based on the following reasons:

1) DLs can be suited to provide the knowledge for the static structure of software application.

2) It is based on formal semantics i.e. descriptive semantics which are well-studied and understood.

3) There are still challenges in formalization of UML class diagram in which many elements are not defined by DLs yet.

4) Its community is well organized and it is still infancy stage.

A Brief Review of the Description Logics

As a family of logic-based knowledge representation formalisms, Description Logics (DLs) [2] have designed to represent and reason about the knowledge of an application domain in a structured and well-understood way. DLs are also able to represent the structural knowledge of an application domain through a knowledge base including a terminology and a world description. The main domain of DLs is interested on concept (unary predicate), roles (binary predicate), and individuals (constant) respectively [2].

The basic formalism of a DL system consists of three components [21]:

1) Constructors which represent concept and role,

2) Knowledge base (KB) which consists of the TBox (terminology) and the Abox (world description). The Tbox presents the vocabulary of an application domain, while the Abox includes assertions about named individuals in terms of this vocabulary,

3) Inferences which are reasoning mechanisms of Tbox and Abox.

Notational Conventions

The basic formalisms of DLs are formed based on three components as follow: the formalism for describing concepts (i.e. the description language); define the terminological (TBox) and the assertion (ABox) formalism; and the reasoning. Elementary descriptions of description language are atomic concepts and atomic roles. Concept constructors can be applied to build the complex descriptions. In abstract notation, A and B are used to define atomic concept; C, D for concept description; R, S is assumed as roles and for functional roles (feature attribute) the letter f, g.

In DLs define the concept name is started with an uppercase letter and then followed by the lowercase letter (e.g., Human, Male), roles name (also functional ones) start with a lowercase letter (e.g., hasChild, marriedTo), and individual names are all uppercase (e.g., CHARLES, MARY). One example of description language is and others are extension of the concept descriptions of are formed based on the following syntax rules:

; Atomic concept

; Universal concept

žâƒ’; Bottom concept

; Atomic negation

; Intersection

; Value restriction

; Limited existential quantification

For Example, if we suppose that and are atomic concepts and is atomic role. Using the universal concept (, we can form the concept and . These concepts describe that Female has a child and mother all of whose children are female. Using the bottom concept , we can form the concept a mother without a child by.

Beside description language, terminology is also as basic formalism of DLs. In most general, terminological axioms have the form: , are called inclusions or , are called equalities, where C, D are concepts (R, S are roles). The semantics of axioms is defined as an interpretation satisfies and inclusion if and it satisfies an quality if . DLs are able to perform specific kinds of reasoning. The different kinds of reasoning performed by DLs system are defined as logical inferences.

Concept, Roles and Knowledge Base

In this section, we briefly describe the syntax and semantic of the DLs. Different combination of constructors generate languages with different expressiveness. Concept descriptions are formed according to the following syntax:


Some Syntax of Concept Constructor


Concrete Syntax

Abstract Syntax






(and C1…Cn)


(or C1…Cn)


(not C)

Value restriction

(all R C)

Limited existential quantification

(some R)

Existential quantification

(some R C)

At-least number restriction

(at-least n R)

At-most number restriction

(at-most n R)

Exact number restriction

(exactly n R)

Qualified at-least restriction

(at-least n R C)

Qualified at-most restriction

(at-most n R C)

Qualified exact restriction

(exactly n R C)

Same-as, agreement

(same-as u1 u2)


(subset R1 R2)

Role fillers

(fillers R I1…In)


(one-of I1..In)


Some DLs Concept and Role Constructor











Value restriction


Existential quant.


Unqualified number restriction

Qualified number restriction




Agreement and disagreement

Role name


{(b, a)∣ (a, b)


Role Conj.

Role Hierarchy


In this section, the proposed approach is presented. Section A present the formalization framework as foundation how to formalization must be done. Section B, brief introduction is about UML Class diagram. Element of UML class diagram will be shown in section C. Finally, the application of the proposed approach will be represented by a simple example.

The Formalization Framework

The proposed approach consists of a conceptual framework and its supported theoretical foundation. In general speaking, a framework is defined as a real or conceptual structure intended to serve as a support or guide for the building of something that expands the defined structure into something useful.

Figure 1. A Conceptual Framework for Formalization

Figure 1 shows a conceptual framework formalization that become as foundation in this work how to formalize the UML class diagram using one of family of logic-based approach i.e. Description Logics (DLs). DLs are formally capture syntax and semantic (see table 1 & 2) about elements of UML class diagram, with sound and complete algorithm through formalization. The formalization result can help to make the same perception e.g. consistence understanding of elements of UML class diagram.

UML Class Diagram

This work concentrates on UML class diagram for the conceptual perspective. The formalization of UML class diagram is based on formal approach that takes advantages from both a conceptual modeling diagram (UML) and DLs. By using logic in formalization is basically considered as a set of reason as follow: a given precise semantics, formal verification, and virtually unlimited expressiveness.

Class Diagram (CD) is a static diagram or model that shows the classes and relationship among classes that remain constant in the system over time. CD depicts classes with common feature which include both structural feature (i.e. attribute and association end) and behavioral feature (i.e. operation), with the relationship between classes. The following sections will present the elements of UML class diagram as shown in figure 2.

Figure 2. UML Class Diagram metamodel

Element of UML Class Diagram Formalization

UML class diagram depicts a structural aspect of the model of a system in graphically and shows what conceptual 'things' exist in a system, and what relationships exist among them. In this section, the Class Diagram (CD) formalization for the structure syntax as below:


Classes in UML model represent a set of objects or its instance in which the objects belong to a class are called instances of the class which form instantiation or extension of the class, with the common feature (see Figure 2). Properties in UML 2.0 contain two distinct notations: attributes and association ends. A property that is owned by a class namely attribute, whilst a property is owned by an association is known association ends.

A class is a main block of UML class diagram which is used to store and manage information. Each class is graphically depicted using three compartments with class' name at the top, attributes in the middle, and operation at the bottom (see figure 3). An UML class is represented by atomic concept. The above description describes that a class is generally composed a set of attributes and operations denotes simplify as where:

is class name, the class name has to be unique in the whole diagram.

is a set of attribute name of the class, for i=1..n.

is a set of attribute type of the class, for j=1…n.

is a set of operations of a class, for i=1..n

A class, we state in DLs assertion as:

Where: : name of a class; : name of attribute, : type of attribute, and : name of operation of a class.

Figure 3. Common Properties of a Class in UML Class Diagram


Attributes are presented in the UML 2.0 metamodel by property 'structural feature' that describe the state of an object. Let be an attribute of a class with type , an optional multiplicity . This means that an attribute is denoted by a name, possibly followed by a multiplicity and the associated type of the attribute as follows:

Visibility name: type multiplicity

Visibility indicates whether an attribute is public or private.

Name denotes the name of the attribute.

The type of the attribute specifies which kinds of values or object an attribute can contain.

The multiplicity of an attribute indicates how kinds of values or object may fill the attribute.

The attributes of a class denote by DLs assertion as (without the definition of visibility):

It has attributes name and type Attribute name must be unique only in the class it belongs to, possibly multiplicity (i.e. implicit or explicit). Implicit multiplicity is assume to be 1..1 (i.e the attribute is mandatory and single value), and an explicit multiplicity with a minimal and maximal number of value e.g. 1..*.

Multiplicity is a constraint on the number of instances of one class that can be related to one instance of the other class. An assertion for multiplicity states that for a associated to each instances of , at least i and most j instances C'. To state the multiplicity of that is:

Note that, if j is * then the second conjunction can be omitted and if the multiplicity is [0…*] then the whole assertion can be ignoring but if multiplicity is [1...1], the first conjunction can be omitted. For example by refer to Figure 3, it is possible to write:


Operation is actions or functions from the object of class to which the operation is associated. The full UML syntax for operation is:

Visibility name (parameter-list): return-type

The name denotes the name of operation and is a string. The parameter-list is the list of parameter of the operation. The return-type is a comma-separated list of return types. If an operation has no parameter(s), the parentheses are still shown but are empty. In generally, operations have names and parameters (has a name and a type). An operation of a class can be stated by mean of DLs such:

:) for and j=,


is return values (or the type of result) belonging to ..; is the operation's name; is parameter belonging to the classes ...

For example (based on Figure 3):

Assocation and Agregation

A relationship between two or more classes (or a class and itself) in UML class diagram is represented by an association. An association is labeled using a verb phrase or a role name that described properties of the association (e.g. attribute, operation and others). In UML, an association is a relation between instances of two or more classes as shown in figure 4(a). An association between two classes is a property of both classes.

Figure 4. Binary association (a) and aggregation (b) in UML

Figure 5 presents multiplicities of binary association, in DLs will be stated:


To state that each instance of is connected by to at least and at most with and each instance of is connected by to at least and at most


Example of association:

Figure 5. An example of binary association

In DLs can be written as follow:


n-ary association among classes is a n-ary predicate in FOL. We need to define the component of the predicate must belong to correct classes as:

An aggregation A bases on the figure 4 (b) without multiplicities, we formalize using DL as follow:

Generalization and Inheritance

In UML, generalization depicts that one class is indentified as the superclass and the others as subclasses of it. It means that the properties and operation of the super class are also valid for objects of the subclass. Note that every instance of each subclass is also instance of the super class. If an UML class generalizes a class, we can express this in DL assertion: or ⊑ means every properties of are also the properties of but . Generalization in UML can be grouped into a class hierarchy, as shown in figure 6.

Figure 6. A class hierarchy in UML

A Class hierarchy in UML can be expressed in FOL assertion:


Constraint can be used to add information (i.e. disjoint and complete) in UML' class hierarchy as additional properties of a class, between each child classes and parent class. Disjointness is a condition where the different subclasses cannot have common instance which is stated in DLs assertion as follows:

Whilst, completeness is every instances of super class is also instance of at least one of the subclasses which is defined in FOL assertion as follow:



Composition is a strong form of aggregation that can be defined as an instance of a class becomes a part of other instance of a class. It means that sometimes an object is made up of other object which is depicted with a fulfill diamond. DLs assertion is stated as follow:


A Simple Example

In this section, a simple example (see Figure 7) is shown to illustrate the whole formalization. In Class Diagram, there are six classes (Student, Undergrade, Graduate, Class, Lecturer, and Course) and one generalization which can be stated in three assertions as follow:

Figure 7. A simple example of UML Class Diagram

Definition of properties of Student's class capture attributes and operation as follows:




The same manner defines attributes and operations for each other class. Association of both classes between student's class and hierarchy class are depicted as follow (the same manner of other classes):

conclusion and future work

In this paper has shown the formalization of element of UML class diagram in term of specified formal logic i.e. Description Logics (DLs). Through this formalization, the deductive capabilities of DLs have been exploited and thus show the result. The work maps the construct of class diagram elements into DLs. We realize this formalization is still not satisfied because any properties are not defined yet such as dependency relation and the formalization is manually produced.

In the future, formalization of elements of UML class diagram is still being a challenging task. We aim to extend this formalization which is able to capture others properties in UML 2.0 class diagram. The formalization will be done automatically using the current tools and thus be analyzed by real case study in huge UML model.


This research is partially funded by Fundamental Research Grant Scheme (FRGS) and Indonesian government's higher education department. The authors would like to convey their utmost appreciation to the anonymous reviewer and respective individuals for their involvement and invaluable feedbacks throughout conducting this research