# The Mathematical Function Centrality Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Centrality is a mathematical function that given a network, maps numerical values to each node of the network. Centrality values indicate the importance of the node in the network. The highest the centrality value of a node the higher its importance may be. Below different centrality measures are being described

## 2. Degree Centrality

The most trivial eccentricity is the degree of a node. In an undirected network degree of the node is the number of edges connected to this node. In the case of a directed network the degree distribution is split in two definitions the in-degree and out-degree denoting the number of incoming and outgoing edges respectively.

## 3. Eccentricity Centrality

The eccentricity centrality allows the most central node of a network to be found out. Central node is defined as the node which is the closest to every node in the network. Therefore if a node is central the maximum length of the shortest path between this node and the others should be the minimum one.

Let d(s,t) be the length of the shortest path between the nodes s, t.

Eccentricity centrality can be defined as

The above formula uses the shortest path function between the nodes. In the case a node is not connected with the other there is not a shortest path and the eccentricity centrality cannot be computed. Therefore the eccentricity centrality formula can be applied only in strongly connected networks (strongly connected network is a network in which for each pair of nodes contained in it there is at least on path connecting them).

## 4. Closeness Centrality

Closeness centrality gives us an indication of how close is a node among the others. For example let's say we have a network consisted of buildings and roads. We would like the most important building to be as close as possible to the other buildings. In order to calculate this , the closeness centrality is been introduced. For each node we define the quantity S as the sum of the lengths of all the shortest paths between the current node and the others.

Where σ(v, t) is the shortest path between the nodes v, t.

We can see the furthest the node the bigger the quantity S is.

Now we can define closeness centrality as

Closeness centrality uses the length of the shortest path function. Therefore as in eccentricity centrality, the network has to be strongly connected.

## 5. Shortest Path Betweenness Centrality

The shortest path betweenness centrality (CSPB) gives an indication of how often this node is used when nodes in the network wants to communicate to each other hence how important is this node for the communication of the network. The more important the node is in the communication between the nodes of the network the more central can be considered.

Let a network to be defined of towns as nodes and roads connecting the towns as edges. Travellers between towns would choose to transport from one town to another using the shortest way. It would be liked to be known which town appears more often in the travelling path of the passengers (hence expecting more traffic in that particular town).

Let the node s has to communicate with the node t. It is been assumed that in order two nodes to communicate with each other, the shortest path between them will be used(the path with the less intermediate nodes possible).

Let's say that the nodes s, t have σs, t number of shortest paths. We define σs, t(u) as the number of shortest paths between the nodes s, t that contains the node u, where u is an interior node in the path (u ≠ s , u ≠ t).The ration of the communication between the nodes s, t that the node u is taking place is denoted as δs, t(u).

In case there is not a path between the nodes s, t we denote δs, t = 0.

The sum of the rates of each pair of nodes s, t in the networks gives us the CSPB centrality.

Where V is the set of the nodes in the network.

In contrast with the previous centralities the CCPB centrality can be used in non-strongly connected networks. If there is unconnected node then the quantity δs, t get zero value and it doesn't affect the calculations.

## 6. Other centralities

The above centralities are being considered as the most trivial ones. More advanced centralities have been introduced. Such centralities are

Katz's status index

Bonacich's Eigenvector Centrality

PageRank

## B. Global Properties

## Distance between two nodes is the length of the shortest path between these two nodes.

In case there is not shortest path between the nodes, the distance is denoted as infinitive.

## Average path length of network is the average distance between all the node pairs in the network.

In case the network is disconnected the average path length cannot be computed. Instead the average inverse path is used.

## Diameter of the network is the maximum distance occurs between all the pairs of nodes in the network.

## Degree Distribution p(k) v is defined as the probability a randomly chosen node of the network to have a degree equal to k.

## Cumulative degree distribution pc(k) is the probability a randomly chosen node of the network to have a degree equal or greater to k.

## Clustering coefficient of a network is the average clustering coefficient of each node of the network. The clustering coefficient of a node is denoted as the probability the neighbors of this node to be connected to each other.

Let ni be a vertex of the network and ki the number of its neighbors. The maximum possible edges between the neighbors of the ni node can be computed

The clustering coefficient of the ni node is

where Ei is the actual number of edges between the neighbors of the node.

In the above computation an undirected graph is being considered.

## Matching index is the ratio of common neighbors between two nodes.

## More in detail the matching index Mi,j of two nodes i. j is computed by the formula

Where the common neighbors of the nodes are counted once.

To compute the Mi,j the adjacency matrix of the network is used. Thus the above formula can be written as

## C. Network's Models

## 1. Introduction

The global properties and centralities of a network are not enough by their own. The centralities attributes correspond to a specific network and the values cannot be compared between nodes of different networks. Furthermore other global properties may indicate a significant structure or not. In order to be known when a property denotes a trivial network structure or a significant one the centrality values and the global properties of the network must be compared with the corresponding value of a default , trivial , network structure. These prototyped networks are called null networks. Null networks are being constructed with certain methods

Below certain prototype networks and the methods to construct them are being described.

## 2. The Erdos-Renyil Model

The Erdos-Renyil model (ER) is a network where the nodes can be connected to each other with a fixed probability. This network is constructed as following.Let NV be the number of vertices. The possible connections between nodes (hence the maximum number edges the network can have) is . From the set of all the possible edges, a NE number of edges are randomly chosen and assigned to the network. Thus the propability of two nodes to be connected to each other is

The ER network model has the following attributes

The degree distribution follows the Binomial distribution.

In very large networks the distribution tends to follow the Poisson distribution.

The ER model has the small-world attribute.

Depending on the probability p (hence the number of edges chosen) the network can from largely disconnected to completely connected.

More in detail for the Watts-Strogatz model has many disconnected sets of nodes while for the network is strongly connected.

This model has a disadvantage: The way it is constructed the degree distribution is not the same as in real biological networks.

There is not local structure in the network as the edges are randomly chosen

## 3. The Watts-Strogatz Model

The Watz-Strogatz (WS) networks come to cover the problem the ER network has. The WS model has a local structure and also retains the small world property the ER has. To construct a WS network the edges are not chosen randomly. Let NV be the number of nodes of the network. Firstly all the nodes are arranged in a circle. Then each node is connected to the closest nodes. This way the network has a local structure and high clustering coefficient but it doesn't have the small world property. In order to acquire the small world property some modification in the connection of the node must be made. Let p be a real number 0<p<1. The number p denotes the probability an edge will be reallocated. For each link there is a probability p that it will be detached from the node it is connected and randomly connect to another node in the network. This way short paths between vertexes are being created thus the average path of the network is dramatically decreased. It has been discovered that for very small p the network can have the small world property. If the p = 1 then the WS approaches the ER model.

## 4. The Barabasi-Albert model

This network implements another approach in the degree distribution. In this case nodes are added by turns. Each turn a new node is added to the already existing network and gets connected to a fixed number of nodes. If m is the numbers of nodes the new node will be connected to; then m < No where No is the number of nodes in the first turn. The node added is not connected completely randomly with the existing nodes. More specifically the probability the new node to connect with another node of the network is analogous the number of neighbors the other node has.

## D. Motifs

Another approach in understanding the biological networks and their properties is the motifs. The basic idea is that a network is implemented by many smaller subnetworks which are connected to each other.

A motif can be considered as a small structural pattern of nodes and edges that can be found many times in a network. Strictly speaking a motif is a small connected graph.

A motif G' matches a subgraph of a given network when there two graphs are isomorphic ( There is a bijectional function which maps each vertex of the first graph to a node of the other and also two nodes of the first graph are adjacent if and only if the mapped nodes of the other network are adjacent too.)

Finally in order to find the number of matches of a motif in a given network a concept motif must first to be chosen. The purpose of concept motifs is to distinguish between the times a node or an edge can be considered to be in a motif pattern. Specifically the motif concepts are the F1, F2, F3.

The F1 does not apply any restriction on how many times a node or an edge can be used to form motif patters.

The F2 allows the nodes to be shared among motifs. The edges still can be counted only for one motif.

The F3 allows both edges and node to be counted only once (hence to exist only in one motif).