As an Open Source solution honeyd does not offer any support or Graphical User Interface for installation or configuration. The source code should be downloaded on the honeyd host, get compiled and the binary and configuration files of honeyd be installed. Then the Honeyd binary file could be run from the command line prompt of the Linux system used. A second more efficient way is to install the Honeyd package as root with the command:
sudo apt-get install honeyd
To function correctly, honeyd requires also the following libraries to be installed:
libevent: a software library providing asynchronous notification for events
libnet: a portable framework/library used for network packet construction
libpcap: a framework used for capturing packets passing through a network
Honeyd comes with numerous scripts written in the script languages Python and Perl by both NielsProvos and other contributors which can be used to emulate services on the appropriate ports in the virtual honeypots.
Get your grade
or your money back
using our Essay Writing Service!
Although installing and running honeyd might seem quite simple, it is a particularly complicated software tool with a range of command line parameters affecting its behavior. After being correctly installed on our system, the following command is used to start honeyd:
honeyd -p /etc/honeyd/nmap.prints -d -f /etc/honeyd/honeyd_thesis.conf
In the following, the above parameters used are explained in detail . The first command honeyd instructs the Linux kernel to execute the honeyd binary file. The -p fingerprints option gives the pathname of the Nmap fingerprint file (here: nmap.prints) which contains the Nmap signature database that honeyd uses to emulate different operating systems at the network stack. This will dictate how honeyd will behave towards attackers depending on the emulated operating system. The -d flag allows honeyd to run in debug mode with all the messages getting printed on the current terminal. This mode can be useful when testing honeyd and its functionality on the fly. Omitting this flag will cause honeyd to run as a daemon process in the background. Another important command line parameter which is not used above is the -i flag. This flag is used when the computer system hosting the virtual honeypots has more than one network interfaces. In this case, the -i flag should be used to denote which interface or interfaces will be the ones receiving network traffic for the virtual honeypots.
Finally, the -f command line parameter is probably the most important one as it lies at the heart of the honeyd configuration. The -f flag gives honeyd the path name for the configuration file (here: honeyd_thesis.conf) where all information about the virtual honeypots are kept such as which operating systems are used and which services should be emulated on each honeypot. Honeyd's configuration file is a simple straightforward text-based file with a context-free-grammar configuration language which can be described in Backus-Naur Form (BNF). Although quite straightforward, it offers a wide variety of options when it comes to configuring the virtual honeypots. Its main role is to specify which are the IP addresses on which the virtual hosts will be running waiting for the attackers' probes and what services should be emulated on each one of them.
Templates constitute the core of the configuration files for virtual honeypots . Honeyd works via the creation of templates which describe and simulate specific computer systems configured in great detail. The first step taken to create a virtual honeypot is to create a corresponding template which will specify the defining characteristics of the honeypot like the simulated operating system, the ports on which the honeypot will listen and the services being emulated. After that, an IP address will be assigned to the template bringing up the honeypot and operating at that specific network address. The command to create a new template is create and the parameter entered should be a name relative to the system intended to be simulated. Each template should have a different name. A different parameter that can be used is the default one. This can be used in case honeyd does not find any template matching the destination IP address of a packet and it is preferred when there is a need for assigning a group of IP addresses under a common template rather than assigning each template to a unique address.
Always on Time
Marked to Standard
Following the creation of a template, the configuration commands define how the virtual honeypot will behave. The set and add commands are used to shape the behavior of the configured honeypot. The first characteristic to be defined by the set command is the operating system or personality from the Nmapfingerprint database which will dictate how the computer system will behave at the IP network stack. The personality indicates the form of the responses honeyd will be sending back along with other details such as the TCP sequence numbers, the TCP timestamps and other. It can be chosen from a great number and variety of famous operating systems like Linux, FreeBSD, Mac OS, Microsoft Windows, Cisco IOS, etc.
The set command can also determine the default behavior of the template regarding the supported network protocols (TCP, UDP, ICMP), i.e. how the template reacts to probes at ports which are unassigned. The action taken can include three options:
Open: this signifies that all ports for the particular network protocol are open by default. This setting applies only to TCP and UDP connections
Block: this indicates that the ports will ignore any incoming connections and packets directed to them will be dropped by default.
Reset: this means that all ports for the specified protocol are closed by default. For a TCP port, honeyd will reply with a TCP RST packet to a SYN packet whereas for a UDP port with a UDP-port unreachable message.
Finally, honeyd gives the option to spoof the uptime of a host, referring to the duration of time since the system was first booted. The set uptime command does exactly that. If no uptime is defined, honeyd assumes an arbitrary value up to 20 days.
Following the set command is the add set of commands. The add commands constitute the center of the template as they are the ones which signify what applications will be running on each port and which are the services that can be remotely accessed by the outside world. The syntax of the add command requires to specify the network protocol, the number of the port and an appropriate action. As we see in the above configuration, the options open, block and reset that are used for the default behavior of the template can also be used on a per-port basis. The important difference here is that apart from just opening or closing ports, predefined scripts can be called and emulate different services on different ports. This possibility of integrating scripts written in programming languages within the honeyd configuration gives virtual honeypots a high degree of realism. A realistic service to which an adversary can talk can grant much more detailed information about an attacker. Apparently, the more scripts running on ports, the higher the possibilities for interacting with attackers.
The following example from our configuration file starts a telnet simulator service for TCP connections on port 23:
add Linux1 tcp port 23 "/usr/share/honeyd/scripts
When a remote hosts attempts to establish connection with the above Linux1 template-personality on port 23, honeyd will initiate a new process executing the shell script "./telnetd.sh". The script is receiving input data via stdin and it is sending replies back to the sender via its stdout. Apart from TCP connections, scripts can also be used to interact with remote users through UDP connections. Important to mention is that when honeyd receives a new connection on one of its honeypots' port, it forks (starts) a new process which will execute the specified script. This can be at times quite risky as it can lead to a performance bottleneck if the virtual honeypots get overwhelmed with network traffic, e.g. if deployed in a busy network .
The last command which should be implemented to configure successfully the virtual honeypot is the bind command whose role is to bind the created template with the IP address on which it will be operating virtually.
The Ethernet option for the set command can be used to assign explicitly a unique MAC address to each configured template. As mentioned earlier, physical addresses are essential for network communication and via proxy ARP the honeyd host can reply with its own MAC address to the ARP requests regarding the honeypots. A disadvantage of this is that attackers can easily realize the existence of virtual machines as all the IP addresses of the honeypots will relate to one MAC address. Using the set ethernet command, this risk is wiped out and no need for configuring proxy ARP exists as honeyd takes care of all the ARP procedures . Attention should be given to avoid any MAC address collisions when assigning them to the virtual hosts, as physical addresses should be unique for every system.
2.2.5. Honeyd Logging Information Gathering
This Essay is
a Student's Work
This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.Examples of our work
The Honeyd framework comes with a built-in mechanism for gathering information regarding the connection attempts launched from adversaries targeting the virtual honeypots. Honeyd has the ability to populate files with log information for both connection attempts by attackers and established connections for all protocols. The command used to log the network activity concerning the virtual honeypots is the following:
administrator@ubuntu:/etc/honeypot$ honeyd -p /etc/honeyd/nmap.prints -d -f /etc/honeyd/honeyd_thesis.conf -l thesis.log
The -l command line option enables the packet-level logging in honeyd. It only takes one parameter and this is the log file that will be used to create the connection logs. In this case, the log filethesis.log. It is important that the directory in which the log file resides, should have the permission to be writable by the user who is running honeyd. The log file contains information about the time a connection was attempted, the source IP address and port of the attacker attempting to connect, the destination IP address and port of the virtual honeypot under attack, the protocol involved and if the attempt is successful and the connection eventually establishes, the starting and ending point of the connection in time along with other information like the total number of bytes transmitted.
The packet-level log files can be extremely useful when used in combination with data mining tools. They can offer a great deal of helpful information regarding the connection attempts launched by adversaries. Scripts written in Perl, Python or other programming languages can extract useful information and statistics from the log files such as the number of IP addresses probing our virtual honeypots on a daily basis, a list with the most common ports to be attacked and other data giving an insight on the scanning activity of the virtual hosts from the potential attackers. This kind of log files can get extremely large over time and care should be taken regarding the processing capabilities of the data mining tools used in each case.
Apart from packet-level logging, the option for service-level logging is also provisioned by honeyd . Whereas packet-level logging gives a general view of the overall network traffic handled by the virtual honeypots, service-level log files give a more detailed view on the ongoing sessions. When scripts emulating services on different ports are used in honeypots and these scripts have additional logging capabilities, a great deal of interesting information can be attained about the attackers' activities and methods they use to take a system under their control.
2.2.6. Significance of Honeyd
As an Open Source low-interaction honeypot, honeyd introduces a great range of interesting features as those were mentioned previously. Being an Open Source software tool indicates that its distribution is free and anyone can have access to the source code . This means that individuals and groups belonging to the network security community can customize and contribute to its source code adding more emulated services which will improve the interaction level between the attacker and the virtual honeypots providing us with even more information about the methods they use to break into systems. Over the next years we can expect an exponential rise in our ability to capture malicious behavior via honeyd. On the other hand, being an Open Source solution, honeyd does not offer any support for maintenance or troubleshooting from an official source.
As a low-interaction honeypot honeyd is basically deployed as a production honeypot used to detect and capture network attacks. No real complete operating system is offered but adversaries are limited to the network services emulated by the scripts. As such, honeyd introduces a low risk to organizations for their overall security when introduced . A recognized by the attacker honeypot becomes useless, so camouflaging honeyd is an issue that should be addressed. Xinwen Fu et al. have shown that an adversary can fingerprint honeyd through measuring the latency of the links simulated and proposed a camouflaged honeyd capable of behaving like its surrounding network environment .
Another issue to be addressed is the scripts emulating network services in honeyd. These need to be written by hand and as a result not so many scripts exist. CorradoLeita et al. have proposed a method which can alleviate this issue by automatically creating new scripts . Finally, the fact that no alerting built-in mechanism exists in honeyd as well as that only command-line interface is offered are two shortcomings of its design. Chao-His Yeh et al. in their work have proposed a Graphical User Interface (GUI) for honeyd offering a variety of interesting features .
Dionaea  is an open source low interaction honeypot that can be further categorized into the class of malware collector honeypots. The aim of these low interaction honeypots is to create vulnerabilities on specific services, in order to attract malware exploiting the network and if possible, capture and download a copy of the malware.
As nowadays the number of malware attacks is increasing, these copies could be very useful for monitoring and analyzing in a safe environment the behavior of a malware and finding defending security solutions. In literature, two ways of malware analysis have been proposed , static and dynamic analysis. As their names suggest, static analysis is simply the procedure of reading the code and trying to figure out the intention of the malware, while dynamic analysis includes the execution of the code of malware in a secure manner. Usually, these two types of analysis are combined and the output of the static analysis can be very useful for the dynamic one.
The collected malware copies are usually stored in the form of a binary file. Malware collectors such as Dionaea can download a great amount of binaries although in most cases these files may represent the same malware. A binary should have different MD5 hash in order to be characterized as unique . From the perspective of detection, two types have been proposed: detection of existing malware based on patterns or samples and zero-day detection schemes. Zero-day malware is defined as "a malicious software which is not detected by anti-virus programs due to lack of existing virus signatures or other malware detection techniques" .
Dionaea is usually referred as Nepenthes'  successor. The main improvements on the features of the new malware collector compared to Nepenthes include :
the protocol implementation in python scripting language
the use of libemu library for shell code detection instead of pattern matching which requires a copy of the shell code, thus making extremely hard the detection of zero-day malware
support for ipv6 addresses and TLS encryption
development of the VOIP module
2.3.1. Features of Dionaea
As mentioned above, Dionaea developers used python to implement the network protocols. This selection allows for an easier implementation compared to C language for instance. However, the main reason for this choice was to deal with the new generation of malware that utilize API to access services.
SMB is the basic protocol supported by Dionaea. The SMB (Server Message Block) protocol works on port 445  and is used from Windows operating systems for file and printer sharing over TCP. Akamai's  internet report for the second and third quarter of 2012 (figure 3), shows that port 445 was the most targeted port at this period, as it attracted almost one third of the total network attack traffic.
Figure . Percentage of global internet attack traffic during the 2nd and 3rd quarter of 2012, by targeted ports 
The SMB protocol has known vulnerabilities and it is a common target especially for worms. That is the reason for which it has been selected by the developers of Dionaea as the main protocol and, as it will be shown in the following chapter, most of the captured copies of malware originated from that port.
Other important protocols that Dionaea supports are the following:
HTTP and secure HTTP (HTTPS) are also supported on port 80
FTP, although the possibility of an attack to an ftp service is rather low.Dionaea supports ftp protocol on port 21. It implements an ftp server which can create directories and also upload/download files
TFTP, tftp server is provided on port 69 and is implemented to check the udp connection code
MSSQL, Dionaea also emulates a Microsoft SQL server on port 1433. Attackers are able to login to the server but as there is no real database provided by Dionaea, there is no further interaction
MYSQL, Dionaea also implements Mysql wire stream protocol on port 3306
SIP, as mentioned above a new module for supporting VOIP was added to Dionaea. The VoIP protocol implemented is SIP. The operation of this module consists in waiting for incoming SIP messages, logging all data and replying accordingly to the requests. Only when malicious messages are detected, Dionaea passes the code to the emulation engine.
2.3.2. Operation of Dionaea
The main function of Dionaea is to detect and analyze the offered payload of the attacker in order to gain a copy of the malware. To succeed this, Dionaea offers different ways of interaction with the attacker. For example, it can provide a command prompt cmd.exe window to the attacker and react accordingly to the input commands or use the URLDownloadToFileapi to get a file through http. If the previous operation is successful, Dionaea should know the location of the file that the attacker tries to send and attempts to download the file. One very interesting feature of Dionaea is that it can send the downloaded file to a third party for further analysis than simply storing it on disk.
Dionaea is also a great monitoring tool. It records all the activities on the ports it listens but also keeps record of connections to other ports. All these recorded data are kept in a log file in text format. Although we can choose the format of the log file, for instance filter the log messages or sort the events from the most recent to the least recent ones, it is still quite difficult to read and gain useful information. Therefore, Dionaea creates ansqlite database with all the recorded activities and makes it easier for the user to make queries and obtain useful information from the honeypot.
From the log file we can retrieve useful data to understand the operation of the honeypot. Dionaea records three types of connections: reject, accept and connect. Connection attempts to the ports that Dionaea does not listen are marked as 'reject'. On the other hand, attempts to monitored ports are marked as either 'connect' or 'accept'. In any case, Dionaea records in the log file and additionally in the sqlite database, valuable information about these connections such as the timestamp of the connection, the IP addresses of the local and remote host and the corresponding ports and protocols.
Except of the information about the connections, Dionaea also keeps in database other significant tables such as download tables which contain information about the id of the connection, the url from which the malware was downloaded and also the downloaded md5 hash.
2.3.3. Installation and Configuration of Dionaea
The installation of Dionaea requires some basic knowledge of Linux operating systems, as it is important to install all the required dependencies first but there are useful and detailed instructions in the official home page too.
Dionaea is a flexible software tool and can be easily configured according to our needs by editing the configuration file. More specifically, in the configuration file we can edit the following:
We can change the directory of the log file and more importantly we can reduce the amount of the produced data. By default, Dionaea records every event in the log file. We can filter the output data by changing the levels value from 'all' to only 'warning, error' for example. Dionaea writes the last event at the end of the log file. Thus, it is really useful to rotate this behavior in logging section, so that the last event can be read directly at the first line of the log file.
Moreover, we can modify the path of the downloaded binaries and bi-streams folders. Bi-directional streams allow us to replay an attack that Dionaea captured on IP-level. As we mentioned above, with Dionaea we can submit directly the downloaded malware to third parties for further analysis. In the submit section of the configuration file, we can edit all these details. One more interesting feature is that we can manually configure the IP range that Dionaea can listen to and also add ipv6 addresses. By default, Dionaea listens to all the IP addresses it can find.
Finally, we can configure the modules section which is considered the most significant of the configuration files. The modules section includes a list of services which Dionaea supports and we can enable or disable some of them. For instance, we can enable and edit the pcap module if we want to keep information about rejected connection attempts or additionally, if we are interested in the operating system of the attackers, we can enable the p0f service.
Kippo  is a medium interaction honeypot which emulates an SSH server. It provides an interaction shell to the intruder while monitoring and recording all the activities. Furthermore, it is designed to monitor brute force attacks.
Secure shell (SSH)  is a network protocol which provides encrypted communication between two devices. SSH allows users to gain access to remote devices through a shell or interactive command line in a secure manner. The port used by SSH protocol by default is 22 . In most cases, a client can access an SSH server by entering a valid username and password through an SSH client tool. From that perspective, SSH servers are vulnerable to password attacks.
Especially SSH dictionary or brute force attacks are very common and quite easy to be launched even by unqualified attackers. These types of attacks are based on the fact that many users choose their credentials from a small domain . Thus, brute force attacks try all the possible username and password combinations until the correct one is found, in an automated way. This attribute could be very useful for SSH server honeypot implementations. In order to have as many successful logins as possible in our SSH honeypot, it is preferred to choose credentials that rely on automated dictionary attack tools.
Cisco's white paper about SSH login activity  shows that for a total of approximately 1,56 million login attempts, username 'root' was used almost in 35 percent of all cases. The following figure depicts the 10 most used usernames according to the results of the research conducted by Cisco.
Figure . Top 10 attempted usernames 
In addition, other surveys ,[ ] give some interesting information about the most commonly used passwords in connection attempts. The top password combinations include variations of the username such as 'username' or 'username123' and passwords like '123456' or even 'password'. The results about the usernames used are almost the same like the ones in Cisco's research.
2.4.1. Features of Kippo
Kippo is implemented in python language. As we mentioned above, Kippo basically emulates an SSH server on port 22 and logs all login attempts to that port. Whenever a login is successful, Kippo monitors all the input commands of the attacker and replies to these commands in order to convince the attacker that she interacts with a real system. A list of the available commands can be found in Kippo's directory.
More specifically, the features of Kippo include:
a fake file system. The attacker can add or remove files with the appropriate command
Kippo saves files that have been downloaded by the attacker with the command wget, in a specific secured folder
Kippo gives the ability to the attacker to add fake file contents, using for example the cat command
provides fake output for some specific commands such as vi, useradd, etc.
tries to fool the attacker with some reactions to specific commands, for instance exit command does not work, which means that the attacker thinks that has disconnected but still can be monitored by kippo.
all the sessions are recorded and can be easily replayed with the initial timestamps
all records are kept in an sql database.
2.4.2. Operation of Kippo
Kippo records all the useful information in a log file but also in ansql database. The main tables of the database include:
authentication table, containing information about the login attempt, the timestamp of the attempt and also the usernames and passwords that have been used
client table, which contains information about the SSH client version that has been used
input table, with information about the input commands that have been entered. Also, in that table we have information about the session id, the timestamp and additionally if the command was successful or not
sessions table, containing information about the id of the connection, the duration and timestamp of the connection and the IP address of the attacker
sensors table providing information about the ssh server and the IP address of the host
finally, the ttylog table which, as mentioned above, contains information about how to replay sessions with the corresponding timestamps
2.4.3. Installation and Configuration of Kippo
The installation of Kippo is quite easy if someone follows the instructions of the home page and installs the latest version of the software. In the configuration file of Kippo we can customize the honeypot according to our needs. We can edit the IP of the host if we want to change the default which is 0.0.0.0 and also the listening port which is by default 2222.
Port 2222 is an alternate port and may be quite useful for testing purposes but as long as most ssh attacks are detected on port 22, this choice would reduce the number of recorded attempts. Thus, it is necessary to change the default port to 22. To succeed this, we need root privileges to the system but this is not recommended due to security reasons. Instead, port redirection can be used as proposed in Kippo's home page or by using other existing solutions, such as authbind .
In addition, in the configuration file we can change the name of the user in the interaction shell. By default, it is "sales", which is quite attractive to attackers. Furthermore, we can set the desired password for our server. By default, it is '123456' which as we have shown above, it is included in dictionary attacks and could guarantee a large number of successful logins. Besides that, Kippo creates a dedicated password database, where we can add more valid passwords. Also, some other configurations include the directories of important folders such as the downloaded folder, a fake file system folder and password and input data storage folders. Finally, we can edit the credentials in order to connect Kippo log files with the sql database