A Qualitative Survey of Active TCP/IP Fingerprinting Tools and Techniques for Operating Systems Identification (Network Security)

Abstract

TCP/IP fingerprinting is the process of identifying the Operating System (OS) of a remote machine through a TCP/IP based computer network. This process has applications close related to network security and both intrusion and defense procedures may use this process to achieve their objectives. There are a large set of methods that performs this process in favorable scenarios. Nowadays there are many adversities that reduce the identification performance. This work compares the characteristics of four active fingerprint tools (Nmap, Xprobe2, SinFP and Zion) and how they deal with test environments under adverse conditions. The results show that Zion outperforms the other tools for all test environments and it is suitable even for use in sensible systems.

Introduction

The remote identification of operating systems, also known as OS fingerprinting (Operating System fingerprinting), is a process that aims at the discovery of the operating system of a remote machine. We consider remote a machine that is accessible through a computer network. This identification is accomplished by the use of data from the remote machine. More specifically, the process of OS fingerprinting is illustrated in Fig. 1.

Representation of the OS fingerprinting process


Fig. 1. Representation of the OS fingerprinting process

The process at all has four components: (1) the acquired network data, (2) a fingerprint made by refinement of the data, (3) a fingerprint database where each fingerprint is labeled according to the OS in which it represents, and (4) the results produced by a matching algorithm applied to the database and the fingerprint made from the acquired data. These four components are distributed into two sub processes, called fingerprinting and matching.

The techniques used for this purpose differ according to the data they use and how these data are acquired. The OS fingerprinting process can be divided in two subsequent tasks: which we call characterization and classification. In the characterization task a fingerprint is created for an OS, while in the classification use some procedure is applied to a database of these pictures to classify (match) the OS. According to how data are created and captured the methods can be grouped in two classes:

— Active: the machine that performs the identification sends messages to the remote machine. The responses to these messages (or the lack of responses) are used in the identification process;

— Passive: the machine that performs the identification does not send messages through the network to perform identification. The remote machine data is captured when it communicates with a third machine. This implies that the identification machine must have access to the communication channel between the remote and the third machine.

The way these two categories of tools performs fingerprinting is very important because it closely related to the tool efficiency. Choosing the appropriate tool for OS fingerprinting is an important question to consider, once it will be (usually) applied on security tests. We will show some of the most important characteristics of the most well known tools and how these characteristics are important for an security expert.

This paper is followed by more 4 sections. The criteria used and to select tools, OSes and the test bed used to assessment are presented in Section 2. The results are presented in Section 3. Explanations about the results are done in Section 4, and Section 5 concludes the paper.

Scenario

When the OS fingerprinting process uses TCP/IP network data the process is called TCP/IP stack fingerprinting, which takes advantage of details that differ from implementation to implementation of the TCP/IP [8]. The selection of the tools used in this survey is conducted by four reasons: (i) greater acceptance by the security community [1]; (ii) widely used [9]; (iii) techniques used are at least mentioned in papers [4,12]; (iv) and use active OS fingerprinting. The last presented reason was adopted because unlike passive methods, the active ones can produce the data it needs not depending on third devices, and the techniques used to create fingerprints depends only on data. For such reason, other well known TCP/IP stack fingerprinting tools such as p0f, PRADS, Etthercap, NetworkMiner, PacketFence and Satori) were not included in the tests. The chosen tools were: Nmap [9] (version 5.21), SinFP [4] (version 2.07), Xprobe2 [3] (version 0.3), and Zion [12] (version 0.1).

First test environment

Fig. 2. First test environment

Each selected tool underwent a series of tests related to its ability to identify, and robustness against the presence of network security devices (e.g. firewalls). Initially, tests were conducted in a controlled environment without the presence of security devices. In this case, the tools are, theoretically, under ideal conditions. Therefore, the results related to these tests express the best possible results for each tool. This initial test environment is shown in Fig. 2. The operating systems used are installed on the machines in the right side, and in the scanner machine the fingerprinting tools were installed.

The OSes were chosen obeying three criteria: (i) they are widely used, so the fingerprinting database of each tool most probably have their signatures; (ii) just one of OSes that have the same, or almost the same, TCP/IP stack implementation (e.g. QNX or NetBSD, MacOS or FreeBSD) [13,14]; (iii) and not to be a newest system that probably can not be in the fingerprint databases of some old tools like Xprobe2 (for example, there are no fingerprints of Windows 7 and Vista on Xprobe2 database). The selected OSes are shown in Table 1.

Now we introduce some security devices to create a more realistic shot of a machine on the Internet. Assuming an environment in which the firewall is intended to protect a given set of services (e.g. HTTP and SSH) all traffic not associated with these services could (or should) be blocked. This blockage may imply that all UDP and ICMP traffic could be discarded. As result, the tools whose use data from these protocols will not produce reasonable results.

Regarding the traffic normalization, almost all the peculiarities exploited by TCP/IP stack fingerprinting methods, present in specially crafted packets, sent to the remote machine, are removed or may cause the drop of the packet [10]. If traffic normalization is performed on all protocols (IP, TCP, UDP and ICMP) practically all fingerprinting tools will be affected.

Table 1. Used operating systems

Operating system

Detailed version

Debian

Linux debian 2.6.26-1-686

FreeBSD

6.4-RELEASE i386

NetBSD

4.0.1 GENERIC i386

OpenBSD

4.4 GENERIC#1021 i386

OpenSolaris

SunOS 5.11 snv_101b i86pc

Windows 2000

5.00.2195 Service Pack 4

Windows XP

Version 2002 Service Pack 2

Second test environment, using Honeyd and OpenBSD Packet Filter

Fig. 3. Second test environment, using Honeyd and OpenBSD Packet Filter

Another problem that affects fingerprint tools is created by the use of SYN proxies. SYN proxies are one of the widely used techniques to prevent servers against DoS (Denial of Service) attacks [6]. The SYN+ACK synchronization message of sent in response to TCP SYN requests are not originated from the target machine, but instead, from the firewall itself. As result, the use of SYN proxies also directly affects an identification tool that uses TCP.

Other problem is related to the use of PAT (Port Address Translation) [7,20]. PAT is a technology in which a mapping between the device’s port internal network and the device port exposed to the Internet is made explicitly. The use of PAT complicates the identification process because the operating system to be identified depends on which port the tool collects the information. If the identification tool uses more than one open TCP port to create your fingerprint this signature will not properly represent the TCP/IP stack of either machine (remote or the one who perform PAT).

Beyond the use of firewalls there are tools that aim to fool OS fingerprinting [19]. The Honeyd is a tool that aims to simulate machinery, services and operating systems on the network [17,18]. Therefore, this tool simulates different implementations of the TCP/IP stack. In this sense, considering the use of OpenBSD Packet Filter [2] and the presence of Honeyd, the architecture of Fig. 2 was modified and used as a second test environment, presented in Fig. 3.

This last test environment can reproduce all the security mechanisms introduced in this section. Next section will presents the results for each tool in the following conditions: (i) in a clean environment, without the use of any security mechanism; (ii) using PAT; (iii) using packet normalization; (iv) using a SYN proxy; (v) and using fake machines made with Honeyd.

Experiments

In Table 2 are summarized the results of the tests for each analyzed tool, where: full black circle means correct, half means imprecise and empty means wrong.

Table 2. Classification results

Network setup

Nmap

SinFP

Xprobe2

Zion

Clean environment

tmp18-161 tmp18-162 tmp18-163 tmp18-164

Using Port Address Translation

tmp18-165 tmp18-166 tmp18-167 tmp18-168

Using Packet Normalization

tmp18-169 tmp18-170 tmp18-171 tmp18-172

Using SYN proxy

tmp18-173 tmp18-174 tmp18-175 tmp18-176

Using Honeyd

tmp18-177 tmp18-178 tmp18-179 tmp18-180

The notes in each part of Table 2 are associated to these facts and events:

(a) unable to distinguish between Windows 2000 and XP;

(b) the Debian GNU/Linux operating system was not precisely recognized: it was classified as Linux 2.6.X and OpenBSD 4.X with the same grade of certainty (85%);

(c) because Xprobe2 uses only network layer information it cannot distinguish the operating system using information associated to transport layer;

(d) the Debian GNU/Linux operating system was not precisely recognized: it was classified as Linux 2.6.X and OpenBSD 4.X with almost the same grade of certainty (approximated 86%);

(e) the Honeyd use was not recognized, but the Honeyd mimic was not good enough to produce the wrong result;

(f) the Zion tool was able to recognize the use of the SYN proxy;

(g) the Zion tool was able to recognize the use of Honeyd.

Until this point what we showed when each analyzed tool fails in the task on recognize operating systems remotely. Although only the active tools have been analyzed the methods used by passive tools are also fragile since the information used by these methods are also affected by the security mechanisms used here. In the next section will be verified what information can be used to perform OS fingerprinting even in the presence of PF and Honeyd.

Why Zion Outperforms?

The TCP ISN (Initial Sequence Number) is responsible for maintaining consistency in TCP communications (i.e. to avoid duplicated segments originated from the reuse of sequences of previous connections [16]). The way the generation of these numbers is implemented can lead to security problems. After the discovery of these problems a new recommendation was established in 1996 by RFC 1948 [5]. Michal Zalewski first showed that some operating systems have a distinct way of implement the generation of these numbers [21,22].

To use the TCP ISNs as data to create a signature to perform OS fingerprinting will consider the PRNG (Pseudo Random Number Generation) of the operating systems. The current recommendation for the generation of these numbers through a function Gisn (t) is expressed as [5]:

Illustration of TCP ISN sample acquisition process

Fig. 4. Illustration of TCP ISN sample acquisition process

tmp18-181_thumb

where Gisn (t) is the function responsible for generating the initial number of sequence at time t, M(t) is a composite function by its previous value adding the value of the function R(t) and F(•), which consists in apply a function f (•) to the identifier of the connection, comprising the addresses and ports of origin and destination and a optional secret key. To estimate the function R(t) using only samples of Gisn(t) is important to note that F(•) can be assumed constant for a same link identifier (connectionJd). Thus, one can obtain from Equations 1, 2 and 3 a estimative, R(t), of the function R(t):

tmp18-183_thumb

The process of sample acquisition is illustrated in Fig. 4.

One feature to consider is that intervals of sending packets SYN sufficiently short, can characterize a SYN flooding attack. Mainly because RST messages are not sent in response to SYN+ACK message from the target machine [6]. The process of acquisition of TCP ISN samples are performed according to Fig. 4, that is: (1) the scanner sends a synchronization message (SYN); (2) the target receives the message confirming the synchronization and acquisition of the TCP ISN (via SYN+ACK); (3) the scanner sends a RST message to cancel synchronization to prevent (and thus avoid detection of) SYN flooding.

In our experiments we find that the analyzed versions of the operating systems Debian, NetBSD, OpenSolaris, Windows 2000 and Windows XP adopt the recommendation proposed by RFC 1948. In cases where the recommendation proposed by RFC 1948 is not adopted, will used their own samples of the Gisn (t) in place of the estimate R(t). In Fig. 5 presents sketches of the 100 first samples of the R(t) for each operating system and Honeyd. This graphical representation of each of these series shows how each one is different from others.

A SYN proxy tends to send the same TCP ISN for a quite long period of time (see FreeBSD sketch). Also, the TCP ISN generator of Honeyd produces a deterministic signal. These facts imply that both SYN proxies and Honeyd can be detectable easily. Zion uses intelligent methods to create a signature for each operating systems and classify them using R(t) samples. The theoretical foundation to accomplish this task is already presented in literature [11,12,15].

Sketch of the time series compound of 100 samples of R(t)

Fig. 5. Sketch of the time series compound of 100 samples of R(t)

Conclusion

The paper presented the aspects related to efficiency and reliability of tools for remote identification of operating systems through active OS fingerprinting and confirms the benefits of using feature extraction and pattern matching on the analysis of TCP ISNs. The results demonstrate the feasibility of the computational intelligent methods developed by the Zion for OS fingerprinting. Since Zion use only well-formed packets on the identification process, it can be used against sensible machine, such as SCADA devices. We showed also the reason for each wrong identification of analyzed tools when a set of network security countermeasures are incorporated in test bed. We exploited the TCP ISN of several OSes to bring about why Zion is not influenced by the use of packet normalization or PAT and can detect SYN proxies and Honeyd.

Next post:

Previous post: