Remote OS detection via TCP/IP Stack FingerPrinting

posted onJune 27, 2000

by hitbsecnews

ABSTRACT

This
paper discusses how to glean precious information about a host by querying
its TCP/IP stack. I first present some of the "classical" methods of determining
host OS which do not involve stack fingerprinting. Then I describe the
current "state of the art" in stack fingerprinting tools. Next comes a
description of many techniques for causing the remote host to leak information
about itself. Finally I detail my (nmap) implementation of this, followed
by a snapshot gained from nmap which discloses what OS is running on many
popular Internet sites.

REASONS

I think the usefulness of determining what OS a system is running is pretty
obvious, so I'll make this section short. One of the strongest examples
of this usefulness is that many security holes are dependent on OS version.
Let's say you are doing a penetration test and you find port 53 open.
If this is a vulnerable version of Bind, you only get one chance to exploit
it since a failed attempt will crash the daemon. With a good TCP/IP fingerprinter,
you will quickly find that this machine is running 'Solaris 2.51' or 'Linux
2.0.35' and you can adjust your shellcode accordingly.

A worse possibility is someone scanning 500,000 hosts in advance to see
what OS is running and what ports are open. Then when someone posts (say)
a root hole in Sun's comsat daemon, our little cracker could grep his
list for 'UDP/512' and 'Solaris 2.6' and he immediately has pages and
pages of rootable boxes. It should be noted that this is SCRIPT KIDDIE
behavior. You have demonstrated no skill and nobody is even remotely impressed
that you were able to find some vulnerable .edu that had not patched the
hole in time. Also, people will be even _less_ impressed if you use your
newfound access to deface the department's web site with a self-aggrandizing
rant about how damn good you are and how stupid the sysadmins must be.

Another
possible use is for social engineering. Lets say that you are scanning
your target company and nmap reports a 'Datavoice TxPORT PRISM 3000 T1
CSU/DSU 6.22/2.06'. The hacker might now call up as 'Datavoice support'
and discuss some issues about their PRISM 3000. "We are going to announce
a security hole soon, but first we want all our current customers to install
the patch -- I just mailed it to you..." Some naive administrators might
assume that only an authorized engineer from Datavoice would know so much
about their CSU/DSU.

Another
potential use of this capability is evaluation of companies you may want
to do business with. Before you choose a new ISP, scan them and see what
equipment is in use. Those "$99/year" deals don't sound nearly so good
when you find out they have crappy routers and offer PPP services off a bunch of Windows boxes.

CLASSICAL TECHNIQUES

Stack
fingerprinting solves the problem of OS identification in a unique way.
I think this technique holds the most promise, but there are currently
many other solutions. Sadly, this is still one the most effective of those
techniques:

playground~>
telnet hpux.u-aizu.ac.jp

Trying 163.143.103.12...

Connected
to hpux.u-aizu.ac.jp.

Escape
character is '^]'.

HP-UX hpux B.10.01 A 9000/715

(ttyp2)
login:

There
is no point going to all this trouble of fingerprinting if the machine
will blatantly announce to the world exactly what it is running! Sadly,
many vendors ship _current_ systems with these kind of banners and many
admins do not turn them off. Just because there are other ways to figure
out what OS is running (such as fingerprinting), does not mean we should
just announce our OS and architecture to every schmuck who tries to connect.

The problems with relying on this technique are that an increasing number
of people are turning banners off, many systems don't give much information,
and it is trivial for someone to "lie" in their banners. Nevertheless,
banner reading is all you get for OS and OS Version checking if you spend
thousands of dollars on the commercial ISS scanner. Download nmap or queso
instead and save your money :).

Even
if you turn off the banners, many applications will happily give away
this kind of information when asked. For example lets look at an FTP server:

payfonez>
telnet ftp.netscape.com 21

Trying
207.200.74.26...

Connected
to ftp.netscape.com.

Escape
character is '^]'.

220
ftp29 FTP server (UNIX(r) System V Release 4.0) ready.

SYST

215
UNIX Type: L8 Version: SUNOS

First
of all, it gives us system details in its default banner. Then if we give
the 'SYST' command it happily feeds back even more information.

If
anon FTP is supported, we can often download /bin/ls or other binaries and determine what architecture it was built for.

Many other applications are too free with information. Take web servers
for example:

playground>
echo 'GET / HTTP/1.0n' | nc hotbot.com 80 | egrep

'^Server:'

Server:
Microsoft-IIS/4.0

playground>

Hmmm ... I wonder what OS those lamers are running.

Other
classic techniques include DNS host info records (rarely effective) and
social engineering. If the machine is listening on 161/udp (snmp), you
are almost guaranteed a bunch of detailed info using 'snmpwalk' from the
CMU SNMP tools distribution and the 'public' community name.

CURRENT FINGERPRINTING PROGRAMS

Nmap
is not the first OS recognition program to use TCP/IP fingerprinting.
The common IRC spoofer sirc by Johan has included very rudimentary fingerprinting
techniques since version 3 (or earlier). It attempts to place a host in
the classes "Linux", "4.4BSD", "Win95", or "Unknown" using a few simple
TCP flag tests.

Another
such program is checkos, released publicly in January of this year by
Shok of Team CodeZero in Confidence Remains High Issue #7. The fingerprinting techniques since version 3 (or earlier). It attempts to place a host in
the classes "Linux", "4.4BSD", "Win95", or "Unknown" using a few simple
TCP flag tests.

Another
such program is checkos, released publicly in January of this year by
Shok of Team CodeZero in Confidence Remains High Issue #7. The fingerprinting
techniques are exactly the same as SIRC, and even the _code_ is identical
in many places. Checkos was privately available for a long time prior
to the public release, so I have no idea who swiped code from whom. But
neither seems to credit the other. One thing checkos does add is telnet
banner checking, which is useful but has the problems described earlier.

Su1d
also wrote an OS checking program. His is called SS and as of Version
3.11 it can identify 12 different OS types. I am somewhat partial to this
one since he credits my nmap program for some of the networking code :).

Then
there is queso. This program is the newest and it is a huge leap forward
from the other programs. Not only do they introduce a couple new tests,
but they were the first (that I have seen) to move the OS fingerprints
_out_ of the code. The other scanners included code like:

/*
from ss */

if ((flagsfour & TH_RST) && (flagsfour & TH_ACK) && (winfour == 0) &&

(flagsthree
& TH_ACK))

reportos(argv[2],argv[3],"Livingston
Portmaster ComOS");

Instead,
queso moves this into a configuration file which obviously scales much
better and makes adding an OS as easy as appending a few lines to a fingerprint
file.

file.

Queso
was written by Savage, one of the fine folks at Apostols.org.

One
problem with all the programs describe above is that they are very limited
in the number of fingerprinting tests which limits the granularity of
answers. I want to know more than just 'this machine is OpenBSD, FreeBSD,
or NetBSD', I wish to know exactly which of those it is as well as some
idea of the release version number. In the same way, I would rather see
'Solaris 2.6' than simply 'Solaris'. To achieve this response granularity,
I worked on a number of fingerprinting techniques which are described
in the next section.

FINGERPRINTING METHODOLOGY

There are many, many techniques which can be used to fingerprint networking
stacks. Basically, you just look for things that differ among operating
systems and write a probe for the difference. If you combine enough of
these, you can narrow down the OS very tightly. For example nmap can reliably
distinguish Solaris 2.4 vs. Solaris 2.5-2.51 vs Solaris 2.6. It can also
tell Linux kernel 2.0.30 from 2.0.31-34 or 2.0.35. Here are some techniques:

The
FIN probe -- Here we send a FIN packet (or any packet without an ACK or
SYN flag) to an open port and wait for a response. The correct RFC793
behavior is to NOT respond, but many broken implementations such as MS
Windows, BSDI, CISCO, HP/UX, MVS, and IRIX send a RESET back. Most current
tools utilize this technique.

The
BOGUS flag probe -- Queso is the first scanner I have seen to use this clever test. The idea is to set an undefined TCP "flag" ( 64 or 128) in
the TCP header of a SYN packet. Linux boxes prior to 2.0.35 keep the flag
set in their response. I have not found any other OS to have this bug.
However, some operating systems seem to reset the connection when they
get a SYN+BOGUS packet. This behavior could be useful in identifying them.

TCP ISN Sampling -- The idea here is to find patterns in the initial sequence
numbers chosen by TCP implementations when responding to a connection
request. These can be categorized in to many groups such as the traditional
64K (many old UNIX boxes), Random increments (newer versions of Solaris,
IRIX, FreeBSD, Digital UNIX, Cray, and many others), True "random" (Linux
2.0.*, OpenVMS, newer AIX, etc). Windows boxes (and a few others) use
a "time dependent" model where the ISN is incremented by a small fixed
amount each time period. Needless to say, this is almost as easily defeated
as the old 64K behavior. Of course my favorite technique is "constant".
The machines ALWAYS use the exact same ISN :). I've seen this on some
3Com hubs (uses 0x803) and Apple LaserWriter printers (uses 0xC7001).

You
can also subclass groups such as random incremental by computing variances,
greatest common divisors, and other functions on the set of sequence numbers
and the differences between the numbers.

It
should be noted that ISN generation has important security implications.
For more information on this, contact "security expert" Tsutomu "Shimmy"
Shimomura at SDSC and ask him how he was owned. Nmap is the first program
I have seen to use this for OS identification.

Don't
Fragment bit -- Many operating systems are starting to set the IP "Don't Fragment bit -- Many operating systems are starting to set the IP "Don't
Fragment" bit on some of the packets they send. This gives various performance
benefits (though it can also be annoying -- this is why nmap fragmentation
scans do not work from Solaris boxes). In any case, not all OS's do this
and some do it in different cases, so by paying attention to this bit
we can glean even more information about the target OS. I haven't seen
this one before either.

TCP
Initial Window -- This simply involves checking the window size on returned
packets. Older scanners simply used a non-zero window on a RST packet
to mean "BSD 4.4 derived". Newer scanners such as queso and nmap keep
track of the exact window since it is actually pretty constant by OS type.
This test actually gives us a lot of information, since some operating
systems can be uniquely identified by the window alone (for example, AIX
is the only OS I have seen which uses 0x3F25). In their "completely rewritten"
TCP stack for NT5, Microsoft uses 0x402E. Interestingly, that is exactly
the number used by OpenBSD and FreeBSD.

ACK
Value -- Although you would think this would be completely standard, implementations
differ in what value they use for the ACK field in some cases. For example,
lets say you send a FIN|PSH|URG to a closed TCP port. Most implementations
will set the ACK to be the same as your initial sequence number, though
Windows and some stupid printers will send your seq + 1. If you send a
SYN|FIN|URG|PSH to an open port, Windows is very inconsistent. Sometimes
it sends back your seq, other times it sends S++, and still other times
is sends back a seemingly random value. One has to wonder what kind of
code MS is writing that changes its mind like this.

ICMP Error Message Quenching -- Some (smart) operating systems follow
the RFC 1812 suggestion to limit the rate at which various error messages are sent. For example, the Linux kernel (in net/ipv4/icmp.h) limits destination
unreachable message generation to 80 per 4 seconds, with a 1/4 second
penalty if that is exceeded. One way to test this is to send a bunch of
packets to some random high UDP port and count the number of unreachables
received. I have not seen this used before, and in fact I have not added
this to nmap (except for use in UDP port scanning). This test would make
the OS detection take a bit longer since you need to send a bunch of packets
and wait for them to return. Also dealing with the possibility of packets
dropped on the network would be a pain.

ICMP
Message Quoting -- The RFCs specify that ICMP error messages quote some
small amount of an ICMP message that causes various errors. For a port
unreachable message, almost all implementations send only the required
IP header + 8 bytes back. However, Solaris sends back a bit more and Linux
sends back even more than that. The beauty with this is it allows nmap
to recognize Linux and Solaris hosts even if they don't have any ports
listening.

ICMP Error message echoing integrity -- I got this idea from something
Theo De Raadt (lead OpenBSD developer) posted to comp.security.unix. As
mentioned before, machines have to send back part of your original message
along with a port unreachable error. Yet some machines tend to use your
headers as 'scratch space' during initial processing and so they are a
bit warped by the time you get them back. For example, AIX and BSDI send
back an IP 'total length' field that is 20 bytes too high. Some BSDI,
FreeBSD, OpenBSD, ULTRIX, and VAXen fuck up the IP ID that you sent them.
While the checksum is going to change due to the changed TTL anyway, there
are some machines (AIX, FreeBSD, etc.) which send back an inconsistent
or 0 checksum. Same thing goes with the UDP checksum. All in all, nmap does nine different tests on the ICMP errors to sniff out subtle differences
like these.

Type
of Service -- For the ICMP port unreachable messages I look at the type
of service (TOS) value of the packet sent back. Almost all implementations
use 0 for this ICMP error although Linux uses 0xC0. This does not indicate
one of the standard TOS values, but instead is part of the unused (AFAIK)
precedence field. I do not know why this is set, but if they change to
0 we will be able to keep identifying the old versions _and_ we will be
able to identify between old and new.

Fragmentation
Handling -- This is a favorite technique of Thomas H. Ptacek of Secure
Networks, Inc (now owned by a bunch of Windows users at NAI). This takes
advantage of the fact that different implementations often handle overlapping
IP fragments differently. Some will overwrite the old portions with the
new, and in other cases the old stuff has precedence. There are many different
probes you can use to determine how the packet was reassembled. I did
not add this capability since I know of no portable way to send IP fragments
(in particular, it is a bitch on Solaris). For more information on overlapping
fragments, you can read their IDS paper (www.secnet.com).

TCP Options -- These are truly a gold mine in terms of leaking information.
The beauty of these options is that:

1)
They are generally optional (duh!) :) so not all hosts implement them.

2)
You know if a host implements them by sending a query with an option set.
The target generally show support of the option by setting it on the reply.

3)
You can stuff a whole bunch of options on one packet to test everything
at once.

Nmap sends these options along with almost every probe packet: Window
Scale=10; NOP; Max Segment Size = 265; Timestamp; End of Ops;

When
you get your response, you take a look at which options were returned
and thus are supported. Some operating systems such as recent FreeBSD
boxes support all of the above, while others, such as Linux 2.0.X support
very few. The latest Linux 2.1.x kernels do support all of the above.
On the other hand, they are more vulnerable to TCP sequence prediction.
Go figure.

Even
if several operating systems support the same set of options, you can
sometimes distinguish them by the _values_ of the options. For example,
if you send a small MSS value to a Linux box, it will generally echo that
MSS back to you. Other hosts will give you different values.

And
even if you get the same set of supported options AND the same values,
you can still differentiate via the _order_ that the options are given,
and where padding is applied. For example Solaris returns 'NNTNWME' which
means:

While
Linux 2.1.122 returns MENNTNW. Same options, same values, but different
order!

1.)
OsReview :
Red Hat 6.1 -
L33tdawg

2.)
Lockdown
: Securing your Linux box (part 1) -
L33tdawg

3.)
Remote OS
detection via TCP/IP Stack FingerPrinting -
Fyodor

4.)
Installing
Linux on a Laptop -
OB-1

5.)
Hacking
payphones - Telstra style -
OB-1

6.)
Testing modems
with a DoS attack -
L33tdawg

7.)
Avoiding detection
-
L33tdawg