Analysis of the scan of the month #21
-------------------------------------
By Javier Fernández-Sanguino Peńa 
jfs@computer.org
jfernandez@germinus.com


----------------------- TCPDUMP FILE ANALYSIS ---------------------------

(Note: all times shown in the GMT+1 timezone)

First off, I downloaded the tar.gz file from honeynet.org and checked
the md5sum
$ md5sum scan21-log.tar.gz
58abd0cb0cbe4c31930225dd229352a5  scan21-log.tar.gz

Yep, it is the same as the one included in the web page. However,
for all those paranoids out there since the honeynet web server is
not SSL-secured there is no way I can confirm that the file has been
really provided by the project. In any case, since I'm not going to
execute the file (just data) I'm not all that worried.

Ok. Un'tar the file and get a tcpdump log file (output from snort) named
0215@000-snort.log. I renamed it to 0215-snort.log to avoid the ugly
'@' character.

First fo all: tcpdump -nr 0215-snort.log

A fast browse determines that this is a scan from 2 am to 2 am 
(a full day). There looks like there is a lot of activity coming and
going at that time, after browsing the archive I run snort to test the file:

$ snort -r ~/honeynet-contest/scan-of-the-month/0215-snort.log \
-A full -c /etc/snort/snort.conf

No alerts are shown. It seems that there is no ordinary attacks in place
here. No, oops, the HOME_NET is not properly setup. Define it to
172.16.1.0/24 in snort.conf and re-run the same command above.
Results show 44 attacks. Divided into:

- portscans (211.114.173.65, 213.20.0.92, 216.183.12.188, 216.183.12.188,
208.187.189.25, 211.251.211.65, 61.129.106.171, 61.142.80.110, 217.0.80.154,
202.98.223.74)
- RPC info connections (208.187.189.25, 211.251.211.65)
- RPC overflow (208.187.189.25)
- Wingate (64.224.118.113)

Now we analyse the tcp sessions. We can use snort by adding a:

   log tcp any any <> any 21 (session: printable;)

rule to the snort.conf file. We can also use tcpflow:
$ tcpflow -r 0215-snort.log

Which are most active? A simple 'du -cs * |sort -n' in the output
directory tells us:
20      172.016.001.102.34560-064.224.118.115.06667
28      064.224.118.115.06667-172.016.001.102.34560

The difference versus the snort output and tcpflow's is that 
snort prints the information on a full session and tcpflows divides 
incoming and outgoing sessions into different files.

The output is very interesting, the system 172.16.1.102 seems 
to be compromised. Contents show an established IRC session that
goes through the full day. Most of the output is just PING-PONG
between the irc client and the server 64.224.118.115
(jade.va.us.dal.net)

Since the IRC server re-connects at 23:16 is useful to point some of
the information that is used by the client:
0x0020   5018 60f4 c196 0000 4e49 434b 2069 6e66        P.`.....NICK.inf
0x0030   5342 4956 534b 0a55 5345 5220 5653 4b20        SBIVSK.USER.VSK.
0x0040   6c6f 6361 6c68 6f73 7420 6c6f 6361 6c68        localhost.localh
0x0050   6f73 7420 3a56 534b 0a                         ost.:VSK.

So the 'bot is using a pre-defined nick and username. The whois list
shown by the server for the #solaris channel (very extensive) 
includes quite a number of bots with the same format: 
inf+6uppercase characters (577 users).

Also interesting in the session is the fact that there is a
connection from the server to the client at 22:20:59 that looks 
like a file transfer of some sort. I have been, however, unable
to determine what it is exactly.

This might not look, at first, related to the questions placed in
this scan of the month, but I have tried to evaluate all possible 
hypothesis.

Finally, the most relevant issue to the current scan is the UDP
scan that takes place almost at the end of the log file. There
are exactly 10 packets directed at all the addresses in the 
172.16.1.101-109 range. Note that there are no traces of other hosts
in any of the scans so these seem to be all the hosts the the IDS
is looking at (so it's not really a /24 network after all). 

The first packet is:

01:50:15.097431 213.68.213.135.5298 > 172.16.1.101.18030:  [no cksum] 
udp 5 (ttl 190, id 54038, len 33)
0x0000   4500 0021 d316 0000 be11 d173 d544 d587        E..!.......s.D..
0x0010   ac10 0165 14b2 466e 000d 0000 444f 4d02        ...e..Fn....DOM.
0x0020   00           

Some first thoughts:
- there is no UDP checksum being generated 
- the packet is really small

A search in google for similar packets being reported in mailing
lists does not retrieve any valuable information.

The other packets vary both in source IP address and destination
and source ports. It is worth noticing, however, that the source
IP is in the 213.68.213.130-144 range.  Full analysis of the UDP
packets has been left to the answers below.

A search for the IP range ("213.68.213") in google returns a pointer to
hackreport.magicnet.org/class3.php. It seems to be a report of
active bots (mirkforces) in any given subnet. Is this related?
Probably not, but it could be.

I also analised the information of other connections to the 
honeynet in other to determine if some of them was in fact 
the one sending the UDP scan (later on).  The UDP probe might be
a consequence of a previous probe not returning useful information.

Using a custom perl script (shown below) I analysed the tcpdump in 
order to determine the most active hosts, with this information I 
was able to easily parse the tcpdump file manually and determine 
that the following scans had taken place during the time of the log:

211.114.173.65 (03:29) - DNS (1 syn in burst, repeated save for R)
216.194.3.132 (05:57) - WWW (3 syn nmap?)
208.209.117.14 (08:22)  - WWW (only one host?)
211.251.211.65 (20:37) - sunrpc scan, then tries portmapper?
217.0.80.154 (00:22) - FTP scan
213.20.0.92 (11:47) - FTP scan
216.183.12.188 (12:07) - 27374 scan
202.98.223.74 (00:32) - sunrpc scan
208.187.189.25 (17:39) - sunrpc scan
211.114.173.65 (0:39) - dns scan
61.129.106.171 (22:26) - rpc scan
80.131.105 (21:02) - ping scan

I also analysed the MAC address. Mostly focused in order to determine
if the UDP packets were being sent from the inside or the outside of the
honeynet network. I used yet another custom perl script to count the
number of MAC addresses. It was strange to see that there are only two
of them.

An external MAC in all packets from the outside: 8:0:20:f6:d3:58 
An internal MAC in all responses from the 172.16.1.0 subnet: 0:e0:1e:60:70:40

Checking the mac address database at
www.dbs.ucdavis.edu/cgi-bin/mac_find we find that MAC1 is a Sun machine
and MAC2 is a Cisco machine. 

A possible layout is (wild guess):

   Internet --- Sun Solaris fw ---- Cisco router ---- 172.16.1.0/24 subnet
                                |
                            Snort probe

MAC addresses in this case are not really useful. It seems that the IDS 
has been placed probably between two systems routing the packets. 
In any case, the analysis confirms that the UDP probe packets are being sent 
from the outside (they appear with the external MAC)

Now, after all this analysis I will try to answer the questions
posed for this scan.


-------------------------- ANSWERS TO QUESTIONS ----------------------------

QUESTION 1. What is the attacker attempting to achieve?

He is trying to determine which hosts are alive and probably also
fingerprint information on what operating system do the hosts 
probed run. There is exactly one UDP probe per system in the 
the 172.16.1.102-172.16.1.109 range.

QUESTION 2. How does UDP work to achieve this purpose?

Servers should send an ICMP Port unreachable when sending an
UDP packet to a port that is not being served. If any of the hosts
probed were alive, they should send this packet as an answer to the
probe. The attacker would be able to determine available systems.
However, this ICMP messages are usually blocked by the outgoing 
rules of a firewall that protects the local network.
The fact that no ICMP messages are being sent (neither for this scan
nor for the other ICMP scan in the log file) suggests that outgoing
ICMP packets are being blocked before the IDS and are not being
sent to the outside.

This fingerprinting technique is also discussed in Orif Arkin paper:
"Network scanning Techniques: Understanding how it is done"
(http://www.sys-security.com)


Open questions:
- Is the attacker using a custom tool to implements this?
 I have tested (briefly) xprobe 
 (http://www.sourceforge.net/projects/xprobe/) and doesn't seem to have
 the same footprint: it does not allow changes in ID and TTLs and packets
 do not share the same content.

QUESTION 3. Why is the attacker using random src and dst UDP ports and random
       IP addresses?

Obviously he is both trying to hide his real IP address, avoid portscan
detectors (Snort is unable to determine this is a real probe for example).
The fact that all ports (source and destination) are over 1348 and below
19841 and that neither seem to be well-known ports suggest that dst
ports might not be all that random. Take in account that if the 
probe sends a packet to a port to a system with a server listening
in the UDP dst port then he might not get the ICMP Port-Unreachable
expected result.

Port analysis was done with:
$ tcpdump -nr 0215-snort.log |grep udp |grep ^01 | \
perl -pe 's/.*\.(\d+) > .*\.(\d+):.*/$1\n$2/' |sort -n > ports

I checked the resulting ports against a database of well known ports
(available at http://www.portsdb.org/, but used a local copy)

$ for i in `cat ports`; do grep $i /etc/services; done

Nothing.

$ for i in `cat ports`; do grep $i portsdb.dump; done

Found some interesting UDP ports:
1348 - multi media conferencing (bbn-mmx)
3327 - bbars (?)

That's 2 out of 18 so it doesn't look very significant. My hypothesis
is that the ports are being generated randomly from high port
ranges excluding well-known ports.

Is is worth noting, however, that all IP addresses are in the 
213.68.213.130-213.68.213.140 range. This is significant for
a later question. Ripe tells us that the 213.67.212.0-213.67.213.255
is being managed by REPRO Computer Publishing GmbH (Germany).


QUESTION 4. Are all the packets originating from the same machine or different
       ones?

It is not possible to be totally sure about this since any information
available could have been forged by the attacker. If the packets are
analysed in-depth we can see that:
- they are using different identification numbers
- all packets have different TTL values
- all source addresses are different
- the 213.68.213 seems to hold a lot of IRC bots ("mirkforces")
that could be used by an attacker to launch probes.

Is this conclusive? My hypothesis is that the IP packets information 
can be forged.  It is quite relevant that the IP addresses scanned
are consecutive (first 101, then 102, then 103...) this seems to
indicate that some kind of tool is being used to conduct the probe.
Is it running on a single system or is this a distributed probed?

In order to retrieve more information, we can check *when* the 
packets were detected by the probe:

$ tcpdump -nr 0215-snort.log |grep udp |grep ^01 | \
perl -pe 's/^.*?\.(\d+)\s.*$/$1/'
097431
173222
177698
178347
179336
180052
180834
181597
182350

They seem to be related. Checking it further with a custom perl script
(source code below) we see that the last packets are the same 
(time) "distance" appart.

$ tcpcpdump -nr 0215-snort.log |grep udp |grep ^01 | perl time-analysis.pl
75791 (2: 097431 -> 173222)
4476 (3: 173222 -> 177698)
649 (4: 177698 -> 178347)
989 (5: 178347 -> 179336)
716 (6: 179336 -> 180052)
782 (7: 180052 -> 180834)
763 (8: 180834 -> 181597)
753 (9: 181597 -> 182350)

This similarity is suspicious, coupled with the hypothesis that answers 5
my opinion is that these packets are being crafted by the same computer.

    5. How can the attacker view the responses to his probes?

I worked with two hypothesis here:

a) he has compromised a host in the local network: 172.16.1.102. Since the
irc communications and the file transfer are suspicious and prelude of
the probe.

b) this is an external attack unrelated to any other activity in the
log file.

The MAC address analysis discards a). I expected (before checking the MACs)
that the IDS was in fact "listening" in the same broadcast domain of the
honeynet systems which is, in fact, not the case. If this where the case
an attacker might have used a compromised host to initiate the probe
inside the local network and would be able to "see" the responses without
revealing the compromised host to others.

Since b) seems to be the most probable hypothesis my guess is that
the IP addresses are not all that random (they belong to the
same IP network) it is possible that the attacker has some way to 
"hear" the answers (if any) received to the probe. There are
several ways to do this:

1.- the attacker has control of the element interconnecting the
213.68.213.0/24 network (or a subnet of it which includes the
hosts in the 213.68.213.130-213.68.213.140 range). Thus, he can
"see" the answers that flow back to those systems

2.- the attacker is in the same broadcast domain as the 
213.68.213.X systems and he is either in a non-switched network
or he has spoofed those addresses in the layer-2 elements.

    6. Can the attacker fingerprint the OS of the victim systems?

Short answer: Not probable with this probe but yes if the
attacker sends some more.

Long answer: 
Yes, it might be possible to do so since UDP's on closed ports should
send an ICMP port unreachable *but* the error rate of destination 
unreachable generation is a suggestion that might
be implemented differently depending on the operating system
(RFC 1812, section 4.3.2.8). If the attacker where to send more than
a single probe per system it would be theoretically possible to fingerprint
which OS system was answering. 

This is discussed both in the nmap(1) manpage and in
Orif's Arkin paper: "ICMP Usage in Scanning" and "X remote ICMP
based OS fingerprinting"

However, the ICMP Port Unreachable packets might provide some other
information, as Arkin's paper describes the following methods:
ICMP Message Quoting and analysis Type Of Service field value.


---------------------------- TOOLS USED -------------------------------------

Tools used:

For data analysis:
- tcpdump
- snort
- tcpflow
- file
- hexdump
- custom scripts (shown below)

For implementations of network scanning techniques in order to
determine the tool used to craft the UDP packets:
- nmap
- xprobe 

.................. most-active.pl ...............................

#!/usr/bin/perl
#
# Parse tcpdump file and say which hosts are more active
# Usage: tcpdump -nr file | perl most-active.pl |sort -n
while (<STDIN>) {
	if ( /^\d{2}:\d{2}:\d{2}.\d+\s(\d+\.\d+\.\d+\.\d+)\.\d+\s>\s(\d+\.\d+\.\d+\.\d+)\.\d+:/ ) {
		$activity{$1}++;
		$activity{$2}++;
	}
}

foreach $host ( sort { $activity{$b} <=> $activity{$a} }  keys %activity) {
	print "$activity{$host} packets for $host\n";
}
exit 0;
...................count-mac.pl.....................................
#!/usr/bin/perl
#
# Parse tcpdump file and say which MAC addresses are being used
# Usage: tcpdump -enr file | perl count-mac.pl |sort -n
while (<STDIN>) {
        if ( /^\d{2}:\d{2}:\d{2}\.\d+\s([\w:]+)\s([\w:]+)[\s\d]+:\s(\d+\.\d+\.\d+\.\d+)[\.\d]*\s>\s(\d+\.\d+\.\d+\.\d+)[\.\d]*:/ ) {
	                $mac{$1}++;
			$mac{$2}++;
	}
}

foreach $addr ( sort { $mac{$b} <=> $mac{$a} }  keys %mac) {
	print "$mac{$addr} count for $addr\n";
}


exit 0;

...............time-analysis.pl.......................................
#!/usr/bin/perl
#
$number=0;
while (<STDIN>) {
	if (/^.*?\.(\d+)\s.*$/) {
           $number++;
           $time=$1;
	   print $time-$prevtime." ($number: $prevtime -> $time)\n" if $prevtime;
           $prevtime=$time
	}
}
exit 0;