Analysis provided by Jeffery Stutzman, Cisco Corporate Information Security
The Honeynet Project
The Challenge:
In November, 2000
the Honeynet Project collected every probe, attack, and exploit launched against
the Honeynet. The challenge is to analyze this month's worth of data and analyze
the blackhat's tools, tactics, and motives.
For this exercise, I chose to use the Snort logs to attempt to determine what early warning signs we might see tipping us off of impending attack. Instead of compiling all of the logs together, I chose to use a methodology known as Statistical Process Control (SPC). SPC is a process used very commonly in the manufacturing world to measure defects in products at the factory floor. Very simply, SPC looks at each process individually, and then compares them to the aggregate by finding watching defects, and setting control limits to determine how many defects are normal (or can be tolerated). By looking at each Snort reported rule individually, and comparing them to each other we can easily identify trends that might indicate a future attack. Here's how it works
Initial Parsing
The first step I performed in doing the IDS analysis was to simply
parse the data. I first parsed by fixed segments at the Month,
Day, Year, and after the Snort IDS rule. I then performed "text to
columns" (in Excel), to parse by ":", then "-", and removed all
misc. characters. You can find the results as a tab-deliminated
text file, called
Total reports by day
Once all of the snort data was parsed, I counted the number of occurrences
of each Snort rule reported per day. By counting
only the number of snort alerts, I was able to both maintain an apples
to apples count, and maintain sanity. The first think I like
to look at (at least with Honeynet) is the RPC activity. RPC attacks
have been very popular in the past, and the numbers have
reflected that activity in our intrusions. So, as you look at the table
below, there are some unfamiliar terms you'll see. Let me
explain. Before I go much further though, I must explain though, I'm
no statistician, so please spare me the flames.
Here's how it works: Look first at the portion of the table labeled “RPC: The Snort alert rules appear first, followed by something called 3DMA. 3DMA means 3-day moving average. I used the current day, and two days preceding to create a moving average to help identify trends. In the next portion of the analysis those trends will become readily apparent. Next, at the end of the table you’ll see a column labeled “UCL”. UCL stands for upper control limit. Taking the standard deviation of the 30-day sample, and by multiplying by two I derived the upper control limit (UCL). The UCL gives us a marker if you well. If the 3DMA goes above the UCL, we should begin paying attention to that indicator. Also, if the 3DMA increases for 3 or more days, we should also pay attention. In statistics, this is called a run. So, the results of the data analysis can be found in the file
A picture speaks a thousand words.
The next step in the process is to put everything together in a graphic.
Why a graphic? A picture speaks a thousand words.
Graphics make it so simple even my boss can understand it. I
can take the pictures and show them without explaining
beyond the fact that this is a statistical process, and the numbers
showed us a warning. Here we go.
The first thing I like to look at is the port scanning activity. So,
I take the port scanning, calculate the 3DMA and UCL, and then
plot them out. Figure 1 is a graph of the port scanning activity. Notice
the two runs from days 3-7 and 27-30.
Next, I plotted the rest of the RPC related activity individually, and again, a 3DMA and UCL. In this instance, looking at RPC data, and the port scans shown above; it becomes pretty clear that an RPC related attack is coming. I can say this with relative certainty, because I know that Red Hat 6.2 servers were placed in service on the 4th and 25th of November, and Sparq 2.6 boxes were placed in action on the 5th and 25th. Now, the blackhat knows there are systems online, and will likely know what type they are. Next he’ll try and identify open ports, someplace he know how to hack. In this case, I’m looking at port 111, RPC. SYN-FIN scanning has become a popular means of identifying open ports. The graphic of SYN-FIN scanning to port 111 is shown in Figure 2.
Again, I find it curious that there is a large amount of activity on the 6th. This amount is well above the UCL, and should cause concern. So, lets continue the quest for other indicators. However, it's pretty safe to say at this point that there is an attack to port 111 imminent. By looking at the scans, and then the activity at 111, the trends match. The next Snort rule I looked at was Portmap status queries, shown in Figure 3. Again, activity was noted in the form of one Snort report at day 4 and again on the 7th. My guess is that someone is checking to see if the port is in fact alive. The UCL on this graphic was .4. So, now we have three charts with out of bounds activity from the 3rd through the 7th of November. So, I’m going to take the hard road, and make the call that we will very likely see an RPC attack around the 7th or 8th, and will likely see something else around the 30th (however this is still just a WAG. The guess is based only on the scanning noted in Figure 1, and no further information.)
Putting it all together
Convinced? Not yet? OK, well, doing post-attack early warning analysis is like doing a crossword puzzle with the answers in the back of the book. Rest assured, the process works. Figure 4 is a trace of all of the noted activity associated with RPC. Figure 4 brings it all together. For simplicity, I’ve left off the 3DMA and UCL lines. The one interesting thing about this graphic is that is shows very clearly that the activity across all rule sets reported by Snort correlate exactly on the 6th. Each of the RPC rules reported on the 6th peaked on the graph. On the 7th, the Red Hat 6.2 box was compromised using an rpc.statd vulnerability, and a backdoor was installed. Interestingly enough, on the 26th, another RH 6.2 box was placed in service. On the 27th, we noted an immediate increase in scanning on the Honeynet, followed again by a compromise on the 30th, again at rpc.statd, with a backdoor installed. On a side note, I found it interesting that a Windows 98 box was placed in service on October 30th, and was compromised on the first of November by a worm. The attacks from various worms battling each other for control over the box lasted four days.
In this exercise, I used only the Snort logs for the analysis. Firewall
logs can be analyzed the same way, but with less detail. The
one lesson learned about using Snort at a Honeynet is that there are
no false positives. As a result, we get a pretty clear picture
of what happened before, and after each attack. In this case, we had
about 4 days notice that something was coming. As the
days progressed, the picture became clear that someone was going to
hack port 111. We had 4 days to ensure our patches
were up to date, and the box was as secure as we could make it. At
Honeynet we really wanted to find out what would
happen, so the box remained default load, but in real world applications,
wouldn't a couple of days notice be nice? As a
qualification, we realize the Honeynet is a VERY small sample (nano
sample?), and that in the real world we're talking about
gigabytes of information per day. Also, I only tested this month using
RPC data, but the
Know Your Enemy: Statistics pulls the top 10 Snort rules
reported at Honeynet and examines them as well. At a nano level, the
process works. However, bear in mind, at this point this
process is considered to be a proof of concept only, and is still under
testing by Honeynet members on enterprise wide data.
Please feel free to add to the analysis, we would love to hear from
Comments are always welcome.
Take care,
Jeff Stutzman
The Honeynet Project