Sunday, January 24, 2010

Artificial Ignorance - Elementary my dear Watson

Marcus Ranum applied the term artificial ignorance to the process of monitoring log files.  You build a filter of events to ignore, and then look at everything else.  All the items that you consider normal get filtered out.  This reminds me of the Sherlock Holmes quote “Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.”  - Arthur Conan Doyle.  It’s impossible to identify all the potential attacks that might be launched against a web site.  New attacks surface every day.  While part of the strategy should be to look for specific types of attacks, an equally important strategy is to look for the unusual.

One of the web sites we support is frequently bookmarked by end users.  When a user bookmarks a site, the browser sends a request to retrieve the favicon.ico file from the web server.  This is the image that will be used for the bookmark.  We currently do not have a bookmark image for the web site.  This means a 404 Not Found error is returned to the browser.  The user never sees this error, the browser simply continues to bookmark the site using a generic image.  Web site reports always show the file favicon.ico as the top Not Found document.  In our situation, this is normal.  We considered modifying the response for this request to eliminate the error, but the 404 error actually serves a purpose.  If we review the web site reports, we always expect to see this file with the most 404 errors.  If this file is not at the top, then something unusual happened.  This is a clue to do more investigation.  It may mean that a popular document has been removed, either by mistake or in an attempt to deface the site.

We keep a close watch on HTTP 500 Internal Server errors because they always indicate some problem on the site.  Unfortunately, some of the products that we run trap errors and return a 200 Successful code to the browser, which does not indicate any error in the logs.  The message on the screen back to the user might say something like “fatal error”, but the logs don’t show this.  One way to look for these situations is to inspect the number of bytes sent.  The number of bytes for a good result frequently falls in one range, while the fatal errors tend to fall in a different range.  This has helped us spot problems when all the other logs did not contain any errors.

It’s important to look at other errors even if the volume is very low.  If you see requests for technology that you do not support or types of content that you do not serve, then these are red flags.  In our case we do not use php on the site.  A request for a php file is a strong indication of some unusual condition.  We block these requests and return an error code.  Even though you may only see one or two requests that fall in this category it is important to investigate these further because they may mean that someone is probing the site.  As Holmes said "You know my method. It is founded upon the observation of trifles."  Use this method when setting up your log analysis.

No comments:

Post a Comment