As the IT market grows, organisations are deploying more security solutions to guard against the ever-widening threat landscape. All those devices generate copious amounts of audit records and alerts, and many organisations are setting up repeatable log collection and analysis processes.
However, when planning and implementing log collection and analysis, these organisations often discover that they aren't realising the full promise of such a system. This happens due to some common log-analysis mistakes.
This article covers the typical mistakes organisations make when analysing audit logs and other security-related records produced by security infrastructure components.
1: Not looking at the logs
Let's start with an obvious but critical one. While collecting and storing logs is important, it's only a means to an end - knowing what 's going on in your environment and responding to it. Thus, once technology is in place and logs are collected, there needs to be a process of ongoing monitoring and review that hooks into actions and possible escalation.
It's worthwhile to note that some organisations take a half-step in the right direction: They review logs only after a major incident. This gives them the reactive benefit of log analysis but fails to realise the proactive one - knowing when bad stuff is about to happen.
Looking at logs proactively helps organisations better realise the value of their security infrastructures. For example, many complain that their network intrusion detection systems (IDS) don't give them their money's worth. A big reason is that such systems often produce false alarms, which makes their output less reliable to act upon. However, correlating IDS logs with other records such as firewall logs and server audit trails, as well as vulnerability and network service information about the target, allows companies to "make IDS perform" and gain new detection capabilities.
Some organisations also have to look at log files and audit tracks due to regulatory pressure.
2: Storing logs for too short a time
This makes the security team think they have all the logs needed for monitoring and investigation (while saving money on storage hardware), but then there is the horrible realisation after the incident that all the logs are gone, due to its retention policy. The incident is often discovered a long time after the crime or abuse has been committed.
If cost is critical, the solution is to split the retention into two parts: short-term online storage and long-term offline storage. For example, archiving old logs on tape allows for cost-effective offline storage, while still enabling future analysis.
3: Not normalising logs
Normalisation means we convert the logs into a universal format, containing all the details of the original message but also allowing us to compare and correlate different data sources such as UNIX and Windows logs. Across different application and security solutions, log format confusion reigns: some prefer SNMP, others favour classic UNIX syslog. Proprietary methods are also common.
Lack of a standard logging format leads to companies needing different expertise to analyse the logs. Not all skilled UNIX administrators who understand syslog format will be able to make sense out of an obscure Windows event log record, and vice versa.
The situation is even worse with security systems, because people commonly have experience with a limited number of systems and thus will be lost in the log pile spewed out by other devices. As a result, a common format that can encompass all the possible messages from security-related devices is essential for analysis, correlation and, ultimately, for decision making.
4: Not prioritising log records
Assuming that logs are collected, stored for a sufficiently long time and normalised, what else lurks in the muddy sea of log analysis? The fourth error is not prioritising log records. The logs are there, but where do we start? Should we go for a high-level summary, look at most recent events or something else? Some system analysts may get overwhelmed and give up after trying to chew a king-size chunk of log data without getting any real sense of priority.
Thus, effective prioritisation starts with defining a strategy. Answering questions such as "What do we care about most?" "Has this attack succeeded?" and "Has this ever happened before?" helps to formulate it. Use these questions to get you started on a prioritisation strategy that will ease the burden of gigabytes of log data, collected every day.
5: Looking for only the bad stuff
Even the most advanced and security-conscious organisations can sometimes get tripped up by this pitfall. It's sneaky and insidious, and can severely reduce the value of a log-analysis project. It occurs when an organisation is only looking at what it knows is bad.
Indeed, a vast majority of open source tools and some commercial ones are set up to filter and look for bad log lines, attack signatures and critical events, among other things. For example, Swatch is a classic free log analysis tool that's powerful, but only at one thing - looking for defined bad things in log files.
However, to fully realise the value of log data, it needs to be taken to the next level - to log mining. In this step, you can discover things of interest in log files without having any preconceived notion of what you need to find. Some examples include compromised or infected systems, novel attacks, insider abuse and intellectual property theft.
It sounds obvious: How can we be sure we know of all the possible malicious behaviour in advance? One option is to list all the known good things and then look at the rest. It sounds like a solution, but such a task is both onerous and thankless. It's usually even harder to list all the good things that might happen on a system or network than it is to list the bad things. So many different events occur that weeding out attack traces just by listing all the possibilities is ineffective.
A more intelligent approach is needed. Some of the data mining (also called "knowledge discovery in databases") and visualisation methods actually work on log data with great success. They allow organisations to look for real anomalies in log data, beyond "known bad" and "not known good."
Avoiding these mistakes will take your log-analysis program to the next level and enhance the value of your company's security and logging infrastructures.
Anton Chuvakin is a security strategist at security information management company netForensics. His areas of expertise include intrusion detection, UNIX security, forensics and honeypots, and in his spare time he maintains his own security portal.