The SIEM is dead

…Long live the SIEM

I accept that what I’m about to say may be controversial, particularly amongst SIEM vendors, but having designed and built one of the largest SIEM’s in the world at Barclays I feel that I have a certain leeway.

So let’s start with a very quick overview of what a SIEM is. SIEM stands for Security Information and Event manager, which really doesn’t give much away. A SIEM is supposed to collect real time events from security and other systems to enable a security operations team to react to those events and stop bad things happening. That’s the Event part of the SIEM. The Information part is about providing dashboards and analysis for management. Pretty simple eh, and it all sound useful, so what is the problem?

Why is the SIEM dead?

There are several problems with the traditional SIEM and I’ll address them one at a time.

Firstly Data Normalisation: All SIEM’s are designed pretty much in the same way. They have a collection tier which receives events from target systems and sends them on to the aggregation tier. This tier tends to require adapters to normalise the data and SIEM vendors provide a very wide variety of standard adapters or data collectors to be able to understand the data; and here-in lays the first problem. The great thing about standards is that there are so many off them. Even simple things like Web logs are variable and may be changed by web server addins and modules. My experience is that these standard adapters almost always need to be adapted themselves and that will cost time and money however you do it. The other problem with the normalization process is that it generally means throwing away some information as you squeeze the data to fit your schema. My experience with these data adapters is also that they can have a disproportionate impact of overall SIEM performance if they are badly written. Poorly constructed regular expressions can bring your SIEM to its knees and changes to data formats can leave you struggling to catch up and virtually blind for extended periods of time.

The next big problem we face is scalability: the data aggregation tier for these systems tends to be a large traditional RDBMS (database to those unfamiliar with this term). That sounds fine but it is the biggest weakness of the SIEM. When you are trying to do millions or even billions of inserts per day you soon fine that even the fastest databases can’t cope. The architecture of traditional SIEMS means this is always a bottleneck as you can’t simply create an additional database to half the load. Most of your support and admin time is spent on tending this gargantuan database and as it fills up things tend to get worse. SIEMS try to lessen this problem by writing the event data to a flat file system and simply store metadata and summary data for reports in the RDBMS. More on this later.

So the first two big problems are about getting data in. What about getting it out again? There tend to be two ways we want the data out, in alerts for the security operations team and in reports and analytics for management and other teams. The SIEM should be really good at the first because that is what they were designed to do but surprisingly they are pretty poor at this also. Alerts tend to be processed in real-time, thus missing out the database. That is a smart idea but it is limiting. To be a really effective security operations team you want two things: very good alerts with little or no false positives and lots of context information. SIEM’s really fall down on the first because their ability to analyse the incoming data is poor. Correlating events together is rudimentary at best as long time between events causes the SIEM big problems and its ability to interpret anything but simple log events is limited. Try correlating an IDS event with a Web proxy log event to get good malware alerts and you will see what I mean. Getting the context you need is also a big problem. Mixing in data from configuration management databases, vulnerability systems, HR systems and other areas is always difficult, bordering on impossible as SIEMS are just not built to handle this static data. Even when you can do this you tend to end up with lots of fragile extensions to core SIEM functionality and small changes in the schema of this static data can have major impacts on your reporting and often go unnoticed until it is too late.

I mentioned that the other requirement for getting data out was reports and analytics and this is normally the point you give up and throw your SIEM away. Why is that? Well remember I mentioned that summary data and metadata was what tended to fill up that database? SIEM’s report based upon predefined summary reports. These are defined early on in the process and are built as data passes through. If you need to change what you are reporting on you change your summarizer job and wait for it to fill-up with new data. There is no way to report in a new way on the old data as the SIEM has no way of accessing its retained data in a workable way. No one ever asks a SIEM owner what their data retention policy is because they can’t easily access that data. So inflexible reporting is a problem, but so is the report data itself. The summary reports are fine for telling you how many and who but they are completely incapable of giving you the details of each individual occasion. And I’m afraid that is exactly what your auditor, compliance officer, HR person, police officer, regulator or whoever is going to ask you for. I’m afraid you are going to spend a lot of time running grep to answer those questions. Remind me why it was you bought a SIEM?

So if the SIEM is dead what is the answer? Well I’m afraid that you are going to have to wait for part two of this blog to find out. Don’t worry though, there is an answer and it’s a good one. Stay tuned…

