In a world where network security is getting more and more complex, how do you validate that what you’re doing is accomplishing what you intended? Today we will discuss another IP Fabric Network Security Assurance use case. One that not only could make your network more secure, but also save you a lot of time and money: assurance for SIEM (Security Information and Event Management).
On the surface, setting up log collection and SIEM services may seem simple, but for any of us that have actually had to do it… we know it’s not. Not only do we have to make the right technical decisions concerning which are the correct devices to collect logs from, but we must make economic decisions around how many devices we can collect from and how long to store data. By design, SIEMs are high-touch systems; not only do they require continuous monitoring, but they need full-time care and feeding. We have to remember to add new devices, ensure that those devices are configured correctly to send the logs, etc.
In the past - at least as it relates to the network and security devices - SIEM sizing has been a combination of three things: guesswork, common sense, and some manual academic review. When I was involved in SIEM design I would always suggest that we start with core switches, border routers, firewalls load balancers, and then look at other obviously central/important devices… Most of the time I think that was the correct advice and at least a good starting point. You may have noticed what I would suggest was quite generic. Why so generic, you may ask - the answer: I didn’t have a single overarching view that would have allowed me to see the network and its associated traffic flows. If I had, I would have been able to see all network choke points, key distribution points, etc at a glance… This would have helped me to be more specific and accomplish two things - 1) establish a priority list for devices that required log collection/correlation and 2) quickly show me which devices would not offer enough value to spend our limited capital on logging. Note that when economics are a factor, deciding what should not be collected is almost as important as what should be.
Most large companies that are hacked today do have SIEMs in place, but despite this, many of these companies don’t know they have been breached. Given this, we can argue whether most companies get to an acceptable SIEM design state eventually or not. I think at least for a point in time many do, but how often is that design properly reviewed, and how does that review take place?
It should be obvious to all of us that networks are not static. This has always been true, but the advent of cloud and DevOps have accelerated the speed at which networks change. Although the cloud and DevOps have certainly increased our company's agility and in many ways lead to simplified network deployments. they have also led to management complexity because there are now so many “more cooks are in the kitchen,” so to speak. This poses the question of how anyone or any organization can ensure that a logging solution - be it Splunk, Elastic, IBM, Sumo Logic, etc. - can be effective in a modern networking environment past the initial installation.
Before I get to what I believe to be the answer, let us touch on the other problems we frequently see with SIEM deployments. All the above problems aside, let’s assume we have all the right devices and a way to update the appropriate people to add the right devices at the right times, etc. How do we ensure that all the devices are always configured properly to forward logs? Perhaps your answer is that you have a tool for rolling out our golden image. My answer to that is, okay, fair enough - but what happens when the device gets upgraded or changed or if the manufacturer changes something? How can we see, without having to do a manual review of the config that it is now no longer acting as we intended how can we be alerted to the change?
The answer to both questions is that you need a tool that does automated network discovery and is able to create updated snapshots multiple times a day without impacting the network, then translate that discovery and snapshot information into a visual reference (a logical network map overlayed into a physical network map). This same tool then needs to be able to run assurance rules, as does this device that's acting like a core router (a device type in the right location that we have pre-determined needs to log) send its logs to the SIEM if not alerted for review.
I have worked in several large environments where we spent millions of dollars on our SIEMs. One of the key things we were missing was an assurance platform to help with initial setup and continued management.
Well, that's all for this edition - I'll be back in a few weeks with my next installment on how to use a Network Security Assurance platform to solve real-world problems. If you want to see how this would work in your network, book a demo with our team!