Most common myths about Splunk - Damn it!!
After digging into the nitty gritty of the Cisco security portpolio now it's time to make our information gathering tools and SIEM solution more efficient for Incident response use cases. Let's jump into the world of Splunk and some lies which you might have heard from some Splunk newbies.
So after bugging the entire IT departments (Through friends and peers ) and interrogating as many business teams as possible to grant you (the security guy) access to their data, you are finally in the process of developing your dreamed use cases. Lucky you! I was doing many of the things that I’ll reference below. Many of these things work, but they aren’t exactly the most efficient approach. And because they do work, they’ve grown into fake news over the years!
I hope that this article will help you avoid some of the growing headaches which Splunk Guys had.
If you doe experience anything,which is listed below, hopefully we will be able to give you an idea of how to better approach your own environment. But if all else fails, please don’t hesitate to reach out to us for Splunk professional services help.
Lie #1 : Always Place Your Config Files in etc/system/local
Splunk’s ability to scale is what attracts so many people to the product. It scales virtually better than anyone else in the market space. I know you’re probably wondering what the system/local directory has to do with scalability and believe it or not, it has a lot to do with it. When it comes to system precedence within Splunk, the system/local directory always wins. The reason this affects scalability is because the local directory CANNOT be remotely managed.
For example, if you’ve ever manually installed a forwarder on a Windows server, you’ve probably noticed that the install instructions ask you to set the deployment client and forwarding server during the install.
These values are entered into the etc/system/local directory upon install and cannot be remotely changed. Instead, do not enter any of these values when doing an install on a windows server. Leave these values blank and use the following approach to set the appropriate configurations.
The best approach is to create custom applications that contain your configuration files. If you manage your config files via applications, you can use your deployment server to remotely change configuration files on hundreds of thousands of forwarders in a matter of minutes. If you just stood up a new indexer and now you have the task of updating the Outputs.conf on your Universal Forwarders, if you follow this approach, your week-long project just became a 15-minute task.
Lie #2: The More Indexes, The Better
More often than not, you will see Splunk administrators who don’t really plan out their indexes. In many cases, we will see indexes created on a per sourcetype basis. Not only is this an overkill approach that makes management much more cumbersome, but it can also cause performance degradation and in some extreme cases, even data loss. This is because in a clustered environment, there are limitations on how many buckets a cluster manager can manage.
You also have to consider that your data retention and role-based access is all index based. This is where the management of so many indexes becomes difficult and cumbersome. The ideal approach to planning your indexes should revolve around the two aforementioned aspects. You should also consider these items when planning your indexes:
Data that is commonly searched together can more than likely be grouped together.
This approach can also alleviate your role-based access requirements as your indexes will be grouped by the team who owns the data, for the most part.
Organize your indexes by ownership group. For example, set your index to a superseding general term like “index=LAN”.
From there, group logs from your firewalls, switches, routers, etc. by their corresponding sourcetype under your “LAN” index.
The bottom line? Keep it simple. Take the time to logically plan out your indexes prior to your deployment. Once your data is indexed, there is no do-over so it is definitely worth the extra effort and attention.
Lie #3 : Sending TCP/UDP Syslog Data Directly to Indexers
Earlier, we talked about why you should avoid using an aggregate layer before indexing your data. Now we’re going to flip the script a bit. When it comes to syslog data, you want to use an aggregate layer. Often, we will see customers who send syslog data either directly to indexers, or via a third-party load balancer like Netscaler.
This is a risky approach because it can have a negative effect on your load balancing. The obvious implications of sending directly to indexers is that there is absolutely no load balancing going on. But when using a third-party load balancer, what can happen is the load balancer won’t switch often enough, or large streams of data could get stuck. Essentially, Splunk knows how to break the data, a third-party load balancer does not. The third-party load balancer could switch early before an entire event makes it to the indexer. In the Splunk world, you want to distribute your indexing as much as possible. Storage is expensive so the more distribution, the better.
Consider this if you send your data directly to indexers, if you have to restart any one of your indexers, the data that was sent to that box during restart is lost. Then there’s the potential data loss during restarts, and the inability to filter noisy data. Instead, consider standing up a dedicated syslog log server (like Syslog-ng). I can’t stress enough how critical this is to Splunk. Then deploy a Universal Forwarder to your syslog server. The Universal Forwarder -> Indexer flow of traffic is th