Over the years, I have seen some of the very best and worst of SOC operations. Having been involved in building or managing SOC’s in my days, and supporting major brands and public sector organizations across the world, it is fair to say I have seen some interesting things. First up, many people get confused between what a SOC and SIEM are. A SIEM (Security Information and Event Management) is piece of software that ingests data from a variety of sources, and using defined rules, provides insights (from a technical perspective only) of the security posture of an organization. A SOC (Security Operating Center), on the other hand, is a facility with people, process, tools, and infrastructure that use the SIEM, and other methods, to monitor alerts and act. Put simply, the SIEM is the software that the SOC operators use to do “eyes on glass” management.
You may not be surprised by the common mistakes people make.
Mistake 1 – Start building your SOC before you know your ‘why’.
There are very different types of SOCs that exist for a variety of purposes, so before you invest a couple of million dollars into your state-of-the art SOC, ask yourself these questions.
- “Why do I want a SOC?” – If your reason is based on FOMO (fear of missing out) and your reason for a SOC is because everyone else has one, then you may have a wasted investment. Think about the core function of the SOC, such as, is its primary role to respond to an attack (reactive), or do you want it to actively hunt through the network looking for undetected threat adversaries (proactive)? Really test out your ‘why’ by talking to key stakeholders across IT and the business, and especially consider what your SOC will not do, because without doing so, you won’t know what to build.
- “What SOC model will I need?” – You can have an outsourced SOC, a Co-managed SOC, a SOC/NOC model, a multi-tenant SOC, dedicated SOC, or multiple SOCs (one for your data center, one for each outsourced provider, etc).
- “When do I need my SOC mature and operational?” – The time element is important. A SOC takes time to build and tune. Typically, it will take 12 months or more to find the facility, ensure it meets the physical security requirements, purchase a vast array of equipment, hire the right team, and setup all the systems and processes. In my experience, once you have your baseline SOC built, it will take another 12 months to tune the SIEM, operating procedures, and embed your team, before it is operational. Do you need 24/7 day one, or is extended business hours sufficient? The default answer is generally 24/7 but I have found this is often an exaggerated requirement because, if there is an issue at 1am and all the decision makers are asleep, then the value of 24/7 is diminished. So really explore what it means to be “operational” by defining the end-to-end engagement model between the SOC and the L1/2/3 support levels.
- “Where will my SOC reside?” – Think about the variety of legislations in place, globally, which impacts what data sources a SOC can access. A SOC is best placed in the same country as the primary data sources and organizations it is required to access and support. For example, a SOC team or infrastructure located outside Australia, will be unable to provide services for any Australian Government organization, due to data sovereignty laws.
Mistake 2 – Gold plated tooling.
SOC solutions can get very expensive. The “free” SIEM that comes bundles with your EUC license can often be more expensive than you realize. Anyone in the SOC team will have stories where SIEM costs have exponentially increased year on year due to the data fed to them. Really understand your SIEM license model.
There are a good number of great open-source tools that you can use. These are “free”, work and are powerful. For some ideas check out:
Naturally you will need to assess if these are right for you, but I wanted to share a few notable ones out there. Choose tools that are accepted and used across the industry, as you will find people you can hire with these skills.
Mistake 3 – Build your SOC then hire your manager.
Get your SOC Manager first and have them co-design the solution. They have a vested interest in the outcome and will want things designed in a specific way. They will have an opinion on the setup of the facility (layout of desks, screens), the tools and technologies needed, the skillsets and culture of the team needed, and the order in which they are needed.
I have seen organizations spend millions of dollars building a state-of-the-art facility with all the best tools and technology, and the new manager leave after a short period, because they couldn’t deliver the services with the team and tools provided. Get your SOC Manager first, and use them as your SOC Architect, to design and build a fit-for-purpose solution.
Mistake 4 – Open the SOC without SOPs.
SOPs are the system operating procedures, which tell the SOC operator what to do when an event is detected. I have seen too many SOC teams with a functional SIEM and SOC operators and are missing SOPs. As a consequence, when alerts are raised by the SIEM, the operators don’t follow a consistent pattern of activity and rely purely on their personal preferences of what to do. This creates a major risk as it places too much pressure on senior SOC analysts, and gaps in the approaches taken by junior analysts. SOPs provide safe guardrails for the SOC operators so that they know what actions to take, what information to record, when to escalate, who to escalate to, and when to move from “monitoring” to taking corrective action.
Mistake 5 – Too much of the wrong data and not enough of the right data.
There is a tendency to send the SIEM excessive data sets, just in case, creating a SIEM data dumping ground. SIEM companies charge a fortune based on volume of data, and hence, the cost of the SIEM can quickly become exponential. I have seen organizations spend three times more on their SIEM, just because of volume of data sent to the SIEM, and to compound the issue, the data is not monitored or acted on. A SIEM is a very expensive solution just for log retention.
Think about what events you want to be aware of, and then determine what data sets you need to determine those events. You will typically want logs from your perimeter IDP/IPS, firewalls, end user devices, applications, cloud environment, active directory, data center and threat intelligence feeds. Consider sending your data to a staging area first (which is cheaper), then collating the critical data from there, and send the curated data to the SIEM. This is cheaper.
Mistake 6 – Don’t tune your SIEM for false positives.
One of the biggest challenges in the SOC is the need to tune the SIEM. Early on, the SIEM will raise more alerts than are possible to assess, and of these, more than 80% of the alerts are false positives, meaning that operators are wasting time investigating alerts that were falsely raised. These need to be “tuned out. SOC teams are under pressure, and you want them to focus on the real alerts. Without tuning the SIEM you will get:
- Team burn-out and angst through wasting time
- Increased cost to replace and train staff
- The business-critical alerts get investigated late, or even worse, not at all.
SIEM engines often require 12-18 months to tune.
Mistake 7 – Unclear escalations.
When an alarm goes off at 2am, who does the operator call? And which alarms do they get you out of bed for, vs. the alarms that they tell you in the morning, or not at all? Developing a communication escalation model is critical as it needs to cover who to call and when, and what types of comms do they get (email, phone call, etc). An escalation plan should be developed and signed off by the most senior executive committee for security, including the CISO and line of business leaders.
There are multiple levels of escalation that need to be planned. The first is the escalation from an analyst to their team leader, and from the team leader to the SOC Manager, and from the SOC Manager to the CIO, Business Application Owner, Technical Application Owner, CISO, etc. Each escalation requires a clear statement of what triggers the escalation, and clear statements of accountable actions.
I have seen too many cases of excessive escalation, resulting in the “crying wolf” scenario. For example, unsuccessful activities do not often require escalation and can be included in reporting, whereas successful attempts will often require escalation.
Mistake 8 – Unclear handoffs.
Closely related to the issue with escalations, we see problems with handoff. A handoff occurs when the SOC team needs to hand off some of the analysis or ongoing activity to another team, to continue with. I see cases of duplication or gaps in these handoff’s where many parties are doing the same analysis, or even worse, assume the other is doing it. This often stems when actions are buried in the comments or notes section of a ticket or buried in lengthy email trails that extended for pages and pages. Whatever the system or method used, it is better to have an agreed set of headings at the top of each document so that the critical information is seen first.
As you mature your SOC over time, it will become a critical arsenal in your defensive capability. With a mindset of continual improvement, you will discover better ways to deliver actionable insights for your business. Will you ever have a mature SOC? Probably not, and that’s OK, because of the continual need to pivot as the threats evolve. The trick is to have a growth mindset, open lines of communication between the SOC team and management, supported by an investment model that allows the SOC team to mature with your business needs.