7 Tips To Improve 'Signal-to-Noise' In The SOC

When security analysts are desensitized to alerts because of sheer volume, they miss the true positives that can prevent a large-scale data breach. Here's how to up your game.

Joshua Goldfarb, Field CISO

April 22, 2014

5 Min Read
(Source: Wikipedia)

Reports on the recent Target and Neiman Marcus breaches have indicated that, in both cases, there were numerous alerts fired as a result of the intrusion activity. Yet, according to news accounts, the alerts were not properly handled, allowing system compromises to go undetected and giving attackers the crucial footholds needed to pull off large-scale breaches.

Though the exact reasons for these oversights remain unclear, it is likely that security operation centers (SOCs) responsible for monitoring retailers' networks had low "signal-to-noise ratios." By that, I mean that analysts in the SOC were adept at collecting vast amounts of security information, but they faced challenges in discerning the most severe, imminent threats -- and responding to them in an effective, timely manner.

A low signal-to-noise ratio means that, more than likely, these SOCs had daily work queues inundated with false positives, to the point where true positives were lost in the clutter of noise. Commonly in these scenarios, analysts attempt to review only the alerts of the highest priority. But because of the large volume of even the highest priority-flagged alerts, analysts are not able to successfully review all of them. When analysts become desensitized to alerts because of the sheer volume of false positives, they tend to miss the true positives.

What surprises me is not that organizations have a low signal-to-noise ratio, but rather, the fundamental assumption that it has to be this way. I know from experience that there is a better way. Here are seven tips that have worked well for me throughout my career:

Tip No. 1: Go for the "Money Shot." It's possible to detect attacks at multiple stages (e.g., exploit, payload delivery, command and control, etc.) using indicators of compromise (IoCs). Different stages of attack will have different false positive ratios, and a given attack may have multiple IoCs that can be used to identify it. Whenever possible, the most reliable IoCs with the lowest false positive rates should be used to benefit the signal-to-noise ratio.

Tip No. 2: Take a scalpel approach. Alerting technologies have tremendous potential to identify suspicious or malicious activity when used like a scalpel. Unfortunately, many organizations use them less like a scalpel and more like a hatchet. It pays to assess risk, security needs, operational needs, and business needs at each deployment location and focus alerting technologies selectively. Singling out only the most relevant alerts lowers the number of false positives and improves the signal-to-noise ratio.

Tip No. 3: Use correlation. Sometimes, an individual alert is not particularly interesting until observed in conjunction with one or more other alerts or activity of interest. In those cases, the alert should be sent to the work queue only when all the correlation criteria are met. This increases the fidelity of the alerts and reduces the number of false positives, both of which help the signal-to-noise ratio.

Tip No. 4: Write intelligent alerting. As the saying goes, "Ask a stupid question, get a stupid answer." Today's threats are sophisticated and require intelligent, targeted, incisive alert logic to extract activity of concern while minimizing false positives. In other words, today's threats require intelligent alert logic. Working to tighten this logic goes a long way toward an optimal signal-to-noise ratio.

Tip No. 5: Be picky with intelligence. Different intelligence sources will bring different fidelity, relevance, and value to security operations. Blindly integrating intelligence feeds without evaluating their fidelity and false positive rates can have a detrimental effect on security operations by significantly lowering the signal-to-noise ratio.

Tip No. 6: Prioritize appropriately. Prioritization is one of the greatest tools a security team can utilize. Alerts that are of higher fidelity and detect higher-risk activity can be assigned a higher priority, while alerts that are of lower fidelity and detect lower-risk activity can be assigned a lower priority. Analysts work the queue from highest priority to lowest priority, ensuring that the most reliable alerts covering the activity of the greatest risk are addressed first.

Tip No. 7: Review every alert. Why fill the queue with alerts that are never reviewed? The whole point is to draw the attention of an analyst for review. Using the techniques I've described, an organization can regain control of its queue with the ultimate goal that every alert will be reviewed. This ensures that nothing is overlooked.

To illustrate these concepts, consider the case of SOC A and SOC B. In SOC A, the daily work queue contains approximately 100 reliable, high-fidelity, usable alerts. Each one is reviewed by an analyst. If incident response is necessary for a given alert, it is performed.

In SOC B, the daily work queue contains approximately 100,000 alerts, almost all of which are false positives. Analysts attempt to review them according to priority. Because of the large volume (even for alerts of the highest priority), analysts cannot successfully review all of the highest-priority alerts. Additionally, because of the large number of false positives, SOC B's analysts become desensitized to alerts and do not take them particularly seriously.

One day, 10 additional alerts relating to payment-card stealing malware fire within a few minutes of one another.

In SOC A, where every alert is reviewed by an analyst, where the signal-to-noise ratio is high, and where 10 additional alerts seem like a lot, analysts successfully identify the breach in less than 24 hours. SOC A's team can perform analysis, containment, and remediation within the first 24 hours of the breach. The team can stop the bleeding before any payment card data is exfiltrated. Though there has been some damage, it can be controlled. The organization can assess the damage, respond appropriately, and return to normal business operations.

In SOC B, where an extremely small percentage of the alerts are reviewed by an analyst, where the signal-to-noise ratio is low, and where 10 additional alerts doesn't even raise an eyebrow, the breach remains undetected. Months later, SOC B will learn of the breach from a third party. The damage will be extensive, and full recovery will take months or years.

Which SOC would you rather have inside your organization?

About the Author

Joshua Goldfarb

Field CISO, F5

Josh Goldfarb is currently Field CISO at F5. Previously, Josh served as VP and CTO of Emerging Technologies at FireEye and as Chief Security Officer for nPulse Technologies until its acquisition by FireEye. Prior to joining nPulse, Josh worked as an independent consultant, applying his analytical methodology to help enterprises build and enhance their network traffic analysis, security operations, and incident response capabilities to improve their information security postures. Earlier in his career, Josh served as the Chief of Analysis for the United States Computer Emergency Readiness Team, where he built from the ground up and subsequently ran the network, endpoint, and malware analysis/forensics capabilities for US-CERT. In addition to Josh's blogging and public speaking appearances, he is also a regular contributor to Dark Reading and SecurityWeek.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights