Correlation, the Key to True Network Visibility
August 30, 2021
Have you ever heard of the term ‘single pane of glass’ when IT people speak about monitoring tools … of course you have! It’s a oft used (and mis-used) term for a management console that presents data from multiple sources in a single display. I probably first heard this term bandied about as far back as the 90’s and I am still yet to see an effective solution that offers a single pane of glass dashboard.
Harsh? Maybe … but the reality is that even in single pane of glass solutions, data is typically still ‘siloed’. That is, each disparate data set exists in it’s own data store. The single pane of glass solutions present this disparate data on a single dashboard – but in essence we are seeing graphs generated from different data sets simply displayed on the same page.
What we really need is the ability to be able to look at the relationships between these different data sets. The cyber security world has been at the forefront of data set correlation seen in some of the more advanced Security Incident and Event Management (SIEM) tools. Most SIEM tools can take information from multiple sources, say in the form of logs, and search for patterns or ‘correlations’ between them providing deep insight into existing and potential security issues.
The network performance monitoring space has been slower to adopt this approach – I suppose because we have been grappling with large data sets that seem discrete from other data sources. But … (and this is important) the key to providing ‘network visibility’ as opposed to simple ‘network monitoring’ is the ability to correlate data from multiple sources in order to present a whole of network picture to assist in problem diagnosis and resolution, as well as security incident reporting.
Effective correlation relies on the ability to ‘link’ to data sets to glean deeper insight. From a networking perspective, the most obvious common link is IP address. Most network related data sets, be they performance monitoring or cyber security, are IP address based which allows an easy and quick point of correlation.
Certainly IP address correlation is a good start, but there is a more interesting, and arguable more effective way. Corelight have developed a technique to pivot between datasets for more effective visibility called Community Flow ID Hashing (https://github.com/corelight/community-id-spec). Community Flow ID Hashing is a technique that takes flow tuple information to create a hash value or ID to identify a flow that is common across data sets.
Corelight uses this to correlate between datasets from Suricata and Zeek to provide detailed cyber incident analysis. At Byte25 we have taken a similar approach by implementing the Community Flow ID within the Byte25 deep packet inspection engine. In this way, Byte25 can correlate data between the Suricata Threat Detection Engine and the network performance data collected from the DPI engine.
This provides deep network visibility well above the simple ‘single pane of glass’ approach. For example, if an incident is identified in the Byte25 Threat Detection engine, it is a simple process to use the Community Flow ID to pivot to the DPI data, set a filter and identify exactly when and how often the malicious device has communicated within the network and who else may have been affected. That is, a simple correlation that allows powerful diagnostics and forensics to identify and potential remediate identified issues.
Single pane of glass may be a great marketing term, but true network visibility relies on the power of correlation. Normalising disparate datasets through techniques such as Community ID Flow Hashing is a great step forward in allowing pivots between different data sources.
A great step forward for network visibility.