A Battle for the Ages, Packet Based v Flow Based Analysis Tools
August 30, 2021
Network performance monitoring tools use a wide range of techniques to collect data for analysis. Two if the most common techniques are packet based data collection and flow based packet collection. In this blog post, we will take a look at the pros and cons of each, and propose a solution to maximise the value of both.
First, a quick discussion of each…
Flow based collection relies of ‘flow agents’ embedded in network devices such as switches or routers that collect and export network performance data to a central server for analysis and presentation. Flow agents usually collect meta-data per flow, that is, metrics pertaining to the communication between hosts in a network. Flow data presents statistical information from network conversations including such things as source and destination address, byte counts, packet counts and other information available in the TCP/UDP header.
By contrast, packet based collection relies on the ability to examine every packet traversing a link in order to extract performance metrics. Packet based collection usually uses a dedicated appliance installed either inline or connected to a SPAN port or network tap. Packet based collection provides a far more granular and precise mechanism for performance monitoring over flow based collection by also supporting more sophisticated data such as inter packet arrival time, latency and jitter. Additionally, packet based collection can also support deep packet inspection to look deep inside the packet payload to identify application specific metrics. It does however come at a cost, packet based collection can be difficult (and expensive) to do at high speed and the resulting data sets can be large and difficult to manage.
So let’s take a quick high level look of the pros and cons of each …
Flow based analysis is usually a lower cost option, after all, your switches and routers probably already support flow agents such as Cisco NetFlow, sFlow or IPFIX. Leveraging this data is often as simple as installing a flow capable collector to correlate and report on usage.
If all you are looking at is capacity planning type data or a high level overview of throughput and usage across the network, then flow based monitoring is definitely the way to go. In this case, flow based monitoring provides a quick and easy way to achieve high level visibility of network traffic.
The other upside of flow based monitoring data is that it is typically lightweight. Because it relies on flow meta-data, the actual amount of data produced is relatively small allowing for easy enterprise wide deployment.
However, for day to day diagnosis of complex network issues or for a better understanding of end user experience, flow based collection can have some limitations. Packet based collection provides a much richer source of data for detailed diagnostics and performance analysis.
Because packet based collection relies on every single packet being examined, detailed latency and packet inter arrival times can be collected to provide deep insight into actual performance of packets across the network and accordingly the ability to assess potential impact to end user experience. Additionally, deep packet inspection allows for application specific information unavailable with flow based monitoring, especially for proprietary applications or those that may use dynamic ports such as VoIP.
So on the face of it, packet based collection is the way to go right? Well, not necessarily ….
Packet based collection relies of dedicated hardware probes that can be expensive, especially on high speed links. It also generates huge amounts of data that can be difficult to manage and analyse. So whilst packet based collection provides a richer source of data, the potential cost to implement can make the solution unviable.
So which is better and which way should you go to get the best network performance monitoring solution?
Well, at the risk of being a total fence sitter, the simpler answer is … it depends. Both have their pros and cons and both are applicable in different environments. I think a better question is, how do we leverage the best of both techniques?
This is exactly our approach at Byte25. We have taken the best pieces of flow and packet based collection to develop a technique we call ‘hybrid packet based flow analysis’. That is, we examine each packet in the same fashion as traditional packet based collection to gather deep insight into performance including latency and deep packet inspection, but then create meta-data pertaining to each flow for storing in the analysis database. We are creating enriched flow data if you like, data with the traditional metrics of flow agents like NetFlow but enhanced with the deep insight data normally only available via packet based collection.
This technique also allows us to keep relatively lightweight data sets, more akin to flow collection, without losing information granularity – that is, we achieve the best of both worlds. In addition, because we store data in a flow based format we have the added advantage of being able to also incorporate feeds from flow based agents. This allows enormous flexibility and cost savings – we can deploy more expensive dedicated packet based probes on important links like major Internet egress connections, but utilise low cost flow based agents for visibility of potentially less critical network connections. The structure of the hybrid packet based flow analysis database allows for both sets of data to exist heterogeneously for easy analysis and reporting.
So the question isn’t which technique iToolss better, but rather how can I have the best of both worlds. This is the solution that Byte25 delivers to meet the needs of network performance monitoring in modern network topologies.