Flow-Based Traffic Analysis at SWITCH
Simon Leinen
SWITCH
For more than three years, SWITCH has been using a locally developed software package called "Fluxoscope" to perform volume-based billing and traffic analysis tasks on traffic over our external (peering/transit) connections. The system processes microflow-based accounting data generated by our external border routers in Cisco NetFlow(TM) format. While the main goal of this effort was to generate bills based on member organizations' (universities') respective utilization of our transatlantic connection, we also hoped to get valuable insight about the nature of the traffic that our network transports, so the system was designed to be easily extensible in order to allow for experimentation. It has successfully been applied to other areas such as: Capacity and interconnection planning using detailed long-term statistics, and the detection of anomalies such as unwanted routing asymmetries and various types of network abuse.
We outline the design of our software and point out some of its distinguishing features, which include: (1) versatile aggregation and analysis in real time; (2) multi-dimensional aggregated traffic matrices allowing "drill down" data exploration, (3) sensible handling of microflows' start and end times, (4) an integrated SNMP agent for monitoring the system; (5) distribution of "raw" NetFlow accounting data to individual customers for local cross-checking and detailed analysis.
The strategy and the tools used in the implementation of Fluxoscope are presented, in particular how the choice of a development and execution environment with garbage collection, arbitrary-precision integer arithmetics and late binding has helped the creation and evolution of the system.
We also look at the performance of the system under real-life conditions, and explain how scalability issues are addressed through parallel distributed operation. Some design decisions for Fluxoscope are compared with other systems such as cflowd [cflowd], FlowScan [flowscan], or JANET's Transatlantic Billing Service [janet].
The flow-processing and aggregation component requires some configuration describing the external and internal topology of the network under consideration. We show two configuration examples, one for a "stub" network with customer networks in the same AS as the backbone, and one for a network whose customer networks are EBGP neighbors.
Several graphical and textual representations of the collected traffic statistics are presented, as well as an example of how the system can be used to get to the bottom of anomalies such as denial-of-service attacks launched from compromised systems within a university.
Router-based per-microflow traffic accounting seems to hit a "sweet spot" with respect to trade-offs of router and analysis performance, data reduction, and range of applications. For situations where full microflow-based accounting isn't deemed possible due to performance limitations, we evaluate alternative methods such as router-based aggregation, accounting based on traffic sampling, separate measurement devices, and policy-based accounting techniques, with respect to their scalability and their applicability to different tasks.