Protosphere. Software platform for deep content inspection with the ability to parse an arbitrary network protocol stack
Nowadays, the task of network traffic analysis is of increasing relevance: the reasons are improvement and deployment of new network technologies (VoIP, P2P, streaming video) and emergence of numerous application level protocols used by new network applications. Offline or online analysis is employed, depending on particular analysis system and the problem being solved. Among online analysis systems the following ones can be highlighted:
- protection systems: firewalls, IDS/IPS, SIEM, DDoS protection systems;
- connection quality assurance (QoS, QoE) and WAN optimization systems;
- policy systems (PCEF, PCRF, NAC).
Problems being solved on saved network traces include:
- development and debugging of protocol parsers for their subsequent usage in online systems;
- analysis of discrepancies between protocol specification and its particular implementation;
- cybersecurity incident response.
Subproblems arising in online and offline analysis are close to one another:
protocol identification, header fields parsing with value extraction, recovering of high-level objects such as files, sites and images from network packets. At the same time, analysis conditions differ significantly:
- online analysis requires high speed processing for potentially infinite incoming data flow with limited memory resources; offline analysis has insignificant constraints on processing speed and resources—much more important aspects are detailing of the analysis level, result visualization, and navigation convenience during analysis.
Compatibility between offline and online systems is an important requirement, a particular use case being checking a new protocol parser or a new signature that had been developed in an offline environment against online data flow. In order to satisfy this requirement, the ProtoSphere analysis system is structured as schematically presented on figure 1.
Fig. 1 – Summary diagram of the ProtoSphere system.
System core features are:
- taking into account packet corruption, loss, rearrangement and duplication, asymmetric traffic;
- resource management for supporting state information of flows being analyzed;
- support for encrypted and compressed data analysis;
- support for arbitrary tunnel configurations;
- extraction of transmitted objects from traffic: video/audio streams, web pages, files.
When offline analysis problems are considered, a critical factor that affects speed and efficiency is presence and usability of GUI components which visualize different aspects of network interactions. The core GUI components are:
- network packet list, where each packet is represented as a parse tree with headers from different protocol level;
- flow tree, taking nesting level into account;
- data dump window with selected object content;
- list of elements that comprise the selected object(e.g. IP fragments for IP packet);
- window of objects that include the selected object as part (e.g. TCP-flow for TCP packet).
An important subproblem is localizing one or several network connections for consequent detailed analysis. Two kinds of graphs are used in this task: the circular network graph and the network flow graph which details network interactions of selected node. In both cases the graph nodes represent interaction participants, and the edges show flows and their properties. The graphs are presented on figure 2.
Fig. 2 – (a) Circular network graph.
Fig. 2 – (b) Network flow graph.
During research of a single network interaction or a connected interactions group, the timing diagram is a convenient representation, an example of which is shown on figure 3.
Fig. 3 – Timing diagram.
Fig. 4 – Event log.
All the described components are synchronized with each other and allow switching to the most convenient representation for the current analysis phase. For example, it is possible to navigate from an event in the log component to a particular network packet in the parse tree and its field that triggered this event.This use case is most important in course of interactive development of an undisclosed network protocol parse module.