The Xilinx Virtex II/Pro (V2P) architecture brought a number of interesting hardware advances to mainstream FPGAs. In addition to large amounts of reconfigurable logic, these chips provided one to four embedded PowerPC processor cores and up to twenty high-speed serdes transceivers called RocketIO. While people had been using FPGAs in network applications for years, the V2P made it (relatively) easy to build a fully functional network processor that could interface with commodity LAN hardware. In 2004 I began experimenting with the V2P's networking capabilities. After successfully interfacing a V2P7 to Gigabit Ethernet (GigE), I collaborated with Chris Clark on the development of a network intrusion detection system (NIDS) that could be hosted in an FPGA. This NIDS could filter multiple GigE network connections at full speed in both directions, and used SNORT's collection of pattern matching rules to make decisions about whether a packet was malicious or not. Our prototype ran on a Xilinx ML300 reference board, which utilized four optical GigE ports.
Network Features of the Xilinx Virtex II/Pro
The Xilinx Virtex II/Pro (V2P) was an appealing architecture for network researchers because it had (1) one or more embedded PowerPC cores that could run custom firmware and (2) up to 20 RocketIO transceivers that could communicate with a number of high-speed serial link standards. Both pieces of hardware were fixed cores that could run at clock speeds that were well beyond the rates reconfigurable logic could handle. The PowerPC cores had internal instruction and data caches, and could be connected to other hardware devices through IBM's PLB and OPB buses. While building the necessary support hardware and software was a manual process at first (before the EDK matured), mixing hardware and software proved to be extremely useful. In addition to simplifying debug and development efforts, encoding a portion of the work in software improved build times as software could easily be inserted into a hardware design without triggering a rebuild (FPGA builds at the time to a half hour more more at the time).
The RocketIO modules were a true advancement for network applications because they were heavily parameterized and could be configured to run in many different ways. The RocketIO multi-gigabit transceiver (MGT) utilized a serial/deserial (serdes) unit to convert a stream of parallel bits into a high-speed serial stream (and vice versa) that could be transmitted long distances. The serdes unit had hardware support for a number of key operations: 8b/10b encoding, disparity handling, clock recovery, and CRC generation/validation. The MGT was designed to work with multiple standards (GigE, SATA, InfiniBand, and later PCIe) and used a 10x or 20x clock multiplier to run the phy at high speeds (up to 3.125GHz for 2.5Gb/s data links). The MGTs could even be synchronized together for multi-lane standards such as 4x IB.
Building a Network Intrusion Detection System
After learning how to configure the MGTs to connect with GigE, I realized the V2P would make an excellent platform for performing low-level inspections and manipulations of LAN traffic. While other NIC chips let you move packets between the host and the wire, the V2P allowed you to precisely control all aspects of packet handling down to the symbols that make up an Ethernet frame. The V2P also offered extremely low latency, as packets were streamed off the wire into user-defined circuitry.
Network Intrusion Detection Systems (NIDS) were an obvious, useful application that could benefit from an FPGA with networking capabilities. In the late 1990's a number of researchers had investigated developing customized pattern matching circuits in FPGA to accelerate the speed at which network packets could be inspected. Most of theses efforts took a large corpus of pattern matching rules from the well-known SNORT tool and generated custom hardware for FPGAs that could perform the work in real time or faster. Chris Clark at Georgia Tech had built a set of software tools to take a rule set and generate a hardware engine as a nondeterministic finite state automata (NFA). I worked with him on integrating his work into a design that could filter multiple GigE lanes at the same time.
Minimal GigE Network Interface
The first step in our NIDS was developing a minimal GigE network interface that could properly send and receive GigE frames properly. The system handled the clocking for the RocketIO module and used custom hardware to direct Ethernet framing, CRC handling, and the byte-aligning of new incoming packets. Packet fifos were used to exchange data between other components in the NIDS and the wire. In order to decouple the analysis hardware's clock frequency from the MGT, packets were stored in a dual-ported block ram that used different clock domains for the ingress and egress portions of the fifo. The fifo's operations were coordinated through asynchronous handshaking logic.
Malicious Packet Detector
The NFA engine that Chris designed divided an incoming packet into header and payload sections before inspecting the contents. The header portion of the unit tests for specific patterns that are required in one or more rules (e.g., TCP port 80). Similarly, the payload section searches for known bit patterns in bulk data that match one or more rules. Data is streamed into the system and analyzed one fixed-width word at a time. The input stream's word length can be changed at hardware generation time to adjust throughput. The tradeoff is data dependent: using a higher bit width may increase data rates, but the resulting hardware may also increase hardware complexity, thereby reducing clock speed. The results of both the header and payload sections feed into a decision engine, which ultimately generates a single "malicious/not malicious" identifier for the packet.
Filtering Multiple Network Lines
In order to be cost effective, it is necessary to consider a NIDS design where a single FPGA can process multiple network links at the same time. This work can be challenging because each link runs at gigabit speeds and has traffic flowing in both directions that needs to be filtered. Fortunately, Chris's packet filter was designed to operate at multi-gigabit speeds and therefore it was possible to use one filter engine to process multiple GigE links. As illustrated below, a scheduler is used to determine which packet to inspect next. The NI buffers incoming packets arriving at the NIDS and notifies the scheduler there is new work. Once the filter is available, the scheduler directs the incoming link to stream the packet into the filter and the outgoing NI at the same time. After the streaming is done and the filter completes its work, the filter sends the outgoing NI information about whether it should transmit the packet or drop it.
We made a parameterized design to allow us to change the number of supported links and then targeted the ML300 reference board. The ML300 was a high-end developer kit that was geared towards demonstrating the new PowerPC processors found in the V2Pro architecture. The board had a small-but-fast V2P7-6 FPGA and many I/O accessories to help the board run Linux (DDR memory, CF slot, serial port, audio, and a touchscreen). The main component of interest for us though was a Stratos Lightwave quad-port optical transceiver block. This unit connected directly to the V2P7's RocketIO modules and provided a convenient way to connect four GigE optical links to the card using standard LC fibers.
In order to test our hardware we configured the design to route data between ports one and two, and ports three and four. We then connected computers to ports one and four, and cabled ports two and three together. This environment allowed us to run the traffic between the two computers through the filter twice, ensuring the filter would receive a large amount of traffic. We then transferred a large amount of traffic between the computers. We replayed traces of known-malicious packet sequences and used tcpdump to verify that the malicious packets were filtered out properly.
IJE Paper Chris Clark, Craig Ulmer, and David Schimmel, "An FPGA-based Network Intrusion Detection System with On-Chip Network Interfaces", International Journal of Electronics, Vol. 93, Issue 6, 2006. link
ARC Paper Chris Clark and Craig Ulmer, "Network Intrusion Detection Systems on FPGAs with On-Chip Network Interfaces", Applied Reconfigurable Computing, February 2005.
ARC Slides Presentation given at the Applied Reconfigurable Computing Conference.