ACM Home Page
Please provide us with feedback. Feedback
Algorithms to accelerate multiple regular expressions matching for deep packet inspection
Full text PdfPdf (411 KB)
Source Applications, Technologies, Architectures, and Protocols for Computer Communication archive
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications table of contents
Pisa, Italy
SESSION: Hardware table of contents
Pages: 339 - 350  
Year of Publication: 2006
ISBN:1-59593-308-5
Also published in ...
Authors
Sailesh Kumar  Washington University
Sarang Dharmapurikar  Washington University
Fang Yu  University of California, Berkeley
Patrick Crowley  Washington University
Jonathan Turner  Washington University
Sponsors
SIGCOMM: ACM Special Interest Group on Data Communication
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 36,   Downloads (12 Months): 368,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1159913.1159952
What is a DOI?

ABSTRACT

There is a growing demand for network devices capable of examining the content of data packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep packet inspection implement simple string matching algorithms to match packets against a large, but finite set of strings. owever, there is growing interest in the use of regular expression-based pattern matching, since regular expressions offer superior expressive power and flexibility. Deterministic finite automata (DFA) representations are typically used to implement regular expressions. However, DFA representations of regular expression sets arising in network applications require large amounts of memory, limiting their practical application.In this paper, we introduce a new representation for regular expressions, called the Delayed Input DFA (D2FA), which substantially reduces space equirements as compared to a DFA. A D2FA is constructed by transforming a DFA via incrementally replacing several transitions of the automaton with a single default transition. Our approach dramatically reduces the number of distinct transitions between states. For a collection of regular expressions drawn from current commercial and academic systems, a D2FA representation reduces transitions by more than 95%. Given the substantially reduced space equirements, we describe an efficient architecture that can perform deep packet inspection at multi-gigabit rates. Our architecture uses multiple on-chip memories in such a way that each remains uniformly occupied and accessed over a short duration, thus effectively distributing the load and enabling high throughput. Our architecture can provide ostffective packet content scanning at OC-192 rates with memory requirements that are consistent with current ASIC technology.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
J. Hopcroft, "An nlogn algorithm for minimizing states in a finite automaton," in Theory of Machines and Computation, J. Kohavi, Ed. New York: Academic, 1971, pp. 189--196.
 
4
Bro: A System for Detecting Network Intruders in Real-Time. http://www.icir.org/vern/bro-info.html
 
5
6
7
 
8
 
9
S. Wu, U. Manber," A fast algorithm for multi-pattern searching," Tech. R. TR-94-17, Dept. of Comp. Science, Univ of Arizona, 1994.
 
10
Fang Yu, et al., "Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection", UCB tech. report, EECS-2005-8.
 
11
N. Tuck, T. Sherwood, B. Calder, and G. Varghese, "Deterministic memory-efficient string matching algorithms for intrusion detection," IEEE Infocom 2004, pp. 333--340.
12
 
13
 
14
S. Yusuf and W. Luk, "Bitwise Optimised CAM for Network Intrusion Detection Systems," IEEE FPL 2005.
 
15
 
16
C. R. Clark and D. E. Schimmel, "Efficient reconfigurable logic circuit for matching complex network intrusion detection patterns," In Proceedings of 13th International Conference on Field Program.
 
17
18
 
19
Scott Tyler Shafer, Mark Jones, "Network edge courts apps," http://infoworld.com/article/02/05/27/020527newebdev_1.html
 
20
TippingPoint X505, www.tippingpoint.com/products_ips.html
 
21
Cisco IOS IPS Deployment Guide, www.cisco.com
 
22
Tarari RegEx, www. tarari.com/PDF/RegEx_FACT_SHEET.pdf
 
23
Cu-11 standard cell/gate array ASIC, IBM. www.ibm.com
 
24
Virtex-4 FPGA, Xilinx. www.xilinx.com
 
25
N.J. Larsson, "Structures of string matching and data compression," PhD thesis, Dept. of Computer Science, Lund University, 1999 .
 
26
S. Dharmapurikar, P. Krishnamurthy, T. Sproull, and J. Lockwood, "Deep Packet Inspection using Parallel Bloom Filters," IEEE Hot Interconnects 12, August 2003. IEEE Computer Society Press.
 
27
Z. K. Baker, V. K. Prasanna, "Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs," in Field Prog. Logic and Applications, Aug. 2004, pp. 311--321.
 
28
 
29
 
30
J. Levandoski, E. Sommer, and M. Strait, "Application Layer Packet Classifier for Linux". http://l7-filter.sourceforge.net/.
 
31
"MIT DARPA Intrusion Detection Data Sets," http://www.ll.mit.edu/IST/ideval/data/2000/2000_data_index.html.
 
32
Vern Paxson et al., "Flex: A fast scanner generator,"http://www.gnu.org/software/flex/
 
33
SafeXcel Content Inspection Engine, hardware regex acceleration IP.
 
34
Network Services Processor, OCTEON CN31XX, CN30XX Family.
 
35
R. Prim, "Shortest connection networks and some generalizations,"Bell System Technical Journal, 36:1389--1401, 1957.
 
36
J. B. Kruskal, "On the shortest spanning subtree of a graph and the traveling salesman problem," Proc. of the American Mathematical Society, 7:48--50, 1956.
 
37
F. M. Liang. A lower bound for on-line bin packing. In Information Processing letters, pages 76--79, 1980.
 
38
Will Eatherton, John Williams, "An encoded version of reg-ex database from cisco systems provided for research purposes".
 
39
Garey, M. R., and Johnson, D. S., "Bounded Component Spanning Forest", pp 208, Computers and Intractability: A Guide to the Theory of NP-Completeness, 1979.

CITED BY  14
 
 

Collaborative Colleagues:
Sailesh Kumar: colleagues
Sarang Dharmapurikar: colleagues
Fang Yu: colleagues
Patrick Crowley: colleagues
Jonathan Turner: colleagues