Information Technology Reference
In-Depth Information
An Ecient Update Mechanism for GPU-Based
IP Lookup Engine Using Threaded Segment Tree
Yanbiao Li
1
, Dafang Zhang
1
, Gaogang Xie
2
, Jintao Zheng
1
,andWeiZhao
1
1
College of Information Science and Engineering, Hunan University,
Changsha, 410082, China
2
Institute of Computing Technology, Chinese Academy of Sciences,
Beijing, 100190, China
Abstract.
Recently, the Graphics Processing Unit (GPU) has been used
to deploy high-speed software routers. On this platform, designing an ef-
ficient IP lookup engine is still a challenging task, especially when taking
into account the comprehensive performance under frequent updates. Ex-
isting solutions either fail in dealing with update overhead, or can not
provide stable throughput. In this paper, we first propose the Threaded
Segment Tree, a novel tree-like structure. On this basis, we then present
a fast IP lookup engine with an ecient parallel update mechanism.
According to our experiment results on real-world data, the proposed
mechanism reduces the memory accesses on the GPU and the overall
update overhead by at least 82
.
5% and 89
.
6% respectively. Moreover, it
also ensures the lookup engine provides stable throughput under highly
frequent updates, which only decreases by less than 1% even though
update frequency increases to 100
,
000
updates/s
.
Keywords:
GPU, IP lookup, parallel update, segment tree.
1 Introduction
IP address lookup, as a key function of modern routers for packets forward-
ing and classifying, aims to determine a proper next hop for each incoming
packet, through comparing its destination address against all prefixes stored in
the Forwarding Information Base (FIB). It is always modeled as a Longest Prefix
Matching (LPM) problem.
1.1 Summarize of Prior Arts
Classic solutions to LPM fall into two major categories. Hardware-based solu-
tions always provide very fast lookup [1,2], but their low flexibilities and high
consumptions on power and cost make them unadaptable to large tables, or to
growing requirements on scalability. By contrast, software-based solutions are
proved more flexible due to some tree-like data structures [3-5]. But processing
LPM on them requires multiple memory accesses for one lookup. Though op-
timized by many techniques [6, 7], their performance are still dicult to meet
today's link speed.