Information Technology Reference
In-Depth Information
Algorithm 3. Insert Segment
Input : curNode,insSeg
if curNode.seg = insSeg then
1
curNode.seg.value = DEFAULT ; curNode.isPrefixSeg = FALSE ;
2
LeafPushing ( curNode,nextHop );
3
end
4
else
5
if curNode.isLeaf then
6
BreakSegment ( curNode );
7
end
8
mid =( curNode.seg.low + curNode.seg.high ) / 2;
9
if mid < insSeg.low then
10
InsertSegment ( curNode.rightChild, insSeg );
11
end
12
if mid ≥ insSeg.high then
13
InsertSegment ( curNode.leftChild, insSeg );
14
end
15
end
16
448 Cores) on the basis of CUDA 5.0. We measure all concerned metrics through
the NVIDIA Visual Profiler [17].
Tabl e 1. Routing Data Sets (Collected at Jan. 1, 2013)
Data Set Location
Total Prefixes Total Updates
rrc11
New York (NY), USA 442,176
1,177,425
rrc12
Frankfurt, Germany 450,752
4,049,260
rrc13
Moscow, Russia
456,580
2,025,239
rrc14
Palo Alto, USA
446,160
1,388,217
4.1 Memory Accesses Required for On-line Updates
To evaluate the performance of route update, we replay a week's update traces
of rrc 12 and a whole day's update traces for all tables. After processing off-line
updates in TST, we get a list of threaded segments, each of which represents
several memory writes toward a range of units of the TBL24 on the GPU. While
for GALE, due to the length map checking, it requires both memory reads and
writes on the GPU to finish on-line updates. We evaluate the performance of
on-line updates by measuring the average produced global memory accesses per
update on the GPU.
As depicted in Fig. 5(a), the average produced memory accesses in TSTT are
far less than that in GALE in all cases. In fact, TSTT achieves a reduction by
92 . 5%
98 . 1%, which is still 82 . 5%
97 . 3% even ignoring memory reads in
 
Search WWH ::




Custom Search