Database Reference
In-Depth Information
Algorithm 1.
Grouping Data Units
Input:
a set of query terms
T
, a data section block
B
Output:
a set of data records
R
1: Set
R
, a set of leaf nodes
N
l
, a set of starting leaf nodes
N
s
, a set of data unit
groups
G
,asetofleafnodegroups
G
l
, and a set of horizontally expanded data
unit groups
G
all to
{}
2: Add every text node in
B
to
N
l
3:
for
every leaf node
n
l
∈ N
l
do
4:
if
n
l
contains a query term
t ∈ T
then
5: Add
n
l
to
N
s
6: Remove
n
l
from
N
l
7:
for
every starting leaf node
n
s
∈ N
s
do
8:
Set a data unit group
g
to
{n
s
}
n
l
∈ N
l
do
10:
if
n
l
is horizontally aligned with
n
s
then
11: Add
n
l
to
g
12: Remove
n
l
from
N
l
13: Add
g
to
G
14:
repeat
15: Remove a leaf node
n
l
from
N
l
16: Set a leaf node group
g
l
=
{n
l
}
17:
for
each leaf node
n
l
∈ N
l
do
18:
if
n
l
is horizontally aligned with
n
l
then
19: Add
n
l
to
g
l
20: Remove
n
l
from
N
l
21: Add
g
l
to
G
l
22:
until
N
l
=
{}
23:
repeat
24: Remove a data unit group
g
from
G
25:
for
each data unit group
g
∈ G
do
26:
if
g
is horizontally aligned with
g
then
27: Set
g
to
g ∪ g
28: Remove
g
from
G
29: Add
g
to
G
30:
until
G
=
{}
31:
repeat
32:
9:
for
every leaf node
Remove a horizontally expanded data unit group
g
from
G
33:
for
each horizontally expanded data unit group
g
∈ G
do
34:
if
g
is vertically adjacent to
g
then
35: Set
g
to
g
∪ g
36: Remove
g
from
G
37:
for
each leaf node group
g
l
∈ G
l
do
38:
if
g
l
is vertically adjacent to
g
then
39: Set
g
to
g
∪ g
l
40: Remove
g
l
from
G
l
41: Add
g
to
R
42:
until
G
=
{}
43: Return
R