Database Reference
In-Depth Information
C
0
)
.
As an example, consider the following configurations over a database with a
single table
R
:
max
size
schedule using no more than
B
storage, where
B
≥
(
C
f
),
size
(
C
0
={
R
(
c
,
d
),
R
(
e
)
}
C
f
={
R
(
a
,
b
),
R
(
a
,
c
),
R
(
c
),
R
(
e
)
}
A simple schedule first removes all indexes in
(
C
0
−
C
f
)
and then creates the
indexes in
(
C
f
−
C
0
)
, that is:
[drop
R
(
c
,
d
)
, create
R
(
a
,
b
)
, create
R
(
a
,
c
)
, create
R
(
c
)
]
The following schedule, however, can be more ecient than the previous one
(assuming that the indexes at each intermediate step fit in the storage con-
straint):
[create
R
(
c
)
, drop
R
(
c
,
d
)
, create
R
(
a
,
b
)
, create
R
(
a
,
c
)
]
In this case, we use
R
(
c
,
d
)
to avoid sorting table
R
in
c
order to create
R
(
c
)
, thus saving time. However, in this case we need to be able to store
simultaneously. There might be more ecient schedules that
create
additional
intermediate indexes not in
R
(
c
)
and
R
(
c
,
d
)
(
C
f
−
C
0
)
. Consider the following
schedule:
[create
R
(
c
)
, drop
R
(
c
,
d
)
, create
R
(
a
,
b
,
c
)
,
(
,
)
(
,
)
(
,
,
)
create
R
a
b
, create
R
a
c
, drop
R
a
b
c
]
In this schedule, we create a temporary index
R
(
a
,
b
,
c
)
=
R
(
a
,
b
)
⊕
R
(
a
,
c
)
before creating
R
(
a
,
b
)
and
R
(
a
,
c
)
. Therefore, we need to perform a full sort
only once, and then
R
(
a
,
b
)
and
R
(
a
,
c
)
can be built by scanning
R
(
a
,
b
,
c
)
and doing a minor sort on column
c
for the case of
R
(
a
,
c
)
(if
a
isakey,no
minor sorting is required for
R
)
). The general physical design scheduling
problem can be defined as follows. Given configurations
C
0
and
C
f
and a
space constraint
B
, obtain a schedule
(
(
a
,
c
s
1
,
s
2
,... ,
s
n
)
that transforms
C
0
into
C
f
such that:
1. Each
s
i
drops an existing index or creates a new index in
closure(
C
f
)
,
as defined in Section 4.3.
2. The size of each intermediate configuration is bounded by
B
.
3. The cost of implementing
(
s
1
,
s
2
,... ,
s
n
)
is minimized.
The two main challenges of the physical design scheduling problem are the
explosion in the search space due to the ability to add elements in the closure
of
C
f
and the presence of the space constraint
B
, which invalidates more
obvious approaches based on topological orders. We now show an interesting
property that connects the scheduling problem with a shortest-path algorithm
in an induced graph.
Search WWH ::
Custom Search