Database Reference
In-Depth Information
C 0 ) .
As an example, consider the following configurations over a database with a
single table R :
max size
schedule using no more than B storage, where B
(
C f ),
size
(
C 0
={
R
(
c
,
d
),
R
(
e
) }
C f
={
R
(
a
,
b
),
R
(
a
,
c
),
R
(
c
),
R
(
e
) }
A simple schedule first removes all indexes in
(
C 0
C f )
and then creates the
indexes in
(
C f
C 0 )
, that is:
[drop R
(
c
,
d
)
, create R
(
a
,
b
)
, create R
(
a
,
c
)
, create R
(
c
)
]
The following schedule, however, can be more ecient than the previous one
(assuming that the indexes at each intermediate step fit in the storage con-
straint):
[create R
(
c
) , drop R
(
c
,
d
) , create R
(
a
,
b
) , create R
(
a
,
c
) ]
In this case, we use R
(
c
,
d
)
to avoid sorting table R in c order to create
R
(
c
)
, thus saving time. However, in this case we need to be able to store
simultaneously. There might be more ecient schedules that
create additional intermediate indexes not in
R
(
c
)
and R
(
c
,
d
)
(
C f
C 0 )
. Consider the following
schedule:
[create R
(
c
)
, drop R
(
c
,
d
)
, create R
(
a
,
b
,
c
)
,
(
,
)
(
,
)
(
,
,
)
create R
a
b
, create R
a
c
, drop R
a
b
c
]
In this schedule, we create a temporary index R
(
a
,
b
,
c
) =
R
(
a
,
b
)
R
(
a
,
c
)
before creating R
(
a
,
b
)
and R
(
a
,
c
)
. Therefore, we need to perform a full sort
only once, and then R
(
a
,
b
)
and R
(
a
,
c
)
can be built by scanning R
(
a
,
b
,
c
)
and doing a minor sort on column c for the case of R
(
a
,
c
) (if a isakey,no
minor sorting is required for R
) ). The general physical design scheduling
problem can be defined as follows. Given configurations C 0 and C f and a
space constraint B , obtain a schedule (
(
a
,
c
s 1 ,
s 2 ,... ,
s n ) that transforms C 0 into
C f such that:
1. Each s i drops an existing index or creates a new index in closure(
C f ) ,
as defined in Section 4.3.
2. The size of each intermediate configuration is bounded by B .
3. The cost of implementing
(
s 1 ,
s 2 ,... ,
s n )
is minimized.
The two main challenges of the physical design scheduling problem are the
explosion in the search space due to the ability to add elements in the closure
of C f and the presence of the space constraint B , which invalidates more
obvious approaches based on topological orders. We now show an interesting
property that connects the scheduling problem with a shortest-path algorithm
in an induced graph.
Search WWH ::




Custom Search