Chemistry Reference
In-Depth Information
of that first invariant. If we would stop at this stage, symmetry comparisons need
only be done within each of the P groups. Assuming roughly even distribution
of graphs into the P groups, each group contains on the order of N/P graphs.
The work of symmetry comparisons now scales like ( N/P ) 2 within each group.
After separation into P groups, the work of symmetry comparison scales like
P ( N/P ) 2
N 2 /P , an improvement by a factor P in efficiency.
Instead of using just one invariant to sort the graphs into smaller subsets, imagine
using m different invariants to sort the graphs m times. For simplicity, we will
assume that the graphs are sorted into P equal piles according to each new invariant.
After employing the first invariant, the graphs are divided into P groups. Then, the
next invariant divides each of those groups into P subgroups, making a total of P 2
groups. Finally, after m such sorts, the graphs are partitioned into P m
=
groups of
size
N
P m
n
=
(40)
The goal is to reduce the groups to a target size n , which is small enough to employ
a conventional symmetry comparison method. From Eq. (40), the number of sorts
required to reach a target size n is
ln( N/n )
ln P
m
=
(41)
Each time the graphs are sorted, each of the graphs is either labeled or moved to
another location in memory or on disk. Hence, the cost of sorting is proportional
to mN .
N ln( N/n )
ln P
mN
=
(42)
Associating a coefficient A with the computational cost of sorting the graphs into
groups of target size n , and another coefficient B associated with
( n 2 ) con-
ventional symmetry comparisons within each of N/n groups, the total work of
eliminating symmetry related graphs scales like
O
Bn
N
AN ln( N/n )
ln P
B N
A
ln P N ln N
A ln n
ln P
n n 2
+
=
+
(43)
The total work contains components that scale as N ln N and as N , far more
efficient than conventional symmetry comparison.
We arrived at N ln N scaling by assuming that each sort breaks the graphs into
P groups of equal size. Actual computations are more complicated. The number of
groups into which the graphs are sorted is the number of different values an invariant
takes over the set of graphs. This varies from invariant to invariant. Moreover, the
graphs are, in general, broken into groups of unequal size. Therefore, the parameter
 
Search WWH ::




Custom Search