Biology Reference
In-Depth Information
Fig. 12.17 The genotypic similarity vs. phenotypic distance (GSvPD) plots of various metabolic
pathways. The bottom two panels involves the cytoskeleton RNAs. The 56 RNA molecules in the
left bottom panel were unfiltered original data. The 26 RNA molecules in the right bottom panel
were selected because their coefficient of variations , defined as (standard deviation/mean) 100,
are less than 50%. Evidently the filtering had little effect on the distribution pattern. To find the
diagonal line objectively, five points with the greatest phenotypic distances ( i.e. , y coordinates)
and five points with the greatest genotypic similarity values ( i.e. , x coordinates) were selected.
From these two sets of points, 25 (¼ 5 5) candidate diagonal lines were generated by
connecting all possible pairs of x and y coordinates. Then the rest of the points were run through
a distance formula to find their distance from each of the 25 diagonals. The 10-15% of the points
that are closest to each diagonal are selected and a line of regression through these points is
found. The median of the resulting lines of regression is chosen as the true” candidate diagonal
that contains 80-90% of the points below it (I thank Mr. Kenneth So for developing this algorithm)
Search WWH ::




Custom Search