Image Processing Reference
In-Depth Information
How do you find patterns in these large data sets? How do you understand changes
over time? An even bigger challenge is: how do you integrate these data types
together?
22.3.3 Visualization for Comparative Genomics
The visual convention for looking at gene expression data frommicroarrays is nearly-
universally the heatmap display, where the data is laid out in a matrix and each value
is encoded with a color [ 10 , 52 , 53 ]. Heatmaps are often augmented with clustering
algorithms to enhance the perception of trends in the data [ 9 ], as in Java TreeView [ 38 ]
and the Hierarchical Clustering Explorer [ 39 ]. These visualizations have a very high
data density, allowing many data points to be viewed at a single time. The use of
color, however, makes fine scale analysis difficult due to perceptual limitations [ 7 ]
as well as the relativity of color [ 54 ].
Tools designed to visualize molecular networks usually focus on showing the
topological structure of the graph. In these systems, networks are most often visu-
alized using a node-link graph, such as is Cytoscape [ 40 ], MicrobesOnline [ 1 ], and
iPath [ 23 ]. These systems support the visualization of an additional dimension of
data by colormapping values on the nodes and edges of the graph. Other recently
developed tools support the visualization of an entire set of values, for example a full
time series, for each node and edge using techniques like small multiples [ 48 ], anima-
tion, or glyphs—example tools are Cerebral [ 2 ], Pathway Tools [ 18 ], VANTED [ 16 ],
PathwayExplorer [ 31 ], and GENeVis II [ 3 ].
22.3.4 Case Study: Pathline
We collaborated with a group of biologists who are pioneering the new field of
comparative functional genomics. This field extends the questions of functional
genomics to understand how gene interactions vary across species. Our collaborators
are interested in understanding how evolutionary mechanisms affect gene regulation
for metabolism in yeast. To probe these questions, the biologists collected data for
multiple genes, at multiple time points, and in multiple related species. They need
to integrate this data in order to find patterns in gene expression levels belonging to
multiple pathways over time and across multiple species. When we started working
with this group, the problem they faced was that existing visualization tools only
look at subsets of this data at a time.
To address the challenge of data integrationwe designed a tool called Pathline [ 29 ],
shown in Fig. 22.2 . Pathline was designed in a user-centered process using iterative
refinement based on feedback from our biology collaborators. Each design decision
was motivated by the specific needs of the biologists.
 
Search WWH ::




Custom Search