Biology Reference
In-Depth Information
categories; other complicated and combinatorial relationships may be
obscured.
During my i eldwork at the Burge lab, I observed i rsthand the de-
velopment of a new visualization tool. During the lab's work on the
alternative splicing of mRNA, the group began to collect large amounts
of new data on alternative splicing events. 52 In trying to come to terms
with this new information, members of the lab realized that many of
the basic questions about alternative splicing remained unanswered:
What fraction of human genes can be alternatively spliced? How does
alternative splicing relate to cell differentiation? Do different sorts of
alternative splicing events play different biological roles? The new data
consisted of over 400 million 32-base-pair mRNA sequences from ten
different kinds of human tissues (skeletal muscle, brain, lung, heart,
liver, etc.) and fi ve cancer cell lines. This overwhelming volume of data
came from the new Solexa-Illumina sequencing machines that had re-
cently been made available to the laboratory. 53
After mapping the sequences to the human genome, the group
quickly discovered that almost 90% of human genes seem to undergo
alternative splicing. 54 In order to answer the second and third questions
posed above, however, the lab needed a way to summarize and organize
the vast amounts of data in order to “see” how alternative splicing var-
ied from tissue to tissue. A signifi cant part of the effort to understand
what role alternative splicing was playing was devoted to creating a
visualization tool in order to capture the essence of the data. This was
not just a process of fi nding a way to communicate information, nor
was it a problem of how to present data at a conference or in a scientifi c
journal article. Rather, it was a problem of organizing the data so as to
see what was going on inside the cell.
The problem the group confronted was that their data had become
multidimensional. To begin with, they needed something like a genome
browser that laid out the linear order of the exons. However, in order
to observe the different ways in which exons could be connected in
different tissues, they also needed to fi nd a way to visually differen-
tiate those tissues and connections. Much discussion centered on the
appropriate use of shapes, colors, and lines to achieve the desired rep-
resentation. Some iterations of the tool brought spurious artifacts into
view: “This representation tends to emphasize the weird events,” Burge
worried at one point. Early versions of the tool generated images that
were diffi cult to understand because the data in the small exonic regions
became lost in the larger, blank intronic regions and the plots displayed
Search WWH ::




Custom Search