Database Reference
In-Depth Information
From Personalized Ads to Personalized Medicine
While ADAM is designed to rapidly and scalably analyze aligned reads, it does not align
the reads itself; instead, ADAM relies on standard short-reads aligners. The Scalable Nuc-
leotide Alignment Program (SNAP) is a collaborative effort including participants from
Microsoft Research, UC San Francisco, and the AMPLab as well as open source deve-
lopers, shared with an Apache 2.0 license. The SNAP aligner is as accurate as the current
best-of-class aligners, like BWA-mem, Bowtie2, and Novalign, but runs between 3 and 20
times faster. This speed advantage is important when doctors are racing to identify a patho-
gen.
In 2013, a boy went to the University of Wisconsin Hospital and Clinics' Emergency De-
partment three times in four months with symptoms of encephalitis: fevers and headaches.
He was eventually hospitalized without a successful diagnosis after numerous blood tests,
brain scans, and biopsies. Five weeks later, he began having seizures that required he be
placed into a medically induced coma. In desperation, doctors sampled his spinal fluid and
sent it to an experimental program led by Charles Chiu at UC San Francisco, where it was
sequenced for analysis. The speed and accuracy of SNAP allowed UCSF to quickly filter
out all human DNA and, from the remaining 0.02% of the reads, identify a rare infectious
bacterium, Leptospira santarosai. They reported the discovery to the Wisconsin doctors just
two days after they sent the sample . The boy was treated with antibiotics for 10 days,
awoke from his coma, and was discharged from the hospital two weeks later. [ 167 ]
If you're interested in learning more about the system the Chiu lab used — called
Sequence-based Ultra-Rapid Pathogen Identification (SURPI) — they have generously
shared their software with a permissive BSD license and provide an Amazon EC2 Machine
Image (AMI) with SURPI preinstalled. SURPI collects 348,922 unique bacterial sequences
and 1,193,607 unique virus sequences from numerous sources and saves them in 29 SNAP-
indexed databases, each approximately 27 GB in size, for fast search.
Today, more data is analyzed for personalized advertising than personalized medicine, but
that will not be the case in the future. With personalized medicine, people receive custom-
ized healthcare that takes into consideration their unique DNA profiles. As the price of se-
quencing drops and more people have their genomes sequenced, the increase in statistical
power will allow researchers to understand the genetic mechanisms underlying diseases
and fold these discoveries into the personalized medical model, to improve treatment for
subsequent patients. While only 25 PB of genomic data were generated worldwide this
year, next year that number will likely be 100 PB.
Search WWH ::




Custom Search