Biology Reference
In-Depth Information
G even though these high-quality reads—you can tell by the
color—are clearly correct, clearly calling a discrete number of
Gs, and so again this one is a C.
Where it seems that there are insuffi cient data to fi ll a gap, the fi nisher
may request further laboratory work on that sequence region—such
requests are passed through the LIMS. Making these requests also re-
quires careful consideration:
Finishers often fi nd themselves in a catch-22: to select an ap-
propriate laboratory procedure, they must understand the un-
derlying sequence, but the sequence is missing. In practice, they
must make educated guesses about the underlying sequence and
the likelihood that various laboratory techniques will succeed.
Their decision is infl uenced by the condition of the DNA near
the gap; it is also infl uenced by the ability of their informatics
tools to highlight those conditions. Most importantly, fi nishers'
decisions are guided by their skill and experience: whereas some
experienced fi nishers may be able to close a gap based on the
information already present in an assembly, less experienced fi n-
ishers may feel they need laboratory work. 7
Under its contracts with the NHGRI and other funding bodies, the
Broad's standard for “fi nished” sequence requires no more than one
error in every 10,000 bases. A recent audit found error levels around
one in every 250,000 bases. Once a fi nisher is satisfi ed with the piece of
sequence on which he or she is working, it is resubmitted to the LIMS.
From there it must pass through a fi nal review” by a different fi nisher
before it can be submitted to GenBank as a “fi nished” sequence. 8 Dur-
ing this review the fi nisher runs a number of scripts on the sequence to
check for quality and consistency and to align the piece of sequence with
the entire genomic build. The submission process is also automated: a
sequence is passed through a script that automatically generates the
required metadata, such as the coordinate systems, the sequence length,
and the author names. This script also does a fi nal check for gaps and
writes standard features into a GenBank submission fi le. All progress,
including the GenBank submission number, is recorded in the LIMS.
Finished submission fi les are automatically uploaded to GenBank via
FTP at 11:00 p.m. each day, where they are available to the world at
8:00 a.m. the following morning.
The integrity and fl uidity of the sequencing pipeline are maintained
Search WWH ::




Custom Search