Information Technology Reference
In-Depth Information
undergo the same validation and processing as non-bulk submissions. Once processing is
complete, the records are loaded into GenBank and are available in Entrez and other
retrieval systems.
14. FLIC Submissions
FLICs are processed via an automated FLIC processing system that is based on the HTG
automated processing system. Submitters use the program tbl2asn to generate their
submissions. As with HTG submissions, submissions to the automated FLIC processing
system must contain three identifiers: the genome center tag, the sequence name (SeqId),
and the Accession number. The genome center tag is assigned by NCBI and is generally the
FTP account login name. The sequence name is a unique identifier that is assigned by the
submitter to a particular clone or entry and must be unique within the group's FLIC
submissions. When a sequence is first submitted, it has only a sequence name and genome
center tag; the Accession number is assigned during processing. All updates to that entry
include the center tag, sequence name, and Accession number, or processing will fail.
15. The FLIC Processing Pathway
The FLIC processing system is analogous to the HTG processing system. Submitters
deposit their submissions in the FLICSEQSUBMIT directory of their FTP account and
notify us that the submissions are there. We then run the scripts to pick up the files from the
FTP site and copy them to the processing pathway, as well as to an archive. Once
processing is complete and if there are no errors in the submission, the files are
automatically loaded into GenBank. As with HTG submissions, FLIC entries can fail for
three reasons: problems with the format, problems with the identification of the record (the
genome center, the SeqId, or the Accession number), or problems with the data itself. When
submissions fail FLIC processing, a GenBank annotator sends email to the sequencing
center, describing the problem and asking the center to submit a corrected entry. Annotators
do not fix incorrect submissions; this ensures that the staff of the submitting genome center
fixes the problems in their database as well. At the completion of processing, reports are
generated and deposited in the submitter's FTP account, as described for HTG submissions.
16. Submission Tools
Direct submissions to GenBank are prepared using one of two submission tools, BankIt or
Sequin. BankIt BankIt [http://www.ncbi.nlm.nih.gov/BankIt/] is a Web-based form that is a
convenient and easy way to submit a small number of sequences with minimal annotation
to GenBank. To complete the form, a user is prompted to enter submitter information, the
nucleotide sequence, biological source information, and features and annotation pertinent to
the submission. BankIt has extensive Help [http://www.ncbi.nlm.nih.gov/BankIt/help.html]
documentation to guide the submitter. Included with the Help document is a set of
annotation examples that detail the types of information that are required for each type of
submission. After the information is entered into the form, BankIt transforms this
information into a GenBank flatfile for review. In addition, a number of quality assurance
and validation checks ensure that the sequence submitted to GenBank is of the highest
quality. The submitter is asked to include spans (sequence coordinates) for the coding
regions and other features and to include amino acid sequence for the proteins that derive
Search WWH ::




Custom Search