GenBank accession numbers are distinctly-formatted sequence accession numbers that NCBI staff assign to individual sequence records submitted to GenBank by investigators or research groups.

  • The current format of a GenBank accession number is: [two-letter alphabetical prefix][six digits][.][version number]
  • The format for older GenBank records is: [one-letter alphabetical prefix][five digits][.][version number]
Some examples of GenBank accessions are AF071988.1, KT183498.1, JQ922422.1, and CP004440.2. GenBank uses this format for standard GenBank sequence records and for individual assembled chromosomes (or parts of assembled chromosomes) in submitted genomic assemblies. GenBank uses different formats for Transcriptome (TSA) and Whole Genome Shotgun (WGS) records. 

Unlike RefSeq accession prefixes, GenBank accession prefixes carry little information. A prefix is allocated to a particular collaborator of the International Nucleotide Sequence Database Collaboration (INSDC). For example, once the "KT" prefix has been allocated to GenBank, neither DNA DataBank of Japan (DDBJ) nor the European Nucleotide Archive (ENA) will use it. There is no meaning in the digits that follow the prefix in the accession number. The version number indicates if the sequence in the record has remained as originally submitted (version 1) or the submitter followed up and updated the sequence in the record (version 2 or more).  Only the original submitters can update their sequence(s). Of the above accession examples, the submitter updated the CP004440.2 sequence once, hence version 2.
Comments (0)