Revision as of 21:51, 2 July 2012

Introduction

Accession numbers are available on the NCBI website, by searching with the taxon ID for Pancrustacea, which is 197562 [[1]]. As of July 1, 2012, there were 373 accessions available. Note that some of these are sub-species. The list of accessions can be downloaded at the right near the top, clicking on "Download".
The accessions downloaded in May along with scripts to directly pull and parse the data, written by THO, sit on macroevolution in the following directory: /labdata/nfs/lab/scripts/ATOLmt/
The accessions pulled at that time are in the file AccList.tx . There are 365 accessions in that list.
The next step is to use BioPerl to download all the GenBank files directly from GenBank. This is done using the script getGB.pl. The actual command is:

   ./getGB.pl AccList.tx > mtGenomes.gb

this pulls accessions in AccList.tx from GenBank and writes the data into the file called mtGenomes.gb

@@ Line 5: / Line 5: @@
 # Accession numbers are available on the NCBI website, by searching with the taxon ID for Pancrustacea, which is 197562 [[http://www.ncbi.nlm.nih.gov/genomes/OrganelleResource.cgi?taxid=197562]]. As of July 1, 2012, there were 373 accessions available. Note that some of these are sub-species. The list of accessions can be downloaded at the right near the top, clicking on "Download".
 # The accessions downloaded in May along with scripts to directly pull and parse the data, written by THO, sit on macroevolution in the following directory: /labdata/nfs/lab/scripts/ATOLmt/
-# The accessions pulled at that time are in the file AccList.tx
+# The accessions pulled at that time are in the file AccList.tx . There are 365 accessions in that list.
 # The next step is to use BioPerl to download all the GenBank files directly from GenBank.  This is done using the script getGB.pl. The actual command is:
      ./getGB.pl AccList.tx > mtGenomes.gb