how to download complete genome sequence in biopython entrez.esearch

I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.

my script:

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

gatunek='Escherichia[ORGN]'

handle = Entrez.esearch(db='nucleotide',

     term=gatunek, property='complete genome' )#title='complete genome[title]')

result = Entrez.read(handle)

As a results I get only small fragments of genomes, whith size about 484 bp:

LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013

DEFINITION  Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome

       shotgun sequence.

I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:

escherichia[orgn] AND complete genome[title]

and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.

edited Aug 27 '13 at 9:42

asked Aug 27 '13 at 9:23

user2662581

1614

add a comment |

I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.

my script:

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

gatunek='Escherichia[ORGN]'

handle = Entrez.esearch(db='nucleotide',

     term=gatunek, property='complete genome' )#title='complete genome[title]')

result = Entrez.read(handle)

As a results I get only small fragments of genomes, whith size about 484 bp:

LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013

DEFINITION  Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome

       shotgun sequence.

I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:

escherichia[orgn] AND complete genome[title]

and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.

edited Aug 27 '13 at 9:42

asked Aug 27 '13 at 9:23

user2662581

1614

add a comment |

I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.

my script:

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

gatunek='Escherichia[ORGN]'

handle = Entrez.esearch(db='nucleotide',

     term=gatunek, property='complete genome' )#title='complete genome[title]')

result = Entrez.read(handle)

As a results I get only small fragments of genomes, whith size about 484 bp:

LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013

DEFINITION  Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome

       shotgun sequence.

I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:

escherichia[orgn] AND complete genome[title]

and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.

edited Aug 27 '13 at 9:42

asked Aug 27 '13 at 9:23

user2662581

1614

I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.

my script:

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

gatunek='Escherichia[ORGN]'

handle = Entrez.esearch(db='nucleotide',

     term=gatunek, property='complete genome' )#title='complete genome[title]')

result = Entrez.read(handle)

As a results I get only small fragments of genomes, whith size about 484 bp:

LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013

DEFINITION  Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome

       shotgun sequence.

I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:

escherichia[orgn] AND complete genome[title]

and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.

python downloading biopython

edited Aug 27 '13 at 9:42

asked Aug 27 '13 at 9:23

user2662581

1614

edited Aug 27 '13 at 9:42

asked Aug 27 '13 at 9:23

user2662581

1614

edited Aug 27 '13 at 9:42

asked Aug 27 '13 at 9:23

user2662581

1614

asked Aug 27 '13 at 9:23

user2662581

1614

asked Aug 27 '13 at 9:23

user2662581

1614

add a comment |

3 Answers
3

active

oldest

votes

Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.

You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):

enter image description here

To help you, here are the Genbank/Refseq links to their genomes:

http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3

http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1

Here is my code for Complete Genome Sequence Parsing into .FASTA files...

# Imports

from Bio import Entrez

from Bio import SeqIO



#############################

# Retrieve NCBI Data Online #

#############################



Entrez.email     = "asiak@wp.pl"             # Always tell NCBI who you are

genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']

search           = " ".join(genomeAccessions)

handle           = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))

genomeIds        = handle['IdList']

records          = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")



###############################

# Generate Genome Fasta files #

###############################



sequences   =   # store your sequences in a list

headers     =   # store genome names in a list (db_xref ids)



for i,record in enumerate(records):



    file_out = open("genBankRecord_"+str(i)+".gb", "w")    # store each genomes .gb in separate files

    file_out.write(record.read())

    file_out.close()



    genomeGenbank   = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank")  # parse in the genbank files

    header         = genome.features[0].qualifiers['db_xref'][0]          # name the genome using db_xfred ID

    sequence       = genome.seq.tostring()                                # obtain genome sequence



    headers.append('>'+header)  # store genome name in list                                     

    sequences.append(sequence)  # store sequence in list



    fasta_out = open("genome"+str(i)+".fasta","w")     # store each genomes .fasta in separate files

    fasta_out.write(header)    # >header ... followed by:

    fasta_out.write(sequence)  # sequence ... 

    fasta_out.close()          # close that .fasta file and move on to next genome

records.close()

Let me know how it goes!
Andy

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

add a comment |

You've done the hard part and worked out the query,

escherichia[orgn] AND complete genome[title]

So use that as the search query via Biopython as well!

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

search_term = "escherichia[orgn] AND complete genome[title]"

handle = Entrez.esearch(db='nucleotide', term=search_term)

result = Entrez.read(handle)

handle.close()

print(result['Count']) # added parenthesis

Currently that gives me 140 results, starting with 545778205, which is the same as the website:
http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D

edited Jan 1 at 5:15

Shred

210112

answered Oct 18 '13 at 20:36

peterjc

1,189511

add a comment |

This works for me...

search_term = 'escherichia coli[orgn] AND complete genome[title]'

handle = Entrez.esearch(db='nucleotide', term=search_term)

genome_ids = Entrez.read(handle)['IdList']



for genome_id in genome_ids:

    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")



    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)

    print('Writing:{}'.format(filename))

    with open(filename, 'w') as f:

        f.write(record.read())

answered Oct 13 '14 at 19:49

schryer

7312

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18461629%2fhow-to-download-complete-genome-sequence-in-biopython-entrez-esearch%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.

You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):

enter image description here

To help you, here are the Genbank/Refseq links to their genomes:

http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3

http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1

Here is my code for Complete Genome Sequence Parsing into .FASTA files...

# Imports

from Bio import Entrez

from Bio import SeqIO



#############################

# Retrieve NCBI Data Online #

#############################



Entrez.email     = "asiak@wp.pl"             # Always tell NCBI who you are

genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']

search           = " ".join(genomeAccessions)

handle           = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))

genomeIds        = handle['IdList']

records          = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")



###############################

# Generate Genome Fasta files #

###############################



sequences   =   # store your sequences in a list

headers     =   # store genome names in a list (db_xref ids)



for i,record in enumerate(records):



    file_out = open("genBankRecord_"+str(i)+".gb", "w")    # store each genomes .gb in separate files

    file_out.write(record.read())

    file_out.close()



    genomeGenbank   = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank")  # parse in the genbank files

    header         = genome.features[0].qualifiers['db_xref'][0]          # name the genome using db_xfred ID

    sequence       = genome.seq.tostring()                                # obtain genome sequence



    headers.append('>'+header)  # store genome name in list                                     

    sequences.append(sequence)  # store sequence in list



    fasta_out = open("genome"+str(i)+".fasta","w")     # store each genomes .fasta in separate files

    fasta_out.write(header)    # >header ... followed by:

    fasta_out.write(sequence)  # sequence ... 

    fasta_out.close()          # close that .fasta file and move on to next genome

records.close()

Let me know how it goes!
Andy

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

add a comment |

Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.

You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):

enter image description here

To help you, here are the Genbank/Refseq links to their genomes:

http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3

http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1

Here is my code for Complete Genome Sequence Parsing into .FASTA files...

# Imports

from Bio import Entrez

from Bio import SeqIO



#############################

# Retrieve NCBI Data Online #

#############################



Entrez.email     = "asiak@wp.pl"             # Always tell NCBI who you are

genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']

search           = " ".join(genomeAccessions)

handle           = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))

genomeIds        = handle['IdList']

records          = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")



###############################

# Generate Genome Fasta files #

###############################



sequences   =   # store your sequences in a list

headers     =   # store genome names in a list (db_xref ids)



for i,record in enumerate(records):



    file_out = open("genBankRecord_"+str(i)+".gb", "w")    # store each genomes .gb in separate files

    file_out.write(record.read())

    file_out.close()



    genomeGenbank   = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank")  # parse in the genbank files

    header         = genome.features[0].qualifiers['db_xref'][0]          # name the genome using db_xfred ID

    sequence       = genome.seq.tostring()                                # obtain genome sequence



    headers.append('>'+header)  # store genome name in list                                     

    sequences.append(sequence)  # store sequence in list



    fasta_out = open("genome"+str(i)+".fasta","w")     # store each genomes .fasta in separate files

    fasta_out.write(header)    # >header ... followed by:

    fasta_out.write(sequence)  # sequence ... 

    fasta_out.close()          # close that .fasta file and move on to next genome

records.close()

Let me know how it goes!
Andy

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

add a comment |

Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.

You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):

enter image description here

To help you, here are the Genbank/Refseq links to their genomes:

http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3

http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1

Here is my code for Complete Genome Sequence Parsing into .FASTA files...

# Imports

from Bio import Entrez

from Bio import SeqIO



#############################

# Retrieve NCBI Data Online #

#############################



Entrez.email     = "asiak@wp.pl"             # Always tell NCBI who you are

genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']

search           = " ".join(genomeAccessions)

handle           = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))

genomeIds        = handle['IdList']

records          = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")



###############################

# Generate Genome Fasta files #

###############################



sequences   =   # store your sequences in a list

headers     =   # store genome names in a list (db_xref ids)



for i,record in enumerate(records):



    file_out = open("genBankRecord_"+str(i)+".gb", "w")    # store each genomes .gb in separate files

    file_out.write(record.read())

    file_out.close()



    genomeGenbank   = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank")  # parse in the genbank files

    header         = genome.features[0].qualifiers['db_xref'][0]          # name the genome using db_xfred ID

    sequence       = genome.seq.tostring()                                # obtain genome sequence



    headers.append('>'+header)  # store genome name in list                                     

    sequences.append(sequence)  # store sequence in list



    fasta_out = open("genome"+str(i)+".fasta","w")     # store each genomes .fasta in separate files

    fasta_out.write(header)    # >header ... followed by:

    fasta_out.write(sequence)  # sequence ... 

    fasta_out.close()          # close that .fasta file and move on to next genome

records.close()

Let me know how it goes!
Andy

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.

You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):

enter image description here

To help you, here are the Genbank/Refseq links to their genomes:

http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3

http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1

http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1

Here is my code for Complete Genome Sequence Parsing into .FASTA files...

# Imports

from Bio import Entrez

from Bio import SeqIO



#############################

# Retrieve NCBI Data Online #

#############################



Entrez.email     = "asiak@wp.pl"             # Always tell NCBI who you are

genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']

search           = " ".join(genomeAccessions)

handle           = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))

genomeIds        = handle['IdList']

records          = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")



###############################

# Generate Genome Fasta files #

###############################



sequences   =   # store your sequences in a list

headers     =   # store genome names in a list (db_xref ids)



for i,record in enumerate(records):



    file_out = open("genBankRecord_"+str(i)+".gb", "w")    # store each genomes .gb in separate files

    file_out.write(record.read())

    file_out.close()



    genomeGenbank   = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank")  # parse in the genbank files

    header         = genome.features[0].qualifiers['db_xref'][0]          # name the genome using db_xfred ID

    sequence       = genome.seq.tostring()                                # obtain genome sequence



    headers.append('>'+header)  # store genome name in list                                     

    sequences.append(sequence)  # store sequence in list



    fasta_out = open("genome"+str(i)+".fasta","w")     # store each genomes .fasta in separate files

    fasta_out.write(header)    # >header ... followed by:

    fasta_out.write(sequence)  # sequence ... 

    fasta_out.close()          # close that .fasta file and move on to next genome

records.close()

Let me know how it goes!
Andy

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

answered Nov 23 '13 at 23:35

hello_there_andy

1,20121341

add a comment |

You've done the hard part and worked out the query,

escherichia[orgn] AND complete genome[title]

So use that as the search query via Biopython as well!

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

search_term = "escherichia[orgn] AND complete genome[title]"

handle = Entrez.esearch(db='nucleotide', term=search_term)

result = Entrez.read(handle)

handle.close()

print(result['Count']) # added parenthesis

Currently that gives me 140 results, starting with 545778205, which is the same as the website:
http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D

edited Jan 1 at 5:15

Shred

210112

answered Oct 18 '13 at 20:36

peterjc

1,189511

add a comment |

You've done the hard part and worked out the query,

escherichia[orgn] AND complete genome[title]

So use that as the search query via Biopython as well!

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

search_term = "escherichia[orgn] AND complete genome[title]"

handle = Entrez.esearch(db='nucleotide', term=search_term)

result = Entrez.read(handle)

handle.close()

print(result['Count']) # added parenthesis

Currently that gives me 140 results, starting with 545778205, which is the same as the website:
http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D

edited Jan 1 at 5:15

Shred

210112

answered Oct 18 '13 at 20:36

peterjc

1,189511

add a comment |

You've done the hard part and worked out the query,

escherichia[orgn] AND complete genome[title]

So use that as the search query via Biopython as well!

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

search_term = "escherichia[orgn] AND complete genome[title]"

handle = Entrez.esearch(db='nucleotide', term=search_term)

result = Entrez.read(handle)

handle.close()

print(result['Count']) # added parenthesis

Currently that gives me 140 results, starting with 545778205, which is the same as the website:
http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D

edited Jan 1 at 5:15

Shred

210112

answered Oct 18 '13 at 20:36

peterjc

1,189511

You've done the hard part and worked out the query,

escherichia[orgn] AND complete genome[title]

So use that as the search query via Biopython as well!

from Bio import Entrez

Entrez.email = "asiakXX@wp.pl"

search_term = "escherichia[orgn] AND complete genome[title]"

handle = Entrez.esearch(db='nucleotide', term=search_term)

result = Entrez.read(handle)

handle.close()

print(result['Count']) # added parenthesis

Currently that gives me 140 results, starting with 545778205, which is the same as the website:
http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D

edited Jan 1 at 5:15

Shred

210112

answered Oct 18 '13 at 20:36

peterjc

1,189511

edited Jan 1 at 5:15

Shred

210112

edited Jan 1 at 5:15

Shred

210112

edited Jan 1 at 5:15

Shred

210112

answered Oct 18 '13 at 20:36

peterjc

1,189511

answered Oct 18 '13 at 20:36

peterjc

1,189511

answered Oct 18 '13 at 20:36

peterjc

1,189511

add a comment |

This works for me...

search_term = 'escherichia coli[orgn] AND complete genome[title]'

handle = Entrez.esearch(db='nucleotide', term=search_term)

genome_ids = Entrez.read(handle)['IdList']



for genome_id in genome_ids:

    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")



    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)

    print('Writing:{}'.format(filename))

    with open(filename, 'w') as f:

        f.write(record.read())

answered Oct 13 '14 at 19:49

schryer

7312

add a comment |

This works for me...

search_term = 'escherichia coli[orgn] AND complete genome[title]'

handle = Entrez.esearch(db='nucleotide', term=search_term)

genome_ids = Entrez.read(handle)['IdList']



for genome_id in genome_ids:

    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")



    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)

    print('Writing:{}'.format(filename))

    with open(filename, 'w') as f:

        f.write(record.read())

answered Oct 13 '14 at 19:49

schryer

7312

add a comment |

This works for me...

search_term = 'escherichia coli[orgn] AND complete genome[title]'

handle = Entrez.esearch(db='nucleotide', term=search_term)

genome_ids = Entrez.read(handle)['IdList']



for genome_id in genome_ids:

    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")



    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)

    print('Writing:{}'.format(filename))

    with open(filename, 'w') as f:

        f.write(record.read())

answered Oct 13 '14 at 19:49

schryer

7312

This works for me...

search_term = 'escherichia coli[orgn] AND complete genome[title]'

handle = Entrez.esearch(db='nucleotide', term=search_term)

genome_ids = Entrez.read(handle)['IdList']



for genome_id in genome_ids:

    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")



    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)

    print('Writing:{}'.format(filename))

    with open(filename, 'w') as f:

        f.write(record.read())

answered Oct 13 '14 at 19:49

schryer

7312

answered Oct 13 '14 at 19:49

schryer

7312

answered Oct 13 '14 at 19:49

schryer

7312

answered Oct 13 '14 at 19:49

schryer

7312

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu