how to download complete genome sequence in biopython entrez.esearch












3















I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.



my script:



from Bio import Entrez
Entrez.email = "asiakXX@wp.pl"
gatunek='Escherichia[ORGN]'
handle = Entrez.esearch(db='nucleotide',
term=gatunek, property='complete genome' )#title='complete genome[title]')
result = Entrez.read(handle)


As a results I get only small fragments of genomes, whith size about 484 bp:



LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013
DEFINITION Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome
shotgun sequence.


I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:



escherichia[orgn] AND complete genome[title]


and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.










share|improve this question





























    3















    I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.



    my script:



    from Bio import Entrez
    Entrez.email = "asiakXX@wp.pl"
    gatunek='Escherichia[ORGN]'
    handle = Entrez.esearch(db='nucleotide',
    term=gatunek, property='complete genome' )#title='complete genome[title]')
    result = Entrez.read(handle)


    As a results I get only small fragments of genomes, whith size about 484 bp:



    LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013
    DEFINITION Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome
    shotgun sequence.


    I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:



    escherichia[orgn] AND complete genome[title]


    and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.










    share|improve this question



























      3












      3








      3


      1






      I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.



      my script:



      from Bio import Entrez
      Entrez.email = "asiakXX@wp.pl"
      gatunek='Escherichia[ORGN]'
      handle = Entrez.esearch(db='nucleotide',
      term=gatunek, property='complete genome' )#title='complete genome[title]')
      result = Entrez.read(handle)


      As a results I get only small fragments of genomes, whith size about 484 bp:



      LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013
      DEFINITION Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome
      shotgun sequence.


      I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:



      escherichia[orgn] AND complete genome[title]


      and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.










      share|improve this question
















      I have to download only complete genome sequences from NCBI (GenBank(full) format). I am intrested in 'complete geneome' not 'whole genome'.



      my script:



      from Bio import Entrez
      Entrez.email = "asiakXX@wp.pl"
      gatunek='Escherichia[ORGN]'
      handle = Entrez.esearch(db='nucleotide',
      term=gatunek, property='complete genome' )#title='complete genome[title]')
      result = Entrez.read(handle)


      As a results I get only small fragments of genomes, whith size about 484 bp:



      LOCUS       NZ_KE350773              484 bp    DNA     linear   CON 23-AUG-2013
      DEFINITION Escherichia coli E1777 genomic scaffold scaffold9_G, whole genome
      shotgun sequence.


      I know how to do it manually via NCBI web site but it is very time consuming, the query that I use there:



      escherichia[orgn] AND complete genome[title]


      and as result I get multiple genomes with sizes range about 5,154,862 bp and this is what I need to do via ENTREZ.esearch.







      python downloading biopython






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Aug 27 '13 at 9:42







      user2662581

















      asked Aug 27 '13 at 9:23









      user2662581user2662581

      1614




      1614
























          3 Answers
          3






          active

          oldest

          votes


















          1














          Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.



          You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):



          enter image description here



          To help you, here are the Genbank/Refseq links to their genomes:




          1. http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3


          2. http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1


          3. http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1


          4. http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1


          5. http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1


          6. http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1



          Here is my code for Complete Genome Sequence Parsing into .FASTA files...



          # Imports
          from Bio import Entrez
          from Bio import SeqIO

          #############################
          # Retrieve NCBI Data Online #
          #############################

          Entrez.email = "asiak@wp.pl" # Always tell NCBI who you are
          genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']
          search = " ".join(genomeAccessions)
          handle = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))
          genomeIds = handle['IdList']
          records = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")

          ###############################
          # Generate Genome Fasta files #
          ###############################

          sequences = # store your sequences in a list
          headers = # store genome names in a list (db_xref ids)

          for i,record in enumerate(records):

          file_out = open("genBankRecord_"+str(i)+".gb", "w") # store each genomes .gb in separate files
          file_out.write(record.read())
          file_out.close()

          genomeGenbank = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank") # parse in the genbank files
          header = genome.features[0].qualifiers['db_xref'][0] # name the genome using db_xfred ID
          sequence = genome.seq.tostring() # obtain genome sequence

          headers.append('>'+header) # store genome name in list
          sequences.append(sequence) # store sequence in list

          fasta_out = open("genome"+str(i)+".fasta","w") # store each genomes .fasta in separate files
          fasta_out.write(header) # >header ... followed by:
          fasta_out.write(sequence) # sequence ...
          fasta_out.close() # close that .fasta file and move on to next genome
          records.close()


          Let me know how it goes!
          Andy






          share|improve this answer































            1














            You've done the hard part and worked out the query,



            escherichia[orgn] AND complete genome[title]


            So use that as the search query via Biopython as well!



            from Bio import Entrez
            Entrez.email = "asiakXX@wp.pl"
            search_term = "escherichia[orgn] AND complete genome[title]"
            handle = Entrez.esearch(db='nucleotide', term=search_term)
            result = Entrez.read(handle)
            handle.close()
            print(result['Count']) # added parenthesis


            Currently that gives me 140 results, starting with 545778205, which is the same as the website:
            http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D






            share|improve this answer

































              0














              This works for me...



              search_term = 'escherichia coli[orgn] AND complete genome[title]'
              handle = Entrez.esearch(db='nucleotide', term=search_term)
              genome_ids = Entrez.read(handle)['IdList']

              for genome_id in genome_ids:
              record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")

              filename = 'generated/genBankRecord_{}.gb'.format(genome_id)
              print('Writing:{}'.format(filename))
              with open(filename, 'w') as f:
              f.write(record.read())





              share|improve this answer























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18461629%2fhow-to-download-complete-genome-sequence-in-biopython-entrez-esearch%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                1














                Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.



                You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):



                enter image description here



                To help you, here are the Genbank/Refseq links to their genomes:




                1. http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3


                2. http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1


                3. http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1


                4. http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1


                5. http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1


                6. http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1



                Here is my code for Complete Genome Sequence Parsing into .FASTA files...



                # Imports
                from Bio import Entrez
                from Bio import SeqIO

                #############################
                # Retrieve NCBI Data Online #
                #############################

                Entrez.email = "asiak@wp.pl" # Always tell NCBI who you are
                genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']
                search = " ".join(genomeAccessions)
                handle = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))
                genomeIds = handle['IdList']
                records = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")

                ###############################
                # Generate Genome Fasta files #
                ###############################

                sequences = # store your sequences in a list
                headers = # store genome names in a list (db_xref ids)

                for i,record in enumerate(records):

                file_out = open("genBankRecord_"+str(i)+".gb", "w") # store each genomes .gb in separate files
                file_out.write(record.read())
                file_out.close()

                genomeGenbank = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank") # parse in the genbank files
                header = genome.features[0].qualifiers['db_xref'][0] # name the genome using db_xfred ID
                sequence = genome.seq.tostring() # obtain genome sequence

                headers.append('>'+header) # store genome name in list
                sequences.append(sequence) # store sequence in list

                fasta_out = open("genome"+str(i)+".fasta","w") # store each genomes .fasta in separate files
                fasta_out.write(header) # >header ... followed by:
                fasta_out.write(sequence) # sequence ...
                fasta_out.close() # close that .fasta file and move on to next genome
                records.close()


                Let me know how it goes!
                Andy






                share|improve this answer




























                  1














                  Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.



                  You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):



                  enter image description here



                  To help you, here are the Genbank/Refseq links to their genomes:




                  1. http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3


                  2. http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1


                  3. http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1


                  4. http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1


                  5. http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1


                  6. http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1



                  Here is my code for Complete Genome Sequence Parsing into .FASTA files...



                  # Imports
                  from Bio import Entrez
                  from Bio import SeqIO

                  #############################
                  # Retrieve NCBI Data Online #
                  #############################

                  Entrez.email = "asiak@wp.pl" # Always tell NCBI who you are
                  genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']
                  search = " ".join(genomeAccessions)
                  handle = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))
                  genomeIds = handle['IdList']
                  records = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")

                  ###############################
                  # Generate Genome Fasta files #
                  ###############################

                  sequences = # store your sequences in a list
                  headers = # store genome names in a list (db_xref ids)

                  for i,record in enumerate(records):

                  file_out = open("genBankRecord_"+str(i)+".gb", "w") # store each genomes .gb in separate files
                  file_out.write(record.read())
                  file_out.close()

                  genomeGenbank = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank") # parse in the genbank files
                  header = genome.features[0].qualifiers['db_xref'][0] # name the genome using db_xfred ID
                  sequence = genome.seq.tostring() # obtain genome sequence

                  headers.append('>'+header) # store genome name in list
                  sequences.append(sequence) # store sequence in list

                  fasta_out = open("genome"+str(i)+".fasta","w") # store each genomes .fasta in separate files
                  fasta_out.write(header) # >header ... followed by:
                  fasta_out.write(sequence) # sequence ...
                  fasta_out.close() # close that .fasta file and move on to next genome
                  records.close()


                  Let me know how it goes!
                  Andy






                  share|improve this answer


























                    1












                    1








                    1







                    Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.



                    You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):



                    enter image description here



                    To help you, here are the Genbank/Refseq links to their genomes:




                    1. http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3


                    2. http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1


                    3. http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1


                    4. http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1


                    5. http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1


                    6. http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1



                    Here is my code for Complete Genome Sequence Parsing into .FASTA files...



                    # Imports
                    from Bio import Entrez
                    from Bio import SeqIO

                    #############################
                    # Retrieve NCBI Data Online #
                    #############################

                    Entrez.email = "asiak@wp.pl" # Always tell NCBI who you are
                    genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']
                    search = " ".join(genomeAccessions)
                    handle = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))
                    genomeIds = handle['IdList']
                    records = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")

                    ###############################
                    # Generate Genome Fasta files #
                    ###############################

                    sequences = # store your sequences in a list
                    headers = # store genome names in a list (db_xref ids)

                    for i,record in enumerate(records):

                    file_out = open("genBankRecord_"+str(i)+".gb", "w") # store each genomes .gb in separate files
                    file_out.write(record.read())
                    file_out.close()

                    genomeGenbank = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank") # parse in the genbank files
                    header = genome.features[0].qualifiers['db_xref'][0] # name the genome using db_xfred ID
                    sequence = genome.seq.tostring() # obtain genome sequence

                    headers.append('>'+header) # store genome name in list
                    sequences.append(sequence) # store sequence in list

                    fasta_out = open("genome"+str(i)+".fasta","w") # store each genomes .fasta in separate files
                    fasta_out.write(header) # >header ... followed by:
                    fasta_out.write(sequence) # sequence ...
                    fasta_out.close() # close that .fasta file and move on to next genome
                    records.close()


                    Let me know how it goes!
                    Andy






                    share|improve this answer













                    Your question is clear, but the full answer is long. The code I provide generates a .fasta file for each of your desired E.Coli genome sequences, yes only the "Complete Genomes" in NCBI.



                    You will see there are only six complete E.Coli reference genomes in NCBI (http://www.ncbi.nlm.nih.gov/genome/167):



                    enter image description here



                    To help you, here are the Genbank/Refseq links to their genomes:




                    1. http://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3


                    2. http://www.ncbi.nlm.nih.gov/nuccore/NC_002695.1


                    3. http://www.ncbi.nlm.nih.gov/nuccore/NC_011750.1


                    4. http://www.ncbi.nlm.nih.gov/nuccore/NC_011751.1


                    5. http://www.ncbi.nlm.nih.gov/nuccore/NC_017634.1


                    6. http://www.ncbi.nlm.nih.gov/nuccore/NC_018658.1



                    Here is my code for Complete Genome Sequence Parsing into .FASTA files...



                    # Imports
                    from Bio import Entrez
                    from Bio import SeqIO

                    #############################
                    # Retrieve NCBI Data Online #
                    #############################

                    Entrez.email = "asiak@wp.pl" # Always tell NCBI who you are
                    genomeAccessions = ['NC_000913', 'NC_002695', 'NC_011750', 'NC_011751', 'NC_017634', 'NC_018658']
                    search = " ".join(genomeAccessions)
                    handle = Entrez.read(Entrez.esearch(db="nucleotide", term=search, retmode="xml"))
                    genomeIds = handle['IdList']
                    records = Entrez.efetch(db="nucleotide", id=genomeIds, rettype="gb", retmode="text")

                    ###############################
                    # Generate Genome Fasta files #
                    ###############################

                    sequences = # store your sequences in a list
                    headers = # store genome names in a list (db_xref ids)

                    for i,record in enumerate(records):

                    file_out = open("genBankRecord_"+str(i)+".gb", "w") # store each genomes .gb in separate files
                    file_out.write(record.read())
                    file_out.close()

                    genomeGenbank = SeqIO.read("genBankRecord"+str(i)+".gb", "genbank") # parse in the genbank files
                    header = genome.features[0].qualifiers['db_xref'][0] # name the genome using db_xfred ID
                    sequence = genome.seq.tostring() # obtain genome sequence

                    headers.append('>'+header) # store genome name in list
                    sequences.append(sequence) # store sequence in list

                    fasta_out = open("genome"+str(i)+".fasta","w") # store each genomes .fasta in separate files
                    fasta_out.write(header) # >header ... followed by:
                    fasta_out.write(sequence) # sequence ...
                    fasta_out.close() # close that .fasta file and move on to next genome
                    records.close()


                    Let me know how it goes!
                    Andy







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Nov 23 '13 at 23:35









                    hello_there_andyhello_there_andy

                    1,20121341




                    1,20121341

























                        1














                        You've done the hard part and worked out the query,



                        escherichia[orgn] AND complete genome[title]


                        So use that as the search query via Biopython as well!



                        from Bio import Entrez
                        Entrez.email = "asiakXX@wp.pl"
                        search_term = "escherichia[orgn] AND complete genome[title]"
                        handle = Entrez.esearch(db='nucleotide', term=search_term)
                        result = Entrez.read(handle)
                        handle.close()
                        print(result['Count']) # added parenthesis


                        Currently that gives me 140 results, starting with 545778205, which is the same as the website:
                        http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D






                        share|improve this answer






























                          1














                          You've done the hard part and worked out the query,



                          escherichia[orgn] AND complete genome[title]


                          So use that as the search query via Biopython as well!



                          from Bio import Entrez
                          Entrez.email = "asiakXX@wp.pl"
                          search_term = "escherichia[orgn] AND complete genome[title]"
                          handle = Entrez.esearch(db='nucleotide', term=search_term)
                          result = Entrez.read(handle)
                          handle.close()
                          print(result['Count']) # added parenthesis


                          Currently that gives me 140 results, starting with 545778205, which is the same as the website:
                          http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D






                          share|improve this answer




























                            1












                            1








                            1







                            You've done the hard part and worked out the query,



                            escherichia[orgn] AND complete genome[title]


                            So use that as the search query via Biopython as well!



                            from Bio import Entrez
                            Entrez.email = "asiakXX@wp.pl"
                            search_term = "escherichia[orgn] AND complete genome[title]"
                            handle = Entrez.esearch(db='nucleotide', term=search_term)
                            result = Entrez.read(handle)
                            handle.close()
                            print(result['Count']) # added parenthesis


                            Currently that gives me 140 results, starting with 545778205, which is the same as the website:
                            http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D






                            share|improve this answer















                            You've done the hard part and worked out the query,



                            escherichia[orgn] AND complete genome[title]


                            So use that as the search query via Biopython as well!



                            from Bio import Entrez
                            Entrez.email = "asiakXX@wp.pl"
                            search_term = "escherichia[orgn] AND complete genome[title]"
                            handle = Entrez.esearch(db='nucleotide', term=search_term)
                            result = Entrez.read(handle)
                            handle.close()
                            print(result['Count']) # added parenthesis


                            Currently that gives me 140 results, starting with 545778205, which is the same as the website:
                            http://www.ncbi.nlm.nih.gov/nuccore/?term=escherichia%5Borgn%5D+AND+complete+genome%5Btitle%5D







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Jan 1 at 5:15









                            Shred

                            210112




                            210112










                            answered Oct 18 '13 at 20:36









                            peterjcpeterjc

                            1,189511




                            1,189511























                                0














                                This works for me...



                                search_term = 'escherichia coli[orgn] AND complete genome[title]'
                                handle = Entrez.esearch(db='nucleotide', term=search_term)
                                genome_ids = Entrez.read(handle)['IdList']

                                for genome_id in genome_ids:
                                record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")

                                filename = 'generated/genBankRecord_{}.gb'.format(genome_id)
                                print('Writing:{}'.format(filename))
                                with open(filename, 'w') as f:
                                f.write(record.read())





                                share|improve this answer




























                                  0














                                  This works for me...



                                  search_term = 'escherichia coli[orgn] AND complete genome[title]'
                                  handle = Entrez.esearch(db='nucleotide', term=search_term)
                                  genome_ids = Entrez.read(handle)['IdList']

                                  for genome_id in genome_ids:
                                  record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")

                                  filename = 'generated/genBankRecord_{}.gb'.format(genome_id)
                                  print('Writing:{}'.format(filename))
                                  with open(filename, 'w') as f:
                                  f.write(record.read())





                                  share|improve this answer


























                                    0












                                    0








                                    0







                                    This works for me...



                                    search_term = 'escherichia coli[orgn] AND complete genome[title]'
                                    handle = Entrez.esearch(db='nucleotide', term=search_term)
                                    genome_ids = Entrez.read(handle)['IdList']

                                    for genome_id in genome_ids:
                                    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")

                                    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)
                                    print('Writing:{}'.format(filename))
                                    with open(filename, 'w') as f:
                                    f.write(record.read())





                                    share|improve this answer













                                    This works for me...



                                    search_term = 'escherichia coli[orgn] AND complete genome[title]'
                                    handle = Entrez.esearch(db='nucleotide', term=search_term)
                                    genome_ids = Entrez.read(handle)['IdList']

                                    for genome_id in genome_ids:
                                    record = Entrez.efetch(db="nucleotide", id=genome_id, rettype="gb", retmode="text")

                                    filename = 'generated/genBankRecord_{}.gb'.format(genome_id)
                                    print('Writing:{}'.format(filename))
                                    with open(filename, 'w') as f:
                                    f.write(record.read())






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Oct 13 '14 at 19:49









                                    schryerschryer

                                    7312




                                    7312






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18461629%2fhow-to-download-complete-genome-sequence-in-biopython-entrez-esearch%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        MongoDB - Not Authorized To Execute Command

                                        How to fix TextFormField cause rebuild widget in Flutter

                                        in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith