BioSeqIO not recognizing .gbff files

451 Views Asked by At

I am trying to convert a bunch of .gbff genebank files to .gbk in order to parse sequences, etc. I got the following code to work and convert a single file,

import Bio
from Bio import SeqIO
count = SeqIO.convert("filename.gbff", "genbank", "filename.gbk", "genbank")

but I cannot get any code with "*.gbff" to work. ex.

from Bio import SeqIO
count = SeqIO.convert("*.gbff", "genbank", "*.gbk", "genbank")

I keep getting the error " File "", line 1 count = SeqIO.convert(".gbff", "genbank", ".gbk", "genbank") ^ SyntaxError: invalid syntax" I've checked the syntax so many times I am wondering if python does not recognize .gbff as a file format. Is there any way around this? Or is there some silly mistake I am doing that I haven't noticed?

Thanks in advance!!

1

There are 1 best solutions below

0
pippo1980 On

here my attept copying from How do I pass Biopython SeqIO.convert() over multiple files in a directory?



from Bio import SeqIO
import os


for path, dirs, files in os.walk(os.getcwd()):
    
    # print(files)
    for filename in files:
        if filename.split('.')[-1] == 'gbff':
            print(filename)
    
    
            count = SeqIO.convert(filename, "genbank", filename.split('.')[0]+'.gbk', "genbank")

used gbff (multiple copies of https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/866/645/GCF_000866645.1_ViralMultiSegProj15620/GCF_000866645.1_ViralMultiSegProj15620_genomic.gbff.gz)

not sure is the same .gbff you are talking about