KeyError when getting features from a genbank file with biopython

312 Views Asked by At

I'm very new to python but i've been using it to extract the sequence of a gene from a genbank file. The issue is is that sometimes i'll get the output i want (prints the sequence to a file) and sometimes it will return a key error. This depends on which accession i'm using. Does anyone know why it sometimes might give a key error? I thought it might be something to do with the genbank records themselves, but they look pretty similar and the gene is there (in the gene feature qualifier). EG works with HG738867.1 but not AP019703.1. Here's my code -

from Bio import Entrez, SeqIO

gi_genome = 'accession'
name = 'acrA'
Entrez.email = 'email'
handle = Entrez.efetch(db="nucleotide", id=gi_genome, rettype="gbwithparts", retmode="text")
record = SeqIO.read(handle, "gb")
handle.close()
element = 0
for feature in record.features:
    if feature.type == 'CDS' and name in feature.qualifiers["gene"]:
        report = 'record.features[%s]' % str(element)
        gene_sequence = feature.extract(record.seq)
        with open('output.fasta', 'a') as f:
            print('>' + gi_genome + ' ' + name, file=f)
            print(gene_sequence, file=f)
        break
    else:
        element = element + 1

Here's the traceback -

Traceback (most recent call last):
  File "/home/ubuntu/Documents/Git_Branches/Project_planning/Learning/In_progress/utils/data.py", line 11, in <module>
    if feature.type == 'CDS' and name in feature.qualifiers["gene"]:
KeyError: 'gene'

Process finished with exit code 1

Thanks in advance!

0

There are 0 best solutions below