Are there certain emails that imaplib cannot fetch but does not return an error?

72 Views Asked by At

I'm using Python imaplib to pull emails from an Outlook inbox. It works well most of the time but every so often, it fails to fetch data even though the result is OK, the UID is valid and I can see the email in the inbox. Are there certain kinds of emails that fetch cannot load? When I look at the emails that I cannot fetch, they are often replies to previous emails and the email bodies themselves are not terribly complicated or long. They may be around 5k-10k chars, some have special chars but not all, most are text with a bunch of footers, and have around 5 people each in the To / CC portions. The most puzzling part is that I don't get a BAD or error message from the server. It sends back OK with no data.

At first I thought the email did not exist but I can see it in the UID search and in the inbox. After that, I tried some different fetch commands like (BODY.PEEK[HEADER.FIELDS (SUBJECT)]), RFC822.HEADER and FLAGS. Finally, I tried some janky stuff like saving the email as msg files and trying to see if there was something funky going on with the format.

The basic set up of the email fetch script looks like this:

import imaplib  # we are using _MAXLINE = 50000000
server = imaplib.IMAP4_SSL(imap_server, imap_port)  
result, data = server.uid('FETCH', 42859, '(BODY.PEEK[HEADER.FIELDS (SUBJECT)])')

Here's a UID search for UIDs between 42858 and 42865. You can see that it returns 3 UIDs, including the problematic email 42862:

10:35.18 > b'FOFI4 EXAMINE INBOX'
  10:35.47 < b'* 11892 EXISTS'
  10:35.47 < b'* 11892 RECENT'
  10:35.47 < b'* FLAGS (\\Seen \\Answered \\Flagged \\Deleted \\Draft $MDNSent)'
  10:35.47 < b'* OK [PERMANENTFLAGS ()] Permanent flags'
  10:35.47 < b'* OK [UNSEEN 3] Is the first unseen message'
  10:35.47 < b'* OK [UIDVALIDITY 14] UIDVALIDITY value'
  10:35.47 < b'* OK [UIDNEXT 43441] The next unique identifier value'
  10:35.47 < b'FOFI4 OK [READ-ONLY] EXAMINE completed.'
  10:35.47 > b'FOFI5 UID SEARCH (UID 42858:42865)'
  10:35.51 < b'* SEARCH 42859 42862 42865'
  10:35.51 < b'FOFI5 OK SEARCH completed.'

When I look at the subject for the email 42859, I see:

10:38.79 > b'FOFI6 UID FETCH 42859 (BODY.PEEK[HEADER.FIELDS (SUBJECT)])'
  10:38.87 < b'* 11726 FETCH (BODY[HEADER.FIELDS (SUBJECT)] {126}'
  10:38.87 read literal size 126
  10:38.87 < b' UID 42859)'
  10:38.95 < b'FOFI6 OK FETCH completed.'
---
result: OK
data:   [(b'11726 (BODY[HEADER.FIELDS (SUBJECT)] {126}', b'Subject: XXX'), b' UID 42859)']

When I try the same for the problematic UID 42862:

  16:07.86 > b'OLDP6 UID FETCH 42862 (BODY.PEEK[HEADER.FIELDS (SUBJECT)])'
  16:09.08 < b'OLDP6 OK FETCH completed.'
---
result: OK
data:   [None]

The RCF822.SIZE command returned a size. I compared it to some other email sizes for emails that I was able to fetch and it was not out of the normal.

  11:35.69 > b'MOMM6 UID FETCH 42862 (RFC822.SIZE)'
  11:35.72 < b'* 11727 FETCH (RFC822.SIZE 120380 UID 42862)'
  11:35.80 < b'MOMM6 OK FETCH completed.'
---
result: OK
data: [b'11727 (RFC822.SIZE 120380 UID 42862)']

When I tried to get the FLAGS for the problematic UID, I got a Recent flag back:

  08:50.93 > b'MFBH6 UID FETCH 42862 (FLAGS)'
  08:50.96 < b'* 11727 FETCH (FLAGS (\\Recent) UID 42862)'
  08:51.04 < b'MFBH6 OK FETCH completed.'
---
result: OK
data: [b'11727 (FLAGS (\\Recent) UID 42862)']

I looked at the email properties / email body formats but did not see any common threads amongst the problematic emails. So far in my research I have not found any documentation or cases for fetch calls that return no error and no data. If there are any other commands I should try to get more info on the problematic emails, please let me know.

0

There are 0 best solutions below