mime multipart/mixed w/ pure binary + python

41 Views Asked by At

It's unclear to me whether raw (non base64) binary is standard supported in MIME multipart/mixed, in particular when decoding with python:

msg = email.message_from_binary_file(fp, policy=email.policy.HTTP)

for part in msg.walk():
    if part.get_content_maintype() == 'multipart':
         continue

    filename = part.get_filename()
    payload = part.get_content()

payload gets text processed (or something like it. 0x13s turn into 0x10s). Obviously this corrupts the data. Is there a way to put this into a pure binary mode? Should I be base64 encoding it? The MIME itself looks like this:

Content-Type: multipart/mixed;boundary=123456789000000000000987654321
Transfer-Encoding: chunked


--123456789000000000000987654321
Content-Type: image/jpeg
Content-transfer-encoding: binary
Content-Disposition: attachment; filename="2024-02-08T000418.jpg"
Content-Length: 23302

<binary data>
--123456789000000000000987654321

UPDATE 08FEB24

  1. RFC 2045 tells us "Content-Transfer-Encoding [...] "binary" all mean that the identity (i.e. NO) encoding transformation has been performed"
  2. RFC 2045 tells us "there are no circumstances in which the "binary" Content-Transfer-Encoding is actually valid in Internet mail."
  3. RFC 2045 tells us:
mechanism := "7bit" / "8bit" / "binary" /
             "quoted-printable" / "base64" /
  1. Python call email.message_from_bytes leaves CRs alone
  2. Yes, fp is opened in binary mode

From this I conclude:

  • Content-transfer-encoding: base64 is standards-supported
  • Python very much took item #2 to heart, since this is an email processing mechanism
  • Presuming that that 28 year old RFC still holds true, one might not accuse Python email module of having a a bug. However, I say it does - email.message_from_binary_file specifically fails on a binary file.

Is my logic sound?

0

There are 0 best solutions below