How to calculate the ID3v2 tag size from mp3 file correctly?

974 Views Asked by At

To my shame, I still don't quite understand byte arithmetic and other manipulations. I am trying to calculate the size of the ID3 tag from mp3 file. Versions 3 or 4 and with no extended header. For simplicity, will return an empty list on any exception.

ID3 description

from functools import reduce


def id3_size_calc(file_path):
    try:
        file_open = open(file_path, 'rb')
    except Exception:
        return print([])
    with file_open:
        id3_header = file_open.read(10)
        if id3_header[0:3] != b'ID3':
            return print([])
        elif id3_header[3] != (3 or 4):
            return print([])
        elif id3_header[5] != 0:
            return print([])
        else:
            size_encoded = bytearray(id3_header[-4:])
            return print(reduce(lambda a, b: a * 128 + b, size_encoded, 0))

I found this piece of code.

size = reduce(lambda a, b: a * 128 + b, size_encoded, 0)

However, I don't understand how it works. In addition, I came across information that function reduce is outdated. Is there a more elegant way to calculate the size of this tag?

3

There are 3 best solutions below

0
checkmate101 On

The simplest way would be to use ffmpeg/ffprobe

ffprobe -i file.mp3 -v debug 2>&1 | grep id3v2 should give you the output like so: id3v2 ver:4 flags:00 len:35

But if you do not intend or have the package to use ffprobe, here is a snippet of python to get the size:

file_open = open('file.mp3', 'rb')
data = file_open.read(10)
file_open = open('fly.mp3', 'rb')
if data[0:3] != b'ID3':
    print('No ID3 header present in file.')
else:
    size_encoded = bytearray(data[-4:])
    size = reduce(lambda a,b: a*128+b, size_encoded, 0)
    print(size)
0
JayRizzo On

Ok I ran into this error as it was missing tags. Credit: https://github.com/quodlibet/mutagen/issues/327#issuecomment-339316014

from mutagen.mp3 import MP3
from mutagen.id3 import ID3

def CreateMissingTag(filename):
    """Credit: https://github.com/quodlibet/mutagen/issues/327#issuecomment-339316014"""
    try:
        mp3 = MP3(filename)
        if mp3.tags is None:
            print(f"No ID3 Header or Tags Exist.")
            mp3.add_tags()
            print(f"Default Placeholder Tags Were Created.")
        tags = mp3.tags
        mp3.save()
    except Exception as e:
        print(f"{e}")

songfile = '/Music/Tom MacDonald - Angels (Explicit).mp3'

# If the following throws an error when you run:
tags = ID3(songfile)
# or if you deleted the tags
tags.delete(filename) # This deletes the ID3 Tags.
tags = ID3(songfile)
'''When you see this error or similar:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/id3/_file.py", line 77, in __init__
    super(ID3, self).__init__(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/id3/_tags.py", line 173, in __init__
    super(ID3Tags, self).__init__(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/_util.py", line 534, in __init__
    super(DictProxy, self).__init__(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/_tags.py", line 111, in __init__
    self.load(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/_util.py", line 185, in wrapper
    return func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/_util.py", line 156, in wrapper
    return func(self, h, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/id3/_file.py", line 152, in load
    self._header = ID3Header(fileobj)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/_util.py", line 185, in wrapper
    return func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/mutagen/id3/_tags.py", line 67, in __init__
raise ID3NoHeaderError("%r doesn't start with an ID3 tag" % fn)
mutagen.id3._util.ID3NoHeaderError: '/SongPath/Music/Tom MacDonald - Angels (Explicit).mp3' doesn't start with an ID3 tag
'''

# then  just call the CreateMisingTag on your file to fix it
CreateMissingTag(songfile)
# prints: 
# # No ID3 Header or Tags Exist.
# # Default Placeholder Tags Were Created.

# and now you can check the size of the ID3 Tags using the following
tags = ID3(songfile)
tags.size()

# Returns:
# # 8896 as an Empty Tag List.
0
dsanchez On

The size of the tag is stored in the ID3v2 tag itself. More specifically, in the ID3v2 header. According to the ID3v2 spec, the first 10 bytes of an MP3 file contain the ID3v2 header information. The tag information can be identified as the following structure:

$49 44 33 yy yy xx zz zz zz zz

The "$" denotes hexadecimal data.

The structure can be broken down into the following parts:

  • The first 3 bytes are always "49 44 33". This is the hexadecimal equivalent of the characters "I", "D" and "3", or "ID3".
  • The next 2 bytes (4 and 5) denote the ID3v2 tag version. For example, "03 00" would mean that the tag version is ID3v2.3.0. These values are always less than "FF" (decimal 255).
  • The next byte (6) contains the ID3v2 flags. The data is stored in the first 3 bits, starting with the most significant bit (bit 7, the leftmost bit). All remaining bits are set to 0. The definition for what those flags are and what they mean can be found in the spec.
  • The next 4 bytes (7, 8, 9 and 10) are the tag size. Each byte is encoded with the most significant bit always set to 0. Essentially, each byte is effectively a 7-bit binary integer. This allows for a total of 28 bits to be allocated to the size value of the tag for a maximum tag size of 256MB. These values will always be less than "80" (decimal 128).

Please note the following (from the spec) regarding the stored tag size:

The ID3v2 tag size is the size of the complete tag after unsychronisation, including padding, excluding the header but not excluding the extended header (total tag size - 10)

Unfortunately, I am not a Python developer (I'm not a developer at all, I'm a photographer) so I can't provide you with any Python examples of how to extract and parse the data you're looking for. However, I have done this in C# and it's just a matter of pulling the first 10 bytes out of the file and parsing for the above structure to find the data you're looking for.

I hope this helps.