llvm linker(lld) mapfile format explanation

624 Views Asked by At

I want to parse the lld mapfile. Is there an explanation or documentation? I checked clang documentation and lld documentation but failed to find helpful infomation. Here is an example of my mapfile

             VMA              LMA     Size Align Out     In      Symbol
      2002a8           2002a8       1d     1 .interp
      2002a8           2002a8       1d     1         <internal>:(.interp)
      2002c8           2002c8       20     4 .note.ABI-tag
      2002c8           2002c8       20     4         /opt/tiger/typhoon-blade/gccs/x86_64-x86_64-gcc-830/sysroot/usr/lib/../lib64/crt1.o:(.note.ABI-tag)
      2002e8           2002e8     24c0     8 .dynsym
      2002e8           2002e8     24c0     8         <internal>:(.dynsym)
      2027a8           2027a8      310     2 .gnu.version
      2027a8           2027a8      310     2         <internal>:(.gnu.version)
      202ab8           202ab8      170     4 .gnu.version_r
      202ab8           202ab8      170     4         <internal>:(.gnu.version_r)
      202c28           202c28       24     8 .gnu.hash
      202c28           202c28       24     8         <internal>:(.gnu.hash)
      202c4c           202c4c      c48     4 .hash
      202c4c           202c4c      c48     4         <internal>:(.hash)
      203894           203894     2c20     1 .dynstr
      203894           203894     2c20     1         <internal>:(.dynstr)
      2064b8           2064b8     9d50     8 .rela.dyn
      2064b8           2064b8     9d50     8         <internal>:(.rela.dyn)
      210208           210208     21f0     8 .rela.plt
      210208           210208     21f0     8         <internal>:(.rela.plt)
      212400           212400     b7b7    16 .rodata
      212400           212400        8     4         <internal>:(.rodata)
      212410           212410      262    16         build64_release/version.cpp.o:(.rodata)
      212410           212410        4     1                 kSvnInfoCount
      212420           212420      1e5     1                 kSvnInfo
      212605           212605        e     1                 kMainInfo
      212613           212613        8     1                 kBuildType
      212620           212620       19     1                 kBuildTime
      212640           212640       10     1                 kBuilderName
      212650           212650        d     1                 kHostName
      212660           212660       11     1                 kCompiler
      212671           212671        1     1                 kScmVersion
      212680           212680       80    16         <internal>:(.rodata)
      212700           212700     4bea     1         <internal>:(.rodata)
      2172ec           2172ec       18     4         build64_release/cpputil/json/libjson.a(json_params.cpp.o):(.rodata._ZNK9rapidjson12GenericValueINS_4UTF8IcEENS_12CrtAllocatorEE6AcceptINS_6WriterINS_19GenericStringBufferIS2_S3_EES2_S2_S3_EEEEbRT_)
      217304           217304       18     4         build64_release/cpputil/json/libjson.a(json_params.cpp.o):(.rodata._ZNK9rapidjson12GenericValueINS_4UTF8IcEENS_12CrtAllocatorEE6AcceptINS_12PrettyWriterINS_15FileWriteStreamES2_S2_S3_EEEEbRT_)
      21731c           21731c       84     4         build64_release/cpputil/json/libjson.a(json_params.cpp.o):(.rodata._ZN9rapidjson13GenericReaderINS_4UTF8IcEES2_NS_12CrtAllocatorEE10ParseValueILj0ENS_19GenericStringStreamIS2_EENS_19GenericDocumentLiteIS2_S3_S3_EEEEvRT0_RT1_)
2

There are 2 best solutions below

0
David Ledger On

The header of this source file has a vague outline of the format, its the best I have:

https://github.com/llvm/llvm-project/blob/main/lld/COFF/MapFile.cpp

It describes the format as the same as link.exe. Some of this is described here:

https://www.codeproject.com/Articles/3472/Finding-Crash-Information-Using-the-MAP-File

I wish I could help more, because I also need more information.

0
Greg Nelson On

The format you put out looks more like the ELF format from lld rather than the COFF format mentioned in another answer. A better link for this would be:

https://github.com/llvm/llvm-project/blob/main/lld/ELF/MapFile.cpp

The columns here correspond to the "virtual memory address" (VMA), the "logical memory address" (LMA), the size of the map file entry (hex), the alignment in bytes (decimal), the "output section", the "input section", and the name of the symbol(s) in that section.

Let's look at your .rodata section which is more interesting than the ones earlier in your listing:

      212400           212400     b7b7    16 .rodata
      212400           212400        8     4         <internal>:(.rodata)
      212410           212410      262    16         build64_release/version.cpp.o:(.rodata)
      212410           212410        4     1                 kSvnInfoCount
      212420           212420      1e5     1                 kSvnInfo

It begins at address 0x212400 (the virtual and logical addresses are the same, and I have not yet seen a mapfile for which this is not true). It contains a total of 0xb7b7 (47031 dec) bytes, starting on a 16-byte boundary. The name of the section output by the linker is .rodata. It was built from multiple "input" sections (generated by the compiler, or possibly by separate linking steps).

The first of these is internal: <internal>:(.rodata) and contains 8 bytes of data aligned to a 4-byte boundary; since the containing section is already 16-byte aligned, this doesn't affect the starting address, and so it begins at the same place as the output section.

The second one came from a .o file called build64_release/version.cpp.o, which had its own .rodata (for example, these might be a set of constant integers, strings, structures, etc.). There were 0x262 (610 dec) bytes from this source, and we've forced a realignment to 16-bytes, so even though the previous bit of data was only 8 bytes long, we've skipped 8 more bytes to get back to a larger alignment, and now the address is 0x212410 to start this group. Within this group, I've shown the first two symbols (the first named symbols we have seen), which are kSvnInfoCount and kSvnInfo. We see here how big they are, and where they start. It appears that because the section is 16-byte aligned, each one gets aligned at that level, but if you had something that was, for example, 32-byte aligned, it might still skip forward for its own placement within the section.

I hope this is helpful to someone. Trust me, you'd rather parse this format that the GCC map file format, which in an effort to be human-readable can represent the same kinds of information in dozens of different ways, and requires a complex state machine for parsing.