'nm' reports different sizes for variables of the same type. How do I find out their real size?

277 Views Asked by At

I was trying to find out the addresses and sizes of variables in my program using nm, and I just realized a whole bunch of my variables are unexpectedly large. I made a following test file, "test.c":

static char test1 = 0;
static char test2 = 0;

char test_f(void)
{
    test1 = test2;
    return test2;
}

int main(void)
{
    return test_f();
}

Then I run the following commands:

gcc test.c
nm -C -S --size-sort a.exe | findstr /rc:"test"

And the output is

0000000140007040 0000000000000001 b test1
0000000140007041 000000000000000f b test2
0000000140001540 000000000000001a T test_f

I assume some sort of padding / alignment is at play here, but I don't understand why the padding became part of a symbol. Is there a way to produce a similar text log in which test1 and test2 would both have the size of 1?

3

There are 3 best solutions below

1
Eric Postpischil On

… I don't understand why the padding became part of a symbol.

The nm man page says “The size is computed as the difference between the value of the symbol and the value of the symbol with the next higher value.” Therefore, if there is padding after variable A and before variable B, the padding will appear as part of the size of A.

In your example, test1 was apparently immediately followed by test2, so the size of test1 was computed as one byte. test2 was not followed by any explicit symbol in its program section; the next “symbol” that nm used may have been the beginning of the next section or the first symbol in it. That next section has some alignment requirement, so there is unused space, also called padding, after test2 and before the next section. So the difference between test2 and the next “symbol” includes that padding, and it shows up in the “size” of test2 as the man page states.

4
teapot418 On

I don't understand why the padding became part of a symbol

According to the man page: https://man7.org/linux/man-pages/man1/nm.1.html

the ELF format records sizes for symbols, other formats (like EXE) will only report size as the interval from the start of this symbol to the start of the next one.

Is there a way to produce a similar text log in which test1 and test2 would both have the size of 1?

Do the same steps in Linux which uses ELF binaries.

0000000000004011 0000000000000001 b test1
0000000000004012 0000000000000001 b test2
0
Dmitry Grigoryev On

In the end, there seems to be no way to know symbol sizes in a PE without debug info. With debug info, one can extract it with objdump -W a.exe and then look up symbol type with DW_AT_type, then find out the size of that type with DW_AT_byte_size:

 <1><28ce>: Abbrev Number: 2 (DW_TAG_variable)
    <28cf>   DW_AT_name        : test1
    <28d5>   DW_AT_decl_file   : 1
    <28d6>   DW_AT_decl_line   : 3
    <28d7>   DW_AT_decl_column : 13
    <28d8>   DW_AT_type        : <0x28e6>
    <28dc>   DW_AT_location    : 9 byte block: 3 40 70 0 40 1 0 0 0     (DW_OP_addr: 140007040)
 <1><28e6>: Abbrev Number: 3 (DW_TAG_base_type)
    <28e7>   DW_AT_byte_size   : 1
    <28e8>   DW_AT_encoding    : 6  (signed char)
    <28e9>   DW_AT_name        : char
 <1><28ee>: Abbrev Number: 2 (DW_TAG_variable)
    <28ef>   DW_AT_name        : test2
    <28f5>   DW_AT_decl_file   : 1
    <28f6>   DW_AT_decl_line   : 4
    <28f7>   DW_AT_decl_column : 13
    <28f8>   DW_AT_type        : <0x28e6>
    <28fc>   DW_AT_location    : 9 byte block: 3 41 70 0 40 1 0 0 0     (DW_OP_addr: 140007041)