Why isn't argument of brk(void *end_data_segment), rounded up to the next page boundary?

96 Views Asked by At

From The Linux Programming Interface:

int brk(void * end_data_segment );

The brk() system call sets the program break to the location specified by end_data_segment. Since virtual memory is allocated in units of pages, end_data_segment is effectively rounded up to the next page boundary.

So for this demo:

#include <stdio.h>
#include <unistd.h>


int
main(int argc, char *argv[])
{
    long int page_size = sysconf(_SC_PAGESIZE);
    printf("My page size: %ld\n", page_size);
    void* c1 = sbrk(0);
    printf("program break address: %p\n", c1);
    printf("sizeof char: %lu\n", sizeof(char));
    c1 = (void*) ((char*) c1 + 1);
    printf("c1: %p\n", c1);
    brk(c1);
    void* c2 = sbrk(0);
    printf("program break address: %p\n", c2); 

}

output:

My page size: 4096
program break address: 0x55b0bc104000
sizeof char: 1
c1: 0x55b0bc104001
program break address: 0x55b0bc104001

I expect the new program break address to be: 0x55b0bc104000 + 0x1000(4096 in HEX) == 0x55b0bc105000

Why didn't I get 0x55b0bc105000 but 0x55b0bc104001 instead?

1

There are 1 best solutions below

2
Brendan On

Think of it as 2 possibilities:

  • align the end_data_segment to a page boundary; and ensure that the size of the underlying area of the virtual address space exactly matches the end_data_segment value

  • don't align the end_data_segment to a page boundary; and ensure that the size of the underlying area of the virtual address space is aligned (rounded up) to the page size

For the first possibility, portable software (that has no idea what the page size will be) could (e.g.) increase the end_data_segment by 1/8th of a page and do that 8 separate times, and instead of ending up with one extra page (the result you'd naturally expect) it would end up with 8 extra pages (7 pages more than it wanted and 7 pages more than it expected). Worse; software could reduce the size of end_data_segment by less than a page and it would do nothing (would be rounded back up to what the original value); and this can also be done many times leading to a large area that software tried to get rid of but still exists. Of course these can be combined - software could increase end_data_segment by 1 byte and then reduce end_data_segment by one byte in the middle of a loop, causing an unexpected memory (space) leak that might quickly gobble up all available virtual address space when people expected it to waste nothing. Of course software could explicitly work around all the problems by adding (non-standard/unportable) fix-ups everywhere, but that would be horribly ugly.

The second possibility (don't align the end_data_segment) is more intuitive, more convenient, and less error prone.