Project
I have a game project in C++ that I'm currently developing.
I compile every source file with -g3 -std=c++2a -Wall ... -fsanitize=address -fsanitize=leak to check for leaks and Segfaults
The main problem
The problem is, randomly (1 in 5 times), asan (address or leak), terminates the program before reaching main with a SIGSEGV without any diagnostics.
AddressSanitizer:DEADLYSIGNAL
=================================================================
==28573==ERROR: AddressSanitizer: SEGV on unknown address 0x625505a4ce68 (pc 0x7cc52585f38f bp 0x000000000000 sp 0x7fff63949020 T0)
==28573==The signal is caused by a READ memory access.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.
The address the SEGV happens on is always different, as is the pc (except for the last 3 digits, e68, 38f respectively)
The system it runs on
My machine is Arch Linux 6.7.0-arch3-1 and I'm using g++ (GCC) 13.2.1 20230801, GNU gdb (GDB) 13.2, that are the latest on the repositories at the moment of writing
What I've tried
I have no idea how to hunt down this bug, nor what might be causing it.
In code
I am sure the problems happens before main since printing something (with cout or printf) has no effect, same for using a signal handler, signal(SIGSEGV, &handle);
asan is part of it
Without asan the SEGV does not happen. (I have tried 50~ times and the program started correctly every time)
gdb
Using gdb with the program compiled with asan and ASLR turned off caused the SIGSEGV and the automatic catch
assembly instruction of the problem
Given the strange pattern of addresses that the problem happens on I tried using a watchpoint on any $pc ending with 38f (watch ((size_t)$pc & 0xfff) == 0x38f).
The watchpoint works, the address in question is contained in a libc function (do_lookup_x or similar) that is seemingly called thousands of times, before the main begins, making debugging this way practically a nightmare.
The question
I would like to ask if anybody has any idea on how to get more information out of asan, gdb, or any other tool, because at this moment I do not have enough information to know where the problem happens or even if the problem is mine or not.
Updates
@marekR and @eljay suggested some kind of symbol collision with some glibc function / names. Most of my definitions are enclosed in a namespace (thus also name mangled) and the only functions generic enough to collide with some other name are init(), loop(), and terminate(). Changing their name did not solve the issue
Following @ÖöTiib suggestion i tested my git history with git bisect, this problem present itself since the first commit, back in 2019, this means that it might have gone unnoticed all of this time, (I'm the only working on this project but seems unlikely), this is a combination of factors local to my machine, or something else
Thanks to @EmployedRussian I was capable of track down the bug origin. Since this was the point of this question I'd close this Post.
I will try to solve the bug myself and, in case, open another question / bug tracker on asan if I'm not capable.
I any case thank you for helping me.
For anyone interested, compiling the binary with
-fsanitize=addressand running it undergdbwithset disable-radomization offcan cause the SIGSEGV, gdb should catch it automatically.I'd consider this question closed.