crash utility itself crashes while decoding kdump generated from null pointer dereference in kernel module

35 Views Asked by At

I am experimenting with crash utility that is used to decode linux kdump files.

My setup consists of linux kernel 6.5 running on qemu-system-aarch64. The rootfs used is buildroot. I have edited the /etc/init.d/rcS to setup crashkernel and generate kdumps.

#hackish way of achieving things
# Check if /proc/vmcore exists
if [ -e "/proc/vmcore" ]; then
    echo "collecting core"
    # Get the current date and time
    TSTAMP=$(date +"%Y%m%d%H%M%S")

    # Define the filename for the kdump file
    FILENAME="kernel.${TSTAMP}.core.kdump"

    # Run makedumpfile to create the vmcore dump
    makedumpfile --message-level 4 -d 17,31 /proc/vmcore "${FILENAME}"
    reboot
else
    echo "loading crashkernel into memory"
    # /proc/vmcore does not exist, so we run kexec
    kexec -p /Image --append="console=ttyAMA0,115200n8 root=/dev/nfs rw nfsroot=10.105.226.234:/home/naveen/nfsroot/rootfs-buildroot-arm64/,nolock,vers=4,tcp ip=10.105.226.235"
fi

I developed a simplistic kernel module with below code, to generate crash due to null pointer dereference :

static int __init null_deref_module_init(void) {
    // Pointer to an integer, initialized to NULL
    int *null_pointer = NULL;
    printk(KERN_INFO "Null dereference module loaded\n");

    // Dereferencing the NULL pointer to trigger a crash
    printk(KERN_INFO "Triggering null pointer dereference...\n");
    *null_pointer = 1; // This line will cause a null pointer dereference

    return 0; // This will never be reached
}

As soon as I load this kernel module, the qemu guest kernel crashes, as expected. And I have a successful kdump as well. When I go to decode this with kdump utility (tried multiple releases including latest 8.0.4), the utility itself crashes. What could be the possible reasons for the crash?

$ sudo crash ~/.repos/src/arm64/linux/vmlinux kernel.20240330170747.core.kdump
crash 8.0.4
Copyright (C) 2002-2022  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
Copyright (C) 2015, 2021  VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=aarch64-elf-linux".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...

please wait... (determining panic task)Segmentation fault

Had I built this module as built-in, I get proper crash message (of course no kdump),

[   63.406244] pc : null_deref_module_init+0x30/0x1000 [npdereference]
[   63.407287] lr : null_deref_module_init+0x24/0x1000 [npdereference]

I already logged this as an issue with maintainers, but I can't wait. Even though it should not have crashed, but I think I can make crash-utility happy by giving what it is expecting (yet to find what). Could it be related to the symbols of kernel module? Can someone please provide hints to get through it.

EDIT

The sp is coming null in the crash utility, which causes the segmentation fault.

Thread 1 "crash" received signal SIGSEGV, Segmentation fault.
value_search_module_6_4 (value=18446603338276298752, offset=0x7ffffffface0) at symbols.c:5564
5564                if (value < sp->value)
(gdb) bt
#0  value_search_module_6_4 (value=18446603338276298752, offset=0x7ffffffface0) at symbols.c:5564
#1  0x0000555555812bd0 in value_to_symstr (value=18446603338276298752,
    buf=buf@entry=0x7fffffffb9c0 "", radix=10, radix@entry=0) at symbols.c:5872
#2  0x00005555557694a2 in display_memory (addr=<optimized out>, count=2048, flag=208,
    memtype=memtype@entry=1, opt=opt@entry=0x0) at memory.c:1740
#3  0x0000555555769e1f in raw_stack_dump (stackbase=<optimized out>, size=<optimized out>)
    at memory.c:2194
#4  0x00005555557923ff in get_active_set_panic_task () at task.c:8639
#5  0x00005555557930d2 in get_dumpfile_panic_task () at task.c:7628
#6  0x00005555557a89d3 in panic_search () at task.c:7380
#7  get_panic_context () at task.c:6267
#8  task_init () at task.c:687
#9  0x00005555557305b3 in main_loop () at main.c:787
#10 0x0000555555a64331 in captured_main (data=<optimized out>) at main.c:1284
#11 gdb_main (args=<optimized out>) at main.c:1313
#12 0x0000555555a643b0 in gdb_main_entry (argc=<optimized out>, argv=argv@entry=0x7fffffffe508)
    at main.c:1338
#13 0x00005555557d1ece in gdb_main_loop (argc=<optimized out>, argc@entry=3,
    argv=argv@entry=0x7fffffffe508) at gdb_interface.c:81
#14 0x0000555555728dfc in main (argc=3, argv=0x7fffffffe508) at main.c:720

My kernel module has the debug symbols already though, and I am simply copying the .ko file to my system and doing insmod :

naveen@workstation:~/.repos/src/arm64/linux$ file drivers/naveen/npdereference.ko
drivers/naveen/npdereference.ko: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), BuildID[sha1]=118e35b0267440ef364c551c5890ff934392fb6c, with debug_info, not stripped
naveen@workstation:~/.repos/src/arm64/linux$
0

There are 0 best solutions below