Set loglevel for crash kernel to reduce verbosity and improve overall router recovery time (#18285)

Why I did it
On certain routers with baud rate 9600, crash kernel is taking a long time , close to ~5mins, to complete kernel dump and reload the box. On contrast to routers with baud rate 115200, crash kernel dump process is observed to be completed under 35s-60s (depending on the platform). Currently, all debug and informational messages are printed on the console which also factors in for the delay seen. Unless the router is monitored on console in real time, these messages are not very useful. Setting the loglevel to warning will help reduce the verbosity of logs on console, in turn allow crash kernel dump process to be completed in a reasonable time which will also help in overall router recovery time.

How I did it
Setting loglevel attribute in crashkernel cmdline

How to verify it
Install SONiC image with crashkernel cmdline with loglevel set to warning and initiate an induced a crash (sysrq-trigger)
crashkernel boot and dump process will be completed in 20s-30s depending on the platform
This commit is contained in:
amulyan7 2024-03-12 18:36:51 -07:00 committed by GitHub
parent b05f488ba6
commit 80448380e6
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -7,10 +7,11 @@ KDUMP_CMDLINE_APPEND="irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service a
# Reboot crash kernel on panic
# Enable debug level logging of crash kernel for better visibility
# Set loglevel to reduce verbosity and print only warning conditions
# Disable advanced pcie features
# Disable high precision event timer as on some platforms it is interfering with the kdump operation
# Pass platform identifier string as part of crash kernel command line to be used by the reboot script during kdump
KDUMP_CMDLINE_APPEND="${KDUMP_CMDLINE_APPEND} panic=10 debug hpet=disable pcie_port=compat pci=nommconf sonic_platform=__PLATFORM__"
KDUMP_CMDLINE_APPEND="${KDUMP_CMDLINE_APPEND} panic=10 debug loglevel=4 hpet=disable pcie_port=compat pci=nommconf sonic_platform=__PLATFORM__"
# Use SONiC reboot wrapper script present in /usr/local/bin post kdump
PATH=/usr/local/bin:$PATH