[kdump] Fix OOM events in crashkernel (#6447)

A few issues where discovered with crashkernel on Arista platforms.

1) platforms using `docker_inram=on` would end up OOM in kdump environment.
This happens because the same initramfs is used by SONiC and the crashkernel.
With `docker_inram=on` the `dockerfs.tar.gz` is extracted in a `tmpfs` created for the occasion.
Since `dockerfs.tar.gz` weights more than 1.5G, it doesn't fit into the kdump environment and ends up OOM.
This OOM event can in turn trigger a panic.

2) Arista platforms with `secureboot` enabled would fail to load the crashkernel because the kernel parameter would be discarded on boot.
This happens because the `boot0` in secureboot mode is strict about kernel parameter injection.

3) The secureboot path allowlist would remove kernel crash reports.

4) The kdump service would fail on Arista products since `/boot/` is empty in `secureboot`

**- How I did it**

1) To prevent an OOM event in the crashkernel the fix is to avoid the codepaths in `union-mount` that create tmpfs and populate them. Some more codepath specific to Arista devices are also skipped to make the kdump process faster.
This relies on detecting that the initramfs is starting in a kdump environment and skipping some initialization.
The `/usr/sbin/kdump-config` tool appends a few kernel cmdline arguments when loading the crashkernel.
The most unique one is `systemd.unit=kdump-tools.service` which is used in a few initramfs hooks to set `in_kdump`.

2) To allow `kdump` to work in `secureboot` environment the cmdline generation in boot0 was slightly modified.
The codepath to load kernel parameters changed by SONiC is now running for booting in secure mode.
It was altered to prevent an append only behavior which would grow the `kernel-cmdline` at every reboot.
This ever growing behavior would lead `kexec` to fail to load the kernel due to a too long cmdline.

3) To get the kernel crash under /var/crash this path has to be added to `allowlist_paths`

4) The `/host/image-XXX/boot` folder is now populated in `secureboot` mode but not used.

**- How to verify it**

Regular boot:
 - enable kdump
 - enable docker_inram=on via kernel-params
 - reboot
 - generate a crash `echo c > /proc/sysrq-trigger`
 - before: witness OOM events on the console
 - after: crash kernel works and crash available under /var/crash

Secure boot:
 - enable kdump
 - reboot
 - generate a crash `echo c > /proc/sysrq-trigger`
 - before: witness no kdump
 - after: crash kernel works and crash available under /var/crash


Co-authored-by: Boyang Yu <byu@arista.com>
This commit is contained in:
Samuel Angebault 2021-02-02 01:55:09 -08:00 committed by GitHub
parent ee18483c0f
commit 0c4d4ace76
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 63 additions and 37 deletions

View File

@ -86,7 +86,7 @@ installer_image_path="$image_path/$installer_image"
boot_config="$target_path/boot-config"
cmdline_allowlist='crashkernel'
cmdline_allowlist="crashkernel hwaddr_ma1"
# for backward compatibility with the sonic_upgrade= behavior
install="${install:-${sonic_upgrade:-}}"
@ -392,7 +392,8 @@ extract_image() {
extract_image_secureboot() {
info "Extracting necessary swi content"
unzip -oq "$swipath" platform/firsttime .imagehash -d "$image_path"
# NOTE: boot/ is not used by the boot process but only extracted for kdump
unzip -oq "$swipath" 'boot/*' platform/firsttime .imagehash -d "$image_path"
info "Installing image as $installer_image_path"
mv "$swipath" "$installer_image_path"
@ -650,6 +651,27 @@ write_default_cmdline() {
cmdline_add "$delimiter"
}
write_cmdline() {
# use extra parameters from kernel-params hook if the file exists
if [ -f "$target_path/$kernel_params" ] && ! $secureboot; then
info "Loading extra kernel parameters from $kernel_params"
cat "$target_path/$kernel_params" | cmdline_append
fi
# FIXME: sonic sometimes adds extra kernel parameters from user space
# this is unsafe but some will be kept as part of the regular boot
if [ -f "$image_path/kernel-cmdline" ]; then
for field in $cmdline_allowlist; do
cat "$image_path/kernel-cmdline" | tr ' ' '\n' | grep -E "$field" | tail -n 1 | cmdline_append
done
fi
# FIXME: legacy configuration files used by fast-reboot and eos2sonic
# these should be deprecated over time.
cmdline_echo > "$image_path/kernel-cmdline"
cmdline_echo | sed 's/ cmdline-aboot-end.*$//' > "$target_path/kernel-params-base"
}
write_common_configs() {
write_default_cmdline
write_platform_specific_cmdline
@ -667,28 +689,14 @@ write_secureboot_configs() {
cmdline_add aboot.secureboot=enabled
# setting panic= has the side effect of disabling the initrd shell on error
cmdline_add panic=0
write_cmdline
}
write_regular_configs() {
write_common_configs
cmdline_add "loop=$image_name/fs.squashfs"
cmdline_add loopfstype=squashfs
# use extra parameters from kernel-params hook if the file exists
if [ -f "$target_path/$kernel_params" ]; then
cat "$target_path/$kernel_params" | cmdline_append
fi
# FIXME: sonic sometimes adds extra kernel parameters from user space
# this is unsafe but some will be kept as part of the regular boot
if [ -f "$image_path/kernel-cmdline" ]; then
cat "$image_path/kernel-cmdline" | tr ' ' '\n' | grep -E "$cmdline_allowlist" | cmdline_append
fi
# FIXME: legacy configuration files used by fast-reboot and eos2sonic
# these should be deprecated over time.
cmdline_echo > "$image_path/kernel-cmdline"
cmdline_echo | sed 's/ cmdline-aboot-end.*$//' > "$target_path/kernel-params-base"
write_cmdline
}
run_kexec() {
@ -753,8 +761,8 @@ secureboot_boot() {
regular_boot() {
# boot uses the image installed on the flash
run_hooks pre-kexec
write_regular_configs "$image_path"
run_hooks pre-kexec
update_next_boot
run_kexec
}

View File

@ -1,5 +1,6 @@
home/.*
var/core/.*
var/crash/.*
var/log/.*
etc/adjtime
etc/default/ntp

View File

@ -19,7 +19,8 @@ block_flash=''
aboot_flag=''
backup_file=''
prev_os=''
sonic_fast_reboot=''
sonic_fast_reboot=false
in_kdump=false
# Wait until get the fullpath of flash device, e.g., /dev/sda
wait_get_flash_dev() {
@ -141,14 +142,20 @@ for x in "$@"; do
SONIC_BOOT_TYPE=warm*|SONIC_BOOT_TYPE=fast*)
sonic_fast_reboot=true
;;
systemd.unit=kdump-tools.service)
in_kdump=true
;;
esac
done
# Check aboot
[ -z "$aboot_flag" ] && exit 0
# Check kdump
[ "$in_kdump" = true ] && exit 0
# Skip this script for warm-reboot/fast-reboot from sonic
[ "$sonic_fast_reboot" == true ] && [ "$prev_os" != eos ] && exit 0
[ "$sonic_fast_reboot" = true ] && [ "$prev_os" != eos ] && exit 0
# Get flash dev name
if [ -z "$block_flash" ]; then

View File

@ -45,6 +45,10 @@ for x in "$@"; do
# Skip this script for warm-reboot and fast-reboot
exit 0
;;
systemd.unit=kdump-tools.service)
# In kdump environment, skip hooks
exit 0
;;
esac
done

View File

@ -15,6 +15,7 @@ docker_inram=false
logs_inram=false
secureboot=false
bootloader=generic
in_kdump=false
# Extract kernel parameters
for x in $(cat /proc/cmdline); do
@ -35,6 +36,9 @@ for x in $(cat /proc/cmdline); do
platform=*)
platform_flag="${x#platform=}"
;;
systemd.unit=kdump-tools.service)
in_kdump=true
;;
esac
done
@ -86,7 +90,7 @@ mkdir -p "$rw_dir"
mkdir -p "$work_dir"
## Remove the files not in allowlist in the rw folder
if $secureboot; then
if [ "$secureboot" = true ] && [ "$in_kdump" = false ]; then
if [ "$bootloader" = "aboot" ]; then
swi_path="${rootmnt}/host/$(sed -E 's/.*loop=([^ ]+).*/\1/' /proc/cmdline)"
unzip -q "$swi_path" allowlist_paths.conf -d /tmp
@ -120,23 +124,25 @@ case "${ROOT}" in
esac
mkdir -p ${rootmnt}/var/lib/docker
if $secureboot; then
mount -t tmpfs -o rw,nodev,size={{ DOCKER_RAMFS_SIZE }} tmpfs ${rootmnt}/var/lib/docker
if [ "$bootloader" = "aboot" ]; then
unzip -qp "$swi_path" dockerfs.tar.gz | tar xz --numeric-owner -C ${rootmnt}/var/lib/docker
## Boot folder is not extracted during secureboot since content would inherently become unsafe
mkdir -p ${rootmnt}/host/$image_dir/boot
if [ "$in_kdump" = false ]; then
if [ "$secureboot" = true ]; then
mount -t tmpfs -o rw,nodev,size={{ DOCKER_RAMFS_SIZE }} tmpfs ${rootmnt}/var/lib/docker
if [ "$bootloader" = "aboot" ]; then
unzip -qp "$swi_path" dockerfs.tar.gz | tar xz --numeric-owner -C ${rootmnt}/var/lib/docker
## Boot folder is not extracted during secureboot since content would inherently become unsafe
mkdir -p ${rootmnt}/host/$image_dir/boot
else
echo "secureboot unsupported for bootloader $bootloader" 1>&2
exit 1
fi
elif [ -f ${rootmnt}/host/$image_dir/{{ FILESYSTEM_DOCKERFS }} ]; then
## mount tmpfs and extract docker into it
mount -t tmpfs -o rw,nodev,size={{ DOCKER_RAMFS_SIZE }} tmpfs ${rootmnt}/var/lib/docker
tar xz --numeric-owner -f ${rootmnt}/host/$image_dir/{{ FILESYSTEM_DOCKERFS }} -C ${rootmnt}/var/lib/docker
else
echo "secureboot unsupported for bootloader $bootloader" 1>&2
exit 1
## Mount the working directory of docker engine in the raw partition, bypass the overlay
mount --bind ${rootmnt}/host/$image_dir/{{ DOCKERFS_DIR }} ${rootmnt}/var/lib/docker
fi
elif [ -f ${rootmnt}/host/$image_dir/{{ FILESYSTEM_DOCKERFS }} ]; then
## mount tmpfs and extract docker into it
mount -t tmpfs -o rw,nodev,size={{ DOCKER_RAMFS_SIZE }} tmpfs ${rootmnt}/var/lib/docker
tar xz --numeric-owner -f ${rootmnt}/host/$image_dir/{{ FILESYSTEM_DOCKERFS }} -C ${rootmnt}/var/lib/docker
else
## Mount the working directory of docker engine in the raw partition, bypass the overlay
mount --bind ${rootmnt}/host/$image_dir/{{ DOCKERFS_DIR }} ${rootmnt}/var/lib/docker
fi
## Mount the boot directory in the raw partition, bypass the overlay