Otherwise, it may cause issues for warm restarts, warm reboot. Warm restart of swss will start nat which is not expected for warm restart. Also it is observed that during warm-reboot script execution nat container gets started after it was killed. This causes removal of nat dump generated by nat previously: A check [ -f /host/warmboot/nat/nat_entries.dump ] || echo "NAT dump does not exists" was added right before kexec: ``` Fri Jul 17 10:47:16 UTC 2020 Prepare MLNX ASIC to fastfast-reboot: install new FW if required Fri Jul 17 10:47:18 UTC 2020 Pausing orchagent ... Fri Jul 17 10:47:18 UTC 2020 Stopping nat ... Fri Jul 17 10:47:18 UTC 2020 Stopped nat ... Fri Jul 17 10:47:18 UTC 2020 Stopping radv ... Fri Jul 17 10:47:19 UTC 2020 Stopping bgp ... Fri Jul 17 10:47:19 UTC 2020 Stopped bgp ... Fri Jul 17 10:47:21 UTC 2020 Initialize pre-shutdown ... Fri Jul 17 10:47:21 UTC 2020 Requesting pre-shutdown ... Fri Jul 17 10:47:22 UTC 2020 Waiting for pre-shutdown ... Fri Jul 17 10:47:24 UTC 2020 Pre-shutdown succeeded ... Fri Jul 17 10:47:24 UTC 2020 Backing up database ... Fri Jul 17 10:47:25 UTC 2020 Stopping teamd ... Fri Jul 17 10:47:25 UTC 2020 Stopped teamd ... Fri Jul 17 10:47:25 UTC 2020 Stopping syncd ... Fri Jul 17 10:47:35 UTC 2020 Stopped syncd ... Fri Jul 17 10:47:35 UTC 2020 Stopping all remaining containers ... Warning: Stopping telemetry.service, but it can still be activated by: telemetry.timer Fri Jul 17 10:47:37 UTC 2020 Stopped all remaining containers ... NAT dump does not exists Fri Jul 17 10:47:39 UTC 2020 Rebooting with /sbin/kexec -e to SONiC-OS-201911.140-08245093 ... ``` With this change, executed warm-reboot 10 times without hitting this issue, while without this change the issue is easily reproducible almost every warm-reboot run. Signed-off-by: Stepan Blyschak <stepanb@mellanox.com> |
||
---|---|---|
.. | ||
Aboot | ||
apt | ||
build_scripts | ||
build_templates | ||
dhcp | ||
docker | ||
image_config | ||
initramfs-tools | ||
scripts | ||
sshd |