9ff56445c9
During ISSU, "mlxsw_minimal" driver still trying to access firmware, in some cases FW could return some wrong critical threshold value which will cause switch shutdown. **- How I did it** In order to prevent "mlxsw_minimal" driver from accessing ASIC during ISSU, SDK will raise "OFFLINE" 'udev' event at the early beginning of such flow. When this event is received, hw-management will remove "mlxsw_minimal" driver. There is no need to implement the opposite "ONLINE" event since this flow is ended up with "kexec". **- How to verify it** repeatedly perform warm reboot, make sure there is no switch shutdown occurred.
31 lines
1.5 KiB
Diff
31 lines
1.5 KiB
Diff
From 3e511778248403968e0a02857b7003f352669ba3 Mon Sep 17 00:00:00 2001
|
|
From: Vadim Pasternak <vadimp@nvidia.com>
|
|
Date: Wed, 13 Jan 2021 13:19:17 +0200
|
|
Subject: [PATCH] hw-mgmt: events: Add support for SDK OFFLINE event for
|
|
handling flow with in service firmware upgrade
|
|
|
|
In order to prevent "mlxsw_minimal" driver access to ASIC during in
|
|
service firmware upgrade flow, SDK will raise "OFFLINE" 'udev' event
|
|
at early beginning of such flow. When this event is received,
|
|
hw-managemnet will remove "mlxsw_minimal" driver.
|
|
There is no need to implement opposite "ONLINE" event, since this flow
|
|
is ended up with "kexec".
|
|
|
|
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
|
|
---
|
|
usr/lib/udev/rules.d/50-hw-management-events.rules | 1 +
|
|
1 file changed, 1 insertion(+)
|
|
|
|
diff --git a/usr/lib/udev/rules.d/50-hw-management-events.rules b/usr/lib/udev/rules.d/50-hw-management-events.rules
|
|
index cf4219e..33ea1bc 100644
|
|
--- a/usr/lib/udev/rules.d/50-hw-management-events.rules
|
|
+++ b/usr/lib/udev/rules.d/50-hw-management-events.rules
|
|
@@ -269,3 +269,4 @@ SUBSYSTEM=="i2c", DEVPATH=="/devices/platform/mlxplat/i2c_mlxcpld*/i2c-*/i2c-*/*
|
|
# SDK
|
|
SUBSYSTEM=="pci", DRIVERS=="sx_core", ACTION=="add", RUN+="/usr/bin/hw-management-thermal-events.sh add sxcore add"
|
|
SUBSYSTEM=="pci", DRIVERS=="sx_core", ACTION=="remove", RUN+="/usr/bin/hw-management-thermal-events.sh rm sxcore remove"
|
|
+SUBSYSTEM=="pci", DRIVERS=="sx_core", ACTION=="offline", RUN+="/usr/bin/hw-management-thermal-events.sh rm sxcore remove"
|
|
--
|
|
1.9.1
|
|
|