Commit Graph

1907 Commits

Author SHA1 Message Date
Junchao-Mellanox
f73d322081
Fix issue: watchdogutil command does not work (#16242)
Conflicts:
	platform/mellanox/mlnx-platform-api/sonic_platform/watchdog.py
	platform/mellanox/mlnx-platform-api/tests/test_watchdog.py
2023-08-28 23:58:15 +08:00
Vadym Hlushko
adb43ff1f4
[mlxtrace] Add mft-fwtrace-cfg.deb which contains fwtrace_cfg files for the mlxtrace utility (#15960)
Backport of #15961

Why I did it
Added the fwtrace config files in order to be able to call mlxstrace utility during show techsupport dump.

Work item tracking
Microsoft ADO (number only):
How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.

How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.
2023-08-20 19:29:32 +08:00
Kebo Liu
4f403c9079
[202211] [Mellanox] Update SDK/FW/SAI to 4.6.1020/2012.1020/SAIBuild2211.25.1.0 (#16096) (#16095)
This is to backport #16096

Why I did it
SONiC changes:

Support Spectrum4 ASIC FW binary building.
Support new SDK sx-obj-desc lib building since new SAI need it.
Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2211.25.1.0
SDK/FW bug fixes

In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.
SDK/FW Features

On SN2700 all ports can support y cable by credo
SAI bug Fixes

When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable
Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE
SAI features

Port init profile
Dual ToR Active-Standby | Additional MAC support
Work item tracking
Microsoft ADO (number only):
How I did it
Update SDK/FW/SAI make files

How to verify it
Run full sonic-mgmt regression on Mellanox platform
2023-08-20 19:24:32 +08:00
mssonicbld
cd06ae9599
[E1031] fix pca9548 initializes failed occasionally (#15712) (#16053) 2023-08-06 23:52:44 +08:00
mssonicbld
4217e52dd3
[Mellanox] Remove unnecessary file manipulation in the SAI Make file (#15993) (#16044) 2023-08-06 14:12:06 +08:00
mssonicbld
946d276f96
[Mellanox] Remove reset_from_comex from reboot cause mapping (#15793) (#16039) 2023-08-06 14:03:59 +08:00
mssonicbld
b04c39f656
[Mellanox] Add support for BIOS update on Spectrum-4 (#15795) (#15941) 2023-07-23 22:42:39 +08:00
Kebo Liu
19fd6d5b8d
[202211] [Mellanox] Update SAI build procedure (#15728) (#15742)
Backport #15728

Why I did it
To optimize Mellanox platform SAI build

Work item tracking
Microsoft ADO (number only):
How I did it
SAI debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release.

How to verify it
Configure/build for Mellanox platform, check the image and ensure that correct SAI debs are included.
2023-07-23 21:12:44 +08:00
Vivek
a416372e04 [Mellanox] Added patchwork link to commit message (#15301)
- Why I did it
Add the patchwork link to the commit description for non-upstream patches if present

- How I did it
Parse the patchwork/<patch_name>.txt file from hw-mgmt
2023-07-08 10:11:41 +08:00
mssonicbld
36f1c8c972
Revert "[gearbox] use credo sai v0.9.0 (#14149)" (#15708) (#15751) 2023-07-08 07:44:02 +08:00
mssonicbld
c442528379
[Mellanox] Add Copyright Headers for missing files (#15136) (#15733) 2023-07-07 07:08:17 +08:00
mssonicbld
28857e34b2
[Mellanox] Facilitate automatic integration of sdk kernel patches (#14652) (#15732) 2023-07-07 05:34:08 +08:00
mssonicbld
3d99bf71b6
[mlnx-ffb.sh] Update issu-version location (#14925) (#15674) 2023-06-30 08:10:20 +08:00
mssonicbld
923da6fc84
[Mellanox] get LED capability from capability file (#14584) (#15664) 2023-06-30 00:26:23 +08:00
mssonicbld
f407a10c27
[Mellanox] Adjust warning threshold implementation according to the latest algorithm update (#15092) (#15665) 2023-06-30 00:25:50 +08:00
Samuel Angebault
19be1fa775
[202211][Arista] Update platform library submodules (#15407)
fix pcied leak on chassis
fix fan status led setting on fixed systems
misc fixes
2023-06-22 08:14:17 -07:00
Pavan-Nokia
776abb002a
[armhf][Nokia-7215]Add SFP refactor support for Nokia-7215 platform (#14789)
Why I did it
Add support for SFP refactor on Nokia-7215 Marvell armhf platform.

Platform: armhf-nokia_ixs7215_52x-r0
HwSKU: Nokia-7215
ASIC: marvell
Port Config: 48x1G + 4x10G (SFP+)

How I did it
Modify sfp.py to support SFP refactor optoe driver and platform.json to facilitate proper OC test completion.

How to verify it
Build armhf target for Nokia-7215 and verify proper Xcvrd and SFP refactor operation.
2023-06-22 08:12:37 -07:00
pavannaregundi
b8cd8d8e06 [Marvell] Update armhf driver version (#15138)
Changes in MRVL_PRESTERA_DRIVER_1.4:
- Memory leak fixed by releasing pci device after retrieval.
- Fixes for 5.10 kernel porting.

Change-Id: I1d7ee4ec02ec17a29ddb8473725ab68ca399748b

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2023-06-16 09:54:53 +08:00
Ikki Zhu
ea2e849607 [celestica/e1031]: enable emc2305 fan controller timeout feature (#14401)
Why I did it
There is rare condition, emc2305 hold SMBus and cause SMBus completion wait timed out.

How I did it
Enable EMC2305 SMBus timeout feature, 30ms period of inactivity will reset the interface.

How to verify it
Use 'i2cget -y -f 23 0x4d 0x20 b' to read EMC2305 configuration register and check DIS_TO bit not set.

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2023-06-16 09:54:47 +08:00
Lior Avramov
d26850611f
[Mellanox] [202211] Remove iproute2 SDK patches from SONiC tree and consume them from SDK github (#15061)
Why I did it
SDK patches for iproute2 were added to SONiC tree as a temporary solution.
Now that SDK with the patches is available, I have removed the patches from SONiC tree and we consume them from SDK github during compilation.

How I did it
During build we download SDK iproute2 patches from SDK github (or from the URL provided by user if compiling SDK from sources) and apply them before compilation.

How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
2023-06-14 17:13:10 +08:00
StormLiangMS
8aeb2ba715
Cherrypick to 202211 [Mellanox] Add patch commit-id mapping to description #15416
cherry pick #15052
2023-06-10 13:58:12 +08:00
Junchao-Mellanox
af7412d3a1 [Mellanox] add PSU fan direction support (#14508)
- Why I did it
Add PSU fan direction support

- How I did it
Implement fan.get_direction for PSU fan

- How to verify it
Manual test
Unit test
2023-06-10 12:32:26 +08:00
Sudharsan Dhamal Gopalarathnam
d93970bc2e
[Mellanox] Update hw-mgmt to 7.0020.4301 (#15260) (#15283)
Manual Cherrypick of #15260

Why I did it
Bug fix:

I2C bus is stuck - Unable to probe I2C bus 2-0048, which causes /var/run/hw-management/config/sfp_counter, module_counter to be zero and pmon docker unable to start.
Work item tracking
Microsoft ADO (number only):
How I did it
Update HW-MGMT package version in the make file
Update HW-MGMT submodule pointer

How to verify it
run full sonic-mgmt regression
2023-06-01 11:41:59 +08:00
mssonicbld
d8f2f7c034
[Mellanox] Use sysfs for sfp reset/LPM/presence (#14130) (#15215) 2023-05-26 02:25:21 +08:00
mssonicbld
2098634ab3
[Mellanox] Update SAI to 2211.24.0.21 and SDK/FW to 4.5.5142/2010_5144 (#15072) (#15214) 2023-05-26 02:20:30 +08:00
mssonicbld
17771efecf
[armhf][Nokia-7215] changes fstrim.timer to daily (#14723) (#15125) 2023-05-18 07:12:24 +08:00
Song Yuan
01f61d1d29 Install ptf afpacket module required by ptf_nn_agent. (#14503)
Why I did it
ptf_nn_agent failed to start in dnx rpc syncd because module afpacket was not installed.
Please see issue sonic-net/sonic-mgmt#7822

How I did it
Add downloading ptf afpacket module in docker file.

How to verify it
Verified that ptf_nn_agent was started successfully in dnx rpc syncd with the change.
2023-05-18 06:32:29 +08:00
Samuel Angebault
59c7d39ef5
[202211][Arista] Update platform library submodules (#14828)
Fix watchdog reboot cause for wolverine linecard
Fix PSU fan speed of 0% by adding max RPM to most psu descriptions
Add product DCS-7060DX5-64
Add product DCS-7060DX5-32
2023-05-08 14:09:37 +08:00
Vivek
f49ae28948 [Mellanox] Fix the hw-mgmt intg tool case sensitivity for KConfig (#14709)
Fix the script to consider case sensitivity while writing the kconfig

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-05-06 12:32:19 +08:00
mssonicbld
10e635be93
[Mellanox] Facilitate automatic integration of new hw-mgmt (#14594) (#14966) 2023-05-06 09:08:54 +08:00
Lior Avramov
d7d8d7754d
[Mellanox] [202211] Replace iproute2 supplied by SDK to iproute2 downloaded from Debian repository (#14726) (#14724)
- Why I did it
Mellanox syncd container will be based on Debian iproute2 plus patches instead of Nvidia internal version of iproute2

- How I did it
Download iproute2 from Debian repository, apply patches and compile to create a new target.
The target is then deployed in syncd container of Mellanox switches only.
The new target is called IPROUTE2_MLNX.

- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
2023-05-02 10:29:02 +03:00
mssonicbld
e786108bbd
[Build] Fix the installation candidate not found issue when building docker-sonic-vs (#14439) (#14769) 2023-04-23 21:05:59 +08:00
Marty Y. Lok
1ce2b50143 [marvell-armhf][uboot-setting] Fix the print menu for marvell-armhf print menu on Nokia-7215 (#13933)
Why I did it
After sonic-install install a new image, print_menu is set echo without any data. No image info between Hit any key to stop autoboot:  0 and  Start USB

Board configuration detected:
Net:   
|  port  | Interface | PHY address  |
|--------|-----------|--------------|
No ethernet found.
Hit any key to stop autoboot:  0 

(Re)start USB...
USB0:   Port (usbActive) : 0    Interface (usbType = 2) : USB EHCI 1.00
scanning bus 0 for devices... 3 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
How I did it
The fw_setenv print_menu is missing the double quotes. That causes the value is truncated. Using double quotes to in the environment setting.

How to verify it
Install new image with this fix. And reboot the system. The following section should be shown:

Signed-off-by: mlok <marty.lok@nokia.com>
2023-04-21 02:33:03 +08:00
Hua Liu
51b60613f7 [S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. (#14402)
[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. 

#### Why I did it
On S6100, the serial-getty service some time can't auto-restart by systemd. So there is a monit unit to check serial-getty service status and restart it.

However, this monit will report false alert, because in most case when serial-getty not running, systemd can restart it successfully.

To avoid the false alert, improve the monitor to wait and re-check.

Steps to reproduce this issue:
1. User login to device via console, and keep the connection.
2. User login to device via SSH, check the serial-getty@ttyS1.service service, it's running.
3. Run 'monit reload' from SSH connection.
4. Check syslog 1 minutes later, there will be false alert: ' 'serial-getty' process is not running'

#### How I did it
Add check-getty.sh script to recheck again later when getty service not running.
And update monit unit to check serial-getty service status with this script to avoid false alert.

#### How to verify it
Pass all UT.
Manually check fixed code work correctly:


```
admin@***:~$ sudo systemctl stop  serial-getty@ttyS1.service
admin@***:~$ sudo /usr/local/bin/check-getty.sh 
admin@***:~$ echo $?
1
admin@***:~$ sudo systemctl status serial-getty@ttyS1.serviceserial-getty@ttyS1.service - Serial Getty on ttyS1
     Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead) since Tue 2023-03-28 07:15:21 UTC; 1min 13s ago

admin@***:~$ sudo /usr/local/bin/check-getty.sh 
admin@***:~$ echo $?
0
admin@***:~$ sudo systemctl status serial-getty@ttyS1.serviceserial-getty@ttyS1.service - Serial Getty on ttyS1
     Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
```

syslog:
```
Mar 28 07:10:37.597458 *** INFO systemd[1]: serial-getty@ttyS1.service: Succeeded.
Mar 28 07:12:43.010550 *** ERR monit[593]: 'serial-getty' status failed (1) -- no output
Mar 28 07:12:43.010744 *** INFO monit[593]: 'serial-getty' trying to restart
Mar 28 07:12:43.010846 *** INFO monit[593]: 'serial-getty' stop: '/bin/systemctl stop serial-getty@ttyS1.service'
Mar 28 07:12:43.132172 *** INFO monit[593]: 'serial-getty' start: '/bin/systemctl start serial-getty@ttyS1.service'
Mar 28 07:13:43.286276 *** INFO monit[593]: 'serial-getty' status succeeded (0) -- no output
```

#### Description for the changelog
[S6100] Improve S6100 serial-getty monitor.

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
2023-04-21 02:32:56 +08:00
mssonicbld
6781c4a4fb
Made non-upstream patch design order aware (#14434) (#14650) 2023-04-14 03:29:35 +08:00
xumia
5dbf512cda
Support to add SONiC OS Version in device info (#14601) (#14623)
Why I did it
Cherry-pick #14601, for code conflict.
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.

SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Work item tracking
Microsoft ADO (number only): 17894593
How I did it
How to verify it
2023-04-13 19:28:03 +08:00
Jemston Fernando
8bbc8eb8cf
[celestica]: Fix Belgite platform issues (#14036)
As part of platform hardening this commit fixes several platform issues
in various components like PSU, FAN, Temperature, LED.
Cherrypick PR#13389
2023-03-27 10:16:16 -07:00
Sudharsan Dhamal Gopalarathnam
156189dbad [Mellanox]Fix lpmode set when logical port is larger than 64 (#14138)
- Why I did it
In sfplpm API, the number of logical ports is hardcoded as 64. When a system contains more port than this, the SDK APIs would fail with a syslog as below

Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [MGMT_LIB.ERR] Slot [0] Module [0] has logport [0x00010069] in enabled state
Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [SDK_MGMT_LIB.ERR] Failed in __sdk_mgmt_phy_module_pwr_attr_set, error: Internal Error
Mar 7 03:53:58.106118 r-leopard-58 ERR pmon#-c: Error occurred when setting power mode for SFP module 0, slot 0, error code 1

- How I did it
Remove the hardcoded value of 64. Obtained the number of logical ports from SDK

- How to verify it
Manual testing
2023-03-19 20:50:58 +08:00
Junhua Zhai
29f3c4944a [gearbox] use credo sai v0.9.0 (#14149)
Update credo sai package to the latest v0.9.0.
2023-03-19 20:50:53 +08:00
Dror Prital
ba14f728de Update SDK/FW to version 4.5.4206/4.5.4204 (#14164)
- Why I did it
To include latest fixes:

Fix traffic loss on all routed traffic when moving from 4.4.3372/XX_2008_3388 to 4.5.4118-012/XX_2010_4120-010. Issue occurred after ISSU process in Spectrum 1 only, When upgrading from older version to a new one. Neighbor entries are overwritten.
Fix When using mirror session policer on SPC2/3, the actual CIR was 1.28 times more than the configured CIR value.
Fix Creation of router interface of type bridge may occasionally fail if create is performed immediately after delete.
Fix False errors during SDK deinitialization may be seen in the syslog

- How I did it
Updated SDK submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "sonic-mgmt".
2023-03-19 20:50:49 +08:00
dbarashinvd
d7ba89a95b [Mellanox] fix for watchdog device not found, adding dependency on hw-management (#14182)
- Why I did it
Sometimes Nvidia watchdog device isn't ready when watchdog-control service is up after first installation from ONIE
need to delay watchdog control service to go up after hw-mgmt which gets devices up and ready

- How I did it
Delay Nvidia watchdog-control service before hw-mgmt has started on Mellanox platform in order to avoid missing or not ready watchdog device.

- How to verify it
verification test of ONIE installation of image in a loop
making sure watchdog service is always up (not failed) after first installation from ONIE
2023-03-19 20:50:44 +08:00
Volodymyr Samotiy
cc5ed4b632 [Mellanox] Update MFT to 4.22.1-15 (#14133)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2023-03-19 18:33:57 +08:00
zitingguo-ms
3c312dec1c
Upgrade SAI xgs version to 8.4.0.2 and migrate to DMZ (#14119)
Why I did it
Update SAI xgs version to 8.4.0.2 and migrate xgs to DMZ repo.

How I did it
Update SAI xgs version in sai.mk.

How to verify it
Run the SONiC and SAI test with the8.4 SAI release pipeline.
2023-03-09 14:52:08 +08:00
Stepan Blyshchak
969166d769 [Mellanox] Place FW binaries under platform directory instead of squashfs (#13837)
Fixes #13568

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to master
master to 201911 downgrade
master -> master reboot
ONIE -> master boot (First FW burn)
Which release branch to backport (provide reason below if selected)
2023-03-08 13:50:18 +08:00
mssonicbld
aea96da04d
[Mellanox] Fix issue: cannot find label port for logical port when logical port number is larger than 64 (#13710) (#13962) 2023-03-06 16:47:31 +08:00
mssonicbld
1757f53290
[Mellanox] update sdk/fw build procedure (#14025) (#14059) 2023-03-03 02:43:19 +08:00
mssonicbld
72f9f51287
[Seastone] fix dx010 qsfp eeprom data write issue (#13930) (#14032) 2023-03-01 19:28:38 +08:00
mssonicbld
18bc044179
Remove support to Mellanox SPC4 ASIC (#13932) (#13957) 2023-02-23 22:22:35 +08:00
mssonicbld
310827c26c
Add PYTHON3_SWSSCOMMON as build time dependency to Mellanox platform API (#13847) (#13959) 2023-02-23 20:32:15 +08:00
mssonicbld
50aaf92590
[Mellanox] Non upstream patches for hw-mgmt V.4.0020.4104 (#13792) (#13960) 2023-02-23 20:32:09 +08:00