Commit Graph

2114 Commits

Author SHA1 Message Date
Saikrishna Arcot
27a0aab6bb
Fix saibcm-modules-dnx hash to use the one on master branch (#16633)
#15926 updated the submodule hash to point to a commit on a private branch that included changes for compiling with the 5.10.179 kernel. However, this submodule hash should've pointed to one on the master branch, not on a private branch. Fix this with this PR.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-09-21 10:37:18 -07:00
Rajkumar-Marvell
755ca6626b
[Marvell] Update armhf sai debian to 1.12.0-2 (#16638)
- Add sai_query_api_version API support

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2023-09-21 10:35:01 -07:00
Saikrishna Arcot
d62ad707bc
Update to Linux 5.10.179 (#15926)
## How I did it

Depends on sonic-net/sonic-linux-kernel#328 and sonic-net/saibcm-modules#12.

#### How to verify it

Verified that the image boots up, BGP comes up, and a basic warm-reboot works on VS, broadcom, and mellanox.
2023-09-20 15:24:39 -07:00
Kevin Wang
7cd9f13253
[cisco]: Align the ref pointer to latest code drop on 202205 (#16600)
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2023-09-20 10:56:37 -07:00
Aravind Mani
2f3f28691a
[devices]: Dell S6100 API 2.0 fix (#16363)
Why I did it
sonic-mgmt test failure is seen for update_firmware component API

Microsoft ADO: 25208748

How I did it
Edited API 2.0 to fix this issue.

How to verify it
Run sonic-mgmt test after the fix and verify it passes.
2023-09-19 10:24:05 -07:00
Christian Svensson
566fe1eb1b
[pddf] Enable deselect logic for CPLDMUX (#14631)
This feature was meant to be enabled but was accidentally left disabled.

Also downgrades the select/deselect messages to KERN_INFO to reduce log
spam.

Fixes #14546.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2023-09-11 11:36:39 -07:00
Yakiv Huryk
2b1c39e6f6
[vs] support for ARM build (#15692)
- Why I did it
To support the building of ARM-based docker-sonic-vs.gz

- How I did it
Fixed SYNCD_VS build rule to be architecture-specific.

- How to verify it
make configure PLATFORM=vs PLATFORM_ARCH=arm64
make target/docker-sonic-vs.gz

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2023-09-10 18:27:04 +03:00
snider-nokia
2f69a0eaa6
[Nokia][sonic-platform] Update Nokia sonic-platform submodule (#16348)
This likely fixes Nokia-ION/ndk#21

To fix a failure that results when edge condition results in MDIPC channel being freed with mismatched ownership.
2023-09-07 11:20:06 -07:00
Dror Prital
d7b85af18b
[Mellanox] Update SDK/FW to 4.6.1062/2012.1062 Update SDK/FW/SAI to 4.6.1062/2012.1062/SAIBuild2211.25.1.4 (#16478)
- Why I did it
SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 which is 255 when fastboot enable and 511 when fastboot disable
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE

SAI features
1. Port init profile
2. Dual ToR Active-Standby | Additional MAC support

SDK/FW bug fixes
1. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.

- How I did it
Update SAI version to SAIBuild2211.25.1.4
Update SDK/FW version to 4.6.1062/2012.1062
2023-09-07 14:05:33 +03:00
Kebo Liu
e286869b24
[Mellanox] Update HW-MGMT package to new version V.7.0030.1011 (#16239)
- Why I did it
1. Update Mellanox HW-MGMT package to newer version V.7.0030.1011
2. Replace the SONiC PMON Thermal control algorithm with the one inside the HW-MGMT package on all Nvidia platforms
3. Support Spectrum-4 systems

- How I did it
1. Update the HW-MGMT package version number and submodule pointer
2. Remove the thermal control algorithm implementation from Mellanox platform API
3. Revise the patch to HW-MGMT package which will disable HW-MGMT from running on SIMX
4. Update the downstream kernel patch list

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-09-06 11:32:08 +03:00
Rajkumar-Marvell
782a92213d
[Marvell] Update armhf sai debian to add SAI 1.12 support (#16299)
- SAI 1.12 support

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2023-09-05 10:20:27 -07:00
Stephen Sun
b5e8c16134
[Mellanox] Enhance FW upgrade mechanism (#16090)
### Why I did it

1. Enhance the diagnosis information collecting mechanism
   - If the option `-v` is fed, it will pass additional diagnosis flags to mlxfwmanager
   - Collect all the output from mlxfwmanager and print them to syslog if it fails
2. Abort syncd in case waiting for device or upgrading firmware fails

Signed-off-by: Stephen Sun <stephens@nvidia.com>

### How I did it

#### How to verify it

Regression and manual test
2023-09-04 11:28:53 -07:00
Vadym Hlushko
78587cedc3
[Mellanox] Remove mlxtrace support for SPC4 (#16373)
- Why I did it
Because the Spectrum4 devices don't support mlxtrace utility.

- How I did it
Edit sai.profile and remove mlxtrace_spectrum4_itrace_*.cfg.ext files

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-09-04 10:53:20 +03:00
Yoush
559151b41e
[centec]: update sonic master centec-sai reference to v1.12.0-1 (#16238)
Signed-off-by: yoush <yoush@centec.com>
2023-09-01 23:22:00 -07:00
Vadym Hlushko
9e3fdded69
[Mellanox][SFP] Remove unused function parameter (#16318)
Why I did it
To avoid errors when the sfputil show error-status -hw is called from the host OS (not from the pmon docker).

How I did it
Remove the self.sdk_handle parameter from the _get_module_info() function.

How to verify it
Execute the sfputil show error-status -hw

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-09-01 23:06:04 -07:00
Andrew Sapronov
0405b369af
[Netberg][Barefoot] Added support for Aurora 750 (#16342)
Why I did it
Support Intel Tofino based platforms Netberg Aurora 750
ASIC: Intel Tofino BFN-T10-064Q
Pors: 64x 100G

How I did it
Added specification to device/netberg directory
Added platform/barefoot/sonic-platform-modules-netberg contains kernel modules, scripts and sonic_platform packages.
Modified the platform/barefoot/platform-modules-netberg.mk to include Aurora 750 related ID.

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
2023-09-01 22:52:39 -07:00
Guohan Lu
3bdfdd95ea Revert "[Ragile]: Add new centec platform ra-b6010 (#14819)"
This reverts commit 75062436e8.
2023-09-01 22:43:18 -07:00
pettershao-ragilenetworks
75062436e8
[Ragile]: Add new centec platform ra-b6010 (#14819)
What I did it
Add new platform arm64-ragile_ra-b6010-48gt4x-r0 (Centec)
ASIC Vendor: Centec
Switch ASIC: Centec
Port Config: 48x1G+4x10G

Why I did it
Add new platform RA-B6010-48GT4X

How I did it
Add new platform RA-B6010-48GT4X

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
2023-08-31 08:38:24 -07:00
Nonodark Huang
42cf153624
[ufispace][pddf] Add the pddf package dependency to the ufispace platform modules mk file. (#16302)
Signed-off-by: nonodark <ef67891@yahoo.com.tw>
2023-08-29 09:45:28 -07:00
Stephen Sun
be8843b166
Fix issue: unprintable character is rendered when handling comments in j2 (#16287)
Use "{#-" and "-#}" to mark comments in jinja template

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-08-28 15:03:39 +03:00
Samuel Angebault
c1c1054952
[Arista] Update platform library submodules (#16112)
Ignore intermittent IO errors during get_change_event in the Platform API
Fix tunings for some ports on CatalinaDD
Fix kernel module build for 6.1 kernel in preparation of bookworm upgrade
2023-08-23 11:05:42 -07:00
Aravind Mani
821be3f6fc
DellEMC: System health config changes (#15771)
Why I did it
System health config is missing in few Dell platforms.

How I did it
Added system health monitoring config and its related API's

How to verify it
show system-health summary/detail commands.
2023-08-23 11:05:03 -07:00
Junchao-Mellanox
95f317a5e2
[Mellanox] Fix issue: watchdogutil command does not work (#16091)
- Why I did it
watchdogutil uses platform API watchdog instance to control/query watchdog status. In Nvidia watchdog status, it caches "armed" status in a object member "WatchdogImplBase.armed". This is not working for CLI infrastructure because each CLI will create a new watchdog instance, the status cached in previous instance will totally lose. Consider following commands:

admin@sonic:~$ sudo watchdogutil arm -s 100      =====> watchdog instance1, armed=True
Watchdog armed for 100 seconds
admin@sonic:~$ sudo watchdogutil status             ======> watchdog instance2, armed=False
Status: Unarmed
admin@sonic:~$ sudo watchdogutil disarm            =======> watchdog instance3, armed=False
Failed to disarm Watchdog

- How I did it
Use sysfs to query watchdog status

- How to verify it
Manual test
Unit test
2023-08-23 09:30:58 +03:00
Vadym Hlushko
b214a8a8b6
[Mellanox] Change SDK API sx_mgmt_phy_module_info_get() to sysfs (#15963)
- Why I did it
Change Mellanox platform API implementation to use ASIC driver sysfs for the module operational state and status error fields.

- How I did it
Modify the platform/mellanox/mlnx-platform-api/sonic_platform/sfp.py file by change the call of sx_mgmt_phy_module_info_get() SDK API to sysfs

- How to verify it
Simulate the unplug cable event
Check the CLI output
sfputil show presence
sfputil show error-status -hw
Simulate the plug cable event
Repeat 2 step

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-08-21 20:54:13 +03:00
Kebo Liu
1626e198a8
[Mellanox] Update SDK/FW/SAI to 4.6.1020/2012.1020/SAIBuild2305.25.0.3 (#16096)
SONiC changes:
1. Support Spectrum4 ASIC FW binary building.
2. Support new SDK sx-obj-desc lib building since new SAI need it.
3. Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
4. Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2305.25.0.3

SDK/FW bug fixes
1. In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
2. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.

SDK/FW Features
1. On SN2700 all ports can support y cable by credo

SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable 
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE

SAI features
1. Port init profile

- How I did it
Update SDK/FW/SAI make files

- How to verify it
Run full sonic-mgmt regression on Mellanox platform

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-15 15:32:52 +03:00
Kebo Liu
5aa2417c71
[Mellanox] Update MFT to newer version 4.25.0-62 (#16149)
- Why I did it
Update Mellanox MFT tool to version 4.25.0-62

- How I did it
Update the MFT tool make file

- How to verify it
Run full sonic-mgmt regression.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-15 09:49:19 +03:00
Nonodark Huang
1acafa4873
[Ufispace][PDDF] Add PDDF support on S9110-32X, S8901-54XC, S7801-54XS and S6301-56ST (#16017)
Why I did it
Add PDDF support on following Ufispace platforms with Broadcom ASIC

S9110-32X
S8901-54XC
S7801-54XS
S6301-56ST
How I did it
Add PDDF configuration files, scripts and python files

How to verify it
Run pddf commands and show commands.

Signed-off-by: nonodark <ef67891@yahoo.com.tw>
2023-08-14 15:56:03 -07:00
Zhijian Li
ab7c4ee661
[Celestica-E1031] Enable CPU watchdog (#16083)
Enable CPU watchdog on Celestica-E1031.
2023-08-13 21:33:19 -07:00
FuzailBrcm
bb8ce50cbe
Adding support for extra GPIO chips in the common PDDF driver (#16082) 2023-08-11 09:31:18 -07:00
Liu Shilong
3500f69fdb
Revert "[Ufispace][PDDF] Add PDDF support on S9180-32X (#14909)" (#16092)
This reverts commit d2b5d774c5.
2023-08-11 09:13:53 -07:00
Saikrishna Arcot
519a1e4a91
Update sairedis submodule (#16072)
* Update sairedis submodule

This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.

This is a reland of #15720, which was reverted in #15995 due to the RPC
package build failing. That failure has since been fixed, and the
PR pipeline has been updated to build the RPC package so that this is
checked at the PR stage.

This submodule update brings in the following changes:

```
4dbdb21 Fix RPC package build failure due to shell syntax issue (#1268)
588d596 Make sure new binaries replace existing binaries in docker-sonic-vs (#1269)
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
```

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Update sairedis submodule with the fix for the RPC package build

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

---------

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-08-11 09:00:46 -07:00
Arun LK
97113bae61
Dell: E3224F platform onboarding (#16002)
* Dell: E3224F platform onboarding

* Dell: E3224F platform onboarding
2023-08-10 17:27:30 -07:00
FuzailBrcm
8524e563d3
PDDF: Supporting extra system fans in the common PDDF drivers (#15956) 2023-08-08 14:59:36 -07:00
pettershao-ragilenetworks
abccdaeb6c
[Ragile]Adapt kernel 5.10 for broadcom on RA-B6510-48V8C (#14809)
* Adapt kernel 5.10 for broadcom on RA-B6510-48V4C

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* modify one-image.mk file

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* modify debian/rule.mk

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* Add platform.json file

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

---------

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
2023-08-04 12:01:49 -07:00
Junchao-Mellanox
91f3da018e
[Mellanox] Add more unit test coverage for platform API (#15842)
- Why I did it
Increase UT coverage for Nvidia platform API code

Work item tracking
Microsoft ADO (number only):

- How I did it
Focus on low coverage file:
1. component.py
2. watchdog.py
3. pcie.py

- How to verify it
Run the unit test, the coverage has been changed from 70% to 90%
2023-08-03 13:54:31 +03:00
Kebo Liu
380898f3a1
[Mellanox] Remove unnecessary file manipulation in the SAI Make file (#15993)
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-03 13:39:27 +03:00
Vadym Hlushko
521a86b2de
[Mellanox] Add mlxtrace to techsupport (#15961)
- Why I did it
Added the fwtrace config files in order to be able to call the mlxstrace utility during the show techsupport dump.

Work item tracking
Microsoft ADO (number only):

- How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.

- How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-08-03 11:36:58 +03:00
Pavan-Nokia
a850175776
[Nokia-7215-A1] Update Nokia-7215-A1 platform (#15342)
Update Nokia-7215-A1 platform to address UT and OC test failures
2023-08-02 09:08:15 -07:00
Liu Shilong
04a6031b2d
Revert "Update sairedis submodule (#15720)" (#15995)
This reverts commit e0927e28af.


Why I did it
Reverts #15720

It breaks build for target/debs/bullseye/syncd_1.0.0_amd64.deb

make[2]: Entering directory '/sonic/src/sonic-sairedis'
dh_install
# Note: escape with an extra symbol
if [ -f debian/syncd-rpc/usr/bin/syncd_init_common.sh ] ; then
/bin/sh: 1: Syntax error: end of file unexpected (expecting "fi")
make[2]: *** [debian/rules:65: override_dh_install] Error 2
make[2]: Leaving directory '/sonic/src/sonic-sairedis'
make[1]: *** [debian/rules:51: binary] Error 2
make[1]: Leaving directory '/sonic/src/sonic-sairedis'
dpkg-buildpackage: error: fakeroot debian/rules binary subprocess returned exit status 2
Work item tracking
Microsoft ADO (number only): 24691535
How I did it
How to verify it
2023-07-31 16:09:35 +08:00
Santhosh Kumar T
141132c94b
Update the iSMART_64 tool (#15936)
Why I did it
Updating the iSMART_64 tool for supporting latest debian releases.

How I did it
On branch new_ismart
Changes to be committed:
(use "git restore --staged ..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/iSMART_64

How to verify it
In s6100, run the iSMART_64 tool.
md5sum - 24725730d7649769c7ba50971c1f2955
2023-07-30 09:52:02 -07:00
kannansel
2d9be532c1
Why I did it (#14826)
Midstone platform has compilation error in master branch, fixed the same.

How I did it
Due to bullseye migration i2c_new_dummy API is deprecated modified with i2c_new_dummy_device.

How to verify it
Verified target/debs/bullseye/platform-modules-midstone-200i_0.2.2_amd64.deb is generated

Co-authored-by: Kannan Selvaraj <skannan@celestica.com>
2023-07-30 09:48:36 -07:00
Ikki Zhu
b23078b52b
[E1031] fix pca9548 initializes failed occasionally (#15712)
Why I did it
[E1031] fix pca9548 initializes failed occasionally in stress test.
When failure happened, ismt i2c bus hang up and need power cycle to
recover it.

How I did it
Add 0.5s delay between setuping and configuring pca9548 i2c mux.

How to verify it
Reboot stress test at least 100 times without failure.
2023-07-30 09:42:14 -07:00
Saikrishna Arcot
e0927e28af
Update sairedis submodule (#15720)
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.

This submodule update brings in the following changes:

ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-07-24 17:05:03 -07:00
Jason Tsai
d2b5d774c5
[Ufispace][PDDF] Add PDDF support on S9180-32X (#14909)
* Add s9180-32x pddf support

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

* Fix memset_s parameter

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

* Update chassis.py and fan.py

1. remove duplicate get_sfp() in chassis.py
2. update get_direction() and get_target_speed() in fan.py

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

---------

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>
2023-07-24 09:37:48 -07:00
wilson-smci
2d0bad0523
[Supermicro]: Add a new supported device and platform, SSE-T7132S. (#15368)
* * platform/innoviunm: Add a new supported device and platform, SSE-T7132S

* Switch Vendor: Supermicro
* Switch SKU:  Supermicro_sse_t7132s
* ASIC Vendor: innovium
* Swich ASIC: TL7
* Port Configuration: 32x400G
* SONiC Image: SONiC-ONIE-Innoviunm

Signed-off-by: wilsonw <wilsonw@supermicro.com.tw>
2023-07-20 10:24:56 -07:00
Aravind Mani
2de5abdaf4
DellEMC: Fix API2.0 initialization issue (#15687)
Why I did it
To fix sonic-net/sonic-mgmt#8786

How I did it
Modified Fan API to check whether the data retrieved is valid or not and return accordingly

How to verify it
Verify whether API 2.0 is loaded properly or not.
Execute CLI's like "show version", "show interface status", "show platform psustatus" etc..
2023-07-20 09:50:22 -07:00
Kebo Liu
b6986ffd68
[Mellanox] Update SAI build procedure (#15728)
= Why I did it
To optimize Mellanox platform SAI build

- How I did it
SAI debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release.

- How to verify it
Configure/build for Mellanox platform, check the image and ensure that correct SAI debs are included.
2023-07-15 01:03:33 +03:00
Junchao-Mellanox
ed21266ff4
[Mellanox] Remove reset_from_comex from reboot cause mapping (#15793)
- Why I did it
The reset cause "reset_from_comex" has been removed by hw-management, hence removing it from platform API code

- How I did it
Remove reset_from_comex from reboot cause mapping

- How to verify it
Manual test
2023-07-15 01:02:46 +03:00
DavidZagury
b06a856fba
[Mellanox] Add support for BIOS update on Spectrum-4 (#15795)
- Why I did it
BIOS on new generation switch can come with a file type of cap or cab. Needs to add support to these file type.
Also ONIE version on new devices can have a suffix of 'dev'.

- How I did it
Added cap & cab as possible component extensions for ComponentBIOS.
Update the ONIE version regex to include dev signed versions.

- How to verify it
Update BIOS.
2023-07-15 00:59:55 +03:00
prabhataravind
114f276dd4
[docker-sonic-vs]: More changes to support DPU-2P HWKSU (#15695)
Why I did it
port_config.ini and hwsku.json are needed to generate the default config
switch_type needs to be "dpu" to spawn the right set of processes during dvs initialization and to make sure that DASH APIs can be handled properly

Work item tracking
Microsoft ADO 24375371:

How I did it
Use the same hwsku.json and port_config.ini for DPU-2P as the ones used for Nvidia-MBF2H536C SKU in nvidia-sonic sonic-buildimage repo.
Set switch_type to "dpu" in DEVICE_METADATA configuration to make sure DASH specific APIs are handled properly

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
2023-07-11 09:57:50 -07:00
leo lin
c6dbfa988e
[Ufispace][PDDF] Add support for S9300-32D platform (#14922) 2023-07-05 14:39:01 -07:00
Arvindsrinivasan Lakshmi Narasimhan
eaa795deb8
Revert "[gearbox] use credo sai v0.9.0 (#14149)" (#15708)
Reverts #14149

This SAI libsaicredo_0.9.0_amd64.deb causing packet forwarding issues on Linecards aristanetworks/sonic#92

This reverts commit c4c621c614.
2023-07-05 10:42:46 -07:00
Andrew Sapronov
c190a8f795
[Netberg][Barefoot] Added support for Aurora 710 (#15298)
* [202012][platform/barefoot] (#8543)

Why I did it
Pcied running by python 2.

How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2

How to verify it
docker exec pmon supervisorctl status

* [Netberg][nba710] Added initial support for Aurora 710

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>

---------

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
2023-06-30 17:30:07 -07:00
snider-nokia
aa46167fdd
[Nokia][sonic-platform] Update Nokia sonic-platform submodule (#15239)
Why I did it
To support dynamic swapping of module types/speeds (400G/100G/40G)
To optimize CMIS ZR optics operation
How I did it
Reinitialize xcvr_api at module removal/insertion time, and also optimize cache for ZR optics.

How to verify it
Verify that different (supported) module types can be dynamically swapped (removed/inserted) and that each is properly provisioned by Xcvrd and has its EEPROM information accurately reported in Redis DB (using "show transceiver eeprom") as well as "sfputil show eeprom" direct access.

Also verify that Xcvrd initialization and operation with 400G CMIS ZR optics is both efficient and functional.
** edit 6/14/23: pushed enhanced caching (full memory map) support and elimination of base class APIs override.
2023-06-29 11:05:45 -07:00
Stepan Blyshchak
1ebdcda9e3
[nvidia] make sure shared storage with syncd is cleared on restarts (#14547)
Why I did it
Sharing the storage of syncd with other proprietary application extensions allows them to communicate with syncd in differnt ways.
If one container wants to pass some information to syncd then shared storage can be used. However, today the shared storage isn't cleaned on restarts making it possible for syncd to read out-of-date information generated in the past.

NOTE: No plans to use it for standard SONIC dockers and we are working on removing the SDK dependency from PMON docker

How I did it
Implemented new service to clean the shared storage.

How to verify it
Do reboot/fast-reboot/warm-reboot/config-reload/systemctl restart swss and verify /tmp/ is cleaned after each restart in syncd container.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-06-28 15:26:49 -07:00
prabhataravind
d4de62d155
[docker-sonic-vs]: dd NPU SKU for docker-sonic-vs (#15604)
Define a generic 2-port NPU SKU for docker-sonic-vs to 
enable DASH vstests to pass on azure pipelines

Work item tracking
Microsoft ADO 24375371:

How I did it
Define a generic 2-port NPU hwsku that is used only for DASH-specific vstests.

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
2023-06-27 14:10:53 -07:00
Prince George
05f326eed9
Move /var/log to RAM for Mellanox SN2700, Nokia 7215 and Dell S6100 (#15077)
* add ONIE_PLATFORM_EXTRA_CMDLINE_LINUX to kernel bootparam
2023-06-26 10:58:39 -07:00
Samuel Angebault
4e43484f1d
[Arista] Update platform library submodules (#15405)
- fix pcied leak on chassis
- fix fan status led setting on fixed systems
- misc fixes
2023-06-26 10:57:08 -07:00
Rajkumar-Marvell
ec6723d4a3
[Marvell] Update arm64 sai debian (#15602)
- SAI-1.12.0 support

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2023-06-26 10:53:37 -07:00
Mai Bui
110a3fd3ac
docker prefer COPY to ADD in dockerfile (#15394)
#### Why I did it
Docker best practices prefer COPY to ADD
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#add-or-copy
##### Work item tracking
- Microsoft ADO **(number only)**: 17418730

#### How I did it
Use the COPY command as opposed to ADD unless working with a tar file.
2023-06-22 13:16:56 -07:00
Stepan Blyshchak
e2e5b77f16
[mlnx-ffb.sh] Update issu-version location (#14925)
#### Why I did it

ISSU version check fails due to inability to mount squashfs from 202211 on 201911

#### How I did it

Put ISSU version file under platform directory

#### How to verify it

Warm-upgrade matrix:
- 201911 (with https://github.com/sonic-net/sonic-buildimage/pull/14928) to master
- 201911 (with https://github.com/sonic-net/sonic-buildimage/pull/14928) to 202211
- 202012 (with https://github.com/sonic-net/sonic-buildimage/pull/14927) to master
- 202205 (with this change cherry-picked) to master
2023-06-15 15:14:52 -07:00
pavannaregundi
bdc1d7ac35
[Marvell] Update armhf driver version (#15138)
Changes in MRVL_PRESTERA_DRIVER_1.4:
- Memory leak fixed by releasing pci device after retrieval.
- Fixes for 5.10 kernel porting.

Change-Id: I1d7ee4ec02ec17a29ddb8473725ab68ca399748b

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2023-06-14 10:54:30 -07:00
Lior Avramov
c05d017091
[Mellanox] Remove iproute2 SDK patches from SONiC tree and consume them from SDK github (#15062)
- Why I did it
SDK patches for iproute2 were added to SONiC tree as a temporary solution.
Now that SDK with the patches is available, I have removed the patches from SONiC tree and we consume them from SDK github during compilation.

- How I did it
During build we download SDK iproute2 patches from SDK github (or from the URL provided by user if compiling SDK from sources) and apply them before compilation.

- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
2023-06-13 15:17:52 +03:00
Stephen Sun
238e6ffcc1
[Mellanox] Adjust warning threshold implementation according to the latest algorithm update (#15092)
- Why I did it
Adjust the warning threshold implementation according to the latest algorithm update

- How I did it
Modify power warning and critical thresholds methods

- How to verify it
Unit test updated to cover the change

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-06-13 15:14:10 +03:00
Vivek
9d8ab1b8e4
[Mellanox] Added patchwork link to commit message (#15301)
- Why I did it
Add the patchwork link to the commit description for non-upstream patches if present

- How I did it
Parse the patchwork/<patch_name>.txt file from hw-mgmt
2023-06-08 18:51:58 +03:00
Aravind Mani
b26445cf7b
Dell FPGA driver fix (#15144)
Why I did it
FPGA driver crash was observed in Dell FPGA based platforms.

How I did it
Fixed FPGA crash

How to verify it
Load FPGA driver and check whether the kernel crashes.
2023-06-05 11:01:46 -07:00
Pavan-Nokia
70d637d904
[marvell-arm64] Update platform.conf (#15163)
Update platform.conf to have a successful marvell-arm64 target image.
2023-06-01 08:49:01 -07:00
Pavan-Nokia
59fc16fe20
[arm64] Fix marvell-arm64 pipeline build (#15228)
Why I did it
When git clone -b xxx command is used the versions-git will reset the HEAD of the git to the commit ID in the versions-git file. Which causes incorrect commit to be checked out causing build errors.

Work item tracking
Microsoft ADO (number only):
How I did it
Split ‘git clone -b’ into two steps to avoid owerwrite

Git clone
cd mrvl-prestera; git checkout ; cd ..
How to verify it
Build marvell-arm64 target using below instructions
make init
make configure PLATFORM=marvell-arm64 PLATFORM_ARCH=arm64
make target/sonic-marvell-arm64.bin SONIC_BUILD_JOBS=2
2023-05-31 16:41:16 +08:00
Kebo Liu
5bb3326d2b
[Mellanox] Update hw-mgmt to 7.0020.4301 (#15260)
- Why I did it
Bug fix:

- * I2C bus is stuck - Unable to probe I2C bus 2-0048, which causes /var/run/hw-management/config/sfp_counter, module_counter to be zero and pmon docker unable to start.

- How I did it
Update HW-MGMT package version in the make file
Update HW-MGMT submodule pointer

-How to verify it
Run full sonic-mgmt regression

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-05-31 10:33:08 +03:00
Vivek
6852fcdc24
[Mellanox] Facilitate automatic integration of sdk kernel patches (#14652)
#### Why I did it

Facilitate Automatic integration of sdk kernel patches into SONiC. 

**Inputs to the Script:**
1) `MLNX_SDK_VERSION` Eg: `4.5.4206`
2) `MLNX_SDK_ISSU_VERSION` Eg: `101` 
 **Note: If nothing is provided the one already present in the sdk.mk file is used**
3) `MLNX_SDK_SOURCE_BASE_URL:` 
 **Note: If nothing is provided the upstream sdk drivers url is used**
4) `CREATE_BRANCH: (y|n)` Creates a branch instead of a commit (optional, default: n) 
5) `BRANCH_SONIC`:  Only relevant when CREATE_BRANCH is y. `Default: master`. 

Note: These should be provided through `SONIC_OVERRIDE_BUILD_VARS ` parameter

**Output:**
1) Script creates a commit in sonic-linux-kernel with any updates to sdk-kernel patches in sonic in accordance with the version provided by  `MLNX_SDK_VERSION`

**Note: Script Doesn't commit anything to linux-kernel when there aren't any changes required..**  

#### How I did it

1) Added a new make target which can be invoked by calling `make integrate-mlnx-sdk`

```
user@server:/sonic-buildimage/src/sonic-linux-kernel$ git rev-parse --abbrev-ref HEAD
master_6f38dca_integrate_4.5.4206

user@server:/sonic-buildimage/src/sonic-linux-kernel$ git log --oneline -n 1
d64d1e7 (HEAD -> master_6f38dca_integrate_4.5.4206) Intgerate MLNX SDK 4.5.4206 Kernel Patches
```

Changes made will be summarized under `sonic-buildimage/integrate-mlnx-sdk_user.out` file. Debugging and troubleshooting output is written to `sonic-buildimage/integrate-mlnx-sdk.log` files

[log_files.zip](https://github.com/sonic-net/sonic-buildimage/files/11226441/log_files.zip)


#### Limitations:
1) Assumes that the sdk kernel patches are always upstreamed

#### How to verify it

Build the Kernel and test
2023-05-29 22:24:06 -07:00
Oleksandr Ivantsiv
f3ce9ebda8
[Mellanox] Update SAI to v2305.24.0.1 (#15208)
Why I did it
Align with SAI headers v1.12.0

Work item tracking
Microsoft ADO (number only):
How I did it
Update Mellanox SAI submodule

How to verify it
Compile SONiC image
2023-05-26 17:53:17 +08:00
Vivek
d3f2d06117
[Mellanox] Add Copyright Headers for missing files (#15136)
Added NVIDIA copyright to missing files under platform/mellanox & device/mellanox
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-05-25 07:55:44 +03:00
Junchao-Mellanox
18cf719d6a
[Mellanox] Use sysfs for sfp reset/LPM/presence (#14130)
- Why I did it
The current implementation of SFP reset, LPM, present relies on SDK API. This PR moves the implementation to SDK sysfs. By this PR, it gains following benefit:
1. SDK sysfs provides better performance.
2. Host side and container side share the same code.
3. Code is much cleaner.

- How I did it
Use SDK sysfs to implement SFP reset, LPM, present.

- How to verify it
1. Manual test.
2. Unit test.
2023-05-24 17:24:34 +03:00
Kebo Liu
3e9437b63e
[Mellanox] Update SAI to 2211.24.0.21 and SDK/FW to 4.5.5142/2010_5144 (#15072)
SDK/FW Fixed Issues:
• When a system has more than 256 ACL entries, on rare occasion, removing/adding entries may cause some ACL entries not to work.
• When using mirror session policer on spectrum-2, spectrum-3, the actual CIR was 1.28 times more than the configured CIR value
• After warm boot process, when enabling ECN marking and the port is in split mode, traffic sent to the port under congestion (for example, when connecting two ports with a total speed of 50GbE to a single 25GbE port) is not marked.
• Warm boot might fail if the key value SAI_KEY_ACCUMULATED_FLOW_COUNTER_UNITS_IN_KB is set
• If counters are bound to an next hop group, there is a probability the next API calls that modify the next-hop group members will fail.
• In Spectrum platforms Fastboot mode is not operational for Split port with Force mode in 50G speed
• When fine grain next hop group has a size of 2K or 4K members, and group is removed, FW will remove only (size % 2048) members, resulting in leakage of KVD resources
• When reading some port statistics, or bulk reading some Queue or PG statistics, and in parallel reading or writing other counters, FW may, in rare cases, get stuck
• SN2201 Module 1 is considered to be present/linked while no cable/module is plugged
• On Spectrum-3 when port configure to 400G FW might stuck after running mlxlink while 400G interface connected and swap between upper and lower 4 lanes

SAI New features:
• ACL: Added support for an ACL match on the AETH field (SAI_ACL_TABLE_ATTR_FIELD_AETH_SYNDROME, SAI_ACL_ENTRY_ATTR_FIELD_AETH_SYNDROME) to count RoCE NAK and CNP packets.
• PLL Status: Added a new logging entry that alerts the user upon a PLL lock loss event.
• Dual ToR - Additional MAC Address: Added support for setting a MAC address for the router interface which is not part of the 10 bit MAC address available for RIFs on Spectrum-1, as part of the Dual ToR scenario.
• Dual ToR: DSCP Remapping Added support for tunnel QoS maps as part of the Dual TOR scenario.

SAI Fixed issues:
• When setting a WRED profile attribute for a color that was not enabled during the profile create time, an error would be returned. After the fix, a default profile is create on such scenario and the set attribute is applied on top of it
• When calling the flush FDB by using the SAI_FDB_FLUSH_ATTR_BRIDGE_PORT_ID attribute, the bridge bv_id value was filled on the notification callback where it should have been left empty.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-05-24 17:20:33 +03:00
Pavan-Nokia
c5d0507224
[arm64][Nokia-7215-A1]Add support for Nokia-7215-A1 platform (#13795)
Add new Nokia build target and establish an arm64 build:

    Platform: arm64-nokia_ixs7215_52xb-r0
    HwSKU: Nokia-7215-A1
    ASIC: marvell
    Port Config: 48x1G + 4x10G

How I did it

- Change make files for saiserver and syncd to use Bulleseye kernel
- Change Marvell SAI version to 1.11.0-1
- Add Prestera make files to build kernel, Flattened Device Tree blob and ramdisk for arm64 platforms
- Provide device and platform related files for new platform support (arm64-nokia_ixs7215_52xb-r0).
2023-05-18 14:24:05 -07:00
FuzailBrcm
37eddd479d
[pddf]: Adding S3IP supported attribute for FAN in PDDF (#15075)
The S3IP (Simplified Switch System INtegration Program) sysfs specification defines a unified interface to access peripheral hardware on devices from different vendors, making it easier for SONiC to support different devices and platforms.

PDDF is a framework to simplify the driver and SONiC platform APIs development for new platforms. This effort is first step in combining the two frameworks.

This specific PR adds S3IP supported sysfs attribute in common FAN driver of PDDF.
2023-05-18 14:06:46 -07:00
FuzailBrcm
d6768b3259
[pddf]: Adding S3IP supported attribute for LEDs in PDDF (#15074)
The S3IP (Simplified Switch System INtegration Program) sysfs specification defines a unified interface to access peripheral hardware on devices from different vendors, making it easier for SONiC to support different devices and platforms.

PDDF is a framework to simplify the driver and SONiC platform APIs development for new platforms. This effort is first step in combining the two frameworks.

This specific PR adds the S3IP supported sysfs attributes in PDDF common LED driver.
2023-05-18 14:06:19 -07:00
FuzailBrcm
771a1170d8
[pddf]: Adding and enabling S3IP support in PDDF (#15073)
Why I did it
The S3IP (Simplified Switch System INtegration Program) sysfs specification defines a unified interface to access peripheral hardware on devices from different vendors, making it easier for SONiC to support different devices and platforms.

PDDF is a framework to simplify the driver and SONiC platform APIs development for new platforms. This effort is first step in combining the two frameworks.

This specific PR adds support for pddf-s3ip-init.service and enables it in PDDF.
2023-05-18 13:13:16 -07:00
Song Yuan
21bcaab280
Install ptf afpacket module required by ptf_nn_agent. (#14503)
Why I did it
ptf_nn_agent failed to start in dnx rpc syncd because module afpacket was not installed.
Please see issue sonic-net/sonic-mgmt#7822

How I did it
Add downloading ptf afpacket module in docker file.

How to verify it
Verified that ptf_nn_agent was started successfully in dnx rpc syncd with the change.
2023-05-17 11:34:43 -07:00
Rajkumar-Marvell
f2ff2bfc3c
[Marvell] Update armhf sai debian (#15096)
- SAI-1.11.0 support
- SONIC 20220531.25 OC Failure: Everflow testcases failing due to SAI orchagent crash
- SONIC 20220531.25 OC Failure: ACL IPv6 testcases.
- TPID support

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2023-05-16 23:03:15 -07:00
daxia16
1175143af1
[Mellanox] Support UID LED in platform API (#11592)
- Why I did it
As a LED indicator to help user to find switch location in the lab, UID LED is a useful LED in Mellanox switch.

- How I did it
I add a new member _led_uid in Mellanox/Chassis.py, and extend Mellanox/led.py to support blue color.
Relevant platform-common PR sonic-net/sonic-platform-common#369

- How to verify it
Add unit test cases in test.py, and do manual test including turn-on/off/show uid led.

Signed-off-by: David Xia <daxia@nvidia.com>
2023-05-16 08:24:39 +03:00
andywongarista
dad61f3d81
[Arista] Update platform library submodules (#15049)
Fix lpmode on 7060DX5-32
Fix psu led issue on 7060DX5-64
Use sonic_xcvr lpmode if platform does not support hw lpmode
Add chassis cooling algorithm
Change cooling algorithm default interval to 10s
Force filesystem sync on linecard reboot
2023-05-15 15:42:29 -07:00
Junchao-Mellanox
7962a5c0fa
[Mellanox] add PSU fan direction support (#14508)
- Why I did it
Add PSU fan direction support

- How I did it
Implement fan.get_direction for PSU fan

- How to verify it
Manual test
Unit test
2023-05-15 21:34:54 +03:00
Vivek
bc58c12ed8
[Mellanox] Add patch commit-id mapping to description (#15052)
- Why I did it
Add the commit-id patch map in the commit message.

- How I did it
By parsing the patch DB from hw-mgmt

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-05-15 13:47:24 +03:00
Junchao-Mellanox
9deca05f9d
[Mellanox] get LED capability from capability file (#14584)
- Why I did it
Currently, LED sysfs path is hardcoded. We will need change LED code if new LED color is supported for new platforms. This PR is aimed to improve this. By this PR, LED sysfs path is deduced from LED capability file.

- How I did it
Improve LED management on Nvidia platform:
get LED capability from capability file and deduce sysfs name according to the capability

- How to verify it
Unit test
Manual test
2023-05-10 20:53:50 +03:00
Yakiv Huryk
fa02411750
[Mellanox][asan] disable fast_unwind_on_malloc for mlnx syncd (#14858)
- Why I did it
To improve ASAN backtrace output when the call stack contains a code that is not compiled with -fno-omit-frame-pointer.

- How I did it
Added fast_unwind_on_malloc=0 to the ASAN_OPTIONS

- How to verify it
Build and test docker-syncd-mlnx.gz with ENABLE_ASAN=y

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2023-05-10 20:50:42 +03:00
Jon Goldberg
0692e8aa43
[armhf][Nokia-7215] changes fstrim.timer to daily (#14723)
Using timer-override.conf, we modify the fstrim.timer service.

For armhf, Nokia-7215 platform, we modify fstrim.timer to run daily
instead of weekly.  This is required because the size of the SSD on
this platform is 16GB, which on average is nearly 10 times smaller than
most other sonic platforms.  With smaller disk and the ever increasing
level of logging done by sonic, this change is required to prevent
the SSD from entering a read-only state due to inadequate free blocks.
2023-05-03 10:26:41 -07:00
Samuel Angebault
205e60ea9e
[Arista] Update platform library submodules (#14827)
- Fix watchdog reboot cause for wolverine linecard
- Fix PSU fan speed of 0% by adding max RPM to most psu descriptions
- Add product DCS-7060DX5-64
- Add product DCS-7060DX5-32
2023-05-03 10:19:38 -07:00
Lior Avramov
97cdb6af5c
[Mellanox] Add copyright header to ECMP calculator files (#14825)
- Why I did it
Add NVIDIA Copyright header to NVIDIA files added lately

- How I did it
Add NVIDIA Copyright header for the relevant files

- How to verify it
N/A (only commented text was added).
2023-05-02 10:35:16 +03:00
Andrew Sapronov
59178e3636
[devices]: Netberg Aurora 610 reduce kernel module output (#13704)
Normally doesn't need to measure i2c calls.
Also switched to use timespec64_sub() to ensure time delta normalized

Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
2023-05-01 10:48:08 -07:00
Lior Avramov
2922f26b6c
[Mellanox] Replace iproute2 supplied by SDK to iproute2 downloaded from Debian repository (#14726)
- Why I did it
Mellanox syncd container will be based on Debian iproute2 plus patches instead of Nvidia internal version of iproute2

- How I did it
Download iproute2 from Debian repository, apply patches and compile to create a new target.
The target is then deployed in syncd container of Mellanox switches only.
The new target is called IPROUTE2_MLNX.

- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
2023-04-30 12:30:09 +03:00
Marty Y. Lok
a68b4ef149
[Nokia7250][sonic-platform] Update sonic-platform submodule for Nokia-7150IXRE platform (#14548)
Why I did it

Update sonic-platform submodule for Nokia-7250IXRE Platform. This requires the new NDK 22.9.8 and above

How I did it
Update submodule sonic-platform for Nokia-7250IXRE platform.
c9f316e Disparate process and thread-safe protection for MDIPC transport, and refactored presence logic to better align with SfpStateUpdateTask operation
a3486cc Added _get_module_bulk_info() and cache the info for 5 seconds to optimize the chassisd update.
4b2e729 Fixed the nokia_cmd show qfpga help display
7b87049 Fixed the nokia_cmd show midplane helper dispaly.
83eabea Add "nokia_cmd set ndk-monitor-action" and "nokia_cmd set ndk-log-level" commands
8aad7de Add nokia_cmd show ndk-version
d2c55e3 Modify the psu.py and module.py to optimize the psud running time


Signed-off-by: mlok <marty.lok@nokia.com>
2023-04-27 08:52:22 -07:00
Vivek
1b63543e7f
[Mellanox] Fix the hw-mgmt intg tool case sensitivity for KConfig (#14709)
Fix the script to consider case sensitivity while writing the kconfig

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-04-25 09:17:02 +03:00
Stepan Blyshchak
04099f075d
[BGP] support BGP pending FIB suppression (#12853)
Signed-off-by: Stepan Blyschak stepanb@nvidia.com

DEPENDS: #12852

Why I did it
To support BGP pending FIB suppression.

How I did it
I backported patches from FRR 8.4 feature that allows communicating ASIC route status back to FRR.
Also, added a new field in DEVICE_METADATA YANG model table. Added UT for YANG model changes.

How to verify it
Run on the switch.
2023-04-20 19:56:13 +08:00
Ravi [Marvell]
fa48caf39d
Add debug shell packages for Marvell Innovium platforms (#11845)
- Why I did it
Package Marvell/Innovium CLI shell.

- How I did it
Include shell packages.

- How to verify it
Platform specific shell commands.

Signed-off-by: rck-innovium rck@innovium.com
2023-04-13 22:04:36 +03:00
Vivek
397908aa59
[Mellanox] Facilitate automatic integration of new hw-mgmt (#14594)
- Why I did it
Facilitate Automatic integration of new hw-mgmt version into SONiC.

Inputs to the Script:

MLNX_HW_MANAGEMENT_VERSION Eg: 7.0040.5202
CREATE_BRANCH: (y|n) Creates a branch instead of a commit (optional, default: n)
BRANCH_SONIC: Only relevant when CREATE_BRANCH is y. Default: master.
Note: These should be provided through SONIC_OVERRIDE_BUILD_VARS  parameter

Output:

Script creates a commit (in each of sonic-buildimage, sonic-linux-kernel) with all the changes required for upgrading the hw-management version to a version provided by MLNX_HW_MANAGEMENT_VERSION
Brief Summary of the changes made:

MLNX_HW_MANAGEMENT_VERSION flag in the hw-management.mk file
hw-mgmt submodule is updated to the corresponding version
Updates are made to non-upstream-patches/patches and series.patch file
series, kconfig-inclusion and kconfig-exclusion files can be updated in the sonic-linux-kernel repo
sonic-linux-kernel/patches folder is updated with the corresponding upstream patches
Based on the inputs, there could be a branch seen in the local for each of the repo's. Branch is named as <branch>_<parent_commit>_integrate_<hw_mgmt_version>

- How I did it
Added a new make target which can be invoked by calling make integrate-mlnx-hw-mgmt
user@server:/sonic-buildimage$ git rev-parse --abbrev-ref HEAD
master_23193446a_integrate_7.0020.5052
user@server:/sonic-buildimage$ git log --oneline -n 2
f66e01867 (HEAD -> master_23193446a_integrate_V.7.0020.5052, show) Intgerate HW-MGMT V.7.0020.5052 Changes
23193446a (master_intg_hw_mgmt) Update logic

user@server:/sonic-buildimage/src/sonic-linux-kernel$ git rev-parse --abbrev-ref HEAD
master_6847319_integrate_7.0020.4104
user@server:/sonic-buildimage/src/sonic-linux-kernel$ git log --oneline -n 2
6094f71 (HEAD -> master_6847319_integrate_V.7.0020.5052) Intgerate HW-MGMT V.7.0020.5052 Changes
6847319 (origin/master, origin/HEAD) Read ID register for optoe1 to find pageable bit in optoe driver  (#308)
Changes made will be summarized under sonic-buildimage/integrate-mlnx-hw-mgmt_user.out file. Debugging and troubleshooting output is written to sonic-buildimage/integrate-mlnx-hw-mgmt.log files

User output file & stdout file:

log_files.tar.gz

Limitations:
Assumes the changes would only work for amd64
Assumes the non-upstream patches in mellanox only belong to hw-mgmt

- How to verify it
Build the Kernel

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-04-13 14:18:09 +03:00
xumia
f1fd42558a
Support to add SONiC OS Version in device info (#14601)
Why I did it
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.

SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
How I did it
2023-04-12 09:20:08 +08:00
Vivek
0df155b014
Made non-upstream patch design order aware (#14434)
- Why I did it

Currently, non upstream patches are applied only after upstream patches.

Depends on sonic-net/sonic-linux-kernel#313. Can be merged in any order, preferably together

- What I did it

Non upstream Patches that reside in the sonic repo will not be saved in a tar file bur rather in a folder pointed out by EXTERNAL_KERNEL_PATCH_LOC. This is to make changes to the non upstream patches easily traceable.
The build variable name is also updated to INCLUDE_EXTERNAL_PATCHES
Files/folders expected under EXTERNAL_KERNEL_PATCH_LOC
EXTERNAL_KERNEL_PATCH_LOC/
       ├──── patches/
             ├── 0001-xxxxx.patch
             ├── 0001-yyyyyyyy.patch
             ├── .............
       ├──── series.patch
series.patch should contain a diff that is applied on the sonic-linux-kernel/patch/series file. The diff should include all the non-upstream patches.
How to verify it

Build the Kernel and verified if all the patches are applied properly

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-04-10 19:48:27 +03:00
snider-nokia
6f54251375
[armhf][Nokia-7215]Add SFP refactor support for Nokia-7215 platform (#14396) 2023-04-06 08:04:45 -07:00
Hua Liu
e17e4fc4c0
[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. (#14402)
[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. 

#### Why I did it
On S6100, the serial-getty service some time can't auto-restart by systemd. So there is a monit unit to check serial-getty service status and restart it.

However, this monit will report false alert, because in most case when serial-getty not running, systemd can restart it successfully.

To avoid the false alert, improve the monitor to wait and re-check.

Steps to reproduce this issue:
1. User login to device via console, and keep the connection.
2. User login to device via SSH, check the serial-getty@ttyS1.service service, it's running.
3. Run 'monit reload' from SSH connection.
4. Check syslog 1 minutes later, there will be false alert: ' 'serial-getty' process is not running'

#### How I did it
Add check-getty.sh script to recheck again later when getty service not running.
And update monit unit to check serial-getty service status with this script to avoid false alert.

#### How to verify it
Pass all UT.
Manually check fixed code work correctly:


```
admin@***:~$ sudo systemctl stop  serial-getty@ttyS1.service
admin@***:~$ sudo /usr/local/bin/check-getty.sh 
admin@***:~$ echo $?
1
admin@***:~$ sudo systemctl status serial-getty@ttyS1.serviceserial-getty@ttyS1.service - Serial Getty on ttyS1
     Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead) since Tue 2023-03-28 07:15:21 UTC; 1min 13s ago

admin@***:~$ sudo /usr/local/bin/check-getty.sh 
admin@***:~$ echo $?
0
admin@***:~$ sudo systemctl status serial-getty@ttyS1.serviceserial-getty@ttyS1.service - Serial Getty on ttyS1
     Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
```

syslog:
```
Mar 28 07:10:37.597458 *** INFO systemd[1]: serial-getty@ttyS1.service: Succeeded.
Mar 28 07:12:43.010550 *** ERR monit[593]: 'serial-getty' status failed (1) -- no output
Mar 28 07:12:43.010744 *** INFO monit[593]: 'serial-getty' trying to restart
Mar 28 07:12:43.010846 *** INFO monit[593]: 'serial-getty' stop: '/bin/systemctl stop serial-getty@ttyS1.service'
Mar 28 07:12:43.132172 *** INFO monit[593]: 'serial-getty' start: '/bin/systemctl start serial-getty@ttyS1.service'
Mar 28 07:13:43.286276 *** INFO monit[593]: 'serial-getty' status succeeded (0) -- no output
```

#### Description for the changelog
[S6100] Improve S6100 serial-getty monitor.

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
2023-04-05 21:34:31 -07:00
Santhosh Kumar T
c4435e833b
[DellEMC] S6100 - Adding logger to fetch SSD FW Upgrade status (#14247)
Adding logger to fetch SSD FW Upgrade status
2023-04-04 10:19:47 -07:00
Marty Y. Lok
54d6ea7c63
[marvell-armhf][uboot-setting] Fix the print menu for marvell-armhf print menu on Nokia-7215 (#13933)
Why I did it
After sonic-install install a new image, print_menu is set echo without any data. No image info between Hit any key to stop autoboot:  0 and  Start USB

Board configuration detected:
Net:   
|  port  | Interface | PHY address  |
|--------|-----------|--------------|
No ethernet found.
Hit any key to stop autoboot:  0 

(Re)start USB...
USB0:   Port (usbActive) : 0    Interface (usbType = 2) : USB EHCI 1.00
scanning bus 0 for devices... 3 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
How I did it
The fw_setenv print_menu is missing the double quotes. That causes the value is truncated. Using double quotes to in the environment setting.

How to verify it
Install new image with this fix. And reboot the system. The following section should be shown:

Signed-off-by: mlok <marty.lok@nokia.com>
2023-03-30 11:53:07 -07:00
andywongarista
896b292589
[Arista] Update platform library submodules (#14450)
implement chassis platform API reboot
fix rpc powercycle on linecard
fix psu/fan LED logic in arista daemon
remove psu LED for PikeZ
2023-03-30 11:50:40 -07:00
Ravi [Marvell]
78ca0dae2a
Add platform files for Innovium platform (#12653)
Why I did it
Add platform files for critical processes and default qos config for Innovium platforms

How I did it
Added default files for critical processes and qos config

How to verify it
Tested with autorestart/test_container_autorestart.py::test_containers_autorestart

Signed-off-by: rck-innovium rck@innovium.com
2023-03-30 11:33:20 -07:00
xumia
320366ab60
[Build] Fix the installation candidate not found issue when building docker-sonic-vs (#14439)
Why I did it
Fix the installation candidate not found issue when building docker-sonic-vs

How I did it
Need to run the command "apt-get update" to update the mirror indexes before installing the package gnupg

How to verify it
2023-03-28 14:04:33 +08:00
Keshav Gupta
d630b2f91c
[Innovium] Innovium build changes for master branch (#13512)
To Fix innovium build issue

Signed-off-by: Keshav Gupta <keshavg@marvell.com>
2023-03-27 10:29:31 -07:00
Ikki Zhu
105decc4d1
[celestica/e1031]: enable emc2305 fan controller timeout feature (#14401)
Why I did it
There is rare condition, emc2305 hold SMBus and cause SMBus completion wait timed out.

How I did it
Enable EMC2305 SMBus timeout feature, 30ms period of inactivity will reset the interface.

How to verify it
Use 'i2cget -y -f 23 0x4d 0x20 b' to read EMC2305 configuration register and check DIS_TO bit not set.

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2023-03-27 10:14:37 -07:00
Saikrishna Arcot
3bbfaa1ee8
Upgrade docker-sonic-vs and docker-syncd-vs to Bullseye (#13294)
* Upgrade docker-sonic-vs and docker-syncd-vs to Bullseye

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* iproute2: Force a new version and timestamp to be used for the package

There is an issue with Docker's overlay2 storage driver when not using
native diffs (and thus falling back to naive diff mode), which is the
case in the CI builds. The way the naive diff mode detects changes is by
comparing the file size and comparing the timestamps (specifically, I
believe it's the modification timestamp), and if there's a change there,
then it's considered a change that needs to be recorded as part of that
layer.

The problem is that with the code being added in the patch, the file
size remains the same, and the timestamp of binary files appear to be
the same timestamp as the changelog entry (likely for reproducible build
purposes). The file size remains the same likely due to extra padding
within the file introduced by relro. Because of this, Docker doesn't
detect this file has changed, and doesn't save the new file as part of
this layer.

To work around this, create a new changelog entry (with a new version as
well) with a new timestamp. This will result in the binary files having
a different timestamp, and thus will get saved by Docker as part of that
layer.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

---------

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-03-19 21:14:27 -07:00
dbarashinvd
06d6dafcf3
[Mellanox] fix for watchdog device not found, adding dependency on hw-management (#14182)
- Why I did it
Sometimes Nvidia watchdog device isn't ready when watchdog-control service is up after first installation from ONIE
need to delay watchdog control service to go up after hw-mgmt which gets devices up and ready

- How I did it
Delay Nvidia watchdog-control service before hw-mgmt has started on Mellanox platform in order to avoid missing or not ready watchdog device.

- How to verify it
verification test of ONIE installation of image in a loop
making sure watchdog service is always up (not failed) after first installation from ONIE
2023-03-15 18:36:20 +02:00
FuzailBrcm
d3e3565a6c
Fix issue: enhancing PDDF common eeprom APIs to use caching (#13835) (#13848)
Why I did it
To enhance pddf_eeprom.py to use caching and fix #13835

How I did it
Utilising the in-built caching mechanism in the base class eeprom_base.py.
Adding a cache file to store the eeprom data.

How to verify it
By running 'decode-syseeprom' or 'show platform syseeprom' commands.
2023-03-14 17:54:58 -07:00
FuzailBrcm
f822373e53
Enabling FPGA device support in PDDF (#13477)
Why I did it
To enable FPGA support in PDDF.

How I did it
Added FPGAI2C and FPGAPCI in the build path for the PDDF debian package
Added the support for FPGA access APIs in the drivers of fan, xcvr, led etc.
Added the FPGA device creation support in PDDF utils and parsers

How to verify it
These changes can be verified on some platform using such FPGAs. For testing purpose, we took Dell S5232f platform and brought it up using PDDF. In doing so, FPGA devices are created using PDDF and optics eeproms were accessed using common FPGA drivers. Below are some of the logs.
2023-03-14 17:53:35 -07:00
Samuel Angebault
8bd6a8891c
[Arista] Update platform library submodules (#14037)
- Add chassis platform API reboot
- Add fwutil hooks for firmware updates
- Fix PikeZ i2c bus identification issue
- Fix testing issue
2023-03-14 09:36:25 -07:00
Dror Prital
35f8101b50
Update SDK/FW to version 4.5.4206/4.5.4204 (#14164)
- Why I did it
To include latest fixes:

Fix traffic loss on all routed traffic when moving from 4.4.3372/XX_2008_3388 to 4.5.4118-012/XX_2010_4120-010. Issue occurred after ISSU process in Spectrum 1 only, When upgrading from older version to a new one. Neighbor entries are overwritten.
Fix When using mirror session policer on SPC2/3, the actual CIR was 1.28 times more than the configured CIR value.
Fix Creation of router interface of type bridge may occasionally fail if create is performed immediately after delete.
Fix False errors during SDK deinitialization may be seen in the syslog

- How I did it
Updated SDK submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "sonic-mgmt".
2023-03-14 16:27:38 +02:00
zitingguo-ms
1cd67444e4
Upgrade SAI xgs version to 8.4.0.2 and migrate to DMZ (#14212)
Why I did it
Upgrade SAI XGS version to 8.4.0.2 and migrate to DMZ repo.

How I did it
Update SAI XGS version in sai.mk.

How to verify it
Run the SONiC and SAI test with the SAI pipeline.

Signed-off-by: zitingguo-ms zitingguo@microsoft.com
2023-03-14 14:09:30 +08:00
Junhua Zhai
c4c621c614
[gearbox] use credo sai v0.9.0 (#14149)
Update credo sai package to the latest v0.9.0.
2023-03-08 23:42:10 -08:00
Sudharsan Dhamal Gopalarathnam
8d82a86134
[Mellanox]Fix lpmode set when logical port is larger than 64 (#14138)
- Why I did it
In sfplpm API, the number of logical ports is hardcoded as 64. When a system contains more port than this, the SDK APIs would fail with a syslog as below

Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [MGMT_LIB.ERR] Slot [0] Module [0] has logport [0x00010069] in enabled state
Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [SDK_MGMT_LIB.ERR] Failed in __sdk_mgmt_phy_module_pwr_attr_set, error: Internal Error
Mar 7 03:53:58.106118 r-leopard-58 ERR pmon#-c: Error occurred when setting power mode for SFP module 0, slot 0, error code 1

- How I did it
Remove the hardcoded value of 64. Obtained the number of logical ports from SDK

- How to verify it
Manual testing
2023-03-09 00:02:55 +02:00
Volodymyr Samotiy
15bee3a2c0
[Mellanox] Update MFT to 4.22.1-15 (#14133)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2023-03-08 10:21:58 +02:00
Stepan Blyshchak
f908dfe919
[Mellanox] Place FW binaries under platform directory instead of squashfs (#13837)
Fixes #13568

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to master
master to 201911 downgrade
master -> master reboot
ONIE -> master boot (First FW burn)
Which release branch to backport (provide reason below if selected)
2023-03-06 13:36:43 +02:00
Yakiv Huryk
fffeec6ecc
[Mellanox] update sdk/fw build procedure (#14025)
- Why I did it
To optimize Mellanox platform build

- How I did it
sdk debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release
sx kernel is downloaded as zip from Spectrum-SDK-Drivers

- How to verify it
configure/build for Mellanox platform

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2023-03-02 09:48:47 +02:00
Ikki Zhu
f801b8fb2d
[Seastone] fix dx010 qsfp eeprom data write issue (#13930)
Why I did it
Platform cases test_tx_disable, test_tx_disable_channel, test_power_override failed in dx010.

How I did it
Add i2c access algorithm for CPLD i2c adapters.

How to verify it
Verify it with platform_tests/api/test_sfp.py::TestSfpApi test cases.
2023-03-01 14:35:53 +08:00
RogerX87
33db298d70
[devices]: Update the Wistron platform support in master branch (#12110)
* Update the Wistron platform support in master branch

Signed-off-by: RogerX87 <RogerX87@gmail.com>
2023-02-23 09:08:13 -08:00
DavidZagury
ee1b6b3751
Remove support to Mellanox SPC4 ASIC (#13932)
- Why I did it
FW for Spectrum-4 ASIC not yet available

- How I did it
Remove in Mellanox fw make files to Spectrum-4 ASIC firmware binaries.
Remove from firmware upgrade scripts to be able Spectrum-4 ASIC.

- How to verify it
Run regression test
2023-02-23 08:25:34 +02:00
Marty Y. Lok
cf4a172486
[Nokia][sonic-platform] Update Nokia sonic-platform submodule (#13522)
d768d19 Remove warning msg when a transceiver op takes > 200ms
7451689 Support the module.py in IMM to query the Supervisor card eeprom info

Signed-off-by: mlok <marty.lok@nokia.com>
2023-02-21 11:22:04 -08:00
Junchao-Mellanox
f6d3615bb9
[Mellanox] Check system eeprom existence in a retry manner (#13884)
- Why I did it
On Mellanox platform, system EEPROM is a soft link provided by hw-management. There is chance that config-setup service accessing the EEPROM before hw-management creating it. It causes errors. The PR is aim to fix it.

- How I did it
Waiting EEPROM creation in platform API up to 10 seconds.

- How to verify it
Manual test
2023-02-21 19:40:16 +02:00
Lior Avramov
fd122efb40
[Mellanox] [ECMP calculator] Add support for 4600/4600C/2201 platforms with different interface naming method (#13814)
- Why I did it
Add support for systems 4600/4600C/2201 that are using sonic interface names aligned to 4 instead of 8 (which is the max number of lanes per port).
Improve DB access calls, now we use Python library functions.

- How I did it
Use addition information taken from Config DB in order to create map from SDK logical index to sonic interface name.

- How to verify it
Run ECMP calculator on 4600, 4600C and 2201 platforms.
2023-02-21 08:52:51 +02:00
Junchao-Mellanox
331b97e2aa
[Mellanox] Fix issue: cannot find label port for logical port when logical port number is larger than 64 (#13710)
- Why I did it
sfp_event.py gets a PMPE message when a cable event is available. In PMPE message, there is no label port available. Current sfp_event.py is using sx_api_port_device_get to get 64 logical ports attributes, and find the label port from those 64 attributes. However, if there are more than 64 ports, sfp_event.py might not be able to find the label port and drop the PMPE message.

- How I did it
Don't use hardcoded 64, get logical port number instead.

- How to verify it
Manual test
2023-02-21 08:14:29 +02:00
Junchao-Mellanox
cac95c2b4a
Add PYTHON3_SWSSCOMMON as build time dependency to Mellanox platform API (#13847)
- Why I did it
Add PYTHON3_SWSSCOMMON as build time dependency to Mellanox platform API to avoid issue like:

19:34:11  ImportError while loading conftest '/sonic/platform/mellanox/mlnx-platform-api/tests/conftest.py'.
19:34:11  tests/conftest.py:28: in <module>
19:34:11      from sonic_platform import utils
19:34:11  sonic_platform/__init__.py:18: in <module>
19:34:11      from sonic_platform import *
19:34:11  sonic_platform/platform.py:28: in <module>
19:34:11      raise ImportError(str(e) + "- required module not found")
19:34:11  E   ImportError: No module named 'swsscommon'- required module not found- required module not found
19:34:11  [  FAIL LOG END  ] [ target/python-wheels/bullseye/mlnx_platform_api-1.0-py3-none-any.whl ]
The issue only happens when calling below command:

make target/python-wheels/bullseye/mlnx_platform_api-1.0-py3-none-any.whl 

- How I did it
Add PYTHON3_SWSSCOMMON as build time dependency to Mellanox platform API

- How to verify it
Run build
2023-02-20 15:24:26 +02:00
Stephen Sun
1cd090277f
[Mellanox] Non upstream patches for hw-mgmt V.4.0020.4104 (#13792)
- Why I did it
Add non-upstream kernel patches for the Nvidia platforms
These patches are not yet upstream but needed for new technology.
A flow to upstream them is in progress and once they will be approved they will be moved officially to sonic-linux-kernel.
Till then to include them in the build (not must) the build option INCLUDE_EXTERNAL_PATCH_TAR=y should be included

- How I did it
Zip all the patches in to a tar.gz tarball.

- How to verify it
Manually test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-02-19 09:44:56 +02:00
Samuel Angebault
8437e893b4
[Arista] Update platform library submodules (#13870)
add SEU reporting on chassis
fix fallback logic for Clearlake eeprom identification
fix fan speed reporting for a specific model
move pcie timeout configuration for Upperlake in platform code (deprecates hwsku-init)
2023-02-17 13:51:17 -08:00
Kebo Liu
aee97a69c6
[Mellanox] Enhance MFT make file to download source code from any valid URL (#13801)
- Why I did it
Currently, when building MFT, it can only download the source code from the official download site: http://www.mellanox.com/downloads/MFT/, it's not possible to integrate an internal version that has not been officially released yet.

The intention of this PR is to make it possible to download the source code from any valid link.

- How I did it
Add a new parameter "MLNX_MFT_INTERNAL_SOURCE_BASE_URL", if an URL is given, it will download the source code from the given URL, otherwise, it downloads from the default official site.

- How to verify it
Specify a valid URL in the make file, the MFT debs should be built successfully.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-02-16 09:51:04 +02:00
Stephen Sun
71b5bb6f37
[Mellanox] Support per PSU slope value for PSU power threshold (#13757)
- Why I did it
Support per PSU slope value for PSU power threshold according to hardware team requirement

- How I did it
Pass the PSU number as a parameter when fetching the slope value of PSU.

- How to verify it
Running regression and manual test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-02-14 08:55:28 +02:00
Junchao-Mellanox
389b279ba9
[Mellanox] set select timeout to no more than 1 sec to make sure fast shutdown (#13611)
- Why I did it
Commit sonic-net/sonic-platform-daemons@153ea47 changed SfpStateUpdateTask from Process to Thread. In this commit, it raises an exception in SfpStateUpdateTask to make shutdown flow fast. But it does not work on Nvidia platform as Nvidia platform is passing timeout parameter of get_change_event to select. Linux select function can not be interrupted by a Python exception. There is no such issue on Nvidia platform before that commit. However, in order to comply with the commit and make shutdown flow fast, we decided to change Nvidia platform API implementation.

To fix issue #13591.

- How I did it
The select call in get_change_event should use no more than 1 second as timeout parameter.
Outside the select call, add a while loop to make sure timeout parameter of get_change_event work as expected

- How to verify it
Manual test
2023-02-14 08:26:25 +02:00
Stephen Sun
3112997b5a
[Mellanox] Advance hw-mgmt to v.7.0020.4104 (#13372)
- Why I did it
Advance hw-mgmt service to V.7.0020.4100
Add missing thermal sensors that are supported by hw-mgmt package
Delay system health service before hw-mgmt has started on Mellanox platform in order to avoid reading some sensors before ready.
Depends on sonic-net/sonic-linux-kernel#305

- How I did it
1. Update hw mgmt version
2. Add missing sensors
3. Delay service 

- How to verify it
Regression test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-02-12 11:23:47 +02:00
Pavan-Nokia
5f26b33bec
[armhf][Nokia-7215]High CPU caused by entropy.py (#13694)
Why I did it
High CPU utilization by entropy.py

How I did it
Remove entropy script as it does not work anymore and is no longer needed for bullseye(202205).
In Buster(202012) the max available poolsize (entropy_avail) for entropy is 4096 and our entropy.py script was based on this value. With the change in kernel to bullseye on 202205 this entropy poolsize was changed to 256 which also causes our script to fail.

This script was initially added to provide SW assistance to improve the system entropy value available early on in the Sonic boot sequence on buster.
On bullseye (Linux kernel 5.10) this is no longer needed as this feature has been improved.

How to verify it
run "top" command to check CPU usage.
2023-02-09 13:29:06 -08:00
Pavan-Nokia
e98927f354
add sfp get error description (#13275)
Why I did it
Command "sudo sfputil show error-status -hw" shows "OK (Not implemented)" in the output.

How I did it
Add a new SFP API get_error_description support in Nokia sonic-platform sfp.py module.

How to verify it
Run the new image and execute command "sudo sfputil show error-status -hw"
2023-02-09 13:17:57 -08:00
FuzailBrcm
0704ff5e6c
[pddf]: Adding support for FPGAPCIe in PDDF (#13476)
Why I did it
Some of the platform vendors use FPGA in the HW design. This FPGA is connected to the CPU via PCIe interface. This FPGA also works as an I2C controller having other devices attached to the I2C channels emanating from it. Adding a common module, a driver and a platform specific algorithm module to be used for such FPGA in PDDF.

How I did it
Added 'pddf_fpgapci_module', 'pddf_fpgapci_driver' and a sample algorithm module for Xilinx device 7021. Kernel modules which takes the platform dependent data from PDDF JSON files and initialises the PCIe FPGA. The sample algorithm module can be used by the ODMs in case the communication algorithms are same for their device. Else, they need to come up with similar algo module.

How to verify it
Any platform having such an FPGA and brought up using PDDF would use these kernel modules. The detail representation of such a device in PDDF JSON file is covered in the HLD.
2023-02-06 13:48:31 -08:00
Dmytro Lytvynenko
5ff5e98437
[BFN] Update psu.py to process sigterm signal (#13350)
Why I did it
Sometime, SIGTERM processing by psud takes more then default 10sec (please see stopwaitsecs in http://supervisord.org/configuration.html).

Due to this, the following two testcases may fail:

test_pmon_psud_stop_and_start_status
test_pmon_psud_term_and_start_status
How I did it
Update PSU plugin to process sigterm signal so that psud runs faster to end last cycle in time

How to verify it
Run SONiC CTs:
test_pmon_psud_stop_and_start_status
test_pmon_psud_term_and_start_status
2023-02-06 09:52:28 -08:00
Stepan Blyshchak
68e1079202
[FRR] Switch to dplane_fpm_nl plugin instead of fpm (#12852)
Why I did it
dplane_fpm_nl is a new FPM implementation in FRR. The old plugin fpm will not have any new features implemented. Usage of the new plugin gives us ability to use BGP suppression feature and next hop groups in the future.

How I did it
Switch to dplane_fpm_nl zebra plugin from old fpm plugin which is not supported anymore
Remove stale patches for old fpm plugin and add similar patches for dplane_fpm_nl

How to verify it
Build and run on the switch.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-06 09:38:39 -08:00
guxianghong
35e41687b4
[Centec] Upgrade Centec platform containers(syncd/saiserver/syncd-rpc) to bullseye (#13375)
Why I did it
Upgrade both Centec X86 and ARM64 platform containers(syncd/saiserver/syncd-rpc) to bullseye
Optimize Centec X86 platform makefile, change sdk.mk to sai.mk

How I did it
Modify Makefile and Dockerfile to use bullseye
Change filename form sdk.mk to sai.mk, optimize and modify related files

How to verify it
For Centec X86 platform, compile the code with : a) make configure PLATFORM=centec; b) make all
For Centec ARM64 platform, cmpile the code with: a) make configure PLATFORM=centec-arm64 PLATFORM_ARCH=arm64; b) make all
Verifiy the sonic-centec.bin and sonic-centec-arm64.bin on Centec chip based board.
2023-02-06 09:26:35 -08:00
Sudharsan Dhamal Gopalarathnam
1ff0c0b685
[Mellanox][sai_failure_dump]Added platform specific script to be invoked during SAI failure dump (#13533)
- Why I did it
Added platform specific script to be invoked during SAI failure dump. Added some generic changes to mount /var/log/sai_failure_dump as read write in the syncd docker

- How I did it
Added script in docker-syncd of mellanox and copied it to /usr/bin

- How to verify it
Manual UT and new sonic-mgmt tests
2023-02-05 16:45:49 +02:00
FuzailBrcm
120aa78b07
[pddf]: Modifying the PDDF common platform APIs as per the LED driver changes (#13474)
Why I did it
LED driver changed due to introduction of FPGA support. The PDDF parser and APIs need to be updated. In turn the common platform APIs also require changes.

How I did it
Changed the get/set status LED APIs for PSU, fan and fan_drawer.
Changed the color strings to plain color name. e.g. 'STATUS_LED_COLOR_GREEN' has been changed to 'green'
Added support for LED color get operation via BMC
How to verify it
Verified the new changes on Accton AS7816-64X platform.

root@sonic:/home/admin#
root@sonic:/home/admin# show platform summary
Platform: x86_64-accton_as7816_64x-r0
HwSKU: Accton-AS7816-64X
ASIC: broadcom
ASIC Count: 1
Serial Number: AAA1903AAEV
Model Number: FP3AT7664000A
Hardware Revision: N/A
root@sonic:/home/admin#
root@sonic:/home/admin# show ver |more

SONiC Software Version: SONiC.master.0-dirty-20230111.010655
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Build commit: 3176b15ae
Build date: Wed Jan 11 09:12:54 UTC 2023
Built by: fk410167@sonic-lvn-csg-006

Platform: x86_64-accton_as7816_64x-r0
HwSKU: Accton-AS7816-64X
ASIC: broadcom
ASIC Count: 1
Serial Number: AAA1903AAEV
Model Number: FP3AT7664000A
Hardware Revision: N/A
Uptime: 09:24:42 up 4 days, 22:45,  1 user,  load average: 1.97, 1.80, 1.51
Date: Mon 23 Jan 2023 09:24:42

Docker images:
REPOSITORY                    TAG                              IMAGE ID       SI
ZE
docker-orchagent              latest                           63262c7468d7   38
5MB
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin# pddf_ledutil getstatusled LOC_LED
off
root@sonic:/home/admin# pddf_ledutil getstatusled DIAG_LED
green
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin# pddf_ledutil setstatusled DIAG_LED red
True
root@sonic:/home/admin# pddf_ledutil getstatusled DIAG_LED
red
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin# pddf_ledutil setstatusled DIAG_LED amber
Invalid color
False
root@sonic:/home/admin# pddf_ledutil getstatusled DIAG_LED
red
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin# pddf_ledutil setstatusled DIAG_LED green
True
root@sonic:/home/admin# pddf_ledutil getstatusled DIAG_LED
green
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin# pddf_ledutil getstatusled LOC_LED
off
root@sonic:/home/admin# pddf_ledutil setstatusled LOC_LED amber
True
root@sonic:/home/admin# pddf_ledutil getstatusled LOC_LED
amber
root@sonic:/home/admin# pddf_ledutil setstatusled LOC_LED off
True
root@sonic:/home/admin# pddf_ledutil getstatusled LOC_LED
off
root@sonic:/home/admin#
2023-02-02 11:23:30 -08:00
FuzailBrcm
0abc4f0c4a
[pddd]: Adding support for I2CFPGA in PDDF (#13475)
Why I did it
Some of the platform vendors use FPGA in the HW design. This FPGA is connected to the CPU via I2C bus. Adding a common module and a driver to be used for such FPGA in PDDF.

How I did it
Added 'pddf_fpgai2c_module' and 'pddf_fpgai2c_driver' kernel modules which takes the platform dependent data from PDDF JSON files and creates an I2C client for the FPGA.

How to verify it
Any platform having such an FPGA and brought up using PDDF would use these kernel modules. The detail representation of such a device in PDDF JSON file is covered in the HLD.
2023-02-02 11:20:59 -08:00
Junhua Zhai
876b96e5e8
[gearbox] use credo sai v0.8.2 (#13565)
Update credo sai package to the latest v0.8.2, which also has the fix for aristanetworks/sonic#52.
2023-02-01 23:38:17 -08:00
Volodymyr Samotiy
fd8d678927
[Mellanox] Update SDK/FW to 4.5.4150/2010.4150 (#13480)
- Why I did it
To include latest fixes and new functionality

SDK/FW
1. Fixed bug in recovery mechanism in case of I2C error when trying to access the XSFP module.
2. On the NVIDIA Spectrum-2 switch, when receiving a packet with Symbol Errors on ports that are configured to cut-thought mode, a pipeline might get stuck.
3. On the Spectrum-2 and Spectrum-3 switch, if you enable ECN marking and the port is in split mode, traffic sent to the port under congestion (for example, when connecting two ports with a total speed of 50GbE to a single 25GbE port) is not marked.
4. Modifying existing entry/Adding new one when switch is at its maximum capacity (full by maximum allowed entries from any type such as routes, FDB, and so forth), will fail with an error.
5. When many ports are active (e.g., 70 ports up), and the configuration of shared buffer is applied on the fly, occasionally, the firmware might get stuck.
6. When a system has more than 256 ACL rules, on rare occasion, removing/adding rules may cause some ACL rules not to work.
7. On SN2201 system, on RJ45 port, the link might appear in 'down' state even if it operations properly.
8. Layer 4 port information is not initialized for BFD packet event. To address the issue, remote peer UDP port information was added in BFD packet event.
9. When setting LAG as a SPAN analyzer, the distributor mode of the LAG members was not taken into account. It may happen that the LAG member with distributor mode disabled will be set as a SPAN analyzer port.

- How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "sonic-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2023-01-26 12:41:22 +02:00
DavidZagury
4cc84c68dc
[Mellanox] Improve FW upgrade logging (#13465)
- Why I did it
To improve ASIC FW upgrade logging and have information about the cause of FW update failure in the log.

- How I did it
Added syslog logger support

In case the FW update has failed the update tool will give the cause of the failure in the output in the last line, starting with "Fail".
When running the tool, in case of a failed update, we will parse the output to retrieve the cause and log it.

Device #1:
 ----------
 
 Device Type:      ConnectX6DX
   Part Number:      MCX623106AN-CDA_Ax
   Description:      ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16;
   PSID:             MT_0000000359
   PCI Device Name:  /dev/mst/mt4125_pciconf0
   Base GUID:        0c42a103007d22d4
   Base MAC:         0c42a17d22d4
   Versions:         Current        Available     
      FW             22.32.0498     22.32.0498    
      PXE            3.6.0500       3.6.0500      
      UEFI           14.25.0015     14.25.0015    
 
 Status:           Forced update required
 
---------
 Found 1 device(s) requiring firmware update...
 
Device #1: Updating FW ...     
 FSMST_INITIALIZE -   OK          
 Writing Boot image component -   OK          
 Fail : The Digest in the signature is wrong

- How to verify it
mlnx-fw-upgrade.sh --upgrade
2023-01-25 20:53:39 +02:00
Lior Avramov
9a49aec570
[Mellanox] [ECMP calculator] Add script usage and more information to script description in help option (#13493)
Add script usage and more information to script description being printed in help option.

- Why I did it
Missing information in script description in help option.

- How I did it
Expand script description and add script usage.

- How to verify it
Run the script with -h option.
2023-01-25 20:50:38 +02:00
Marty Y. Lok
fd3966a0b8
[Nokia][sonic-platform] Update sonic-platform submodule for Nokia IXR7250E platform (#13437)
Why I did it
Update Nokia sonic-platform submodule

81a9c77  [Supervisor] Modifed the get_description to fix the name for Nokia-IXR7250E-SUP-10 card.
e49ddfb Fix the LedContorlCommon to get the physical index from port mapping
dd143f1 [module] modify the chassis.py and module.py to allow supervisor to retrieve the line card eemprom info
How I did it
Update Nokia sonic-platform submodule

81a9c77  [Supervisor] Modifed the get_description to fix the name for Nokia-IXR7250E-SUP-10 card.
e49ddfb Fix the LedContorlCommon to get the physical index from port mapping
dd143f1 [module] modify the chassis.py and module.py to allow supervisor to retrieve the line card eemprom info
How to verify it
On supervisor, "show chassis module status" should show Nokia-IXR7250E-SUP-10 instead of Nokia-IXR7250-SUP-10

Signed-off-by: mlok <marty.lok@nokia.com>
2023-01-24 11:40:59 -08:00
Dror Prital
940e2cd9bf
[Mellanox] Add ASIC simulation version tag to fw.mk (#13470)
Signed-off-by: dprital <drorp@nvidia.com>
2023-01-23 13:30:02 +02:00
Marty Y. Lok
e1f0d7650e
[Nokia][sonic-platform] Update sonic-platform submodule for Nokia IXR7250E (#13145)
fcb45b5 Add MDIPC channel cleanup code at signal-based termination time and don't precache in get_presence unless required
8984b3d Properly synchronize transceiver module presence globally

Signed-off-by: mlok <marty.lok@nokia.com>

Signed-off-by: mlok <marty.lok@nokia.com>
2023-01-18 15:47:02 -08:00
Samuel Angebault
dfaf379e27
[Arista] Update platform library submodules (#13398)
- add module reboot APIs for chassis
- add supervisor module on linecard (fixes show chassis module midplane-status)
- improve RTC update mechanism and sync every 10 mins
- fix sbtsi temp sensor presence/thresholds
- fix Mineral status leds
- remove thermal object on xcvrs
- misc fixes
2023-01-18 10:03:48 -08:00