SONiC changes:
1. Support Spectrum4 ASIC FW binary building.
2. Support new SDK sx-obj-desc lib building since new SAI need it.
3. Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
4. Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2305.25.0.3
SDK/FW bug fixes
1. In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
2. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.
SDK/FW Features
1. On SN2700 all ports can support y cable by credo
SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE
SAI features
1. Port init profile
- How I did it
Update SDK/FW/SAI make files
- How to verify it
Run full sonic-mgmt regression on Mellanox platform
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Update Mellanox MFT tool to version 4.25.0-62
- How I did it
Update the MFT tool make file
- How to verify it
Run full sonic-mgmt regression.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
Add PDDF support on following Ufispace platforms with Broadcom ASIC
S9110-32X
S8901-54XC
S7801-54XS
S6301-56ST
How I did it
Add PDDF configuration files, scripts and python files
How to verify it
Run pddf commands and show commands.
Signed-off-by: nonodark <ef67891@yahoo.com.tw>
* Update sairedis submodule
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.
This is a reland of #15720, which was reverted in #15995 due to the RPC
package build failing. That failure has since been fixed, and the
PR pipeline has been updated to build the RPC package so that this is
checked at the PR stage.
This submodule update brings in the following changes:
```
4dbdb21 Fix RPC package build failure due to shell syntax issue (#1268)
588d596 Make sure new binaries replace existing binaries in docker-sonic-vs (#1269)
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
```
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Update sairedis submodule with the fix for the RPC package build
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
Increase UT coverage for Nvidia platform API code
Work item tracking
Microsoft ADO (number only):
- How I did it
Focus on low coverage file:
1. component.py
2. watchdog.py
3. pcie.py
- How to verify it
Run the unit test, the coverage has been changed from 70% to 90%
- Why I did it
Added the fwtrace config files in order to be able to call the mlxstrace utility during the show techsupport dump.
Work item tracking
Microsoft ADO (number only):
- How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.
- How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
This reverts commit e0927e28af.
Why I did it
Reverts #15720
It breaks build for target/debs/bullseye/syncd_1.0.0_amd64.deb
make[2]: Entering directory '/sonic/src/sonic-sairedis'
dh_install
# Note: escape with an extra symbol
if [ -f debian/syncd-rpc/usr/bin/syncd_init_common.sh ] ; then
/bin/sh: 1: Syntax error: end of file unexpected (expecting "fi")
make[2]: *** [debian/rules:65: override_dh_install] Error 2
make[2]: Leaving directory '/sonic/src/sonic-sairedis'
make[1]: *** [debian/rules:51: binary] Error 2
make[1]: Leaving directory '/sonic/src/sonic-sairedis'
dpkg-buildpackage: error: fakeroot debian/rules binary subprocess returned exit status 2
Work item tracking
Microsoft ADO (number only): 24691535
How I did it
How to verify it
Why I did it
Updating the iSMART_64 tool for supporting latest debian releases.
How I did it
On branch new_ismart
Changes to be committed:
(use "git restore --staged ..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/iSMART_64
How to verify it
In s6100, run the iSMART_64 tool.
md5sum - 24725730d7649769c7ba50971c1f2955
Midstone platform has compilation error in master branch, fixed the same.
How I did it
Due to bullseye migration i2c_new_dummy API is deprecated modified with i2c_new_dummy_device.
How to verify it
Verified target/debs/bullseye/platform-modules-midstone-200i_0.2.2_amd64.deb is generated
Co-authored-by: Kannan Selvaraj <skannan@celestica.com>
Why I did it
[E1031] fix pca9548 initializes failed occasionally in stress test.
When failure happened, ismt i2c bus hang up and need power cycle to
recover it.
How I did it
Add 0.5s delay between setuping and configuring pca9548 i2c mux.
How to verify it
Reboot stress test at least 100 times without failure.
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.
This submodule update brings in the following changes:
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
To fixsonic-net/sonic-mgmt#8786
How I did it
Modified Fan API to check whether the data retrieved is valid or not and return accordingly
How to verify it
Verify whether API 2.0 is loaded properly or not.
Execute CLI's like "show version", "show interface status", "show platform psustatus" etc..
= Why I did it
To optimize Mellanox platform SAI build
- How I did it
SAI debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release.
- How to verify it
Configure/build for Mellanox platform, check the image and ensure that correct SAI debs are included.
- Why I did it
The reset cause "reset_from_comex" has been removed by hw-management, hence removing it from platform API code
- How I did it
Remove reset_from_comex from reboot cause mapping
- How to verify it
Manual test
- Why I did it
BIOS on new generation switch can come with a file type of cap or cab. Needs to add support to these file type.
Also ONIE version on new devices can have a suffix of 'dev'.
- How I did it
Added cap & cab as possible component extensions for ComponentBIOS.
Update the ONIE version regex to include dev signed versions.
- How to verify it
Update BIOS.
Why I did it
port_config.ini and hwsku.json are needed to generate the default config
switch_type needs to be "dpu" to spawn the right set of processes during dvs initialization and to make sure that DASH APIs can be handled properly
Work item tracking
Microsoft ADO 24375371:
How I did it
Use the same hwsku.json and port_config.ini for DPU-2P as the ones used for Nvidia-MBF2H536C SKU in nvidia-sonic sonic-buildimage repo.
Set switch_type to "dpu" in DEVICE_METADATA configuration to make sure DASH specific APIs are handled properly
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
* [202012][platform/barefoot] (#8543)
Why I did it
Pcied running by python 2.
How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2
How to verify it
docker exec pmon supervisorctl status
* [Netberg][nba710] Added initial support for Aurora 710
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
---------
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
Why I did it
To support dynamic swapping of module types/speeds (400G/100G/40G)
To optimize CMIS ZR optics operation
How I did it
Reinitialize xcvr_api at module removal/insertion time, and also optimize cache for ZR optics.
How to verify it
Verify that different (supported) module types can be dynamically swapped (removed/inserted) and that each is properly provisioned by Xcvrd and has its EEPROM information accurately reported in Redis DB (using "show transceiver eeprom") as well as "sfputil show eeprom" direct access.
Also verify that Xcvrd initialization and operation with 400G CMIS ZR optics is both efficient and functional.
** edit 6/14/23: pushed enhanced caching (full memory map) support and elimination of base class APIs override.
Why I did it
Sharing the storage of syncd with other proprietary application extensions allows them to communicate with syncd in differnt ways.
If one container wants to pass some information to syncd then shared storage can be used. However, today the shared storage isn't cleaned on restarts making it possible for syncd to read out-of-date information generated in the past.
NOTE: No plans to use it for standard SONIC dockers and we are working on removing the SDK dependency from PMON docker
How I did it
Implemented new service to clean the shared storage.
How to verify it
Do reboot/fast-reboot/warm-reboot/config-reload/systemctl restart swss and verify /tmp/ is cleaned after each restart in syncd container.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Define a generic 2-port NPU SKU for docker-sonic-vs to
enable DASH vstests to pass on azure pipelines
Work item tracking
Microsoft ADO 24375371:
How I did it
Define a generic 2-port NPU hwsku that is used only for DASH-specific vstests.
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
- Why I did it
SDK patches for iproute2 were added to SONiC tree as a temporary solution.
Now that SDK with the patches is available, I have removed the patches from SONiC tree and we consume them from SDK github during compilation.
- How I did it
During build we download SDK iproute2 patches from SDK github (or from the URL provided by user if compiling SDK from sources) and apply them before compilation.
- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
- Why I did it
Adjust the warning threshold implementation according to the latest algorithm update
- How I did it
Modify power warning and critical thresholds methods
- How to verify it
Unit test updated to cover the change
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
Add the patchwork link to the commit description for non-upstream patches if present
- How I did it
Parse the patchwork/<patch_name>.txt file from hw-mgmt
Why I did it
FPGA driver crash was observed in Dell FPGA based platforms.
How I did it
Fixed FPGA crash
How to verify it
Load FPGA driver and check whether the kernel crashes.
Why I did it
When git clone -b xxx command is used the versions-git will reset the HEAD of the git to the commit ID in the versions-git file. Which causes incorrect commit to be checked out causing build errors.
Work item tracking
Microsoft ADO (number only):
How I did it
Split ‘git clone -b’ into two steps to avoid owerwrite
Git clone
cd mrvl-prestera; git checkout ; cd ..
How to verify it
Build marvell-arm64 target using below instructions
make init
make configure PLATFORM=marvell-arm64 PLATFORM_ARCH=arm64
make target/sonic-marvell-arm64.bin SONIC_BUILD_JOBS=2
- Why I did it
Bug fix:
- * I2C bus is stuck - Unable to probe I2C bus 2-0048, which causes /var/run/hw-management/config/sfp_counter, module_counter to be zero and pmon docker unable to start.
- How I did it
Update HW-MGMT package version in the make file
Update HW-MGMT submodule pointer
-How to verify it
Run full sonic-mgmt regression
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
Facilitate Automatic integration of sdk kernel patches into SONiC.
**Inputs to the Script:**
1) `MLNX_SDK_VERSION` Eg: `4.5.4206`
2) `MLNX_SDK_ISSU_VERSION` Eg: `101`
**Note: If nothing is provided the one already present in the sdk.mk file is used**
3) `MLNX_SDK_SOURCE_BASE_URL:`
**Note: If nothing is provided the upstream sdk drivers url is used**
4) `CREATE_BRANCH: (y|n)` Creates a branch instead of a commit (optional, default: n)
5) `BRANCH_SONIC`: Only relevant when CREATE_BRANCH is y. `Default: master`.
Note: These should be provided through `SONIC_OVERRIDE_BUILD_VARS ` parameter
**Output:**
1) Script creates a commit in sonic-linux-kernel with any updates to sdk-kernel patches in sonic in accordance with the version provided by `MLNX_SDK_VERSION`
**Note: Script Doesn't commit anything to linux-kernel when there aren't any changes required..**
#### How I did it
1) Added a new make target which can be invoked by calling `make integrate-mlnx-sdk`
```
user@server:/sonic-buildimage/src/sonic-linux-kernel$ git rev-parse --abbrev-ref HEAD
master_6f38dca_integrate_4.5.4206
user@server:/sonic-buildimage/src/sonic-linux-kernel$ git log --oneline -n 1
d64d1e7 (HEAD -> master_6f38dca_integrate_4.5.4206) Intgerate MLNX SDK 4.5.4206 Kernel Patches
```
Changes made will be summarized under `sonic-buildimage/integrate-mlnx-sdk_user.out` file. Debugging and troubleshooting output is written to `sonic-buildimage/integrate-mlnx-sdk.log` files
[log_files.zip](https://github.com/sonic-net/sonic-buildimage/files/11226441/log_files.zip)
#### Limitations:
1) Assumes that the sdk kernel patches are always upstreamed
#### How to verify it
Build the Kernel and test
Why I did it
Align with SAI headers v1.12.0
Work item tracking
Microsoft ADO (number only):
How I did it
Update Mellanox SAI submodule
How to verify it
Compile SONiC image
- Why I did it
The current implementation of SFP reset, LPM, present relies on SDK API. This PR moves the implementation to SDK sysfs. By this PR, it gains following benefit:
1. SDK sysfs provides better performance.
2. Host side and container side share the same code.
3. Code is much cleaner.
- How I did it
Use SDK sysfs to implement SFP reset, LPM, present.
- How to verify it
1. Manual test.
2. Unit test.
SDK/FW Fixed Issues:
• When a system has more than 256 ACL entries, on rare occasion, removing/adding entries may cause some ACL entries not to work.
• When using mirror session policer on spectrum-2, spectrum-3, the actual CIR was 1.28 times more than the configured CIR value
• After warm boot process, when enabling ECN marking and the port is in split mode, traffic sent to the port under congestion (for example, when connecting two ports with a total speed of 50GbE to a single 25GbE port) is not marked.
• Warm boot might fail if the key value SAI_KEY_ACCUMULATED_FLOW_COUNTER_UNITS_IN_KB is set
• If counters are bound to an next hop group, there is a probability the next API calls that modify the next-hop group members will fail.
• In Spectrum platforms Fastboot mode is not operational for Split port with Force mode in 50G speed
• When fine grain next hop group has a size of 2K or 4K members, and group is removed, FW will remove only (size % 2048) members, resulting in leakage of KVD resources
• When reading some port statistics, or bulk reading some Queue or PG statistics, and in parallel reading or writing other counters, FW may, in rare cases, get stuck
• SN2201 Module 1 is considered to be present/linked while no cable/module is plugged
• On Spectrum-3 when port configure to 400G FW might stuck after running mlxlink while 400G interface connected and swap between upper and lower 4 lanes
SAI New features:
• ACL: Added support for an ACL match on the AETH field (SAI_ACL_TABLE_ATTR_FIELD_AETH_SYNDROME, SAI_ACL_ENTRY_ATTR_FIELD_AETH_SYNDROME) to count RoCE NAK and CNP packets.
• PLL Status: Added a new logging entry that alerts the user upon a PLL lock loss event.
• Dual ToR - Additional MAC Address: Added support for setting a MAC address for the router interface which is not part of the 10 bit MAC address available for RIFs on Spectrum-1, as part of the Dual ToR scenario.
• Dual ToR: DSCP Remapping Added support for tunnel QoS maps as part of the Dual TOR scenario.
SAI Fixed issues:
• When setting a WRED profile attribute for a color that was not enabled during the profile create time, an error would be returned. After the fix, a default profile is create on such scenario and the set attribute is applied on top of it
• When calling the flush FDB by using the SAI_FDB_FLUSH_ATTR_BRIDGE_PORT_ID attribute, the bridge bv_id value was filled on the notification callback where it should have been left empty.
Signed-off-by: Kebo Liu <kebol@nvidia.com>