Why I did it
1. Upgrade centec-arm64 platform to Bookworm.
2. Solve the problem of compiling the docker-syncd-centec-rpc.gz error on the centec platform.
How I did it
1. Modified platform driver to comply with bookworm kernel.
2. Upgrade SONiC package versions of the centec platform.
How to verify it
1. Compile the centec-arm64 platform to generate sonic-centec-arm64.bin.
2. Compile the centec platform to generate docker-syncd-centec-rpc.gz.
Signed-off-by: centecqianj <qianj@centec.com>
- Why I did it
Enable CMIS host management for Mellanox devices which are expected to support the feature
- How I did it
new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event()
this thread in the new file handles the state machine per port.
first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module.
After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port,
so it will be able to detect plug in or out events of a cable.
- How to verify it
Enhanced unit tests
run sonic mgmt on Nvidia SN4700 with CMIS host management enabled
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.comCloses#17345
This W/A was proposed by Nvidia FRR team before the long term solution is ready.
Why I did it
A W/A to fix default route installation during LAG member flap
Work item tracking
N/A
How I did it
Disabled FRR next hop group support
How to verify it
Do LAG member flap
- Why I did it
Provide a dummy implementation for SFP error description when CMIS host management is enabled. A future feature shall be raised to implement SFP error description for such mode.
- How I did it
if SFP is under software control, provide "Not supported" as error description
if SFP is under initialization, provide "Initializing" as error description
- How to verify it
unit test
This commit adds support for pensando asic called ELBA. ELBA is used in pci based cards and in smartswitches.
#### Why I did it
This commit introduces pensando platform which is based on ELBA ASIC.
##### Work item tracking
- Microsoft ADO **(number only)**:
#### How I did it
Created platform/pensando folder and created makefiles specific to pensando.
This mainly creates pensando docker (which OEM's need to download before building an image) which has all the userspace to initialize and use the DPU (ELBA ASIC).
Output of the build process creates two images which can be used from ONIE and goldfw.
Recommendation is use to use ONIE.
#### How to verify it
Load the SONiC image via ONIE or goldfw and make sure the interfaces are UP.
##### Description for the changelog
Add pensando platform support.
How I did it
Modified platform driver to comply with bookworm kernel.
Modified python build commands for building whl packages.
How to verify it
Verify whether all the platform bookworm debs are built.
make target/debs/bookworm/platform-modules-v682-48y8c-d_1.0_amd64.deb
Load the platform debian into the device and install it in bookworm image.
Verify the platform related CLI and the functionality
Signed-off-by: centecqianj <qianj@centec.com>
- Why I did it
The current low power mode setting implementation requests the user to set the port to admin down first before toggling LP mode, this is not backward compatible, now revert it to the old way so that the user can toggle the LP mode regardless of the port admin status.
- How I did it
Revert the recent changes related to LPM in PR #14130 and #16545
- How to verify it
Run all sfputil and SFP platform API related tests on all the Mellanox platforms.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
[Bookworm] Update platform-modules-dell for Bookworm #16735
How I did it
Modified platform driver to comply with bookworm kernel.
Removed MODULE_SUPPORTED_DEVICE wherever used.
Modified python build commands for building whl packages.
How to verify it
Verify whether all the platform bookworm debs are built.
make target/debs/bookworm/platform-modules-z9100_1.1_amd64.deb
Load the platform debian into the device and install it in bookworm image.
Verify the platform related CLI and the functionality
Why I did it
Update SDK/SAI and FW for Mellanox Platform
How I did it
Update SDK/FW to v4.6.2104/v2012.2104
Fixed Issues:
Some of the Warmboot related files which were created by SDK during switch create are now generated during pre shutdown flow
New Features:
Debian 12 and kernel 6.1 support
Update SAI
New Features:
Auto Fec Support
FDB entries are now restored after warmboot to prevent temporary system flooding.
Minor Enhancement and Bug Fix in integrate-mlnx-sdk
How to verify it
Build Image and run tests
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Why I did it
Add platform support for Debian 12 (Bookworm) on Mellanox Platform
How I did it
Update hw-management to v7.0030.2008
Deprecate the sfp_count == module_count approach in favour of asic init completion
Ref: Mellanox/hw-mgmt@bf4f593
Add xxd package to base image which is required by hw-management scripts
Add the non-upstream flag into linux kernel cache options
Update the thermalctl logic based on new sysfs attributes
Fix the integrate-mlnx-hw-mgmt script to not populate the arm64 Kconfig
How to verify it
Build kernel and run platform tests
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Co-authored-by: Junchao-Mellanox <junchao@nvidia.com>
Co-authored-by: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
[Nvidia] Enable iproute2 & fix mft build (#16)
* Enable iproute2 as the SDK is also built
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
* [Nvidia] Dont use mkbmdeb method of dkms to build the package
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
* Added linux image to the Depends section of mft
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
[Nvidia] [Bookworm] Separate KERNEL_MFT into a new target (#16782)
* [Nvidia] Seperate KERNEL_MFT into a new target because of kernel header dependency
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
* Update linux-kernel submodule
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
* Fix paralell build problem
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
---------
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
* sonic-platform-modules-cel: broadcom: adapt for kernel 6.1 and bookworm
The i2c_driver->remove API declaration has been updated to return void instead
of int, as part of cleanup patches in 6.1. More details can be referred from
here: [1]. Update the remove API definition in the modules accordingly and
cleanup variables that go unused from the remove API.
Update python build commands for bookworm. The packaging based on calling
setup.py is deprecated and using build module/pip utility is the recommended
method for python packaging/installation. Further details can be referred to
from here: [2], [3]. The build module is picky about the package information file,
which needs to be either setup.py or pyproject.toml.
Additionally, fix formatting inconsistencies in debian/changelog reported by
`dh_installchangelogs` during the build.
Tested the changes by compiling the changes as below:
make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1
sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb
cd platform/broadcom/sonic-platform-modules-cel
KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage
Also verified the python scripts under the sonic-platform-modules-cel with
pyflakes to ensure no new errors are flagged (with exception of unused modules).
References:
[1] - https://github.com/torvalds/linux/commit/ed5c2f5f
[2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm
[3] - 0b20a4863 (Update Python build commands for Bookworm, 2023-09-07)
Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
* platform/pddf: i2c: adapt for kernel 6.1 and bookworm
* Fixup i2c_driver->remove API due to changes in the function
prototype (ref: [1]).
* Cleanup `MODULE_SUPPORTED_DEVICE` macros that were cleaned up in
the upstream (ref: [2]).
* Sanitize python packaging and installation using the `build` module
instead of calling the setup.py directly (ref: [3]. [4]).
Tested the changes by compiling pddf module as below:
make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1
sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb
cd platform/pddf/i2c
KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage
References:
[1] - https://github.com/torvalds/linux/commit/ed5c2f5f
[2] - https://github.com/torvalds/linux/commit/6417f031
[2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm
[3] - 0b20a4863 (Update Python build commands for Bookworm, 2023-09-07)
Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
* platform/broadcom: include platform-modules-cel in builds
With pddf modules patched for 6.1, platform-modules-cel can be compiled
and included in the final image.
Testing by building sonic-broadcom.bin/sonic-broadcom-dnx.bin.
Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
* pddf/i2c: revert correct rootdir for pip install
The pip install directory has been set to test-pkg1/ for testing the build and
incorrectly retained as is. Revert this to the correct path $(PACKAGE_PRE_NAME).
Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
* platform/broadcom: include pddf/modules-cel in the base package
Without this change, the modules were built but not packaged in the final .bin.
The final sonic-broadcom.bin has been tested for bootup on Celestica's
Silverstone platform.
admin@sonic:~$ uname -a
Linux sonic 6.1.0-11-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux
admin@sonic:~$ show platform summary
Platform: x86_64-cel_silverstone-r0
HwSKU: Silverstone
ASIC: broadcom
ASIC Count: 1
Serial Number: R4009B2F062504LK200024
Model Number: N/A
Hardware Revision: N/A
admin@sonic:~$ show version | head
SONiC Software Version: SONiC.g0aad6c67c-rachandr
SONiC OS Version: 12
Distribution: Debian 12.2
Kernel: 6.1.0-11-2-amd64
Build commit: 0aad6c67c
Build date: Thu Oct 26 07:13:47 UTC 2023
Built by: rachandr@AZUHPS14
Platform: x86_64-cel_silverstone-r0
Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
---------
Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
* [Marvell-arm64] Add platform support for rd98DX35xx
This change adds following two variants of rd98DX35xx board to arm64
build.
Board with CPU integrated into the 98DX35xx switching chip:
Platform: arm64-marvell_rd98DX35xx-r0
HwSKU: rd98DX35xx
ASIC: marvell
Port Config: 32x1G + 16x2.5G + 6x25G
Board with external CN9131 CPU connected over PCI to 98DX35xx
switching chip:
Platform: arm64-marvell_rd98DX35xx_cn9131-r0
HwSKU: rd98DX35xx_cn9131
ASIC: marvell
Port Config: 32x1G + 16x2.5G + 6x25G
Change-Id: I21dc9fe972417daaabb20a5bddf7779d72b7972e
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
* Add HWSKU for rd98DX35xx and rd98DX35xx_cn9131
This patch adds new HWSKU's for Marvell arm64 platforms rd98DX35xx
and rd98DX35xx_cn9131.
Change-Id: Id7c14f49f0e304335cc4ca73dcae52362c49d231
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
---------
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
- Why I did it
Support running hw-management service on MSN4700 emulation platform.
- How I did it
Use physical EEPROM instead of the fake one
Do not skip PSUd, PCId, thermal control daemon
Adjust PCIe and thermal configuration files
Adjust platform.json for different chassis names and thermals
Remove a patch to hw-management in order to enable it
- How to verify it
Run Nvidia simulation on SN4700 (ASIC and Platform)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
In order to activate FW after it was upgraded need to perform reboot.
If reboot wasn't performed and user need to upgrade to another SONiC image then it will fail.
The reason for that is that during SONiC upgrade new FW should be installed but it will fail because previously installed FW wasn't activated.
In order to allow 2nd FW upgrade without reboot in-between need to reactivate FW image.
This change handles such flow.
Example of issue scenario:
User installed SONiC image on the switch
Then for some reason FW was upgraded by user or script but reboot was not performed to activate it.
After that upgrade to new SONiC image will fail because new image need to install FW but it fails due to previous one wasn't activated.
- How I did it
In "mlnx-fw-upgrade" script check if FW upgrade failed with the error that FW was already installed but reboot was not performed.
If so then perform FW image reactivation and try to upgrade FW again.
- How to verify it
Install SONiC image on the switch
Then upgrade FW but don't perform reboot.
After that upgrade to new SONiC image and check that upgrade was successfull.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
Why I did it
- Convert hw-dump into generate-dump plugins
- Enable DRAM scrubber on some products
- Fix xcvr driver active low register bit logic
- Improve cooling algorithm (now considers xcvrs and modules)
- Add linecard graceful shutdown (disabled by default)
The scrubber was enabled for the following products:
- DCS-7050QX-32S
- DCS-7050CX3-32S
- DCS-7060CX-32S
This is CSP CS00012280996.
The issue to fix is that the checksum was incorrect for all TCP packets leaving the system so that the BGP connection cannot be established. We found the issue on BCM56993, and it is possible to affect all platforms using linux_ngknet.
Why I did it
XGS saibcm-modules 8.4 is needed. #14471
Work item tracking
Microsoft ADO (number only): 24917414
How I did it
Copy files from xgs SDK 8.4 repo and modify makefiles to build the image.
Upgrade version to 8.4.0.2 in saibcm-modules.mk.
How to verify it
Build a private image and run full qualification with it: https://elastictest.org/scheduler/testplan/650419cb71f60aa92c456a2b
Why I did it
In an effort to allow people to build a slim version of SONiC to fit on devices to small storage, there is a need to disable some unneeded features.
The docker-gbsyncd are only applicable to devices with external gearboxes and might not apply to devices that need a small image.
It is therefore desirable to have a knob to not include these gbsyncd containers.
Work item tracking
Microsoft ADO (number only):
How I did it
Add a new config INCLUDE_GBSYNCD which is enabled by default to retain the previous behavior.
Setting it to n will not include the platform/components/docker-gbsyncd-*.mk.
How to verify it
Set INCLUDE_GBSYNCD = n and witness that docker-gbsyncd images are not present in the final image.
- Why I did it
Add an ability to add arm64 mellanox specific kconfig using the integration tool
Fix the existing duplicate kconfig problem by using the vanilla .config
Add an ability to patch kconfig-inclusions file. Renamed series.patch to external-changes.patch to reflect the behavior
NOTE: Min hw-mgmt version to use with these changes: V.7.0030.2000 not yet upstream but required prio to it.
This option will be enabled one the new hw mgmt will be upstream.
Depends on sonic-net/sonic-linux-kernel#336
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Why I did it
Added Marvell SAI-1.13.0 debian support for x86_64 platform.
Work item tracking
Microsoft ADO (number only):
How I did it
compile marvel libsai.so (with SAI headers from version 1.13.0) and package it with version 1.13.0-1
How to verify it
Updated sdk & driver requries hugepage to be reserved during kernel
boot. These kernel command line agrument are passed from installer.conf
in device folder.
Change-Id: Id43f61af2b050500775da66d058c2de78cb5ad15
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
This patch adds support for lazy install of Marvell prestera SDK
drivers for platform-nokia. Lazy install for drivers is added as
updated sdk driver needs to classify the drivers required for platform
during compile time. SDK drivers and platform files are now fetched
from a submodule(mrvl-prestera).
Additionaly, DTB required for sonic_fit creation during compile time
is sourced from sonic-linux-kernel.
Change-Id: Id5b011e6bd67accf7b1579d91cb7affad464e916
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
Upgrade the xgs SAI version to 8.4.21.0 to include the following changes:
8.4.21.0: [CSP CS00012316669][SAI_BRANCH rel_ocp_sai_8_4] FP destroy API behavior change to avoid traffic leaks
8.4.20.0: [CSP CS00012312900] Max path used as 0 in ordered ECMP replace.
8.4.19.0: [CSP CS00012301679] sai_query_attribute_capability SAI_OBJECT_TYPE_SWITCH, fix few attrs in previous checkin
8.4.18.0: [CSP CS00012310706] Add SAI_TUNNEL_SUPPORT to azure pipeline build files
8.4.16.0: [CSP CS00012301679] sai_query_attribute_capability for obj type SAI_OBJECT_TYPE_SWITCH
8.4.15.0: [SAI_BRANCH rel_ocp_sai_8_4] Port SONIC-75025 to SAI 8.4
8.4.14.0: [CSP CS00012306356] Change log level of sai_bulk_object_get_stats, unsupported object type to warning
8.4.13.0: [CSP CS00012302193] backport SONIC-72912 jira on SAI 8.4 branch
8.4.12.0: [CSP CS00012296541][SAI_BRANCH rel_ocp_sai_8_4] Preformance improvement for ECMP from SDK-354625
8.4.11.0: [CSP CS00012293985] Port SONIC-74816 fix to 8.4.
8.4.10.0: [CSP NA/SID-26013][SAI_BRANCH rel_ocp_sai_8_4] SID - L3 multicast packet drop due to wrong VFI derivation - SDK-350470
8.4.9.0: [CSP NA/SID-25917][SAI_BRANCH rel_ocp_sai_8_4] SID-Crash in ALPM algorithm during entry split SDK-343694
8.4.8.0: [CSP CS00012275265][SAI_BRANCH rel_ocp_sai_8_4] SID Deadlock in linkscan callback during flexport operations
8.4.7.0: [CSP CS00012284142] Fixed MMU buffer config issue with multicast queues
8.4.6.0: [CSP CS00012275454] sai_object_type_get_availability failed with SAI_STATUS_INVALID_PARAMETER; [CSP CS00012284121] [SAI_BRANCH rel_ocp_sai_8_4] SID - L2_ENTRY Table Lookups May Miss
8.4.4.0: [CSP CS00012287462] Uplift tunnel fix from SONIC-73462
8.4.2.0: Fixing the issue with SAI_QUEUE_STAT_DROPPED_PACKETS retrieval; Enable/Disable bitmask for egress stats; SAI - OCP SAI 8.4 - SAI: Reduce Index data type union _brcm_sai_indexed_data_t size to be below 2k.; Cut Down Version - Port Tpid Compilation Issue Fix
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
- Why I did it
hw-management renamed PSU temperature related sysfs:
psu1_temp -> psu1_temp1
psu2_temp -> psu2_temp1
psu1_temp_max -> psu1_temp1_max
psu2_temp_max -> psu2_temp1_max
This PR is to align the change in SONiC.
- How I did it
Use new sysfs node for PSU temperature and PSU temperature threshold
- How to verify it
Manual test
sonic-mgmt Regression test
1. Upgrade Centec SAI debian package version to v1.13, in order to match syncd's requirement.
2. Fix syncd compile fail for missing sai_query_api_version function in verdor sai
Signed-off-by: Xianghong Gu <xgu@centec.com>
Why I did it
SONiC service determine-reboot-cause might run before driver creating reset cause files. In that case, the reset cause will be "Unknown". This PR introduces a wait mechanism to wait for reset cause sysfs files ready.
How I did it
/run/hw-management/config/reset_attr_ready is the file to indicate all reset cause files are ready. In chassis.get_reboot_cause function, it waits /run/hw-management/config/reset_attr_ready for up to 45 seconds.
How to verify it
Manual test on master/202211/202205
Why I did it
Add info syslog for cpu_wdt.service when trigger watchdog arm action.
How I did it
Add info syslog for cpu_wdt.service when trigger watchdog arm action.
This fixesNokia-ION/ndk#22
Note that this PR must be coupled with NDK version >= 22.9.13
Why I did it
To provide proper support for CMIS compliant transceiver module CDB operations (including FW related operations).
How I did it
Enhanced the transport subsystem so as to provide for up to 2k bytes of data to be passed to/from modules (as contrasted with the prior max of 128 bytes).
How to verify it
Ensure that new FW (firmware) can be programmed to CMIS compliant module(s) using the 'sfputil firmware ...' commands.
To include files in path platform/broadcom/sonic-platform-modules-dell/s6100/bin during build in Dell S6100 platform deb package.
How I did it
During Dell S6100 platform deb package build, copy the files to the install location.
How to verify it
- Copy the required files to platform/broadcom/sonic-platform-modules-dell/s6100/bin.
- Build the SONiC image and install in a Dell S6100 device.
- The files will be available in /usr/share/sonic/device/x86_64-dell_s6100_c2538-r0/bin/.
SONiC CLI command was broken.
admin@sonic:~$ show platform psustatus
PSU Model Serial HW Rev Voltage (V) Current (A) Power (W) Status LED
----- --------------- ------------------ -------- ------------- ------------- ----------- -------- -----
PSU 1 PFE600-12-054NA 420000956420600006 206 N/A N/A 82.00 OK green
PSU 2 PFE600-12-054NA 420000956420600248 206 N/A N/A 60.00 NOT OK green
Management port currently broken for Edgecore AS4630-54PE platform due to NIC hardware numbering.
Created new PR with typo from Edgecore in original PR fixed. Here is a link to the old PR that has broken logic:
#9560
Sfp api can now be called from the host which doesn't have the python_sdk_api installed. Also, sfp api has been migrated to use sysfs instead of sdk handle.
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
* platform/pddf/README.md: Fix typo in *development*
* platform/pddf/README.md: Remove trailing space
* platform/pddf/README.md: Remove leading space from all lines
#15926 updated the submodule hash to point to a commit on a private branch that included changes for compiling with the 5.10.179 kernel. However, this submodule hash should've pointed to one on the master branch, not on a private branch. Fix this with this PR.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
sonic-mgmt test failure is seen for update_firmware component API
Microsoft ADO: 25208748
How I did it
Edited API 2.0 to fix this issue.
How to verify it
Run sonic-mgmt test after the fix and verify it passes.
This feature was meant to be enabled but was accidentally left disabled.
Also downgrades the select/deselect messages to KERN_INFO to reduce log
spam.
Fixes#14546.
Signed-off-by: Christian Svensson <blue@cmd.nu>
- Why I did it
To support the building of ARM-based docker-sonic-vs.gz
- How I did it
Fixed SYNCD_VS build rule to be architecture-specific.
- How to verify it
make configure PLATFORM=vs PLATFORM_ARCH=arm64
make target/docker-sonic-vs.gz
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
- Why I did it
SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 which is 255 when fastboot enable and 511 when fastboot disable
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE
SAI features
1. Port init profile
2. Dual ToR Active-Standby | Additional MAC support
SDK/FW bug fixes
1. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.
- How I did it
Update SAI version to SAIBuild2211.25.1.4
Update SDK/FW version to 4.6.1062/2012.1062
- Why I did it
1. Update Mellanox HW-MGMT package to newer version V.7.0030.1011
2. Replace the SONiC PMON Thermal control algorithm with the one inside the HW-MGMT package on all Nvidia platforms
3. Support Spectrum-4 systems
- How I did it
1. Update the HW-MGMT package version number and submodule pointer
2. Remove the thermal control algorithm implementation from Mellanox platform API
3. Revise the patch to HW-MGMT package which will disable HW-MGMT from running on SIMX
4. Update the downstream kernel patch list
Signed-off-by: Kebo Liu <kebol@nvidia.com>
### Why I did it
1. Enhance the diagnosis information collecting mechanism
- If the option `-v` is fed, it will pass additional diagnosis flags to mlxfwmanager
- Collect all the output from mlxfwmanager and print them to syslog if it fails
2. Abort syncd in case waiting for device or upgrading firmware fails
Signed-off-by: Stephen Sun <stephens@nvidia.com>
### How I did it
#### How to verify it
Regression and manual test
- Why I did it
Because the Spectrum4 devices don't support mlxtrace utility.
- How I did it
Edit sai.profile and remove mlxtrace_spectrum4_itrace_*.cfg.ext files
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
Why I did it
To avoid errors when the sfputil show error-status -hw is called from the host OS (not from the pmon docker).
How I did it
Remove the self.sdk_handle parameter from the _get_module_info() function.
How to verify it
Execute the sfputil show error-status -hw
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
Why I did it
Support Intel Tofino based platforms Netberg Aurora 750
ASIC: Intel Tofino BFN-T10-064Q
Pors: 64x 100G
How I did it
Added specification to device/netberg directory
Added platform/barefoot/sonic-platform-modules-netberg contains kernel modules, scripts and sonic_platform packages.
Modified the platform/barefoot/platform-modules-netberg.mk to include Aurora 750 related ID.
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
What I did it
Add new platform arm64-ragile_ra-b6010-48gt4x-r0 (Centec)
ASIC Vendor: Centec
Switch ASIC: Centec
Port Config: 48x1G+4x10G
Why I did it
Add new platform RA-B6010-48GT4X
How I did it
Add new platform RA-B6010-48GT4X
Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
Ignore intermittent IO errors during get_change_event in the Platform API
Fix tunings for some ports on CatalinaDD
Fix kernel module build for 6.1 kernel in preparation of bookworm upgrade
Why I did it
System health config is missing in few Dell platforms.
How I did it
Added system health monitoring config and its related API's
How to verify it
show system-health summary/detail commands.
- Why I did it
watchdogutil uses platform API watchdog instance to control/query watchdog status. In Nvidia watchdog status, it caches "armed" status in a object member "WatchdogImplBase.armed". This is not working for CLI infrastructure because each CLI will create a new watchdog instance, the status cached in previous instance will totally lose. Consider following commands:
admin@sonic:~$ sudo watchdogutil arm -s 100 =====> watchdog instance1, armed=True
Watchdog armed for 100 seconds
admin@sonic:~$ sudo watchdogutil status ======> watchdog instance2, armed=False
Status: Unarmed
admin@sonic:~$ sudo watchdogutil disarm =======> watchdog instance3, armed=False
Failed to disarm Watchdog
- How I did it
Use sysfs to query watchdog status
- How to verify it
Manual test
Unit test
- Why I did it
Change Mellanox platform API implementation to use ASIC driver sysfs for the module operational state and status error fields.
- How I did it
Modify the platform/mellanox/mlnx-platform-api/sonic_platform/sfp.py file by change the call of sx_mgmt_phy_module_info_get() SDK API to sysfs
- How to verify it
Simulate the unplug cable event
Check the CLI output
sfputil show presence
sfputil show error-status -hw
Simulate the plug cable event
Repeat 2 step
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
SONiC changes:
1. Support Spectrum4 ASIC FW binary building.
2. Support new SDK sx-obj-desc lib building since new SAI need it.
3. Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
4. Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2305.25.0.3
SDK/FW bug fixes
1. In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
2. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.
SDK/FW Features
1. On SN2700 all ports can support y cable by credo
SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE
SAI features
1. Port init profile
- How I did it
Update SDK/FW/SAI make files
- How to verify it
Run full sonic-mgmt regression on Mellanox platform
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Update Mellanox MFT tool to version 4.25.0-62
- How I did it
Update the MFT tool make file
- How to verify it
Run full sonic-mgmt regression.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
Add PDDF support on following Ufispace platforms with Broadcom ASIC
S9110-32X
S8901-54XC
S7801-54XS
S6301-56ST
How I did it
Add PDDF configuration files, scripts and python files
How to verify it
Run pddf commands and show commands.
Signed-off-by: nonodark <ef67891@yahoo.com.tw>
* Update sairedis submodule
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.
This is a reland of #15720, which was reverted in #15995 due to the RPC
package build failing. That failure has since been fixed, and the
PR pipeline has been updated to build the RPC package so that this is
checked at the PR stage.
This submodule update brings in the following changes:
```
4dbdb21 Fix RPC package build failure due to shell syntax issue (#1268)
588d596 Make sure new binaries replace existing binaries in docker-sonic-vs (#1269)
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
```
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Update sairedis submodule with the fix for the RPC package build
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
Increase UT coverage for Nvidia platform API code
Work item tracking
Microsoft ADO (number only):
- How I did it
Focus on low coverage file:
1. component.py
2. watchdog.py
3. pcie.py
- How to verify it
Run the unit test, the coverage has been changed from 70% to 90%
- Why I did it
Added the fwtrace config files in order to be able to call the mlxstrace utility during the show techsupport dump.
Work item tracking
Microsoft ADO (number only):
- How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.
- How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
This reverts commit e0927e28af.
Why I did it
Reverts #15720
It breaks build for target/debs/bullseye/syncd_1.0.0_amd64.deb
make[2]: Entering directory '/sonic/src/sonic-sairedis'
dh_install
# Note: escape with an extra symbol
if [ -f debian/syncd-rpc/usr/bin/syncd_init_common.sh ] ; then
/bin/sh: 1: Syntax error: end of file unexpected (expecting "fi")
make[2]: *** [debian/rules:65: override_dh_install] Error 2
make[2]: Leaving directory '/sonic/src/sonic-sairedis'
make[1]: *** [debian/rules:51: binary] Error 2
make[1]: Leaving directory '/sonic/src/sonic-sairedis'
dpkg-buildpackage: error: fakeroot debian/rules binary subprocess returned exit status 2
Work item tracking
Microsoft ADO (number only): 24691535
How I did it
How to verify it
Why I did it
Updating the iSMART_64 tool for supporting latest debian releases.
How I did it
On branch new_ismart
Changes to be committed:
(use "git restore --staged ..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/iSMART_64
How to verify it
In s6100, run the iSMART_64 tool.
md5sum - 24725730d7649769c7ba50971c1f2955
Midstone platform has compilation error in master branch, fixed the same.
How I did it
Due to bullseye migration i2c_new_dummy API is deprecated modified with i2c_new_dummy_device.
How to verify it
Verified target/debs/bullseye/platform-modules-midstone-200i_0.2.2_amd64.deb is generated
Co-authored-by: Kannan Selvaraj <skannan@celestica.com>
Why I did it
[E1031] fix pca9548 initializes failed occasionally in stress test.
When failure happened, ismt i2c bus hang up and need power cycle to
recover it.
How I did it
Add 0.5s delay between setuping and configuring pca9548 i2c mux.
How to verify it
Reboot stress test at least 100 times without failure.
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.
This submodule update brings in the following changes:
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
To fixsonic-net/sonic-mgmt#8786
How I did it
Modified Fan API to check whether the data retrieved is valid or not and return accordingly
How to verify it
Verify whether API 2.0 is loaded properly or not.
Execute CLI's like "show version", "show interface status", "show platform psustatus" etc..
= Why I did it
To optimize Mellanox platform SAI build
- How I did it
SAI debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release.
- How to verify it
Configure/build for Mellanox platform, check the image and ensure that correct SAI debs are included.
- Why I did it
The reset cause "reset_from_comex" has been removed by hw-management, hence removing it from platform API code
- How I did it
Remove reset_from_comex from reboot cause mapping
- How to verify it
Manual test
- Why I did it
BIOS on new generation switch can come with a file type of cap or cab. Needs to add support to these file type.
Also ONIE version on new devices can have a suffix of 'dev'.
- How I did it
Added cap & cab as possible component extensions for ComponentBIOS.
Update the ONIE version regex to include dev signed versions.
- How to verify it
Update BIOS.
Why I did it
port_config.ini and hwsku.json are needed to generate the default config
switch_type needs to be "dpu" to spawn the right set of processes during dvs initialization and to make sure that DASH APIs can be handled properly
Work item tracking
Microsoft ADO 24375371:
How I did it
Use the same hwsku.json and port_config.ini for DPU-2P as the ones used for Nvidia-MBF2H536C SKU in nvidia-sonic sonic-buildimage repo.
Set switch_type to "dpu" in DEVICE_METADATA configuration to make sure DASH specific APIs are handled properly
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>