Commit Graph

2091 Commits

Author SHA1 Message Date
dbarashinvd
7a34d4a275
[Mellanox] fix code for warm reboot to work with FW controlled ports (#18065)
- Why I did it
Fix the code to work also after warm reboot to work with FW controlled ports.
In warm reboot the control state sysfs of each port does not change unlike reboot or fast boot.

- How I did it
1. Check procfs cmdline if warm reboot done this is due to the fact pmon don't recognize warm reboot when it's taking place since pmon is loaded after warm reboot is finished.
2. If warm reboot done, check in static detection part for each port if it's FW controlled. If so, leave it this way and stop the state machine flow (set it to final state).

- How to verify it
1. Boot a switch with CMIS host management with at least one FW controlled port (non active cables or non cmis cables) then run warm reboot.
2. Verify no errors of sysfs reading appears for control sysfs
2024-02-08 14:49:56 +02:00
zitingguo-ms
74494010e1
[Broadcom] Upgrade xgs SAI to 10.1.6.0 (#18044)
Why I did it
Upgrade the xgs SAI version to 10.1.6.0 to include the following fix:

10.1.6.0: [CS00012332630][SAI_BRANCH rel_ocp_sai_10_1] SAI - OTHER - [SAI BUG] sflow use psample to send packet, but the psample in linux version is not right.
10.1.4.0: [CS00012329827]ECMP LB traffic polarization, configure hash_offset along with hash_seed attr
10.1.3.0: Double commit test code fixes in EM for 10.1.
10.1.2.0: fix ODP packaging in rel_ocp_sai_10_1
10.1.1.0: Use knet-cb procfs path for DNX port speed sampling rate (does not use new genl)
Work item tracking
Microsoft ADO (number only): 26720003
How I did it
Upgrade xgs SAI version in sai.mk file.

How to verify it
Run full qual on s6100 T1: https://elastictest.org/scheduler/testplan/65c1c2e69e3e72f540cae34b
2024-02-07 09:29:40 +08:00
Yevhen Fastiuk
2f35079979
[Mellanox] Fix uninitialized variable on module plug event (#17011)
- Why I did it
To fix uninitialized variable

- How I did it
Add initial value

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
2024-02-05 19:41:16 +02:00
dbarashinvd
0aacc1f28e
[Mellanox] fix sysfs reading that gets garbage end of line using strip (#17830)
- Why I did it
when reading sysfs fd upon python poller events, there's end of line garbage like "# 012" (without space between the 2 parts) trailing the real value of 1 or 0

- How I did it
using python strip() to remove end of line

- How to verify it
run the CMIS host management feature on a switch
wait few minutes until switch completes boot up sequence including CMIS host manager
then disconnect or reconnect a port to create a poller event
2024-02-05 19:39:55 +02:00
Stepan Blyshchak
e1a8d2a6e8
[nvidia][syncd] fix incorrect permission of /tmp in syncd container (#17777)
Fixes #16034
2024-02-05 00:00:29 -08:00
wenyiz2021
892f171b80
[Master] [DNX SAI] Update DNX SAI to 9.2.X and SDK on master branch (#17935)
SAI 9.2.x was sanitized and posted on 202305 branch: https://github.com/sonic-net/sonic-buildimage/pull/17432/files

Posting SAI 9.2.x to master branch also.

26607678
2024-02-01 17:44:48 -08:00
Dror Prital
4af43dc63b
[Mellanox] Update SIMX version to 23.10-1123 (#17958)
- Why I did it
Update NVIDIA SIMX Version to 23.10-1123

- How I did it
Changed fw.mk file
2024-01-31 19:41:23 +02:00
Sudharsan Dhamal Gopalarathnam
77384494b3
[Mellanox]Update SDK/FW to 4.6.2202/2012.2202 (#17947)
- Why I did it
Update SDK/FW version to 4.6.2202/2012.2202

Fixed issues:
1. On Spectrum-3 systems, ports' toggling while sending traffic on 400G speed ports, might result in stuck FW.
2. In Spectrum-1 switch systems, 50G SR2 speed mode is not supported when AutoNeg is enabled. In this case although the max interface speed is 50G for SR2 or SR4 or SR, the actual max interface speed negotiated between the loopback is 25G.
3. On Spectrum-2 and Spectrum-3, Switch create in fastboot might take more than 40 seconds in case there are no active links.
4. When performing warmboot from version prior to 202205 to 202205 and above , no aging and mac move take place

- How I did it
Updating make files.

-How to verify it
Running regression
2024-01-31 08:35:16 +02:00
Lior Avramov
865042ed23
[Nvidia] Update syncd docker to use python version 3 (#17735)
* Remove python2 from compilation of python-sdk-api

* Upgrade Python version in syncd RPC docker image to Python3
2024-01-30 13:47:39 -08:00
xumia
bb5a420de5
[Build] Fix krb5 package not found issue (#17926)
Why I did it
Fix the build issue caused by the wrong version specified.

See the build error logs:

Try 4: /usr/bin/wget --retry-connrefused failed to get: -O
--2024-01-26 11:38:23--  https://sonicstorage.blob.core.windows.net/public/fips/bullseye/0.10/amd64/libk5crypto3_1.18.3-6+deb11u14+fips_amd64.deb
Resolving sonicstorage.blob.core.windows.net (sonicstorage.blob.core.windows.net)... 20.60.59.131
Connecting to sonicstorage.blob.core.windows.net (sonicstorage.blob.core.windows.net)|20.60.59.131|:443... connected.
HTTP request sent, awaiting response... 404 The specified blob does not exist.
2024-01-26 11:38:23 ERROR 404: The specified blob does not exist..

Try 5: /usr/bin/wget --retry-connrefused failed to get: -O
make[1]: *** [Makefile:12: /sonic/target/debs/bullseye/symcrypt-openssl_0.10_amd64.deb] Error 8
make[1]: Leaving directory '/sonic/src/sonic-fips'
Work item tracking
Microsoft ADO (number only): 26577929
The package not installed but PR passed issue is traced in another issue #17927

How I did it
Add the libkrb5-dev and the depended packages to fix docker-sonic-vs build failure.
The package libzmq3-dev has dependency on the libkrb5-dev.
2024-01-30 21:44:32 +08:00
dbarashinvd
927dde73f1
fix low polarity wrong value for hw_reset deassert and seek(0) before reading sysfs upon poll event (#17627)
* fix hw_reset low polarity (reverse values)

* move seek to beginning of sysfs fd before reading to resolve power_good
sysfs returns empty upon plug out cable
2024-01-22 10:53:55 -08:00
Junchao-Mellanox
91d77fe7ae
Fix error log while creating PSU thermal object (#17789)
- Why I did it
If a PSU is not present, there could be error log while restarting psud or thermalctld:

Jan  8 17:15:52.689616 sonic ERR pmon#psud: Thermal sysfs /run/hw-management/thermal/psu2_temp1_max does not exist

Jan  8 17:15:57.747723 sonic ERR pmon#thermalctld: Thermal sysfs /run/hw-management/thermal/psu2_temp1 does not exist

- How I did it
if a PSU is not present, we should not check the PSU temperature sysfs.
2024-01-22 16:22:07 +02:00
Liu Shilong
e30782b0fe
[ci] Enable cache for marvell-arm64 build in PR checks. (#15449)
Why I did it
Enable build cache for marvell-arm64 build to decrease PR check duration.

Work item tracking
Microsoft ADO (number only): 26340500
How I did it
How to verify it
2024-01-09 20:28:31 +08:00
snider-nokia
98f24b639e
[Nokia][sonic-platform] Update Nokia sonic-platform submodule and device data (#17378)
These changes, in conjunction with NDK version >= 22.9.17 address the thermal logging issues discussed at Nokia-ION/ndk#27. While the changes contained at this PR do not require coupling to NDK version >= 22.9.17, thermal logging enhancements will not be available without updated NDK >= 22.9.17. Thus, coupling with NDK >=22.9.17 is preferred and recommended.

Why I did it
To address thermal logging deficiencies.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
The following changes are included:

Threshold configuration values are provided in the associated device data .json files. There is also a change included to better handle the condition where an SFP module read fails.

Modify the module.py reboot to support reboot linecard from Supervisor

 - Modify reboot to call _reboot_imm for single IMM card reboot
 - Add log to the ndk_cmd to log the operation of "reboot-linecard" and "shutdown/satrtup the sfm"
Add new nokia_cmd set command and modify show ndk-status output

 - Add a new function reboot_imm() to nokia_common.py to support reboot a single IMM slot from CPM
 - Added new command: nokia_cmd set reboot-linecard <slot> [forece] for CPM
 - Append a new column "RebootStatus" at the end of output of "nokia_cmd show ndk-status"
 - Provide ability for IMM to disable all transceiver module TX at reboot time
 - Remove defunct xcvr-resync service
2024-01-08 11:38:46 -08:00
Junchao-Mellanox
ee49d0dfec
[Mellanox] Fix issues found for CMIS host management (#17637)
- Why I did it
1. Thermal updater should wait more time for module to be initialized
2. sfp should get temperature threshold from EEPROM because SDK sysfs is not yet supported
3. Rename sfp function to fix typo
4. sfp.get_presence should return False if module is under initialization

- How I did it
1. Thermal updater should wait more time for module to be initialized
2. sfp should get temperature threshold from EEPROM because SDK sysfs is not yet supported
3. Rename sfp function to fix typo
4. sfp.get_presence should return False if module is under initialization

- How to verify it
Manual test
Unit test
2024-01-04 09:42:33 +02:00
Junchao-Mellanox
7d388cd0e6
[Mellanox] wait until hw-management watchdog files ready (#17618)
- Why I did it
watchdog-control service always disarm watchdog during system startup stage. It could be the case that watchdog is not fully initialized while the watchdog-control service is accessing it. This PR adds a wait to make sure watchdog has been fully initialized.

- How I did it
adds a wait to make sure watchdog has been fully initialized.

- How to verify it
Manual test
sonic regression
2023-12-26 18:27:18 +02:00
Junchao-Mellanox
d8a1ffbace
[Mellanox] implement sfp.reset for CMIS management (#16862)
- Why I did it
For CMIS host management module, we need a different implementation for sfp.reset. This PR is to implement it

- How I did it
For SW control modules, do reset from hw_reset
For FW control modules, do reset as the original way

- How to verify it
Manual test
sonic-mgmt platform test
2023-12-17 08:02:47 +02:00
Junchao-Mellanox
c1cb292310
[Mellanox] implement platform wait in python code (#17398)
- Why I did it
New implementation of Nvidia platform_wait due to:
1. sysfs deprecated by hw-mgmt
2. new dependencies to SDK
3. For CMIS host management mode

- How I did it
wait hw-management ready
wait SDK sysfs nodes ready

- How to verify it
manual test
unit test
sonic-mgmt regression
2023-12-14 12:04:24 +02:00
Junchao-Mellanox
f373a16e95
[Mellanox] Fix race condition while creating SFP (#17441)
- Why I did it
Fix issue xcvrd crashes due to cannot import name 'initialize_sfp_thermal':

Nov 27 09:47:16.388639 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")

- How I did it
Add lock for creating SFP object

- How to verify it
Unit test
Manual Test
2023-12-14 12:01:11 +02:00
zitingguo-ms
6a9ec987b5
change branch name (#17267)
Why I did it
Upgrade xgs SAI to 10.1 version.

Work item tracking
Microsoft ADO (number only): 25931321
How I did it
Upgrade xgs SAI version in sai.mk file.

How to verify it
Run full qualification on 7050cx3/7260cx3:

7050cx3:
https://dev.azure.com/mssonic/internal/_build/results?buildId=425450&view=results
https://dev.azure.com/mssonic/internal/_build/results?buildId=425449&view=results
7260cx3: https://elastictest.org/scheduler/testplan/656f2b2b617fb27e41557494?leftSideViewMode=detail&prop=status&order=ascending
2023-12-14 09:37:35 +08:00
Junchao-Mellanox
1b84f3daa5
[Mellanox] update asic and module temperature in a thread for CMIS management (#16955)
- Why I did it
When module is totally under software control, driver cannot get module temperature/temperature threshold from firmware. In this case, sonic needs to get temperature/temperature threshold from EEPROM. In this PR, a thread thermal updater is created to update module temperature/temperature threshold while software control is enabled.

- How I did it
Query ASIC temperature from SDK sysfs and update hw-management-tc periodically
Query Module temperature from EEPROM and update hw-management-tc periodically

- How to verify it
Manual test
New Unit tests
2023-12-13 14:19:44 +02:00
Junchao-Mellanox
0d62cf0e92
[Mellanox] Remove EEPROM write limitation if it is software control (#17030)
- Why I did it
When module is under software control (CMIS host management enabled), EEPROM should be controlled by software and there should be no limitation for any write operation.

- How I did it
Remove EEPROM write limitation if a module is under software control

- How to verify it
Manual test
UT
2023-12-13 14:16:40 +02:00
Sudharsan Dhamal Gopalarathnam
dd39dd0e03
[Mellanox] Update SAI to 2311.26.0.28, SDK/FW to 4.6.2134/2012.2134 (#17481)
- Why I did it
Update SAI version to SAIBuild2311.26.0.28

Fixed issues
1. Traffic with unicast destination ip and multicast destination mac wasn't properly dropped
2. When working with SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD key/value enabled, trying to add a LAG member to a LAG which is created after warm boot initial configuration phase ended, will fail.
3. Optional feature of Port IP counters (SAI_PORT_STAT_IP*) , enabled by SAI XML per-port-ip-counter-enabled config node, wasn't initialized properly.
4. Creating BFD session for non default VRF fails (SAI_BFD_SESSION_ATTR_VIRTUAL_ROUTER != SAI_SWITCH_ATTR_DEFAULT_VIRTUAL_ROUTER_ID).
5. The default value for port FEC during switch init for Spectrum3 was initialized as 'auto' and not aligned to SAI header default 'none'. Note if setups has invalid configuration and relied previously on auto, now it might be necessary for the user to provide explicit valid value for SAI_PORT_ATTR_FEC_MODE

Update SDK/FW version to 4.6.2134/2012.2134
Fixed issues:
1. Updated SN3700C to enable limit to 100G speed.
2. Recovering from Low power mode might ends with port down.

- How I did it
Updating the versions in makefile

- How to verify it
Confirm issues fixed and run sonic-mgmt tests
2023-12-13 12:48:49 +02:00
Junchao-Mellanox
b0bb3d40d3
[Mellanox] Implement low power mode for cmis host management (#17159)
- Why I did it
For cmis host management mode, the prevous sysfs cannot be used for low power mode setting. This PR reuses existing low power mode implementation in sonic_xcvr package when CMIS host management mode is enabled

- How I did it
Use sonic_xcvr low power mode implementation when CMIS host management mode is enabled.

- How to verify it
Manual test for CMIS host management mode
Regression test for old mode and backward compatible test
2023-12-11 10:42:01 +02:00
Nazarii Hnydyn
278a958517
[Mellanox] Disable MFT bash autocompletion (#17442)
A W/A to overcome delay of about 20 sec on login due to MFT bash autocompletion bug.
Should be reverted once a formal solution will be available in future MFT release.

- Why I did it
To overcome SN2700 20 sec delay on login

- How I did it
Removed MFT bash autocompletion part

- How to verify it
1. Build a mellanox image
2. Verify no such links after system boot.

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-12-10 10:28:32 +02:00
Aravind-Subbaroyan
b222c7c240
Update cisco-8000.ini (#17428)
FCS/CRC Errors will only be reported as RX_ERR.
Fix to avoid the mac port related errors.
Fix for sharedResSize testcase failure in QoS-SAI
Fix the issue related to voltage in 'show platform psustatus'.
Support WRED drop for lossy queues.
Fixed an issue where lossy traffic was getting dropped.
Enhancement of SAI logging for errors and interrupts
2023-12-07 17:05:05 -08:00
Arun Saravanan Balachandran
80e743716c
[Dell] S6100 - Update EEPROM API serial_number_str to return service tag instead of serial number (#17440)
To modify EEPROM API serial_number_str to return service tag instead of serial number in Dell S6100.
Ref PR: #1239

How I did it
Update EEPROM API serial_number_str to return service tag instead of serial number.

How to verify it
Verify decode-syseeprom -s returns service tag in Dell S6100.
2023-12-07 10:08:42 -08:00
centecqianj
8ec4b53451
[Bookworm] Upgrade centec-arm64 platform to Bookworm. (#17411)
Why I did it
1. Upgrade centec-arm64 platform to Bookworm.
2. Solve the problem of compiling the docker-syncd-centec-rpc.gz error on the centec platform.

How I did it
1. Modified platform driver to comply with bookworm kernel.
2. Upgrade SONiC package versions of the centec platform.

How to verify it
1. Compile the centec-arm64 platform to generate sonic-centec-arm64.bin.
2. Compile the centec platform to generate docker-syncd-centec-rpc.gz.

Signed-off-by: centecqianj <qianj@centec.com>
2023-12-07 08:42:13 -08:00
dbarashinvd
000a2ef818
[Mellanox] Enable CMIS host management (#16846)
- Why I did it
Enable CMIS host management for Mellanox devices which are expected to support the feature

- How I did it
new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event()
this thread in the new file handles the state machine per port.
first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module.
After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port,
so it will be able to detect plug in or out events of a cable.

- How to verify it
Enhanced unit tests
run sonic mgmt on Nvidia SN4700 with CMIS host management enabled
2023-12-07 14:54:56 +02:00
Nazarii Hnydyn
1ff27db42f
[frr]: Force disable next hop group support. (#17344)
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com

Closes #17345

This W/A was proposed by Nvidia FRR team before the long term solution is ready.

Why I did it
A W/A to fix default route installation during LAG member flap
Work item tracking
N/A
How I did it
Disabled FRR next hop group support
How to verify it
Do LAG member flap
2023-12-06 11:09:54 +08:00
Junchao-Mellanox
26bf38b610
[Mellanox] Provide default implementation for sfp error description when CMIS host management is enabled (#17294)
- Why I did it
Provide a dummy implementation for SFP error description when CMIS host management is enabled. A future feature shall be raised to implement SFP error description for such mode.

- How I did it
if SFP is under software control, provide "Not supported" as error description
if SFP is under initialization, provide "Initializing" as error description

- How to verify it
unit test
2023-12-05 20:27:29 +02:00
Ashwin Hiranniah
ada7c6a72e
Add pensando platform (#15978)
This commit adds support for pensando asic called ELBA. ELBA is used in pci based cards and in smartswitches.

#### Why I did it
This commit introduces pensando platform which is based on ELBA ASIC.
##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Created platform/pensando folder and created makefiles specific to pensando.
This mainly creates pensando docker (which OEM's need to download before building an image) which has all the userspace to initialize and use the DPU (ELBA ASIC).
Output of the build process creates two images which can be used from ONIE and goldfw.
Recommendation is use to use ONIE.
#### How to verify it
Load the SONiC image via ONIE or goldfw and make sure the interfaces are UP.

##### Description for the changelog
Add pensando platform support.
2023-12-04 14:41:52 -08:00
zitingguo-ms
ed8fa6a47e
Upgrade xgs SAI version to 8.4.31.0 (#17059)
Why I did it
Upgrade the xgs SAI version to 8.4.31.0 to include the following changes:

8.4.22.0: [SDK upgrade][CSP CS00012314723][SAI_BRANCH rel_ocp_sai_8_4] SID:bcmtmPfcDdrScan thread takes 100% CPU utilization
8.4.23.0: [SDK upgrade][CSP CS00012290176[SAI_BRANCH rel_ocp_sai_8_4] SDK-323160: bcm_l3_ecmp_member_add returns Table Full error while ISSU
8.4.24.0:
[SDK upgrade]Merge "[CSP NA][SAI_BRANCH rel_ocp_sai_8_4] SID: Software LinkScan Not Catching Short Local/Remote Fault Events" into hsdk_6.5.27_SAI_8.4.0_GA
[SDK upgrade][CSP NA][SAI_BRANCH rel_ocp_sai_8_4] SID: Software LinkScan Not Catching Short Local/Remote Fault Events
8.4.25.0: [SAI_BRANCH rel_ocp_sai_8_4]CLONE - SAI - 8.4 - _brcm_sai_cosq_stat_get errors for CPU queue 41
8.4.26.0: [CSP CS00012307911] Fixed incorrect CPU related SAI port obj encoding/decoding in most subsystems
8.4.27.0: [CSP CS00012309154] [TD3] SAI_STATUS_INVALID_PARAMETER on setting SAI_BUFFER_POOL_ATTR_SIZE, OA crash
8.4.28.0: [CSP CS00012315552] Excessive logging from _brcm_sai_acl_tbl_grp_mbr_migration
8.4.29.0: [CSP CS00012321369] Fix TH2 regression with MMU/pool size
8.4.30.0: [SDK upgrade][CSP CS00012316299][SAI_BRANCH rel_ocp_sai_8_4] L3 entry delete failed when SER error is present
8.4.31.0: [CSP CS00012307911] Revert and limit scope of previous change due to WB issue.
Work item tracking
Microsoft ADO (number only): 26021230
How I did it
Upgrade the SAI version in sai.mk file.

How to verify it
Run advanced reboot on TH2 and TD3:

https://dev.azure.com/mssonic/internal/_build/results?buildId=422024&view=results
https://dev.azure.com/mssonic/internal/_build/results?buildId=423352&view=results
@saiarcot895 run warm reboot from 202012 to target image and they've passed
TH2: https://dev.azure.com/mssonic/internal/_build/results?buildId=423112&view=logs&j=76acabad-01e9-5c52-6fe6-d396d63e85d2&t=0d14fb40-14d5-50ca-4a23-af1778140cbf
TH: https://dev.azure.com/mssonic/internal/_build/results?buildId=423119&view=logs&j=76acabad-01e9-5c52-6fe6-d396d63e85d2&t=0d14fb40-14d5-50ca-4a23-af1778140cbf
TD3: https://dev.azure.com/mssonic/internal/_build/results?buildId=423074&view=logs&j=76acabad-01e9-5c52-6fe6-d396d63e85d2&t=0d14fb40-14d5-50ca-4a23-af1778140cbf
2023-12-04 09:31:39 +08:00
centecqianj
8db3a99d11
[Bookworm] Upgrade centec platforms to Bookworm (#17364)
How I did it
Modified platform driver to comply with bookworm kernel.
Modified python build commands for building whl packages.

How to verify it
Verify whether all the platform bookworm debs are built.
make target/debs/bookworm/platform-modules-v682-48y8c-d_1.0_amd64.deb
Load the platform debian into the device and install it in bookworm image.
Verify the platform related CLI and the functionality

Signed-off-by: centecqianj <qianj@centec.com>
2023-12-01 16:07:52 -08:00
Pavan-Nokia
36c1b8ac51
[Nokia-7215][armhf] Enable Watchdog service (#16612)
Enable CPUWDT service to enable watchdog
2023-11-30 16:34:53 -08:00
Saikrishna Arcot
5d3e5c0094
Re-enable DNX module compilation (#17310)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-29 16:24:23 -08:00
Kebo Liu
bf4a2e3002
[Mellanox] Revert LPM implementation to the old way (#17096)
- Why I did it
The current low power mode setting implementation requests the user to set the port to admin down first before toggling LP mode, this is not backward compatible, now revert it to the old way so that the user can toggle the LP mode regardless of the port admin status.

- How I did it
Revert the recent changes related to LPM in PR #14130 and #16545

- How to verify it
Run all sfputil and SFP platform API related tests on all the Mellanox platforms.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-11-27 13:23:04 +02:00
Aravind Mani
c8fe65873f [Bookworm] Upgrade Dell platforms to Bookworm (#17003)
Why I did it
[Bookworm] Update platform-modules-dell for Bookworm #16735

How I did it
Modified platform driver to comply with bookworm kernel.
Removed MODULE_SUPPORTED_DEVICE wherever used.
Modified python build commands for building whl packages.

How to verify it
Verify whether all the platform bookworm debs are built.
make target/debs/bookworm/platform-modules-z9100_1.1_amd64.deb
Load the platform debian into the device and install it in bookworm image.
Verify the platform related CLI and the functionality
2023-11-21 18:53:15 -08:00
Vivek
b8548439b7 [mellanox] Update SAI to SAIBuild2311.25.0.36, SDK/FW to 4.6.2104/2012.2104 (#17131)
Why I did it
Update SDK/SAI and FW for Mellanox Platform

How I did it
Update SDK/FW to v4.6.2104/v2012.2104

Fixed Issues:

Some of the Warmboot related files which were created by SDK during switch create are now generated during pre shutdown flow
New Features:

Debian 12 and kernel 6.1 support
Update SAI

New Features:

Auto Fec Support
FDB entries are now restored after warmboot to prevent temporary system flooding.
Minor Enhancement and Bug Fix in integrate-mlnx-sdk

How to verify it
Build Image and run tests

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2023-11-21 18:53:15 -08:00
Vivek
787dd7221d [Mellanox] Upgrade HW-MGMT to 7.0030.2008 and update platform-api (#17134)
Why I did it
Add platform support for Debian 12 (Bookworm) on Mellanox Platform

How I did it
Update hw-management to v7.0030.2008
Deprecate the sfp_count == module_count approach in favour of asic init completion
Ref: Mellanox/hw-mgmt@bf4f593
Add xxd package to base image which is required by hw-management scripts
Add the non-upstream flag into linux kernel cache options
Update the thermalctl logic based on new sysfs attributes
Fix the integrate-mlnx-hw-mgmt script to not populate the arm64 Kconfig
How to verify it
Build kernel and run platform tests

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Co-authored-by: Junchao-Mellanox <junchao@nvidia.com>
Co-authored-by: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
2023-11-21 18:53:15 -08:00
Pavan Naregundi
a4cfe25d3d [marvell-armhf]: Enable SDK module for bookworm (#17110)
* Enable SDK modules for Bookworm
* Update SAI deb to 1.13.0-1



Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2023-11-21 18:53:15 -08:00
Vivek
1fccc97ee7 [Nvidia] Fix mlnx-sai build failure (#14)
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

[Nvidia] Enable iproute2 & fix mft build (#16)

* Enable iproute2 as the SDK is also built

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* [Nvidia] Dont use mkbmdeb method of dkms to build the package

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* Added linux image to the Depends section of mft

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

[Nvidia] [Bookworm] Separate KERNEL_MFT into a new target (#16782)

* [Nvidia] Seperate KERNEL_MFT into a new target because of kernel header dependency

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* Update linux-kernel submodule

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* Fix paralell build problem

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

---------

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2023-11-21 18:53:15 -08:00
Keshav Gupta
14b98705ed [marvell-arm64]: Enable SDK module for bookworm (#16909)
This patch enables compiling of Marvell platform
module and fixes sonic-platform-nokia compilation
issues for bookworm.
2023-11-21 18:53:15 -08:00
jostar-yang
6ed491db74 [Edgecore][sonic-platform-modules-accton]Support kernel 6.1 and bookworm (#16982)
* [Edgecore][sonic-platform-modules-accton]Support kernel 6.1 and bookworm

* Modify pddf drv code for i2c_remove_callback function fail
2023-11-21 18:53:15 -08:00
Ramasamy Chandramouli
c436ce20be [PR:16737, PR:16739] platform-modules: pddf, broadcom/cel: adapt for kernel 6.1 and bookworm (#16954)
* sonic-platform-modules-cel: broadcom: adapt for kernel 6.1 and bookworm

The i2c_driver->remove API declaration has been updated to return void instead
of int, as part of cleanup patches in 6.1. More details can be referred from
here: [1]. Update the remove API definition in the modules accordingly and
cleanup variables that go unused from the remove API.

Update python build commands for bookworm. The packaging based on calling
setup.py is deprecated and using build module/pip utility is the recommended
method for python packaging/installation. Further details can be referred to
from here: [2], [3]. The build module is picky about the package information file,
which needs to be either setup.py or pyproject.toml.

Additionally, fix formatting inconsistencies in debian/changelog reported by
`dh_installchangelogs` during the build.

Tested the changes by compiling the changes as below:

    make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1
    sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb
    cd platform/broadcom/sonic-platform-modules-cel
    KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage

Also verified the python scripts under the sonic-platform-modules-cel with
pyflakes to ensure no new errors are flagged (with exception of unused modules).

References:
   [1] - https://github.com/torvalds/linux/commit/ed5c2f5f
   [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm
   [3] - 0b20a4863 (Update Python build commands for Bookworm, 2023-09-07)

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>

* platform/pddf: i2c: adapt for kernel 6.1 and bookworm

   * Fixup i2c_driver->remove API due to changes in the function
     prototype (ref: [1]).

   * Cleanup `MODULE_SUPPORTED_DEVICE` macros that were cleaned up in
     the upstream (ref: [2]).

   * Sanitize python packaging and installation using the `build` module
   instead of calling the setup.py directly (ref: [3]. [4]).

Tested the changes by compiling pddf module as below:

     make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1
     sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb
     cd platform/pddf/i2c
     KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage

References:
    [1] - https://github.com/torvalds/linux/commit/ed5c2f5f
    [2] - https://github.com/torvalds/linux/commit/6417f031
    [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm
    [3] - 0b20a4863 (Update Python build commands for Bookworm, 2023-09-07)

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>

* platform/broadcom: include platform-modules-cel in builds

With pddf modules patched for 6.1, platform-modules-cel can be compiled
and included in the final image.

Testing by building sonic-broadcom.bin/sonic-broadcom-dnx.bin.

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>

* pddf/i2c: revert correct rootdir for pip install

The pip install directory has been set to test-pkg1/ for testing the build and
incorrectly retained as is. Revert this to the correct path $(PACKAGE_PRE_NAME).

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>

* platform/broadcom: include pddf/modules-cel in the base package

Without this change, the modules were built but not packaged in the final .bin.

The final sonic-broadcom.bin has been tested for bootup on Celestica's
Silverstone platform.

   admin@sonic:~$ uname -a
   Linux sonic 6.1.0-11-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux
   admin@sonic:~$ show platform summary
   Platform: x86_64-cel_silverstone-r0
   HwSKU: Silverstone
   ASIC: broadcom
   ASIC Count: 1
   Serial Number: R4009B2F062504LK200024
   Model Number: N/A
   Hardware Revision: N/A
   admin@sonic:~$ show version | head

   SONiC Software Version: SONiC.g0aad6c67c-rachandr
   SONiC OS Version: 12
   Distribution: Debian 12.2
   Kernel: 6.1.0-11-2-amd64
   Build commit: 0aad6c67c
   Build date: Thu Oct 26 07:13:47 UTC 2023
   Built by: rachandr@AZUHPS14

   Platform: x86_64-cel_silverstone-r0

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>

---------

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
79b8e2eec5 Disable several platform modules for Bookworm
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
0909c671c6 Update sonic-linux-kernel to use 6.1.38
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
jfeng-arista
6dfaf5e293
[sonic-vs]: Add fabric port data for vs test, and start fabricmgrd in vs environment (#16791)
Add fabric port data for vs test, and start fabricmgrd in vs environment.

This PR depends on sonic-net/sonic-sairedis#1301

sonic-net/sonic-swss#2920 needs this one merge first.
2023-11-20 16:21:03 -08:00
Pavan Naregundi
307e39bde4
[Marvell-arm64] Add platform support for rd98DX35xx (#16874)
* [Marvell-arm64] Add platform support for rd98DX35xx

This change adds following two variants of rd98DX35xx board to arm64
build.

Board with CPU integrated into the 98DX35xx switching chip:

 Platform: arm64-marvell_rd98DX35xx-r0
 HwSKU: rd98DX35xx
 ASIC: marvell
 Port Config: 32x1G + 16x2.5G + 6x25G

Board with external CN9131 CPU connected over PCI to 98DX35xx
switching chip:

 Platform: arm64-marvell_rd98DX35xx_cn9131-r0
 HwSKU: rd98DX35xx_cn9131
 ASIC: marvell
 Port Config: 32x1G + 16x2.5G + 6x25G

Change-Id: I21dc9fe972417daaabb20a5bddf7779d72b7972e
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

* Add HWSKU for rd98DX35xx and rd98DX35xx_cn9131

This patch adds new HWSKU's for Marvell arm64 platforms rd98DX35xx
and rd98DX35xx_cn9131.

Change-Id: Id7c14f49f0e304335cc4ca73dcae52362c49d231
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

---------

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2023-11-20 09:43:02 -08:00
Stephen Sun
b93852d53d
[Mellanox] Support running hw-management service on MSN4700 emulation platform (#16584)
- Why I did it
Support running hw-management service on MSN4700 emulation platform.

- How I did it
Use physical EEPROM instead of the fake one
Do not skip PSUd, PCId, thermal control daemon
Adjust PCIe and thermal configuration files
Adjust platform.json for different chassis names and thermals
Remove a patch to hw-management in order to enable it

- How to verify it
Run Nvidia simulation on SN4700 (ASIC and Platform)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-11-19 11:03:46 +02:00