Commit Graph

1715 Commits

Author SHA1 Message Date
Junchao-Mellanox
bc300b4d79 [Mellanox] add more log while doing sysfs reading (#11556)
- Why I did it
Add more log while doing sysfs reading to increase the debug capability

- How I did it
Log the relevant file path and error number while sysfs reading return None

- How to verify it
Manual test
2022-08-08 20:44:18 +00:00
Santhosh Kumar T
79e014efcb
[DellEMC] S6100 Platform Service optimization porting in 202205 (#11329)
To reduce rc.local script execution time. Porting changes from [DellEMC] S6100 Platform Service optimization #10989
Changes:
Moving platform-modules-s6100.service and s6100-lpc-monitor.service asynchronous to rc.local script.
2022-08-02 09:55:49 -07:00
Junchao-Mellanox
b7e3db4ef1 [Mellanox] Fix issue: failed to decode Json while there is no hwsku.json (#11436)
- Why I did it
Fix bug: pmon report error on start up because some SKUs do not have hwsku.json

- How I did it
If hwsku.json, do not extract RJ45 port information

- How to verify it
Manual test.
Unit test.
2022-07-29 04:49:32 +00:00
Stephen Sun
e317af0e9a Fix chassis test issue (#11460)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:36:44 +00:00
Stephen Sun
94df2c4b86 [Mellanox] Support new platform API get_port_or_cage_type for RJ45 ports (#11336)
- Why I did it
Support get_port_or_cage_type for RJ45 ports

- How I did it
Implement the new platform API get_port_or_cage_type
Fix the issue: unable to import SFP when chassis object is destructed

- How to verify it
Manually test and regression test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:31:20 +00:00
andywongarista
f377636747 Add gbsyncd container for broncos (#11154)
* Add docker-gbsyncd-broncos support
* Address review comments
* Add socket to gbsyncd
* Upgrade gbsyncd-broncos to bullseye
2022-07-28 20:27:21 +00:00
Kebo Liu
2f59460fc4 [Mellanox] Enhance Platform API to support SN2201 - RJ45 ports and new components mgmt. (#10377)
* Support new platform SN2201 and RJ45 port

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* remove unused import and redundant function

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* fix error introduced by rebase

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* Revert the special handling of RJ45 ports (#56)

* Revert the special handling of RJ45 ports

sfp.py
sfp_event.py
chassis.py

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove deadcode

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Support CPLD update for SN2201

A new class is introduced, deriving from ComponentCPLD and overloading _install_firmware
Change _install_firmware from private (starting with __) to protected, making it overloadable

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Initialize component BIOS/CPLD

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove swb_amb which doesn't on DVT board any more

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove the unexisted sensor - switch board ambient - from platform.json

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Do not report error on receiving unknown status on RJ45 ports

Translate it to disconnect for RJ45 ports
Report error for xSFP ports

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Add reinit for RJ45 to avoid exception

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:24:49 +00:00
Samuel Angebault
8ae03c994d [Arista] Update platform library (#10922)
- Implement Pcie plugin for chassis
- Implement set_admin_status for chassis modules
- Fix phy declaration for phy-credo
2022-07-22 22:15:34 +00:00
zitingguo-ms
e13df585ee [bcm sai]upgrade Broadcom SAI to 7.1.0.0-6 (#11410)
- Default Not to report Single bit ECC correctable events to avoid the need to set SOC porperties.

Signed-off-by: zitingguo <zitingguo@microsoft.com>
2022-07-22 22:14:28 +00:00
pavannaregundi
fea173d54e [Marvell-armhf] Fix kernel hang due to kernel upgrade to 5.10.103 (#11298)
Give more room for the kernel image in memory

Change-Id: I015856d173d50d94e30d8c555590efb70eb712ae
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2022-07-07 21:20:20 +00:00
Nazarii Hnydyn
a4da090e5a
[Mellanox] Update SAI to 1.21.2.0 (#11360)
- Why I did it
Advance to new SAI version for bugs fixes as well as new features/enhacements:

New:
1. ARM64 support
2. FG ECMP performance optimization
3. Support setting empty list for port ingress/egress buffer profile list
4. Add service port for SN5600
5. Add CR8/SR8/LR8/KR8 interface type
6. Disable mlxtrace during debug dump

Fixes:
1. Fix SAI_ACL_ENTRY_ATTR_FIELD_TC
2. Fix Packets loop back if no member in portchannel
3. Fix optimize descriptors apply time (and fast boot time)
4. Add flush fdb entries for vxlan tunnel bridge port
5. Don't disable used tunnel underlay interfaces

- How I did it
Advanced SAI submodule

- How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2022-07-07 09:15:41 +03:00
Sudharsan Dhamal Gopalarathnam
119789719d
[202205][Mellanox]Check dmi file permission before access (#11346)
* [Mellanox]Check dmi file permission before access (#11309)

Signed-off-by: Sudharsan Dhamal Gopalarathnam sudharsand@nvidia.com

Why I did it
During the system boot up when 'show platform status' or 'show version' command is executed before STATE_DB CHASSIS_INFO table is populated, the show will try to fallback to use the platform API. The DMI file in mellanox platforms require root permission for access. So if the show commands are executed as admin or any other user, the following error log will appear in the syslog

Jun 28 17:21:25.612123 sonic ERR show: Fail to decode DMI /sys/firmware/dmi/entries/2-0/raw due to PermissionError(13, 'Permission denied')

How I did it
Check the file permission before accessing it.

How to verify it
Added UT to verify. Manually verified if the error log is not thrown.
2022-07-06 16:13:07 -07:00
Alexander Allen
f99d272669 [Mellanox] Add arch folder to SDK binary location (#11278)
- Why I did it
This is for the eventual support of multiple architectures for the mellanox platform.

- How I did it
Change the location of the binaries in Switch-SDK-drivers so that the path specifies the target architecture in addition to the target distribution that the debians are built for.

This is the most straightforward way to separate binaries built against different architectures and selectively target them for installation in the mellanox SONiC image.

- How to verify it
Build SONiC for mellanox and verify it compiles successfully.
2022-07-06 01:52:33 +00:00
Ying Xie
296b21ec40 Revert "[Mellanox]Check dmi file permission before access (#11309)"
This reverts commit 187f351b23.
2022-07-06 00:06:45 +00:00
Sudharsan Dhamal Gopalarathnam
187f351b23 [Mellanox]Check dmi file permission before access (#11309)
Signed-off-by: Sudharsan Dhamal Gopalarathnam sudharsand@nvidia.com

Why I did it
During the system boot up when 'show platform status' or 'show version' command is executed before STATE_DB CHASSIS_INFO table is populated, the show will try to fallback to use the platform API. The DMI file in mellanox platforms require root permission for access. So if the show commands are executed as admin or any other user, the following error log will appear in the syslog

Jun 28 17:21:25.612123 sonic ERR show: Fail to decode DMI /sys/firmware/dmi/entries/2-0/raw due to PermissionError(13, 'Permission denied')

How I did it
Check the file permission before accessing it.

How to verify it
Added UT to verify. Manually verified if the error log is not thrown.
2022-07-05 16:12:31 +00:00
Rajkumar-Marvell
0bc054e308 [Marvell] Update armhf sai deb (#11296)
1) Migrate SAI to 1.10.2
2) MRVL-SAI fixes from 202012 branch

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2022-07-05 16:11:39 +00:00
Volodymyr Samotiy
421b659161 [Mellanox] Update SDK/FW to 4.5.2262/xx.2010.2262 (#10882)
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.

- How I did it
Updated SDK submodule along with the relevant Makefiles

- How to verify it
Build an image and run tests from "sonic-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-07-05 16:10:23 +00:00
Yakiv Huryk
97cb613729 [asan] add print_suppressions=0 to ASAN configs (#11252)
- Why I did it
To provide an ability to suppress ASAN false positives and have a clean ASAN report for docker-sonic-vs/mlnx-syncd/orchagent docker

- How I did it
Added the "print_suppressions=0" to ASAN configs.

- How to verify it
add a suppression to some ASAN-enabled component (the suppression should catch some leak)
build with ENABLE_ASAN=y
run a test and see that the ASAN report is empty instead of having the suppression summary

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-06-28 16:11:05 +00:00
Yakiv Huryk
75f73899e1 [vs][asan] add /var/log/asan to ASAN-enabled docker-sonic-vs image (#11059)
To ensure that ASAN logs are always generated. Currently, the way to get the logs is to map the "/var/log/asan" outside of a container, which doesn't work for DVS test run with "--imgname" option.

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-06-24 05:14:33 +00:00
saksarav-nokia
1976e55010 Update platform/broadcom/sonic-platform-modules-nokia (#11107) 2022-06-22 23:09:01 +00:00
Ying Xie
36b54da653
[brcm docker build] remove extra line (#11182)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-17 07:51:35 -07:00
Ying Xie
95dc2e23ff
[202205][BRCM_SAI] update Brcm SAI dependencies (#11173)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-17 05:02:00 -07:00
Ying Xie
9329c4b987
[202205][bcm sai] upgrade Broadcom SAI to 7.1.0.0-5 (#11159)
* [bcm sai] upgrade Broadcom SAI to 7.1.0.0-5

- Enable Microsoft AN/LT patch

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-16 16:00:17 -07:00
Jon Goldberg
b2685736e0 [installer]: fix armhf for installer.conf usage (#11121)
This fixes the build for armhf to be able to use '/device///installer.conf' files. Specifically, armhf needs support to be able to change the size of /var/log/ directory. It is hardcoded to 512 bytes on all armhf platforms currently. This change will allow any armhf platform to be able to use an installer.conf file to customize the installed image.
2022-06-16 02:12:59 +00:00
Junchao-Mellanox
00d04dcb5f [Mellanox] optimize platform API import time (#10815)
- Why I did it
"import sonic_platform" takes about 600ms ~ 1000ms, it is kind of slow. After this optimization, the time is about 100ms. The benefit is that those CLIs which does not need the slow import sentence would be faster than before.

- How I did it
Find slow import and call them when need.

- How to verify it
Measure the import time.
2022-06-09 16:50:12 +00:00
xumia
043656dfe8 Support symcrypt fips config for aboot/uboot (#10729)
Why I did it
Support symcrypt fips config for aboot/uboot
2022-06-05 15:20:20 +00:00
Ying Xie
ea3df2a21a
[platform build] fix platform ycabled build (#11020)
* remove python2 wheel for sonic-platform-common

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2022-06-04 09:43:05 -07:00
Yakiv Huryk
7306d68411
[build][asan] make dpkg cache asan-aware (#10750)
Currently, the build with ASAN_ENABLE=y reuses the packages built with
ASAN_ENABLE=n (and vice versa). To address this issue, ASAN_ENABLE is added to DEP_FLAGS for asan-enabled packages (docker-syncd-mlnx, syncd, docker-orchagent, swss).

- Why I did it
To make dpkg cache use/rebuild the packages for ASAN_ENABLE=y/n.

- How I did it
Added ASAN_ENABLE to the DEP_FLAGS for asan-enabled packages.

- How to verify it
Built with ASAN_ENABLE=y/n and checked the .flags .log files.

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-05-31 11:15:44 +03:00
Yakiv Huryk
bd91b2eef3
[asan] add debug package for asan-enabled containers (#10953)
This is to improve the readability of ASAN reports. The debug package adds function names and source code references to the backtrace (currently, there are only binary addresses of functions)

Another way to address this issue is to build the image with "INSTALL_DEBUG_TOOLS=y". The downside of this approach is that the image size and compilation time are unnecessarily big. Also, the idea is to make the "ENABLE_ASAN" self-sufficient, which would not be the case for this approach.

- Why I did it
To improve the readability of asan logs.

- How I did it
Added SYNCD_DBG and SWSS_DBG to corresponding docker images for ASAN_ENABLE=y build

- How to verify it
Add artificial memory leak
Build with ASAN_ENABLE=y
Test the image and check the ASAN report

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-05-31 09:24:18 +03:00
Eric Zhu
8c1ded61b0
[SONiC-CEL]: fix platform fancontrol testcase failure issue (#10934) 2022-05-31 10:54:55 +08:00
Andriy Yurkiv
29043ff026
[MFT] Update MFT version to MFT 4.20.0-34 (#10933)
- Why I did it
Update MFT to newer version

- How I did it
Update MFT_VERSION in platform/mellanox/mft.mk

- How to verify it
Check version via dpkg -l | grep mft

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-05-28 15:45:02 +03:00
Alexander Allen
71c868f56a
Upgrade libasan to version 6 in docker-syncd-mlnx to align with bullseye libasan (#10886) 2022-05-27 11:28:01 -07:00
jerseyang
c92bfe0728
Add belgite support (#9511)
Why I did it
add celestica belgite platform

How I did it
add belgite platform in celestica

Co-authored-by: nicwu-cel <nicwu@celestica.com>
Co-authored-by: anjian <anjian@celestica.com>
Co-authored-by: sandycelestica <sandyli@celestica.com>
2022-05-23 18:45:37 -07:00
Samuel Angebault
70e2727b02
[Arista] Update platform submodules (#10800)
This update has following changes
Refactor pci topology logic for chassis (fixes some chassis commands and chassisd on linecard)
Introduce new cooling algorithm
Fix linecard poweroff logic when supervisor is going down
Fix linecard status led leading to system-health crashing
Misc fixes
2022-05-23 13:28:13 -07:00
vmittal-msft
f7882b3885
Fix for libsaithrift build for BRCM image (#10852)
Updated libsaibcm to fix libsaithrift compile issue on BRCM image
2022-05-19 11:15:34 -07:00
Andriy Yurkiv
70d71f99f5
[Mellanox] Credo Y-cable | add more log info, checks, fix exception message (#10779)
- Why I did it
Script fails when there is an exception while reading

- How I did it
Add more logs and checks. Fix wrong variable naming and messages.

- How to verify it
Provoke exception while read_eeprom() and check that it is handled properly
2022-05-19 17:36:02 +03:00
Arun Saravanan Balachandran
942bef4475
DellEMC: S6100, Z9332f - Include ONIE version in 'show platform firmware status' (#10493)
Why I did it
To include ONIE version in show platform firmware status command output in DellEMC S6100 and Z9332f platforms.

How I did it
Include ‘ONIE’ in the list of components provided by platform APIs in DellEMC S6100 and Z9332f.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins in DellEMC S6100.
2022-05-12 09:24:06 -07:00
Saikrishna Arcot
949e76a00f
Update Linux kernel from 5.10.46 to 5.10.103 (#10634)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-05-10 13:46:31 -07:00
Alexander Allen
d202bf26d7
Upgrade mellanox platform containers (syncd / saiserver / syncd-rpc) and pmon to bullseye (#10580)
Fixes #9279

- Why I did it
Part of larger effort to move all SONiC systems to bullseye

- How I did it
1. Update container makefiles with correct dependencies
2. Update container Dockerfile with correct base image
3. Update container Dockerfile with correct apt dependencies
4. Update any other makefiles with dependencies to remove python2 support
5. Minor changes to support bullseye / python3

- How to verify it
Run regression on the switch:
1. Verify PTF community tests work
2. Verify syncd runs and all ports come up / pass traffic
3. Verify all platform tests succeed
2022-05-10 12:45:28 +03:00
FuzailBrcm
f579f61e4c
Fix for Accton platform build failure when doing incremental build (#10541) 2022-05-09 12:17:38 -07:00
FuzailBrcm
0ed671c9af
Fixing some python errors in the common PDDF platform classes (#10669) 2022-05-09 10:49:25 -07:00
vmittal-msft
9ae17e66a3
[sonic-sairedis update] Support for SAI header v1.10.2 with BRCM SAI v7.1.0.0 and MLNX SAI v1.21.1.0 (#10583) 2022-05-05 20:27:29 -07:00
Alexander Allen
53e5fe6a93
[Mellanox] Upgrade mellanox SDK to 4.5.1500 and mlnx-sai to 1.21.1.1 (#10675)
Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.1.1

SDK/FW features:
1. Added support for Finisar DR4 (FTCD4523E2PCM) on Spectrum-2 and Spectrum-3 systems.

SAI Features:
1. ECMP overlay support for IPv6
2. BFD offloading / 4K scale
3. Host interface user traps + improved trap registration (table entry)
4. gcc11 compilation fixes
5. Read support for ACL redirect action
6. Optimize ECMP DB size
7. Buffer descriptors new defaults
8. Updated port mapping for SN2201

SAI Fixes:
1. Debug counter removal when configured with all drop reasons

- Why I did it
Upgrade Mellanox SDK and SAI versions to latest

- How I did it
Updated submodule pointers

- How to verify it
Regression tested
2022-04-29 20:50:59 +03:00
Kalimuthu-Velappan
bc30528341
Parallel building of sonic dockers using native dockerd(dood). (#10352)
Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are
specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are
created with a fixed docker name and common to all the users.

    docker-database:latest
    docker-swss:latest

When multiple builds are triggered on the same build server that creates parallel building issue because
all the build jobs are trying to create the same docker with latest tag.
This happens only when sonic dockers are built using native host dockerd for sonic docker image creation.

This patch creates all sonic dockers as user sonic dockers and then, while
saving and loading the user sonic dockers, it rename the user sonic
dockers into correct sonic dockers with tag as latest.

	docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag

The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env
variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for
loading and saving the docker image.
2022-04-28 08:39:37 +08:00
Zhaohui Sun
cc30771f6b
Add python3 virtual environment for docker-ptf (#10599)
Why I did it
Migrate ptftests script to python3, in order to do an incremental migration, add python virtual environment firstly, install all required python packages in virtual env as well.
Then migrate ptftests scripts from python2 to python3 one by one avoid impacting non-changed scripts.

Signed-off-by: Zhaohui Sun zhaohuisun@microsoft.com

How I did it
Add python3 virtual environment for docker-ptf.
Add submodule ptf-py3 and install patched ptf 0.9.3 into virtual environment as well, two ptf issues were reported here:
p4lang/ptf#173
p4lang/ptf#174

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-04-26 09:13:26 +08:00
Stephen Sun
9237950a0c
Align threshold mode of zero buffer profile of egress_lossless_pool (#10627)
On vs platform, egress_lossless_pool's mode is static.
So the corresponding profile should be of static_th as well.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-04-25 21:56:05 +03:00
Eric Zhu
869ac1d1f2
sonic-platform-modules-cel dx010: speed up dx010 platform init script (#10313)
* Optimize dx010 sonic platform init script to speed up init process
* Merge issue #10152: [warm-upgrade][202012] Slow Celestica platform init
in rc.local causes lacp-teardown fix into master branch

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2022-04-22 20:36:17 +08:00
Yakiv Huryk
3abf383d3d
[asan] add address sanitizer support to docker-sonic-vs (#10470)
- Why I did it
To support docker-sonic-vs image with ASAN.

- How I did it
1. Made the supervisord.conf a template
2. Added the 'log_path' environment variable for ASAN-enabled daemons
3. Added supervisord.conf.j2 generation and ASAN lib to the docker-sonic-vs/Dockerfile.j2

- How to verify it
1. Made a build with ENABLE_ASAN=y
2. Run the tests, checked ASAN reports

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-04-22 11:07:07 +03:00
Junchao-Mellanox
af5e5c4c94
[Mellanox] Adjust PSU voltage WA (#10619)
- Why I did it
InvalidPsuVolWA.run might raise exception if user power off PSU when it is running. This exception is not caught and will be raised to psud which causes psud failed to update PSU data to DB.

- How I did it
1. Change the log level when WA does not work. This could happen when user power off PSU, hence changing the log level from error to warning is better
2. Change the wait time from 5 to 1 to avoid introduce too much delay in psud. 1 second is usually enough per my test
3. Give a default return value for function get_voltage_low_threshold and get_voltage_high_threshold to avoid exception reach to psud

- How to verify it
Manual test.
Run sonic-mgmt regression
2022-04-22 11:02:30 +03:00
Samuel Angebault
ea38864235
[Arista] Update platform submodules (#10561)
Update PikeZ platform definition
Improve powercycle behavior on chassis
2022-04-21 13:14:59 -07:00