Commit Graph

1876 Commits

Author SHA1 Message Date
Richard.Yu
bca955f527
[SAIServer]Upgrade SAI server init script (#13175)
Why I did it
why
In order to apply different config across different platform, and use the code with a unified format, reuse syncd init script to init saiserver.

How I did it
how
Reuse syncd init script

How to verify it
Test
Test in DUT s6000 and dx010 with sonic 202205
2023-01-01 13:47:44 +08:00
Richard.Yu
f542aec319
[202205][Submodule][SAI-Redis]Advance SAI Redis head pointer (#13156)
Why I did it
Advance SAI Redis head pointer

How I did it
changes:

sonic-net/sonic-sairedis@cf679e7
sonic-net/sonic-sairedis@8d6688e
[202205][Submodule][SAI]Advance SAI head pointer sonic-sairedis#1185 sonic-net/sonic-sairedis@66f2961
remove parameter --skip_error, which removed from [202205][Submodule][SAI]Advance SAI head pointer sonic-sairedis#1185
How to verify it
local image build
2022-12-25 10:17:34 +08:00
Kebo Liu
a8376ef109
[202205][Mellanox] change the implementation of is_host() to fix a stuck issue on simx platform (#13101)
This PR is to backport #13100 to the 202205 branch since can not be cleanly cherry-picked.

Following code to judge whether a process is running inside a docker could get stuck on the simx platform

subprocess.Popen(["docker", "--version"],
                                stdout=subprocess.PIPE,
                                stderr=subprocess.STDOUT,
                                universal_newlines=True)
When it gets stuck, the config-chassisdb service can not be successfully started, thus the system can not be booted up.

root@sonic:/# service config-chassisdb status
     config-chassisdb.service - Config chassis_db
     Loaded: loaded (/lib/systemd/system/config-chassisdb.service; enabled; vendor preset: enabled)
     Active: activating (start) since Thu 2022-12-15 09:23:02 UTC; 29min ago
   Main PID: 571 (config-chassisd)
      Tasks: 14 (limit: 9501)
     Memory: 132.4M
     CGroup: /system.slice/config-chassisdb.service
                        ├─571 /bin/bash /usr/bin/config-chassisdb
			├─575 /usr/bin/python3 /usr/local/bin/sonic-cfggen -H -v DEVICE_METADATA.localhost.platform
			├─602 /bin/sh -c sudo decode-syseeprom -m
			├─603 sudo decode-syseeprom -m
			├─607 /usr/bin/python3 /usr/local/bin/decode-syseeprom -m
			├─616 /bin/sh -c docker --version 2>/dev/null
			└─617 docker --version
- How I did it
Use an alternative way to implement this function and issue can be avoided:

docker_env_file = '/.dockerenv'
return os.path.exists(docker_env_file) is False

- How to verify it
run regression on real hardware and simx platform.
2022-12-20 09:59:24 +02:00
Vadym Hlushko
2dff179f2c
[SFP] Change logging severity when failed to read EEPROM (#13010)
Why I did it
In order to prevent the sonic-mgmt/tests/platform_tests/sfp/test_sfputil.py test failing on the log analyzer step.

The mentioned test is performing the sfputil reset EthernetX for every interface on the SONiC switch, this action will flap the SFP device status (INSTERTED -> REMOVED -> INSTERTED).

The SONiC XCVRD daemon will catch this SFP device status change (because it is monitoring the presence status of the cable).
To judge the cable presence status, currently, we are still leveraging to read the first bytes of the EEPROM, and the EEPROM could be not ready at some moment and the SONiC XCVRD daemon will print the error log to Syslog:

ERR pmon#xcvrd: Error! Unable to read data for 'xx' port, page 'xx' offset 128, rc = 1, err msg: Sending access register
How I did it
Change logging severity from ERR to WARNING

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2022-12-19 09:21:21 -08:00
zitingguo-ms
0dcc7e4651
update SAI version to 7.1.28.4 (#13072)
Why I did it
To bring the following fixes:

Applied customer patch for limited port breakout support in rel_ocp_sai_7_1.
Revert "Merged PR 7224850: SAI7.1_DNX: Temp workaround for Nexthop Group Scale Issue(CS00012251649)".
backport SONIC-67662 to SAI7.1:JR2C ECMP partition for NHgroup members.
How I did it
Updated SAI code with the fixes above.

How to verify it
Run the SONiC and SAI test with the SAI pipeline.
2022-12-16 19:07:31 +08:00
Rajkumar-Marvell
bc70978540
[Marvell] Update armhf sai debian (#13037)
Why I did it
Update Marvell armhf sai debian version. Includes supports for,
Autoneg
Everflow

How to verify it
Compile and load SONiC armhf bin

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2022-12-14 12:10:09 +08:00
mssonicbld
1a34cff029
[sflow]: Unblocked psample_*() function calls in BRCM ESW platforms for proper functionality of sflow feature (#12918) (#13001) 2022-12-09 03:40:48 +08:00
Samuel Angebault
7c948a3b42
[202205][Arista] Update platform library submodules (#12968)
- add reboot cause support for linecards
- add back a Wolverine variant removed by mistake
- misc fixes and improvements
2022-12-07 14:59:29 -08:00
Marty Y. Lok
3d9dcf5294 [armhf][sonic-installer] Fix issue of the sonic-installer install a image after sonic-installer clean (#12609)
Signed-off-by: mlok <marty.lok@nokia.com>

Signed-off-by: mlok <marty.lok@nokia.com>
2022-12-08 02:37:52 +08:00
Ikki Zhu
561bf2e076 [Platform/Seastone]: fix syseeprom tlv read issue (#12200)
Why I did it
Fix Seastone syseeprom tlv header read incorrect issue

How I did it
Set mux idle_state

How to verify it
i2cdump -y -f 12 0x50 i
2022-12-07 12:38:37 +08:00
Santhosh Kumar T
dad23df0c3 [DellEMC] Master: S6100: SSD upgrade status: Moving from smartctl to iSMART (#12784)
Why I did it
smartctl tool is available only in PMON docker. Hence, the tool may be not accessible incase PMON docker goes down.
Using iSMART_64 tool to fetch the SSD firmware version and device model information.

How I did it
Replacing smartctl with iSMART_64.
2022-12-07 12:38:36 +08:00
Marty Y. Lok
f187febcca [Nokia]Update Nokia platform submodule for Nokia-IXR7250E platform (#12876)
1d53bf4 Skip platform NDK health check two times in watchdog.sh
d68297c Added code to shutdown the channel after the grpc call also fixed the show fp-status command
0769efe Impelemented the module API to return the correct eeprom info for fabric card.
171569c Remove explicit logger identifier for transceiver module operations; use inherited id
6c4d651 Corrected the log messages for firmware install

Signed-off-by: mlok <marty.lok@nokia.com>
2022-12-07 12:38:32 +08:00
LuiSzee
fa31cac213 [centec][arm64] fix tsingma bsp compile error (#12774)
fix centec arm64 tsingma bsp compile error caused by linux kernel api change
2022-12-06 18:35:30 +08:00
Richard.Yu
a30d4a9008
[SAI-PTF][202205]enable sai-ptf logger in sai_adapter to log all the sai api … (#12923)
Why I did it
enable sai-ptf logger in sai_adapter to log all the sai api invcations

How I did it
add build parameter to enable the sai-ptf logger when build sai PRC

How to verify it
local build test
test the generated sai_adapter
test with pipeline
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-12-04 22:09:57 -08:00
Saikrishna Arcot
60afb50c52
[202205] Update Linux kernel from 5.10.103 to 5.10.140 (#12660) #12874
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
(cherry picked from commit 672367c33e)
2022-12-01 09:25:40 -08:00
tjchadaga
7855cd8d15
Update BRCM SAI version to 7.1.25.4 (#12878) 2022-11-30 14:24:29 -08:00
Rajkumar-Marvell
6f7b5293ad [Marvell] Move armhf syncd docker to bullseye. (#12585)
Why I did it
Move armhf syncd docker compilation to bullseye.

How I did it
compile syncd docker for armhf platform using below commands,
NOJESSIE=1 NOSTRETCH=1 NOBUSTER=1 BLDENV=bullseye make configure PLATFORM=marvell-armhf PLATFORM_ARCH=armhf
NOJESSIE=1 NOSTRETCH=1 NOBUSTER=1 BLDENV=bullseye make target/docker-syncd-mrvl.gz

How to verify it
upgrade the syncd docker and verify ports are up.

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2022-11-28 18:49:28 +00:00
tjchadaga
7c1015b45e
Update BRCM SAI version to 7.1.24.4 (#12809) 2022-11-23 15:26:40 -08:00
Richard.Yu
c41cbc8f4f
Revert "[SAI PTF][202205]Support sai ptf v2 Syncd-rpc (#12763)" (#12785)
This reverts commit 99c01b5762.
2022-11-21 20:48:24 -08:00
Richard.Yu
99c01b5762
[SAI PTF][202205]Support sai ptf v2 Syncd-rpc (#12763)
cherry-pick #12761
Make syncd rpc docker which supports sai-ptf v2
Part of previous PR #11610

local bulild the target

NOSTRETCH=y NOJESSIE=y make configure PLATFORM=broadcom NOSTRETCH=y NOJESSIE=y ENABLE_SYNCD_RPC=y SAITHRIFT_V2=y make target/docker-syncd-brcm-rpcv2.gz NOSTRETCH=y NOJESSIE=y ENABLE_SYNCD_RPC=y SAITHRIFT_V2=y make target/docker-saiserverv2-brcm.gz

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-11-20 20:49:58 -08:00
Samuel Angebault
7775f15222
[202205][Arista] Update platform library submodules (#12737)
add partial reboot cause support for linecards
add watchdog support for linecards
add power draw information for chassis
properly implement Chassis.get_port_or_cage_type
fix pcieutil on chassis with powered off cards
fix watchdog-control.service crash
misc fixes and cleanups
2022-11-18 13:23:52 -08:00
Richard.Yu
20fedefbde
[sai-ptf]Fix sai ptf issue (#12723)
* fix sai-ptf build issue caused by upgraded scapy in base ptf docker

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless change

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* reformat code

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-11-16 20:28:30 -08:00
Junhua Zhai
bef2f3e6f6 [gbsyncd] Enable debug shell for BRCM broncos PHY (#12622)
* Build docker-gbsyncd-broncos image
* Correct typo in LIBSAI_BRONCOS_URL_PREFIX
* Update docker-gbsyncd-broncos/Dockerfile.j2
* Enable debug shell support on docker-gbsyncd-broncos
* Include bcmsh in docker-gbsyncd-broncos

Why I did it
In docker-gbsyncd-broncos image, enable debug shell support for BRCM broncos PHY.

How I did it
How to verify it
Note: need enable attr SAI_SWITCH_ATTR_SWITCH_SHELL_ENABLE support in BCM PAI library

# bcmsh 
Press Enter to show prompt.
Press Ctrl+C to exit.
NOTICE: Only one bcmsh or bcmcmd can connect to the shell at same time.


BRCM:> help
help
List of available commands
- h or help => Print command menu
- l => Print list of active ports on the PHY
- ps <port_id> <options> =>  Print port status
  <options> =>  1 -> Link status
            =>  2 -> Link training failure status
            =>  3 -> Link training RX status
            =>  4 -> PRBS lock status
            =>  5 -> PRBS lock loss status
- rd <port_id> <addr> <no of registers to read> => Read register contents
- wr <port_id> <addr> <data> => Write register data
- rrd <lanemap> <if_side> <addr> <no of registers to read> => Raw read register contents using lanemap and if_side (line = 0, system = 1)
- rwr <lanemap> <if_side> <addr> <data> => Raw write register data using lanemap and if_side (line = 0, system = 1)
- fw or firmware => Print firmware version of the PHY
- pd or port_dump <port_id> <flags> => Dump port status
- eyescan <port_id> => Display eye scan 
- fec_status <port_id> => Get fec status of the port
- polarity <lanemap> <if_side> <TX polarity> <RX Polarity> => Set TX and RX polarity
    <lanemap> => 0xF, 0xFF, or 0xFFFF based on number of lanes
    <if_side > => Line = 0, System = 1
    <TX/RX Polarity> =>_TX/RX Polarity bitmap of all lanes
        Each bit represents a lane number.
        E.g. Lane 0's polarity value (0 or 1) is populated in Bit 0.
- polarity <lanemap> <if_side> => Print TX and RX polarity
- lb <port_id> <lb_value> => Enable loopback on the port
  lb_value = 0 -> Disable, 1 -> PHY, 2 -> MAC
- lb <port_id> => Print loopback configuration of the port
- prbs <port_id> <options> <val> => Set/Get PRBS configuration
  <options> => 1 -> Get PRBS state and polynomial
               2 -> Set PRBS Polynomial, <val> - PRBS Polynomial
                    Please refer to phy/chip documentation for valid values
               3 -> Enable PRBS
                    <val> => 0 Disable PRBS
                             1 Enable both PRBS Transmitter and Receiver
                             2 Enable PRBS Receiver
                             3 Enable PRBS Transmitter
  exit or q => Exit the diagnostic shell
2022-11-10 18:15:51 +00:00
Kebo Liu
38bf957072 [Mellanox] [Platform API] Update SN2201 dynamic minimum fan speed table (#12602)
- Why I did it
Update SN2201 dynamic minimum fan speed table according to data provided by the thermal team.

- How I did it
Update the thermal table in device_data.py

- How to verify it
Run platform related regression

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-11-10 18:15:21 +00:00
zitingguo-ms
f05941f7bd Update BRCM SAI version to 7.1.17.4 (#12546)
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>

Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
2022-11-10 18:10:16 +00:00
Dror Prital
52d746b775 [Mellanox] Update SDK/FW to version 4.5.3186/2010.3186 (#12542)
- Why I did it
Update SDK/FW version - 4.5.3186/2010_3186 in order to have the following changes:

New functionality:
1. Added support for 6.5W (Class 8) in ports 49-50, 53-54, 57-58, and 61-62 on SN4600 system

Fix the following issues:
1. On very rare occasion (~1/100K), during I2C transaction with MMS1V50-WM and MMS1V90-WR modules on SN4700 system, the module may send unexpected stop which violate the I2C specification, possibly affecting the link up flow
2. When running 1GbE speeds on SN4600 system, the port remained active while peer side was closed
3. While toggling the cable with ‘sfputil lpmode on/off’, error msg like “ERR pmon#xcvrd: Receive PMPE error event on module 1: status {X} error type {y}” could be received
4. When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted
5. When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU
6. While moving from lossless to lossy mode while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized
7. SLL configuration is missing in SDK dump
8. If TTL_CMD_COPY is used in Encap direction for a packet with no TTL, then the value passed in the ttl data structure will be used if non-zero (default 255 if zero)
9. PCI calibration changes from a static to a dynamic mechanism
10. Layer 4 port information is not initialized for BFD packet event. To address the issue, remote peer UDP port information was added in BFD packet event
11. SDK returned error when FEC mode is set on twisted pair, when FEC was set to None

- How I did it
Update pointer for the SDK/FW

- How to verify it
Run regression tests

Signed-off-by: dprital <drorp@nvidia.com>
2022-11-10 18:10:05 +00:00
tjchadaga
10ea75eb4a Update BRCM SAI version to 7.1.16.4 (#12515) 2022-11-10 18:09:37 +00:00
Junchao-Mellanox
b57b17318d
[Mellanox] Use sdk sysfs instead of ethtool (#12480) (#12603)
Conflicts:
	platform/mellanox/mlnx-platform-api/sonic_platform/sfp.py
2022-11-09 16:45:25 -08:00
zitingguo-ms
d2540ae312
Add a parameter for libsaithrift to skip error on errno -2 (#12581)
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>

Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
2022-11-03 10:31:11 +08:00
Xu Chen
49d491e15c
update cisco 8000 tag: 202205-v0.6 (#12580) 2022-11-03 08:58:07 +08:00
vmittal-msft
93fbfbbf1f Updated BRCM SAI to version 7.1.10.4 (#12423) 2022-10-25 20:43:00 +00:00
xumia
db2128564b
[202205] Change submodule path from Azure to sonic-net (#12308)
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"

How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
2022-10-24 13:13:14 +08:00
Samuel Angebault
0772d36c6d
[202205][Arista] Update platform driver library (#12451)
fix linecard provisioning issue (500 error)
fix some value types for get_system_eeprom_info API
refactor code to leverage pci topology (enabling dynamic Pcie plugin)
refactor asic declaration logic to new style
misc fixes
2022-10-20 23:15:57 +08:00
Marty Y. Lok
526114ccf8 [Nokia] Update the nokia platform submodule for Nokia-IXR7250E platform (#12305)
Signed-off-by: mlok <marty.lok@nokia.com>
2022-10-11 21:48:01 +00:00
Marty Y. Lok
1af8985d8a [armhf][sonic-installer] Fix the sonic-installer install images on armhf platform issue (#12284)
Signed-off-by: mlok <marty.lok@nokia.com>

Signed-off-by: mlok <marty.lok@nokia.com>
2022-10-06 15:29:54 +00:00
Junchao-Mellanox
bca54e162b [Mellanox] Provide dummy implementation for get_rx_los and get_tx_fault (#12231)
- Why I did it
get_rx_los and get_tx_fault is not supported via the exisitng interface used, need provide dummy implementation for them.
NOTE: in later releases we will get them back via different interface.

- How I did it
Return False * lane_num for get_rx_los and get_tx_fault

- How to verify it
Added unit test
2022-10-03 19:01:03 +00:00
Samuel Angebault
009203def1
[202205][Arista] Update platform submodules (#12226)
Implement input power psu API
Report DC power output via API
Add bootloader Component in API
Fix issue where naming was not unique for Component
2022-09-30 16:03:48 +08:00
Volodymyr Samotiy
86ad8edb8b
[Mellanox] Update SAI to v2205.22.1.19 and SDK/FW to v4.5.3168/v2010.3070 (#12206)
- Why I did it
To include latest fixes and new functionality

SAI fixes and new features
fix #3205239, incorrect object type returned for SG child list
Fix VRF-VNI map entries remove issue
ECC health event and logging
[Port Buffers] restore default queue and pg configuration when all user pools are deleted
Fix EVPN type3 error on removal of uc/bc flood group
Fix EVPN type2 MAC move from local to remote results in SAI failure
Fix Disable learning on VXLAN tunnel
Fix error on VXLAN v6 tunnel removal
Fix port cannot apply schedule group when it is a lag member
Fix BFD add more detailed message on BFD packet not related to any existing session
gcc10 compilation fixes
Disable learning on VXLAN tunnel
Support BFD remote-disc exchange in negotiation stage
Tunnel Loopback packet action attribute implementation (for Dual TOR)
Add KVD resources MIN/MAX functionality (pending CRM issue with MIN only)
Support for CRC2 hash algorithm
Bulk counter support for PGs, queues
Support mirror sample rate attribute (SPC2+)
[Functional] [QoS] | Unable to remove SCHEDULE profile table even if there is no object referencing it
Next hop group optimized bulk API
Reduce verbosity of shared database already exists print
Span mirror policer (SPC2+), optimize pipeline for acl mirror action with policer on SPC2+
use same size descriptor pool for rx/tx
fix bfd - notify Sonic for admin-down event
2201 - empty list for supported fec for RJ45 ports
Fix don't disable used tunnel underlay interfaces

SDK fixes
100GbE FCI DAC (10137628-4050LF/HPE PN: 845408-B21) was recognized by mistake as supporting "cable burning' which caused the switch firmware to read page 0x9f (which unsupported in the cable) and to report this cable as having "bad eeprom".
Added remote peer UDP port information in BFD packet event.
After editing an ECMP, the resilient ECMP next-hop counter may not count correctly.
Fixed potential memory leaks in some APIs related to LPM
If TTL_CMD_COPY is used in Encap direction for a packet with no TTL, then the value passed in the ttl data structure will be used if non-zero (default 255 if zero).
In SN2201: When configuring Force mode, user should configure Speed and FEC on both sides
In Flex Tunnel encapsulation flow, if the encapsulation is with an IPv6 header, the flow label field may not be updated as expected.
In some cases, when changing speed to 400GbE over 8 lanes, the first few packets would be dropped.
In some traffic patterns involving small packets, the PortRcvErrors counter may mistakenly count events of local physical errors due to an internal flow in the hardware that involves link packets.
On Spectrum systems, sometimes during link failure, not all previous firmware indications cleared properly, potentially affecting the next link up attempt.
On the NVIDIA Spectrum-2 switch, when receiving a packet with Symbol Errors on ports that are configured to cut-thought mode, a pipeline might get stuck.
PCI calibration changes from a static to a dynamic mechanism.
SDK debug dump shows "Unknown" Counter in RFC3635 Counter Group.
SDK debug dump shows "Unknown" Counter in the PPCNT Traffic Class Counter Group.
SDK Dump missing column headers in some GC tables may result in difficulty understanding the dump.
SLL configuration is missing in SDK dump.
Spectrum-2 systems, do no support 1GbE on supported 40GbE modules.
When binding a UDP port which is already in use for BFD TX session, the error message appears incorrectly.
When Flex Tunnel was used, Flex Modifier sometimes experienced a brief mis-configuration during ISSU.
When many ports are active (e.g. 70 ports up), and the configuration of shared buffer is applied on the fly, occasionally, the firmware might get stuck.
When running 1GbE speeds on SN4600 system, the port remained active while peer side was closed.
When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
While toggling the cable, and the low power mode is set to ON, an unexpected PMPE event error is received.
How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "soni-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-09-30 09:39:12 +03:00
Stephen Sun
6dab2ffb46
[202205] [Mellanox] Fix typo in platform API (#12139)
- Why I did it
Fix a typo in chassis platform API which causes the following error

>>> import sonic_platform as P
>>> c = P.platform.Platform().get_chassis()
>>> sl = c.get_all_sfps()
>>> sl[0].get_lpmode()
Sep 28 07:48:33 INFO    LOG: Initializing SX log with STDOUT as output file.
False
>>> del c
Exception ignored in: <function Chassis.__del__ at 0x7f1d166ef8b0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 126, in __del__
    self.sfp_module.deinitialize_sdk_handle(sfp_module.SFP.shared_sdk_handle)
NameError: name 'sfp_module' is not defined

- How I did it
Use self while using the SDK handle

- How to verify it
Manual test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-09-28 11:09:46 +03:00
Junchao-Mellanox
ef84a41048
Revert "[Mellanox] Redirect ethtool stderr to subprocess for better error log (#12038)" (#12184)
This reverts commit 9750cb4.

There is a PR to handle master branch revert: #12183

- Why I did it
The PR to be reverted introduced many notice logs every 1 minute if SFP is not plugged:

Cannot get module EEPROM information: Input/output error
Before the "bad" PR, the message format is like this:

INFO pmon#supervisord: xcvrd Cannot get module EEPROM information: Input/output error
It was truncated by rsyslog because every message is the same. However, the "bad" PR introduces SFP index to the message:

NOTICE pmon#xcvrd: Failed to get EEPROM data for sfp 39: Cannot get module EEPROM information: Input/output error
Rsyslog no longer truncate such log and many such messages are flooded to syslog.

- How I did it
Revert the PR

- How to verify it
Manual test
2022-09-28 10:14:50 +03:00
Dror Prital
d5bd2dd6bf
[202205] [Mellanox] update NVIDIA copyright header for added files (#12126)
- Why I did it
Add NVIDIA Copyright header for new "NVIDIA" files

- How I did it
Add the copyright header as remark at the head of the file
2022-09-22 19:08:08 +03:00
Junchao-Mellanox
5ecb93f7e8 [Mellanox] Redirect ethtool stderr to subprocess for better error log (#12038)
- Why I did it
ethtool print error logs when EEPROM of a SFP is not available. It prints error like this:

INFO pmon#/supervisord: xcvrd Cannot get module EEPROM information: Input/output error
INFO pmon#/supervisord: xcvrd Cannot get Module EEPROM data: Invalid argument
However, this log does not contain the relevant SFP index which is hard for developer/qa to find the exactly SFP.

- How I did it
Redirect ethtool stderr to subprocess and log it better

- How to verify it
Manual test
2022-09-21 21:18:55 +00:00
Dror Prital
4e3aff882f [Mellanox] Update SDK/FW to version 4.5.2320/2010.2320 (#11990)
- Why I did it
Update SDK/FW version - 4.5.2320/2010_2320 in order to have the following fixes:
• Spectrum-3 | PCI calibration changes from a static to a dynamic mechanism.
• [VxLAN] TTL was set to 0 for non IP traffic (such as ARP)

- How I did it
Update pointer for the SDK/FW

- How to verify it
Run regression tests
2022-09-21 21:16:20 +00:00
Ravindranath C K
a700ffdb3d [innovium]: Enable syncd container autorestart for Innovium platforms (#11497)
Why I did it
Enable syncd container autorestart for Innovium platforms

How I did it
Add critical_process file and sypervisord.conf entry

How to verify it
Tested with autorestart/test_container_autorestart.py::test_containers_autorestart

PASSED autorestart/test_container_autorestart.py::test_containers_autorestart[sonic-xxx-dut-sonic-xxx-dut|syncd]

Signed-off-by: rck-innovium rck@innovium.com
2022-09-21 21:10:23 +00:00
Samuel Angebault
9f351aecd7
[202205][Arista] Update platform submodules (#12021) 2022-09-13 19:40:16 -07:00
xumia
7c5cb343e3 Fix dbus-run-session command not found issue when install dbus-python (#12009) 2022-09-08 16:34:38 +00:00
Junhua Zhai
f8a4ff04a6 [gbsyncd] Build docker-gbsyncd-broncos image (#11748)
The libsaibroncos debian package is published at $(LIBSAI_BRONCOS)_URL. Enable building docker-gbsyncd-broncos image on PLATFORM broadcom.
2022-09-01 00:01:35 +00:00
Ying Xie
875328bf8a
[202205] Update BRCM KNET modules to support new psample definitions from sflow (#11890)
* [202205][linux-kernel] advance submodule head

linux-kernel:
* 3fe00d4 2022-06-24 | [sflow + dropmon] added missing patch for the sFlow + dropmon feature. (#281) (HEAD -> 202205) [Vadym Hlushko]
* ca727d6 2022-04-21 | [sflow + dropmon] added patches for the sFlow + dropmon feature. (#276) [Vadym Hlushko]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Update BRCM KNET modules to support new psample definitions from sflow… (#11709)

* Update BRCM KNET module to support new psample definitions from sflow dropmon feature

* Update BRCM KNET module to support new psample definitions from sflow dropmon feature

* Advance saibcm-modules-dnx

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Co-authored-by: Michael Li <52540620+michaelli10@users.noreply.github.com>
2022-08-31 16:53:33 -07:00
Richard.Yu
a30daf0248
[202205][SAIServer] support saiserver v2 in bullseye (#11870)
* [SAIServer] support saiserver v2 in bullseye

Support build saiserverv2 in bullseye

Add dependencies for building saiserverv2
Upgrade libboost-atomic1.71 to libboost-atomic1.74

Test done:
Local builded with NOSTRETCH=y NOJESSIE=y NOBUSTER=y SAITHRIFT_V2=y make target/docker-saiserverv2-brcm.gz

* update libboost-atomic from 1.71 to 1.74 for bullseye
2022-08-29 18:59:53 -07:00
Samuel Angebault
e2cade3ca0
[202205][Arista] Update platform submodules (#11854)
Backport platform code for Arista products from master.
This most notably improve:

PikeZ platform support
Chassis fixes and improvements
2022-08-29 23:01:10 +08:00
Junhua Zhai
b722d5f7de Mount directory warmboot in docker gbsyncd (#11852)
Why I did it
The directory /var/warmboot as top directory for warmboot feature is also needed in docker gbsyncd. Some vendor SAI might save data under it. Without it, the SAI init/creation API failure has happened on PikeZ platform.

How I did it
Mount host directory /host/warmboot as /var/warmboot in docker gbsyncd, which is same as what it has done on docker syncd.
2022-08-27 16:16:17 +00:00
Junhua Zhai
322f14136a [BRCM SAI 7.1.7.2] catch up CS00012257483 patch (#11768)
Why I did it
It solves a swss orchagent crash issue on PikeZ device, due to link-training setting of external PHY port.

How I did it
Catch up the fix for CS00012257483 in version 7.1.7.2.
2022-08-27 16:16:17 +00:00
Volodymyr Samotiy
001177110d
[202205] [Mellanox] Update MFT to v4.21.0-100 (#11759)
- Why I did it
To update MFT package to the latest version.

- How I did it
Updated MFT_VERSION & MFT_REVISION in platform/mellanox/mft.mk.

- How to verify it
Run regression testing using tests from sonic-mgmt

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-08-24 11:48:43 +03:00
Kebo Liu
42c96feb9b
update Mellanox SDK/FW to 4.5.2318 2010.2318 (#11788)
Why I did it
Update SDK/FW version - 4.5.2318/2010_2318 to pick up new fixes:

Cr space timeout on Hold and Release GW - at warm boot
SPC-1 Port in stuck PHY_UP after peer side rebooted
memory leak in sx_api_router_ecmp_update_set
How I did it
update the make file with the new version number
update submodule Switch-SDK-drivers pointer

How to verify it
run sonic regression

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-08-19 10:59:36 -07:00
Junchao-Mellanox
75844e6b2a [Mellanox] Fix issue: set lpmode by platform API does not work (#11732)
- Why I did it
Fix issue: set lpmode by platform API does not work

- How I did it
Fix miss return value in code

- How to verify it
Manual test
2022-08-19 15:22:07 +00:00
Junhua Zhai
a9f2edd195 Upgrade gbsyncd container to bullseye (#11288)
Update the base of docker gbsyncd from buster to bullseye
2022-08-12 03:11:33 +00:00
賓少鈺
113e1059b2 PDE migration to bullseye (#10836)
#### Why I did it
Upgrade docker-pde to bullseye

#### How to verify it
Check Azp status
2022-08-11 23:14:10 +00:00
gechiang
d1a9052a4e [BRCM SAI 7.1.7.1] catch up on all pending fix patches for REL_7.0/7.1 (#11693) 2022-08-11 16:21:20 +00:00
Saikrishna Arcot
92ed619f99 Update Broadcom SAI to 7.1.0.0-9 (#11612)
This brings in a SAI library that is compiled on Bullseye.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-11 16:19:01 +00:00
Kebo Liu
3b749d6320 [Mellanox] Update hw-mgmt package to V.7.0020.3006 (#11538)
- Why I did it
Update HW-MGMT to V.7.0020.3006
1. Support new system SN2201
2. Add COMEX BRDWL respin support

- How I did it
Update the version number of the makefile
Advance the hw-mgmt submodule pointer

- How to verify it
Run full regression on Nvidia platforms

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-08-11 16:17:30 +00:00
Senthil Kumar Guruswamy
95bf542dd5 Upgrade broadcom platform containers(syncd/ saiserver/ syncd-rpc/ syncd-dnx-rpc) to bullseye (#10864) 2022-08-11 16:16:42 +00:00
Junchao-Mellanox
bc300b4d79 [Mellanox] add more log while doing sysfs reading (#11556)
- Why I did it
Add more log while doing sysfs reading to increase the debug capability

- How I did it
Log the relevant file path and error number while sysfs reading return None

- How to verify it
Manual test
2022-08-08 20:44:18 +00:00
Santhosh Kumar T
79e014efcb
[DellEMC] S6100 Platform Service optimization porting in 202205 (#11329)
To reduce rc.local script execution time. Porting changes from [DellEMC] S6100 Platform Service optimization #10989
Changes:
Moving platform-modules-s6100.service and s6100-lpc-monitor.service asynchronous to rc.local script.
2022-08-02 09:55:49 -07:00
Junchao-Mellanox
b7e3db4ef1 [Mellanox] Fix issue: failed to decode Json while there is no hwsku.json (#11436)
- Why I did it
Fix bug: pmon report error on start up because some SKUs do not have hwsku.json

- How I did it
If hwsku.json, do not extract RJ45 port information

- How to verify it
Manual test.
Unit test.
2022-07-29 04:49:32 +00:00
Stephen Sun
e317af0e9a Fix chassis test issue (#11460)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:36:44 +00:00
Stephen Sun
94df2c4b86 [Mellanox] Support new platform API get_port_or_cage_type for RJ45 ports (#11336)
- Why I did it
Support get_port_or_cage_type for RJ45 ports

- How I did it
Implement the new platform API get_port_or_cage_type
Fix the issue: unable to import SFP when chassis object is destructed

- How to verify it
Manually test and regression test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:31:20 +00:00
andywongarista
f377636747 Add gbsyncd container for broncos (#11154)
* Add docker-gbsyncd-broncos support
* Address review comments
* Add socket to gbsyncd
* Upgrade gbsyncd-broncos to bullseye
2022-07-28 20:27:21 +00:00
Kebo Liu
2f59460fc4 [Mellanox] Enhance Platform API to support SN2201 - RJ45 ports and new components mgmt. (#10377)
* Support new platform SN2201 and RJ45 port

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* remove unused import and redundant function

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* fix error introduced by rebase

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* Revert the special handling of RJ45 ports (#56)

* Revert the special handling of RJ45 ports

sfp.py
sfp_event.py
chassis.py

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove deadcode

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Support CPLD update for SN2201

A new class is introduced, deriving from ComponentCPLD and overloading _install_firmware
Change _install_firmware from private (starting with __) to protected, making it overloadable

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Initialize component BIOS/CPLD

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove swb_amb which doesn't on DVT board any more

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove the unexisted sensor - switch board ambient - from platform.json

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Do not report error on receiving unknown status on RJ45 ports

Translate it to disconnect for RJ45 ports
Report error for xSFP ports

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Add reinit for RJ45 to avoid exception

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:24:49 +00:00
Samuel Angebault
8ae03c994d [Arista] Update platform library (#10922)
- Implement Pcie plugin for chassis
- Implement set_admin_status for chassis modules
- Fix phy declaration for phy-credo
2022-07-22 22:15:34 +00:00
zitingguo-ms
e13df585ee [bcm sai]upgrade Broadcom SAI to 7.1.0.0-6 (#11410)
- Default Not to report Single bit ECC correctable events to avoid the need to set SOC porperties.

Signed-off-by: zitingguo <zitingguo@microsoft.com>
2022-07-22 22:14:28 +00:00
pavannaregundi
fea173d54e [Marvell-armhf] Fix kernel hang due to kernel upgrade to 5.10.103 (#11298)
Give more room for the kernel image in memory

Change-Id: I015856d173d50d94e30d8c555590efb70eb712ae
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2022-07-07 21:20:20 +00:00
Nazarii Hnydyn
a4da090e5a
[Mellanox] Update SAI to 1.21.2.0 (#11360)
- Why I did it
Advance to new SAI version for bugs fixes as well as new features/enhacements:

New:
1. ARM64 support
2. FG ECMP performance optimization
3. Support setting empty list for port ingress/egress buffer profile list
4. Add service port for SN5600
5. Add CR8/SR8/LR8/KR8 interface type
6. Disable mlxtrace during debug dump

Fixes:
1. Fix SAI_ACL_ENTRY_ATTR_FIELD_TC
2. Fix Packets loop back if no member in portchannel
3. Fix optimize descriptors apply time (and fast boot time)
4. Add flush fdb entries for vxlan tunnel bridge port
5. Don't disable used tunnel underlay interfaces

- How I did it
Advanced SAI submodule

- How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2022-07-07 09:15:41 +03:00
Sudharsan Dhamal Gopalarathnam
119789719d
[202205][Mellanox]Check dmi file permission before access (#11346)
* [Mellanox]Check dmi file permission before access (#11309)

Signed-off-by: Sudharsan Dhamal Gopalarathnam sudharsand@nvidia.com

Why I did it
During the system boot up when 'show platform status' or 'show version' command is executed before STATE_DB CHASSIS_INFO table is populated, the show will try to fallback to use the platform API. The DMI file in mellanox platforms require root permission for access. So if the show commands are executed as admin or any other user, the following error log will appear in the syslog

Jun 28 17:21:25.612123 sonic ERR show: Fail to decode DMI /sys/firmware/dmi/entries/2-0/raw due to PermissionError(13, 'Permission denied')

How I did it
Check the file permission before accessing it.

How to verify it
Added UT to verify. Manually verified if the error log is not thrown.
2022-07-06 16:13:07 -07:00
Alexander Allen
f99d272669 [Mellanox] Add arch folder to SDK binary location (#11278)
- Why I did it
This is for the eventual support of multiple architectures for the mellanox platform.

- How I did it
Change the location of the binaries in Switch-SDK-drivers so that the path specifies the target architecture in addition to the target distribution that the debians are built for.

This is the most straightforward way to separate binaries built against different architectures and selectively target them for installation in the mellanox SONiC image.

- How to verify it
Build SONiC for mellanox and verify it compiles successfully.
2022-07-06 01:52:33 +00:00
Ying Xie
296b21ec40 Revert "[Mellanox]Check dmi file permission before access (#11309)"
This reverts commit 187f351b23.
2022-07-06 00:06:45 +00:00
Sudharsan Dhamal Gopalarathnam
187f351b23 [Mellanox]Check dmi file permission before access (#11309)
Signed-off-by: Sudharsan Dhamal Gopalarathnam sudharsand@nvidia.com

Why I did it
During the system boot up when 'show platform status' or 'show version' command is executed before STATE_DB CHASSIS_INFO table is populated, the show will try to fallback to use the platform API. The DMI file in mellanox platforms require root permission for access. So if the show commands are executed as admin or any other user, the following error log will appear in the syslog

Jun 28 17:21:25.612123 sonic ERR show: Fail to decode DMI /sys/firmware/dmi/entries/2-0/raw due to PermissionError(13, 'Permission denied')

How I did it
Check the file permission before accessing it.

How to verify it
Added UT to verify. Manually verified if the error log is not thrown.
2022-07-05 16:12:31 +00:00
Rajkumar-Marvell
0bc054e308 [Marvell] Update armhf sai deb (#11296)
1) Migrate SAI to 1.10.2
2) MRVL-SAI fixes from 202012 branch

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2022-07-05 16:11:39 +00:00
Volodymyr Samotiy
421b659161 [Mellanox] Update SDK/FW to 4.5.2262/xx.2010.2262 (#10882)
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.

- How I did it
Updated SDK submodule along with the relevant Makefiles

- How to verify it
Build an image and run tests from "sonic-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-07-05 16:10:23 +00:00
Yakiv Huryk
97cb613729 [asan] add print_suppressions=0 to ASAN configs (#11252)
- Why I did it
To provide an ability to suppress ASAN false positives and have a clean ASAN report for docker-sonic-vs/mlnx-syncd/orchagent docker

- How I did it
Added the "print_suppressions=0" to ASAN configs.

- How to verify it
add a suppression to some ASAN-enabled component (the suppression should catch some leak)
build with ENABLE_ASAN=y
run a test and see that the ASAN report is empty instead of having the suppression summary

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-06-28 16:11:05 +00:00
Yakiv Huryk
75f73899e1 [vs][asan] add /var/log/asan to ASAN-enabled docker-sonic-vs image (#11059)
To ensure that ASAN logs are always generated. Currently, the way to get the logs is to map the "/var/log/asan" outside of a container, which doesn't work for DVS test run with "--imgname" option.

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-06-24 05:14:33 +00:00
saksarav-nokia
1976e55010 Update platform/broadcom/sonic-platform-modules-nokia (#11107) 2022-06-22 23:09:01 +00:00
Ying Xie
36b54da653
[brcm docker build] remove extra line (#11182)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-17 07:51:35 -07:00
Ying Xie
95dc2e23ff
[202205][BRCM_SAI] update Brcm SAI dependencies (#11173)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-17 05:02:00 -07:00
Ying Xie
9329c4b987
[202205][bcm sai] upgrade Broadcom SAI to 7.1.0.0-5 (#11159)
* [bcm sai] upgrade Broadcom SAI to 7.1.0.0-5

- Enable Microsoft AN/LT patch

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-16 16:00:17 -07:00
Jon Goldberg
b2685736e0 [installer]: fix armhf for installer.conf usage (#11121)
This fixes the build for armhf to be able to use '/device///installer.conf' files. Specifically, armhf needs support to be able to change the size of /var/log/ directory. It is hardcoded to 512 bytes on all armhf platforms currently. This change will allow any armhf platform to be able to use an installer.conf file to customize the installed image.
2022-06-16 02:12:59 +00:00
Junchao-Mellanox
00d04dcb5f [Mellanox] optimize platform API import time (#10815)
- Why I did it
"import sonic_platform" takes about 600ms ~ 1000ms, it is kind of slow. After this optimization, the time is about 100ms. The benefit is that those CLIs which does not need the slow import sentence would be faster than before.

- How I did it
Find slow import and call them when need.

- How to verify it
Measure the import time.
2022-06-09 16:50:12 +00:00
xumia
043656dfe8 Support symcrypt fips config for aboot/uboot (#10729)
Why I did it
Support symcrypt fips config for aboot/uboot
2022-06-05 15:20:20 +00:00
Ying Xie
ea3df2a21a
[platform build] fix platform ycabled build (#11020)
* remove python2 wheel for sonic-platform-common

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2022-06-04 09:43:05 -07:00
Yakiv Huryk
7306d68411
[build][asan] make dpkg cache asan-aware (#10750)
Currently, the build with ASAN_ENABLE=y reuses the packages built with
ASAN_ENABLE=n (and vice versa). To address this issue, ASAN_ENABLE is added to DEP_FLAGS for asan-enabled packages (docker-syncd-mlnx, syncd, docker-orchagent, swss).

- Why I did it
To make dpkg cache use/rebuild the packages for ASAN_ENABLE=y/n.

- How I did it
Added ASAN_ENABLE to the DEP_FLAGS for asan-enabled packages.

- How to verify it
Built with ASAN_ENABLE=y/n and checked the .flags .log files.

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-05-31 11:15:44 +03:00
Yakiv Huryk
bd91b2eef3
[asan] add debug package for asan-enabled containers (#10953)
This is to improve the readability of ASAN reports. The debug package adds function names and source code references to the backtrace (currently, there are only binary addresses of functions)

Another way to address this issue is to build the image with "INSTALL_DEBUG_TOOLS=y". The downside of this approach is that the image size and compilation time are unnecessarily big. Also, the idea is to make the "ENABLE_ASAN" self-sufficient, which would not be the case for this approach.

- Why I did it
To improve the readability of asan logs.

- How I did it
Added SYNCD_DBG and SWSS_DBG to corresponding docker images for ASAN_ENABLE=y build

- How to verify it
Add artificial memory leak
Build with ASAN_ENABLE=y
Test the image and check the ASAN report

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-05-31 09:24:18 +03:00
Eric Zhu
8c1ded61b0
[SONiC-CEL]: fix platform fancontrol testcase failure issue (#10934) 2022-05-31 10:54:55 +08:00
Andriy Yurkiv
29043ff026
[MFT] Update MFT version to MFT 4.20.0-34 (#10933)
- Why I did it
Update MFT to newer version

- How I did it
Update MFT_VERSION in platform/mellanox/mft.mk

- How to verify it
Check version via dpkg -l | grep mft

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-05-28 15:45:02 +03:00
Alexander Allen
71c868f56a
Upgrade libasan to version 6 in docker-syncd-mlnx to align with bullseye libasan (#10886) 2022-05-27 11:28:01 -07:00
jerseyang
c92bfe0728
Add belgite support (#9511)
Why I did it
add celestica belgite platform

How I did it
add belgite platform in celestica

Co-authored-by: nicwu-cel <nicwu@celestica.com>
Co-authored-by: anjian <anjian@celestica.com>
Co-authored-by: sandycelestica <sandyli@celestica.com>
2022-05-23 18:45:37 -07:00
Samuel Angebault
70e2727b02
[Arista] Update platform submodules (#10800)
This update has following changes
Refactor pci topology logic for chassis (fixes some chassis commands and chassisd on linecard)
Introduce new cooling algorithm
Fix linecard poweroff logic when supervisor is going down
Fix linecard status led leading to system-health crashing
Misc fixes
2022-05-23 13:28:13 -07:00
vmittal-msft
f7882b3885
Fix for libsaithrift build for BRCM image (#10852)
Updated libsaibcm to fix libsaithrift compile issue on BRCM image
2022-05-19 11:15:34 -07:00
Andriy Yurkiv
70d71f99f5
[Mellanox] Credo Y-cable | add more log info, checks, fix exception message (#10779)
- Why I did it
Script fails when there is an exception while reading

- How I did it
Add more logs and checks. Fix wrong variable naming and messages.

- How to verify it
Provoke exception while read_eeprom() and check that it is handled properly
2022-05-19 17:36:02 +03:00
Arun Saravanan Balachandran
942bef4475
DellEMC: S6100, Z9332f - Include ONIE version in 'show platform firmware status' (#10493)
Why I did it
To include ONIE version in show platform firmware status command output in DellEMC S6100 and Z9332f platforms.

How I did it
Include ‘ONIE’ in the list of components provided by platform APIs in DellEMC S6100 and Z9332f.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins in DellEMC S6100.
2022-05-12 09:24:06 -07:00
Saikrishna Arcot
949e76a00f
Update Linux kernel from 5.10.46 to 5.10.103 (#10634)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-05-10 13:46:31 -07:00
Alexander Allen
d202bf26d7
Upgrade mellanox platform containers (syncd / saiserver / syncd-rpc) and pmon to bullseye (#10580)
Fixes #9279

- Why I did it
Part of larger effort to move all SONiC systems to bullseye

- How I did it
1. Update container makefiles with correct dependencies
2. Update container Dockerfile with correct base image
3. Update container Dockerfile with correct apt dependencies
4. Update any other makefiles with dependencies to remove python2 support
5. Minor changes to support bullseye / python3

- How to verify it
Run regression on the switch:
1. Verify PTF community tests work
2. Verify syncd runs and all ports come up / pass traffic
3. Verify all platform tests succeed
2022-05-10 12:45:28 +03:00
FuzailBrcm
f579f61e4c
Fix for Accton platform build failure when doing incremental build (#10541) 2022-05-09 12:17:38 -07:00
FuzailBrcm
0ed671c9af
Fixing some python errors in the common PDDF platform classes (#10669) 2022-05-09 10:49:25 -07:00
vmittal-msft
9ae17e66a3
[sonic-sairedis update] Support for SAI header v1.10.2 with BRCM SAI v7.1.0.0 and MLNX SAI v1.21.1.0 (#10583) 2022-05-05 20:27:29 -07:00
Alexander Allen
53e5fe6a93
[Mellanox] Upgrade mellanox SDK to 4.5.1500 and mlnx-sai to 1.21.1.1 (#10675)
Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.1.1

SDK/FW features:
1. Added support for Finisar DR4 (FTCD4523E2PCM) on Spectrum-2 and Spectrum-3 systems.

SAI Features:
1. ECMP overlay support for IPv6
2. BFD offloading / 4K scale
3. Host interface user traps + improved trap registration (table entry)
4. gcc11 compilation fixes
5. Read support for ACL redirect action
6. Optimize ECMP DB size
7. Buffer descriptors new defaults
8. Updated port mapping for SN2201

SAI Fixes:
1. Debug counter removal when configured with all drop reasons

- Why I did it
Upgrade Mellanox SDK and SAI versions to latest

- How I did it
Updated submodule pointers

- How to verify it
Regression tested
2022-04-29 20:50:59 +03:00
Kalimuthu-Velappan
bc30528341
Parallel building of sonic dockers using native dockerd(dood). (#10352)
Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are
specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are
created with a fixed docker name and common to all the users.

    docker-database:latest
    docker-swss:latest

When multiple builds are triggered on the same build server that creates parallel building issue because
all the build jobs are trying to create the same docker with latest tag.
This happens only when sonic dockers are built using native host dockerd for sonic docker image creation.

This patch creates all sonic dockers as user sonic dockers and then, while
saving and loading the user sonic dockers, it rename the user sonic
dockers into correct sonic dockers with tag as latest.

	docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag

The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env
variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for
loading and saving the docker image.
2022-04-28 08:39:37 +08:00
Zhaohui Sun
cc30771f6b
Add python3 virtual environment for docker-ptf (#10599)
Why I did it
Migrate ptftests script to python3, in order to do an incremental migration, add python virtual environment firstly, install all required python packages in virtual env as well.
Then migrate ptftests scripts from python2 to python3 one by one avoid impacting non-changed scripts.

Signed-off-by: Zhaohui Sun zhaohuisun@microsoft.com

How I did it
Add python3 virtual environment for docker-ptf.
Add submodule ptf-py3 and install patched ptf 0.9.3 into virtual environment as well, two ptf issues were reported here:
p4lang/ptf#173
p4lang/ptf#174

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-04-26 09:13:26 +08:00
Stephen Sun
9237950a0c
Align threshold mode of zero buffer profile of egress_lossless_pool (#10627)
On vs platform, egress_lossless_pool's mode is static.
So the corresponding profile should be of static_th as well.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-04-25 21:56:05 +03:00
Eric Zhu
869ac1d1f2
sonic-platform-modules-cel dx010: speed up dx010 platform init script (#10313)
* Optimize dx010 sonic platform init script to speed up init process
* Merge issue #10152: [warm-upgrade][202012] Slow Celestica platform init
in rc.local causes lacp-teardown fix into master branch

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2022-04-22 20:36:17 +08:00
Yakiv Huryk
3abf383d3d
[asan] add address sanitizer support to docker-sonic-vs (#10470)
- Why I did it
To support docker-sonic-vs image with ASAN.

- How I did it
1. Made the supervisord.conf a template
2. Added the 'log_path' environment variable for ASAN-enabled daemons
3. Added supervisord.conf.j2 generation and ASAN lib to the docker-sonic-vs/Dockerfile.j2

- How to verify it
1. Made a build with ENABLE_ASAN=y
2. Run the tests, checked ASAN reports

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-04-22 11:07:07 +03:00
Junchao-Mellanox
af5e5c4c94
[Mellanox] Adjust PSU voltage WA (#10619)
- Why I did it
InvalidPsuVolWA.run might raise exception if user power off PSU when it is running. This exception is not caught and will be raised to psud which causes psud failed to update PSU data to DB.

- How I did it
1. Change the log level when WA does not work. This could happen when user power off PSU, hence changing the log level from error to warning is better
2. Change the wait time from 5 to 1 to avoid introduce too much delay in psud. 1 second is usually enough per my test
3. Give a default return value for function get_voltage_low_threshold and get_voltage_high_threshold to avoid exception reach to psud

- How to verify it
Manual test.
Run sonic-mgmt regression
2022-04-22 11:02:30 +03:00
Samuel Angebault
ea38864235
[Arista] Update platform submodules (#10561)
Update PikeZ platform definition
Improve powercycle behavior on chassis
2022-04-21 13:14:59 -07:00
FuzailBrcm
122cb90695
Fix AS7726 not showing serial number in 'show platform summary' (#10489) (#10509) 2022-04-20 13:06:22 -07:00
Junhua Zhai
128d762af3
[gearbox] Add peer gbsyncd for swss if gearbox exists (#10504)
Fix the issues #10501 and #9733

If having gearbox, we need:
    * add gbsyncd as a peer since swss also has dependency on gbsyncd
    * add service gbsyncd to FEATURE table if it is missing
2022-04-20 19:02:49 +08:00
Kebo Liu
a0c76b1bc9
[Mellanox] support newly added reboot cause (#10531)
- Why I did it
Implement newly added reboot causes in PR Azure/sonic-platform-common#277

- How I did it
Map the reboot cause sysfs to the newly added reboot causes.

- How to verify it
manual test, check whether the reboot cause is correct after rebooting the switch in various ways.
run the community reboot test to see whether the reboot cause checker is passing.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-04-18 10:55:56 +03:00
Taras Keryk
593ab45bde
[BFN] Added the latest version of FPGA driver and modules (#10458)
Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
2022-04-16 15:11:59 -07:00
Junhua Zhai
04f810a346
[gearbox] use credo sai v0.7.5 (#10578)
The v0.7.5 has bug fix for the support of gearbox port and macsec counters. It also includes a owl firmware update with owl.lz4.fw.1.94.0.bin.

How I did it
Update credo sai url for v0.7.5
Update gearbox_config.json with using firmware owl.lz4.fw.1.94.0.bin instead of owl.lz4.fw.1.92.1.bin

How to verify it
Test gearbox port and macsec counter successfully on A7280.
2022-04-15 10:41:43 -07:00
Yakiv Huryk
d9117d9411
[Mellanox][asan] add address sanitizer support for syncd (#10266)
Why I did it
To support address sanitizer for Mellanox syncd

How I did it
/var/log/asan is mapped for syncd container (the same as for swss)
container stop() has a timeout (60s) for syncd (the same as for swss)
This is so libasan has enough time to generate a report.
added ASAN's log path to Mellanox syncd supervisord.conf
added "asan: yes" to sonic_version.yml
How to verify it
Added artificial memory leaks
Compiled with ENABLE_ASAN=y
Installed the image on DUT
Rebooted the DUT
Verified that /var/log/asan/syncd-asan.log contains the leaks

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-04-14 15:00:32 -07:00
Junchao-Mellanox
0191300b96
[Mellanox] Auto correct PSU voltage threshold (WA) (#10394)
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.

- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"

- How to verify it
Unit test and Manual test
2022-04-14 08:14:40 +03:00
brandonchuang
2116f62978
[AS9716-32D] Support i2c mux reset (#10492)
Why I did it
    Prevent from i2c bus to get locked.

How I did it
    Add sysfs driver to access ioport.
    Command to reset i2c mux:
    echo 1 > /sys/devices/platform/as9716_32d_ioport/i2c_mux_rst
    Command to bring i2c mux out of reset:
    echo 0 > /sys/devices/platform/as9716_32d_ioport/i2c_mux_rst

Signed-off-by: Brandon Chuang <brandon_chuang@edge-core.com>
2022-04-09 10:51:49 -07:00
roman_savchuk
81ac482e5a
[BFN] updated SDE packages for BFN platforms (#10512)
Updated SDE packages for bfn platform

- introduced X6 profile
- fixes for drop counters
- fixes for platform part
2022-04-09 10:47:55 -07:00
byu343
6581decf38
[saibcm-modules]: Add linux_ngknet for trident4/tomahawk4 chips (#10517)
Why I did it
For trident4/tomahawk4, linux_ngknet.ko and linux_ngknetcb.ko have to be installed. Also, the kernel modules to load on such chips are different from existing ones, so we add an option is_ltsw_chip to determine the kernel modules to load. The option is_ltsw_chip is controlled by adding 'is_ltsw_chip=1' to platform_env.conf or not.

How to verify it
We verified that existing platforms still work after this change; and for platforms with trident4/tomahawk4, we can load the different kernel modules as expected after adding 'is_ltsw_chip=1' to platform_env.conf
2022-04-09 10:46:09 -07:00
Marty Y. Lok
487a29a43b
Update Nokia sonic-platform submodule (#10437)
b67d479 Fixed the sfp refactor issue
827c5a6 Added nokia_cmd command nokia_common grpc support for power down/up SFM module
aeb7f56 Added the nokia cli commands for midplane
c57d083 Fix the get_my_module issue and the thermal_infos exception issue.
0536293 Change the output of "show chassis module status"
63212d7 Enhance the help display for nokia_cmd command
e8d2599 Fix the sonic_install_ndk_service script issue
d52bdcf Add command nokia_cmd show sfm-eeprom support

Signed-off-by: mlok <marty.lok@nokia.com>
2022-04-08 10:33:02 -07:00
dflynn-Nokia
e348dff77a
[Nokia ixs7215] Platform API temperature threshold value fixes (#10372)
Incorrect high-threshold and critical-high-threshold values are displayed for
some of the temperature sensors. This commit fixes that.
2022-04-07 22:44:31 +08:00
Rajkumar-Marvell
faabf00f82
[Marvell] Update armhf sai deb (#10403)
1) DHCP trap for IPV4 and IPV6
2) Interface ACL's (Ingress Everflow support)
3) 1G Autoneg support

Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2022-04-07 22:32:31 +08:00
Taras Keryk
bfe5835650
[BFN] Fix exception when fwutil run without sudo (#10335)
* [BFN] Fix for run fwutil without sudo
            SONiC has a concept of "platform components"
            this may include - CPLD, FPGA, BIOS, BMC, etc.

            These changes are needed to read the version of the BIOS and BMC component.

            What I did
                    The previous implementaion of component.py expect fwutil run with sudo.
	       When fwutil run without sudo, there are an exception:
```
Traceback (most recent call last):
  File "/usr/local/bin/fwutil", line 5, in <module>
    from fwutil.main import cli
  File "/usr/local/lib/python3.9/dist-packages/fwutil/__init__.py", line 3, in <module>
    from . import main
  File "/usr/local/lib/python3.9/dist-packages/fwutil/main.py", line 40, in <module>
    pdp = PlatformDataProvider()
  File "/usr/local/lib/python3.9/dist-packages/fwutil/lib.py", line 159, in __init__
    self.__platform = Platform()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/platform.py", line 21, in __init__
    self._chassis = Chassis()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 48, in __init__
    self.__initialize_components()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 136, in __initialize_components
    component = Components(index)
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/component.py", line 184, in __init__
    self.version = get_bios_version()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/component.py", line 19, in get_bios_version
    return subprocess.check_output(['dmidecode', '-s', 'bios-version']).strip().decode()
  File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.9/subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.9/subprocess.py", line 1823, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'dmidecode'

```
           How I did it
                    Modification of dmidecode command

           How to verify it
                    Run manually 'fwutil' (without sudo)

	   Previous command output had exception

	   New command output:
		Root privileges are required

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Why I did it
The previous implementaion of component.py expect fwutil run with sudo.
When fwutil run without sudo, there are an exception:

Traceback (most recent call last):
  File "/usr/local/bin/fwutil", line 5, in <module>
    from fwutil.main import cli
  File "/usr/local/lib/python3.9/dist-packages/fwutil/__init__.py", line 3, in <module>
    from . import main
  File "/usr/local/lib/python3.9/dist-packages/fwutil/main.py", line 40, in <module>
    pdp = PlatformDataProvider()
  File "/usr/local/lib/python3.9/dist-packages/fwutil/lib.py", line 159, in __init__
    self.__platform = Platform()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/platform.py", line 21, in __init__
    self._chassis = Chassis()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 48, in __init__
    self.__initialize_components()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 136, in __initialize_components
    component = Components(index)
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/component.py", line 184, in __init__
    self.version = get_bios_version()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/component.py", line 19, in get_bios_version
    return subprocess.check_output(['dmidecode', '-s', 'bios-version']).strip().decode()
  File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.9/subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.9/subprocess.py", line 1823, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'dmidecode'

How I did it
Modification of dmidecode command

How to verify it
Run manually 'fwutil' (without sudo)
Previous command output had exception

New command output:
Root privileges are required

Signed-off-by: Taras Keryk tarasx.keryk@intel.com

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* rewrite a call of dmidecode, when run without sudo

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Why I did it
The previous implementaion of component.py expect fwutil run with sudo.
When fwutil run without sudo, there are an exception:

Traceback (most recent call last):
  File "/usr/local/bin/fwutil", line 5, in <module>
    from fwutil.main import cli
  File "/usr/local/lib/python3.9/dist-packages/fwutil/__init__.py", line 3, in <module>
    from . import main
  File "/usr/local/lib/python3.9/dist-packages/fwutil/main.py", line 40, in <module>
    pdp = PlatformDataProvider()
  File "/usr/local/lib/python3.9/dist-packages/fwutil/lib.py", line 159, in __init__
    self.__platform = Platform()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/platform.py", line 21, in __init__
    self._chassis = Chassis()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 48, in __init__
    self.__initialize_components()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 136, in __initialize_components
    component = Components(index)
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/component.py", line 184, in __init__
    self.version = get_bios_version()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/component.py", line 19, in get_bios_version
    return subprocess.check_output(['dmidecode', '-s', 'bios-version']).strip().decode()
  File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.9/subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.9/subprocess.py", line 1823, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'dmidecode'

The previous implementaion of eeprom.py expect fwutil run with sudo.
When fwutil run without sudo, there are an exception:

Traceback (most recent call last):
File "/usr/lib/python3.9/logging/config.py", line 564, in configure
handler = self.configure_handler(handlers[name])
File "/usr/lib/python3.9/logging/config.py", line 745, in configure_handler
result = factory(**kwargs)
File "/usr/lib/python3.9/logging/handlers.py", line 153, in init
BaseRotatingHandler.init(self, filename, mode, encoding=encoding,
File "/usr/lib/python3.9/logging/handlers.py", line 58, in init
logging.FileHandler.init(self, filename, mode=mode,
File "/usr/lib/python3.9/logging/init.py", line 1142, in init
StreamHandler.init(self, self._open())
File "/usr/lib/python3.9/logging/init.py", line 1171, in _open
return open(self.baseFilename, self.mode, encoding=self.encoding,
PermissionError: [Errno 13] Permission denied: '/var/log/platform.log'
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/bin/fwutil", line 5, in
from fwutil.main import cli
File "/usr/local/lib/python3.9/dist-packages/fwutil/init.py", line 3, in
from . import main
File "/usr/local/lib/python3.9/dist-packages/fwutil/main.py", line 41, in
pdp = PlatformDataProvider()
File "/usr/local/lib/python3.9/dist-packages/fwutil/lib.py", line 162, in init
self.chassis_component_map = self.__get_chassis_component_map()
File "/usr/local/lib/python3.9/dist-packages/fwutil/lib.py", line 168, in __get_chassis_component_map
chassis_name = self.__chassis.get_name()
File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 146, in get_name
return self._eeprom.modelstr()
File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 54, in _eeprom
self.__eeprom = Eeprom()
File "/usr/local/lib/python3.9/dist-packages/sonic_platform/eeprom.py", line 50, in init
logging.config.dictConfig(config_dict)
File "/usr/lib/python3.9/logging/config.py", line 809, in dictConfig
dictConfigClass(config).configure()
File "/usr/lib/python3.9/logging/config.py", line 571, in configure
raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'file'

How I did it
Modification call of dmidecode command.
Added modification of log files access attributes before file open operations.

How to verify it
Run manually 'fwutil' (without sudo)

New command output have no exception.

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Added file_check for checking access to log files for eeprom.py

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Removed unused import

* Added logfile_create to eeprom.py and chassis.py

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Created platform_utils.py

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Added interpreter  string to  platform_utils.py

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
2022-04-05 11:33:51 -07:00
Vadym Yashchenko
8e616c153b
[BFN] Refactoring and adding some functions of Thermal class (set and get thresholds and etc.) (#10205)
* Revised set_high_thershold and set_low_thershold methobs in the thermal.py

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised set_low_thershold and set_high_thershold

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Added separated files with thermal thresholds, changed platform.json and
thermal.py

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised on code revieww

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Reverted thermal.py

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised ther python.py

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised due to code review

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Added fucntion for fix the problem of tofino sensor high critical threshold

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised due to code review

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised due to code review

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised due to code review

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised only for cab18-4

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised default thresholds

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised ther def thresholds

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised on code review

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised platform.json and thermal_thresholds.json

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Code review in PR to azure (trigger CI)

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Added handle of exception

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revised exception handler

* Added psu-1 thermal names to platfrom.json

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Changed platform.json and thermal_thresholds.json in
x86_64-acton_as9516_32d-r0

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Removed indentation from json file

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>
2022-04-05 09:13:08 -07:00
jostar-yang
b39b7a3f2d
[Accton/PDDF] Support pddf to as4630/as7816/as7326 (#10340)
Why I did it
Support pddf to as4630/as7816/as7326

How I did it
Send needed file to the PR for these platform

How to verify it
Test sensors and show platform cmd.
root@as7326-56x-3:/home/admin# show platform psustatus
PSU Model Serial HW Rev Voltage (V) Current (A) Power (W) Status LED

PSU 1 FSF045-611 FSF0451912000505 N/A 12.06 5.50 66.00 OK green
PSU 2 FSF045-611 FSF0451912000568 N/A 12.00 5.50 66.00 OK green

root@as7326-56x-3:/home/admin# sensors
lm75-i2c-15-4a
Adapter: i2c-1-mux (chan_id 6)
Main Board Temperature: +35.5 C (high = +80.0 C, hyst = +75.0 C)

lm75-i2c-15-4b
Adapter: i2c-1-mux (chan_id 6)
CPU Board Temperature: +29.0 C (high = +80.0 C, hyst = +75.0 C)

fan_ctrl-i2c-11-66
Adapter: i2c-1-mux (chan_id 2)
fan1: 9100 RPM
fan2: 9400 RPM
fan3: 9300 RPM
fan4: 9600 RPM
fan5: 9000 RPM
fan6: 9100 RPM
fan7: 9100 RPM
fan8: 9300 RPM
fan9: 9200 RPM
fan10: 9400 RPM
fan11: 9200 RPM
fan12: 9400 RPM

pch_haswell-virtual-0
Adapter: Virtual device
temp1: +43.0 C

psu_pmbus-i2c-17-59
Adapter: i2c-1-mux (chan_id 0)
in3: +12.06 V
fan1: 6272 RPM
temp1: +37.0 C
power2: 60.00 W
curr2: +6.00 A

lm75-i2c-15-49
Adapter: i2c-1-mux (chan_id 6)
Main Board Temperature: +40.0 C (high = +80.0 C, hyst = +75.0 C)

lm75-i2c-15-48
Adapter: i2c-1-mux (chan_id 6)
Main Board Temperature: +39.0 C (high = +80.0 C, hyst = +75.0 C)

psu_pmbus-i2c-13-5b
Adapter: i2c-1-mux (chan_id 4)
in3: +12.00 V
fan1: 6144 RPM
temp1: +36.0 C
power2: 66.00 W
curr2: +5.50 A

coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 0: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 1: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 2: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 3: +50.0 C (high = +82.0 C, crit = +104.0 C)

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-04-01 09:55:04 -07:00
arunlk-dell
a9e86c3b82
DellEMC: N3248TE platform API2.0 changes (#10400)
DellEMC: N3248TE platform API2.0 changes

Why I did it
N3248TE Platform API 2.0 changes

How I did it
Implemented the functional API's needed for Platform API 2.0
Added system_health_monitoring_config.json file
How to verify it
Used the API 2.0 test suite to validate the test cases.

Co-authored-by: Arun LK <Arun_L_K@dell.com>
2022-04-01 07:34:31 -07:00
arunlk-dell
94ec85464b
DellEMC: N3248TE/N3248PXE Watchdog Support (#9398)
Why I did it
DellEMC : Added support for N3248TE/N3248PXE platforms

How I did it
Implemented the changes to enable/disable the watchdog support

How to verify it
watchdog_unit_test.txt

Co-authored-by: Arun LK <Arun_L_K@dell.com>
2022-04-01 07:33:08 -07:00
Myron Sosyak
eb141f2163
[BFN] Remove SAI patches (#10343)
Signed-off-by: Myron Sosyak <myronx.sosyak@intel.com>
2022-03-31 21:49:22 +02:00
Kebo Liu
85539e7e08
[Mellanox] Update hw-mgmt package to version V.7.0020.2004 (#10401)
- Why I did it
Take new hw-mgmt release to SONiC, including:

New features:
1. hw-mgmt: add to PSU FW upgrade tool command to show current FW version
2. hw-mgmt: add to PSU FW upgrade tool support for single-PSU-in-the-system FW upgrade
3. hw-mgmt: add attribute “/firmware” to show FW version of restricted upgradable PSUs only
4. hw-mgmt: Add NVME temperature reports attributes (_alarm/_crit/_min/_max)

Bug fix:
1. psu: redundant i2c_addr attributes being created for psu 3 & 4 in system having only 2 psus.
2. hw-mgmt: in SPC1/2 i2c driver removal is too slow vs. ASIC reset causing non-functional log errors
3. PSU thresholds sysfs changed in 5.10 to “read only” preventing modification (modification required due PSU HW bug)
4. CPLD3 sysfs attribute missing after chip down/up flow
5. sysfs attributes missing when hw-mgmt is restarted (stop/start) within systemd

Release notes can be found from link https://github.com/Mellanox/hw-mgmt/blob/V.7.0020.2004/debian/Release.txt

- How I did it
Update hw-mgmt make file with new version number
Update hw-mgmt submodule pointer

- How to verify it
Run platform regression on all Mellanox platform

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-03-31 08:21:09 +03:00
Andriy Yurkiv
1e2e493daa
[Mellanox] Credo Y-cable read_eeprom/write_eeprom API implementation (#10320)
- Why I did it
Implement read_eeprom/write_eeprom API for Credo Y-cable for Dual ToR Active-Standby

- How I did it
Use mlxreg utility for API implementation

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-03-30 20:41:31 +03:00
tjchadaga
1ddcfd0c3c
Update Broadcom SAI version to 6.1 (#10344) 2022-03-29 13:41:31 -07:00
jostar-yang
ab3053b3df
[Accton/PDDF] Support show cmd for psu-temp and fan (#10215)
Why I did it
Support for show platform temp/fan for psu-temp and fan.
Original code doesn't has fan_drawer to support these information.

How I did it
Support for show platform temp/fan for psu-temp and fan.
Add fan_drawer.py and update thermal.py to add needed code.
It need PDDF common code to support . (Refer to #10213)

How to verify it
Test show platform temp and show platform fan.
root@as7726-32x-2:~# show platform fan
Drawer LED FAN Speed Direction Presence Status Timestamp

Fantray1 green Fantray1_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray1 green Fantray1_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray2 green Fantray2_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray2 green Fantray2_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray3 green Fantray3_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray3 green Fantray3_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray4 green Fantray4_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray4 green Fantray4_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray5 green Fantray5_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray5 green Fantray5_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray6 green Fantray6_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray6 green Fantray6_2 38% EXHAUST Present OK 20220311 08:13:04
N/A green PSU1_FAN1 23% EXHAUST Present OK 20220311 08:13:04
N/A green PSU2_FAN1 22% EXHAUST Present OK 20220311 08:13:04
root@as7726-32x-2:~# show platform temp
Sensor Temperature High TH Low TH Crit High TH Crit Low TH Warning Timestamp

PSU1_TEMP1 28 N/A N/A N/A N/A False 20220311 08:13:04
PSU2_TEMP1 25 N/A N/A N/A N/A False 20220311 08:13:04
TEMP1 23.5 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP2 27 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP3 24 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP4 27 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP5 24 80.0 N/A N/A N/A False 20220311 08:13:04

Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-03-29 13:36:58 -07:00
Santhosh Kumar T
e2502edefd
Refactoring DELL platform init to reduce rc.local processing time porting changes in master (#10318)
Why I did it
To reduce the processing time of rc.local, refactoring s6100 platform initialization.
Porting changes from 202012 branch [202012] Refactoring DELL platform init to reduce rc.local processing time #10171
2022-03-24 11:14:37 -07:00
pavannaregundi
7eb321c376
[Marvell-armhf] Setting u-boot ftd_high to resolve kernel hung (#10204)
Why I did it
Kernel hang in during early boot is caused due overwriting of device tree with uncompressing kernel. Added the fdt_high which gives a safe offset from kernel location.

How I did it
Setting uboot environment variable fdt_high.

How to verify it
Successful boot of bullseye kernel on Marvell Armada 380/385.

Change-Id: I3e2521780f5ecdb3bdf6cbb6542250814ca11959
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2022-03-24 07:24:49 -07:00
pavannaregundi
97c02075f5
[Marvell-armhf] Fixing issues related to partition label (#10203)
Why I did it
Removing incorrect check in plt setup for fw_env config: This check was added before to compare 2 different types of disk. Now the check is redundant and check is not required as transition is complete.
2)Removing legacy_volume_label in create_partition: legacy_volume_label is not used in armhf install files. With legacy_volume_label initialized to NULL, current code will always return true for check, if demo_part exits.

How I did it
Change is about removing the redundant/incorrect code explained above.

How to verify it
uboot fw_printenv and fw_setenv is tested
onie-nos-install has be verified.

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2022-03-24 07:24:13 -07:00
Dror Prital
8bc81206c5
[Mellanox] Update NVIDIA License header for files changed since 1.1.2022 (#10289)
- Why I did it
Update NVIDIA Copyright header to "mellanox" files which were changed since 1.1.2022

- How I did it
Update the copyright header

- How to verify it
Sanity tests and PR checkers.
2022-03-23 13:19:25 +02:00
Oleksandr Ivantsiv
9565ef7a9a
[Mellanox] Refactor SFP to use new APIs. (#10317)
- Why I did it
Refactor SFP code to remove code duplication and to be able to use the latest features available in new APIs.

- How I did it
Refactor SFP code to remove code duplication and to be able to use the latest features available in new APIs.

- How to verify it
Run sonic-mgmt/platform_tests/sfp tests
2022-03-23 09:16:03 +02:00
dflynn-Nokia
99b7891e4d
[Nokia ixs7215] Fixes to support Debian bullseye (#10309)
The following changes are provided to support bullseye and the latest master
branch content.
    - Accommodate relocated fan and thermal sysfs entries in bullseye
    - Add support for chassis and PSU HW revision


Why I did it
Fix platform issues introduced by the bullseye kernel upgrade.

How I did it
Minor fixes to Nokia ixs7215 platform code

How to verify it
Execute the following CLI commands
show platform summary
show platform fan
show platform temperature
2022-03-22 17:58:46 -07:00
Kebo Liu
42ab5b8eaa
[Mellanox] update MFT version to 4.18.0-106 (#10304)
- Why I did it
With the previous MFT 4.18.1-16 there is a bug in mstdump tool accessing wrong address. it is confirmed this issue does not exist in official 4.18.0-106.

- How I did it
Update the MFT version to 4.18.0-106

- How to verify it
Run regression on Mellanox platforms
2022-03-21 19:38:37 +02:00
Junchao-Mellanox
f0ddd102d5
[Mellanox] Add CPU thermal control for Nvidia platforms (#10202)
Why I did it
Add CPU thermal control for Nvidia platforms which will be enabled for platforms that have heavy CPU load. Now it is only enabled on 4800, and it will be enabled on future platforms.

How I did it
Check CPU pack temperature and update cooling level accordingly

How to verify it
Manual test
Added sonic-mgmt test case, PR link will update later
2022-03-21 09:54:52 -07:00
Junchao-Mellanox
dc04d64219
[Mellanox] Fix issue: psu might use wrong voltage sysfs which causes invalid voltage value (#10231)
- Why I did it
Fix issue: psu might use wrong voltage sysfs which causes invalid voltage value. The flow is like:

1. User power off a PSU
2. All sysfs files related to this PSU are removed
3. User did a reboot/config reload
4. PSU will use wrong sysfs as voltage node

- How I did it
Always try find an existing sysfs.

- How to verify it
Manual test
2022-03-20 10:34:04 +02:00
Samuel Angebault
3a4ed99593
[Arista] Update driver submodules (#10267)
- Fix i2c bus on crow cpu
 - Fix exception handling in logs
 - Improve linecard mgmt interface configuration
 - Add new PSU models for chassis
 - Misc fixes
2022-03-18 12:10:30 -07:00
FuzailBrcm
d3f2da8679
[pddf]: Adding support for fan_drawer class in PDDF common platform APIs (#10213)
Why I did it
fan_drawer support was missing in PDDF common platform APIs. This resulted in 'thermalctld' not working and 'show platform fan' and 'show platfomr temperature' commands not working.
_thermal_list array inside PSU class was not initialized. Made changes to attach the PSU related thermal sensors in the PSU instance.
How I did it
Added a common class pddf_fan_drawer.py. This class uses the PDDF JSON to fetch the platform specific data. A platform which uses PDDF would follow the below hierarchy.

fan_drawer_base.py ---> pddf_fan_drawer.py ---> fan_drawer.py

How to verify it
Run the 'show platform fan' and 'show platform temperature' commands and check the o/p.

o/p on AS7326:

root@sonic:/home/admin# show platform fan
s  Drawer    LED         FAN    Speed    Direction    Presence    Status          Timestamp
--------  -----  ----------  -------  -----------  ----------  --------  -----------------
Fantray1  green  Fantray1_1      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray1  green  Fantray1_2      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray2  green  Fantray2_1      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray2  green  Fantray2_2      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray3  green  Fantray3_1      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray3  green  Fantray3_2      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray4  green  Fantray4_1      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray4  green  Fantray4_2      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray5  green  Fantray5_1      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray5  green  Fantray5_2      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray6  green  Fantray6_1      38%      EXHAUST     Present        OK  20220311 04:15:03
Fantray6  green  Fantray6_2      38%      EXHAUST     Present        OK  20220311 04:15:03
     N/A    off   PSU1_FAN1       0%                  Present    Not OK  20220311 04:15:05
     N/A  green   PSU2_FAN1      34%      EXHAUST     Present        OK  20220311 04:15:05
hroot@sonic:/home/admin# show platform temperature
    Sensor    Temperature    High TH    Low TH    Crit High TH    Crit Low TH    Warning          Timestamp
----------  -------------  ---------  --------  --------------  -------------  ---------  -----------------
PSU1_TEMP1            0          N/A       N/A             N/A            N/A      False  20220311 04:15:05
PSU2_TEMP1           37          N/A       N/A             N/A            N/A      False  20220311 04:15:05
     TEMP1           37         80.0       N/A             N/A            N/A      False  20220311 04:15:05
     TEMP2           27         80.0       N/A             N/A            N/A      False  20220311 04:15:05
     TEMP3           28.5       80.0       N/A             N/A            N/A      False  20220311 04:15:05
     TEMP4           30.5       80.0       N/A             N/A            N/A      False  20220311 04:15:05
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin#
o/p on AS7726:

root@as7726-32x-2:~# show platform fan
  Drawer    LED         FAN    Speed    Direction    Presence    Status          Timestamp
--------  -----  ----------  -------  -----------  ----------  --------  -----------------
Fantray1  green  Fantray1_1      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray1  green  Fantray1_2      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray2  green  Fantray2_1      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray2  green  Fantray2_2      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray3  green  Fantray3_1      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray3  green  Fantray3_2      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray4  green  Fantray4_1      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray4  green  Fantray4_2      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray5  green  Fantray5_1      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray5  green  Fantray5_2      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray6  green  Fantray6_1      38%      EXHAUST     Present        OK  20220311 08:13:04
Fantray6  green  Fantray6_2      38%      EXHAUST     Present        OK  20220311 08:13:04
     N/A  green   PSU1_FAN1      23%      EXHAUST     Present        OK  20220311 08:13:04
     N/A  green   PSU2_FAN1      22%      EXHAUST     Present        OK  20220311 08:13:04
root@as7726-32x-2:~# show platform temp
    Sensor    Temperature    High TH    Low TH    Crit High TH    Crit Low TH    Warning          Timestamp
----------  -------------  ---------  --------  --------------  -------------  ---------  -----------------
PSU1_TEMP1           28          N/A       N/A             N/A            N/A      False  20220311 08:13:04
PSU2_TEMP1           25          N/A       N/A             N/A            N/A      False  20220311 08:13:04
     TEMP1           23.5       80.0       N/A             N/A            N/A      False  20220311 08:13:04
     TEMP2           27         80.0       N/A             N/A            N/A      False  20220311 08:13:04
     TEMP3           24         80.0       N/A             N/A            N/A      False  20220311 08:13:04
     TEMP4           27         80.0       N/A             N/A            N/A      False  20220311 08:13:04
     TEMP5           24         80.0       N/A             N/A            N/A      False  20220311 08:13:04
2022-03-18 11:14:10 -07:00
Jeff Henning
ad6200029f
Edgecore 4630/5835/7326/7816 API2.0 platform support (#10053)
* Initial pass of EdgeCore platform changes.

* Remove libevent dependency from lldpd.

* Remove python2 dependencies python3.7 force from platform install script.

* Include usbmount support changes.

* Add missing 4630 install file.

* Update a few file permissions.  Add umask line to Makefile.  Specify python3.9 in install script.

* Misc platform updates:
- Add missing fan drawer component to sonic_platform
- Remove kernel version specification from Makefile
- Update to 4630 utility

* - Fix file permissions on source files
- Fix compile issue with 4630 driver modules (set_fs, get_fs, no longer supported in kernel 5.10)

* Fix missing/extra parens in 4630 util script.

* Fix indentation in fanutil.py.

* Integrate deltas from Edgecore to ec_platform branch.

* Installer update from Edgecore to resolve smbus serial console errors.

* Update stable_size for warm boot.

* Fix SFP dictionary key to match xcvrd.

* - Add missing define in event.py files needed for xcvrd
- Fix SFP info dict key for 7xxx switches

* 5835 platform file updates including installer and 5835 utility.

* 5835 fix for DMAR errors on serial console.

* Don't skip starting thermalctld in the pmon container.

* Revert several changes that were not related to platform.

* Run thermalctld in pmon container.

* Don't disable thermalctld in the pmon container.

* Fix prints/parens in 7816 install utility.

* - Incorporate 7816 changes from Edgecore
- Fix 7326 driver file using old kernel function

* Update kernel modules to use kernel_read().

* Fix compile errors with 7816 and 7326 driver modules.

* Fix some indents preventing platform files from loading.

* Update 7816 platform sfp dictionary to match field names in xcvrd.

* Add missing service and util files for 7816.

* Update file names, etc. based on full SKU for 7816.

* Delete pddf files not needed.  These were causing conflicts with API2.0
implementation.

* Remove pddf files suggested by Edgecore that were preventing API2.0 support from starting.

* Install API2.0 file instead of pddf.

* Update 7326 mac service file to not use pddf.  Fix syntax errors in 7326 utility script.

* Fix sonic_platform setup file for 7326.

* Fix syntax errors in python scripts.

* Updates to 7326 platform files.

* Fix some tab errors pulled down from master merge.

* Remove pddf files that were added from previous merge.

* Updates for 5835.

* Fix missing command byte for 5835 psu status.

* Fix permission bits on 4630 service files.

* Update platforms to use new SFP refactoring.

* Fix unused var warnings.
2022-03-18 06:17:08 +05:30
Arun Saravanan Balachandran
5c7aa50463
DellEMC: Z9332f - Component API Fixes (#10187) 2022-03-16 14:16:26 -07:00
FuzailBrcm
98cfec2982
Using SFP refactoring framework in PDDF sfp class (#10047)
* Using SFP refactoring framework in PDDF sfp class

* Fixing a typo error
2022-03-16 06:35:34 +05:30
gechiang
004dc69910
[BRCM SAI 6.0.0.13-3] Fix Warmreboot issue (#10225) 2022-03-15 11:50:33 -07:00
Yang Wang
a29ba9cf22
Correct thrift 0141 typo fix (#10199)
Correct libsaithrift dependency package name from
LIBTHRIFT_DEV_0_14_1 THRIFT_COMPILER_0_14_1 to
LIBTHRIFT_0_14_1_DEV THRIFT_0_14_1_COMPILER

How I did it
How to verify it
Test Done:

make BLDENV=buster SAITHRIFT_V2=y -f Makefile.work target/debs/buster/saiserverv2_0.9.4_amd64.deb
2022-03-11 17:35:17 +08:00