Deliver sfputil support for sfputil show eeprom and sfputil reset along with some component test case fixes
Co-authored-by: Carl Keene <keene@nokia.com>
Why I did it
fix the dx010 system eeprom unavailable issue
How I did it
enable the i2c slave 30ms timeout mechanism
How to verify it
i2cstress test in DX010 iSMT controller bus
Co-authored-by: nicwu-cel <nicwu@celestica.com>
Update Makefile, so it does the following:
For a given platform, verify if platform/checkout/.ini exists and
hence run the platform/checkout/template.j2. This allows platform
code to be checked out during the 'make configure' stage.
Why I did it
serial-getty service exited in Dell S6100 device randomly.
How I did it
Added serial-getty to monit services.
How to verify it
Stop serial-getty in ssh session and check whether the service restarts or not.
the branch refers the branch name that the commit is in,
for example master, 202012, 201911, ...
In case there is no branch, the name will be HEAD.
release is encoded in /etc/sonic/sonic_release file.
the file is only available for a release branch.
It is not available in master branch.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Why I did it
platform test suite failed for few API's in DellEMC Z9332f platform.
How I did it
Modified the API's to return the expected values in the script.
How to verify it
Run platform test suite after making the changes.
This is to pick up BRCM SAI 4.3.5.1 which contains the following fix:
CS00012201406: [4.3.3.9] SAI_STATUS_FAILURE on FDB flush after all ports flapped
Preliminary tests looks fine. BGP neighbors were all up with proper routes programmed
interfaces are all up
Manually ran the following test cases on z9332f (TH3) T0 DUT and all passed:
```
ipfwd/test_dir_bcast.py
fib/test_fib.py
```
Manually ran the following test cases on S6100 (TH) and all passed:
```
ipfwd/test_dir_bcast.py
fdb/test_fdb.py
```
This PR only contains backports from master
Fix leak discovered on master, though 202012 is not affected it's better to have the fix (fixes [master] thermalctld leak on Arista devices makes them unreachable when memory is exhausted #7515)
Fix EepromDecoderimplementation in the platform API (fixes syseepromd crashing repeatedly on SONiC.20201231.02 #8263)
Fix Mineral platform definition and configuration
Fix build issues in environments where /proc is not mounted/restricted (fixes PLATFORM=broadcom fails arista "ReloadCauseManagerTest" first time #7800)
Fix some pytest issues
Add sfp-eeprom C API and also mount it in pmon
Why I did it
BIOS upgrade on rare cases cannot guarantee bus value remain the same on every BIOS release. Ignoring this field in order for pcied not to fail but still verify device id in a different way. The solution is future proof and will not require changes in code when new BIOS version is available
How I did it
Since bus is not a fixed value (it is determined by the bios version) we are ignoring this field, and instead checking if there is a device that match on all other fields that and in addition has a matching device id.
How to verify it
Verify no errors or failures in pcied on different BIOS version with the same code base.
- Why I did it
to prevent python exception error when executing warm-reboot command on mellanox simulator platform
- How I did it
return None on the watchdog python script on cases that watchdog file is not exist
- How to verify it
warm-reboot is running well without the python error. error message will appear on log on these cases.
in order to avoid this error message we can simulate the watchdog on mellanox simulator platform
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
Why I did it
To determine the revision of the pcie.yaml to be used based on BIOS version in DellEMC S6100 platform.
Depends on: Azure/sonic-platform-common#195
How I did it
Added two revisions of pcie.yaml pcie_1.yaml and pcie_2.yaml
Included a platform-specific Pcie class to provide the revision of the pcie.yaml to be used by pcieutil/pcied.
How to verify it
Execute pcieutil check (Azure/sonic-utilities#1672) command and verify the list of PCIe devices displayed.
Logs: UT_logs.txt
Use udevadm to trigger the udev rules on the first boot
How to verify:
- Connect C0 with E1031;
- Install or upgrade the sonic os to 202012 branch;
- When access to sonic check if /dev/C0-1 to /dev/C0-48 are existed.
Update FW version to 2008.3218, fixing the following issues:
- 50G/100G links that are operationally down before warm-reboot are not coming up after warm-reboot
- 50G/100G links with admin shut / no shut commands are not coming up after warm-reboot
Signed-off-by: Dror Prital <drorp@nvidia.com>
- Changes and new features:
1. Added support in SN4600C systems for new module Finisar ET7402-CWDM4 (100G CWDM4 QSFP28 1310nm SM 2KM).
2. Added support for new module MMS1W50-HM (2km transceiver FR4) for 200GbE
3. Improved performance of "per-port-buffer" counters
4. Added support for Kernel 5.10
- Bug fix:
On rare occasions (0.5%), in SN4600C systems, when using 100GbE NRZ mode and Fastboot flow, the link up time may take up to 10 seconds
Signed-off-by: Dror Prital <drorp@nvidia.com>
#### Why I did it
Support API 2.0 for S5248F platform
#### How I did it
Making changes to S5248F platform specific directory
Co-authored-by: Arun LK <Arun_L_K@dell.com>
#### Why I did it
The debian install files are required for installing sonic_platform packages
#### How I did it
Add install files to under debian folder
Co-authored-by: robert.hong <robert.hong@qct.io>
Add device and platform code for ix7-bwde, ix8a-bwde.
Support platform API 2.0 for all quanta platforms except for ix1b
Co-authored-by: robert.hong <robert.hong@qct.io>
Why I did it
There were two regression issues introduced by BRCM SAI 4.3.3.8:
CS00012196056 [4.3.3.8][WARMBOOT] syncd[2584]: segfault at 5616ad6c3d80 ip 00007f61e0c6bc65 sp 00007fff0c5a7a90 error 4 in libsai.so.1.0[7f61e0a95000+3cd8000]
CS00012195956 [4.3.3.8] [TD3]Syncd Crash at brcm_sai_tnl_mp_create_tunnel()
How I did it
Patch for CS00012195956 from BRCM was validated to have addressed the tunnel creation issue.
Temporary worked around the issue by commenting out a portion of questionable code in BRCM SAI that seems to be the root cause of CS00012196056 .
How to verify it
See the BRCM cases for details.
- Why I did it
Split and bulk counter bug fixes:
Init port auto neg to default on static (SAI XML) port split for 2nd+ port
- How I did it
Update submodule hash pointer.
- How to verify it
Verify the above is handled properly and reported issues are assumed to be fixed.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Remove EEPROM cache file and use DB instead
- How I did it
Read EEPROM data from DB if possible
If data is not ready in DB, read from hardware using a visitor pattern
- How to verify it
Manual test and regression
#### Why I did it
Failures observed when running the open community platform test suite (sonic-mgmt)
#### How I did it
Call PSUBase class initializer from derived class
There was an issue on master where `thermal.get_position_in_parent` in the platform API was returning -1 instead of a proper index. This is a backport of the fix for that issue.
- Why I did it
This is to back-port Azure 7410 to 202012 branch.
Enhance the Python3 support for platform API. Originally, some platform APIs call SDK API which didn't support Python 3. Now the Python 3 APIs have been supported in SDK 4.4.3XXX, Python3 is completely supported by platform API
- How I did it
Start all platform daemons from python3
1. Remove #/usr/bin/env python at the beginning of each platform API file as the platform API won't be started as daemons but be imported from other daemons.
2. Adjust SDK API calls accordingly
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
Adjust the Makefile for SDK/python-SDK-API to support both python2 and python3
- How to verify it
Build the image and check whether python2 and python3 are both supported by SDK API.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
To fix determine-reboot-cause service which was failing due to non-implemented thrown from get_reboot_case, if the reboot was done with `sudo reboot` (cold reboot)
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.
How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.
How to verify it
I verified this on the device str-7260cx3-acs-1.
This is due to the fact that we use SONIC_OVERRIDE_BUILD_VARS internally
in our build jobs and this is not accounted in caching framework.
So we add MLNX_SDK_DEB_VERSION to force rebuild if we changed it via
SONIC_OVERRIDE_BUILD_VARS.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
#### Why I did it
- After [sonic-linux-kernel#177](https://github.com/Azure/sonic-linux-kernel/pull/177) changes, the I2C mux channels of Baseboard and Switchboard CPLDs are moved from i2c-4 and i2c-5 to i2c-36 and i2c-37 respectively.
- This caused QSFP driver initialization of i2c-36 to i2c-41 to fail causing the ports from Ethernet208 to Ethernet248 fail.
#### How I did it
- The fix to this problem is to change the order of QSFP driver initialization to I2C mux channels.
- Instead of the order i2c-10 to i2c-41, the order i2c-4 to i2c-35 is being utilized.
- Also, need to change the i2c-mux-channel number for Baseboard CPLD and switchboard CPLD in scripts to access them.
#### Why I did it
On our platforms syncd must be up while using the sonic_platform.
The issue is warm-reboot script first disables syncd then instantiate Chassis, which tries to connect syncd in __init__.
#### How I did it
Refactor Chassis to lazy initialize components.
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
Initialize fans and thermals lists on demand; make them properties in order to reduce Chassis object initialization time
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
- Why I did it
Update FW version to 2008_3110 fixing SN3800 specific warm boot scenario:
1. Disable interface
2. Warm Boot
3. Enable Interface --> link will remain down.
- How I did it
Use new FW that contains the fix for the problem mentioned above
- How to verify it
Run the scenario mentioned above and make sure that the link is up after warm boot
Signed-off-by: Dror Prital <drorp@nvidia.com>