Why I did it
Add CPU thermal control for Nvidia platforms which will be enabled for platforms that have heavy CPU load. Now it is only enabled on 4800, and it will be enabled on future platforms.
How I did it
Check CPU pack temperature and update cooling level accordingly
How to verify it
Manual test
Added sonic-mgmt test case, PR link will update later
- Why I did it
Fix issue: psu might use wrong voltage sysfs which causes invalid voltage value. The flow is like:
1. User power off a PSU
2. All sysfs files related to this PSU are removed
3. User did a reboot/config reload
4. PSU will use wrong sysfs as voltage node
- How I did it
Always try find an existing sysfs.
- How to verify it
Manual test
- Fix i2c bus on crow cpu
- Fix exception handling in logs
- Improve linecard mgmt interface configuration
- Add new PSU models for chassis
- Misc fixes
Why I did it
fan_drawer support was missing in PDDF common platform APIs. This resulted in 'thermalctld' not working and 'show platform fan' and 'show platfomr temperature' commands not working.
_thermal_list array inside PSU class was not initialized. Made changes to attach the PSU related thermal sensors in the PSU instance.
How I did it
Added a common class pddf_fan_drawer.py. This class uses the PDDF JSON to fetch the platform specific data. A platform which uses PDDF would follow the below hierarchy.
fan_drawer_base.py ---> pddf_fan_drawer.py ---> fan_drawer.py
How to verify it
Run the 'show platform fan' and 'show platform temperature' commands and check the o/p.
o/p on AS7326:
root@sonic:/home/admin# show platform fan
s Drawer LED FAN Speed Direction Presence Status Timestamp
-------- ----- ---------- ------- ----------- ---------- -------- -----------------
Fantray1 green Fantray1_1 38% EXHAUST Present OK 20220311 04:15:03
Fantray1 green Fantray1_2 38% EXHAUST Present OK 20220311 04:15:03
Fantray2 green Fantray2_1 38% EXHAUST Present OK 20220311 04:15:03
Fantray2 green Fantray2_2 38% EXHAUST Present OK 20220311 04:15:03
Fantray3 green Fantray3_1 38% EXHAUST Present OK 20220311 04:15:03
Fantray3 green Fantray3_2 38% EXHAUST Present OK 20220311 04:15:03
Fantray4 green Fantray4_1 38% EXHAUST Present OK 20220311 04:15:03
Fantray4 green Fantray4_2 38% EXHAUST Present OK 20220311 04:15:03
Fantray5 green Fantray5_1 38% EXHAUST Present OK 20220311 04:15:03
Fantray5 green Fantray5_2 38% EXHAUST Present OK 20220311 04:15:03
Fantray6 green Fantray6_1 38% EXHAUST Present OK 20220311 04:15:03
Fantray6 green Fantray6_2 38% EXHAUST Present OK 20220311 04:15:03
N/A off PSU1_FAN1 0% Present Not OK 20220311 04:15:05
N/A green PSU2_FAN1 34% EXHAUST Present OK 20220311 04:15:05
hroot@sonic:/home/admin# show platform temperature
Sensor Temperature High TH Low TH Crit High TH Crit Low TH Warning Timestamp
---------- ------------- --------- -------- -------------- ------------- --------- -----------------
PSU1_TEMP1 0 N/A N/A N/A N/A False 20220311 04:15:05
PSU2_TEMP1 37 N/A N/A N/A N/A False 20220311 04:15:05
TEMP1 37 80.0 N/A N/A N/A False 20220311 04:15:05
TEMP2 27 80.0 N/A N/A N/A False 20220311 04:15:05
TEMP3 28.5 80.0 N/A N/A N/A False 20220311 04:15:05
TEMP4 30.5 80.0 N/A N/A N/A False 20220311 04:15:05
root@sonic:/home/admin#
root@sonic:/home/admin#
root@sonic:/home/admin#
o/p on AS7726:
root@as7726-32x-2:~# show platform fan
Drawer LED FAN Speed Direction Presence Status Timestamp
-------- ----- ---------- ------- ----------- ---------- -------- -----------------
Fantray1 green Fantray1_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray1 green Fantray1_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray2 green Fantray2_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray2 green Fantray2_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray3 green Fantray3_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray3 green Fantray3_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray4 green Fantray4_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray4 green Fantray4_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray5 green Fantray5_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray5 green Fantray5_2 38% EXHAUST Present OK 20220311 08:13:04
Fantray6 green Fantray6_1 38% EXHAUST Present OK 20220311 08:13:04
Fantray6 green Fantray6_2 38% EXHAUST Present OK 20220311 08:13:04
N/A green PSU1_FAN1 23% EXHAUST Present OK 20220311 08:13:04
N/A green PSU2_FAN1 22% EXHAUST Present OK 20220311 08:13:04
root@as7726-32x-2:~# show platform temp
Sensor Temperature High TH Low TH Crit High TH Crit Low TH Warning Timestamp
---------- ------------- --------- -------- -------------- ------------- --------- -----------------
PSU1_TEMP1 28 N/A N/A N/A N/A False 20220311 08:13:04
PSU2_TEMP1 25 N/A N/A N/A N/A False 20220311 08:13:04
TEMP1 23.5 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP2 27 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP3 24 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP4 27 80.0 N/A N/A N/A False 20220311 08:13:04
TEMP5 24 80.0 N/A N/A N/A False 20220311 08:13:04
* Initial pass of EdgeCore platform changes.
* Remove libevent dependency from lldpd.
* Remove python2 dependencies python3.7 force from platform install script.
* Include usbmount support changes.
* Add missing 4630 install file.
* Update a few file permissions. Add umask line to Makefile. Specify python3.9 in install script.
* Misc platform updates:
- Add missing fan drawer component to sonic_platform
- Remove kernel version specification from Makefile
- Update to 4630 utility
* - Fix file permissions on source files
- Fix compile issue with 4630 driver modules (set_fs, get_fs, no longer supported in kernel 5.10)
* Fix missing/extra parens in 4630 util script.
* Fix indentation in fanutil.py.
* Integrate deltas from Edgecore to ec_platform branch.
* Installer update from Edgecore to resolve smbus serial console errors.
* Update stable_size for warm boot.
* Fix SFP dictionary key to match xcvrd.
* - Add missing define in event.py files needed for xcvrd
- Fix SFP info dict key for 7xxx switches
* 5835 platform file updates including installer and 5835 utility.
* 5835 fix for DMAR errors on serial console.
* Don't skip starting thermalctld in the pmon container.
* Revert several changes that were not related to platform.
* Run thermalctld in pmon container.
* Don't disable thermalctld in the pmon container.
* Fix prints/parens in 7816 install utility.
* - Incorporate 7816 changes from Edgecore
- Fix 7326 driver file using old kernel function
* Update kernel modules to use kernel_read().
* Fix compile errors with 7816 and 7326 driver modules.
* Fix some indents preventing platform files from loading.
* Update 7816 platform sfp dictionary to match field names in xcvrd.
* Add missing service and util files for 7816.
* Update file names, etc. based on full SKU for 7816.
* Delete pddf files not needed. These were causing conflicts with API2.0
implementation.
* Remove pddf files suggested by Edgecore that were preventing API2.0 support from starting.
* Install API2.0 file instead of pddf.
* Update 7326 mac service file to not use pddf. Fix syntax errors in 7326 utility script.
* Fix sonic_platform setup file for 7326.
* Fix syntax errors in python scripts.
* Updates to 7326 platform files.
* Fix some tab errors pulled down from master merge.
* Remove pddf files that were added from previous merge.
* Updates for 5835.
* Fix missing command byte for 5835 psu status.
* Fix permission bits on 4630 service files.
* Update platforms to use new SFP refactoring.
* Fix unused var warnings.
Correct libsaithrift dependency package name from
LIBTHRIFT_DEV_0_14_1 THRIFT_COMPILER_0_14_1 to
LIBTHRIFT_0_14_1_DEV THRIFT_0_14_1_COMPILER
How I did it
How to verify it
Test Done:
make BLDENV=buster SAITHRIFT_V2=y -f Makefile.work target/debs/buster/saiserverv2_0.9.4_amd64.deb
Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
Why I did it
The previous implementaion of API for platform component didn't have the new thrift files
How I did it
Add the new thrift-generated: pltfm_mgr_rpc.py, ttypes.py
How to verify it
Run manually 'fwutil show status' or run unit tests
Previous command output had no information about components
New command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
Chassis1 N/A BIOS 1.2.3 Chassis BIOS
BMC 5.1 Chassis BMC
* [BFN] Implementation API for platform component
SONiC has a concept of "platform components"
this may include - CPLD, FPGA, BIOS, BMC, etc.
These changes are needed to read the version of the BIOS and BMC component.
What I did
Create components.py module
Add funcion for reading componet version to thrift interface
How I did it
The previous implementaion didn't have platform components API, so fwutil return an empty list.
After implementation of the platform component API, we have actual list of platform components and firmware versions
How to verify it
Run manually 'fwutil show status' or run unit tests
Previous command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
New command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
Chassis1 N/A BIOS 1.2.3 Chassis BIOS
BMC 5.1 Chassis BMC
Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
* [BFN] Implementation API for platform component
SONiC has a concept of "platform components"
this may include - CPLD, FPGA, BIOS, BMC, etc.
These changes are needed to read the version of the BIOS and BMC component.
What I did
Create components.py module
Add funcion for reading componet version to thrift interface
How I did it
The previous implementaion didn't have platform components API, so fwutil return an empty list.
After implementation of the platform component API, we have actual list of platform components and firmware versions
How to verify it
Run manually 'fwutil show status' or run unit tests
Previous command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
New command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
Chassis1 N/A BIOS 1.2.3 Chassis BIOS
BMC 5.1 Chassis BMC
Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
* [BFN] Implementation API for platform component
get chassis name from json
* [BFN] Implementation API for platform component
Updated platform and platrom_components json
* [BFN] Implementation API for platform component
Fixed spaces in component.py
* [BFN] Implementation API for platform component
Fixed exception in component.py
* Update chassis.py
* [BFN] Implementation API for platform component
Fixed spaces in component.py, chassis.py
* [BFN] Implementation API for platform component: Fixed spaces in component.py, chassis.py
* Fixed exception in get_bios_version
Why I did it
N3248TE - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
How to verify it
Used the API 2.0 test suite to validate the test cases.
Signed-off-by: Jostar Yang jostar_yang@accton.com.tw
Why I did it
Fix led drv because CPLD SPEC is updated.
Fix i2c bus order
How I did it
Fix led drv. Set blacklist to i801 and ismt. Let accton util to modprobe i801 and ismt.
How to verify it
Test led and sensors cmd. Results are fine.
Why I did it
To fix I2C bus order to meet with HW SPEC. Let i801 use bus-0 and ismt use bus-1
How I did it
Modprobe i801 and then do ismt. So i801 will use bus-0 and ismt will use bus-1.
How to verify it
Test show cmd and sensors work well
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
As per linux kernel 5.10, 'force_deselct_on_exit' parameter used for driver i2c_mux_pca954x is no longer valid. Instead an attribute 'idle_state' is added per MUX device. This needs to be set to
-1 : For leaving the mux state as is
-2 : For deselecting the channel upon exit
: To always set a channel upon exit
This needs to be accommodated inside the PDDF JSON parser as well.
AS7816 support AT or non-AT DUT. They use different pmbus i2c bus. So use "pre_pddf_init.sh" to check this case.
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
- Why I did it
The latest upgrade of Mellanox hw-mgmt V7.0020.1300 introduced a couple new kernel modules for new Mellanox platforms that have yet to be upstreamed to the linux kernel.
As these new platforms do not have SONiC support we elected not to upstream these new drivers to sonic-linux-kernel but hw-mgmt expects them to exist which is causing a non-functional error on switch boot.
Feb 15 00:09:55.374130 r-leopard-simx-74 ERR systemd-modules-load[269]: Failed to find module 'emc2305'
Feb 15 00:09:55.374141 r-leopard-simx-74 ERR systemd-modules-load[269]: Failed to find module 'ads1015'
To resolve this we can patch hw-mgmt to no longer attempt to load these modules by default.
- How I did it
Added a SONiC patch to Mellanox hw-mgmt in order to remove the unused kernel modules which were not upstreamed to sonic-linux-kernel
- How to verify it
Boot switch and verify there are no error logs regarding kernel modules failing to load.
- Why I did it
In SONiC thermal control algorithm, it compares thermal zone temperature with thermal zone threshold. Previously, a thermal zone with no thermal sensor can still get its threshold. However, a recently driver patch changes this behavior: a thermal zone with no thermal sensor will return 0 for threshold. We need to ignore such thermal zone.
- How I did it
Ignore thermal zones whose temperature is 0.
- How to verify it
Added unit test case and Manual test
Fixes#10020
Why I did it
The platform api for parsing syseeprom information read from STATE DB has issue
with parsing the value part that has whitespace in the middle. The current
code assumes that the value part does not have whitespace. So everything after
the whitespace will be ignored. The syseeprom values returned from platform
API do not match the output of "show platform syseeprom".
How I did it
This change improved the regular expression for parsing syseeprom values to
accommodate whitespaces in the value.
How to verify it
Locally updated the code on a dx010 device. Call the platform API:
```
>>> import sonic_platform
>>> platform = sonic_platform.platform.Platform()
>>> chassis = platform.get_chassis()
>>> chassis.get_system_eeprom_info()
{'0x21': 'DX010', '0x22': 'R0872-F0020-02', '0x23': 'DX010B2F030A27BY200002', '0x24': '00:E0:EC:E7:71:0F', '0x25': '11/03/2020 21:22:56', '0x26': '3', '0x27': 'Seastone', '0x28': 'RANGELEY', '0x29': '2014.08', '0x2A': '131', '0x2B': 'CELESTICA', '0x2C': 'THA', '0x2D': 'Celestica', '0x2E': '1.0.5', '0x2F': 'LB', '0xFD': '', '0xFE': '0xAAB39BDB'}
```
Signed-off-by: Xin Wang <xiwang5@microsoft.com>
Generate the sai.profile base on the brcm j2 file if the sai.profile
is not existing in the dut mounted folder.
Change the supervisor service configuration accordingly.
Testing done:
Add the script and config in dut
saiservice server can start automatically with [systemctl start saiserver]
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Add support for Accton wedge100bf_32qs platform
This pull request is based on wedge100bf_32x.
The components on the mainboard are the same as wedge100bf_32x, except for tofino 32Q and COMe models, so it refers to wedge100bf_32x to create new model: wedge100bf_32qs.
Signed-off-by: alvin_feng <alvin_feng@accton.com>
* Fix lgtm alerts issues
Signed-off-by: alvin_feng <alvin_feng@accton.com>
* Modify some file permissions and use symlink to link wedge100bf-32qs/sonic_platform
Signed-off-by: alvin_feng <alvin_feng@accton.com>
* Remove switch-sai.conf file
Signed-off-by: alvin_feng <alvin_feng@accton.com>
* Modify platform.json to avoid platform TCs issues and changes for correct generating BUFFER_QUEUE values in DB.
Signed-off-by: alvin_feng <alvin_feng@accton.com>
* Fix error name in platform.json
* [PTF-SAIv2]Add ptf dockre for sai-ptf (saiv2)
Base on current ptf docker create a new docker for sai-ptf(saiv2)
upgrade related package
use the latest ptf and install it
test done:
NOJESSIE=1 NOSTRETCH=1 NOBULLSEYE=1 ENABLE_SYNCD_RPC=y make target/docker-ptf-sai.gz
BLDENV=buster make -f Makefile.work target/docker-ptf-sai.gz
* upgrade the thrift to 014
#### Why I did it
To bump the Thrift version to 0.14.1
- To avoid [CVE-2020-13949](https://nvd.nist.gov/vuln/detail/CVE-2020-13949)
- to fix some dependencies issues
#### How I did it
- rename `src/thrfit_0_13_0` to `src/thrift_2` to remove version number in the path. (`src/thrift` contains rules to build thrift 0.11.0 )
- Add thrift sources as submodule as there are no prepared debian packages for version >0.13.0 on [debian.org](https://packages.debian.org/search?searchon=sourcenames&keywords=thrift)
- Added patches with fixes for original thrift debian rules:(remove unneeded packages, fix multi job build)
#### How to verify it
```
BLDENV=buster make -f Makefile.work target/debs/buster/libthrift-dev_0.14.1_amd64.deb
```
- Why I did it
Update MFT to version 4.18.1-16 for bugs fixes and new SN2201 support
- How I did it
Advance to MFT tool version to 4.18.1-16
- How to verify it
Manually tested on all Mellanox platforms (ASIC FW Upgrade, link debug tools, CPLD upgrade, etc.)
ded0344 Return both 'vendor_rev' and 'hardware_rev' keys from get_transceiver_info to support both earlier and later versions of xcvrd
b3442cc Change log_error to log_info when transceiver module is transitioned
3d3a73c Fix problem introduced with new SFP caching paradigm
2915746 Add script to reboot all IMMs
17ad221 Fix the position_in_parent for the psu entity
207e731 No longer force read pages at SFP init time
b387921 Fixed the voltage and power with rounding to 2 digit decimals
Signed-off-by: mlok <marty.lok@nokia.com>
* [AS4630-54PE] Add to support PDDF
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
* Fix LGTM alerts
* Fix LGTM alerts
* Fix LGTM alerts
* Add post_device.sh to turn off stk led
* Add to support system_health
* Correct the wait and timeout mechanism for better CPU usage
* Add event.c to support port evt change
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
Support saiserver v2 with python3 and thrift 0.13.0
add variables to support the saiserverv2
build different thrift in saithrift depends on saiserver version
build differernt versions of saiserver
make the saiserver and saiserver docker with version number
test done:
build two different versions of sasiserver in local build environment
add saiserver to buster
Co-authored-by: richard.yu <richard.yu@microsoft.comwq>
* [AS7726-32X] Add to support mulit PSU in PDDF
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
* Modify SN offset and include path
* Fix device_name to PSU2-EEPROM in PSU2
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
- Why I did it
New version of mellanox platform management code available adding support for new platforms and fixing bugs.
- How I did it
1. Updated the submodule
2. Updated makefile version references
3. Regenerated SONiC patches
* Remove tm and all dependancies
Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>
* Removed line connected with thermal_manager
Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>
- Why I did it
Fix issue: 'sx_port_mapping_t' object has no attribute 'slot_id'. sx_port_mapping_t only has attribute slot.
- How I did it
Change slot_id to slot.
- How to verify it
Manual test
- Why I did it
Python select.select accept a optional timeout value in seconds, however, the value passes to it is a value in millisecond.
- How I did it
Transfer the value to millisecond.
- How to verify it
Manual test
- Why I did it
To include latest SDK fixes:
1. On CMIS modules, after low power configuration, the firmware waited for the module state to be ModuleReady instead of ModuleLowPower causing delays.
2. When connecting SN4600C, 100GbE port with CWDM4 module (Gen 3.0), link up time is 30 seconds.
and to include SAI fixes \ changes:
1. Reduce verbosity for resource check vendor data not found
2. Fix metadata validation, check default value on conditions check
3. Add 100MB, 10MB to 2201 system
4. L3 VXLAN overlay ECMP
5. VXLAN srcport API implementation
6. Fix scheduler profile null (default values) when set on sub group scheduler group
7. Fix ACL binding restoration when port leaves a LAG
8. Fix route logic for set next hop/action and reference counter for ECMP overlay
- How I did it
1. Updated SDK/FW submodule and relevant makefiles with the required versions.
2. Update SAI submodule and relevant makefile with the required version.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.
How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
Why I did it
Old fan drv will be build fail under kernel 5.10. It get below error message.
/sonic/platform/broadcom/sonic-platform-modules-accton/as7312-54xs/modules/accton_as7312_54x_fan.c:483:5: error: implicit declarat ion of function 'set_fs'; did you mean 'sget_fc'? [-Werror=implicit-function-declaration]
set_fs(KERNEL_DS);
^~~~~~
sget_fc
How I did it
These code is old design and they are not needed currently. So remove them.
Signed-off-by: Jostar Yang <jostar_yang@accton.com>
Why I did it
Update Broadcom SAI to version 6.0.0.13, SDK 6.5.24, saibcm-modules to 6.5.24.gpl
How I did it
Brcm SAI 6.0 EA with fixes for CS00012203367, CS00012219613, CS00012213974, CS00012218290, CS00012217169, CS00012211718, CS00012213944, CS00012215529, CS00012218100, CS00012214196, CS00012212681, CS00012205138, CS00012208537, CS00012185316, CS00012208524, CS00012203367, CS00012197364.
- Why I did it
Optimize thermal control policies to simplify the logic and add more protection code in policies to make sure it works even if kernel algorithm does not work.
- How I did it
Reduce unused thermal policies
Add timely ASIC temperature check in thermal policy to make sure ASIC temperature and fan speed is coordinated
Minimum allowed fan speed now is calculated by max of the expected fan speed among all policies
Move some logic from fan.py to thermal.py to make it more readable
- How to verify it
1. Manual test
2. Regression
#### Why I did it
Build failed.
Due to the error message looks like build failed because `pyversions` utility was not found.
`pyversions` utility is a part of `python2-minimal` package and it wasn't installed.
#### How I did it
To avoid installing python2 just specify explicit python version `--with python3` and use build system for py3 `--buildsystem=pybuild`
#### How to verify it
Run build
* Add intel_iommu=off to installer.conf
* This solve flooding DMAR err msg: "handling fault status reg 2"
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* remove customized at24 driver
* Use kernel 5.10.46 upstream at24 driver directly. The ADDR16 issue on
old driver has gone.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* pin I2C-0/I2C-1 bus order
* otherwise, sometimes I2C-0/I2C-1 will be assigned to the undesired one.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* fix i2c bus num for fan driver
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* backward compatible with R0A/R0B HW
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* fix workdir for seastone2
Signed-off-by: Viktor Ekmark <viktor@ekmark.se>
* seastone2: Add I2C SFP definition for SFP1
Signed-off-by: Christian Svensson <blue@cmd.nu>
* [device/cel_seastone_2] sfputil logic for SFP1
Earlier logic resulted in the name of SFP1 being SFP33 which is not
correct. The cannonical source is seastone2_fpga module and it calls it
SFP1, so ensure the logic does as well.
Signed-off-by: Christian Svensson <blue@cmd.nu>
* [device/cel_seastone_2] sysfs paths for SFP1
Various changes that plumbs the correct port presence and DOM decoding
for the SFP1 port.
Signed-off-by: Christian Svensson <blue@cmd.nu>
Co-authored-by: Christian Svensson <blue@cmd.nu>
Enable pca954x idle_disconnect to avoid possible I2C device address conflict.
How I did it
Change pca954x device_attr idle_state to -2 (MUX_IDLE_DISCONNECT).
How to verify it
Cat pca954x device_attr idle_state and confirm the value is -2.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
Signed-off-by: Roman Savchuk romanx.savchuk@intel.com
Why I did it
Update SDK with fixed for warmboot tofino2
Added platform interface extension
How I did it
Create new package for Y2, Y1, X2, X1 profiles
How to verify it
Run test suites from sonic-mgmt repo
The SAI credo libraries don't depend on that library. This saves about
36MB of disk storage space.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Add docker-syncd-centec-rpc and docker-saiserver-centec to buster docker image list. These 2 docker images are based on docker-config-engine-buster.
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Why I did it
Recently additional sensors that were needed only for specific system added to all systems and caused errors.
How I did it
* Include CPU board and switch board sensors only on SN2201 system
* Fix issue in test_chassis_thermal, now it skips non existing thermals.
How to verify it
Run show platform temperature
Signed-off-by: liora <liora@nvidia.com>
Why I did it
Reboot cause is not working in S52xx platforms.
How I did it
Modified platform API's.
How to verify it
Check "show reboot-cause" to verify the reboot reason
Why I did it
To incorporate the below changes in DellEMC S6100, S6000 platforms.
S6100, S6000:
Implement 'get_revision' method for Chassis
Implement 'get_maximum_consumed_power' method for FanDrawer
Implement 'get_revision', 'get_maximum_supplied_power' methods for PSU
Implement 'get_error_description' method for SFP.
S6100:
Implement 'get_module_index' method for Chassis
Implement 'get_description', 'get_maximum_consumed_power', 'get_oper_status', 'get_slot' methods for Module
Update component names in platform.json
How I did it
Implement the platform API methods in the respective device files
How to verify it
Verified that the respective sonic-mgmt platform API test cases report success.
Why I did it
Some platforms need to run few steps before the PDDF service is actually started.
* Adding pre_pddf_init script in the service file
* Raising exception for get_target_speed() for PSU-fan in PDDF (#8129)
- Why I did it
PDDF utils were python2 compliant and they needed to be migrated to Python3 (as per Bullseye)
PDDF common platform APIs file name changed as the name was already in use
Indentation issues
Dead/redundant code needed to be removed
- How I did it
Made files Python3 compliant
Indentation corrected
Redundant code removed
- How to verify it
AS7326 Accton platform uses PDDF. PDDF utils were run on this platform to verify.
- Why I did it
There were compilation errors and warnings like,
/usr/src/linux-headers-5.10.0-8-2-common/scripts/Makefile.build:69: You cannot use subdir-y/m to visit a module Makefile. Use obj-y/m instead.
fatal error: linux/platform_data/pca954x.h: No such file or directory
hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
If PDDF kernel module compilation fails, the PDDF debian package was not detecting the break.
- How I did it
Modified the code with new kernel 5.10 APIs.
Modified the Makefiles to use 'obj-m' instead of 'subdir-y'
- How to verify it
PDDF is supported on Accton platform. Load the build on AS7326 setup and check the 'dmesg'
- Why I did it
Added missing functionality for dynamic buffer calculation in Spectrum-4.
- How I did it
Added a section of code in asic_table.j2 for Spectrum-4, and added the simx version of SN5600 to the supported list.
- How to verify it
Manually: buffershow -l should show all ingress/egress lossy/lossless pools, and all fields of profiles should show values.
Automatically: https://github.com/Azure/sonic-mgmt/blob/master/tests/qos/test_buffer.py
Signed-off-by: Raphael Tryster <raphaelt@nvidia.com>
Why I did it
Adding platform support for centec v682-48y8c and v682-48x8c.
V682-48y8c switch has 48 SFP+ (1G/10G/25G) ports, 8 QSFP28 (40G/100G) ports on CENTEC TsingMa.MX.
V682-48y8c is different from V682-48y8c_d in that:
transceiver is managed by cpu smbus rather than TsingMa.MX i2c bus.
port led is managed by mcu inside TsingMa.MX.
fan, psu, sensors, leds are managed by cpu smbus other than the cpu board vendor's close sourse driver.
V682-48x8c switch has 48 SFP+ (1G/10G) ports, 8 QSFP28 (40G/100G) ports on CENTEC TsingMa.MX.
CPU used in v682-48y8c and v682-48x8c is Intel(R) Xeon(R) CPU D-1527.
How I did it
Modify related code in platform and device directory.
Upgrade centec sai to v1.9.
upgrade python to python3 and kernel version to 5.0 for V682-48y8c_d.
How to verify it
Build centec amd64 sonic image, verify platform functions (port, sfp, led etc) on centec v682-48y8c and v682-48x8c board.
Co-authored-by: shil <shil@centecnetworks.com>
- Why I did it
To fix an issue that hw-mgmt patches were not applied. One patch was already in upstream hw-mgmt package thus applying it again caused an error and no other patches were applied. Also, I did it to improve the Makefile, so that the make will fail in case patches fail to apply.
- How I did it
Removed obsolete patch, made applying patches a hard failure in the build.
- How to verify it
Run the make and verify patches are applied.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
- Why I did it
Rename platform x86_64-mlnx_msn4800 to x86_64-nvidia_msn4800
- How I did it
Rename platform folder as well as all code that reference the platform name
- How to verify it
Manual test
- Why I did it
To have an ability to use PRM sniffer.
- How I did it
Enabled the option in configure flags.
- How to verify it
Built and ran on switch. Enabled the feature in runtime and checked the sniffer recording.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
The same docker image is built multiple times after upgrading to bullseye, the build time is increased to about 15 hours from 6 hours.
See log: https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_apis/build/builds/50390/logs/9
Line 1437: 2021-11-11T11:15:02.7094923Z [ building ] [ target/docker-sonic-telemetry.gz ]
Line 1446: 2021-11-11T11:37:41.1073304Z [ finished ] [ target/docker-sonic-telemetry.gz ]
Line 1459: 2021-11-11T11:38:20.6293007Z [ building ] [ target/docker-sonic-telemetry.gz-load ]
Line 1462: 2021-11-11T11:38:28.1250201Z [ finished ] [ target/docker-sonic-telemetry.gz-load ]
Line 2906: 2021-11-11T18:57:42.8207365Z [ building ] [ target/docker-sonic-telemetry.gz ]
Line 2917: 2021-11-11T19:43:47.1860961Z [ finished ] [ target/docker-sonic-telemetry.gz ]
Line 3997: 2021-11-11T22:49:35.0196252Z [ building ] [ target/docker-sonic-telemetry.gz ]
Line 4002: 2021-11-11T23:14:00.4127728Z [ finished ] [ target/docker-sonic-telemetry.gz ]
How I did it
Place the python wheels in another folder relative to the build distribution.
Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
- Why I did it
Add new Spectrum-4 system support SN5600 on top of Nvidia ASIC simulator.
- How I did it
Add all relevant system and simulator SKU.
Updated syseeprom.hex and related directories to reflect Nvidia SN5600 brand name.
- How to verify it
Tested init flow, basic show commands, up interfaces, traffic test.
Signed-off-by: Raphael Tryster <raphaelt@nvidia.com>
- Why I did it
To include latest fixes.
SAI
1. Reclaim buffers for port which is admin down
2. Support for Spectrum-4 os Nvidia ASIC simulation
3. Support for SN2201
4. Fix host interface table entry, one channel per trap (fix sflow double registration)
5. 2 new queue counters - ecn marked packets + shared current occupancy
6. Fix storm policer unknown unicast
7. Add key/value for accuflow counters
8. Add MAC move
9. Add mirror congestion mode attribute
SDK
1. Under various circumstances, Ethernet ports falsely showed that InfiniBand cables were connected.
2. In SN4600C, at times, the link up time in both DAC and optics cables may, in the worst case, take up to 15 seconds.
3. Using SN4600C with copper or optics loopback cables in NRZ speeds, link may raise in long link up times
4. When ECMP has high amount of next-hops based on VLAN interfaces, in some rare cases, packets will get a wrong VLAN tag and will be dropped.
5. When connecting Spectrum devices with optical transceivers that support RXLOS, remote side port down might cause the switch firmware to get stuck and cause unexpected switch behavior.
6. Aggregation event is missing for WJH L2 drop reason 'Unicast egress port list is empty'.
7. Tying the SCL and SDA of the optical modules to 3.3V causes errors.
8. On SN4600, there was a delay of more than 10 seconds from the time a data packet is sent from CPU until it is transmitted through one of the switch ports.
9. While using SN4600C system with Finisar FTLC1157RGPL 100GbE CWDM4 modules, intermittent link flaps across multiple ports may be observed.
10. In Spectrum-2 and Spectrum-3 systems, link did not work in auto-negotiation when connected to Marvell PHY. KR mechanism has been enhanced to integrate with Marvell PHY.
11. The tunnel counter counts the drop packets now for Spectrum-2 and Spectrum-3 and consistent with Spectrum behavior and count the ECN dropped packets as well.
12. When connecting SN3800 to Cisco-9000, fast-linkup flow will fail and will rise in the normal flow.
13. Race condition in WJH library: when multiple threads load the LAG shared memory concurrently, the program may crash.
14. Add WJH L2 drop reason 'Unicast egress port list is empty' as a new drop reason.
15. Fixed a memory leak in sx_api_port_sflow_statistics_get API.
16. During initialization flow, the command interface that is used by the minimal driver and SDK caused the collision in the firmware since the same buffer is used in the firmware for the two interfaces.
17. Fix route issue on Kernel 5.10
- How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
Update Nokia platform sonic-pmon submoduel to the latest with the following commits
c41c823 Fix transceiver module dynamic insertion/removal operations
21a1df6 Fixed pcied process FATAl issue
7fc1fd4 Fix midplane status nokia_cmd
a14ee1c Override get_module() api in chassis
8a457fc SON-326: Watchdog logger changes and file scrubbing
7250eb1 SON-410: Fix missing eeprom access routine
7a70c42 Allow only reboot of self card for OC API test
6ab5d96 Fixed the flake8 compliant issues
807de95 APIs to set thermal threshold to return false
9b38265 SON-382: platform-dump with common techsupport
3f83a67 Add model, base_mac, system_eeprom and serial number support in moduel.py
848d311 SFP: Add get_error_description and fix return status for set_lpmode
1fcb5de PSU check presence of psu instance for APIs
7c68da3 Fixed the eagle and hornet card description
0c01d07 Module support for reboot API
Why I did it
Nvidia platform API does not support set LED to orange
How I did it
Allow user to set LED to orange
How to verify it
Added unit test
Manual test
- Use SfpOptoeBase by default to leverage new `sonic_xcvr` refactor
- Add support for `Woodleaf` product
- Move `libsfp-eeprom.so` to a different `.deb` package
- Add new logrotate configuration for arista logs
- Improve logging mechanism for the drivers (IO loglevel, fix syslog duplicates)
- Initialize chassis cards in parallel
- Refactor of `get_change_event` to fix interrupts treated as presence change
Why I did it
To implement fan control using thermalctld in DellEMC S6000 platform
Requires: Azure/sonic-linux-kernel#241
How I did it
Add thermal policies in 'thermal_policy.json'
Implemented thermal_manager.py and the necessary modules to perform fan control via thermalctld
Removed fancontrol.sh
How to verify it
Verified that the fan speeds are set based on the fan and temperature status.
Logs: S6000_fan_control_test_logs.txt
Why I did it
Adding SSD as part of platform components list.
Introducing platform_fw_au_reboot_handle to use auto-update functionality in fwutil
How I did it
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/debian/platform-modules-s6100.install
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/platform_fw_au_reboot_handle
modified: platform/broadcom/sonic-platform-modules-dell/s6100/sonic_platform/chassis.py
modified: platform/broadcom/sonic-platform-modules-dell/s6100/sonic_platform/component.py
How to verify it
By running fwutil command.
Warning: fwupdate_fwimage_dir: /var/platform/fwpackage/.
Chassis Module Component Firmware Version (Current/Available) Status
--------- -------- ----------- ------------------------------- ----------------------------- ------------------
S6100-ON BIOS S6100-BIOS-3.25.0.2-9-noRP2.bin 3.25.0.2-8 / 3.25.0.2-9 update is required
FPGA smf_firmware_upgrade.tar 2.4 / 2.4 up-to-date
CPLD cpld_firmware_upgrade.tar 4 / 4 up-to-date
SSD ssd_firmware_upgrade.tar S16425cG / S16425cG up-to-date
root@sonic:~#
Why I did it
To support iTCO watchdog using watchdog APIs.
How I did it
Implemented a new watchdog class WatchdogTCO for interfacing with iTCO watchdog.
Updated reboot cause determination logic.
How to verify it
Verified that the watchdog APIs' return values are as expected.
Logs: UT_logs.txt
- Why I did it
Add support for SN2201 platform
- How I did it
Add required content for SN2201 platform
Note: still missing kernel driver support for this system. Once all is upstream will be updated as well.
- How to verify it
Install and basic sanity tests including traffic.
Signed-off-by: liora liora@nvidia.com
Depends on #9358
Why I did it
Adjust LED logical according to hw-mgmt change.
How I did it
Add a trigger to set LED to blink.
How to verify it
Manual test
For broadcom sai, we only need to upgrade the version, not necessary the token part in the url.
Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
#### Why I did it
Updated hw-mgmt pointer to updated branch and to include new bugfixes. The hw-mgmt submodule was previously pointing to an orphaned commit which could not be fetched from github, this has now been resolved.
#### How I did it
Updated submodule pointer.
#### How to verify it
Clone down repository and update all submodules.
Signed-off-by: Stephen Sun stephens@nvidia.com
Why I did it
Support zero buffer profiles
Add buffer profiles and pool definition for zero buffer profiles
Support applying zero profiles on INACTIVE PORTS
Enable dynamic buffer manager to load zero pools and profiles from a JSON file
Dependency: It depends on Azure/sonic-swss#1910 and submodule advancing PR once the former merged.
How I did it
Add buffer profiles and pool definition for zero buffer profiles
If the buffer model is static:
Apply normal buffer profiles to admin-up ports
Apply zero buffer profiles to admin-down ports
If the buffer model is dynamic:
Apply normal buffer profiles to all ports
buffer manager will take care when a port is shut down
Update buffers_config.j2 to support INACTIVE PORTS by extending the existing macros to generate the various buffer objects, including PGs, queues, ingress/egress profile lists
Originally, all the macros to generate the above buffer objects took active ports only as an argument
Now that buffer items need to be generated on inactive ports as well, an extra argument representing the inactive ports need to be added
To be backward compatible, a new series of macros are introduced to take both active and inactive ports as arguments
The original version (with active ports only) will be checked first. If it is not defined, then the extended version will be called
Only vendors who support zero profiles need to change their buffer templates
Enable buffer manager to load zero pools and profiles from a JSON file:
The JSON file is provided on a per-platform basis
It is copied from platform/<vendor> folder to /usr/share/sonic/temlates folder in compiling time and rendered when the swss container is being created.
To make code clean and reduce redundant code, extract common macros from buffer_defaults_t{0,1}.j2 of all SKUs to two common files:
One in Mellanox-SN2700-D48C8 for single ingress pool mode
The other in ACS-MSN2700 for double ingress pool mode
Those files of all other SKUs will be symbol link to the above files
Update sonic-cfggen test accordingly:
Adjust example output file of JSON template for unit test
Add unit test in for Mellanox's new buffer templates.
How to verify it
Regression test.
Unit test in sonic-cfggen
Run regression test and manually test.
* Add macsec-xpn-support iproute2 in syncd
Signed-off-by: Ze Gan <ganze718@gmail.com>
* Polish code
Signed-off-by: Ze Gan <ganze718@gmail.com>
* Remove useless files
Signed-off-by: Ze Gan <ganze718@gmail.com>
* Add self-compiled iproute2 to docker sonic vs
Signed-off-by: Ze Gan <ganze718@gmail.com>
* Enhance apt install for iproute2 dependencies
Signed-off-by: Ze Gan <ganze718@gmail.com>
Sonic master has moved to use SAI v1.9.1. For consistency, use the latest credo sai v0.7.2 package, built with
SAI header v1.9.1 too.
How I did it
Update credo sai url for v0.7.2
Add a debug tool crshell