Why I did it
- Convert hw-dump into generate-dump plugins
- Enable DRAM scrubber on some products
- Fix xcvr driver active low register bit logic
- Improve cooling algorithm (now considers xcvrs and modules)
- Add linecard graceful shutdown (disabled by default)
The scrubber was enabled for the following products:
- DCS-7050QX-32S
- DCS-7050CX3-32S
- DCS-7060CX-32S
SONiC CLI command was broken.
admin@sonic:~$ show platform psustatus
PSU Model Serial HW Rev Voltage (V) Current (A) Power (W) Status LED
----- --------------- ------------------ -------- ------------- ------------- ----------- -------- -----
PSU 1 PFE600-12-054NA 420000956420600006 206 N/A N/A 82.00 OK green
PSU 2 PFE600-12-054NA 420000956420600248 206 N/A N/A 60.00 NOT OK green
Why I did it
Support Intel Tofino based platforms Netberg Aurora 750
ASIC: Intel Tofino BFN-T10-064Q
Pors: 64x 100G
How I did it
Added specification to device/netberg directory
Added platform/barefoot/sonic-platform-modules-netberg contains kernel modules, scripts and sonic_platform packages.
Modified the platform/barefoot/platform-modules-netberg.mk to include Aurora 750 related ID.
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Ignore intermittent IO errors during get_change_event in the Platform API
Fix tunings for some ports on CatalinaDD
Fix kernel module build for 6.1 kernel in preparation of bookworm upgrade
* [202012][platform/barefoot] (#8543)
Why I did it
Pcied running by python 2.
How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2
How to verify it
docker exec pmon supervisorctl status
* [Netberg][nba710] Added initial support for Aurora 710
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
---------
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
Fix lpmode on 7060DX5-32
Fix psu led issue on 7060DX5-64
Use sonic_xcvr lpmode if platform does not support hw lpmode
Add chassis cooling algorithm
Change cooling algorithm default interval to 10s
Force filesystem sync on linecard reboot
- Fix watchdog reboot cause for wolverine linecard
- Fix PSU fan speed of 0% by adding max RPM to most psu descriptions
- Add product DCS-7060DX5-64
- Add product DCS-7060DX5-32
Normally doesn't need to measure i2c calls.
Also switched to use timespec64_sub() to ensure time delta normalized
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
add SEU reporting on chassis
fix fallback logic for Clearlake eeprom identification
fix fan speed reporting for a specific model
move pcie timeout configuration for Upperlake in platform code (deprecates hwsku-init)
Why I did it
Sometime, SIGTERM processing by psud takes more then default 10sec (please see stopwaitsecs in http://supervisord.org/configuration.html).
Due to this, the following two testcases may fail:
test_pmon_psud_stop_and_start_status
test_pmon_psud_term_and_start_status
How I did it
Update PSU plugin to process sigterm signal so that psud runs faster to end last cycle in time
How to verify it
Run SONiC CTs:
test_pmon_psud_stop_and_start_status
test_pmon_psud_term_and_start_status
Why I did it
Enable Test sai api on bfn container with a lightweight container(saiserver).
How I did it
enable saiserver container on barefoot platform.
add docker-saiserver-bfn.mk for building saiserver container
in platform/barefoot/docker-saiserver-bfn, add necessary files that needs in saiserver container
How to verify it
Tested on Intel platform ec9516
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Why I did it
Platform interface doesn't provide all sensors and using it isn't effective
How I did it
Request sensors via http from BMC server and parse the result
How to verify it
Related daemon in pmon populates redis db, run this command to view the contents
Why I did it
Initial implementation of Watchdog platform plugin for BMC-based boards
How I did it
How to verify it
Run platform_tests/test_reload_config.py
Why I did it
SIGTERM takes more than 10 seconds to be processed, so psud is stopped by SIGKILL, this causes unexpected behavior since data base is not cleared
How I did it
Decorate get_presence api to cancel it on SIGTERM signal in order to avoid long processing.
How to verify it
test_pmon_psud_stop_and_start_status
test_pmon_psud_term_and_start_status
add partial reboot cause support for linecards
add watchdog support for linecards
add power draw information for chassis
properly implement Chassis.get_port_or_cage_type
fix pcieutil on chassis with powered off cards
fix watchdog-control.service crash
misc fixes and cleanups
Why I did it
syseepromd in pmon crashes because of missing import in python script and doesn't get in running state
How I did it
Fix missing import issue to avoid python script failing
How to verify it
Boot system and wait till syseepromd gets into running state
Which release branch to backport (provide reason below if selected)
201811
201911
202006
202012
202106
202111
202205
Why I did it
In case the device contains more then one FAN drawer, the FANs name was incorrect.
How I did it
Passed max fan value to FAN object.
Fixed get_name() FAN API
How to verify it
show platform fan
Why I did it
syseepromd in pmon crashes because of missing import in python script and doesn't get in running state
How I did it
Fix missing import issue to avoid python script failing
How to verify it
Boot system and wait till syseepromd gets into running state
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
Replace `os` by `subprocess`
fix linecard provisioning issue (500 error)
fix some value types for get_system_eeprom_info API
refactor code to leverage pci topology (enabling dynamic Pcie plugin)
refactor asic declaration logic to new style
misc fixes
Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
What I did
Adding the dynamic headroom calculation support for Barefoot platforms.
Why I did it
Enabling dynamic mode for barefoot case.
How I verified it
The community tests are adjusted and pass.
* Move qsfp eeprom reading to new cached api
* provide reading multiple pages in recursive manner
* workaround with flat memory on cmis
* remove workaround with memory model
* Remove unused imports
* draft upgrade to deb11 of syncd and syncd-rpc
* upgrade to python3
* revert workaround with libsaithrift
* Provide urls for sai and platform debs
* Downgrade python3 to python2
* Remove saithrift-patches
* Upgrade modules
* remove unnecessary lib
* remove more unnecessary modules
* Update sdk reference
* remove unnecessary packages from syncd-rpc
- Add Watchdog remaining time API
- Add support for non-swappable fans via a FixedDrawer
- Add ASIC voltage tweaks for PikeZ product
- Add better pylint support
- Fix reboot-cause decision issue for future products
- Fix thermal issue for RJ45 ports
- Deprecate Catalina prototype support
Fix an issue with front panel port led introduced in previous PR
Implement status led for linecards
Implement full power cycle for linecards
Improve reboot cause reporting for Ucd devices
Add fan support for PikeZ
Miscellaneous fixes and improvements
Why I did it
Support Intel Tofino based platforms Netberg Aurora 610
ASIC: Intel Tofino BFN-T10-032D-020
Pors: 48x 25G + 8x 100G
How I did it
Added specification to device/netberg directory
Added platform/barefoot/sonic-platform-modules-netberg contains kernel modules, scripts and sonic_platform packages.
Modified the platform/barefoot/one-image.mk and platform/barefoot/rule.mk to include Aurora 610 related ID and files.
How to verify it
Build SONiC
Install the image on the device and verify the related components are installed and shown correctly.
This update has following changes
Refactor pci topology logic for chassis (fixes some chassis commands and chassisd on linecard)
Introduce new cooling algorithm
Fix linecard poweroff logic when supervisor is going down
Fix linecard status led leading to system-health crashing
Misc fixes
Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are
specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are
created with a fixed docker name and common to all the users.
docker-database:latest
docker-swss:latest
When multiple builds are triggered on the same build server that creates parallel building issue because
all the build jobs are trying to create the same docker with latest tag.
This happens only when sonic dockers are built using native host dockerd for sonic docker image creation.
This patch creates all sonic dockers as user sonic dockers and then, while
saving and loading the user sonic dockers, it rename the user sonic
dockers into correct sonic dockers with tag as latest.
docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag
The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env
variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for
loading and saving the docker image.