Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [PR (#12065)](https://github.com/sonic-net/sonic-buildimage/pull/12065) needs to merge first.
#### Why I did it
`subprocess.Popen()` and `subprocess.check_output()` is used with `shell=True`, which is very dangerous for shell injection.
#### How I did it
Disable `shell=True`, enable `shell=False`
#### How to verify it
Tested on DUT, compare and verify the output between the original behavior and the new changes' behavior.
[testresults.zip](https://github.com/sonic-net/sonic-buildimage/files/9550867/testresults.zip)
- Why I did it
To update MFT package to the latest version.
- How I did it
Updated MFT_VERSION & MFT_REVISION in platform/mellanox/mft.mk.
- How to verify it
Build an image and deploy to the switch
Check MFT version by dpkg -l | grep mft
Verify that all the SONiC services up and running
Run regression testing using tests from sonic-mgmt
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
To include latest fixes and new functionality
SAI fixes and new features
fix#3205239, incorrect object type returned for SG child list
Fix VRF-VNI map entries remove issue
ECC health event and logging
[Port Buffers] restore default queue and pg configuration when all user pools are deleted
Fix EVPN type3 error on removal of uc/bc flood group
Fix EVPN type2 MAC move from local to remote results in SAI failure
Fix Disable learning on VXLAN tunnel
Fix error on VXLAN v6 tunnel removal
Fix port cannot apply schedule group when it is a lag member
Fix BFD add more detailed message on BFD packet not related to any existing session
gcc10 compilation fixes
Disable learning on VXLAN tunnel
Support BFD remote-disc exchange in negotiation stage
Tunnel Loopback packet action attribute implementation (for Dual TOR)
Add KVD resources MIN/MAX functionality (pending CRM issue with MIN only)
Support for CRC2 hash algorithm
Bulk counter support for PGs, queues
Support mirror sample rate attribute (SPC2+)
[Functional] [QoS] | Unable to remove SCHEDULE profile table even if there is no object referencing it
Next hop group optimized bulk API
Reduce verbosity of shared database already exists print
Span mirror policer (SPC2+), optimize pipeline for acl mirror action with policer on SPC2+
use same size descriptor pool for rx/tx
fix bfd - notify Sonic for admin-down event
2201 - empty list for supported fec for RJ45 ports
Fix don't disable used tunnel underlay interfaces
SDK fixes
100GbE FCI DAC (10137628-4050LF/HPE PN: 845408-B21) was recognized by mistake as supporting "cable burning' which caused the switch firmware to read page 0x9f (which unsupported in the cable) and to report this cable as having "bad eeprom".
Added remote peer UDP port information in BFD packet event.
After editing an ECMP, the resilient ECMP next-hop counter may not count correctly.
Fixed potential memory leaks in some APIs related to LPM
If TTL_CMD_COPY is used in Encap direction for a packet with no TTL, then the value passed in the ttl data structure will be used if non-zero (default 255 if zero).
In SN2201: When configuring Force mode, user should configure Speed and FEC on both sides
In Flex Tunnel encapsulation flow, if the encapsulation is with an IPv6 header, the flow label field may not be updated as expected.
In some cases, when changing speed to 400GbE over 8 lanes, the first few packets would be dropped.
In some traffic patterns involving small packets, the PortRcvErrors counter may mistakenly count events of local physical errors due to an internal flow in the hardware that involves link packets.
On Spectrum systems, sometimes during link failure, not all previous firmware indications cleared properly, potentially affecting the next link up attempt.
On the NVIDIA Spectrum-2 switch, when receiving a packet with Symbol Errors on ports that are configured to cut-thought mode, a pipeline might get stuck.
PCI calibration changes from a static to a dynamic mechanism.
SDK debug dump shows "Unknown" Counter in RFC3635 Counter Group.
SDK debug dump shows "Unknown" Counter in the PPCNT Traffic Class Counter Group.
SDK Dump missing column headers in some GC tables may result in difficulty understanding the dump.
SLL configuration is missing in SDK dump.
Spectrum-2 systems, do no support 1GbE on supported 40GbE modules.
When binding a UDP port which is already in use for BFD TX session, the error message appears incorrectly.
When Flex Tunnel was used, Flex Modifier sometimes experienced a brief mis-configuration during ISSU.
When many ports are active (e.g. 70 ports up), and the configuration of shared buffer is applied on the fly, occasionally, the firmware might get stuck.
When running 1GbE speeds on SN4600 system, the port remained active while peer side was closed.
When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
While toggling the cable, and the low power mode is set to ON, an unexpected PMPE event error is received.
- How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
get_rx_los and get_tx_fault is not supported via the exisitng interface used, need provide dummy implementation for them.
NOTE: in later releases we will get them back via different interface.
- How I did it
Return False * lane_num for get_rx_los and get_tx_fault
- How to verify it
Added unit test
* Move qsfp eeprom reading to new cached api
* provide reading multiple pages in recursive manner
* workaround with flat memory on cmis
* remove workaround with memory model
* Remove unused imports
- Why I did it
Fix a typo in chassis platform API which causes the following error
>>> import sonic_platform as P
>>> c = P.platform.Platform().get_chassis()
>>> sl = c.get_all_sfps()
>>> sl[0].get_lpmode()
Sep 28 07:48:33 INFO LOG: Initializing SX log with STDOUT as output file.
False
>>> del c
Exception ignored in: <function Chassis.__del__ at 0x7f1d166ef8b0>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 126, in __del__
self.sfp_module.deinitialize_sdk_handle(sfp_module.SFP.shared_sdk_handle)
NameError: name 'sfp_module' is not defined
- How I did it
Use self while using the SDK handle
- How to verify it
Manual test
Signed-off-by: Stephen Sun <stephens@nvidia.com>
This reverts commit 9750cb4.
There is a PR to handle 202205 branch revert: #12184
- Why I did it
The PR to be reverted introduced many notice logs every 1 minute if SFP is not plugged:
Cannot get module EEPROM information: Input/output error
Before the "bad" PR, the message format is like this:
INFO pmon#supervisord: xcvrd Cannot get module EEPROM information: Input/output error
It was truncated by rsyslog because every message is the same. However, the "bad" PR introduces SFP index to the message:
NOTICE pmon#xcvrd: Failed to get EEPROM data for sfp 39: Cannot get module EEPROM information: Input/output error
Rsyslog no longer truncate such log and many such messages are flooded to syslog.
- How I did it
Revert the PR
- How to verify it
Manual test
- Why I did it
Update SDK/FW version - 4.5.2320/2010_2320 in order to have the following fixes:
• Spectrum-3 | PCI calibration changes from a static to a dynamic mechanism.
• [VxLAN] TTL was set to 0 for non IP traffic (such as ARP)
- How I did it
Update pointer for the SDK/FW
- How to verify it
Run regression tests
- Why I did it
ethtool print error logs when EEPROM of a SFP is not available. It prints error like this:
INFO pmon#/supervisord: xcvrd Cannot get module EEPROM information: Input/output error
INFO pmon#/supervisord: xcvrd Cannot get Module EEPROM data: Invalid argument
However, this log does not contain the relevant SFP index which is hard for developer/qa to find the exactly SFP.
- How I did it
Redirect ethtool stderr to subprocess and log it better
- How to verify it
Manual test
* Adding support for get/set low pwer mode for QSFPs in PDDF common APIs
* Adding support for get/set low pwer mode for QSFPs in PDDF common APIs - Review comments
Why I did it
To gracefully unmount filesystems and stop containers while performing a cold reboot.
Unmount ONIE-BOOT if mounted during fast/soft/warm reboot
How I did it
Override systemd-reboot service to perform a cold reboot.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins.
How to verify it
On reboot, verify that the container stop and filesystem unmount services have completed execution before the platform reboot.
Why I did it
The directory /var/warmboot as top directory for warmboot feature is also needed in docker gbsyncd. Some vendor SAI might save data under it. Without it, the SAI init/creation API failure has happened on PikeZ platform.
How I did it
Mount host directory /host/warmboot as /var/warmboot in docker gbsyncd, which is same as what it has done on docker syncd.
Why I did it
S5296F - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
How to verify it
Used the API 2.0 test suite to validate the test cases.
- Why I did it
Update SDK/FW version - 4.5.2318/2010_2318 to pick up new fixes:
1. Cr space timeout on Hold and Release GW - at warm boot
2. Spectrum Port in stuck PHY_UP after peer side rebooted
3. Memory leak in sx_api_router_ecmp_update_set
- How I did it
Update the make file with the new version number
Update submodule Switch-SDK-drivers pointer
- How to verify it
Run sonic regression
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
Update scripts in sonic-buildimage from py-swsssdk to swsscommon
#### How I did it
Replace swsssdk with swsscommon in centec devices.
#### How to verify it
Pass all E2E test case
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Replace swsssdk with swsscommon in centec devices.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
It solves a swss orchagent crash issue on PikeZ device, due to link-training setting of external PHY port.
How I did it
Catch up the fix for CS00012257483 in version 7.1.7.2.
* draft upgrade to deb11 of syncd and syncd-rpc
* upgrade to python3
* revert workaround with libsaithrift
* Provide urls for sai and platform debs
* Downgrade python3 to python2
* Remove saithrift-patches
* Upgrade modules
* remove unnecessary lib
* remove more unnecessary modules
* Update sdk reference
* remove unnecessary packages from syncd-rpc
# Why I did it
platform-modules-belgite's deb requests linux-image-5.10.0-8-2-amd64-unsigned, which does not match the runtime kernel version
# How I did it
update the belgite's deb configuration in deb's control
# How to verify it
check the firsttime boot log in belgite platform
Co-authored-by: nicwu-cel <nicwu@celestica.com>
* Update BRCM KNET module to support new psample definitions from sflow dropmon feature
* Update BRCM KNET module to support new psample definitions from sflow dropmon feature
* Advance saibcm-modules-dnx
- Add Watchdog remaining time API
- Add support for non-swappable fans via a FixedDrawer
- Add ASIC voltage tweaks for PikeZ product
- Add better pylint support
- Fix reboot-cause decision issue for future products
- Fix thermal issue for RJ45 ports
- Deprecate Catalina prototype support
- Why I did it
Update HW-MGMT to V.7.0020.3006
1. Support new system SN2201
2. Add COMEX BRDWL respin support
- How I did it
Update the version number of the makefile
Advance the hw-mgmt submodule pointer
- How to verify it
Run full regression on Nvidia platforms
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Add PSU input voltage and input current to mlnx platform api.
- How I did it
Implement 2 function of getting the psu voltage and psu current input:
Get the values from "power/psu{}_curr_in" , "power/psu{}_volt_in"
- How to verify it
Manual test.
Run sonic-mgmt regression
Signed-off-by: orfar1994 <orfar1994@gmail.com>
- Why I did it
Add more log while doing sysfs reading to increase the debug capability
- How I did it
Log the relevant file path and error number while sysfs reading return None
- How to verify it
Manual test
To reduce rc.local script execution time. Porting changes from [DellEMC] S6100 Platform Service optimization #10989
Changes:
Moving platform-modules-s6100.service and s6100-lpc-monitor.service asynchronous to rc.local script.
this upgrade contains two changes:
1. Add the following MacSec Initialization Condition:
- When MacSec feature is not included MacSec block should not be brought out of reset irrespective of the value of the newly added config variable.
- When included its initialization is controlled by the newly added config variable.
2. DNX buf fix: increase _BRCM_SAI_MAX_ACL_TABLES to 128
Signed-off-by: zitingguo <zitingguo@microsoft.com>
Why I did it
Enable syncd container autorestart for Innovium platforms
How I did it
Add critical_process file and sypervisord.conf entry
How to verify it
Tested with autorestart/test_container_autorestart.py::test_containers_autorestart
PASSED autorestart/test_container_autorestart.py::test_containers_autorestart[sonic-xxx-dut-sonic-xxx-dut|syncd]
Signed-off-by: rck-innovium rck@innovium.com
* Ported Marvell armhf build on x86 for debian buster to use cross-compilation instead of qemu emulation
Current armhf Sonic build on amd64 host uses qemu emulation. Due to the
nature of the emulation it takes a very long time, about 22-24 hours to
complete the build. The change I did to reduce the building time by
porting Sonic armhf build on amd64 host for Marvell platform for debian
buster to use cross-compilation on arm64 host for armhf target. The
overall Sonic armhf building time using cross-compilation reduced to
about 6 hours.
Signed-off-by: marvell <marvell@cpss-build3.marvell.com>
* Fixed final Sonic image build with dockers inside
* Update Dockerfile.j2
Fixed qemu-user-static:x86_64-aarch64-5.0.0-2 .
* Update cross-build-arm-python-reqirements.sh
Added support for both armhf and arm64 cross-build platform using $PY_PLAT environment variable.
* Update Makefile
Added TARGET=<cross-target> for armhf/arm64 cross-compilation.
* Reviewer's @qiluo-msft requests done
Signed-off-by: marvell <marvell@cpss-build3.marvell.com>
* Added new radius/pam patch for arm64 support
* Update slave.mk
Added missing back tick.
* Added libgtest-dev: libgmock-dev: to the buster Dockerfile.j2. Fixed arm perl version to be generic
* Added missing armhf/arm64 entries in /etc/apt/sources.list
* fix libc-bin core dump issue from xumia:fix-libc-bin-install-issue commit
* Removed unnecessary 'apt-get update' from sonic-slave-buster/Dockerfile.j2
* Fixed saiarcot895 reviewer's requests
* Fixed README and replaced 'sed/awk' with patches
* Fixed ntp build to use openssl
* Unuse sonic-slave-buster/cross-build-arm-python-reqirements.sh script (put all prebuilt python packages cross-compilation/install inside Dockerfile.j2). Fixed src/snmpd/Makefile to use -j1 in all cases
* Clean armhf cross-compilation build fixes
* Ported cross-compilation armhf build to bullseye
* Additional change for bullseye
* Set CROSS_BUILD_ENVIRON default value n
* Removed python2 references
* Fixes after merge with the upstream
* Deleted unused sonic-slave-buster/cross-build-arm-python-reqirements.sh file
* Fixed 2 @saiarcot895 requests
* Fixed @saiarcot895 reviewer's requests
* Removed use of prebuilt python wheels
* Incorporated saiarcot895 CC/CXX and other simplification/generalization changes
Signed-off-by: marvell <marvell@cpss-build3.marvell.com>
* Fixed saiarcot895 reviewer's additional requests
* src/libyang/patch/debian-packaging-files.patch
* Removed --no-deps option when installing wheels. Removed unnecessary lazy_object_proxy arm python3 package instalation
Co-authored-by: marvell <marvell@cpss-build3.marvell.com>
Co-authored-by: marvell <marvell@cpss-build2.marvell.com>
Fix an issue with front panel port led introduced in previous PR
Implement status led for linecards
Implement full power cycle for linecards
Improve reboot cause reporting for Ucd devices
Add fan support for PikeZ
Miscellaneous fixes and improvements