- Why I did it
In order to prevent the sonic-mgmt/tests/platform_tests/sfp/test_sfputil.py test failing on the log analyzer step.
The mentioned test is performing the sfputil reset EthernetX for every interface on the SONiC switch, this action will flap the SFP device status (INSTERTED -> REMOVED -> INSTERTED).
The SONiC XCVRD daemon will catch this SFP device status change (because it is monitoring the presence status of the cable).
To judge the cable presence status, currently, we are still leveraging to read the first bytes of the EEPROM, and the EEPROM could be not ready at some moment and the SONiC XCVRD daemon will print the error log to Syslog:
ERR pmon#xcvrd: Error! Unable to read data for 'xx' port, page 'xx' offset 128, rc = 1, err msg: Sending access register
- How I did it
Change logging severity from ERR to WARNING
- How to verify it
Run the sonic-mgmt/tests/platform_tests/sfp/test_sfputil.py
OR much faster way to run the next script on the switch:
#!/bin/bash
START=0
END=248
for (( intf=$START; intf<=$END; intf+=8))
do
sfputil reset Ethernet"${intf}"
done
sfputil show presence
- Why I did it
Remove TODO comments which are no longer needed
- How I did it
Remove TODO comments which are no longer needed
- How to verify it
Only comment change
- Why I did it
Following code to judge whether a process is running inside a docker could get stuck on the simx platform
subprocess.Popen(["docker", "--version"],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
universal_newlines=True)
When it gets stuck, the config-chassisdb service can not be successfully started, thus the system can not be booted up.
root@sonic:/# service config-chassisdb status
config-chassisdb.service - Config chassis_db
Loaded: loaded (/lib/systemd/system/config-chassisdb.service; enabled; vendor preset: enabled)
Active: activating (start) since Thu 2022-12-15 09:23:02 UTC; 29min ago
Main PID: 571 (config-chassisd)
Tasks: 14 (limit: 9501)
Memory: 132.4M
CGroup: /system.slice/config-chassisdb.service
├─571 /bin/bash /usr/bin/config-chassisdb
├─575 /usr/bin/python3 /usr/local/bin/sonic-cfggen -H -v DEVICE_METADATA.localhost.platform
├─602 /bin/sh -c sudo decode-syseeprom -m
├─603 sudo decode-syseeprom -m
├─607 /usr/bin/python3 /usr/local/bin/decode-syseeprom -m
├─616 /bin/sh -c docker --version 2>/dev/null
└─617 docker --version
- How I did it
Use an alternative way to implement this function and issue can be avoided:
docker_env_file = '/.dockerenv'
return os.path.exists(docker_env_file) is False
- How to verify it
run regression on real hardware and simx platform.
- Why I did it
In case of warm/fast reboot, the hardware reboot cause will NOT be cleared because CPLD will not be touched in this flow. To not confuse the reboot cause determine logic, the leftover hardware reboot cause shall be skipped by the platform API, platform API will return the 'REBOOT_CAUSE_NON_HARDWARE' instead of the "hardware" reboot cause.
- How I did it
Check the proc cmdline to see whether the last reboot is a warm or fast reboot, if yes skip checking the leftover hardware reboot cause.
- How to verify it
a. Manual test:
- Perform a power loss
- Perform a warm/fast reboot
- Check the reboot cause should be "warm-reboot" or "fast-reboot" instead of "power loss"
b. Run reboot cause related regression test.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Backport of https://github.com/sonic-net/sonic-buildimage/pull/12490 into 202211
- Why I did it
Support syslog rate limit configuration feature
- How I did it
Remove unused rsyslog.conf from containers
Modify docker startup script to generate rsyslog.conf from template files
Add metadata/init data for syslog rate limit configuration
- How to verify it
Manual test
New sonic-mgmt regression cases
Why I did it
Provide a Fan driver framework that complies with s3ip sysfs specification
How I did it
1、 The framework module provides register and unregister interface and implementation.
2、 The framework will help you create the sysfs node
How to verify it
A demo driver base on this framework will display the sysfs node wich conform to the s3ip sysfs specification
Co-authored-by: tianshangfei <31125751+tianshangfei@users.noreply.github.com>
Why I did it
Add two platform that support s3IP framework
How I did it
Add two platforms supporting S3IP SYSFS (TCS8400, TCS9400)
How to verify it
Manual test
Co-authored-by: tianshangfei <31125751+tianshangfei@users.noreply.github.com>
Why I did it
why
In order to apply different config across different platform, and use the code with a unified format, reuse syncd init script to init saiserver.
How I did it
how
Reuse syncd init script
How to verify it
Test
Test in DUT s6000 and dx010 with sonic 202205
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
The [xml.etree.ElementTree](https://docs.python.org/3/library/xml.etree.elementtree.html#module-xml.etree.ElementTree) module is not secure against maliciously constructed data.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
`subprocess.getstatusoutput` is dangerous because include shell=True in the implementation
#### How I did it
Remove xml. Use [lxml](https://pypi.org/project/lxml/) XML parsers package that prevent potentially malicious operation.
Replace `os` by `subprocess`
Use command as an array instead of string
Use `getstatusoutput_noshell` in `sonic_py_common` lib
- Why I did it
Add support for compiling Spectrum-4 ASIC firmware to the SONiC image
Add support for Spectrum-4 ASIC firmware upgrade
- How I did it
Update Mellanox fw make files to include Spectrum-4 ASIC firmware binaries.
Update firmware upgrade scripts to be able to detect Spectrum-4 ASIC.
- How to verify it
Run regression tests
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Add SDK hash calculator Debian and update SDK makefile to compile it.
- How I did it
SDK hash calculator Debian will be used by ECMP calculator (PR #12482)
- How to verify it
Compile sonic-buildimage and verify SDK hash calculator Debian exist in target folder.
* Support power threshold
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* get_psu_power_warning_threshold => get_psu_power_warning_suppress_threshold
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Fix comments
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Make syncd rpc docker which supports sai-ptf v2
local bulild the target
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=vs
NOSTRETCH=y NOJESSIE=y NOBULLSEYE=y SAITHRIFT_V2=y make target/docker-ptf-sai.gz
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=vs
NOSTRETCH=y NOJESSIE=y NOBULLSEYE=y make target/docker-ptf.gz
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=broadcom
NOSTRETCH=y NOJESSIE=y ENABLE_SYNCD_RPC=y SAITHRIFT_V2=y make target/docker-syncd-brcm-rpcv2.gz
NOSTRETCH=y NOJESSIE=y ENABLE_SYNCD_RPC=y SAITHRIFT_V2=y make target/docker-saiserverv2-brcm.gz
Test done:
#12619
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=broadcom
NOSTRETCH=y NOJESSIE=y ENABLE_SYNCD_RPC=y make target/docker-syncd-brcm-rpc.gz
NOSTRETCH=y NOJESSIE=y ENABLE_SYNCD_RPC=y make target/docker-saiserver-brcm.gz
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
add partial reboot cause support for linecards
add watchdog support for linecards
add power draw information for chassis
properly implement Chassis.get_port_or_cage_type
fix pcieutil on chassis with powered off cards
fix watchdog-control.service crash
misc fixes and cleanups
Why I did it
enable sai-ptf logger in sai_adapter to log all the sai api invcations
How I did it
add build parameter to enable the sai-ptf logger when build sai PRC
How to verify it
local build test
test the generated sai_adapter
test with pipeline
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Why I did it*
Advance submodule sairdis with sai 1.11 and add brcm and mlnx sai sdk
*How I did it*
Advance sairedis which contains
Todo: cause sairedis 202211 branch blocked by some dependences repo, map to sairedis master, will move to 202211 when branch ready
[submodule][SAI]Advance SAI head pointer sonic-sairedis#1155
[Recorder]: Acquire lock for ofstream changes sonic-sairedis#1145
[SAI submodule update] Enable support for SAI v1.11.0 sonic-sairedis#1140
Add brcm sdk 7.1 which update with sai 1.11
Add mlnx sdk which update with sai 1.11
*How to verify it*
Test with pipeline which enable RPC build as well https://github.com/sonic-net/sonic-buildimage/pull/12770/files
Test with sonic smoke test cases
Test with sai test cases
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Co-authored-by: Kebo Liu <kebol@nvidia.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Co-authored-by: Kebo Liu <kebol@nvidia.com>
Why I did it
Move armhf syncd docker compilation to bullseye.
How I did it
compile syncd docker for armhf platform using below commands,
NOJESSIE=1 NOSTRETCH=1 NOBUSTER=1 BLDENV=bullseye make configure PLATFORM=marvell-armhf PLATFORM_ARCH=armhf
NOJESSIE=1 NOSTRETCH=1 NOBUSTER=1 BLDENV=bullseye make target/docker-syncd-mrvl.gz
How to verify it
upgrade the syncd docker and verify ports are up.
Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
* [SAI PTF] SAI PTF docker support sai-ptf v2
Publish the sai-ptf docker.
Take part of the change from previous PR #11610 (already reverted as some cache issue)
Cause in #11610, added two new target in it, one is sai-ptf another one is syncd-rpc with sai-ptf v2, to make the upgrade with more clear target, use this one take the sai-ptf one.
Test one:
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=vs
NOSTRETCH=y NOJESSIE=y NOBULLSEYE=y SAITHRIFT_V2=y make target/docker-ptf-sai.gz
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* remove useless change
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* remove useless parameters
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* remove useless change
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Update azure-pipelines-build.yml
remove a useless option
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
This PR is part of the following HLD:
Persistent loglevel HLD: sonic-net/SONiC#1041
- Why I did it
After the Logger tables moved from the LOGLEVEL_DB to the CONFIG_DB and the jinja2_cache was deleted the LOGLEVEL_DB is not in use.
- How I did it
Removed the LOGLEVEL_DB from the SONiC code
- How to verify it
All tests were passed
Why I did it
syseepromd in pmon crashes because of missing import in python script and doesn't get in running state
How I did it
Fix missing import issue to avoid python script failing
How to verify it
Boot system and wait till syseepromd gets into running state
Which release branch to backport (provide reason below if selected)
201811
201911
202006
202012
202106
202111
202205
* Build docker-gbsyncd-broncos image
* Correct typo in LIBSAI_BRONCOS_URL_PREFIX
* Update docker-gbsyncd-broncos/Dockerfile.j2
* Enable debug shell support on docker-gbsyncd-broncos
* Include bcmsh in docker-gbsyncd-broncos
Why I did it
In docker-gbsyncd-broncos image, enable debug shell support for BRCM broncos PHY.
How I did it
How to verify it
Note: need enable attr SAI_SWITCH_ATTR_SWITCH_SHELL_ENABLE support in BCM PAI library
# bcmsh
Press Enter to show prompt.
Press Ctrl+C to exit.
NOTICE: Only one bcmsh or bcmcmd can connect to the shell at same time.
BRCM:> help
help
List of available commands
- h or help => Print command menu
- l => Print list of active ports on the PHY
- ps <port_id> <options> => Print port status
<options> => 1 -> Link status
=> 2 -> Link training failure status
=> 3 -> Link training RX status
=> 4 -> PRBS lock status
=> 5 -> PRBS lock loss status
- rd <port_id> <addr> <no of registers to read> => Read register contents
- wr <port_id> <addr> <data> => Write register data
- rrd <lanemap> <if_side> <addr> <no of registers to read> => Raw read register contents using lanemap and if_side (line = 0, system = 1)
- rwr <lanemap> <if_side> <addr> <data> => Raw write register data using lanemap and if_side (line = 0, system = 1)
- fw or firmware => Print firmware version of the PHY
- pd or port_dump <port_id> <flags> => Dump port status
- eyescan <port_id> => Display eye scan
- fec_status <port_id> => Get fec status of the port
- polarity <lanemap> <if_side> <TX polarity> <RX Polarity> => Set TX and RX polarity
<lanemap> => 0xF, 0xFF, or 0xFFFF based on number of lanes
<if_side > => Line = 0, System = 1
<TX/RX Polarity> =>_TX/RX Polarity bitmap of all lanes
Each bit represents a lane number.
E.g. Lane 0's polarity value (0 or 1) is populated in Bit 0.
- polarity <lanemap> <if_side> => Print TX and RX polarity
- lb <port_id> <lb_value> => Enable loopback on the port
lb_value = 0 -> Disable, 1 -> PHY, 2 -> MAC
- lb <port_id> => Print loopback configuration of the port
- prbs <port_id> <options> <val> => Set/Get PRBS configuration
<options> => 1 -> Get PRBS state and polynomial
2 -> Set PRBS Polynomial, <val> - PRBS Polynomial
Please refer to phy/chip documentation for valid values
3 -> Enable PRBS
<val> => 0 Disable PRBS
1 Enable both PRBS Transmitter and Receiver
2 Enable PRBS Receiver
3 Enable PRBS Transmitter
exit or q => Exit the diagnostic shell
- Why I did it
Update SN2201 dynamic minimum fan speed table according to data provided by the thermal team.
- How I did it
Update the thermal table in device_data.py
- How to verify it
Run platform related regression
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
Replace `os` by `subprocess`
Remove unused functions