Commit Graph

6927 Commits

Author SHA1 Message Date
Andriy Kokhan
ffad305fd3
[BFN] Added watchdog platform plugin (#12995)
Why I did it
Initial implementation of Watchdog platform plugin for BMC-based boards

How I did it
How to verify it
Run platform_tests/test_reload_config.py
2022-12-08 21:56:40 +08:00
Arvindsrinivasan Lakshmi Narasimhan
7db272556e
[chassis] update the asic_status.py to read from CHASSIS_FABRIC_ASIC_INFO_TABLE (#12576)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

Why I did it
Fixes #12575 and #12575

How I did it
In the PR sonic-net/sonic-platform-daemons#311 chassisd updates to CHASSIS_FABRIC_ASIC_INFO with the fabric asic info.
Updating the asic_status.py to read from the correct table.

How to verify it
test on chassis

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-12-07 21:53:47 -08:00
hari-selvam
d993444883
[sflow]: Unblocked psample_*() function calls in BRCM ESW platforms for proper functionality of sflow feature (#12918)
*Replaced BRCM SDK's psample support flag(PSAMPLE_SUPPORT) with linux kernel psample module support config flag(CONFIG_PSAMPLE) in saibcm-modules.
*Replaced BUILD_PSAMPLE conditioanl check with CONFIG_PSAMPLE to build psample callback library(psample-cb.o), only if psample config is enabled in linux kernel.
*Cleaned up PSAMPLE_SUPPORT related commented code.

Signed-off-by: haris@celestica.com

Signed-off-by: haris@celestica.com
2022-12-07 17:14:34 -08:00
wenyiz2021
5073dc0f8b
[MASIC] [azp] remove official-build-multi-asic.yml (#12973)
remove ..multi-asic.yml file as original purpose was to test stability before multi_asic PR check were brought up
2022-12-07 15:12:12 -08:00
byu343
dd87a791b4
[Arista] Disable pcie checking on x86_64-arista_7050cx3_32s (#12900)
This change is to disable the pcie firmware check done by Broadcom SAI. The change is needed for the Arista platform x86_64-arista_7050cx3_32s; otherwise, the check will fail, blocking the initialization.

There was a pcie firmware check added in brcm SDK and certain Arista hardwares do not compliant with the check, so we added the disable_pcie_firmware_check originally for x86_64-arista_7060dx4_32. For x86_64-arista_7050cx3_32s, it was able to pass the check but some firmware change done in August made it fail.
2022-12-07 01:28:26 -08:00
Dmytro Lytvynenko
0711aea3aa
[bfn]: Fix sigterm processing (#12952)
Why I did it
SIGTERM takes more than 10 seconds to be processed, so psud is stopped by SIGKILL, this causes unexpected behavior since data base is not cleared

How I did it
Decorate get_presence api to cancel it on SIGTERM signal in order to avoid long processing.

How to verify it
test_pmon_psud_stop_and_start_status
test_pmon_psud_term_and_start_status
2022-12-06 23:38:23 -08:00
Liu Shilong
1bf5a245cd
[build]: increase raw image disk size to 4GB (#12958)
3GB disk size is not enough for broadcom raw image.
2022-12-06 23:37:05 -08:00
Saikrishna Arcot
61536028f8
[build]: Fix docker load image tag not being the expected tag (#12959)
PR #12829 modified the docker tagging scheme such that optional docker
containers would be tagged with the SONiC image version. However, the
docker-image-load macro wasn't updated for these changes. Update it
here.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-12-06 23:36:00 -08:00
Samuel Angebault
19ec89b830
[Arista] Update platform library submodules (#12967)
- add reboot cause support for linecards
- add back a Wolverine variant removed by mistake
- misc fixes and improvements
2022-12-06 23:34:59 -08:00
Michael Li
50b962b4a8
Limit reload BCM SDK kmods on syncd start to PikeZ platform (#12971)
Why I did it
Limiting #12804 changes to PikeZ platform only (Arista-720DT-48S). Note that this is a short term workaround for this platform until SDK investigation on SDK init failure on docker syncd restart due to DMA issues is resolved.

How I did it
Retrieve platform name from /host/machine.conf and only reload SDK kmods on Arista-720DT-48S platform.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-12-07 09:53:21 +08:00
Stepan Blyshchak
8b8a7aaba8
[sonic-swss] update submodule (#12961)
Changes included:
```
28aa309 [fpm] Fix FpmLink to read all netlink messages from FPM message (#2492)
```
2022-12-06 12:06:57 -08:00
Zain Budhwani
0240763eb3
Update submodule ptr (#12953)
Incorporates following commits:

43a9179 Zain Budhwani Mon Dec 5 13:44:16 2022 -0800 Call evtc_stop after error (#64)
5712679 pettershao-ragilenetworks Fri Dec 2 11:04:08 2022 +0800 Fix the cfg variable configuration bug. (#65)
2022-12-06 09:29:43 -08:00
Stepan Blyshchak
8ca0530920
[swss.sh] optimize macsec feature state query (#12946)
- Why I did it
There's a slowdown in bootup related to the execution of a show command during startup of swss service. show is a pretty heavy command and takes long time to execute ~2 sec.

- How I did it
I replaced show with sonic-db-cli which takes a ms to run.

- How to verify it
Boot the switch and verify swss is active.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-12-06 11:23:46 +02:00
Mykhailo Onipko
586d715f17
[BFN] Update BFN sdk to SAI 1.11.0 (#12945)
Why I did it
SONiC moved to SAI 1.11.0

How I did it
Build package with SAI 1.11.0

How to verify it
Ran sanity for all included profiles
2022-12-06 16:47:31 +08:00
zitingguo-ms
c55f4dca2d
[submodule] Advance sairedis header (#12937)
# Why I did it
Update sairedis submodule to include following changes:
1. Use github code scanning instead of LGTM sonic-sairedis#1160
2. enable cisco8000 SAI bulk API feature sonic-sairedis#1153
3. [submodule] Advance SAI header sonic-sairedis#1168
# How I did it
Advance sairedis header to keep up with master.

Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
2022-12-06 16:38:23 +08:00
Yutong Zhang
4de98ffecb
[TestbedV2][master]Set all jobs mandatory in pipeline. (#12938)
Recently, the job of t0-sonic, multi-asic and wan run stably in master branch, so in this pr, I set them mandatory in azure pipeline.

Why I did it
Recently, the job of t0-sonic, multi-asic and wan run stably in master branch, so in this pr, I set them mandatory in azure pipeline.

How I did it
Modify the value of continueOnError in each job from `true` to `false`.

Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
2022-12-06 10:19:54 +08:00
Marty Y. Lok
f2ece3a4fc
[Nokia]Update Nokia platform submodule for Nokia-IXR7250E platform (#12876)
1d53bf4 Skip platform NDK health check two times in watchdog.sh
d68297c Added code to shutdown the channel after the grpc call also fixed the show fp-status command
0769efe Impelemented the module API to return the correct eeprom info for fabric card.
171569c Remove explicit logger identifier for transceiver module operations; use inherited id
6c4d651 Corrected the log messages for firmware install

Signed-off-by: mlok <marty.lok@nokia.com>
2022-12-05 11:38:52 -08:00
Ikki Zhu
64e7fff7c7
[Platform/Seastone]: fix syseeprom tlv read issue (#12200)
Why I did it
Fix Seastone syseeprom tlv header read incorrect issue

How I did it
Set mux idle_state

How to verify it
i2cdump -y -f 12 0x50 i
2022-12-05 09:49:43 -08:00
Ikki Zhu
ad49100985
Seastone: fix platform fan psu and temperature issues (#12567)
Why I did it:
Fix multiple seastone platform issues caused by sonic kernel upgrade.

How I did it:
Get gpio base id with new label path in gpio sys fs.

How to verify it:
After the change, show platform fan/psustatus/temperature works well.
2022-12-05 09:44:55 -08:00
Lior Avramov
e5808020a7
Add ECMP calculator tool (#12482)
- Why I did it
Added ECMP calculator tool.

- How I did it
New files were added.

- How to verify it
Manual tests performed according to tests chapter in HLD
Automated tests will be added by verification.
2022-12-04 17:14:25 +02:00
LuiSzee
5281f6c3f6
[build][arm64] disable p4rt compile on arm64 for bazel not work (#12798)
pre-compiled bazel is not work in arm64 docker container

shil@2f910d8d37b2:/sonic/src/sonic-p4rt/sonic-pins$ uname -a
Linux 2f910d8d37b2 5.4.0-132-generic #148-Ubuntu SMP Mon Oct 17 16:02:06 UTC 2022 aarch64 GNU/Linux
shil@2f910d8d37b2:/sonic/src/sonic-p4rt/sonic-pins$ bazel
Opening zip "/proc/self/exe": lseek(): Bad file descriptor
FATAL: Failed to open '/proc/self/exe' as a zip file: (error: 9): Bad file descriptor
shil@2f910d8d37b2:/sonic/src/sonic-p4rt/sonic-pins$
2022-12-03 23:07:12 -08:00
LuiSzee
cd12486316
[centec][arm64] fix tsingma bsp compile error (#12774)
fix centec arm64 tsingma bsp compile error caused by linux kernel api change
2022-12-03 23:05:59 -08:00
LuiSzee
3ef7b560ec
[build][arm64] fix debian source for arm64 bullseye docker image (#12778)
Why I did it
arm64 bullseye docker image source is set to jessie for VERSION_CODENAME is miss.

shil@localhost:~/sonic-buildimage$ docker run -it multiarch/debian-debootstrap:arm64-bullseye bash
root@b2e2fea86e2d:/# cat /etc/os-release | grep VERSION_CODENAME | cut -d= -f2
root@b2e2fea86e2d:/# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux bullseye/sid"
NAME="Debian GNU/Linux"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

How I did it
if DISTRO is NULL, find it in /etc/apt/sources.list

root@b2e2fea86e2d:/# cat /etc/apt/sources.list | grep deb.debian.org | awk '{print $3}'
bullseye
root@b2e2fea86e2d:/# cat /etc/apt/sources.list
deb http://deb.debian.org/debian bullseye main
root@b2e2fea86e2d:/#
2022-12-03 23:04:36 -08:00
LuiSzee
c154b68b61
[centec][arm64] support multi-platform device tree (#12846)
Why I did it
support multi-platform device tree for default dtb may not suitable on all vender hardware designs.

How I did it
use onie_platform variable to load device tree blob
2022-12-03 22:32:59 -08:00
Yutong Zhang
cb354a5af2
[TestbedV2][master] Remove timeout in each step. (#12915)
Previously, we set timeout in each step such as Lock testbed, Prepare testbed, Run test and KVM dump. When some issue suck like retry happens in one step, it will cause timeout error, but actually, it only needs more time to success. In this pr, we remove the timeout limit in each step and control the timeout outside in each job. When the job runs more than four hours, it will be cancelled.

Why I did it
Previously, we set timeout in each step such as Lock testbed, Prepare testbed, Run test and KVM dump. When some issue suck like retry happens in one step, it will cause timeout error, but actually, it only needs more time to success. In this pr, we remove the timeout limit in each step and control the timeout outside in each job. When the job runs more than four hours, it will be cancelled.

How I did it
Remove the timeout parameter in each step, and control the timeout outside in each job.

How to verify it
Set the timeout of one job to 4 hours, and when timeout happens, azure pipeline will cancel this job.
2022-12-03 22:30:03 -08:00
Marty Y. Lok
8bf7a8b2ce
[armhf][sonic-installer] Fix issue of the sonic-installer install a image after sonic-installer clean (#12609)
Signed-off-by: mlok <marty.lok@nokia.com>

Signed-off-by: mlok <marty.lok@nokia.com>
2022-12-02 13:52:59 -08:00
Jing Kan
272f61d0f1
[Arista 720DT] Create SKU alias Arista-720DT-G48S4 (#12905) 2022-12-02 18:53:02 +08:00
StormLiangMS
c7c921166c
Add 202211 backport option for the PR review template (#12884)
Why I did it
Add 202211 backport option

How I did it
Add option in .github/pull_request_template.md

How to verify it
2022-12-02 17:43:56 +08:00
Santhosh Kumar T
f10f79b754
[DellEMC] Master: S6100: SSD upgrade status: Moving from smartctl to iSMART (#12784)
Why I did it
smartctl tool is available only in PMON docker. Hence, the tool may be not accessible incase PMON docker goes down.
Using iSMART_64 tool to fetch the SSD firmware version and device model information.

How I did it
Replacing smartctl with iSMART_64.
2022-12-01 17:16:10 -08:00
Kalimuthu-Velappan
aaeafa8411
02.Version cache - docker cache build framework (#12001)
During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.

Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

#### How I did it

#### How to verify it
2022-12-02 08:28:45 +08:00
Robert J. Halstead
7a0152ad15
[sonic-pins] update submodule ptr (#12644)
Update submodule for sonic-pins to be aligned to following swss PRs
*New P4Orch development. sonic-swss#2425
*Upstream new development on p4orch sonic-swss#2237
2022-12-01 10:05:47 -08:00
Sudharsan Dhamal Gopalarathnam
15fc527d30
[yang] Add collector_vrf to sflow yang model (#12897)
- Why I did it
Fixed sflow yang model to include collector_vrf field.

- How I did it
Added leaf for collector_vrf under sflow_collector. Additionally aligned the configuration guide

- How to verify it
Added UT to verify.
2022-12-01 19:30:32 +02:00
Saikrishna Arcot
3226c40581
[build]: Disable stretch slave container (#12868)
The only platforms that currently need the stretch slave container are
innovium and nephos, and both are not building with the current code due
to other issues. All other platforms only need buster and bullseye slave
containers.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-12-01 09:26:46 -08:00
Mai Bui
2b3e884209
[nokia] Replace os.system and remove subprocess with shell=True (#12100)
Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [https://github.com/sonic-net/sonic-buildimage/pull/12065](https://github.com/sonic-net/sonic-buildimage/pull/12065)
#### Why I did it
`subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
`getstatusoutput` is dangerous because it contains `shell=True` in the implementation
#### How I did it
Replace `os` by `subprocess`, use with `shell=False`
Remove unused functions
2022-12-01 12:12:50 -05:00
Stephen Sun
ec809bd7a1
[Submodule] Advance sonic-host-services pointer (#12902)
4a2ef99 Avoid printing message in error level when DEVICE_METADATA|localhost updates (sonic-net/sonic-host-services#25)
6c131c4 Use github code scanning instead of LGTM(sonic-net/sonic-host-services#26)
c55f5d1 Use github code scanning instead of LGTM

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-12-01 15:35:44 +02:00
zitingguo-ms
b774ebfdc2
[SAI-PTF] Publish docker saiserverv2 in master branch (#12842)
Why I did it
Publish docker saiserverv2 in the build pipeline.

How I did it
Add docker saiserverv2 target in the build template.

How to verify it
Test this by running this pipeline: https://dev.azure.com/mssonic/build/_build/results?buildId=182134&view=results
2022-11-30 22:26:54 -08:00
vdahiya12
11d579ccb1
[sonic-platform-daemons] submodule update (#12841)
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com

e474335 (HEAD -> master, origin/master, origin/HEAD) [ycabled] fix minor appl_db retrieving logic for update (#319)
9b84b58 Use github code scanning instead of LGTM (#316)
f784ad7 Pass grid parameter while calling set_laser_freq (#317)
ed818f8 [PSU daemon] Support PSU power threshold checking (#288)
707a720 (origin/202211) [chassisd] update chassisd to write fabric and lc asics on sep erate table (#311)
e8c5657 [ycabled] fix exception-handling logic for ycabled (#306)
905874d [ycabled] move swsscommon API's from subroutines to call them exactly once per task_worker/thread (#303)
510d330 Fix typo in xcvrd (#313)
9ae551f [ycabled] add support for detach mode in 'active-active' topology (#309)

The above commits are added to sonic-platform-daemons
2022-11-30 19:34:36 -08:00
Junchao-Mellanox
ffa974c7f4
[system-health] Led color shall be controlled by configuration when system is booting (#12487)
* [system-health] Led color shall be controlled by configuration when system is booting

* Fix unit test issue
2022-11-30 18:38:50 -08:00
svshah-intel
f189986386
[submodule update] sairedis refpoint to include support for json sai attr value
sairedis commits:
b1e9c91 2022-11-29 | validation support for SAI_ATTR_VALUE_TYPE_JSON (sonic-net/sonic-sairedis#1152)
2022-11-30 18:12:41 -08:00
Stepan Blyshchak
d22cf46ceb
[dockers] save extension dockers with an image tag (#12829)
Fixes: #11521

- Why I did it
When build SONiC dockers, SONiC build system tags all of them with latest tag. This is Ok for all built-in dockers because we will also tag them with image version tag in sonic_debian_extension.j2 script. On the other hand, some of these dockers are SONiC packages and they are installed by sonic-package-manager which creates a only one tag whcih is recorded in the corresponding .gz file. This leads to having these dockers tagged only with latest tag. This change saves the tag as an image version string in .gz file, so that these dockers have version identification in their tag.

- How I did it
I modified slave.mk to save the version tag instead of latest tag.

- How to verify it
I verified this change by running show version

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-11-30 19:49:34 +02:00
Junchao-Mellanox
7d38b459e4
[Mellanox] Add device files for SN5600 (#12831)
- Why I did it
Add device files for new platform SN5600

- How I did it
Add device files for new platform SN5600

- How to verify it
Manual test
2022-11-30 19:47:50 +02:00
Yutong Zhang
df4312f7ef
Support passing the instance numbers of a testplan. (#12879)
Previously, we hard code the min and max numbers of instance in a plan. In this pr, we support passing the instance numbers of a testplan.

Why I did it
Previously, we hard code the min and max numbers of instance in a plan. In this pr, we support passing the instance numbers of a testplan.

How I did it
Use a variable to set the instance number.
2022-11-30 22:36:55 +08:00
Michael Li
f725b83bd6
Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804)
Why I did it
There is an issue on the Arista PikeZ platform (using T3.X2: BCM56274) while running SONiC. If the 'syncd' container in SONiC is restarted, the expected behaviour is that syncd will automatically restart/recover; however it does not and always fails at create_switch due to BCM SDK kmod DMA operation cancellation getting stuck.

Sep 16 22:19:44.855125 pkz208 ERR syncd#syncd: [none] SAI_API_SWITCH:platform_process_command:428 Platform command "init soc" failed, rc = -1. Sep 16 22:19:44.855206 pkz208 INFO syncd#supervisord: syncd CMIC_CMC0_PKTDMA_CH4_DESC_COUNT_REQ:0x33#015 Sep 16 22:19:44.855264 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:platformInit:1909 initialization command "init soc" failed, rc = -1 (Internal error). Sep 16 22:19:44.855403 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:sai_driver_init:642 Error initializing driver, rc = -1. ... Sep 16 22:19:44.855891 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:1173 initializing SDK failed with error Operation failed (0xfffffff5).

Reloading the BCM SDK kmods allows the switch init to continue properly.

How I did it
If BCM SDK kmods are loaded, unload and load them again on syncd docker start script.

How to verify it
Steps to reproduce:

In SONiC, run 'docker ps' to see current running containers; 'syncd' should be present.
Run 'docker stop syncd'
Wait ~1 minute.
Run 'docker ps' to see that syncd is missing.
Check logs to see messages similar to the above.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-11-30 16:16:30 +08:00
Mai Bui
0bd3be32e6
[device/marvell] Mitigation for security vulnerability (#11876)
#### Why I did it
`os` and `commands` modules are not secure against maliciously constructed input
`getstatusoutput` is detected without a static string, uses `shell=True`
#### How I did it
Eliminate the use of `os` and `commands`
Use `subprocess` instead
2022-11-30 00:06:28 -08:00
Liu Shilong
6f2ddc5f49
[action] Add github action to merge mssonicbld's PRs which can be merged (#12564)
* [action] Add github action to scan auto-mergeable PRs
2022-11-30 11:28:06 +08:00
Neetha John
c323037815
Update ECN settings for storage backend (#12855)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
ECN parameters need to be updated for storage backend

How I did it
Included the check for storage backend devices to update qos configs

How to verify it
Verified that the new ecn settings are applied on storage backend device.
Verified that the old ecn settings are applied for storage frontend, non storage frontend/backend devices
2022-11-29 10:19:06 -08:00
Mai Bui
95bb7f3b78
[device/ragile] Mitigation for security vulnerability (#11744)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
The [xml.etree.ElementTree](https://docs.python.org/3/library/xml.etree.elementtree.html#module-xml.etree.ElementTree) module is not secure against maliciously constructed data.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
`subprocess.getstatusoutput` is dangerous because include shell=True in the implementation
#### How I did it
Remove xml. Use [lxml](https://pypi.org/project/lxml/) XML parsers package that prevent potentially malicious operation.
Replace `os` by `subprocess`
Use command as an array instead of string
Use `getstatusoutput_noshell` in `sonic_py_common` lib
2022-11-29 11:54:37 -05:00
Junchao-Mellanox
32eca3ff75
[YANG] Support syslog rate limit configuration (#12488)
- Why I did it
Change YANG model to support syslog rate limit configuration feature

- How I did it
modified sonic-syslog.yang and sonic-feature.yang to support the new added configuration schema

- How to verify it
Unit test
2022-11-29 16:49:13 +02:00
Kebo Liu
36a100083f
[Mellanox] Add support to Mellanox Spectrum-4 ASIC Firmware compiling and upgrade (#12844)
- Why I did it
Add support for compiling Spectrum-4 ASIC firmware to the SONiC image
Add support for Spectrum-4 ASIC firmware upgrade

- How I did it
Update Mellanox fw make files to include Spectrum-4 ASIC firmware binaries.
Update firmware upgrade scripts to be able to detect Spectrum-4 ASIC.

- How to verify it
Run regression tests

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-11-29 16:38:41 +02:00
ganglv
c16b8dbcc5
[sonic-gnmi] Support GNMI native write (#10948)
Why I did it
Provide GNMI native write interface for configuration.

How I did it
Add configuration parameters for GNMI native write.

How to verify it
Check build pipeline.
2022-11-29 16:58:27 +08:00