Commit Graph

7337 Commits

Author SHA1 Message Date
Arvindsrinivasan Lakshmi Narasimhan
6a3a6c77f4 set the default value for the port fec to RS on J2 based LC (#15346)
Why I did it
Work item tracking
Microsoft ADO (24182162):
How I did it
update the config.bcm to set the default fec RS 100G Linecard

How to verify it
Tests on chassis
2023-06-10 14:32:36 +08:00
DavidZagury
8de162d4af [Mellanox] Update SN5600 SAI XML file (#14947)
- Why I did it
Update SAI xml file to align with the default SKU

- How I did it
Update the SN5600 SAI xml file

- How to verify it
Install image on SN5600 device
2023-06-10 14:32:30 +08:00
Kebo Liu
3100425299 [Mellanox] Update SN5600 sensors.conf and pcie.yaml files (#14883)
- Why I did it
Update the sensors.conf and pcie.yaml according to the real hardware.

- How I did it
Update the sensors.conf and pcie.yaml

- How to verify it
run relevant sonic-mgmt test cases.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-06-10 14:32:26 +08:00
Junchao-Mellanox
b8ac86e14a [system-health] Add fan direction check for system health (#14509)
- Why I did it
Add fan direction check to system health, all fans should be in the same direction

- How I did it
Add fan direction check to system health, all fans should be in the same direction

- How to verify it
Manual test
Unit test
Added sonic-mgmt test case to verify
2023-06-10 14:32:21 +08:00
StormLiangMS
8aeb2ba715
Cherrypick to 202211 [Mellanox] Add patch commit-id mapping to description #15416
cherry pick #15052
2023-06-10 13:58:12 +08:00
Junchao-Mellanox
af7412d3a1 [Mellanox] add PSU fan direction support (#14508)
- Why I did it
Add PSU fan direction support

- How I did it
Implement fan.get_direction for PSU fan

- How to verify it
Manual test
Unit test
2023-06-10 12:32:26 +08:00
mssonicbld
c99e035232
Added change to add 'peerType' as element in NEIGH_STATE_TABLE. (#15265) (#15380) 2023-06-08 05:09:53 +08:00
mssonicbld
5f4b54a9cd
[ci/build]: Upgrade SONiC package versions (#15361) 2023-06-06 19:46:12 +08:00
mssonicbld
e4d8355976
[ci/build]: Upgrade SONiC package versions (#15329) 2023-06-04 18:12:12 +08:00
mssonicbld
4e9569ee3b
[ci/build]: Upgrade SONiC package versions (#15165) 2023-06-03 17:22:05 +08:00
mssonicbld
084564bdde
Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933) (#15317) 2023-06-03 09:16:42 +08:00
Ye Jianquan
dd989a64d7
[CI/CD] Refine pr test definition, remove old test jobs and testbedv2 flags (#15305) 2023-06-02 16:33:41 +08:00
Ye Jianquan
167704807e
[CI/CD] Migrate to SONiC Elastictest (#15273) 2023-06-02 10:38:55 +08:00
Sudharsan Dhamal Gopalarathnam
d93970bc2e
[Mellanox] Update hw-mgmt to 7.0020.4301 (#15260) (#15283)
Manual Cherrypick of #15260

Why I did it
Bug fix:

I2C bus is stuck - Unable to probe I2C bus 2-0048, which causes /var/run/hw-management/config/sfp_counter, module_counter to be zero and pmon docker unable to start.
Work item tracking
Microsoft ADO (number only):
How I did it
Update HW-MGMT package version in the make file
Update HW-MGMT submodule pointer

How to verify it
run full sonic-mgmt regression
2023-06-01 11:41:59 +08:00
Ye Jianquan
69d61047c4
[CI/CD] Refine PR test templates and test_plan.py to be ready to migrate to Elastictest (#15259) 2023-05-31 09:37:38 +08:00
Neetha John
b82145bc27 [qos] Update RDMA-CENTRIC lossy profile to use static threshold for Th devices (#14372)
Why I did it
For better accounting purposes, updating the ingress lossy traffic profile to use static threshold. This change is only intended for Th devices using RDMA-CENTRIC profiles

How I did it
Update the buffer templates for Th devices in RDMA-CENTRIC folder to use the correct threshold

How to verify it
Verified the changes manually on a Th device.
Existing unit tests render Th template from the RDMA-CENTRIC folder. Updated the expected output to use the correct threshold
2023-05-31 00:32:12 +08:00
lixiaoyuner
8867d2459f Clean up the old version container images (#14978)
Why I did it
Our k8s feature will pull new version container images for each upgrade, the container images inside sonic will be more and more, but for now we don’t have a way to clean up the old version container images, the disk may be filled up. Need to add cleaning up the old version container images logic.

Work item tracking
Microsoft ADO (number only):
17979809
How I did it
Remove the old version container images besides the feature's current version and last version image, last version image is saved for supporting fallback.

How to verify it
Check whether the old version images are removed
2023-05-30 20:50:15 +08:00
mssonicbld
7b6a7d8283 [submodule] Update submodule sonic-swss to the latest HEAD automatically 2023-05-30 16:32:45 +08:00
mssonicbld
24daa8ab40
[healthd] Use unix_socket_path instead of loopback ip (#14843) (#15249) 2023-05-29 22:40:31 +08:00
Jing Kan
2cf1370ba0 [YANG] Add MgmtLeafRouter to Device Neighbor Metadata element type list (#15202)
Why I did it
Introduce a new valid neighbor element type to YANG.

Work item tracking
Microsoft ADO (number only): 23994521
How I did it
Add MgmtLeafRouter to element network type list.

How to verify it
Passes UTs
2023-05-29 14:34:10 +08:00
mssonicbld
d598217bab [submodule] Update submodule sonic-swss to the latest HEAD automatically 2023-05-26 16:32:43 +08:00
mssonicbld
d8f2f7c034
[Mellanox] Use sysfs for sfp reset/LPM/presence (#14130) (#15215) 2023-05-26 02:25:21 +08:00
mssonicbld
2098634ab3
[Mellanox] Update SAI to 2211.24.0.21 and SDK/FW to 4.5.5142/2010_5144 (#15072) (#15214) 2023-05-26 02:20:30 +08:00
mssonicbld
46e72ede39 [submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically 2023-05-25 16:32:39 +08:00
Ye Jianquan
9764ec297e
Refine test job definition and assert logic (#14961)
Why I did it
Remove 'kvmtest-t0' and 'kvmtest-t1-lag' test jobs since all the test jobs are required (continueOnError: false) already, and will only enable one of classical and testbedV2 tests, no need to do an unnecessary 'or' compute test job.
Change agent pool to reduce cost and avoid congestion
2023-05-24 10:26:49 +08:00
Yaqiang Zhu
782c044a75 [minigraph] Add rack_mgmt_rack parse support in minigraph.py (#15064)
Why I did it
We need to store information of power shelf in config_db for SONiC MX switch. Current minigraph parser cannot parse rack_mgmt_map field.

Work item tracking
Microsoft ADO (number only): 22179645
How I did it
Add support for parsing rack_mgmt_map.
2023-05-23 14:33:24 +08:00
Yaqiang Zhu
8a48cab032
[202211][yang] Extend device_metadata yang model with rack_mgmt_map (#15141)
Why I did it
Manually cherry-pick and resolve conflicts of this PR: #15109
Extend device_metadata yang model.

Work item tracking
Microsoft ADO (number only): 22912178
How I did it
Add rack_mgmt_map field in yang model.

How to verify it
Build image.
2023-05-23 09:44:38 +08:00
mssonicbld
93d62f87a7
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#15172) 2023-05-21 14:52:18 +08:00
mssonicbld
09e2bc9964
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#15164) 2023-05-20 15:08:40 +08:00
Dror Prital
2e8b7d2ede Support pulling sonic-slave-docker image from path at REGISTRY_SERVER (#14907)
- Why I did it
In order to reduce sonic build time, there is an option to acquire sonic slave docker(s) from artifact server (reduce sonic make configure time).
Current implementation supports only convention of:

<REGISTRY_SERVER>:<REGISTRY_PORT>/<SLAVE_BASE_IMAGE>:<SLAVE_BASE_TAG>

In case the SLAVE_BASE_IMAGE appear in internal path inside the server, the convention should be like that:

<REGISTRY_SERVER>:<REGISTRY_PORT><REGISTRY_SERVER_PATH>/<SLAVE_BASE_IMAGE>:<SLAVE_BASE_TAG>

When REGISTRY_SERVER_PATH (that is set on rules/config) will have to start with "/".

If REGISTRY_SERVER_PATH will not be set, the behavior will remain the same it works today.

- How I did it
Add ability to set REGISTRY_SERVER_PATH and update the code for docker image tag and docker image pull accordingly

- How to verify it
Use sonic slave docker image from artifact server in which the image is kept in internal folder and make sure it consume it.
2023-05-18 14:33:33 +08:00
Vivek
e2876b0062 [Sys Mon] Fix the service entry delete in state_db because of timer job (#14702)
Why I did it
systemd stop event on service with timers can sometime delete the state_db entry for the corresponding service.

Note: This won't be observed on the latest master label since the dependency on timer was removed with the recent config reload enhancement. However, it is better to have the fix since there might be some systemd services added to system health daemon in the future which may contain timers

root@qa-eth-vt01-4-3700c:/home/admin# systemctl stop snmp
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status 
System is not ready - one or more services are not up

Service-Name            Service-Status    App-Ready-Status    Down-Reason
----------------------  ----------------  ------------------  -------------
<Truncated>
ssh                     OK                OK                  -
swss                    OK                OK                  -
syncd                   OK                OK                  -
sysstat                 OK                OK                  -
teamd                   OK                OK                  -
telemetry               OK                OK                  -
what-just-happened      OK                OK                  -
ztp                     OK                OK                  -
<Truncated>
Expected

Should see a Down entry for SNMP instead of the entry being deleted from the STATE_DB

root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status 
System is not ready - one or more services are not up

Service-Name            Service-Status    App-Ready-Status    Down-Reason
----------------------  ----------------  ------------------  -------------
<Truncated>
snmp                    Down              Down                Inactive
ssh                     OK                OK                  -
swss                    OK                OK                  -
syncd                   OK                OK                  -
sysstat                 OK                OK                  -
teamd                   OK                OK                  -
telemetry               OK                OK                  -
what-just-happened      OK                OK                  -
ztp                     OK                OK                  -
<Truncated>
How I did it
Happens because the timer is usually a PartOf service and thus a stop on service is propagated to timer. Fixed the logic to handle this

Apr 18 02:06:47.711252 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.service from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.711347 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.service ] 
Apr 18 02:06:47.722363 r-lionfish-16 INFO healthd: snmp.service service state changed to [inactive/dead]

Apr 18 02:06:47.723230 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.timer from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.723328 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.timer ] 

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-05-18 09:47:01 +08:00
Anish Narsian
71ecd727ac [arp_update] Resolve neighbors from config_db (#15006)
* To resolve NEIGH table entries present in CONFIG_DB. Without this change arp/ndp entries which we wish to resolve, and configured via CONFIG_DB are not resolved.
2023-05-18 09:46:56 +08:00
mssonicbld
3869fbfb68
[macsec]: show macsec: add --profile option, include profile name in show command output (#13940) (#15127) 2023-05-18 08:42:44 +08:00
mssonicbld
17771efecf
[armhf][Nokia-7215] changes fstrim.timer to daily (#14723) (#15125) 2023-05-18 07:12:24 +08:00
Song Yuan
01f61d1d29 Install ptf afpacket module required by ptf_nn_agent. (#14503)
Why I did it
ptf_nn_agent failed to start in dnx rpc syncd because module afpacket was not installed.
Please see issue sonic-net/sonic-mgmt#7822

How I did it
Add downloading ptf afpacket module in docker file.

How to verify it
Verified that ptf_nn_agent was started successfully in dnx rpc syncd with the change.
2023-05-18 06:32:29 +08:00
mssonicbld
155477082f [submodule] Update submodule sonic-platform-common to the latest HEAD automatically 2023-05-17 18:32:19 +08:00
Nazarii Hnydyn
ba54e1e1ae
Revert "[swss/syncd] remove dependency on interfaces-config.service (#13084) (#14341)" (#15094)
This reverts commit 499f57a7f7.
2023-05-17 15:59:55 +08:00
Marty Y. Lok
4c707cbf25 [Nokia][device-data] Modify the Nokia-7250IXRE platform specific reboot script (#14568)
Why I did it

When reboot the chassis by issuing "sudo reboot" on Supervisor card. The internal midplane communication xe0 should be shutdown to avoid double reboot on the linecard.
Added a udev link rule to disable the autoneg on AMD xgbe port Xe0 and Xe1 and make the setting in sync with the peer Broadcom greyhound ports.

How I did it

Modify the Nokia-7250IXRE specific reboot script on the Supervisor card to shutdown the internal interface xe0. Also move reboot linecard code to the top of the script to make sure the notification has been send to Linecard before shutdown the xe0 interface.
Introduced a new rule 80-net-by-driver.link to disable the autoneg on the AMD size. This change requires the latest NDK which contains the change to set the autoneg on the xe0 and xe1 port on the Greyhound.

Signed-off-by: mlok <marty.lok@nokia.com>
2023-05-17 14:32:57 +08:00
mssonicbld
a9ffcc8a6d
Add SECURE_UPGRADE_PROD_TOOL_ARGS flag to make it possible for vendors to pass their own arguments on the prod signing script (#14581) (#15095) 2023-05-17 02:26:58 +08:00
mssonicbld
a443f15617 [submodule] Update submodule sonic-py-swsssdk to the latest HEAD automatically 2023-05-17 00:36:54 +08:00
mssonicbld
146457bc60 [submodule] Update submodule sonic-utilities to the latest HEAD automatically 2023-05-16 00:36:53 +08:00
mssonicbld
fac120025a [submodule] Update submodule sonic-swss to the latest HEAD automatically 2023-05-16 00:36:48 +08:00
Hua Liu
50705e9d9f Fix per-command authorization failed issue when a command with wildcard match more than hundred files. (#14787)
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.


#### Why I did it
When user enable TACACS per-command authorization, and run a command with wildcard , if the command match more than hundreds of files, the per-command authorization will failed with following message:
  *** authorize failed by TACACS+ with given arguments, not executing

The root cause of this issue is because bash will match files with wildcard and replace with wildcard args with matched files. when there are too many files, TACACS plugin will generate a big authorization request, which will be reject by server side. 

##### Work item tracking
- Microsoft ADO **(number only)**: 18074861

#### How I did it
Fix bash patch file, use original user inputs as authorization parameters.

#### How to verify it
Pass all UT.
Create new UT to validate the TACACS authorization request are using original command arguments.
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8115

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [X] 202205
- [X] 202211

#### Tested branch (Please provide the tested image version)

- [x] 202205.258490-412b83d0f
- [x] 202211.71966120-1b971c54b5


#### Description for the changelog
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.
2023-05-16 00:36:40 +08:00
xumia
30d919945d [Ci] Support marvell/marvell-arm64 build (#14875)
Why I did it
Support marvell/marvell-arm64 build

Work item tracking
Microsoft ADO (number only): 19995559
How I did it
2023-05-16 00:36:28 +08:00
mssonicbld
bb6a219520 [submodule] Update submodule wpasupplicant/sonic-wpa-supplicant to the latest HEAD automatically 2023-05-16 00:36:24 +08:00
DavidZagury
c59e15304c Change SECURE_UPGRADE_DEV_SIGNING_CERT to SECURE_UPGRADE_SIGNING_CERT (#14591)
Depends on https://github.com/sonic-net/sonic-linux-kernel/pull/315

#### Why I did it
The name SECURE_UPGRADE_DEV_SIGNING_CERT is misleading, this flag is relevant to both to dev and prod signing.

#### How I did it
Rename all mentions of name SECURE_UPGRADE_DEV_SIGNING_CERT to SECURE_UPGRADE_SIGNING_CERT - this is also done with PR in sonic-linux-kernel repository

#### How to verify it
Build SONiC using your own prod script
2023-05-16 00:36:15 +08:00
DavidZagury
cfa36bbd7b Fix issue with prod script not found, change the prod signing to work with flags to align to the dev script (#14580)
- Why I did it
Fix issue with signing tool not running due to being call with the path from the host and not the path it is mounted on inside the docker-slave

- How I did it
Modified the path on the SECURE_UPGRADE_PROD_SIGNING_TOOL flag to the path where it is mounted inside the slave docker

- How to verify it
Build SONiC using your own prod script
2023-05-16 00:36:11 +08:00
mssonicbld
65f40a188e
[submodule] Update submodule sonic-linux-kernel to the latest HEAD automatically (#15014)
Why I did it
src/sonic-linux-kernel

* 3909870 - (HEAD -> 202211, origin/202211) Change SECURE_UPGRADE_DEV_SIGNING_CERT to SECURE_UPGRADE_SIGNING_CERT (#315) (4 days ago) [DavidZagury]
* baaa137 - [202211] Add Secure Boot Kernel configuration backport (#316) (4 days ago) [DavidZagury]
How I did it
How to verify it
2023-05-15 22:53:24 +08:00
mssonicbld
c948a7305a
Remove default value from SECURE_UPGRADE_DEV_SIGNING_KEY (#14582) (#15063)
This is done because when there is a default value, we mount to this path, and this creates this folder on the host.

#### Why I did it
Fix issue that running without overwriting SECURE_UPGRADE_DEV_SIGNING_KEY and SECURE_UPGRADE_DEV_SIGNING_CERT dummy folders are being created on the host.

#### How I did it
Removed the default assignment to SECURE_UPGRADE_DEV_SIGNING_KEY and SECURE_UPGRADE_DEV_SIGNING_CERT

#### How to verify it
Build SONiC using your own prod script

Co-authored-by: DavidZagury <32644413+DavidZagury@users.noreply.github.com>
2023-05-15 16:39:36 +08:00
DavidZagury
4fd2a6297f
[Secure Boot] Add Secure Boot Support (#12692) (#14963)
- Why I did it
Add Secure Boot support to SONiC OS.
Secure Boot (SB) is a verification mechanism for ensuring that code launched by a computer's UEFI firmware is trusted. It is designed to protect a system against malicious code being loaded and executed early in the boot process before the operating system has been loaded.

- How I did it
Added a signing process to sign the following components:
shim, grub, Linux kernel, and kernel modules when doing the build, and when feature is enabled in build time according to the HLD explanations (the feature is disabled by default).

- How to verify it
There are self-verifications of each boot component when building the image, in addition, there is an existing end-to-end test in sonic-mgmt repo that checks that the boot succeeds when loading a secure system (details below).

How to build a sonic image with secure boot feature: (more description in HLD)

Required to use the following build flags from rules/config:
SECURE_UPGRADE_MODE="dev"
SECURE_UPGRADE_DEV_SIGNING_KEY="/path/to/private/key.pem"
SECURE_UPGRADE_DEV_SIGNING_CERT="/path/to/cert/key.pem"
After setting those flags should build the sonic-buildimage.
Before installing the image, should prepared the setup (switch device) with the follow:
check that the device support UEFI
stored pub keys in UEFI DB

enabled Secure Boot flag in UEFI
How to run a test that verify the Secure Boot flow:
The existing test "test_upgrade_path" under "sonic-mgmt/tests/upgrade_path/test_upgrade_path", is enough to validate proper boot
You need to specify the following arguments:
Base_image_list your_secure_image
Taget_image_list your_second_secure_image
Upgrade_type cold
And run the test, basically the test will install the base image given in the parameter and then upgrade to target image by doing cold reboot and validates all the services are up and working correctly

Co-authored-by: davidpil2002 <91657985+davidpil2002@users.noreply.github.com>
2023-05-15 10:13:26 +08:00