Commit Graph

7338 Commits

Author SHA1 Message Date
Longxiang Lyu
3090d2671a [snmp] Check intfmgrd running before start (#16588)
Add pre start check to ensure intfmgrd is running.
The check will run for 20 seconds at most.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2023-10-20 12:33:36 +08:00
Samuel Angebault
cf4f06d1c5 Disable CPU C-States other than C1 (#16703)
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
2023-10-20 12:33:31 +08:00
Saikrishna Arcot
d504600c9f [baseimage]: Update openssh to 1:8.4p1-5+deb11u2 (#16826)
Openssh in Debian Bullseye has been updated to 1:8.4p1-5+deb11u2 to fix CVE-2023-38408. 
Since we're building openssh with some patches, we need to update our version as well.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-10-20 12:33:26 +08:00
mssonicbld
466f689e78
[submodule] Update submodule sonic-linux-kernel to the latest HEAD automatically (#16654)
#### Why I did it
src/sonic-linux-kernel
```
* 9534615 - (HEAD -> 202211, origin/202211) arm64: ac5: Fix watchdog timeleft (#334) (5 days ago) [pavannaregundi]
* 70c4df8 - [marvell-arm64]: Add support for 98DX35xx and 98CX85xx platform (#311) (6 days ago) [pavannaregundi]
* aab079e - [Mellanox] Upstream kernel patches with HW-MGMT 7.0030.1011 (#327) (4 weeks ago) [Kebo Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-10-19 16:34:30 +08:00
mssonicbld
c6a1838a32
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#16855)
#### Why I did it
src/linkmgrd
```
* abb22d2 - (HEAD -> 202211, origin/202211) [warmboot] config all interfaces back to `auto` if reconciliation times out  (#220) (7 days ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-10-19 14:34:44 +08:00
mssonicbld
6d3cd99217
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#16695)
#### Why I did it
src/sonic-swss
```
* 9647b81f - (HEAD -> 202211, origin/202211) [muxorch] Reorder the neighbor disable operations (#2917) (12 hours ago) [Longxiang Lyu]
* 30cea968 - Support type7 encoded CAK key for macsec in config_db (#2892) (5 days ago) [judyjoseph]
* 8d76a4e7 - [202211][ppi]: General code cleanup: remove unused methods. (#2868) (3 weeks ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-10-19 10:33:44 +08:00
Volodymyr Samotiy
29926587be
[202211][Mellanox] Update SAI version to SAIBuild2211.25.1.6 (#16522)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2023-10-10 14:38:04 +08:00
Kebo Liu
b56b34aa2c
[202211][Mellanox] Update HW-MGMT package to new version V.7.0030.1011(#16239) (#16295)
* [Mellanox] Update HW-MGMT package to new version V.7.0030.1010

Signed-off-by: Kebo Liu <kebol@mellanox.com>

* Update hw-mgmt version to 7.0030.1011

Signed-off-by: Kebo Liu <kebol@nvidia.com>

---------

Signed-off-by: Kebo Liu <kebol@mellanox.com>
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-10-10 14:30:56 +08:00
mssonicbld
2b738c53fa
Fix the dependency grpcio-tools version (#16776) (#16810) 2023-10-09 18:01:59 +08:00
Junchao-Mellanox
5c74ecb87f [Mellanox] wait reset cause ready (#16722)
Why I did it
SONiC service determine-reboot-cause might run before driver creating reset cause files. In that case, the reset cause will be "Unknown". This PR introduces a wait mechanism to wait for reset cause sysfs files ready.

How I did it
/run/hw-management/config/reset_attr_ready is the file to indicate all reset cause files are ready. In chassis.get_reboot_cause function, it waits /run/hw-management/config/reset_attr_ready for up to 45 seconds.

How to verify it
Manual test on master/202211/202205
2023-10-05 09:33:21 +08:00
mssonicbld
7049b6f788
[Ci] Change the package upgrade PR title (#16674) (#16728) 2023-09-27 22:05:31 +08:00
mssonicbld
2cfa8b2d93
[build] Fix build issue in docker-ptf-sai caused by setuptools_scm new release (#16636) (#16680)
Why I did it
When SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT changed, almost all dockers need to be built again.
But currently it will be loaded by cache.

Work item tracking
Microsoft ADO (number only): 25123348
How I did it
Add $(DOCKER)_FILES into dependencies.

How to verify it
2023-09-26 18:47:01 +08:00
mssonicbld
5b6fcb7711
[ci/build]: Upgrade SONiC package versions (#15614) 2023-09-23 00:34:57 -07:00
Yoush
330d0780fd
[centec]: update sonic centec-sai reference to v1.11.0-2 for 202211 (#16241)
Change makefile to reference to new SAI debian package of v1.11.0-2 for centec of 202211

Signed-off-by: yoush <yoush@centec.com>
2023-09-23 00:29:16 -07:00
Stephen Sun
7ea54f53d5
Add yang model for scheduler in PORT_QOS_MAP (#16244) (#16359)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-09-19 10:06:32 +08:00
vganesan-nokia
9ffa92cc61
[swss] Chassis db clean up optimization and bug fixes (#16454) (#16540)
* [swss] Chassis db clean up optimization and bug fixes

This commit includes the following changes:
    - Fix for regression failure due to error in finding CHASSIS_APP_DB in
    pizzabox (#PR 16451)
    - After attempting to delete the system neighbor entries from
    chassis db, before starting clearing the system interface entries,
    wait for sometime only if some system neighbors were deleted.
    If there are no system neighbors entries deleted for the asic coming up,
    no need to wait.
    - Similar changes for system lag delete. Before deleting the
    system lag, wait for some time only if some system lag memebers were
    deleted. If there are no system lag members deleted no need to wait.
    - Flush the SYSTEM_NEIGH_TABLE from the local STATE_DB. While asic
    is coming up, when system neigh entries are deleted from chassis ap
    db (as part of chassis db clean up), there is no orchs/process running to
    process the delete messages from chassis redis. Because of this, stale system
    neigh are entries present in the local STATE_DB. The stale entries result in
    creation of orphan (no corresponding data path/asic db entry) kernel neigh
    entries during STATE_DB:SYSTEM_NEIGH_TABLE entries processing by nbrmgr (after
    the swss serive came up). This is avoided by flushing the SYSTEM_NEIGH_TABLE from
    the local STATE_DB when sevice comes up.

Signed-off-by: vedganes <veda.ganesan@nokia.com>

* [swss] Chassis db clean up bug fixes review comment fix - 1

Debug logs added for deletion of other tables (SYSTEM_INTERFACE and SYSTEM_LAG_TABLE)

Signed-off-by: vedganes <veda.ganesan@nokia.com>

---------

Signed-off-by: vedganes <veda.ganesan@nokia.com>
(cherry picked from commit b13b41fc22)
2023-09-14 14:06:52 -07:00
SuvarnaMeenakshi
379f256bcf
[202211][SNMP][IPv6]: Revert PRs to support SNMP over IPv6 (#16278)
* Revert "[SNMP][IPv6]: Fix to use link local IPv6 address as snmp agentAddress (#16013)"

This reverts commit 803c71c86a.

* Revert "[SNMP][IPv6]: Fix SNMP IPv6 reachability issue in certain scenarios (#15487)"

This reverts commit 9864dfeaa1.
2023-09-10 22:18:17 +08:00
Dror Prital
4e67c18c11
[202211][Mellanox] Update SDK/FW to 4.6.1062/2012.1062 Update SDK/FW/SAI to 4.6.1062/2012.1062/SAIBuild2211.25.1.4 (#16434)
SAI bug Fixes

- When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
- Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 which is 255 when fastboot enable and 511 when fastboot disable
- Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE

SDK/FW bug fixes

- When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.
2023-09-08 23:55:42 -07:00
mssonicbld
4390159698
Update macsec CAK keys in profile for tests to change to type7 encoded format (#16388) (#16500) 2023-09-09 04:37:17 +08:00
Yaqiang Zhu
3310592d8f [yang] Add Bmc to Device Neighbor Metadata element type list (#16188)
Bmc is a valid neighbor type in minigraph, however it was missing from the YANG model definition. Usually, the Bmc type device can be neighbor of BmcMgmtToRRouter. This PR is to introduce this type.
2023-09-07 12:33:20 +08:00
mssonicbld
d0325862a8
[Mellanox] set select timeout to no more than 1 sec to make sure fast shutdown (#13611) (#16448) 2023-09-06 04:23:23 +08:00
mssonicbld
67ea31edc8
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#16442) 2023-09-05 18:45:48 +08:00
Zain Budhwani
71678dc355 [eventd]: Remove unnecessary log (#16166)
Work item tracking
Microsoft ADO (number only): 16789053
2023-09-03 16:33:03 +08:00
Senthil Kumar Guruswamy
7ab6be8440 Handle service start-limit-hit failure event case in sysmonitor (#16174) 2023-09-03 16:32:58 +08:00
mssonicbld
1909f019ab
[P4RT]Disabling p4rt by default to overcome build issues (#16343) (#16426) 2023-09-03 16:06:31 +08:00
Stephen Sun
72aab2b58e Fix issue: unprintable character is rendered when handling comments in j2 (#16287)
Use "{#-" and "-#}" to mark comments in jinja template

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-09-03 14:32:27 +08:00
mssonicbld
07b5677095
[Nokia][DeviceData] Update the Nokia platform IXR-7250E device data (#16028) (#16380) 2023-09-02 07:32:54 +08:00
mssonicbld
26e1d59867
[chassis] Chassis DB cleanup when asic comes up (#16213) (#16379) 2023-09-02 07:18:17 +08:00
mssonicbld
fea5bd34bc
chassis-packet: Update arp_update script for FAILED and STALE check (#16311) (#16384) 2023-09-02 07:11:54 +08:00
mssonicbld
f98bdb6eb5
[Nokia-IXR7250E] Modify the platform_ndk.json for Nokia-IXR7250E platform (#16355) (#16383) 2023-09-02 06:54:41 +08:00
mssonicbld
3214850e84
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#16335)
#### Why I did it
src/sonic-swss
```
* 16817324 - (HEAD -> 202211, origin/202211) [mux]: Fix UTs segmentation fault (#2760) (12 hours ago) [Nazarii Hnydyn]
* 0fa5d880 - [orchagent]: Handle additional SAI error conditions gracefully (#2755) (2 days ago) [prabhataravind]
* 3726aebc - [mux]: Implement rollback for failed mux switchovers (#2714) (2 days ago) [Lawrence Lee]
* a8e50e7d - [portsorch]: Set default hostif TX queue (#2697) (2 days ago) [prabhataravind]
* 0689d656 - Add missing parameter to on_switch_shutdown_request method. (#2567) (2 days ago) [Hua Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-09-01 16:32:48 +08:00
Liping Xu
ddb4ce1040 update rsyslog log size conf (#15821)
Why I did it
For some devices whose log folder size is larger than 200M, for example, 256M, the LOG_FILE_ROTATE_SIZE_KB should be 16M. and
THRESHOLD_KB=$((USABLE_SPACE_KB - (NUM_LOGS_TO_ROTATE * LOG_FILE_ROTATE_SIZE_KB * 2)))
= $(( (VAR_LOG_SIZE_KB * 90 / 100) - RESERVED_SPACE_KB)) - (NUM_LOGS_TO_ROTATE * LOG_FILE_ROTATE_SIZE_KB * 2)))
= $(( (256M * 90 / 100) - 4096)) - (8 * 16M * 2)))
the result would be a negative value

Work item tracking
Microsoft ADO (number only):
24524827
How I did it
Add a case for 400M, if the log folder size is between 200M and 400M, set the log file size to 2M

How to verify it
Do cmd "sudo logrotate -f /etc/logrotate.conf" on DUT which val/log folder size is 256M, and check the syslog.
2023-09-01 14:33:30 +08:00
Xichen96
07f0a8ac7c add processor.max_cstate=0 to intel cpu cmdline (#16339)
Why I did it
This is a fix for PR [kernel] Change grub cmdline to set c-states to 0 for "Intel" CPUs by shlomibitton · Pull Request #6051 · sonic-net/sonic-buildimage (github.com)

The original PR will disable intel idle driver but it cannot limit the max c-state to 1 due to system will fall back to acpi idle driver.

Currently intel_idle.max_cstate=0 is already present, which will disable intel idle driver. With the added option, common idle driver will be disabled as well, so there will not be idle management. This is to prevent a bug that can be triggered by idle instruction on intel platform.

How I did it
Add the option to installer file beside intel_idle.max_cstate=0
2023-09-01 04:32:42 +08:00
Zhijian Li
acad2e6cee [YANG SONIC-ACL] Fix Yang definition of IN_PORTS and OUT_PORTS (#16220)
How I did it
Update Yang definition of IN_PORTS and OUT_PORTS to string.
Since we cannot split the string with comma (,) and validate each substring is a valid SONiC port name. The only restriction for them is must be a string.

How to verify it
Verified by building sonic_yang_models-1.0-py3-none-any.whl. While building the target package, unit tests were run and passed.
Build a SONiC image based on 202205 branch and installed on physical DUT. Re try the steps in [Yang] Incorrect definition of IN_PORTS and OUT_PORTS in sonic-acl.yang #16190 and can see below success response:
2023-08-31 16:33:40 +08:00
Aravind Mani
b3979d6da1 Dell S6100 Platform API 2.0 fixes (#16208)
Why I did it
Dell S6100 Platform components needs to be updated.

How I did it
Modified platform.json to fix the issue.

How to verify it
Run sonic-mgmt component test and check whether it passes.
2023-08-31 16:33:34 +08:00
judyjoseph
38c29293eb sudo not required explicitly as /bin/ip netns identify is part of READ_ONLY_CMDS in sudoers file (#16115)
Why I did it
Few commands in multiasic platforms when run with the "sudo ip netns exec asic0 " option was taking like 15 mins to get the o/p. This behavior of sudo getting hung was seen by just doing this

jujoseph@svcstr-server-2:~ sudo ip netns exec asic0 bash
jujoseph@svcstr-server-2:~ sudo ls

deally sudo is not needed as we have /bin/ip netns identify present in /etc/sudoers file. Hence removing it
2023-08-31 16:33:30 +08:00
xumia
fa44b44dd9 Change the docker image from alpine to debian in Makefile (#15132)
Why I did it
For security and consistency consideration, change the docker image from alpine to Debian in Makefile

Work item tracking
Microsoft ADO (number only): 23077660
How I did it
change the docker image from alpine to Debian in Makefile
2023-08-31 16:33:23 +08:00
xumia
ff20b785e0 Upgrade sonic-fips packages (#15400)
Why I did it
Downgrade the symcrypt version, use the SymCrypt version v103.0.1 for certification.

Work item tracking
Microsoft ADO (number only): 24222567
How I did it
How to verify it
2023-08-31 16:33:19 +08:00
Vivek
181eb02865 Run db_migrator for non first-time reboots (#16116)
- Why I did it
The recent change #15685 (comment) removed the db migration for non first reboots.
This is problematic for many deployments which doesn't rely on ZTP and push a custom config_db.json
Port to older branches after #15685 is ported back

- How I did it
Re-introduce the logic to run the db_migrator on non-first boots

- How to verify it
Verified reboot and warm-reboot cases

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-08-31 14:33:00 +08:00
Zhijian Li
4310b484c1 Fix openconfig_acl.py (#16303)
How I did it
Fix the regex for L4 port range in openconfig_acl.py.

How to verify it
Build image and install on Arista-720DT DUT, then try the repro steps in #16189 and confirmed the ACL rule be setup correctly:
2023-08-31 06:32:40 +08:00
mssonicbld
80c19c2874
[submodule] Update submodule sonic-sairedis to the latest HEAD automatically (#16291)
#### Why I did it
src/sonic-sairedis
```
* 2ebbd48 - (HEAD -> 202211, origin/202211) [syncd] Add pre match logic for acl entry (#1240) (11 hours ago) [Kamil Cudnik]
* 1db8726 - Use SAI_STATUS_ITEM_NOT_FOUND when key not found (#1224) (11 hours ago) [Lawrence Lee]
* 9e4071b - [CI]: Fix collect log error in azp template. (#1282) (4 days ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-30 16:32:59 +08:00
mssonicbld
decbc0d39f
[submodule] Update submodule sonic-linux-kernel to the latest HEAD automatically (#16330)
#### Why I did it
src/sonic-linux-kernel
```
* 10d7946 - (HEAD -> 202211, origin/202211) PATCH] net: allow user to set metric on default route learned via Router Advertisement (#326) (8 hours ago) [abdosi]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-30 16:32:51 +08:00
Nazarii Hnydyn
99e1ce9987
[202211][PPI]: Enable global port late create for SPC-4. (#15801)
DEPENDS:

[202211][ppi]: Implement port bulk comparison logic (#2564)  sonic-swss#2821
HLD: sonic-net/SONiC#1084

Why I did it
Enabled port late create on SN5600 switch boots up with no ports
Work item tracking
N/A
How I did it
Updated SAI xml config file
How to verify it
Run sonic-mgmt tests fastboot
2023-08-30 16:05:58 +08:00
Kebo Liu
ba82b52a1a
[Mellanox] Update MFT to newer version 4.25.0-62 (#16149) (#16203)
- Why I did it
Update Mellanox MFT tool to version 4.25.0-62

- How I did it
Update the MFT tool make file

- How to verify it
Run full sonic-mgmt regression.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-30 16:04:07 +08:00
mssonicbld
207d0c38ca
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#16332)
#### Why I did it
src/sonic-platform-common
```
* 05cf5c1 - (HEAD -> 202211, origin/202211) Change Y cable simulator log level from error to warning due to false alarm (11 hours ago) [ShiyanWangMS]
* 35ea290 - Update CMIS api's rendering max-duration (#375) (11 hours ago) [rajann]
* 33bd498 - Retrieve FW version using CDB command for CMIS transceivers + handle single bank FW versioning (#372) (11 hours ago) [mihirpat1]
* 2434362 - Render Media lane and Media assignment options info from Application Code (#368) (11 hours ago) [rajann]
* 862674b - Modify sfputil show fwversion to include build version for active/inactive FW version fields (#367) (11 hours ago) [mihirpat1]
* 8edfece - Adding electrical for 800G and 100G (#365) (11 hours ago) [mihirpat1]
* 8a1debf - SFF-8472: Fix tx_disable_channel to avoid write to read-only bit (#364) (11 hours ago) [mihirpat1]
* 223a231 - Update host electrical interface for 2x400G breakout cable (#363) (11 hours ago) [mihirpat1]
* baabd8f - fix get module hardware minor revision (#361) (11 hours ago) [Qingxiao Ren]
* 2ebabf5 - Prevent VDM dictionary related KeyError when a transceiver module is pulled while a bulk get method is interrogating said module (#360) (11 hours ago) [snider-nokia]
* 1498ed6 - [CMIS] Add API to get module power up duration (#354) (11 hours ago) [ChiouRung Haung]
* 1cae718 - Modify get_host_lane_assignment_option to return value based on application id (#352) (11 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-30 14:33:05 +08:00
mssonicbld
4c82749c2c
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#16336) 2023-08-30 13:57:20 +08:00
mssonicbld
a25f722dea
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#16334) 2023-08-30 13:57:15 +08:00
Ye Jianquan
b2fbfbe7c3
[PRTest, 202211] Skip telemetry/test_telemetry.py::test_on_change_updates (#16314)
* [PRTest, 202211] Skip telemetry/test_telemetry.py::test_on_change_updates

* exclude test_warm_reboot
2023-08-30 10:43:57 +08:00
Junchao-Mellanox
f73d322081
Fix issue: watchdogutil command does not work (#16242)
Conflicts:
	platform/mellanox/mlnx-platform-api/sonic_platform/watchdog.py
	platform/mellanox/mlnx-platform-api/tests/test_watchdog.py
2023-08-28 23:58:15 +08:00
Vaibhav Hemant Dixit
e7ce179b73 Fix CONFIG_DB_INITIALIZED flag check logic and set/reset flag for warmboot (#15685)
* Fix CONFIG_DB_INITIALIZED flag check logic and set/reset flag for warm-reboot
* Fix db-cli usage
* Handle same image warm-reboot and generalize handling of INIT flag
* Cover boot from ONIE case: set config init flag when minigraph, config_db are missing
* Handle case: first boot of SONiC
* Check for config init flag
* Simplify logic, and do not call db_migrator for same image reboot
2023-08-25 02:32:24 +08:00