Commit Graph

8397 Commits

Author SHA1 Message Date
Longxiang Lyu
3f29b28b36 [dualtor] Disable zebra link-detect for vlan interfaces (#17784)
* [dualtor] Disable zebra link-detect for vlan interfaces

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2024-01-19 06:33:12 +08:00
Junchao-Mellanox
0fbdc2b8ed [Mellanox] wait until hw-management watchdog files ready (#17618)
- Why I did it
watchdog-control service always disarm watchdog during system startup stage. It could be the case that watchdog is not fully initialized while the watchdog-control service is accessing it. This PR adds a wait to make sure watchdog has been fully initialized.

- How I did it
adds a wait to make sure watchdog has been fully initialized.

- How to verify it
Manual test
sonic regression
2024-01-19 04:32:53 +08:00
Saikrishna Arcot
9d918889e4 dhcrelay: Don't look up the ifindex for the fallback interface (#17797)
Currently, whenever isc-dhcp-relay forwards a packet upstream,
internally, it will try to send it on a "fallback" interface. My
understanding is that this isn't meant to be a real interface, but
instead is basically saying to use Linux's regular routing stack to
route the packet appropriately (rather than having isc-dhcp-relay
specify specifically which interface to use).

The problem is that on systems with a weak CPU, a large number of
interfaces, and many upstream servers specified, this can introduce a
noticeable delay in packets getting sent. The delay comes from trying to
get the ifindex of the fallback interface. In one test case, it got to
the point that only 2 packets could be processed per second. Because of
this, dhcrelay will easily get backlogged and likely get to a point
where packets get dropped in the kernel.

Fix this by adding a check saying if we're using the fallback interface,
then don't try to get the ifindex of this interface. We're never going
to have an interface named this in SONiC.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2024-01-18 16:33:48 +08:00
Nazarii Hnydyn
1687a442de [frr]: Force disable next hop group support. (#17344)
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com

Closes #17345

This W/A was proposed by Nvidia FRR team before the long term solution is ready.

Why I did it
A W/A to fix default route installation during LAG member flap
Work item tracking
N/A
How I did it
Disabled FRR next hop group support
How to verify it
Do LAG member flap
2024-01-18 14:36:23 +08:00
Junchao-Mellanox
56ba5b10b4 [Mellanox] implement sfp.reset for CMIS management (#16862)
- Why I did it
For CMIS host management module, we need a different implementation for sfp.reset. This PR is to implement it

- How I did it
For SW control modules, do reset from hw_reset
For FW control modules, do reset as the original way

- How to verify it
Manual test
sonic-mgmt platform test
2024-01-17 06:33:11 +08:00
Kebo Liu
cacf46ff86
[202311][Mellanox] Integrate HW-MGMT Version 7.0030.2008 (#17659)
* Intgerate HW-MGMT 7.0030.2008 Changes

 ## Patch List
* 0285-UBUNTU-SAUCE-mlxbf-gige-Fix-intermittent-no-ip-issue.patch :
* 0286-pinctrl-Introduce-struct-pinfunction-and-PINCTRL_PIN.patch :
* 0287-pinctrl-mlxbf3-Add-pinctrl-driver-support.patch :
* 0288-UBUNTU-SAUCE-gpio-mmio-handle-ngpios-properly-in-bgp.patch :
* 0289-UBUNTU-SAUCE-gpio-mlxbf3-Add-gpio-driver-support.patch :
* 0291-mlxsw-core_hwmon-Align-modules-label-name-assignment.patch :
* 0292-mlxsw-i2c-Limit-single-transaction-buffer-size.patch :
* 0293-mlxsw-reg-Limit-MTBR-register-records-buffer-by-one-.patch :
* 0296-UBUNTU-SAUCE-mmc-sdhci-of-dwcmshc-Add-runtime-PM-ope.patch :
* 0298-UBUNTU-SAUCE-mlxbf-ptm-use-0444-instead-of-S_IRUGO.patch :
* 0299-UBUNTU-SAUCE-mlxbf-ptm-add-atx-debugfs-nodes.patch :
* 0300-UBUNTU-SAUCE-mlxbf-ptm-update-module-version.patch :
* 0301-UBUNTU-SAUCE-mlxbf-gige-Fix-kernel-panic-at-shutdown.patch :
* 0302-UBUNTU-SAUCE-mlxbf-bootctl-support-SMC-call-for-sett.patch :
* 0303-UBUNTU-SAUCE-Add-BF3-related-ACPI-config-and-Ring-de.patch :
* 0306-dt-bindings-trivial-devices-Add-infineon-xdpe1a2g7.patch :
* 0307-leds-mlxreg-Add-support-for-new-flavour-of-capabilit.patch :
* 0308-leds-mlxreg-Remove-code-for-amber-LED-colour.patch :
* 0308-platform_data-mlxreg-Add-capability-bit-and-mask-fie.patch :
* 0309-hwmon-mlxreg-fan-Add-support-for-new-flavour-of-capa.patch :
* 0310-hwmon-mlxreg-fan-Extend-number-of-supporetd-fans.patch :
* 0317-platform-mellanox-Introduce-support-for-switches-equ.patch :
* 0318-mellanox-Relocate-mlx-platform-driver.patch :
* 0319-UBUNTU-SAUCE-mlxbf-tmfifo-fix-potential-race.patch :
* 0320-UBUNTU-SAUCE-mlxbf-tmfifo-Drop-the-Rx-packet-if-no-m.patch :
* 0321-UBUNTU-SAUCE-mlxbf-tmfifo-Drop-jumbo-frames.patch :
* 0322-UBUNTU-SAUCE-mlxbf-tmfifo.c-Amend-previous-tmfifo-pa.patch :
* 0323-mlxbf_gige-add-set_link_ksettings-ethtool-callback.patch :
* 0324-mlxbf_gige-fix-white-space-in-mlxbf_gige_eth_ioctl.patch :
* 0325-UBUNTU-SAUCE-mlxbf-bootctl-Fix-kernel-panic-due-to-b.patch :
* 0326-platform-mellanox-mlxreg-hotplug-Add-support-for-new.patch :
* 0327-platform-mellanox-mlx-platform-Change-register-name.patch :
* 0328-platform-mellanox-mlx-platform-Add-support-for-new-X.patch :

* [Mellanox] Don't populate arm64 Kconfig when integrating hw-mgmt

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* [Mellanox] Remove thermal zone related code and replace with new one

* Revert "Revert "[Mellanox] Align PSU temperature sysfs node name with hw-management change (#16820)" (#16956)"

This reverts commit c2edc6f9d5.

* Update copyright header

Signed-off-by: Kebo Liu <kebol@nvidia.com>

---------

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Co-authored-by: Vivek Reddy <vkarri@nvidia.com>
Co-authored-by: Junchao-Mellanox <junchao@nvidia.com>
Co-authored-by: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
2024-01-16 08:33:50 -08:00
Junchao-Mellanox
0b511986ae
[202311][Mellanox] implement platform wait in python code (#17398) (#17719)
- Why I did it
New implementation of Nvidia platform_wait due to:
1. sysfs deprecated by hw-mgmt
2. new dependencies to SDK
3. For CMIS host management mode

- How I did it
wait hw-management ready
wait SDK sysfs nodes ready

- How to verify it
manual test
unit test
sonic-mgmt regression
2024-01-16 08:31:33 -08:00
mssonicbld
c51bfd2ee2
[submodule] Update submodule sonic-linux-kernel to the latest HEAD automatically (#17772)
#### Why I did it
src/sonic-linux-kernel
```
* 46db038 - (HEAD -> 202311, origin/202311) Intgerate HW-MGMT 7.0030.2008 Changes (#361) (#372) (9 hours ago) [Kebo Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-13 16:32:41 +08:00
mssonicbld
87b4dc8899
[submodule] Update submodule dhcpmon to the latest HEAD automatically (#17750)
#### Why I did it
src/dhcpmon
```
* 2443073 - (HEAD -> 202311, origin/202311) [counter] Clear counter table when dhcpmon init (#14) (#16) (2 days ago) [Yaqiang Zhu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-12 16:34:42 +08:00
mssonicbld
5886145160
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17752)
#### Why I did it
src/sonic-utilities
```
* 72b6c04c - (HEAD -> 202311, origin/202311) Support disable/enable syslog rate limit feature (#3072) (2 days ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-12 16:34:38 +08:00
zitingguo-ms
4a26cd81f5 [YANG] Enable Yang model for BGP_BBR config entry (#17622)
Why I did it
Enable Yang model for BGP_BBR config entry.

{
        "BGP_BBR": {
            "all": {
                "status": "enabled"/"disabled"
            }
        }
}
Work item tracking
Microsoft ADO (number only): 25988660
How I did it
Add yang model and ut for BGP_BBR.

How to verify it
Use GCU cmd to change bbr status.
Create following json patch: disable_bbr.json-patch

[
 {
  "op": "replace",
  "path": "/BGP_BBR/all/status",
  "value": "disabled"
 }
]
Run sudo config apply-patch ./disable_bbr.json-patch cmd on dut. Success.
2024-01-11 16:34:42 +08:00
Lawrence Lee
bd63fff758 add timeout to ping6 command (#17729)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2024-01-11 10:41:00 +08:00
Junchao-Mellanox
4b6feaa69c Optimize syslog rate limit feature for fast and warm boot (#17458)
- Why I did it
Optimize syslog rate limit feature for fast and warm boot

- How I did it
Optimize redis start time
Don't render rsyslog.conf in container startup script
Disable containercfgd by default. There is a new CLI to enable it (in another PR)

- How to verify it
Manual test
Regression test
2024-01-10 14:35:30 +08:00
snider-nokia
bcdbaf1039 [Nokia][sonic-platform] Update Nokia sonic-platform submodule and device data (#17378)
These changes, in conjunction with NDK version >= 22.9.17 address the thermal logging issues discussed at Nokia-ION/ndk#27. While the changes contained at this PR do not require coupling to NDK version >= 22.9.17, thermal logging enhancements will not be available without updated NDK >= 22.9.17. Thus, coupling with NDK >=22.9.17 is preferred and recommended.

Why I did it
To address thermal logging deficiencies.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
The following changes are included:

Threshold configuration values are provided in the associated device data .json files. There is also a change included to better handle the condition where an SFP module read fails.

Modify the module.py reboot to support reboot linecard from Supervisor

 - Modify reboot to call _reboot_imm for single IMM card reboot
 - Add log to the ndk_cmd to log the operation of "reboot-linecard" and "shutdown/satrtup the sfm"
Add new nokia_cmd set command and modify show ndk-status output

 - Add a new function reboot_imm() to nokia_common.py to support reboot a single IMM slot from CPM
 - Added new command: nokia_cmd set reboot-linecard <slot> [forece] for CPM
 - Append a new column "RebootStatus" at the end of output of "nokia_cmd show ndk-status"
 - Provide ability for IMM to disable all transceiver module TX at reboot time
 - Remove defunct xcvr-resync service
2024-01-10 12:35:13 +08:00
Marty Y. Lok
88ee9f78c2 [Nokia-IXR7250E] Modify the platform_reboot on the IXR7250E for PMON API reboot and Disable all SFPs (#17483)
Why I did it
When Supervisor card is rebooted by using PMON API, it takes about 90 seconds to trigger the shutdown in down path. At this time linecards have been up. This delays linecards database initialization which is trying to PING/PONG the database-chassis. To address this issue, we modified the NDK to use the system call with "sudo reboot" when the request is from PMON API on Supervisor case. The NDK version is 22.9.20 and greater. This new NDK requires this modifcaiton of platform_reboot to work with.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
Modify the platform_reboot In Supervisor not to reboot all IMMs since it has been done in the function reboot() in module.py. Also handle the reboot-cause.txt for on the Supervisor when the reboot is request from PMON API.
Modify the Nokia platform specific platform_reboot in linecard to disable all SPFs.
This PR works with NDK version 22.9.20 and above

Signed-off-by: mlok <marty.lok@nokia.com>
2024-01-10 12:35:10 +08:00
Junchao-Mellanox
767944d7da [Mellanox] Fix race condition while creating SFP (#17441)
- Why I did it
Fix issue xcvrd crashes due to cannot import name 'initialize_sfp_thermal':

Nov 27 09:47:16.388639 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")

- How I did it
Add lock for creating SFP object

- How to verify it
Unit test
Manual Test
2024-01-09 14:34:47 +08:00
Junchao-Mellanox
8de7cb5988
[202311] [Mellanox] update asic and module temperature in a thread for CMIS management (#16955) (#17699)
- Why I did it
When module is totally under software control, driver cannot get module temperature/temperature threshold from firmware. In this case, sonic needs to get temperature/temperature threshold from EEPROM. In this PR, a thread thermal updater is created to update module temperature/temperature threshold while software control is enabled.

- How I did it
Query ASIC temperature from SDK sysfs and update hw-management-tc periodically
Query Module temperature from EEPROM and update hw-management-tc periodically

- How to verify it
Manual test
New Unit tests
2024-01-08 10:50:59 -08:00
mssonicbld
4060f5ce5b
[Mellanox] Remove EEPROM write limitation if it is software control (#17030) (#17694) 2024-01-07 13:16:25 +08:00
mssonicbld
fb7bad2d11
[Mellanox] Implement low power mode for cmis host management (#17159) (#17693) 2024-01-06 07:55:41 +08:00
Junchao-Mellanox
7368df7839
[Mellanox] Enable CMIS host management (#16846) (#17684)
- Why I did it
Enable CMIS host management for Mellanox devices which are expected to support the feature

- How I did it
new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event()
this thread in the new file handles the state machine per port.
first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module.
After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port,
so it will be able to detect plug in or out events of a cable.

- How to verify it
Enhanced unit tests
run sonic mgmt on Nvidia SN4700 with CMIS host management enabled

Co-authored-by: dbarashinvd <105214075+dbarashinvd@users.noreply.github.com>
2024-01-05 12:07:30 -08:00
mssonicbld
aafbf5bdc6
Update Dockerfile.j2 (#17663) (#17682) 2024-01-05 06:22:58 +08:00
mssonicbld
ac4f6fcbc2
[docker_image_ctl.j2]: swss docker initialization improvements (#17628) (#17680) 2024-01-05 04:39:16 +08:00
mssonicbld
c5473c1d8b
Update backend_acl.py to specify ACL table name (#17553) (#17668) 2024-01-04 10:45:26 +08:00
Junchao-Mellanox
6d43d2f636 [Mellanox] Provide default implementation for sfp error description when CMIS host management is enabled (#17294)
- Why I did it
Provide a dummy implementation for SFP error description when CMIS host management is enabled. A future feature shall be raised to implement SFP error description for such mode.

- How I did it
if SFP is under software control, provide "Not supported" as error description
if SFP is under initialization, provide "Initializing" as error description

- How to verify it
unit test
2024-01-04 10:38:38 +08:00
mssonicbld
48885b6ac9
[image_config]: Update DHCP rate-limit for mgmt TOR devices (#17630) (#17655) 2024-01-03 17:36:12 +08:00
mssonicbld
27c1e9bb42
[dhcp_server] Fix ut issue in test_utils and test_dhcp_cfggen (#17646) (#17651) 2024-01-03 04:40:15 +08:00
Ying Xie
af08f29d4d
[202311][YANG][sonic-utilities] update sonic DB version string format (#17600)
Old format: version_a_b_c
New format: version_<branch>_<nn>

sonic-utilities:
* fba4bf0b 2023-12-21 | [202311][db_migrator] add db migrator version space for 202305/202311 branch (#3082) (HEAD -> 202311, github/202311) [Ying Xie]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-12-22 11:23:31 -08:00
Ying Xie
16e695b912
[202311] lock down submodule branches (#17597)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-12-22 08:49:34 -08:00
Nazarii Hnydyn
49e96c3daa
[mellanox]: Disable MFT bash autocompletion. (#17543)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-12-21 09:45:42 -08:00
Yevhen Fastiuk
f78cb9c55c
[202311][cherry-pick][NTP] Add NTP extended configuration (#17487)
* Add NTP YANG model

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Extend NTP config generation mechanism

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Add NTP YANG nodel tests

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Add test for NTP Jinja templates

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Add ntpdate package

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Fix 'bad' when auth disabled

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* [NTP] Changed owner for ntp keys config file to root and remove read access for other.

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Fix NTP warnings after restarting the service

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Add ability to encrypt/decrypt NTP keys

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Update Configuration reference

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Fix NTP configuration template

* Align the description for setting interface
* Fix the usage of scoped variable

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Fix YANG model description and tests

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Align NTP test according to fixed condition

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Allow eth0 to be as source ifc without defining it

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

* Update sample config with NTP config

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>

---------

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
2023-12-21 09:45:29 -08:00
kellyyeh
bd8ed6bc6d
Advance dhcprelay submodule (#17585)
5ae186f Yaqiang Zhu Tue Dec 19 12:05:15 2023 -0500 [counter] Clear counter table when init (#45)
2023-12-20 22:49:23 -08:00
Ying Xie
9e94c3689a
[202311] set sonic release value (#17582)
Why I did it
Each release branch needs to have release number set.

Work item tracking
Microsoft ADO (number only):
How I did it
How to verify it
This PR test.
2023-12-21 13:26:53 +08:00
Ze Gan
e28b48b842
[202311][submodule]: Update submodule sonic-swss/sonic-dash-api/protobuf (#17521)
* [submodule]: Update submodule sonic-swss/sonic-dash-api/protobuf (#17413)

1. Protobuf 3.21 has been released in the Debian bookworm
2. Update submodule sonic-swss and sonic-dash-api because they include related updates.

- Microsoft ADO **(number only)**:

1. In the protobuf.mk, If it isn't bullseye, ignore to compile the protobuf package
2. Move sonic-swss commits:
```
fd852084 (HEAD, origin/master, origin/HEAD) [dashrouteorch]: Rename dash route namespace (#2966)
```
3. Move sonic-dash-api and move build chain to its submodule
```
d4448c7 (HEAD, origin/master, origin/HEAD, master) [azp]: Add multi-platform artifacts (#11)
8a5e5cc [debian]: Add debian package (#10)
d96163a [misc]: Add dash utils and its tests (#9)
```

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-12-20 17:25:23 -08:00
Junhua Zhai
3d7459ccfc
[gbsyncd] Graceful shutdown of syncd process in container gbsyncd (#16812) (#17563)
Fix #16608. Need to gracefully shutdown syncd/gbsyncd individually.
2023-12-20 09:23:14 -08:00
Arun Saravanan Balachandran
9dbb016ad8 [Dell] S6100 - Update EEPROM API serial_number_str to return service tag instead of serial number (#17440)
To modify EEPROM API serial_number_str to return service tag instead of serial number in Dell S6100.
Ref PR: #1239

How I did it
Update EEPROM API serial_number_str to return service tag instead of serial number.

How to verify it
Verify decode-syseeprom -s returns service tag in Dell S6100.
2023-12-15 09:37:01 +08:00
mssonicbld
0cb0891227
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17457)
src/sonic-utilities

* 1b1402f5 - (HEAD -> 202311, origin/202311) [hash]: Add ECMP/LAG hash algorithm CLI (#3036) (9 days ago) [Nazarii Hnydyn]
* 71514ea3 - Revert "Run yang validation in unit test (#3025)" (#3055) (9 days ago) [Ying Xie]
* b5daf5d4 - [dhcp_relay] Fix dhcp_relay counter display issue (#3054) (9 days ago) [Yaqiang Zhu]
* b3172505 - [sflow][db_migrator] Egress Sflow support (#3020) (9 days ago) [Rajkumar-Marvell]
* 1e813105 - [wol] Implement wol command line utility (#3048) (3 weeks ago) [Zhijian Li]
* 8ebc56a0 - [sonic_installer]: Improve exception handling: introduce notes. (#3029) (3 weeks ago) [Nazarii Hnydyn]
* 3610ce93 - [sonic-package-manager] Fix YANG validation failure on upgrade when feature has constraints in YANG model on FEATURE table (#2933) (3 weeks ago) [Stepan Blyshchak]
* cfd2dd39 - Add container rsyslog.conf to the sys dump (#3039) (4 weeks ago) [Vivek]
* c4b07828 - Support new platform in generic configuration update (#3038) (4 weeks ago) [Stephen Sun]
* a8d236c8 - [fast-reboot-filter-routes.py] Remove click and improve error reporting (#3030) (4 weeks ago) [Stepan Blyshchak]
* 75199c0f - [sonic-package-manager] insert newline in /etc/sonic/generated_services.conf (#3040) (4 weeks ago) [Stepan Blyshchak]
* cd855698 - [VOQ][saidump] Modify generate_dump: replace save_saidump with save_saidump_by_route_size (#2972) (4 weeks ago) [JunhongMao]
* f1e24ae5 - GCU support for Cisco-8000 features (#3010) (4 weeks ago) [rbpittman]
* 67e1c3dc - Update GCU rsyslog validator (#3012) (4 weeks ago) [jingwenxie]
* 253b7975 - [sonic-package-manager] do not modify config_db.json (#3032) (5 weeks ago) [Stepan Blyshchak]
* 177dd8e8 - [sonic-package-manager] add generated service to /etc/sonic/generated_services.conf (#3037) (5 weeks ago) [Stepan Blyshchak]
* 62fcd77a - Configure NTP according to extended configuration (#2835) (5 weeks ago) [Yevhen Fastiuk]
* ced09404 - [dualtor_neighbor_check] Adjust zero-mac check condition (#3034) (5 weeks ago) [Longxiang Lyu]
* a4eeb698 - [config] config reload should generate sysinfo if missing  (#3031) (6 weeks ago) [jingwenxie]
* e01fc891 - Run yang validation in unit test (#3025) (6 weeks ago) [ganglv]
2023-12-14 13:16:39 -08:00
zitingguo-ms
bd15b77ba9 change branch name (#17267)
Why I did it
Upgrade xgs SAI to 10.1 version.

Work item tracking
Microsoft ADO (number only): 25931321
How I did it
Upgrade xgs SAI version in sai.mk file.

How to verify it
Run full qualification on 7050cx3/7260cx3:

7050cx3:
https://dev.azure.com/mssonic/internal/_build/results?buildId=425450&view=results
https://dev.azure.com/mssonic/internal/_build/results?buildId=425449&view=results
7260cx3: https://elastictest.org/scheduler/testplan/656f2b2b617fb27e41557494?leftSideViewMode=detail&prop=status&order=ascending
2023-12-14 14:36:07 +08:00
mssonicbld
ee75667fd1
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#17452)
src/sonic-platform-daemons

* 502c0b6 - (HEAD -> 202311, origin/202311) Add Port SI Configuration Per Speed  (#400) (12 days ago) [Tomer Shalvi]
* e2d9f87 - Add dynamic sensor logic for fixed and psu presence/state checking in thermalctld (#401) (2 weeks ago) [Gregory Boudreau]
2023-12-13 17:43:09 -08:00
mssonicbld
093abe423a
[submodule] Update submodule sonic-swss-common to the latest HEAD automatically (#17456)
src/sonic-swss-common

* 8dc6218 - (HEAD -> 202311, origin/202311) Add STATE_TRANSCEIVER_INFO_TABLE_NAME to shcema.h (#824) (2 weeks ago) [noaOrMlnx]
2023-12-13 17:34:35 -08:00
mssonicbld
4ee9a5c368
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#17455)
src/sonic-swss

* d839eec3 - (HEAD -> 202311, origin/202311) Add support for fabric monitor daemon (swss part). (#2920) (11 days ago) [jfeng-arista]
* 8dc0a856 - Add support for new Port SI parameters in PortsOA (#2929) (11 days ago) [Tomer Shalvi]
* 9458b855 - [hash]: Add ECMP/LAG hash algorithm to OA (#2953) (12 days ago) [Nazarii Hnydyn]
* dac3972d - [coppmgrd] Fix Copp processing logic by using Producer del instead of del from Table (13 days ago) [Vivek]
* f6a35e98 - [gcov]: Fix directory prefix issue for (#2969) (13 days ago) [Lawrence Lee]
* 14408ca3 - [Chassis][master][orchagent] : Added test case to verify WRED profile on system ports (#2954) (2 weeks ago) [vmittal-msft]
* 2ca3deb0 - [dash] fix DASH ACL Rule protocol use-after-free (#2958) (3 weeks ago) [Yakiv Huryk]
* b8841ecb - [orchagent]: Extend the SRv6Orch to support the programming of the L3Adj (#2902) (3 weeks ago) [Carmine Scarpitta]
* 194566a7 - Fix the Orchagent Qos error messages reported in Issue #16787 (#2947) (3 weeks ago) [saksarav-nokia]
2023-12-13 15:42:15 -08:00
mssonicbld
d174ad33b7
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#17450)
src/sonic-platform-common

* 5d69644 - (HEAD -> 202311, origin/202311) Adding supported vendor PNs for remote CDB FW upgrade (#418) (#419) (5 days ago) [mihirpat1]
* 036b2fc - [Credo][Ycable] Correct the lane mapping in the debugdumpregister function for the 50G cable (#417) (11 days ago) [Xinyu Lin]
* 2efe97e - Fix VDM freeze and unfreeze needed for PM stats collection  (#402) (2 weeks ago) [jaganbal-a]
* cb80f17 - Fix issue: QSFP module with id 0x0d can be parsed using 8636 (#412) (3 weeks ago) [Stephen Sun]
2023-12-13 15:41:55 -08:00
mssonicbld
f215595699
[submodule] Update submodule sonic-sairedis to the latest HEAD automatically (#17454)
src/sonic-sairedis

* 9621316 - (HEAD -> 202311, origin/202311) [syncd] Remove notify pointers manual handling (#1326) (2 weeks ago) [Kamil Cudnik]
* 4ee9c25 - Add TestSwitch missing attribute (#1327) (2 weeks ago) [noaOrMlnx]
* 4cbbeed - Add SAI Notification support for host_tx_ready (#1307) (2 weeks ago) [noaOrMlnx]
* 9804bd7 - Fix compilation issue due to PORT_STATE_CHANGE_QUEUE_SIZE undefined (#1324) (3 weeks ago) [Ashish Singh]
2023-12-13 15:34:35 -08:00
mssonicbld
2e8c2eba14
Revert "[swss/syncd] remove dependency on interfaces-config.service (#13084) (#14341)" (#15094) (#17367) (#17447) 2023-12-09 10:22:55 +08:00
Aravind-Subbaroyan
62429a2328
Update cisco-8000.ini (#17429)
FCS/CRC Errors will only be reported as RX_ERR.
Fix to avoid the mac port related errors.
Fix for sharedResSize testcase failure in QoS-SAI
Fix the issue related to voltage in 'show platform psustatus'.
Support WRED drop for lossy queues.
Fixed an issue where lossy traffic was getting dropped.
Enhancement of SAI logging for errors and interrupts
2023-12-07 17:04:45 -08:00
Ying Xie
6d22649c81
[202311] lock down some sub module branches (#17405)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-12-04 18:35:14 -08:00
zitingguo-ms
897a023637 Upgrade xgs SAI version to 8.4.31.0 (#17059)
Why I did it
Upgrade the xgs SAI version to 8.4.31.0 to include the following changes:

8.4.22.0: [SDK upgrade][CSP CS00012314723][SAI_BRANCH rel_ocp_sai_8_4] SID:bcmtmPfcDdrScan thread takes 100% CPU utilization
8.4.23.0: [SDK upgrade][CSP CS00012290176[SAI_BRANCH rel_ocp_sai_8_4] SDK-323160: bcm_l3_ecmp_member_add returns Table Full error while ISSU
8.4.24.0:
[SDK upgrade]Merge "[CSP NA][SAI_BRANCH rel_ocp_sai_8_4] SID: Software LinkScan Not Catching Short Local/Remote Fault Events" into hsdk_6.5.27_SAI_8.4.0_GA
[SDK upgrade][CSP NA][SAI_BRANCH rel_ocp_sai_8_4] SID: Software LinkScan Not Catching Short Local/Remote Fault Events
8.4.25.0: [SAI_BRANCH rel_ocp_sai_8_4]CLONE - SAI - 8.4 - _brcm_sai_cosq_stat_get errors for CPU queue 41
8.4.26.0: [CSP CS00012307911] Fixed incorrect CPU related SAI port obj encoding/decoding in most subsystems
8.4.27.0: [CSP CS00012309154] [TD3] SAI_STATUS_INVALID_PARAMETER on setting SAI_BUFFER_POOL_ATTR_SIZE, OA crash
8.4.28.0: [CSP CS00012315552] Excessive logging from _brcm_sai_acl_tbl_grp_mbr_migration
8.4.29.0: [CSP CS00012321369] Fix TH2 regression with MMU/pool size
8.4.30.0: [SDK upgrade][CSP CS00012316299][SAI_BRANCH rel_ocp_sai_8_4] L3 entry delete failed when SER error is present
8.4.31.0: [CSP CS00012307911] Revert and limit scope of previous change due to WB issue.
Work item tracking
Microsoft ADO (number only): 26021230
How I did it
Upgrade the SAI version in sai.mk file.

How to verify it
Run advanced reboot on TH2 and TD3:

https://dev.azure.com/mssonic/internal/_build/results?buildId=422024&view=results
https://dev.azure.com/mssonic/internal/_build/results?buildId=423352&view=results
@saiarcot895 run warm reboot from 202012 to target image and they've passed
TH2: https://dev.azure.com/mssonic/internal/_build/results?buildId=423112&view=logs&j=76acabad-01e9-5c52-6fe6-d396d63e85d2&t=0d14fb40-14d5-50ca-4a23-af1778140cbf
TH: https://dev.azure.com/mssonic/internal/_build/results?buildId=423119&view=logs&j=76acabad-01e9-5c52-6fe6-d396d63e85d2&t=0d14fb40-14d5-50ca-4a23-af1778140cbf
TD3: https://dev.azure.com/mssonic/internal/_build/results?buildId=423074&view=logs&j=76acabad-01e9-5c52-6fe6-d396d63e85d2&t=0d14fb40-14d5-50ca-4a23-af1778140cbf
2023-12-04 22:14:03 +00:00
Kebo Liu
2528b70630 [Mellanox] Add special rsyslog filter for MSN2410 platform (#17365)
- Why I did it
Mellanox MSN2410 platforms have a non-functional error log: "ERR pmon#sensord: Error getting sensor data: dps460/#10: Can't read". This error is because of a firmware issue with some PSU, we are not able to upgrade the FW online. Since there is no functional impact, this error log can be ignored safely

- How I did it
Add a new rsyslog rule to the rsyslog-container.conf.j2, if the docker name is pmon and the platform name matches, the new rule will be inserted into the docker rsyslogd.conf

- How to verify it
run regression on the MSN2410 platform to make the error log will not be printed to the syslog.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-12-04 22:14:03 +00:00
Sudharsan Dhamal Gopalarathnam
8c782c91a4 [FRR]zebra: Fix fpm multipath encap addition (#17247)
Why I did it
To fix the EVPN type5 failure seen in FRR when there are multipaths for nexthop. The type5 routes were queued

show ip route vrf Vrf1
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF Vrf1:
B>q 5.5.5.0/24 [200/0] via 30.0.0.2, Vlan100 onlink, weight 1, 00:00:40
  q                    via 40.0.0.3, Vlan100 onlink, weight 1, 00:00:40
C>* 10.0.0.0/24 is directly connected, Vlan10, 00:00:43
B>q 100.0.0.0/24 [200/0] via 30.0.0.2, Vlan100 onlink, weight 1, 00:00:40
  q                      via 40.0.0.3, Vlan100 onlink, weight 1, 00:00:40
Work item tracking
Microsoft ADO (number only):
How I did it
Porting the FRR fix FRRouting/frr#14835

How to verify it
Validated EVPN multipath with the scenario and confirmed its working.
2023-12-04 22:14:03 +00:00
Dev Ojha
15d9177c14 [Snappi] Update snappi module on sonic-mgmt docker (#17269)
* Update snappi module on Dockerfile.j2

* Update snappi module on Dockerfile.j2

* Update snappi module for py2 and venv
2023-12-04 22:14:03 +00:00
Tomer Shalvi
dccc5bf6cf Media_settings.json Validator Update (#16908)
The format of the media_settings.json file was updated to support the Port SI Per Speed Enhancements. Since media_checker is the validator for the media_settings.json file, it needs to be updated to align with the new format.


How I did it
I added six new SI parameter names introduced as part of the Port SI Per Speed Enhancements. Additionally, I implemented handling for the new hierarchy level (lane_speed_key) in the updated media_settings.json format while maintaining backward compatibility with vendors whose JSON does not support port SI per speed.

How to verify it
I locally built the Debian package using 'make target/debs/bullseye/sonic-device-data_1.0-1_all.deb,' and it completed successfully. Jenkins also built the entire image, which includes the media_checker as part of its process.
2023-12-04 22:14:03 +00:00