Commit Graph

7913 Commits

Author SHA1 Message Date
Zhijian Li
8dc43a6c72
[202305][Celestica-E1031] Enable CPU watchdog (#17415)
* [Celestica-E1031] Enable CPU watchdog (#16083)

Enable CPU watchdog on Celestica-E1031.

* Add info syslog for cpu_wdt.service (#16678)

Why I did it
Add info syslog for cpu_wdt.service when trigger watchdog arm action.

How I did it
Add info syslog for cpu_wdt.service when trigger watchdog arm action.
2023-12-26 15:46:00 +08:00
mssonicbld
79ddd800f2
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17619) 2023-12-26 15:09:51 +08:00
anamehra
98e3dd131e
Update cisco-8000.ini (#17607)
Why I did it
Release notes for Cisco 8111-32EH-O, 8102-64H-O and 8101-32FH-O:
• Fixed a bug in PFC-WD where watchdog is triggered too often when sparse traffic is present, failing to detect the traffic traversal - (SR 696617830)
• Resolved an issue where SAI_STATUS_ITEM_NOT_FOUND error was seen while adding LAG members - (MIGSMSFT-354)
• Fixed Thermal API related error message (MIGSMSFT-354)
• Fixed an issue related to default config trap - (MIGSMSFT-354)
• Changed the message log level from error to debug in situations when the HW offloaded session is not found or was never created for the packet received. (MIGSMSFT-354)
• Fixed an issue where drop option was not working when encap and decap IPinIP tunnels share the same SDK tunnel port.
• Fixed an error while running VRF testcase (MIGSMSFT-354)
• Fixed an issue where BFD packets not egressing using Queue 7
• SAI support for additional FEC related attributes:
· SAI_PORT_ATTR_MAX_FEC_SYMBOL_ERRORS_DETECTABLE
· SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S0
. SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S16

Work item tracking
Microsoft ADO (number only):
2023-12-25 09:45:05 +08:00
mssonicbld
187bdaf190
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17595) 2023-12-22 12:13:32 -08:00
Ying Xie
415c0b7de2
[202305][yang][sonic-utilities] update sonic DB version pattern (#17602)
Supports:
- old version: version_a_b_c
- new version: version_<branch>_<nn>

sonic-utilities:
* b0908bd7 2023-12-21 | [202305][db_migrator] add db migrator version space for 202305/202311 branch (#3084) (HEAD -> 202305, github/202305) [Ying Xie]
* 8f343ebb 2023-08-16 | [GCU] Add PORT table StateDB Validator (#2936) [isabelmsft]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-12-22 11:23:48 -08:00
mssonicbld
40488d3797
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#17593)
#### Why I did it
src/sonic-swss
```
* 5643db9a - (HEAD -> 202305, origin/202305) [muxorch] Fixing cache bug in updateRoute logic (#2982) (6 hours ago) [Nikola Dancejic]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-12-22 06:32:20 +08:00
mssonicbld
223f048b0a
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17590) 2023-12-21 15:55:02 +08:00
mssonicbld
833bb4ae61
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#17589) 2023-12-21 15:37:22 +08:00
mssonicbld
cb8886a212
[submodule] Update submodule sonic-host-services to the latest HEAD automatically (#17587) 2023-12-21 15:10:18 +08:00
kellyyeh
58dcf7f6fc
[202305] Advance dhcprelay and dhcpmon submodule (#17584)
sonic-dhcp-relay
5ae186f Yaqiang Zhu Tue Dec 19 12:05:15 2023 -0500 [counter] Clear counter table when init (#45)
40c6877 Jing Zhang Fri Nov 10 12:41:23 2023 -0800 [CodeQL] fix unmet dependency for build-swss-common (#44)

sonic-dhcpmon
7c55e50 StormLiangMS Thu Sep 14 09:57:06 2023 +0800 Merge pull request #13 from jcaiMR/dev/jcai_master_interface_counter
085a087 jcaiMR Mon Sep 11 09:17:03 2023 +0000 refine counting logic
2023-12-20 22:48:12 -08:00
mssonicbld
109035e511
[Arista] Use port_config.ini for Arista-7050QX-32S-S4Q31 (#17253) (#17574) 2023-12-20 19:34:42 +08:00
mssonicbld
bf62b6ac29
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#17554) 2023-12-20 10:49:04 +08:00
mssonicbld
668a3e0b80
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#17555) 2023-12-20 10:41:23 +08:00
Junchao-Mellanox
9240bcb5b1
[Mellanox] Fix race condition while creating SFP (#17544)
Why I did it
Fix issue xcvrd crashes due to cannot import name 'initialize_sfp_thermal':

Nov 27 09:47:16.388639 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")
Nov 27 09:47:16.392544 sonic ERR pmon#xcvrd: Traceback (most recent call last):
Nov 27 09:47:16.392643 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1518, in run
Nov 27 09:47:16.392757 sonic ERR pmon#xcvrd:     self.task_worker()
Nov 27 09:47:16.392757 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1240, in task_worker
Nov 27 09:47:16.392757 sonic ERR pmon#xcvrd:     sfp = platform_chassis.get_sfp(pport)
Nov 27 09:47:16.392793 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 346, in get_sfp
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd:     self.initialize_single_sfp(index)
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 288, in initialize_single_sfp
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd:     self._sfp_list[index] = sfp_module.SFP(index)
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/sonic_platform/sfp.py", line 272, in __init__
Nov 27 09:47:16.392866 sonic ERR pmon#xcvrd:     from .thermal import initialize_sfp_thermal
Nov 27 09:47:16.392918 sonic ERR pmon#xcvrd: ImportError: cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)
Nov 27 09:47:16.393103 sonic ERR pmon#xcvrd: Xcvrd: exception found at child thread CmisManagerTask due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")
Nov 27 09:47:16.393103 sonic ERR pmon#xcvrd: Exiting main loop as child thread raised exception!
Work item tracking
Microsoft ADO (number only):
How I did it
Add lock for creating SFP object

How to verify it
UNIT TEST
Manual Test
2023-12-19 19:52:42 +08:00
mssonicbld
7f03584d9d
Fix syncd_request_shutdown coredump in config reload on KVM sonic (#17486) (#17564) 2023-12-19 19:02:31 +08:00
mssonicbld
d7ba0608ba
[gbsyncd] Graceful shutdown of syncd process in container gbsyncd (#16812) (#17523) 2023-12-17 22:39:08 +08:00
anamehra
4595db4666
Fixed determine/process reboot-cause service dependency (#17406)
Signed-off-by: anamehra anamehra@cisco.com

Why I did it
Fixes #16990 for 202305/202205 branch

Note: This PR is for 202305 and 202205. For master, a new PR will be raised with a new field (Uphold=) provided by debian bookworm to handle the dependency failure restartability of the processes.

determine-reboot-cause and process-reboot-cause service does not start If the database service fails to restart in the first attempt. Even if the Database service succeeds in the next attempt, these reboot-cause services do not start.

The process-reboot-cause service also does not restart if the docker or database service restarts, which leads to an empty reboot-cause history

deploy-mg from sonic-mgmt also triggers the docker service restart. The restart of the docker service caused the issue stated in 2 above. The docker restart also triggers determine-reboot-cause to restart which creates an additional reboot-cause file in history and modifies the last reboot-cause.

This PR fixes these issues by making both processes start again when dependency meets after dependency failure, making both processes restart when the database service restarts, and preventing duplicate processing of the last reboot reason.

Work item tracking
Microsoft ADO 25892856
How I did it
Modified systemd unit files to make determine-reboot-cause and process-reboot-cause services restartable when the database service restarts.
On the restart, the determine-reboot-cause service should not recreate a new reboot-cause entry in the database. Added check for first start or restart to skip entry for restart case.
How to verify it
On single asic pizza box:

Installed the image and check reboot-cause history
restart database service and verify that determine-reboot-cause and process-reboot-cause services also restart. Verify that reboot-cause shows correct data and no new entry is created for restart.
On Chassis:

Installed the image and check reboot-cause history
restart the database service and verify that determine-reboot-cause and process-reboot-cause services also restart. Verify that reboot-cause shows correct data and no new entry is created for restart.
Reboot LC. On Supervicor, stop database-chassis service.
Let database service on LC fail the first time. determine-reboot-cause and process-reboot-cause would fail to start due to dependency failure
start database-chassis on Supervisor. Database service on LC should now start successfully.
Verify determine-reboot-cause and process-reboot-cause also starts
Verify show reboot-cause history output
2023-12-17 20:48:15 +08:00
mssonicbld
fc4574bcfc
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17542) 2023-12-17 15:20:16 +08:00
mssonicbld
2858cfa240
[submodule] Update submodule sonic-snmpagent to the latest HEAD automatically (#17540) 2023-12-16 15:55:48 +08:00
mssonicbld
14ea72f378
[submodule] Update submodule sonic-sairedis to the latest HEAD automatically (#17539) 2023-12-16 15:44:27 +08:00
mssonicbld
07860da103
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#17538) 2023-12-16 15:30:31 +08:00
mssonicbld
df10bfba16
[submodule] Update submodule sonic-host-services to the latest HEAD automatically (#17537) 2023-12-16 15:08:10 +08:00
Pavan-Nokia
c4ba9c9edb [armhf][Nokia-7215] Remove platform reboot (#17010) 2023-12-16 12:33:36 +08:00
mssonicbld
9c8dbf6f1b
[Dell] S6100 - Update EEPROM API serial_number_str to return service tag instead of serial number (#17440) (#17494) 2023-12-16 01:44:16 +08:00
mssonicbld
b16e2da1be
[installer] Create a blank grubenv if doesn't exist. (#17414) (#17525) 2023-12-16 01:40:44 +08:00
mssonicbld
6dba9f8305
Change leaf value of used_cnt of sonic-events-swss:chk_crm_threshold (#17430) (#17527) 2023-12-16 01:36:36 +08:00
mssonicbld
c49d7c5417
[Nokia-7215][armhf] Enable Watchdog service (#16612) (#17522) 2023-12-16 01:35:16 +08:00
mssonicbld
c63a4c6a4a
[Mellanox][SKU] Adding Mellanox-SN4700-O8V48 SKU (#17425) (#17526) 2023-12-16 01:32:50 +08:00
zitingguo-ms
1a0268c224
Fix ecmp hash polarization by enable hash seed/offset config on T1 and upgrade xgs SAI to 8.4.35.0 (#17505)
Why I did it
To fix ecmp hash polarization issue.

Work item tracking
Microsoft ADO (number only): 26085143
How I did it
Add sai_hash_seed_config_hash_offset_enable=1 in all config.bcm that Broadcom T1 uses.

HardwareSku
Force10-S6100-T1
Force10-S6100-ITPAC-T1
Force10-S6100
Celestica-DX010-C32
Arista-7260CX3-C64
Arista-7060CX-32S-Q32
Arista-7060CX-32S-C32-T1
Arista-7060CX-32S-C32
Arista-7050QX32S-Q32
Arista-7050QX-32S-S4Q31
Arista-7050-QX32
Arista-7050-QX-32SInclude Broadcom's fix by upgrading xgs SAI version to 8.4.35.0.
8.4.35.0: [CSP 00012324019] back-porting SONIC-75006 to SAI8.4
8.4.34.0:
[CSP 00012318293] back-porting SONIC-81534 to SAI8.4;
ECMP LB traffic polarization, configure hash_offset along with hash_seed attr
Run qual with only xgs SAI version upgraded to 8.4.35.0:
on TH2: https://elastictest.org/scheduler/testplan/6579b36ccfacd86e78e3e885?leftSideViewMode=detail&prop=status&order=ascending
on TH: https://elastictest.org/scheduler/testplan/657a75f8c1d3b51fc1d585b4?leftSideViewMode=detail&prop=status&order=ascending

How to verify it
use tests/ecmp/test_ecmp_sai_value.py to verify.
2023-12-15 19:33:47 +08:00
mssonicbld
571efc2f3a
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#17515) 2023-12-15 15:18:42 +08:00
mssonicbld
b09e7d1b6b
[gbsyncd]: Set SYSLOG_CONFIG_FEATURE for gbsyncd (#17325) (#17513) 2023-12-15 14:54:26 +08:00
Sudharsan Dhamal Gopalarathnam
8297800a5e [FRR] Fix zebra memory leak when bgp fib suppress pending is enabled (#17484)
Fix zebra leaking memory with fib suppress enabled. Porting the fix from
FRRouting/frr#14983

While running test_stress_route.py, systems with lower memory started to throw low memory logs. On further investigation, a memory leak has been found in zebra which was fixed in the FRR community.
2023-12-15 14:34:37 +08:00
Sudharsan Dhamal Gopalarathnam
bed8d24a4a
[202305][Mellanox] Update SAI to SAIBuild2305.26.0.16, SDK/FW to 4.6.2134/2012.2134 (#17474)
Why I did it
Update SAI version to SAIBuild2305.26.0.16
Update SDK/FW to 4.6.2134/2012.2134

Fixed issues:

Updated SN3700C to enable limit to 100G speed.
Recovering from Low power mode might ends with port down.
Work item tracking
Microsoft ADO (number only):
How I did it
Updating the versions in makefile

How to verify it
Confirm issues fixed and run sonic-mgmt tests
2023-12-14 17:29:37 +08:00
arista-nwolfe
dd294f3883
Disable SA_EQUALS_DA trap on DNX LC SKUs (#17488)
This is a 202305 cast of this PR #17206
2023-12-14 08:44:44 +08:00
wenyiz2021
7fb7722959
[202305 branch] Upgrade dnx SAI version to 9.2.x (#17432)
202305 image does not come up on chassis with SAI 7.1.111.1.
SAI 9.2.0.0 on 202305 image is verified to come up on Arista chassis. Initial testing is also done, no new failures compare to 202205 image, SAI 7.1.111.1.

Why I did it
Bring up 202305 image on chassis.

Work item tracking
Microsoft ADO (number only): 18189434
How I did it
How to verify it
Brought up SAI 9.2.0.0 on Arista chassis.
Ran pipeline on acl, bgp, arp, acms, cacl, copp, decap, fib, iface_namingmode.
2023-12-13 11:42:24 +08:00
mssonicbld
5b1d18898f
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#17477)
#### Why I did it
src/sonic-platform-common
```
* 57f63e6 - (HEAD -> 202305, origin/202305) Adding supported vendor PNs for remote CDB FW upgrade (#418) (4 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-12-12 16:34:31 +08:00
mssonicbld
d297c4fd34
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17467)
#### Why I did it
src/sonic-utilities
```
* 7cf32a9f - (HEAD -> 202305, origin/202305) Reduce generate_dump mem usage for cores (#3052) (16 hours ago) [davidm-arista]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-12-09 18:32:14 +08:00
Stepan Blyshchak
2cea4bcbdf [config-chassisdb] use cached variables (#17342)
- Why I did it
Improve boot performance mostly needed for fast and warmboot

- How I did it
Use cached variable.

- How to verify it
Boot the system. Simply do "systemd-analyze blame" and look at service start time.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-12-09 14:32:43 +08:00
Stepan Blyshchak
bc4bc03239 [config-topology] use cached variables (#17343)
- Why I did it
Improve  boot performance mostly needed for fast and warmboot

- How I did it
Use cached variable.

- How to verify it
Boot the system. Simply do "systemd-analyze blame" and look at service start time.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-12-09 14:32:39 +08:00
mssonicbld
688245a724
Revert "[swss/syncd] remove dependency on interfaces-config.service (#13084) (#14341)" (#15094) (#17367) (#17461) 2023-12-08 19:45:49 +08:00
Nazarii Hnydyn
06ed67dfa6
[mellanox]: Disable MFT bash autocompletion. (#17359)
A W/A to overcome delay of about 20 sec on login due to MFT bash autocompletion bug.
Should be reverted once a formal solution will be available in future MFT release.

Why I did it
To overcome SN2700 20 sec delay on login
Work item tracking
N/A
How I did it
Removed MFT bash autocompletion part
How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin
2023-12-08 14:35:28 +08:00
mssonicbld
f445416ec5
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#17420)
#### Why I did it
src/sonic-platform-daemons
```
* f23e342 - (HEAD -> 202305, origin/202305) Add dynamic sensor logic for fixed and psu presence/state checking in thermalctld (#401) (18 hours ago) [Gregory Boudreau]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-12-07 10:39:51 +08:00
wadoodkhan
2b8efdbb11
[Marvell] Update armhf sai debian (#17301)
Why I did it
Fixed the issue - Some special IPv6 packets cannot be dropped by dataplane ACL rule

Work item tracking
Microsoft ADO (number only):
No
How I did it
How to verify it
Loaded SAI debian (in syncd docker) and re-run the failed cases.
2023-12-06 20:07:07 +08:00
mssonicbld
a0d2968273
[submodule] Update submodule sonic-dbsyncd to the latest HEAD automatically (#17418)
#### Why I did it
src/sonic-dbsyncd
```
* 68baf40 - (HEAD -> 202305, origin/202305) [lldp-syncd] Fix unexpected exception in snmp-subagent (#64) (18 hours ago) [Zhaohui Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-12-06 16:34:22 +08:00
mssonicbld
3cb68edac5
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17421)
#### Why I did it
src/sonic-utilities
```
* cebac831 - (HEAD -> 202305, origin/202305) [ci] Use correct bullseye docker image according to source branch. (17 hours ago) [Liu Shilong]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-12-06 16:34:13 +08:00
mssonicbld
337f925058
[frr]: Force disable next hop group support. (#17344) (#17423) 2023-12-06 15:53:52 +08:00
mssonicbld
26250f4e4f
[Mellanox] remove log in RAM kernel option for 2700 A1 platform (#17254) (#17422) 2023-12-06 15:39:46 +08:00
Aravind-Subbaroyan
b6a8443487
Update cisco-8000.ini to 202305.1.0.3 (#17417)
Why I did it
FCS/CRC Errors will only be reported as RX_ERR.
Fix to avoid the mac port related errors.
Fix for sharedResSize testcase failure in QoS-SAI
Fix the issue related to voltage in 'show platform psustatus'.
Support WRED drop for lossy queues.
Fixed an issue where lossy traffic was getting dropped.
Enhancement of SAI logging for errors and interrupts
Work item tracking
Microsoft ADO (number only):
How I did it
Update Cisco platform to 202305.1.0.3

How to verify it
2023-12-06 14:22:56 +08:00
StormLiangMS
fa7be88599
Revert "[pmon] update gRPC version to 1.57.0 (#16257) (#17219)" (#17391)
This reverts commit 066065f1cd.
2023-12-05 10:38:51 +08:00
mssonicbld
2804987be0 [submodule] Update submodule sonic-restapi to the latest HEAD automatically (#17386) 2023-12-04 18:36:35 +08:00