DEPENDS ON: sonic-net/sonic-swss#2997sonic-net/sonic-utilities#3093
What I did
Revert the feature.
Why I did it
Revert bgp suppress FIB functionality due to found FRR memory consumption issues and bugs.
How I verified it
Basic sanity check on t1-lag, regression in progress.
Backport PR #17458 due to conflict.
Why I did it
Optimize syslog rate limit feature for fast and warm boot
Work item tracking
Microsoft ADO (number only):
How I did it
Optimize redis start time
Don't render rsyslog.conf in container startup script
Disable containercfgd by default. There is a new CLI to enable it (in another PR)
How to verify it
Manual test
Regression test
* [Celestica-E1031] Enable CPU watchdog (#16083)
Enable CPU watchdog on Celestica-E1031.
* Add info syslog for cpu_wdt.service (#16678)
Why I did it
Add info syslog for cpu_wdt.service when trigger watchdog arm action.
How I did it
Add info syslog for cpu_wdt.service when trigger watchdog arm action.
Why I did it
Release notes for Cisco 8111-32EH-O, 8102-64H-O and 8101-32FH-O:
• Fixed a bug in PFC-WD where watchdog is triggered too often when sparse traffic is present, failing to detect the traffic traversal - (SR 696617830)
• Resolved an issue where SAI_STATUS_ITEM_NOT_FOUND error was seen while adding LAG members - (MIGSMSFT-354)
• Fixed Thermal API related error message (MIGSMSFT-354)
• Fixed an issue related to default config trap - (MIGSMSFT-354)
• Changed the message log level from error to debug in situations when the HW offloaded session is not found or was never created for the packet received. (MIGSMSFT-354)
• Fixed an issue where drop option was not working when encap and decap IPinIP tunnels share the same SDK tunnel port.
• Fixed an error while running VRF testcase (MIGSMSFT-354)
• Fixed an issue where BFD packets not egressing using Queue 7
• SAI support for additional FEC related attributes:
· SAI_PORT_ATTR_MAX_FEC_SYMBOL_ERRORS_DETECTABLE
· SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S0
. SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S16
Work item tracking
Microsoft ADO (number only):
#### Why I did it
src/sonic-swss
```
* 5643db9a - (HEAD -> 202305, origin/202305) [muxorch] Fixing cache bug in updateRoute logic (#2982) (6 hours ago) [Nikola Dancejic]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Fix issue xcvrd crashes due to cannot import name 'initialize_sfp_thermal':
Nov 27 09:47:16.388639 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")
Nov 27 09:47:16.392544 sonic ERR pmon#xcvrd: Traceback (most recent call last):
Nov 27 09:47:16.392643 sonic ERR pmon#xcvrd: File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1518, in run
Nov 27 09:47:16.392757 sonic ERR pmon#xcvrd: self.task_worker()
Nov 27 09:47:16.392757 sonic ERR pmon#xcvrd: File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1240, in task_worker
Nov 27 09:47:16.392757 sonic ERR pmon#xcvrd: sfp = platform_chassis.get_sfp(pport)
Nov 27 09:47:16.392793 sonic ERR pmon#xcvrd: File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 346, in get_sfp
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd: self.initialize_single_sfp(index)
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd: File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 288, in initialize_single_sfp
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd: self._sfp_list[index] = sfp_module.SFP(index)
Nov 27 09:47:16.392830 sonic ERR pmon#xcvrd: File "/usr/local/lib/python3.9/dist-packages/sonic_platform/sfp.py", line 272, in __init__
Nov 27 09:47:16.392866 sonic ERR pmon#xcvrd: from .thermal import initialize_sfp_thermal
Nov 27 09:47:16.392918 sonic ERR pmon#xcvrd: ImportError: cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)
Nov 27 09:47:16.393103 sonic ERR pmon#xcvrd: Xcvrd: exception found at child thread CmisManagerTask due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")
Nov 27 09:47:16.393103 sonic ERR pmon#xcvrd: Exiting main loop as child thread raised exception!
Work item tracking
Microsoft ADO (number only):
How I did it
Add lock for creating SFP object
How to verify it
UNIT TEST
Manual Test
Signed-off-by: anamehra anamehra@cisco.com
Why I did it
Fixes#16990 for 202305/202205 branch
Note: This PR is for 202305 and 202205. For master, a new PR will be raised with a new field (Uphold=) provided by debian bookworm to handle the dependency failure restartability of the processes.
determine-reboot-cause and process-reboot-cause service does not start If the database service fails to restart in the first attempt. Even if the Database service succeeds in the next attempt, these reboot-cause services do not start.
The process-reboot-cause service also does not restart if the docker or database service restarts, which leads to an empty reboot-cause history
deploy-mg from sonic-mgmt also triggers the docker service restart. The restart of the docker service caused the issue stated in 2 above. The docker restart also triggers determine-reboot-cause to restart which creates an additional reboot-cause file in history and modifies the last reboot-cause.
This PR fixes these issues by making both processes start again when dependency meets after dependency failure, making both processes restart when the database service restarts, and preventing duplicate processing of the last reboot reason.
Work item tracking
Microsoft ADO 25892856
How I did it
Modified systemd unit files to make determine-reboot-cause and process-reboot-cause services restartable when the database service restarts.
On the restart, the determine-reboot-cause service should not recreate a new reboot-cause entry in the database. Added check for first start or restart to skip entry for restart case.
How to verify it
On single asic pizza box:
Installed the image and check reboot-cause history
restart database service and verify that determine-reboot-cause and process-reboot-cause services also restart. Verify that reboot-cause shows correct data and no new entry is created for restart.
On Chassis:
Installed the image and check reboot-cause history
restart the database service and verify that determine-reboot-cause and process-reboot-cause services also restart. Verify that reboot-cause shows correct data and no new entry is created for restart.
Reboot LC. On Supervicor, stop database-chassis service.
Let database service on LC fail the first time. determine-reboot-cause and process-reboot-cause would fail to start due to dependency failure
start database-chassis on Supervisor. Database service on LC should now start successfully.
Verify determine-reboot-cause and process-reboot-cause also starts
Verify show reboot-cause history output
Why I did it
To fix ecmp hash polarization issue.
Work item tracking
Microsoft ADO (number only): 26085143
How I did it
Add sai_hash_seed_config_hash_offset_enable=1 in all config.bcm that Broadcom T1 uses.
HardwareSku
Force10-S6100-T1
Force10-S6100-ITPAC-T1
Force10-S6100
Celestica-DX010-C32
Arista-7260CX3-C64
Arista-7060CX-32S-Q32
Arista-7060CX-32S-C32-T1
Arista-7060CX-32S-C32
Arista-7050QX32S-Q32
Arista-7050QX-32S-S4Q31
Arista-7050-QX32
Arista-7050-QX-32SInclude Broadcom's fix by upgrading xgs SAI version to 8.4.35.0.
8.4.35.0: [CSP 00012324019] back-porting SONIC-75006 to SAI8.4
8.4.34.0:
[CSP 00012318293] back-porting SONIC-81534 to SAI8.4;
ECMP LB traffic polarization, configure hash_offset along with hash_seed attr
Run qual with only xgs SAI version upgraded to 8.4.35.0:
on TH2: https://elastictest.org/scheduler/testplan/6579b36ccfacd86e78e3e885?leftSideViewMode=detail&prop=status&order=ascending
on TH: https://elastictest.org/scheduler/testplan/657a75f8c1d3b51fc1d585b4?leftSideViewMode=detail&prop=status&order=ascending
How to verify it
use tests/ecmp/test_ecmp_sai_value.py to verify.
Fix zebra leaking memory with fib suppress enabled. Porting the fix from
FRRouting/frr#14983
While running test_stress_route.py, systems with lower memory started to throw low memory logs. On further investigation, a memory leak has been found in zebra which was fixed in the FRR community.