Cherry-pick the commit from master where in multi-asic platforms bgp template rendering fails which needs Loopback4096 IP Address. Issue happens because of timing/race condition where if peer gets added first and then Loopback4096 notification comes to bgpcfgd
- Why I did it
Collecting MST dump before syncd restart on shutdown notification during a SAI failure
Dump can be found under:
root@sonic:/home/admin# ls -l /var/dump/mstdump/
total 10684
-rw-r--r-- 1 root root 5460332 Aug 15 18:41 mstdump_20220815_184143.tar.gz
-rw-r--r-- 1 root root 5473253 Aug 15 21:46 mstdump_20220815_214642.tar.gz
root@sonic:/home/admin# tar -xvzf /var/dump/mstdump/mstdump_20220815_214642.tar.gz
├── ir-gdb
│ └── core
└── mstdump
├── mstdump1
├── mstdump2
├── mstdump3
└── mststatus
- How I did it
Checked for shutdown notification log in sairedis and used it to determine whether the shutdown is normal or due to SAI failure
- How to verify it
Simulated a SAI failure event and verified it. Verified it also on different reboots and config reload scenarios the dump is not generated
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Why I did it
Add the hardware reboot cause when the previous software reboot failed
How I did it
Check both hardware reboot cause and software reboot cause.
Add the hardware reboot as actual reboot cause
if any hardware reboot cause is available for any software reboot.
How to verify it
Perform reboots and verify the reboot-cause
Why I did it
Add Celestica Silverstone-x platform
How I did it
Add Celestica Silverstone-x platform
How to verify it
verified by SONiC tested platform APIs
verified by SONiC APIs including " psuutil
psushow(show platform psustatus)
sfputil
sfpshow
tempershow(show platform temperature)
fanshow(show platform fan)
watchdogutil
fwutil(show platform firmware status)
decode-syseeprom -d(show platform syseeprom)
show platform ssdhealth
show platform summary
show interfaces status
"
What/Why I did:
Update Broadcom SAI debian package. New Package has following changes:
CaseCS00012248135: Fix shows error message "linux-bcm-knet: Fatal error: Incomplete chain" followed by malformed LACP/LLDP packets
Why I did it
62b7b56 2022-07-13 | Remove disabled and not loaded services before calling reset-failed and restart services (#2266) [Zain Budhwani]
09b4678 2022-07-05 | [config/load_mgmt_config] Support load IPv6 mgmt IP (#2206) (#2246) [Jing Kan]
How I did it
Pulled latest commit from 201911 sonic-utilities branch and created PR
How to verify it
Look at build-image
```
23fc702 [201911][patch] mlxsw: i2c: Prevent transaction execution for special chip st (#278)
e4f44e4 [201911] Increase log buf len size to 1M (#265)
ef6abe3 [201911] Apply kernel patches to fix emmc unreliability (#264)
7458347 [201911] Increase log buf len size to 1M
4edf1b4 [201911] Apply kernel patches to fix emmc unreliability
```
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Why I did it
I updated sonic-linux-kernel to pick a fix for a bug happening during ISSU that caused CPU stall.
How I did it
Updated submodule
How to verify it
Build and run warm reboot
Why I did it
Improve throughput and latency for 7260 deployments
How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%
How to verify it
Added unit tests for rendering the qos template for 7260. Built sonic config engine wheel successfully
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Added Support for Celestica Midstone-100x platform
How I did it
Implemented the support for Celestica Midstone-100x platform
Platform: x86_64-cel_midstone-100x-r0
HwSKU: Midstone-100x
ASIC: innovium
ASIC Count: 1
How to verify it
Run platform test on testbed
Why I did it
There is a need to select different mmu profiles based on deployment type
How I did it
There will be separate subfolders (RDMA-CENTRIC, TCP-CENTRIC, BALANCED) in each hwsku folder which contains deployment specific mmu and qos settings. SonicQosProfile attribute in the minigraph will be used to determine which settings to use. If that attribute is not present, the default settings that exist in the hwsku folder will be used
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
This was an ask by Microsoft to provide:
7260 config.bcm file for hardware sku Arista-7260CX3-D92C16 (Named Arista-7260CX3-D96C16).
There are 16 100G uplinks:
Ethernet13-20/1
Ethernet45-52/1
All other ports are breakout to 2 50G ports.
How I did it
Copied existing Arista-7260CX3-D108C8 HWSKU and altered the bcm.config and port_config.ini files.
How to verify it
The new 100G ports do come up with a 201811 image using this HWSKU.
Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
Backport form master
Appropriate PR on master: #7735
Appropriate PR on master #6444
Why I did it
PG drop counters should be enabled by default (merge from master)
After "config reload" or "docker swss restart" all counters were enabled even if they were disabled before
How I did it
1)Add PG drop counter enable option to dockers/docker-orchagent/enable_counters.py
2) Check if entry already exist before set default values
How to verify it
- install image and run counterpoll show CLI command and then you will see PG_STAT_DROP enabled
- Disable few counters
counterpoll pg-drop disable
counterpoll port disable
- Save and reload
config save
config reload
- Check enable status
* [ci] Set default ACR in UpgrateVersion/PR/official pipeline. (#10341)
Why I did it
docker hub will limit the pull rate.
Use ACR instead to pull debian related docker image.
How I did it
Set DEFAULT_CONTAINER_REGISTRY in pipeline.
* Add a config variable to override default container registry instead of dockerhub. (#10166)
* Add variable to reset default docker registry
* fix bug in docker version control
c3d9d8f2bcd364dc81cd4d9bec02666cef648b10 (HEAD -> 201911, origin/201911) API for getting all members from all VLANs (#106)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
What I did:
Added support to create route-map action set tag
when the the allow prefix list matches. The tag can ben define by user in
constants.yml.
Why I did:
Since for Allow List feature we call from base route-map allow-list route-map having set tag option provides way for base route-map to do match tag and take any further action if needed. Adding tag provide metadata that can used by base route-map
[201911][pfcwd] Avoid ingress drop by not attaching zero profiles when pfc storm is detected (#2279)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* fix allow list issue
Signed-off-by: stormliang <stormliang@microsoft.com>
* add the ipaddress in the install list
* add unit test
Co-authored-by: Ubuntu <azureuser@SONIC-SH-STORM-02.5pu3m0fajw1edcfltykk1gauxa.gx.internal.cloudapp.net>
Why I did it
Failed to remove part of configuration of bgp allowed prefix list. The details in #10141
How I did it
There are two issues:
In FRR, ipv6 default route is ::/0, but in the configuration, it is 0::/0, string comparison would be false, but why ipv4 failed to remove the allowed prefix list, ipv6 works? Looks into next one for the answer.
The current managers_allow_list doesn’t support removal part of the prefix list. But why IPv6 works in 1? It is because the bug for the IPv6 default route comparison, it would do the update no matter what is the operation (the code will compare the prefix list in the FRR and configuration db, if all configurations in db are presented in FRR, it do nothing, otherwise it will update the prefix list based on the configuration from db).
How to verify it
Follow the step in #10141
f91a9e6e07a43cae531cda019935de3221e0bb09 (HEAD -> 201911, origin/201911) Fix: not to use blocking get_all() after keys() (#255)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
- Why I did it
To include the fix for the issue of Modification of shared headroom on the fly can get to negative occupancy that leads to PFC been sent from the switch continuously.
- How I did it
Updated submodule pointer and version in relevant Makefile.
- How to verify it
Build an image and run tests from sonic-mgmt.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>