Commit Graph

4220 Commits

Author SHA1 Message Date
Vadym Hlushko
1ea5fd7c70
[DPB][YANG-models] extended regex pattern according to Mellanox systems speeds requirements (#6279)
[DPB][MLNX][YANG] fixed range of max speed

- Why I did it
All Mellanox platforms require DPB modes with a specific set of speeds example

- How I did it
Extended regex pattern inside YANG model.
Supported platforms: SN2010, SN2100, SN2410, SN2700, SN3420, SN3700, SN3700C, SN3800, SN4600C, SN4410, SN4700

- How to verify it
Manually tested DPB CLI on all platform with all modes

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2021-01-17 10:39:40 +02:00
Lawrence Lee
063e11cce1
[minigraph.py]: Don't create mux table entries for servers w/o loopbacks (#6457)
Avoid sonic-cfggen crashing when a server does not have a configured loopback address in the minigraph

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-17 00:23:11 -08:00
dflynn-Nokia
2830a2bca6
[build arm] fix sonic-slave-buster build break (#6469)
When building the sonic-slave-buster docker container, the node.js package is
installed to meet the requirements of the Azure DevOPs pipleline
build. Recently this install of node.js has been failing.

This commit fixes that build break by upgrading the
sonic-slave-buster build to install version 14.x of node.js which is the
current LTS version for buster.
2021-01-16 22:04:19 -08:00
Kebo Liu
4cf9316ec3
[Mellanox] Make determine-reboot-cause service start after hw-management service (#6465)
**- Why I did it**

On the Mellanox platform, reboot cause is fetched from some certain sysfs which is created by the hw-management service. So determine-reboot-cause service shall start after hw-management, otherwise it could fail due to the related sysfs is not available yet.

**- How I did it**

Add a patch to the hw-management service to make sure determine-reboot-cause service should start after it.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-15 11:38:31 -08:00
Wirut Getbamrung
0ca343422d
[device/celestica]: Add thermalctld support on DX010 platform APIs (#6089)
**- Why I did it**
- The thermalctld daemon on the Pmon docker requires support from the thermal manager API.

**- How I did it**
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
2021-01-15 10:20:47 -08:00
Roy Lee
c9d3e25115
[device/accton]: As7816-64x, fix memory leakage on accton fan monitor. (#6168)
It's been reported that accton fan monitor process keeps consuming memory after few days.
The amount of memory occupied increases in linear and never leased.

Signed-off-by: roy_lee <roy_lee@edge-core.com>
2021-01-15 08:06:21 -08:00
Lawrence Lee
ffcef27eb1
[minigraph.py]: Check for empty cluster tag before parsing (#6440)
Some non-production minigraphs will have an empty ClusterName tag

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-15 08:03:47 -08:00
Kebo Liu
1b2980540d
[mellanox][platform api] fix a missing import time module (#6458)
“time" module was missed to be imported and will cause an error when the branch hit.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-15 08:01:11 -08:00
Junchao-Mellanox
6db88e860f
[Mellanox] PSU and module thermals are no longer child of chassis (#6460)
In order to build up device hierachy, PSU and module thermals are no longer child of chassis. PSU thermal belongs to PSU objects and SFP thermals belong to SFP object now. Need align this change in platform.json. Move thermal objects to correct parent device
2021-01-15 08:00:15 -08:00
Ying Xie
054f5b7a53
[warm boot finalizer] only wait for enabled components to reconcile (#6454)
* [warm boot finalizer] only wait for enabled components to reconcile

Define the component with its associated service. Only wait for components that have associated service enabled to reconcile during warm reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-01-15 07:48:11 -08:00
Partha Dutta
58a13b4c11
Export libyang API "lyd_check_mandatory_tree" for Management framework (CVL) (#5714)
- Why I did it
Management framework (CVL) needs to call lyd_check_mandatory_tree() for validation and hence exported lyd_check_mandatory_tree() as an API.

- How I did it
Added "API" keyword before lyd_check_mandatory_tree() definition.

- How to verify it
There is no functionality code change here and no specific steps to verify it. Management framework (CVL) should be able to call this function and no patching and compilation error should be seen.

- Description for the changelog
Added "API" keyword before export lyd_check_mandatory_tree() function definition.
2021-01-14 16:47:57 -08:00
pavel-shirshov
16e54340b7
[docker-frr]: Use egrep with regexp to match correct TSA rules (#6403)
**- Why I did it**
Earlier today we found a bug in the SONiC TSA implementation.
TSC shows incorrect output (see below) in case we have a route-map which contains TSA route-map as a prefix.
```
admin@str-s6100-acs-1:~$ TSC
Traffic Shift Check:
System Mode: Not consistent
```
The reason is that TSC implementation has too loose regexps in TSA utilities, which match wrong route-map entries:
For example, current TSC matches following
```
route-map TO_BGP_PEER_V4 permit 200
route-map TO_BGP_PEER_V6 permit 200
```
But it should match only
```
route-map TO_BGP_PEER_V4 permit 20
route-map TO_BGP_PEER_V4 deny 30
route-map TO_BGP_PEER_V6 permit 20
route-map TO_BGP_PEER_V6 deny 30
```

**- How I did it**
I fixed it by using egrep with `^` and `$` regexp markers which match begin and end of the line.

**- How to verify it**
1. Add follwing entry to FRR config:
```
str-s6100-acs-1# 
str-s6100-acs-1# conf t
str-s6100-acs-1(config)# route-map TO_BGP_PEER_V4 permit 200 
str-s6100-acs-1(config-route-map)# end
```
2. Use the TSC command and check output. It should show normal.
```
admin@str-s6100-acs-1:~$ TSC
Traffic Shift Check:
System Mode: Normal```
2021-01-14 11:09:16 -08:00
Joe LeVeque
419c10bf97
[sonic-platform-common] Enable pytest during build for Python 3 package (#6442)
**- Why I did it**

To enable running Pytest unit tests before building the Python 3 sonic-platform-common package

**- How I did it**

- Add Python 3 sonic-config-engine package as a dependency of Python 3 sonic-platform-common package (needed for both runtime and unit tests)
- No longer disable unit tests when building Python 3 sonic-platform-common package
2021-01-14 10:26:15 -08:00
Joe LeVeque
c141bb90e9
Remove things needed for building Python 3 from source (#6441)
**- Why I did it**

Prior to SONiC using Debian Buster, we needed to build Python 3.5 or newer from source for installation in the SNMP container, becuase it wasn't available from the Debian repository for Jessie or Stretch. Now that all containers are based on Buster, we simply install Python 3.7 from the Debian repository in the host as well as all containers. We are no longer building Python 3 from source, so the Makefile is unused and we no longer need to install build dependencies in the slave containers.

**- How I did it**

- Remove Python 3 makefile
- No longer install Python 3 build dependencies in the slave containers.
2021-01-14 10:25:40 -08:00
Stepan Blyshchak
9a1b42ff6b
[snmpagent] update submodule (#6169)
Includes below sonic-snmpagent commits
```
dfde06e 2021-01-13 | Revert "[rfc1213] Interface MIB add l3 vlan interfaces & aggregate rif counters (#169)" (#191) [Stepan Blyshchak]
45edd7e 2021-01-04 | [snmpagent] Fix hardcoded qsfp lane count by reading sensor status from DB (#184) [Junchao-Mellanox]
3b72a6f 2021-01-02 | Fix: handle empty LOC_CHASSIS_TABLE (#190) [Qi Luo]
4aad821 2020-12-29 | [sysName]: Implement sysName OID (#185) [SuvarnaMeenakshi]
8efb4bb 2020-12-29 | [rfc1213] fix counter value type (#189) [Stepan Blyshchak]
025483a 2020-12-23 | [RouteUpdater]: Fix multi_asic mock function implementation and multi_asic variable name (#186) [SuvarnaMeenakshi]
381ae47 2020-12-10 | [mibs] b'VLAN_TABLE:' -> 'VLAN_TABLE' (#181) [Stepan Blyshchak]
e54036c 2020-12-09 | [rfc1213] Interface MIB add l3 vlan interfaces & aggregate rif counters (#169) [Stepan Blyshchak]
fd1eae7 2020-11-24 | Set swsscommon logging level (#178) [Qi Luo]
706d504 2020-11-23 | Improve MockRedis _encode(): so it will work on all types of value (#179) [Qi Luo]
64c93a1 2020-11-16 | [RFC4292][Namespace]: Fix implementation of RouteUpdater for multi-asic platform (#176) [SuvarnaMeenakshi]
b8f19ee 2020-11-12 | [sonic-snmpagent] SONiC physical entity mib extension (#168) [Junchao-Mellanox]
6b94ec3 2020-11-05 | Replace swsssdk.SonicV2Connector with swsscommon.SonicV2Connector (SWIG wrapper of C++ implementation) in production code (#162) [Qi Luo]
```

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2021-01-14 00:34:34 -08:00
Joe LeVeque
4612f680e6
[swss] Depend on Python 3 version of swsscommon rather than Python 2 (#6432)
The only Python code in the SwSS package, restore_neighbors.py, was recently converted to Python 3 and most dependencies were updated as part of #6207. However, the SwSS makefile still listed the Python 2 version of the swsscommon package as a dependency. This caused Python 2-related packages to be installed in containers unnecessarily.
2021-01-14 00:29:21 -08:00
Kebo Liu
86553342eb
[sonic-linux-kernel]: Update sonic-linux-kernel submodule (#6433)
Update sonic-linux-kernel pointer to pick up new commits:

- Backport patches to increase critical threshold for ASIC and validate transceiver temperature a7c1af7c44edde90dff49d672071139043bcdb65  548e8e0be4
- [ci]: Set up CI with Azure Pipelines   548e8e0be49692050ea4071d5e9945816bc5aacc a7c1af7c44

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-13 15:18:05 -08:00
Kalimuthu-Velappan
18350a5dd9
[build]: Fix for missing dependencies in the DPKG framework (#6393)
1. Fixes the missing DPKG file for gbsyncd-vs package
2. Fixes the softlink issue on the Platform-common and ztp package
3. Fixes the PYTHNON_DEBS list is missing for DBG dockers.
2021-01-13 10:32:42 -08:00
lguohan
1c00145813
[ci]: cleanup fsroot reliably (#6431)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-13 10:30:49 -08:00
Vaibhav Hemant Dixit
8b4b146760
[sonic-swss-common] Advance swss-common head to include General Protection error fix (#6436)
To include Kernel GP fault seen in *syncd processes:
Azure/sonic-swss-common#444
2021-01-13 10:18:17 -08:00
Junchao-Mellanox
0a49edb68e
[Mellanox] Fix issue: need import initialize_sdk_handle in get_sdk_handle (#6435)
Found test_sfp.py failed due to use a method without importing it.
2021-01-13 09:42:04 -08:00
Samuel Angebault
68e9b83f3e
Update swi-tools in buster Dockerfile (#6414)
Fixed swi-tools code to work with `python3`
Updated the version of swi-tools downloaded by the `sonic-slave-buster/Dockerfile.j2`
Other Dockerfiles still use the `python2` version, though swi-tools is not used within the stretch builder.
2021-01-13 08:32:22 -08:00
Vadym Hlushko
b56320ce56
[SN4410] fixed 'port_config.ini' (#6316)
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2021-01-13 15:45:05 +02:00
xumia
1dcab4d1e3
Fix py3 version changed even version control enabled issue (#6422)
* Fix py3 version changed even version control enabled issue

* Add some comments and simplify the script

* Add the comment to explain how to get the not hooked command
2021-01-13 18:40:39 +08:00
dflynn-Nokia
674fac21c1
[Nokia ixs7215] Add SW assist for platform entropy & fix inband mgmt support (#6417)
- Improve random number generation during early Sonic initialization by providing SW updates to Linux entropy value.
- Improve handling of platform In-Band management port

This commit provides the following updates to the Nokia ixs7215 platform

1. The Marvell Armada-38x SOC requires SW assistance to improve the system
   entropy value available early on in the Sonic boot sequence.
2. The Nokia ixs7215 platform does not have a dedicated Out-Of-Band (OOB) mgmt
   port and thus requires additional logic to optionally support configuring
   front panel port 48 as an In-Band mgmt port. This commit provides additional
   logic to manage and maintain the operation of this In-Band mgmt port.
2021-01-12 16:59:42 -08:00
carl-nokia
380edf054d
[Platform][nokia]: python3-smbus package add with python3 and jinja fixes (#6416)
fix platform driver breakage due to python3 upgrade and fix load minigraph errors with config load_minigraph -y

**- How I did it**
added python3-smbus to the pmon docker template since the previous was python2 specific 
fixed additional "ord" python2 specific code 
fixed the jinja templates used by qos reload - the template logic required data to be parsed 

**- How to verify it**
run "show platform XXX" commands and verify output
run "sudo config load_minigraph -y" and verify configuration 
run "show interfaces XXX" and verify output 

Co-authored-by: Carl Keene <keene@nokia.com>
2021-01-12 15:05:06 -08:00
guxianghong
d4f9fa56aa
[Centec] upgrade to buster docker for DOCKER_SYNCD_CENTEC_RPC, docker-saiserver-centec and platform-modules (#6423)
Centec syncd have beend upgraded to buster, docker-syncd-centec-rpc do not need generate stretch based docker.

Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
2021-01-12 12:36:10 -08:00
Kebo Liu
015b421e5e
[Mellanox] [platform API] Fix “local variable 'label_port' referenced before assignment” error (#6419)
In rare case can see that xcvrd failed due to "UnboundLocalError: local variable 'label_port' referenced before assignment"

Init "label_port" as None at the beginning of the function, to avoid the case that "label_port" not assigned.
2021-01-12 10:43:57 -08:00
jostar-yang
bbd6967c82
[as7326-54x] Remove not need executable flag (#6326)
Remove executable bit from the service files
2021-01-12 10:40:51 -08:00
gechiang
26fd52780d
Anchor the libprotobuf-dev version based on a fixed version by using debian control dependency (#6420) 2021-01-12 09:51:15 -08:00
Lawrence Lee
6e63ecfa1d
[minigraph.py]: Force /32 prefix for mux cable server IPv4 loopbacks (#6418)
Server IPv4 loopbacks do not always arrive with /32 prefix, which is a requirement for the MUX_CABLE table in config DB

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-12 06:04:48 -08:00
Qi Luo
8b82408ba4
[sonic-slave]: Upgrade python lxml library version to 4.6.2 (#6404) 2021-01-12 06:03:59 -08:00
lguohan
ab2ae41212
[build]: fix dpkg admindir corruption issue in parallel build (#6408)
Fix #119

when parallel build is enable, multiple dpkg-buildpackage
instances are running at the same time. /var/lib/dpkg is shared
by all instances and the /var/lib/dpkg/updates could be corrupted
and cause the build failure.

the fix is to use overlay fs to mount separate /var/lib/dpkg
for each dpkg-buildpackage instance so that they are not affecting
each other.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-12 06:03:12 -08:00
lguohan
6ff8d2cb10
[ci]: add mellanox build to azure-pipeline (#6409)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-12 06:01:59 -08:00
xumia
4ef3f1376f
[arista]: Fix web package md5 hash not correct issue (#6411)
Need to add the follow redirection option -L when downloading the package with redirection.
2021-01-11 10:33:08 -08:00
Lawrence Lee
3dd993e019
[minigraph.py]: Add peer switch hostname to device metadata (#6405)
To make the peer switch hostname easily accessible from config DB. Add peer_switch field to DEVICE_METADATA table

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-11 10:18:39 -08:00
Ying Xie
281651510c
[utilities] advance utilities submodule head (#6402)
- (HEAD, github/master) [storyteller] adding a grep wrapper with predefined scenarios (#1349)
- Adding global-timeout, individual command timeout, log files collection (#1249)
- Add FW dump with new SAI implementation (#1338)
- [unit test][pfcwd] Fix tests that require sudo access (#1340)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-01-11 08:13:04 -08:00
Ze Gan
c22575218a
[docker-macsec]: MACsec container and wpa_supplicant component (#5700)
The HLD about MACsec feature is at :

https://github.com/Azure/SONiC/blob/master/doc/macsec/MACsec_hld.md

- How to verify it
This PR doesn't set MACsec container automatically start, You should manually start the container by docker run docker-macsec
wpa_supplicant binary can be found at MACsec container.
This PR depends on the PR, WPA_SUPPLICANT, and The MACsec container will be set as automatically start by later PR.

Signed-off-by: zegan <zegan@microsoft.com>
2021-01-10 10:39:59 -08:00
Samuel Angebault
1498408ce7
[Arista] Update driver submodules (#6396)
- Cleanup and Refactor of library internals, logic mostly unchanged.
 - Enhance debugability with `arista dump` and `arista diag` commands.
 - Fix power supply detection issue.
2021-01-10 07:42:56 -08:00
guxianghong
c64052bb28
[Centec ARM64]Upgrade Centec syncd docker to buster and Enable Telemetry on ARM64 (#6386)
* Enable telemetry for ARM64 by default

* [Centec]Upgrade Centec syncd docker to buster; libjemalloc2 have been installed in docker-base-buster, remove libjemalloc1 from docker-syncd-centec's Dockerfile.j2

Co-authored-by: Gu Xianghong <xgu@centecnetworks.com>
2021-01-09 08:07:30 -08:00
dependabot[bot]
72b635083d
Bump lxml from 4.6.1 to 4.6.2 in /src/sonic-config-engine (#6385)
Bumps [lxml](https://github.com/lxml/lxml) from 4.6.1 to 4.6.2.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-4.6.1...lxml-4.6.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-01-08 15:47:29 -08:00
Qi Luo
5b7f88abf3
[sonic-swss-common] Update submodule (#6382)
Includes sonic-swss-common commits:
```
71dc350 2021-01-07 | Lower the log level for outdated key for SubscriberStateTable notification (#441) [Qi Luo]
7e40582 2021-01-08 | Add boost dependencies (#442) [Ze Gan]
30a8ddf 2021-01-05 | Change DBConnector::hgetall return type from map to unordered_map (#440) [Qi Luo]
021108d 2021-01-02 | MCLAG Enhancements per HLD https://github.com/Azure/SONiC/pull/596 (#405) [Praveen-Brcm]
54996fc 2021-01-02 | Implement ConfigDBConnector and ConfigDBPipeConnector in C++ (#437) [Qi Luo]
8286525 2020-12-27 | Simply refactor DBConnector hgetall() [Qi Luo]
6d1d33b 2020-12-27 | Fix RedisTransactioner: handle empty deque [Qi Luo]
624e0b8 2020-12-26 | Move complex class constructor as explicit, and fix several mistaken copy constructor usage [Qi Luo]
3b983f9 2020-12-30 | [ci]: add timeout to 180 minutes for arm build (#439) [lguohan]
f2e4210 2020-12-29 | Add utility for string and redis (#434) [Ze Gan]
7a885fd 2020-12-29 | [build]: add build check for arm64 and armhf (#436) [lguohan]
47bccc4 2020-12-24 | Add missed vector header to rediscommand.h (#435) [Ze Gan]
```
2021-01-08 14:13:25 -08:00
pavel-shirshov
83715cfc49
[bgpcfgd]: Support default action for "Allow prefix" feature (#6370)
* Use 20 and 30 route-map entries instead of 2 and 3 for TSA

* Added support for dynamic "Allow list" default action.

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2021-01-08 14:03:26 -08:00
yozhao101
04cd1d61e8
[Monit] Monitoring the running status of containers. (#6251)
**- Why I did it**
This PR aims to monitor the running status of each container. Currently the auto-restart feature was enabled. If a critical process exited unexpected, the container will be restarted. If the container was restarted 3 times during 20 minutes, then it will not run anymore unless we cleared the flag using the command `sudo systemctl reset-failed <container_name>` manually. 

**- How I did it**
We will employ Monit to monitor a script. This script will generate the expected running container list and compare it with the current running containers. If there are containers which were expected to run but were not running, then an alerting message will be written into syslog.

**- How to verify it**
I tested this feature on a lab device `str-a7050-acs-3` which has single ASIC and `str2-n3164-acs-3` which has a Multi-ASIC. First I manually stopped a container by running the command `sudo systemctl stop <container_name>`, then I checked whether there was an alerting message in the syslog.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2021-01-07 19:52:22 -08:00
lguohan
fc3cb76dcf
[ci]: Set up CI with Azure Pipelines (#6384)
[skip ci]
2021-01-07 19:50:11 -08:00
Renuka Manavalan
dbc6718408
Take a copy of existing TACACS credentials and restore it during upgrade (#6285)
In scenario where upgrade gets config from minigraph, it could miss tacacs credentials as they are not in minigraph. Hence restore explicitly upon load-minigraph, if present.

- Why I did it
Upon boot, when config migration is required, the switch could load config from minigraph. The config-load from minigraph would wipe off TACACS key and disable login via TACACS, which would disable all remote user access. This change, would re-configure the TACACS if there is a saved copy available.

- How I did it
When config is loaded from minigraph, look for a TACACS credentials back up (tacacs.json) under /etc/sonic/old_config. If present, load the credentials into running config, before config-save is called.

- How to verify it
Remove /etc/sonic/config_db.json and do an image update. Upon reboot, w/o this change, you would not be able ssh in as remote user. You may login as admin and check out, "show tacacs" & "show aaa" to verify that tacacs-key is missing and login is not enabled for tacacs.
With this change applied, remove /etc/sonic/config_db.json, but save tacacs & aaa credentials as tacacs.json in /etc/sonic/. Upon reboot, you should see remote user access possible.
2021-01-07 16:45:38 -08:00
Joe LeVeque
e52581e919
[PDDF] Build and install Python 3 package (#6286)
- Make PDDF code compliant with both Python 2 and Python 3
- Align code with PEP8 standards using autopep8
- Build and install both Python 2 and Python 3 PDDF packages
2021-01-07 10:03:29 -08:00
Danny Allen
0ad2098402
[README] Update build badges to include 202012 build status (#6373)
Signed-off-by: Danny Allen <daall@microsoft.com>
2021-01-07 10:02:39 -08:00
Joe LeVeque
2d77a36658
[system-health] Make run_command() Python 3-compliant (#6371)
Pass universal_newlines=True parameter to subprocess.Popen(); no longer use .encode('utf-8') on resulting stdout.
This was missed in #5886

Note: I would prefer to use text=True instead of universal_newlines=True, as the former is an alias only available in Python 3 and is more understandable than the latter. However, Even though the setup.py file for this package only specifies Python 3, the LGTM tool finds other Python 2 code in the repo and validates the code as Python 2 code and alerts that text=True is an invalid parameter. Will stick with universal_newlines=True for now. Once all Python code in the repo has been converted to Python 3, I will change all universal_newlines=True to text=True.
2021-01-07 05:48:13 -08:00
gechiang
a6907a7c62
[brcm]: BRCM SAI 4.2.1.5-9 Fix _brcm_sai_indexed_data_get () with unexpected queue causing _brcm_sai_switch_assert () after warm reboot (#6374)
Starting from build (master) 176 the warm reboot on BRCM Platform started to experience syncd crash. Upon further debug by Ying it was determined that the crash was related to the following new change:
[Dynamic buffer calc] Support dynamic buffer calculation (#1338)

Ying also debugged further and found The crash was caused by buffer pool profile setting operation SAI_BUFFER_PROFILE_ATTR_SHARED_DYNAMIC_TH

A case has filed with BRCM while a potential fix was tried by Ying that seems to have addressed this issue and we are making this change available in master branch so that it will allow further feature validation/testing especially in the warm reboot area.
Once an official fix is provided by BRCM, we will then remove this in house fix and apply the official fix.

- How to verify it
Just perform warm reboot with any master code 175 or above you should see this issue or issue the following cmd will also cause the crash: "mmuconfig -p egress_lossy_profile -a 0"
2021-01-07 05:46:48 -08:00