2022-07-28 854d54e: Add support of mdio IPC server class using sai switch api and unix socket (sonic-net/sonic-sairedis#1080) (Jiahua Wang)
2022-07-27 513cb2a: [FlexCounter] Refactor FlexCounter class (sonic-net/sonic-sairedis#1073) (Junchao-Mellanox)
Why I did it
The directory /var/warmboot as top directory for warmboot feature is also needed in docker gbsyncd. Some vendor SAI might save data under it. Without it, the SAI init/creation API failure has happened on PikeZ platform.
How I did it
Mount host directory /host/warmboot as /var/warmboot in docker gbsyncd, which is same as what it has done on docker syncd.
Why I did it
S5296F - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
How to verify it
Used the API 2.0 test suite to validate the test cases.
Why I did it
VoQ chassis supervisor will have Fabric asics and the sub_role for fabric asics will be "Fabric".
The fabric asics namespaces are not being returned in get_all_namespaces() and is required in caclmgrd to add right cacl to allow internal docker traffic from fabric asic namespaces.
test_cacl_application fails on VoQ chassis Supervisor with the error:
Failed: Missing expected iptables rules: set(['-A INPUT -s 240.127.1.1/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.3/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.2/32 -d 240.127.1.1/32 -j ACCEPT'])
How I did it
Update get_all_namespaces to return fabric namespaces list.
How to verify it
Verified on VoQ chassis.
Why I did it
Address issue #10966
sign-off: Jing Zhang zhangjing@microsoft.com
How I did it
Add sonic-peer-switch.yang and unit tests.
How to verify it
Compile Compile target/python-wheels/sonic_yang_mgmt-1.0-py3-none-any.whl and target/python-wheels/sonic_yang_models-1.0-py3-none-any.whl.
Which release branch to backport (provide reason below if selected)
201811
201911
202006
202012
202106
202111
202205
Description for the changelog
Link to config_db schema for YANG module changes
b721ff87b9/src/sonic-yang-models/doc/Configuration.md (peer-switch)
#### Why I did it
This fixed memory leak in ETHERLIKE-MIB. The fix is not part of net-snmp(5.7.3 version). This PR includes the patch to fix memory leak issue.
```
ke->name in stdup-ed at line 297: n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
```
#### How I did it
patched the fix.
[net-snmp] upstream fix link -> [snmpd}upstream link](ed4e48b5fa)
#### How to verify it
**Before The fix**
used valgrind to find memory leak.
```
root@lnos-x1-a-csw06:/# grep "definitely lost" valgrind-out.txt
==493== 4 bytes in 1 blocks are definitely lost in loss record 1 of 333
==493== 16 bytes in 1 blocks are definitely lost in loss record 25 of 333
==493== 757 bytes in 71 blocks are definitely lost in loss record 214 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 293 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 294 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 295 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 296 of 333
==493== definitely lost: 905 bytes in 77 blocks
```
_we can see the memory leak see in stack trace._
-> dot3stats_linux -> get_nlmsg -> strdup
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.chttps://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c#L277
```
n = malloc(sizeof(*n));
memset(n, 0, sizeof(*n));
n->ifindex = ifi->ifi_index;
n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
memcpy(&n->stats, RTA_DATA(tb[IFLA_STATS]), sizeof(n->stats));
n->next = kern_db;
kern_db = n;
return 0;
```
we were not freeing space for EtherLike-MIB.AS interface mib queries were getting increased, we see memory increment.
```
kern_db = ke->next;
free(ke);
```
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c#L467
```
==55== 757 bytes in 71 blocks are definitely lost in loss record 186 of 299
==55== at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==55== by 0x4EB6E49: strdup (strdup.c:42)
==55== by 0x493F278: get_nlmsg (dot3stats_linux.c:299)
==55== by 0x493F529: rtnl_dump_filter_l.constprop.3 (dot3stats_linux.c:370)
==55== by 0x493FD7A: rtnl_dump_filter (dot3stats_linux.c:401)
==55== by 0x493FD7A: _dot3Stats_netlink_get_errorcntrs (dot3stats_linux.c:424)
==55== by 0x494009F: interface_dot3stats_get_errorcounters (dot3stats_linux.c:530)
==55== by 0x48F6FDA: dot3StatsTable_container_load (dot3StatsTable_data_access.c:330)
==55== by 0x485E76B: _cache_load (cache_handler.c:700)
==55== by 0x485FA37: netsnmp_cache_helper_handler (cache_handler.c:638)
==55== by 0x48720BC: netsnmp_call_handler (agent_handler.c:526)
==55== by 0x48720BC: netsnmp_call_next_handler (agent_handler.c:640)
==55== by 0x4865F75: table_helper_handler (table.c:717)
==55== by 0x4871B66: netsnmp_call_handler (agent_handler.c:526)
==55== by 0x4871B66: netsnmp_call_handlers (agent_handler.c:611)
757 bytes in 71 blocks are definitely lost in loss record 214 of 333
==493== at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==493== by 0x4EB6E49: strdup (strdup.c:42)
==493== by 0x493F278: ??? (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x493F529: ??? (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x493FD7A: _dot3Stats_netlink_get_errorcntrs (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x494009F: interface_dot3stats_get_errorcounters (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x48F6FDA: dot3StatsTable_container_load (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x485E76B: _cache_load (cache_handler.c:700)
==493== by 0x485FA37: netsnmp_cache_helper_handler (cache_handler.c:638)
==493== by 0x48720BC: netsnmp_call_handler (agent_handler.c:526)
==493== by 0x48720BC: netsnmp_call_next_handler (agent_handler.c:640)
==493== by 0x4865F75: table_helper_handler (table.c:717)
==493== by 0x4871B66: netsnmp_call_handler (agent_handler.c:526)
==493== by 0x4871B66: netsnmp_call_handlers (agent_handler.c:611)
```
```
**After The fix**
no memory leak in valgrind stack trace related to etherlike MIB.
```
#### Why I did it
To deprecate swsssdk, remove all dependency to it.
#### How I did it
Remove swsssdk from rules and build image scripts.
#### How to verify it
Pass all UT and E2E test case
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Remove swsssdk from rules and build image scripts.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
- Pass TARGET_BOOTLOADER variable value to slave build infra
#### Why I did it
The TARGET_BOOTLOADER is always blank when referred to in the Makefiles which are executed inside the slave build container.
#### How I did it
Pass it on the make command invoking slave.mk explicitly similar to other environment variables.
#### How to verify it
kdump-tools package is installed on sonic-broadcom.bin image.
With the Broadcom syncd containers getting upgraded to Bullseye, the DNX
RPC container is no longer automatically built. Explicitly add a make
command to build it.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
This new breakout mode is required when a QSFP cable is used on the QSFP-DD supported 4700 port. since QSFP only uses the first 4 lanes, this mode is required to restrict the child ports to only use the first four lanes
- How I did it
Updated the platfrom.json file with the extended data
- How to verify it
Tested on one port:
root@msn-4700:/home/admin# show int status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- ------------------------------- ------- ----- ----- ------- ------ ------ ------- ----------------------------------------------- ----------
Ethernet0 0 25G 9100 N/A etp1a routed up up QSFP28 or later N/A
Ethernet1 1 25G 9100 N/A etp1b routed down up N/A N/A
Ethernet2 2 25G 9100 N/A etp1c routed down up N/A N/A
Ethernet3 3 25G 9100 N/A etp1d routed down up N/A N/A
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Why I did it
To support saithriftv2 build for bullseye dockers
- How I did it
Added the dependencies documented in the SAI docs and used in sonic-slave-buster
- How to verify it
Build saithriftv2 in the sonic-slave-bullseye
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
- Why I did it
Profiled the execution for the following cmd intfutil -c status
- How I did it
Cached the following information:
1. get_sonic_version_info()
2. get_platform_info()
None of the API exposed to the user libraries (for eg: sonic-utilities) has been modified
These methods involve reading text files or from redis. Thus, caching helped to improve the execution time
- How to verify it
Added UT's.
Verified on the device
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Why I did it
Update SDK/FW version - 4.5.2318/2010_2318 to pick up new fixes:
1. Cr space timeout on Hold and Release GW - at warm boot
2. Spectrum Port in stuck PHY_UP after peer side rebooted
3. Memory leak in sx_api_router_ecmp_update_set
- How I did it
Update the make file with the new version number
Update submodule Switch-SDK-drivers pointer
- How to verify it
Run sonic regression
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
Update scripts in sonic-buildimage from py-swsssdk to swsscommon
#### How I did it
Replace swsssdk with swsscommon in alphanetworks devices code.
#### How to verify it
Pass all E2E test case
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Replace swsssdk with swsscommon in alphanetworks devices code.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
#### Why I did it
Update scripts in sonic-buildimage from py-swsssdk to swsscommon
#### How I did it
Replace swsssdk with swsscommon in centec devices.
#### How to verify it
Pass all E2E test case
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Replace swsssdk with swsscommon in centec devices.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
#### Why I did it
Update scripts in sonic-buildimage from py-swsssdk to swsscommon
#### How I did it
Remove unused swsssdk import from accton device code
#### How to verify it
Pass all E2E test case
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Remove unused swsssdk import from accton device code
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
Currently the CLI commands show interface status show interface counters and show interface description displays Ethernet-IB and Ethernet-Rec ports in the output. These are internal ports should only be displayed when the option -d all is used for the above mentioned CLI commands
How I did it
Add the port roles Inb and Rec when classifing a port as internal port.
How to verify it
Verify the CLI output of the command show interface status doesnt display the Ethenet-IB and Ethernet-Rec port when -d all option in not present
Before
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Signed-off-by: maipbui maibui@microsoft.com
Why I did it
The xml.etree.ElementTree module is not secure against maliciously constructed data.
How I did it
Remove xml. Use lxml XML parsers package that prevent potentially malicious operation.
Why I did it
It solves a swss orchagent crash issue on PikeZ device, due to link-training setting of external PHY port.
How I did it
Catch up the fix for CS00012257483 in version 7.1.7.2.
Why I did it
On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.
The monit 'container_checker' fails in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).
Why I did it
Migrate FRR to bullseye
How I did it
Makefile and docker config changes to refer to bullseye instead of buster.
How to verify it
Build bullseye frr docker.
Co-authored-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"
How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
* draft upgrade to deb11 of syncd and syncd-rpc
* upgrade to python3
* revert workaround with libsaithrift
* Provide urls for sai and platform debs
* Downgrade python3 to python2
* Remove saithrift-patches
* Upgrade modules
* remove unnecessary lib
* remove more unnecessary modules
* Update sdk reference
* remove unnecessary packages from syncd-rpc
* [snmpd]: Update to 5.9+dfsg-4+deb11u1 to match Debian version
This brings in some security fixes.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Update snmpd makefile
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Remove binNMU for snmpd
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* [Bgpcfgd] Enhance add_peer/add_peer_ipv6 unit tests
Why I did it
The current input to add_peer/add_peer_ipv6 is admin status change, update the UT to supply new peer information.
Current UT does not check for case when check_neig_meta is true, update UT to check for this case
How I did it
By changing the input to add_peer/add_peer_ipv6
By modifying load_constants/constructor to take constants path as an input, and add two UT that uses a version of constants.yml that sets check_neig_meta to true.
How to verify it
UT failing before the change, and passing after the change.
#### Why I did it
- Building `sonic-$PLATFORM.img.gz` fails if KVM support is not enabled.
- Repos have been transferred over from Azure to sonic-net domain
- sonic-net repos no longer use Microsoft CLA, so updated the README to point towards Linux foundation CLA
- p4 platform is no longer supported. Reference: https://github.com/sonic-net/sonic-buildimage/issues/2591#issuecomment-649425081
# Why I did it
platform-modules-belgite's deb requests linux-image-5.10.0-8-2-amd64-unsigned, which does not match the runtime kernel version
# How I did it
update the belgite's deb configuration in deb's control
# How to verify it
check the firsttime boot log in belgite platform
Co-authored-by: nicwu-cel <nicwu@celestica.com>
Signed-off-by: maipbui <maibui@microsoft.com>
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
Replace unsafe functions to safe functions
#### How I did it
Replace `strtok()` by `strtok_r()`
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did:
In case of multi-asic platforms gbsyncd is not getting added to Feature Table of Host Config DB. Without this container_checker complains of not needed gbsyncd container's are running.
How I did:
Update Both Host and Namespace config db when gbsyncd docker is starting.
How I verify:
Verified on Multi-asic platforms.
Fix#10549Fix#10550
#### Why I did it
Create sonic yang model for SNMP
Tables:SNMP, SNMP_COMMUNITY
#### How I did it
Defined yang models based for SNMP based on snmp.yml
#### How to verify it
Added test cases to verify
Port index 22 is associated with phy23_config.json, then same port index 22 in phy24_config.json may cause gearbox port creation error. Port Ethernet22 maps to index 23.
Why I did it
This PR is to update Yang model for pfc_enable and pfcwd_sw_enable fields to support more than 2 queues, like 2,3,4,6.
Before this change, the regex "[0-7](,[0-7])?" accepts only no more than 2 queues.
How I did it
Update the regex pattern for pfc_enable and pfcwd_sw_enable, from "[0-7](,[0-7])?" to "[0-7](,[0-7])*
How to verify it
The change is verified by UT. The test input is updated to cover the change.
collected 3 items
tests/test_sonic_yang_models.py .. [ 66%]
tests/yang_model_tests/test_yang_model.py .
* Update BRCM KNET module to support new psample definitions from sflow dropmon feature
* Update BRCM KNET module to support new psample definitions from sflow dropmon feature
* Advance saibcm-modules-dnx
Change `sxdkernel start` to `sxdkernel restart`. If `syncd` service crashes in `ExecStartPre` systemd will not call `ExecStop` and thus will not call `sxdkernel stop`. Use of `sxdkernel restart` is more robust in terms of guarantees to restore the system after unexpected crashes.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
- Why I did it
SN5600 has an additional service interface with a different parameters than other interfaces.
- How I did it
Added the etp65 interface with the correct parameters.
- How to verify it
Run platform test on SN5600 platform.
Check the service port can startup correctly.