utilities:
* c63a62b 2023-01-23 | [muxcable][config] Add support to enable/disable ceasing to be an advertisement interface when `radv` service is stopped (#2622) (HEAD -> 202205) [Jing Zhang]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Why I did it
Fix issue caused by dualtor support PR [dhcpmon] Open different socket for dual tor to enable interface filtering #11201
Improve code
How I did it
On single ToR, packets received count was duplicated due to socket filter set to "inbound"
Tx count not increasing due to filter set to "inbound". Added an outbound socket to count tx packets
Added vlan member interface mapping for Ethernet interface to vlan interface lookup in reference to PR Fix multiple vlan issue sonic-dhcp-relay#27
Exit when socket fails to initialize to allow dhcp_relay docker to restart
How to verify it
Tested on vstestbed single tor and dual tor, sent packets and verify printed out dhcpmon rx and tx counters is correct
Correct number of tx increases
Tx does not increase when ToR is on standby
Why I did it
This PR is to update minigraph.py to support both port alias and port name as input of AttachTo attribute of ACL table.
Before this change, only port alias is supported.
How I did it
Add a global variable to store port names
Search both port names and port alias wheh parsing the value of AttachTo.
How to verify it
Verified by a new unit test case test_minigraph_acl_attach_to_ports
Verified by copying the new minigraph.py to a testbed and run conflg load_minigraph.
utilities:
* 3ebe948 2023-01-14 | [show] Add bgpraw to show run all (#2537) (HEAD -> 202205) [jingwenxie]
* 7979b9b 2022-12-05 | Transceiver eeprom dom CLI modification to show output from TRANSCEIVER_DOM_THRESHOLD table (#2535) [mihirpat1]
swss:
* 4ad82c5 2023-01-13 | Changed the BFD default detect multiplier to 10x (#2614) (HEAD -> 202205) [siqbal1986]
* 4fe7138 2023-01-12 | [MuxOrch] Enabling neighbor when adding in active state (#2601) [Nikola Dancejic]
sairedis:
* 2f6cbd3 2023-01-19 | Fix for [EVPN] When MAC moves from remote end point to local, ASIC DB fields are not updated properly for the mac #11503 (#1173) (github/202205) [anilkpan]
platform-daemon:
* 2851d86 2023-01-17 | Chassisd do an explicit stop of the config_manager (#328) (HEAD -> 202205) [judyjoseph]
platform-common:
* 2995989 2022-12-06 | Add get_transceiver_status and get_transceiver_pm to API interface (#315) (HEAD -> 202205) [longhuan-cisco]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
#### Why I did it
This fixed memory leak in ETHERLIKE-MIB. The fix is not part of net-snmp(5.7.3 version). This PR includes the patch to fix memory leak issue.
```
ke->name in stdup-ed at line 297: n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
```
#### How I did it
patched the fix.
[net-snmp] upstream fix link -> [snmpd|upstream link](ed4e48b5fa)
#### How to verify it
**Before The fix**
used valgrind to find memory leak.
```
root@lnos-x1-a-csw06:/# grep "definitely lost" valgrind-out.txt
==493== 4 bytes in 1 blocks are definitely lost in loss record 1 of 333
==493== 16 bytes in 1 blocks are definitely lost in loss record 25 of 333
==493== 757 bytes in 71 blocks are definitely lost in loss record 214 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 293 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 294 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 295 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 296 of 333
==493== definitely lost: 905 bytes in 77 blocks
```
_we can see the memory leak see in stack trace._
-> dot3stats_linux -> get_nlmsg -> strdup
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.chttps://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c#L277
```
n = malloc(sizeof(*n));
memset(n, 0, sizeof(*n));
n->ifindex = ifi->ifi_index;
n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
memcpy(&n->stats, RTA_DATA(tb[IFLA_STATS]), sizeof(n->stats));
n->next = kern_db;
kern_db = n;
return 0;
```
we were not freeing space for EtherLike-MIB.AS interface mib queries were getting increased, we see memory increment.
```
kern_db = ke->next;
free(ke);
```
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c#L467
```
==55== 757 bytes in 71 blocks are definitely lost in loss record 186 of 299
==55== at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==55== by 0x4EB6E49: strdup (strdup.c:42)
==55== by 0x493F278: get_nlmsg (dot3stats_linux.c:299)
==55== by 0x493F529: rtnl_dump_filter_l.constprop.3 (dot3stats_linux.c:370)
==55== by 0x493FD7A: rtnl_dump_filter (dot3stats_linux.c:401)
==55== by 0x493FD7A: _dot3Stats_netlink_get_errorcntrs (dot3stats_linux.c:424)
==55== by 0x494009F: interface_dot3stats_get_errorcounters (dot3stats_linux.c:530)
==55== by 0x48F6FDA: dot3StatsTable_container_load (dot3StatsTable_data_access.c:330)
==55== by 0x485E76B: _cache_load (cache_handler.c:700)
==55== by 0x485FA37: netsnmp_cache_helper_handler (cache_handler.c:638)
==55== by 0x48720BC: netsnmp_call_handler (agent_handler.c:526)
==55== by 0x48720BC: netsnmp_call_next_handler (agent_handler.c:640)
==55== by 0x4865F75: table_helper_handler (table.c:717)
==55== by 0x4871B66: netsnmp_call_handler (agent_handler.c:526)
==55== by 0x4871B66: netsnmp_call_handlers (agent_handler.c:611)
757 bytes in 71 blocks are definitely lost in loss record 214 of 333
==493== at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==493== by 0x4EB6E49: strdup (strdup.c:42)
==493== by 0x493F278: ??? (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x493F529: ??? (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x493FD7A: _dot3Stats_netlink_get_errorcntrs (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x494009F: interface_dot3stats_get_errorcounters (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x48F6FDA: dot3StatsTable_container_load (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493== by 0x485E76B: _cache_load (cache_handler.c:700)
==493== by 0x485FA37: netsnmp_cache_helper_handler (cache_handler.c:638)
==493== by 0x48720BC: netsnmp_call_handler (agent_handler.c:526)
==493== by 0x48720BC: netsnmp_call_next_handler (agent_handler.c:640)
==493== by 0x4865F75: table_helper_handler (table.c:717)
==493== by 0x4871B66: netsnmp_call_handler (agent_handler.c:526)
==493== by 0x4871B66: netsnmp_call_handlers (agent_handler.c:611)
```
```
**After The fix**
no memory leak in valgrind stack trace related to etherlike MIB.
```
Why I did it
When getting system mac of centec platform, it would increase by 1 the last byte of mac, but it could not consider the case of carry.
How I did it
Firstly, I would replace the ":" with "" of mac to a string.
And then, I would convert the mac from string to int and increase by 1, at last convert it to string with inserting ":".
Why I did it
There is a queue in sysmonitor.py that is created based on an object of multiprocessing.Manager.
After performing fast-reboot, system health monitor is being shut down, what causes this Manager to be shut down as well, since it is a child-process of healthd.
That's why I moved the creation of this Manager from the top of the file to the function Sysmonitor.system_service() (The only place it is used), to make Manager a child-process of Sysmonitor, instead of Healthd. This way both the queue (the Manager) and the processes that uses this queue will be child-processes of the same process, and the problematic scenario of sysmonitor sending messages to a dead queue will not be possible.
How I did it
Removed the definition of manager as global and moved it to system_service() function
How to verify it
Perform a fast reboot and verify the traceback issue is fixed
Why I did it
To bring in the following fixes:
Revert temporary fix added to disable SA equal DA drops
CS00012273013 - [7.1][J2, J2c+] Disable SA Equals DA trap on DNX
CS00012274222 - How to block the voq for given destination port for a flow from a remote mod-id
CS00012275381 - SAI_INGRESS_PRIORITY_GROUP_STAT_PACKETS is incremented for port's PG's even if there are no traffic sent to that PG
CS00012274433 - Local Fault and Remote Fault are not polled by linkscan thread
How I did it
Merged above fixes to SAI code
How to verify it
Validated by running the basic sanity tests on XGS and DNX chassis platforms including
fib/test_fib.py
decap/test_decap.py
drop_counters/test_drop_counters.py
arp/test_arpall.py
Why I did it
In some cases, dpkg will call dpkg to validate version.
dpkg hook will get stuck in a loop to lock.
How I did it
Use an env variable to skip duplicated lock.
The deepdiff python package was recently updated to 6.2.3. As part of
this, a dependency was introduced on orjson. There's no armv8l python
wheel available for orjson, which means it needs to be built from
source. However, building it requires rust (which Buster and Bullseye
don't have a new enough version of) and maturin.
As a quick fix, pin this to version 6.2.2, before the orjson dependency
is introduced.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Update sonic-utilities submodule pointer to include the following:
* dddd6c5 [202205] Revert the show-techsupport optimization PR's ([#2581](https://github.com/sonic-net/sonic-utilities/pull/2581))
Signed-off-by: dprital <drorp@nvidia.com>
Why I did it
In the voq chassis the buffer_queue configuration needs to be applied on system_port instead of the sonic port.
This PR has the change to do this.
How I did it
Modify buffer_config.j2 to generate buffer_queue configuration on system_ports if the device is Voq Chassis
How to verify it
Verify the buffer_queue configuration is generated properly using sonic-cfggen
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Update sonic-utilities submodule pointer to include the following:
3cb66b4 [202205] Preserve copp tables through DB migration (2524) (#2574)
Update sonic-swss submodule pointer to include the following:
c9ca7c8 Modify coppmgr mergeConfig to support preserving copp tables through reboot. (#2548)
Signed-off-by: dprital <drorp@nvidia.com>
Update sonic-utilities submodule pointer to include the following:
* c9ed09d [202205] [sonic_installer] use /etc/resolv.conf from the host when migrating packages (#2573) ([#2575](https://github.com/sonic-net/sonic-utilities/pull/2575))
Signed-off-by: dprital <drorp@nvidia.com>
Update sonic-utilities submodule pointer to include the following:
3bc2bc6 [Mellanox][202205] Change severity to NOTICE in Mellanox buffer migrator when unable to fetch DEVICE_METADATA due to empty CONFIG_DB during initialization (#2570)
e1c8243 [202205][generate_dump] Fix for a deletion flow for all secret files in the techsupport dump (#2572)
9f2984a [202205] Fix issue: unconfigured PGs are displayed in watermarkstat (#2568)
f7988b0 [202205] [timer.unit.j2] use wanted-by in timer unit (#2561)
f45dcfb [generate_dump] Optimize the execution time of 'show techsupport' CLI by paraller function execution (#2565)
67cbb15 [202205]Fixes 12170: Delete subinterface and recreate the subinterface in default-vrf (#2564)
93172c4 [202205] [generate_dump] Optimize the execution time of the 'show techsupport' script to 5-10% by reducing calls to the 'tar append' operation (#2562)
Signed-off-by: dprital <drorp@nvidia.com>
Why I did it
Advance SAI Redis head pointer
How I did it
changes:
sonic-net/sonic-sairedis@cf679e7sonic-net/sonic-sairedis@8d6688e
[202205][Submodule][SAI]Advance SAI head pointer sonic-sairedis#1185 sonic-net/sonic-sairedis@66f2961
remove parameter --skip_error, which removed from [202205][Submodule][SAI]Advance SAI head pointer sonic-sairedis#1185
How to verify it
local image build
This PR is a required for changing the L3 IP forwarding Behavior to SoC in active-active toplogy.
Basically a src IP is added to the SNAT rule so that only packets originating from ToR with src IP as vlan IP get natted by the rule and change the src IP to LoopBack IP
Master Branch PR with combined change is here
sonic-net/sonic-host-services#3
How I did it
check the config DB if the ToR is a DualToR and has an SoC IP assigned.
put an iptable rule
iptables -t nat -A POSTROUTING --destination -j SNAT --to-source "
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.
How I did it
add coalesce-time 10000 to the frr bgp startup config.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
- Why I did it
Fixed sflow yang model to include collector_vrf field.
- How I did it
Added leaf for collector_vrf under sflow_collector. Additionally aligned the configuration guide
- How to verify it
Added UT to verify.
* Fix kube mode to local mode long duration issue
* Remove IPV6 parameters which is not necessary
* Fix read node labels bug
* Tag the running image to latest if it's stable
* Disable image_version_higher check
* Change image_version_higher checker test case
Signed-off-by: Yun Li <yunli1@microsoft.com>
Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: lixiaoyuner <35456895+lixiaoyuner@users.noreply.github.com>