Modify snmpd.conf to start snmpd to listen on specific management and loopback ips instead of listening on any ip.
#### Why I did it
SNMP over IPv6 is not working for all scenarios for a single asic platforms.
The expectation is that SNMP query over IPv6 should work over Management or Loopback0 addresses.
**Specific scenario where this issue is seen**
In case of Lab T0 device, when SNMP request is sent from a directly connected T1 neighbor over Loopback IP, SNMP response was not received.
This was because the SRC IP address in SNMP response was not Loopback IP, it was the PortChannel IP connected to the neighboring device.
```
23:18:51.620897 In 22:26:27:e6:e0:07 ethertype IPv6 (0x86dd), length 105: fc00::72.41725 > **fc00:1::32**.161: C="msft" **GetRequest**(28) .1.3.6.1.2.1.1.1.0
23:18:51.621441 Out 28:99:3a:a0:97:30 ethertype IPv6 (0x86dd), length 241: **fc00::71**.161 > fc00::72.41725: C="msft" **GetResponse**(162) .1.3.6.1.2.1.1.1.0="SONiC Software Version: SONiC.xxx - HwSku: xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64"
```
In case of IPv4, the SRC IP in SNMP response was correctly set to Loopback IP.
```
23:25:32.769712 In 22:26:27:e6:e0:07 ethertype IPv4 (0x0800), length 85: 10.0.0.57.56701 > **10.1.0.32**.161: C="msft" **GetRequest**(28) .1.3.6.1.2.1.1.1.0
23:25:32.975967 Out 28:99:3a:a0:97:30 ethertype IPv4 (0x0800), length 221: **10.1.0.32**.161 > 10.0.0.57.56701: C="msft" **GetResponse**(162) .1.3.6.1.2.1.1.1.0="SONiC Software Version: SONiC.xxx - HwSku: xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64"
```
**Sequence of SNMP request and response**
1. SNMP request will be sent with SRC IP fc00::72 DST IP fc00:1::32
2. SNMP request is received at SONiC device is sent to snmpd which is listening on port 161 :::161/
3. snmpd process will parse the request create a response and sent to DST IP fc00::72.
snmpd process does not track the DST IP on which the SNMP request was received, which in this case is Loopback IP.
snmpd process will only keep track what is tht IP to which the response should be sent to.
4. snmpd process will send the response packet.
5. Kernel will do a route look up on destination IP and find the best path.
ip -6 route get fc00::72
fc00::72 from :: dev PortChannel101 proto kernel src fc00::71 metric 256 pref medium
5. Using the "src" ip from about, the response is sent out. This SRC ip is that of the PortChannel and not the device Loopback IP.
The same issue is seen when SNMP query is sent from a remote server over Management IP.
SONiC device eth0 --------- Remote server
SNMP request comes with SRC IP <Remote_server> DST IP <Mgmt IP>
If kernel finds best route to Remote_server_IP is via BGP neighbors, then it will send the response via front-panel interface with SRC IP as Loopback IP instead of Management IP.
Main issue is that in case of IPv6, snmpd ignores the IP address to which SNMP request was sent, in case of IPv6.
In case of IPv4, snmpd keeps track of DST IP of SNMP request, it will keep track if the SNMP request was sent to mgmt IP or Loopback IP.
Later, this IP is used in ipi_spec_dst as SRC IP which helps kernel to find the route based on DST IP using the right SRC IP.
https://github.com/net-snmp/net-snmp/blob/master/snmplib/transports/snmpUDPBaseDomain.c#L300
ipi.ipi_spec_dst.s_addr = srcip->s_addr
Reference: https://man7.org/linux/man-pages/man7/ip.7.html
```
If IP_PKTINFO is passed to sendmsg(2)
and ipi_spec_dst is not zero, then it is used as the local
source address for the routing table lookup and for
setting up IP source route options. When ipi_ifindex is
not zero, the primary local address of the interface
specified by the index overwrites ipi_spec_dst for the
routing table lookup.
```
**This issue is not seen on multi-asic platform, why?**
on multi-asic platform, there exists different network namespaces.
SNMP docker with snmpd process runs on host namespace.
Management interface belongs to host namespace.
Loopback0 is configured on asic namespaces.
Additional inforamtion on how the packet coming over Loopback IP reaches snmpd process running on host namespace: https://github.com/sonic-net/sonic-buildimage/pull/5420
Because of this separation of network namespaces, the route lookup of destination IP is confined to routing table of specific namespace where packet is received.
if packet is received over management interface, SNMP response also is sent out of management interface. Same goes with packet received over Loopback Ip.
##### Work item tracking
- Microsoft ADO **17537063**:
#### How I did it
Have snmpd listen on specific Management and Loopback IPs specifically instead of listening on any IP for single-asic platform.
Before Fix
```
admin@xx:~$ sudo netstat -tulnp | grep 161
udp 0 0 0.0.0.0:161 0.0.0.0:* 15631/snmpd
udp6 0 0 :::161 :::* 15631/snmpd
```
After fix
```
admin@device:~$ sudo netstat -tulnp | grep 161
udp 0 0 10.1.0.32:161 0.0.0.0:* 215899/snmpd
udp 0 0 10.3.1.1:161 0.0.0.0:* 215899/snmpd
udp6 0 0 fc00:1::32:161 :::* 215899/snmpd
udp6 0 0 fc00:2::32:161 :::* 215899/snmpd
```
**How this change helps with the issue?**
To see snmpd trace logs, modify snmpd to start using the below parameters, in supervisord.conf file
```
/usr/sbin/snmpd -f -LS0-7i -Lf /var/log/snmpd.log
```
When snmpd listens on any IP, snmpd binds to IPv4 and IPv6 sockets as below:
```
netsnmp_udpbase: binding socket: 7 to UDP: [0.0.0.0]:0->[0.0.0.0]:161
trace: netsnmp_udp6_transport_bind(): transports/snmpUDPIPv6Domain.c, 303:
netsnmp_udpbase: binding socket: 8 to UDP/IPv6: [::]:161
```
When IPv4 response is sent, it goes out of fd 7 and IPv6 response goes out of fd 8.
When IPv6 response is sent, it does not have the right SRC IP and it can lead to the issue described.
When snmpd listens on specific Loopback/Management IPs, snmpd binds to different sockets:
```
trace: netsnmp_udpipv4base_transport_bind(): transports/snmpUDPIPv4BaseDomain.c, 207:
netsnmp_udpbase: binding socket: 7 to UDP: [0.0.0.0]:0->[10.250.0.101]:161
trace: netsnmp_udpipv4base_transport_bind(): transports/snmpUDPIPv4BaseDomain.c, 207:
netsnmp_udpbase: binding socket: 8 to UDP: [0.0.0.0]:0->[10.1.0.32]:161
trace: netsnmp_register_agent_nsap(): snmp_agent.c, 1261:
netsnmp_register_agent_nsap: fd 8
netsnmp_udpbase: binding socket: 10 to UDP/IPv6: [fc00:1::32]:161
trace: netsnmp_register_agent_nsap(): snmp_agent.c, 1261:
netsnmp_register_agent_nsap: fd 10
netsnmp_ipv6: fmtaddr: t = (nil), data = 0x7fffed4c85d0, len = 28
trace: netsnmp_udp6_transport_bind(): transports/snmpUDPIPv6Domain.c, 303:
netsnmp_udpbase: binding socket: 9 to UDP/IPv6: [fc00:2::32]:161
```
When SNMP request comes in via Loopback IPv4, SNMP response is sent out of fd 8
```
trace: netsnmp_udpbase_send(): transports/snmpUDPBaseDomain.c, 511:
netsnmp_udp: send 170 bytes from 0x5581f2fbe30a to UDP: [10.0.0.33]:46089->[10.1.0.32]:161 on fd 8
```
When SNMP request comes in via Loopback IPv6, SNMP response is sent out of fd 10
```
netsnmp_ipv6: fmtaddr: t = (nil), data = 0x5581f2fc2ff0, len = 28
trace: netsnmp_udp6_send(): transports/snmpUDPIPv6Domain.c, 164:
netsnmp_udp6: send 170 bytes from 0x5581f2fbe30a to UDP/IPv6: [fc00::42]:43750 on fd 10
```
#### How to verify it
Verified on single asic and multi-asic devices.
Single asic SNMP query with Loopback
```
ARISTA01T1#bash snmpget -v2c -c xxx 10.1.0.32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: Arista-7260xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64
ARISTA01T1#bash snmpget -v2c -c xxx fc00:1::32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: Arista-7260xxx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64
```
On multi-asic -- no change.
```
sudo netstat -tulnp | grep 161
udp 0 0 0.0.0.0:161 0.0.0.0:* 17978/snmpd
udp6 0 0 :::161 :::* 17978/snmpd
```
Query result using Loopback IP from a directly connected BGP neighbor
```
ARISTA01T2#bash snmpget -v2c -c xxx 10.1.0.32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: xx - Distribution: Debian 9.13 - Kernel: 4.9.0-14-2-amd64
ARISTA01T2#bash snmpget -v2c -c xxx fc00:1::32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: xx - Distribution: Debian 9.13 - Kernel: 4.9.0-14-2-amd64
```
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->
Why I did it
It is to fix the docker-ptf-sai build failure.
https://dev.azure.com/mssonic/build/_build/results?buildId=311315&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795
2023-07-09T13:53:19.9025355Z �[91mTraceback (most recent call last):
2023-07-09T13:53:19.9025715Z File "/root/ptf/.eggs/setuptools_scm-7.1.0-py3.7.egg/setuptools_scm/_entrypoints.py", line 74, in <module>
2023-07-09T13:53:19.9025933Z from importlib.metadata import entry_points # type: ignore
2023-07-09T13:53:19.9026167Z ModuleNotFoundError: No module named 'importlib.metadata'
Work item tracking
Microsoft ADO (number only): 24513583
How I did it
How to verify it
#### Why I did it
src/sonic-swss
```
* eff2b751 - (HEAD -> 202211, origin/202211) [202211][CodeQL]: Use dependencies with relevant versions in azp template. (#2843) (6 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
- What I did
Added support for secure upgrade.
- How I did it
During sonic_installer install, added secure upgrade image verification.
HLD can be found in the following PR: sonic-net/SONiC#1024
- Why I did it
Feature is used to allow image was not modified since built from vendor. During installation, image can be verified with a signature attached to it.
- How I did it
Feature includes image signing during build (in sonic buildimage repo) and verification during image install (in sonic-utilities).
- How to verify it
In order for image verification - image must be signed - need to provide signing key and certificate (paths in SECURE_UPGRADE_DEV_SIGNING_KEY and SECURE_UPGRADE_DEV_SIGNING_CERT in rules/config) during build , and during image install, need to enable secure boot flag in bios, and signing_certificate should be available in bios.
- Feature dependencies
In order for this feature to work smoothly, need to have secure boot feature implemented as well.
The Secure boot feature will be merged in the near future.
#### Why I did it
src/sonic-swss
```
* 0ec46f22 - (HEAD -> 202211, origin/202211) [muxorch] Skip programming ACL for standby `active-active` ports (#2569) (#2854) (7 hours ago) [Longxiang Lyu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* ac698065 - (HEAD -> 202211, origin/202211) [202211][muxorch] Skip programming SoC IP kernel tunnel route (39 minutes ago) [Longxiang Lyu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
To reduce the container's dependency from host system
Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.
How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.
Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
Why I did it
Refine PR test template.
How I did it
Refine PR test template.
How to verify it
PR test executed normally.
Signed-off-by: Chun'ang Li <chunangli@microsoft.com>
- Why I did it
Add the patchwork link to the commit description for non-upstream patches if present
- How I did it
Parse the patchwork/<patch_name>.txt file from hw-mgmt
#### Why I did it
src/sonic-sairedis
```
* da40f3f - (HEAD -> 202211, origin/202211) [202211][CodeQL]: Use dependencies with relevant versions in azp template. (#1263) (11 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-sairedis
```
* d1850e2 - (HEAD -> 202211, origin/202211) [CI]: Fix pipeline issue caused by urllib3 v2. (#1264) (13 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 5e50a4af - (HEAD -> 202211, origin/202211) Add support for secure upgrade (#2698) (14 hours ago) [ycoheNvidia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss-common
```
* 7465f1b - (HEAD -> 202211, origin/202211) Fix pipeline issue caused by urllib3 v2 (12 hours ago) [Liu Shilong]
* 82f9d76 - [Ci] Fix collect log error in azp template (#799) (2 days ago) [xumia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 5fe6e25a - (HEAD -> 202211, origin/202211) [subinterface]: Fix admin state handling. (#2806) (4 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* 3ebe922 - (HEAD -> 202211, origin/202211) Fix issue: hostcfgd unit test might be affected by other during parallel build (#65) (12 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 4daa748 - (HEAD -> 202211, origin/202211) PSUD-Delete or update CHASSIS_INFO table PSU/Modules data if added or… (#351) (33 hours ago) [prem-nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Don't auto update package version for release branch.
Work item tracking
Microsoft ADO (number only): 22335854
How I did it
How to verify it
Why I did it
Add support for SFP refactor on Nokia-7215 Marvell armhf platform.
Platform: armhf-nokia_ixs7215_52x-r0
HwSKU: Nokia-7215
ASIC: marvell
Port Config: 48x1G + 4x10G (SFP+)
How I did it
Modify sfp.py to support SFP refactor optoe driver and platform.json to facilitate proper OC test completion.
How to verify it
Build armhf target for Nokia-7215 and verify proper Xcvrd and SFP refactor operation.
Why I did it
Add yang model definition for CHASSIS_MODULE define and implemented for sonic chassis. HLD for this configuration is included in https://github.com/sonic-net/SONiC/blob/master/doc/pmon/pmon-chassis-design.md#configurationFixes#12640
How I did it
Added yang model definition, unit tests, sample config and documentation for the table
How to verify it
Validated config tree generation using "pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-voq-inband-interface.yang"
Built the below python-wheels to validate unit tests and other changes
target/python-wheels/bullseye/sonic_yang_mgmt-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_yang_models-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_config_engine-1.0-py3-none-any.whl
Why I did it
refine reproducible build.
How I did it
Fix reset map variable in bash.
Ignore empty web file md5sum value.
If web file didn't backup in azure storage, use file on web.
How to verify i
#### Why I did it
src/sonic-platform-common
```
* 459ffaa - (HEAD -> 202211, origin/202211) Fix issue: should use 'Value' column to calculate the health percentage (#381) (3 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 0f0ec140 - (HEAD -> 202211, origin/202211) Fix issue: show interfaces transceiver eeprom -d should display same entry for CMIS cable (#2864) (3 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
How I did it
During the disable flow of dhcp_relay, it entered the dnsrvs_name list, which caused the SYSTEM_STATE key to be set to DOWN. Right after that, the dhcp_relay service was removed from the full service list, however, but, when it was removed from the dnsrvs_name, there was no flow to reset the system state back to UP even though there was no more services in down state.
How to verify it
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay enabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay disabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
Should see
System is ready
This reverts commit 02b17839c3.
Reverts #14933
The earlier commit caused a race condition that particularly broke cross branch warm upgrade.
Issue happens when db_migrator is still migrating the DB and finalizer is checking DB for list of components to reconcile.
If migration is not complete, finalizer get an empty list to wait for. Due to this, finalizer concludes warmboot (deletes system wide warmboot flag) and cause all the services to do cold restart.
ADO: 24274591
* Re-add 127.0.0.1/8 when bringing down the interfaces
With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.
To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.
Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
sonic-mgmt is failing tests due to invalid test data in platform.json
Fwutil is upset the chassis name in the platform_component.json of the 7060CX-32S
How I did it
Fixed the aforementioned issues