Commit Graph

1049 Commits

Author SHA1 Message Date
anamehra
a47e79d584 Chassis: fix pmon docker failure when DEVICE_METADATA is not available (#16527)
Signed-off-by: anamehra anamehra@cisco.com

Added a check for DEVICE_METADATA before accessing the data. This prevents the j2 failure when var is not available.
2023-10-20 12:34:10 +08:00
abdosi
366d558dd9 [chassis/multi-asic] Enable Sending BGP Community over internal neighbors over iBGP Session (#16705)
What I did:
Enable Sending BGP Community over internal neighbors over iBGP Session

Microsoft ADO: 25268695

Why I did:
Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers.


str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52141
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 16:08:26 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52688
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 15:45:51 2023

After the change

str2-xxxx-lc2-2(config)# router bgp 65100
str2-xxxx-lc2-2(config-router)# address-family ipv4
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community
str2-xxxx-lc2-2(config-router-af)# exit
str2-xxxx-lc2-2(config-router)# address-family ipv6
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community
str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52400
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:19 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52947
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:09 2023

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-10-20 12:34:00 +08:00
mssonicbld
2cfa8b2d93
[build] Fix build issue in docker-ptf-sai caused by setuptools_scm new release (#16636) (#16680)
Why I did it
When SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT changed, almost all dockers need to be built again.
But currently it will be loaded by cache.

Work item tracking
Microsoft ADO (number only): 25123348
How I did it
Add $(DOCKER)_FILES into dependencies.

How to verify it
2023-09-26 18:47:01 +08:00
SuvarnaMeenakshi
379f256bcf
[202211][SNMP][IPv6]: Revert PRs to support SNMP over IPv6 (#16278)
* Revert "[SNMP][IPv6]: Fix to use link local IPv6 address as snmp agentAddress (#16013)"

This reverts commit 803c71c86a.

* Revert "[SNMP][IPv6]: Fix SNMP IPv6 reachability issue in certain scenarios (#15487)"

This reverts commit 9864dfeaa1.
2023-09-10 22:18:17 +08:00
mssonicbld
4390159698
Update macsec CAK keys in profile for tests to change to type7 encoded format (#16388) (#16500) 2023-09-09 04:37:17 +08:00
SuvarnaMeenakshi
9e43edf237 [SNMP][IPv6]: Fix to use link local IPv6 address as snmp agentAddress (#16013)
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
fixes: https://github.com/sonic-net/sonic-buildimage/issues/16001
Caused by: https://github.com/sonic-net/sonic-buildimage/pull/15487

The above PR introduced change to use Management and Loopback Ipv4 and ipv6 addresses as snmpagent address in snmpd.conf file.
With this change, if Link local IP address is configured as management or Loopback IPv6 address, then snmpd tries to open socket on that ipv6 address and fails with the below error:
```
Error opening specified endpoint "udp6:[fe80::5054:ff:fe6f:16f0]:161"
Server Exiting with code 1
```
From RFC4007, if we need to specify non-global ipv6 address without ambiguity, we need to use zone id along with the ipv6 address: <address>%<zone_id>
Reference: https://datatracker.ietf.org/doc/html/rfc4007

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Modify snmpd.conf file to use the %zone_id representation for ipv6 address.
#### How to verify it
In VS testbed, modify config_db to use link local ipv6 address as management address:
    "MGMT_INTERFACE": {
        "eth0|10.250.0.101/24": {
            "forced_mgmt_routes": [
                "172.17.0.1/24"
            ],
            "gwaddr": "10.250.0.1"
        },
        "eth0|fe80::5054:ff:fe6f:16f0/64": {
            "gwaddr": "fe80::1"
        }
    },

Execute config_reload after the above change.
snmpd comes up and check if snmpd is listening on ipv4 and ipv6 addresses:
```
admin@vlab-01:~$ sudo netstat -tulnp | grep 161
tcp        0      0 127.0.0.1:3161          0.0.0.0:*               LISTEN      274060/snmpd        
udp        0      0 10.1.0.32:161           0.0.0.0:*                           274060/snmpd        
udp        0      0 10.250.0.101:161        0.0.0.0:*                           274060/snmpd        
udp6       0      0 fc00:1::32:161          :::*                                274060/snmpd        
udp6       0      0 fe80::5054:ff:fe6f::161 :::*                                274060/snmpd      -- Link local 
 
admin@vlab-01:~$ sudo ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.250.0.101  netmask 255.255.255.0  broadcast 10.250.0.255
        inet6 fe80::5054:ff:fe6f:16f0  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:6f:16:f0  txqueuelen 1000  (Ethernet)
        RX packets 36384  bytes 22878123 (21.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 261265  bytes 46585948 (44.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

admin@vlab-01:~$ docker exec -it snmp snmpget -v2c -c public fe80::5054:ff:fe6f:16f0 1.3.6.1.2.1.1.1.0
iso.3.6.1.2.1.1.1.0 = STRING: "SONiC Software Version: SONiC.master.327516-04a6031b2 - HwSku: Force10-S6000 - Distribution: Debian 11.7 - Kernel: 5.10.0-18-2-amd64"
```
Logs from snmpd:
```
Turning on AgentX master support.
NET-SNMP version 5.9
Connection from UDP/IPv6: [fe80::5054:ff:fe6f:16f0%eth0]:44308
```
Ran test_snmp_loopback test to check if loopback ipv4 and ipv6 works:
```
./run_tests.sh -n vms-kvm-t0 -d vlab-01 -c snmp/test_snmp_loopback.py  -f vtestbed.yaml -i ../ansible/veos_vtb -e "--skip_sanity --disable_loganalyzer" -u
=== Running tests in groups ===
Running: pytest snmp/test_snmp_loopback.py --inventory ../ansible/veos_vtb --host-pattern vlab-01 --testbed vms-kvm-t0 --testbed_file vtestbed.yaml --log-cli-level warning --log-file-level debug --kube_master unset --showlocals --assert plain --show-capture no -rav --allow_recover --ignore=ptftests --ignore=acstests --ignore=saitests --ignore=scripts --ignore=k8s --ignore=sai_qualify --junit-xml=logs/tr.xml --log-file=logs/test.log --skip_sanity --disable_loganalyzer
..                                                                        

snmp/test_snmp_loopback.py::test_snmp_loopback[vlab-01] PASSED 
```
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [x] 202012
- [x] 202106
- [x] 202111
- [x] 202205
- [x] 202211
- [x] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2023-08-16 14:32:52 +08:00
Saikrishna Arcot
e76b4a0cc7 Upgrade scapy in the PTF's python3 virtualenv to 2.5.0 (#15573)
This is primarily to fix a bug in scapy hitting an error when trying to
listen on multiple interfaces in a single `sniff` call. This also
upgrades it to the current latest version.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-07-20 18:34:16 +08:00
xumia
7862722228 [Build] Fix the PyYang python package installation issue (#15890)
Why I did it
Fix the armhf build failure.
How to reproduce the issue:

docker run -it debain:bullseye bash
apt-get update && apt-get install -y python3-pip
pip3 install PyYAML==5.4.1
Error message:

Collecting PyYAML==5.4.1
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3 /tmp/tmp6xabslgb_in_process.py get_requires_for_build_wheel /tmp/tmp_er01ztl
....
      raise AttributeError(attr)
  AttributeError: cython_sources
  ----------------------------------------
WARNING: Discarding d63f2d7597/PyYAML-5.4.1.tar.gz (sha256)=607774cbba28732bfa802b54baa7484215f530991055bb562efbed5b2f20a45e (from https://pypi.org/simple/pyyaml/) (requires-python:>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*). Command errored out with exit status 1: /usr/bin/python3 /tmp/tmp6xabslgb_in_process.py get_requires_for_build_wheel /tmp/tmp_er01ztl Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement PyYAML==5.4.1
ERROR: No matching distribution found for PyYAML==5.4.1
root@fa2fa92edcfd:/# 
But if adding the option --no-build-isolation, then it is good, see fix.

install "PyYAML==5.4.1" --no-build-isolation
The same error can be found in the multiple builds.

Work item tracking
Microsoft ADO (number only): 24567457

How I did it
Add a build option --no-build-isolation.
2023-07-20 04:32:58 +08:00
lixiaoyuner
5a836b24f3 Add health check probe for k8s upgrade containers. (#15223)
#### Why I did it
After k8s upgrade a container, k8s can only know the container is running, don't know the service's status inside container. So we need a probe inside container, k8s will call the probe to check whether the container is really ready.
##### Work item tracking
- Microsoft ADO **(number only)**: 22453004
#### How I did it
Add a health check probe inside config engine container, the probe will check whether the start service exit normally or not if the start service exists and call the python script to do container self-related specific checks if the script is there. The python script should be implemented by feature owner if it's needed.

more details: [design doc](https://github.com/sonic-net/SONiC/blob/master/doc/kubernetes/health-check.md)
#### How to verify it
Check path /usr/bin/readiness_probe.sh inside container.

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [x] 202205
- [x] 202211

#### Tested branch (Please provide the tested image version)
- [x] 20220531.28
2023-07-14 20:55:07 +08:00
SuvarnaMeenakshi
6d491fac2d [SNMP][IPv6]: Fix SNMP IPv6 reachability issue in certain scenarios (#15487)
Modify snmpd.conf to start snmpd to listen on specific management and loopback ips instead of listening on any ip.

#### Why I did it
SNMP over IPv6 is not working for all scenarios for a single asic platforms.
The expectation is that SNMP query over IPv6 should work over Management or Loopback0 addresses.
**Specific scenario where this issue is seen**
In case of Lab T0 device,  when SNMP request is sent from a directly connected T1 neighbor over Loopback IP, SNMP response was not received.
This was because the SRC IP address in SNMP response was not Loopback IP, it was the PortChannel IP connected to the neighboring device.
```
23:18:51.620897  In 22:26:27:e6:e0:07 ethertype IPv6 (0x86dd), length 105: fc00::72.41725 > **fc00:1::32**.161:  C="msft" **GetRequest**(28)  .1.3.6.1.2.1.1.1.0
23:18:51.621441 Out 28:99:3a:a0:97:30 ethertype IPv6 (0x86dd), length 241: **fc00::71**.161 > fc00::72.41725:  C="msft" **GetResponse**(162)  .1.3.6.1.2.1.1.1.0="SONiC Software Version: SONiC.xxx - HwSku: xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64"
```
In case of IPv4, the SRC IP in SNMP response was correctly set to Loopback IP.
```
23:25:32.769712  In 22:26:27:e6:e0:07 ethertype IPv4 (0x0800), length 85: 10.0.0.57.56701 > **10.1.0.32**.161:  C="msft" **GetRequest**(28)  .1.3.6.1.2.1.1.1.0
23:25:32.975967 Out 28:99:3a:a0:97:30 ethertype IPv4 (0x0800), length 221: **10.1.0.32**.161 > 10.0.0.57.56701:  C="msft" **GetResponse**(162)  .1.3.6.1.2.1.1.1.0="SONiC Software Version: SONiC.xxx - HwSku: xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64"
```

**Sequence of SNMP request and response**
1. SNMP request will be sent with SRC IP fc00::72 DST IP fc00:1::32
2. SNMP request is received at SONiC device is sent to snmpd which is listening on port 161 :::161/
3. snmpd process will parse the request create a response and sent to DST IP fc00::72. 
snmpd process does not track the DST IP on which the SNMP request was received, which in this case is Loopback IP.
snmpd process will only keep track what is tht IP to which the response should be sent to.
4. snmpd process will send the response packet.
5. Kernel will do a route look up on destination IP and find the best path.
ip -6 route get fc00::72
fc00::72 from :: dev PortChannel101 proto kernel src fc00::71 metric 256 pref medium
5. Using the "src" ip from about, the response is sent out. This SRC ip is that of the PortChannel and not the device Loopback IP.

The same issue is seen when SNMP query is sent from a remote server over Management IP.
SONiC device eth0 --------- Remote server
SNMP request comes with SRC IP <Remote_server> DST IP <Mgmt IP>
If kernel finds best route to Remote_server_IP is via BGP neighbors, then it will send the response via front-panel interface with SRC IP as Loopback IP instead of Management IP.

Main issue is that in case of IPv6, snmpd ignores the IP address to which SNMP request was sent, in case of IPv6.
In case of IPv4, snmpd keeps track of DST IP of SNMP request, it will keep track if the SNMP request was sent to mgmt IP or Loopback IP.
Later, this IP is used in ipi_spec_dst as SRC IP which helps kernel to find the route based on DST IP using the right SRC IP.
https://github.com/net-snmp/net-snmp/blob/master/snmplib/transports/snmpUDPBaseDomain.c#L300 
ipi.ipi_spec_dst.s_addr = srcip->s_addr
Reference: https://man7.org/linux/man-pages/man7/ip.7.html
```
If IP_PKTINFO is passed to sendmsg(2)
              and ipi_spec_dst is not zero, then it is used as the local
              source address for the routing table lookup and for
              setting up IP source route options.  When ipi_ifindex is
              not zero, the primary local address of the interface
              specified by the index overwrites ipi_spec_dst for the
              routing table lookup.
```

**This issue is not seen on multi-asic platform, why?**
on multi-asic platform, there exists different network namespaces.
SNMP docker with snmpd process runs on host namespace.
Management interface belongs to host namespace.
Loopback0 is configured on asic namespaces.
Additional inforamtion on how the packet coming over Loopback IP reaches snmpd process running on host namespace: https://github.com/sonic-net/sonic-buildimage/pull/5420
Because of this separation of network namespaces, the route lookup of destination IP is confined to routing table of specific namespace where packet is received.
if packet is received over management interface, SNMP response also is sent out of management interface. Same goes with packet received over Loopback Ip.

##### Work item tracking
- Microsoft ADO **17537063**:

#### How I did it
Have snmpd listen on specific Management and Loopback IPs specifically instead of listening on any IP for single-asic platform.

Before Fix
```
admin@xx:~$ sudo netstat -tulnp | grep 161   
udp        0      0 0.0.0.0:161             0.0.0.0:*                           15631/snmpd         
udp6       0      0 :::161                  :::*                                15631/snmpd  
```
After fix
```
admin@device:~$ sudo netstat -tulnp | grep 161
udp        0      0 10.1.0.32:161           0.0.0.0:*                           215899/snmpd        
udp        0      0 10.3.1.1:161             0.0.0.0:*                           215899/snmpd        
udp6       0      0 fc00:1::32:161          :::*                                215899/snmpd        
udp6       0      0 fc00:2::32:161          :::*                                215899/snmpd  
``` 

**How this change helps with the issue?**
To see snmpd trace logs, modify snmpd to start using the below parameters, in supervisord.conf file
```
/usr/sbin/snmpd -f -LS0-7i -Lf /var/log/snmpd.log
```
When snmpd listens on any IP, snmpd binds to IPv4 and IPv6 sockets as below:
```
netsnmp_udpbase: binding socket: 7 to UDP: [0.0.0.0]:0->[0.0.0.0]:161
trace: netsnmp_udp6_transport_bind(): transports/snmpUDPIPv6Domain.c, 303:
netsnmp_udpbase: binding socket: 8 to UDP/IPv6: [::]:161
```

When IPv4 response is sent, it goes out of fd 7 and IPv6 response goes out of fd 8.
When IPv6 response is sent, it does not have the right SRC IP and it can lead to the issue described.

When snmpd listens on specific Loopback/Management IPs, snmpd binds to different sockets:
```
trace: netsnmp_udpipv4base_transport_bind(): transports/snmpUDPIPv4BaseDomain.c, 207:
netsnmp_udpbase: binding socket: 7 to UDP: [0.0.0.0]:0->[10.250.0.101]:161
trace: netsnmp_udpipv4base_transport_bind(): transports/snmpUDPIPv4BaseDomain.c, 207:
netsnmp_udpbase: binding socket: 8 to UDP: [0.0.0.0]:0->[10.1.0.32]:161
trace: netsnmp_register_agent_nsap(): snmp_agent.c, 1261:
netsnmp_register_agent_nsap: fd 8
netsnmp_udpbase: binding socket: 10 to UDP/IPv6: [fc00:1::32]:161
trace: netsnmp_register_agent_nsap(): snmp_agent.c, 1261:
netsnmp_register_agent_nsap: fd 10
netsnmp_ipv6: fmtaddr: t = (nil), data = 0x7fffed4c85d0, len = 28
trace: netsnmp_udp6_transport_bind(): transports/snmpUDPIPv6Domain.c, 303:
netsnmp_udpbase: binding socket: 9 to UDP/IPv6: [fc00:2::32]:161
```
When SNMP request comes in via Loopback IPv4, SNMP response is sent out of fd 8
```
trace: netsnmp_udpbase_send(): transports/snmpUDPBaseDomain.c, 511:
netsnmp_udp: send 170 bytes from 0x5581f2fbe30a to UDP: [10.0.0.33]:46089->[10.1.0.32]:161 on fd 8
```
When SNMP request comes in via Loopback IPv6, SNMP response is sent out of fd 10
```
netsnmp_ipv6: fmtaddr: t = (nil), data = 0x5581f2fc2ff0, len = 28
trace: netsnmp_udp6_send(): transports/snmpUDPIPv6Domain.c, 164:
netsnmp_udp6: send 170 bytes from 0x5581f2fbe30a to UDP/IPv6: [fc00::42]:43750 on fd 10
```

#### How to verify it
Verified on single asic and multi-asic devices.
Single asic SNMP query with Loopback 
```
ARISTA01T1#bash snmpget -v2c -c xxx 10.1.0.32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: Arista-7260xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64
ARISTA01T1#bash snmpget -v2c -c xxx fc00:1::32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: Arista-7260xxx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64
```

On multi-asic -- no change.
```
sudo netstat -tulnp | grep 161
udp        0      0 0.0.0.0:161             0.0.0.0:*                           17978/snmpd         
udp6       0      0 :::161                  :::*                                17978/snmpd 
```
Query result using Loopback IP from a directly connected BGP neighbor
```
ARISTA01T2#bash snmpget -v2c -c xxx 10.1.0.32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: xx - Distribution: Debian 9.13 - Kernel: 4.9.0-14-2-amd64
ARISTA01T2#bash snmpget -v2c -c xxx fc00:1::32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: xx - Distribution: Debian 9.13 - Kernel: 4.9.0-14-2-amd64  
```
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->
2023-07-14 04:32:39 +08:00
xumia
de743f0530 [Build] Fix the python module importlib.metadata not found issue (#15800)
Why I did it
It is to fix the docker-ptf-sai build failure.
https://dev.azure.com/mssonic/build/_build/results?buildId=311315&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795

2023-07-09T13:53:19.9025355Z �[91mTraceback (most recent call last):
2023-07-09T13:53:19.9025715Z   File "/root/ptf/.eggs/setuptools_scm-7.1.0-py3.7.egg/setuptools_scm/_entrypoints.py", line 74, in <module>
2023-07-09T13:53:19.9025933Z     from importlib.metadata import entry_points  # type: ignore
2023-07-09T13:53:19.9026167Z ModuleNotFoundError: No module named 'importlib.metadata'
Work item tracking
Microsoft ADO (number only): 24513583
How I did it
How to verify it
2023-07-13 20:57:29 +08:00
lixiaoyuner
87a8de4816
Move k8s script to docker-config-engine (#14788) (#15767)
Why I did it
To reduce the container's dependency from host system

Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.

How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.

Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2023-07-10 08:55:45 -07:00
mssonicbld
e4a7a53002
[chassis][lldp] Fix the lldp error log in host instance which doesn't contain front panel ports (#14814) (#15670) 2023-06-30 08:17:04 +08:00
abdosi
4111c25557 updated internal route policy for chassis-packet (#15349)
What I did:

Workaround for the issue seen here : FRRouting/frr#13682
It seems there is timing issue where there are multiple recursive lookup needed to resolve nexthop of the route it's possible that it does not happen correctly causing route to remain in inactive state

Issue is seen on chassis-packet as there 2 level of recursive lookup needed for a given e-BGP learnt route
- Level1 to resolve e-BGP peer (connected route via bgp ) over Loopback4096 (i-BGP peering)
- Level 2 Loopback4096 over backend port-channels next-hops

For VOQ chassis there is no e-BGP peer (connected route via bgp )  resolution as route is added as Static route by orchagent over Ethernet-IB.

Also as part of this remove route-map policy from instance.conf.j2 as same is define in peer-group.j2.

Microsoft ADO: https://msazure.visualstudio.com/One/_workitems/edit/24198507

How I verify:
Functional Verification manually
Updated UT.
We will be adding sanity check in sonic-mgmt to make sure none of route are in inactive state.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-06-10 14:32:44 +08:00
mssonicbld
3869fbfb68
[macsec]: show macsec: add --profile option, include profile name in show command output (#13940) (#15127) 2023-05-18 08:42:44 +08:00
Gokulnath-Raja
eb99648a4e
[sflow] Exception handling for if_nametoindex (#11437) (#14457)
catch system error and log as warning level instead of
     error level in case interface was already deleted

Signed-off-by: Gokulnath-Raja <Gokulnath_R@dell.com>
2023-05-05 16:11:12 -07:00
mssonicbld
99d6003717
Changes to support TSA from supervisor (#14691) (#14878) 2023-04-28 21:11:55 +08:00
mssonicbld
e462cb1b07
Add monit_snmp file to monitor memory usage (#14464) (#14770) 2023-04-23 21:00:20 +08:00
Hua Liu
d859a4ed5b Install python-redis package to docker containers (#14632)
Install python-redis package to docker containers

#### Why I did it
This this bug: https://github.com/sonic-net/sonic-buildimage/issues/14531
The 'flush_unused_database' is part of docker-database, and docker-database does not install python-redis package by itself. it's using redis installed by sonic-py-swsssdk.
So after remove sonic-py-swsssdk from container, this script break.

To this this bug and avoid similer bug happen again, install python-redis to docker containers which removed sonic-py-swsssdk .

#### How I did it
Install python-redis to containers.

#### How to verify it
Pass all UT.
Create new UT to cover this scenario: https://github.com/sonic-net/sonic-mgmt/pull/8032

#### Description for the changelog
Improve sudo cat command for RO user.
2023-04-23 14:33:22 +08:00
mssonicbld
73e0cb63bc
Fix telemetry.sh passing in null as log level value (#14303) (#14763) 2023-04-22 20:42:14 +08:00
Vivek
4dc61fcbc1 [lldpmgrd] Don't log error message for outdated event (#14178)
- Why I did it
Fixes #14236

When a redis event quickly gets outdated during port breakout, error logs like this are seen

Mar  8 01:43:26.011724 r-leopard-56 INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet64': {'admin_status': 'down'}, 'Ethernet68': {'admin_status': 'down'}}}
Mar  8 01:43:26.012565 r-leopard-56 INFO ConfigMgmt: Writing in Config DB
Mar  8 01:43:26.013468 r-leopard-56 INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet64': None, 'Ethernet68': None}, 'INTERFACE': None}
Mar  8 01:43:26.018095 r-leopard-56 NOTICE swss#portmgrd: :- doTask: Configure Ethernet64 admin status to down
Mar  8 01:43:26.018309 r-leopard-56 NOTICE swss#portmgrd: :- doTask: Delete Port: Ethernet64
Mar  8 01:43:26.018641 r-leopard-56 NOTICE lldp#lldpmgrd[32]: :- pops: Miss table key PORT_TABLE:Ethernet64, possibly outdated
Mar  8 01:43:26.018654 r-leopard-56 ERR lldp#lldpmgrd[32]: unknown operation ''

- How I did it
Only log the error when the op is not empty and not one of ("SET" & "DEL" )

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-03-20 20:54:45 +08:00
mssonicbld
b3109fefe5
[dhcp-relay] Add dhcp_relay show cli (#13614) (#14342) 2023-03-19 22:48:25 +08:00
Zain Budhwani
d8c9517280 Remove dialout as critical process (#14006)
#### Why I did it

Remove dialout as critical process as it is no longer used in prod. As part of future work, can remove dialout completely

#### How I did it

Remove from critical process list
2023-03-19 22:32:52 +08:00
Tejaswini Chadaga
37be88bef2 Fix VOQ_CHASSIS_V6_PEER route-map config (#14055)
* Fix typo in VOQ_CHASSIS_V6_PEER route-map config

* Updated UT files with the changed config
2023-03-19 22:32:47 +08:00
xumia
05b89457c2 [Build] Fix the mirror gpg key expired issue (#14206)
Why I did it
[Build] Fix the mirror gpg key expired issue
See vs build: https://dev.azure.com/mssonic/build/_build/results?buildId=231680&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795

How I did it
Add the apt option not to check the valid until, the option is set to the SONiC docker base image, docker ptf missing the option.

Acquire::Check-Valid-Until "false";
How to verify it
The build of docker-ptf is succeeded after fixed.

2023-03-11T17:26:35.1801999Z [ building ] [ target/docker-ptf.gz ] 
2023-03-11T17:38:10.1608536Z [ finished ] [ target/docker-ptf.gz ]
2023-03-13 16:37:49 +08:00
Sudharsan Dhamal Gopalarathnam
6862692ae1 [dhcp_relay]Fix the clear dhcp6relay_counters CLI (#13148)
Avoid traceback on sonic-clear command

sonic-clear dhcp6relay_counters 
Traceback (most recent call last):
  File "/usr/local/bin/sonic-clear", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/clear/plugins/dhcp-relay.py", line 19, in dhcp6relay_clear_counters
    counter = DHCPv6_Counter()
NameError: name 'DHCPv6_Counter' is not defined

- How I did it
Corrected the way to import using importlib

- How to verify it
Tested the sonic-clear command and verified no traceback is seen
2023-02-16 18:36:47 +08:00
Yaqiang Zhu
30e4369255
[dhcp_relay] Remove exist check while adding dhcpv6 relay (#13826) 2023-02-16 11:08:34 +08:00
mssonicbld
b3cf657129
[chassis] Fixed critical process not correct for database-chassis docker (#13445) (#13679) 2023-02-13 01:22:50 +08:00
Vivek
ee7724e74d Fix dependency of dhcp-mon on VLAN with only v6 (#13006)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-02-06 12:36:55 +08:00
xumia
81dd4b8f7b [Build] Support j2 template for debian sources for docker ptf (#13198)
Change to use the sources.list from the file generated from the j2 template
2023-02-06 12:36:51 +08:00
Longxiang Lyu
918e2d11f8 [dualtor] Let T0 delay 10 seconds before sending BGP updates (#12996)
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.

How I did it
add coalesce-time 10000 to the frr bgp startup config.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2023-02-04 09:54:05 +08:00
Zain Budhwani
24be87504f Change bgp notification leaf name and mem_usage leaf type (#13012)
#### Why I did it

Improve naming convention for bgp notification events and change type of leaf for sonic-events-host mem usage from uint64 to decimal64

#### How I did it

Replace "-" with "_"

Replace uint64 with decimal64

#### How to verify it

Run yang model unit tests

#### Description for the changelog

Change YANG model leaf naming convention for bgp notification
2023-02-04 09:53:57 +08:00
Yaqiang Zhu
39c1f878b3 [dhcp-relay] Add support for dhcp_relay config cli (#13373)
Why I did it
Currently the config cli of dhcpv4 is may cause confusion and config of dhcpv6 is missing.

How I did it
Add dhcp_relay config cli and test cases.

config dhcp_relay ipv4 helper (add | del) <vlan_id> <helper_ip_list>
config dhcp_relay ipv6 destination (add | del) <vlan_id> <destination_ip_list>
Updated docs for it in sonic-utilities: https://github.com/sonic-net/sonic-utilities/pull/2598/files
How to verify it
Build docker-dhcp-relay.gz with and without INCLUDE_DHCP_RELAY, and check target/docker-dhcp-relay.gz.log
2023-02-04 09:53:30 +08:00
Junchao-Mellanox
e631f426f4
[infra] Support syslog rate limit configuration (#12490) (#13535)
Backport of https://github.com/sonic-net/sonic-buildimage/pull/12490 into 202211

- Why I did it
Support syslog rate limit configuration feature

- How I did it
Remove unused rsyslog.conf from containers
Modify docker startup script to generate rsyslog.conf from template files
Add metadata/init data for syslog rate limit configuration

- How to verify it
Manual test
New sonic-mgmt regression cases
2023-01-30 20:11:44 +02:00
mssonicbld
35cdb760dc
[containercfgd] Add containercfgd and syslog rate limit configuration support (#12489) (#13361) 2023-01-14 13:16:14 +08:00
Vivek
48d4c0aa1e [Bullseye] Upgrade sonic-sdk image to bullseye (#12649)
- Why I did it
Upgrade the app-extension developer environments (sonic-sdk & sonic-sdk-bullseye) to bullseye

- How to verify it
Built an app-extension using these images and verified if it is up and running.

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2022-12-10 10:33:21 +08:00
Konstantin Vasin
36f79395e5 [Build] set apt Acquire::Retries to 3 for bullseye (#12758)
Why I did it
There were some changes in apt source code in version 2.1.9.
As a result apt used in bullseye (2.2.4) is intolerant to network issues.
This was fixed in 10631550f1 Already fixed version is used in bookworm (2.5.4)
And not yet affected version is used in buster (1.8.2.3)

How I did it
Set Acquire::Retries to 3 for sonic-slave-bullseye, docker-base-bullseye and final Debian image.

Ref: https://bugs.launchpad.net/ubuntu/+source/apt/+bug/1876035

Signed-off-by: Konstantin Vasin k.vasin@yadro.com
2022-12-10 10:33:21 +08:00
Richard.Yu
47d63bcd06
[SAI PTF] SAI PTF docker support sai-ptf v2 (#12719)
* [SAI PTF] SAI PTF docker support sai-ptf v2

Publish the sai-ptf docker.

Take part of the change from previous PR #11610 (already reverted as some cache issue)
Cause in #11610, added two new target in it, one is sai-ptf another one is syncd-rpc with sai-ptf v2, to make the upgrade with more clear target, use this one take the sai-ptf one.

Test one:
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=vs
NOSTRETCH=y NOJESSIE=y NOBULLSEYE=y SAITHRIFT_V2=y make target/docker-ptf-sai.gz

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless change

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless parameters

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless change

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* Update azure-pipelines-build.yml

remove a useless option

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-11-17 04:42:51 -08:00
Ashwin Srinivasan
7de04504c9
Added libpci and pciutils to the pmon docker (#12684)
This enables the pcied daemon to call the corresponding system commands needed for pci transactions
2022-11-14 11:44:36 -08:00
Arnaud
9d3814045b
[docker-fpm-frr]: Add unified-split mode to routing config (#11938)
- Why I did it
The values for config_db "docker_routing_config_mode" are:

separated: FRR config generated from ConfigDB, each FRR daemon has its own config file
unified: FRR config generated from ConfigDB, single FRR config file
split: FRR config not generated from ConfigDB, each FRR daemon has its own config file
This commit adds:
split-unified: FRR config not generated from ConfigDB, single FRR config file

- How I did it
In docker_init.sh, when split-unified is used, the FRR configs are not generated
from ConfigDB. What's more, "service integrated-vtysh-config" is configured in vtysh.conf.

- How to verify it
FRR config not overwritten when FRR container starts.

Signed-off-by: Arnaud le Taillanter <a.letaillanter@criteo.com>
2022-11-14 10:37:48 -08:00
Zain Budhwani
98ace33b0f
Add rsyslog plugin regex for select operation failure (#12659)
Added events for select op, alpm parity error, moved dhcp events from host to container
2022-11-13 21:41:33 -08:00
Liu Shilong
6d78199d6f
Revert "[SAI PTF]Syncd-rpc and PTF docker support sai ptf v2 (#11610)" (#12677)
This reverts commit f0873f29d8.
2022-11-14 09:56:10 +08:00
Caitlin Choate
66f1cc458d
Bugfix #9739: Support when 'bgp_asn' is set to 'None', 'Null', or missing. (#12588)
bgpd.main.conf.j2: bugfix-9739
* Update bgpd.main.conf.j2 to gracefully handle the bgp configuration cases for when 'bgp_asn' is set to 'None', 'Null', or missing.

How I did it
Include a conditional statement to avoid configuring bgp in FRR when 'bgp_asn' is missing or set to 'None' or 'Null'

How to verify it
Configure 'bgp_asn' as 'None', 'Null' or have it missing from configurations and verify that /etc/frr/bgpd.conf does not have invalid bgp configurations like 'router bgp None'

Description for the changelog
Update bgpd.main.conf.j2 to gracefully handle the bgp configuration cases for when 'bgp_asn' is set to 'None', 'Null', or missing for bugfix 9739.

Signed-off-by: cchoate54@gmail.com
2022-11-08 16:53:14 -08:00
xumia
ac5d89c6ac
[Build] Support j2 template for debian sources (#12557)
Why I did it
Unify the Debian mirror sources
Make easy to upgrade to the next Debian release, not source url code change required.
Support to customize the Debian mirror sources during the build
Relative issue: #12523
2022-11-09 08:09:53 +08:00
Richard.Yu
f0873f29d8
[SAI PTF]Syncd-rpc and PTF docker support sai ptf v2 (#11610)
* support sai-ptf-v2 in libsaithrift vs

* add build target docker-ptf-sai syncd-rpcv2 and saiserverv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add docker ptf sai

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add build condition for broadcom

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add docker syncd dbg and add debug symbol to docker-saiserverv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* correct the build option

* change the azure pipeline build template

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* change build option for docker-ptf-sai

* enable ptf-sai docker build

* remove the build for syncd-rpcv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* fix issue in build tempalte

* ignore useless package build when build sai-ptf

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove scapy version contraint

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove duplicated target docker-ptf

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* change template for testing the pipeline

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove duplicated target

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* fix error in make script

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add shel to setup env

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* replace with certain platform name

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* disable cache for syncd-rpcv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* test without cache

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* disable cache

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* testing: disable the cache for build syncd-rpcv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add cache back and get the code ready for testing

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* refactor code

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add workaround for issue in rules/sairedis.dep

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* refactor code

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-11-07 21:47:52 +08:00
Mai Bui
61a085e55e
Replace os.system and remove subprocess with shell=True (#12177)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
remove `shell=True`, use `shell=False`
Replace `os` by `subprocess`
2022-11-04 10:48:51 -04:00
tjchadaga
763d3dc29d
Allow TSA on ibgp sessions between linecards on packet chassis (#12589) 2022-11-03 08:54:33 -07:00
arlakshm
a85b34fd36
update notify-keyspace-events in redis.conf (#12540)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

Why I did it
closes #12343

Today in SONiC the notify-keyspace-events is from DbInterface class when application try do any configdb set.
In Chassis the chassis_db may not get any configdb set operations, so there is chance this configuration will never be set.
So the chassis_db updates from one line card will not be propogated to other linecards, which are doing a psubscribe to get these event.

How I did it
update the redis.conf to set notify-keyspace-events AKE so that the notify-keyspace-events are set when the redis instance is started

How to verify it
Test on chassis
2022-10-28 18:28:57 -07:00
xumia
a771a26d99
[Build] Add the missing debian source bullseye-updates/buster-updates (#12522)
Why I did it
Add the missing debian source bullseye-updates/buster-updates

The build failure as below, it is caused by the docker image debian:bullseye used the version 2.31-13+deb11u5, but the version only available in bullseye-update.
2022-10-27 10:15:14 -07:00
kellyyeh
f4046c1417
Add dhcp6relay dualtor option (#12459) 2022-10-21 10:33:10 -07:00