Commit Graph

1172 Commits

Author SHA1 Message Date
Saikrishna Arcot
c991c5f16e
Upgrade scapy in the PTF's python3 virtualenv to 2.5.0 (#15573)
This is primarily to fix a bug in scapy hitting an error when trying to
listen on multiple interfaces in a single `sniff` call. This also
upgrades it to the current latest version.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-07-14 08:36:30 -07:00
xumia
dc5258eed5
[Build] Fix the python module importlib.metadata not found issue (#15800)
Why I did it
It is to fix the docker-ptf-sai build failure.
https://dev.azure.com/mssonic/build/_build/results?buildId=311315&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795

2023-07-09T13:53:19.9025355Z �[91mTraceback (most recent call last):
2023-07-09T13:53:19.9025715Z   File "/root/ptf/.eggs/setuptools_scm-7.1.0-py3.7.egg/setuptools_scm/_entrypoints.py", line 74, in <module>
2023-07-09T13:53:19.9025933Z     from importlib.metadata import entry_points  # type: ignore
2023-07-09T13:53:19.9026167Z ModuleNotFoundError: No module named 'importlib.metadata'
Work item tracking
Microsoft ADO (number only): 24513583
How I did it
How to verify it
2023-07-13 10:38:46 +08:00
SuvarnaMeenakshi
9864dfeaa1
[SNMP][IPv6]: Fix SNMP IPv6 reachability issue in certain scenarios (#15487)
Modify snmpd.conf to start snmpd to listen on specific management and loopback ips instead of listening on any ip.

#### Why I did it
SNMP over IPv6 is not working for all scenarios for a single asic platforms.
The expectation is that SNMP query over IPv6 should work over Management or Loopback0 addresses.
**Specific scenario where this issue is seen**
In case of Lab T0 device,  when SNMP request is sent from a directly connected T1 neighbor over Loopback IP, SNMP response was not received.
This was because the SRC IP address in SNMP response was not Loopback IP, it was the PortChannel IP connected to the neighboring device.
```
23:18:51.620897  In 22:26:27:e6:e0:07 ethertype IPv6 (0x86dd), length 105: fc00::72.41725 > **fc00:1::32**.161:  C="msft" **GetRequest**(28)  .1.3.6.1.2.1.1.1.0
23:18:51.621441 Out 28:99:3a:a0:97:30 ethertype IPv6 (0x86dd), length 241: **fc00::71**.161 > fc00::72.41725:  C="msft" **GetResponse**(162)  .1.3.6.1.2.1.1.1.0="SONiC Software Version: SONiC.xxx - HwSku: xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64"
```
In case of IPv4, the SRC IP in SNMP response was correctly set to Loopback IP.
```
23:25:32.769712  In 22:26:27:e6:e0:07 ethertype IPv4 (0x0800), length 85: 10.0.0.57.56701 > **10.1.0.32**.161:  C="msft" **GetRequest**(28)  .1.3.6.1.2.1.1.1.0
23:25:32.975967 Out 28:99:3a:a0:97:30 ethertype IPv4 (0x0800), length 221: **10.1.0.32**.161 > 10.0.0.57.56701:  C="msft" **GetResponse**(162)  .1.3.6.1.2.1.1.1.0="SONiC Software Version: SONiC.xxx - HwSku: xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64"
```

**Sequence of SNMP request and response**
1. SNMP request will be sent with SRC IP fc00::72 DST IP fc00:1::32
2. SNMP request is received at SONiC device is sent to snmpd which is listening on port 161 :::161/
3. snmpd process will parse the request create a response and sent to DST IP fc00::72. 
snmpd process does not track the DST IP on which the SNMP request was received, which in this case is Loopback IP.
snmpd process will only keep track what is tht IP to which the response should be sent to.
4. snmpd process will send the response packet.
5. Kernel will do a route look up on destination IP and find the best path.
ip -6 route get fc00::72
fc00::72 from :: dev PortChannel101 proto kernel src fc00::71 metric 256 pref medium
5. Using the "src" ip from about, the response is sent out. This SRC ip is that of the PortChannel and not the device Loopback IP.

The same issue is seen when SNMP query is sent from a remote server over Management IP.
SONiC device eth0 --------- Remote server
SNMP request comes with SRC IP <Remote_server> DST IP <Mgmt IP>
If kernel finds best route to Remote_server_IP is via BGP neighbors, then it will send the response via front-panel interface with SRC IP as Loopback IP instead of Management IP.

Main issue is that in case of IPv6, snmpd ignores the IP address to which SNMP request was sent, in case of IPv6.
In case of IPv4, snmpd keeps track of DST IP of SNMP request, it will keep track if the SNMP request was sent to mgmt IP or Loopback IP.
Later, this IP is used in ipi_spec_dst as SRC IP which helps kernel to find the route based on DST IP using the right SRC IP.
https://github.com/net-snmp/net-snmp/blob/master/snmplib/transports/snmpUDPBaseDomain.c#L300 
ipi.ipi_spec_dst.s_addr = srcip->s_addr
Reference: https://man7.org/linux/man-pages/man7/ip.7.html
```
If IP_PKTINFO is passed to sendmsg(2)
              and ipi_spec_dst is not zero, then it is used as the local
              source address for the routing table lookup and for
              setting up IP source route options.  When ipi_ifindex is
              not zero, the primary local address of the interface
              specified by the index overwrites ipi_spec_dst for the
              routing table lookup.
```

**This issue is not seen on multi-asic platform, why?**
on multi-asic platform, there exists different network namespaces.
SNMP docker with snmpd process runs on host namespace.
Management interface belongs to host namespace.
Loopback0 is configured on asic namespaces.
Additional inforamtion on how the packet coming over Loopback IP reaches snmpd process running on host namespace: https://github.com/sonic-net/sonic-buildimage/pull/5420
Because of this separation of network namespaces, the route lookup of destination IP is confined to routing table of specific namespace where packet is received.
if packet is received over management interface, SNMP response also is sent out of management interface. Same goes with packet received over Loopback Ip.

##### Work item tracking
- Microsoft ADO **17537063**:

#### How I did it
Have snmpd listen on specific Management and Loopback IPs specifically instead of listening on any IP for single-asic platform.

Before Fix
```
admin@xx:~$ sudo netstat -tulnp | grep 161   
udp        0      0 0.0.0.0:161             0.0.0.0:*                           15631/snmpd         
udp6       0      0 :::161                  :::*                                15631/snmpd  
```
After fix
```
admin@device:~$ sudo netstat -tulnp | grep 161
udp        0      0 10.1.0.32:161           0.0.0.0:*                           215899/snmpd        
udp        0      0 10.3.1.1:161             0.0.0.0:*                           215899/snmpd        
udp6       0      0 fc00:1::32:161          :::*                                215899/snmpd        
udp6       0      0 fc00:2::32:161          :::*                                215899/snmpd  
``` 

**How this change helps with the issue?**
To see snmpd trace logs, modify snmpd to start using the below parameters, in supervisord.conf file
```
/usr/sbin/snmpd -f -LS0-7i -Lf /var/log/snmpd.log
```
When snmpd listens on any IP, snmpd binds to IPv4 and IPv6 sockets as below:
```
netsnmp_udpbase: binding socket: 7 to UDP: [0.0.0.0]:0->[0.0.0.0]:161
trace: netsnmp_udp6_transport_bind(): transports/snmpUDPIPv6Domain.c, 303:
netsnmp_udpbase: binding socket: 8 to UDP/IPv6: [::]:161
```

When IPv4 response is sent, it goes out of fd 7 and IPv6 response goes out of fd 8.
When IPv6 response is sent, it does not have the right SRC IP and it can lead to the issue described.

When snmpd listens on specific Loopback/Management IPs, snmpd binds to different sockets:
```
trace: netsnmp_udpipv4base_transport_bind(): transports/snmpUDPIPv4BaseDomain.c, 207:
netsnmp_udpbase: binding socket: 7 to UDP: [0.0.0.0]:0->[10.250.0.101]:161
trace: netsnmp_udpipv4base_transport_bind(): transports/snmpUDPIPv4BaseDomain.c, 207:
netsnmp_udpbase: binding socket: 8 to UDP: [0.0.0.0]:0->[10.1.0.32]:161
trace: netsnmp_register_agent_nsap(): snmp_agent.c, 1261:
netsnmp_register_agent_nsap: fd 8
netsnmp_udpbase: binding socket: 10 to UDP/IPv6: [fc00:1::32]:161
trace: netsnmp_register_agent_nsap(): snmp_agent.c, 1261:
netsnmp_register_agent_nsap: fd 10
netsnmp_ipv6: fmtaddr: t = (nil), data = 0x7fffed4c85d0, len = 28
trace: netsnmp_udp6_transport_bind(): transports/snmpUDPIPv6Domain.c, 303:
netsnmp_udpbase: binding socket: 9 to UDP/IPv6: [fc00:2::32]:161
```
When SNMP request comes in via Loopback IPv4, SNMP response is sent out of fd 8
```
trace: netsnmp_udpbase_send(): transports/snmpUDPBaseDomain.c, 511:
netsnmp_udp: send 170 bytes from 0x5581f2fbe30a to UDP: [10.0.0.33]:46089->[10.1.0.32]:161 on fd 8
```
When SNMP request comes in via Loopback IPv6, SNMP response is sent out of fd 10
```
netsnmp_ipv6: fmtaddr: t = (nil), data = 0x5581f2fc2ff0, len = 28
trace: netsnmp_udp6_send(): transports/snmpUDPIPv6Domain.c, 164:
netsnmp_udp6: send 170 bytes from 0x5581f2fbe30a to UDP/IPv6: [fc00::42]:43750 on fd 10
```

#### How to verify it
Verified on single asic and multi-asic devices.
Single asic SNMP query with Loopback 
```
ARISTA01T1#bash snmpget -v2c -c xxx 10.1.0.32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: Arista-7260xx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64
ARISTA01T1#bash snmpget -v2c -c xxx fc00:1::32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: Arista-7260xxx - Distribution: Debian 10.13 - Kernel: 4.19.0-12-2-amd64
```

On multi-asic -- no change.
```
sudo netstat -tulnp | grep 161
udp        0      0 0.0.0.0:161             0.0.0.0:*                           17978/snmpd         
udp6       0      0 :::161                  :::*                                17978/snmpd 
```
Query result using Loopback IP from a directly connected BGP neighbor
```
ARISTA01T2#bash snmpget -v2c -c xxx 10.1.0.32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: xx - Distribution: Debian 9.13 - Kernel: 4.9.0-14-2-amd64
ARISTA01T2#bash snmpget -v2c -c xxx fc00:1::32 1.3.6.1.2.1.1.1.0
SNMPv2-MIB::sysDescr.0 = STRING: SONiC Software Version: SONiC.xx - HwSku: xx - Distribution: Debian 9.13 - Kernel: 4.9.0-14-2-amd64  
```
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->
2023-07-12 09:52:06 -07:00
lixiaoyuner
c470b7dfd1
Add health check probe for k8s upgrade containers. (#15223)
#### Why I did it
After k8s upgrade a container, k8s can only know the container is running, don't know the service's status inside container. So we need a probe inside container, k8s will call the probe to check whether the container is really ready.
##### Work item tracking
- Microsoft ADO **(number only)**: 22453004
#### How I did it
Add a health check probe inside config engine container, the probe will check whether the start service exit normally or not if the start service exists and call the python script to do container self-related specific checks if the script is there. The python script should be implemented by feature owner if it's needed.

more details: [design doc](https://github.com/sonic-net/SONiC/blob/master/doc/kubernetes/health-check.md)
#### How to verify it
Check path /usr/bin/readiness_probe.sh inside container.

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [x] 202205
- [x] 202211

#### Tested branch (Please provide the tested image version)
- [x] 20220531.28
2023-07-10 22:16:29 -07:00
ShiyanWangMS
c58923053a
Add Python3 packages to sonic-mgmt-docker (#15726)
Why I did it
This is part of sonic-mgmt-docker Python3 migration project.
Currently Python3 packages are in the Python3 virtual environment. This PR will add Python3 packages to real file system.
After we migrate all script to use Python3 in real file system, the Python3 venv will be deleted.

After this PR, in sonic-mgmt-docker,
Directly run cmd - pytest will use Python2's version.
python3 -m pytest will use Python3's version.

How I did it
Modify sonic-mgmt-docker j2 script.

How to verify it
Build a private sonic-mgmt-docker and run basic test case with Python3.
2023-07-11 09:54:10 +08:00
lixiaoyuner
ca29197184
Move k8s script to docker-config-engine (#14788)
Why I did it
To reduce the container's dependency from host system

Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.

How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.

Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2023-07-05 14:44:48 -07:00
Hua Liu
c91707ff31
Migrate flush_unused_database from py-redis to sonic-swss-common (#15511)
Migrate flush_unused_database from py-redis to sonic-swss-common

#### Why I did it
flush_unused_database using py-redis, but sonic-swss-common already support flushdb, so we need migrate to sonic-swss-common

##### Work item tracking
- Microsoft ADO **(number only)**: 24292565

#### How I did it
Migrate flush_unused_database from py-redis to sonic-swss-common

#### How to verify it
Pass all UT and E2E test

#### Description for the changelog
Migrate flush_unused_database from py-redis to sonic-swss-common
2023-06-29 15:08:54 -07:00
nmoray
f978b2bb53
Timezone sync issue between the host and containers (#14000)
#### Why I did it
To fix the timezone sync issue between the containers and the host. If a certain timezone has been configured on the host (SONIC) then the expectation is to reflect the same across all the containers.

This will fix [Issue:13046](https://github.com/sonic-net/sonic-buildimage/issues/13046).

For instance, a PST timezone has been set on the host and if the user checks the link flap logs (inside the FRR), it shows the UTC timestamp. Ideally, it should be PST.
2023-06-25 16:36:09 -07:00
Ye Jianquan
6bb0483af3
[sonic-mgmt] install newest az-cli to mitigate old version az-cli issue (#15621)
Force merge to work around the az-cli installation issue.
2023-06-25 16:51:58 +08:00
Marty Y. Lok
16bb026c9c
[chassis][lldp] Fix the lldp error log in host instance which doesn't contain front panel ports (#14814)
* [chassis][lldp] Fix the lldp error log in host instance which doesn't contain front pannel ports

---------

Signed-off-by: mlok <marty.lok@nokia.com>
2023-06-23 00:56:38 -07:00
Shashanka Balakuntala
13897723c2
Modify azure cli to install through apt-get and pyaml to specific version supported by py2 (#15472)
Why I did it
Current docker-sonic-mgmt build is broken. So below are two fixes which can help in mitigating the same.

PYAML - Download a specific version in python2 as after https://pypi.org/project/pyaml/23.5.5/ there was support only for python3. This update happened on May 5th. And consequently all daily builds after this changes https://dev.azure.com/mssonic/build/_build/results?buildId=266733&view=results (starting build to break) kept failing
Azure-CLI - this can be downloaded by apt-get repository. So modify as an improvement.
Work item tracking
Microsoft ADO (number only): [Build] fix docker-sonic-mgmt build #15567
How I did it
By manually checking the release notes of pyaml and install azure-cli in newly installed docker container using apt-get

How to verify it
You can run below commands to validate:

make configure PLATFORM=generic
make target/docker-sonic-mgmt.gz

Second line would fail without the commit.
2023-06-22 20:51:33 +08:00
Zain Budhwani
e0f287b19a
Update gnxi ptr (#15562)
#### Why I did it

Need new changes that were added to gnxi inside ptf docker

##### Work item tracking
- Microsoft ADO **(number only)**: 17747466

#### How I did it

Update commit number

#### How to verify it

Pipeline
2023-06-21 10:55:37 -07:00
Longxiang Lyu
7fd48eb82d
[mux] Integrate linkmgrd with swss logger (#15392)
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2023-06-19 14:39:31 -07:00
Hua Liu
05f1a5a31e
Add watchdog mechanism to swss service and generate alert when swss have issue. (#15429)
Add watchdog mechanism to swss service and generate alert when swss have issue. 

**Work item tracking**
Microsoft ADO (number only): 16578912

**What I did**
Add orchagent watchdog to monitor and alert orchagent stuck issue.

**Why I did it**
Currently SONiC monit system only monit orchagent process exist or not. If orchagent process stuck and stop processing, current monit can't find and report it.

**How I verified it**
Pass all UT.

Manually test process_monitoring/test_critical_process_monitoring.py can pass.

Add new UT https://github.com/sonic-net/sonic-mgmt/pull/8306 to check watchdog works correctly.

Manually test, after pause orchagent with 'kill -STOP <pid>', check there are warning message exist in log:

Apr 28 23:36:41.504923 vlab-01 ERR swss#supervisor-proc-watchdog-listener: Process 'orchagent' is stuck in namespace 'host' (1.0 minutes).

**Details if related**
Heartbeat message PR: https://github.com/sonic-net/sonic-swss/pull/2737
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8306
2023-06-12 17:53:54 -07:00
Ye Jianquan
cec9d7b83a
Revert "Add watchdog mechanism to swss service and generate alert when swss have issue. (#14686)" (#15390)
This reverts commit 44427a2f6b.
Docker image not updated during PR validation and caused PR check failures.
Force merge this revert. After cache is updated after this PR is merged, issue should be fixed.
2023-06-09 09:10:35 +08:00
abdosi
6139c525d2
updated internal route policy for chassis-packet (#15349)
What I did:

Workaround for the issue seen here : FRRouting/frr#13682
It seems there is timing issue where there are multiple recursive lookup needed to resolve nexthop of the route it's possible that it does not happen correctly causing route to remain in inactive state

Issue is seen on chassis-packet as there 2 level of recursive lookup needed for a given e-BGP learnt route
- Level1 to resolve e-BGP peer (connected route via bgp ) over Loopback4096 (i-BGP peering)
- Level 2 Loopback4096 over backend port-channels next-hops

For VOQ chassis there is no e-BGP peer (connected route via bgp )  resolution as route is added as Static route by orchagent over Ethernet-IB.

Also as part of this remove route-map policy from instance.conf.j2 as same is define in peer-group.j2.

Microsoft ADO: https://msazure.visualstudio.com/One/_workitems/edit/24198507

How I verify:
Functional Verification manually
Updated UT.
We will be adding sanity check in sonic-mgmt to make sure none of route are in inactive state.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-06-07 09:17:44 -07:00
Hua Liu
44427a2f6b
Add watchdog mechanism to swss service and generate alert when swss have issue. (#14686)
This PR depends on https://github.com/sonic-net/sonic-swss/pull/2737 merge first.

**What I did**
Add orchagent watchdog to monitor and alert orchagent stuck issue.

**Why I did it**
Currently SONiC monit system only monit orchagent process exist or not. If orchagent process stuck and stop processing, current monit can't find and report it.

**How I verified it**
Pass all UT.
Add new UT https://github.com/sonic-net/sonic-mgmt/pull/8306 to check watchdog works correctly.
Manually test, after pause orchagent with 'kill -STOP <pid>', check there are warning message exist in log:

Apr 28 23:36:41.504923 vlab-01 ERR swss#supervisor-proc-watchdog-listener: Process 'orchagent' is stuck in namespace 'host' (1.0 minutes).

**Details if related**
Heartbeat message PR: https://github.com/sonic-net/sonic-swss/pull/2737
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8306
2023-06-05 22:21:17 -07:00
Mai Bui
1477f779de
modify commands using utilities_common.cli.run_command and advance sonic-utilities submodule on master (#15193)
Dependency:
sonic-net/sonic-utilities#2718

Why I did it
This PR sonic-net/sonic-utilities#2718 reduce shell=True usage in utilities_common.cli.run_command() function.

Work item tracking
Microsoft ADO (number only): 15022050
How I did it
Replace strings commands using utilities_common.cli.run_command() function to list of strings

due to circular dependency, advance sonic-utilities submodule
72ca4848 (HEAD -> master, upstream/master, upstream/HEAD) Add CLI configuration options for teamd retry count feature (sonic-net/sonic-utilities#2642)
359dfc0c [Clock] Implement clock CLI (sonic-net/sonic-utilities#2793)
b316fc27 Add transceiver status CLI to show output from TRANSCEIVER_STATUS table (sonic-net/sonic-utilities#2772)
dc59dbd2 Replace pickle by json (sonic-net/sonic-utilities#2849)
a66f41c4 [show] replace shell=True, replace xml by lxml, replace exit by sys.exit (sonic-net/sonic-utilities#2666)
57500572 [utilities_common] replace shell=True (sonic-net/sonic-utilities#2718)
6e0ee3e7 [CRM][DASH] Extend CRM utility to support DASH resources. (sonic-net/sonic-utilities#2800)
b2c29b0b [config] Generate sysinfo in single asic (sonic-net/sonic-utilities#2856)
2023-06-05 17:08:13 +08:00
Tejaswini Chadaga
8058550c09
[bgp]: Add sudo check for TSA/B/C command execution (#15288)
TSA/B/C scripts invoke commands that require root permissions. If the user does not have sudo permissions, the scripts today execute until the command and throw a backtrace with error at the specific command. Added a check to ensure the operations check for root permissions upfront.
2023-06-03 23:47:26 +02:00
Baorong Liu
acb423b255
[staticroutebfd]fix an issue on deleting a non-bfd static route (#15269)
* [static_route][staticroutebfd]fix an issue on deleting a non-bfd static route

Fix an issue for deleting a non-bfd static route also remove the staticroutebfd from critical_processes list and make it auto restart in the case of crash.
2023-06-02 11:46:56 -07:00
qiwang4
359b80e012
[master]staticroutebfd process implementation (#13789)
* [BFD] staticroutebfd implementation
* To enable the BFD for static route

HLD: sonic-net/SONiC#1216
2023-05-26 16:32:05 -07:00
Sachin Holla
ba6aba2b92
[mgmt-framework] Fix rest-server startup script (#14979)
This script was using 'null' as default value for all optional fields
of REST_SERVER table -- due to incorrect use of 'jq -r' command.
Server was not coming up when REST_SERVER entry exists but some fields
were not given (which is a valid configuration).
Fixed the jq query expression to return empty string for non existing
fields.

Signed-off-by: Sachin Holla <sachin.holla@broadcom.com>
2023-05-22 17:42:38 -07:00
Tejaswini Chadaga
4e60f0d563
Template change for BGP monitors on T2 (#14844)
Why I did it
To support BGPMon sessions from each T2 linecard ASIC

Work item tracking
Microsoft ADO (number only): 17873174
How I did it
Added change in BGPMon configuration to use Loopback4096 as source interface, since this has a unique IP per ASIC.

How to verify it
Tested by manually setting up BGPMon session on T2 LC and verified that Loopback4096 could be used as source
2023-05-09 13:40:00 -07:00
abdosi
9b8b4e6e4d
[bgp/TSA]: Fixed the internal peer route-map policy (#14804)
What I did:
In FRR command update source <interface-name> is not at address-family level. Because of this
internal peer route-map for ipv6 were getting applied to ipv4 address family. As a result
TSA over iBGP for Ipv6 was not getting applied.

How I verify:

Manual Verification of TSA over both ipv4 and ipv6 after fix works fine.
Updated UT for this.

Added sonic-mgmt test gap: sonic-net/sonic-mgmt#8170

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-05-05 13:55:05 -07:00
Zain Budhwani
4974b5c49c
Add idle conn duration config to telemetry.sh (#14903)
Why I did it
Supports new field in sonic-net/sonic-gnmi@258b887

Work item tracking
Microsoft ADO (number only): 13468195
How I did it
Add new field in telemetry.sh

How to verify it
Pipeline
2023-05-04 16:47:02 -07:00
Tejaswini Chadaga
ca224863cb
Changes to support TSA from supervisor (#14691)
Why I did it
Support for SONIC chassis isolation using TSA and un-isolation using TSB from supervisor module

Work item tracking
Microsoft ADO (number only): 17826134
How I did it
When TSA is run on the supervisor, it triggers TSA on each of the linecards using the secure rexec infrastructure introduced in sonic-net/sonic-utilities#2701. User password is requested to allow secure login to linecards through ssh, before execution of TSA/TSB on the linecards

TSA of the chassis withdraws routes from all the external BGP neighbors on each linecard, in order to isolate the entire chassis. No route withdrawal is done from the internal BGP sessions between the linecards to prevent transient drops during internal route deletion. With these changes, complete isolation of a single linecard using TSA will not be possible (a separate CLI/script option will be introduced at a later time to achieve this)

Changes also include no-stats option with TSC for quick retrieval of the current system isolation state

This PR also reverts changes in #11403

How to verify it
These changes have a dependency on sonic-net/sonic-utilities#2701 for testing

Run TSA from supervisor module and ensure transition to Maintenance mode on each linecard
Verify that all routes are withdrawn from eBGP neighbors on all linecards
Run TSB from supervisor module and ensure transition to Normal mode on each linecard
Verify that all routes are re-advertised from eBGP neighbors on all linecards
Run TSC no-stats from supervisor and verify that just the system maintenance state is returned from all linecards
2023-04-28 16:28:06 +08:00
judyjoseph
6370257fa3
[macsec]: show macsec: add --profile option, include profile name in show command output (#13940)
This PR is to add the following

Add a new options "--profile" to the show macsec command, to show all profiles in device
Update the currentl show macsec command, to show profile in each interface o/p. This will tell which macsec profile the interface is attached to.
2023-04-27 08:51:28 -07:00
Stepan Blyshchak
04099f075d
[BGP] support BGP pending FIB suppression (#12853)
Signed-off-by: Stepan Blyschak stepanb@nvidia.com

DEPENDS: #12852

Why I did it
To support BGP pending FIB suppression.

How I did it
I backported patches from FRR 8.4 feature that allows communicating ASIC route status back to FRR.
Also, added a new field in DEVICE_METADATA YANG model table. Added UT for YANG model changes.

How to verify it
Run on the switch.
2023-04-20 19:56:13 +08:00
Hua Liu
a14cc76879
Install python-redis package to docker containers (#14632)
Install python-redis package to docker containers

#### Why I did it
This this bug: https://github.com/sonic-net/sonic-buildimage/issues/14531
The 'flush_unused_database' is part of docker-database, and docker-database does not install python-redis package by itself. it's using redis installed by sonic-py-swsssdk.
So after remove sonic-py-swsssdk from container, this script break.

To this this bug and avoid similer bug happen again, install python-redis to docker containers which removed sonic-py-swsssdk .

#### How I did it
Install python-redis to containers.

#### How to verify it
Pass all UT.
Create new UT to cover this scenario: https://github.com/sonic-net/sonic-mgmt/pull/8032

#### Description for the changelog
Improve sudo cat command for RO user.
2023-04-19 18:14:48 -07:00
Zain Budhwani
e9a9c9e31f
Update telemetry.sh with threshold config (#14615)
#### Why I did it

Threshold is a new config field passed to telelemetry.go as parameter

#### How I did it

Add check for threshold

#### How to verify it

Modify telemetry.sh, systemctl restart telemetry, telemetry process has threshold of 100
2023-04-18 14:29:30 -07:00
Kuanyu Chen
cffd87a627
Add monit_snmp file to monitor memory usage (#14464)
#### Why I did it

When CPU is busy, the sonic_ax_impl may not have sufficient speed to handle the notification message sent from REDIS.
Thus, the message will keep stacking in the memory space of sonic_ax_impl.

If the condition continues, the memory usage will keep increasing.

#### How I did it

Add a monit file to check if the SNMP container where sonic_ax_impl resides in use more than 4GB memory.
If yes, restart the sonic_ax_impl process.

#### How to verify it

Run a lot of this command: `while true; do ret=$(redis-cli -n 0 set LLDP_ENTRY_TABLE:test1 test1); sleep 0.1; done;`
And check the memory used by sonic_ax_impl keeps increasing.

After a period, make sure the sonic_ax_impl is restarted when the memory usage reaches the 4GB threshold.
And verify the memory usage of sonic_ax_impl drops down from 4GB.
2023-04-06 12:19:11 -07:00
Christian Svensson
bce824723c
[sflow] Switch to bullseye (#14494)
Change references to use bullseye instead of buster

Why I did it
Almost all daemons in 202211 and master uses bullseye, and sflow was easy to migrate.

How I did it
Replaced the references, built and tested in 202211.
How to verify it

Build with the changes, enable sflow:
admin@sonic:~$ sudo config sflow collector add test 1.2.3.4
admin@sonic:~$ sudo config sflow collector enable
tcpdump on 1.2.3.4 and see that UDP sFlow are being sent.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2023-04-03 09:49:35 -07:00
Christian Svensson
67abcff944
[nat] Switch to bullseye (#14495)
Change references to use bullseye instead of buster

Why I did it
Almost all daemons in 202211 and master uses bullseye, and NAT seems easy to migrate.

How I did it
Replaced the references, built with 202211 branch.

How to verify it
Not sure, it builds and tests pass as far as I can tell but I don't use the feature myself.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2023-04-02 14:02:33 -07:00
Gokulnath-Raja
cedc4d914f
[sflow] Exception handling for if_nametoindex (#11437) (#13567)
catch system error and log as warning level instead of
error level in case interface was already deleted.

Why I did it
sflow process exited when failed to convert the interface index from interface name

How I did it
Added exception handling code and logged when OSError exception.

How to verify it
Recreated the bug scenario #11437 and ensured that sflow process not exited.

Description for the changelog
catch system error and log as warning level instead of
error level in case interface was already deleted.

Logs
steps :

root@sonic:~# sudo config vlan member del 4094 PortChannel0001
root@sonic:~# sudo config vlan member del 4094 Ethernet2
root@sonic:~# sudo config vlan del 4094
root@sonic:~#

"WARNING sflow#port_index_mapper: no interface with this name" is  seen but no crash is reported
syslogs :


Jan 23 09:17:24.420448 sonic NOTICE swss#orchagent: :- removeVlanMember: Remove member Ethernet2 from VLAN Vlan4094 lid:ffe vmid:27000000000a53
Jan 23 09:17:24.420710 sonic NOTICE swss#orchagent: :- flushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 3
Jan 23 09:17:24.420847 sonic NOTICE swss#orchagent: :- recordFlushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 3
Jan 23 09:17:24.426082 sonic NOTICE syncd#syncd: :- processFdbFlush: fdb flush succeeded, updating redis database
Jan 23 09:17:24.426242 sonic NOTICE syncd#syncd: :- processFlushEvent: received a flush port fdb event, portVid = oid:0x3a000000000a52, bvId = oid:0x26000000000a51
Jan 23 09:17:24.426374 sonic NOTICE syncd#syncd: :- processFlushEvent: pattern ASIC_STATE:SAI_OBJECT_TYPE_FDB_ENTRY:*oid:0x26000000000a51*, portStr oid:0x3a000000000a52
Jan 23 09:17:24.427104 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0
Jan 23 09:17:24.427182 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0
Jan 23 09:17:24.428502 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: processing consolidated fdb flush event of type: SAI_FDB_ENTRY_TYPE_DYNAMIC
Jan 23 09:17:24.429058 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: fdb flush took 0.000606 sec
Jan 23 09:17:24.431496 sonic NOTICE swss#orchagent: :- setHostIntfsStripTag: Set SAI_HOSTIF_VLAN_TAG_STRIP to host interface: Ethernet2
Jan 23 09:17:24.431675 sonic NOTICE swss#orchagent: :- flushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 2
Jan 23 09:17:24.431797 sonic NOTICE swss#orchagent: :- recordFlushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 2
Jan 23 09:17:24.437009 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: processing consolidated fdb flush event of type: SAI_FDB_ENTRY_TYPE_DYNAMIC
Jan 23 09:17:24.437532 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: fdb flush took 0.000514 sec
Jan 23 09:17:24.437942 sonic NOTICE syncd#syncd: :- processFdbFlush: fdb flush succeeded, updating redis database
Jan 23 09:17:24.438065 sonic NOTICE syncd#syncd: :- processFlushEvent: received a flush port fdb event, portVid = oid:0x3a000000000a52, bvId = oid:0x0
Jan 23 09:17:24.438173 sonic NOTICE syncd#syncd: :- processFlushEvent: pattern ASIC_STATE:SAI_OBJECT_TYPE_FDB_ENTRY:*, portStr oid:0x3a000000000a52
Jan 23 09:17:24.440348 sonic NOTICE swss#orchagent: :- removeBridgePort: Remove bridge port Ethernet2 from default 1Q bridgeJan 23 09:17:29.782554 sonic NOTICE swss#orchagent: :- removeVlan: VLAN Vlan4094 still has 1 FDB entries
Jan 23 09:17:29.791373 sonic WARNING sflow#port_index_mapper: no interface with this name

Signed-off-by: Gokulnath-Raja <Gokulnath_R@dell.com>
2023-03-27 10:19:05 -07:00
ShiyanWangMS
06795931b7
Add AZP agent necessary packages to sonic-mgmt-docker (#14291)
Why I did it
Add AZP agent necessary packages to sonic-mgmt-docker
Remove Python 201811 venv
Update some packages in order to meet internal security requirements

How I did it
Update sonic-mgmt-docker file

How to verify it
sonic-mgmt-docker can run: bash, apt update, apt install and ping.
start.sh is under /azp with exec permission.
env-201811 venv is removed.
jinja2 is upgrade to 2.10.1
2023-03-21 08:09:44 +08:00
Zain Budhwani
881b925d19
Fix telemetry.sh passing in null as log level value (#14303)
#### Why I did it

Bug in script that was passing in null as log level value if missing from config_db

#### How I did it

Added more robust conditional statement

#### How to verify it

1) Remove log_level from config db
2) config reload -y
3) telemetry should not crash
2023-03-20 16:22:11 -07:00
Vivek
f19c414176
[lldpmgrd] Don't log error message for outdated event (#14178)
- Why I did it
Fixes #14236

When a redis event quickly gets outdated during port breakout, error logs like this are seen

Mar  8 01:43:26.011724 r-leopard-56 INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet64': {'admin_status': 'down'}, 'Ethernet68': {'admin_status': 'down'}}}
Mar  8 01:43:26.012565 r-leopard-56 INFO ConfigMgmt: Writing in Config DB
Mar  8 01:43:26.013468 r-leopard-56 INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet64': None, 'Ethernet68': None}, 'INTERFACE': None}
Mar  8 01:43:26.018095 r-leopard-56 NOTICE swss#portmgrd: :- doTask: Configure Ethernet64 admin status to down
Mar  8 01:43:26.018309 r-leopard-56 NOTICE swss#portmgrd: :- doTask: Delete Port: Ethernet64
Mar  8 01:43:26.018641 r-leopard-56 NOTICE lldp#lldpmgrd[32]: :- pops: Miss table key PORT_TABLE:Ethernet64, possibly outdated
Mar  8 01:43:26.018654 r-leopard-56 ERR lldp#lldpmgrd[32]: unknown operation ''

- How I did it
Only log the error when the op is not empty and not one of ("SET" & "DEL" )

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-03-16 18:15:50 +02:00
Ye Jianquan
5e85c01621
Add scandir into sonic-mgmt docker image (#14219)
Why I did it
TestbedV2 requires scandir python package

How I did it
Install scandir packages
2023-03-14 08:58:11 +08:00
xumia
5f4d063506
[Build] Fix the mirror gpg key expired issue (#14206)
Why I did it
[Build] Fix the mirror gpg key expired issue
See vs build: https://dev.azure.com/mssonic/build/_build/results?buildId=231680&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795

How I did it
Add the apt option not to check the valid until, the option is set to the SONiC docker base image, docker ptf missing the option.

Acquire::Check-Valid-Until "false";
How to verify it
The build of docker-ptf is succeeded after fixed.

2023-03-11T17:26:35.1801999Z [ building ] [ target/docker-ptf.gz ] 
2023-03-11T17:38:10.1608536Z [ finished ] [ target/docker-ptf.gz ]
2023-03-13 11:13:21 +08:00
Yaqiang Zhu
284ba61a86
[dhcp-relay] Add dhcp_relay show cli (#13614)
Why I did it
Currently the show and clear cli of dhcp_relayis may cause confusion.

How I did it
Add doc for it: [doc] Add docs for dhcp_relay show/clear cli sonic-utilities#2649
Add dhcp_relay config cli and test cases.
show dhcp_relay ipv4 helper
show dhcp_relay ipv6 destination
show dhcp_relay ipv6 counters
sonic-clear dhcp_relay ipv6 counters

How to verify it
Unit test all passed
2023-03-06 10:48:25 -08:00
ppikh
de84eb98c7
[ptf]: Added package "wireshark-common" into PTF docker (#14070)
It will allow us to have application called "mergecap" - which can merge multiple .pcap files into single .pcapng file and convert it to .pcap file

Signed-off-by: Petro Pikh <petrop@nvidia.com>
2023-03-04 17:47:42 -08:00
Vaibhav Hemant Dixit
860bc7492a
Add shellcheck and mock modules for running unit and linter test (#14062) 2023-03-03 19:24:26 +00:00
Tejaswini Chadaga
f80bf7783d
Fix VOQ_CHASSIS_V6_PEER route-map config (#14055)
* Fix typo in VOQ_CHASSIS_V6_PEER route-map config

* Updated UT files with the changed config
2023-03-03 09:28:57 -08:00
Zain Budhwani
165e33b4e4
Remove dialout as critical process (#14006)
#### Why I did it

Remove dialout as critical process as it is no longer used in prod. As part of future work, can remove dialout completely

#### How I did it

Remove from critical process list
2023-02-28 15:56:54 -08:00
Yaqiang Zhu
c5a7ce0cf4
[dhcp_relay] Remove exist check while adding dhcpv6 relay (#13822)
Why I did it
DHCPv6 relay config entry is not useful while del dhcpv6 relay config.

How I did it
Remove dhcpv6_relay entry if it is empty and not check entry exist while adding dhcpv6 relay
2023-02-15 10:23:39 -08:00
Stepan Blyshchak
68e1079202
[FRR] Switch to dplane_fpm_nl plugin instead of fpm (#12852)
Why I did it
dplane_fpm_nl is a new FPM implementation in FRR. The old plugin fpm will not have any new features implemented. Usage of the new plugin gives us ability to use BGP suppression feature and next hop groups in the future.

How I did it
Switch to dplane_fpm_nl zebra plugin from old fpm plugin which is not supported anymore
Remove stale patches for old fpm plugin and add similar patches for dplane_fpm_nl

How to verify it
Build and run on the switch.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-06 09:38:39 -08:00
Yaqiang Zhu
bb48ee92ab
[dhcp-relay] Add support for dhcp_relay config cli (#13373)
Why I did it
Currently the config cli of dhcpv4 is may cause confusion and config of dhcpv6 is missing.

How I did it
Add dhcp_relay config cli and test cases.

config dhcp_relay ipv4 helper (add | del) <vlan_id> <helper_ip_list>
config dhcp_relay ipv6 destination (add | del) <vlan_id> <destination_ip_list>
Updated docs for it in sonic-utilities: https://github.com/sonic-net/sonic-utilities/pull/2598/files
How to verify it
Build docker-dhcp-relay.gz with and without INCLUDE_DHCP_RELAY, and check target/docker-dhcp-relay.gz.log
2023-01-30 17:48:01 -08:00
Zain Budhwani
2068a2697a
Change bgp notification leaf name and mem_usage leaf type (#13012)
#### Why I did it

Improve naming convention for bgp notification events and change type of leaf for sonic-events-host mem usage from uint64 to decimal64

#### How I did it

Replace "-" with "_"

Replace uint64 with decimal64

#### How to verify it

Run yang model unit tests

#### Description for the changelog

Change YANG model leaf naming convention for bgp notification
2023-01-24 15:47:32 -08:00
abdosi
439d4eab98
[chassis] Fixed critical process not correct for database-chassis docker (#13445)
*Critical process for database-chassis is redis-chassis but critical_process contains hard-coded
to `redis` program always. Instead using jinja2 template to render critical process list based on database docker type. redis-chassis for database-chassis docker and redis for regular database docker.
2023-01-20 10:21:48 -08:00
pettershao-ragilenetworks
3c9837b484
]pmon]: Import requests libraries for Ragile platform (#13171)
if there is no request, you need to use curl to get data from bmc, and each query needs to start a curl process. pmon is a circular query, which will pull up multiple processes in a loop, which consumes a lot. Using request does not need to pull up the process.
2023-01-07 21:12:13 -08:00
xumia
38c5d7fcec
[Build] Support j2 template for debian sources for docker ptf (#13198)
Change to use the sources.list from the file generated from the j2 template
2022-12-29 23:33:21 -08:00
Sudharsan Dhamal Gopalarathnam
bbc72e78b0
[dhcp_relay]Fix the clear dhcp6relay_counters CLI (#13148)
Avoid traceback on sonic-clear command

sonic-clear dhcp6relay_counters 
Traceback (most recent call last):
  File "/usr/local/bin/sonic-clear", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/clear/plugins/dhcp-relay.py", line 19, in dhcp6relay_clear_counters
    counter = DHCPv6_Counter()
NameError: name 'DHCPv6_Counter' is not defined

- How I did it
Corrected the way to import using importlib

- How to verify it
Tested the sonic-clear command and verified no traceback is seen
2022-12-26 09:14:59 +02:00
Junchao-Mellanox
2126def04e
[infra] Support syslog rate limit configuration (#12490)
- Why I did it
Support syslog rate limit configuration feature

- How I did it
Remove unused rsyslog.conf from containers
Modify docker startup script to generate rsyslog.conf from template files
Add metadata/init data for syslog rate limit configuration

- How to verify it
Manual test
New sonic-mgmt regression cases
2022-12-20 10:53:58 +02:00
Longxiang Lyu
d2ab55cc15
[dualtor] Let T0 delay 10 seconds before sending BGP updates (#12996)
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.

How I did it
add coalesce-time 10000 to the frr bgp startup config.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2022-12-15 22:14:46 +00:00
Dmytro Lytvynenko
5550c5da08
[BFN]: Implement getting psu related sensors in sonic_platform directly from BMC (#12786)
Why I did it
Platform interface doesn't provide all sensors and using it isn't effective

How I did it
Request sensors via http from BMC server and parse the result

How to verify it
Related daemon in pmon populates redis db, run this command to view the contents
2022-12-14 22:21:36 +08:00
Vivek
5624d15a7c
Fix dependency of dhcp-mon on VLAN with only v6 (#13006)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-12-09 14:41:07 -08:00
Junchao-Mellanox
3b3837a636
[containercfgd] Add containercfgd and syslog rate limit configuration support (#12489)
* [containercfgd] Add containercfgd and syslog rate limit configuration support

* Fix build issue

* Fix checker issue

* Fix review comment

* Fix review comment

* Update containercfgd.py
2022-12-08 08:58:35 -08:00
Arvindsrinivasan Lakshmi Narasimhan
7db272556e
[chassis] update the asic_status.py to read from CHASSIS_FABRIC_ASIC_INFO_TABLE (#12576)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

Why I did it
Fixes #12575 and #12575

How I did it
In the PR sonic-net/sonic-platform-daemons#311 chassisd updates to CHASSIS_FABRIC_ASIC_INFO with the fabric asic info.
Updating the asic_status.py to read from the correct table.

How to verify it
test on chassis

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-12-07 21:53:47 -08:00
Vivek
d82e1321bc
[Bullseye] Upgrade sonic-sdk image to bullseye (#12649)
- Why I did it
Upgrade the app-extension developer environments (sonic-sdk & sonic-sdk-bullseye) to bullseye

- How to verify it
Built an app-extension using these images and verified if it is up and running.

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2022-11-28 18:57:26 +02:00
Konstantin Vasin
6448afd338
[Build] set apt Acquire::Retries to 3 for bullseye (#12758)
Why I did it
There were some changes in apt source code in version 2.1.9.
As a result apt used in bullseye (2.2.4) is intolerant to network issues.
This was fixed in 10631550f1 Already fixed version is used in bookworm (2.5.4)
And not yet affected version is used in buster (1.8.2.3)

How I did it
Set Acquire::Retries to 3 for sonic-slave-bullseye, docker-base-bullseye and final Debian image.

Ref: https://bugs.launchpad.net/ubuntu/+source/apt/+bug/1876035

Signed-off-by: Konstantin Vasin k.vasin@yadro.com
2022-11-21 08:05:16 +08:00
Richard.Yu
47d63bcd06
[SAI PTF] SAI PTF docker support sai-ptf v2 (#12719)
* [SAI PTF] SAI PTF docker support sai-ptf v2

Publish the sai-ptf docker.

Take part of the change from previous PR #11610 (already reverted as some cache issue)
Cause in #11610, added two new target in it, one is sai-ptf another one is syncd-rpc with sai-ptf v2, to make the upgrade with more clear target, use this one take the sai-ptf one.

Test one:
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=vs
NOSTRETCH=y NOJESSIE=y NOBULLSEYE=y SAITHRIFT_V2=y make target/docker-ptf-sai.gz

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless change

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless parameters

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove useless change

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* Update azure-pipelines-build.yml

remove a useless option

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-11-17 04:42:51 -08:00
Ashwin Srinivasan
7de04504c9
Added libpci and pciutils to the pmon docker (#12684)
This enables the pcied daemon to call the corresponding system commands needed for pci transactions
2022-11-14 11:44:36 -08:00
Arnaud
9d3814045b
[docker-fpm-frr]: Add unified-split mode to routing config (#11938)
- Why I did it
The values for config_db "docker_routing_config_mode" are:

separated: FRR config generated from ConfigDB, each FRR daemon has its own config file
unified: FRR config generated from ConfigDB, single FRR config file
split: FRR config not generated from ConfigDB, each FRR daemon has its own config file
This commit adds:
split-unified: FRR config not generated from ConfigDB, single FRR config file

- How I did it
In docker_init.sh, when split-unified is used, the FRR configs are not generated
from ConfigDB. What's more, "service integrated-vtysh-config" is configured in vtysh.conf.

- How to verify it
FRR config not overwritten when FRR container starts.

Signed-off-by: Arnaud le Taillanter <a.letaillanter@criteo.com>
2022-11-14 10:37:48 -08:00
Zain Budhwani
98ace33b0f
Add rsyslog plugin regex for select operation failure (#12659)
Added events for select op, alpm parity error, moved dhcp events from host to container
2022-11-13 21:41:33 -08:00
Liu Shilong
6d78199d6f
Revert "[SAI PTF]Syncd-rpc and PTF docker support sai ptf v2 (#11610)" (#12677)
This reverts commit f0873f29d8.
2022-11-14 09:56:10 +08:00
Caitlin Choate
66f1cc458d
Bugfix #9739: Support when 'bgp_asn' is set to 'None', 'Null', or missing. (#12588)
bgpd.main.conf.j2: bugfix-9739
* Update bgpd.main.conf.j2 to gracefully handle the bgp configuration cases for when 'bgp_asn' is set to 'None', 'Null', or missing.

How I did it
Include a conditional statement to avoid configuring bgp in FRR when 'bgp_asn' is missing or set to 'None' or 'Null'

How to verify it
Configure 'bgp_asn' as 'None', 'Null' or have it missing from configurations and verify that /etc/frr/bgpd.conf does not have invalid bgp configurations like 'router bgp None'

Description for the changelog
Update bgpd.main.conf.j2 to gracefully handle the bgp configuration cases for when 'bgp_asn' is set to 'None', 'Null', or missing for bugfix 9739.

Signed-off-by: cchoate54@gmail.com
2022-11-08 16:53:14 -08:00
xumia
ac5d89c6ac
[Build] Support j2 template for debian sources (#12557)
Why I did it
Unify the Debian mirror sources
Make easy to upgrade to the next Debian release, not source url code change required.
Support to customize the Debian mirror sources during the build
Relative issue: #12523
2022-11-09 08:09:53 +08:00
Richard.Yu
f0873f29d8
[SAI PTF]Syncd-rpc and PTF docker support sai ptf v2 (#11610)
* support sai-ptf-v2 in libsaithrift vs

* add build target docker-ptf-sai syncd-rpcv2 and saiserverv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add docker ptf sai

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add build condition for broadcom

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add docker syncd dbg and add debug symbol to docker-saiserverv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* correct the build option

* change the azure pipeline build template

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* change build option for docker-ptf-sai

* enable ptf-sai docker build

* remove the build for syncd-rpcv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* fix issue in build tempalte

* ignore useless package build when build sai-ptf

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove scapy version contraint

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove duplicated target docker-ptf

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* change template for testing the pipeline

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* remove duplicated target

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* fix error in make script

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add shel to setup env

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* replace with certain platform name

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* disable cache for syncd-rpcv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* test without cache

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* disable cache

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* testing: disable the cache for build syncd-rpcv2

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add cache back and get the code ready for testing

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* refactor code

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add workaround for issue in rules/sairedis.dep

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* refactor code

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-11-07 21:47:52 +08:00
Mai Bui
61a085e55e
Replace os.system and remove subprocess with shell=True (#12177)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
remove `shell=True`, use `shell=False`
Replace `os` by `subprocess`
2022-11-04 10:48:51 -04:00
tjchadaga
763d3dc29d
Allow TSA on ibgp sessions between linecards on packet chassis (#12589) 2022-11-03 08:54:33 -07:00
arlakshm
a85b34fd36
update notify-keyspace-events in redis.conf (#12540)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

Why I did it
closes #12343

Today in SONiC the notify-keyspace-events is from DbInterface class when application try do any configdb set.
In Chassis the chassis_db may not get any configdb set operations, so there is chance this configuration will never be set.
So the chassis_db updates from one line card will not be propogated to other linecards, which are doing a psubscribe to get these event.

How I did it
update the redis.conf to set notify-keyspace-events AKE so that the notify-keyspace-events are set when the redis instance is started

How to verify it
Test on chassis
2022-10-28 18:28:57 -07:00
xumia
a771a26d99
[Build] Add the missing debian source bullseye-updates/buster-updates (#12522)
Why I did it
Add the missing debian source bullseye-updates/buster-updates

The build failure as below, it is caused by the docker image debian:bullseye used the version 2.31-13+deb11u5, but the version only available in bullseye-update.
2022-10-27 10:15:14 -07:00
kellyyeh
f4046c1417
Add dhcp6relay dualtor option (#12459) 2022-10-21 10:33:10 -07:00
Lawrence Lee
37ad8befc1
[tunnel_pkt_handler]: Skip nonexistent intfs (#12424)
- Skip the interface status check if the interface does not exist. In the future, when the interface is created/comes up this check will be triggered again.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-10-20 09:29:57 -07:00
Xin Wang
a07aaca831
[docker-sonic-mgmt] Cleanup and upgrade some packages (#12218)
Why I did it
The Dockerfile of docker-sonic-mgmt became a little bit messy over time. Some packages are also a little bit too old. It would be better to do some cleanup and upgrade some important packages.

How I did it
Updated the dockerfile template for building docker-sonic-mgmt.

How to verify it
Locally built the docker-sonic-mgmt image and used it to run some test scripts.

Description for the changelog:
The build-essential package contains gcc and make. It's unnecessary to install them again.
The python-is-python2 package is included in the python package for Ubuntu 20.04. It's unnecessary to install it again.
Sort the apt and pip packages by alphabetic order.
Cleanup get-pip.py after installation.
Cleanup the python-scapy deb package after installation.
Ensure that the python pip, setuptools and wheel packages are up to date.
Install pytest-ansible from pip instead of from source code.
While installing docker-ce-cli, it's unnecessary to install curl and software-properties-common again.
Merged some pip install steps into one step.
Upgrade ansible from 2.8.12 to 2.9.27 for env-python3.
Upgrade pytest to 7.1.3 for env-python3.
Add ncclient package to evn-python3.
2022-10-18 10:02:30 +08:00
cytsao1
9ef8464964
[pmon] Add smartmontools to pmon docker (#11837)
* Add smartmontools to pmon docker

* Set smartmontools to install version 7.2-1 in pmon to match host; clean up smartmontools build files

* Add comments on smartmontools version for both host and pmon
2022-10-17 13:26:31 -07:00
Vivek
34f9a642dd
[DHCP_RELAY] Updated wait_for_intf.sh to wait for ipv6 global and link local addr (#12273)
- Why I did it
Fixes #11431

- How I did it
dhcp6relay binds to ipv6 addresses configured on these vlan interfaces
Thus check if they are ready before launching dhcp6relay

- How to verify it
Unit Tests
Tested on a live device

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-10-12 11:46:20 +03:00
Ye Jianquan
62692f4228
Add storage blob related packages (#12220)
Why I did it
TestbedV2 relies on storage blob related python package to upload logs.

How I did it
Install storage blob related packages
2022-10-10 11:29:26 +08:00
Zain Budhwani
09fe3f467f
Add Structured Events w/ YANG Models (#12270)
Add events for dhcp-relay, bgp, syncd, & kernel.
2022-10-09 20:23:31 -07:00
Adam Yeung
80c1210a6f
iccpd bullseye migration (#12097) 2022-10-06 11:28:53 -07:00
Ye Jianquan
7666af9403
Fix pip install error (#12198)
Fix the error of pip install introduced in PR #12197
2022-09-28 14:39:33 +08:00
Ye Jianquan
9c602320c3
install missed package python-dateutil (#12197)
Why I did it
Fix issue of can't import dateutil.parser in show_techsupport/test_auto_techsupport.py

How I did it
install python-dateutil
2022-09-28 11:38:41 +08:00
ShiyanWangMS
1995540758
Upgrade docker-sonic-mgmt base image from Ubuntu18.04 to 20.04 (#12056)
Upgrade docker-sonic-mgmt base image from Ubuntu18.04 to 20.04
2022-09-27 09:15:48 +08:00
Xin Wang
f50dc28789
[docker-sonic-mgmt] Deprecate azure-kusto-data & azure-kusto-ingest for py2 (#12143)
Why I did it
The python packages azure-kusto-data and azure-kusto-ingest packages for python2 are too old and not really used. The python3 environment has newer version of these packages installed. This change is to deprecate these two packages for python2 in docker-sonic-mgmt image.

How I did it
Removed the lines for installing old version of packages azure-kusto-data and azure-kusto-ingest in python2 in the Dockerfile template.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
2022-09-26 10:48:02 +08:00
Zain Budhwani
fd6a1b0ce2
Add events to host and create rsyslog_plugin deb pkg (#12059)
Why I did it

Create rsyslog plugin deb for other containers/host to install
Add events for bgp and host events
2022-09-21 09:20:53 -07:00
Zhaohui Sun
1effff9836
Enable system-site-packages for ptf docker and install thrift for test_qos_sai (#12094)
Why I did it
test_sai_qos failed because of the following error:

"stderr_lines": [
        "Traceback (most recent call last):", 
        "  File \"/usr/bin/ptf\", line 522, in <module>", 
        "    test_modules = load_test_modules(config)", 
        "  File \"/usr/bin/ptf\", line 413, in load_test_modules", 
        "    mod = imp.load_module(modname, *imp.find_module(modname, [root]))", 
        "  File \"saitests/switch.py\", line 19, in <module>", 
        "    import switch_sai_thrift", 
        "ImportError: No module named switch_sai_thrift"
    ], 

It's because test_sai_qos runs ptf script which imports switch_sai_thrift, switch_sai_thrift is installed from python-saithrift_0.9.4_amd64.deb.
For master image, the deb file is for python3, but ptf only has virtual python3 environment, that's why we add --system-site-packages to allow virtual env to access system site-packeges.

Add thrift package in docker ptf virtual python3 env, because currently env-python3 doesn't have thrift module which is needed in switch_sai_thrift.

How I did it
Enable --system-site-packages for virtual py3 env in ptf docker and install thrift for test_qos_sai

How to verify it
load and login ptf conatiner
dpkg - i python-saithrift_0.9.4_amd64.deb
source /root/env-python3/bin/activate
python
import switch_sai_thrift.switch_sai_rpc
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-09-17 13:33:53 +08:00
anamehra
714b1807b6
Fix radv.conf traceback when VLAN_INTERFACE is not defined (#12034)
*Fix the if block scope to prevent traceback due to undefined vlan_list when  VLAN_INTERFACE is not defined.
2022-09-09 12:54:05 -07:00
ShiyanWangMS
acc17db0c8
Revert PR#11831 (#12035) "Upgrade docker-sonic-mgmt base image from Ubuntu18.04 to 20.04" 2022-09-09 22:18:13 +08:00
Ze Gan
5efd6f9748
[macsec]: Add MACsec clear CLI support (#11731)
Why I did it
To support clear MACsec counters by sonic-clear macsec

How I did it
Add macsec sub-command in sonic-clear to cache the current macsec stats, and in the show macsec command to check the cache and return the diff with cache file.

How to verify it

admin@vlab-02:~$ show macsec  Ethernet0
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-128
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
        MACsec Egress SC (52540067daa70001)
        -----------  -
        encoding_an  0
        -----------  -
                MACsec Egress SA (0)
                -------------------------------------  --------------------------------
                auth_key                               9DDD4C69220A1FA9B6763F229B75CB6F
                next_pn                                1
                sak                                    BA86574D054FCF48B9CD7CF54F21304A
                salt                                   000000000000000000000000
                ssci                                   0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN         52
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
                SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
                SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
                -------------------------------------  --------------------------------
        MACsec Ingress SC (525400d4fd3f0001)
                MACsec Ingress SA (0)
                ---------------------------------------  --------------------------------
                active                                   true
                auth_key                                 9DDD4C69220A1FA9B6763F229B75CB6F
                lowest_acceptable_pn                     1
                sak                                      BA86574D054FCF48B9CD7CF54F21304A
                salt                                     000000000000000000000000
                ssci                                     0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN           56
                SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
                SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
                SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
                SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
                ---------------------------------------  --------------------------------

admin@vlab-02:~$ sonic-clear macsec
Clear MACsec counters

admin@vlab-02:~$ show macsec  Ethernet0
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-128
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
        MACsec Egress SC (52540067daa70001)
        -----------  -
        encoding_an  0
        -----------  -
                MACsec Egress SA (0)
                -------------------------------------  --------------------------------
                auth_key                               9DDD4C69220A1FA9B6763F229B75CB6F
                next_pn                                1
                sak                                    BA86574D054FCF48B9CD7CF54F21304A
                salt                                   000000000000000000000000
                ssci                                   0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN         52
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
                SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
                SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
                -------------------------------------  --------------------------------
        MACsec Ingress SC (525400d4fd3f0001)
                MACsec Ingress SA (0)
                ---------------------------------------  --------------------------------
                active                                   true
                auth_key                                 9DDD4C69220A1FA9B6763F229B75CB6F
                lowest_acceptable_pn                     1
                sak                                      BA86574D054FCF48B9CD7CF54F21304A
                salt                                     000000000000000000000000
                ssci                                     0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN           0 <---this counters was cleared.
                SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
                SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
                SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
                SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
                ---------------------------------------  --------------------------------


Signed-off-by: Ze Gan <ganze718@gmail.com>
Co-authored-by: Judy Joseph <jujoseph@microsoft.com>
2022-09-07 08:16:23 +08:00
Renuka Manavalan
31e750ee0b
Fix PR build failure (#11973)
Some PR builds fails to find this file. Remove it temporarily until we root cause it
2022-09-06 15:13:05 -07:00
Zain Budhwani
6a54bc439a
Streaming structured events implementation (#11848)
With this PR in, you flap BGP and use events_tool to see the published events.
With telemetry PR #111 in and corresponding submodule update done in buildimage, one could run gnmi_cli to capture BGP flap events.
2022-09-03 07:33:25 -07:00
Zhaohui Sun
88191b063b
Add python-is-python3 package for bullseye base docker (#11895)
Why I did it
In latest syncd container, it is installed bullseye, can't find command '/usr/bin/python'.
Some scripts such as test_copp still calls /usr/bin/python in syncd.
Submitted the change in #11807 for syncd docker, but it's better to add it in bullseye base docker.

How I did it
Install python-is-python3 package in bullseye base docker to resolve this issue, whatever run python or python3, it will run /usr/bin/python3, will not cause the error of can't find command '/usr/bin/python'

How to verify it
run python in syncd container.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-09-01 08:13:24 +08:00
ShiyanWangMS
83704d9955
Upgrade docker-sonic-mgmt base image from Ubuntu18.04 to 20.04 (#11831)
Update base image from ubuntu18.04 to ubuntu20.04
Fix necessary dependencies.
After upgrade, Py2 is 2.7.18, Py3 is 3.8.10.
2022-08-25 15:55:01 +08:00
Hasan Naqvi
2d4ab9e979
Bullseye frr (#11777)
Why I did it
Migrate FRR to bullseye

How I did it
Makefile and docker config changes to refer to bullseye instead of buster.

How to verify it
Build bullseye frr docker.

Co-authored-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2022-08-21 17:04:47 -07:00
Saikrishna Arcot
9753f28d17
Upgrade snmp docker to Bullseye (#11741)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-19 11:20:17 -07:00
Saikrishna Arcot
2af7498b38
Upgrade LLDP docker to Bullseye (#11628)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-09 17:05:36 -07:00
Lawrence Lee
4a996f3662
[swss]: Run tunnel_pkt_handler on dualtor only (#11627)
At SWSS docker init time, check the device subtype and enable tunnel packet handler only if it is dualtor

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-08-09 16:19:59 -07:00
Junchao-Mellanox
736c739bf4
Fix issue: rsyslog rate limit does not work on version 8.2110.0 (#11588)
#### Why I did it

The default stable version of rsyslog on bullseye has a bug about rate limit. It causes rate limit not work. The bug has been fixed on backport version 8.2206.0-1~bpo11+1.

Buster has no such issue.

#### How I did it

Upgrade rsyslog from 8.2110.0 to 8.2206.0-1~bpo11+1

#### How to verify it

Manual test
2022-08-04 15:10:34 -07:00
Robert J. Halstead
16eaece11d
Update p4rt configuration to match SONiC upstream schema. (#10725)
*The initial commit for the P4RT docker hard coded all the flags which makes it difficult to configure at runtime. Reading them from the CONFIG_DB allows for more flexibility.
2022-08-04 14:56:48 -07:00
amulyan7
1b33f864e5
Add ping package to pmon docker (#11550)
ping command is not working inside PMON docker (bullseye)
Use case: chassisd checks for module reachability inside PMON for "show chassis modules midplane-status" CLI, and on Cisco chassis, this uses ping command to check network reachability
2022-08-03 10:04:37 -07:00
Hua Liu
45ded68d8d
Fix docker database flush_unused_database failed issue (#11600)
#### Why I did it
Fix docker-database flush_unused_database failed issue: https://github.com/Azure/sonic-buildimage/issues/11597
When change flush_unused_database from use swsssdk to use swsscommon, get_instancelist() and get_dblist() name changed but not update.

#### How I did it
Change flush_unused_database code to use swsscommon API:
Change get_instancelist to getInstanceList.
Change get_dblist to getDbList.

#### How to verify it
Pass all E2E test.
Manually check syslog make sure error log not exist and swss, syncd, bgp service started.
Search code in Azure make sure there all similer case are fixed in this PR.

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205

#### Description for the changelog
Fix docker-database flush_unused_database failed issue: https://github.com/Azure/sonic-buildimage/issues/11597
When change flush_unused_database from use swsssdk to use swsscommon, get_instancelist() and get_dblist() name changed but not update.

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)



Co-authored-by: liuh-80 <azureuser@liuh-dev-vm-02.5fg3zjdzj2xezlx1yazx5oxkzd.hx.internal.cloudapp.net>
2022-08-03 10:18:00 +08:00
tjchadaga
cdd2786117
Fix for TSA error logging on multi-asic (#11519) 2022-07-30 22:16:58 -07:00
Ye Jianquan
526cd92f53
Install celery in sonic-mgmt image (#11554)
Install celery in sonic-mgmt image
2022-07-29 14:46:24 +08:00
Saikrishna Arcot
aff1fdecb8
[teamd]: Upgrade teamd docker to Bullseye (#11536)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-07-27 18:21:22 -07:00
abdosi
a380105461
Enable ARP Update Script for Packet based chassis. (#11465)
What I did:

    Following changes done for packet based chassis:-
    1> Run arp_update on LC's to resolve static route nexthops over backend
    port-channel interfaces.
    2> On Supervisor make sure arp_update exit gracefully
2022-07-26 16:50:16 -07:00
Junhua Zhai
f01749de99
[macsec] cli multi-namespace support (#11285)
Enable multi-asic platform support for macsec cli
2022-07-22 09:52:46 +08:00
gregshpit
5df09490dc
Ported Marvell armhf build on amd64 host for debian buster to use cross-comp… (#8035)
* Ported Marvell armhf build on x86 for debian buster to use cross-compilation instead of qemu emulation

Current armhf Sonic build on amd64 host uses qemu emulation. Due to the
nature of the emulation it takes a very long time, about 22-24 hours to
complete the build. The change I did to reduce the building time by
porting Sonic armhf build on amd64 host for Marvell platform for debian
buster to use cross-compilation on arm64 host for armhf target. The
overall Sonic armhf building time using cross-compilation reduced to
about 6 hours.

Signed-off-by: marvell <marvell@cpss-build3.marvell.com>

* Fixed final Sonic image build with dockers inside

* Update Dockerfile.j2

Fixed qemu-user-static:x86_64-aarch64-5.0.0-2 .

* Update cross-build-arm-python-reqirements.sh

Added support for both armhf and arm64 cross-build platform using $PY_PLAT environment variable.

* Update Makefile

Added TARGET=<cross-target> for armhf/arm64 cross-compilation.

* Reviewer's @qiluo-msft requests done

Signed-off-by: marvell <marvell@cpss-build3.marvell.com>

* Added new radius/pam patch for arm64 support

* Update slave.mk

Added missing back tick.

* Added libgtest-dev: libgmock-dev: to the buster Dockerfile.j2. Fixed arm perl version to be generic

* Added missing armhf/arm64 entries in /etc/apt/sources.list

* fix libc-bin core dump issue from xumia:fix-libc-bin-install-issue commit

* Removed unnecessary 'apt-get update' from sonic-slave-buster/Dockerfile.j2

* Fixed saiarcot895 reviewer's requests

* Fixed README and replaced 'sed/awk' with patches

* Fixed ntp build to use openssl

* Unuse sonic-slave-buster/cross-build-arm-python-reqirements.sh script (put all prebuilt python packages cross-compilation/install inside Dockerfile.j2). Fixed src/snmpd/Makefile to use -j1 in all cases

* Clean armhf cross-compilation build fixes

* Ported cross-compilation armhf build to bullseye

* Additional change for bullseye

* Set CROSS_BUILD_ENVIRON default value n

* Removed python2 references

* Fixes after merge with the upstream

* Deleted unused sonic-slave-buster/cross-build-arm-python-reqirements.sh file

* Fixed 2 @saiarcot895 requests

* Fixed @saiarcot895 reviewer's requests

* Removed use of prebuilt python wheels

* Incorporated saiarcot895 CC/CXX and other simplification/generalization changes

Signed-off-by: marvell <marvell@cpss-build3.marvell.com>

* Fixed saiarcot895 reviewer's  additional requests

* src/libyang/patch/debian-packaging-files.patch

* Removed --no-deps option when installing wheels. Removed unnecessary lazy_object_proxy arm python3 package instalation

Co-authored-by: marvell <marvell@cpss-build3.marvell.com>
Co-authored-by: marvell <marvell@cpss-build2.marvell.com>
2022-07-21 14:15:16 -07:00
yozhao101
0dfa79d3c4
[pft_docker] Checks out the update of gNMI client script. (#11450)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to check in the commit f2b11e4 introduced by the update related to gNMI python client.

How I did it
I changed the Dockfile.j2 such that the update of gNMI client script will be checked in when ptf docker image is built.

How to verify it
A PTF container image was built and then loaded on a testbed. I checked the update of gNMI client script was checked in.
2022-07-18 14:39:47 -07:00
tjchadaga
077a537b14
Log message fix for TSB (#11441) 2022-07-14 12:26:58 -07:00
賓少鈺
f92aca837d
PDE migration to bullseye (#10836)
#### Why I did it
Upgrade docker-pde to bullseye

#### How to verify it
Check Azp status
2022-07-13 11:58:47 -07:00
tjchadaga
849eb4bf32
Changes to persist TSA/B state across reloads (#11257) 2022-07-12 00:22:48 -07:00
Hua Liu
a9b7a1facd
Replace swsssdk with swsscommon (#11215)
#### Why I did it
Update scripts in sonic-buildimage from py-swsssdk to swsscommon


#### How I did it
Change code to use swsscommon.

#### How to verify it
Pass all E2E test case

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205

#### Description for the changelog
Update scripts in sonic-buildimage from py-swsssdk to swsscommon

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2022-07-11 10:01:10 +08:00
Ye Jianquan
27d53cbba7
FIX the build error introduced by textfsm 1.1.3(Published on 2022/7/6) (#11394)
Why I did it
sonic-mgmt docker image build error, because of the textfsm new version(1.1.3).
https://dev.azure.com/mssonic/build/_build/results?buildId=119147&view=logs&j=3dc8fd7e-4368-5a92-293e-d53cefc8c4b3&t=44e6c678-cb87-52d9-8547-bcdbd0ad6ae4&l=43043

How I did it
Fix textfsm version to 1.1.2

How to verify it
I build the image on my local env, and reproduce the issue with 1.1.3, it's fixed after I change the version to 1.1.2 .

Signed-off-by: jianquanye@microsoft.com
2022-07-09 19:02:10 +08:00
Ye Jianquan
5873e8ef9c
Upgrade snappi version to 0.7.44 (#11335)
Why I did it
Keysight provide a new version with some snappi API source code related fix: snappi[ixnetwork,convergence]==0.7.44

How I did it
Upgrade snappi version to 0.7.44

How to verify it
Whether it's installed in sonic-mgmt docker container
2022-07-06 16:59:19 +08:00
yozhao101
1720fa21d9
[tunnel_packet_handler] Add a whitespace in the warning syslog message. (#11232)
*This PR aims to add a whitespace in the warning syslog message of process tunnel_packet_handler.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2022-06-28 17:29:02 -07:00
Yakiv Huryk
0ced7081c7
[asan] add print_suppressions=0 to ASAN configs (#11252)
- Why I did it
To provide an ability to suppress ASAN false positives and have a clean ASAN report for docker-sonic-vs/mlnx-syncd/orchagent docker

- How I did it
Added the "print_suppressions=0" to ASAN configs.

- How to verify it
add a suppression to some ASAN-enabled component (the suppression should catch some leak)
build with ENABLE_ASAN=y
run a test and see that the ASAN report is empty instead of having the suppression summary

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-06-28 18:45:52 +03:00
Prince George
11ff80b786
Skip CMIS manager (#10907)
* Removed unwanted changes

* Fix j2 compilation error

* Address review comment

* Add newline
2022-06-22 10:14:06 -07:00
Junhua Zhai
04ea32b0c2
[macsec] CLI Supports display of gearbox macsec counter (#11113)
Why I did it
To support gearbox macsec counter display, following Azure/sonic-swss-common#622.

How I did it
Use swsscommon CounterTable API
2022-06-18 19:17:05 +08:00
Saikrishna Arcot
ead106eda8
Add ping to swss-layer docker (#11093)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-06-10 07:40:37 -07:00
Sudharsan Dhamal Gopalarathnam
14f6f70ca3
[BGP]Adding configuration knob to allow advertise Loopback ipv6 /128 prefix (#10958)
* [BGP]Adding configuration knob to allow advertise Loopback ipv6 /128 prefix
By default when IPv6 address is configured with /128 as subnet mask in Loopback0 interface, it will be advertised as prefix with /64 subnet.
To control this behavior a new field 'bgp_adv_lo_prefix_as_128' is introduced in DEVICE_METADATA table which when set to true will advertise prefix with /128 subnet as it is.
2022-06-06 08:51:04 -07:00
kellyyeh
6bdbe14975
[dhcp_relay] Add "vlan missing ip helper" dhcp relay unittest (#10654) 2022-06-04 11:37:04 -07:00
abdosi
0285bfe42e
[chassis] Fix issues regarding database service failure handling and mid-plane connectivity for namespace. (#10500)
What/Why I did:

Issue1: By setting up of ipvlan interface in interface-config.sh we are not tolerant to failures. Reason being interface-config.service is one-shot and do not have restart capability. 

Scenario: For example if let's say database service goes in fail state  then interface-services also gets failed because of dependency check but later database service gets restart but interface service will remain in stuck state and the ipvlan interface nevers get created.

Solution: Moved all the logic in database service from interface-config service which looks more align logically also since the namespace is created here and all the network setting (sysctl) are happening here.With this if database starts we recreate the interface.

Issue 2: Use of IPVLAN vs MACVLAN

Currently we are using ipvlan mode.  However above failure scenario is not handle correctly by ipvlan mode. Once the ipvlan interface is created and ip address assign to it and if we restart interface-config or database (new PR) service Linux Kernel gives error "Error: Address already assigned to an ipvlan device."  based on this:https://github.com/torvalds/linux/blob/master/drivers/net/ipvlan/ipvlan_main.c#L978Reason being if we do not do cleanup of ip address assignment (need to be unique for IPVLAN)  it remains in Kernel Database and never goes to free pool even though namespace is deleted. 

Solution: Considering this hard dependency of unique ip macvlan mode is better for us and since everything is managed by Linux Kernel and no dependency for on user configured IP address.

Issue3: Namespace database Service do not check reachability to Supervisor Redis Chassis   Server.

Currently there is no explicit check as we never do Redis PING from namespace to Supervisor Redis Chassis  Server. With this check it's possible we will start database and all other docker even though there is no connectivity and will hit the error/failure late in cycle

Solution: Added explicit PING from namespace that will check this reachability.

Issue 4:flushdb give exception when trying to accces Chassis Server DB over Unix Sokcet.

Solution: Handle gracefully via try..except and log the message.
2022-05-24 16:54:12 -07:00
kellyyeh
2ead3aaefc
[dhcp6relay] Fix option parsing and add dhcpv6 client messages (#10819) 2022-05-24 14:37:16 -07:00
Ze Gan
0156c21eff
[macsec-cli]: Fixing to config MACsec on the port will clear port attributes in config db (#10903)
Why I did it
There is a bug that the Port attributes in CONFIG_DB will be cleared if using sudo config macsec port add Ethernet0 or sudo config macsec port del Ethernet0

How I did it
To fetch the port attributes before set/remove MACsec field in port table.

Signed-off-by: Ze Gan <ganze718@gmail.com>
2022-05-24 18:42:54 +08:00
Ze Gan
910e1c6eb4
[docker-macsec]: MACsec CLI Plugin (#9390)
#### Why I did it
To provide MACsec config and show CLI for manipulating MACsec

#### How I did it
Add `config macsec` and `show macsec`.

#### How to verify it

This PR includes unittest for MACsec CLI, check Azp status.
- Add MACsec profile
```
admin@sonic:~$ sudo config macsec profile add --help
Usage: config macsec profile add [OPTIONS] <profile_name>

  Add MACsec profile

Options:
  --priority <priority>           For Key server election. In 0-255 range with
                                  0 being the highest priority.  [default:
                                  255]
  --cipher_suite <cipher_suite>   The cipher suite for MACsec.  [default: GCM-
                                  AES-128]
  --primary_cak <primary_cak>     Primary Connectivity Association Key.
                                  [required]
  --primary_ckn <primary_cak>     Primary CAK Name.  [required]
  --policy <policy>               MACsec policy. INTEGRITY_ONLY: All traffic,
                                  except EAPOL, will be converted to MACsec
                                  packets without encryption.  SECURITY: All
                                  traffic, except EAPOL, will be encrypted by
                                  SecY.  [default: security]
  --enable_replay_protect / --disable_replay_protect
                                  Whether enable replay protect.  [default:
                                  False]
  --replay_window <enable_replay_protect>
                                  Replay window size that is the number of
                                  packets that could be out of order. This
                                  field works only if ENABLE_REPLAY_PROTECT is
                                  true.  [default: 0]
  --send_sci / --no_send_sci      Send SCI in SecTAG field of MACsec header.
                                  [default: True]
  --rekey_period <rekey_period>   The period of proactively refresh (Unit
                                  second).  [default: 0]
  -?, -h, --help                  Show this message and exit.
```
- Delete MACsec profile
```
admin@sonic:~$ sudo config macsec profile del --help
Usage: config macsec profile del [OPTIONS] <profile_name>

  Delete MACsec profile

Options:
  -?, -h, --help  Show this message and exit.
```
- Enable MACsec on the port
```
admin@sonic:~$ sudo config macsec port add --help
Usage: config macsec port add [OPTIONS] <port_name> <profile_name>

  Add MACsec port

Options:
  -?, -h, --help  Show this message and exit.
```
- Disable MACsec on the port
```
admin@sonic:~$ sudo config macsec port del --help
Usage: config macsec port del [OPTIONS] <port_name>

  Delete MACsec port

Options:
  -?, -h, --help  Show this message and exit.

```
Show MACsec
```
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-256
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
	MACsec Egress SC (5254008f4f1c0001)
	-----------  -
	encoding_an  2
	-----------  -
		MACsec Egress SA (1)
		-------------------------------------  ----------------------------------------------------------------
		auth_key                               849B69D363E2B0AA154BEBBD7C1D9487
		next_pn                                1
		sak                                    AE8C9BB36EA44B60375E84BC8E778596289E79240FDFA6D7BA33D3518E705A5E
		salt                                   000000000000000000000000
		ssci                                   0
		SAI_MACSEC_SA_ATTR_CURRENT_XPN         179
		SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
		SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
		SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
		SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
		-------------------------------------  ----------------------------------------------------------------
		MACsec Egress SA (2)
		-------------------------------------  ----------------------------------------------------------------
		auth_key                               5A8B8912139551D3678B43DD0F10FFA5
		next_pn                                1
		sak                                    7F2651140F12C434F782EF9AD7791EE2CFE2BF315A568A48785E35FC803C9DB6
		salt                                   000000000000000000000000
		ssci                                   0
		SAI_MACSEC_SA_ATTR_CURRENT_XPN         87185
		SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
		SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
		SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
		SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
		-------------------------------------  ----------------------------------------------------------------
	MACsec Ingress SC (525400edac5b0001)
		MACsec Ingress SA (1)
		---------------------------------------  ----------------------------------------------------------------
		active                                   true
		auth_key                                 849B69D363E2B0AA154BEBBD7C1D9487
		lowest_acceptable_pn                     1
		sak                                      AE8C9BB36EA44B60375E84BC8E778596289E79240FDFA6D7BA33D3518E705A5E
		salt                                     000000000000000000000000
		ssci                                     0
		SAI_MACSEC_SA_ATTR_CURRENT_XPN           103
		SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
		SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
		SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
		SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
		SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
		SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
		SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
		SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
		SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
		SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
		---------------------------------------  ----------------------------------------------------------------
		MACsec Ingress SA (2)
		---------------------------------------  ----------------------------------------------------------------
		active                                   true
		auth_key                                 5A8B8912139551D3678B43DD0F10FFA5
		lowest_acceptable_pn                     1
		sak                                      7F2651140F12C434F782EF9AD7791EE2CFE2BF315A568A48785E35FC803C9DB6
		salt                                     000000000000000000000000
		ssci                                     0
		SAI_MACSEC_SA_ATTR_CURRENT_XPN           91824
		SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
		SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
		SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
		SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
		SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
		SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
		SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
		SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
		SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
		SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
		---------------------------------------  ----------------------------------------------------------------
MACsec port(Ethernet1)
---------------------  -----------
cipher_suite           GCM-AES-256
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
	MACsec Egress SC (5254008f4f1c0001)
	-----------  -
	encoding_an  1
	-----------  -
		MACsec Egress SA (1)
		-------------------------------------  ----------------------------------------------------------------
		auth_key                               35FC8F2C81BCA28A95845A4D2A1EE6EF
		next_pn                                1
		sak                                    1EC8572B75A840BA6B3833DC550C620D2C65BBDDAD372D27A1DFEB0CD786671B
		salt                                   000000000000000000000000
		ssci                                   0
		SAI_MACSEC_SA_ATTR_CURRENT_XPN         4809
		SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
		SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
		SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
		SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
		-------------------------------------  ----------------------------------------------------------------
	MACsec Ingress SC (525400edac5b0001)
		MACsec Ingress SA (1)
		---------------------------------------  ----------------------------------------------------------------
		active                                   true
		auth_key                                 35FC8F2C81BCA28A95845A4D2A1EE6EF
		lowest_acceptable_pn                     1
		sak                                      1EC8572B75A840BA6B3833DC550C620D2C65BBDDAD372D27A1DFEB0CD786671B
		salt                                     000000000000000000000000
		ssci                                     0
		SAI_MACSEC_SA_ATTR_CURRENT_XPN           5033
		SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
		SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
		SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
		SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
		SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
		SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
		SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
		SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
		SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
		SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
		---------------------------------------  ----------------------------------------------------------------
```
2022-05-19 21:59:37 +08:00
Lawrence Lee
0eeb249fd8
[swss]: Convert swss docker to bullseye (#10484)
* [swss]: Convert swss docker to bullseye

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-05-17 13:55:59 -07:00
Alexander Allen
d202bf26d7
Upgrade mellanox platform containers (syncd / saiserver / syncd-rpc) and pmon to bullseye (#10580)
Fixes #9279

- Why I did it
Part of larger effort to move all SONiC systems to bullseye

- How I did it
1. Update container makefiles with correct dependencies
2. Update container Dockerfile with correct base image
3. Update container Dockerfile with correct apt dependencies
4. Update any other makefiles with dependencies to remove python2 support
5. Minor changes to support bullseye / python3

- How to verify it
Run regression on the switch:
1. Verify PTF community tests work
2. Verify syncd runs and all ports come up / pass traffic
3. Verify all platform tests succeed
2022-05-10 12:45:28 +03:00
Jing Zhang
322363b9ab
[master][sonic-linkmgrd] submodule updates (#10763)
[master][sonic-linkmgrd] submodule updates

df51322 Longxiang Lyu   Fri May 6 10:01:46 2022 +0800   Add `ActiveActiveStateMachine` implementation (#64)
e721ceb Jing Zhang      Wed May 4 10:07:14 2022 -0700   Add doc for default route related changes  (#63)
7bb06fb Jing Zhang      Tue May 3 09:48:28 2022 -0700   Add Cli support to enable or disable default route related feature (#68)
e4b02cb Jing Zhang      Mon May 2 13:27:54 2022 -0700   Reset WaitActiveUp count before switching to active (#70)
212d960 Jing Zhang      Wed Apr 27 10:35:05 2022 -0700  lower log level to warning (#69)
48abc9e Jing Zhang      Thu Apr 14 16:50:04 2022 -0700  Add support to enable switchover time measurement (with link prober interval decreased to 10ms) feature  (#61)
c4858a6 Jing Zhang      Thu Apr 14 11:27:55 2022 -0700  Avoid proactively switching to `active` if default route is missing  (#62)

sign-off: Jing Zhang zhangjing@microsoft.com
2022-05-06 13:42:23 -07:00
kellyyeh
cfdb8431df
[dhcp6relay] Add dhcpv6 option check (#10486) 2022-05-05 18:04:14 -07:00
xumia
8ec8900d31
Support SONiC OpenSSL FIPS 140-3 based on SymCrypt engine (#9573)
Why I did it
Support OpenSSL FIPS 140-3, see design doc: https://github.com/Azure/SONiC/blob/master/doc/fips/SONiC-OpenSSL-FIPS-140-3.md.

How I did it
Install the fips packages.
To build the fips packages, see https://github.com/Azure/sonic-fips
Azure pipelines: https://dev.azure.com/mssonic/build/_build?definitionId=412

How to verify it
Validate the SymCrypt engine:

admin@sonic:~$ dpkg-query -W | grep openssl
openssl 1.1.1k-1+deb11u1+fips
symcrypt-openssl        0.1

admin@sonic:~$ openssl engine -v | grep -i symcrypt
(symcrypt) SCOSSL (SymCrypt engine for OpenSSL)
admin@sonic:~$
2022-05-06 07:21:30 +08:00
Lior Avramov
6284d136ec
[LLDP] Enhance lldmgrd Redis events handling (#10593)
Why I did it
When lldpmgrd handled events of other tables besides PORT_TABLE, error message was printed to log.

How I did it
Handle event according to its file descriptor instead of looping all registered selectables for each coming event.

How to verify it
I verified same events are being handled by printing events key and operation, before and after the change.
Also, before the change, in init flow after config reload, when lldpmgrd handled events of other tables besides PORT_TABLE, error messages were printed to log, this issue is solved now.
2022-05-04 08:21:02 -07:00
Kalimuthu-Velappan
bc30528341
Parallel building of sonic dockers using native dockerd(dood). (#10352)
Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are
specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are
created with a fixed docker name and common to all the users.

    docker-database:latest
    docker-swss:latest

When multiple builds are triggered on the same build server that creates parallel building issue because
all the build jobs are trying to create the same docker with latest tag.
This happens only when sonic dockers are built using native host dockerd for sonic docker image creation.

This patch creates all sonic dockers as user sonic dockers and then, while
saving and loading the user sonic dockers, it rename the user sonic
dockers into correct sonic dockers with tag as latest.

	docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag

The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env
variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for
loading and saving the docker image.
2022-04-28 08:39:37 +08:00
Zhaohui Sun
cc30771f6b
Add python3 virtual environment for docker-ptf (#10599)
Why I did it
Migrate ptftests script to python3, in order to do an incremental migration, add python virtual environment firstly, install all required python packages in virtual env as well.
Then migrate ptftests scripts from python2 to python3 one by one avoid impacting non-changed scripts.

Signed-off-by: Zhaohui Sun zhaohuisun@microsoft.com

How I did it
Add python3 virtual environment for docker-ptf.
Add submodule ptf-py3 and install patched ptf 0.9.3 into virtual environment as well, two ptf issues were reported here:
p4lang/ptf#173
p4lang/ptf#174

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-04-26 09:13:26 +08:00
Song Yuan
aa62e33339
[chassis] Do not configure LLDP on recirc ports (#7909)
Why I did it
Recirc port is used to only forward traffic from one asic to another asic. So it's not required to configure LLDP on it.

How I did it
Add interface prefix helper for recirc port. Similar to skip configuring LLDP on inband port, add check in lldpmgrd to skip recirc port by checking interface prefix.
2022-04-25 13:14:17 -07:00
Maxime Lorrillere
0606add017
[chassis] Get asic PCI ID from CHASSIS_STATE_DB and update asic_id in CONFIG_DB (#9681)
Asic PCI ID (PCI address) is collected by chassisd (inside pmon -
Azure/sonic-platform-daemons#175) and saved in CHASSIS_STATE_DB (in
redis_chassis). CHASSIS_STATE_DB is accessible by swss containers.

At docker-init.sh (script is called after swss container is created and before
anything that could run in swss like orchagent...), we wait until asic PCI ID
of the corresponding asic is populated by chassisd. We then update asic_id in
CONFIG_DB of asic's database.

A system supporting dynamic asic PCI ID identification requires to have a file
(empty) use_pci_id_chassis in its platform dir.

When orchagent runs, it has correct asic PCI ID in its CONFIG_DB.

Together with this PR:

Azure/sonic-platform-daemons#175
Azure/sonic-platform-common#185

Signed-off-by: Maxime Lorrillere <mlorrillere@arista.com>

Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
2022-04-25 13:09:42 -07:00
yozhao101
e24fe9bc60
[Monit] Fix the issue which shows Monit can not reset its counter. (#10288)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>

Why I did it
This PR aims to fix the Monit issue which shows Monit can't reset its counter when monitoring memory usage of telemetry container.

Specifically the Monit configuration file related to monitoring memory usage of telemetry container is as following:

  check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400"
      if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry"
If memory usage of telemetry container is larger than 400MB for 10 times within 20 cycles (minutes), then it will be restarted.
Recently we observed, after telemetry container was restarted, its memory usage continuously increased from 400MB to 11GB within 1 hour, but it was not restarted anymore during this 1 hour sliding window.

The reason is Monit can't reset its counter to count again and Monit can reset its counter if and only if the status of monitored service was changed from Status failed to Status ok. However, during this 1 hour sliding window, the status of monitored service was not changed from Status failed to Status ok.

Currently for each service monitored by Monit, there will be an entry showing the monitoring status, monitoring mode etc. For example, the following output from command sudo monit status shows the status of monitored service to monitor memory usage of telemetry:

    Program 'container_memory_telemetry'
         status                             Status ok
         monitoring status          Monitored
         monitoring mode          active
         on reboot                      start
         last exit value                0
         last output                    -
         data collected               Sat, 19 Mar 2022 19:56:26
Every 1 minute, Monit will run the script to check the memory usage of telemetry and update the counter if memory usage is larger than 400MB. If Monit checked the counter and found memory usage of telemetry is larger than 400MB for 10 times
within 20 minutes, then telemetry container was restarted. Following is an example status of monitored service:

    Program 'container_memory_telemetry'
         status                             Status failed
         monitoring status          Monitored
         monitoring mode          active
         on reboot                      start
         last exit value                0
         last output                    -
         data collected               Tue, 01 Feb 2022 22:52:55
After telemetry container was restarted. we found memory usage of telemetry increased rapidly from around 100MB to more than 400MB during 1 minute and status of monitored service did not have a chance to be changed from Status failed to Status ok.

How I did it
In order to provide a workaround for this issue, Monit recently introduced another syntax format repeat every <n> cycles related to exec. This new syntax format will enable Monit repeat executing the background script if the error persists for a given number of cycles.

How to verify it
I verified this change on lab device str-s6000-acs-12. Another pytest PR (Azure/sonic-mgmt#5492) is submitted in sonic-mgmt repo for review.
2022-04-20 18:08:06 -07:00
Jing Zhang
8b5d908c92
Upgrade mux container to Bullseye (#10498)
sign-off: Jing Zhang zhangjing@microsoft.com

#### Why I did it
As part of the process moving containers from buster to bullseye.

#### How I did it
1. change base image from buster to bullseye. 
2. remove unused addition to orchagent run options 

#### How to verify it
Tested building locally.
2022-04-19 09:27:45 -07:00
Ze Gan
87036c34ec
[macsec]: Upgrade docker-macsec to bullseye (#10574)
Following the patch from : https://packages.debian.org/bullseye/wpasupplicant, to upgrade sonic-wpa-supplicant for supporting bullseye and upgrade docker-macsec.mk as a bullseye component.
2022-04-17 20:32:51 +08:00
Zhaohui Sun
44cf773a96
Revert "[docker-ptf]: Upgrade scapy to 2.4.5 in docker-ptf (#10507)" (#10537)
It upgraded scapy to 2.4.5 in docker-ptf container, after this upgrade, all scripts under ansible/roles/test/files/ptftests will import scapy 2.4.5, some test cases will fail because they are not upgraded accordingly.

Reverts #10507 to avoid breaking regression test.

This reverts commit 92efc01270.
2022-04-13 08:57:10 +08:00
kellyyeh
396a92cb2e
[dhcp_relay] Remove dhcp6mon (#10467) 2022-04-12 10:44:17 -07:00
Ashwin Srinivasan
b4f8f1dd22
Removed python2 dependency for sonic-pcied in sonic-platform-daemons (#10421)
Removed python2 support for sonic-platform-daemons that was causing unit
test errors in sonic_pcied.
* Removed config from docker supervisord jinja templates per VD review comment
* Removed space and python3 per QL comments
2022-04-09 13:16:50 -07:00
Ze Gan
92efc01270
[docker-ptf]: Upgrade scapy to 2.4.5 in docker-ptf (#10507)
Why I did it
Existing dataplane tests cannot be tested under MACsec environment due to the traffic under MACsec link is encrypted. So, I will override the dp_poll of ptf to MACsec dp_poll to decrypt the MACsec packets on injected ports (PR: Azure/sonic-mgmt#5490). MACsec decryption library depends on scapy 2.4.5.

How I did it
Upgrade scapy library to 2.4.5 by pip.

How to verify it
Check the scapy version in docker-ptf by

python -c "import scapy; print(scapy.__version__)"
2.4.5

Signed-off-by: Ze Gan <ganze718@gmail.com>
2022-04-08 22:26:40 -07:00
kellyyeh
330d11a128
Add EPMS and MgmtTsToR (#10478) 2022-04-07 21:49:42 -07:00
Stepan Blyshchak
4426f7715f
[scapy] update scapy to 2.4.5 and patch it (#10457)
Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:

Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors     from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr  2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr  2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors     from scapy.route import *
Apr  2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr  2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors     conf.iface = get_working_if()
Apr  2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr  2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors     ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr  2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr  2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors     return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr  2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-04-07 14:23:35 +03:00
kellyyeh
8cd346d80b
Update docker-router-advertiser.supervisord.conf.j2 (#10375) 2022-04-06 09:44:21 -07:00
Saikrishna Arcot
588ed0b760
Upgrade router-advertiser container to Bullseye (#10374)
Change the base image from `docker-config-engine-buster` to
`docker-config-engine-bullseye`, and remove the hardcoded
`radvd` version from the Dockerfile.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-04-01 16:12:43 -07:00
Junchao-Mellanox
106fac5f09
[counter] Fix issue: non default counters will be delayed forever after fastboot (#10413)
- Why I did it
Fastboot will delay all counters in CONFIG DB, it relies on enable_counters.py to recover the delayed counters. However, enable_counters.py does not recover those non-default counters.

- How I did it
For non-default counters, if it is in CONFIG DB, put delay status to false after the waiting.

- How to verify it
Manual test
2022-03-31 15:23:57 +03:00
Lawrence Lee
b31df59c7c
[tun_pkt]: Wait for AsyncSniffer to init fully (#10346)
Fix for Tunnel packet handler can crash at system startup 
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-03-30 14:03:29 -07:00
judyjoseph
8e642848c2
Introduce the asic_subtype field for adding the sub platform variants. (#10235)
* Introduce the asic_subtype field for adding the sub platform variants. 
   It uses the value of TARGET_MACHINE variable in slave.mk.
2022-03-28 11:22:32 -07:00
tomer-israel
cc938e73a3
Dynamic port configuration - solve lldp issues when adding/removing ports (#9386)
#### Why I did it
when adding and removing ports after init stage we saw two issues:

first:
In several cases, after removing a port, lldpmgr is continuing to try to add a port to lldp with lldpcli command. the execution of this command is continuing to fail since the port is not existing anymore.

second:
after adding a port, we sometimes see this warning messgae:
"Command failed 'lldpcli configure ports Ethernet18 lldp portidsubtype local etp5b': 2021-07-27T14:16:54 [WARN/lldpctl] cannot find port Ethernet18"

we added these changes in order to solve it.
#### How I did it
port create events are taken from app db only.
lldpcli command is executed only when linux port is up.

when delete port event is received we remove this command from  pending_cmds dictionary

#### How to verify it
manual tests and running lldp tests


#### Description for the changelog
Dynamic port configuration - solve lldp issues when adding/removing ports
2022-03-25 17:47:24 -07:00