Why I did it
This PR is part of sonic-mgmt-docker Python3 migration project.
Work item tracking
Microsoft ADO (number only): 24397943
How I did it
Upgrade Ansible to 6.7.0
Make Python3 as the default interpreter. python is a soft link to python3. If you want to use python2, use the command python2 explicitly.
Upgrade some pip packages to higher version in order to meet security requirement.
How to verify it
Build a private sonic-mgmt-docker successfully.
Verify python is python3.
Verify python2 is working with 202012 and 202205 branch.
Verify python3 is working with master branch.
Verify with github PR test.
### Why I did it
We use `EdgeZoneAggregator` in `db_migrator`, but we don't support this pattern in sonic yang models. Hence, we update this in the sonic-yang model.
##### Work item tracking
- Microsoft ADO **(number only)**: 25574132
#### How I did it
Update the device pattern list.
#### Why I did it
HLD implementation: Container Hardening (https://github.com/sonic-net/SONiC/pull/1364)
##### Work item tracking
- Microsoft ADO **(number only)**: 14807420
#### How I did it
Reduce linux capabilities in privileged flag
#### How to verify it
Run sflow sonic-mgmt tests
Check container's settings: Privileged is false and container only has default Linux caps, does not have extended caps.
```
admin@vlab-01:~$ docker inspect sflow | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker exec -it sflow bash
root@vlab-01:/# capsh --print
Current: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ep
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
```
#### Why I did it
This header file comes from an external package, and a very old version of the header file has been checked into swss-common. This will cause problems for the upcoming Bookworm upgrade.
##### Work item tracking
- Microsoft ADO **(number only)**: 25411155
#### How I did it
Change references to the header file to use the Debian package nlohmann-json-dev, instead of from swss-common.
### Tested branch (Please provide the tested image version)
- [ ] <!-- image version 1 -->
- [ ] VS image from pipeline build
Verified that eventd was running
Why I did it
To avoid orchagent crash issue like sonic-net/sonic-swss#2935, disable unsupported counters on SONiC management devices.
Work item tracking
Microsoft ADO (number only): 25437720
How I did it
Update the minigraph parser to disable unsupported counters on management devices.
How to verify it
Verified by unittest.
Manually apply patch to DUT and do config load_minigraph
Why I did it
XGS saibcm-modules 8.4 is needed. #14471
Work item tracking
Microsoft ADO (number only): 24917414
How I did it
Copy files from xgs SDK 8.4 repo and modify makefiles to build the image.
Upgrade version to 8.4.0.2 in saibcm-modules.mk.
How to verify it
Build a private image and run full qualification with it: https://elastictest.org/scheduler/testplan/650419cb71f60aa92c456a2b
#### Why I did it
src/sonic-sairedis
```
* 7210b0c - (HEAD -> master, origin/master, origin/HEAD) [Link event damping] Add utility methods. (#1313) (20 hours ago) [Ashish Singh]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
It was observed that a flood of DHCP packets without rate-limiting can cause BGP flaps or lacp keepalive losses.
This change attempts to prevent or reduce such BGP flaps by enabling appropriate rate-limiting in SONiC for all traffic types.
Work item tracking
Microsoft ADO 17964421:
How I did it
Set a reasonable CIR/CBS value of 300 for queue4_group3 (dhcp, lldp, macsec) and 6000 for queue4_group1.
The value 300 was arrived at after testing with dhcp flooding using ptf (using multiple threads). Throttling at this rate was necessary to ensure that dhcp flooding does not cause BGP flaps.
How to verify it
Verified with this script running from ptf, that BGP flaps don't happen when CBS/CIR is set at 300 for queue4_group3.
import threading
from scapy.all import *
def send_dhcp_discover(intf):
dhcp_discover = Ether(dst='ff:ff:ff:ff:ff:ff',src=RandMAC()) \
/IP(src='1.1.1.1',dst='255.255.255.255') \
/UDP(sport=68,dport=67) \
/DHCP(options=[('message-type','discover'),('end')])
sendp(dhcp_discover,count=100000,iface=intf)
if __name__ == "__main__":
t1 = threading.Thread(target=send_dhcp_discover, args=("eth1",))
t2 = threading.Thread(target=send_dhcp_discover, args=("eth2",))
t1.start()
t2.start()
t1.join()
t2.join()
Verified on Arista-7260CX3-D108C8 running 202012 that the copp rule for queue4_group1 and queue4_group3 do NOT affect BGP packets. To verify this using PTF, the copp rules were modified to set the "CBS" and "CIR" for queue4_group1 and queue4_group3 at 600pps and 50k packets each of "BGP open" and "DHCP Discover" were simultaneously sent from the same PTF port to the DUT. It was verified using "show c cpu" that packets are hitting the cpu queue at 1200 pps (double the configured CIR/CBS for these packet types). This helped conclude that throttling rate is per trap (or packet type) and not per queue.
Verified with updated sonic-mgmt tests ([tests/copp]: Update copp mgmt tests to support new rate-limits sonic-mgmt#8199) on broadcom and mellanox platforms that these traffic types are rate-limited.
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
#### Why I did it
src/sonic-sairedis
```
* 1ef16ee - (HEAD -> master, origin/master, origin/HEAD) [Link event damping] Add generic concurrent queue for link event damping. (#1297) (11 hours ago) [Ashish Singh]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* a9867e67 - (HEAD -> master, origin/master, origin/HEAD) Fix acl match ip_type_non_ipv4 and ip_type_non_ipv6. (#2842) (5 hours ago) [LTeng]
* dc8fd20f - [DASH] ACL tags implementation (#2915) (11 hours ago) [Oleksandr Ivantsiv]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 0ae5d2d2 - (HEAD -> master, origin/master, origin/HEAD) [ci] Use correct bullseye docker image according to source branch. (18 hours ago) [Liu Shilong]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
In an effort to allow people to build a slim version of SONiC to fit on devices to small storage, there is a need to disable some unneeded features.
The docker-gbsyncd are only applicable to devices with external gearboxes and might not apply to devices that need a small image.
It is therefore desirable to have a knob to not include these gbsyncd containers.
Work item tracking
Microsoft ADO (number only):
How I did it
Add a new config INCLUDE_GBSYNCD which is enabled by default to retain the previous behavior.
Setting it to n will not include the platform/components/docker-gbsyncd-*.mk.
How to verify it
Set INCLUDE_GBSYNCD = n and witness that docker-gbsyncd images are not present in the final image.
With Debian Bookworm, Paramiko 2.9 or newer will need to be used to be
able to connect to devices running that version of Debian
(specifically, to those running OpenSSH 9.2).
Paramiko is currently on 3.3.1. For now, upgrade to 2.9.5.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
src/sonic-sairedis
```
* eaa2bda - (HEAD -> master, origin/master, origin/HEAD) Update SAI submodule to latest (#1311) (12 hours ago) [Kamil Cudnik]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Update the sonic-pins submodule. This brings in the following commit:
56a7762 Use json.hpp from nlohmann-json-dev instead of from swss-common (#22)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
Mellanox MSN2700 platforms have a non-functional error log: "ERR pmon#sensord: Error getting sensor data: dps460/#10: Can't read". This error is because of a firmware issue with some PSU, we are not able to upgrade the FW online. Since there is no functional impact, this error log can be ignored safely.
- How I did it
Add a new rsyslog rule to the rsyslog-container.conf.j2, if the docker name is pmon and the platform name matches, the new rule will be inserted into the docker rsyslogd.conf
- How to verify it
run regression on the MSN2700 platform to make the error log will not be printed to the syslog.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
src/sonic-platform-common
```
* 6d804d6 - (HEAD -> master, origin/master, origin/HEAD) Fix SSD health percentage issue for vendor Virtium (#407) (3 hours ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* Reduce SONiC image filesystem size
Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:
- change some packages that are installed in the rootfs
- apply a rootfs reduction script
The script itself will perform a few steps:
- remove file duplication by leveraging hardlinks
- under /usr/share/sonic since the symlinks under the device folder are lost during the build.
- under /var/lib/docker since the files there will only be mounted ro
- remove some extra files (man, docs, licenses, ...)
- some image specific space reduction (only for aboot images currently)
The script can later be improved but for now it's reducing the rootfs
size by ~30%.
* restore fully featured vim package
Why I did it
Add config to set pip HTTP timeout value in building process for build to be more stable.
Default value is 60.
Work item tracking
Microsoft ADO (number only): 25190067
How I did it
Insert timeout options in all pip commands.
Why I did it
K8S_OPTIONS maybe empty, so there will be syntax error. Need to fix this issue.
Work item tracking
Microsoft ADO (number only): 25495020
How I did it
Add "" for K8S_OPTIONS to avoid exception.
How to verify it
No more exception is throwed in PR build validation pipeline.
Why I did it
Part implementation of dhcp_server. HLD: sonic-net/SONiC#1282.
Add dhcpservd to dhcp_server container.
How I did it
Add installing required pkg (psutil) in Dockerfile.
Add copying required file to container in Dockerfile (kea-dhcp related and dhcpservd related)
Add critical_process and supervisor config.
Add support for generating kea config (only in dhcpservd.py) and updating lease table (in dhcpservd.py and lease_update.sh)
How to verify it
Build image with setting INCLUDE_DHCP_SERVER to y and enabled dhcp_server feature after installed image, container start as expected.
Enter container and found that all processes defined in supervisor configuration running as expected.
Kill processes defined in critical_processes, container exist.
#### Why I did it
src/sonic-utilities
```
* 244ad2d6 - (HEAD -> master, origin/master, origin/HEAD) Revert "Remove syslog service validator in GCU (#2991)" (#3015) (2 hours ago) [jingwenxie]
* d857eb09 - [db_migrator] Fix the broken version chain (#3014) (11 hours ago) [Vivek]
* 424be9ca - [fwutil] Fix python SyntaxWarning for 'is' with literals (#3013) (23 hours ago) [Kebo Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
### Why I did it
Privileges and volumes were incorrectly set in macsec container. Privileged flag is set to false and volumes are not mounted properly.
```
admin@vlab-01:~$ docker inspect macsec0 | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker inspect macsec0 | grep -A 10 Binds
"Binds": [
"/var/run/redis0:/var/run/redis:rw",
"/var/run/redis-chassis:/var/run/redis-chassis:ro",
"/usr/share/sonic/device/x86_64-nokia_ixr7250e_36x400g-r0/Nokia-IXR7250E-36x100G/0:/usr/share/sonic/hwsku:ro",
"/var/run/redis0/:/var/run/redis0/:rw",
"/usr/share/sonic/device/x86_64-nokia_ixr7250e_36x400g-r0:/usr/share/sonic/platform:ro"
],
```
### How I did it
#### How to verify it
Make sure privileged settings remain unchanged and make sure volumes are properly mounted
```
admin@vlab-01:~$ docker inspect macsec | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker inspect macsec | grep -A 10 Binds
"Binds": [
"/etc/timezone:/etc/timezone:ro",
"/var/run/redis:/var/run/redis:rw",
"/var/run/redis-chassis:/var/run/redis-chassis:ro",
"/etc/fips/fips_enable:/etc/fips/fips_enable:ro",
"/usr/share/sonic/templates/rsyslog-container.conf.j2:/usr/share/sonic/templates/rsyslog-container.conf.j2:ro",
"/etc/sonic:/etc/sonic:ro",
"/host/warmboot:/var/warmboot",
"/usr/share/sonic/device/x86_64-kvm_x86_64-r0/Force10-S6000/:/usr/share/sonic/hwsku:ro",
"/usr/share/sonic/device/x86_64-kvm_x86_64-r0:/usr/share/sonic/platform:ro"
],
```
Why I did it
RFS cache have issues which breaks official build and PR checker.
By reading cache, fsroot-vs/lib/modules folder don't exist.
Work item tracking
Microsoft ADO (number only): 25481484
How I did it
Disable read cache currently.
How to verify it
### Why I did it
[Security] Upgrade the OpenSSL/OpenSSH to fix CVE alerts
Upgrade OpenSSL to 1.1.1n-0+deb11u5
Fix CVEs:
CVE-2023-0464 (Excessive Resource Usage Verifying X.509 Policy
CVE-2023-0465 (Invalid certificate policies in leaf certificates are
CVE-2023-0466 (Certificate policy check not enabled).
CVE-2022-4304 (Timing Oracle in RSA Decryption).
CVE-2023-2650 (Possible DoS translating ASN.1 object identifiers).
Upgrade OpenSSH to 8.4p1-5+deb11u2
Fix CVEs:
CVE-2023-38408 (Lacks SSH agent restriction)
##### Work item tracking
- Microsoft ADO **(number only)**: 25506776
#### How I did it
Upgrade the OpenSSL/OpenSSH package version and fix the UT failure.
#### How to verify it
Verified by UTs with and without FIPS enabled.
- Why I did it
Add an ability to add arm64 mellanox specific kconfig using the integration tool
Fix the existing duplicate kconfig problem by using the vanilla .config
Add an ability to patch kconfig-inclusions file. Renamed series.patch to external-changes.patch to reflect the behavior
NOTE: Min hw-mgmt version to use with these changes: V.7.0030.2000 not yet upstream but required prio to it.
This option will be enabled one the new hw mgmt will be upstream.
Depends on sonic-net/sonic-linux-kernel#336
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
#### Why I did it
src/sonic-swss
```
* f31ccd09 - (HEAD -> master, origin/master, origin/HEAD) Add refillToSync() into ConsumerBase to support warmboot. (#2866) (2 days ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-gnmi
```
* 07e0b36 - (HEAD -> master, origin/master, origin/HEAD) Recover from potential panic when doing map to JSON serialization (#161) (29 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-linux-kernel
```
* 6508505 - (HEAD -> master, origin/master, origin/HEAD) Add drop monitor Kernel Patches for buffer support (#338) (3 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-restapi
```
* ccad4a2 - (HEAD -> master, origin/master, origin/HEAD) [Tunnel] Support co-existence of IPv4 and IPv6 tunnels (#147) (8 hours ago) [Prince Sunny]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Added Marvell SAI-1.13.0 debian support for x86_64 platform.
Work item tracking
Microsoft ADO (number only):
How I did it
compile marvel libsai.so (with SAI headers from version 1.13.0) and package it with version 1.13.0-1
How to verify it
* Re-add missing dependency for derived debs.
My previous changed removed the whole dependency on the main deb
existing, not just the installation of the main deb. Fix this by
readding a dependency on the main deb being built/pulled from cache.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Add the kernel and initramfs as dependencies for RFS build
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.
How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.
How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
#### Why I did it
src/sonic-linux-kernel
```
* fee7d7e - (HEAD -> master, origin/master, origin/HEAD) Add nvidia arm section and an ability to patch kconfig-inc and fix manage-config (#336) (3 days ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* b9313df0 - (HEAD -> master, origin/master, origin/HEAD) Reducing the severity of oper fec attribute get failure (#2924) (89 minutes ago) [Sudharsan Dhamal Gopalarathnam]
* cb98893f - Add support for SEND_TO_INGRESS port table. (#2816) (19 hours ago) [Yilan Ji]
* 966c5bb0 - [Dash] Fix wrong table name for acl_out_table (#2911) (2 days ago) [Ze Gan]
* 35996350 - [FEC]Auto FEC initial changes (#2893) (8 days ago) [Sudharsan Dhamal Gopalarathnam]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-sairedis
```
* 65323ca - (HEAD -> master, origin/master, origin/HEAD) [VOQ][saidump] To move saidump.sh from the sonic-buildimage repo to the sairedis repo (#1298) (3 days ago) [JunhongMao]
* d520642 - [syncd] Respect each api log level after sai discovery (#1303) (3 days ago) [Kamil Cudnik]
* 7c07d81 - [vslib]: Fix method signatures. (#1299) (3 days ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 76a8590 - (HEAD -> master, origin/master, origin/HEAD) Fix exception occurred during decode vendor name and pn (#406) (2 days ago) [Anoop Kamath]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* bf9c07c4 - (HEAD -> master, origin/master, origin/HEAD) Add target mode to sfputil firmware (#3002) (22 hours ago) [Anoop Kamath]
* 0e43e4dc - [sflow] Added egress Sflow support. (#2790) (2 days ago) [Rajkumar-Marvell]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-ztp
```
* 739470d - (HEAD -> master, origin/master, origin/HEAD) [ZTP] 'config reload' use -f to avoid system checks (#52) (4 hours ago) [Peter Yu]
* 04cd8e8 - [ZTP] bufsize=1 not supported in binary mode (#51) (4 hours ago) [Peter Yu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Improve per-command authorization performance by read passwd entry with getpwent.
#### Why I did it
Currently per-command authorization will check if user is remote user with getpwnam API, which will trigger tacplus-nss for authentication with TACACS server.
But this is not necessary because when user login the user information already add to local passwd file.
Use getpwent API can directly read from passwd file, this will improve per-command authorization performance.
##### Work item tracking
- Microsoft ADO: 25104723
#### How I did it
Improve per-command authorization performance by read passwd entry with getpwent.
#### How to verify it
Pass all UT.
### Description for the changelog
Improve per-command authorization performance by read passwd entry with getpwent.