Why I did it
Enable the suppress fib feature by default.
Work item tracking
Microsoft ADO (25564723):
How I did it
In minigraph.py, to add the field suppress-fib-pending, and enable it for leafrouter.
How to verify it
Build / load image and check the config_db by show CLI.
admin@str-7260cx3-acs-2:~$ show suppress-fib-pending
Enabled
Need to modify the tests/bgp/test_bgp_suppress_fib.py in sonic-mgmt repo, to check the config before restore. Otherwise, after this test, it will turn off the suppress-fib-pending.
sonic-net/sonic-mgmt#10612
Why I did it
This is part of Python3 migration project. This PR will add a new makefile flag: LEGACY_SONIC_MGMT_DOCKER
Now by default: LEGACY_SONIC_MGMT_DOCKER = y will build sonic-mgmt-docker with Python2 and Python3
If LEGACY_SONIC_MGMT_DOCKER = n will will sonic-mgmt-docker with Python3 only
Work item tracking
Microsoft ADO (number only): 25254349
How I did it
Add makefile flag: LEGACY_SONIC_MGMT_DOCKER
How to verify it
By default will build sonic-mgmt-docker with Python2 and Python3. No change compared to before.
Set LEGACY_SONIC_MGMT_DOCKER=n will build sonic-mgmt-docker with Python3 only
This is CSP CS00012280996.
The issue to fix is that the checksum was incorrect for all TCP packets leaving the system so that the BGP connection cannot be established. We found the issue on BCM56993, and it is possible to affect all platforms using linux_ngknet.
#### Why I did it
src/sonic-swss-common
```
* a57cf9e - (HEAD -> master, origin/master, origin/HEAD) Add batch support in ZmqProducerStateTable. (#803) (10 hours ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-linux-kernel
```
* a75a3df - (HEAD -> master, origin/master, origin/HEAD) arm64: Kconfig inclusions to fix PCI hang and MTD detection (#350) (3 hours ago) [Pavan Naregundi]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 917c21e0 - (HEAD -> master, origin/master, origin/HEAD) Add more debug information when PFC WD is triggered (#2858) (10 hours ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
This PR is part of sonic-mgmt-docker Python3 migration project.
Work item tracking
Microsoft ADO (number only): 24397943
How I did it
Upgrade Ansible to 6.7.0
Make Python3 as the default interpreter. python is a soft link to python3. If you want to use python2, use the command python2 explicitly.
Upgrade some pip packages to higher version in order to meet security requirement.
How to verify it
Build a private sonic-mgmt-docker successfully.
Verify python is python3.
Verify python2 is working with 202012 and 202205 branch.
Verify python3 is working with master branch.
Verify with github PR test.
### Why I did it
We use `EdgeZoneAggregator` in `db_migrator`, but we don't support this pattern in sonic yang models. Hence, we update this in the sonic-yang model.
##### Work item tracking
- Microsoft ADO **(number only)**: 25574132
#### How I did it
Update the device pattern list.
#### Why I did it
HLD implementation: Container Hardening (https://github.com/sonic-net/SONiC/pull/1364)
##### Work item tracking
- Microsoft ADO **(number only)**: 14807420
#### How I did it
Reduce linux capabilities in privileged flag
#### How to verify it
Run sflow sonic-mgmt tests
Check container's settings: Privileged is false and container only has default Linux caps, does not have extended caps.
```
admin@vlab-01:~$ docker inspect sflow | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker exec -it sflow bash
root@vlab-01:/# capsh --print
Current: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ep
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
```
#### Why I did it
This header file comes from an external package, and a very old version of the header file has been checked into swss-common. This will cause problems for the upcoming Bookworm upgrade.
##### Work item tracking
- Microsoft ADO **(number only)**: 25411155
#### How I did it
Change references to the header file to use the Debian package nlohmann-json-dev, instead of from swss-common.
### Tested branch (Please provide the tested image version)
- [ ] <!-- image version 1 -->
- [ ] VS image from pipeline build
Verified that eventd was running
Why I did it
To avoid orchagent crash issue like sonic-net/sonic-swss#2935, disable unsupported counters on SONiC management devices.
Work item tracking
Microsoft ADO (number only): 25437720
How I did it
Update the minigraph parser to disable unsupported counters on management devices.
How to verify it
Verified by unittest.
Manually apply patch to DUT and do config load_minigraph
Why I did it
XGS saibcm-modules 8.4 is needed. #14471
Work item tracking
Microsoft ADO (number only): 24917414
How I did it
Copy files from xgs SDK 8.4 repo and modify makefiles to build the image.
Upgrade version to 8.4.0.2 in saibcm-modules.mk.
How to verify it
Build a private image and run full qualification with it: https://elastictest.org/scheduler/testplan/650419cb71f60aa92c456a2b
#### Why I did it
src/sonic-sairedis
```
* 7210b0c - (HEAD -> master, origin/master, origin/HEAD) [Link event damping] Add utility methods. (#1313) (20 hours ago) [Ashish Singh]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
It was observed that a flood of DHCP packets without rate-limiting can cause BGP flaps or lacp keepalive losses.
This change attempts to prevent or reduce such BGP flaps by enabling appropriate rate-limiting in SONiC for all traffic types.
Work item tracking
Microsoft ADO 17964421:
How I did it
Set a reasonable CIR/CBS value of 300 for queue4_group3 (dhcp, lldp, macsec) and 6000 for queue4_group1.
The value 300 was arrived at after testing with dhcp flooding using ptf (using multiple threads). Throttling at this rate was necessary to ensure that dhcp flooding does not cause BGP flaps.
How to verify it
Verified with this script running from ptf, that BGP flaps don't happen when CBS/CIR is set at 300 for queue4_group3.
import threading
from scapy.all import *
def send_dhcp_discover(intf):
dhcp_discover = Ether(dst='ff:ff:ff:ff:ff:ff',src=RandMAC()) \
/IP(src='1.1.1.1',dst='255.255.255.255') \
/UDP(sport=68,dport=67) \
/DHCP(options=[('message-type','discover'),('end')])
sendp(dhcp_discover,count=100000,iface=intf)
if __name__ == "__main__":
t1 = threading.Thread(target=send_dhcp_discover, args=("eth1",))
t2 = threading.Thread(target=send_dhcp_discover, args=("eth2",))
t1.start()
t2.start()
t1.join()
t2.join()
Verified on Arista-7260CX3-D108C8 running 202012 that the copp rule for queue4_group1 and queue4_group3 do NOT affect BGP packets. To verify this using PTF, the copp rules were modified to set the "CBS" and "CIR" for queue4_group1 and queue4_group3 at 600pps and 50k packets each of "BGP open" and "DHCP Discover" were simultaneously sent from the same PTF port to the DUT. It was verified using "show c cpu" that packets are hitting the cpu queue at 1200 pps (double the configured CIR/CBS for these packet types). This helped conclude that throttling rate is per trap (or packet type) and not per queue.
Verified with updated sonic-mgmt tests ([tests/copp]: Update copp mgmt tests to support new rate-limits sonic-mgmt#8199) on broadcom and mellanox platforms that these traffic types are rate-limited.
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
#### Why I did it
src/sonic-sairedis
```
* 1ef16ee - (HEAD -> master, origin/master, origin/HEAD) [Link event damping] Add generic concurrent queue for link event damping. (#1297) (11 hours ago) [Ashish Singh]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* a9867e67 - (HEAD -> master, origin/master, origin/HEAD) Fix acl match ip_type_non_ipv4 and ip_type_non_ipv6. (#2842) (5 hours ago) [LTeng]
* dc8fd20f - [DASH] ACL tags implementation (#2915) (11 hours ago) [Oleksandr Ivantsiv]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 0ae5d2d2 - (HEAD -> master, origin/master, origin/HEAD) [ci] Use correct bullseye docker image according to source branch. (18 hours ago) [Liu Shilong]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
In an effort to allow people to build a slim version of SONiC to fit on devices to small storage, there is a need to disable some unneeded features.
The docker-gbsyncd are only applicable to devices with external gearboxes and might not apply to devices that need a small image.
It is therefore desirable to have a knob to not include these gbsyncd containers.
Work item tracking
Microsoft ADO (number only):
How I did it
Add a new config INCLUDE_GBSYNCD which is enabled by default to retain the previous behavior.
Setting it to n will not include the platform/components/docker-gbsyncd-*.mk.
How to verify it
Set INCLUDE_GBSYNCD = n and witness that docker-gbsyncd images are not present in the final image.
With Debian Bookworm, Paramiko 2.9 or newer will need to be used to be
able to connect to devices running that version of Debian
(specifically, to those running OpenSSH 9.2).
Paramiko is currently on 3.3.1. For now, upgrade to 2.9.5.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
src/sonic-sairedis
```
* eaa2bda - (HEAD -> master, origin/master, origin/HEAD) Update SAI submodule to latest (#1311) (12 hours ago) [Kamil Cudnik]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Update the sonic-pins submodule. This brings in the following commit:
56a7762 Use json.hpp from nlohmann-json-dev instead of from swss-common (#22)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
Mellanox MSN2700 platforms have a non-functional error log: "ERR pmon#sensord: Error getting sensor data: dps460/#10: Can't read". This error is because of a firmware issue with some PSU, we are not able to upgrade the FW online. Since there is no functional impact, this error log can be ignored safely.
- How I did it
Add a new rsyslog rule to the rsyslog-container.conf.j2, if the docker name is pmon and the platform name matches, the new rule will be inserted into the docker rsyslogd.conf
- How to verify it
run regression on the MSN2700 platform to make the error log will not be printed to the syslog.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
src/sonic-platform-common
```
* 6d804d6 - (HEAD -> master, origin/master, origin/HEAD) Fix SSD health percentage issue for vendor Virtium (#407) (3 hours ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* Reduce SONiC image filesystem size
Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:
- change some packages that are installed in the rootfs
- apply a rootfs reduction script
The script itself will perform a few steps:
- remove file duplication by leveraging hardlinks
- under /usr/share/sonic since the symlinks under the device folder are lost during the build.
- under /var/lib/docker since the files there will only be mounted ro
- remove some extra files (man, docs, licenses, ...)
- some image specific space reduction (only for aboot images currently)
The script can later be improved but for now it's reducing the rootfs
size by ~30%.
* restore fully featured vim package
Why I did it
Add config to set pip HTTP timeout value in building process for build to be more stable.
Default value is 60.
Work item tracking
Microsoft ADO (number only): 25190067
How I did it
Insert timeout options in all pip commands.
Why I did it
K8S_OPTIONS maybe empty, so there will be syntax error. Need to fix this issue.
Work item tracking
Microsoft ADO (number only): 25495020
How I did it
Add "" for K8S_OPTIONS to avoid exception.
How to verify it
No more exception is throwed in PR build validation pipeline.
Why I did it
Part implementation of dhcp_server. HLD: sonic-net/SONiC#1282.
Add dhcpservd to dhcp_server container.
How I did it
Add installing required pkg (psutil) in Dockerfile.
Add copying required file to container in Dockerfile (kea-dhcp related and dhcpservd related)
Add critical_process and supervisor config.
Add support for generating kea config (only in dhcpservd.py) and updating lease table (in dhcpservd.py and lease_update.sh)
How to verify it
Build image with setting INCLUDE_DHCP_SERVER to y and enabled dhcp_server feature after installed image, container start as expected.
Enter container and found that all processes defined in supervisor configuration running as expected.
Kill processes defined in critical_processes, container exist.
#### Why I did it
src/sonic-utilities
```
* 244ad2d6 - (HEAD -> master, origin/master, origin/HEAD) Revert "Remove syslog service validator in GCU (#2991)" (#3015) (2 hours ago) [jingwenxie]
* d857eb09 - [db_migrator] Fix the broken version chain (#3014) (11 hours ago) [Vivek]
* 424be9ca - [fwutil] Fix python SyntaxWarning for 'is' with literals (#3013) (23 hours ago) [Kebo Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
### Why I did it
Privileges and volumes were incorrectly set in macsec container. Privileged flag is set to false and volumes are not mounted properly.
```
admin@vlab-01:~$ docker inspect macsec0 | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker inspect macsec0 | grep -A 10 Binds
"Binds": [
"/var/run/redis0:/var/run/redis:rw",
"/var/run/redis-chassis:/var/run/redis-chassis:ro",
"/usr/share/sonic/device/x86_64-nokia_ixr7250e_36x400g-r0/Nokia-IXR7250E-36x100G/0:/usr/share/sonic/hwsku:ro",
"/var/run/redis0/:/var/run/redis0/:rw",
"/usr/share/sonic/device/x86_64-nokia_ixr7250e_36x400g-r0:/usr/share/sonic/platform:ro"
],
```
### How I did it
#### How to verify it
Make sure privileged settings remain unchanged and make sure volumes are properly mounted
```
admin@vlab-01:~$ docker inspect macsec | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker inspect macsec | grep -A 10 Binds
"Binds": [
"/etc/timezone:/etc/timezone:ro",
"/var/run/redis:/var/run/redis:rw",
"/var/run/redis-chassis:/var/run/redis-chassis:ro",
"/etc/fips/fips_enable:/etc/fips/fips_enable:ro",
"/usr/share/sonic/templates/rsyslog-container.conf.j2:/usr/share/sonic/templates/rsyslog-container.conf.j2:ro",
"/etc/sonic:/etc/sonic:ro",
"/host/warmboot:/var/warmboot",
"/usr/share/sonic/device/x86_64-kvm_x86_64-r0/Force10-S6000/:/usr/share/sonic/hwsku:ro",
"/usr/share/sonic/device/x86_64-kvm_x86_64-r0:/usr/share/sonic/platform:ro"
],
```
Why I did it
RFS cache have issues which breaks official build and PR checker.
By reading cache, fsroot-vs/lib/modules folder don't exist.
Work item tracking
Microsoft ADO (number only): 25481484
How I did it
Disable read cache currently.
How to verify it
### Why I did it
[Security] Upgrade the OpenSSL/OpenSSH to fix CVE alerts
Upgrade OpenSSL to 1.1.1n-0+deb11u5
Fix CVEs:
CVE-2023-0464 (Excessive Resource Usage Verifying X.509 Policy
CVE-2023-0465 (Invalid certificate policies in leaf certificates are
CVE-2023-0466 (Certificate policy check not enabled).
CVE-2022-4304 (Timing Oracle in RSA Decryption).
CVE-2023-2650 (Possible DoS translating ASN.1 object identifiers).
Upgrade OpenSSH to 8.4p1-5+deb11u2
Fix CVEs:
CVE-2023-38408 (Lacks SSH agent restriction)
##### Work item tracking
- Microsoft ADO **(number only)**: 25506776
#### How I did it
Upgrade the OpenSSL/OpenSSH package version and fix the UT failure.
#### How to verify it
Verified by UTs with and without FIPS enabled.
- Why I did it
Add an ability to add arm64 mellanox specific kconfig using the integration tool
Fix the existing duplicate kconfig problem by using the vanilla .config
Add an ability to patch kconfig-inclusions file. Renamed series.patch to external-changes.patch to reflect the behavior
NOTE: Min hw-mgmt version to use with these changes: V.7.0030.2000 not yet upstream but required prio to it.
This option will be enabled one the new hw mgmt will be upstream.
Depends on sonic-net/sonic-linux-kernel#336
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
#### Why I did it
src/sonic-swss
```
* f31ccd09 - (HEAD -> master, origin/master, origin/HEAD) Add refillToSync() into ConsumerBase to support warmboot. (#2866) (2 days ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-gnmi
```
* 07e0b36 - (HEAD -> master, origin/master, origin/HEAD) Recover from potential panic when doing map to JSON serialization (#161) (29 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-linux-kernel
```
* 6508505 - (HEAD -> master, origin/master, origin/HEAD) Add drop monitor Kernel Patches for buffer support (#338) (3 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-restapi
```
* ccad4a2 - (HEAD -> master, origin/master, origin/HEAD) [Tunnel] Support co-existence of IPv4 and IPv6 tunnels (#147) (8 hours ago) [Prince Sunny]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Added Marvell SAI-1.13.0 debian support for x86_64 platform.
Work item tracking
Microsoft ADO (number only):
How I did it
compile marvel libsai.so (with SAI headers from version 1.13.0) and package it with version 1.13.0-1
How to verify it