- Why I did it
when reading sysfs fd upon python poller events, there's end of line garbage like "# 012" (without space between the 2 parts) trailing the real value of 1 or 0
- How I did it
using python strip() to remove end of line
- How to verify it
run the CMIS host management feature on a switch
wait few minutes until switch completes boot up sequence including CMIS host manager
then disconnect or reconnect a port to create a poller event
#### Why I did it
src/sonic-sairedis
```
* e5b8d4e - (HEAD -> master, origin/master, origin/HEAD) Make changes to support compiling on Bookworm (with GCC 12) (#1344) (3 days ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-dash-api
```
* ec15bc7 - (HEAD -> master, origin/master, origin/HEAD) Revert "rename VnetMapping.action_type" (#17) (2 hours ago) [Ze Gan]
* ad0f59e - Add unspecified default value to all enums (2 days ago) [Lawrence Lee]
* dd844b1 - Merge branch 'add-enum-default' of github.com:theasianpianist/sonic-dash-api into add-enum-default (4 days ago) [Lawrence Lee]
|\
| * 4b31135 - Merge branch 'master' into add-enum-default (4 days ago) [Lawrence Lee]
* | 4b41ea7 - rename VnetMapping.action_type (4 days ago) [Lawrence Lee]
|/
* b1ab99f - Add unspecified default value to all enums (4 days ago) [Lawrence Lee]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
The PR introduced a bug for slim image build, #17905, by which the sonic_asic_platform is missing when build docker image for slim image.
[ building ] [ target/docker-dhcp-relay.gz ]
/sonic/dockers/docker-dhcp-relay/cli-plugin-tests /sonic
/sonic
Traceback (most recent call last):
File "/usr/local/bin/j2", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.9/dist-packages/j2cli/cli.py", line 202, in main
output = render_command(
File "/usr/local/lib/python3.9/dist-packages/j2cli/cli.py", line 186, in render_command
result = renderer.render(args.template, context)
File "/usr/local/lib/python3.9/dist-packages/j2cli/cli.py", line 85, in render
return self._env \
File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1090, in render
self.environment.handle_exception()
File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 832, in handle_exception
reraise(*rewrite_traceback_stack(source=source))
File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 28, in reraise
raise value.with_traceback(tb)
File "/sonic/dockers/docker-dhcp-relay/Dockerfile.j2", line 48, in top-level template code
{% if build_reduce_image_size != "y" or sonic_asic_platform != "broadcom" %}
jinja2.exceptions.UndefinedError: 'sonic_asic_platform' is undefined
make: *** [slave.mk:1072: target/docker-dhcp-relay.gz] Error 1
make: *** Waiting for unfinished jobs....
[ finished ] [ target/docker-swss-layer-bullseye.gz ]
[ finished ] [ target/docker-syncd-brcm-dnx.gz ]
make[1]: *** [Makefile.work:608: target/sonic-broadcom.bin] Error 2
make[1]: Leaving directory '/data/work/1/s'
make: *** [Makefile:41: target/sonic-broadcom.bin] Error 2
And why it slipped the PR test? PR test doesn't compile with slim option, it won't check sonic_asic_platform != "broadcom" for PR build.
Work item tracking
Microsoft ADO (number only):
How I did it
Export sonic_asic_platform for docker build in slave.mk
How to verify it
build with slim image option.
#### Why I did it
src/sonic-swss-common
```
* 3c3ae57 - (HEAD -> master, origin/master, origin/HEAD) Provide build flag to Disable compilation of libyang dependent interfaces (#853) (5 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 538ec67 - (HEAD -> master, origin/master, origin/HEAD) Tx/Rx power values should be rounded up to 3 decimal places (#432) (6 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
- The ubuntu 2004 is needed by 202311
- Because the artifacts of ubuntu2004 are used by other repos, a daily building is needed without an updating of this repo for a long time.
Signed-off-by: Ze Gan <ganze718@gmail.com>
#### Why I did it
src/sonic-swss-common
```
* 253ceb6 - (HEAD -> master, origin/master, origin/HEAD) Fix race condition in ZmqServer. (#850) (23 hours ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog
- Why I did it
Update SDK/FW version to 4.6.2202/2012.2202
Fixed issues:
1. On Spectrum-3 systems, ports' toggling while sending traffic on 400G speed ports, might result in stuck FW.
2. In Spectrum-1 switch systems, 50G SR2 speed mode is not supported when AutoNeg is enabled. In this case although the max interface speed is 50G for SR2 or SR4 or SR, the actual max interface speed negotiated between the loopback is 25G.
3. On Spectrum-2 and Spectrum-3, Switch create in fastboot might take more than 40 seconds in case there are no active links.
4. When performing warmboot from version prior to 202205 to 202205 and above , no aging and mac move take place
- How I did it
Updating make files.
-How to verify it
Running regression
#### Why I did it
src/sonic-platform-pde
```
* f2cc748 - (HEAD -> master, origin/master, origin/HEAD) Merge pull request #35 from nonodark/local (21 hours ago) [賓少鈺]
* 607e920 - Fix 'Chassis' object has no attribute 'get_num_psu' in test_psu.py (3 weeks ago) [nonodark]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Fix an error in the log_err call.
this error can be triggered by an invalid static route key. usually the code cannot go here with normal config file. but hit this issue with an invalid key by manual testing with redis-cli directly. the file is scanned by Python lint to prevent such errors.
Work item tracking
Microsoft ADO ():26250268
How I did it
fix the format error.
How to verify it
1, ran pylint to check the design, make sure no such error in the design file.
2, wrote a separate python program to verify the log call.
In the current logging related testing, usually use patch/mock for logging. for this specific error, could not trigger it if we call mock function instead the real function in the design. so need to do lint checking for code change.
### Why I did it
Disable eventd at buildtime for slim images
##### Work item tracking
- Microsoft ADO **(number only)**:26386286
#### How I did it
Add flags for disabling eventd and only copy rsyslog conf files when eventd is included and not slim image
#### How to verify it
Manual testing
Why I did it
Fix the build issue caused by the wrong version specified.
See the build error logs:
Try 4: /usr/bin/wget --retry-connrefused failed to get: -O
--2024-01-26 11:38:23-- https://sonicstorage.blob.core.windows.net/public/fips/bullseye/0.10/amd64/libk5crypto3_1.18.3-6+deb11u14+fips_amd64.deb
Resolving sonicstorage.blob.core.windows.net (sonicstorage.blob.core.windows.net)... 20.60.59.131
Connecting to sonicstorage.blob.core.windows.net (sonicstorage.blob.core.windows.net)|20.60.59.131|:443... connected.
HTTP request sent, awaiting response... 404 The specified blob does not exist.
2024-01-26 11:38:23 ERROR 404: The specified blob does not exist..
Try 5: /usr/bin/wget --retry-connrefused failed to get: -O
make[1]: *** [Makefile:12: /sonic/target/debs/bullseye/symcrypt-openssl_0.10_amd64.deb] Error 8
make[1]: Leaving directory '/sonic/src/sonic-fips'
Work item tracking
Microsoft ADO (number only): 26577929
The package not installed but PR passed issue is traced in another issue #17927
How I did it
Add the libkrb5-dev and the depended packages to fix docker-sonic-vs build failure.
The package libzmq3-dev has dependency on the libkrb5-dev.
#### Why I did it
src/sonic-sairedis
```
* 5b2a517 - (HEAD -> master, origin/master, origin/HEAD) Revert "add if statement for module control mode support" (#1341) (22 hours ago) [dbarashinvd]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 3d45c0c6 - (HEAD -> master, origin/master, origin/HEAD) Migrate GNMI table (#3053) (9 hours ago) [ganglv]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
ICM reported due to "BGPMon Process exited" which was caused by json load exception.
Work item tracking
Microsoft ADO (number only):
25916773
How I did it
Add an exception handle during json load.
How to verify it
Verified locally, add debug log to modify the output string of cmd to make it not with json formation, then check the syslog.
Why I did it
Align the keywords to make qos configuration take effect
Work item tracking
Microsoft ADO (number only):
How I did it
Change the keyword to ComputeAI
How to verify it
reload minigraph and check the qos configuration
- Why I did it
Based on some research some products might experience an occasional IO failures in the communication between CPU and SSD because of NCQ.
There seems to be a problem between some kernel versions and some SATA controllers.
Syslog error message examples:
Error "ata1: SError: { UnrecovData Handshk }" - "failed command: WRITE FPDMA QUEUED".
Error "ata1: SError: { RecovComm HostInt PHYRdyChg CommWake 10B8B DevExch }" - "failed command: READ FPDMA QUEUED".
Some vendors already disabled NCQ on their platforms in SONiC due to similar issue:
[Arista] Disable ATA NCQ for a few products #13739 [Arista] Disable ATA NCQ for a few products
[Arista] Disable SSD NCQ on DCS-7050CX3-32S #13964 [Arista] Disable SSD NCQ on DCS-7050CX3-32S
Also there are other discussions on Debian/Ubuntu forums about similar issues and it was suggested to disable NCQ:
https://askubuntu.com/questions/133946/are-these-sata-errors-dangerous
- How I did it
Add a kernel parameter to tell libata to disable NCQ
- How to verify it
Use FIO tool - fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4
#### Why I did it
src/sonic-swss-common
```
* 41ee154 - (HEAD -> master, origin/master, origin/HEAD) [dbconnect]: Support DPU database schema (#845) (12 hours ago) [Ze Gan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 96e42cc6 - (HEAD -> master, origin/master, origin/HEAD) Additional check to skip FRR-Offloaded check if the bgp route-src was not selected as best (#3130) (11 hours ago) [Deepak Singhal]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-mgmt-common
```
* 9905269 - (HEAD -> master, origin/master, origin/HEAD) Added support for singleton containers and a sibling list in a single SONIC table (3 days ago) [Mohammed Faraaz]
```
#### How I did it
#### How to verify it
#### Description for the changelog
### Why I did it
Fix the krb5 vulnerable issue
CVE-2021-36222 allows remote attackers to cause a NULL pointer dereference and daemon crash
CVE-2021-37750 NULL pointer dereference in kdc/do_tgs_req.c via a FAST inner body that lacks a server field
DSA 5286-1 remote code execution
##### Work item tracking
- Microsoft ADO **(number only)**: 26577929
#### How I did it
Upgrade the krb5 version to 1.18.3-6+deb11u14+fips.
### Why I did it
- Modified "sonic-port.yang" for adding support in Port Yang model for the "mode" attribute for adding port modes
- Modified "sonic-portchannel.yang" for adding support in Port Channel Yang model for the "mode" attribute for adding port modes
- Updated tests for these modifications
#### How to verify it
- Added support to align SONiC yang with Config_db
### Why I did it
HLD implementation: Container Hardening (https://github.com/sonic-net/SONiC/pull/1364)
### How I did it
Reduce linux capabilities in privileged flag
#### How to verify it
Check container's settings: Privileged is false and container only has default Linux caps, does not have extended caps.
```
admin@vlab-01:~$ docker inspect nat | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker exec -it nat bash
root@vlab-01:/# capsh --print
Current: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ep
```
#### Why I did it
src/sonic-swss-common
```
* e4db436 - (HEAD -> master, origin/master, origin/HEAD) [schema] Add SAG table for static anycast gateway (#540) (8 hours ago) [Jimi Chen]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* b3d856bf - (HEAD -> master, origin/master, origin/HEAD) Add all SKUs to the generic config update list (#3131) (7 hours ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Change orchagent stuck message from ERR to WARNING
#### Why I did it
During switch initialization, sometime Orchagent will busy for more than 40seconds and will trigger process stuck workdog error.
To improve this issue, change watchdog error message to warning message.
##### Work item tracking
- Microsoft ADO: 26517622
#### How I did it
Change orchagent stuck message from ERR to WARNING.
#### How to verify it
Pass all UT.
### Description for the changelog
Change orchagent stuck message from ERR to WARNING.
Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue.
#### Why I did it
When set TACACS to "tacacs+, local", user still can run a blocked command with local permission.
##### Work item tracking
- Microsoft ADO: 26399545
#### How I did it
Fix code to reject command when authorized failed from TACACS server side.
#### How to verify it
Pass all UT.
### Description for the changelog
Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue.
### Why I did it
Unnecessary for logs to be written out to /tmp/${SERVICE}-debug.log as they are already being written to syslog. Therefore, removing writing to a new log in concern for memory space and not being able to startup some services in RO state.
##### Work item tracking
- Microsoft ADO **(number only)**:26458976
#### How I did it
Remove DEBUGLOG definition and line that echo's message to mentioned log file.
#### How to verify it
Manually verified, /tmp/${SERVICE}-debug.log files do not exist and log for service starting still appears in syslog
#### Why I did it
src/sonic-swss
```
* 41330abf - (HEAD -> master, origin/master, origin/HEAD) [Build] Support to collect the test coverage in cobertura format (#3019) (33 hours ago) [xumia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-gnmi
```
* 2c862b8 - (HEAD -> master, origin/master, origin/HEAD) Merge pull request #184 from abdosi/master (9 hours ago) [Rita Hui]
* 1d7f24c - Fix (4 days ago) [Abhishek Dosi]
* eda628c - Fix (4 days ago) [Abhishek Dosi]
* e37da40 - Fix Compile Error (4 days ago) [Abhishek Dosi]
* 22d0d0f - Update db_client.go (5 days ago) [abdosi]
```
#### How I did it
#### How to verify it
#### Description for the changelog
### Why I did it
Fix the issue detected by[ TestStaticMgmtPortIP::test_dynamic_dns_not_working_when_static_ip_configured ](https://github.com/sonic-net/sonic-mgmt/blob/master/tests/dns/static_dns/test_static_dns.py#L105C9-L105C63) test.
### How I did it
Query MGMT interface configuration. Do not apply dynamic DNS configuration when MGMT interface has static IP address.
#### How to verify it
Run `tests/dns/static_dns/test_static_dns.py` sonic-mgmt tests.
### Why I did it
HLD implementation: Container Hardening (https://github.com/sonic-net/SONiC/pull/1364)
##### Work item tracking
- Microsoft ADO **(number only)**: 14807420
#### How I did it
Reduce linux capabilities in privileged flag
#### How to verify it
Check container's settings: Privileged is false and container only has default Linux caps, does not have extended caps.
```
admin@vlab-01:~$ docker inspect p4rt | grep Privi
"Privileged": false,
admin@vlab-01:~$ docker exec -it p4rt bash
root@vlab-01:/# capsh --print
Current: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ep
```
Fix reproducible build Upgrade version pipeline.
Remove barefoot build. Because it failed on sai package.
add marvell-arm64/pensando build.
Microsoft ADO (number only): 26515265
Improve SSHD config to use more secure settings
Why I did it
According to Sonic OS review result, SSHD config file /etc/ssh/sshd_config using insecure settings.
Work item tracking
Microsoft ADO: 15022083
How I did it
Change build_debian.sh script to set following settings to /etc/ssh/sshd_config:
ClientAliveInterval is set to 300
MaxAuthTries is set to default of 3
Banner set to /etc/issue
How to verify it
Pass all E2E test case.
Ignore TACACS accounting trace log when debug disabled.
#### Why I did it
TACACS accounting trace log is only for debug, improve code to not generate trace log when debug disabled.
##### Work item tracking
- Microsoft ADO: 25270078
#### How I did it
Ignore TACACS accounting trace log when debug disabled.
#### How to verify it
Pass all UT.
Manually verified the auditd-tacplus not generate trace log when debug disabled.
### Description for the changelog
Ignore TACACS accounting trace log when debug disabled.