#### Why I did it
src/sonic-swss
```
* 859bd678 - (HEAD -> 202211, origin/202211) Fix error in peer response time when headroom is calculated for 800G (#2860) (2 days ago) [Stephen Sun]
* 5f294cf1 - [Dynamic Buffer][Mellanox] Skip PGs in pending deleting set while checking accumulative headroom of a port (#2871) (2 days ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 2da4286 - (HEAD -> 202211, origin/202211) Add new SSD type support (#390) (14 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 46b32daa - (HEAD -> 202211, origin/202211) [kdump] Fix API to read the current running image (#2217) (14 hours ago) [rajendra-dendukuri]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* ec37e5d4 - (HEAD -> 202211, origin/202211) [Techsupport] Update the message seen during the lock acquisition failure (#2897) (10 days ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 63b08b59 - (HEAD -> 202211, origin/202211) [ASAN] Fix Indirect Mem Leaks in Orchagent (#2869) (2 days ago) [Vivek]
* 4248d01d - [muxorch] set mux state to init upon warm reboot (#2834) (5 days ago) [Nikola Dancejic]
* 3ca4b842 - Handle duplicate routes in a graceful manner (#2688) (5 days ago) [prabhataravind]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 8ea4de3 - (HEAD -> 202211, origin/202211) [PSU power threshold] Fix logic error: compare the system power with the PSU's power threshold (#367) (2 days ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* [yang] Change swss-event, dhcp-relay-event leafref to string (#13326)
Why I did it
Do not require leafref as part of yang. Only need string to compare whether string received from event matches what is possible for ifname.
How I did it
How to verify it
Run UT
* Add fix to monit_regex.json for catching mem_usage and cpu_usage (#14954)
Why I did it
Current regex not able to capture logs, modify regex to capture syslog messages
Work item tracking
Microsoft ADO (number only): 13366345
How I did it
Code change
How to verify it
sonic-mgmt test case
* Update usage leaf in sonic-events-host yang models (#15805)
#### Why I did it
event yang models for usage currently use int as type for usage leaf, needs to be of type decimal64
##### Work item tracking
- Microsoft ADO **(number only)**:17747466
#### How I did it
Update yang models and UT
#### How to verify it
UT
#### Why I did it
src/sonic-utilities
```
* 9e8df7e5 - (HEAD -> 202211, origin/202211) [build] Fix dependency issue between responses and urllib3 package. (#2928) (#2929) (20 hours ago) [Liu Shilong]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 6ec611dc - (HEAD -> 202211, origin/202211) [202211][ppi]: Implement port bulk comparison logic (#2564) (#2821) (19 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
During the upgrade process via k8s, the feature's systemd service will restart as well, all of the feature systemd service has restart number limit, and the limit number is too small, only three times. if fallback happens when upgrade, the start count will be 2, just once again, the systemd service will be down. So, need to bypass this. This restart function will be called when do local -> kube, kube -> kube, kube ->local, each time call this function, we indeed need to restart successfully, so do reset-failed every time we do restart.
When need to go back to local mode, we do systemd restart immediately without waiting the default restart interval time so that we can reduce the container down time.
Work item tracking
Microsoft ADO (number only):
24172368
How I did it
Before every restart for upgrade, do reset feature's restart number. The restart number will be reset to 0 to bypass the restart limit.
When need to go back to local mode, we do systemd restart immediately.
How to verify it
Feature's systemd service can be always restarted successfully during upgrade process via k8s.
#### Why I did it
src/sonic-platform-daemons
```
* ea8e5c7 - (HEAD -> 202211, origin/202211) [202211] chassisd: Fix crash on exit on linecard (#352) (6 hours ago) [Patrick MacArthur]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
When do clean up container images, current code has two bugs need to be fixed. And some variables' name maybe cause confused, change the variables' name.
Work item tracking
Microsoft ADO (number only): 24502294
How I did it
We do clean up after tag latest successfully. But currently tag latest function only return 0 and 1, 0 means succeed and 1 means failed, when we get 1, we will retry, when we get 0, we will do clean up. Actually the code 0 includes another case we don't need to do clean up. The case is that when we are doing tag latest, the container image we want to tag maybe not running, so we can not tag latest and don't need to cleanup, we need to separate this case from 0, return -1 now.
When local mode(v1) -> kube mode(v2) happens, one problem is how to handle the local image, there are two cases. one case is that there was one kube v1 container dry-run(cause we don't relace the local if kube version = local version), we will remove the kube v1 image and tag the local version with ACR prefix and remove local v1 local tag. Another case is that there was no kube v1 container dry-run, we remove the local v1 image directly, cause the local v1 image should not be the last desire version.
About the docker_id variable, it may cause confused, it's actually docker image id, so rename the variable. About the two dicts and the list, rename them to be more readable.
How to verify it
Check tag latest and image clean up result.
#### Why I did it
After k8s upgrade a container, k8s can only know the container is running, don't know the service's status inside container. So we need a probe inside container, k8s will call the probe to check whether the container is really ready.
##### Work item tracking
- Microsoft ADO **(number only)**: 22453004
#### How I did it
Add a health check probe inside config engine container, the probe will check whether the start service exit normally or not if the start service exists and call the python script to do container self-related specific checks if the script is there. The python script should be implemented by feature owner if it's needed.
more details: [design doc](https://github.com/sonic-net/SONiC/blob/master/doc/kubernetes/health-check.md)
#### How to verify it
Check path /usr/bin/readiness_probe.sh inside container.
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [x] 202205
- [x] 202211
#### Tested branch (Please provide the tested image version)
- [x] 20220531.28
#### Why I did it
src/sonic-swss
```
* eff2b751 - (HEAD -> 202211, origin/202211) [202211][CodeQL]: Use dependencies with relevant versions in azp template. (#2843) (6 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 0ec46f22 - (HEAD -> 202211, origin/202211) [muxorch] Skip programming ACL for standby `active-active` ports (#2569) (#2854) (7 hours ago) [Longxiang Lyu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* ac698065 - (HEAD -> 202211, origin/202211) [202211][muxorch] Skip programming SoC IP kernel tunnel route (39 minutes ago) [Longxiang Lyu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
To reduce the container's dependency from host system
Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.
How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.
Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
#### Why I did it
src/sonic-sairedis
```
* da40f3f - (HEAD -> 202211, origin/202211) [202211][CodeQL]: Use dependencies with relevant versions in azp template. (#1263) (11 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-sairedis
```
* d1850e2 - (HEAD -> 202211, origin/202211) [CI]: Fix pipeline issue caused by urllib3 v2. (#1264) (13 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 5e50a4af - (HEAD -> 202211, origin/202211) Add support for secure upgrade (#2698) (14 hours ago) [ycoheNvidia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss-common
```
* 7465f1b - (HEAD -> 202211, origin/202211) Fix pipeline issue caused by urllib3 v2 (12 hours ago) [Liu Shilong]
* 82f9d76 - [Ci] Fix collect log error in azp template (#799) (2 days ago) [xumia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 5fe6e25a - (HEAD -> 202211, origin/202211) [subinterface]: Fix admin state handling. (#2806) (4 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* 3ebe922 - (HEAD -> 202211, origin/202211) Fix issue: hostcfgd unit test might be affected by other during parallel build (#65) (12 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 4daa748 - (HEAD -> 202211, origin/202211) PSUD-Delete or update CHASSIS_INFO table PSU/Modules data if added or… (#351) (33 hours ago) [prem-nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Add yang model definition for CHASSIS_MODULE define and implemented for sonic chassis. HLD for this configuration is included in https://github.com/sonic-net/SONiC/blob/master/doc/pmon/pmon-chassis-design.md#configurationFixes#12640
How I did it
Added yang model definition, unit tests, sample config and documentation for the table
How to verify it
Validated config tree generation using "pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-voq-inband-interface.yang"
Built the below python-wheels to validate unit tests and other changes
target/python-wheels/bullseye/sonic_yang_mgmt-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_yang_models-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_config_engine-1.0-py3-none-any.whl
Why I did it
refine reproducible build.
How I did it
Fix reset map variable in bash.
Ignore empty web file md5sum value.
If web file didn't backup in azure storage, use file on web.
How to verify i
#### Why I did it
src/sonic-platform-common
```
* 459ffaa - (HEAD -> 202211, origin/202211) Fix issue: should use 'Value' column to calculate the health percentage (#381) (3 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 0f0ec140 - (HEAD -> 202211, origin/202211) Fix issue: show interfaces transceiver eeprom -d should display same entry for CMIS cable (#2864) (3 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
How I did it
During the disable flow of dhcp_relay, it entered the dnsrvs_name list, which caused the SYSTEM_STATE key to be set to DOWN. Right after that, the dhcp_relay service was removed from the full service list, however, but, when it was removed from the dnsrvs_name, there was no flow to reset the system state back to UP even though there was no more services in down state.
How to verify it
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay enabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay disabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
Should see
System is ready
* Re-add 127.0.0.1/8 when bringing down the interfaces
With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.
To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.
Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
src/sonic-swss
```
* bccb1cc - (HEAD -> 202211, origin/202211) [202211] [sflowmgrd] Infer sampling rate dynamically based on oper speed (#2805) (4 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Backporting #13969
Why I did it
Implementing code changes for sonic-net/SONiC#1203
Work item tracking
Microsoft ADO (number only):
How I did it
Removed the timers and delayed target since the delayed services would start based on event driven approach.
Cleared port table during config reload and cold reboot scenario.
Modified yang model, init_cfg.json to change has_timer to delayed
How to verify it
Added UT to verify
What I did:
Workaround for the issue seen here : FRRouting/frr#13682
It seems there is timing issue where there are multiple recursive lookup needed to resolve nexthop of the route it's possible that it does not happen correctly causing route to remain in inactive state
Issue is seen on chassis-packet as there 2 level of recursive lookup needed for a given e-BGP learnt route
- Level1 to resolve e-BGP peer (connected route via bgp ) over Loopback4096 (i-BGP peering)
- Level 2 Loopback4096 over backend port-channels next-hops
For VOQ chassis there is no e-BGP peer (connected route via bgp ) resolution as route is added as Static route by orchagent over Ethernet-IB.
Also as part of this remove route-map policy from instance.conf.j2 as same is define in peer-group.j2.
Microsoft ADO: https://msazure.visualstudio.com/One/_workitems/edit/24198507
How I verify:
Functional Verification manually
Updated UT.
We will be adding sanity check in sonic-mgmt to make sure none of route are in inactive state.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
- Why I did it
Add fan direction check to system health, all fans should be in the same direction
- How I did it
Add fan direction check to system health, all fans should be in the same direction
- How to verify it
Manual test
Unit test
Added sonic-mgmt test case to verify