Commit Graph

8031 Commits

Author SHA1 Message Date
Liu Shilong
104ae5d185
[ci] Fix pipeline 'Check if vstest is needed.' issue. (#15958)
Why I did it
Line:7 will exit when k8s file didn't change.
Use 'System.PullRequest.TargetBranchName' instead of 'System.PullRequest.TargetBranch'. Because git server in AzDevOps don't support 'System.PullRequest.TargetBranch'.
Work item tracking
Microsoft ADO (number only): 24636791
How I did it
How to verify it
2023-07-25 18:24:57 +08:00
Longxiang Lyu
dc139cfc32
[monit][dualtor] Periodically check mux neighbors consistency (#15769)
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2023-07-24 21:16:49 -07:00
Saikrishna Arcot
e0927e28af
Update sairedis submodule (#15720)
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.

This submodule update brings in the following changes:

ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-07-24 17:05:03 -07:00
lixiaoyuner
10b65d9826
Add k8s master code new (#15716)
Why I did it
Currently, k8s master image is generated from a separate branch which we created by ourselves, not release ones. We need to commit these k8s master related code to master branch for a better way to do k8s master image build out.

Work item tracking
Microsoft ADO (number only):
19998138
How I did it
Install k8s dashboard docker images
Install geneva mds and mdsd and fluentd docker images and tag them as latest, tagging latest will help create container always with the latest version
Install azure-storage-blob and azure-identity, this will help do etcd backup and restore.
Install kubernetes python client packages, this will help read worker and container state, we can send these metric to Geneva.
Remove mdm debian package, will replace it with the mdm docker image
Add k8s master entrance script, this script will be called by rc-local service when system startup. we have some master systemd services in compute-move repo, when VMM service create master VM, VMM will copy all master service files inside VM, the entrance script will setup all services according to the service files.
When the entrance script content changed, the PR build will set include_kubernetes_master=y to help do validation for k8s master related code change. The default value of include_kubernetes_master should be always n for public master branch. We will generate master image from internal master branch
How to verify it
Build with INCLUDE_KUBERNETES_MASTER = y
2023-07-25 07:44:59 +08:00
Jason Tsai
d2b5d774c5
[Ufispace][PDDF] Add PDDF support on S9180-32X (#14909)
* Add s9180-32x pddf support

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

* Fix memset_s parameter

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

* Update chassis.py and fan.py

1. remove duplicate get_sfp() in chassis.py
2. update get_direction() and get_target_speed() in fan.py

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

---------

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>
2023-07-24 09:37:48 -07:00
Junchao-Mellanox
05f9c5c297
Fix issue: set delayed attribute to true for platform monitor service (#15816)
There is a redundant line in init_cfg.json.j2. It would cause pmon service always has "delayed=False". However, we know that PMON has a timer now. So, I try to fix it here.
2023-07-24 08:30:35 -07:00
mssonicbld
9129a7bf04
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#15918)
#### Why I did it
src/sonic-platform-daemons
```
* 76baca3 - (HEAD -> master, origin/master, origin/HEAD) Fixes for the issues uncovered by sonic-pcied unit tests (#389) (32 hours ago) [Ashwin Srinivasan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-21 18:33:20 +08:00
xumia
a0ba49d732
[Build] Fix some of the patches not applied issue (#15660)
Why I did it
Fix some of the patches in .patches folder not applied issue.
The command "quilt applied" only lists the applied patches, if some of the patches have issues, then the patches will not be applied when you run the build command again.

Work item tracking
Microsoft ADO (number only): 24410730
How I did it
Run the command to apply the patches without any conditions.
If failed, check if the failure reason is "series fully applied".
How to verify it
2023-07-21 16:48:57 +08:00
mssonicbld
19638a4df6
[submodule] Update submodule sonic-gnmi to the latest HEAD automatically (#15929)
#### Why I did it
src/sonic-gnmi
```
* fb338d5 - (HEAD -> master, origin/master, origin/HEAD) Merge pull request #135 from liuh-80/dev/liuh/cherry-pick-zmq (3 hours ago) [Hua Liu]
* f8d9c7e - Merge branch 'master' into dev/liuh/cherry-pick-zmq (8 hours ago) [Qi Luo]
* cbd5185 - Fix PR comments (26 hours ago) [liuh-80]
* 226fc31 - Fix PR comments (2 days ago) [liuh-80]
* 6579847 - Fix UT (3 days ago) [liuh-80]
* 53713c3 - Improve code coverage (3 days ago) [liuh-80]
* d8ff562 - Fix UT (3 days ago) [liuh-80]
* c3a66bc - Cherry-pick ZMQ change from nvidia repo (3 days ago) [liuh-80]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-21 16:32:44 +08:00
mssonicbld
287056110e
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#15931) 2023-07-21 15:38:23 +08:00
guangyao6
9567c06570
Add BGP configuration for BGPSentinel peer (#15714)
Why I did it
For route registry service, in order to block hijacked routes, IBGP session needs to be set up from BGP sentinel service to SONiC, and BGP sentinel service advertise the same route with higher local-preference and no export community. So that SONiC takes the route from BGP sentinel as the best path and does not advertise the route to EBGP peers.
In order to do that, new route-maps are needed. So this change adds a new set of templates, keeping BGPSentinel peers out of the other templates.

Work item tracking
Microsoft ADO (number only): 24451346
How I did it
Add sentinel_community in constants.yml, route from BGPSentinel do not match this community will be denied.
Add support to convert BGPSentinel related configuration in the BGPPeerPassive element of the minigraph to a new BGP_SENTINELS table in CONFIG_DB
Add a new set of "sentinels" templates to docker-fpm-frr
Add a new BGP peer manager to bgpcfgd, to add neighbors from the BGP_SENTINELS table using the "sentinels" templates
Add a test case for minigraph.py, making sure the BGPSentinel and BGPSentinelV6 elements create BGP_SENTINELS DB entry.
Add a set of test cases for the new sentinels templates in sonic-bgpcfgd tests.
Add sonic-bgp-sentinel.yang and a set of testcases for the yang file.

How to verify it
Testcases and UT newly added would pass.
Setup IPv4 and IPv6 BGPSentinel services in minigraph, and load minigraph, show CONFIG_DB and "show runningconfig bgp", configuration would be loaded successfully.
Using t1-lag topo and setup IBGP session from BGPSentinel to SONiC loopback address, IBGP session would up.
Advertise route from BGPSentinel to T1 with sentinel_community, higher local-preference and no-export communiyt. In T1, show bgp route, the result is "Not advertise to any EBGP peer".
Withdraw the route in BGPSentinel, in T1, route would advertise to EBGP peers.
Advertise route from T1 that does not match sentinel_community, in T1, would not see the route in show bgp route.
2023-07-21 09:32:29 +08:00
mssonicbld
bb99552f03
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#15861) 2023-07-21 07:14:31 +08:00
Jing Zhang
57b2ab4bc3
[YANG] add yang model for MUX_LINKMGR|MUXLOGGER (#15884)
Add yang model for MUX_LINKMGR|MUXLOGGER.
2023-07-20 13:12:35 -07:00
wilson-smci
2d0bad0523
[Supermicro]: Add a new supported device and platform, SSE-T7132S. (#15368)
* * platform/innoviunm: Add a new supported device and platform, SSE-T7132S

* Switch Vendor: Supermicro
* Switch SKU:  Supermicro_sse_t7132s
* ASIC Vendor: innovium
* Swich ASIC: TL7
* Port Configuration: 32x400G
* SONiC Image: SONiC-ONIE-Innoviunm

Signed-off-by: wilsonw <wilsonw@supermicro.com.tw>
2023-07-20 10:24:56 -07:00
Aravind Mani
2de5abdaf4
DellEMC: Fix API2.0 initialization issue (#15687)
Why I did it
To fix sonic-net/sonic-mgmt#8786

How I did it
Modified Fan API to check whether the data retrieved is valid or not and return accordingly

How to verify it
Verify whether API 2.0 is loaded properly or not.
Execute CLI's like "show version", "show interface status", "show platform psustatus" etc..
2023-07-20 09:50:22 -07:00
Aravind Mani
05314f9e5b
DellEMC: S5248F update LED Firmware (#15790)
* DellEMC: S5248F update LED firmware
2023-07-20 09:49:48 -07:00
Saikrishna Arcot
371c3a0be5
Add support for deb build profiles env variable (#15858)
Add support for a separate DEB_BUILD_PROFILES environment variable, to
be able to set build profiles. This may be used to specify whether
python 2 bindings/libraries should be built, or what configuration
options should be specified for a package.

This also makes it easier to append/remove build profiles from our rules
files, which will be needed for the sairedis build.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-07-20 09:14:23 -07:00
mssonicbld
a4787fd213
[submodule] Update submodule sonic-gnmi to the latest HEAD automatically (#15921)
#### Why I did it
src/sonic-gnmi
```
* 610509b - (HEAD -> master, origin/master, origin/HEAD) Install necessary debs instead of entire artifact in azp (#137) (2 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-20 20:50:08 +08:00
mssonicbld
601ec40700
[submodule] Update submodule sonic-linux-kernel to the latest HEAD automatically (#15916) 2023-07-20 19:20:29 +08:00
mssonicbld
135243d7bf
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#15920) 2023-07-20 19:16:40 +08:00
Ye Jianquan
7533c8ccf6
[sonic-mgmt docker image] Upgrade celery in the python3 to 5.2.7, upgrade ipython to 8.12.2 (#15911)
Upgrade celery in the python3 to 5.2.7,
Upgrade ipython to 8.12.2 since 5.4.1 requires prompt-toolkit<2.0.0,>=1.0.4,
But celery 5.2.7 relies click-repl>=0.2.0 , click-repl>=0.2.0 relies prompt-toolkit>=3.0.36.
So upgrade ipython to resolve the prompt-toolkit version incompatible issue.
2023-07-20 14:28:08 +08:00
mssonicbld
e4d2752143
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#15908)
#### Why I did it
src/sonic-swss
```
* cb1b3f40 - (HEAD -> master, origin/master, origin/HEAD) Remove system neighbor DEL operation in m_toSync if SET operation for (#2853) (7 hours ago) [Song Yuan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-20 09:47:43 +08:00
xumia
e73f1110ad
[Build] Fix the dash cache dependency issue (#15851)
#### Why I did it
[Build] Fix the dash cache dependency issue
```
12:47:34  [ finished ] [ target/files/bullseye/ctrmgrd.service ]
12:47:36  fatal: Unable to hash src/sonic-dash-api/sonic-dash-api
12:47:36  make: *** [Makefile.cache:528: target/debs/bullseye/libdashapi_1.0.0_amd64.deb.smdep] Error 123
12:47:36  make: *** Waiting for unfinished jobs....
```

##### Work item tracking
- Microsoft ADO **(number only)**: 24547630
2023-07-19 15:56:24 -07:00
vmittal-msft
fea10546f2
Update WRED profile on system ports (#15612)
* Update WRED profile on system ports
2023-07-19 15:00:39 -07:00
mssonicbld
c8ea7d26f3
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#15885)
#### Why I did it
src/linkmgrd
```
* 6e5cfda - (HEAD -> master, origin/master, origin/HEAD) Change common_libs dependencies from buster to bullseye (#212) (2 days ago) [Ze Gan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-20 04:33:02 +08:00
mssonicbld
ecc0f4c243 [ci/build]: Upgrade SONiC package versions 2023-07-20 04:32:51 +08:00
Ashwin Srinivasan
0b067bfb2a
[master] Mellanox: 2700, 4600c - Quoted device IDs to prevent false flags in pcied (#15896)
Why I did it
Certain all-numeric device IDs of PCI devices in the pcie.yaml file are left unquoted, leading to false mismatch flags in the pcie daemon and subsequently leads to log flooding. This PR fixes that issue.

Work item tracking
Microsoft ADO (number only): 24578930
How I did it
Added quotes around numeric PCI devices in the pcie.yaml files of the following platforms:

x86_64-mlnx_msn2700-r0
x86_64-mlnx_msn4600c-r0

How to verify it
Install latest image after the merge and verify that syslogs are not flooded with PCI device mismatch errors
2023-07-18 21:14:00 -07:00
xumia
bdef73ea96
[Build] Fix the PyYang python package installation issue (#15890)
Why I did it
Fix the armhf build failure.
How to reproduce the issue:

docker run -it debain:bullseye bash
apt-get update && apt-get install -y python3-pip
pip3 install PyYAML==5.4.1
Error message:

Collecting PyYAML==5.4.1
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3 /tmp/tmp6xabslgb_in_process.py get_requires_for_build_wheel /tmp/tmp_er01ztl
....
      raise AttributeError(attr)
  AttributeError: cython_sources
  ----------------------------------------
WARNING: Discarding d63f2d7597/PyYAML-5.4.1.tar.gz (sha256)=607774cbba28732bfa802b54baa7484215f530991055bb562efbed5b2f20a45e (from https://pypi.org/simple/pyyaml/) (requires-python:>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*). Command errored out with exit status 1: /usr/bin/python3 /tmp/tmp6xabslgb_in_process.py get_requires_for_build_wheel /tmp/tmp_er01ztl Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement PyYAML==5.4.1
ERROR: No matching distribution found for PyYAML==5.4.1
root@fa2fa92edcfd:/# 
But if adding the option --no-build-isolation, then it is good, see fix.

install "PyYAML==5.4.1" --no-build-isolation
The same error can be found in the multiple builds.

Work item tracking
Microsoft ADO (number only): 24567457

How I did it
Add a build option --no-build-isolation.
2023-07-19 06:33:49 +08:00
Zain Budhwani
e2a58acf61
Update usage leaf in sonic-events-host yang models (#15805)
#### Why I did it

event yang models for usage currently use int as type for usage leaf, needs to be of type decimal64

##### Work item tracking
- Microsoft ADO **(number only)**:17747466

#### How I did it

Update yang models and UT

#### How to verify it

UT
2023-07-18 10:28:39 -07:00
jcaiMR
bd413d20d2
advance dhcprelay to 6a6ce24, add default dhcpv6 dualtor source interface (#15864)
sonic-build image side change to fix source interface selection in dual tor scenario.
dhcprelay related PR:
[master]fix dhcpv6 relay dual tor source interface selection issue sonic-dhcp-relay#42

Announce dhcprelay submodule to 6a6ce24([to invoke #40 PR]([master]fix dhcpv6 relay dual tor source interface selection issue sonic-dhcp-relay#42))
2023-07-17 15:28:10 -07:00
mssonicbld
39f3e1f97a
[ci/build]: Upgrade SONiC package versions (#15862) 2023-07-17 19:08:24 +08:00
mssonicbld
1ec3b1dc6b
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#15860)
#### Why I did it
src/sonic-swss
```
* 5b27c209 - (HEAD -> master, origin/master, origin/HEAD) Refactor Orch class to separate recorder implementation (#2837) (8 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-17 16:37:57 +08:00
ycoheNvidia
05bbf72c86
Reduced root directory privileges (#15147)
#### Why I did it
Reduced root directory privileges

#### How I did it
During build_debian - called chroot to reduce root directory and its subdirectories privileges to 744
#### How to verify it
After image build and upgrade - check /root privileges by calling "ls -a /root"

#### Description for the changelog
reduced /root directory privileges
2023-07-16 11:06:29 -07:00
mssonicbld
c970ee0f42
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#15853) 2023-07-16 15:30:08 +08:00
mssonicbld
273cb46af9
[ci/build]: Upgrade SONiC package versions (#15854) 2023-07-15 20:23:42 +08:00
mssonicbld
3e9ae4fc7a
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#15852)
#### Why I did it
src/sonic-platform-daemons
```
* 94242c2 - (HEAD -> master, origin/master, origin/HEAD) Use vendor customizable fan speed threshold checks (#378) (3 hours ago) [spilkey-cisco]
* db6e340 - Fix index out of range in the error log of invalid media lane mask received (#386) (8 hours ago) [MichaelWangSmci]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-15 16:34:27 +08:00
Stephen Sun
2a55e8b359
Update the description message of PSU power threshold checking in system health (#15289)
- Why I did it
Adjust PSU power threshold logic in system health.

- How I did it
Update the description message in PSU power threshold checking
power of PSU x (xx w) exceeds threshold (xx w) => System power exceeds xx threshold (xx w)

- How to verify it
Manual test and unit test
2023-07-15 01:10:29 +03:00
Kebo Liu
b6986ffd68
[Mellanox] Update SAI build procedure (#15728)
= Why I did it
To optimize Mellanox platform SAI build

- How I did it
SAI debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release.

- How to verify it
Configure/build for Mellanox platform, check the image and ensure that correct SAI debs are included.
2023-07-15 01:03:33 +03:00
Junchao-Mellanox
ed21266ff4
[Mellanox] Remove reset_from_comex from reboot cause mapping (#15793)
- Why I did it
The reset cause "reset_from_comex" has been removed by hw-management, hence removing it from platform API code

- How I did it
Remove reset_from_comex from reboot cause mapping

- How to verify it
Manual test
2023-07-15 01:02:46 +03:00
DavidZagury
b06a856fba
[Mellanox] Add support for BIOS update on Spectrum-4 (#15795)
- Why I did it
BIOS on new generation switch can come with a file type of cap or cab. Needs to add support to these file type.
Also ONIE version on new devices can have a suffix of 'dev'.

- How I did it
Added cap & cab as possible component extensions for ComponentBIOS.
Update the ONIE version regex to include dev signed versions.

- How to verify it
Update BIOS.
2023-07-15 00:59:55 +03:00
Ze Gan
a24845997d
Add protobuf and dashapi to sonic-mgmt (#15743)
#### Why I did it
The testcases in sonic-mgmt need the packages of protobuf and dashapi

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Because the docker of sonic-mgmt is based on ubuntu20.04, it cannot directly install the packages compiled by slave due to dependency issues. Download related packaged directly from Azp.

#### How to verify it
Check azp stats.
2023-07-14 11:23:25 -07:00
lixiaoyuner
2602ad25ba
[ctgmgr]: do not remove label when do systemd service stop when service is in kube mode (#15642)
Why I did it
When sonic is managed by k8s, the sonic container is managed by k8s daemonset, daemonset identifies its members by labels. Currently when restarting a sonic service by systemctl, if the service's container is already managed by k8s, systemd script stops the container by removing the feature label to make it disjoin from k8s daemonset, and then starts it by adding the label to make it join k8s daemonset again.

This behavior would cause problem during k8s container upgrade. Containers in daemonset are upgraded in a rolling fashion, that means the daemonset version is updated first, then rollout the new version to containers with precheck/postcheck one by one. However, if a sonic device joins a daemonset, k8s will directly deploy a pod with the current version of daemonset, it is expected when a device joins k8s cluster at first time.

But for a device which has already joined k8s cluster, the re-joining daemonset will cause the container upgraded to new version without precheck, so if a systemd service is restarted during daemonset upgrade, the container may be upgraded without precheck and break rolling update policy. To fix it, we need to remove the logic about dropping k8s label in systemd service stop script for kube mode.

Work item tracking
Microsoft ADO (number only): 24304563

How I did it
Don't drop label in systemd service stop script when feature's set_owner is kube. Only drop label when feature's set_owner is local.

How to verify it
The label feature_enabled should be always true if the feature's set owner is kube.
2023-07-14 09:15:20 -07:00
Ying Xie
bf49154493
Potential fix for Celestica E1031 device hang (#15822)
set CPU max_cstate to 0

Co-authored-by: Sumukha Tumkur Vani <sumukhatv@outlook.com>
2023-07-14 08:38:45 -07:00
Saikrishna Arcot
c991c5f16e
Upgrade scapy in the PTF's python3 virtualenv to 2.5.0 (#15573)
This is primarily to fix a bug in scapy hitting an error when trying to
listen on multiple interfaces in a single `sniff` call. This also
upgrades it to the current latest version.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-07-14 08:36:30 -07:00
mssonicbld
23a0a87874
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#15812)
#### Why I did it
src/sonic-utilities
```
* 51c7a43c - (HEAD -> master, origin/master, origin/HEAD) [show][muxcable] update `show mux config` to print out `soc_ipv6` as well  (#2909) (6 hours ago) [Jing Zhang]
* fd497755 - [route_check][dualtor] Ignore vlan neighbor route miss (#2888) (18 hours ago) [Longxiang Lyu]
* 81c0ed4e - [show][muxcable] update `show mux tunnel-route` to check soc_ipv6 as well (33 hours ago) [Jing Zhang]
* 1ee73668 - [db_migrator] Migrate DNS configuratuion (#2893) (2 days ago) [ganglv]
* 553a3432 - [dualtor][route_check] filter out `soc_ipv6`  (#2899) (2 days ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-07-14 16:36:32 +08:00
Liping Xu
95d11976bd
update rsyslog log size conf (#15821)
Why I did it
For some devices whose log folder size is larger than 200M, for example, 256M, the LOG_FILE_ROTATE_SIZE_KB should be 16M. and
THRESHOLD_KB=$((USABLE_SPACE_KB - (NUM_LOGS_TO_ROTATE * LOG_FILE_ROTATE_SIZE_KB * 2)))
= $(( (VAR_LOG_SIZE_KB * 90 / 100) - RESERVED_SPACE_KB)) - (NUM_LOGS_TO_ROTATE * LOG_FILE_ROTATE_SIZE_KB * 2)))
= $(( (256M * 90 / 100) - 4096)) - (8 * 16M * 2)))
the result would be a negative value

Work item tracking
Microsoft ADO (number only):
24524827
How I did it
Add a case for 400M, if the log folder size is between 200M and 400M, set the log file size to 2M

How to verify it
Do cmd "sudo logrotate -f /etc/logrotate.conf" on DUT which val/log folder size is 256M, and check the syslog.
2023-07-14 15:44:17 +08:00
lixiaoyuner
1bf2a613d5
[ctrmgr]: Container image clean up bug fix (#15772)
Why I did it
When do clean up container images, current code has two bugs need to be fixed. And some variables' name maybe cause confused, change the variables' name.

Work item tracking
Microsoft ADO (number only): 24502294

How I did it
We do clean up after tag latest successfully. But currently tag latest function only return 0 and 1, 0 means succeed and 1 means failed, when we get 1, we will retry, when we get 0, we will do clean up. Actually the code 0 includes another case we don't need to do clean up. The case is that when we are doing tag latest, the container image we want to tag maybe not running, so we can not tag latest and don't need to cleanup, we need to separate this case from 0, return -1 now.

When local mode(v1) -> kube mode(v2) happens, one problem is how to handle the local image, there are two cases. one case is that there was one kube v1 container dry-run(cause we don't relace the local if kube version = local version), we will remove the kube v1 image and tag the local version with ACR prefix and remove local v1 local tag. Another case is that there was no kube v1 container dry-run, we remove the local v1 image directly, cause the local v1 image should not be the last desire version.

About the docker_id variable, it may cause confused, it's actually docker image id, so rename the variable. About the two dicts and the list, rename them to be more readable.

How to verify it
Check tag latest and image clean up result.
2023-07-13 22:44:24 -07:00
lixiaoyuner
df13380d70
[k8s]: Bypass the systemd service restart limit and do immediately restart when change to local mode (#15432)
Why I did it
During the upgrade process via k8s, the feature's systemd service will restart as well, all of the feature systemd service has restart number limit, and the limit number is too small, only three times. if fallback happens when upgrade, the start count will be 2, just once again, the systemd service will be down. So, need to bypass this. This restart function will be called when do local -> kube, kube -> kube, kube ->local, each time call this function, we indeed need to restart successfully, so do reset-failed every time we do restart.
When need to go back to local mode, we do systemd restart immediately without waiting the default restart interval time so that we can reduce the container down time.

Work item tracking
Microsoft ADO (number only):
24172368

How I did it
Before every restart for upgrade, do reset feature's restart number. The restart number will be reset to 0 to bypass the restart limit.
When need to go back to local mode, we do systemd restart immediately.

How to verify it
Feature's systemd service can be always restarted successfully during upgrade process via k8s.
2023-07-13 22:42:17 -07:00
Mai Bui
d549787408
limit privileged flag for bgp container (#14932)
Why I did it
HLD implementation: Container Hardening (sonic-net/SONiC#1364)

Work item tracking
Microsoft ADO (number only): 14807420
How I did it
Reduce linux capabilities in privileged flag, retain NET_ADMIN and SYS_ADMIN capabilities

How to verify it
Install new image to DUT, verify bgp container is up
Run bgp sonic-mgmt kvmtest
2023-07-14 09:08:43 +08:00
xumia
30959ec901
[Build] Change the build option from ENABLE_FIPS_FEATURE to INCLUDE_FIPS (#15758)
Why I did it
[Build] Change the build option from ENABLE_FIPS_FEATURE to INCLUDE_FIPS

Work item tracking
Microsoft ADO (number only): 24485797
How I did it
2023-07-13 23:00:38 +08:00