Why I did it
Our k8s feature will pull new version container images for each upgrade, the container images inside sonic will be more and more, but for now we don’t have a way to clean up the old version container images, the disk may be filled up. Need to add cleaning up the old version container images logic.
Work item tracking
Microsoft ADO (number only):
17979809
How I did it
Remove the old version container images besides the feature's current version and last version image, last version image is saved for supporting fallback.
How to verify it
Check whether the old version images are removed
Why I did it
We found a bug when pilot, the tag function doesn't remove the ACR domain when do tag, it makes the latest tag not work. And in the original tag function, it calls os.system and os.popen which are not recommend, need to refactor.
How I did it
Do a split("/") when get image_rep to fix the acr domain bug
Refactor the tag function code and add test cases
How to verify it
Check whether container images are tagged as latest when in kube mode.
* Fix kube mode to local mode long duration issue
* Remove IPV6 parameters which is not necessary
* Fix read node labels bug
* Tag the running image to latest if it's stable
* Disable image_version_higher check
* Change image_version_higher checker test case
Signed-off-by: Yun Li <yunli1@microsoft.com>
Why I did it
The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo.
Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy.
How I did it
When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker.
To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo.
How to verify it
Configure a switch such that image repo is unreachable
Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1)
Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option.
Configure a k8s server, and deploy an image for feature with set_owner="kube"
Check if switch could successfully download the image or not.
* First cut image update for kubernetes support.
With this,
1) dockers dhcp_relay, lldp, pmon, radv, snmp, telemetry are enabled
for kube management
init_cfg.json configure set_owner as kube for these
2) Each docker's start.sh updated to call container_startup.py to register going up
As part of this call, it registers the current owner as local/kube and its version
The images are built with its version ingrained into image during build
3) Update all docker's bash script to call 'container start/stop/wait' instead of 'docker start/stop/wait'.
For all locally managed containers, it calls docker commands, hence no change for locally managed.
4) Introduced a new ctrmgrd service, that helps with transition between owners as kube & local and carry over any labels update from STATE-DB to API server
5) hostcfgd updated to handle owner change
6) Reboot scripts are updatd to tag kube running images as local, so upon reboot they run the same image.
7) Added kube_commands.py to handle all updates with Kubernetes API serrver -- dedicated for k8s interaction only.