sonic-buildimage

Author	SHA1	Message	Date
Sudharsan Dhamal Gopalarathnam	11ed28c857	[ctnmgd]: Fixing netaddr build issue (#16668 ) Fixing the following build issue [2023-09-20T04:42:00.004Z] [ FAIL LOG START ] [ target/python-wheels/bullseye/sonic_bgpcfgd-1.0-py3-none-any.whl ] [2023-09-20T04:42:00.004Z] Build start time: Wed Sep 20 04:41:54 UTC 2023 [2023-09-20T04:42:00.004Z] [ REASON ] : target/python-wheels/bullseye/sonic_bgpcfgd-1.0-py3-none-any.whl does not exist NON-EXISTENT PREREQUISITES: target/python-wheels/bullseye/sonic_config_engine-1.0-py3-none-any.whl-install target/python-wheels/bullseye/sonic_yang_mgmt-1.0-py3-none-any.whl-install target/python-wheels/bullseye/sonic_yang_models-1.0-py3-none-any.whl-install target/debs/bullseye/libyang_1.0.73_amd64.deb-install target/debs/bullseye/libyang-cpp_1.0.73_amd64.deb-install target/debs/bullseye/python3-yang_1.0.73_amd64.deb-install target/debs/bullseye/python3-swsscommon_1.0.0_amd64.deb-install [2023-09-20T04:42:00.004Z] [ FLAGS FILE ] : [] [2023-09-20T04:42:00.005Z] [ FLAGS DEPENDS ] : [mellanox amd64 bullseye] [2023-09-20T04:42:00.005Z] [ FLAGS DIFF ] : [mellanox amd64 bullseye ] [2023-09-20T04:42:00.005Z] /sonic/src/sonic-bgpcfgd /sonic [2023-09-20T04:42:00.005Z] running pytest [2023-09-20T04:42:00.005Z] Searching for netaddr==0.8.0 [2023-09-20T04:42:00.005Z] Best match: netaddr 0.8.0 [2023-09-20T04:42:00.005Z] [2023-09-20T04:42:00.005Z] Using /var/sw-r2d2-bot/.local/lib/python3.9/site-packages [2023-09-20T04:42:00.005Z] running egg_info [2023-09-20T04:42:00.005Z] writing sonic_bgpcfgd.egg-info/PKG-INFO [2023-09-20T04:42:00.005Z] writing dependency_links to sonic_bgpcfgd.egg-info/dependency_links.txt [2023-09-20T04:42:00.005Z] writing entry points to sonic_bgpcfgd.egg-info/entry_points.txt [2023-09-20T04:42:00.005Z] writing requirements to sonic_bgpcfgd.egg-info/requires.txt [2023-09-20T04:42:00.005Z] writing top-level names to sonic_bgpcfgd.egg-info/top_level.txt [2023-09-20T04:42:00.005Z] reading manifest file 'sonic_bgpcfgd.egg-info/SOURCES.txt' [2023-09-20T04:42:00.005Z] writing manifest file 'sonic_bgpcfgd.egg-info/SOURCES.txt' [2023-09-20T04:42:00.005Z] running build_ext [2023-09-20T04:42:00.005Z] Traceback (most recent call last): [2023-09-20T04:42:00.005Z] File "/sonic/src/sonic-bgpcfgd/setup.py", line 3, in <module> [2023-09-20T04:42:00.005Z] setuptools.setup( [2023-09-20T04:42:00.005Z] File "/usr/local/lib/python3.9/dist-packages/setuptools/__init__.py", line 163, in setup [2023-09-20T04:42:00.005Z] return distutils.core.setup(attrs) [2023-09-20T04:42:00.005Z] File "/usr/lib/python3.9/distutils/core.py", line 148, in setup [2023-09-20T04:42:00.005Z] dist.run_commands() [2023-09-20T04:42:00.006Z] File "/usr/lib/python3.9/distutils/dist.py", line 966, in run_commands [2023-09-20T04:42:00.006Z] self.run_command(cmd) [2023-09-20T04:42:00.006Z] File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command [2023-09-20T04:42:00.006Z] cmd_obj.run() [2023-09-20T04:42:00.006Z] File "/usr/local/lib/python3.9/dist-packages/ptr.py", line 208, in run [2023-09-20T04:42:00.006Z] with self.project_on_sys_path(): [2023-09-20T04:42:00.006Z] File "/usr/lib/python3.9/contextlib.py", line 117, in __enter__ [2023-09-20T04:42:00.006Z] return next(self.gen) [2023-09-20T04:42:00.006Z] File "/usr/local/lib/python3.9/dist-packages/setuptools/command/test.py", line 168, in project_on_sys_path [2023-09-20T04:42:00.006Z] require('%s==%s' % (ei_cmd.egg_name, ei_cmd.egg_version)) [2023-09-20T04:42:00.006Z] File "/usr/local/lib/python3.9/dist-packages/pkg_resources/__init__.py", line 899, in require [2023-09-20T04:42:00.006Z] needed = self.resolve(parse_requirements(requirements)) [2023-09-20T04:42:00.006Z] File "/usr/local/lib/python3.9/dist-packages/pkg_resources/__init__.py", line 790, in resolve [2023-09-20T04:42:00.006Z] raise VersionConflict(dist, req).with_context(dependent_req) [2023-09-20T04:42:00.006Z] pkg_resources.ContextualVersionConflict: (netaddr 0.9.0 (/var/sw-r2d2-bot/.local/lib/python3.9/site-packages), Requirement.parse('netaddr==0.8.0'), {'sonic-bgpcfgd'}) [2023-09-20T04:42:00.007Z] [ FAIL LOG END ] [ target/python-wheels/bullseye/sonic_bgpcfgd-1.0-py3-none-any.whl ] [2023-09-20T04:42:00.007Z] make: * [slave.mk:881: target/python-wheels/bullseye/sonic_bgpcfgd-1.0-py3-none-any.whl] Error 1 [2023-09-20T04:42:00.007Z] make: *** Waiting for unfinished jobs....	2023-09-26 14:34:16 +08:00
mssonicbld	18b446bfe0	[ctgmgr]: do not remove label when do systemd service stop when service is in kube mode (#15642 ) (#15878 )	2023-07-19 20:10:41 +08:00
mssonicbld	f4a7e22e4e	[k8s]: Bypass the systemd service restart limit and do immediately restart when change to local mode (#15432 ) (#15868 )	2023-07-19 20:04:23 +08:00
mssonicbld	38e721bc24	[ctrmgr]: Container image clean up bug fix (#15772 ) (#15870 )	2023-07-19 20:02:45 +08:00
mssonicbld	74598e568a	Add health check probe for k8s upgrade containers. (#15223 ) (#15867 ) #### Why I did it After k8s upgrade a container, k8s can only know the container is running, don't know the service's status inside container. So we need a probe inside container, k8s will call the probe to check whether the container is really ready. ##### Work item tracking - Microsoft ADO (number only): 22453004 #### How I did it Add a health check probe inside config engine container, the probe will check whether the start service exit normally or not if the start service exists and call the python script to do container self-related specific checks if the script is there. The python script should be implemented by feature owner if it's needed. more details: [design doc](https://github.com/sonic-net/SONiC/blob/master/doc/kubernetes/health-check.md) #### How to verify it Check path /usr/bin/readiness_probe.sh inside container. #### Which release branch to backport (provide reason below if selected) - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [x] 202205 - [x] 202211 #### Tested branch (Please provide the tested image version) - [x] 20220531.28 Co-authored-by: lixiaoyuner <35456895+lixiaoyuner@users.noreply.github.com>	2023-07-19 16:11:13 +08:00
lixiaoyuner	c59f55f6a3	Move k8s script to docker-config-engine (#14788 ) (#15768 ) Why I did it To reduce the container's dependency from host system Work item tracking Microsoft ADO (number only): 17713469 How I did it Move the k8s container startup script to config engine container, other than mount it from host. How to verify it Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container. Signed-off-by: Yun Li <yunli1@microsoft.com> Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>	2023-07-17 23:21:01 +08:00
Guilt	a73d443c1d	[CI][doc][build] Trim src folder files trailing blanks (#15162 ) - Run pre-commit tox profile to trim all trailing blanks - Use several commits with a per-folder based strategy to ease their merge Issue #15114 Signed-off-by: Guillaume Lambert <guillaume.lambert@orange.com>	2023-05-24 10:01:43 -07:00
lixiaoyuner	6dffa55e9c	Clean up the old version container images (#14978 ) Why I did it Our k8s feature will pull new version container images for each upgrade, the container images inside sonic will be more and more, but for now we don’t have a way to clean up the old version container images, the disk may be filled up. Need to add cleaning up the old version container images logic. Work item tracking Microsoft ADO (number only): 17979809 How I did it Remove the old version container images besides the feature's current version and last version image, last version image is saved for supporting fallback. How to verify it Check whether the old version images are removed	2023-05-18 10:37:34 -07:00
lixiaoyuner	f51e5bba1f	Refactor the logic of tagging kube container as local latest (#14367 ) Why I did it We found a bug when pilot, the tag function doesn't remove the ACR domain when do tag, it makes the latest tag not work. And in the original tag function, it calls os.system and os.popen which are not recommend, need to refactor. How I did it Do a split("/") when get image_rep to fix the acr domain bug Refactor the tag function code and add test cases How to verify it Check whether container images are tagged as latest when in kube mode.	2023-03-30 11:41:02 -07:00
lixiaoyuner	bc7b35473e	Add k8s support feature set and Add platform label for scheduler usage (#12997 ) Why I did it We plan to pilot k8s feature, need to fix several bugs including enable telemetry feature and add platform label. How I did it Add support feature set, only enable telemetry container upgrade for now Add platform label for scheduler usage Remove CNI installation code, it would be auto installed when install kubeadm How to verify it After sonic device join k8s cluster, show node labels to check if platform label is visible. Signed-off-by: Yun Li yunli1@microsoft.com	2023-01-10 07:56:44 -08:00
lixiaoyuner	c3a51b2d0d	Fix code irregular issues (#12595 ) * Fix code irregular issues Signed-off-by: Yun Li <yunli1@microsoft.com>	2022-11-07 13:06:19 +08:00
lixiaoyuner	e1440f0044	Improve feature mode switch process (#12188 ) * Fix kube mode to local mode long duration issue * Remove IPV6 parameters which is not necessary * Fix read node labels bug * Tag the running image to latest if it's stable * Disable image_version_higher check * Change image_version_higher checker test case Signed-off-by: Yun Li <yunli1@microsoft.com>	2022-11-02 17:24:32 +08:00
Mai Bui	0fcd219c3b	[sonic-ctrmgrd] Replace os.system and remove subprocess with shell=True (#12534 ) Signed-off-by: maipbui <maibui@microsoft.com> #### Why I did it `subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection. `os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content #### How I did it Replace `os` by `subprocess`, remove `shell=True` #### How to verify it Passed UT Tested in DUT	2022-10-31 11:12:03 -04:00
lixiaoyuner	a1b50cac41	Make client indentity by AME cert (#11946 ) * Make client indentity by AME cert * Join k8s cluster by ipv6 * Change join test cases * Test case bug fix * Improve read node label func * Configure kubelet and change test cases * For kubernetes version 1.22.2 * Fix undefine issue Signed-off-by: Yun Li <yunli1@microsoft.com>	2022-09-16 13:13:39 +08:00
Sudharsan Dhamal Gopalarathnam	14de0a1548	[containerd]Fixing container commands when mode is local and state is disabled (#9986 ) Why I did it During warm-reboot and fast-reboot the below error logs appear Feb 3 22:05:15.187408 r-lionfish-13 ERR container: docker cmd: kill for nat failed with 404 Client Error for http+docker://localhost/v1.41/containers/nat/json: Not Found ("No such container: nat") The container command when called for local mode doesn't check if it is enabled before calling docker kill which throws the above errors. `b6ca76b482/scripts/fast-reboot (L699)` How I did it Checking feature state if local mode and returning error exit code along with valid debug message. How to verify it Manually tested with warm-reboot and fast-reboot Added UT to verify it.	2022-03-02 19:08:06 -08:00
lguohan	75afb13ad3	[k8s]: disable http_proxy for docker by default (#8328 ) disable http_proxy for docker by default. by default, we should not use proxy. Signed-off-by: Guohan Lu <lguohan@gmail.com>	2021-08-04 00:30:43 -07:00
Renuka Manavalan	c5dff0c640	Revert "Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469 )" (#8023 )" (#8158 ) This reverts commit `7236fa98e8`. Restore original PR #7469	2021-07-15 19:48:55 -07:00
Ying Xie	7236fa98e8	Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469 )" (#8023 ) This change causes nightly test to fail due to the fake proxy IP is not reachable. Reverts #7469 This reverts commit `f7ed82f44a`.	2021-06-29 18:43:53 -07:00
Renuka Manavalan	f7ed82f44a	[Kubernetes]: The kube server could be used as http-proxy for docker (#7469 ) Why I did it The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo. Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy. How I did it When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker. To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo. How to verify it Configure a switch such that image repo is unreachable Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1) Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option. Configure a k8s server, and deploy an image for feature with set_owner="kube" Check if switch could successfully download the image or not.	2021-06-16 07:46:01 -07:00
Renuka Manavalan	1b33ebc9cd	K8S handles hostname in lower case (#7694 ) Why I did it k8s handles in lower case, so the code ensures that it uses hostname in all lower case How I did it Wrapper for device_info.get_hostname that returns in lower case. This wrapper is used in all places that require hostname to use in kubectl commands. How to verify it Device joins successfully.	2021-05-26 09:17:48 -07:00
Renuka Manavalan	678bbc6ba3	Kubernetes server configurable using URL 1) Dropped non-required IP update in admin.conf, as all masters use VIP only (#7288) 2) Don't clear VERSION during stop, as it would overwrite new version pending to go. 3) subprocess, get return value from proc and do not imply with presence of data in stderr.	2021-04-16 13:55:36 -07:00
Joe LeVeque	ee1383791c	[sonic-py-common] Add 'general' module with load_module_from_source() function (#7167 ) #### Why I did it To eliminate the need to write duplicate code in order to import a Python module from a source file. #### How I did it Add `general` module to sonic-py-common, which contains a `load_module_from_source()` function which supports both Python 2 and 3. Call this new function in: - sonic-ctrmgrd/tests/container_test.py - sonic-ctrmgrd/tests/ctrmgr_tools_test.py - sonic-host-services/tests/determine-reboot-cause_test.py - sonic-host-services/tests/hostcfgd/hostcfgd_test.py - sonic-host-services/tests/procdockerstatsd_test.py - sonic-py-common/sonic_py_common/daemon_base.py	2021-04-08 08:29:28 -07:00
Renuka Manavalan	6f7cd8d772	Copy dummy flannel.conf to get around absence of CNI Network (#6985 ) Why I did it We skip install of CNI plugin, as we don't need. But this leaves node in "not ready" state, upon joining master. To fix, we copy this dummy .conf file in /etc/cni/net.d How I did it Keep this file in /usr/share/sonic/templates and copy to /etc/cni/net.d upon joining k8s master. How to verify it Upon configuring master-IP and enable join, watch node join and move to ready state. You may verify using kubectl get nodes command	2021-03-09 19:49:54 -08:00
Joe LeVeque	980a024dd4	Fix Python 3 'importlib' bug; Add support for Python 2 back in sonic-py-common (#6933 ) Fix a strange bug introduced by https://github.com/Azure/sonic-buildimage/pull/6832 which would only occur in environments with both Python 2 and Python 3 installed (e.g., the PMon container). Error messages such as the following would be seen: ``` ERR pmon#ledd[29]: Failed to load ledutil: module 'importlib' has no attribute 'machinery' ``` This is very odd, and it seems like the Python 2 version of importlib, which is basically just a stub, is taking precedence over the Python 3 version. I found that this occurs when calling `import importlib`. However, calling `import importlib.machinery` and `import importlib.util` causes the proper package to be referenced, and the `machinery` and `util` modules are loaded successfully. This is how it is specified in examples in the official documentation, however there is nothing mentioned regarding that it should be done this way or that `import importlib` is unreliable. Also, since sonic-py-common is still used in environments with Python 2 installed we should maintain support for both Python 2 and 3 until we completely deprecate Python 2, so I have added this back in.	2021-03-02 18:31:19 -08:00
judyjoseph	46b3bd5503	[teamd]: Increase wait timeout for teamd docker stop to clean Port channels. (#6537 ) The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send. Fixes #6199 To check if it works out for this issue in 201911 ? #6503 This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels. master admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd real 0m15.599s user 0m0.061s sys 0m0.038s Sonic 201911.v58 admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd real 0m5.541s user 0m0.020s sys 0m0.028s	2021-01-23 20:57:52 -08:00
Renuka Manavalan	ba02209141	First cut image update for kubernetes support. (#5421 ) * First cut image update for kubernetes support. With this, 1) dockers dhcp_relay, lldp, pmon, radv, snmp, telemetry are enabled for kube management init_cfg.json configure set_owner as kube for these 2) Each docker's start.sh updated to call container_startup.py to register going up As part of this call, it registers the current owner as local/kube and its version The images are built with its version ingrained into image during build 3) Update all docker's bash script to call 'container start/stop/wait' instead of 'docker start/stop/wait'. For all locally managed containers, it calls docker commands, hence no change for locally managed. 4) Introduced a new ctrmgrd service, that helps with transition between owners as kube & local and carry over any labels update from STATE-DB to API server 5) hostcfgd updated to handle owner change 6) Reboot scripts are updatd to tag kube running images as local, so upon reboot they run the same image. 7) Added kube_commands.py to handle all updates with Kubernetes API serrver -- dedicated for k8s interaction only.	2020-12-22 08:01:33 -08:00

26 Commits