sonic-buildimage

Author	SHA1	Message	Date
xumia	a6644b2b99	[Build] Upgrade the python docker version (#15031 ) #### Why I did it [Build] Upgrade the python docker version to fix bgp not up issue ##### Work item tracking - Microsoft ADO (number only): 22236397	2023-05-12 11:37:00 -07:00
Neetha John	6c7e24381e	[storage_backend] Add backend acl service (#14229 ) Why I did it This PR addresses the issue mentioned above by loading the acl config as a service on a storage backend device How I did it The new acl service is a oneshot service which will start after swss and does some retries to ensure that the SWITCH_CAPABILITY info is present before attempting to load the acl rules. The service is also bound to sonic targets which ensures that it gets restarted during minigraph reload and config reload How to verify it Build an image with the following changes and did the following tests Verified that acl is loaded successfully on a storage backend device after a switch boot up Verified that acl is loaded successfully on a storage backend ToR after minigraph load and config reload Verified that acl is not loaded if the device is not a storage backend ToR or the device does not have a DATAACL table Signed-off-by: Neetha John <nejo@microsoft.com>	2023-03-20 20:25:21 +00:00
Stepan Blyshchak	73c7ced753	[202012][Mellanox] Place FW binaries under platform directory instead of squashfs (#13890 ) Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation: admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa lrwxrwxrwx 1 root root 66 Feb 8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa - Why I did it 202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change. - How I did it Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation. /etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade. - How to verify it Upgrade from 201911 to 202012 202012 to 201911 downgrade 202012 -> 202012 reboot ONIE -> 202012 boot (First FW burn) Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2023-02-22 17:38:54 +02:00
Nazarii Hnydyn	83b6518ae2	[202012][mellanox]: Add BIOS upgrade infra (#13571 ) - Why I did it Added BIOS upgrade infra - How I did it Added new make target - How to verify it Copy msn3800_bios.tar.gz to platform/mellanox/bios make configure PLATFORM=mellanox make target/files/buster/msn3800_bios.tar.gz Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>	2023-02-02 10:07:03 +02:00
Lorne Long	3402094fd0	[Build] Use apt-get to predictably support dependency ordered configuration of lazy packages (#12164 ) Why I did it The current lazy installer relies on a filename sort for both unpack and configuration steps. When systemd services are configured [started] by multiple packages the order is by filename not by the declared package dependencies. This can cause the start order of services to differ between first-boot and subsequent boots. Declared systemd service dependencies further exacerbate the issue (e.g. blocking the first-boot script). The current installer leaves packages un-configured if the package dependency order does not match the filename order. This also fixes a trivial bug in [Build]: Support to use symbol links for lazy installation targets to reduce the image size #10923 where externally downloaded dependencies are duplicated across lazy package device directories. How I did it Changed the staging and first-boot scripts to use apt-get: dpkg -i /host/image-$SONIC_VERSION/platform/$platform/.deb becomes apt-get -y install /host/image-$SONIC_VERSION/platform/$platform/.deb when dependencies are detected during image staging. How to verify it Apt-get critical rules Add a Depends= to the control information of a package. Grep the syslog for rc.local between images and observe the configuration order of packages change.	2022-11-23 10:41:28 +00:00
lixiaoyuner	0abf8d0419	Make client indentity by AME cert (#11946 ) * Make client indentity by AME cert * Join k8s cluster by ipv6 * Change join test cases * Test case bug fix * Improve read node label func * Configure kubelet and change test cases * For kubernetes version 1.22.2 * Fix undefine issue Signed-off-by: Yun Li <yunli1@microsoft.com>	2022-09-17 00:41:53 +00:00
Neetha John	26ee4ae4a4	Add backend acl template (#11220 ) Why I did it Storage backend has all vlan members tagged. If untagged packets are received on those links, they are accounted as RX_DROPS which can lead to false alarms in monitoring tools. Using this acl to hide these drops. How I did it Created a acl template which will be loaded during minigraph load for backend. This template will allow tagged vlan packets and dropped untagged How to verify it Unit tests Signed-off-by: Neetha John <nejo@microsoft.com>	2022-07-08 21:39:39 +00:00
xumia	32cda89f93	[Build]: Support to use symbol links for lazy installation targets to reduce the image size (#10923 ) Why I did it Support to use symbol links in platform folder to reduce the image size. The current solution is to copy each lazy installation targets (xxx.deb files) to each of the folders in the platform folder. The size will keep growing when more and more packages added in the platform folder. For cisco-8000 as an example, the size will be up to 2G, while most of them are duplicate packages in the platform folder. How I did it Create a new folder in platform/common, all the deb packages are copied to the folder, any other folders where use the packages are the symbol links to the common folder. Why platform.tar? We have implemented a patch for it, see #10775, but the problem is the the onie use really old unzip version, cannot support the symbol links. The current solution is similar to the PR 10775, but make the platform folder into a tar package, which can be supported by onie. During the installation, the package.tar will be extracted to the original folder and removed.	2022-07-05 20:57:49 +00:00
Saikrishna Arcot	044570c42e	Remove SSH host keys after installing the custom version of sshd (#10633 ) (#11140 ) * Remove SSH host keys after installing the custom version of sshd Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Use an override for for sshd instead of overwriting the service file Don't overwrite upstream's .service file, and instead use an override file for making sure the host key(s) are generated. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-06-16 11:47:04 -07:00
xumia	06addae853	Revert "Reduce image size for lazy installation packages (#10775 )" (#10916 ) This reverts commit `15cf9b0d70`. Why I did it Revert the PR #10775, for it has impact on onie installation. It is caused by the symbol links not supported in some of the onie unzip. We will enable after fixing the issue, see #10914	2022-05-27 17:00:50 +00:00
shlomibitton	c71c91e2b0	[202012] [Fastboot] Delay PMON service for better fastboot performance (#10745 ) #### Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot. #### How I did it Add a timer for PMON service. Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot. Copy the timer file to the host bin image. #### How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.	2022-05-15 23:31:32 -07:00
shlomibitton	bca8a244c6	[202012] [Fastboot] Delay LLDP service for better fastboot performance (#10568 ) (#10744 ) This PR is to backport a fix #10568 This PR is dependent on PR: #10745 - Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot. - How I did it Add a timer for LLDP service. Copy the timer file to the host bin image. - How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.	2022-05-15 15:05:29 +03:00
xumia	951d93e362	Reduce image size for lazy installation packages (#10775 ) Why I did it The image size is too large, when there are multiple lazy packages and multiple platforms. It is not necessary to keep the lazy installation packages in multiple copies. For cisco image, the image size will reduce from 3.5G to 1.7G. How I did it Use symbol links to only keep one package for each of the lazy package. Make a new folder fsroot/platform/common Copy the lazy packages into the folder. When using a package in each of the platform, such as x86_64-grub, x86_64-8800_rp-r0, x86_64-8201_on-r0, etc, only make a symbol link to the package in the common folder.	2022-05-10 06:44:40 +00:00
Stepan Blyshchak	fa1e364f54	[services] kill container on stop in warm/fast mode (#10511 ) To optimize stop on warm boot, added kill for containers Use service "kill" in the shutdown path for fast and warm reboot. For all other reload methods, service "stop" is used. This is done to save time in shutdown path, and to overall improve the time spent in warm and fast reload. How - Use service_mgmt.sh to trigger common logic to initiate kill (fast/warm) or stop (cold) for database.sh, radv.sh, snmp.sh, telemetry.sh, mgmt-framework.sh Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>, Vaibhav H D <vaibhav.dixit@microsoft.com>	2022-04-18 14:27:48 -07:00
Saikrishna Arcot	aafb3d00e2	Start haveged before systemd-random-seed (#10328 ) The haveged service file in Debian Buster specifies that haveged should start after systemd-random-seed starts (this was removed in Bullseye after systemd changes caused a bootloop). This is a bit counterproductive, since haveged is meant to be used in environments with minimal sources of entropy, but one of the checks that systemd-random-seed does is to verify that entropy is present. Therefore, override the default .service file for haveged that moves systemd-random-seed to the Before list, allowing it to start before systemd-random-seed checks the system entropy level. (systemd doesn't allow removing items from dependency/ordering entries such as After= and Before=, so the entire .service file has to be overwritten.) Note that despite this, haveged takes up to two seconds to actually start working, so systemd-random-seed may still block for about two seconds. However, this still allows other work (such as running rc.local) to proceed a bit sooner. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-03-24 14:28:42 -07:00
xumia	67312ff635	[Build]: Use one debian mirror config (#10281 ) Why I did it Use one debian mirror config. The empty config in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/apt/sources.list overrides the file https://github.com/Azure/sonic-buildimage/blob/master/files/apt/sources.list.amd64 (armhf/arm64), it does not make sense. All the content in files/image_config/apt is no use, any one wants to add mirror config, please add in files/apt. How I did it Remove files/image_config/apt and the reference.	2022-03-21 17:04:19 +08:00
xumia	413ee3e219	[Build]: Fix /proc not mounted issue (#10164 ) (#10256 ) [Build]: Fix /proc not mounted issue	2022-03-19 22:19:06 +08:00
xumia	a8d844c83d	[build]: Fix marvell-armhf build hung issue (#10156 ) The marvel-armhf build is hung, it does not exist after waiting for a long time. It is caused by the process /etc/entropy.py which is started by the postinst script in target/debs/buster/sonic-platform-nokia-7215_1.0_armhf.deb $ cat postinst sh /usr/sbin/nokia-7215_plt_setup.sh ... $ cat usr/sbin/nokia-7215_plt_setup.sh \| tail python /etc/entropy.py & $ cat etc/entropy.py if path.exists("/proc/sys/kernel/random/entropy_avail"): while 1: while avail() < 2048: with open('/dev/urandom', 'rb') as urnd, open("/dev/random", mode='wb') as rnd: d = urnd.read(512) t = struct.pack('ii', 4 * len(d), len(d)) + d fcntl.ioctl(rnd, RNDADDENTROPY, t) time.sleep(30) It is a workaround to fix the build issue, need to fix debian package, and revert the change.	2022-03-07 08:00:56 -08:00
Stephen Sun	fafd5327bd	[Reclaim buffer] Common infrastructure update for reclaiming buffer (#9133 ) - Why I did it This is to update the common sonic-buildimage infra for reclaiming buffer. - How I did it Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there. Rendering is done here for passing azure pipeline. Load zero_profiles.json when the dynamic buffer manager starts Generate inactive port list to reclaim buffer Signed-off-by: Stephen Sun <stephens@nvidia.com>	2021-12-01 02:28:46 +00:00
trzhang-msft	86fa5eede2	Add service mark_dhcp_packet to mux container (#9015 ) - add a new service "mark_dhcp_packet" to mux container - apply packet marks on a per-interface basis in ebtables - write packet marks to "DHCP_PACKET_MARK" table in state_db	2021-11-15 21:36:29 +00:00
Lawrence Lee	77378b4364	[mux]: Call write_standby from host only Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	25712c712e	[mux]: Make write_standby available on host Signed-off-by: Lawrence Lee <lawlee@microsoft.com> [write_standby]: Cleanup and fix build Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Sudharsan Dhamal Gopalarathnam	ba2284c4c0	Grouping delayed services under a target for config reload checks (#7846 ) #### Why I did it Create a target for delayed service timers. Few services in sonic have delayed to speed up the bring up of the system and essential services. However there is no way to track when they start. This will be a problem when executing config reload as config reload expects all services to be up. Hence grouped all the timers that trigger the delayed services under one target so that they could be tracked in 'config reload' command #### How I did it Created delay.target service and add created dependency on the delayed targets.	2021-08-16 07:50:56 +00:00
Guohan Lu	db8cc247e0	[build]: Fix docker pull on armhf platform armhf build uses native dockerd Signed-off-by: Guohan Lu <lguohan@gmail.com>	2021-08-06 23:35:25 -07:00
VenkatCisco	cb8ff6dba1	[baseimage]: add j2cli to sonic_debian_extension.j2 (#8019 ) j2cli provides access to jinja library. cisco platform.py requires j2cli to handle jinja template configuration files.	2021-08-05 15:22:57 +00:00
vdahiya12	5e594043ce	[pmon] create and mount firmware directory on PMON for firmware upgrade support on muxcable (#8283 ) This PR creates a directory firmware on the HOST with the path /usr/share/sonic/firmware, as well as this is mounted on PMON container with the same path /usr/share/sonic/firmware. This is required for firmware upgrade support for muxcable as currently by design all Y-Cable API's are called by xcvrd. As such if CLI has to transfer a file to PMON we need to mount a directory from host to PMON just for getting the firmware files. Hence we require this change. Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>	2021-08-05 15:22:41 +00:00
Renuka Manavalan	91f611157a	cherry-pick PR #8158 & PR #8205 into 202012 (#8235 )	2021-07-20 20:52:33 -07:00
Guohan Lu	d3e2983188	Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469 )" This reverts commit `e851a42db7`.	2021-07-01 18:41:21 -07:00
Renuka Manavalan	e851a42db7	[Kubernetes]: The kube server could be used as http-proxy for docker (#7469 ) Why I did it The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo. Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy. How I did it When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker. To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo. How to verify it Configure a switch such that image repo is unreachable Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1) Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option. Configure a k8s server, and deploy an image for feature with set_owner="kube" Check if switch could successfully download the image or not.	2021-06-17 07:09:50 +00:00
yozhao101	fb2c995f53	[202012][Monit] Deprecate the feature of monitoring the critical processes by Monit (#7823 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit. How I did it I removed the script process_checker and corresponding Monit configuration entries of critical processes. How to verify it I verified this on the device str-7260cx3-acs-1.	2021-06-09 09:04:22 -07:00
Renuka Manavalan	32e5137ab7	Add service to restore TACACS from old config (#7560 ) Why I did it In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials. Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present. How I did it By adding a service, that would fire 5 mins after boot. This service apply tacacs if available. How to verify it Upgrade and watch status of tacacs.timer & tacacs.service You may create /etc/sonic/old_config/tacacs.json, with updated credentials (before 5mins after boot) and see that appears in config & persisted too. Which release branch to backport (provide reason below if selected) 201911 202006 202012	2021-06-07 06:02:32 +00:00
yozhao101	3af05fdffe	[Monit] Restart telemetry container if memory usage is beyond the threshold (#7645 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold. How I did it I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container. How to verify it I verified this implementation on device str-7260cx3-acs-1.	2021-05-31 04:38:18 +00:00
guxianghong	a0fde3a626	[arm] support compile sonic arm image on arm server (#7285 ) - Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly. - Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228 - rename multiarch docker to sonic-slave-${distro}-march-${arch} Co-authored-by: Xianghong Gu <xgu@centecnetworks.com> Co-authored-by: Shi Lei <shil@centecnetworks.com>	2021-05-02 08:11:56 -07:00
Stepan Blyshchak	ae574ab000	[systemd] disable default systemd udev rules for interfaces (#7369 ) Fix #7364 99-default.link - was always in SONiC, but previous systemd (<247) had an issue and it did not work due to issue systemd/systemd#3374. Now systemd 247 works. However, such policy overrides teamd provided mac address which causes teamd netdev to use a random mac address. Therefore, needs to be disabled. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2021-05-01 19:43:41 -07:00
Renuka Manavalan	a4d81f3c19	Copy dummy flannel.conf to get around absence of CNI Network (#6985 ) Why I did it We skip install of CNI plugin, as we don't need. But this leaves node in "not ready" state, upon joining master. To fix, we copy this dummy .conf file in /etc/cni/net.d How I did it Keep this file in /usr/share/sonic/templates and copy to /etc/cni/net.d upon joining k8s master. How to verify it Upon configuring master-IP and enable join, watch node join and move to ready state. You may verify using kubectl get nodes command	2021-03-10 09:32:49 -08:00
Sujin Kang	15aed52ef2	[pcie.yaml] Move pcie configuration file path to platform directory (#6475 ) - Why I did it The pcie configuration file location is under plugin directory not under platform directory. #6437 - How I did it Move all pcie.yaml configuration file from plugin to platform directory. Remove unnecessary timer to start pcie-check.service Move pcie-check.service to sonic-host-services - How to verify it Verify on the device	2021-03-04 21:23:05 +00:00
Stepan Blyshchak	7fb5a72d23	[services] introduce sonic.target (#5705 ) - Why I did it Group all SONiC services together and able to manage them together. Will be used in config reload command as much simpler and generic way to restart services. - How I did it Add services to sonic.target - How to verify it Together with Azure/sonic-utilities#1199 config reload -y Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>	2021-03-04 21:23:05 +00:00
dflynn-Nokia	e3ab6b0494	[armhf build] Fix azure-storage dependency on cryptography package (#6780 ) Fix marvell-armhf build break The azure-storage package depends on the cryptography package. Newer versions of cryptography require the rust compiler, the correct version for which is not readily available in buster. Hence we pre-install an older version here to satisfy the azure-storage dependency. Note: This is not a problem for other architectures as pre-built versions of cryptography are available for those. This sequence can be removed after upgrading to debian bullseye.	2021-03-01 09:40:00 -08:00
Joe LeVeque	d7517a704c	[PDDF] Build and install Python 3 package (#6286 ) - Make PDDF code compliant with both Python 2 and Python 3 - Align code with PEP8 standards using autopep8 - Build and install both Python 2 and Python 3 PDDF packages	2021-02-23 23:56:01 +00:00
yozhao101	bfec282a82	[Monit] Monitoring the running status of containers. (#6251 ) - Why I did it This PR aims to monitor the running status of each container. Currently the auto-restart feature was enabled. If a critical process exited unexpected, the container will be restarted. If the container was restarted 3 times during 20 minutes, then it will not run anymore unless we cleared the flag using the command `sudo systemctl reset-failed <container_name>` manually. - How I did it We will employ Monit to monitor a script. This script will generate the expected running container list and compare it with the current running containers. If there are containers which were expected to run but were not running, then an alerting message will be written into syslog. - How to verify it I tested this feature on a lab device `str-a7050-acs-3` which has single ASIC and `str2-n3164-acs-3` which has a Multi-ASIC. First I manually stopped a container by running the command `sudo systemctl stop <container_name>`, then I checked whether there was an alerting message in the syslog. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2021-01-09 08:27:53 -08:00
Joe LeVeque	566ea4f601	[system-health] Convert to Python 3 (#5886 ) - Convert system-health scripts to Python 3 - Build and install system-health as a Python 3 wheel - Also convert newlines from DOS to UNIX	2020-12-29 14:04:09 -08:00
Joe LeVeque	62662acbd5	No longer install some unnecessary Python 2 packages in host (#6301 ) - No longer install Python 2 packages in host: - libpython2.7-dev - docker - ipaddress - netifaces - azure-storage - watchdog - futures - Install Python 3 versions of the following packages in host: - docker - azure-storage - watchdog - redis - swsssdk (install unconditionally)	2020-12-29 13:02:11 -08:00
lguohan	aa1cc848e2	[sonic-yang-mgmt-py2]: remove sonic-yang-mgmt py2 (#6262 ) No longer needed as sonic-utilties has been moved python3 Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-12-22 21:05:33 -08:00
Renuka Manavalan	ba02209141	First cut image update for kubernetes support. (#5421 ) * First cut image update for kubernetes support. With this, 1) dockers dhcp_relay, lldp, pmon, radv, snmp, telemetry are enabled for kube management init_cfg.json configure set_owner as kube for these 2) Each docker's start.sh updated to call container_startup.py to register going up As part of this call, it registers the current owner as local/kube and its version The images are built with its version ingrained into image during build 3) Update all docker's bash script to call 'container start/stop/wait' instead of 'docker start/stop/wait'. For all locally managed containers, it calls docker commands, hence no change for locally managed. 4) Introduced a new ctrmgrd service, that helps with transition between owners as kube & local and carry over any labels update from STATE-DB to API server 5) hostcfgd updated to handle owner change 6) Reboot scripts are updatd to tag kube running images as local, so upon reboot they run the same image. 7) Added kube_commands.py to handle all updates with Kubernetes API serrver -- dedicated for k8s interaction only.	2020-12-22 08:01:33 -08:00
Prabhu Sreenivasan	df2a4ded98	[ntp]: Source interface support for NTP (#6033 ) Added source interface support for NTP. Also made NTP start on Mgmt-VRF by default when configured. - How I did it 1) Updated hostcfg to listen to global config NTP and NTP_SERVER tables and restart ntp when ever the configuration changes. NTP table includes source interface configuration. 2) The ntp script updated to by default start on Mgmt-VFT when configured. Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom>	2020-12-21 05:34:13 -08:00
Joe LeVeque	c829e6914a	Install 'wheel' package in host OS; upgrade pip and setuptools (#6187 ) Install the 'wheel' package in host OS (along with python3 and python3-distutils which are also needed for building some Python packages) to eliminate error messages like the following: ``` Running setup.py bdist_wheel for watchdog: started Running setup.py bdist_wheel for watchdog: finished with status 'error' Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-Qd3K08/watchdog/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-0AHpMe --python-tag cp27: usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] or: -c --help [cmd1 cmd2 ...] or: -c --help-commands or: -c cmd --help error: invalid command 'bdist_wheel' ---------------------------------------- Failed building wheel for watchdog ``` These error messages appear to have no impact on the image build, because the Python package seems to still get installed successfully afterward, just the building of a wheel package fails. Therefore, this is more of a cosmetic fix than an actual bug. This is an addendum to https://github.com/Azure/sonic-buildimage/pull/6182. Also upgrade pip and install more recent version of setuptools package via PyPI.	2020-12-16 16:38:15 -08:00
Sabareesh-Kumar-Anandan	9f4ca01388	[sonic-config-engine] Adding dependent pkgs needed for arm compilation (#6186 ) libxslt-dev and libz-dev are dependencies for lxml==4.6.1 which is required for pyangbind==0.8.1 lxml-4.6.2-cp37-cp37m-manylinux1_x86_64.whl is directly downloaded in amd64 whereas in arm this is built from lxml-4.6.2.tar.gz Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>	2020-12-15 08:44:46 -08:00
Stephen Sun	e010d83fc3	[Dynamic buffer calc] Support dynamic buffer calculation (#6194 ) - Why I did it To support dynamic buffer calculation. This PR also depends on the following PRs for sub modules - [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](https://github.com/Azure/sonic-swss/pull/1338) - [sonic-swss-common: Dynamic buffer calculation #361](https://github.com/Azure/sonic-swss-common/pull/361) - [sonic-utilities: Support dynamic buffer calculation #973](https://github.com/Azure/sonic-utilities/pull/973) - How I did it 1. Introduce field `buffer_model` in `DEVICE_METADATA\|localhost` to represent which buffer model is running in the system currently: - `dynamic` for the dynamic buffer calculation model - `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used 2. Add the tables required for the feature: - ASIC_TABLE in platform/\<vendor\>/asic_table.j2 - PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2 - PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed. - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2 - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2 3. Copy the newly introduced j2 files into the image and rendering them when the system starts 4. Update the CLI options for buffermgrd so that it can start with dynamic mode 5. Fetches the ASIC vendor name in orchagent: - fetch the vendor name when creates the docker and pass it as a docker environment variable - `buffermgrd` can use this passed-in variable 6. Clear buffer related tables from STATE_DB when swss docker starts 7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2 8. Remove buffer pool sizes for ingress pools and egress_lossy_pool Update the buffer settings for dynamic buffer calculation	2020-12-13 11:35:39 -08:00
Junchao-Mellanox	51c77b179f	[Mellanox] Add python3 support for Mellanox platform API (#6175 ) python2 is end of life and SONiC is going to support python3. This PR is going to support: 1. Mellanox SONiC platform API python3 support 2. Install both python2 and python3 verson of Mellanox SONiC platform API or pmon and host side	2020-12-11 10:51:31 -08:00
Prabhu Sreenivasan	77afb8e54d	[ntp]: ntp-systemd-wrapper file is getting overwritten (#6179 ) ntp-systemd-wrapper file from files/image_config/ntp was not getting picked up. Added a line on sonic_debian_extension.j2 to copy over the file from files/image_config/ntp after installing the debian package. Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom.com>	2020-12-10 23:20:41 -08:00

1 2 3 4 5

231 Commits