sonic-buildimage

Author	SHA1	Message	Date
Yaqiang Zhu	f48e8b61cf	[dhcp_relay] Use dhcprelayd to manage critical processes (#17236 ) Modify j2 template files in docker-dhcp-relay. Add dhcprelayd to group dhcp-relay instead of isc-dhcp-relay-VlanXXX, which would make dhcprelayd to become critical process. In dhcprelayd, subscribe FEATURE table to check whether dhcp_server feature is enabled. 2.1 If dhcp_server feature is disabled, means we need original dhcp_relay functionality, dhcprelayd would do nothing. Because dhcrelay/dhcpmon configuration is generated in supervisord configuration, they will automatically run. 2.2 If dhcp_server feature is enabled, dhcprelayd will stop dhcpmon/dhcrelay processes started by supervisord and subscribe dhcp_server related tables in config_db to start dhcpmon/dhcrelay processes. 2.3 While dhcprelayd running, it will regularly check feature status (by default per 5s) and would encounter below 4 state change about dhcp_server feature: A) disabled -> enabled In this scenario, dhcprelayd will subscribe dhcp_server related tables and stop dhcpmon/dhcrelay processes started by supervisord and start new pair of dhcpmon/dhcrelay processes. After this, dhcpmon/dhcrelay processes are totally managed by dhcprelayd. B) enabled -> enabled In this scenaro, dhcprelayd will monitor db changes in dhcp_server related tables to determine whether to restart dhcpmon/dhrelay processes. C) enabled -> disabled In this scenario, dhcprelayd would unsubscribe dhcp_server related tables and kill dhcpmon/dhcrelay processes started by itself. And then dhcprelayd will start dhcpmon/dhcrelay processes via supervisorctl. D) disabled -> disabled dhcprelayd will check whether dhcrelay processes running status consistent with supervisord configuration file. If they are not consistent, dhcprelayd will kill itself, then dhcp_relay container will stop because dhcprelayd is critical process.	2023-12-04 22:14:02 +00:00
Shashanka Balakuntala	c0963db5a3	[dhcp-relay]: Modify dhcp relay to pick primary address (#17012 ) This is change taken as part of the HLD: sonic-net/SONiC#1470 and this is a follow up on the PR #16827 where in the docker-dhcp we pick the value of primary gateway of the interface from the VLAN_Interface table which has "secondary" flag set in the config_db Microsoft ADO (number only): 16784946 How did I do it - Changes in the j2 file to add a new "-pg" parameter in the dhcpv4-relay.agents.j2, the ip would be retrieved from the config db's vlan_interface table such that the interface which are picked will have secondary field set. - Changes in isc-dhcp to re-order the addresses of the discovered interface and which has the ip which has the passed parameter.	2023-12-04 22:14:02 +00:00
Yaqiang Zhu	274d320443	[dhcp_server] Add dhcprelayd for dhcp_server feature (#16947 ) Add support in dhcp_relay container for dhcp_server_ipv4 feature. HLD: sonic-net/SONiC#1282	2023-11-02 08:09:01 -07:00
Zain Budhwani	d89dde3b6d	Fix regex and process name (#16647 ) ### Why I did it ### How I did it Fix regex such that dhcp bind failure event is detected as well as process name since dhcp relay processes that need to be detected are dhcprelay6 and dhcrelay. #### How to verify it Manual testing and nightly test event	2023-09-26 16:15:27 -07:00
jcaiMR	9c1c82e9ff	add show dhcp_relay ipv4 counter entry, fix interface name offset issue (#16507 ) Why I did it Add another cli entry: show dhcp_relay ipv4 counter Fix get all interface offset issue Work item tracking Microsoft ADO (17271822): How I did it show dhcp_relay ipv4 counter -i [ifname] show dhcp4relay_counters counts -i [ifname] How to verify it show dhcp4relay_counters counts \| more 10 Message Type Ethernet144(RX)	2023-09-11 21:08:06 +08:00
jcaiMR	a522a63e25	[dhcp-relay]: dhcp/dhcpv6 per interface counter support (#16377 ) Why I did it Support DHCP/DHCPv6 per-interface counter, code change in sonic-build image. Work item tracking Microsoft ADO (17271822): How I did it - Introduce libjsoncpp-dev in dhcpmon and dhcprelay repo - Show CLI changes after counter format change How to verify it - Manually run show command - dhcpmon, dhcprelay integration tests	2023-09-05 10:16:39 -07:00
jcaiMR	bd413d20d2	advance dhcprelay to 6a6ce24, add default dhcpv6 dualtor source interface (#15864 ) sonic-build image side change to fix source interface selection in dual tor scenario. dhcprelay related PR: [master]fix dhcpv6 relay dual tor source interface selection issue sonic-dhcp-relay#42 Announce dhcprelay submodule to 6a6ce24([to invoke #40 PR]([master]fix dhcpv6 relay dual tor source interface selection issue sonic-dhcp-relay#42))	2023-07-17 15:28:10 -07:00
nmoray	f978b2bb53	Timezone sync issue between the host and containers (#14000 ) #### Why I did it To fix the timezone sync issue between the containers and the host. If a certain timezone has been configured on the host (SONIC) then the expectation is to reflect the same across all the containers. This will fix [Issue:13046](https://github.com/sonic-net/sonic-buildimage/issues/13046). For instance, a PST timezone has been set on the host and if the user checks the link flap logs (inside the FRR), it shows the UTC timestamp. Ideally, it should be PST.	2023-06-25 16:36:09 -07:00
Mai Bui	1477f779de	modify commands using utilities_common.cli.run_command and advance sonic-utilities submodule on master (#15193 ) Dependency: sonic-net/sonic-utilities#2718 Why I did it This PR sonic-net/sonic-utilities#2718 reduce shell=True usage in utilities_common.cli.run_command() function. Work item tracking Microsoft ADO (number only): 15022050 How I did it Replace strings commands using utilities_common.cli.run_command() function to list of strings due to circular dependency, advance sonic-utilities submodule 72ca4848 (HEAD -> master, upstream/master, upstream/HEAD) Add CLI configuration options for teamd retry count feature (sonic-net/sonic-utilities#2642) 359dfc0c [Clock] Implement clock CLI (sonic-net/sonic-utilities#2793) b316fc27 Add transceiver status CLI to show output from TRANSCEIVER_STATUS table (sonic-net/sonic-utilities#2772) dc59dbd2 Replace pickle by json (sonic-net/sonic-utilities#2849) a66f41c4 [show] replace shell=True, replace xml by lxml, replace exit by sys.exit (sonic-net/sonic-utilities#2666) 57500572 [utilities_common] replace shell=True (sonic-net/sonic-utilities#2718) 6e0ee3e7 [CRM][DASH] Extend CRM utility to support DASH resources. (sonic-net/sonic-utilities#2800) b2c29b0b [config] Generate sysinfo in single asic (sonic-net/sonic-utilities#2856)	2023-06-05 17:08:13 +08:00
Yaqiang Zhu	284ba61a86	[dhcp-relay] Add dhcp_relay show cli (#13614 ) Why I did it Currently the show and clear cli of dhcp_relayis may cause confusion. How I did it Add doc for it: [doc] Add docs for dhcp_relay show/clear cli sonic-utilities#2649 Add dhcp_relay config cli and test cases. show dhcp_relay ipv4 helper show dhcp_relay ipv6 destination show dhcp_relay ipv6 counters sonic-clear dhcp_relay ipv6 counters How to verify it Unit test all passed	2023-03-06 10:48:25 -08:00
Yaqiang Zhu	c5a7ce0cf4	[dhcp_relay] Remove exist check while adding dhcpv6 relay (#13822 ) Why I did it DHCPv6 relay config entry is not useful while del dhcpv6 relay config. How I did it Remove dhcpv6_relay entry if it is empty and not check entry exist while adding dhcpv6 relay	2023-02-15 10:23:39 -08:00
Yaqiang Zhu	bb48ee92ab	[dhcp-relay] Add support for dhcp_relay config cli (#13373 ) Why I did it Currently the config cli of dhcpv4 is may cause confusion and config of dhcpv6 is missing. How I did it Add dhcp_relay config cli and test cases. config dhcp_relay ipv4 helper (add \| del) <vlan_id> <helper_ip_list> config dhcp_relay ipv6 destination (add \| del) <vlan_id> <destination_ip_list> Updated docs for it in sonic-utilities: https://github.com/sonic-net/sonic-utilities/pull/2598/files How to verify it Build docker-dhcp-relay.gz with and without INCLUDE_DHCP_RELAY, and check target/docker-dhcp-relay.gz.log	2023-01-30 17:48:01 -08:00
Sudharsan Dhamal Gopalarathnam	bbc72e78b0	[dhcp_relay]Fix the clear dhcp6relay_counters CLI (#13148 ) Avoid traceback on sonic-clear command sonic-clear dhcp6relay_counters Traceback (most recent call last): File "/usr/local/bin/sonic-clear", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 764, in __call__ return self.main(args, kwargs) File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 555, in invoke return callback(args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/clear/plugins/dhcp-relay.py", line 19, in dhcp6relay_clear_counters counter = DHCPv6_Counter() NameError: name 'DHCPv6_Counter' is not defined - How I did it Corrected the way to import using importlib - How to verify it Tested the sonic-clear command and verified no traceback is seen	2022-12-26 09:14:59 +02:00
Junchao-Mellanox	2126def04e	[infra] Support syslog rate limit configuration (#12490 ) - Why I did it Support syslog rate limit configuration feature - How I did it Remove unused rsyslog.conf from containers Modify docker startup script to generate rsyslog.conf from template files Add metadata/init data for syslog rate limit configuration - How to verify it Manual test New sonic-mgmt regression cases	2022-12-20 10:53:58 +02:00
Vivek	5624d15a7c	Fix dependency of dhcp-mon on VLAN with only v6 (#13006 ) Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com> Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2022-12-09 14:41:07 -08:00
Zain Budhwani	98ace33b0f	Add rsyslog plugin regex for select operation failure (#12659 ) Added events for select op, alpm parity error, moved dhcp events from host to container	2022-11-13 21:41:33 -08:00
kellyyeh	f4046c1417	Add dhcp6relay dualtor option (#12459 )	2022-10-21 10:33:10 -07:00
Vivek	34f9a642dd	[DHCP_RELAY] Updated wait_for_intf.sh to wait for ipv6 global and link local addr (#12273 ) - Why I did it Fixes #11431 - How I did it dhcp6relay binds to ipv6 addresses configured on these vlan interfaces Thus check if they are ready before launching dhcp6relay - How to verify it Unit Tests Tested on a live device Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2022-10-12 11:46:20 +03:00
kellyyeh	6bdbe14975	[dhcp_relay] Add "vlan missing ip helper" dhcp relay unittest (#10654 )	2022-06-04 11:37:04 -07:00
kellyyeh	2ead3aaefc	[dhcp6relay] Fix option parsing and add dhcpv6 client messages (#10819 )	2022-05-24 14:37:16 -07:00
kellyyeh	cfdb8431df	[dhcp6relay] Add dhcpv6 option check (#10486 )	2022-05-05 18:04:14 -07:00
Kalimuthu-Velappan	bc30528341	Parallel building of sonic dockers using native dockerd(dood). (#10352 ) Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are created with a fixed docker name and common to all the users. docker-database:latest docker-swss:latest When multiple builds are triggered on the same build server that creates parallel building issue because all the build jobs are trying to create the same docker with latest tag. This happens only when sonic dockers are built using native host dockerd for sonic docker image creation. This patch creates all sonic dockers as user sonic dockers and then, while saving and loading the user sonic dockers, it rename the user sonic dockers into correct sonic dockers with tag as latest. docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for loading and saving the docker image.	2022-04-28 08:39:37 +08:00
kellyyeh	396a92cb2e	[dhcp_relay] Remove dhcp6mon (#10467 )	2022-04-12 10:44:17 -07:00
Christian Svensson	660c0cbe7b	docker-dhcp-relay: Fix test call to MockConfigDb (#9903 ) *docker-dhcp-relay: Fix test call to MockConfigDb Signed-off-by: Christian Svensson <blue@cmd.nu>	2022-02-01 18:52:52 -08:00
kellyyeh	3e263fa6a8	[dhcp_relay] Remove dhcpv6 servers from VlanBrief (#9718 )	2022-01-19 07:47:08 -08:00
Saikrishna Arcot	bb3362760d	[docker-dhcprelay]: Update to Bullseye (#9736 ) As part of this, update the isc-dhcp package to match the Bullseye version (this fixes some compile errors related to BIND), clean up some of the build dependencies and runtime dependencies for debian packaging, and use the default Boost version to compile against instead of explicitly saying using 1.74. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-01-18 15:11:36 -08:00
shlomibitton	eaa888d948	Fix import error for DHCP relay CLI (#9691 ) Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>	2022-01-16 08:08:01 +02:00
kellyyeh	f2ee94d201	[dhcp_relay] Update DHCPv6 counter on relayed messages (#9283 )	2021-11-30 20:15:30 -08:00
Saikrishna Arcot	c1d5e0682f	docker-dhcp-relay: Fix waiting for interfaces to get set up (#9034 ) Fix the check used to wait for interfaces to come up. The group name in the supervisor config files has changed from isc-dhcp-relay to dhcp-relay. Also, in the wait script, wait 10 additional seconds after the vlans, port channels, and any interfaces are up. This is because dhcrelay listens on all interfaces (in addition to port channels and vlans), and to ensure that it stays in a clean state during runtime, wait some extra time to make sure that those interfaces are created as well. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2021-10-21 18:45:00 -07:00
shlomibitton	546340bf7b	[dhcp_relay] Fix import for dhcp_counters on clear_dhcp6relay_counter.py (#8991 ) #### Why I did it Import issue will cause: root@sonic:/# sudo sonic-clear arp failed to import plugin clear.plugins.dhcprelay: No module named 'show_dhcp_relay' #### How I did it Fix the import. #### How to verify it run sudo sonic-clear arp	2021-10-19 03:10:36 -07:00
kellyyeh	62a1f5eb19	Add CLI Support for IPv6 Helpers and DHCPv6 Relay Counters (#8593 )	2021-09-23 22:01:26 -07:00
kellyyeh	bc06c6fcb5	Incorporate DHCPv6 Relay Agent into dhcp-relay docker (#8321 )	2021-09-22 16:05:03 -07:00
shlomibitton	56533ceb9e	[dhcp_relay] Adapt config/show CLI commands to support DHCPv6 relay (#8211 ) #### Why I did it - Adapt config/show CLI commands to support DHCPv6 relay - Support multiple dhcp servers assignment in one command - Fix IP validation - Adapt UT and add new UT cases #### How I did it - Modify config/show dhcp relay files - Modify config/show UT files #### How to verify it This PR has a dependency on PR https://github.com/Azure/sonic-utilities/pull/1717 Build an image with the dependent PR and this PR Use config/show DHCPv6 relay commands.	2021-08-25 00:48:39 -07:00
shlomibitton	604becdd5c	[dhcp_relay] DHCP relay support for IPv6 (#7772 ) Why I did it Currently SONiC use the 'isc-dhcp-relay' package to allow DHCP relay functionality on IPv4 networks only. This will allow the IPv6 functionality along the IPv4 type. How I did it Edit supervisord template to start DHCPv6 instances when configured to do so on Config DB. Align cfg unit test to the new change. Add DHCPv6 relay minigraph parsing support and a suitable t0 topology xml file for UT. How to verify it Configure DHCPv6 agents as described on the feature HLD: Azure/SONiC#765 Test it with real client/server with IPv6 or use the dedicated automatic test: Azure/sonic-mgmt#3565 Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com> * Split docker-dhcp-relay.supervisord.conf.j2 template into several files for easier code maintenance	2021-07-16 07:31:05 -07:00
Stepan Blyshchak	b3b6938fda	[dhcp-relay] make DHCP relay an extension (#6531 ) - Why I did it Make DHCP relay docker an extension. DHCP relay now carries dhcp relay commands CLI plugin and has a complete manifest. It is installed as extension if INCLUDE_DHCP_REALY is set to y. DEPENDS on #5939 - How I did it Modify DHCP relay docker makefile and dockerfile. Make changes to sonic_debian_extension.j2 to install sonic packages. I moved DHCP related CLI tests from sonic-utilities to DHCP relay docker. This PR introduces a way to write a plugin as part of docker image and run the tests from cli-plugin-tests directory under docker directory. The test result is available in target/docker-dhcp-relay.gz.log: [ REASON ] : target/docker-dhcp-relay.gz does not exist NON-EXISTENT PREREQUISITES: docker-start target/docker-config-engine-buster.gz-load target/python-wheels/sonic_utilities-1.2-py3-none-any.whl-in stall target/debs/buster/python3-swsscommon_1.0.0_amd64.deb-install [ FLAGS FILE ] : [] [ FLAGS DEPENDS ] : [] [ FLAGS DIFF ] : [] ============================= test session starts ============================== platform linux -- Python 3.7.3, pytest-3.10.1, py-1.7.0, pluggy-0.8.0 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /sonic/dockers/docker-dhcp-relay/cli-plugin-tests, inifile: plugins: cov-2.6.0 collecting ... collected 10 items test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_plugin_registration PASSED [ 10%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_nonexist_vlanid PASSED [ 20%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_vlanid PASSED [ 30%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_ip PASSED [ 40%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_exist_ip PASSED [ 50%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_del_dhcp_relay_dest PASSED [ 60%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_nonexist_dhcp_relay_dest PASSED [ 70%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_dhcp_relay_dest_with_nonexist_vlanid PASSED [ 80%] test_show_dhcp_relay.py::TestVlanDhcpRelay::test_plugin_registration PASSED [ 90%] test_show_dhcp_relay.py::TestVlanDhcpRelay::test_dhcp_relay_column_output PASSED [100%] =============================== warnings summary =============================== /usr/local/lib/python3.7/dist-packages/tabulate.py:7 /usr/local/lib/python3.7/dist-packages/tabulate.py:7: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import namedtuple, Iterable -- Docs: https://docs.pytest.org/en/latest/warnings.html ==================== 10 passed, 1 warnings in 0.35 seconds =====================	2021-07-15 10:35:56 -07:00
trzhang-msft	4f2b54e735	dhcpmon: support dual tor in docker template (#7470 )	2021-05-03 10:51:34 -07:00
Joe LeVeque	c651a9ade4	[dockers][supervisor] Increase event buffer size for process exit listener; Set all event buffer sizes to 1024 (#7083 ) To prevent error [messages](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802) like the following from being logged: ``` Mar 17 02:33:48.523153 vlab-01 INFO swss#supervisord 2021-03-17 02:33:48,518 ERRO pool supervisor-proc-exit-listener event buffer overflowed, discarding event 46 ``` This is basically an addendum to https://github.com/Azure/sonic-buildimage/pull/5247, which increased the event buffer size for dependent-startup. While supervisor-proc-exit-listener doesn't subscribe to as many events as dependent-startup, there is still a chance some containers (like swss, as in the example above) have enough processes running to cause an overflow of the default buffer size of 10. This is especially important for preventing erroneous log_analyzer failures in the sonic-mgmt repo regression tests, which have started occasionally causing PR check builds to fail. Example [here](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802). I set all supervisor-proc-exit-listener event buffer sizes to 1024, and also updated all dependent-startup event buffer sizes to 1024, as well, to keep things simple, unified, and allow headroom so that we will not need to adjust these values frequently, if at all.	2021-03-27 21:14:24 -07:00
trzhang-msft	97b371ee08	[docker-dhcp-relay]: add -si support in dhcp docker template (#7053 )	2021-03-15 09:21:03 -07:00
Tamer Ahmed	bb03e5bb37	Start DHCP Relay When Helpers IPs Are Available (#6961 ) #### Why I did it It is possible to have DHCP relay configuration with no servers/ helpers which result in DHCP container to crash. This PR fixes this issue by not starting DHCP relay for vlans with no DHCP helpers. resolves: #6931 closes: #6931 #### How I did it Do not add program group for dhcp relay with not dhcp helpers #### How to verify it Unit test	2021-03-04 20:43:08 -08:00
Tamer Ahmed	8d857fab16	[dhcp-relay]: Launch DHCP Relay On L3 Vlan (#6527 ) Recent changes brought l2 vlan concept which do not have DHCP clients behind them and so DHCP relay is not required. Also, dhcpmon fails to launch on those vlans as their interfaces lack IP addresses. This PR limit launch of both DHCP relay and dhcpmon to L3 vlans only. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2021-01-25 10:48:48 -08:00
yozhao101	be3c036794	[supervisord] Monitoring the critical processes with supervisord. (#6242 ) - Why I did it Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance. Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by Supervisord, we can only focus on the logic of monitoring. - How I did it We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take following steps if it was notified one of critical processes exited unexpectedly: The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted. If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog. - How to verify it First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not. Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute. - Which release branch to backport (provide reason below if selected) 201811 201911 [x ] 202006	2021-01-21 12:57:49 -08:00
Renuka Manavalan	ba02209141	First cut image update for kubernetes support. (#5421 ) * First cut image update for kubernetes support. With this, 1) dockers dhcp_relay, lldp, pmon, radv, snmp, telemetry are enabled for kube management init_cfg.json configure set_owner as kube for these 2) Each docker's start.sh updated to call container_startup.py to register going up As part of this call, it registers the current owner as local/kube and its version The images are built with its version ingrained into image during build 3) Update all docker's bash script to call 'container start/stop/wait' instead of 'docker start/stop/wait'. For all locally managed containers, it calls docker commands, hence no change for locally managed. 4) Introduced a new ctrmgrd service, that helps with transition between owners as kube & local and carry over any labels update from STATE-DB to API server 5) hostcfgd updated to handle owner change 6) Reboot scripts are updatd to tag kube running images as local, so upon reboot they run the same image. 7) Added kube_commands.py to handle all updates with Kubernetes API serrver -- dedicated for k8s interaction only.	2020-12-22 08:01:33 -08:00
trzhang-msft	d4d90a8963	Support for dual tor option in dhcp docker template (#6152 )	2020-12-09 18:10:00 -08:00
lguohan	4d3eb18ca7	[supervisord]: use abspath as supervisord entrypoint (#5995 ) use abspath makes the entrypoint not affected by PATH env. Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-11-22 21:18:44 -08:00
Joe LeVeque	7bf05f7f4f	[supervisor] Install vanilla package once again, install Python 3 version in Buster container (#5546 ) - Why I did it We were building a custom version of Supervisor because I had added patches to prevent hangs and crashes if the system clock ever rolled backward. Those changes were merged into the upstream Supervisor repo as of version 3.4.0 (http://supervisord.org/changes.html#id9), therefore, we should be able to simply install the vanilla package via pip. This will also allow us to easily move to Python 3, as Python 3 support was added in version 4.0.0. - How I did it - Remove Makefiles and patches for building supervisor package from source - Install Python 3 supervisor package version 4.2.1 in Buster base container - Also install Python 3 version of supervisord-dependent-startup in Buster base container - Debian package installed binary in `/usr/bin/`, but pip package installs in `/usr/local/bin/`, so rather than update all absolute paths, I changed all references to simply call `supervisord` and let the system PATH find the executable to prevent future need for changes just in case we ever need to switch back to build a Debian package, then we won't need to modify these again. - Install Python 2 supervisor package >= 3.4.0 in Stretch and Jessie base containers	2020-11-19 23:41:32 -08:00
Tamer Ahmed	15f5d47338	[dhcpmon] Print Both Snapshot And Current Counters (#5374 ) Printing both snapshot and current counter sets will make it easier to pinpoint which message type(s) is/are not being relayed. This PR prints both counter sets. Also, this PR defines gnu11 as a C standard to compile with in order to avoid making changes when porting to 201811 branch. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-15 15:27:36 -07:00
Tamer Ahmed	1bf6fdc6d2	[dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317 ) When BGP routes are missing, DHCP packets get relayed over mgmt interface. This results in dhcpmon alerting that DHCP packets are not being relayed. This is PR include mgmt interface as uplink device, and so, if DHCP packet gets relayed over mgmt interface, regular dhcpmon alert will not be issues. Instead, dhcpmon will check the mgmt interface counts and issue a separate alert regarding packets travelling through mgmt network. In addition, this PR includes the following enhancements: 1. Add SIGUSR1 handler that prints out current packet counts 2. Increase alert grace window to 3 minutes from currently 2 minutes 3. Time is now computed more accurately 4. Print vlan name before counters signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-09 18:37:01 -07:00
Joe LeVeque	5b3b4804ad	[dockers][supervisor] Increase event buffer size for dependent-startup (#5247 ) When stopping the swss, pmon or bgp containers, log messages like the following can be seen: ``` Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37 ``` This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100. Resolves https://github.com/Azure/sonic-buildimage/issues/5241	2020-09-08 23:36:38 -07:00
Tamer Ahmed	3a10e9c6fa	[dhcp-relay] Reduce Calls to SONiC Cfggen (#5175 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to one call during startup when starting dhcp-relay service. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:47:14 -07:00
yozhao101	4fa81b4f8d	[dockers] Update critical_processes file syntax (#4831 ) - Why I did it Initially, the critical_processes file contains either the name of critical process or the name of group. For example, the critical_processes file in the dhcp_relay container contains a single group name `isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical processes and test whether a container can be restarted correctly if one of its critical processes is killed. However, it will be difficult to differentiate whether the names in the critical_processes file are the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user. Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes managed by supervisord using the name "xxx". At the same time, I also updated the logic to parse the file critical_processes in supervisor-proc-event-listener script. - How to verify it We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.	2020-06-25 21:18:21 -07:00

1 2

94 Commits