sonic-buildimage/dockers
Hua Liu 05f1a5a31e
Add watchdog mechanism to swss service and generate alert when swss have issue. (#15429)
Add watchdog mechanism to swss service and generate alert when swss have issue. 

**Work item tracking**
Microsoft ADO (number only): 16578912

**What I did**
Add orchagent watchdog to monitor and alert orchagent stuck issue.

**Why I did it**
Currently SONiC monit system only monit orchagent process exist or not. If orchagent process stuck and stop processing, current monit can't find and report it.

**How I verified it**
Pass all UT.

Manually test process_monitoring/test_critical_process_monitoring.py can pass.

Add new UT https://github.com/sonic-net/sonic-mgmt/pull/8306 to check watchdog works correctly.

Manually test, after pause orchagent with 'kill -STOP <pid>', check there are warning message exist in log:

Apr 28 23:36:41.504923 vlab-01 ERR swss#supervisor-proc-watchdog-listener: Process 'orchagent' is stuck in namespace 'host' (1.0 minutes).

**Details if related**
Heartbeat message PR: https://github.com/sonic-net/sonic-swss/pull/2737
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8306
2023-06-12 17:53:54 -07:00
..
docker-base [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-base-bullseye [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-base-buster [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-base-stretch [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-basic_router [supervisord]: use abspath as supervisord entrypoint (#5995) 2020-11-22 21:18:44 -08:00
docker-config-engine Install python-redis package to docker containers (#14632) 2023-04-19 18:14:48 -07:00
docker-config-engine-bullseye Install python-redis package to docker containers (#14632) 2023-04-19 18:14:48 -07:00
docker-config-engine-buster Install python-redis package to docker containers (#14632) 2023-04-19 18:14:48 -07:00
docker-config-engine-stretch Install python-redis package to docker containers (#14632) 2023-04-19 18:14:48 -07:00
docker-database [chassis] Fixed critical process not correct for database-chassis docker (#13445) 2023-01-20 10:21:48 -08:00
docker-dhcp-relay modify commands using utilities_common.cli.run_command and advance sonic-utilities submodule on master (#15193) 2023-06-05 17:08:13 +08:00
docker-eventd Add events to host and create rsyslog_plugin deb pkg (#12059) 2022-09-21 09:20:53 -07:00
docker-fpm-frr updated internal route policy for chassis-packet (#15349) 2023-06-07 09:17:44 -07:00
docker-fpm-gobgp Parallel building of sonic dockers using native dockerd(dood). (#10352) 2022-04-28 08:39:37 +08:00
docker-iccpd [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-lldp [lldpmgrd] Don't log error message for outdated event (#14178) 2023-03-16 18:15:50 +02:00
docker-macsec modify commands using utilities_common.cli.run_command and advance sonic-utilities submodule on master (#15193) 2023-06-05 17:08:13 +08:00
docker-mux [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-nat [nat] Switch to bullseye (#14495) 2023-04-02 14:02:33 -07:00
docker-orchagent Add watchdog mechanism to swss service and generate alert when swss have issue. (#15429) 2023-06-12 17:53:54 -07:00
docker-pde [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-platform-monitor ]pmon]: Import requests libraries for Ragile platform (#13171) 2023-01-07 21:12:13 -08:00
docker-ptf [Build] Fix the mirror gpg key expired issue (#14206) 2023-03-13 11:13:21 +08:00
docker-ptf-sai [SAI PTF] SAI PTF docker support sai-ptf v2 (#12719) 2022-11-17 04:42:51 -08:00
docker-router-advertiser Fix radv.conf traceback when VLAN_INTERFACE is not defined (#12034) 2022-09-09 12:54:05 -07:00
docker-sflow [sflow] Switch to bullseye (#14494) 2023-04-03 09:49:35 -07:00
docker-snmp Add monit_snmp file to monitor memory usage (#14464) 2023-04-06 12:19:11 -07:00
docker-sonic-mgmt Add AZP agent necessary packages to sonic-mgmt-docker (#14291) 2023-03-21 08:09:44 +08:00
docker-sonic-mgmt-framework [mgmt-framework] Fix rest-server startup script (#14979) 2023-05-22 17:42:38 -07:00
docker-sonic-p4rt Update p4rt configuration to match SONiC upstream schema. (#10725) 2022-08-04 14:56:48 -07:00
docker-sonic-restapi [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
docker-sonic-sdk [Bullseye] Upgrade sonic-sdk image to bullseye (#12649) 2022-11-28 18:57:26 +02:00
docker-sonic-sdk-buildenv Parallel building of sonic dockers using native dockerd(dood). (#10352) 2022-04-28 08:39:37 +08:00
docker-sonic-telemetry Add idle conn duration config to telemetry.sh (#14903) 2023-05-04 16:47:02 -07:00
docker-swss-layer-bullseye Add ping to swss-layer docker (#11093) 2022-06-10 07:40:37 -07:00
docker-swss-layer-buster Add ping to swss-layer docker (#11093) 2022-06-10 07:40:37 -07:00
docker-teamd [infra] Support syslog rate limit configuration (#12490) 2022-12-20 10:53:58 +02:00
dockerfile-macros.j2 [sonic-config-engine] Clean up dependencies, pin versions; install Python 3 package in Buster container (#5656) 2020-10-26 13:48:50 -07:00