sonic-buildimage/files/build_templates
yozhao101 04cd1d61e8
[Monit] Monitoring the running status of containers. (#6251)
**- Why I did it**
This PR aims to monitor the running status of each container. Currently the auto-restart feature was enabled. If a critical process exited unexpected, the container will be restarted. If the container was restarted 3 times during 20 minutes, then it will not run anymore unless we cleared the flag using the command `sudo systemctl reset-failed <container_name>` manually. 

**- How I did it**
We will employ Monit to monitor a script. This script will generate the expected running container list and compare it with the current running containers. If there are containers which were expected to run but were not running, then an alerting message will be written into syslog.

**- How to verify it**
I tested this feature on a lab device `str-a7050-acs-3` which has single ASIC and `str2-n3164-acs-3` which has a Multi-ASIC. First I manually stopped a container by running the command `sudo systemctl stop <container_name>`, then I checked whether there was an alerting message in the syslog.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2021-01-07 19:52:22 -08:00
..
per_namespace Move teamd warm reboot code to service script (#5163) 2020-11-13 13:34:18 -08:00
share_image [ChassisDB]: bring up ChassisDB service (#5283) 2020-10-14 15:15:24 -07:00
arp_update_vars.j2 [swss] Enhance ARP Update to Call Sonic Cfggen Once (#5398) 2020-09-18 18:44:23 -07:00
buffers_config.j2 [Dynamic buffer calc] Support dynamic buffer calculation (#6194) 2020-12-13 11:35:39 -08:00
config-chassisdb.service.j2 [ChassisDB]: bring up ChassisDB service (#5283) 2020-10-14 15:15:24 -07:00
config-setup.service.j2 [config-setup]: create a SONiC configuration management service (#3227) 2019-12-04 07:15:58 -08:00
database.service.j2 Multi-ASIC implementation (#3888) 2020-03-31 10:06:19 -07:00
dhcp_relay.service.j2 [services] Remove explicit dependencies from dhcp_relay service file, control in swss.sh (#3823) 2019-11-26 16:59:45 -08:00
docker_image_ctl.j2 First cut image update for kubernetes support. (#5421) 2020-12-22 08:01:33 -08:00
gbsyncd.service.j2 Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature (#4851) 2020-09-25 08:32:44 -07:00
iccpd.service.j2 MCLAG feature for SONIC (#2514) 2020-04-04 15:24:06 -07:00
init_cfg.json.j2 [init_cfg]: allow enable/disable swss/teamd/syncd services (#6291) 2020-12-28 10:33:46 -08:00
kube_cni.10-flannel.conflist First cut image update for kubernetes support. (#5421) 2020-12-22 08:01:33 -08:00
lldp.service.j2 Changes for LLDP docker to support multi-npu platforms (#4530) 2020-05-11 11:05:44 -07:00
mgmt-framework.service.j2 [services][mgmt-framework] delay mgmt-framework service on boot (#5226) 2020-08-27 21:53:58 +03:00
mgmt-framework.timer [services][mgmt-framework] delay mgmt-framework service on boot (#5226) 2020-08-27 21:53:58 +03:00
nat.service.j2 [services] remove swss from WantedBy for nat service (#4991) 2020-07-19 21:50:26 -07:00
organization_extensions.sh Framework to plugin Organization specific scripts during ONIE Image build (#951) 2017-09-19 16:23:31 -07:00
pcie-check.timer Add pcie-check service to check PCIe devices at boot (#4771) 2020-07-13 14:15:09 -07:00
pmon.service.j2 [psud]: Fix for psud crash because of database connection reset (#3647) 2020-01-10 13:26:04 -08:00
qos_config.j2 Make backend device checking more robust (#5730) 2020-11-10 15:06:35 -08:00
radv.service.j2 Move RADV fastboot handling to a service script (#5108) 2020-08-11 13:13:13 -07:00
restapi.service.j2 Start RestAPI container when sonic boots (#4140) 2020-02-12 16:38:45 -08:00
sflow.service.j2 [sflow] Fix race-condition seen with mVRF configured (#6102) 2020-12-03 01:33:10 -08:00
snmp.service.j2 [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
snmp.timer [build_templates]: Start SNMP timer after SWSS service (#6195) 2020-12-16 16:39:14 -08:00
sonic_debian_extension.j2 [Monit] Monitoring the running status of containers. (#6251) 2021-01-07 19:52:22 -08:00
swss_vars.j2 Enable synchronous mode by default and add in minigraph parser (#5735) 2020-10-29 09:15:12 -07:00
telemetry.service.j2 [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
telemetry.timer [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
updategraph.service.j2 [config-setup]: create a SONiC configuration management service (#3227) 2019-12-04 07:15:58 -08:00