sonic-buildimage/dockers/docker-platform-monitor/base_image_files/monit_pmon.j2
yozhao101 a8d2d0b5cd
[201911][Monit] Monitor critical processes in PMon contianer. (#7438)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor the critical processes in PMon container by Monit in 201911 branch.

How I did it
I created a template configuration file of Monit and it will be rendered to generate Monit configuration file of PMon container
by a service generate_monit_config.service.

How to verify it
I verified this on a Mellanox device str-msn2700-03 and an Arista device str-a7050-acs-1.

Which release branch to backport (provide reason below if selected)
 201811
[x ] 201911
 202006
 202012
2021-04-28 17:12:21 -07:00

62 lines
2.3 KiB
Django/Jinja

{# This template file is used to generate Monit configuration file of platform monitor container -#}
###############################################################################
## Monit configuration file for PMon container
## process list:
{% if not skip_fancontrol and HAVE_FANCONTROL_CONF == 1 %}
## fancontrol
{% endif %}
{% if not skip_ledd %}
## ledd
{% endif %}
{% if not skip_psud %}
## psud
{% endif %}
{% if not skip_sensors and HAVE_SENSORS_CONF == 1 %}
## sensord
{% endif %}
{% if not skip_syseepromd %}
## syseepromd
{% endif %}
{% if not skip_thermalctld %}
## thermalctld
{% endif %}
{% if not skip_xcvrd %}
## xcvrd
{% endif %}
###############################################################################
{% if not skip_fancontrol and HAVE_FANCONTROL_CONF == 1 %}
check program pmon|fancontrol with path "/usr/bin/process_checker pmon /bin/bash /usr/sbin/fancontrol"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{% endif %}
{% if not skip_ledd %}
check program pmon|ledd with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/ledd"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{% endif %}
{% if not skip_psud %}
check program pmon|psud with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/psud"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{% endif %}
{% if not skip_sensors and HAVE_SENSORS_CONF == 1 %}
check program pmon|sensord with path "/usr/bin/process_checker pmon /usr/sbin/sensord -f daemon"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{% endif %}
{% if not skip_syseepromd %}
check program pmon|syseepromd with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/syseepromd"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{% endif %}
{% if not skip_thermalctld %}
check program pmon|thermalctld with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/thermalctld"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{% endif %}
{% if not skip_xcvrd %}
check program pmon|xcvrd with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/xcvrd"
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
{%- endif -%}