a8d2d0b5cd
Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to monitor the critical processes in PMon container by Monit in 201911 branch. How I did it I created a template configuration file of Monit and it will be rendered to generate Monit configuration file of PMon container by a service generate_monit_config.service. How to verify it I verified this on a Mellanox device str-msn2700-03 and an Arista device str-a7050-acs-1. Which release branch to backport (provide reason below if selected) 201811 [x ] 201911 202006 202012
62 lines
2.3 KiB
Django/Jinja
62 lines
2.3 KiB
Django/Jinja
{# This template file is used to generate Monit configuration file of platform monitor container -#}
|
|
|
|
###############################################################################
|
|
## Monit configuration file for PMon container
|
|
## process list:
|
|
{% if not skip_fancontrol and HAVE_FANCONTROL_CONF == 1 %}
|
|
## fancontrol
|
|
{% endif %}
|
|
{% if not skip_ledd %}
|
|
## ledd
|
|
{% endif %}
|
|
{% if not skip_psud %}
|
|
## psud
|
|
{% endif %}
|
|
{% if not skip_sensors and HAVE_SENSORS_CONF == 1 %}
|
|
## sensord
|
|
{% endif %}
|
|
{% if not skip_syseepromd %}
|
|
## syseepromd
|
|
{% endif %}
|
|
{% if not skip_thermalctld %}
|
|
## thermalctld
|
|
{% endif %}
|
|
{% if not skip_xcvrd %}
|
|
## xcvrd
|
|
{% endif %}
|
|
###############################################################################
|
|
{% if not skip_fancontrol and HAVE_FANCONTROL_CONF == 1 %}
|
|
check program pmon|fancontrol with path "/usr/bin/process_checker pmon /bin/bash /usr/sbin/fancontrol"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{% endif %}
|
|
|
|
{% if not skip_ledd %}
|
|
check program pmon|ledd with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/ledd"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{% endif %}
|
|
|
|
{% if not skip_psud %}
|
|
check program pmon|psud with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/psud"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{% endif %}
|
|
|
|
{% if not skip_sensors and HAVE_SENSORS_CONF == 1 %}
|
|
check program pmon|sensord with path "/usr/bin/process_checker pmon /usr/sbin/sensord -f daemon"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{% endif %}
|
|
|
|
{% if not skip_syseepromd %}
|
|
check program pmon|syseepromd with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/syseepromd"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{% endif %}
|
|
|
|
{% if not skip_thermalctld %}
|
|
check program pmon|thermalctld with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/thermalctld"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{% endif %}
|
|
|
|
{% if not skip_xcvrd %}
|
|
check program pmon|xcvrd with path "/usr/bin/process_checker pmon /usr/bin/python /usr/bin/xcvrd"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
{%- endif -%}
|