fd22b3bcee
To run VNET route consistency check periodically. For any failure, the monit will raise alert based on return code. The tool will log required details.
39 lines
1.8 KiB
Plaintext
39 lines
1.8 KiB
Plaintext
###############################################################################
|
|
## Monit configuration for SONiC host OS
|
|
##
|
|
## This includes system-level monitoring as well as processes which
|
|
## run in the host OS (i.e., not inside a Docker container)
|
|
###############################################################################
|
|
|
|
check filesystem root-overlay with path /
|
|
if space usage > 90% for 10 times within 20 cycles then alert repeat every 1 cycles
|
|
|
|
check filesystem var-log with path /var/log
|
|
if space usage > 90% for 10 times within 20 cycles then alert repeat every 1 cycles
|
|
|
|
check system $HOST
|
|
if memory usage > 90% for 10 times within 20 cycles then alert repeat every 1 cycles
|
|
if cpu usage (user) > 90% for 10 times within 20 cycles then alert repeat every 1 cycles
|
|
if cpu usage (system) > 90% for 10 times within 20 cycles then alert repeat every 1 cycles
|
|
|
|
check process rsyslog with pidfile /var/run/rsyslogd.pid
|
|
start program = "/bin/systemctl start rsyslog.service"
|
|
stop program = "/bin/systemctl stop rsyslog.service"
|
|
if totalmem > 800 MB for 10 times within 20 cycles then restart
|
|
|
|
# route_check.py Verify routes between APPL-DB & ASIC-DB are in sync.
|
|
# For any discrepancy, details are logged and a non-zero code is returned
|
|
# which would trigger a monit alert.
|
|
# Hence for any discrepancy, there will be log messages for "ERR" level
|
|
# from both route_check.py & monit.
|
|
#
|
|
check program routeCheck with path "/usr/bin/route_check.py"
|
|
every 5 cycles
|
|
if status != 0 for 3 cycle then alert repeat every 1 cycles
|
|
|
|
# vnet_route_check.py: tool that verifies VNET routes consistancy between SONiC and vendor SDK DBs.
|
|
check program vnetRouteCheck with path "/usr/bin/vnet_route_check.py"
|
|
every 5 cycles
|
|
if status != 0 for 3 cycle then alert repeat every 1 cycles
|
|
|