3af05fdffe
Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold. How I did it I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container. How to verify it I verified this implementation on device str-7260cx3-acs-1.
15 lines
828 B
Plaintext
15 lines
828 B
Plaintext
###############################################################################
|
|
## Monit configuration for telemetry container
|
|
## process list:
|
|
## telemetry
|
|
## dialout_client
|
|
###############################################################################
|
|
check program telemetry|telemetry with path "/usr/bin/process_checker telemetry /usr/sbin/telemetry"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
|
|
check program telemetry|dialout_client with path "/usr/bin/process_checker telemetry /usr/sbin/dialout_client_cli"
|
|
if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles
|
|
|
|
check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400"
|
|
if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry"
|