b8ad0ed4e4
**- Why I did it** After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the syncd process, Monit will not detect whether syncd process is running or not. If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag - u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. Similarly, On Mellanox Monit will also select the process with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process. **- How I did it** **- How to verify it** Signed-off-by: Yong Zhao <yozhao@microsoft.com>
8 lines
339 B
Plaintext
8 lines
339 B
Plaintext
###############################################################################
|
|
## Monit configuration for syncd container
|
|
## process list:
|
|
## syncd
|
|
###############################################################################
|
|
check process syncd matching "/usr/bin/syncd\s"
|
|
if does not exist for 5 times within 5 cycles then alert
|