b8ad0ed4e4
**- Why I did it** After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the syncd process, Monit will not detect whether syncd process is running or not. If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag - u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. Similarly, On Mellanox Monit will also select the process with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process. **- How I did it** **- How to verify it** Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
---|---|---|
.. | ||
base_image_files | ||
critical_processes | ||
Dockerfile.j2 | ||
supervisord.conf |