Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.
How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.
How to verify it
I verified this on the device str-7260cx3-acs-1.
Why/How I did:
Make sure first error syslog is triggered based on FAULT TOLERANCE condition.
Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent
Updated the monit conf files with repeat every x cycles for the alert action
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
This PR has changes to support accessing the bcmsh and bcmcmd utilities on multi ASIC devices
Changes done
- move the link of /var/run/sswsyncd from docker-syncd-brcm.mk to docker_image_ctl.j2
- update the bcmsh and bcmcmd scripts to take -n [ASIC_ID] as an argument on multi ASIC platforms
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the
syncd process, Monit will not detect whether syncd process is running or not.
If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.
Similarly, On Mellanox Monit will also select the process with the highest uptime (at there
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.
That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.
**- How I did it**
**- How to verify it**
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Since the syncd process running on different platforms will have the different full path names, we
change the full path name of process syncd in the monit config file such that it will be universal and is not for a specific vendor.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a monit config file for teamd container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a copy mechanism to put the monit config file in teamd container
into base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a monit config file for snmp container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a copy mechanism to put the monit config file of snmp container into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a monit config file for dhcp_relay container in the dir
base_image_files.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a copy mechanism to put the monit config file of dhcp_relay
container into base image under /etc/monit/conf.d.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a monit config file for router advertiser container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Add a copy mechanism to put the monit config file of router advertiser
contianer into base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-Pmon] Add a monit config file for pmon container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-Pmon] Add a copy mechanism to put the monit config file into the
base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-lldp] Add a monit config file for lldp container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-lldp] Add a copy mechanism to put the monit config file into the
base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-bgp] Add a monit config file for BGP container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-bgp] Add a copy mechanism to put monit config file into the base
image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-swss] Add a monit config file for the swss container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-swss] Add a copy mechanism to put monit config file into the
base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on barefoot
platform.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image on barefoot.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on broadcom.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image on broadcom.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on cavium.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-centec] Add a monit config file for syncd container on centen
platform.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on centen
platform.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on marvell.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit conifg file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on
marvell-arm64.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image on marvell-arm64.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on
marvell-armhf.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on mellanox.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a monit config file for syncd container on nephos.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-sflow] Add a monit config file for sflow container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-sflow] Add a copy mechanism to put the monit conifg file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-telemetry] Add a monit config file for telemetry container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-telemetry] Add a copy mechanism to put the monit config file
into the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-database] Add a monit config file for database container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-database] Add a copy mechanism to put the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-Dhcprelay] Change a typo.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-Dhcprelay] Change the process name in monit config file to
dhcrelay.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] There is no desserve process in syncd container on
barefoot.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] There is no process desserve in syncd container on
cavium.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] There is no process named desserve in syncd on centec.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] There is no process named desserve in syncd on marvell.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Should not delete the process desserve in syncd container
on marvell.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Delete the process dsserve in syncd on marvell.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Delete the process dsserve in syncd container on
marvell-arm64.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Delete the process dsserve in syncd container on
marvell-armhf.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Delete the process dsserve in syncd container on
mellanox.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-Radv] Change the process name to radvd.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-telemetry] Correct a typo in monit_telemetry.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-teamd] Delete the monit config file for teamd.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-teamd] Delete the mechanism to copy the monit config file into
base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-dhcprelay] Delete the monit config file for dhcp_relay
container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-dhcprelay] Delete the mechanism to copy the monit config file
into the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-radv] Delete the monit config file foe radv container.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-radv] Delete the mechanism to copy the monit config file into
the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-bgp] change the monit config file for BGP container such that
monit only generates alert if the process is not running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-snmp] Change the monit config file for snmp container such that
monit only generates alret if the process is not running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-pmon] Change the monit config file for pmon container such that
monit only generates alert if the processes are not running for 5
minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-lldp] Change the monit config file for lldp container such that
monit only generates alerts if some processes are not running for 5
minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-pmon] Delete the monit config file for pmon container since some
of processes are not running depended on the type of box.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-pmon] Delete the copy mechanism to copy the monit config file
into the base image.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-lldp] Change the matching name for the process lldpd.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-swss] Change the monit config file for swss container such that
monit only generates alerts if the processes are not running for 5
minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
barefoot such that monit only generates alerts if the process is not
running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Correct a typo in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
broadcom such that monit only generates alerts if the processes are not
running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
cavium such that monit only generates alerts if the process is not
running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container such
that monit only generates alerts if the process is not running for 5
minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
marvell such that monit only generates alerts if the process is not
running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
marvell-arm64 such that monit only generates alerts if the process is
not running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
marvell-armhf such that monit will generate alert if the process is not
running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Change the monit config file for syncd container on
mellanox such that monit only generates alerts if the process is not
running for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-sycnd] Change the monit config file for syncd container such
that monit only generates alerts if the processes are not running for 5
minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-sflow] Change the monit config file for sflow container such
that monit only generates alerts if the process is not running for 5
minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-telemetry] Change the monit config file for telemetry container
such that monit only generates alerts if the processes are not running
for 5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-database] Change the monit config file for database container
such that monit only generates alerts if the process is not running for
5 minutes.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-database] Use 4 spaces to replace 2 spaces in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-bgp] Use 4 spcess to replace 2 spaces in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-lldp] Use 4 spaces to replace 2 spaces in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-swss] Use 4 spaces to replace 2 space in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-sflow] Use 4 spaces to replace 2 spaces in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-snmp] Use 4 spaces to replace 2 spaces in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-telemetry] Use 4 spaces to replace 2 spaces in monit config
file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file
on barefoot.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file
on broadcom.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file
on cavium.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file
on centec.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file
on marvell.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file
on mellanox.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-syncd] Use 4 spaces to repalce 2 spaces in the monit config file
on nephos.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [Docker-bgp] Remove the trailing extra spaces in monit config file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>