834a29cb66
9 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
yozhao101
|
4fa81b4f8d
|
[dockers] Update critical_processes file syntax (#4831)
**- Why I did it** Initially, the critical_processes file contains either the name of critical process or the name of group. For example, the critical_processes file in the dhcp_relay container contains a single group name `isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical processes and test whether a container can be restarted correctly if one of its critical processes is killed. However, it will be difficult to differentiate whether the names in the critical_processes file are the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user. Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes managed by supervisord using the name "xxx". At the same time, I also updated the logic to parse the file critical_processes in supervisor-proc-event-listener script. **- How to verify it** We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not. |
||
yozhao101
|
b8ad0ed4e4
|
[Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it** After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the syncd process, Monit will not detect whether syncd process is running or not. If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag - u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. Similarly, On Mellanox Monit will also select the process with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process. **- How I did it** **- How to verify it** Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
Guohan Lu
|
918cf632f4 | [docker-syncd-mrvl]: use service dependency in supervisord to start services | ||
yozhao101
|
91e5fb5602
|
[Service] Enable/disable container auto-restart based on configuration. (#4073) | ||
yozhao101
|
db7668638a |
[Monit] Change the full process name of syncd in the monit config file. (#4033)
Since the syncd process running on different platforms will have the different full path names, we change the full path name of process syncd in the monit config file such that it will be universal and is not for a specific vendor. Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
yozhao101
|
b7e48b422f |
[Services] Allow monit system tool to monitor the critical processes status running in various SONiC containers. (#3940)
* Add a monit config file for teamd container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file in teamd container into base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a monit config file for snmp container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file of snmp container into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a monit config file for dhcp_relay container in the dir base_image_files. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file of dhcp_relay container into base image under /etc/monit/conf.d. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a monit config file for router advertiser container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file of router advertiser contianer into base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Pmon] Add a monit config file for pmon container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Pmon] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Add a monit config file for lldp container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Add a monit config file for BGP container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Add a copy mechanism to put monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Add a monit config file for the swss container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Add a copy mechanism to put monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on barefoot platform. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image on barefoot. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on broadcom. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image on broadcom. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on cavium. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-centec] Add a monit config file for syncd container on centen platform. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on centen platform. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit conifg file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on marvell-arm64. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image on marvell-arm64. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on marvell-armhf. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on mellanox. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on nephos. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Add a monit config file for sflow container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Add a copy mechanism to put the monit conifg file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Add a monit config file for telemetry container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Add a monit config file for database container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Dhcprelay] Change a typo. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Dhcprelay] Change the process name in monit config file to dhcrelay. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no desserve process in syncd container on barefoot. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no process desserve in syncd container on cavium. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no process named desserve in syncd on centec. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no process named desserve in syncd on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Should not delete the process desserve in syncd container on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd container on marvell-arm64. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd container on marvell-armhf. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd container on mellanox. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Radv] Change the process name to radvd. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Correct a typo in monit_telemetry. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-teamd] Delete the monit config file for teamd. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-teamd] Delete the mechanism to copy the monit config file into base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-dhcprelay] Delete the monit config file for dhcp_relay container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-dhcprelay] Delete the mechanism to copy the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-radv] Delete the monit config file foe radv container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-radv] Delete the mechanism to copy the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] change the monit config file for BGP container such that monit only generates alert if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-snmp] Change the monit config file for snmp container such that monit only generates alret if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-pmon] Change the monit config file for pmon container such that monit only generates alert if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Change the monit config file for lldp container such that monit only generates alerts if some processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-pmon] Delete the monit config file for pmon container since some of processes are not running depended on the type of box. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-pmon] Delete the copy mechanism to copy the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Change the matching name for the process lldpd. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Change the monit config file for swss container such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on barefoot such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Correct a typo in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on broadcom such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on cavium such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on marvell such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on marvell-arm64 such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on marvell-armhf such that monit will generate alert if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on mellanox such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sycnd] Change the monit config file for syncd container such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Change the monit config file for sflow container such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Change the monit config file for telemetry container such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Change the monit config file for database container such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Use 4 spcess to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Use 4 spaces to replace 2 space in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-snmp] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on barefoot. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on broadcom. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on cavium. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on centec. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on mellanox. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to repalce 2 spaces in the monit config file on nephos. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Remove the trailing extra spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
Joe LeVeque
|
85b0de3df1 |
[docker-syncd]: Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly (#3534)
Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit. |
||
arheneus@marvell.com
|
11258e5db4 |
[build]: sonic arm64 changes (#3419)
Marvell arm64 changes over sonic Signed-off-by: Antony Rheneus <arheneus@marvell.com> |
||
arheneus@marvell.com
|
50fe458592 |
[build]: SONiC buildimage ARM arch support (#2980)
ARM Architecture support in SONIC make configure platform=[ASIC_VENDOR_ARCH] PLATFORM_ARCH=[ARM_ARCH] SONIC_ARCH: default amd64 armhf - arm32bit arm64 - arm64bit Signed-off-by: Antony Rheneus <arheneus@marvell.com> |