526dd3c4fb
20 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
yozhao101
|
fb2c995f53
|
[202012][Monit] Deprecate the feature of monitoring the critical processes by Monit (#7823)
Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit. How I did it I removed the script process_checker and corresponding Monit configuration entries of critical processes. How to verify it I verified this on the device str-7260cx3-acs-1. |
||
Joe LeVeque
|
dd9be59cd1
|
[202012][dockers][supervisor] Increase event buffer size for process exit listener; Set all event buffer sizes to 1024 (#7203)
#### Why I did it Backport of https://github.com/Azure/sonic-buildimage/pull/7083 to the 202012 branch. To prevent error [messages](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802) like the following from being logged: ``` Mar 17 02:33:48.523153 vlab-01 INFO swss#supervisord 2021-03-17 02:33:48,518 ERRO pool supervisor-proc-exit-listener event buffer overflowed, discarding event 46 ``` This is basically an addendum to https://github.com/Azure/sonic-buildimage/pull/5247, which increased the event buffer size for dependent-startup. While supervisor-proc-exit-listener doesn't subscribe to as many events as dependent-startup, there is still a chance some containers (like swss, as in the example above) have enough processes running to cause an overflow of the default buffer size of 10. This is especially important for preventing erroneous log_analyzer failures in the sonic-mgmt repo regression tests, which have started occasionally causing PR check builds to fail. Example [here](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802). I set all supervisor-proc-exit-listener event buffer sizes to 1024, and also updated all dependent-startup event buffer sizes to 1024, as well, to keep things simple, unified, and allow headroom so that we will not need to adjust these values frequently, if at all. |
||
yozhao101
|
cc9c3f567e |
[supervisord] Monitoring the critical processes with supervisord. (#6242)
- Why I did it Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance. Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by Supervisord, we can only focus on the logic of monitoring. - How I did it We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take following steps if it was notified one of critical processes exited unexpectedly: The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted. If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog. - How to verify it First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not. Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute. - Which release branch to backport (provide reason below if selected) 201811 201911 [x ] 202006 |
||
Denys Petryshyn
|
42fa096883 |
[BFN] Upgrade docker-syncd-bfn to buster (#6345)
* Add changes to allow migration of bfn syncd to buster * Update BFN packages for Debian 10 Signed-off-by: Denys Petryshyn <denysx.petryshyn@intel.com> |
||
Joe LeVeque
|
80bf8691e8
|
[Syncd] containers still based on Stretch must still use Python 2 (#6010)
Some syncd containers are still based on Debian Stretch, and thus do not have Python 3 available. For these containers, we must still rely on Python 2 to run supervisord_dependent_startup and supervisor-proc-exit-listener. |
||
lguohan
|
4d3eb18ca7
|
[supervisord]: use abspath as supervisord entrypoint (#5995)
use abspath makes the entrypoint not affected by PATH env. Signed-off-by: Guohan Lu <lguohan@gmail.com> |
||
Joe LeVeque
|
7bf05f7f4f
|
[supervisor] Install vanilla package once again, install Python 3 version in Buster container (#5546)
**- Why I did it** We were building a custom version of Supervisor because I had added patches to prevent hangs and crashes if the system clock ever rolled backward. Those changes were merged into the upstream Supervisor repo as of version 3.4.0 (http://supervisord.org/changes.html#id9), therefore, we should be able to simply install the vanilla package via pip. This will also allow us to easily move to Python 3, as Python 3 support was added in version 4.0.0. **- How I did it** - Remove Makefiles and patches for building supervisor package from source - Install Python 3 supervisor package version 4.2.1 in Buster base container - Also install Python 3 version of supervisord-dependent-startup in Buster base container - Debian package installed binary in `/usr/bin/`, but pip package installs in `/usr/local/bin/`, so rather than update all absolute paths, I changed all references to simply call `supervisord` and let the system PATH find the executable to prevent future need for changes just in case we ever need to switch back to build a Debian package, then we won't need to modify these again. - Install Python 2 supervisor package >= 3.4.0 in Stretch and Jessie base containers |
||
abdosi
|
dddf96933c
|
[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720)
Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action |
||
yozhao101
|
13cec4c486
|
[Monit] Unmonitor the processes in containers which are disabled. (#5153)
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that Monit will not generate false alerting messages into the syslog. Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
Joe LeVeque
|
5b3b4804ad
|
[dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen: ``` Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37 ``` This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100. Resolves https://github.com/Azure/sonic-buildimage/issues/5241 |
||
yozhao101
|
4fa81b4f8d
|
[dockers] Update critical_processes file syntax (#4831)
**- Why I did it** Initially, the critical_processes file contains either the name of critical process or the name of group. For example, the critical_processes file in the dhcp_relay container contains a single group name `isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical processes and test whether a container can be restarted correctly if one of its critical processes is killed. However, it will be difficult to differentiate whether the names in the critical_processes file are the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user. Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes managed by supervisord using the name "xxx". At the same time, I also updated the logic to parse the file critical_processes in supervisor-proc-event-listener script. **- How to verify it** We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not. |
||
yozhao101
|
b8ad0ed4e4
|
[Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it** After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the syncd process, Monit will not detect whether syncd process is running or not. If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag - u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. Similarly, On Mellanox Monit will also select the process with the highest uptime (at there `/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p /etc/sai.d/said.profile` to match. That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process. **- How I did it** **- How to verify it** Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
Guohan Lu
|
82dc8ace12 | [docker-syncd-bfn]: use service dependency in supervisord to start services | ||
yozhao101
|
91e5fb5602
|
[Service] Enable/disable container auto-restart based on configuration. (#4073) | ||
yozhao101
|
db7668638a |
[Monit] Change the full process name of syncd in the monit config file. (#4033)
Since the syncd process running on different platforms will have the different full path names, we change the full path name of process syncd in the monit config file such that it will be universal and is not for a specific vendor. Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
yozhao101
|
b7e48b422f |
[Services] Allow monit system tool to monitor the critical processes status running in various SONiC containers. (#3940)
* Add a monit config file for teamd container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file in teamd container into base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a monit config file for snmp container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file of snmp container into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a monit config file for dhcp_relay container in the dir base_image_files. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file of dhcp_relay container into base image under /etc/monit/conf.d. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a monit config file for router advertiser container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * Add a copy mechanism to put the monit config file of router advertiser contianer into base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Pmon] Add a monit config file for pmon container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Pmon] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Add a monit config file for lldp container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Add a monit config file for BGP container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Add a copy mechanism to put monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Add a monit config file for the swss container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Add a copy mechanism to put monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on barefoot platform. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image on barefoot. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on broadcom. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image on broadcom. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on cavium. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-centec] Add a monit config file for syncd container on centen platform. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on centen platform. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit conifg file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on marvell-arm64. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image on marvell-arm64. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on marvell-armhf. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on mellanox. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a monit config file for syncd container on nephos. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Add a monit config file for sflow container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Add a copy mechanism to put the monit conifg file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Add a monit config file for telemetry container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Add a monit config file for database container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Add a copy mechanism to put the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Dhcprelay] Change a typo. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Dhcprelay] Change the process name in monit config file to dhcrelay. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no desserve process in syncd container on barefoot. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no process desserve in syncd container on cavium. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no process named desserve in syncd on centec. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] There is no process named desserve in syncd on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Should not delete the process desserve in syncd container on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd container on marvell-arm64. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd container on marvell-armhf. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Delete the process dsserve in syncd container on mellanox. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-Radv] Change the process name to radvd. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Correct a typo in monit_telemetry. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-teamd] Delete the monit config file for teamd. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-teamd] Delete the mechanism to copy the monit config file into base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-dhcprelay] Delete the monit config file for dhcp_relay container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-dhcprelay] Delete the mechanism to copy the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-radv] Delete the monit config file foe radv container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-radv] Delete the mechanism to copy the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] change the monit config file for BGP container such that monit only generates alert if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-snmp] Change the monit config file for snmp container such that monit only generates alret if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-pmon] Change the monit config file for pmon container such that monit only generates alert if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Change the monit config file for lldp container such that monit only generates alerts if some processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-pmon] Delete the monit config file for pmon container since some of processes are not running depended on the type of box. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-pmon] Delete the copy mechanism to copy the monit config file into the base image. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Change the matching name for the process lldpd. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Change the monit config file for swss container such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on barefoot such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Correct a typo in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on broadcom such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on cavium such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on marvell such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on marvell-arm64 such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on marvell-armhf such that monit will generate alert if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Change the monit config file for syncd container on mellanox such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sycnd] Change the monit config file for syncd container such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Change the monit config file for sflow container such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Change the monit config file for telemetry container such that monit only generates alerts if the processes are not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Change the monit config file for database container such that monit only generates alerts if the process is not running for 5 minutes. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-database] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Use 4 spcess to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-lldp] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-swss] Use 4 spaces to replace 2 space in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-sflow] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-snmp] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-telemetry] Use 4 spaces to replace 2 spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on barefoot. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on broadcom. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on cavium. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on centec. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on marvell. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to replace 2 spaces in the monit config file on mellanox. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-syncd] Use 4 spaces to repalce 2 spaces in the monit config file on nephos. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [Docker-bgp] Remove the trailing extra spaces in monit config file. Signed-off-by: Yong Zhao <yozhao@microsoft.com> |
||
Joe LeVeque
|
85b0de3df1 |
[docker-syncd]: Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly (#3534)
Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit. |
||
Andriy Kokhan
|
d038fd228a |
[build]: fixed BFN target build (#2784)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com> |
||
Sagar Balani
|
37287d2eef |
[barefoot]: Fix SONiC Build for BFN platforms (#2124)
* BFN fixes for sonic master bld (#26) * Need to still update sde deb package with sai1.3.3, will be done in seperate commit * Update bfn sde url - sai1.3.3 |
||
Sagar Balani
|
93905d3d82 |
[barefoot]: Support for platforms based on Barefoot Networks' device (#1796)
* Initial commit * Add Ingrasys S9180-32X platform dirver. Signed-off-by: Wade He <chihen.he@gmail.com> * Add bfn.service for init barefoot. Signed-off-by: Wade He <chihen.he@gmail.com> * [Barefoot Beta] Add some functions and fixed some bugs. 1. Update sensors.conf. 2. Fixed IO expander init. 3. Fixed PSU EEPROM. 4. Fixed MB EEPROM. 5. Add fancontrol and fan init. 6. Add SYS LED control (sys, fan, fan tray). 7. 2.5V compute and setup max and min. 8. Fixed typo MB eeprom delete address. 9. Remove coretemp to BMC. 10. Add active CPLD. 11. Modify SFP+ GPIO slave address. 12. Modify tmp75 Near Port 32 slave address. Signed-off-by: Wade He <chihen.he@gmail.com> * Add bfn script in /etc/init.d/ Signed-off-by: Wade He <chihen.he@gmail.com> * Add bfn service in debian Signed-off-by: Wade He <chihen.he@gmail.com> * Fixed CPLD switch LED behavior. Signed-off-by: Wade He <chihen.he@gmail.com> * [Barefoot Beta] Fixed sensors and hwmon order. 1. Fixed ignore sensors Vbat. 2. Reorg hwmon order. Signed-off-by: Wade He <chihen.he@gmail.com> * Fixed PSU1 and PSU2 EEPROM order. Signed-off-by: Wade He <chihen.he@gmail.com> * initial barefoot checkin october 2017 * update refpoint * update refpoints * update refpoints to bf-master * update refpoint * update refpoint to tested version * change to platform from asic * update refpoint for swss * revert core creation setting * update refpoints * add telnet for debug shell * update refpoints 11/17/17 * missed change in file on previous merge * [CPLD] Fixed blink LED issue. * Fixed blink LED mask set error. Signed-off-by: Wade He <chihen.he@gmail.com> * Update bf_kdrv.c for 6.0.2.39 * Update bf kernel driver * Add bf_fun kernel module. * Update bf_tun for fixed build error * merge with Azure master (12/12/17) * update swss refpoint * update refpoint of swss * library dependency for stack unroll * update refpoint to bf-master * [DHCP relay]: Fix circuit ID and remote ID bugs (#1248) * [DHCP relay]: Fix circuit ID and remote ID bugs * Set circuit_id_len after setting circuit_id_len to ip->name * [Platform] Add Psuutil and update sensors.conf for S9100-32X, S8810-32Q and S9200-64X (#1272) * Add I2C CPLD kernel module for psuutil. * Support psuutil script. * Add voltage min and max threshold. * Update sensors.conf for tmp75. Signed-off-by: Wade He <chihen.he@gmail.com> * Allow multi platform support - infra (more changes to follow) * update relative path to include platform for clarity * [Platform] Add Ingrasys S9130-32X and S9230-64X with Nephos Switch ASIC for "branch 201712" (#1274) - What I did Add switch ASIC vendor: Nephos Add Nephos platforms: Ingrasys S9130-32X, Ingrasys S9230-64X - How I did it Add platform/nephos files Add platform/nephos/sonic-platform-modules-ingrasys submodule Add device/ingrasys/x86_64-ingrasys_s9130_32x-r0 files Add device/ingrasys/x86_64-ingrasys_s9230_64x-r0 files Add SONiC to support Nephos platform Update Head of submodule src/sonic-sairedis to "3b817bb" - How to verify it To build SONiC installer image and docker images, run the following commands: make configure PLATFORM=nephos make target/sonic-nephos.bin Check system and network feature is worked as well - Description for the changelog Add switch ASIC vendor and platforms for Nephos - A picture of a cute animal (not mandatory but encouraged) Signed-off-by: Sam Yang <yang.kaiyu@gmail.com> * change source of files to github (from dropbox), update sairedis refpoint * update refpoint of sairedis * [centec] support CENTEC SAI 1.0 on 201712 branch and update e582-48x6q board (#1269) * [marvel]: Marvell's updates for SONiC.201712 & SAI v1.0 (#1287) * update sairedis (fast-boot refpoint) * fix syncd rpc make files * update refpoint to handle Makefile change (no functional change) * [Marvell]: Add support for SLM5401-54x device (#1307) * Marvell's updates for SONiC.201712 & SAI v1.0 * [Platform] Add Marvell's SLM5401-54x for branch 201712 * [Broadcom]: Update Boradcom SAI package to 3.0.3.3-3 (#1312) (#1321) - update Arista 7050-QX32S config.bcm file - update Accton th-as771*-32x100G.config.bcm files * update refpoint for Makefile chnage in sairedis * update refpoint - sairedis * update sairedis to older refpoint till we debug clean build * export asic platform for build * update refpoint for makefiles * [PLATFORM] Centec update E582 driver fan/epprom/sensor (#1332) * Upload wnc-osw1800 * Modify for Barefoot suggest * Revert bfn-platform.mk * Update bfn-platform-wnc.mk Update parameter name * Update parameter name * initial support for WNC platform * change switch name to "switch" * Delete bf modules for rel_7_0 * Add Ingrasys S9180 platform Signed-off-by: Wade He <chihen.he@gmail.com> * Modify bfnsdk for Ingrasys S9180 platform Signed-off-by: Wade He <chihen.he@gmail.com> * Resolved the conflict. * Resolved the conflict. * Update submodule path and url. * Delete unused file. * Update PSU GPIO and EEPROM for psuutil. * Add psuutil in S9180-32X Signed-off-by: Wade He <chihen.he@gmail.com> * update refpoint * update refpoint * change contact email, update refpoint * cleanup and update kernel modules * updates based on review * update refpoint * update refpoint * fix typo in config script to check for platforms * remove stale file * resolve conflicts * cleanup diffs with Azure repo and update SDK debs * update refpoints to Azure * address review comments * revert refpoint of swss-common * porting the build fix from master * porting build fix from master * Minor Fix * Minor fix * Temp to sde deb packages url * Update sonic - sairedis,swss & swss-common refpoints * Update git modules url path to bfn repo * updated paths for swss, swss-common & sairedis * Update refpoint for sonic-swss to local bfn repo * Update URL for downloading sde debian packages * porting fix links of debian git server from master * porting fix links of debian git server from master * [Ingrasys] Add platform support for S9280-64X with Barefoot ASIC * Update ref points for swss, swss-common and sairedis repos * Add sonic platform scripts for bfn montara/maverick * Call sh scripts instead of calling py scripts * Address upstream PR Comments (#10) * Update bf-master with azure/master * Undo changes to some files * Revert "Address upstream PR Comments (#10)" This reverts commit |