Commit Graph

105 Commits

Author SHA1 Message Date
Sujin Kang
02a98add92
Add pcied to PMON docker to monitor the PCIe device status (#5000)
* Add pcied to PMON container

* remove tailing spaces

* update pmon submodule

* review comments

* rebase to the latest
2020-07-29 11:27:49 -07:00
Joe LeVeque
2600747f0e
[docker-pmon] Fix copy of fancontrol config file (#5037)
Copy proper fancontrol config file to the proper destination. Also some minor refactoring for code reuse to help prevent issues like this in the future.

Fixes a bug introduced by #4599
2020-07-28 00:23:21 -07:00
Stepan Blyshchak
16a37d8c17
[dockers] update mellanox syncd and pmon to buster (#4818)
Upgrade to libsensors5

Updated sonic-sairedis pointer:
    d54bfb4 [SAI] update pointer (#636)
    1885a8c [syncd] Fix notification on shutdown request (#635)
    9e57ba2 Fixing hostif For Genetlink host interfaces (#633)
    449a092 sonic-sairedis: Add support to sonic-sairedis for gearbox phys (#632)

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2020-07-18 03:46:15 -07:00
yozhao101
4fa81b4f8d
[dockers] Update critical_processes file syntax (#4831)
**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a  container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.

Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.

**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
2020-06-25 21:18:21 -07:00
Guohan Lu
8da46d26c3 [docker-pmon]: use service dependency in supervisord to start services 2020-05-22 11:01:28 -07:00
Sujin Kang
cbc75fe4c8
[pmon]: Fix the continous syseepromd autorestart issue on 201911 (#4478)
- Remove syseepromd from the critical process of pmon docker
- Fix supervisor autorestart configuration of syseepromd
2020-04-30 15:51:34 -07:00
Junchao-Mellanox
c730f3e207
[Mellanox] thermal control enhancement for dynamic minimum fan speed and PSU fan speed policy (#4403) 2020-04-21 08:09:53 -07:00
Kebo Liu
860cb265ac
[PMON] Extend pmon daemon start control to lm-sensors and fancontrol (#4447) 2020-04-21 08:00:48 -07:00
Kebo Liu
cfa112ace8
[Mellanox] Extend mellanox platform API to report SFP error event (#4365)
* extend mellanox platform API to report SFP error event
* remove unnecessary loop code
* install enum34 to pmon to support using Enum
2020-04-14 10:20:06 -07:00
lguohan
60b16495cc
[docker-base-stretch]: move common packages into docker-base-stretch (#4371)
libpython2.7, libdaemon0, libdbus-1-3, libjansson4 are common
across different containers. move them into docker-base-stretch

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-05 13:29:34 -07:00
Sujin Kang
01f3f9286f
[fancontrol] Restart process upon unexpected exit, not entire pmon container (#4101)
* fancontrol restart

* Cleanup the default setting for exitcodes

* Remove the unnecessary stopwaitsecs default settin
2020-03-19 17:24:22 -07:00
Junchao-Mellanox
be549db395
Add thermal control support for SONiC (#3949) 2020-03-09 10:41:10 -07:00
yozhao101
91e5fb5602
[Service] Enable/disable container auto-restart based on configuration. (#4073) 2020-02-07 12:34:07 -08:00
yozhao101
4fa3a1e27e [Services] Restart Platform-monitor service upon unexpected critical process exit. (#3689)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-04 17:44:01 -08:00
Andriy Moroz
976850fc00 [submodule update] Add SSD Health tools (#3218)
Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
2019-10-04 10:52:58 -07:00
sridhar-ravindran
56608bf06b [devices]: DELL Platform 2.0 API Infra and Reboot Reason support in Z9100 & S6100 (#3063) 2019-07-03 06:52:35 -07:00
Stepan Blyshchak
81cf33231f [build]: Improve dockerfile instructions (#3048)
- create a dockerfile-marcros.j2 file with all common operations
  written as j2 macro
- use single dockerfile instruction for COPY and RUN commands
  when possible to improve build time
- reorganize dockerfile instructions to make more cache friendly
  (in case someday we will remove --no-cache to build docker images)

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-06-22 11:26:23 -07:00
Kebo Liu
8a08595006 [Pmon] Add new daemon "syseepromd" to pmon docker (#2866) 2019-06-18 11:02:24 -07:00
Stephen Sun
95452b7385 [docker-pmon] install dmidecode tool to pmon (#2990) 2019-06-12 12:10:43 +03:00
Kebo Liu
f5d3ee71a2 [pmon]: Add ethtool to pmon docker (#2943) 2019-05-25 17:59:56 -07:00
Samuel Angebault
77cde50541 [device/Arista] Improvements to the boot of Arista devices. (#2898)
* Fix showing systemd shutdown sequence when verbose is set

* Fix creation of kernel-cmdline file

Sometimes boot0 prints error
"mv: can't preserve ownership of '/mnt/flash/image-arsonic.xxxx/kernel-cmdline': Operation not permitted"

* Improve flash space usage during installation

Some older systems only have 2GB of flash available. Installing a second
image on these can prove to be challenging.
The new installation process moves the installer swi to memory in order
to avoid free up space from the flash before uncompressing it there.
It removes all the flash space usage spike and also improves the IO
since the installation is no more reading and writting to the flash at
the same time.

* Add support of 7060CX-32S-SSD

* 7260CX3: use inventory powerCycle procedures

* 7050QX-32S: use inventory powerCycle procedures

* 7050QX-32: use inventory powerCycle procedures

* platform: arista: add common platform_reboot

Replace platform_reboot by a link to new common for devices already
using a similar script.

* 7060CX-32S: use inventory powerCycle procedures

* Install python smbus in pmon

Some platform plugin need the python smbus library to perform some actions.
This installs the dependency.
2019-05-15 12:45:05 -07:00
Qi Luo
6b3a26f0cc
Remove unused packages in docker images and host (#2807)
* Remove unneeded packages in docker images and host
* Remove libpython3.6 from snmp docker image
2019-04-29 17:21:24 -07:00
Wirut Getbamrung
27803ec603 [docker-platform-monitor]: Add smartmontools 6.6-1 (#2703) 2019-04-10 21:55:54 -07:00
Mykola F
3826ffd30f [pmon] move platform monitor docker to stretch (#2680)
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
2019-03-22 16:42:56 -07:00
Kebo Liu
84b46bb0e0 [Pmon] dynamically load pmon daemons (#2654)
* dynamically load pmon daemons
2019-03-22 02:49:35 -07:00
Nazarii Hnydyn
b22fe37670 [mellanox]: Upgraded hw-management V.2.0.0160. (#2643)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2019-03-06 18:51:46 -08:00
lguohan
f20665008c
[build]: put stretch debian packages under target/debs/stretch/ (#2519)
* [build]: put stretch debian packages under target/debs/stretch/

* in stretch build phase, all debian packages built in that stage are placed under target/debs/stretch directory.
* for python-based debian packages, since they are really the same for jessie and stretch, they are placed under target/python-debs directory.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-04 22:06:37 -08:00
Kevin(Shengkai) Wang
b3abf9af7f [docker-platform-monitor] add psud daemon to Pmon (#2423)
* Add psud daemon to pmon container
* Update submodule sonic-platform-daemons

Submodule update sonic-platform-daemons:

e5d8155 - [sonic-psud] add a new daemon sonic-psud to platform monitor (#20)

Signed-off-by: Kevin Wang <kevinw@mellanox.com>
2019-01-15 21:24:47 -08:00
lguohan
f3ca7c422f
[rsyslog]: use # to separate container name and program name in syslog message (#1918)
Previously use / to separate container name and program name.

However, in rsyslogd:

Precisely, the programname is terminated by either (whichever occurs first):

end of tag
nonprintable character
‘:’
‘[‘
‘/’
The above definition has been taken from the FreeBSD syslogd sources.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-12 22:23:58 -07:00
paavaanan
ecfca8bf23 [devices]: DellEMC new platform support for z9264f - 64x100 (#26)
* Added new platform support DellEMC - Z92264f - 64x100

* Includes changes with Makefiles, sfputil, eeprom and default minigraph

* Led support for Z9264f platform

* Includes changes on default minigraph

* ipmitool implementation in pmon docker. platform_sensors script is inclued in pmon startup
2018-08-11 09:09:03 +00:00
Kebo Liu
38beca654c [docker-platform-monitor] make file and supervisord conf change for new xcvrd deamon (#1840)
* [docker-platform-monitor] make file and supervisord conf change for new xcvrd deamon

* make file change for the new daemon
* supervisord conf change for the new daemon

signed-off-by Liu Kebo kebol@mellanox.com

* make xcvrd start sequence aligned with the supervisord conf

* update submodules to include xcvrd modification
2018-08-03 16:33:56 -07:00
Qi Luo
7ba08e5bf6
Prefix docker container name to syslog syslogtag (program name) (#1810) 2018-06-25 10:48:42 -07:00
Joe LeVeque
1102acec48 [ledd] Exit with code 0 if we fail to find a platform-specific led_control module; no autorestart (#1688) 2018-05-10 01:20:22 -07:00
Joe LeVeque
1df7c9a993
[docker-platform-monitor] Convert ledd from polling-based to subscription-based model (#1623) 2018-04-20 10:42:19 -07:00
Joe LeVeque
e1cb2ace36 [base image files] All 'docker exec' wrapper scripts now dynamically adjust their flags depending on whether or not they are run on a terminal (#1507) 2018-03-17 00:43:29 -07:00
Joe LeVeque
ab26a5c589
Install sonic-platform-common package in platform-monitor docker for ledd (#1330)
* Install sonic-platform-common package in platform-monitor docker for ledd

* Specify Python wheel dependencies in docker-platform-monitor.mk; Remove explicit specifications from Dockerfile.j2
2018-01-22 10:52:52 -08:00
Joe LeVeque
def0f2e4de [sensors]: Workaround for apparent bug in lm-sensors (#1058) 2017-10-20 11:01:26 -07:00
Joe LeVeque
bbf1d6624b [docker-platform-monitor]: Remove stale fancontrol.pid file (if exists) before starting fancontrol (#1002) 2017-09-30 10:55:03 -07:00
Joe LeVeque
f938f3ecaf [docker-platform-monitor]: Prevent supervisor from logging unexpected exits from processes known to exit in < 1 second (#889) 2017-08-15 10:38:22 -07:00
Joe LeVeque
f49cac086f Remove extra trailing newlines at EOF (#804)
Files now end with a single newline
2017-07-12 20:54:37 -07:00
Joe LeVeque
22819d9983 [docker-platform-monitor]: Add fancontrol (#735) 2017-06-23 15:23:00 -07:00
Joe LeVeque
d094ceecc2 [docker-platform-monitor]: Add LED control daemon and plugin for x86_64-arista_7050_qx32 platform (#691)
* Add files for building ledd package; add ledd to docker-platform-monitor; Control platform monitor docker using supervisord

* Add sonic-platform-daemons submodule

* Rename ledd.mk -> sonic-ledd.mk

* Add led_control.py plugin for x86_64-arista_7050_qx32 platform

* Rename Dockerfile -> Dockerfile.j2

* Fix build

* Remove blank line
2017-06-10 22:05:11 -07:00
Joe LeVeque
d5c13c0a83 [dockers]: Disable autorestart on all supervisor processes inside containers (#580) 2017-05-09 17:37:08 -07:00
Joe LeVeque
8f348399f5 [Dockers]: Manage all Docker containers with Supervisord (#573)
- Consolidate config.sh and start.sh scripts into one script (start.sh)
 - Solve issue #435 - All dockers now run supervisord as their ENTRYPOINT
 - All stdout/stderr output from processes managed by supervisord is now sent to syslog instead of their own files
 - Supervisord log messages are now also sent to syslog
 - Removed unused smartmontools package from docker-platform-monitor
2017-05-08 15:43:31 -07:00
pavel-shirshov
814fd87e63 Remove /var/run/rsyslogd.pid bofore starting rsyslog (#453) 2017-03-29 18:07:25 -07:00
Taoyu Li
f08874db36 [platform-monitor]: Fix sensors.conf file path (#426)
sensors.conf file was moved in #316.
2017-03-22 16:59:12 -07:00
pavel-shirshov
a845740543 [All Dockerfiles]: Prevent apt asking questions on the console (#300)
Add noninteractive setting into every Dockerfile in the repo

Signed-off-by: Pavel Shirshov pavelsh@microsoft.com
2017-02-16 21:48:49 -08:00
lguohan
b6753e7960 [docker-config-engine]: introduce docker sonic config engine (#274)
* [docker-config-engine]: introduce docker sonic config engine

sonic config engine provide the sonic configure engine for all sonic
dockers that rely on the engine to generate runtime configuration.
2017-02-07 18:11:19 -08:00
Oleksandr Ivantsiv
53a9792014 [Makefile]: Add possibility for docker containers to install files to base image (#240)
- Add vtysh/lldpctl/sensors to baseimage
2017-02-07 00:33:20 -08:00
Taoyu Li
e87498d16d Add platform config for 7050 and 6100 (#212)
* Add platform config for 7050 and 6100

* allow certain platform to have no sensors.conf file in sonic-cfggen
2017-01-25 18:18:25 -08:00
Taoyu Li
1b49499b65 [docker]: Add a status file to mark that the file is generated by sonic-config-engine (#211) 2017-01-24 19:16:55 -08:00
Oleksandr Ivantsiv
ea65962fe4 Fix compilation issue. (#198)
Fix docker-platform-monitor compilation issue.
Update .gitignore file
2017-01-20 13:55:42 -08:00
Taoyu Li
4fe1bdcf87 sonic-cfggen with sonicv2 dockers (#190)
Add a sonic-config-engine to help generate config file based on minigraph and other data on runtime. Modify fpm, teamd, lldp, snmp, and platform-monitor docker to use sonic-config-engine to generate config in docker upon load.
2017-01-19 20:56:26 -08:00
lguohan
c07e54c3e1 [platform-monitor] update apt cache (#148) 2016-12-21 15:16:18 -08:00
Qi Luo
e4bd20c18a Squash merge master (11de390) 2016-08-04 10:39:33 -07:00