sonic-buildimage/device
Junchao-Mellanox 1c97a03b81
[system-health] Add support for monitoring system health (#4835)
* system health first commit

* system health daemon first commit

* Finish healthd

* Changes due to lower layer logic change

* Get ASIC temperature from TEMPERATURE_INFO table

* Add system health make rule and service files

* fix bugs found during manual test

* Change make file to install system-health library to host

* Set system LED to blink on bootup time

* Caught exceptions in system health checker to make it more robust

* fix issue that fan/psu presence will always be true

* fix issue for external checker

* move system-health service to right after rc-local service

* Set system-health service start after database service

* Get system up time via /proc/uptime

* Provide more information in stat for CLI to use

* fix typo

* Set default category to External for external checker

* If external checker reported OK, save it to stat too

* Trim string for external checker output

* fix issue: PSU voltage check always return OK

* Add unit test cases for system health library

* Fix LGTM warnings

* fix demo comments: 1. get boot up timeout from monit configuration file; 2. set system led in library instead of daemon

* Remove boot_timeout configuration because it will get from monit config file

* Fix argument miss

* fix unit test failure

* fix issue: summary status is not correct

* Fix format issues found in code review

* rename th to threshold to make it clearer

* Fix review comment: 1. add a .dep file for system health; 2. deprecated daemon_base and uses sonic-py-common instead

* Fix unit test failure

* Fix LGTM alert

* Fix LGTM alert

* Fix review comments

* Fix review comment

* 1. Add relevant comments for system health; 2. rename external_checker to user_define_checker

* Ignore check for unknown service type

* Fix unit test issue

* Rename user define checker to user defined checker

* Rename user_define_checkers to user_defined_checkers for configuration file

* Renmae file user_define_checker.py -> user_defined_checker.py

* Fix typo

* Adjust import order for config.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/health_checker/hardware_checker.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/scripts/healthd

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import orders in src/system-health/tests/test_system_health.py

* Fix typo

* Add new line after import

* If system health configuration file not exist, healthd should exit

* Fix indent and enable pytest coverage

* Fix typo

* Fix typo

* Remove global logger and use log functions inherited from super class

* Change info level logger to notice level

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
2020-10-12 11:12:49 +03:00
..
accton [as7326-56x]Fix port_eeprom i2c mapping (#5466) 2020-09-26 11:21:31 -07:00
alphanetworks In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
arista [arista]: Add disable_pcie_firmware_check soc property (#5543) 2020-10-06 18:35:15 -07:00
barefoot [barefoot] Switch to Y profiles for Newport board (#5187) 2020-10-08 13:23:51 -07:00
broadcom/x86_64-bcm_xlr-r0 In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
celestica In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
centec [centec]: Add centec arm64 architecture support for E530 (#4641) 2020-08-06 03:16:11 -07:00
cig [device] set the port state to default down for device cig and ingrasys s9130 and s9230 (#4618) 2020-05-21 02:14:51 -07:00
dell In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
delta In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
facebook/x86_64-facebook_wedge100-r0 In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
ingrasys In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
inventec [Inventec][D6356] Update Inventec 6356 (#3839) 2020-02-10 12:26:48 -08:00
juniper In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
marvell [marvell] skip thermal control daemon for marvell device (#4703) 2020-06-09 09:20:51 -07:00
mellanox [system-health] Add support for monitoring system health (#4835) 2020-10-12 11:12:49 +03:00
mitac/x86_64-mitac_ly1200_b32h0_c3-r0 In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
pegatron/x86_64-pegatron_porsche-r0 [fix]: various minor fixes (#2246) 2018-11-10 13:39:30 -08:00
quanta In SAI 3.5 by default we are supporting 256 Group with 64 Memeber each. (#5400) 2020-09-22 11:21:12 -07:00
virtual/x86_64-kvm_x86_64-r0 Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature (#4851) 2020-09-25 08:32:44 -07:00
wnc/x86_64-wnc_osw1800-r0 [barefoot][build] Fixed BFN platform build failure (#3766) 2019-11-19 22:14:29 -08:00