sonic-buildimage/files/build_templates
Junchao-Mellanox 1c97a03b81
[system-health] Add support for monitoring system health (#4835)
* system health first commit

* system health daemon first commit

* Finish healthd

* Changes due to lower layer logic change

* Get ASIC temperature from TEMPERATURE_INFO table

* Add system health make rule and service files

* fix bugs found during manual test

* Change make file to install system-health library to host

* Set system LED to blink on bootup time

* Caught exceptions in system health checker to make it more robust

* fix issue that fan/psu presence will always be true

* fix issue for external checker

* move system-health service to right after rc-local service

* Set system-health service start after database service

* Get system up time via /proc/uptime

* Provide more information in stat for CLI to use

* fix typo

* Set default category to External for external checker

* If external checker reported OK, save it to stat too

* Trim string for external checker output

* fix issue: PSU voltage check always return OK

* Add unit test cases for system health library

* Fix LGTM warnings

* fix demo comments: 1. get boot up timeout from monit configuration file; 2. set system led in library instead of daemon

* Remove boot_timeout configuration because it will get from monit config file

* Fix argument miss

* fix unit test failure

* fix issue: summary status is not correct

* Fix format issues found in code review

* rename th to threshold to make it clearer

* Fix review comment: 1. add a .dep file for system health; 2. deprecated daemon_base and uses sonic-py-common instead

* Fix unit test failure

* Fix LGTM alert

* Fix LGTM alert

* Fix review comments

* Fix review comment

* 1. Add relevant comments for system health; 2. rename external_checker to user_define_checker

* Ignore check for unknown service type

* Fix unit test issue

* Rename user define checker to user defined checker

* Rename user_define_checkers to user_defined_checkers for configuration file

* Renmae file user_define_checker.py -> user_defined_checker.py

* Fix typo

* Adjust import order for config.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/health_checker/hardware_checker.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/scripts/healthd

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import orders in src/system-health/tests/test_system_health.py

* Fix typo

* Add new line after import

* If system health configuration file not exist, healthd should exit

* Fix indent and enable pytest coverage

* Fix typo

* Fix typo

* Remove global logger and use log functions inherited from super class

* Change info level logger to notice level

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
2020-10-12 11:12:49 +03:00
..
per_namespace BGP Service script path and error fix (#5183) 2020-08-15 12:09:10 -07:00
arp_update_vars.j2 [swss] Enhance ARP Update to Call Sonic Cfggen Once (#5398) 2020-09-18 18:44:23 -07:00
buffers_config.j2 Fix the build issue when port2cable lenth define in (#5437) 2020-09-23 08:07:09 -07:00
config-setup.service.j2 [config-setup]: create a SONiC configuration management service (#3227) 2019-12-04 07:15:58 -08:00
database.service.j2 Multi-ASIC implementation (#3888) 2020-03-31 10:06:19 -07:00
dhcp_relay.service.j2 [services] Remove explicit dependencies from dhcp_relay service file, control in swss.sh (#3823) 2019-11-26 16:59:45 -08:00
docker_image_ctl.j2 [Multi-Asic] Forward SNMP requests received on front panel interface to SNMP agent in host. (#5420) 2020-09-26 12:14:30 -07:00
gbsyncd.service.j2 Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature (#4851) 2020-09-25 08:32:44 -07:00
iccpd.service.j2 MCLAG feature for SONIC (#2514) 2020-04-04 15:24:06 -07:00
init_cfg.json.j2 Enhanced Feature Table state enable/disable for multi-asic platforms. (#5358) 2020-09-22 08:34:02 -07:00
lldp.service.j2 Changes for LLDP docker to support multi-npu platforms (#4530) 2020-05-11 11:05:44 -07:00
mgmt-framework.service.j2 [services][mgmt-framework] delay mgmt-framework service on boot (#5226) 2020-08-27 21:53:58 +03:00
mgmt-framework.timer [services][mgmt-framework] delay mgmt-framework service on boot (#5226) 2020-08-27 21:53:58 +03:00
nat.service.j2 [services] remove swss from WantedBy for nat service (#4991) 2020-07-19 21:50:26 -07:00
organization_extensions.sh Framework to plugin Organization specific scripts during ONIE Image build (#951) 2017-09-19 16:23:31 -07:00
pcie-check.timer Add pcie-check service to check PCIe devices at boot (#4771) 2020-07-13 14:15:09 -07:00
pmon.service.j2 [psud]: Fix for psud crash because of database connection reset (#3647) 2020-01-10 13:26:04 -08:00
process-reboot-cause.timer [reboot cause]: Delay process-reboot-cause service until network connection is stable (#4003) 2020-01-10 09:47:13 -08:00
qos_config.j2 [qos]: Alpha and ECN settings change for Th (#4564) 2020-05-09 11:21:18 -07:00
radv.service.j2 Move RADV fastboot handling to a service script (#5108) 2020-08-11 13:13:13 -07:00
restapi.service.j2 Start RestAPI container when sonic boots (#4140) 2020-02-12 16:38:45 -08:00
sflow.service.j2 [services] sflow service sets swss service as Requisite=, not Requires= (#3819) 2019-12-03 09:50:49 -08:00
snmp.service.j2 [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
snmp.timer [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
sonic_debian_extension.j2 [system-health] Add support for monitoring system health (#4835) 2020-10-12 11:12:49 +03:00
telemetry.service.j2 [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
telemetry.timer [services] make snmp.timer work again and delay telemetry.service (#3742) 2019-12-16 09:07:05 -08:00
updategraph.service.j2 [config-setup]: create a SONiC configuration management service (#3227) 2019-12-04 07:15:58 -08:00