[memory_checker] Add a specific log message in a case when the docker service is not running. (#16018)
#### Why I did it
To fix the logic introduced by [[memory_checker] Do not check memory usage of containers which are not created #11129](https://github.com/sonic-net/sonic-buildimage/pull/11129).
There could be a scenario before the reboot, where
1. The `docker service` has stopped
2. In a very short period of time, the monit service performs the `root@sonic:/home/admin# monit status container_memory_telemetry`
In such scenario, the `memory_checker` script will throw an error to the syslog:
```
ERR memory_checker: Failed to retrieve the running container list from docker daemon! Error message is: 'Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))'
```
But, actually, this scenario is a correct behavior, because when the docker service is stopped, the Unix socket is destroyed and that is why we could see the `FileNotFoundError(2, 'No such file or directory'` exception in the syslog.
#### How I did it
Change the log severity to the warning and changed the return value.
#### How to verify it
It is really hard to catch the exact moment described in the `Why I did it` section.
In order to check the logic:
1. Change the Unix socket path to non-existing in [/usr/bin/memory_checker](47742dfc2c/files/image_config/monit/memory_checker (L139)
) file on the switch.
2. Execute the `root@sonic:/home/admin# monit restart container_memory_telemetry`
3. Check the syslog for such messages:
```
WARNING memory_checker: Failed to retrieve the running container list from docker daemon! Error message is: 'Error while fetching server API version: ('Connection aborte
d.', FileNotFoundError(2, 'No such file or directory'))'
INFO memory_checker: [memory_checker] Exits without checking memory usage since container 'telemetry' is not running!
```
This commit is contained in:
parent
edc1e48c17
commit
b7dfc5b280
@ -140,6 +140,11 @@ def get_running_container_names():
|
||||
running_container_list = docker_client.containers.list(filters={"status": "running"})
|
||||
running_container_names = [ container.name for container in running_container_list ]
|
||||
except (docker.errors.APIError, docker.errors.DockerException) as err:
|
||||
if not is_service_active("docker"):
|
||||
syslog.syslog(syslog.LOG_INFO,
|
||||
"[memory_checker] Docker service is not running. Error message is: '{}'".format(err))
|
||||
return []
|
||||
|
||||
syslog.syslog(syslog.LOG_ERR,
|
||||
"Failed to retrieve the running container list from docker daemon! Error message is: '{}'"
|
||||
.format(err))
|
||||
|
Loading…
Reference in New Issue
Block a user