[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. (#14402)

[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. 

#### Why I did it
On S6100, the serial-getty service some time can't auto-restart by systemd. So there is a monit unit to check serial-getty service status and restart it.

However, this monit will report false alert, because in most case when serial-getty not running, systemd can restart it successfully.

To avoid the false alert, improve the monitor to wait and re-check.

Steps to reproduce this issue:
1. User login to device via console, and keep the connection.
2. User login to device via SSH, check the serial-getty@ttyS1.service service, it's running.
3. Run 'monit reload' from SSH connection.
4. Check syslog 1 minutes later, there will be false alert: ' 'serial-getty' process is not running'

#### How I did it
Add check-getty.sh script to recheck again later when getty service not running.
And update monit unit to check serial-getty service status with this script to avoid false alert.

#### How to verify it
Pass all UT.
Manually check fixed code work correctly:


```
admin@***:~$ sudo systemctl stop  serial-getty@ttyS1.service
admin@***:~$ sudo /usr/local/bin/check-getty.sh 
admin@***:~$ echo $?
1
admin@***:~$ sudo systemctl status serial-getty@ttyS1.serviceserial-getty@ttyS1.service - Serial Getty on ttyS1
     Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead) since Tue 2023-03-28 07:15:21 UTC; 1min 13s ago

admin@***:~$ sudo /usr/local/bin/check-getty.sh 
admin@***:~$ echo $?
0
admin@***:~$ sudo systemctl status serial-getty@ttyS1.serviceserial-getty@ttyS1.service - Serial Getty on ttyS1
     Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
```

syslog:
```
Mar 28 07:10:37.597458 *** INFO systemd[1]: serial-getty@ttyS1.service: Succeeded.
Mar 28 07:12:43.010550 *** ERR monit[593]: 'serial-getty' status failed (1) -- no output
Mar 28 07:12:43.010744 *** INFO monit[593]: 'serial-getty' trying to restart
Mar 28 07:12:43.010846 *** INFO monit[593]: 'serial-getty' stop: '/bin/systemctl stop serial-getty@ttyS1.service'
Mar 28 07:12:43.132172 *** INFO monit[593]: 'serial-getty' start: '/bin/systemctl start serial-getty@ttyS1.service'
Mar 28 07:13:43.286276 *** INFO monit[593]: 'serial-getty' status succeeded (0) -- no output
```

#### Description for the changelog
[S6100] Improve S6100 serial-getty monitor.

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
This commit is contained in:
Hua Liu 2023-04-05 21:34:31 -07:00 committed by mssonicbld
parent aea1980b14
commit 51b60613f7
3 changed files with 20 additions and 1 deletions

View File

@ -35,6 +35,7 @@ s6100/systemd/s6100-ssd-upgrade-status.service etc/systemd/system
s6100/systemd/s6100-reboot-cause.service etc/systemd/system
s6100/systemd/s6100-platform-startup.service etc/systemd/system
s6100/scripts/s6100_serial_getty_monitor etc/monit/conf.d
s6100/scripts/check-getty.sh usr/local/bin
common/fw-updater usr/local/bin
common/onie_mode_set usr/local/bin
common/onie_version usr/local/bin

View File

@ -0,0 +1,17 @@
#!/bin/bash
RETRY=0
while [ $RETRY -lt 5 ]; do
let RETRY=$RETRY+1
/bin/systemctl --quiet is-active serial-getty@ttyS1.service
status=$?
if [ $status == 0 ]; then
exit 0
fi
# when serial-getty not running, recheck later, beause systemd will restart serial-getty automatically.
sleep 1
done
exit 1

View File

@ -1,4 +1,5 @@
#Dell S6100 serial getty monitor
check process serial-getty matching "ttyS"
check program serial-getty with path /usr/local/bin/check-getty.sh
start program = "/bin/systemctl start serial-getty@ttyS1.service"
stop program = "/bin/systemctl stop serial-getty@ttyS1.service"
if status != 0 then restart