[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert.
#### Why I did it
On S6100, the serial-getty service some time can't auto-restart by systemd. So there is a monit unit to check serial-getty service status and restart it.
However, this monit will report false alert, because in most case when serial-getty not running, systemd can restart it successfully.
To avoid the false alert, improve the monitor to wait and re-check.
Steps to reproduce this issue:
1. User login to device via console, and keep the connection.
2. User login to device via SSH, check the serial-getty@ttyS1.service service, it's running.
3. Run 'monit reload' from SSH connection.
4. Check syslog 1 minutes later, there will be false alert: ' 'serial-getty' process is not running'
#### How I did it
Add check-getty.sh script to recheck again later when getty service not running.
And update monit unit to check serial-getty service status with this script to avoid false alert.
#### How to verify it
Pass all UT.
Manually check fixed code work correctly:
```
admin@***:~$ sudo systemctl stop serial-getty@ttyS1.service
admin@***:~$ sudo /usr/local/bin/check-getty.sh
admin@***:~$ echo $?
1
admin@***:~$ sudo systemctl status serial-getty@ttyS1.service
● serial-getty@ttyS1.service - Serial Getty on ttyS1
Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
Active: inactive (dead) since Tue 2023-03-28 07:15:21 UTC; 1min 13s ago
admin@***:~$ sudo /usr/local/bin/check-getty.sh
admin@***:~$ echo $?
0
admin@***:~$ sudo systemctl status serial-getty@ttyS1.service
● serial-getty@ttyS1.service - Serial Getty on ttyS1
Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
```
syslog:
```
Mar 28 07:10:37.597458 *** INFO systemd[1]: serial-getty@ttyS1.service: Succeeded.
Mar 28 07:12:43.010550 *** ERR monit[593]: 'serial-getty' status failed (1) -- no output
Mar 28 07:12:43.010744 *** INFO monit[593]: 'serial-getty' trying to restart
Mar 28 07:12:43.010846 *** INFO monit[593]: 'serial-getty' stop: '/bin/systemctl stop serial-getty@ttyS1.service'
Mar 28 07:12:43.132172 *** INFO monit[593]: 'serial-getty' start: '/bin/systemctl start serial-getty@ttyS1.service'
Mar 28 07:13:43.286276 *** INFO monit[593]: 'serial-getty' status succeeded (0) -- no output
```
#### Description for the changelog
[S6100] Improve S6100 serial-getty monitor.
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
Why I did it
smartctl tool is available only in PMON docker. Hence, the tool may be not accessible incase PMON docker goes down.
Using iSMART_64 tool to fetch the SSD firmware version and device model information.
How I did it
Replacing smartctl with iSMART_64.
Why I did it
To gracefully unmount filesystems and stop containers while performing a cold reboot.
Unmount ONIE-BOOT if mounted during fast/soft/warm reboot
How I did it
Override systemd-reboot service to perform a cold reboot.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins.
How to verify it
On reboot, verify that the container stop and filesystem unmount services have completed execution before the platform reboot.
Why I did it
S5296F - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
How to verify it
Used the API 2.0 test suite to validate the test cases.
To reduce rc.local script execution time. Porting changes from [DellEMC] S6100 Platform Service optimization #10989
Changes:
Moving platform-modules-s6100.service and s6100-lpc-monitor.service asynchronous to rc.local script.
Why I did it
To return 'False' in update_firmware component API in DellEMC Z9332f platform, if the firmware image is not present in the provided image path.
How I did it
Updated 'update_firmware' in component.py to return False if image is not found in location provided by 'image_path'
How to verify it
Verified that the API returns False when an invalid image path is specified.
Why I did it
S5212F - Platform API 2.0 changes
S5224F - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
Added media_settings.json, pcie.yaml, platform.json, system_health_monitoring_config.json files.
How to verify it
Used the API 2.0 test suite to validate the test cases.
Why I did it
Added support for the device Z9432F
How I did it
Implemented the support for the platform Z9432F
Switch Vendor: DellEMC
Switch SKU: Z9432F-ON
ASIC Vendor: Broadcom
SONiC Image: sonic-broadcom.bin
Why I did it
To include ONIE version in show platform firmware status command output in DellEMC S6100 and Z9332f platforms.
How I did it
Include ‘ONIE’ in the list of components provided by platform APIs in DellEMC S6100 and Z9332f.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins in DellEMC S6100.
DellEMC: N3248TE platform API2.0 changes
Why I did it
N3248TE Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
Added system_health_monitoring_config.json file
How to verify it
Used the API 2.0 test suite to validate the test cases.
Co-authored-by: Arun LK <Arun_L_K@dell.com>
Why I did it
DellEMC : Added support for N3248TE/N3248PXE platforms
How I did it
Implemented the changes to enable/disable the watchdog support
How to verify it
watchdog_unit_test.txt
Co-authored-by: Arun LK <Arun_L_K@dell.com>
Why I did it
To reduce the processing time of rc.local, refactoring s6100 platform initialization.
Porting changes from 202012 branch [202012] Refactoring DELL platform init to reduce rc.local processing time #10171
Why I did it
N3248TE - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
How to verify it
Used the API 2.0 test suite to validate the test cases.
Why I did it
Reboot cause is not working in S52xx platforms.
How I did it
Modified platform API's.
How to verify it
Check "show reboot-cause" to verify the reboot reason
Why I did it
To incorporate the below changes in DellEMC S6100, S6000 platforms.
S6100, S6000:
Implement 'get_revision' method for Chassis
Implement 'get_maximum_consumed_power' method for FanDrawer
Implement 'get_revision', 'get_maximum_supplied_power' methods for PSU
Implement 'get_error_description' method for SFP.
S6100:
Implement 'get_module_index' method for Chassis
Implement 'get_description', 'get_maximum_consumed_power', 'get_oper_status', 'get_slot' methods for Module
Update component names in platform.json
How I did it
Implement the platform API methods in the respective device files
How to verify it
Verified that the respective sonic-mgmt platform API test cases report success.
Why I did it
To implement fan control using thermalctld in DellEMC S6000 platform
Requires: Azure/sonic-linux-kernel#241
How I did it
Add thermal policies in 'thermal_policy.json'
Implemented thermal_manager.py and the necessary modules to perform fan control via thermalctld
Removed fancontrol.sh
How to verify it
Verified that the fan speeds are set based on the fan and temperature status.
Logs: S6000_fan_control_test_logs.txt
Why I did it
Adding SSD as part of platform components list.
Introducing platform_fw_au_reboot_handle to use auto-update functionality in fwutil
How I did it
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/debian/platform-modules-s6100.install
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/platform_fw_au_reboot_handle
modified: platform/broadcom/sonic-platform-modules-dell/s6100/sonic_platform/chassis.py
modified: platform/broadcom/sonic-platform-modules-dell/s6100/sonic_platform/component.py
How to verify it
By running fwutil command.
Warning: fwupdate_fwimage_dir: /var/platform/fwpackage/.
Chassis Module Component Firmware Version (Current/Available) Status
--------- -------- ----------- ------------------------------- ----------------------------- ------------------
S6100-ON BIOS S6100-BIOS-3.25.0.2-9-noRP2.bin 3.25.0.2-8 / 3.25.0.2-9 update is required
FPGA smf_firmware_upgrade.tar 2.4 / 2.4 up-to-date
CPLD cpld_firmware_upgrade.tar 4 / 4 up-to-date
SSD ssd_firmware_upgrade.tar S16425cG / S16425cG up-to-date
root@sonic:~#
Why I did it
To support iTCO watchdog using watchdog APIs.
How I did it
Implemented a new watchdog class WatchdogTCO for interfacing with iTCO watchdog.
Updated reboot cause determination logic.
How to verify it
Verified that the watchdog APIs' return values are as expected.
Logs: UT_logs.txt
Why I did it
To include capabilities fields in platform.json of DellEMC S6000, S6100, Z9332f platforms.
How I did it
Add the capabilities fields in each platform's respective platform.json.
How to verify it
Ran sonic-mgmt platform api test cases that use capabilities fields and verified that the results are as expected.
ipmihelper files are repeated for few DellEMC platforms. Removed the
files in sonic_platform since as part of debian rules,ipmihelper will be
copied to necessary directory.
Also add out of tree pca9548 mux driver to use platform data to mapping i2c bus with front panel port.
Signed-off-by: Jakkapan Jangmuang <jjangmua@celestica.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Added Support for Dell EMC S5212f platform
How I did it
Implemented the support for Dell EMC S5212f platform
Platform: x86_64-dellemc_s5212f_c3538-r0
HwSKU: DellEMC-S5212f-P-25G
ASIC: broadcom
ASIC Count: 1
How to verify it
Verified the show command outputs
Why I did it
PCIe Gen1 settings was needed for Dell S6000 device.
How I did it
Modified from Gen2 to Gen 1 speed for Dell S6000 PCIe devices
How to verify it
Check lspci output, verify the syslogs
Why I did it
Added support for the device S5224F
How I did it
Implemented the support for the platform S5224F
Switch Vendor: DellEMC
Switch SKU: S5224F-ON
ASIC Vendor: Broadcom
SONiC Image: sonic-broadcom.bin
How to verify it
Verified the show platform/interface commands
Why I did it
Added support for the device N3248PXE
How I did it
Implemented the support for the platform N3248PXE
n3248pxe_unit_test_log.txt
Switch Vendor: DellEMC
* Switch SKU: N3248PXE
* ASIC Vendor: Broadcom
* SONiC Image: sonic-broadcom.bin
How to verify it
Verified the show platform commands
Why I did it
Added support for the device N3248TE
How I did it
Implemented the support for the platform N3248TE
Switch Vendor: DellEMC
Switch SKU: N3248TE
ASIC Vendor: Broadcom
SONiC Image: sonic-broadcom.bin
How to verify it
Verified the show platform commands