Why I did it
To monitor the SSD health condition in DellEMC S6100 platform post upgrade.
A daemon is introduced to monitor the SSD every one hour.
To check for SSD status at boot time and at the time of cold-reboot.
All these changes are supported only for newer SSD firmware.
Porting changes from 201911 branch
Added a platform_reboot_pre_check script to prevent cold-reboot based on SSD status.
Depends on Azure/sonic-utilities#1788
DO NOT MERGE UNTIL ABOVE PR IS MERGED
How I did it
On branch s6100_ssd_202012
Changes to be committed:
(use "git restore --staged ..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/debian/platform-modules-s6100.install
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/iSMART_64
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/platform_reboot_pre_check
modified: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/s6100_platform.sh
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/s6100_ssd_mon.sh
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/s6100_ssd_upgrade_status.sh
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/soft-reboot_plugin
new file: platform/broadcom/sonic-platform-modules-dell/s6100/systemd/s6100-ssd-monitor.service
new file: platform/broadcom/sonic-platform-modules-dell/s6100/systemd/s6100-ssd-monitor.timer
new file: platform/broadcom/sonic-platform-modules-dell/s6100/systemd/s6100-ssd-upgrade-status.service
This PR aims to fix the healthd crash issue by adding system health monitoring configuration file for platform Celestica E1031 by adding a new configuration file under the path device/celestica/x86_64-cel_e1031-r0/.
How to verify it
I manually restart the system-health.service and confirmed that healthd is running.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Added logrotate file for wtmp and btmp to override default conf and set size cap as 100K as done in
PR: #865. For buster this is control by separate file wtmp and btmp.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* c3691d3 [202012][pfcwd] Convert polling interval from ms to us in LUA scripts (#1909)
* 549c804 Mux state order change (#1902)
* 6b0b2c4 Update acl type check logic (#1886)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
* 2a8957d 2021-09-14 | [202012][sonic-utilities] CLI support for port auto negotiation (#1817) (HEAD, origin/202012) [vdahiya12]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
*Removed execute permissions from the systemd copp-config.service file.
Without this we will get a warning: "Configuration file /lib/systemd/system/copp-config.service is marked executable. Please remove executable permission bits. Proceeding anyway."
Why I did it
fstrim has dependency on pmon docker.
How I did it
start fstrim timer after sonic.target.
How to verify it
local test and PR test.
Signed-off-by: Ying Xie ying.xie@microsoft.com
* d03ba4f [202012] [portstat, intfstat] added rates and utilization (#1812)
* 499ad3f [config reload] Fix config reload failure due to sonic.target job cancellation (#1814)
* 96d658c [202012][sonic installer] Add swap setup support (#1815)
* a9c6970 platform pre-check for reboot in 202012 branch (#1788)
* 0e0478b Unify the number format in the ourput of portstat and pfcstat in all cases (#1795)
* 2d1e00e [ecnconfig] Fix exception seen during display and add unit tests (#1784) (#1789)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
* 0323d5e noaOrMlnx Fix flex counters logic of converting poll interval to seconds from MS (#878)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
* [Nokia ixs7215] Miscellaneous platform API fixes
This commit delivers the following fixes for the Nokia ixs7215 platform
- Fix bug in a fan API error path
- Add support for setting the fan drawer led
- Add support for getting/setting the front panel PSU status led
- Add support for getting the min/max observed temperature value
* [Nokia ixs7215] code review changes: temperature min/max values
Why I did it
Power cycle test case fails for Z9332f in sonic-mgmt framework(#8605).
How I did it
Modified the platform API to return expected strings.
How to verify it
Power cycle the device and verify the reboot reason.
Run sonic-mgmt test_reboot script.
#### Why I did it
while sonic upgrade, Image will be extracted to tmpfs for installation so tmpfs size should be larger than image size. Image installation will fail if image size is larger than tmpfs size.
we are facing below error while installing debug image with size greater than tmpfs which is 1.5g in marvell armhf platform.
sonic-installer install <url>
New image will be installed, continue? [y/N]: y
Downloading image...
...99%, 1744 MB, 708 KB/s, 0 seconds left...
Installing image SONiC-OS-202012.0-dirty-20210311.224845 and setting it as default...
Command: bash /tmp/sonic_image
tar: installer/fs.zip: Wrote only 7680 of 10240 bytes
tar: installer/onie-image-arm64.conf: Cannot write: No space left on device
tar: Exiting with failure status due to previous errors
Verifying image checksum ... OK.
Preparing image archive ...
#### How I did it
compare downloaded image size with tmpfs size, if size less than image size update the tmpfs size according to image size.
#### How to verify it
Install an Image with size larger than tmpfs. we verified by installing debug image with size 1.9gb which is larger than tmpfs size 1.5gb.
This PR updates the following commits in sonic-platform-common
9d2e7d5 Add y-cable driver for simulated mux (#213)
e3e8f09 [Y-Cable][Broadcom] Broadcom implementation of YCable class which inherits from YCableBase required for Y-Cable API's in sonic-platform-daemons (#208)
This PR updates the following commits in sonic-platform-daemons
ebc4f3f [Y-Cable] create unknown entries for mux_cable when there is a cable present but module definition is not present/invalid module
b10c417 [xcvrd] initial support for integrating vendor specfic class objects for calling Y-Cable API's inside xcvrd (#197) (#213)
f3fc1ea [y-cable] fix for logging the xcvrd metrics before writing the state to the State-DB (#208)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
- Why I did it
Update SDK\FW version to 4.4.3326\2008.3326. This version contains:
New Features:
1. Add support for Fast Boot for SN3800
Bug Fixing:
1. In some cases, when the total number of allocations exceeds the resource limit, an error can occur due to incorrect resource release procedure. This issue is most likely to affect the following resources: flow counters, ACL actions, PBS, WJH filter, Tunnels, ECMP containers, MC (L2 &L3)
2. On Spectrum systems, when using Async Router API with IPV6, an error message in the log regarding failing to remove ECMP container may show up. This error is not functional and can be safely ignored.
3. On Spectrum-2 systems and above, when using warm boot, setting max_bridge_num to a value greater than 1968 will cause an error and potential crash.
4. Some Molex cables do not support speed after reboot
- How I did it
Update submodule and .mk files
- How to verify it
Verified by running regression tests that includes complete sonic-mgmt tests supported
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
To enable saiserver docker on different platforms, it needs different configuration files. make the saiserver docker mount them in hwsku folder.
Co-authored-by: Ubuntu <richardyu@richardyu-ubuntu-vm0.trsxrdzozv2e1czsze2t05vqzh.ix.internal.cloudapp.net>
Why I did it
The first 4 ports on this dut are breakout ports. They might not always be connected in lab. Mark them as 'RJ45' to skip the SFP check since they are by default disabled.
How to verify it
run platform test_reboot.py
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
The Lodoga platform also matched crow which was hardcoding the flash
size to 3700. This change enables autodetect on Clearlake which in turns
allows autodetect for Lodoga.
The threshold was bumped from 3700 to 4000 because size computation can
differ slightly and report slightly above 3700.