The S6000 devices, the cold reboot is abrupt and it is likely to cause issues which will cause the device to land into EFI shell. Hence the platform reboot will happen after graceful unmount of all the filesystems as in S6100.
Why I did it
To monitor the SSD health condition in DellEMC S6100 platform post upgrade.
A daemon is introduced to monitor the SSD every one hour.
To check for SSD status at boot time and at the time of cold-reboot.
All these changes are supported only for newer SSD firmware.
Added a platform_reboot_pre_check script to prevent cold-reboot based on SSD status.
Depends on Azure/sonic-utilities#1472
DO NOT MERGE UNTIL ABOVE PR IS MERGED
During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
* [platform] Add Support For Environment Variable
This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
- Why I did it
For fixing PCA MUX attachment issue in Dell S6100 platform.
- How I did it
Wait till IOM MUX powered up properly and start I2C enumeration.
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97
- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
- Xilinx/pericom peripherals are not actively used in DellEMC S6100 switch.
- These peripherals are throwing PCIE corrected messages in some of the units and filling syslog.
- Since it is not usable disabling it at startup.
**- Why I did it**
For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.
**- How I did it**
- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’ ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py,
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.
**- How to verify it**
- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.
UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
**- Why I did it**
- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100 and Z9100 platforms to 'float'.
**- How I did it**
- Add 'skip_thermalctld:true' in pmon_daemon_control.json for DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Made changes in thermal.py, for 'get_temperature', 'get_high_threshold' and 'get_low_threshold' to return 'float' value.
**- How to verify it**
- Check thermalctld is not running in 'pmon'.
- Wrote a python script to load Chassis class and then call the APIs accordingly and verify the return type.
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.
- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.
sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)
FPGA driver crash fix for stale buffer in i2c transfer
LED firmware load issue fix.
10G port swapfix
psu/sfp bug fixes to report correct states/status of hw
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)
Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
- Implemented fancontrol service to monitor S6000 fans and adjust fan speed w.r.t temperature.
- fancontrol.service starts the fancontrol script at startup.
- This script takes the average temperature by reading three sensors and configure FANS to appropritate RPM against the temperature.
- When the temperature is adjusted script will log in syslog for future reference.
- Also, script checks for faulty fans and report the status in syslog.
* This method is used to update firmware components such as CPLD,FPGA,BIOS,SMF
* This uses ONIE firmware upgrade design to stage firmware update from NOS.
a) Copy latest firmware updater image to target running sonic.
b) Run “./fw-updater -u onie-firmware-x86_64-dellemc_s5200_c3538-r0.3.40.5.1-9.bin”.
c) This would automatically reboot ONIE into update mode and update firmware components to latest and reboot back to sonic without any user intervention.
Signed-off-by: Srideep Devireddy <srideep_devireddy@dell.com>
- optoe driver truncates invalid pages(ff) but sff driver doesn't truncate.so,the DOM related calculation made by sff8436 driver will show incorrect data.
- Few optics doesn't support DOM.
- SFP plugins currently returns None for unreadable pages and this'd throw the below mentioned error in sfpshow eeprom --dom.
Implement Watchdog platform2.0 API for DellEMC S6100 platform.
- Added new file watchdog.py in sonic_platform directory.
- Enabled API support to Enable/disable watchdog.
Fixed the fpga crash issue which we see in 15-20 mins time frame after onie-install. Accessing stale i2c transfer message buffer causes this crash. Te message buffer becomes stale due to race between i2c transfer and fpga interrupt handler.
This new state STATE_STOP will not be exposed for the wake up call till all the ISR of previous transfer is completed successfully.
Implemented remaining APIs in s6100,z9100,s6000
Removed soft link in s6100,s6000,z9100 and implemented seperately
Implemented get_transceiver_change_event in S6000
* [DELL][Z9100,S6100,S6000] Platform 2.0 SFP Changes
Added support in sfp.py file which will be generic. Send the eeprom path and sfp_control path from chassis.py