Why I did it
To monitor the SSD health condition in DellEMC S6100 platform post upgrade.
A daemon is introduced to monitor the SSD every one hour.
To check for SSD status at boot time and at the time of cold-reboot.
All these changes are supported only for newer SSD firmware.
Porting changes from 201911 branch
Added a platform_reboot_pre_check script to prevent cold-reboot based on SSD status.
Depends on Azure/sonic-utilities#1556
DO NOT MERGE UNTIL ABOVE PR IS MERGED
Co-authored-by: Arun LK <Arun_L_K@dell.com>
Why I did it
Support for show system-health command in s5232f
How I did it
Added the configuration, API changes to support system health
How to verify it
Execute "show system-health summary/detail/monitor-list" CLI.
Why I did it
platform test suite failed for few API's in DellEMC Z9332f platform.
How I did it
Modified the API's to return the expected values in the script.
How to verify it
Run platform test suite after making the changes.
Why I did it
Added a check in determining CPU reset in fast-/warm-reboot in their respective platform plugin.
Introducing reboot plugin for "reboot" command to handle its own platform plugin.
How I did it
On branch s6100_fast_warm_check
Changes to be committed:
(use "git reset HEAD ..." to unstage)
modified: ../../debian/platform-modules-s6100.install
modified: ../scripts/fast-reboot_plugin
modified: ../scripts/platform_reboot_override
new file: ../scripts/reboot_plugin
modified: ../scripts/track_reboot_reason.sh
modified: chassis.py
How to verify it
Triggered cold-reset inside fast-reboot to test out the reboot-cause 2.0 API.
Why I did it
serial-getty service exited in Dell S6100 device randomly.
How I did it
Added serial-getty to monit services.
How to verify it
Stop serial-getty in ssh session and check whether the service restarts or not.
Why I did it
To determine the revision of the pcie.yaml to be used based on BIOS version in DellEMC S6100 platform.
Depends on: Azure/sonic-platform-common#195
How I did it
Added two revisions of pcie.yaml pcie_1.yaml and pcie_2.yaml
Included a platform-specific Pcie class to provide the revision of the pcie.yaml to be used by pcieutil/pcied.
How to verify it
Execute pcieutil check (Azure/sonic-utilities#1672) command and verify the list of PCIe devices displayed.
Logs: UT_logs.txt
#### Why I did it
Support API 2.0 for S5248F platform
#### How I did it
Making changes to S5248F platform specific directory
Co-authored-by: Arun LK <Arun_L_K@dell.com>
#### Why I did it
- To build flashrom properly with dependency tracking.
#### How I did it
- Moved flashrom code from platform/broadcom/sonic-platform-modules-dell/tools directory to src/flashrom directory.
- At the end, flashrom_0.9.7_amd64.deb package is build which will be installed in the devices.
- Currently flashrom builds only for Dell S6100 platforms.
#### Why I did it
- After [sonic-linux-kernel#177](https://github.com/Azure/sonic-linux-kernel/pull/177) changes, the I2C mux channels of Baseboard and Switchboard CPLDs are moved from i2c-4 and i2c-5 to i2c-36 and i2c-37 respectively.
- This caused QSFP driver initialization of i2c-36 to i2c-41 to fail causing the ports from Ethernet208 to Ethernet248 fail.
#### How I did it
- The fix to this problem is to change the order of QSFP driver initialization to I2C mux channels.
- Instead of the order i2c-10 to i2c-41, the order i2c-4 to i2c-35 is being utilized.
- Also, need to change the i2c-mux-channel number for Baseboard CPLD and switchboard CPLD in scripts to access them.
#### Why I did it
xcvrd crashes when the application advertisement capability flag is not seen for few transceivers.
#### How I did it
Initialize the additional application capability in dunder init
#### Why I did it
400G media EEPROM and DOM information are not populated properly in DellEMC Z9332f platform.
#### How I did it
Handled QSFP_DD, QSFP28/QSFP+, SFP+ accordingly based on media type detected.
#### Why I did it
To build flashrom properly with dependency tracking.
#### How I did it
Moved flashrom code from platform/broadcom/sonic-platform-modules-dell/tools directory to src/flashrom directory.
At the end, flashrom_0.9.7_amd64.deb package is build which will be installed in the devices.
#### Why I did it
- xcvrd crash was seen in latest 201811 images.
- For Dell S6100,API 2.0 uses poll mode while 1.0 was still using interrupt mode.
#### How I did it
- Modified get_transceiver_change_event in 1.0 to poll mode.
Incorporate the below changes in DellEMC Z9332F platform:
- Implemented watchdog platform API support
- Implement ‘get_position_in_parent’, ‘is_replaceable’ methods for all device types
- Change return type of SFP methods to match specification in sonic_platform_common/sfp_base.py
- Added platform.json file in device directory.
Co-authored-by: V P Subramaniam <Subramaniam_Vellalap@dell.com>
- Why I did it
To collect platform based logs along with "show techsupport" on S6000 and S6100 plaforms.
- How I did it
On branch dell_techsupport_dump
Changes to be committed:
(use "git reset HEAD ..." to unstage)
new file: platform/broadcom/sonic-platform-modules-dell/common/actions.sh
modified: platform/broadcom/sonic-platform-modules-dell/debian/platform-modules-s6000.install
modified: platform/broadcom/sonic-platform-modules-dell/debian/platform-modules-s6100.install
new file: platform/broadcom/sonic-platform-modules-dell/s6000/scripts/hw-management-generate-dump.sh
new file: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/hw-management-generate-dump.sh
- How to verify it
hw-mgmt-dump.tar.gz will be found in sonic_dump__< YYYYMMDD_HHMMSS>.tar.gz.
In preparation for the merging of Azure/sonic-platform-common#173, which properly defines class and instance members in the Platform API base classes.
It is proper object-oriented methodology to call the base class initializer, even if it is only the default initializer. This also future-proofs the potential addition of custom initializers in the base classes down the road.
#### Why I did it
To incorporate the below changes in DellEMC S5232, Z9264, Z9332 platforms.
- Update thermal high threshold values
- Make watchdog API Python2 and Python3 compatible
- Fix LGTM alerts
- Z9264: Fix get_change_event timer value
#### How I did it
- Use 'universal_newlines=True' in subprocess.Popen call.
- Change the timeout in 'get_change_event' to milliseconds to match specification in sonic_platform_common/chassis_base.py
The S6000 devices, the cold reboot is abrupt and it is likely to cause issues which will cause the device to land into EFI shell. Hence the platform reboot will happen after graceful unmount of all the filesystems as in S6100.
Moved the platform_reboot to platform_reboot_override and hooked it to the systemd shutdown services as in S6100
**- Why I did it**
PR https://github.com/Azure/sonic-platform-common/pull/102 modified the name of the SFF-8436 (QSFP) method to align the method name between all drivers, renaming it from `parse_qsfp_dom_capability` to `parse_dom_capability`. Once the submodule was updated, the callers using the old nomenclature broke. This PR updates all callers to use the new naming convention.
**- How I did it**
Update the name of the function globally for all calls into the SFF-8436 driver.
Note that the QSFP-DD driver still uses the old nomenclature and should be modified similarly. I will open a PR to handle this separately.
**- Why I did it**
To incorporate the below changes in DellEMC S6100, S6000 platforms.
- S6100, S6000:
- Enable 'thermalctld'
- Implement DeviceBase methods (presence, status, model, serial) for Fantray and Component
- Implement ‘get_position_in_parent’, ‘is_replaceable’ methods for all device types
- Implement ‘get_status’ method for Fantray
- Implement ‘get_temperature’, ‘get_temperature_high_threshold’, ‘get_voltage_high_threshold’, ‘get_voltage_low_threshold’ methods for PSU
- Implement ‘get_status_led’, ‘set_status_led’ methods for Chassis
- SFP:
- Make EEPROM read both Python2 and Python3 compatible
- Fix ‘get_tx_disable_channel’ method’s return type
- Implement ‘tx_disable’, ‘tx_disable_channel’ and ‘set_power_override’ methods
- S6000:
- Move PSU thermal sensors from Chassis to respective PSU
- Make available the data of both Fans present in each Fantray
**- How I did it**
- Remove 'skip_thermalctld:true' in pmon_daemon_control.json
- Implement the platform API methods in the respective device files
- Use `bytearray` for data read from transceiver EEPROM
- Change return type of 'get_tx_disable_channel' to match specification in sonic_platform_common/sfp_base.py
Fixes#6445
Because the ipmihelper.py script in the 9332 folder is slightly different than the common one (due to LGTM fixes), when the common one gets copied during build time it causes the workspace/build to become dirty.
Signed-off-by: Danny Allen <daall@microsoft.com>
- Dynamically change EEPROM driver based on media type.
- Otherwise, EEPROM INFO and DOM INFO might not be fetched properly and will result in erroneous output.
Add a systemd dependency to make platform-modules service as a prerequisite for determine-reboot-cause service to ensure platform initialization is complete before determine-reboot-cause.service executes.
- Why I did it
For determining reboot-cause while running newer BIOS, SMF firmware.
- How I did it
Made changes in reboot-cause determination script to add support for behavior of newer firmware.
- How to verify it
Performed different type of resets and verified "show reboot-cause" provides the correct reason.
Logs: UT_logs.txt
- Description for the changelog
DellEMC S6100: Update reboot-cause determination to support new firmware
- Change return type of SFP methods to match specification in sonic_platform_common/sfp_base.py
- Use init methods of base classes to initialize common instance variables
- Handle negative timeout values in S6100's watchdog ‘arm’ method
- Return appropriate values for 'get_target_speed', 'set_status_led' to avoid false warnings
- Implement FanDrawer and get_high_critical_threshold Platform API for S6000, S6100, Z9100 and Z9264F.
- Fix incorrect fan direction values in S6100, Z9100
- Make DellEMC platform modules Python3 compliant.
- Change return type of PSU Platform APIs in DellEMC Z9264, S5232 and Thermal Platform APIs in S5232 to 'float'.
- Remove multiple copies of pcisysfs.py.
- PEP8 style changes for utility scripts.
- Build and install Python3 version of sonic_platform package.
- Fix minor Platform API issues.
During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
- Why I did it
For fixing PCA MUX attachment issue in Dell S6100 platform.
- How I did it
Wait till IOM MUX powered up properly and start I2C enumeration.
We are moving toward building all Python packages for SONiC as wheel packages rather than Debian packages. This will also allow us to more easily transition to Python 3.
Python files are now packaged in "sonic-utilities" Pyhton wheel. Data files are now packaged in "sonic-utilities-data" Debian package.
**- How I did it**
- Build and install sonic-utilities as a Python package
- Remove explicit installation of wheel dependencies, as these will now get installed implicitly by pip when installing sonic-utilities as a wheel
- Build and install new sonic-utilities-data package to install data files required by sonic-utilities applications
- Update all references to sonic-utilities scripts/entrypoints to either reference the new /usr/local/bin/ location or remove absolute path entirely where applicable
Submodule updates:
* src/sonic-utilities aa27dd9...2244d7b (5):
> Support building sonic-utilities as a Python wheel package instead of a Debian package (#1122)
> [consutil] Display remote device name in show command (#1120)
> [vrf] fix check state_db error when vrf moving (#1119)
> [consutil] Fix issue where the ConfigDBConnector's reference is missing (#1117)
> Update to make config load/reload backward compatible. (#1115)
* src/sonic-ztp dd025bc...911d622 (1):
> Update paths to reflect new sonic-utilities install location, /usr/local/bin/ (#19)
* [platform] Add Support For Environment Variable
This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97
- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
- Xilinx/pericom peripherals are not actively used in DellEMC S6100 switch.
- These peripherals are throwing PCIE corrected messages in some of the units and filling syslog.
- Since it is not usable disabling it at startup.
- Add .gitignore files in each subdirectory of src/, so as to reduce the size of the .gitignore file in the project root, and also make it easier to maintain (i.e., if a directory in src/ is removed, there will not be outdated entries in the root .gitignore file.
- Also add missing .gitignore entries and remove outdated entries and duplicates.
**- Why I did it**
For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.
**- How I did it**
- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’ ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py,
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.
**- How to verify it**
- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.
UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100, Z9100 and Z9264 platforms to 'float'.
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.
- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
1. undefine led_classdev_register as it is defined in leds.h
2. header file location change
a. linux/i2c/pmbus.h -> linux/pmbus.h
b. linux/i2c-mux-gpio.h -> linux/platform_data/i2c-mux-gpio.h
c. linux/i2c/pca954x.h -> linux/platform_data/pca954x.h
FPGA driver crash fix for stale buffer in i2c transfer
LED firmware load issue fix.
10G port swapfix
psu/sfp bug fixes to report correct states/status of hw
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)
Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
- Fix for Azure/sonic-buildimage#4095
- Exit status from failed make command(action) didnt reached parent target because the make command is inside the "for" loop.
- Only the exit status of the last command in the last iteration of the for loop is read by parent target.
- This is the reason why dpkg-buildpackage ignored the make error.
- Fixed the issue with help of "set -e".
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.
sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)
Implement classes IpmiSensor, IpmiFru to obtain platform sensors information for Platform2.0 APIs in DellEMC Z9264 platform.
Add a new file ipmihelper.py with the implementation for IpmiSensor, IpmiFru classes.
- Sfp,Eeprom,Chassis(transceiver change event) support added for z9264f Platform 2.0 API
- Added Interrupt handler to SFP change event in dell_z9264f_fpga_ocores.c
- Fixed few indentation and offset issues in sfputil.py for z9264f
- Implemented fancontrol service to monitor S6000 fans and adjust fan speed w.r.t temperature.
- fancontrol.service starts the fancontrol script at startup.
- This script takes the average temperature by reading three sensors and configure FANS to appropritate RPM against the temperature.
- When the temperature is adjusted script will log in syslog for future reference.
- Also, script checks for faulty fans and report the status in syslog.
This common utility would set next boot option as onie mode and
when reboot is triggered it would reboot the box into that specific onie mode.
Current support modes are rescue/install/uninstall
* This method is used to update firmware components such as CPLD,FPGA,BIOS,SMF
* This uses ONIE firmware upgrade design to stage firmware update from NOS.
a) Copy latest firmware updater image to target running sonic.
b) Run “./fw-updater -u onie-firmware-x86_64-dellemc_s5200_c3538-r0.3.40.5.1-9.bin”.
c) This would automatically reboot ONIE into update mode and update firmware components to latest and reboot back to sonic without any user intervention.
Signed-off-by: Srideep Devireddy <srideep_devireddy@dell.com>
- optoe driver truncates invalid pages(ff) but sff driver doesn't truncate.so,the DOM related calculation made by sff8436 driver will show incorrect data.
- Few optics doesn't support DOM.
- SFP plugins currently returns None for unreadable pages and this'd throw the below mentioned error in sfpshow eeprom --dom.
Implement Watchdog platform2.0 API for DellEMC S6100 platform.
- Added new file watchdog.py in sonic_platform directory.
- Enabled API support to Enable/disable watchdog.
Fixed the fpga crash issue which we see in 15-20 mins time frame after onie-install. Accessing stale i2c transfer message buffer causes this crash. Te message buffer becomes stale due to race between i2c transfer and fpga interrupt handler.
This new state STATE_STOP will not be exposed for the wake up call till all the ISR of previous transfer is completed successfully.