During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
- Why I did it
For fixing PCA MUX attachment issue in Dell S6100 platform.
- How I did it
Wait till IOM MUX powered up properly and start I2C enumeration.
We are moving toward building all Python packages for SONiC as wheel packages rather than Debian packages. This will also allow us to more easily transition to Python 3.
Python files are now packaged in "sonic-utilities" Pyhton wheel. Data files are now packaged in "sonic-utilities-data" Debian package.
**- How I did it**
- Build and install sonic-utilities as a Python package
- Remove explicit installation of wheel dependencies, as these will now get installed implicitly by pip when installing sonic-utilities as a wheel
- Build and install new sonic-utilities-data package to install data files required by sonic-utilities applications
- Update all references to sonic-utilities scripts/entrypoints to either reference the new /usr/local/bin/ location or remove absolute path entirely where applicable
Submodule updates:
* src/sonic-utilities aa27dd9...2244d7b (5):
> Support building sonic-utilities as a Python wheel package instead of a Debian package (#1122)
> [consutil] Display remote device name in show command (#1120)
> [vrf] fix check state_db error when vrf moving (#1119)
> [consutil] Fix issue where the ConfigDBConnector's reference is missing (#1117)
> Update to make config load/reload backward compatible. (#1115)
* src/sonic-ztp dd025bc...911d622 (1):
> Update paths to reflect new sonic-utilities install location, /usr/local/bin/ (#19)
* [platform] Add Support For Environment Variable
This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97
- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
- Xilinx/pericom peripherals are not actively used in DellEMC S6100 switch.
- These peripherals are throwing PCIE corrected messages in some of the units and filling syslog.
- Since it is not usable disabling it at startup.
- Add .gitignore files in each subdirectory of src/, so as to reduce the size of the .gitignore file in the project root, and also make it easier to maintain (i.e., if a directory in src/ is removed, there will not be outdated entries in the root .gitignore file.
- Also add missing .gitignore entries and remove outdated entries and duplicates.
**- Why I did it**
For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.
**- How I did it**
- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’ ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py,
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.
**- How to verify it**
- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.
UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100, Z9100 and Z9264 platforms to 'float'.
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.
- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
1. undefine led_classdev_register as it is defined in leds.h
2. header file location change
a. linux/i2c/pmbus.h -> linux/pmbus.h
b. linux/i2c-mux-gpio.h -> linux/platform_data/i2c-mux-gpio.h
c. linux/i2c/pca954x.h -> linux/platform_data/pca954x.h
FPGA driver crash fix for stale buffer in i2c transfer
LED firmware load issue fix.
10G port swapfix
psu/sfp bug fixes to report correct states/status of hw
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)
Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
- Fix for Azure/sonic-buildimage#4095
- Exit status from failed make command(action) didnt reached parent target because the make command is inside the "for" loop.
- Only the exit status of the last command in the last iteration of the for loop is read by parent target.
- This is the reason why dpkg-buildpackage ignored the make error.
- Fixed the issue with help of "set -e".
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.
sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)
Implement classes IpmiSensor, IpmiFru to obtain platform sensors information for Platform2.0 APIs in DellEMC Z9264 platform.
Add a new file ipmihelper.py with the implementation for IpmiSensor, IpmiFru classes.
- Sfp,Eeprom,Chassis(transceiver change event) support added for z9264f Platform 2.0 API
- Added Interrupt handler to SFP change event in dell_z9264f_fpga_ocores.c
- Fixed few indentation and offset issues in sfputil.py for z9264f
- Implemented fancontrol service to monitor S6000 fans and adjust fan speed w.r.t temperature.
- fancontrol.service starts the fancontrol script at startup.
- This script takes the average temperature by reading three sensors and configure FANS to appropritate RPM against the temperature.
- When the temperature is adjusted script will log in syslog for future reference.
- Also, script checks for faulty fans and report the status in syslog.
This common utility would set next boot option as onie mode and
when reboot is triggered it would reboot the box into that specific onie mode.
Current support modes are rescue/install/uninstall
* This method is used to update firmware components such as CPLD,FPGA,BIOS,SMF
* This uses ONIE firmware upgrade design to stage firmware update from NOS.
a) Copy latest firmware updater image to target running sonic.
b) Run “./fw-updater -u onie-firmware-x86_64-dellemc_s5200_c3538-r0.3.40.5.1-9.bin”.
c) This would automatically reboot ONIE into update mode and update firmware components to latest and reboot back to sonic without any user intervention.
Signed-off-by: Srideep Devireddy <srideep_devireddy@dell.com>
- optoe driver truncates invalid pages(ff) but sff driver doesn't truncate.so,the DOM related calculation made by sff8436 driver will show incorrect data.
- Few optics doesn't support DOM.
- SFP plugins currently returns None for unreadable pages and this'd throw the below mentioned error in sfpshow eeprom --dom.
Implement Watchdog platform2.0 API for DellEMC S6100 platform.
- Added new file watchdog.py in sonic_platform directory.
- Enabled API support to Enable/disable watchdog.
Fixed the fpga crash issue which we see in 15-20 mins time frame after onie-install. Accessing stale i2c transfer message buffer causes this crash. Te message buffer becomes stale due to race between i2c transfer and fpga interrupt handler.
This new state STATE_STOP will not be exposed for the wake up call till all the ISR of previous transfer is completed successfully.
Implemented remaining APIs in s6100,z9100,s6000
Removed soft link in s6100,s6000,z9100 and implemented seperately
Implemented get_transceiver_change_event in S6000
* [DELL][Z9100,S6100,S6000] Platform 2.0 SFP Changes
Added support in sfp.py file which will be generic. Send the eeprom path and sfp_control path from chassis.py
Added Reboot Reason for S6000 in platform 2.0
Fixed issue in process-reboot-cause
Added package uninstall code in platform de-init code for z9100, s6100
- How I did it
-> Added support for S6000 Reboot Reason
-> Added platform.py for all platforms
-> Verified show reboot-cause command with the code changes. Added UT logs with show reboot-cause
-> Modified process-reboot-cause service to start after pmon.service. In S6000, we have to wait for nvram to be loaded.
-> If reboot-cause service starts before pmon.service, show reboot-cause is showing incorrect reason.
-> Bug fix in process-reboot-cause file
- import sonic_platform
+ import sonic_platform.platform
The following commit addresses the graceful unmounting of file
system and graceful shutdown of dockers before calling a
cold reboot which will cause a power cycle of SSD. This ensures
orderly shutdown and no corruption of files systems because
of the power cycle to SSD.
This commit will use the existing systemd-reboot service scripts
and override the configuration to do cold reboot for S6100 and
Z9100.
Unit tested the fix and graceful shutdown of file system and
dockers are done with cold reboot.
Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
* [devices]: Add a new supported device DellEMC s5232f
* Switch Vendor: DellEMC
* Switch SKU: s5232F
* ASIC Vendor: Broadcom
* Swich ASIC: Trident3
* Port Configuration: 32x100G
* SONiC Image: sonic-broadcom.bin
* LED support for s5232f
* Changes Include ipmitool implementation for platform_sensors script is inclued in pmon startup
* Added 100G,25G,10G configruation ( 100G is default).
* s5232[device] PSU detecttion and default led state support
* Switch Vendor: DellEMC
* Switch SKU: s5232F
* ASIC Vendor: Broadcom
* Swich ASIC: Trident3
* Port Configuration: 32x100G
* SONiC Image: sonic-broadcom.bin
* LED support for s5232f
* Changes Include ipmitool implementation for platform_sensors script is inclued in pmon startup
* Added 100G,25G,10G configruation ( 100G is default).
- What I did
Added Daemon to Log LPC bus degradation in Intel C2000 processor. Intel Rangeley C2000 processors with revision less than or equal to 2 have issue where LPC bus degrades over time in some processors. To identify the problem and to notify the issue, a daemon has been added which will log on encountering the issue.
- How I did it
Added a daemon which validates the CPLD scratch(0x102) and SMF scratch(0x202) registers by writing and reading values on regular polling intervals (300 seconds). If there is a discrepancy between read and write, a critical log will be thrown.
- How to verify it
The infra is verify by simulating the issue where between write and read, the value in register is modified and the log appearance is checked.
- Description for the changelog
Added Daemon to identify LPC bus degradation issue and notify using syslog in Dell S6100 and Z9100 platforms. This daemon will only run on processors with revision less than or equal to 2.
* [submodule] update sonic-linux-kernel
* update linux kernel version
* Fix many version strings
* update mellanox components (built with new kernel)
* [mlnx] add make files for SDK WJH libs
* Update arista driver submodule (#8)
Make the debian packaging point to a newer kernel version.
This service (weekly) will let SSD firmware to do the garbage collection
after file-system deleted files. It could avoid slowness or
even READ-ONLY error due to SSD not being able to free the pages
even though the file system thinks there was a lot of space left.
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
dell_ich module fails to load sometimes due to the failure of pci_get_drvdata().
This function is responsible for fetching INTEL PCI related memory handle in kernel. This is implemented in lpc_ich kernel module.
Due to race in addition/deletion of kernel modules, sometimes lpc_ich loads after dell_ich.
Because of this behaviour dell_ich module fails to load.
Fixed by addding dependency between modules.
Removed i2c_mux_gpio module from blacklist entry as it is not the original root case of this issue.