Why I did it
smartctl tool is available only in PMON docker. Hence, the tool may be not accessible incase PMON docker goes down.
Using iSMART_64 tool to fetch the SSD firmware version and device model information.
How I did it
Replacing smartctl with iSMART_64.
Why I did it
To port DellEMC S6100 SSD upgrade status checker changes from master (based on #7289) to 201811 branch
Handle newer SSD firmware version (S210506G - 3IE devices)
Recover SSD upgrade state if in case ssd_fw_upgrade folder got deleted
Why I did it
serial-getty service exited in Dell S6100 device randomly.
How I did it
Added serial-getty to monit services.
How to verify it
Stop serial-getty in ssh session and check whether the service restarts or not.
Why I did it
To monitor the SSD health condition in DellEMC S6100 platform post upgrade.
A daemon is introduced to monitor the SSD every one hour.
To check for SSD status at boot time and at the time of cold-reboot.
All these changes are supported only for newer SSD firmware.
Porting changes from 201911 branch
Added a platform_reboot_pre_check script to prevent cold-reboot based on SSD status.
Depends on Azure/sonic-utilities#1557
Basically mlnx-fw-upgrade.sh is used in two places:
1. https://github.com/Azure/sonic-buildimage/blob/201811/files/scripts/syncd.sh#L109
```bash
/usr/bin/mst start
/usr/bin/mlnx-fw-upgrade.sh
/etc/init.d/sxdkernel start
/sbin/modprobe i2c-dev
```
2. https://github.com/Azure/sonic-buildimage/blob/201811/device/mellanox/x86_64-mlnx_msn2700-r0/platform_reboot#L32
```bash
ParseArguments "$@"
${FW_UPGRADE_SCRIPT} --upgrade --verbose
EXIT_CODE="$?"
```
In first case the `stdout` is redirected to `syslog` directly by `systemd`.
Thus, the `syslog` logger is only required in second case.
#### Why I did it
* To improve ASIC/CPLD FW upgrade logging
* To improve CPLD upgrade time
#### How I did it
* Added `syslog` logger support
* Replaced `_pciconf0` -> `_pci_cr0` to reduce CPLD upgrade time
#### How to verify it
1. mlnx-fw-upgrade.sh --upgrade
Why I did it
The S6000 devices, the cold reboot is abrupt and it is likely to cause issues which will cause the device to land into EFI shell. Hence the platform reboot will happen after graceful unmount of all the filesystems as in S6100.
How I did it
Moved the platform_reboot to platform_reboot_override and hooked it to the systemd shutdown services as in S6100.
Fixed the "/host unmount failed" issue as well in 201811.
How to verify it
Issue "reboot" command to verify if the reboot is happening gracefully.
#### Why I did it
- The iSMT SMBUS I2c bus number conflicts in different kernel versions.
#### How I did it
- Add I2cbus number detector for iSMT bus
- Replace iSMT bus number in fancontrol config
Updated commit hash pointer in the relevant Makefile for the repository which contains SDK packages.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
when parallel build is enabled, both docker-fpm-frr and docker-syncd-brcm
is built at the same time, docker-fpm-frr requires swss which requires to
install libsaivs-dev. docker-syncd-brcm requires syncd package which requires
to install libsaibcm-dev.
since libsaivs-dev and libsaibcm-dev install the sai header in the same
location, these two packages cannot be installed at the same time. Therefore,
we need to serialize the build between these two packages. Simply uninstall
the conflict package is not enough to solve this issue. The correct solution
is to have one package wait for another package to be uninstalled.
For example, if syncd is built first, then it will install libsaibcm-dev.
Meanwhile, if the swss build job starts and tries to install libsaivs-dev,
it will first try to query if libsaibcm-dev is installed or not. if it is
installed, then it will wait until libsaibcm-dev is uninstalled. After syncd
job is finished, it will uninstall libsaibcm-dev and swss build job will be
unblocked.
To solve this issue, _UNINSTALLS is introduced to uninstall a package that
is no longer needed and to allow blocked job to continue.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
backport c4b5b002c3
make swss build depends only on libsairedis instead of syncd. This allows to build swss without depending
on vendor sai library.
Currently, libsairedis build also buils syncd which requires vendor SAI lib. This makes difficult to build
swss docker in buster while still keeping syncd docker in stretch, as swss requires libsairedis which also
build syncd and requires vendor to provide SAI for buster. As swss docker does not really contain syncd
binary, so it is not necessary to build syncd for swss docker.
[submodule]: update sonic-sairedis
* 9a66890 2020-06-28 | [build]: add option to build without syncd (HEAD -> 201811, origin/201811) [Guohan Lu]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Haliburton needed watchdog daemon to monitor the basic health of a machine. If something goes wrong, such as a crashing program overloading the CPU, or no more free memory on the system, watchdog can safely reboot the machine,
* [SAI] Add PFC pause duration counters in microseconds
**- Why I did it**
To add PFC pause duration counters in microseconds
**- How I did it**
Updated SAI to version 1.14.3
**- How to verify it**
**- Description for the changelog**
[Mellanox] Update SAI to version 1.14.3
Since DOCKER_SYNCD_VS is no longer being used, the mount option does not properly mount the warmboot file directory. Fix the mount option so that the directory is properly mounted.
During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
* [platform/cel-dx010]: add gpio init for fan direction
* [platform/cel-dx010]: remove invalid code on fancontrol service
* [platform/cel-dx010]: modify fancontrol service permission
* [platform/cel-dx010]: install fancontrol in pmon
If a device had a master or 201911 image then installed a 201811 image, it could result in a prefdl that was not properly processed by 201811 Arista code.
This is a commit that was on 201911 and master branch.
Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
Fix the method get_transceiver_change_event to abide by the function description, return True status and use timeout in milliseconds.
Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
Update fancontrol service for Seastone-DX010/E1031 device to support hysteresis temperature threshold and difference config for each unit fan direction type (B2F/F2B); follow master branch
- Broadcom SAI 3.5 GA code drop on 20200608.
Changes:
- CS9533198
- CS10283709
- CS00009716645
- CS00010389861
- CS00010406122
- CS00010503275
- Addressed a few memory leak issues.
- Addressed an array memory allocation issue.
- Addressed assert during SER handling.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* fixes an issue when /host/warmboot/issu_bank.txt is empty/corrupted
switch is not able to over come this and enters continuos reload/reboot
failure.
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
admin@sonic:~$ sudo hw-management-wd.sh
Usage: hw-management-wd.sh start [timeout] | stop | tleft | check_reset | help
start - start watchdog
timeout is optional. Default value will be used in case if it's omitted
timeout provided in seconds
stop - stop watchdog
tleft - check watchdog timeout left
check_reset - check if previous reset was caused by watchdog
Prints only in case of watchdog reset
help -this help
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>