Commit Graph

872 Commits

Author SHA1 Message Date
yozhao101
bbcd4c6235 [Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor 
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the 
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the 
syncd process, Monit will not detect whether syncd process is running or not. 

If we ran the command  `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three 
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match. 

Similarly, On Mellanox Monit will also select the process with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.

That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.

**- How I did it**

**- How to verify it**

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-28 07:29:59 -07:00
Wirut Getbamrung
3d0126baeb [platform-celestica]: Update fancontrol service for Seastone-DX010 device (#3690)
* [platform/cel]: add fancontrol service support for dx010

* [device/celestica]: add hysteresis temp to dx010 fancontrol configuration
2020-06-28 07:27:20 -07:00
Junchao-Mellanox
acafde1895 [Mellanox] Change port index in port_config.ini to 1-based (#4781)
* Change port index in port_config.ini to 1-based
* Add default port index to port_config.ini, change platform plugins to accept 1-based port index
* fix port index in sfp_event.py
2020-06-28 07:24:10 -07:00
Kebo Liu
9db492e31f [Mellanox] Update SDK to 4.4.0952, FW to *.2007.1280 (#4842) 2020-06-28 07:19:25 -07:00
abdosi
a84b534ed7
[broadcom-sai]: Updated broadcom SAI to fix High CPU on TH/Th2 platform. (#4859)
Verified after loading on TH platforms cpu usage gone down:

Previous:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2521 root 20 0 1512860 360452 63540 S 144.6 4.4 8:00.03 syncd

After Fix:
7500 root 20 0 1592420 350912 64184 S 45.4 4.3 3:50.99 syncd
2020-06-27 01:10:41 -07:00
yozhao101
c2364cf03e
[201911][dockers] Update critical_processes file syntax (#4854)
Backport of https://github.com/Azure/sonic-buildimage/pull/4831 to the 201911 branch
2020-06-26 11:37:05 -07:00
madhanmellanox
337502c220
converting to Platform based utils (#4830)
Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-23 09:12:54 -07:00
Nazarii Hnydyn
7b18d9c15c [Mellanox] Update MFT to v4.14.5-2. (#4784)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-06-17 10:18:01 -07:00
Volodymyr Samotiy
2f82cce3e8
[Mellanox] Update SDK 4.4.0940 and FW xx.2007.1244 (#4777) 2020-06-16 10:28:22 -07:00
Arun Saravanan Balachandran
030570de81 [DellEMC]: EEPROM decoder for S6000, S6000-ON (#4718)
**- Why I did it**

For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.

**- How I did it**

- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’  ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py, 
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.

**- How to verify it**

- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.

UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
2020-06-16 08:15:28 -07:00
Junchao-Mellanox
d10b597f50 [Mellanox] Upgrade mft to 4.14.1-8 (#4701) 2020-06-16 08:14:18 -07:00
Junchao-Mellanox
62690f504a
[Mellanox] Initialize system LED color to green for 201911 (#4743)
* [Mellanox] Initialize system LED color to green for 201911

* Rename variable to make it more readable
2020-06-16 15:38:17 +03:00
Nazarii Hnydyn
50f4e7de5f
[Mellanox] Add ONIE and SSD platform components. (#4764)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-06-15 13:04:44 +03:00
Arun Saravanan Balachandran
093d7731ab
[201911] DellEMC: Skip thermalctld and thermal platform API changes (#4752)
**- Why I did it**

- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100 and Z9100 platforms to 'float'.

**- How I did it**

- Add 'skip_thermalctld:true' in pmon_daemon_control.json for DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Made changes in thermal.py, for 'get_temperature', 'get_high_threshold' and 'get_low_threshold' to return 'float' value.

**- How to verify it**

- Check thermalctld is not running in 'pmon'.
- Wrote a python script to load Chassis class and then call the APIs accordingly and verify the return type.
2020-06-11 10:48:27 -07:00
Junchao-Mellanox
0a70571011
[201911][thermal control] Backport feature from master branch (#4677)
Backport thermal control feature from master branch to 201911 branch by cherry-picking commits and manually resolving conflicts.
2020-06-08 11:20:43 -07:00
judyjoseph
ccf12d2ff7
SAI 3.7.5.1 (#4710) 2020-06-07 20:45:12 -07:00
Volodymyr Samotiy
e73a5f1375
[Mellanox] Update SAI, SDK 4.4.0928 and FW xx.2007.1208 (#4704)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2020-06-04 13:28:12 -07:00
Kebo Liu
618d529ef4
[201911][Mellanox] Update hw-mgmt package to V.7.0010.1000 (#4688) 2020-06-02 14:53:09 -07:00
Arun Saravanan Balachandran
98b8d1eee1 DellEMC: get_change_event Platform API implementation for S6000, S6100 and Z9100 (#4593)
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.

- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
2020-05-27 18:00:45 -07:00
Andriy Kokhan
11ce7617d2 [BFN] Updated Barefoot SDK to 2020-05-07 (#4566)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-05-20 07:55:54 -07:00
Santhosh Kumar T
045d5e6f23 [DellEMC] S6000 Disable Low power mode by default (#4592) 2020-05-20 07:55:16 -07:00
Kebo Liu
fffee7e33a [mellanox]: Update SAI to 1.16.4, SDK to 4.4.0918, FW to *.2007.1140 (#4571)
- mgmt buffer issue on 400G port
- high CPU utilization issue caused by some counter reading
2020-05-12 22:46:21 -07:00
Santhosh Kumar T
1e3df476e5 [DellEMC] S6100 Last Reboot Reason Thermal Support (#3767) 2020-05-09 18:37:31 -07:00
shlomibitton
8367dfebaa
hw-mgmt_V.7.0000.3034 integration (#4518)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-05-06 12:14:41 +03:00
Samuel Angebault
8456aeba98
Update arista drivers submodules (#4532) 2020-05-04 20:10:50 -07:00
Nazarii Hnydyn
c266435d40
Revert "Add thermal control support for SONiC (#3949)" (#4527)
This reverts commit 109a13cc03.

Conflicts:
	dockers/docker-platform-monitor/docker-pmon.supervisord.conf.j2
2020-05-04 21:20:47 +03:00
Junchao-Mellanox
109a13cc03 Add thermal control support for SONiC (#3949) 2020-04-30 22:39:17 -07:00
Kebo Liu
4bd47e3d7f [mellanox]: MSN4700 support 8 lanes 400G with new SAI/SDK/FW (#4509)
Update SAI/SDK/FW and MSN4700 device files to support 8 lanes 400G

Update SAI to 1.16.3
Update SDK to 4.4.0914
Update FW to *.2007.1112
Update MSN4700 device files to support 8 lanes 400G
2020-04-30 22:19:21 -07:00
Nazarii Hnydyn
bd370fd2b6 [mellanox]: Align CPLD component with latest hw-mgmt. (#4485)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-30 22:16:29 -07:00
shlomibitton
7709ff5228 [Mellanox] Add a new Mellanox platform x86_64-mlnx_msn4600c and new SKU ACS-MSN4600C (#4483)
* New SKU support for MSN4600C

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-04-30 22:15:52 -07:00
rajendra-dendukuri
e843d994d0 [brcm-th-svk]: Fix errors in BCM956960K switch (#4390)
Fix Broadcom TH SVK boot up crash
2020-04-30 22:10:16 -07:00
Nazarii Hnydyn
8fb0549a48 [mellanox]: Add DPKG local caching support. (#4441)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-27 08:50:23 -07:00
shlomibitton
aa33a959dd [Mellanox] Add a new Mellanox platform x86_64-mlnx_msn3420 and new SKU ACS-MSN3420 (#4436)
* New SKU support for MSN3420

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>

Conflicts:
	device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py

* Add CPLD's

* Symlink fixes and semantics

* Adding new platform at end of lines
2020-04-27 08:50:23 -07:00
Kebo Liu
1d7d8fac66 [Mellanox] Update hw-mgmt package to V.7.0000.3020 (#4362)
* update hw-mgmt package to V.7.0000.3020
* update sonic-linux-kernel repo to pick up new patches
2020-04-27 08:50:23 -07:00
Wataru Ishida
674f72e2ce [broadcom]: respect the current network namespace when creating netdev (#3896)
https://github.com/Broadcom-Switch/OpenNSL/issues/26

Signed-off-by: Wataru Ishida <ishida@nel-america.com>
2020-04-27 08:50:23 -07:00
Santhosh Kumar T
214541b75b [DellEMC] S6000 - Thermal support - Last Reboot Reason (#4097)
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.

sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)
2020-04-27 08:50:23 -07:00
abdosi
5839a01abd [sonic-buildimage] libsaibcm Debian package update (#4439)
from 3.7.3.3-3 to 3.7.3.3-4
Fixes for PFC WD

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-04-19 16:29:32 -07:00
Kebo Liu
e4bd7ab189 [Mellanox] Extend mellanox platform API to report SFP error event (#4365)
* extend mellanox platform API to report SFP error event
* remove unnecessary loop code
* install enum34 to pmon to support using Enum
2020-04-15 13:11:59 -07:00
Nazarii Hnydyn
c3e030b769 [mellanox]: Enable CPLD update progress bar (#4363)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-15 13:11:29 -07:00
Myron Sosyak
9dd1fa016c Update Barefoot kdrv (#4355) 2020-04-15 13:09:40 -07:00
SuvarnaMeenakshi
0099305475 Multi-ASIC implementation (#3888)
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
2020-04-15 13:08:34 -07:00
Nazarii Hnydyn
0b35fcf3bf [mellanox]: Add SSD FW update tool (#4351)
* [mellanox]: Add SSD FW update tool.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Align Platform API.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Fix firmware description.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Update SSD tool.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-15 13:02:36 -07:00
Kebo Liu
4ee0f1ff08 update SAI 1.16.2 SDK 4.4.0800 FW *.2007.0872 (#4417) 2020-04-15 13:01:09 -07:00
Joe LeVeque
d09fba379f
[201911][Juniper QFX5210] Fix Python errors (#4413) 2020-04-11 16:42:56 -07:00
Srideep
c080e80165 [DellEMC] S5232 platform updates (#4360)
FPGA driver crash fix for stale buffer in i2c transfer
LED firmware load issue fix.
10G port swapfix
psu/sfp bug fixes to report correct states/status of hw
2020-04-10 21:22:48 -07:00
Andriy Kokhan
0d5c9aadcb [BFN] Update Barefoot SDK packages (#4397)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-04-10 21:21:08 -07:00
Abhishek Dosi
249265ad99 Revert "Multi-ASIC implementation (#3888)"
This reverts commit 2e87a16941.
2020-04-03 14:34:38 -07:00
Samuel Angebault
8819322210
[Arista] Update drivers submodules (#4353)
* Update arista drivers submodules

* Add device configs for 7060CX2-32S

* Update boot0 and union-mount for 7060CX2-32S

* Add 7170-32C and 7170-32CD support in boot0

* Sync after writting boot configs

* Add 7170-32C and 7170-32CD device configurations

Co-authored-by: Boyang Yu <byu@arista.com>

Co-authored-by: Boyang Yu <byu@arista.com>
2020-04-01 23:26:42 -07:00
SuvarnaMeenakshi
2e87a16941 Multi-ASIC implementation (#3888)
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
2020-04-01 23:21:49 -07:00
Aravind Mani
9385e0201f [DellEMC] Fix Z9100 port index issue (#4309) 2020-03-24 15:15:14 -07:00