Commit Graph

983 Commits

Author SHA1 Message Date
abdosi
0fad6bdc7f [monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720)
Why/How I did:

Make sure first error syslog is triggered based on FAULT TOLERANCE condition.

Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent

Updated the monit conf files with repeat every x cycles for the alert action
2020-11-01 10:27:10 -08:00
Andriy Kokhan
77a1ef236b
[BFN] Updated SDK packages to 20201023_sai_1.5.2 (#5711)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-10-30 10:53:42 -07:00
Junchao-Mellanox
06b5ad02ac [Mellanox] Re-initialize SFP object when detecting a new SFP insertion (#5695)
When detecting a new SFP insertion, read its SFP type and DOM capability from EEPROM again.

SFP object will be initialized to a certain type even if no SFP present. A case could be:

1. A SFP object is initialized to QSFP type by default when there is no SFP present
2. User insert a SFP with an adapter to this QSFP port
3. The SFP object fail to read EEPROM because it still treats itself as QSFP.

This PR fixes this issue.
2020-10-30 09:04:26 -07:00
Aravind Mani
66e0298a89 [Dell S6100] Properly release memory upon ICH driver deinit (#5561)
During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
2020-10-30 08:59:18 -07:00
Stepan Blyshchak
f7d753fd70 [Mellanox] Configure SAI to log to syslog instead of stdout. (#5634)
Example of syslog message from Mellanox SAI:

"Oct  7 15:39:11.482315 arc-switch1025 INFO syncd#supervisord: syncd Oct 07 15:39:11 NOTICE  SAI_BUFFER: mlnx_sai_buffer.c[3893]- mlnx_clear_buffer_pool_stats: Clear pool stats pool id:1"

There is a log INFO from supervisord which actually printed NOTICE and
date again. This confusion happens becuase if SAI is not built to log
to syslog it will log everything to stdout with format "[date] [level]
[message]" so supervisord sends it to syslog with level INFO.

New logs look like:

"Oct  7 15:40:21.488055 arc-switch1025 NOTICE syncd#SDK  [SAI_BUFFER]: mlnx_sai_buffer.c[3893]- mlnx_clear_buffer_pool_stats: Clear pool stats pool id:17"

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2020-10-30 08:57:21 -07:00
Aravind Mani
3734bf326b
[201911] DellEMC platform API 2.0 for Z9264f, S5232f (#5637)
Add platform API 2.0 support for Z9264f, S5232f in 201911 branch
2020-10-28 10:01:47 -07:00
Junchao-Mellanox
ea28f2dcb2 [Mellanox] Fix issue: read data from eeprom should trim tail \0 (#5670)
Now we are reading base mac, product name from eeprom data, and the data read from eeprom contains multiple "\0" characters at the end, need trim them to make the string clean and display correct.
2020-10-22 10:54:02 -07:00
Kebo Liu
ae3f09246c [Mellanox] Optimize SFP Platform API implementation (#5476)
Each SFP object inside Chassis will open an SDK client, this is not necessary and SDK client can be shared between SFP objects.
2020-10-21 08:13:59 -07:00
Samuel Angebault
ce7248604e
[Arista] Update arista driver submodules (#5654)
Only import yaml python module if necessary
2020-10-17 21:06:17 -07:00
Nazarii Hnydyn
bd61e3811b
[Mellanox] Update SDK 4.4.1912, FW XX.2008.1912 (#5575)
- SN3800 vs Cisco9236 - no link copper or optics - start sending IDLE before PHY_UP for specific OPNs

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-10-11 15:34:05 -07:00
Tony Titus
05dc6834c2
[201911] Update Innovium device support (#5341)
- Add new device delta_evs-a-32q56 support for Innovium
- Update qos and buffer json files
- Update Innovium config files
2020-10-07 10:51:24 -07:00
Kebo Liu
ae3d458d48 [Mellanox] Refactor platform API to remove dependency on database (#5468)
**- Why I did it**
- Platform API implementation using sonic-cfggen to get platform name and SKU name, which will fail when the database is not available.
- Chassis name is not correctly assigned, it shall be assigned with EEPROM TLV "Product Name", instead of SKU name
- Chassis model is not implemented, it shall be assigned with EEPROM TLV "Part Number"

**- How I did it**

1. Chassis

> - Get platform name from /host/machine.conf
> - Remove get SKU name with sonic-cfggen
> - Get Chassis name and model from EEPROM TLV "Product Name" and "Part Number"
> - Add function to return model

2. EEPROM

> - Add function to return product name and part number

3. Platform

> - Init EEPROM on the host side, so also can get the Chassis name model from EEPROM on the host side.
2020-10-06 11:30:16 -07:00
Kebo Liu
8c3dbce209 [Mellanox] Fix truncated manufacture date returned from platform API (#5473)
The manufacture date returned from platform API was truncated, time is not included. Revise the regular expression used for matching.
2020-10-06 11:22:08 -07:00
abdosi
7db8cb0e03
Taken the fix from BRCM SAI 4.2.1.3 (6.5.19 hsdk). (#5550)
The error message is updated to provide correct information
of Netdevice link being down.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-10-06 09:04:51 -07:00
Junchao-Mellanox
de3045188d [Mellanox] Update dynamic minimum table for 4700, 3420 and 4600C (#5388)
Update dynamic minimum fan speed table according to data provided by thermal team.
2020-10-06 06:04:06 +00:00
Junchao-Mellanox
e614697967
Fix issue: there should be 2 cpu core temperature sensors in 3420 (#5402) 2020-10-04 16:58:17 +03:00
Stephen Sun
1da60a6811
Integrate sdk and fw 4.4.1910 (#5495)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2020-10-01 17:22:05 +03:00
Kebo Liu
46e57e050c [Mellanox] Refactor SFP related platform API and plugins with new SDK API (#5326)
Refactor SFP reset, low power get/set API, and plugins with new SDK SX APIs. Previously they were calling SDK SXD APIs which have glibc dependency because of shared memory usage.

Remove implementation "set_power_override", "tx_disable_channel", "tx_disable" which using SXD APIs, once related SDK SX API available, will add them back based on new SDK SX APIs.
2020-09-29 15:39:52 +00:00
Tamer Ahmed
2cc98b4bac [platform] Add Support For Environment Variable File (#5010)
* [platform] Add Support For Environment Variable

This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-28 21:14:39 +00:00
Joe LeVeque
b70c6f72b2 [dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-28 16:12:53 +00:00
Vitaliy Senchyshyn
7c849ed037
[barefoot][platform] Update BFN platforms (#5419)
* [barefoot][platform] Update BFN platforms (#5356)

1. Added support of BFN newport new platform name.
2. Updated debian version for montara and mavericks platforms.
2020-09-27 10:21:02 -07:00
yozhao101
7580c846ad
[201911][Monit] Unmonitor processes in disabled containers (#5462)
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.

- Backport of https://github.com/Azure/sonic-buildimage/pull/5153 to the 201911 branch

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-09-25 00:30:41 -07:00
Wirut Getbamrung
10534a39eb
[device/celestica]: Update DX010 platform APIS on 201911 branch (#5416)
* [device/celestica]: DX010 platform API update (#4608)

- Fix fancontrol.service path
- Fix return temp format in thermal API
- Improve init time in chassis API
- Upgrade sfp API

* [device/celestica]: Update DX010 reboot cause API (#4678)

- Add more cases support in DX010 reboot cause API
    - Add Thermal Overload reboot cause support
    - Add new Watchdog reboot cause support

* [device/celestica]: using sonic-py-common package
2020-09-24 10:20:15 -07:00
Aravind Mani
a637c7f9b7
[201911] Dell S6100 fix mux issue (#5415)
- Why I did it
For fixing PCA MUX attachment issue in Dell S6100 platform.

- How I did it
Wait till IOM MUX powered up properly and start I2C enumeration.
2020-09-22 15:29:21 -07:00
Arun Saravanan Balachandran
cb55779937 DellEMC S6100: Log HW reboot reason registers (#5361) 2020-09-19 14:09:31 -07:00
Samuel Angebault
4a185fc09b
[Arista] Update driver submodules. (#5408)
- Fix show platform firmware platform plugin error
- Fix import behavior for arista's sonic_platform implementation
- Fix fan led color and detection on Smartsville
2020-09-18 21:40:49 -07:00
judyjoseph
ef89cb96a0
[201911] Broadcom SAI 3.7.5.2 (#5330)
* Broadcom SAI 3.7.5.2, with the fixes for following CSP's 

e5e06f4 Fix for CS00010914668(KB0029456/SDK-218585) and CS00010503275(KB0029315/SDK-213475)
cf4f8da Solution for CS00010775359 in 3.7
0348f03 Patch for CS00010897814
a2d2fdd Patch for CS00010817763
4d362e8 Patch for CS00010636736
557ddc6 Solution for CS00010443542
0f122f1 Port SDK SER fix for dynamic tables (SDK-175398 / SDK-221245) to SAI 3.7
37e5c5e Fix for CS00010790550
64daf8a Fix for CS00010726597
e7f000e Fix for CS00010697761
44b7ab3 Solution for CS00010617498.
1475c24 CSP10503275 request to pull KB0029314 into 3.7
2020-09-18 17:17:30 -07:00
noaOrMlnx
75068f3a62
[Mellanox] Update SAI version to 1.17.3 in 201911 branch (#5376)
Signed-off-by: Noa Or <noaor@nvidia.com>
2020-09-15 20:26:57 +03:00
Volodymyr Samotiy
68d054e925
[Mellanox] Update SDK 4.4.1622, FW xx.2008.1622 (#5299)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2020-09-03 15:03:52 -07:00
Samuel Angebault
cde4e88f3a
[201911][Arista] Update arista drivers submodules (#5290)
- fix watchdog timeout units
 - remove arista bind mounts for docker-snmp
 - add python3 mounts for pmon
2020-09-01 20:19:13 -07:00
Mykola F
c243b8a9f5
[201911] Update SAI-Implementation submodule and enable port in/out dropped pkts stats (#5093)
- Enable port buffer drops by default
- Update SAI submodule

Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
2020-08-25 08:20:05 -07:00
Joe LeVeque
9048d7ae4d
[201911] Remove sonic-daemon-base package (#5181)
sonic-daemon-base package has been deprecated in favor of the sonic-py-common package. All related functionality has been moved there.

This is a backport of https://github.com/Azure/sonic-buildimage/pull/5131 and parts of https://github.com/Azure/sonic-buildimage/pull/5168 to the 201911 branch
2020-08-22 17:55:27 -07:00
Aravind Mani
fef4626804
Dell S6100 Port I2C changes to 201911 branch (#5148) 2020-08-18 14:39:36 -07:00
Guohan Lu
cf3260861b [docker-syncd-nephos]: use service dependency in supervisord to start services 2020-08-15 22:32:12 -07:00
Guohan Lu
8d3fd18162 [docker-syncd-mrvl]: use service dependency in supervisord to start services 2020-08-15 22:32:05 -07:00
Guohan Lu
82adac4926 [docker-syncd-centec]: use service dependency in supervisord to start services
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-08-15 22:31:55 -07:00
Guohan Lu
8de1e8de28 [docker-syncd-invm]: use service dependency in supervisord to start services 2020-08-15 22:31:46 -07:00
Guohan Lu
0936e9abc5 [docker-syncd-bfn]: use service dependency in supervisord to start services 2020-08-15 22:31:39 -07:00
Guohan Lu
b52b9c12cf [docker-syncd-mlnx]: use service dependency in supervisord to start services 2020-08-15 22:31:32 -07:00
Guohan Lu
da7e36e869 [docker-syncd-brcm]: use service dependency in supervisord to start services 2020-08-15 22:31:23 -07:00
Guohan Lu
6909170c3e [docker-syncd-vs]: use service dependency in supervisord to start services 2020-08-15 22:31:13 -07:00
Kebo Liu
7140055d73 [Mellanox] Update the sfp platform API to get the ext_specification_compliance with new way (#5123)
Update the platform API implementation with calling dedicated parse function which defined in the platform-common as defined by https://github.com/Azure/sonic-platform-common/pull/112
2020-08-13 23:03:03 -07:00
Joe LeVeque
309a098b21
[201911][Python] Migrate applications/scripts to import sonic-py-common package (#5132)
As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request:
1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added to the 201911 branch in https://github.com/Azure/sonic-buildimage/pull/5063
2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`.
3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module

This is a step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999
2020-08-13 16:35:53 -07:00
abdosi
ea1b25a210
[libsaibcm] updated libsaibcm debian package for (#5138)
3.7.5.1-3 in 201911

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-08-11 08:55:43 -07:00
Kebo Liu
5d1065dddc
[Mellanox] Update SDK to 4.4.1306, FW to *.2008.1310 (#5124)
* Update SDK/FW version number in the make file

* update Switch-SDK-drivers submodule
2020-08-11 10:05:47 +03:00
lguohan
78c803851c [build]: combine feature and container feature table (#5081)
1. remove container feature table
2. do not generate feature entry if the feature is not included
   in the image
3. rename ENABLE_* to INCLUDE_* for better clarity
4. rename feature status to feature state
5. [submodule]: update sonic-utilities

* 9700e45 2020-08-03 | [show/config]: combine feature and container feature cli (#1015) (HEAD, origin/master, origin/HEAD) [lguohan]
* c9d3550 2020-08-03 | [tests]: fix drops_group_test failure on second run (#1023) [lguohan]
* dfaae69 2020-08-03 | [lldpshow]: Fix input device is not a TTY error (#1016) [Arun Saravanan Balachandran]
* 216688e 2020-08-02 | [tests]: rename sonic-utilitie-tests to tests (#1022) [lguohan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-08-09 11:55:40 -07:00
shlomibitton
fad242480c Add support for 'Extended Specification Compliance' for QSFP cables parser (#5096)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-08-09 10:44:14 -07:00
Joe LeVeque
890e1f38cc [sonic-py-common] get_platform(): Refactor method of retrieving platform identifier (#5094)
Applications running in the host OS can read the platform identifier from /host/machine.conf. When loading configuration, sonic-config-engine *needs* to read the platform identifier from machine.conf, as it it responsible for populating the value in Config DB.

When an application is running inside a Docker container, the machine.conf file is not accessible, as the /host directory is not mounted. So we need to retrieve the platform identifier from Config DB if get_platform() is called from inside a Docker 
container. However, we can't simply check that we're running in a Docker container because the host OS of the SONiC virtual switch is running inside a Docker container. So I refactored `get_platform()` to:
    1. Read from the `PLATFORM` environment variable if it exists (which is defined in a virtual switch Docker container)
    2. Read from machine.conf if possible (works in the host OS of a standard SONiC image, critical for sonic-config-engine at boot)
    3. Read the value from Config DB (needed for Docker containers running in SONiC, as machine.conf is not accessible to them)

- Also fix typo in daemon_base.py
- Also changes to align `get_hwsku()` with `get_platform()`
2020-08-09 10:40:20 -07:00
Joe LeVeque
6556c40040
[201911] Introduce sonic-py-common package (#5063)
Consolidate common SONiC Python-language functionality into one shared package (sonic-py-common) and eliminate duplicate code.

The package currently includes four modules:
- daemon_base
- device_info
- logger
- task_base

NOTE: This is a combination of all changes from https://github.com/Azure/sonic-buildimage/pull/5003, https://github.com/Azure/sonic-buildimage/pull/5049 and some changes from https://github.com/Azure/sonic-buildimage/pull/5043 backported to align with the 201911 branch. As part of the 201911 port, I am not installing the Python 3 package in the base image or in the VS container, because we do not have pip3 installed, and we do not intend to migrate to Python 3 in 201911.
2020-08-03 11:50:06 -07:00
Nazarii Hnydyn
4e558bca25
[201911][Mellanox] Update MFT to 4.15.0-104 (#5077)
* [Mellanox] Update MFT to 4.15.0-104.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [Mellanox] Remove build system W/A.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [Mellanox] Add MFT DKMS build support.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-08-03 13:53:33 +03:00
Andriy Kokhan
fbf3cb13a5
[bfn] Updated BFN SDK packages to 20200731 with SAI v1.5.2 (#5082)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-08-03 02:12:55 -07:00
Kebo Liu
c8c4493a96
Update SAI to 1.16.6, SDK to 4.4.1014, FW to *.2008.1032 (#5056)
SAI:
    Fix ECMP max groups logic
    add set issu log level for spc2/spc3, as now issu is supported
    set vlan max swid = 0 on sdk init, as only single swid is needed, for efficient resource usage
    Fix traffic lost during FFB related to buffer config + optimize buffer config timing for FB
    Add ACL fields BTH, IP flags
    Add ACL infrastructure of different fields per ASIC type
    Add port stat ether rx/tx oversize pkts
  SDK/FW:
    Added support for Finisar 100GbE SWDM Transceiver FTLC9152RGPL.
    Spectrum-2 Added support for 10G BaseT modules
    Added link LED support for SN4600C.
    Counters | In SDK debug dump, the incorrect counter type appears for vtraps.
    WJH | Without any traffic or events on the idle system, the CPU load is constantly above 4%
    WJH | WJH filter currently cannot filter by PORT for buffer drop reason.
    Spectrum | ACL, Unbind, Lazy Delete | Running Lazy Delete together with auto_unbind may cause rate condition errors. To work work with Lazy Delete use new INIT parameter "acl_manual_unbind" so that ACLs will notbe removed automatically when binding point is deleted.
    Spectrum | ISSU | In ISSU mode, when querying for the number of configurable buffers, using the API sx_api_cos_port_buff_type_get with the count parameter as 0, the API returns the number for NORMAL mode instead.
    Spectrum-2 | BER | BER monitor counts raw errors instead of effective errors
    Spectrum-2 | BER | Connecting to ConnectX-5 adapter card with copper splitter cable MCP7H50-V001R30 in 1
    Spectrum-2 | Cables | Link flaps in 200GbE with AOM Optic cable MMA1T00-VS
    Spectrum-3 | Speeds, Link | When moving from a 400GbE link to a 1GbE link, packets may drop for 1msec right after link up
    Spectrum-3 | Cables, Speeds | Using 400GbE with 3rd party systems is not supported
    Spectrum-3 | LAG | After a while, LAG members become out of sync with one another
    Spectrum-3 | VLAN, Ports | Packets with VLAN headers are sent to
2020-07-30 13:37:54 +03:00
shlomibitton
9385775803 [Mellanox] Change fan tolerance to 50% (#5018)
Mellanox platforms fan tolerance should change to 50%

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-07-26 11:18:00 -07:00
shlomibitton
d0be3ebe16 Add support for QSFP-DD cables on MLNX platform API (#4965)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-07-26 11:15:05 -07:00
Nazarii Hnydyn
0ec979dd30
[Mellanox] Fix SN3700 platform string. (#5035)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-07-25 03:04:03 -07:00
Joe LeVeque
840be7732c
[201911][devices] Update SFP keys to align with new standard (#4976)
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97

- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
2020-07-16 11:09:47 -07:00
Samuel Angebault
41ba95ee3f
[arista] update Arista drivers submodules (#4967)
Merge most of the changes that recently made it to master.
This will be the last such merge operation and future commits will only cherry-pick fixes and targeted features.

Major fixes and features,
- reboot cause enhancement with more hardware reboot cause reporting
- fix reboot cause parsing issue with 201811 release
- fix get_change_event logic
- fix error message on missing sysfs entry by our plugins
- final piece of the platform refactors for fan and sensor reporting through the platform API
2020-07-16 10:36:07 -07:00
arlakshm
7c699df654 Add support for bcmsh and bcmcmd utlitites in multi ASIC devices (#4926)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
This PR has changes to support accessing the bcmsh and bcmcmd utilities on multi ASIC devices
Changes done
- move the link of /var/run/sswsyncd from docker-syncd-brcm.mk to docker_image_ctl.j2
- update the bcmsh and bcmcmd scripts to take -n [ASIC_ID] as an argument on multi ASIC platforms
2020-07-11 09:47:24 -07:00
gracelicd
3f05e4a15e
[nephos]: upgrade Nephos SAI version to c749df (#4814)
Verified with Nephos nps8365 based platform Accton AS7116-54x.
2020-07-08 06:55:57 -07:00
abdosi
43bb028ed6
[brcmsai]: Updated BRCM SAI Debina package to 3.7.5.1-2 (#4916)
Fix for Copp Rules not having Policer Rate-Limit applied.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-07-08 06:47:33 -07:00
paavaanan
369bf887d4 [Dell]: DellEMC S6100 disable pericom/xlinx chipset (#4868)
- Xilinx/pericom peripherals are not actively used in DellEMC S6100 switch.
- These peripherals are throwing PCIE corrected messages in some of the units and filling syslog.
- Since it is not usable disabling it at startup.
2020-07-05 15:37:38 -07:00
yozhao101
bbcd4c6235 [Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor 
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the 
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the 
syncd process, Monit will not detect whether syncd process is running or not. 

If we ran the command  `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three 
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match. 

Similarly, On Mellanox Monit will also select the process with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.

That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.

**- How I did it**

**- How to verify it**

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-28 07:29:59 -07:00
Wirut Getbamrung
3d0126baeb [platform-celestica]: Update fancontrol service for Seastone-DX010 device (#3690)
* [platform/cel]: add fancontrol service support for dx010

* [device/celestica]: add hysteresis temp to dx010 fancontrol configuration
2020-06-28 07:27:20 -07:00
Junchao-Mellanox
acafde1895 [Mellanox] Change port index in port_config.ini to 1-based (#4781)
* Change port index in port_config.ini to 1-based
* Add default port index to port_config.ini, change platform plugins to accept 1-based port index
* fix port index in sfp_event.py
2020-06-28 07:24:10 -07:00
Kebo Liu
9db492e31f [Mellanox] Update SDK to 4.4.0952, FW to *.2007.1280 (#4842) 2020-06-28 07:19:25 -07:00
abdosi
a84b534ed7
[broadcom-sai]: Updated broadcom SAI to fix High CPU on TH/Th2 platform. (#4859)
Verified after loading on TH platforms cpu usage gone down:

Previous:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2521 root 20 0 1512860 360452 63540 S 144.6 4.4 8:00.03 syncd

After Fix:
7500 root 20 0 1592420 350912 64184 S 45.4 4.3 3:50.99 syncd
2020-06-27 01:10:41 -07:00
yozhao101
c2364cf03e
[201911][dockers] Update critical_processes file syntax (#4854)
Backport of https://github.com/Azure/sonic-buildimage/pull/4831 to the 201911 branch
2020-06-26 11:37:05 -07:00
madhanmellanox
337502c220
converting to Platform based utils (#4830)
Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-23 09:12:54 -07:00
Nazarii Hnydyn
7b18d9c15c [Mellanox] Update MFT to v4.14.5-2. (#4784)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-06-17 10:18:01 -07:00
Volodymyr Samotiy
2f82cce3e8
[Mellanox] Update SDK 4.4.0940 and FW xx.2007.1244 (#4777) 2020-06-16 10:28:22 -07:00
Arun Saravanan Balachandran
030570de81 [DellEMC]: EEPROM decoder for S6000, S6000-ON (#4718)
**- Why I did it**

For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.

**- How I did it**

- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’  ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py, 
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.

**- How to verify it**

- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.

UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
2020-06-16 08:15:28 -07:00
Junchao-Mellanox
d10b597f50 [Mellanox] Upgrade mft to 4.14.1-8 (#4701) 2020-06-16 08:14:18 -07:00
Junchao-Mellanox
62690f504a
[Mellanox] Initialize system LED color to green for 201911 (#4743)
* [Mellanox] Initialize system LED color to green for 201911

* Rename variable to make it more readable
2020-06-16 15:38:17 +03:00
Nazarii Hnydyn
50f4e7de5f
[Mellanox] Add ONIE and SSD platform components. (#4764)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-06-15 13:04:44 +03:00
Arun Saravanan Balachandran
093d7731ab
[201911] DellEMC: Skip thermalctld and thermal platform API changes (#4752)
**- Why I did it**

- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100 and Z9100 platforms to 'float'.

**- How I did it**

- Add 'skip_thermalctld:true' in pmon_daemon_control.json for DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Made changes in thermal.py, for 'get_temperature', 'get_high_threshold' and 'get_low_threshold' to return 'float' value.

**- How to verify it**

- Check thermalctld is not running in 'pmon'.
- Wrote a python script to load Chassis class and then call the APIs accordingly and verify the return type.
2020-06-11 10:48:27 -07:00
Junchao-Mellanox
0a70571011
[201911][thermal control] Backport feature from master branch (#4677)
Backport thermal control feature from master branch to 201911 branch by cherry-picking commits and manually resolving conflicts.
2020-06-08 11:20:43 -07:00
judyjoseph
ccf12d2ff7
SAI 3.7.5.1 (#4710) 2020-06-07 20:45:12 -07:00
Volodymyr Samotiy
e73a5f1375
[Mellanox] Update SAI, SDK 4.4.0928 and FW xx.2007.1208 (#4704)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2020-06-04 13:28:12 -07:00
Kebo Liu
618d529ef4
[201911][Mellanox] Update hw-mgmt package to V.7.0010.1000 (#4688) 2020-06-02 14:53:09 -07:00
Arun Saravanan Balachandran
98b8d1eee1 DellEMC: get_change_event Platform API implementation for S6000, S6100 and Z9100 (#4593)
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.

- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
2020-05-27 18:00:45 -07:00
Andriy Kokhan
11ce7617d2 [BFN] Updated Barefoot SDK to 2020-05-07 (#4566)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-05-20 07:55:54 -07:00
Santhosh Kumar T
045d5e6f23 [DellEMC] S6000 Disable Low power mode by default (#4592) 2020-05-20 07:55:16 -07:00
Kebo Liu
fffee7e33a [mellanox]: Update SAI to 1.16.4, SDK to 4.4.0918, FW to *.2007.1140 (#4571)
- mgmt buffer issue on 400G port
- high CPU utilization issue caused by some counter reading
2020-05-12 22:46:21 -07:00
Santhosh Kumar T
1e3df476e5 [DellEMC] S6100 Last Reboot Reason Thermal Support (#3767) 2020-05-09 18:37:31 -07:00
shlomibitton
8367dfebaa
hw-mgmt_V.7.0000.3034 integration (#4518)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-05-06 12:14:41 +03:00
Samuel Angebault
8456aeba98
Update arista drivers submodules (#4532) 2020-05-04 20:10:50 -07:00
Nazarii Hnydyn
c266435d40
Revert "Add thermal control support for SONiC (#3949)" (#4527)
This reverts commit 109a13cc03.

Conflicts:
	dockers/docker-platform-monitor/docker-pmon.supervisord.conf.j2
2020-05-04 21:20:47 +03:00
Junchao-Mellanox
109a13cc03 Add thermal control support for SONiC (#3949) 2020-04-30 22:39:17 -07:00
Kebo Liu
4bd47e3d7f [mellanox]: MSN4700 support 8 lanes 400G with new SAI/SDK/FW (#4509)
Update SAI/SDK/FW and MSN4700 device files to support 8 lanes 400G

Update SAI to 1.16.3
Update SDK to 4.4.0914
Update FW to *.2007.1112
Update MSN4700 device files to support 8 lanes 400G
2020-04-30 22:19:21 -07:00
Nazarii Hnydyn
bd370fd2b6 [mellanox]: Align CPLD component with latest hw-mgmt. (#4485)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-30 22:16:29 -07:00
shlomibitton
7709ff5228 [Mellanox] Add a new Mellanox platform x86_64-mlnx_msn4600c and new SKU ACS-MSN4600C (#4483)
* New SKU support for MSN4600C

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-04-30 22:15:52 -07:00
rajendra-dendukuri
e843d994d0 [brcm-th-svk]: Fix errors in BCM956960K switch (#4390)
Fix Broadcom TH SVK boot up crash
2020-04-30 22:10:16 -07:00
Nazarii Hnydyn
8fb0549a48 [mellanox]: Add DPKG local caching support. (#4441)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-27 08:50:23 -07:00
shlomibitton
aa33a959dd [Mellanox] Add a new Mellanox platform x86_64-mlnx_msn3420 and new SKU ACS-MSN3420 (#4436)
* New SKU support for MSN3420

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>

Conflicts:
	device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py

* Add CPLD's

* Symlink fixes and semantics

* Adding new platform at end of lines
2020-04-27 08:50:23 -07:00
Kebo Liu
1d7d8fac66 [Mellanox] Update hw-mgmt package to V.7.0000.3020 (#4362)
* update hw-mgmt package to V.7.0000.3020
* update sonic-linux-kernel repo to pick up new patches
2020-04-27 08:50:23 -07:00
Wataru Ishida
674f72e2ce [broadcom]: respect the current network namespace when creating netdev (#3896)
https://github.com/Broadcom-Switch/OpenNSL/issues/26

Signed-off-by: Wataru Ishida <ishida@nel-america.com>
2020-04-27 08:50:23 -07:00
Santhosh Kumar T
214541b75b [DellEMC] S6000 - Thermal support - Last Reboot Reason (#4097)
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.

sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)
2020-04-27 08:50:23 -07:00
abdosi
5839a01abd [sonic-buildimage] libsaibcm Debian package update (#4439)
from 3.7.3.3-3 to 3.7.3.3-4
Fixes for PFC WD

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-04-19 16:29:32 -07:00
Kebo Liu
e4bd7ab189 [Mellanox] Extend mellanox platform API to report SFP error event (#4365)
* extend mellanox platform API to report SFP error event
* remove unnecessary loop code
* install enum34 to pmon to support using Enum
2020-04-15 13:11:59 -07:00
Nazarii Hnydyn
c3e030b769 [mellanox]: Enable CPLD update progress bar (#4363)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-15 13:11:29 -07:00