Commit Graph

140 Commits

Author SHA1 Message Date
shlomibitton
e6ec5d0774
Fix MSN4700 sensors labels (#5861)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-10 18:33:24 +02:00
Kebo Liu
1158701edc
add pcied config files for mellanox platform (#5669)
This PR has a dependency on community change to move PCIe config files from $PLATFORM/plugin folder to $PLATFORM/ folder
- Why I did it
To support PCIed daemon on Mellanox platforms
- How I did it
Add PCIed config yaml files for all Mellanox platforms
Update pmon daemon config files for SimX platforms
2020-11-02 19:45:36 -08:00
Nazarii Hnydyn
5486f87afc
[Mellanox] Update platform components config files. (#5685)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-10-25 19:44:37 +02:00
shlomibitton
a5242a65dc
[Mellanox] Fixes sensors labels for human readable output for MSN3420 (#5664)
Fixes sensors labels for human readable output for MSN3420

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-10-19 16:04:07 -07:00
shlomibitton
7ecc15e26d
[Mellanox] Add sensors labels for human readable output for MSN2740 (#5662)
Add sensors labels for human readable output for MSN2740
2020-10-19 16:03:04 -07:00
shlomibitton
de1f7421ac
[Mellanox] Add sensors labels for human readable output for MSN2700 (#5661)
Add sensors labels for human readable output for MSN2700

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-10-19 16:02:27 -07:00
shlomibitton
b5043a2e49
[Mellanox] Add sensors labels for human readable output for MSN2410 (#5660)
Add sensors labels for human readable output for MSN2410
2020-10-19 09:51:52 -07:00
shlomibitton
9f73b8aeb0
[Mellanox] Add sensors labels for human readable output for MSN2100 (#5659)
Add sensors labels for human readable output for MSN2100

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-10-19 09:50:44 -07:00
shlomibitton
97caf46b00
[Mellanox] Add sensors labels for human readable output for MSN2010 (#5658)
Add sensors labels for human readable output for MSN2010

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-10-19 09:49:19 -07:00
Junchao-Mellanox
1c97a03b81
[system-health] Add support for monitoring system health (#4835)
* system health first commit

* system health daemon first commit

* Finish healthd

* Changes due to lower layer logic change

* Get ASIC temperature from TEMPERATURE_INFO table

* Add system health make rule and service files

* fix bugs found during manual test

* Change make file to install system-health library to host

* Set system LED to blink on bootup time

* Caught exceptions in system health checker to make it more robust

* fix issue that fan/psu presence will always be true

* fix issue for external checker

* move system-health service to right after rc-local service

* Set system-health service start after database service

* Get system up time via /proc/uptime

* Provide more information in stat for CLI to use

* fix typo

* Set default category to External for external checker

* If external checker reported OK, save it to stat too

* Trim string for external checker output

* fix issue: PSU voltage check always return OK

* Add unit test cases for system health library

* Fix LGTM warnings

* fix demo comments: 1. get boot up timeout from monit configuration file; 2. set system led in library instead of daemon

* Remove boot_timeout configuration because it will get from monit config file

* Fix argument miss

* fix unit test failure

* fix issue: summary status is not correct

* Fix format issues found in code review

* rename th to threshold to make it clearer

* Fix review comment: 1. add a .dep file for system health; 2. deprecated daemon_base and uses sonic-py-common instead

* Fix unit test failure

* Fix LGTM alert

* Fix LGTM alert

* Fix review comments

* Fix review comment

* 1. Add relevant comments for system health; 2. rename external_checker to user_define_checker

* Ignore check for unknown service type

* Fix unit test issue

* Rename user define checker to user defined checker

* Rename user_define_checkers to user_defined_checkers for configuration file

* Renmae file user_define_checker.py -> user_defined_checker.py

* Fix typo

* Adjust import order for config.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/health_checker/hardware_checker.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/scripts/healthd

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import orders in src/system-health/tests/test_system_health.py

* Fix typo

* Add new line after import

* If system health configuration file not exist, healthd should exit

* Fix indent and enable pytest coverage

* Fix typo

* Fix typo

* Remove global logger and use log functions inherited from super class

* Change info level logger to notice level

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
2020-10-12 11:12:49 +03:00
vdahiya12
f2194010f8
[MSN2700] Add platform.json file containing platform hardware facts (#5189)
As part of Platform api testing for multiple platforms, this pull request adds a platform.json file which contains all
the static data for Mellanox-2700 platform. This file would provide all the platform specific data required for testing of all the Platform tests . As part of testing the API's the values of static/default objects within this specific platform file  will be compared  against the values returned by calling the Platform specific API's in a typical platform test
2020-09-22 17:12:51 -07:00
Stephen Sun
c8277a4eba
Update buffer configuration for SKUs based on SN3800 (#5320)
C64: 32 100G down links and 32 100G up links.
D112C8: 112 50G down links and 8 100G up links.
D24C52: 24 50G down links, 20 100G down links, and 32 100G up links.
D28C50: 28 50G down links, 18 100G down links, and 32 100G up links.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2020-09-22 19:10:18 +03:00
Nazarii Hnydyn
e2b4afc438
[Mellanox] Update platform components config files. (#5360)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-09-13 10:23:22 +03:00
Kebo Liu
72ec212fa7
[Mellanox] Refactor SFP related platform API and plugins with new SDK API (#5326)
Refactor SFP reset, low power get/set API, and plugins with new SDK SX APIs. Previously they were calling SDK SXD APIs which have glibc dependency because of shared memory usage.

Remove implementation "set_power_override", "tx_disable_channel", "tx_disable" which using SXD APIs, once related SDK SX API available, will add them back based on new SDK SX APIs.
2020-09-11 13:23:23 -07:00
Stephen Sun
54fcdbb380
Support single ingress pool for MSFT SKUs and optimize headroom calculation (#4686)
Calculate pool size in t1 as 24 * downlink port + 8 * uplink port

- Take both port and peer MTU into account when calculating headroom
- Worst case factor is decreased to 50%
- Mellanox-SN2700-C28D8 t0, assume 48 * 50G/5m + 8 * 100G/40m ports
- Mellanox-SN2700 (C32)
  - t0: 16 * 100G/5m + 16 * 100G/40m
  - t1: 16 * 100G/40m + 16 * 100G/300m

Signed-off-by: Stephen Sun <stephens@mellanox.com>

Co-authored-by: Stephen Sun <stephens@mellanox.com>
2020-08-13 13:12:09 +03:00
shlomibitton
70bf65302d
MSN2100 and MSN2010 platforms are not supporting PSU temperature sampling, ignore temperature check by default for these platforms (#5047)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-08-04 14:49:19 +03:00
Joe LeVeque
3b89e5d467
[Python] Migrate applications/scripts to import sonic-py-common package (#5043)
As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request:
1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added in https://github.com/Azure/sonic-buildimage/pull/5003
2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`.
3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module

This is the next step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999

Also reverted my previous change in which device_info.get_platform() would first try obtaining the platform ID string from Config DB and fall back to gathering it from machine.conf upon failure because this function is called by sonic-cfggen before the data is in the DB, in which case, the db_connect() call will hang indefinitely, which was not the behavior I expected. As of now, the function will always reference machine.conf.
2020-08-03 11:43:12 -07:00
Stephen Sun
0db7e88313
[Mellanox] Update the buffer setting (#4989)
* Update the buffer size based on the latest excel

Signed-off-by: Stephen Sun <stephens@mellanox.com>

* Align the buffer configuration with the latest formula:

- reduce redundant "*2" in formula
- use port MTU for local sending the PFC frame and peer lossless MTU for peer sending lossless traffic

Buffer pool size updated accordingly.

Signed-off-by: Stephen Sun <stephens@mellanox.com>
2020-07-30 14:22:08 +03:00
Joe LeVeque
9905d9382d
[devices] Update SFP keys to align with new standard (#4975)
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97

- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
2020-07-16 13:03:50 -07:00
Junchao-Mellanox
e1f7fb135b
[Mellanox] Add system health configuration file for Mellanox platforms (#4834)
The new feature system health support a platform based configuration file. Add configuration files for all Mellanox platform.

Add a configuration file for SN2700, other platform will use a soft link to it.
2020-07-13 10:20:22 -07:00
Stephen Sun
153f880e6b [mellanox]: Support warm reboot on MSN4700 (#4910) 2020-07-12 18:08:52 +00:00
shlomibitton
e666bf8490 [Mellanox] Add a new SKU Mellanox-SN4600C-D112C8 (#4833)
Add related files to the device folder:

buffer config templates
pg lookup profile
port_config.ini
sai profile
sensor conf
plugins

Co-authored-by: Stephen Sun <stephens@mellanox.com>
2020-07-12 18:08:52 +00:00
Junchao-Mellanox
563a0fd21e
[Mellanox] Change port index in port_config.ini to 1-based (#4781)
* Change port index in port_config.ini to 1-based
* Add default port index to port_config.ini, change platform plugins to accept 1-based port index
* fix port index in sfp_event.py
2020-06-23 17:21:36 -07:00
madhanmellanox
d2366d4ff7
added files to create SKU Mellanox-SN3800-C64 (#4812)
* added files to create SKU Mellanox-SN3800-C64
Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-22 08:42:30 -07:00
madhanmellanox
b5f1b37386
added files to create SKU Mellanox-SN3800-D24C52 (#4808)
* added files to create SKU Mellanox-SN3800-D24C52

Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-21 12:19:11 -07:00
madhanmellanox
5efd1e7527
added files to create SKU Mellanox-SN3800-D28C50 (#4809)
* added files to create SKU Mellanox-SN3800-D28C50

Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-21 12:17:56 -07:00
madhanmellanox
b5d0bada19
modified files relevant to SKU Mellanox-SN3800-D112C8 (#4810)
* modified files relevant to SKU Mellanox-SN3800-D112C8

Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-21 12:16:16 -07:00
madhanmellanox
2c830f4074
Modified SKU based utils to Platform based utils (#4786)
Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-06-21 12:15:23 -07:00
shlomibitton
0029d366c4
Fix MSN4700 sensors (#4753)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-06-16 15:39:27 +03:00
shlomibitton
ea63f3e4b2
[mellanox]: Fix for MSN4600C sensors (#4754)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-06-12 16:08:27 -07:00
Junchao-Mellanox
e25c2d984f
[Mellanox] Never disable kernel thermal algorithm at real-time (#4638) 2020-05-26 10:46:29 -07:00
Junchao-Mellanox
5e6c20481d
[Mellanox] Enhancement for fan led management (#4437) 2020-05-13 10:01:32 -07:00
shlomibitton
404ae85e2c
[Mellanox] Fix 'sensors.conf' mapping for MSN4700 (#4511)
* [Mellanox] Fix 'sensors.conf' mapping for SN4700

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>

* Fix some labels name
2020-05-07 16:19:51 +03:00
shlomibitton
d9210d7ace
[Mellanox] Fix SN3420 'sensors.conf' label names (#4544)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-05-07 16:15:13 +03:00
Kebo Liu
352a39742a
[mellanox]: MSN4700 support 8 lanes 400G with new SAI/SDK/FW (#4509)
Update SAI/SDK/FW and MSN4700 device files to support 8 lanes 400G

Update SAI to 1.16.3
Update SDK to 4.4.0914
Update FW to *.2007.1112
Update MSN4700 device files to support 8 lanes 400G
2020-04-30 15:46:21 -07:00
Stephen Sun
a87bf4df83
[Mellanox] Fix error in sensors.conf for 3700/3700c/3800 (#4506) 2020-04-30 10:30:58 -07:00
shlomibitton
b6291372d9
[Mellanox] Add a new Mellanox platform x86_64-mlnx_msn4600c and new SKU ACS-MSN4600C (#4483)
* New SKU support for MSN4600C

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-04-30 00:30:11 -07:00
Stephen Sun
e363293ac0
[Mellanox]Mellanox-SN3800-D112C8 support warm-reboot (#4482) 2020-04-27 12:47:44 -07:00
shlomibitton
ac6cfb115f
[Mellanox] Add a new Mellanox platform x86_64-mlnx_msn3420 and new SKU ACS-MSN3420 (#4436)
* New SKU support for MSN3420

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>

Conflicts:
	device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py

* Add CPLD's

* Symlink fixes and semantics

* Adding new platform at end of lines
2020-04-26 14:39:55 +03:00
Junchao-Mellanox
c730f3e207
[Mellanox] thermal control enhancement for dynamic minimum fan speed and PSU fan speed policy (#4403) 2020-04-21 08:09:53 -07:00
Kebo Liu
860cb265ac
[PMON] Extend pmon daemon start control to lm-sensors and fancontrol (#4447) 2020-04-21 08:00:48 -07:00
Kebo Liu
e04eb806b6
[Mellanox] bug fix - adpt sfputil plugin to support ACS-MSN4700 (#4361) 2020-04-14 10:27:27 -07:00
noaOrMlnx
837c13fa63
[Mellanox] Enable ISSU on MSN2100, MSN2740, MSN3800 (#4387) 2020-04-08 02:39:36 -07:00
Mykola F
c5c1ae2173
[Mellanox] update eeprom.py plugin for SimX (#4364)
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
2020-04-03 12:41:37 -07:00
Mykola F
7e7777e374
[devices][Mellanox] create sai.xml for MSN3800-D112-C8 (#4334)
sai_xml contains info about port splits, previously it simply linked to the MSN3800 sai xml, which does not have splits. New version describes splits and speeds according to Mellanox-SN3800-D112-C8 SKU.

Practically it can cause port recreation on SAI init.

Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
2020-03-31 11:25:52 -07:00
Kebo Liu
f4ed88297d
[Mellanox] Add a new Mellanox platform x86_64-mlnx_msn4700 and new SKU ACS-MSN4700 (#3901)
* add MSN4700 device files

* update ACS-MSN4700 sai profile

* update buffer pool size, headroom, sensor conf, port config and reboot scripts

* fix ident

* update sensor conf and buffer pool

* [sn4700] add sku 4700 to chassis.py

* [Mellanox-4700] Add 4700 info to psu and thermal platform API

* update buffer config file template to the latest.
update SAI profile to use 100G X 4lanes for now
update port_config.ini according to the SAI profile

* [Mellanox]Update the buffer configurations for 4700

* fix alignment in pg_profile_lookup.ini

* add platform components file for new sku

* Update device/mellanox/x86_64-mlnx_msn4700-r0/ACS-MSN4700/pg_profile_lookup.ini

Co-Authored-By: Nazarii Hnydyn <nazariig@mellanox.com>

* remove redundant line

* [Mellanox]Correct type, buffer size

Co-authored-by: Nazarii Hnydyn <nazariig@mellanox.com>
Co-authored-by: junchao <junchao@mellanox.com>
Co-authored-by: Stephen Sun <stephens@mellanox.com>
2020-03-24 14:32:52 +02:00
Stephen Sun
8f0ff4b764
[Mellanox] Calculate the buffer size based on the latest excel and with gearbox considered (#4239) 2020-03-11 13:44:18 -07:00
Junchao-Mellanox
be549db395
Add thermal control support for SONiC (#3949) 2020-03-09 10:41:10 -07:00
Joe LeVeque
f6ec95cf2b
[Mellanox]the port index in port_config.ini should starts from 0 (#4152) 2020-03-04 19:15:12 -08:00
Kebo Liu
7a9c8ee1fc
filter out CPU ports to avoid any operation on them (#4197) 2020-03-03 21:19:30 +02:00