Commit Graph

20 Commits

Author SHA1 Message Date
Junchao-Mellanox
ee49d0dfec
[Mellanox] Fix issues found for CMIS host management (#17637)
- Why I did it
1. Thermal updater should wait more time for module to be initialized
2. sfp should get temperature threshold from EEPROM because SDK sysfs is not yet supported
3. Rename sfp function to fix typo
4. sfp.get_presence should return False if module is under initialization

- How I did it
1. Thermal updater should wait more time for module to be initialized
2. sfp should get temperature threshold from EEPROM because SDK sysfs is not yet supported
3. Rename sfp function to fix typo
4. sfp.get_presence should return False if module is under initialization

- How to verify it
Manual test
Unit test
2024-01-04 09:42:33 +02:00
Junchao-Mellanox
d8a1ffbace
[Mellanox] implement sfp.reset for CMIS management (#16862)
- Why I did it
For CMIS host management module, we need a different implementation for sfp.reset. This PR is to implement it

- How I did it
For SW control modules, do reset from hw_reset
For FW control modules, do reset as the original way

- How to verify it
Manual test
sonic-mgmt platform test
2023-12-17 08:02:47 +02:00
Junchao-Mellanox
1b84f3daa5
[Mellanox] update asic and module temperature in a thread for CMIS management (#16955)
- Why I did it
When module is totally under software control, driver cannot get module temperature/temperature threshold from firmware. In this case, sonic needs to get temperature/temperature threshold from EEPROM. In this PR, a thread thermal updater is created to update module temperature/temperature threshold while software control is enabled.

- How I did it
Query ASIC temperature from SDK sysfs and update hw-management-tc periodically
Query Module temperature from EEPROM and update hw-management-tc periodically

- How to verify it
Manual test
New Unit tests
2023-12-13 14:19:44 +02:00
Junchao-Mellanox
0d62cf0e92
[Mellanox] Remove EEPROM write limitation if it is software control (#17030)
- Why I did it
When module is under software control (CMIS host management enabled), EEPROM should be controlled by software and there should be no limitation for any write operation.

- How I did it
Remove EEPROM write limitation if a module is under software control

- How to verify it
Manual test
UT
2023-12-13 14:16:40 +02:00
Junchao-Mellanox
b0bb3d40d3
[Mellanox] Implement low power mode for cmis host management (#17159)
- Why I did it
For cmis host management mode, the prevous sysfs cannot be used for low power mode setting. This PR reuses existing low power mode implementation in sonic_xcvr package when CMIS host management mode is enabled

- How I did it
Use sonic_xcvr low power mode implementation when CMIS host management mode is enabled.

- How to verify it
Manual test for CMIS host management mode
Regression test for old mode and backward compatible test
2023-12-11 10:42:01 +02:00
Junchao-Mellanox
26bf38b610
[Mellanox] Provide default implementation for sfp error description when CMIS host management is enabled (#17294)
- Why I did it
Provide a dummy implementation for SFP error description when CMIS host management is enabled. A future feature shall be raised to implement SFP error description for such mode.

- How I did it
if SFP is under software control, provide "Not supported" as error description
if SFP is under initialization, provide "Initializing" as error description

- How to verify it
unit test
2023-12-05 20:27:29 +02:00
Kebo Liu
bf4a2e3002
[Mellanox] Revert LPM implementation to the old way (#17096)
- Why I did it
The current low power mode setting implementation requests the user to set the port to admin down first before toggling LP mode, this is not backward compatible, now revert it to the old way so that the user can toggle the LP mode regardless of the port admin status.

- How I did it
Revert the recent changes related to LPM in PR #14130 and #16545

- How to verify it
Run all sfputil and SFP platform API related tests on all the Mellanox platforms.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-11-27 13:23:04 +02:00
Vivek
456a90e1ab
[Nvidia] Remove the dependency on python_sdk_api for sfp api (#16545)
Sfp api can now be called from the host which doesn't have the python_sdk_api installed. Also, sfp api has been migrated to use sysfs instead of sdk handle.

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-09-23 00:19:27 -07:00
Junchao-Mellanox
91f3da018e
[Mellanox] Add more unit test coverage for platform API (#15842)
- Why I did it
Increase UT coverage for Nvidia platform API code

Work item tracking
Microsoft ADO (number only):

- How I did it
Focus on low coverage file:
1. component.py
2. watchdog.py
3. pcie.py

- How to verify it
Run the unit test, the coverage has been changed from 70% to 90%
2023-08-03 13:54:31 +03:00
Junchao-Mellanox
18cf719d6a
[Mellanox] Use sysfs for sfp reset/LPM/presence (#14130)
- Why I did it
The current implementation of SFP reset, LPM, present relies on SDK API. This PR moves the implementation to SDK sysfs. By this PR, it gains following benefit:
1. SDK sysfs provides better performance.
2. Host side and container side share the same code.
3. Code is much cleaner.

- How I did it
Use SDK sysfs to implement SFP reset, LPM, present.

- How to verify it
1. Manual test.
2. Unit test.
2023-05-24 17:24:34 +03:00
Junchao-Mellanox
830b7d8cb4
[Mellanox] Use sdk sysfs instead of ethtool (#12480) 2022-11-03 11:17:44 -07:00
Junchao-Mellanox
1d69f0916e
[Mellanox] Provide dummy implementation for get_rx_los and get_tx_fault (#12231)
- Why I did it
get_rx_los and get_tx_fault is not supported via the exisitng interface used, need provide dummy implementation for them.
NOTE: in later releases we will get them back via different interface.

- How I did it
Return False * lane_num for get_rx_los and get_tx_fault

- How to verify it
Added unit test
2022-09-30 09:38:05 +03:00
Junchao-Mellanox
46ebd06403
[Mellanox] Fix issue: set lpmode by platform API does not work (#11732)
- Why I did it
Fix issue: set lpmode by platform API does not work

- How I did it
Fix miss return value in code

- How to verify it
Manual test
2022-08-18 13:07:38 +03:00
Kebo Liu
7ac590b5c5
[Mellanox] Enhance Platform API to support SN2201 - RJ45 ports and new components mgmt. (#10377)
* Support new platform SN2201 and RJ45 port

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* remove unused import and redundant function

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* fix error introduced by rebase

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* Revert the special handling of RJ45 ports (#56)

* Revert the special handling of RJ45 ports

sfp.py
sfp_event.py
chassis.py

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove deadcode

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Support CPLD update for SN2201

A new class is introduced, deriving from ComponentCPLD and overloading _install_firmware
Change _install_firmware from private (starting with __) to protected, making it overloadable

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Initialize component BIOS/CPLD

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove swb_amb which doesn't on DVT board any more

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove the unexisted sensor - switch board ambient - from platform.json

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Do not report error on receiving unknown status on RJ45 ports

Translate it to disconnect for RJ45 ports
Report error for xSFP ports

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Add reinit for RJ45 to avoid exception

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
2022-06-20 19:12:20 -07:00
Andriy Yurkiv
1e2e493daa
[Mellanox] Credo Y-cable read_eeprom/write_eeprom API implementation (#10320)
- Why I did it
Implement read_eeprom/write_eeprom API for Credo Y-cable for Dual ToR Active-Standby

- How I did it
Use mlxreg utility for API implementation

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-03-30 20:41:31 +03:00
Oleksandr Ivantsiv
9565ef7a9a
[Mellanox] Refactor SFP to use new APIs. (#10317)
- Why I did it
Refactor SFP code to remove code duplication and to be able to use the latest features available in new APIs.

- How I did it
Refactor SFP code to remove code duplication and to be able to use the latest features available in new APIs.

- How to verify it
Run sonic-mgmt/platform_tests/sfp tests
2022-03-23 09:16:03 +02:00
Junchao-Mellanox
e8b4c2a1f4
[Mellanox] Refactor Mellanox platform API to support dynamic port configuration (#8422)
- Why I did it
* To support systems with dynamic port configuration
* Apply lazy initialization to faster the speed of loading platform API

- How I did it
* Add module.py to implement dynamic port configuration (aka line card model)
* Adjust chassis.py, platform.py, thermal.py, sfp.py to support dynamic port configuration
* Optimize existing code

- How to verify it
Platform regression on MSN4700, MSN3800 and MSN2700, 100% pass
Unit test covers all new changes.
2021-10-25 07:59:06 +03:00
Dror Prital
5356244e53
[Mellanox] Add NVIDIA Copyright header to "mellanox" files (#8799)
- Why I did it
Add NVIDIA Copyright header to "mellanox" files

- How I did it
Add NVIDIA Copyright header as a comment for Mellanox files

- How to verify it
Sanity tests and PR checkers.
2021-10-17 19:03:02 +03:00
Junchao-Mellanox
147bf240f0
[Mellanox] Add bitmap support for SFP error event (#7605)
#### Why I did it

Currently, SONiC use a single value to represent SFP error, however, multiple SFP errors could exist at the same time. This PR is aimed to support it

#### How I did it

Return bitmap instead of single value when a SFP event occurs

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-06-25 10:56:47 -07:00
Stephen Sun
9f0dce0313
[Mellanox] Optimize SFP modules initialization (#7537)
Originally, SFP modules were always accessed from platform daemons, and arbitrary SFP modules can be accessed in the daemon. So all SFP modules were initialized in one shot once one of the following chassis APIs called
- get_all_sfps
- get_sfp_numbers
- get_sfp

Recently, we noticed that SFP modules can also be accessed from CLI, eg. the latest refactor of `sfputil`.

In this case, only one SFP module is accessed in the chassis object's life cycle.
To initialize all SFP modules in one shot is waste of time and causes the CLI to take much more time to finish.
So we would like to optimize the initialization flow by introducing a two-phase initialization approach:
- Partial initialization, which means the `chassis._sfp_list` has been initialized with proper length and all elements being `None`
- Full initialization, which means all elements in `chassis._sfp_list` are created

If the relevant function is called,
- `get_sfp`, only partial initialization will be done, and then the specific SFP module is initialized.
- `get_all_sfps` or `get_num_sfps`, full initialization will be done, which means all SFP modules are initialized.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-06 10:14:48 -07:00