Commit Graph

39 Commits

Author SHA1 Message Date
mssonicbld
06aa8aa11b
[Mellanox] Support DSCP remapping in dual ToR topo on T0 switch (#12605) (#13745)
- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
2023-02-10 09:16:56 -08:00
Kebo Liu
f4806091bb [Mellanox] Add Sensor conf to support respined platforms(SN3700/SN3700C/SN4600C) (#11553)
- Why I did it
Add new sensor conf file to support respined platforms(SN3700/SN3700C/SN4600C)

- How I did it
Add new sensor conf
Update the get_sensors_conf_path scripts to apply the sensor conf according to the HW respin version info

- How to verify it
run platform test(including sensor test)

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-11-10 18:17:49 +00:00
Stephen Sun
b4d8ee3fec [Mellanox] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario (#11261)
- Why I did it
Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario
This is to port #11032 and #11299 from 202012 to master.

Support additional queue and PG in buffer templates, including both traditional and dynamic model
Support mapping DSCP 2/6 to lossless traffic in the QoS template.
Add macros to generate additional lossless PG in the dynamic model
Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
Buffer tables are rendered via using macros.
Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.
On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:

40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:30:00 +00:00
Junchao-Mellanox
0191300b96
[Mellanox] Auto correct PSU voltage threshold (WA) (#10394)
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.

- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"

- How to verify it
Unit test and Manual test
2022-04-14 08:14:40 +03:00
Andriy Yurkiv
dd87b948a5
[Mellanox][vxlan] remove old static config for VXLAN src port range feature (#9956)
- Why I did it
Need to remove old static configs from sai.profile files.
New implementation: Azure/sonic-swss#1959
New configuration: #9658

- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 lines from files per HWSKU

- How to verify it
When static config is removed following test will fail (src port will be in range 0-255)

py.test vxlan/test_vnet_vxlan.py --inventory "../ansible/inventory, ../ansible/veos" --host-pattern (testbed)-t0 --module-path ../ansible/library/ --testbed (testbed)-t0 --testbed_file ../ansible/testbed.csv --allow_recover --assert plain --log-cli-level info --show-capture=no -ra --showlocals --disable_loganalyzer --skip_sanity --upper_bound_udp_port 65535 --lower_bound_udp_port 64128
2022-02-13 14:55:59 +02:00
Junchao-Mellanox
05cc8f9df2
[Mellanox] Fix issue: SN4600C has 4 CPU core temperature sensors (#9930)
- Why I did it
platform.json of 4600C only has 2 CPU core thermal sensors, but there are 4 actually

- How I did it
Added thermal sensors for CPU core 2 and core 3.

- How to verify it
Build.
2022-02-09 13:26:40 +02:00
Kebo Liu
8bb9ed6876
[Mellanox] Add sensors conf for MSN4600C A1 platform (#9706)
- Why I did it
Add sensor conf for MSN4600C A1 platform

- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform

- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-01-13 07:59:35 +02:00
Stephen Sun
34be53a22e
Fix typo and missing files in SN3800 and SN4600C's buffer templates (#9537)
Why I did it
Fix typo and missing files in SN3800 and SN4600C's buffer templates

How I did it
ingress_lossless_xoff_size => ingress_lossless_pool_xoff add missing files for SN4600C-D100C12S2

How to verify it
Deploy the fix and verify whether the device can be up.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-12-20 13:31:45 -08:00
Alexander Allen
325276a917
[Mellanox] Add copyright headers to Mellanox-SN4600C-D100C12S2 SKU (#9345)
Missing NVIDIA copyright header from some files.
2021-12-10 08:23:26 +02:00
Stephen Sun
2d097ded8a
[Mellanox] Adjust buffer parameters with 2km cable supported for 4600C non-generic SKUs (#9215)
- Why I did it
Also recalculated all parameters with the latest algorithm with per-speed peer response time taken into account

- How I did it
Detailed information of each SKU:

C64:
t0: 32 100G downlinks and 32 100G uplinks
t1: 56 100G downlinks and 8 100G uplinks with 2km-cable supported
D112C8: 112 50G downlinks and 8 100G uplinks.
D48C40: 48 50G downlinks, 32 100G downlinks, and 8 100G uplinks
D100C12S2: 4 100G downlinks, 2 10G downlinks, 100 50G downlinks, and 8 100G uplinks
2km cable is supported for C64 on t1 only

- How to verify it
Run regression test (QoS)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-12-08 15:51:33 +02:00
Stephen Sun
ba853348d5
[Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles (#8768)
Signed-off-by: Stephen Sun stephens@nvidia.com

Why I did it
Support zero buffer profiles

Add buffer profiles and pool definition for zero buffer profiles
Support applying zero profiles on INACTIVE PORTS
Enable dynamic buffer manager to load zero pools and profiles from a JSON file
Dependency: It depends on Azure/sonic-swss#1910 and submodule advancing PR once the former merged.

How I did it
Add buffer profiles and pool definition for zero buffer profiles

If the buffer model is static:
Apply normal buffer profiles to admin-up ports
Apply zero buffer profiles to admin-down ports
If the buffer model is dynamic:
Apply normal buffer profiles to all ports
buffer manager will take care when a port is shut down
Update buffers_config.j2 to support INACTIVE PORTS by extending the existing macros to generate the various buffer objects, including PGs, queues, ingress/egress profile lists

Originally, all the macros to generate the above buffer objects took active ports only as an argument
Now that buffer items need to be generated on inactive ports as well, an extra argument representing the inactive ports need to be added
To be backward compatible, a new series of macros are introduced to take both active and inactive ports as arguments
The original version (with active ports only) will be checked first. If it is not defined, then the extended version will be called
Only vendors who support zero profiles need to change their buffer templates
Enable buffer manager to load zero pools and profiles from a JSON file:

The JSON file is provided on a per-platform basis
It is copied from platform/<vendor> folder to /usr/share/sonic/temlates folder in compiling time and rendered when the swss container is being created.
To make code clean and reduce redundant code, extract common macros from buffer_defaults_t{0,1}.j2 of all SKUs to two common files:

One in Mellanox-SN2700-D48C8 for single ingress pool mode
The other in ACS-MSN2700 for double ingress pool mode
Those files of all other SKUs will be symbol link to the above files

Update sonic-cfggen test accordingly:

Adjust example output file of JSON template for unit test
Add unit test in for Mellanox's new buffer templates.

How to verify it
Regression test.
Unit test in sonic-cfggen
Run regression test and manually test.
2021-11-29 08:04:01 -08:00
Dror Prital
5356244e53
[Mellanox] Add NVIDIA Copyright header to "mellanox" files (#8799)
- Why I did it
Add NVIDIA Copyright header to "mellanox" files

- How I did it
Add NVIDIA Copyright header as a comment for Mellanox files

- How to verify it
Sanity tests and PR checkers.
2021-10-17 19:03:02 +03:00
Qi Luo
add9b651b6
Add platform_asic file to each platform folder in sonic-device-data based package (#8542)
#### Why I did it
Add platform_asic file to each platform folder in sonic-device-data package. The file content will be used as the ground truth of mapping from PLATFORM_STRING to switch ASIC family.

One use case of the mapping is to prevent installing a wrong image, which targets for other ASIC platforms. For example, currently we have several ONIE images naming as sonic-*.bin, it's easy to mistakenly install the wrong image. With this mapping built into image, we could fetch the ONIE platform string, and figure out which ASIC it is using, and check we are installing the correct image.

After this PR merged, each platform vendor has to add one mandatory text file  `device/PLATFORM_VENDOR/PLATFORM_STRING/platform_asic`, with the content of the platform's switch ASIC family.

I will update https://github.com/Azure/SONiC/wiki/Porting-Guide after this PR is merged.

You can get a list of the ASIC platforms by `ls -b platform | cat`. Currently the options are
```
barefoot
broadcom
cavium
centec
centec-arm64
generic
innovium
marvell
marvell-arm64
marvell-armhf
mellanox
nephos
p4
vs
```

Also support
```
broadcom-dnx
```

#### How I did it

#### How to verify it
Test one image on DUT. And check the folders under `/usr/share/sonic/device`
2021-10-08 19:27:48 -07:00
Alexander Allen
57ad1ed680
Add Mellanox-SN4600C-D100C12S2 SKU (#8832)
*Add Mellanox-SN4600C-D100C12S2 SKU
2021-09-29 14:14:11 -07:00
Ashok Daparthi-Dell
6cbdf11e53
SONIC QOS YANG - Remove qos tables field value refernce format (#7752)
Depends on Azure/sonic-utilities#1626
Depends on Azure/sonic-swss#1754

QOS tables in config db used ABNF format i.e "[TABLE_NAME|name] to refer fieldvalue to other qos tables.

Example:
Config DB:
"Ethernet92|3": {
"scheduler": "[SCHEDULER|scheduler.1]",
"wred_profile": "[WRED_PROFILE|AZURE_LOSSLESS]"
},
"Ethernet0|0": {
"profile": "[BUFFER_PROFILE|ingress_lossy_profile]"
},
"Ethernet0": {
"dscp_to_tc_map": "[DSCP_TO_TC_MAP|AZURE]",
"pfc_enable": "3,4",
"pfc_to_queue_map": "[MAP_PFC_PRIORITY_TO_QUEUE|AZURE]",
"tc_to_pg_map": "[TC_TO_PRIORITY_GROUP_MAP|AZURE]",
"tc_to_queue_map": "[TC_TO_QUEUE_MAP|AZURE]"
},

This format is not consistent with other DB schema followed in sonic.
And also this reference in DB is not required, This is taken care by YANG "leafref".

Removed this format from all platform files to consistent with other sonic db schema.
Example:
"Ethernet92|3": {
"scheduler": "scheduler.1",
"wred_profile": "AZURE_LOSSLESS"
},

Dependent pull requests:
#7752 - To modify platfrom files
#7281 - Yang model
Azure/sonic-utilities#1626 - DB migration
Azure/sonic-swss#1754 - swss change to remove ABNF format
2021-09-28 09:21:24 -07:00
Alexander Allen
5259af2e2b
[mellanox] remove 2x40G and 4x40G breakout modes due to no hardware support (#8280)
Mellanox platforms do not support 2x40G or 4x40G breakout modes.

Signed-off-by: Alexander Allen <arallen@nvidia.com>
2021-08-01 13:24:26 -07:00
Vivek Reddy
b0620ee055
[Mellanox] [SKU] Fix the shared headroom for 4600C-C64 SKU (#8242)
Removed ingress_lossy_pool from the BUFFER_POOL list
Fx the the egress_lossless_pool_size value

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-07-29 08:21:28 -07:00
Vivek Reddy
69ed4417ca
[Mellanox] [master] Added D48C40 SKU for 4600C platform (#8201)
*Added new SKU for SN4600C Platform: Mellanox-SN4600C-D48C40
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-07-19 10:07:35 -07:00
Vivek Reddy
f95f8ead9e
[Mellanox][master][SKU] sonic interface names are aligned to 4 instead of 8 for 4600/4600C platforms (#8155)
*Edited platform.json for 4600 & 4600C
*Edited hwsku.json and port_config.ini files for all the SKU's present under these platforms
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-07-13 18:59:48 -07:00
DavidZagury
9d1c1659bd
[Mellanox] Update SKUs to enable SDK dumps (#7708)
- Why I did it
To create SDK dump on Mellanox devices when SDK event has occurred.

- How I did it
Set the SKUs keys needed to initialize the feature in SAI.

- How to verify it
Simulate SDK event and check that dump is created in the expected path.
2021-06-21 16:41:18 +03:00
madhanmellanox
4e206ba36e
Adding new SKU Mellanox-SN4600C-C4 (#7815)
Add new SKU of SN4600C switch: Mellanox-SN4600c-c64

Co-authored-by: Madhan Babu <madhan@r-build-sonic06.mtr.labs.mlnx>
2021-06-17 10:04:38 -07:00
Kebo Liu
629f4459d7
[Mellanox] Align PSU fan name in platform.json with latest change in PR #7490 (#7557)
The PSU fan name convention was changed from "psu_{}_fan_{}" to "psu{}_fan{}" in PR #7490, platform.json need to be changed and aligned.
2021-05-07 09:42:40 -07:00
Kebo Liu
5ac048f7e7
[Mellanox] Enhance the platform.json with adding more platform device facts. (#7495)
#### Why I did it

Current platform.json lacks some peripheral device related facts, like chassis/fan/pasu/drawer/thermal/components names, numbers, etc.

#### How I did it

Add platform device facts to the platform.json file

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-05-03 12:22:13 -07:00
Nazarii Hnydyn
39b1c12731
[Mellanox]: Fix PCIEd config for SN4600c (#6892)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-02-26 00:28:32 -08:00
Sangita Maity
18263c99dd
[DPB|master] Update Dynamic Port Breakout Logic for flexible alias support a… (#6831)
To fix [DPB| wrong aliases for interfaces](https://github.com/Azure/sonic-buildimage/issues/6024) issue, implimented flexible alias support [design doc](https://github.com/Azure/SONiC/pull/749)

> [[dpb|config] Fix the validation logic of breakout mode](https://github.com/Azure/sonic-utilities/pull/1440) depends on this

#### How I did it

1. Removed `"alias_at_lanes"` from port-configuration file(i.e. platfrom.json) 
2. Added dictionary to "breakout_modes" values. This defines the breakout modes available on the platform for this parent port, and it maps to the alias list. The alias list presents the alias names for individual ports in order under this breakout mode.
```
{
    "interfaces": {
        "Ethernet0": {
            "index": "1,1,1,1",
            "lanes": "0,1,2,3",
            "breakout_modes": {
                "1x100G[40G]": ["Eth1"],
                "2x50G": ["Eth1/1", "Eth1/2"],
                "4x25G[10G]": ["Eth1/1", "Eth1/2", "Eth1/3", "Eth1/4"],
                "2x25G(2)+1x50G(2)": ["Eth1/1", "Eth1/2", "Eth1/3"],
                "1x50G(2)+2x25G(2)": ["Eth1/1", "Eth1/2", "Eth1/3"]
            }
        }
}
```
#### How to verify it
`config interface breakout`

Signed-off-by: Sangita Maity <samaity@linkedin.com>
2021-02-26 00:13:33 -08:00
shlomibitton
2076260180
Fix for Mellanox-SN4600C-D112C8 SKU (#6817)
- Why I did it
Mellanox-SN4600C-D112C8 SKU is not configured properly.
It should have 112 50G interfaces and 8 100G interfaces as described on this PR.

- How I did it
Modify sai_profile, port_config.ini and hwsku.json for DPB.

- How to verify it
Apply this HwSKU to a MSN4600C Mellanox platform.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-02-18 17:44:03 +02:00
Stephen Sun
7790a74d90
Support shared headroom pool for Microsoft SKUs (#6366)
- Why I did it
Support shared headroom pool

Signed-off-by: Stephen Sun stephens@nvidia.com

- How I did it
Port configurations for SKUs based on 2700/3800 platform from 201911
For SN3800 platform:
C64: 32 100G down links and 32 100G up links.
D112C8: 112 50G down links and 8 100G up links.
D24C52: 24 50G down links, 20 100G down links, and 32 100G up links.
D28C50: 28 50G down links, 18 100G down links, and 32 100G up links.
For SN2700 platform:
D48C8: 48 50G down links and 8 100G up links
C32: 16 100G downlinks and 16 100G uplinks
Add configuration for Mellanox-SN4600C-D112C8
112 50G down links and 8 100G up links.

- How to verify it
Run regression test.
2021-02-16 08:53:40 -08:00
Vadym Hlushko
ece8297cf8
[DPB] [Mellanox] added capability files for SN4600C platform (#6061)
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout

- How I did it
Created capability files according to platform specification SN4600C

- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage

NOTE: breakout to 4 is currently not available as of missing functionality in DPB implementation.

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2021-01-17 11:13:54 +02:00
Stephen Sun
e010d83fc3
[Dynamic buffer calc] Support dynamic buffer calculation (#6194)
**- Why I did it**
To support dynamic buffer calculation.
This PR also depends on the following PRs for sub modules
- [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](https://github.com/Azure/sonic-swss/pull/1338)
- [sonic-swss-common: Dynamic buffer calculation #361](https://github.com/Azure/sonic-swss-common/pull/361)
- [sonic-utilities: Support dynamic buffer calculation #973](https://github.com/Azure/sonic-utilities/pull/973)

**- How I did it**
1. Introduce field `buffer_model` in `DEVICE_METADATA|localhost` to represent which buffer model is running in the system currently:
    - `dynamic` for the dynamic buffer calculation model
    - `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used
2. Add the tables required for the feature:
   - ASIC_TABLE in platform/\<vendor\>/asic_table.j2
   - PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2
   - PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed.
   - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2
   - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2
3. Copy the newly introduced j2 files into the image and rendering them when the system starts
4. Update the CLI options for buffermgrd so that it can start with dynamic mode
5. Fetches the ASIC vendor name in orchagent:
   - fetch the vendor name when creates the docker and pass it as a docker environment variable
   - `buffermgrd` can use this passed-in variable
6. Clear buffer related tables from STATE_DB when swss docker starts
7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2
8. Remove buffer pool sizes for ingress pools and egress_lossy_pool
   Update the buffer settings for dynamic buffer calculation
2020-12-13 11:35:39 -08:00
Kebo Liu
1158701edc
add pcied config files for mellanox platform (#5669)
This PR has a dependency on community change to move PCIe config files from $PLATFORM/plugin folder to $PLATFORM/ folder
- Why I did it
To support PCIed daemon on Mellanox platforms
- How I did it
Add PCIed config yaml files for all Mellanox platforms
Update pmon daemon config files for SimX platforms
2020-11-02 19:45:36 -08:00
Nazarii Hnydyn
5486f87afc
[Mellanox] Update platform components config files. (#5685)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-10-25 19:44:37 +02:00
Nazarii Hnydyn
e2b4afc438
[Mellanox] Update platform components config files. (#5360)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-09-13 10:23:22 +03:00
Stephen Sun
54fcdbb380
Support single ingress pool for MSFT SKUs and optimize headroom calculation (#4686)
Calculate pool size in t1 as 24 * downlink port + 8 * uplink port

- Take both port and peer MTU into account when calculating headroom
- Worst case factor is decreased to 50%
- Mellanox-SN2700-C28D8 t0, assume 48 * 50G/5m + 8 * 100G/40m ports
- Mellanox-SN2700 (C32)
  - t0: 16 * 100G/5m + 16 * 100G/40m
  - t1: 16 * 100G/40m + 16 * 100G/300m

Signed-off-by: Stephen Sun <stephens@mellanox.com>

Co-authored-by: Stephen Sun <stephens@mellanox.com>
2020-08-13 13:12:09 +03:00
Junchao-Mellanox
e1f7fb135b
[Mellanox] Add system health configuration file for Mellanox platforms (#4834)
The new feature system health support a platform based configuration file. Add configuration files for all Mellanox platform.

Add a configuration file for SN2700, other platform will use a soft link to it.
2020-07-13 10:20:22 -07:00
shlomibitton
e666bf8490 [Mellanox] Add a new SKU Mellanox-SN4600C-D112C8 (#4833)
Add related files to the device folder:

buffer config templates
pg lookup profile
port_config.ini
sai profile
sensor conf
plugins

Co-authored-by: Stephen Sun <stephens@mellanox.com>
2020-07-12 18:08:52 +00:00
Junchao-Mellanox
563a0fd21e
[Mellanox] Change port index in port_config.ini to 1-based (#4781)
* Change port index in port_config.ini to 1-based
* Add default port index to port_config.ini, change platform plugins to accept 1-based port index
* fix port index in sfp_event.py
2020-06-23 17:21:36 -07:00
shlomibitton
ea63f3e4b2
[mellanox]: Fix for MSN4600C sensors (#4754)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-06-12 16:08:27 -07:00
Junchao-Mellanox
5e6c20481d
[Mellanox] Enhancement for fan led management (#4437) 2020-05-13 10:01:32 -07:00
shlomibitton
b6291372d9
[Mellanox] Add a new Mellanox platform x86_64-mlnx_msn4600c and new SKU ACS-MSN4600C (#4483)
* New SKU support for MSN4600C

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-04-30 00:30:11 -07:00