Commit Graph

900 Commits

Author SHA1 Message Date
Neetha John
bbcad1362f
[202012] [Profile separation] MMU infrastructure update for TD2 (#12739)
Signed-off-by: Neetha John <nejo@microsoft.com>

This is to backport #12626

Why I did it
There is a need to have separate profiles on compute and storage and this infra update will help achieve that

How I did it
Moved buffer pool/profile and qos definitions on TD2 to a common folder and all TD2 hwsku's will reference that folder
2022-11-29 12:53:58 -08:00
bingwang-ms
47d7e5d0d2
[202012] Apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP on dualtor (#12792)
* Apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP on dualtor
2022-11-23 21:49:00 +08:00
zzhiyuan
3a68dc0325 [Arista] Increase switch PCIe timeout for 7060-cx32s (#9248)
Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
Why I did it
Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up.

No issues would be expected from the setpci command. From the PCIe spec:

"Software is permitted to change the value in this field at any
time. For Requests already pending when the Completion
Timeout Value is changed, hardware is permitted to use either
the new or the old value for the outstanding Requests, and is
permitted to base the start time for each Request either on when
this value was changed or on when each request was issued. "

How I did it
Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder.

Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands.

How to verify it
On bootup of an Arista 7060, can execute:
lspci -vv -s 01:00.0 | grep -i "devctl2"
In order to check that the timeout has changed.
2022-11-23 10:43:54 +00:00
Samuel Angebault
ea3620cde5
[Arista] Remove pcie device monitoring for 7260CX3-64 (#12654) 2022-11-15 15:26:32 -08:00
Ying Xie
eb37bed49c
[201811][DX010] enable LPM (#12641)
Why I did it
Without LPM enabled, the routing table size is very small.

How I did it
Enabling LPM.

Signed-off-by: Ying Xie ying.xie@microsoft.com
2022-11-09 08:16:03 -08:00
Ying Xie
778df1e178
[202012][RDMA] create split profiles for Arista-7050CX3-32S (#12478)
* [202012][RDMA] create split profiles for Arista-7050CX3-32S

Manually cherry-picking #12228.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-10-31 22:55:06 -07:00
Vivek
458f583e3b
[202012] [Mellanox] [SKU] Mellanox-SN4700-V48C32 SKU added (#12150)
New SKU for MSN-4700 Platform i.e. Mellanox-SN4700-C128

Requirements:
* Breakout: Port 1-32: 4x100G
* Downlinks: 120 (1-30)
* Uplinks: 8 (31-32)
* Shared Headroom: Enabled
* Over Subscribe Ratio: 1:8
* Default Topology: T2
* Default Cable Length for T2: 1500m
* QoS params: The default ones defined in qos_config.j2 will be applied
* Small Packet Percentage: Used 50% for traditional buffer model Note: For dynamic model, the value defined in LOSSLESS_TRAFFIC_PATTERN|AZURE|small_packet_percentage is used

Additional Details:
Switch Type has to be programmed as SpineRouter through config_db.json in DEVICE_METADATA|localhost|type field for the buffer values & cable lengths defined in the buffers_defaults_t2.j2 to apply on the device
Cable Lengths Used for generating buffer_defaults_{t0,t1,t2}.j2 

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-10-13 16:52:59 +03:00
Vivek
d14e6f69d5
[202012] [Mellanox] [SKU] Mellanox-SN4700-A96C8V8 SKU added (#12151)
A new SKU for MSN4700 Platform i.e. Mellanox-SN4700-A96C8V8

Requirements:

Breakout:
Port 1-24: 4x25G(4)[10G,1G]
Port 25-28: 2x100G[200G,50G,40G,25G,10G,1G]
Port 29-32: 2x200G[100G,50G,40G,25G,10G,1G]
Downlinks: 96 (1-24) + 4 (25-28)
Uplinks: 4 (29-32)
Shared Headroom: Enabled
Over Subscribe Ratio: 1:4
Default Topology: T0
Default Cable Length for T1: 5m
VxLAN source port range set: No
Static Policy Based Hashing Supported: No

Additional Details:
QoS params: The default ones defined in qos_config.j2 will be applied
Small Packet Percentage: Used 50% for traditional buffer model Note: For dynamic model, the value defined in LOSSLESS_TRAFFIC_PATTERN|AZURE|small_packet_percentage is used
SKU was drafted under the assumption that the downlink ports uses xcvr's that will only support the first 4 lanes of the physical port they are connected to. Hence for the ports 1-24, the last four lanes are not used
Cable Lengths used for generating buffer_defaults_{t0,t1}.j2 D

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2022-10-13 16:51:17 +03:00
bingwang-ms
ee7d9d1c45 Map TC6 to Queue 1 for regular traffic (#11904)
Why I did it
This PR is to update TC_TO_QUEUE_MAP|AZURE for SKU Arista-7050CX3-32S-D48C8 and Arista-7260CX3 T0.

The change is only to align the TC_TO_QUEUE_MAP for regular traffic and bounced traffic. It has no impact on business because we have no traffic being mapped to TC2 or TC6.

How I did it
Update TC_TO_QUEUE_MAP|AZURE , and test cases as well.

How to verify it
Verified by running test case test_j2files.py

/sonic/src/sonic-config-engine$ python3 setup.py test -s tests/test_j2files.py
running test
......
----------------------------------------------------------------------
Ran 29 tests in 25.390s

OK
2022-09-17 00:41:48 +00:00
Vivek
484402ff08
[202012] [Mellanox] [SKU] Mellanox-SN4700-C128 SKU added (11574) (#11878)
- Why I did it
New SKU for MSN-4700 Platform i.e. Mellanox-SN4700-C128

Requirements:
* Breakout: Port 1-32: 4x100G
* Downlinks: 120 (1-30)
* Uplinks: 8 (31-32)
* Shared Headroom: Enabled
* Over Subscribe Ratio: 1:8
* Default Topology: T2
* Default Cable Length for T2: 1500m
* QoS params: The default ones defined in qos_config.j2 will be applied
* Small Packet Percentage: Used 50% for traditional buffer model Note: For dynamic model, the value defined in LOSSLESS_TRAFFIC_PATTERN|AZURE|small_packet_percentage is used

Additional Details:
Switch Type has to be programmed as SpineRouter through config_db.json in DEVICE_METADATA|localhost|type field for the buffer values & cable lengths defined in the buffers_defaults_t2.j2 to apply on the device
Cable Lengths Used for generating buffer_defaults_{t0,t1,t2}.j2 values

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2022-09-04 11:05:22 +03:00
Dev Ojha
8c57f0521f [Arista7050cx3] TD3 SKU changes for pg headroom value after interop testing with cisco 8102 (#11901)
Why I did it
After PFC interop testing between 8102 and 7050cx3, data packet losses were observed on the Rx ports of the 7050cx3 (inflow from 8102) during testing. This was primarily due to the slower response times to react to PFC pause packets for the 8102, when receiving such frames from neighboring devices. To solve for the packet drops, the 7050cx3 pg headroom size has to be increased to 160kB.

How I did it
Modified the xoff threshold value to 160kB in the pg_profile file to allow for the buffer manager to read that value when building the image, and configuring the device

How to verify it
run "mmuconfig -l" once image is built


Signed-off-by: dojha <devojha@microsoft.com>
2022-08-31 11:10:22 -07:00
Arun Saravanan Balachandran
c1712b8c9a
[202012] DellEMC: S6000, S6100, Z9332f - Add capabilities fields in platform.json (#11772) 2022-08-31 09:06:47 -07:00
Ikki Zhu
cf12aa549a [hlx/sfp] fix hlx platform sfp+ tx disable issue (#11532)
Why I did it:
To fix hlx platform sfp+ module tx disable issue

How I did it:
Fix sfp+ tx disable function according SFF-8472 specification

Co-authored-by: Eric Zhu <erzhu@celestica.com>
2022-08-09 21:05:08 +00:00
bingwang-ms
84aca00847
[202012]Support different DSCP_TO_TC_MAP for T1 in dualtor deployment (#11580)
Why I did it
This PR is to backport #11569 into 202012 branch.
This PR is to apply different DSCP_TO_TC_MAP to downlink and uplink ports on T1 in dualtor deployment.
For T1 downlink ports (To T0)
The DSCP_TO_TC_MAP is not changed. DSCP2 and DSCP6 are mapped to TC2 and TC6 respectively.
For T1 uplink ports (To T1)
A new DSCP_TO_TC_MAP|AZURE_UPLINK is defined and applied. DSCP2 and DSCP6 are mapped to TC1 to avoid mixing up lossy and lossless traffic from T2.
The extra lossy PG2 and PG6 added in PR #11157 is reverted as well because no traffic from T2 is mapped to PG2 or PG6 now.

How I did it
Define a new map DSCP_TO_TC_MAP|AZURE_UPLINK for 7260 T1.

How to verify it
Verified by test case in test_j2files.py.
2022-08-01 08:59:45 -07:00
Stephen Sun
44ecff1154
Support queue 7 in dual ToR scenario (#11570)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-08-01 09:27:49 +08:00
Kebo Liu
c40435c94c
[202012] [Mellanox] Add new sensor conf to support SN4410 A1 system (#8379) (#11530)
- Why I did it
New SN410 A1 system has a different sensor layout with A0 system, needs a new sensor conf file to support it.

- How I did it
Since the SN4410 A1 system use exactly the same sensor layout as the SN4700 A1 system, so add a symbol link linking to the SN4700 A1 sensor conf file to reuse.

- How to verify it
Run sensor test against the SN4410 A1 system;
Run platform related regression test against the SN4410 A1 system
2022-07-29 17:41:44 +02:00
Taylor Cai
c4927e0e68 [device/celestica]:Fix failed test case of Seastone snmp (#11430)
* Update psu.py
* Update thermal.py
2022-07-27 23:28:11 +00:00
Neetha John
15cc046eda
[202012] Update MMU and ECN settings for Arista-7260CX3-D96C16 (#11427)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
Missed this sku in the previous PR #11398

How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%

How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
2022-07-15 09:33:39 -07:00
Kebo Liu
aa4379ddbe
[202012] [Mellanox] Add sensor conf file for new version of MSN3700/3700C/4600C platforms (#11358)
- Why I did it
MSN3700/3700C/4600C have been re-spined, the new HW version of platforms has different sensors, so need to apply the correct sensor.conf for them.

- How I did it
Add new sensor.conf files for the new re-spined platforms.
Enhance the logic of "get_sensors_conf_path" for the related platforms in order to load the correct sensor.conf for each version of platforms.

- How to verify it
run sensors test on different versions of platforms
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-07-14 08:59:10 +03:00
Neetha John
4de610af15
[202012] Update 7260 MMU and ECN settings (#11398)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
Improve throughput and latency for 7260 deployments

How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%

How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
2022-07-12 08:46:44 -07:00
Ying Xie
1d55dca6d3 [Buffer] Separate buffer profile for Arista-7060CX-32S-Q24C8
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-07 14:09:01 -07:00
Ying Xie
17a9259c55 [7060] fix default port map
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
41518aa825 [Buffer] Separate buffer profile for Arista-7260CX3-Q64
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
f53b2620db [Buffer] Separate buffer profile for Arista-7260CX3-D108C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
b625085b46 [Buffer] Separate buffer profile for Arista-7260CX3-C64
50G data is not accurate, needs further update.

Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
5b42ba021b [Buffer] Separate buffer profile for Arista-7060CX-32S-C32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
83780549c7 [Buffer] Separate buffer profile for Arista-7060CX-32S-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
e8f04cd2e6 [Buffer] Separate buffer profile for Arista-7060CX-32S-Q32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
2c81f02b13 [Buffer] Separate buffer profile for Celestica-DX010-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
4dbdc8e0a0 [Buffer] Separate buffer profile for Force10-S6100
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Ying Xie
2ed29da38d [buffer] create infrastructure to enable buffer/QoS profiles
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-07 14:09:01 -07:00
vmittal-msft
6ada55439d Updated buffer profile settings for TD3 based HWSKUs (Arista-7050CX3-32S-C32, Arista-7050CX3-32S-D48C8) (#11202)
* Updated buffer profile settings for TD3 based HWSKUs (Arista-7050CX3-32S-C32, Arista-7050CX3-32S-D48C8)
2022-07-05 20:57:53 +00:00
Stephen Sun
fe6be5da92
[202012] Configure different map between uplink and downlink on t1 switch in dual ToR scenario (#11299)
- Why I did it
Configure different DSCP_TO_TC_MAP between uplink and downlink on T1 switch in dual ToR scenario
On T1 uplink, both DSCP 2/6 will be mapped to TC 1 for the purpose of avoiding such traffic occupying lossless buffers.
On T1 downlink, they will be mapped to TC 2/6 respectively. (unchanged)

- How I did it
For vendors who want to configure different DSCP_TO_TC_MAP between uplinks and downlinks on T1, they should
Define generate_dscp_to_tc_map macro in SKU's qos.json.j2 file
Define map AZURE for downlink and AZURE_UPLINK for uplink
Define jinja2 variable different_dscp_to_tc_map as True

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-03 15:58:06 +03:00
Stephen Sun
307d0e2aca
[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario (#11032)
Why I did it
Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario

1. Support additional queue and PG in buffer templates, including both traditional and dynamic model
2. Support mapping DSCP 2/6 to lossless traffic in the QoS template.
3. Add macros to generate additional lossless PG in the dynamic model
4. Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
  - Buffer tables are rendered via using macros.
  - Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
5. Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.

On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:
40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues

Signed-off-by: Stephen Sun stephens@nvidia.com

How to verify it
Run regression test.
2022-06-21 10:04:49 -07:00
bingwang-ms
6ddf5cd7dc
[202012] [cherry-pick] Generate switch level dscp_to_tc_map entry from qos_config template (#11132)
* Generate switch level dscp_to_tc_map

Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-06-17 20:49:56 +08:00
Jon Goldberg
159613153f [Nokia ixs7215] change var/log size to 4GB (#11122)
This makes use of #11121 to add support for configuration of VAR_LOG_SIZE on Nokia IXS7215
2022-06-14 09:02:14 -07:00
Guohan Lu
b0c48f9b31 [devices]: fix j2 syntax error for the config.bcm in Arista-7260CX3-D108C8
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2022-06-10 11:23:28 -07:00
Neetha John
881796f376
[202012] Adjust 7260 buffer sizes to accomodate extra lossless queues (#11050)
Backport changes from #11018

Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR

How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully
2022-06-06 18:13:16 -07:00
Guohan Lu
693bc5faae Revert "[qos]: Adjust 7260 buffer sizes to accomodate extra lossless queues (#11018)"
This reverts commit 21f14dc6ea.

unit test needs to be cherry-picked.
2022-06-06 06:30:24 -07:00
Neetha John
21f14dc6ea [qos]: Adjust 7260 buffer sizes to accomodate extra lossless queues (#11018)
Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR

How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully

Signed-off-by: Neetha John <nejo@microsoft.com>
2022-06-05 22:23:13 -07:00
Richard.Yu
f555a4a0a0 [Tunnel PFC] Add property for tunnel PFC (#10962)
* [Tunnel PFC] Add property for tunnel PFC

Replace the config.bcm file with j2 template file
- Add 'sai_remap_prio_on_tnl_egress=1' property when device metadata local
- Host subtype is 'dualtor'
- Change sai.profile foe the new config.bcm.j2
2022-06-05 22:02:19 -07:00
bingwang-ms
e159998657
[202012][cherry-pick] Add two extra lossless queues for bounced back traffic (#10715)
* Add extra lossless queues

Signed-off-by: bingwang <bingwang@microsoft.com>
2022-06-04 19:25:02 +08:00
bingwang-ms
7ec6a60230
[cherry-pick] [202012] Update qos config to clear queues for bounced back traffic (#10608)
* Update qos config to clear queues for bounced back traffic

Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-06-02 16:29:25 +08:00
Taylor Cai
f5ecf1ee1c [device/celestica]:Fix failed test case of Haliburton snmp (#10844) 2022-05-25 22:56:57 +00:00
Vivek R
cd0a0608a9 Removed platform specific reboot files for mellanox simx platforms (#10806)
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.

Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-05-16 23:28:02 +00:00
vmittal-msft
7b7737ef0f Adjustment to ingress pool size to accomodate brcm sai (#10694) 2022-05-03 00:42:27 +00:00
Nikola Dancejic
602c8e99dc
[device config] Adding configuration for default route fallback (#10692)
Set sai_tunnel_underlay_route_mode attribute to fallback to default
route if more specific route is unavailable.
Signed-off-by: Nikola Dancejic <ndancejic@microsoft.com>
2022-04-29 16:20:18 -07:00
Taylor Cai
1aeb658964 Fix issue test_crm and test_fib (#10585)
Why I did it
Fix issue (https://github.com/Azure/sonic-buildimage/issues/9171) and (https://github.com/Azure/sonic-buildimage/issues/9236)

How I did it
Add flag in config file for get correct count of IPv6 entry.
Add init config file to set IPv4 ECMP hash on L4.

How to verify it
Compile the sonic_platform wheel for e1031, then upload to device and install the wheel, verify using testbed.
2022-04-26 17:40:47 +00:00
vmittal-msft
fcf5dcf5eb Changes to support topology and port speed agnostic switch init for TD3 based platforms (#10587) 2022-04-21 22:00:38 +00:00
Eric Zhu
8a246b88b1 Fix issue of partially parsing syseeprom value (#10020) (#10276)
Why I did it
The current code assumes that the value part does not have whitespace. So everything after the whitespace will be ignored. The syseeprom values returned from platform API do not match the output of "show platform syseeprom" on dx010 and e1031 device.

How I did it
This change improved the regular expression for parsing syseeprom values to accommodate whitespaces in the value.
PR 10021 provides the solution, but committed to the wrong place for dx010 and e1031.

How to verify it
Compile the sonic_platform wheel for dx010, then upload to device and install the wheel, verify the platform eeprom API.

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2022-03-25 21:53:45 +00:00