Commit Graph

1268 Commits

Author SHA1 Message Date
Nazarii Hnydyn
845bb80a3c
[ppi]: Enable global port late create for all Mellanox HWSKUs. (#16945)
HLD: sonic-net/SONiC#1084

To improve FAST reboot dataplane downtime

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-11-01 21:50:14 -07:00
Pavan Naregundi
add98b221b [Marvell-arm64]: Add hugepage cmdline agrument
Updated sdk & driver requries hugepage to be reserved during kernel
boot. These kernel command line agrument are passed from installer.conf
in device folder.

Change-Id: Id43f61af2b050500775da66d058c2de78cb5ad15
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2023-10-12 02:07:36 -07:00
Samuel Angebault
be22217b64
[Arista] Remove pcie device monitoring for 7260CX3-64 (#12734)
On some products from this line one of the management NIC might be unpopulated.
On such products this leads to errors from pcied and pcie-check.sh

How I did it
Remove this PCIe device from pcie.yaml

How to verify it
Run pcieutil check on the 2 hardware variants and validate that it passes.
Restart pcied and make sure that there is no more error logs in the syslog.

ADO: 25447788
2023-10-11 22:57:34 -07:00
Ashwin Srinivasan
61683d9d64
Revert "Move /var/log to RAM for Mellanox SN2700, Nokia 7215 and Dell S6100 (#15077)" (#16775)
This reverts commit 05f326eed9.

Microsoft ADO 25355843:
2023-10-11 10:36:29 -07:00
Yakiv Huryk
5719d1a59a
[Mellanox] add Mellanox-SN4700-O28 SKU (#16784)
- Why I did it
To add new SKU for Virtual Smart Switch. T1 switch with 28x400G ports.

- How I did it
Add new SKU with all relevant files.

- How to verify it
run sonic-mgmt t1-28 test suites based on master.
Few issues observed not relevant to the topology but to the stability of master

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2023-10-10 19:20:10 +03:00
Nazarii Hnydyn
875a6d9a1f
[Mellanox][Switching Mode] Enable Store-And-Forward switching mode on specific platforms (#16781)
- Why I did it
To enable Store-And-Forward switching mode for SN2700/SN3800/SN4600C/SN4700 on specific and requested SKUs. Default SKU remain untouched.

- How I did it
Added vendor SAI config options

- How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin
run sonic-mgmt test suits while this option is enabled.

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-10-09 19:00:02 +03:00
Vadym Hlushko
3bd396043e
[buffers] Add 'create_only_config_db_buffers.json' file for the Mellanox devices (not MSFT SKU) (#16233)
* [buffers] Add create_only_config_db_buffers.json for MLNX devices (not MSFT SKU), inject it at the start of the swss docker

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>

* [buffers] Align the sonic-device_metadata.yang

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>

---------

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-10-03 08:35:57 -07:00
Nazarii Hnydyn
d1ea3620c0
[Mellanox]: Update default SKU for SN2700. (#16663)
Set default SKU for SN2700: Mellanox-SN2700 -> ACS-MSN2700

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-09-30 01:43:30 -07:00
vmittal-msft
9068bd986b
[nokia]: Updated total headroom pool size to accommodate 100G ports on T2 uplinks (#16690)
Microsoft ADO (25266920)

sonic-mgmt xoff test was failing for [100g,120km]. Needed to update total headroom pool size when 100G line card is used as T2 uplink.

This size was calculated assuming 100g is used for downlink so cable length was 2km whereas it can also be used for uplink (cable length - 120km). so we need to do calculation based on 120km not 2km. Although it will be some wastage for 2km scenario but it should cover both cases.
2023-09-26 15:58:34 -07:00
byu343
504f1163d3
[Arista] Add new hwskus to x86_64-arista_7060dx5_32 (#16077)
Add two new hwskus for different port speed layouts

Arista-7060DX5-32-25Gx96-100Gx8-200Gx8
Arista-7060DX5-32-200Gx50-100Gx14

Disable bfd on all hwskus for x86_64-arista_7060dx5_32 as its dependencies have not been ready, which will result in a runtime error if not disabled.
2023-09-23 01:42:31 -07:00
Junchao-Mellanox
5138afe4e7
[Mellanox] add new platform 2700 a1 (#16515)
- new pcie.yaml
- new sensors.conf
- new thermal support
- new platform.json file
- adjust test code
2023-09-23 00:15:17 -07:00
Myron Sosyak
d35bf7ef57
[devices] Add DPB support for x86_64-dell_z9100_c2538-r0 (#16538)
Why I did it
Added DPB support for x86_64-dell_z9100_c2538-r0 device

How I did it
Added new SKU folder Force10-Z9100 based on Force10-Z9100-C32
Added platform.json and hwsku.json
Added generic th-z9100-flex-all.config.bcm

How to verify it
On x86_64-dell_z9100_c2538-r0 with changes from this PR

change default SKU to Force10-Z9100
do factory reset
reboot

Signed-off-by: Myron Sosyak <myron.sosyak@plvision.eu>
Co-authored-by: Andriy Kokhan <andriy.kokhan@gmail.com>
2023-09-23 00:12:43 -07:00
Kebo Liu
e286869b24
[Mellanox] Update HW-MGMT package to new version V.7.0030.1011 (#16239)
- Why I did it
1. Update Mellanox HW-MGMT package to newer version V.7.0030.1011
2. Replace the SONiC PMON Thermal control algorithm with the one inside the HW-MGMT package on all Nvidia platforms
3. Support Spectrum-4 systems

- How I did it
1. Update the HW-MGMT package version number and submodule pointer
2. Remove the thermal control algorithm implementation from Mellanox platform API
3. Revise the patch to HW-MGMT package which will disable HW-MGMT from running on SIMX
4. Update the downstream kernel patch list

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-09-06 11:32:08 +03:00
Prince George
a4e37a5cd6
[platform]: Disable interrupt for intel i2c-i801 driver (#16309)
On S6100 we are seeing almost 100K interrupts per second on intels i801 SMBUS controller which affects systems performance.

We now disable the i801 driver interrupt and instead enable polling

Microsoft ADO (number only): 24910530

How I did it
Disable the interrupt by passing the interrupt disable feature argument to i2c-i801 driver

How to verify it
This fix is NOT applicable for ARM based platforms. Applicable only for intel based platforms:-

- On SN2700 its already disabled in Mellanox hw-mgmt
- Celestica DX010 and E1031
- Dell S6100 verified the interrupts are no longer incrementing.
- Arista 7260CX3

Signed-off-by: Prince George <prgeor@microsoft.com>
2023-09-05 10:23:57 -07:00
Pavan-Nokia
31194124b5
[armhf][Nokia-7215]Add HWSKU files for new SAI (#16321)
Add new easy bringup (EZB) files for new SAI 1.12.0
2023-09-05 10:21:53 -07:00
Vadym Hlushko
78587cedc3
[Mellanox] Remove mlxtrace support for SPC4 (#16373)
- Why I did it
Because the Spectrum4 devices don't support mlxtrace utility.

- How I did it
Edit sai.profile and remove mlxtrace_spectrum4_itrace_*.cfg.ext files

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-09-04 10:53:20 +03:00
Andrew Sapronov
0405b369af
[Netberg][Barefoot] Added support for Aurora 750 (#16342)
Why I did it
Support Intel Tofino based platforms Netberg Aurora 750
ASIC: Intel Tofino BFN-T10-064Q
Pors: 64x 100G

How I did it
Added specification to device/netberg directory
Added platform/barefoot/sonic-platform-modules-netberg contains kernel modules, scripts and sonic_platform packages.
Modified the platform/barefoot/platform-modules-netberg.mk to include Aurora 750 related ID.

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
2023-09-01 22:52:39 -07:00
Guohan Lu
3bdfdd95ea Revert "[Ragile]: Add new centec platform ra-b6010 (#14819)"
This reverts commit 75062436e8.
2023-09-01 22:43:18 -07:00
Marty Y. Lok
de7fb325ae
[Nokia-IXR7250E] Modify the platform_ndk.json for Nokia-IXR7250E platform (#16355)
Signed-off-by: mlok <marty.lok@nokia.com>
2023-09-01 08:54:40 -07:00
Junchao-Mellanox
0be57803e2
[Mellanox] Revise label name and fix typo in sensor.conf of 4600C (#16271)
- Why I did it
Revise lable name and fix typo in sensor.conf of 4600C

- How I did it
Revise lable name and fix typo in sensor.conf of 4600C

- How to verify it
Manual test
sonic-mgmt test_sensors.py
2023-08-31 19:41:12 +03:00
pettershao-ragilenetworks
75062436e8
[Ragile]: Add new centec platform ra-b6010 (#14819)
What I did it
Add new platform arm64-ragile_ra-b6010-48gt4x-r0 (Centec)
ASIC Vendor: Centec
Switch ASIC: Centec
Port Config: 48x1G+4x10G

Why I did it
Add new platform RA-B6010-48GT4X

How I did it
Add new platform RA-B6010-48GT4X

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
2023-08-31 08:38:24 -07:00
vmittal-msft
9a15221e46
Update CPU transmitted packets to queue 7 for chassis (#16254)
* Update CPU transmitted packets to TC = 7 for SONIC chassis

* Added new SOC property to permitted list
2023-08-29 18:33:16 -07:00
Nazarii Hnydyn
65b0011866
[Mellanox] [PPI] Enable global port late create for SN5600 (#15866)
- Why I did it
Enabled port late create on SN5600 Spectrum-4 switch boots up with no ports

Work item tracking
N/A

- How I did it
Updated SAI xml config file

- How to verify it
Run sonic-mgmt tests of fastboot

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-08-28 14:50:53 +03:00
Aravind Mani
821be3f6fc
DellEMC: System health config changes (#15771)
Why I did it
System health config is missing in few Dell platforms.

How I did it
Added system health monitoring config and its related API's

How to verify it
show system-health summary/detail commands.
2023-08-23 11:05:03 -07:00
Aravind Mani
0eb7907e87
Dell S6100 Platform API 2.0 fixes (#16208)
Why I did it
Dell S6100 Platform components needs to be updated.

How I did it
Modified platform.json to fix the issue.

How to verify it
Run sonic-mgmt component test and check whether it passes.
2023-08-23 11:01:22 -07:00
Samuel Angebault
d42066cf8d
[Arista] Remove one pcie device accross platforms (#16173)
These devices will not reliabily report the proper devid and vendorid
when reading it is read directly from the pci config space.
It can be read but shouldn't be compared against some fixed value like
the one stored in pcie.yaml.

Since this makes pcied unhappy, the simplest path forward is to just
remove this device from monitoring.
2023-08-22 17:07:14 -07:00
Aaron Payment
a4098de529
Misc platform improvements for DCS-7060DX5-64S (#13875)
* sonic-buildimage: Add 7060DX5-64S brcm tunnel config

Add bcm_tunnel_term_compatible_mode: 1 support, which allows
Loopback configuration to no longer result in SAI failure
"tunnel terminator add failed with error Feature unavailable"
that caused Orchagent SIGABRT

Signed-off-by: Aaron Payment <aaronp@arista.com>

* sonic-buildimage: Set port config ENABLE:0 in 7060DX5-64S brcm config

Set ENABLE:0 for the front panel ports in the brcm config so that the
ports are default admin down. This change prevents the issue that ports
are able to link up and pass traffic resulting in mac learn events after
SAI create switch and before SAI admin state up. The unexpected mac learn events
resulted in Orch agent crash in PortsOrch init, which occurs after SAI
create switch and before SAI admin state up.

* fix sensors.conf on CatalinaDD

* Add support for two sfp ports

* Add copper 50g tuning to babbagelp on catalina

---------

Signed-off-by: Aaron Payment <aaronp@arista.com>
Co-authored-by: enes.oncu <enes.oncu@arista.com>
Co-authored-by: Boyang Yu <byu@arista.com>
2023-08-18 13:05:05 -07:00
Marty Y. Lok
a28352e781
[Nokia][DeviceData] Update the Nokia platform IXR-7250E device data (#16028)
Why I did it
Update the platform_reboot of Nokia Platform IXR-7250E-36x400G to displays the correct reboot-cause history when reboot from supervisor card.

Work item tracking
Microsoft ADO (number only):
How I did it
Modify the platform_reboot script to copy the correct reboo-cause.txt file from NDK to the /host/reboot-cause directory at the down cycle when the reboot is issued from Supervisor (for both reboot right after install a new image and normal reboot)

Signed-off-by: mlok <marty.lok@nokia.com>
2023-08-17 16:35:21 -07:00
Vivek
d4923615d6
[Mellanox] [SN4410] Support new breakout modes for PAM4 (#15668)
- Why I did it
Add new breakout modes to be used in PAM4 supported cables

- How I did it

- How to verify it
Verified the 50G per lane breakout modes are applied properly on the switch

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-08-16 08:30:33 +03:00
Nonodark Huang
1acafa4873
[Ufispace][PDDF] Add PDDF support on S9110-32X, S8901-54XC, S7801-54XS and S6301-56ST (#16017)
Why I did it
Add PDDF support on following Ufispace platforms with Broadcom ASIC

S9110-32X
S8901-54XC
S7801-54XS
S6301-56ST
How I did it
Add PDDF configuration files, scripts and python files

How to verify it
Run pddf commands and show commands.

Signed-off-by: nonodark <ef67891@yahoo.com.tw>
2023-08-14 15:56:03 -07:00
bingwang-ms
d50ae1fd09
[arista]: Always set sai_tunnel_support on Arista-7260cx3 (#16097)
Why I did it
To overwrite the default DSCP_TO_TC_MAP for tunnel traffic, the attribute sai_tunnel_support must be set to 1.
Before this change, the attribute is set only on dual-tor platform when remap is enabled.
This PR is to set the attribute on all Arista-7260cx3 devices.

Work item tracking
Microsoft ADO 24785776

How I did it
Update the config.bcm template for Arista-7260cx3 devices.

How to verify it
The change is verified by manually rendering the j2 on a T1 testbed.
2023-08-11 11:51:25 -07:00
Liu Shilong
3500f69fdb
Revert "[Ufispace][PDDF] Add PDDF support on S9180-32X (#14909)" (#16092)
This reverts commit d2b5d774c5.
2023-08-11 09:13:53 -07:00
Aaron Payment
eedaa2adbf
sonic-buildimage: Fix SAI_API_TUNNEL SAI_STATUS_NOT_SUPPORTED error (#13874)
Syncd will abort in handleSaiCreateStatus with
'Encountered failure in create operation, exiting orchagent,SAI API: SAI_API_TUNNEL, status: SAI_STATUS_NOT_SUPPORTED'

The fix is to add the following brcm config to prevent the error:
sai_tunnel_global_sip_mask_enable=1
bcm_tunnel_term_compatible_mode=1

Signed-off-by: Aaron Payment <aaronp@arista.com>
2023-08-11 13:36:18 +08:00
vmittal-msft
12d24d572a
Updated PG headroom settings for 40g port speed (#16038) 2023-08-10 17:35:43 -07:00
Arun LK
97113bae61
Dell: E3224F platform onboarding (#16002)
* Dell: E3224F platform onboarding

* Dell: E3224F platform onboarding
2023-08-10 17:27:30 -07:00
andywongarista
96fa513690
[Arista] Add support for DCS-7060DX5-32 (#14793)
* Add asic support for blackhawkth4dd

* Add bfd feature to BlackhawkTh4Dd

* Add platform data for blackhawkth4

* Add Qos settings for Blackhawk-TH4

* Add pg and queue settings for Blackhawk-TH4

* Add buffers_defaults_t0.j2

* Add blackhawkth4 to boot0

* Update 7060dx5 config.bcm

* Fix build error

---------

Co-authored-by: Boyang Yu <byu@arista.com>
Co-authored-by: David Meggy <davidm@arista.com>
2023-08-05 22:11:45 +08:00
Stephen Sun
97a091abd2
[Mellanox] Use Debian reboot in Nvidia platform reboot when it is invoked from kdump capture boot (#15701)
#### Why I did it

When a kernel crash occurs, the system will reboot to the kdump capture kernel if kdump is enabled (`config kdump enable`). In the kdump capture boot, it only stores the crash information, and then reboot the system to a normal boot.
In this boot, no SONiC service is started but it invokes `reboot` which is actually the SONiC reboot that depends on SONiC services. There is a logic to skip all SONiC stuff and invoke platform reboot in SONiC reboot to avoid issues.
However, on Nvidia platforms, the platform reboot still depends on SONiC services, which can cause issues.
So, the Debian reboot is called directly in platform reboot if it is invoked from the kdump capture boot.

#### How I did it

Manual test
2023-08-04 13:24:38 -07:00
pettershao-ragilenetworks
abccdaeb6c
[Ragile]Adapt kernel 5.10 for broadcom on RA-B6510-48V8C (#14809)
* Adapt kernel 5.10 for broadcom on RA-B6510-48V4C

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* modify one-image.mk file

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* modify debian/rule.mk

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* Add platform.json file

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

---------

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
2023-08-04 12:01:49 -07:00
Vadym Hlushko
521a86b2de
[Mellanox] Add mlxtrace to techsupport (#15961)
- Why I did it
Added the fwtrace config files in order to be able to call the mlxstrace utility during the show techsupport dump.

Work item tracking
Microsoft ADO (number only):

- How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.

- How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-08-03 11:36:58 +03:00
Ikki Zhu
9a7eb495c2
[E1031] add platform specific reboot command support (#15889)
* [E1031] add platform specific reboot command support

Why I did it
E1031: add platform specific cold reboot support

How I did it
Use the CPLD to trigger board level power cycle when cold reboot

How to verify it
Do reboot stress test and check the reboot cause history

* [E1031] try to umount filesystem before power cycle reboot

* [E1031] remove fstrim in customized reboot script
2023-08-02 17:20:53 -07:00
Pavan-Nokia
a850175776
[Nokia-7215-A1] Update Nokia-7215-A1 platform (#15342)
Update Nokia-7215-A1 platform to address UT and OC test failures
2023-08-02 09:08:15 -07:00
Jason Tsai
d2b5d774c5
[Ufispace][PDDF] Add PDDF support on S9180-32X (#14909)
* Add s9180-32x pddf support

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

* Fix memset_s parameter

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

* Update chassis.py and fan.py

1. remove duplicate get_sfp() in chassis.py
2. update get_direction() and get_target_speed() in fan.py

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>

---------

Signed-off-by: cytsai0409 <cytsai0409@gmail.com>
2023-07-24 09:37:48 -07:00
wilson-smci
2d0bad0523
[Supermicro]: Add a new supported device and platform, SSE-T7132S. (#15368)
* * platform/innoviunm: Add a new supported device and platform, SSE-T7132S

* Switch Vendor: Supermicro
* Switch SKU:  Supermicro_sse_t7132s
* ASIC Vendor: innovium
* Swich ASIC: TL7
* Port Configuration: 32x400G
* SONiC Image: SONiC-ONIE-Innoviunm

Signed-off-by: wilsonw <wilsonw@supermicro.com.tw>
2023-07-20 10:24:56 -07:00
Aravind Mani
05314f9e5b
DellEMC: S5248F update LED Firmware (#15790)
* DellEMC: S5248F update LED firmware
2023-07-20 09:49:48 -07:00
Ashwin Srinivasan
0b067bfb2a
[master] Mellanox: 2700, 4600c - Quoted device IDs to prevent false flags in pcied (#15896)
Why I did it
Certain all-numeric device IDs of PCI devices in the pcie.yaml file are left unquoted, leading to false mismatch flags in the pcie daemon and subsequently leads to log flooding. This PR fixes that issue.

Work item tracking
Microsoft ADO (number only): 24578930
How I did it
Added quotes around numeric PCI devices in the pcie.yaml files of the following platforms:

x86_64-mlnx_msn2700-r0
x86_64-mlnx_msn4600c-r0

How to verify it
Install latest image after the merge and verify that syslogs are not flooded with PCI device mismatch errors
2023-07-18 21:14:00 -07:00
Ying Xie
bf49154493
Potential fix for Celestica E1031 device hang (#15822)
set CPU max_cstate to 0

Co-authored-by: Sumukha Tumkur Vani <sumukhatv@outlook.com>
2023-07-14 08:38:45 -07:00
prabhataravind
114f276dd4
[docker-sonic-vs]: More changes to support DPU-2P HWKSU (#15695)
Why I did it
port_config.ini and hwsku.json are needed to generate the default config
switch_type needs to be "dpu" to spawn the right set of processes during dvs initialization and to make sure that DASH APIs can be handled properly

Work item tracking
Microsoft ADO 24375371:

How I did it
Use the same hwsku.json and port_config.ini for DPU-2P as the ones used for Nvidia-MBF2H536C SKU in nvidia-sonic sonic-buildimage repo.
Set switch_type to "dpu" in DEVICE_METADATA configuration to make sure DASH specific APIs are handled properly

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
2023-07-11 09:57:50 -07:00
leo lin
c6dbfa988e
[Ufispace][PDDF] Add support for S9300-32D platform (#14922) 2023-07-05 14:39:01 -07:00
Andrew Sapronov
c190a8f795
[Netberg][Barefoot] Added support for Aurora 710 (#15298)
* [202012][platform/barefoot] (#8543)

Why I did it
Pcied running by python 2.

How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2

How to verify it
docker exec pmon supervisorctl status

* [Netberg][nba710] Added initial support for Aurora 710

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>

---------

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
2023-06-30 17:30:07 -07:00
prabhataravind
d4de62d155
[docker-sonic-vs]: dd NPU SKU for docker-sonic-vs (#15604)
Define a generic 2-port NPU SKU for docker-sonic-vs to 
enable DASH vstests to pass on azure pipelines

Work item tracking
Microsoft ADO 24375371:

How I did it
Define a generic 2-port NPU hwsku that is used only for DASH-specific vstests.

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
2023-06-27 14:10:53 -07:00