Commit Graph

1261 Commits

Author SHA1 Message Date
Zhijian Li
5bc2aa3c63 [E1031] Bugfix for Python syntax error in sonic_platform/common.py (#18386)
Why I did it
Bugfix for Python syntax error in sonic_platform/common.py.
A method of class need to have self as parameter.

Fixing below issue:

e1031:~$ show int st
Traceback (most recent call last):
  File "/usr/local/bin/intfutil", line 836, in <module>
    main()
  File "/usr/local/bin/intfutil", line 819, in main
    interface_stat.display_intf_status()
  File "/usr/local/bin/intfutil", line 448, in display_intf_status
    self.get_intf_status()
  File "/usr/local/lib/python3.9/dist-packages/utilities_common/multi_asic.py", line 157, in wrapped_run_on_all_asics
    func(self,  *args, **kwargs)
  File "/usr/local/bin/intfutil", line 529, in get_intf_status
    self.portchannel_speed_dict = po_speed_dict(self.po_int_dict, self.db)
  File "/usr/local/bin/intfutil", line 334, in po_speed_dict
    optics_type = port_optics_get(appl_db, value[0], PORT_OPTICS_TYPE)
  File "/usr/local/bin/intfutil", line 224, in port_optics_get
    if is_rj45_port(intf_name):
  File "/usr/local/lib/python3.9/dist-packages/utilities_common/platform_sfputil_helper.py", line 120, in is_rj45_port
    platform_chassis = sonic_platform.platform.Platform().get_chassis()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/platform.py", line 21, in __init__
    self._chassis = Chassis()
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 37, in __init__
    self._is_host = self._api_common.is_host()
TypeError: is_host() takes 0 positional arguments but 1 was given
Work item tracking
Microsoft ADO (number only): 27208152
How I did it
Add self parameter to function Common::is_host().

How to verify it
Verified on E1031 DUT with this patch.
2024-03-20 01:01:07 +08:00
Xichen96
09eb07da59 replace host check command in e1031 (#18279) 2024-03-12 13:01:20 +08:00
mssonicbld
5fdc7d29dd
[Nokia-IXR7250E][Devicedata] Update the device data for Nokia IXR7250E platform (thermal logging thresholds) (#18063) (#18171)
These changes adjust Nokia IXR7250 thermal sensor logging thresholds.

Why I did it
To modify the thermal sensor logging thresholds used on LC and Supervisor.

How I did it
Modified the JSON based thermal logging thresholds used to determine when to log current high sensor temperature and hottest sensor margin fluctuations.

How to verify it
Verify that syslog messages indicating current (high) temperature and margin values are only logged when these respective values fluctuate by at least 5 degrees.

Co-authored-by: snider-nokia <76123698+snider-nokia@users.noreply.github.com>
2024-02-25 09:06:59 +08:00
Volodymyr Samotiy
15a0f912bd [Mellanox] Disable SSD NCQ on Mellanox platforms (#17567)
- Why I did it
Based on some research some products might experience an occasional IO failures in the communication between CPU and SSD because of NCQ.
There seems to be a problem between some kernel versions and some SATA controllers.

Syslog error message examples:

Error "ata1: SError: { UnrecovData Handshk }" - "failed command: WRITE FPDMA QUEUED".
Error "ata1: SError: { RecovComm HostInt PHYRdyChg CommWake 10B8B DevExch }" - "failed command: READ FPDMA QUEUED".
Some vendors already disabled NCQ on their platforms in SONiC due to similar issue:

[Arista] Disable ATA NCQ for a few products #13739 [Arista] Disable ATA NCQ for a few products
[Arista] Disable SSD NCQ on DCS-7050CX3-32S #13964 [Arista] Disable SSD NCQ on DCS-7050CX3-32S
Also there are other discussions on Debian/Ubuntu forums about similar issues and it was suggested to disable NCQ:

https://askubuntu.com/questions/133946/are-these-sata-errors-dangerous

- How I did it
Add a kernel parameter to tell libata to disable NCQ

- How to verify it
Use FIO tool - fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4
2024-02-01 04:32:27 +08:00
snider-nokia
9f81d9d1bf [Nokia][sonic-platform] Update Nokia sonic-platform submodule and device data (#17378)
These changes, in conjunction with NDK version >= 22.9.17 address the thermal logging issues discussed at Nokia-ION/ndk#27. While the changes contained at this PR do not require coupling to NDK version >= 22.9.17, thermal logging enhancements will not be available without updated NDK >= 22.9.17. Thus, coupling with NDK >=22.9.17 is preferred and recommended.

Why I did it
To address thermal logging deficiencies.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
The following changes are included:

Threshold configuration values are provided in the associated device data .json files. There is also a change included to better handle the condition where an SFP module read fails.

Modify the module.py reboot to support reboot linecard from Supervisor

 - Modify reboot to call _reboot_imm for single IMM card reboot
 - Add log to the ndk_cmd to log the operation of "reboot-linecard" and "shutdown/satrtup the sfm"
Add new nokia_cmd set command and modify show ndk-status output

 - Add a new function reboot_imm() to nokia_common.py to support reboot a single IMM slot from CPM
 - Added new command: nokia_cmd set reboot-linecard <slot> [forece] for CPM
 - Append a new column "RebootStatus" at the end of output of "nokia_cmd show ndk-status"
 - Provide ability for IMM to disable all transceiver module TX at reboot time
 - Remove defunct xcvr-resync service
2024-01-18 14:36:12 +08:00
Marty Y. Lok
2f8630c85f [Nokia-IXR7250E] Modify the platform_reboot on the IXR7250E for PMON API reboot and Disable all SFPs (#17483)
Why I did it
When Supervisor card is rebooted by using PMON API, it takes about 90 seconds to trigger the shutdown in down path. At this time linecards have been up. This delays linecards database initialization which is trying to PING/PONG the database-chassis. To address this issue, we modified the NDK to use the system call with "sudo reboot" when the request is from PMON API on Supervisor case. The NDK version is 22.9.20 and greater. This new NDK requires this modifcaiton of platform_reboot to work with.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
Modify the platform_reboot In Supervisor not to reboot all IMMs since it has been done in the function reboot() in module.py. Also handle the reboot-cause.txt for on the Supervisor when the reboot is request from PMON API.
Modify the Nokia platform specific platform_reboot in linecard to disable all SPFs.
This PR works with NDK version 22.9.20 and above

Signed-off-by: mlok <marty.lok@nokia.com>
2024-01-18 14:36:08 +08:00
vdahiya12
a262306822 [Arista] Update config.bcm of 7060_cx32s for handling 40g optics with unreliable los settings (#17768)
For 40G optics there is SAI handling of T0 facing ports to be set with SR4 type and unreliable los set for a fixed set of ports. For this property to be invoked the requirement is set
phy_unlos_msft=1 in config.bcm.
This change is to meet the requirement and once this property is set, the los/interface type settings is applied by SAI on the required ports.

Why I did it
For Arista-7060CX-32S-Q32 T1, 40G ports RX_ERR minimalization during connected device reboot
can be achieved by turning on Unreliable LOS and SR4 media_type for all ports which are connected to T0.

The property phy_unlos_msft=1 is to exclusively enable this property.

Microsoft ADO: 25941176

How I did it
Changes in SAI and turning on property

How to verify it
Ran the changes on a testbed and verified configurations are as intended.

with property

admin@sonic2:~$ bcmcmd "phy diag xe8 dsc config" | grep -C 2 "LOS"
Brdfe_on                    = 0
Media Type                  = 2
Unreliable LOS              = 1
Scrambling Disable          = 0
Lane Config from PCS        = 0

without property

admin@sonic:~$ bcmcmd "phy diag xe8 dsc config" | grep -C 2 "LOS"
Brdfe_on                    = 0
Media Type                  = 0
Unreliable LOS              = 0
Scrambling Disable          = 0
Lane Config from PCS        = 0

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2024-01-18 04:32:39 +08:00
mssonicbld
df22ca9132
[Arista] Remove aggregate port config files for multi-asic devices (#16923) (#17678) 2024-01-05 01:04:39 +08:00
mssonicbld
109035e511
[Arista] Use port_config.ini for Arista-7050QX-32S-S4Q31 (#17253) (#17574) 2023-12-20 19:34:42 +08:00
Pavan-Nokia
c4ba9c9edb [armhf][Nokia-7215] Remove platform reboot (#17010) 2023-12-16 12:33:36 +08:00
mssonicbld
c63a4c6a4a
[Mellanox][SKU] Adding Mellanox-SN4700-O8V48 SKU (#17425) (#17526) 2023-12-16 01:32:50 +08:00
zitingguo-ms
1a0268c224
Fix ecmp hash polarization by enable hash seed/offset config on T1 and upgrade xgs SAI to 8.4.35.0 (#17505)
Why I did it
To fix ecmp hash polarization issue.

Work item tracking
Microsoft ADO (number only): 26085143
How I did it
Add sai_hash_seed_config_hash_offset_enable=1 in all config.bcm that Broadcom T1 uses.

HardwareSku
Force10-S6100-T1
Force10-S6100-ITPAC-T1
Force10-S6100
Celestica-DX010-C32
Arista-7260CX3-C64
Arista-7060CX-32S-Q32
Arista-7060CX-32S-C32-T1
Arista-7060CX-32S-C32
Arista-7050QX32S-Q32
Arista-7050QX-32S-S4Q31
Arista-7050-QX32
Arista-7050-QX-32SInclude Broadcom's fix by upgrading xgs SAI version to 8.4.35.0.
8.4.35.0: [CSP 00012324019] back-porting SONIC-75006 to SAI8.4
8.4.34.0:
[CSP 00012318293] back-porting SONIC-81534 to SAI8.4;
ECMP LB traffic polarization, configure hash_offset along with hash_seed attr
Run qual with only xgs SAI version upgraded to 8.4.35.0:
on TH2: https://elastictest.org/scheduler/testplan/6579b36ccfacd86e78e3e885?leftSideViewMode=detail&prop=status&order=ascending
on TH: https://elastictest.org/scheduler/testplan/657a75f8c1d3b51fc1d585b4?leftSideViewMode=detail&prop=status&order=ascending

How to verify it
use tests/ecmp/test_ecmp_sai_value.py to verify.
2023-12-15 19:33:47 +08:00
arista-nwolfe
dd294f3883
Disable SA_EQUALS_DA trap on DNX LC SKUs (#17488)
This is a 202305 cast of this PR #17206
2023-12-14 08:44:44 +08:00
mssonicbld
26250f4e4f
[Mellanox] remove log in RAM kernel option for 2700 A1 platform (#17254) (#17422) 2023-12-06 15:39:46 +08:00
mssonicbld
afe382a5f9
[Arista]: Set SYNCD_SHM_SIZE for Arista DNX Devices (#17205) (#17287) 2023-11-24 00:54:55 +08:00
mssonicbld
8575a5b5ed
[Nokia][Nokia-IXR7250E-SUP-10] Update BCM config for supervisor card to reduce the CPU usage (#16790) (#17224) 2023-11-19 14:35:05 +08:00
mssonicbld
7d5b882877
[Mellanox] fix new MSN2700-A1 platform name (#17151) (#17198) 2023-11-16 21:40:47 +08:00
Pavan Naregundi
fdf54a01cc
[Marvell-arm64] Support lazy install of sdk drivers (#17135)
* Support lazy install of sdk drivers

This patch adds support for lazy install of Marvell prestera SDK
drivers for platform-nokia. Lazy install for drivers is added as
updated sdk driver needs to classify the drivers required for platform
during compile time. SDK drivers and platform files are now fetched
from a submodule(mrvl-prestera).

Additionaly, DTB required for sonic_fit creation during compile time
is sourced from sonic-linux-kernel.

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

* Add hugepage cmdline agrument

Updated sdk & driver requries hugepage to be reserved during kernel
boot. These kernel command line agrument are passed from installer.conf
in device folder.

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

* Update SAI deb to 1.12.0-3

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

---------

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2023-11-16 21:24:53 +08:00
mssonicbld
403d09db78
[Mellanox] [SN5600] Removing 8x DPB mode from platform files (#17071) (#17119) 2023-11-08 22:48:22 +08:00
Vadym Hlushko
28ecd068d4
[202305][buffers] Add 'create_only_config_db_buffers.json' file for the Mellanox devices (not MSFT SKU) (#17006)
Why I did it
Add the create_only_config_db_buffers attribute to the DEVICE_METADATA|localhost. If the "create_only_config_db_buffers" exists and is equal to "true" - the buffers will be created according to the config_db configuration (for example BUFFER_QUEUE|* table), otherwise the maximum available buffers (which are read from SAI) will be created, regardless of the CONFIG_DB buffers configuration.

Work item tracking
Microsoft ADO (number only):
How I did it
Add the create_only_config_db_buffers.json files for Mellanox devices (not MSFT SKU's), and inject the content to the CONFIG_DB during the swss docker container start.

How to verify it
Manual verification:

Install the image with this PR included on the not MSFT SKU switch
Check the show queue counters output and verify that only configured in CONFIG_DB buffers are created
root@sonic:/home/admin# show queue counters
     Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
---------  -----  --------------  ---------------  -----------  ------------
Ethernet0    UC0               0                0            0           N/A
Ethernet0    UC1               0                0            0           N/A
Ethernet0    UC2               0                0            0           N/A
Ethernet0    UC3               0                0            0           N/A
Ethernet0    UC4               0                0            0           N/A
Ethernet0    UC5               0                0            0           N/A
Ethernet0    UC6               0                0            0           N/A
Open the /usr/share/sonic/device/$DEVICE/$SKU/create_only_config_db_buffers.json and change it to:
"create_only_config_db_buffers": "false"
Do config reload
Check the show queue counters output and verify that all available buffers are created
root@sonic:/home/admin# show queue counters
     Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
---------  -----  --------------  ---------------  -----------  ------------
Ethernet0    UC0               0                0            0           N/A
Ethernet0    UC1               0                0            0           N/A
Ethernet0    UC2               0                0            0           N/A
Ethernet0    UC3               0                0            0           N/A
Ethernet0    UC4               0                0            0           N/A
Ethernet0    UC5               0                0            0           N/A
Ethernet0    UC6               0                0            0           N/A
Ethernet0    UC7              60            15346            0           N/A
Ethernet0    MC8             N/A              N/A          N/A           N/A
Ethernet0    MC9             N/A              N/A          N/A           N/A
Ethernet0   MC10             N/A              N/A          N/A           N/A
Ethernet0   MC11             N/A              N/A          N/A           N/A
Ethernet0   MC12             N/A              N/A          N/A           N/A
Ethernet0   MC13             N/A              N/A          N/A           N/A
Ethernet0   MC14             N/A              N/A          N/A           N/A
Ethernet0   MC15             N/A              N/A          N/A           N/A
2023-11-03 14:27:17 +08:00
Nazarii Hnydyn
7d54155f67
[ppi]: Enable global port late create for all Mellanox HWSKUs. (#16946)
Why I did it
To improve FAST reboot dataplane downtime
Work item tracking
N/A
How I did it
Updated SAI xml config file
How to verify it
Run sonic-mgmt tests of fastboot
2023-11-02 23:14:18 +08:00
Ashwin Srinivasan
c31ccbaba8 Revert "Move /var/log to RAM for Mellanox SN2700, Nokia 7215 and Dell S6100 (#15077)" (#16775)
This reverts commit 05f326eed9.

Microsoft ADO 25355843:
2023-10-18 00:37:18 +08:00
Samuel Angebault
a8b53e1452 [Arista] Remove pcie device monitoring for 7260CX3-64 (#12734)
On some products from this line one of the management NIC might be unpopulated.
On such products this leads to errors from pcied and pcie-check.sh

How I did it
Remove this PCIe device from pcie.yaml

How to verify it
Run pcieutil check on the 2 hardware variants and validate that it passes.
Restart pcied and make sure that there is no more error logs in the syslog.

ADO: 25447788
2023-10-17 20:49:02 +08:00
Nazarii Hnydyn
ea40778e35
[ssm]: Enable Store-And-Forward switching mode for SN2700/SN3800/SN4600C/SN4700. (#16761)
BACKPORT: #16781

Why I did it
To enable Store-And-Forward switching mode for SN2700/SN3800/SN4600C/SN4700
Work item tracking
N/A
How I did it
Added vendor SAI config options
How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin
2023-10-08 12:29:09 +08:00
mssonicbld
70332c1fde
[nokia]: Updated total headroom pool size to accommodate 100G ports on T2 uplinks (#16690) (#16798) 2023-10-08 04:04:33 +08:00
mssonicbld
c3ea44a522
[Mellanox] add new platform 2700 a1 (#16515) (#16795) 2023-10-08 03:06:03 +08:00
mssonicbld
413f4bd253
[Arista] Add new hwskus to x86_64-arista_7060dx5_32 (#16077) (#16794) 2023-10-08 02:59:07 +08:00
Pavan-Nokia
370980d42f [armhf][Nokia-7215]Add HWSKU files for new SAI (#16321)
Add new easy bringup (EZB) files for new SAI 1.12.0
2023-09-21 22:33:17 +08:00
mssonicbld
a713299614
[Mellanox] Remove mlxtrace support for SPC4 (#16373) (#16625)
- Why I did it
Because the Spectrum4 devices don't support mlxtrace utility.

- How I did it
Edit sai.profile and remove mlxtrace_spectrum4_itrace_*.cfg.ext files

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
Co-authored-by: Vadym Hlushko <62022266+vadymhlushko-mlnx@users.noreply.github.com>
2023-09-21 20:38:22 +08:00
Kebo Liu
27f15d40e1 [Mellanox] Update HW-MGMT package to new version V.7.0030.1011 (#16239)
- Why I did it
1. Update Mellanox HW-MGMT package to newer version V.7.0030.1011
2. Replace the SONiC PMON Thermal control algorithm with the one inside the HW-MGMT package on all Nvidia platforms
3. Support Spectrum-4 systems

- How I did it
1. Update the HW-MGMT package version number and submodule pointer
2. Remove the thermal control algorithm implementation from Mellanox platform API
3. Revise the patch to HW-MGMT package which will disable HW-MGMT from running on SIMX
4. Update the downstream kernel patch list

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-09-21 18:34:07 +08:00
Prince George
d5a96f69f1 [platform]: Disable interrupt for intel i2c-i801 driver (#16309)
On S6100 we are seeing almost 100K interrupts per second on intels i801 SMBUS controller which affects systems performance.

We now disable the i801 driver interrupt and instead enable polling

Microsoft ADO (number only): 24910530

How I did it
Disable the interrupt by passing the interrupt disable feature argument to i2c-i801 driver

How to verify it
This fix is NOT applicable for ARM based platforms. Applicable only for intel based platforms:-

- On SN2700 its already disabled in Mellanox hw-mgmt
- Celestica DX010 and E1031
- Dell S6100 verified the interrupts are no longer incrementing.
- Arista 7260CX3

Signed-off-by: Prince George <prgeor@microsoft.com>
2023-09-21 16:33:37 +08:00
Pavan-Nokia
393c6911c5 [Nokia-7215-A1] Update Nokia-7215-A1 platform (#15342)
Update Nokia-7215-A1 platform to address UT and OC test failures
2023-09-05 00:35:18 +08:00
mssonicbld
fdc4c039b8
[Nokia-IXR7250E] Modify the platform_ndk.json for Nokia-IXR7250E platform (#16355) (#16428) 2023-09-03 22:19:40 +08:00
Marty Y. Lok
5774ce2206 [Nokia][DeviceData] Update the Nokia platform IXR-7250E device data (#16028)
Why I did it
Update the platform_reboot of Nokia Platform IXR-7250E-36x400G to displays the correct reboot-cause history when reboot from supervisor card.

Work item tracking
Microsoft ADO (number only):
How I did it
Modify the platform_reboot script to copy the correct reboo-cause.txt file from NDK to the /host/reboot-cause directory at the down cycle when the reboot is issued from Supervisor (for both reboot right after install a new image and normal reboot)

Signed-off-by: mlok <marty.lok@nokia.com>
2023-09-03 20:44:41 +08:00
Aravind Mani
82dfe0db78 Dell S6100 Platform API 2.0 fixes (#16208)
Why I did it
Dell S6100 Platform components needs to be updated.

How I did it
Modified platform.json to fix the issue.

How to verify it
Run sonic-mgmt component test and check whether it passes.
2023-09-03 20:44:11 +08:00
Junchao-Mellanox
6c9c2cca42 [Mellanox] Revise label name and fix typo in sensor.conf of 4600C (#16271)
- Why I did it
Revise lable name and fix typo in sensor.conf of 4600C

- How I did it
Revise lable name and fix typo in sensor.conf of 4600C

- How to verify it
Manual test
sonic-mgmt test_sensors.py
2023-09-03 18:32:30 +08:00
Vivek
1908a04fdf [Mellanox] [SN4410] Support new breakout modes for PAM4 (#15668)
- Why I did it
Add new breakout modes to be used in PAM4 supported cables

- How I did it

- How to verify it
Verified the 50G per lane breakout modes are applied properly on the switch

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-09-03 16:33:21 +08:00
Nazarii Hnydyn
dd08437b3d [Mellanox] [PPI] Enable global port late create for SN5600 (#15866)
- Why I did it
Enabled port late create on SN5600 Spectrum-4 switch boots up with no ports

Work item tracking
N/A

- How I did it
Updated SAI xml config file

- How to verify it
Run sonic-mgmt tests of fastboot

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-09-03 16:33:15 +08:00
andywongarista
f0823e6dd0
[Arista] Add support for DCS-7060DX5-32 (#14793) (#16176)
* Add asic support for blackhawkth4dd

* Add bfd feature to BlackhawkTh4Dd

* Add platform data for blackhawkth4

* Add Qos settings for Blackhawk-TH4

* Add pg and queue settings for Blackhawk-TH4

* Add buffers_defaults_t0.j2

* Add blackhawkth4 to boot0

* Update 7060dx5 config.bcm

* Fix build error

---------

Co-authored-by: Boyang Yu <byu@arista.com>
Co-authored-by: David Meggy <davidm@arista.com>
2023-09-03 09:21:33 +08:00
mssonicbld
875b81e407
[Mellanox] Add mlxtrace to techsupport (#15961) (#16215) 2023-08-20 23:51:38 +08:00
vmittal-msft
7b7aae981a Updated PG headroom settings for 40g port speed (#16038) 2023-08-15 16:32:30 +08:00
Stephen Sun
33d14521f2 [Mellanox] Use Debian reboot in Nvidia platform reboot when it is invoked from kdump capture boot (#15701)
#### Why I did it

When a kernel crash occurs, the system will reboot to the kdump capture kernel if kdump is enabled (`config kdump enable`). In the kdump capture boot, it only stores the crash information, and then reboot the system to a normal boot.
In this boot, no SONiC service is started but it invokes `reboot` which is actually the SONiC reboot that depends on SONiC services. There is a logic to skip all SONiC stuff and invoke platform reboot in SONiC reboot to avoid issues.
However, on Nvidia platforms, the platform reboot still depends on SONiC services, which can cause issues.
So, the Debian reboot is called directly in platform reboot if it is invoked from the kdump capture boot.

#### How I did it

Manual test
2023-08-07 14:33:34 +08:00
mssonicbld
471a3a8067
Add support data for fabric monitoring in CONFIG_DB. (#14170) (#16045)
Added support data for fabric monitoring in CONFIG_DB

The CONFIG_DB now has the FABRIC_MONITOR|FABRIC_MONITOR_DATA table for default value for fabric port monitoring. An example output of getting this table is:

sonic-db-cli CONFIG_DB hgetall "FABRIC_MONITOR|FABRIC_MONITOR_DATA"
{'monErrThreshCrcCells': '1', 'monErrThreshRxCells': '61035156', 'monPollThreshIsolation': '1', 'monPollThreshRecovery': '8'}

The CONFIG_DB now also has a table for each fabric port for its isolate status.
An example output of getting this table is:

sonic-db-cli CONFIG_DB hgetall "FABRIC_PORT|Fabric20"
{'alias': 'Fabric20', 'isolateStatus': 'False', 'lanes': '20'}

Co-authored-by: jfeng-arista <98421150+jfeng-arista@users.noreply.github.com>
2023-08-07 09:26:45 +08:00
Ying Xie
8369e1c6b7 Potential fix for Celestica E1031 device hang (#15822)
set CPU max_cstate to 0

Co-authored-by: Sumukha Tumkur Vani <sumukhatv@outlook.com>
2023-07-21 14:33:59 +08:00
mssonicbld
0eb0749442
Move /var/log to RAM for Mellanox SN2700, Nokia 7215 and Dell S6100 (#15077) (#15871)
Why I did it
Move the /var/log on RAM. This is to prevent too many disk write on /var/log when mounted on disk.

Work item tracking
Microsoft ADO (number only): 17955517

How I did it
Pass kernel cmdline option "log_inram=on"

How to verify it
Mellanox SN2700
root@str-msn2700-02:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 791M 15M 776M 2% /run
root-overlay 15G 12G 2.9G 80% /
/dev/sda3 15G 12G 2.9G 80% /host
tmpfs 790M 12M 779M 2% /var/log
tmpfs 3.9G 107M 3.8G 3% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/f50948841bee041368bf7c0546ceab4c71f05951fb0ed5ae70411f28dde68907/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/c45de6c53e7185631a37e87686dd296b2585425f638aa92c720c90eae038480c/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/f5bc87d5c2965b21b222f09dd57fe0fc798e518101d7ecd25d170b7662ae3e80/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/b2f435a256b930da4897d8a096095dcce183a6efa55b5b637187a654db0585ee/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/5c3588e42b29fd0516a164c00de621b7a00236ecbb240c4d0b3903ec706c220d/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/5a4a2a2602fb4ed1d1df90c3916076f595b4d8bc18eb465dd23e33f354adcfb8/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/9926f7378de9223fd3e88c8f59d888ad178e2ca23fa978f372e9838f10b7b803/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/130abaf95cffc06d952adacb6aa54a2f5e7c54c81fa8c15184389e25a7884328/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/aeef95cf5af6e20909a4cfd6c696176cc5dcb31dd456cc8acbbd3d59d47333d7/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/ef9bb94012b9fe987e55c9b73141296da8081d258d0d134922776c3c4b3ec551/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/cf425d372b347fd68569f128e1771e5a70dbf504b2f013304d60bcef6dfbd0da/merged
overlay 15G 12G 2.9G 80% /var/lib/docker/overlay2/7a2592cdac5c7369a6a98e07dbf1c2d96d29634e7d7b593617c50cc7e09e5cb3/merged
root@str-msn2700-02:~# 
root@str-msn2700-02:~# free -h
 total used free shared buff/cache available
Mem: 7.7Gi 3.0Gi 3.3Gi 133Mi 1.5Gi 4.4Gi
Swap: 0B 0B 0B
root@str-msn2700-02:~# 


Dell S6100

root@str-s6100-acs-5:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 794M 15M 780M 2% /run
root-overlay 14G 9.9G 3.6G 74% /
/dev/sda4 14G 9.9G 3.6G 74% /host
tmpfs 793M 13M 781M 2% /var/log
tmpfs 3.9G 60K 3.9G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/f94441208fba5df49b0b8f0b49c699475ed0fd07673ab4a3eb574869b8e17c83/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/1c3dc3b582599602aec0dbd78945560f330f6244d2e218750622b3814dc53ed3/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/ab5b96e72e323fff5168abc69f8599fa244410d856dbd10cdbf73c99a4fe8d67/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/0e6e3adaba6bb1d2684da444661e540030d588ef498466b7d8ff773ce263a2ea/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/1218ed8bfa7a17c8927b20005d45f5e1e4a634e653d5c5c2057ac54713dc3387/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/b31486f665e5c929966185397990553fee6b41b515cbef28c945096673ac9bef/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/b984fa70f30bd1bac92bdf8d36542ed4433b4dabc33f7bb1f0a17a5eaee90f3e/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/b7866a1462768f3564b832187837c7a5e3d493b8084204e59610960cc5f6bc19/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/113bbbe88ee8452f4310b02a1343cfb4f1beb5fedf68a7d810ff5b5d7457c9f0/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/e7cc383186c6f9acecf2031c0c1f0870b8a7f63e1918b8359afa7a13d3c28963/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/c5d269100da205981c51e70e9e86facf69487f99e234dcdac822b8ab01af3d6a/merged
overlay 14G 9.9G 3.6G 74% /var/lib/docker/overlay2/463874ab78b2e45a34cf4d3d1cd2e45ff18c0abbf37be62d2c8559dce38d6219/merged
root@str-s6100-acs-5:~# free -h
 total used free shared buff/cache available
Mem: 7.8Gi 2.1Gi 4.1Gi 69Mi 1.6Gi 5.3Gi
Swap: 0B 0B 0B
root@str-s6100-acs-5:~# 

Nokia-7215

root@str-2-7215-acs-4:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 1.5G 0 1.5G 0% /dev
tmpfs 303M 14M 289M 5% /run
root-overlay 15G 7.2G 7.2G 51% /
/dev/sda2 15G 7.2G 7.2G 51% /host
tmpfs 302M 7.5M 295M 3% /var/log
tmpfs 1.5G 60K 1.5G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
overlay 15G 7.2G 7.2G 51% /var/lib/docker/overlay2/617e49b8b8e4368db2b3b2fb3e3204e80ec572fe7981d67ad2116d9c3e4472f3/merged
overlay 15G 7.2G 7.2G 51% /var/lib/docker/overlay2/c94b855482fc14aa1f032b0c8dc035b02f37ad9e4341cb5a8d22f14e14c63824/merged
overlay 15G 7.2G 7.2G 51% /var/lib/docker/overlay2/2d8c6ee95b212bbc8376d15916723128455678f2a3c88f382b451bec88297341/merged
overlay 15G 7.2G 7.2G 51% /var/lib/docker/overlay2/92114013a19dc19f30505ba645f961d50e093365422a9b22116ced1fa88ded2b/merged
overlay 15G 7.2G 7.2G 51% /var/lib/docker/overlay2/c8e79a8403863887666324f163a4b6633c40c8b349402b3a0f40ba7e51adb28b/merged
overlay 15G 7.2G 7.2G 51% /var/lib/docker/overlay2/27fd4a51859f3febd345a8551a0b4686d696c205048e1d595b76114385a68949/merged
root@str-2-7215-acs-4:~#
2023-07-19 16:09:37 +08:00
vdahiya12
acc5dfb603 [Arista][x86_64-arista_7050_qx32] Add Components to platform.json (#15252)
* [Arista][x86_64-arista_7050_qx32] Add Components to platform.json

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

* fix comment

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

* fix comment

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

* reformat

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

---------

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>
2023-06-26 16:41:07 +08:00
byu343
c2b2407335
[Arista] Update hwsku.json for Arista-7050QX-32S-S4Q31 (#15251)
* [Arista] Update hwsku.json for Arista-7050QX-32S-S4Q31

* Change to 3x10G(3)+1x1G(1) on Arista-7050QX-32S-S4Q31
2023-06-14 16:16:24 -07:00
Samuel Angebault
afc6f7acc7
[Arista] fix platform.json for a few devices (#15308)
Why I did it
sonic-mgmt is failing tests due to invalid test data in platform.json
Fwutil is upset the chassis name in the platform_component.json of the 7060CX-32S

How I did it
Fixed the aforementioned issues
2023-06-14 13:19:28 -07:00
Kebo Liu
3cb13226be
Update SN5600 platform.json with service port sfp (#15337)
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-06-13 14:15:15 +03:00
Arvindsrinivasan Lakshmi Narasimhan
0f194c5a03
set the default value for the port fec to RS on J2 based LC (#15346)
Why I did it
Work item tracking
Microsoft ADO (24182162):
How I did it
update the config.bcm to set the default fec RS 100G Linecard

How to verify it
Tests on chassis
2023-06-08 11:08:48 -07:00