Commit Graph

1220 Commits

Author SHA1 Message Date
Andrew Sapronov
c190a8f795
[Netberg][Barefoot] Added support for Aurora 710 (#15298)
* [202012][platform/barefoot] (#8543)

Why I did it
Pcied running by python 2.

How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2

How to verify it
docker exec pmon supervisorctl status

* [Netberg][nba710] Added initial support for Aurora 710

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>

---------

Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
2023-06-30 17:30:07 -07:00
prabhataravind
d4de62d155
[docker-sonic-vs]: dd NPU SKU for docker-sonic-vs (#15604)
Define a generic 2-port NPU SKU for docker-sonic-vs to 
enable DASH vstests to pass on azure pipelines

Work item tracking
Microsoft ADO 24375371:

How I did it
Define a generic 2-port NPU hwsku that is used only for DASH-specific vstests.

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
2023-06-27 14:10:53 -07:00
Prince George
05f326eed9
Move /var/log to RAM for Mellanox SN2700, Nokia 7215 and Dell S6100 (#15077)
* add ONIE_PLATFORM_EXTRA_CMDLINE_LINUX to kernel bootparam
2023-06-26 10:58:39 -07:00
vdahiya12
78c262ea9f
[Arista][x86_64-arista_7050_qx32] Add Components to platform.json (#15252)
* [Arista][x86_64-arista_7050_qx32] Add Components to platform.json

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

* fix comment

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

* fix comment

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

* reformat

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>

---------

Signed-off-by: vaibhav dahiya <vdahiya@microsoft.com>
2023-06-22 09:04:05 -07:00
jfeng-arista
4b31e30924
Add support data for fabric monitoring in CONFIG_DB. (#14170)
Added support data for fabric monitoring in CONFIG_DB

The CONFIG_DB now has the FABRIC_MONITOR|FABRIC_MONITOR_DATA table for default value for fabric port monitoring. An example output of getting this table is:

sonic-db-cli CONFIG_DB hgetall "FABRIC_MONITOR|FABRIC_MONITOR_DATA"
{'monErrThreshCrcCells': '1', 'monErrThreshRxCells': '61035156', 'monPollThreshIsolation': '1', 'monPollThreshRecovery': '8'}

The CONFIG_DB now also has a table for each fabric port for its isolate status.
An example output of getting this table is:

sonic-db-cli CONFIG_DB hgetall "FABRIC_PORT|Fabric20"
{'alias': 'Fabric20', 'isolateStatus': 'False', 'lanes': '20'}
2023-06-16 15:16:40 -07:00
byu343
c2b2407335
[Arista] Update hwsku.json for Arista-7050QX-32S-S4Q31 (#15251)
* [Arista] Update hwsku.json for Arista-7050QX-32S-S4Q31

* Change to 3x10G(3)+1x1G(1) on Arista-7050QX-32S-S4Q31
2023-06-14 16:16:24 -07:00
Samuel Angebault
afc6f7acc7
[Arista] fix platform.json for a few devices (#15308)
Why I did it
sonic-mgmt is failing tests due to invalid test data in platform.json
Fwutil is upset the chassis name in the platform_component.json of the 7060CX-32S

How I did it
Fixed the aforementioned issues
2023-06-14 13:19:28 -07:00
Kebo Liu
3cb13226be
Update SN5600 platform.json with service port sfp (#15337)
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-06-13 14:15:15 +03:00
Arvindsrinivasan Lakshmi Narasimhan
0f194c5a03
set the default value for the port fec to RS on J2 based LC (#15346)
Why I did it
Work item tracking
Microsoft ADO (24182162):
How I did it
update the config.bcm to set the default fec RS 100G Linecard

How to verify it
Tests on chassis
2023-06-08 11:08:48 -07:00
Ikki Zhu
9fcbd5ed1d
fix possible cpld race access issue (#15371)
Why I did it
fix possible cpld race read issue between watchdog and reboot cause
process

How I did it
Use fcntl.flock to limit parallel access to cpld sys file

How to verify it
It can be simulated and verified with following python script

``` python3
import fcntl
import signal
import threading

exit_flag = False

def get_cpld_reg_value(getreg_path, register):
    file = open(getreg_path, 'w+')
    # Acquire an exclusive lock on the file
    fcntl.flock(file, fcntl.LOCK_EX)

    try:
        file.write(register + '\n')
        file.flush()

        # Seek to the beginning of the file
        file.seek(0)

        # Read the content of the file
        result = file.readline().strip()
    finally:
        # Release the lock and close the file
        fcntl.flock(file, fcntl.LOCK_UN)
        file.close()

    return result

def cpld_read(thread_num, cpld_reg, expect_val):
    while not exit_flag:
        val
= get_cpld_reg_value("/sys/devices/platform/dx010_cpld/getreg",
cpld_reg)
        #print(f"Thread {thread_num}: get cpld reg {cpld_reg}, value
{val}")
        if val != expect_val:
            print(f"Thread {thread_num}: get cpld reg {cpld_reg}, value
{val}, expect_val {expect_val}")

def signal_handler(sig, frame):
    global exit_flag
    print("Ctrl+C detected. Quitting...")
    exit_flag = True

if __name__ == '__main__':
    # Register the signal handler for Ctrl+C
    signal.signal(signal.SIGINT, signal_handler)

    t1 = threading.Thread(target=cpld_read, args=(1, '0x103', '0x11',))
    t2 = threading.Thread(target=cpld_read, args=(2, '0x141', '0x00',))
    t1.start()
    t2.start()
    t1.join()
    t2.join()
```
2023-06-07 11:29:18 -07:00
Marty Y. Lok
d4a81ea121
[Nokia-IXR7250E][Devicedata] update the device data for Nokia IXR7250E platform (#15216)
Why I did it
Update the device data files to support 1024 LAGs for Nokia IXR7250E platform
fixes https://github.com/Nokia-ION/ndk/issues/15

How I did it
Update the lag_id_end=1024 in chassisdb.conf file and add the trunk_group_max_members=16 in the BCM config file

How to verify it
check to allow to create lag ids up to 1024 with 16 port members

Signed-off-by: mlok <marty.lok@nokia.com>
2023-06-05 12:02:05 -07:00
Neetha John
6a8f1bad63
[brcm] Update SOC properties for DLR_INIT based pfcwd recovery (#15286)
* [202205] Update SOC properties for DLR_INIT based pfcwd recovery (#15217)

Why I did it
Update soc properties for certain roles that need to use pfcwd dlr init based recovery mechanism

How to verify it
Updated the templates on a 7050cx3 dual tor and 7260 T1 which satisfies these conditions and validated pfcwd recovery which uses DLR_INIT based mechanism. Also validated that this mechanism is not used on 7050cx3 single tor with the updated templates

Signed-off-by: Neetha John <nejo@microsoft.com>
2023-06-03 14:39:38 -07:00
Sudharsan Dhamal Gopalarathnam
5680c544b6
[Mellanox]Adding SKU Mellanox-SN4700-O8C48 (#15179)
#### Why I did it
To add new SKU Mellanox-SN4700-O8C48 with following requirements:

| Port configuration | Value |
| ------  |--------- |
 | Breakout mode for each port  |**Defined in port mapping** |
| Speed of the port |  **Defined in Port mapping** |
| Auto-negotiation enable/disable | **No setting required** |
| FEC mode | **No setting required** |
|Type of transceiver used | **Not needed**|

 Buffer configuration | Value
------  |---------
 Shared headroom | **Enabled**
 Shared headroom pool factor | **2**
 Dynamic Buffer | **Disable**
 In static buffer scenario how many uplinks and downlinks? | **48x100G  Downlinks and  8x400G uplinks**
 2km cable support required? | **Yes**

Switch configuration | Value
------  |---------
 Warmboot enabled? | **yes**
 Should warmboot be added to SAI profile when enabled? | **yes**
 Is VxLAN source port range set? | **No**
 Should Vxlan source port range be added to SAI profile when set. | **No**
 Is Static Policy Based Hashing enabled? | **No**

Port Mapping

| Ports  | Mode      |
| ------  |--------- |
| 1-12   | 2x100G |
| 13-20  | 1x400G   | 
| 21-32  | 2x100G |


Number of Uplinks / Downlinks:
T1 topology: **48x100G Downlinks 8x400G uplinks**.
Length of downlink: **40m**
Length of uplink: **2000m**

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Defined the SKU as per requirements

#### How to verify it
Load the SKU and verify if all links come up and traffic passes.


#### A picture of a cute animal (not mandatory but encouraged)
2023-06-02 13:10:04 -07:00
Vivek
d3f2d06117
[Mellanox] Add Copyright Headers for missing files (#15136)
Added NVIDIA copyright to missing files under platform/mellanox & device/mellanox
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-05-25 07:55:44 +03:00
Pavan-Nokia
4d8f3cda35
[armhf][Nokia-7215]Add HWSKU files for new SAI (#15146)
* [armhf][Nokia-7215]Add HWSKU files for new SAI

Add new easy bringup (EZB) files for new SAI 1.11.0

* [Nokia][devicedata]Modified the port autoneg default setting for Nokia-7215 platform

[armhf][Nokia-7215]Update profile.ini
2023-05-24 21:01:40 -07:00
george-deng88
2b527d301f
[Celestica] Optimize Silverstone led init process (#14852)
Why I did it
Optimize Silverstone led init process, this linkscan = off can cause the sonic port link status async with bcm shell after reboot.

How I did it
Remove redundant code.

How to verify it
After reboot, the ports can linkup normally.
2023-05-24 15:50:12 -07:00
vmittal-msft
ecb4db58a9
Update PG headroom settings ports based on port speed/cable length (#14908)
* Update PG headroom settings ports based on port speed/cable length

* Updated XOFF settings to use chip level numbers than core

* Updated PG headroom based on uplink/downlink side

* fix for sonic-config-gen tests

* More fixes for unit test cases

* more test fixes

* Merged multiple functions into one
2023-05-19 08:19:27 -07:00
Pavan-Nokia
c5d0507224
[arm64][Nokia-7215-A1]Add support for Nokia-7215-A1 platform (#13795)
Add new Nokia build target and establish an arm64 build:

    Platform: arm64-nokia_ixs7215_52xb-r0
    HwSKU: Nokia-7215-A1
    ASIC: marvell
    Port Config: 48x1G + 4x10G

How I did it

- Change make files for saiserver and syncd to use Bulleseye kernel
- Change Marvell SAI version to 1.11.0-1
- Add Prestera make files to build kernel, Flattened Device Tree blob and ramdisk for arm64 platforms
- Provide device and platform related files for new platform support (arm64-nokia_ixs7215_52xb-r0).
2023-05-18 14:24:05 -07:00
arista-nwolfe
93add6ed05
Add soc property sai_pfc_dlr_init_capability=0 to missing DNX SKUs (#15098) 2023-05-17 14:03:42 -07:00
DavidZagury
a10c1951d6
[Mellanox] Update SN5600 SAI XML file (#14947)
- Why I did it
Update SAI xml file to align with the default SKU

- How I did it
Update the SN5600 SAI xml file

- How to verify it
Install image on SN5600 device
2023-05-10 20:43:27 +03:00
vmittal-msft
5fc85f3274
Updated default ECN settings for T2 chassis (#14388)
Why I did it
Update ECN settings for T2 chassis

How I did it
Updated qos config file to load these settings during switch bootup

How to verify it
Verified on line card on T2 chassis
2023-05-04 10:01:09 -07:00
Kebo Liu
14a5f21c08
[Mellanox] Update SN5600 sensors.conf and pcie.yaml files (#14883)
- Why I did it
Update the sensors.conf and pcie.yaml according to the real hardware.

- How I did it
Update the sensors.conf and pcie.yaml

- How to verify it
run relevant sonic-mgmt test cases.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-05-02 10:36:57 +03:00
Song Yuan
48ed53cbf2
[chassis/arista]: Increase LAG Ids to 1024 (#10519)
Why I did it
Today at most 128 LAGs are supported. This is not sufficient if there are many LAGs with just few ports.

How I did it
Increase LAG Ids to 1024 for DNX device.
2023-04-27 11:28:23 -07:00
Marty Y. Lok
a9cc1fb11d
[Nokia][device-data] Modify the Nokia-7250IXRE platform specific reboot script (#14568)
Why I did it

When reboot the chassis by issuing "sudo reboot" on Supervisor card. The internal midplane communication xe0 should be shutdown to avoid double reboot on the linecard.
Added a udev link rule to disable the autoneg on AMD xgbe port Xe0 and Xe1 and make the setting in sync with the peer Broadcom greyhound ports.

How I did it

Modify the Nokia-7250IXRE specific reboot script on the Supervisor card to shutdown the internal interface xe0. Also move reboot linecard code to the top of the script to make sure the notification has been send to Linecard before shutdown the xe0 interface.
Introduced a new rule 80-net-by-driver.link to disable the autoneg on the AMD size. This change requires the latest NDK which contains the change to set the autoneg on the xe0 and xe1 port on the Greyhound.

Signed-off-by: mlok <marty.lok@nokia.com>
2023-04-27 08:53:16 -07:00
arista-nwolfe
990993e3f4
[devices/arista]: Added recycle ports required for egress mirroring (#13967)
Why I did it
Support Egress Mirroring on supported Arista platforms

How I did it
Add necessary soc properties for egress mirroring recycle ports to be created

Signed-off-by: Nathan Wolfe <nwolfe@arista.com>
2023-04-06 10:58:01 -07:00
kenneth-arista
8ddfaec34f
[devices/arista] Update asic_port_name in Arista LCs (#14234)
Updated asic_port_names for all Arista LC SKUs to follow latest naming
conventions to remove redundant ASICx suffix. For
Arista-7800R3-48CQ2-C48, added the asic_port_name mapping.
2023-04-06 10:53:42 -07:00
snider-nokia
6f54251375
[armhf][Nokia-7215]Add SFP refactor support for Nokia-7215 platform (#14396) 2023-04-06 08:04:45 -07:00
Ikki Zhu
f550c86bd7
[Seastone] DX010 platform switch to sfp-refactor based sfp impl (#13972)
Why I did it
sonic-sfp based sfp impl would be deprecated in future, change to sfp-refactor based implementation.

How I did it
Use the new sfp-refactor based sfp implementation for seastone.

How to verify it
Manual test sfp platform api or run sfp platform test cases.
2023-03-27 10:17:21 -07:00
Neetha John
ab097788d5
[qos] Update RDMA-CENTRIC lossy profile to use static threshold for Th devices (#14372)
Why I did it
For better accounting purposes, updating the ingress lossy traffic profile to use static threshold. This change is only intended for Th devices using RDMA-CENTRIC profiles

How I did it
Update the buffer templates for Th devices in RDMA-CENTRIC folder to use the correct threshold

How to verify it
Verified the changes manually on a Th device.
Existing unit tests render Th template from the RDMA-CENTRIC folder. Updated the expected output to use the correct threshold
2023-03-23 09:31:06 -07:00
Neetha John
8e4ce44e5c
Update dynamic threshold for TD2 (#14224)
Why I did it
Update dynamic threshold to -1 to get optimal performance for RDMA traffic

How I did it
Modified pg_profile_lookup.ini to reflect the correct value

Signed-off-by: Neetha John <nejo@microsoft.com>
2023-03-16 10:06:46 -07:00
Samuel Angebault
1516ace9a5
[Arista] Add missing platform_components.json (#14067)
Provide platform-components.json for Clearwater2 and Wolverine

These files are needed for fwutil platform sonic-mgmt tests to pass.

Fix PikeZ platform_components.json

Co-authored-by: Patrick MacArthur <pmacarthur@arista.com>
Co-authored-by: Andy Wong <andywong@arista.com>
2023-03-13 12:18:42 -07:00
Sambath Kumar Balasubramanian
71835385c1
sonic-buildimage Remove unused SAT port from arista configs. (#14167)
Why I did it
To fix aristanetworks/sonic#85

How I did it
Remove unnecessary SAT ports

How to verify it
Speed change from 400-100g without any error.
2023-03-09 15:54:20 -08:00
Ikki Zhu
f801b8fb2d
[Seastone] fix dx010 qsfp eeprom data write issue (#13930)
Why I did it
Platform cases test_tx_disable, test_tx_disable_channel, test_power_override failed in dx010.

How I did it
Add i2c access algorithm for CPLD i2c adapters.

How to verify it
Verify it with platform_tests/api/test_sfp.py::TestSfpApi test cases.
2023-03-01 14:35:53 +08:00
Song Yuan
bab8230444
Add QOS profiles for Arista SKUs (#13829) 2023-02-28 20:43:12 -08:00
Sambath Kumar Balasubramanian
d1bca210a6
sonic-buildimage Make changes to arista config.bcm files to support max cores (#13831)
To support 64 cores on arista skus. Fixes aristanetworks/sonic#77
Remapped recycle ports to lowers core port ids and set appl_param_nof_ports_per_modid to 64.
2023-02-23 17:54:43 -08:00
RogerX87
33db298d70
[devices]: Update the Wistron platform support in master branch (#12110)
* Update the Wistron platform support in master branch

Signed-off-by: RogerX87 <RogerX87@gmail.com>
2023-02-23 09:08:13 -08:00
Ikki Zhu
f8a393c3a1
add psu fans status led available config (#13926)
Why I did it
Seastone does not have the psu fans' status led, need to reflect it in platform.json.

How I did it
Set the psu fans status led available to false.

How to verify it
Verify it with platform_tests/api/test_psu_fans.py::TestPsuFans::test_set_fans_led case.
2023-02-22 10:55:55 -08:00
Samuel Angebault
38c9d3a53d
[Arista] update sensors.conf to ignore sensors (#12529)
Why I did it
The sensors and sensord processes were reporting data on unused sensors.
This lead to ALARM messages or erroneous values that could be misinterpreted.

How I did it
Ignore the affected sensors in the sensors.conf

How to verify it
Check that there are no longer ALARM messages from sensord in the syslog or in the output of sensors
2023-02-20 23:45:16 -08:00
byu343
b0e0b23d2a
[arista] Add tuning values for phys on 7280cr3 (#10084)
Why I did it
This change specifies the tuning values for each lane of the B52 phy chips. These values can be different for different ports. The values being set are under the assumption of optical transceivers. This change depends on the change to sonic-swss: sonic-net/sonic-swss#2158.

How to verify it
We verified the values are correctly set on the B52 chips of Arista 7280cr3, by reading them from the debug cli of the B52 driver.
2023-02-15 10:25:49 -08:00
george-deng88
3097e4258d
Fix Celestica Silverstone addDecapTunnel failed issue (#13235)
*Why I did it
     The current sonic image bringup failed on Silverstone:
     4:56:15.957705 sonic NOTICE swss#orchagent: :- addDecapTunnel: Create overlay loopback router interface oid:6000000000520
 *How I did it
     Enable bcm switch Tunnel function on Silverstone.
 *How to verify it
     The new sonic image can bringup OK on Silverstone.
2023-02-14 12:18:19 -08:00
Stephen Sun
3112997b5a
[Mellanox] Advance hw-mgmt to v.7.0020.4104 (#13372)
- Why I did it
Advance hw-mgmt service to V.7.0020.4100
Add missing thermal sensors that are supported by hw-mgmt package
Delay system health service before hw-mgmt has started on Mellanox platform in order to avoid reading some sensors before ready.
Depends on sonic-net/sonic-linux-kernel#305

- How I did it
1. Update hw mgmt version
2. Add missing sensors
3. Delay service 

- How to verify it
Regression test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-02-12 11:23:47 +02:00
Ikki Zhu
f6701f5fd7
[DX010 platform] fix dx010 platform testcase issues (#13595)
Why I did it
1. fix chassis test_set_fans_led case
2. fix chassis get_name case mismatch issue
3. fix fan_drawer test_set_fans_speed
4. fix component test_components test case

How I did it
Add corresponding configuration into chassis json file

How to verify it
Run platform tests cases to verify these failure cases
2023-02-09 19:07:13 -08:00
andywongarista
c1355625ca
[Arista] Add other chassis names to platform_components.json for 720DT-48S (#12378)
Why I did it
The 720DT-48S platform has variants with different chassis names, and these need to all be included in platform_components.json to ensure that sonic-mgmt platform_tests/fwutil/test_fwutil.py::test_fwutil_show passes

How I did it
Updated platform_components.json with the variant names for 720DT-48S.

How to verify it
Ran aforementioned testcase and verified that it passes on the different variants.
2023-02-09 13:23:30 -08:00
Patrick MacArthur
39cbd28486
fix platform.json on Wolverine for thermal sensors (#13524)
Why I did it
The current platform.json contains entries for thermals and SFPs that do not exist on Wolverine.

How I did it
I removed the incorrect entries.

How to verify it
Verify using applicable sonic-mgmt platform API tests.
2023-02-08 10:38:36 -08:00
Stephen Sun
e3ff08833e
[Mellanox] Support DSCP remapping in dual ToR topo on T0 switch (#12605)
- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-02-07 16:21:59 +02:00
Dmytro Lytvynenko
346576bcf4
[BFN] Remove not common entries from pcie yaml configuration (#12816)
Why I did it
Default pcieutil uses one configuration for all models of platform

How I did it
Take the configuration file as base for all models of concrete platform where model-specific devices are not listed in this configuration

How to verify it
Run pmon#pcied and verify that there is no error/warning logs on initialization step
2023-02-06 09:54:43 -08:00
arunlk-dell
8fdbf9dce3
[devices]: DellEMC: Add platform_env.conf for Z9432F platform (#13003)
Added the platform specific non-default values.
2023-02-06 09:51:00 -08:00
Ikki Zhu
1dec473495
[Celestica DX010] fix fan drawer and watchdog platform testcase issues (#13426)
Why I did it
fix DX010 fan drawer and watchdog platform test case issues

How I did it
1. Add fan_drawer get_maximum_consumed_power support
2. Adjust maximum watchdog timeout value check

How to verify it
Run test_fan_drawer and test_watchdog test cases.
2023-02-06 09:27:46 -08:00
wenyiz2021
85b978a1ca
[Arista] [Platform] Update platform.json for psu led (#13523)
Why I did it
By specifying 'status_led' 'controllable' to false for psu section, it means the platform is not yet supporting psu status led

How I did it
specify 'status_led' 'controllable' to false for psu section

How to verify it
by running test in pdb, manually add {'status_led' : {'controllable' : False}} in dictionary
this flag will be able to get False and skip testing:
ce290c735d/tests/platform_tests/api/test_psu.py (L337)
2023-02-01 09:53:22 -08:00
Richard.Yu
a096363b48
[broadcom]: Set default SYNCD_SHM_SIZE for Broadcom XGS devices (#13297)
After upgrade to brcmsai 8.1, the sdk running environment (container) recommended with mininum memory size as below

TH4/TD4(ltsw) uses 512MB
TH3 used 300MB
Helix4/TD2/TD3/TH/TH 256 MB
Base on this requirement, adjust the default syncd share memory size and set the memory size for special ACISs in platform_env.conf file for different types of Broadcom ASICs.

How I did it
Add the platform_env.conf file if none of it for broadcom platform (base on platform_asic file)
Add the 'SYNCD_SHM_SIZE' and set the value

for ltsw(TD4/TH4) devices set to 512M at least (update the platform_env.conf)
for Td2/TH2/TH devices set to 256M
for TH3 set to 300M

verify

How to verify it
verify the image with code fix
Check with UT
Check on lab devices

On a problematic device which cannot start successfully
Run with the command
$ cat /proc/linux-kernel-bde
Broadcom Device Enumerator (linux-kernel-bde)
Module parameters:
        maxpayload=128
        usemsi=0
        dmasize=32M
        himem=(null)
        himemaddr=(null)
DMA Memory (kernel): 33554432 bytes, 0 used, 33554432 free, local mmap
No devices found
$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Cannot get Broadcom Chip Id. Skip set SYNCD_SHM_SIZE.
Creating new syncd container with HWSKU Force10-S6000
a4862129a7fea04f00ed71a88715eac65a41cdae51c3158f9cdd7de3ccc3dd31
$ docker inspect syncd | grep -i shm
            "ShmSize": 67108864,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
On Normal device
$ docker inspect syncd | grep -i shm
            "ShmSize": 268435456,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e"
change the config syncd_shm.ini to b85=128m

$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
3209ffc1e5a7224b99640eb9a286c4c7aa66a2e6a322be32fb7fe2113bb9524c
$  docker inspect syncd | grep -i shm
            "ShmSize": 134217728,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
change the config under
/usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/Force10-S6000/platform_env.conf
and run command

$ cat /usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/platform_env.conf
SYNCD_SHM_SIZE=300m

$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
897f6fcde1f669ad2caab7da4326079abd7e811bf73f018c6dacc24cf24bfda5
$  docker inspect syncd | grep -i shm
            "ShmSize": 314572800,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2023-01-30 20:23:03 -08:00