Commit Graph

1171 Commits

Author SHA1 Message Date
Richard.Yu
a096363b48
[broadcom]: Set default SYNCD_SHM_SIZE for Broadcom XGS devices (#13297)
After upgrade to brcmsai 8.1, the sdk running environment (container) recommended with mininum memory size as below

TH4/TD4(ltsw) uses 512MB
TH3 used 300MB
Helix4/TD2/TD3/TH/TH 256 MB
Base on this requirement, adjust the default syncd share memory size and set the memory size for special ACISs in platform_env.conf file for different types of Broadcom ASICs.

How I did it
Add the platform_env.conf file if none of it for broadcom platform (base on platform_asic file)
Add the 'SYNCD_SHM_SIZE' and set the value

for ltsw(TD4/TH4) devices set to 512M at least (update the platform_env.conf)
for Td2/TH2/TH devices set to 256M
for TH3 set to 300M

verify

How to verify it
verify the image with code fix
Check with UT
Check on lab devices

On a problematic device which cannot start successfully
Run with the command
$ cat /proc/linux-kernel-bde
Broadcom Device Enumerator (linux-kernel-bde)
Module parameters:
        maxpayload=128
        usemsi=0
        dmasize=32M
        himem=(null)
        himemaddr=(null)
DMA Memory (kernel): 33554432 bytes, 0 used, 33554432 free, local mmap
No devices found
$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Cannot get Broadcom Chip Id. Skip set SYNCD_SHM_SIZE.
Creating new syncd container with HWSKU Force10-S6000
a4862129a7fea04f00ed71a88715eac65a41cdae51c3158f9cdd7de3ccc3dd31
$ docker inspect syncd | grep -i shm
            "ShmSize": 67108864,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
On Normal device
$ docker inspect syncd | grep -i shm
            "ShmSize": 268435456,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e"
change the config syncd_shm.ini to b85=128m

$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
3209ffc1e5a7224b99640eb9a286c4c7aa66a2e6a322be32fb7fe2113bb9524c
$  docker inspect syncd | grep -i shm
            "ShmSize": 134217728,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
change the config under
/usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/Force10-S6000/platform_env.conf
and run command

$ cat /usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/platform_env.conf
SYNCD_SHM_SIZE=300m

$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
897f6fcde1f669ad2caab7da4326079abd7e811bf73f018c6dacc24cf24bfda5
$  docker inspect syncd | grep -i shm
            "ShmSize": 314572800,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2023-01-30 20:23:03 -08:00
kenneth-arista
8c2d8ea4af
[device/arista] Reduce SDK stat polling freq in DNX devices (#13429)
Eariler the SDK stat polling was erroneously set to once every msec
which is far more frequent than required by SWSS. The new setting, which
is consistent with other vendor SKUs, is once a second. The net result
is reduced CPU MHz by syncd.
2023-01-30 14:13:01 -08:00
Ying Xie
e0ed5f968f
[Arista] add support for hardware sku Arista-7260CX3-D92C16 (#13438)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-01-19 11:08:21 -08:00
Ikki Zhu
eba30ff26f
[Celestica Seastone] fix multi sonic platform issues (#13356)
Why I did it
Fix the following issues for Seastone platform:

- system-health issue: show system-health detail will not complete #9530, Celestica Seastone DX010-C32: show system-health detail fails with 'Chassis' object has no attribute 'initizalize_system_led' #11322
- show platform firmware updates issue: Celestica Seastone DX010-C32: show platform firmware updates #11317
- other platform optimization

How I did it
Modify and optimize the platform implememtation.

How to verify it
Manual run the test commands described in these issues.
2023-01-18 16:27:48 -08:00
Jemston Fernando
892f26556c
[platform]: Fix Belgite platform issues (#13389)
As part of platform hardening this commit fixes several platform issues
in various components like PSU, FAN, Temperature, LED.
2023-01-18 10:00:07 -08:00
kenneth-arista
06d55b8027
[device/arista] Disabled polled_irq_mode for DNX SKUs (#13349)
Disabled polled_irq_mode for all Arista DNX devices as this mode
leads to excessive use of the CPU via an unneeded interrupt
polling thread.
2023-01-12 23:48:37 -08:00
pettershao-ragilenetworks
bce4aa1412
[ragile] adapter for kernel 5.x (#10762)
Why I did it
Ragile adapter ra-b6510-32c ra-b6510-48v8c ra-b6910-64c ra-b6920-4s to kernel 5.x

Signed-off-by: “pettershao” pettershao@ragilenetworks.com
2023-01-12 18:01:47 -08:00
Hua Liu
ce88a38185
Fix code issue when SonicV2Connector.get() return None. (#13250)
Fix code issue when SonicV2Connector.get() return None.

#### Why I did it
When database key does not exist, SonicV2Connector.get() will return None.
Code will break if not check return value.

#### How I did it
Check SonicV2Connector.get() return value before use it.

#### How to verify it
Pass all E2E test case.
2023-01-09 11:37:43 -08:00
Mai Bui
06e1a0bc14
[device/dell] Mitigation for security vulnerability (#11875)
Dependency: [PR (#12065)](https://github.com/sonic-net/sonic-buildimage/pull/12065) needs to merge first.

#### Why I did it
`commands` module is not protected against malicious input
`getstatusoutput` is detected without a static string, uses `shell=True`
#### How I did it
Eliminate the use of `commands`
Use `subprocess.run()`, commands in `subprorcess.run()` are totally static
Fix indentation
#### How to verify it
Tested on DUT
[dell_log.txt](https://github.com/sonic-net/sonic-buildimage/files/9561332/dell_log.txt)
2023-01-05 16:22:09 -08:00
byu343
83fd368f19
[Arista]: Add hwSku Arista-7260CX3-D108C10 (#13242)
* [Arista]: Add hwSku Arista-7260CX3-D108C10

* Add buffer-related config for Arista-7260CX3-D108C10
2023-01-04 13:21:43 -08:00
Ikki Zhu
8ad69f77a4
Seastone add platform capability enhancement config (#13079) 2023-01-04 10:17:50 -08:00
Marty Y. Lok
948ce3fe09
[Nokia][device-data] Rename the port name in the port_config.ini and also remove ECN setting in the BCM config file (#13053)
Signed-off-by: mlok <marty.lok@nokia.com>
2022-12-22 20:41:29 +00:00
roger530-ho
3740f1efb7
[Edgecore][device/accton] Fix subprocess.call issue in is_host(). (#13111)
Signed-off-by: roger530-ho <roger530_ho@edge-core.com>
2022-12-20 13:13:42 -08:00
tianshangfei
b65e06f998
two platforms supporting S3IP SYSFS (TCS8400, TCS9400) (#12386)
Why I did it
Add two platform that support s3IP framework

How I did it
Add two platforms supporting S3IP SYSFS (TCS8400, TCS9400)

How to verify it
Manual test
2022-12-18 16:16:53 +08:00
FSSec
bb09ebe977
[FS][arm64] support new boars s5800-48t4s and s5800-48t8s-mars8p (#12994)
Adding platform support for FS s5800-48t4s and s5800-48t8s-mars8p.

Both s5800-48t4s and s5800-48t8s-mars8p have 48 * 10/100/1000 Base-T ports, 4 * 10GE SFP+ Ports on Centec TsingMa.
s5800-48t4s is different from s5800-48t8s-mars8p in that:

The phy chip used by s5800-48t4s is Marvell 88e1680;
The phy chip used by s5800-48t4s-mars8p is Centec ctc21108;
2022-12-17 14:48:02 -08:00
andywongarista
372a7c85c5
[Arista] Update ip packet checksum when set to 0xffff on 720DT-48S (#13088)
Why I did it
This is to fix test_forward_ip_packet_with_0xffff_chksum_tolerant test failure on 720DT-48S. IP packets with checksum set to 0xffff will be forwarded with the same checksum on this platform, instead of updating to the correct value.

How I did it
Add bcm config sai_verify_incoming_chksum=0 so that checksum is updated instead of being left unchanged when checksum is 0xffff. Note that packets with invalid checksum are still dropped with this config.
2022-12-17 13:47:05 -08:00
Deepak Singhal
bf428fd9a7
DNX(J2/J2c/J2c+): Reserve Non-ECMP Fec Resource for Non-ECMP Route Nexthops/NBR Entries (#13076)
Why I did it
On DNX (J2/J2c/J2c+) platforms, Single Path Nexthops and ECMp Nexthop resources(FECs) are shared. BRCM SAI do not have partition of this resource, and hence more single path Nexthop entries, causes ECMP programming to fail in scaled setup.

How I did it
Broadcom provided SAI changes to reserve resources for single path nexthop entries(More details in CSP: https://brcmsemiconductor-csm.wolkenservicedesk.com/wolken-support/allcases/request-details?requestId=CS00012251649).
Along with SAI changes, they provided configurable Macro/flag to reserve NON_ECMP entries.
This PR is to add that flag in various sai.profile files wherever applicable.

PS: We are reserving 3072 single path Nexthop entries on each Linecard. Calculation is as follows.
Max Slots per chassis: 8
Max No of Ports(each LC): 64
MyIP/Subnet Entries per port: 4(v4/v6)
Nbr Entries Per port: 2(v4/v6)

Total Non_ECMP Count: 8x64x(4+2) = 3072

How to verify it
Without this change, the ECMP group count will be shown as Max_count in 'crm show resources all' command, and with this change the ECMP group count will be decreased by 24(3072/128).
2022-12-16 16:43:43 -08:00
Song Yuan
1fd2395f29
Fix port index for multi-asic (#13042)
Port indexes of front panel ports are not contiguous in multi-asic because we didn't distiguish between
front panel and internal ports, e.g., recycle ports. Fix this by assigning index to front panel port first
and then internal ports.
2022-12-16 09:12:36 -08:00
vaibhav-dahiya
0eb852c4a4 Revert "[Arista] Disable pcie checking on x86_64-arista_7050cx3_32s (#12900)"
This reverts commit dd87a791b4.
2022-12-15 22:56:19 -08:00
Maxime Lorrillere
298de5abef
Fix missing system_ref_core_clock_khz (#12663)
Add missing system_ref_core_clock_khz in Arista-7800R3A-36D2-C36 and Arista-7800R3A-36D2-C72
2022-12-13 22:20:16 -08:00
kenneth-arista
570e6fb28f
Add aggregate port_config.ini for Wolverine SKU (#12951)
Add missing aggregate port_config.ini needed by sonic-mgmt

Concatenate the ASIC specific port_config.ini from device/arista/x86_64-arista_7800r3a_36d2_lc/Arista-7800R3A-36D2-C36/[01] to create the aggregate file.
2022-12-13 22:02:14 -08:00
wenyiz2021
8a8d83b814
[arista] Add platform.json for arista chassis LC5 (#12949)
Add components all LCs
add platform.json for new sku LC5
mark thermal controllable to false to skip setter function of high/low threshold
2022-12-09 13:45:22 -08:00
Mai Bui
51a1eb112b
[device/celestica] Mitigation for command injection vulnerability (#11740)
Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [PR (#12065)](https://github.com/sonic-net/sonic-buildimage/pull/12065) needs to merge first.
#### Why I did it
1. `eval()` - not secure against maliciously constructed input, can be dangerous if used to evaluate dynamic content. This may be a code injection vulnerability.
2. `subprocess()` - when using with `shell=True` is dangerous. Using subprocess function without a static string can lead to command injection.
3. `os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content.
4. `is` operator - string comparison should not be used with reference equality.
5. `globals()` - extremely dangerous because it may allow an attacker to execute arbitrary code on the system
#### How I did it
1. `eval()` - use `literal_eval()`
2. `subprocess()` - use `shell=False` instead. use an array string. Ref: [https://semgrep.dev/docs/cheat-sheets/python-command-injection/#mitigation](https://semgrep.dev/docs/cheat-sheets/python-command-injection/#mitigation)
3. `os` - use with `subprocess`
4. `is` - replace by `==` operator for value equality
5. `globals()` - avoid the use of globals()
2022-12-09 10:30:20 -05:00
byu343
dd87a791b4
[Arista] Disable pcie checking on x86_64-arista_7050cx3_32s (#12900)
This change is to disable the pcie firmware check done by Broadcom SAI. The change is needed for the Arista platform x86_64-arista_7050cx3_32s; otherwise, the check will fail, blocking the initialization.

There was a pcie firmware check added in brcm SDK and certain Arista hardwares do not compliant with the check, so we added the disable_pcie_firmware_check originally for x86_64-arista_7060dx4_32. For x86_64-arista_7050cx3_32s, it was able to pass the check but some firmware change done in August made it fail.
2022-12-07 01:28:26 -08:00
Ikki Zhu
ad49100985
Seastone: fix platform fan psu and temperature issues (#12567)
Why I did it:
Fix multiple seastone platform issues caused by sonic kernel upgrade.

How I did it:
Get gpio base id with new label path in gpio sys fs.

How to verify it:
After the change, show platform fan/psustatus/temperature works well.
2022-12-05 09:44:55 -08:00
Jing Kan
272f61d0f1
[Arista 720DT] Create SKU alias Arista-720DT-G48S4 (#12905) 2022-12-02 18:53:02 +08:00
Mai Bui
2b3e884209
[nokia] Replace os.system and remove subprocess with shell=True (#12100)
Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [https://github.com/sonic-net/sonic-buildimage/pull/12065](https://github.com/sonic-net/sonic-buildimage/pull/12065)
#### Why I did it
`subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
`getstatusoutput` is dangerous because it contains `shell=True` in the implementation
#### How I did it
Replace `os` by `subprocess`, use with `shell=False`
Remove unused functions
2022-12-01 12:12:50 -05:00
Junchao-Mellanox
7d38b459e4
[Mellanox] Add device files for SN5600 (#12831)
- Why I did it
Add device files for new platform SN5600

- How I did it
Add device files for new platform SN5600

- How to verify it
Manual test
2022-11-30 19:47:50 +02:00
Mai Bui
0bd3be32e6
[device/marvell] Mitigation for security vulnerability (#11876)
#### Why I did it
`os` and `commands` modules are not secure against maliciously constructed input
`getstatusoutput` is detected without a static string, uses `shell=True`
#### How I did it
Eliminate the use of `os` and `commands`
Use `subprocess` instead
2022-11-30 00:06:28 -08:00
Neetha John
c323037815
Update ECN settings for storage backend (#12855)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
ECN parameters need to be updated for storage backend

How I did it
Included the check for storage backend devices to update qos configs

How to verify it
Verified that the new ecn settings are applied on storage backend device.
Verified that the old ecn settings are applied for storage frontend, non storage frontend/backend devices
2022-11-29 10:19:06 -08:00
Mai Bui
95bb7f3b78
[device/ragile] Mitigation for security vulnerability (#11744)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
The [xml.etree.ElementTree](https://docs.python.org/3/library/xml.etree.elementtree.html#module-xml.etree.ElementTree) module is not secure against maliciously constructed data.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
`subprocess.getstatusoutput` is dangerous because include shell=True in the implementation
#### How I did it
Remove xml. Use [lxml](https://pypi.org/project/lxml/) XML parsers package that prevent potentially malicious operation.
Replace `os` by `subprocess`
Use command as an array instead of string
Use `getstatusoutput_noshell` in `sonic_py_common` lib
2022-11-29 11:54:37 -05:00
Aravind Mani
724a285569
[DPB] Dell Z9332f port breakout changes (#12789) 2022-11-28 22:16:51 -08:00
andywongarista
e7f4da5823
[Arista] Enable ipv6 128b lpm on 720DT-48S (#12832)
Why I did it
Added to allow test_crm_route to pass; the test tries to add a /126 ipv6 route and this change is required in order for the count of available routes to be updated correctly.
2022-11-29 11:20:50 +08:00
bingwang-ms
818f275957
Add missing flags in qos template file (#12793) 2022-11-23 09:08:38 +08:00
Mai Bui
2f6b34a637
[device/juniper] Mitigation for security vulnerability (#11838)
Signed-off-by: maipbui maibui@microsoft.com
Dependency: [https://github.com/sonic-net/sonic-buildimage/pull/12065](https://github.com/sonic-net/sonic-buildimage/pull/12065)
#### Why I did it
`commands` module is not secure
command injection in `getstatusoutput` being used without a static string
#### How I did it
Eliminate `commands` module, use `subprocess` module only
Convert Python 2 to Python 3
2022-11-22 10:46:12 -05:00
Bohan Yang
16a15e9ae0
[Arista]Add media_settings.json for x86_64-arista_7800r3a_36d2_lc (#12444)
Why I did it
TX FIR tuning should be done based on the type of inserted transceiver

How I did it
Add media_settings.json which contains the tuning data for 100G optic and 400G optic.

How to verify it
Tested against x86_64-arista_7800r3a_36d2_lc
2022-11-21 14:50:34 -08:00
bingwang-ms
f402e6b5c6
Apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor (#12730)
Why I did it
The PR is to apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor.
The traffic with DSCP 2 and DSCP 6 from T1 is treated as lossless traffic.

DSCP    TC    Queue
2      2     2
6      6     6
Traffic with DSCP 2 or DSCP 6 from downlink is still treated as lossy traffic as before.

How I did it
Define DSCP_TO_TC_MAP|AZURE_UPLINK and TC_TO_QUEUE_MAP|AZURE_UPLINK.

How to verify it
Verified by UT
Verified by coping the new template to a testbed, and rendering a config_db.json
2022-11-21 11:42:28 -08:00
Samuel Angebault
b05d2e3729
[Arista] Update platform.json for 7060CX-32S (#12783)
Why I did it
Some sonic-mgmt platform_tests/api were failing on the 7060CX-32S

How I did it
Added the missing metadata in platform.json and platform_components.json
This is purely test data and does not impact our API implementation.

How to verify it
Run platform_tests / api and expect 100% pass rate.
2022-11-21 09:19:24 -08:00
Samuel Angebault
46bd5f695c
[Arista] Update platform.json for 7260CX3-64 (#12757)
Why I did it
Some sonic-mgmt platform_tests/api were failing on the 7260CX3-64

How I did it
Added the missing metadata in platform.json and platform_components.json
This is purely test data and does not impact our API implementation.

How to verify it
Run platform_tests/api and expect 100% passrate.
2022-11-19 12:45:37 -08:00
Neetha John
dc21c9605e
[Profile separation] MMU infrastructure update for TD2 (#12626)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
There is a need to have separate profiles on compute and storage and this infra update will help achieve that

How I did it
Moved buffer pool/profile and qos definitions on TD2 to a common folder and all TD2 hwsku's will reference that folder
2022-11-17 12:58:11 -08:00
wenyiz2021
fecc7c6a1d
[Arista] [platform] Add thermal info in platform.json (#12714)
add 1 more thermal entry
2022-11-17 11:05:16 -08:00
Samuel Angebault
4d62689914
[Arista] Add pcie.yaml to 7280CR3-32D4 variants (#12700) 2022-11-14 13:29:30 -08:00
wenyiz2021
39ebf80f1b
[arista] [chassis] Add psu/thermal info in platform.json for sup (#12667)
update psu info in platform.json on sup
2022-11-14 11:03:52 -08:00
Ying Xie
a544a07931
Enable Dx010 LPM (#12642)
Why I did it
DX010 platform has limited routing table size.

How I did it
Enabling LPM.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-11-09 08:15:41 -08:00
wenyiz2021
f581a77a64
[Chassis] [Arista] correct platform.json for sup and LC6 names (#12627)
add platform.json separately for LC6 that has different name, bc of supporting macsec
Signed-off-by: Wenyi Zhang <wenyizhang@microsoft.com>
2022-11-08 12:56:39 -08:00
judyjoseph
c259c996b4
Use the macsec_enabled flag in platform to enable macsec feature state (#11998)
* Use the macsec_enabled flag in platform to enable macesc feature state
* Add macsec supported metadata in DEVICE_RUNTIME_METADATA
2022-11-08 11:03:38 -08:00
arlakshm
c4be3a51aa
[chassis][Arista] add supervisor to the platform_env.conf (#12615)
Why I did it
Fixes #12614

How I did it
In the container_checker the database_chassis is added to expected container if device is supervisor
To detect the device is superviso, add supervisor=1 to the platform_env.conf of 7808 sup platform

How to verify it
run container_checker monit check
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-11-07 15:30:02 -08:00
Mai Bui
5b0c4ec1e6
[device/accton] Replace os.system and remove subprocess with shell=True (#11985)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
Replace `os` by `subprocess`
Remove unused functions
2022-11-07 10:31:32 -05:00
roman_savchuk
a31a4e7f82
Revert "[Barefoot] Add xon_offset to pg_profile_lookup.ini (#12073)" (#12568)
Why I did it
This changes should go with updated SDE for BFN. Without update we do see orchagent core dump.

How I did it
Revert changes

How to verify it
Deploy topology. No core dump appears
2022-11-02 09:10:07 +08:00
Vivek
a68ce12dd6
[Mellanox] [SKU] Added Mellanox-SN4700-A96C8V8 SKU (#12347)
- Why I did it
A new SKU for MSN4700 Platform i.e. Mellanox-SN4700-V16A96

Requirements:

Breakout:
Port 1-24: 4x25G(4)[10G,1G]
Port 25-28: 2x100G[200G,50G,40G,25G,10G,1G]
Port 29-32: 2x200G[100G,50G,40G,25G,10G,1G]
Downlinks: 96 (1-24) + 4 (25-28)
Uplinks: 4 (29-32)
Shared Headroom: Enabled
Over Subscribe Ratio: 1:4
Default Topology: T0
Default Cable Length for T1: 5m
VxLAN source port range set: No
Static Policy Based Hashing Supported: No

Additional Details:
QoS params: The default ones defined in qos_config.j2 will be applied
Small Packet Percentage: Used 50% for traditional buffer model Note: For dynamic model, the value defined in LOSSLESS_TRAFFIC_PATTERN|AZURE|small_packet_percentage is used
SKU was drafted under the assumption that the downlink ports uses xcvr's that will only support the first 4 lanes of the physical port they are connected to. Hence for the ports 1-24, the last four lanes are not used
Cable Lengths used for generating buffer_defaults_{t0,t1}.j2 values

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-10-20 09:50:07 +03:00