Why I did it
System health config is missing in few Dell platforms.
How I did it
Added system health monitoring config and its related API's
How to verify it
show system-health summary/detail commands.
Why I did it
Dell S6100 Platform components needs to be updated.
How I did it
Modified platform.json to fix the issue.
How to verify it
Run sonic-mgmt component test and check whether it passes.
These devices will not reliabily report the proper devid and vendorid
when reading it is read directly from the pci config space.
It can be read but shouldn't be compared against some fixed value like
the one stored in pcie.yaml.
Since this makes pcied unhappy, the simplest path forward is to just
remove this device from monitoring.
* sonic-buildimage: Add 7060DX5-64S brcm tunnel config
Add bcm_tunnel_term_compatible_mode: 1 support, which allows
Loopback configuration to no longer result in SAI failure
"tunnel terminator add failed with error Feature unavailable"
that caused Orchagent SIGABRT
Signed-off-by: Aaron Payment <aaronp@arista.com>
* sonic-buildimage: Set port config ENABLE:0 in 7060DX5-64S brcm config
Set ENABLE:0 for the front panel ports in the brcm config so that the
ports are default admin down. This change prevents the issue that ports
are able to link up and pass traffic resulting in mac learn events after
SAI create switch and before SAI admin state up. The unexpected mac learn events
resulted in Orch agent crash in PortsOrch init, which occurs after SAI
create switch and before SAI admin state up.
* fix sensors.conf on CatalinaDD
* Add support for two sfp ports
* Add copper 50g tuning to babbagelp on catalina
---------
Signed-off-by: Aaron Payment <aaronp@arista.com>
Co-authored-by: enes.oncu <enes.oncu@arista.com>
Co-authored-by: Boyang Yu <byu@arista.com>
Why I did it
Update the platform_reboot of Nokia Platform IXR-7250E-36x400G to displays the correct reboot-cause history when reboot from supervisor card.
Work item tracking
Microsoft ADO (number only):
How I did it
Modify the platform_reboot script to copy the correct reboo-cause.txt file from NDK to the /host/reboot-cause directory at the down cycle when the reboot is issued from Supervisor (for both reboot right after install a new image and normal reboot)
Signed-off-by: mlok <marty.lok@nokia.com>
- Why I did it
Add new breakout modes to be used in PAM4 supported cables
- How I did it
- How to verify it
Verified the 50G per lane breakout modes are applied properly on the switch
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Why I did it
Add PDDF support on following Ufispace platforms with Broadcom ASIC
S9110-32X
S8901-54XC
S7801-54XS
S6301-56ST
How I did it
Add PDDF configuration files, scripts and python files
How to verify it
Run pddf commands and show commands.
Signed-off-by: nonodark <ef67891@yahoo.com.tw>
Why I did it
To overwrite the default DSCP_TO_TC_MAP for tunnel traffic, the attribute sai_tunnel_support must be set to 1.
Before this change, the attribute is set only on dual-tor platform when remap is enabled.
This PR is to set the attribute on all Arista-7260cx3 devices.
Work item tracking
Microsoft ADO 24785776
How I did it
Update the config.bcm template for Arista-7260cx3 devices.
How to verify it
The change is verified by manually rendering the j2 on a T1 testbed.
Syncd will abort in handleSaiCreateStatus with
'Encountered failure in create operation, exiting orchagent,SAI API: SAI_API_TUNNEL, status: SAI_STATUS_NOT_SUPPORTED'
The fix is to add the following brcm config to prevent the error:
sai_tunnel_global_sip_mask_enable=1
bcm_tunnel_term_compatible_mode=1
Signed-off-by: Aaron Payment <aaronp@arista.com>
#### Why I did it
When a kernel crash occurs, the system will reboot to the kdump capture kernel if kdump is enabled (`config kdump enable`). In the kdump capture boot, it only stores the crash information, and then reboot the system to a normal boot.
In this boot, no SONiC service is started but it invokes `reboot` which is actually the SONiC reboot that depends on SONiC services. There is a logic to skip all SONiC stuff and invoke platform reboot in SONiC reboot to avoid issues.
However, on Nvidia platforms, the platform reboot still depends on SONiC services, which can cause issues.
So, the Debian reboot is called directly in platform reboot if it is invoked from the kdump capture boot.
#### How I did it
Manual test
- Why I did it
Added the fwtrace config files in order to be able to call the mlxstrace utility during the show techsupport dump.
Work item tracking
Microsoft ADO (number only):
- How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.
- How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
* [E1031] add platform specific reboot command support
Why I did it
E1031: add platform specific cold reboot support
How I did it
Use the CPLD to trigger board level power cycle when cold reboot
How to verify it
Do reboot stress test and check the reboot cause history
* [E1031] try to umount filesystem before power cycle reboot
* [E1031] remove fstrim in customized reboot script
Why I did it
Certain all-numeric device IDs of PCI devices in the pcie.yaml file are left unquoted, leading to false mismatch flags in the pcie daemon and subsequently leads to log flooding. This PR fixes that issue.
Work item tracking
Microsoft ADO (number only): 24578930
How I did it
Added quotes around numeric PCI devices in the pcie.yaml files of the following platforms:
x86_64-mlnx_msn2700-r0
x86_64-mlnx_msn4600c-r0
How to verify it
Install latest image after the merge and verify that syslogs are not flooded with PCI device mismatch errors
Why I did it
port_config.ini and hwsku.json are needed to generate the default config
switch_type needs to be "dpu" to spawn the right set of processes during dvs initialization and to make sure that DASH APIs can be handled properly
Work item tracking
Microsoft ADO 24375371:
How I did it
Use the same hwsku.json and port_config.ini for DPU-2P as the ones used for Nvidia-MBF2H536C SKU in nvidia-sonic sonic-buildimage repo.
Set switch_type to "dpu" in DEVICE_METADATA configuration to make sure DASH specific APIs are handled properly
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
* [202012][platform/barefoot] (#8543)
Why I did it
Pcied running by python 2.
How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2
How to verify it
docker exec pmon supervisorctl status
* [Netberg][nba710] Added initial support for Aurora 710
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
---------
Signed-off-by: Andrew Sapronov <andrew.sapronov@gmail.com>
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
Define a generic 2-port NPU SKU for docker-sonic-vs to
enable DASH vstests to pass on azure pipelines
Work item tracking
Microsoft ADO 24375371:
How I did it
Define a generic 2-port NPU hwsku that is used only for DASH-specific vstests.
Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
Added support data for fabric monitoring in CONFIG_DB
The CONFIG_DB now has the FABRIC_MONITOR|FABRIC_MONITOR_DATA table for default value for fabric port monitoring. An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_MONITOR|FABRIC_MONITOR_DATA"
{'monErrThreshCrcCells': '1', 'monErrThreshRxCells': '61035156', 'monPollThreshIsolation': '1', 'monPollThreshRecovery': '8'}
The CONFIG_DB now also has a table for each fabric port for its isolate status.
An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_PORT|Fabric20"
{'alias': 'Fabric20', 'isolateStatus': 'False', 'lanes': '20'}
Why I did it
sonic-mgmt is failing tests due to invalid test data in platform.json
Fwutil is upset the chassis name in the platform_component.json of the 7060CX-32S
How I did it
Fixed the aforementioned issues
Why I did it
Work item tracking
Microsoft ADO (24182162):
How I did it
update the config.bcm to set the default fec RS 100G Linecard
How to verify it
Tests on chassis
Why I did it
fix possible cpld race read issue between watchdog and reboot cause
process
How I did it
Use fcntl.flock to limit parallel access to cpld sys file
How to verify it
It can be simulated and verified with following python script
``` python3
import fcntl
import signal
import threading
exit_flag = False
def get_cpld_reg_value(getreg_path, register):
file = open(getreg_path, 'w+')
# Acquire an exclusive lock on the file
fcntl.flock(file, fcntl.LOCK_EX)
try:
file.write(register + '\n')
file.flush()
# Seek to the beginning of the file
file.seek(0)
# Read the content of the file
result = file.readline().strip()
finally:
# Release the lock and close the file
fcntl.flock(file, fcntl.LOCK_UN)
file.close()
return result
def cpld_read(thread_num, cpld_reg, expect_val):
while not exit_flag:
val
= get_cpld_reg_value("/sys/devices/platform/dx010_cpld/getreg",
cpld_reg)
#print(f"Thread {thread_num}: get cpld reg {cpld_reg}, value
{val}")
if val != expect_val:
print(f"Thread {thread_num}: get cpld reg {cpld_reg}, value
{val}, expect_val {expect_val}")
def signal_handler(sig, frame):
global exit_flag
print("Ctrl+C detected. Quitting...")
exit_flag = True
if __name__ == '__main__':
# Register the signal handler for Ctrl+C
signal.signal(signal.SIGINT, signal_handler)
t1 = threading.Thread(target=cpld_read, args=(1, '0x103', '0x11',))
t2 = threading.Thread(target=cpld_read, args=(2, '0x141', '0x00',))
t1.start()
t2.start()
t1.join()
t2.join()
```
Why I did it
Update the device data files to support 1024 LAGs for Nokia IXR7250E platform
fixes https://github.com/Nokia-ION/ndk/issues/15
How I did it
Update the lag_id_end=1024 in chassisdb.conf file and add the trunk_group_max_members=16 in the BCM config file
How to verify it
check to allow to create lag ids up to 1024 with 16 port members
Signed-off-by: mlok <marty.lok@nokia.com>
* [202205] Update SOC properties for DLR_INIT based pfcwd recovery (#15217)
Why I did it
Update soc properties for certain roles that need to use pfcwd dlr init based recovery mechanism
How to verify it
Updated the templates on a 7050cx3 dual tor and 7260 T1 which satisfies these conditions and validated pfcwd recovery which uses DLR_INIT based mechanism. Also validated that this mechanism is not used on 7050cx3 single tor with the updated templates
Signed-off-by: Neetha John <nejo@microsoft.com>
#### Why I did it
To add new SKU Mellanox-SN4700-O8C48 with following requirements:
| Port configuration | Value |
| ------ |--------- |
| Breakout mode for each port |**Defined in port mapping** |
| Speed of the port | **Defined in Port mapping** |
| Auto-negotiation enable/disable | **No setting required** |
| FEC mode | **No setting required** |
|Type of transceiver used | **Not needed**|
Buffer configuration | Value
------ |---------
Shared headroom | **Enabled**
Shared headroom pool factor | **2**
Dynamic Buffer | **Disable**
In static buffer scenario how many uplinks and downlinks? | **48x100G Downlinks and 8x400G uplinks**
2km cable support required? | **Yes**
Switch configuration | Value
------ |---------
Warmboot enabled? | **yes**
Should warmboot be added to SAI profile when enabled? | **yes**
Is VxLAN source port range set? | **No**
Should Vxlan source port range be added to SAI profile when set. | **No**
Is Static Policy Based Hashing enabled? | **No**
Port Mapping
| Ports | Mode |
| ------ |--------- |
| 1-12 | 2x100G |
| 13-20 | 1x400G |
| 21-32 | 2x100G |
Number of Uplinks / Downlinks:
T1 topology: **48x100G Downlinks 8x400G uplinks**.
Length of downlink: **40m**
Length of uplink: **2000m**
##### Work item tracking
- Microsoft ADO **(number only)**:
#### How I did it
Defined the SKU as per requirements
#### How to verify it
Load the SKU and verify if all links come up and traffic passes.
#### A picture of a cute animal (not mandatory but encouraged)
* [armhf][Nokia-7215]Add HWSKU files for new SAI
Add new easy bringup (EZB) files for new SAI 1.11.0
* [Nokia][devicedata]Modified the port autoneg default setting for Nokia-7215 platform
[armhf][Nokia-7215]Update profile.ini
Why I did it
Optimize Silverstone led init process, this linkscan = off can cause the sonic port link status async with bcm shell after reboot.
How I did it
Remove redundant code.
How to verify it
After reboot, the ports can linkup normally.
* Update PG headroom settings ports based on port speed/cable length
* Updated XOFF settings to use chip level numbers than core
* Updated PG headroom based on uplink/downlink side
* fix for sonic-config-gen tests
* More fixes for unit test cases
* more test fixes
* Merged multiple functions into one
Add new Nokia build target and establish an arm64 build:
Platform: arm64-nokia_ixs7215_52xb-r0
HwSKU: Nokia-7215-A1
ASIC: marvell
Port Config: 48x1G + 4x10G
How I did it
- Change make files for saiserver and syncd to use Bulleseye kernel
- Change Marvell SAI version to 1.11.0-1
- Add Prestera make files to build kernel, Flattened Device Tree blob and ramdisk for arm64 platforms
- Provide device and platform related files for new platform support (arm64-nokia_ixs7215_52xb-r0).
- Why I did it
Update SAI xml file to align with the default SKU
- How I did it
Update the SN5600 SAI xml file
- How to verify it
Install image on SN5600 device
Why I did it
Update ECN settings for T2 chassis
How I did it
Updated qos config file to load these settings during switch bootup
How to verify it
Verified on line card on T2 chassis
- Why I did it
Update the sensors.conf and pcie.yaml according to the real hardware.
- How I did it
Update the sensors.conf and pcie.yaml
- How to verify it
run relevant sonic-mgmt test cases.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
Today at most 128 LAGs are supported. This is not sufficient if there are many LAGs with just few ports.
How I did it
Increase LAG Ids to 1024 for DNX device.
Why I did it
When reboot the chassis by issuing "sudo reboot" on Supervisor card. The internal midplane communication xe0 should be shutdown to avoid double reboot on the linecard.
Added a udev link rule to disable the autoneg on AMD xgbe port Xe0 and Xe1 and make the setting in sync with the peer Broadcom greyhound ports.
How I did it
Modify the Nokia-7250IXRE specific reboot script on the Supervisor card to shutdown the internal interface xe0. Also move reboot linecard code to the top of the script to make sure the notification has been send to Linecard before shutdown the xe0 interface.
Introduced a new rule 80-net-by-driver.link to disable the autoneg on the AMD size. This change requires the latest NDK which contains the change to set the autoneg on the xe0 and xe1 port on the Greyhound.
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
Support Egress Mirroring on supported Arista platforms
How I did it
Add necessary soc properties for egress mirroring recycle ports to be created
Signed-off-by: Nathan Wolfe <nwolfe@arista.com>
Updated asic_port_names for all Arista LC SKUs to follow latest naming
conventions to remove redundant ASICx suffix. For
Arista-7800R3-48CQ2-C48, added the asic_port_name mapping.
Why I did it
sonic-sfp based sfp impl would be deprecated in future, change to sfp-refactor based implementation.
How I did it
Use the new sfp-refactor based sfp implementation for seastone.
How to verify it
Manual test sfp platform api or run sfp platform test cases.
Why I did it
For better accounting purposes, updating the ingress lossy traffic profile to use static threshold. This change is only intended for Th devices using RDMA-CENTRIC profiles
How I did it
Update the buffer templates for Th devices in RDMA-CENTRIC folder to use the correct threshold
How to verify it
Verified the changes manually on a Th device.
Existing unit tests render Th template from the RDMA-CENTRIC folder. Updated the expected output to use the correct threshold
Why I did it
Update dynamic threshold to -1 to get optimal performance for RDMA traffic
How I did it
Modified pg_profile_lookup.ini to reflect the correct value
Signed-off-by: Neetha John <nejo@microsoft.com>
Provide platform-components.json for Clearwater2 and Wolverine
These files are needed for fwutil platform sonic-mgmt tests to pass.
Fix PikeZ platform_components.json
Co-authored-by: Patrick MacArthur <pmacarthur@arista.com>
Co-authored-by: Andy Wong <andywong@arista.com>
Why I did it
Platform cases test_tx_disable, test_tx_disable_channel, test_power_override failed in dx010.
How I did it
Add i2c access algorithm for CPLD i2c adapters.
How to verify it
Verify it with platform_tests/api/test_sfp.py::TestSfpApi test cases.
To support 64 cores on arista skus. Fixesaristanetworks/sonic#77
Remapped recycle ports to lowers core port ids and set appl_param_nof_ports_per_modid to 64.
Why I did it
Seastone does not have the psu fans' status led, need to reflect it in platform.json.
How I did it
Set the psu fans status led available to false.
How to verify it
Verify it with platform_tests/api/test_psu_fans.py::TestPsuFans::test_set_fans_led case.
Why I did it
The sensors and sensord processes were reporting data on unused sensors.
This lead to ALARM messages or erroneous values that could be misinterpreted.
How I did it
Ignore the affected sensors in the sensors.conf
How to verify it
Check that there are no longer ALARM messages from sensord in the syslog or in the output of sensors
Why I did it
This change specifies the tuning values for each lane of the B52 phy chips. These values can be different for different ports. The values being set are under the assumption of optical transceivers. This change depends on the change to sonic-swss: sonic-net/sonic-swss#2158.
How to verify it
We verified the values are correctly set on the B52 chips of Arista 7280cr3, by reading them from the debug cli of the B52 driver.
*Why I did it
The current sonic image bringup failed on Silverstone:
4:56:15.957705 sonic NOTICE swss#orchagent: :- addDecapTunnel: Create overlay loopback router interface oid:6000000000520
*How I did it
Enable bcm switch Tunnel function on Silverstone.
*How to verify it
The new sonic image can bringup OK on Silverstone.
- Why I did it
Advance hw-mgmt service to V.7.0020.4100
Add missing thermal sensors that are supported by hw-mgmt package
Delay system health service before hw-mgmt has started on Mellanox platform in order to avoid reading some sensors before ready.
Depends on sonic-net/sonic-linux-kernel#305
- How I did it
1. Update hw mgmt version
2. Add missing sensors
3. Delay service
- How to verify it
Regression test.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Why I did it
1. fix chassis test_set_fans_led case
2. fix chassis get_name case mismatch issue
3. fix fan_drawer test_set_fans_speed
4. fix component test_components test case
How I did it
Add corresponding configuration into chassis json file
How to verify it
Run platform tests cases to verify these failure cases
Why I did it
The 720DT-48S platform has variants with different chassis names, and these need to all be included in platform_components.json to ensure that sonic-mgmt platform_tests/fwutil/test_fwutil.py::test_fwutil_show passes
How I did it
Updated platform_components.json with the variant names for 720DT-48S.
How to verify it
Ran aforementioned testcase and verified that it passes on the different variants.
Why I did it
The current platform.json contains entries for thermals and SFPs that do not exist on Wolverine.
How I did it
I removed the incorrect entries.
How to verify it
Verify using applicable sonic-mgmt platform API tests.
- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.
- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.
On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G
- How to verify it
Unit test.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Why I did it
Default pcieutil uses one configuration for all models of platform
How I did it
Take the configuration file as base for all models of concrete platform where model-specific devices are not listed in this configuration
How to verify it
Run pmon#pcied and verify that there is no error/warning logs on initialization step
Why I did it
fix DX010 fan drawer and watchdog platform test case issues
How I did it
1. Add fan_drawer get_maximum_consumed_power support
2. Adjust maximum watchdog timeout value check
How to verify it
Run test_fan_drawer and test_watchdog test cases.
Why I did it
By specifying 'status_led' 'controllable' to false for psu section, it means the platform is not yet supporting psu status led
How I did it
specify 'status_led' 'controllable' to false for psu section
How to verify it
by running test in pdb, manually add {'status_led' : {'controllable' : False}} in dictionary
this flag will be able to get False and skip testing:
ce290c735d/tests/platform_tests/api/test_psu.py (L337)
After upgrade to brcmsai 8.1, the sdk running environment (container) recommended with mininum memory size as below
TH4/TD4(ltsw) uses 512MB
TH3 used 300MB
Helix4/TD2/TD3/TH/TH 256 MB
Base on this requirement, adjust the default syncd share memory size and set the memory size for special ACISs in platform_env.conf file for different types of Broadcom ASICs.
How I did it
Add the platform_env.conf file if none of it for broadcom platform (base on platform_asic file)
Add the 'SYNCD_SHM_SIZE' and set the value
for ltsw(TD4/TH4) devices set to 512M at least (update the platform_env.conf)
for Td2/TH2/TH devices set to 256M
for TH3 set to 300M
verify
How to verify it
verify the image with code fix
Check with UT
Check on lab devices
On a problematic device which cannot start successfully
Run with the command
$ cat /proc/linux-kernel-bde
Broadcom Device Enumerator (linux-kernel-bde)
Module parameters:
maxpayload=128
usemsi=0
dmasize=32M
himem=(null)
himemaddr=(null)
DMA Memory (kernel): 33554432 bytes, 0 used, 33554432 free, local mmap
No devices found
$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Cannot get Broadcom Chip Id. Skip set SYNCD_SHM_SIZE.
Creating new syncd container with HWSKU Force10-S6000
a4862129a7fea04f00ed71a88715eac65a41cdae51c3158f9cdd7de3ccc3dd31
$ docker inspect syncd | grep -i shm
"ShmSize": 67108864,
"Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
On Normal device
$ docker inspect syncd | grep -i shm
"ShmSize": 268435456,
"Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e"
change the config syncd_shm.ini to b85=128m
$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
3209ffc1e5a7224b99640eb9a286c4c7aa66a2e6a322be32fb7fe2113bb9524c
$ docker inspect syncd | grep -i shm
"ShmSize": 134217728,
"Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
change the config under
/usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/Force10-S6000/platform_env.conf
and run command
$ cat /usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/platform_env.conf
SYNCD_SHM_SIZE=300m
$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
897f6fcde1f669ad2caab7da4326079abd7e811bf73f018c6dacc24cf24bfda5
$ docker inspect syncd | grep -i shm
"ShmSize": 314572800,
"Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Eariler the SDK stat polling was erroneously set to once every msec
which is far more frequent than required by SWSS. The new setting, which
is consistent with other vendor SKUs, is once a second. The net result
is reduced CPU MHz by syncd.
Why I did it
Fix the following issues for Seastone platform:
- system-health issue: show system-health detail will not complete #9530, Celestica Seastone DX010-C32: show system-health detail fails with 'Chassis' object has no attribute 'initizalize_system_led' #11322
- show platform firmware updates issue: Celestica Seastone DX010-C32: show platform firmware updates #11317
- other platform optimization
How I did it
Modify and optimize the platform implememtation.
How to verify it
Manual run the test commands described in these issues.
Why I did it
Ragile adapter ra-b6510-32c ra-b6510-48v8c ra-b6910-64c ra-b6920-4s to kernel 5.x
Signed-off-by: “pettershao” pettershao@ragilenetworks.com
Fix code issue when SonicV2Connector.get() return None.
#### Why I did it
When database key does not exist, SonicV2Connector.get() will return None.
Code will break if not check return value.
#### How I did it
Check SonicV2Connector.get() return value before use it.
#### How to verify it
Pass all E2E test case.
Why I did it
Add two platform that support s3IP framework
How I did it
Add two platforms supporting S3IP SYSFS (TCS8400, TCS9400)
How to verify it
Manual test
Adding platform support for FS s5800-48t4s and s5800-48t8s-mars8p.
Both s5800-48t4s and s5800-48t8s-mars8p have 48 * 10/100/1000 Base-T ports, 4 * 10GE SFP+ Ports on Centec TsingMa.
s5800-48t4s is different from s5800-48t8s-mars8p in that:
The phy chip used by s5800-48t4s is Marvell 88e1680;
The phy chip used by s5800-48t4s-mars8p is Centec ctc21108;
Why I did it
This is to fix test_forward_ip_packet_with_0xffff_chksum_tolerant test failure on 720DT-48S. IP packets with checksum set to 0xffff will be forwarded with the same checksum on this platform, instead of updating to the correct value.
How I did it
Add bcm config sai_verify_incoming_chksum=0 so that checksum is updated instead of being left unchanged when checksum is 0xffff. Note that packets with invalid checksum are still dropped with this config.
Why I did it
On DNX (J2/J2c/J2c+) platforms, Single Path Nexthops and ECMp Nexthop resources(FECs) are shared. BRCM SAI do not have partition of this resource, and hence more single path Nexthop entries, causes ECMP programming to fail in scaled setup.
How I did it
Broadcom provided SAI changes to reserve resources for single path nexthop entries(More details in CSP: https://brcmsemiconductor-csm.wolkenservicedesk.com/wolken-support/allcases/request-details?requestId=CS00012251649).
Along with SAI changes, they provided configurable Macro/flag to reserve NON_ECMP entries.
This PR is to add that flag in various sai.profile files wherever applicable.
PS: We are reserving 3072 single path Nexthop entries on each Linecard. Calculation is as follows.
Max Slots per chassis: 8
Max No of Ports(each LC): 64
MyIP/Subnet Entries per port: 4(v4/v6)
Nbr Entries Per port: 2(v4/v6)
Total Non_ECMP Count: 8x64x(4+2) = 3072
How to verify it
Without this change, the ECMP group count will be shown as Max_count in 'crm show resources all' command, and with this change the ECMP group count will be decreased by 24(3072/128).
Port indexes of front panel ports are not contiguous in multi-asic because we didn't distiguish between
front panel and internal ports, e.g., recycle ports. Fix this by assigning index to front panel port first
and then internal ports.
Add missing aggregate port_config.ini needed by sonic-mgmt
Concatenate the ASIC specific port_config.ini from device/arista/x86_64-arista_7800r3a_36d2_lc/Arista-7800R3A-36D2-C36/[01] to create the aggregate file.
Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [PR (#12065)](https://github.com/sonic-net/sonic-buildimage/pull/12065) needs to merge first.
#### Why I did it
1. `eval()` - not secure against maliciously constructed input, can be dangerous if used to evaluate dynamic content. This may be a code injection vulnerability.
2. `subprocess()` - when using with `shell=True` is dangerous. Using subprocess function without a static string can lead to command injection.
3. `os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content.
4. `is` operator - string comparison should not be used with reference equality.
5. `globals()` - extremely dangerous because it may allow an attacker to execute arbitrary code on the system
#### How I did it
1. `eval()` - use `literal_eval()`
2. `subprocess()` - use `shell=False` instead. use an array string. Ref: [https://semgrep.dev/docs/cheat-sheets/python-command-injection/#mitigation](https://semgrep.dev/docs/cheat-sheets/python-command-injection/#mitigation)
3. `os` - use with `subprocess`
4. `is` - replace by `==` operator for value equality
5. `globals()` - avoid the use of globals()
This change is to disable the pcie firmware check done by Broadcom SAI. The change is needed for the Arista platform x86_64-arista_7050cx3_32s; otherwise, the check will fail, blocking the initialization.
There was a pcie firmware check added in brcm SDK and certain Arista hardwares do not compliant with the check, so we added the disable_pcie_firmware_check originally for x86_64-arista_7060dx4_32. For x86_64-arista_7050cx3_32s, it was able to pass the check but some firmware change done in August made it fail.
Why I did it:
Fix multiple seastone platform issues caused by sonic kernel upgrade.
How I did it:
Get gpio base id with new label path in gpio sys fs.
How to verify it:
After the change, show platform fan/psustatus/temperature works well.