- Why I did it
Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario
This is to port #11032 and #11299 from 202012 to master.
Support additional queue and PG in buffer templates, including both traditional and dynamic model
Support mapping DSCP 2/6 to lossless traffic in the QoS template.
Add macros to generate additional lossless PG in the dynamic model
Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
Buffer tables are rendered via using macros.
Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.
On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:
40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Support new platform SN2201 and RJ45 port
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* remove unused import and redundant function
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* fix error introduced by rebase
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* Revert the special handling of RJ45 ports (#56)
* Revert the special handling of RJ45 ports
sfp.py
sfp_event.py
chassis.py
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Remove deadcode
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Support CPLD update for SN2201
A new class is introduced, deriving from ComponentCPLD and overloading _install_firmware
Change _install_firmware from private (starting with __) to protected, making it overloadable
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Initialize component BIOS/CPLD
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Remove swb_amb which doesn't on DVT board any more
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Remove the unexisted sensor - switch board ambient - from platform.json
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Do not report error on receiving unknown status on RJ45 ports
Translate it to disconnect for RJ45 ports
Report error for xSFP ports
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* Add reinit for RJ45 to avoid exception
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
1. SN2201 sai profile needs to be updated according to the latest hardware.
2. In the reboot script, need to use the common symbol link of the power_cycle sysfs instead of directly accessing it due to SN2201 sysfs is different than other platforms.
3. echo 1 > $SYSFS_PWR_CYCLE will trigger the reboot immediately, the following sleep 3 and echo 0 > $SYSFS_PWR_CYCLE will never be executed, can be removed.
- How I did it
1. Replace the SN2201 sai profile with the latest one.
2. In the platform_reboot script, replace the direct sysfs path with the symbol link path.
3. Remove the redundant code from platform_reboot
- How to verify it
Perform reboot on all the Nvidia platforms, and check all can be rebooted successfully.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.
Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.
- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"
- How to verify it
Unit test and Manual test
- Why I did it
Update NVIDIA Copyright header to "mellanox" files which were changed since 1.1.2022
- How I did it
Update the copyright header
- How to verify it
Sanity tests and PR checkers.
Update device-specific files for new platform SN2201, including:
device/mellanox/x86_64-nvidia_sn2201-r0/ACS-SN2201/buffers_defaults_objects.j2
device/mellanox/x86_64-nvidia_sn2201-r0/ACS-SN2201/hwsku.json
device/mellanox/x86_64-nvidia_sn2201-r0/default_sku
device/mellanox/x86_64-nvidia_sn2201-r0/pcie.yaml
device/mellanox/x86_64-nvidia_sn2201-r0/platform.json
device/mellanox/x86_64-nvidia_sn2201-r0/platform_components.json
device/mellanox/x86_64-nvidia_sn2201-r0/sensors.conf
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
The chassis name in MSN4410 platform_components.json is not correct
- How I did it
Fix the chassis name
- How to verify it
Run relevant platform API test
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Need to remove old static configs from sai.profile files.
New implementation: Azure/sonic-swss#1959
New configuration: #9658
- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 lines from files per HWSKU
- How to verify it
When static config is removed following test will fail (src port will be in range 0-255)
py.test vxlan/test_vnet_vxlan.py --inventory "../ansible/inventory, ../ansible/veos" --host-pattern (testbed)-t0 --module-path ../ansible/library/ --testbed (testbed)-t0 --testbed_file ../ansible/testbed.csv --allow_recover --assert plain --log-cli-level info --show-capture=no -ra --showlocals --disable_loganalyzer --skip_sanity --upper_bound_udp_port 65535 --lower_bound_udp_port 64128
- Why I did it
platform.json of 4600C only has 2 CPU core thermal sensors, but there are 4 actually
- How I did it
Added thermal sensors for CPU core 2 and core 3.
- How to verify it
Build.
- Why I did it
There was a missing file of platform.json on SN4800 SIMx platform
- How I did it
Add symbolic link to platform.json file from the real SN4800 platform
- How to verify it
Run SN4800 SIMx and verify that platfrom.json file exist at the following folder:
/usr/share/sonic/device/x86_64-nvidia_sn4800_simx-r0
- Why I did it
Remove obsolete parameter that enables static VXLAN src port range
provide functionality no generate json config file according to appropriate parameter in config_db
Done for
SN3800:
• Mellanox-SN3800-D28C50
• Mellanox-SN3800-C64
• Mellanox-SN3800-D28C49S1 (New 10G SKU)
SN2700:
• Mellanox-SN2700-D48C8
- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 from appropriate sai.profile files
Created vxlan.json file and added few params that depends on DEVICE_METADATA.localhost.vxlan_port_range
- How to verify it
File /etc/swss/config.d/vxlan.json should be generated inside swss docker when it restart
[
{
"SWITCH_TABLE:switch": {
"vxlan_src": "0xFF00",
"vxlan_mask": "8"
},
"OP": "SET"
}
]
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.
How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
- Why I did it
For MSN4410/MSN4600/MSN4700 now they can support fetching PSU voltage threshold, no need to skip the psu voltage check in system health monitoring, so update the system health monitoring configuration file for these platforms.
- How I did it
remove skip PSU change config from the system_health_monitoring_config.json file
- How to verify it
Build image run on these platforms, system health monitoring will not report error against PSU voltage
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
MSN4700 platform has 8 lanes per port and thus can support 2x40G with each lane running at 10G
- How I did it
Added 40G to 2x200G breakout mode in platform.json
- How to verify it
Run config int break Ethernet0 2x40G[200G,100G,50G,25G,10G,1G]
And verify the command runs successfully and the port speed was set to 40G with a 2x breakout.
- Why I did it
Add sensor conf for MSN4600C A1 platform
- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform
- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Added missing functionality for dynamic buffer calculation in Spectrum-4.
- How I did it
Added a section of code in asic_table.j2 for Spectrum-4, and added the simx version of SN5600 to the supported list.
- How to verify it
Manually: buffershow -l should show all ingress/egress lossy/lossless pools, and all fields of profiles should show values.
Automatically: https://github.com/Azure/sonic-mgmt/blob/master/tests/qos/test_buffer.py
Signed-off-by: Raphael Tryster <raphaelt@nvidia.com>
Why I did it
Fix typo and missing files in SN3800 and SN4600C's buffer templates
How I did it
ingress_lossless_xoff_size => ingress_lossless_pool_xoff add missing files for SN4600C-D100C12S2
How to verify it
Deploy the fix and verify whether the device can be up.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
The capability files were incorrect in comparison to the marketing spec of the SN4410 platform.
- How I did it
Aligned the capability files according to the marketing spec.
- How to verify it
Basic manual sanity checks:
1. Check if critical docker containers were UP
2. Check if interfaces were created and were UP
3. Check if interfaces created in the syncd docker container by executing – sx_api_ports_dump.py script
4. Check the logs from the start of the switch – everything was OK
5. Verified the port breakout
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
Rename platform x86_64-mlnx_msn4800 to x86_64-nvidia_msn4800
- How I did it
Rename platform folder as well as all code that reference the platform name
- How to verify it
Manual test
- Why I did it
Add new Spectrum-4 system support SN5600 on top of Nvidia ASIC simulator.
- How I did it
Add all relevant system and simulator SKU.
Updated syseeprom.hex and related directories to reflect Nvidia SN5600 brand name.
- How to verify it
Tested init flow, basic show commands, up interfaces, traffic test.
Signed-off-by: Raphael Tryster <raphaelt@nvidia.com>
- Why I did it
Also recalculated all parameters with the latest algorithm with per-speed peer response time taken into account
- How I did it
Detailed information of each SKU:
C64:
t0: 32 100G downlinks and 32 100G uplinks
t1: 56 100G downlinks and 8 100G uplinks with 2km-cable supported
D112C8: 112 50G downlinks and 8 100G uplinks.
D48C40: 48 50G downlinks, 32 100G downlinks, and 8 100G uplinks
D100C12S2: 4 100G downlinks, 2 10G downlinks, 100 50G downlinks, and 8 100G uplinks
2km cable is supported for C64 on t1 only
- How to verify it
Run regression test (QoS)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Why I did it
Add support for SN2201 platform
- How I did it
Add required content for SN2201 platform
Note: still missing kernel driver support for this system. Once all is upstream will be updated as well.
- How to verify it
Install and basic sanity tests including traffic.
Signed-off-by: liora liora@nvidia.com
Signed-off-by: Stephen Sun stephens@nvidia.com
Why I did it
Support zero buffer profiles
Add buffer profiles and pool definition for zero buffer profiles
Support applying zero profiles on INACTIVE PORTS
Enable dynamic buffer manager to load zero pools and profiles from a JSON file
Dependency: It depends on Azure/sonic-swss#1910 and submodule advancing PR once the former merged.
How I did it
Add buffer profiles and pool definition for zero buffer profiles
If the buffer model is static:
Apply normal buffer profiles to admin-up ports
Apply zero buffer profiles to admin-down ports
If the buffer model is dynamic:
Apply normal buffer profiles to all ports
buffer manager will take care when a port is shut down
Update buffers_config.j2 to support INACTIVE PORTS by extending the existing macros to generate the various buffer objects, including PGs, queues, ingress/egress profile lists
Originally, all the macros to generate the above buffer objects took active ports only as an argument
Now that buffer items need to be generated on inactive ports as well, an extra argument representing the inactive ports need to be added
To be backward compatible, a new series of macros are introduced to take both active and inactive ports as arguments
The original version (with active ports only) will be checked first. If it is not defined, then the extended version will be called
Only vendors who support zero profiles need to change their buffer templates
Enable buffer manager to load zero pools and profiles from a JSON file:
The JSON file is provided on a per-platform basis
It is copied from platform/<vendor> folder to /usr/share/sonic/temlates folder in compiling time and rendered when the swss container is being created.
To make code clean and reduce redundant code, extract common macros from buffer_defaults_t{0,1}.j2 of all SKUs to two common files:
One in Mellanox-SN2700-D48C8 for single ingress pool mode
The other in ACS-MSN2700 for double ingress pool mode
Those files of all other SKUs will be symbol link to the above files
Update sonic-cfggen test accordingly:
Adjust example output file of JSON template for unit test
Add unit test in for Mellanox's new buffer templates.
How to verify it
Regression test.
Unit test in sonic-cfggen
Run regression test and manually test.
- Why I did it
Wrong SKU configuration will lead to longer init flow.
This will affect fast-reboot feature by increasing the traffic downtime.
Since MLNX met the required downtime period with this SKU this bug found with a delay.
- How I did it
Add the required split labels for ports.
- How to verify it
Run fast-reboot with this platform using SN3800-D112C8 SKU.
Why I did it
"chassis_db_init" task of PMON should be skipped on Mellanox simx platform, since the hardware info which this task is trying to access is not available on simx platforms, It will introduce some error log.
How I did it
Add the capability for "chassis_db_init" in the template for it can be skipped by adding configuration in "pmon_daemon_control.json".
add "skip_chassis_db_init" configuration for simx platforms.
use symbol link for "pmon_daemon_control.json" since all the simx platforms share the same configuration
How to verify it
Build an image and install it on simx platform to check whether "chassis_db_init" task is skipped.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Add NVIDIA Copyright header to "mellanox" files
- How I did it
Add NVIDIA Copyright header as a comment for Mellanox files
- How to verify it
Sanity tests and PR checkers.
#### Why I did it
Fixing https://github.com/Azure/sonic-buildimage/issues/8938
Fixing 1x10G DPB mode in Mellanox-SN2700-D40C8S8 SKU as it was causing sonic-cfggen to fail.
#### How I did it
Added correct mode format in hwksu.json in Mellanox-SN2700-D40C8S8 and updated platform.json for the new mode.
#### How to verify it
Using sonic-cfggen verify it works fine
#### Why I did it
Add platform_asic file to each platform folder in sonic-device-data package. The file content will be used as the ground truth of mapping from PLATFORM_STRING to switch ASIC family.
One use case of the mapping is to prevent installing a wrong image, which targets for other ASIC platforms. For example, currently we have several ONIE images naming as sonic-*.bin, it's easy to mistakenly install the wrong image. With this mapping built into image, we could fetch the ONIE platform string, and figure out which ASIC it is using, and check we are installing the correct image.
After this PR merged, each platform vendor has to add one mandatory text file `device/PLATFORM_VENDOR/PLATFORM_STRING/platform_asic`, with the content of the platform's switch ASIC family.
I will update https://github.com/Azure/SONiC/wiki/Porting-Guide after this PR is merged.
You can get a list of the ASIC platforms by `ls -b platform | cat`. Currently the options are
```
barefoot
broadcom
cavium
centec
centec-arm64
generic
innovium
marvell
marvell-arm64
marvell-armhf
mellanox
nephos
p4
vs
```
Also support
```
broadcom-dnx
```
#### How I did it
#### How to verify it
Test one image on DUT. And check the folders under `/usr/share/sonic/device`
Depends on Azure/sonic-utilities#1626
Depends on Azure/sonic-swss#1754
QOS tables in config db used ABNF format i.e "[TABLE_NAME|name] to refer fieldvalue to other qos tables.
Example:
Config DB:
"Ethernet92|3": {
"scheduler": "[SCHEDULER|scheduler.1]",
"wred_profile": "[WRED_PROFILE|AZURE_LOSSLESS]"
},
"Ethernet0|0": {
"profile": "[BUFFER_PROFILE|ingress_lossy_profile]"
},
"Ethernet0": {
"dscp_to_tc_map": "[DSCP_TO_TC_MAP|AZURE]",
"pfc_enable": "3,4",
"pfc_to_queue_map": "[MAP_PFC_PRIORITY_TO_QUEUE|AZURE]",
"tc_to_pg_map": "[TC_TO_PRIORITY_GROUP_MAP|AZURE]",
"tc_to_queue_map": "[TC_TO_QUEUE_MAP|AZURE]"
},
This format is not consistent with other DB schema followed in sonic.
And also this reference in DB is not required, This is taken care by YANG "leafref".
Removed this format from all platform files to consistent with other sonic db schema.
Example:
"Ethernet92|3": {
"scheduler": "scheduler.1",
"wred_profile": "AZURE_LOSSLESS"
},
Dependent pull requests:
#7752 - To modify platfrom files
#7281 - Yang model
Azure/sonic-utilities#1626 - DB migration
Azure/sonic-swss#1754 - swss change to remove ABNF format
- Why I did it
Removed 2x40G for SN3800. This mode is not supported by hardware.
- How I did it
Removing it from hwsku.json and platform.json
- How to verify it
Load it in the device and check supported modes
- Why I did it
SN4600 A0 platform was EOL, so there is no need to support it, sensor conf can be removed and we don't need to maintain 2 sensor conf files, only A1 platform is needed.
- How I did it
Remove get_sensors_conf_path which intends to load correct sensor conf for different(A0/A1) platforms.
Remove the sensor conf for A0 platform, rename previous sensor.conf.a1 to sensor.conf
- How to verify it
Run sensor test on the SN4600 platform.
- Why I did it
Update platform data files for SN4800 to support chassis management
- How I did it
Update pcie.yml
Update sensors.conf
Update platform.json
- How to verify it
Run platform test suite in sonic-mgmt
- Why I did it
The MSN4410 platform was missing 2x100G and 4x50G supported breakout modes in platform.json
- How I did it
Added the aforementioned breakout modes to platform.json
- How to verify it
Run show interface breakout on a MSN4410 and verify that 2x100G and 4x50G are listed in the supported breakout modes for ports 1-24.
#### Why I did it
The values of the syseeprom were not valid
#### How I did it
I took the correct hexdump values from a real switch and created this hex file again
#### How to verify it
decode-syseeprom will display the new values
#### Why I did it
New SN410 A1 system has a different sensor layout with A0 system, needs a new sensor conf file to support it.
#### How I did it
Since the SN4410 A1 system use exactly the same sensor layout as the SN4700 A1 system, so add a symbol link linking to the SN4700 A1 sensor conf file to reuse.
#### How to verify it
Run sensor test against the SN4410 A1 system;
Run platform related regression test against the SN4410 A1 system
*Edited platform.json for 4600 & 4600C
*Edited hwsku.json and port_config.ini files for all the SKU's present under these platforms
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Why I did it
To create SDK dump on Mellanox devices when SDK event has occurred.
- How I did it
Set the SKUs keys needed to initialize the feature in SAI.
- How to verify it
Simulate SDK event and check that dump is created in the expected path.
- Why I did it
The default breakout mode according to hwsku.json for the MSN4410 is 1x400G and this is not a supported breakout mode according to its platform.json
This causes a conflict on boot of this platform and no containers on the switch will init successfully.
- How I did it
Referenced the platform specification files and updated platform.json
- How to verify it
Install master version of SONiC on MSN4410
Boot switch and verify swss is successfully running using docker ps
- Why I did it
Enhance the Python3 support for platform API. Originally, some platform APIs call SDK API which didn't support Python 3. Now the Python 3 APIs have been supported in SDK 4.4.3XXX, Python3 is completely supported by platform API
- How I did it
Start all platform daemons from python3
1. Remove #/usr/bin/env python at the beginning of each platform API file as the platform API won't be started as daemons but be imported from other daemons.
2. Adjust SDK API calls accordingly
- How to verify it
Manually test and run regression platform test
Signed-off-by: Stephen Sun <stephens@nvidia.com>
#### Why I did it
Support 2km cables for Microsoft SKUs
#### How I did it
1. Update pg_profile_lookup.ini with 2000m cable supported
2. Update buffer configuration for t1 with uplink cable 2000m
- For SN3800 platform:
- C64:
- t0: 32 100G down links and 32 100G up links.
- t1: 56 100G down links and 8 100G up links with 2 km cable.
- D112C8: 112 50G down links and 8 100G up links.
- D24C52: 24 50G down links, 20 100G down links, and 32 100G up links.
- D28C50: 28 50G down links, 18 100G down links, and 32 100G up links.
- For SN2700 platform:
- D48C8: 48 50G down links and 8 100G up links.
- C32:
- t0: 16 100G down links and 16 100G up links.
- t1: 24 100G down links and 8 100G up links with 2 km cable.
- For SN4600C platform:
- D112C8: 112 50G down links and 8 100G up links.
#### How to verify it
Run regression test
#### Why I did it
The label for PSU related sensors on the Spectrum-2 platform is not aligned with the physical location of the PSU.
#### How I did it
Update the label in the sensor conf file for those relevant platforms
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
Add initial support of SN4800 platform for Mellanox ASIC simulation device.
NOTE: This is work in progress and not full support of the platform.
#### How I did it
Add new folders for SN4800 with zero ports based on SN4700 Spectrum-3 switch.
- Why I did it
Add initial support of SN4800 platform .
NOTE: This is work in progress and not full support of the platform.
- How I did it
Add new folders for SN4800 with zero ports based on SN4700 Spectrum-3 switch.
- How to verify it
Simulator device was tested. See #7448
#### Why I did it
MSN4700 A1/A0 used different sensor chip but keep the existing platform name *x86_64-mlnx_msn4700-r0*, this is a workaround to replace the sensor conf on MSN4700 A1/A0
#### How I did it
Use a shell script to get the sensor conf path and copy that files to /etc/sensors.d/sensors.conf
- Why I did it
Enable VXLAN src port range configuration via SAI profile for Mellanox-SN3800-D28C49S1 SKU
- How I did it
Added SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 configuration to appropriate sai.profile
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
#### Why I did it
Current platform.json lacks some peripheral device related facts, like chassis/fan/pasu/drawer/thermal/components names, numbers, etc.
#### How I did it
Add platform device facts to the platform.json file
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Fix to the correct value for all SPC1 devices.
For 10G added 10GB_CX4_XAUI, 10GB_KX4, 10GB_KR, 10GB_SR and 10GB_ER_LR
For 50G added 50GB_SR2
This bitmask represents all the options available for interface type and some were missing.
Note: it was working just fine if you were setting the value from SONiC CLI but not from the default SAI Profile.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
The platform name for MSN4600C in sfputil pliugin is not complete: "x86_64-mlnx_msn4600c" -> "x86_64-mlnx_msn4600c-r0"
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Add missed files for dynamic buffer calculation for ACS-MSN3420 and ACS-MSN4410
- How I did it
asic_table.j2: Add mapping from platform to ASIC
Add buffer_dynamic.json.j2 for ACS-MSN4410.
- How to verify it
Check whether the dynamic buffer calculation daemon starts successfully.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
The file device/mellanox/x86_64-mlnx_msn4410-r0/plugins/sfputil.py is not a software link for device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py. And it is still using python2 syntex which causes some SFP CLI error. The PR is to change it to a softlink and add 4410 support in device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py.
#### Why I did it
Additional file for DPB in order to support SKU SN2700-D40C8S8 on master
#### How I did it
Add hwsku.json file
#### How to verify it
Enforce "Mellanox-SN2700-D40C8S8 SKU on Master and see it works as expected, meaning:
Port 1/3 will be used as 4x10G
Port 2/4 - Not exist (blocked since 1 and 3 split to 4)
Port 7/8/9/10/23/24/25/26 will used as 100G
All other ports will be used as 2x50G
This PR should be added on top of PR:
https://github.com/Azure/sonic-buildimage/pull/6876
#### Description for the changelog
Adding hwsku.json file to SN2700-D40C8S8 SKU
#### Why I did it
Change buffer config for new SKU Mellanox-SN2700-D40C8S8
#### How I did it
Reuse the buffer config of SKU Mellanox-SN2700-D48C8
#### How to verify it
Run sonic-mgmt qos test and all passed
- Why I did it
Fix the build and fix the SN4600 DPB support
- How I did it
Fix port configuration file for SN4600 based on recent changes
- How to verify it
System bringup is completed, all interfaces are up.
Platform tests suits all is passing.
- Why I did it
To add support for the dynamic breakout on Mellanox platform x86_64-mlnx_msn4600
- How I did it
Add the relevant files describing Mellanox platform x86_64-mlnx_msn4600 breakout modes to a new device folder.
- How to verify it
System bringup is completed, all interfaces are up.
Platform tests suits all is passing.
- Why I did it
To fix PCIEd errors in log.
- How I did it
Update pcie.yaml with the right PCI addresses.
- How to verify it
Check logs, operation occurs each minute.
Signed-off-by: liora <liora@nvidia.com>
To fix [DPB| wrong aliases for interfaces](https://github.com/Azure/sonic-buildimage/issues/6024) issue, implimented flexible alias support [design doc](https://github.com/Azure/SONiC/pull/749)
> [[dpb|config] Fix the validation logic of breakout mode](https://github.com/Azure/sonic-utilities/pull/1440) depends on this
#### How I did it
1. Removed `"alias_at_lanes"` from port-configuration file(i.e. platfrom.json)
2. Added dictionary to "breakout_modes" values. This defines the breakout modes available on the platform for this parent port, and it maps to the alias list. The alias list presents the alias names for individual ports in order under this breakout mode.
```
{
"interfaces": {
"Ethernet0": {
"index": "1,1,1,1",
"lanes": "0,1,2,3",
"breakout_modes": {
"1x100G[40G]": ["Eth1"],
"2x50G": ["Eth1/1", "Eth1/2"],
"4x25G[10G]": ["Eth1/1", "Eth1/2", "Eth1/3", "Eth1/4"],
"2x25G(2)+1x50G(2)": ["Eth1/1", "Eth1/2", "Eth1/3"],
"1x50G(2)+2x25G(2)": ["Eth1/1", "Eth1/2", "Eth1/3"]
}
}
}
```
#### How to verify it
`config interface breakout`
Signed-off-by: Sangita Maity <samaity@linkedin.com>
- Why I did it
Add support for new 64x200G SN4600 systems
- How I did it
Add all relevant files (w/o platform.json and hwsku.json as they will come later) with default SKU.
- How to verify it
Install image on switch, verify all ports are up and configured properly, run full platform SONiC tests.
#### Why I did it
Add new SKU for SN2700 Mellanox system that supports the following port configuration:
8 X 100G
40 X 50G
8 X 10G
#### How I did it
Add new Folder - "Mellanox-SN2700-D40C8S8" under /sonic-buildimage/device/mellanox/x86_64-mlnx_msn2700-r0/
that contains the relevant files supporting this SKU
the buffers are based on SKU: D48C8 . Later on it will be configured specific for this SKU
#### How to verify it
Bring up the image, run "show interface status" and make sure that all ports are up and reflect the following requirement:
Port 1/3 will be used as 4x10G
Port 2/4 - Not exist (blocked since 1 and 3 split to 4)
Port 7/8/9/10/23/24/25/26 will used as 100G
All other ports will be used as 2x50G
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [X] 202012
#### Description for the changelog
Support new SKU under the name of SN2700-D40C8S8
- Why I did it
Mellanox-SN4600C-D112C8 SKU is not configured properly.
It should have 112 50G interfaces and 8 100G interfaces as described on this PR.
- How I did it
Modify sai_profile, port_config.ini and hwsku.json for DPB.
- How to verify it
Apply this HwSKU to a MSN4600C Mellanox platform.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Support shared headroom pool
Signed-off-by: Stephen Sun stephens@nvidia.com
- How I did it
Port configurations for SKUs based on 2700/3800 platform from 201911
For SN3800 platform:
C64: 32 100G down links and 32 100G up links.
D112C8: 112 50G down links and 8 100G up links.
D24C52: 24 50G down links, 20 100G down links, and 32 100G up links.
D28C50: 28 50G down links, 18 100G down links, and 32 100G up links.
For SN2700 platform:
D48C8: 48 50G down links and 8 100G up links
C32: 16 100G downlinks and 16 100G uplinks
Add configuration for Mellanox-SN4600C-D112C8
112 50G down links and 8 100G up links.
- How to verify it
Run regression test.
**- Why I did it**
PR https://github.com/Azure/sonic-platform-common/pull/102 modified the name of the SFF-8436 (QSFP) method to align the method name between all drivers, renaming it from `parse_qsfp_dom_capability` to `parse_dom_capability`. Once the submodule was updated, the callers using the old nomenclature broke. This PR updates all callers to use the new naming convention.
**- How I did it**
Update the name of the function globally for all calls into the SFF-8436 driver.
Note that the QSFP-DD driver still uses the old nomenclature and should be modified similarly. I will open a PR to handle this separately.
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN4600C
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
NOTE: breakout to 4 is currently not available as of missing functionality in DPB implementation.
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN4410
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN3700
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN2410
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
NOTE: breakout to 4 is currently not available as of missing functionality in DPB implementation.
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN2100
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN2010
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN3800
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
[DPB] added capability files for SN2700 platform
- Why I did it
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout
- How I did it
Created capability files according to platform specification SN2700
- How to verify it
Full qualification requires bugs fixes reported under sonic-buildimage
NOTE: breakout to 4 is currently not available as of missing functionality in DPB implementation.
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
In order to build up device hierachy, PSU and module thermals are no longer child of chassis. PSU thermal belongs to PSU objects and SFP thermals belong to SFP object now. Need align this change in platform.json. Move thermal objects to correct parent device
**- Why I did it**
To support dynamic buffer calculation.
This PR also depends on the following PRs for sub modules
- [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](https://github.com/Azure/sonic-swss/pull/1338)
- [sonic-swss-common: Dynamic buffer calculation #361](https://github.com/Azure/sonic-swss-common/pull/361)
- [sonic-utilities: Support dynamic buffer calculation #973](https://github.com/Azure/sonic-utilities/pull/973)
**- How I did it**
1. Introduce field `buffer_model` in `DEVICE_METADATA|localhost` to represent which buffer model is running in the system currently:
- `dynamic` for the dynamic buffer calculation model
- `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used
2. Add the tables required for the feature:
- ASIC_TABLE in platform/\<vendor\>/asic_table.j2
- PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2
- PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed.
- DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2
- Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2
3. Copy the newly introduced j2 files into the image and rendering them when the system starts
4. Update the CLI options for buffermgrd so that it can start with dynamic mode
5. Fetches the ASIC vendor name in orchagent:
- fetch the vendor name when creates the docker and pass it as a docker environment variable
- `buffermgrd` can use this passed-in variable
6. Clear buffer related tables from STATE_DB when swss docker starts
7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2
8. Remove buffer pool sizes for ingress pools and egress_lossy_pool
Update the buffer settings for dynamic buffer calculation
python2 is end of life and SONiC is going to support python3. This PR is going to support:
1. Mellanox SONiC platform API python3 support
2. Install both python2 and python3 verson of Mellanox SONiC platform API or pmon and host side