- Why I did it
1. SN2201 sai profile needs to be updated according to the latest hardware.
2. In the reboot script, need to use the common symbol link of the power_cycle sysfs instead of directly accessing it due to SN2201 sysfs is different than other platforms.
3. echo 1 > $SYSFS_PWR_CYCLE will trigger the reboot immediately, the following sleep 3 and echo 0 > $SYSFS_PWR_CYCLE will never be executed, can be removed.
- How I did it
1. Replace the SN2201 sai profile with the latest one.
2. In the platform_reboot script, replace the direct sysfs path with the symbol link path.
3. Remove the redundant code from platform_reboot
- How to verify it
Perform reboot on all the Nvidia platforms, and check all can be rebooted successfully.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR
How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully
Signed-off-by: Neetha John <nejo@microsoft.com>
* [Tunnel PFC] Add property for tunnel PFC
Replace the config.bcm file with j2 template file
- Add 'sai_remap_prio_on_tnl_egress=1' property when device metadata local
- Host subtype is 'dualtor'
- Change sai.profile foe the new config.bcm.j2
Signed-off-by: bingwang <bingwang@microsoft.com>
Why I did it
This PR is to add two extra lossless queues for bounced back traffic.
HLD sonic-net/SONiC#950
SKUs include
Arista-7050CX3-32S-C32
Arista-7050CX3-32S-D48C8
Arista-7260CX3-D108C8
Arista-7260CX3-C64
Arista-7260CX3-Q64
How I did it
Update the buffers.json.j2 template and buffers_config.j2 template to generate new BUFFER_QUEUE table.
For T1 devices, queue 2 and queue 6 are set as lossless queues on T0 facing ports.
For T0 devices, queue 2 and queue 6 are set as lossless queues on T1 facing ports.
Queue 7 is added as a new lossy queue as DSCP 48 is mapped to TC 7, and then mapped into Queue 7
How to verify it
Verified by UT
Verified by coping the new template and generate buffer config with sonic-cfggen
* Removed unused default_config.json
* Remove asic.conf file from HW SKUs directories as they are not used by upstream code
* Enable dynamic PCI ID identification on Otterlake2
Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
What/Why I did:
Issue1: By setting up of ipvlan interface in interface-config.sh we are not tolerant to failures. Reason being interface-config.service is one-shot and do not have restart capability.
Scenario: For example if let's say database service goes in fail state then interface-services also gets failed because of dependency check but later database service gets restart but interface service will remain in stuck state and the ipvlan interface nevers get created.
Solution: Moved all the logic in database service from interface-config service which looks more align logically also since the namespace is created here and all the network setting (sysctl) are happening here.With this if database starts we recreate the interface.
Issue 2: Use of IPVLAN vs MACVLAN
Currently we are using ipvlan mode. However above failure scenario is not handle correctly by ipvlan mode. Once the ipvlan interface is created and ip address assign to it and if we restart interface-config or database (new PR) service Linux Kernel gives error "Error: Address already assigned to an ipvlan device." based on this:https://github.com/torvalds/linux/blob/master/drivers/net/ipvlan/ipvlan_main.c#L978Reason being if we do not do cleanup of ip address assignment (need to be unique for IPVLAN) it remains in Kernel Database and never goes to free pool even though namespace is deleted.
Solution: Considering this hard dependency of unique ip macvlan mode is better for us and since everything is managed by Linux Kernel and no dependency for on user configured IP address.
Issue3: Namespace database Service do not check reachability to Supervisor Redis Chassis Server.
Currently there is no explicit check as we never do Redis PING from namespace to Supervisor Redis Chassis Server. With this check it's possible we will start database and all other docker even though there is no connectivity and will hit the error/failure late in cycle
Solution: Added explicit PING from namespace that will check this reachability.
Issue 4:flushdb give exception when trying to accces Chassis Server DB over Unix Sokcet.
Solution: Handle gracefully via try..except and log the message.
Why I did it
add celestica belgite platform
How I did it
add belgite platform in celestica
Co-authored-by: nicwu-cel <nicwu@celestica.com>
Co-authored-by: anjian <anjian@celestica.com>
Co-authored-by: sandycelestica <sandyli@celestica.com>
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.
Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Why I did it
To include ONIE version in show platform firmware status command output in DellEMC S6100 and Z9332f platforms.
How I did it
Include ‘ONIE’ in the list of components provided by platform APIs in DellEMC S6100 and Z9332f.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins in DellEMC S6100.
Why I did it
Fixes some pmon errors/warnings by providing missing configuration files
How I did it
Add missing pcie.yaml and sensors.conf for supported linecards
How to verify it
pcie-check should pass
sensors should display proper sensor names
* [PDDF] Rename temp for 7816/7326/7726
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
* Change naming to pddf device
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
Co-authored-by: ecsonic <ecsonic@edge-core.com>
Why I did it
The customer report of the PCIe Bus Errors upon the SDK initialization of as7816-64x.
How I did it
Based on the internal info and discussion, update "pcie_aspm=off" into ONIE_PLATFORM_EXTRA_CMDLINE_LINUX of installer.conf to resolve it.
Why I did it
The buffer pool & profile setting in buffer template was not correct and caused the errors like the following:
ERR swss#orchagent: :- parseReference: malformed reference:[BUFFER_PROFILE|ingress_lossless_profile]. Must not be surrounded by [ ]
How I did it
Fix the buffer pool & profile setting by removing "[]".
How to verify it
Loaded image with this fix in a switch and made sure the error was not seen anymore.
The v0.7.5 has bug fix for the support of gearbox port and macsec counters. It also includes a owl firmware update with owl.lz4.fw.1.94.0.bin.
How I did it
Update credo sai url for v0.7.5
Update gearbox_config.json with using firmware owl.lz4.fw.1.94.0.bin instead of owl.lz4.fw.1.92.1.bin
How to verify it
Test gearbox port and macsec counter successfully on A7280.
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.
- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"
- How to verify it
Unit test and Manual test
This PR includes necessary changes for correct generating BUFFER_QUEUE values in DB. Changes are based on the schema.md
Why I did it
Change format of generating BUFFER_QUEUE in DB according to schema.md and yang-model.
Old format:
"BUFFER_QUEUE": {
"Ethernet0,Ethernet100,Ethernet104,Ethernet108,Ethernet112,Ethernet116,Ethernet12,Ethernet120,Ethernet124,Ethernet16,Ethernet20,Ethernet24,Ethernet28,Ethernet32,Ethernet36,Ethernet4,Ethernet40,Ethernet44,Ethernet48,Ethernet52,Ethernet56,Ethernet60,Ethernet64,Ethernet68,Ethernet72,Ethernet76,Ethernet8,Ethernet80,Ethernet84,Ethernet88,Ethernet92,Ethernet96|queue": {
"profile": "profile"
},
"Ethernet0,Ethernet100,Ethernet104,Ethernet108,Ethernet112,Ethernet116,Ethernet12,Ethernet120,Ethernet124,Ethernet16,Ethernet20,Ethernet24,Ethernet28,Ethernet32,Ethernet36,Ethernet4,Ethernet40,Ethernet44,Ethernet48,Ethernet52,Ethernet56,Ethernet60,Ethernet64,Ethernet68,Ethernet72,Ethernet76,Ethernet8,Ethernet80,Ethernet84,Ethernet88,Ethernet92,Ethernet96|queue": {
"profile": "profile"
}
},
New format:
"BUFFER_QUEUE": {
"Ethernet0|queue": {
"profile": "profile"
},
"Ethernet0|queue": {
"profile": "profile"
},
"Ethernet4|queue": {
"profile": "profile"
},
"Ethernet4|queue": {
"profile": "profile"
},
"Ethernet8|queue": {
"profile": "profile"
},
"Ethernet8|queue": {
"profile": "profile"
},
...
}
How I did it
Updated structure of buffers_defaults jinja templates.
Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>
* [device config] Adding configuration for default route fallback
* Set sai_tunnel_underlay_route_mode attribute to fallback to default route if more specific route is unavailable.
Why I did it
Prevent from i2c bus to get locked.
How I did it
Add sysfs driver to access ioport.
Command to reset i2c mux:
echo 1 > /sys/devices/platform/as9716_32d_ioport/i2c_mux_rst
Command to bring i2c mux out of reset:
echo 0 > /sys/devices/platform/as9716_32d_ioport/i2c_mux_rst
Signed-off-by: Brandon Chuang <brandon_chuang@edge-core.com>
Why I did it
Support pddf to as4630/as7816/as7326
How I did it
Send needed file to the PR for these platform
How to verify it
Test sensors and show platform cmd.
root@as7326-56x-3:/home/admin# show platform psustatus
PSU Model Serial HW Rev Voltage (V) Current (A) Power (W) Status LED
PSU 1 FSF045-611 FSF0451912000505 N/A 12.06 5.50 66.00 OK green
PSU 2 FSF045-611 FSF0451912000568 N/A 12.00 5.50 66.00 OK green
root@as7326-56x-3:/home/admin# sensors
lm75-i2c-15-4a
Adapter: i2c-1-mux (chan_id 6)
Main Board Temperature: +35.5 C (high = +80.0 C, hyst = +75.0 C)
lm75-i2c-15-4b
Adapter: i2c-1-mux (chan_id 6)
CPU Board Temperature: +29.0 C (high = +80.0 C, hyst = +75.0 C)
fan_ctrl-i2c-11-66
Adapter: i2c-1-mux (chan_id 2)
fan1: 9100 RPM
fan2: 9400 RPM
fan3: 9300 RPM
fan4: 9600 RPM
fan5: 9000 RPM
fan6: 9100 RPM
fan7: 9100 RPM
fan8: 9300 RPM
fan9: 9200 RPM
fan10: 9400 RPM
fan11: 9200 RPM
fan12: 9400 RPM
pch_haswell-virtual-0
Adapter: Virtual device
temp1: +43.0 C
psu_pmbus-i2c-17-59
Adapter: i2c-1-mux (chan_id 0)
in3: +12.06 V
fan1: 6272 RPM
temp1: +37.0 C
power2: 60.00 W
curr2: +6.00 A
lm75-i2c-15-49
Adapter: i2c-1-mux (chan_id 6)
Main Board Temperature: +40.0 C (high = +80.0 C, hyst = +75.0 C)
lm75-i2c-15-48
Adapter: i2c-1-mux (chan_id 6)
Main Board Temperature: +39.0 C (high = +80.0 C, hyst = +75.0 C)
psu_pmbus-i2c-13-5b
Adapter: i2c-1-mux (chan_id 4)
in3: +12.00 V
fan1: 6144 RPM
temp1: +36.0 C
power2: 66.00 W
curr2: +5.50 A
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 0: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 1: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 2: +50.0 C (high = +82.0 C, crit = +104.0 C)
Core 3: +50.0 C (high = +82.0 C, crit = +104.0 C)
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
Why I did it
Added platform.json file for N3248TE
How I did it
Defined the platform.json file with the required components under chassis.
How to verify it
validated the API 2.0 test suite
Co-authored-by: Arun LK <Arun_L_K@dell.com>
DellEMC: N3248TE platform API2.0 changes
Why I did it
N3248TE Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
Added system_health_monitoring_config.json file
How to verify it
Used the API 2.0 test suite to validate the test cases.
Co-authored-by: Arun LK <Arun_L_K@dell.com>
When do "skip_thermalcltd: true" will let "show platform fan" and "show platform temp" fail.
When enable thermalctld, these cmd will work well.
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
- Why I did it
Update NVIDIA Copyright header to "mellanox" files which were changed since 1.1.2022
- How I did it
Update the copyright header
- How to verify it
Sanity tests and PR checkers.
The following changes are provided to support bullseye and the latest master
branch content.
- Accommodate relocated fan and thermal sysfs entries in bullseye
- Add support for chassis and PSU HW revision
Why I did it
Fix platform issues introduced by the bullseye kernel upgrade.
How I did it
Minor fixes to Nokia ixs7215 platform code
How to verify it
Execute the following CLI commands
show platform summary
show platform fan
show platform temperature
Why I did it
The current code assumes that the value part does not have whitespace. So everything after the whitespace will be ignored. The syseeprom values returned from platform API do not match the output of "show platform syseeprom" on dx010 and e1031 device.
How I did it
This change improved the regular expression for parsing syseeprom values to accommodate whitespaces in the value.
PR 10021 provides the solution, but committed to the wrong place for dx010 and e1031.
How to verify it
Compile the sonic_platform wheel for dx010, then upload to device and install the wheel, verify the platform eeprom API.
Signed-off-by: Eric Zhu <erzhu@celestica.com>
Update device-specific files for new platform SN2201, including:
device/mellanox/x86_64-nvidia_sn2201-r0/ACS-SN2201/buffers_defaults_objects.j2
device/mellanox/x86_64-nvidia_sn2201-r0/ACS-SN2201/hwsku.json
device/mellanox/x86_64-nvidia_sn2201-r0/default_sku
device/mellanox/x86_64-nvidia_sn2201-r0/pcie.yaml
device/mellanox/x86_64-nvidia_sn2201-r0/platform.json
device/mellanox/x86_64-nvidia_sn2201-r0/platform_components.json
device/mellanox/x86_64-nvidia_sn2201-r0/sensors.conf
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
* Initial pass of EdgeCore platform changes.
* Remove libevent dependency from lldpd.
* Remove python2 dependencies python3.7 force from platform install script.
* Include usbmount support changes.
* Add missing 4630 install file.
* Update a few file permissions. Add umask line to Makefile. Specify python3.9 in install script.
* Misc platform updates:
- Add missing fan drawer component to sonic_platform
- Remove kernel version specification from Makefile
- Update to 4630 utility
* - Fix file permissions on source files
- Fix compile issue with 4630 driver modules (set_fs, get_fs, no longer supported in kernel 5.10)
* Fix missing/extra parens in 4630 util script.
* Fix indentation in fanutil.py.
* Integrate deltas from Edgecore to ec_platform branch.
* Installer update from Edgecore to resolve smbus serial console errors.
* Update stable_size for warm boot.
* Fix SFP dictionary key to match xcvrd.
* - Add missing define in event.py files needed for xcvrd
- Fix SFP info dict key for 7xxx switches
* 5835 platform file updates including installer and 5835 utility.
* 5835 fix for DMAR errors on serial console.
* Don't skip starting thermalctld in the pmon container.
* Revert several changes that were not related to platform.
* Run thermalctld in pmon container.
* Don't disable thermalctld in the pmon container.
* Fix prints/parens in 7816 install utility.
* - Incorporate 7816 changes from Edgecore
- Fix 7326 driver file using old kernel function
* Update kernel modules to use kernel_read().
* Fix compile errors with 7816 and 7326 driver modules.
* Fix some indents preventing platform files from loading.
* Update 7816 platform sfp dictionary to match field names in xcvrd.
* Add missing service and util files for 7816.
* Update file names, etc. based on full SKU for 7816.
* Delete pddf files not needed. These were causing conflicts with API2.0
implementation.
* Remove pddf files suggested by Edgecore that were preventing API2.0 support from starting.
* Install API2.0 file instead of pddf.
* Update 7326 mac service file to not use pddf. Fix syntax errors in 7326 utility script.
* Fix sonic_platform setup file for 7326.
* Fix syntax errors in python scripts.
* Updates to 7326 platform files.
* Fix some tab errors pulled down from master merge.
* Remove pddf files that were added from previous merge.
* Updates for 5835.
* Fix missing command byte for 5835 psu status.
* Fix permission bits on 4630 service files.
* Update platforms to use new SFP refactoring.
* Fix unused var warnings.
* [BFN] Implementation API for platform component
SONiC has a concept of "platform components"
this may include - CPLD, FPGA, BIOS, BMC, etc.
These changes are needed to read the version of the BIOS and BMC component.
What I did
Create components.py module
Add funcion for reading componet version to thrift interface
How I did it
The previous implementaion didn't have platform components API, so fwutil return an empty list.
After implementation of the platform component API, we have actual list of platform components and firmware versions
How to verify it
Run manually 'fwutil show status' or run unit tests
Previous command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
New command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
Chassis1 N/A BIOS 1.2.3 Chassis BIOS
BMC 5.1 Chassis BMC
Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
* [BFN] Implementation API for platform component
SONiC has a concept of "platform components"
this may include - CPLD, FPGA, BIOS, BMC, etc.
These changes are needed to read the version of the BIOS and BMC component.
What I did
Create components.py module
Add funcion for reading componet version to thrift interface
How I did it
The previous implementaion didn't have platform components API, so fwutil return an empty list.
After implementation of the platform component API, we have actual list of platform components and firmware versions
How to verify it
Run manually 'fwutil show status' or run unit tests
Previous command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
New command output
Chassis Module Component Version Description
------------------------ -------- ----------- --------- -------------
Chassis1 N/A BIOS 1.2.3 Chassis BIOS
BMC 5.1 Chassis BMC
Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
* [BFN] Implementation API for platform component
get chassis name from json
* [BFN] Implementation API for platform component
Updated platform and platrom_components json
* [BFN] Implementation API for platform component
Fixed spaces in component.py
* [BFN] Implementation API for platform component
Fixed exception in component.py
* Update chassis.py
* [BFN] Implementation API for platform component
Fixed spaces in component.py, chassis.py
* [BFN] Implementation API for platform component: Fixed spaces in component.py, chassis.py
* Fixed exception in get_bios_version
Signed-off-by: Jostar Yang jostar_yang@accton.com.tw
Why I did it
Fix led drv because CPLD SPEC is updated.
Fix i2c bus order
How I did it
Fix led drv. Set blacklist to i801 and ismt. Let accton util to modprobe i801 and ismt.
How to verify it
Test led and sensors cmd. Results are fine.
Why I did it
To fix I2C bus order to meet with HW SPEC. Let i801 use bus-0 and ismt use bus-1
How I did it
Modprobe i801 and then do ismt. So i801 will use bus-0 and ismt will use bus-1.
How to verify it
Test show cmd and sensors work well
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>