Provide platform-components.json for Clearwater2 and Wolverine
These files are needed for fwutil platform sonic-mgmt tests to pass.
Fix PikeZ platform_components.json
Co-authored-by: Patrick MacArthur <pmacarthur@arista.com>
Co-authored-by: Andy Wong <andywong@arista.com>
Why I did it
Update SAI xgs version to 7.1.36.4 to include the following changes and migrate xgs to DMZ repo.
JIRA# SONIC-69731 (7.1.33.4)
Issue Summary: SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO brcm_sai_get_switch_attribute returns null.
Root Cause: Not implemented.
Fix Description: Get support for SAI switch attr SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO added
JIRA# SONIC-70737 (7.1.34.4)
Issue Summary: ECN being marked as CE even without congestion
Root Cause: ecn_thresh was set to very low value and packets were 100% marked.
Fix Description: ecn_thresh set to correct value
backport SONIC-70081 to SAI7.1 (7.1.35.4)
egress lossy queue PFC Rx fix:ignore PFC signals from egress
Update git submodules (7.1.36.4)
Update sdk-src/hsdk_6.5.24_SAI_7.1.0_GA from branch 'hsdk_6.5.24_SAI_7.1.0_GA'
to 57d0e360269c4ab659c4790ae471aa4dba2532b4
[SAI_BRANCH rel_ocp_sai_7_1] Broadcom image build failed with SAI 7.1 in DMZ repo (on bullseye)
How I did it
Update SAI xgs code.
How to verify it
Run the SONiC and SAI test with the 7.1 SAI pipeline.
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
Why I did it
Add 'channel' to the CONFIG_DB PORT table. This will be needed to support PORT breakout to multiple channel ports so that Xcvrd can understand which datapath or channel to initialize on the CMIS compliant optics
How I did it
Add 'channel' to the CONFIG_DB PORT table.
How to verify it
Added unit test for valid and invalid channel number
Channel 0 -> No breakout
Channel 1 to 8 -> Breakout channel 1,2, ..8
Signed-off-by: Prince George <prgeor@microsoft.com>
Porting fix https://github.com/sonic-net/sonic-buildimage/pull/14045 to 202205. 202205 doesn't have fast_rate and hence only fallback is updated in yang model
#### Why I did it
Added Missing fields in sonic-portchannel yang model.
"fallback" is present in configuration schema but not in yang model. This leads to traceback when yang is validated
#### How I did it
Updated yang model
#### How to verify it
Added tests to verify
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
#### Link to config_db schema for YANG module changes
Part of the PR
Why I did it
To include DNX fix
Temp workaround for CS00012281200: J2C+ : Scope of config.bcm SOC property bcm_stat_interval
How I did it
Updated SAI version
How to verify it
Basic validation on DNX platform
Why I did it
Vendor platform may require running platform specific pre-shutdown routine before shutting down the syncd process which runs the SAI and vendor sdk instance.
How I did it
Added a platform script hook which will be executed if the plugin script is provided by the platform in device//plugins/
What I did
Check /etc/pam.d/sshd integrity after modify it in hostcfgd.
Why I did it
Found some incident that /etc/pam.d/sshd become empty file during OR upgrade.
How I verified it
Pass all UT.
Add new UT to cover new code.
Why I did it
Porting/cherry-pick PR sonic-net/sonic-host-services#46
"show reboot-cause history" shows empty history. When the previous-reboot-cause has a broken symlink, And rebooting the system will not be able to generate a new symlink of the new previous-reboot-cause.
admin@sonic:~$ show reboot-cause history
Name Cause Time User Comment
------ ------- ------ ------ ---------
How I did it
Somehow, when the symlink file /host/reboot-cause/previous-reboot-cause is broken (which its destination files doesn't exist in this case), the current condition check "if os.path,exists(PREVIOUS_REBOOT_CAUSE_FILE)" will return False in determine-reboot-cause script. Hence, the current previous-reboot-cause is not been removed and the recreation of the new previous-reboot-cause failed. In case of previous-reboot-cause is a broken synlink file, add condition os.path.islink(PREVIOUS_REBOOT_CAUSE) to check and allow the remove operation happens.
How to verify it
Manually make the /host/reboot-cause/previous-reboot-cause to be a broken symlink file by removing its destination file
reboot the system. "show reboot-cause history" should show the correct info
Signed-off-by: mlok <marty.lok@nokia.com>
- Why I did it
Some scripts on syncd require Python2 support.
- How I did it
Add Python2 to syncd docker
- How to verify it
Run manually python scripts under Nvidia SDK debug to ensure they are working
Fixe #12047. After the c++ implementation of the sonic-db-cli, sonic-db-cli PING command tries to initialize the global database for all instances database starting. If all instance database-config.json are not ready yet. it will crash and generate core file. PR sonic-net/sonic-swss-common#701 only fix the crash and the process abortion.
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
Currently the show and clear cli of dhcp_relayis may cause confusion.
How I did it
Add doc for it: [doc] Add docs for dhcp_relay show/clear cli sonic-utilities#2649
Add dhcp_relay config cli and test cases.
show dhcp_relay ipv4 helper
show dhcp_relay ipv6 destination
show dhcp_relay ipv6 counters
sonic-clear dhcp_relay ipv6 counters
How to verify it
Unit test all passed
Why I did it
[Build] Support to use loosen version when failed to install python packages
It is to fix the issue #14012
How I did it
Try to use the installation command without constraint
How to verify it
* sdk debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release
* sx kernel is downloaded as zip from Spectrum-SDK-Drivers
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
#### Why I did it
Remove dialout as critical process as it is no longer used in prod. As part of future work, can remove dialout completely
#### How I did it
Remove from critical process list
Why I did it
Platform cases test_tx_disable, test_tx_disable_channel, test_power_override failed in dx010.
How I did it
Add i2c access algorithm for CPLD i2c adapters.
How to verify it
Verify it with platform_tests/api/test_sfp.py::TestSfpApi test cases.
To support 64 cores on arista skus. Fixesaristanetworks/sonic#77
Remapped recycle ports to lowers core port ids and set appl_param_nof_ports_per_modid to 64.
Co-authored-by: Sambath Kumar Balasubramanian <63021927+skbarista@users.noreply.github.com>
Why I did it
It is to fix the broadcom build failure, it is caused by the build image docker-dhcp-relay:latest not found.
2022-12-14T00:09:57.5464893Z [ FAIL LOG START ] [ target/docker-dhcp-relay.gz-load ]
2022-12-14T00:09:57.5466036Z Attempting docker image lock for docker-dhcp-relay load
2022-12-14T00:09:57.5467113Z Obtained docker image lock for docker-dhcp-relay load
2022-12-14T00:09:57.5468206Z Loading docker image target/docker-dhcp-relay.gz
2022-12-14T00:09:57.5469361Z Loaded image: docker-dhcp-relay:internal.65852159-11ad82a07a
2022-12-14T00:09:57.5470686Z Tagging docker image docker-dhcp-relay:latest as docker-dhcp-relay-sonic:latest
2022-12-14T00:09:57.5471997Z Error response from daemon: No such image: docker-dhcp-relay:latest
2022-12-14T00:09:57.5473122Z [ FAIL LOG END ] [ target/docker-dhcp-relay.gz-load ]
2022-12-14T00:09:57.5539792Z make: *** [slave.mk:1180: target/docker-dhcp-relay.gz-load] Error 1
2022-12-14T00:09:57.5540958Z make: *** Waiting for unfinished jobs....
The image had been built succeeded
2022-12-13T17:01:59.9046935Z [ finished ] [ target/docker-eventd.gz ]
2022-12-13T17:02:00.4947165Z [ building ] [ target/docker-dhcp-relay.gz ]
2022-12-13T17:02:00.6688627Z /sonic/dockers/docker-dhcp-relay/cli-plugin-tests /sonic
2022-12-13T17:02:41.1123955Z /sonic
2022-12-13T17:07:04.1786069Z [ finished ] [ target/docker-dhcp-relay.gz ]
But it was tagged by another value:
Obtained docker image lock for docker-dhcp-relay save
Tagging docker image docker-dhcp-relay-sonic:latest as docker-dhcp-relay:internal.65852159-11ad82a07a
Saving docker image docker-dhcp-relay:internal.65852159-11ad82a07a
Released docker image lock for docker-dhcp-relay save
Removing docker image docker-dhcp-relay-sonic:latest
Untagged: docker-dhcp-relay-sonic:latest
target/docker-dhcp-relay.gz
File /dpkg_cache/docker-dhcp-relay.gz-2ddfa01a109ca69b7621f1a-450bae36026d9dee62646f2.tgz saved in cache
[ CACHE::SAVED ] /dpkg_cache/docker-dhcp-relay.gz-2ddfa01a109ca69b7621f1a-450bae36026d9dee62646f2.tgz
How I did it
When the feature SONIC_CONFIG_USE_NATIVE_DOCKERD_FOR_BUILD not enabled, always save as the latest tag, not use the specify version.
The version is dynamic, it is changed when a new commit checked in, but the image of docker-dhcp-relay is not necessary to change.
What I did:
Added IP Table rule to make sure we do not drop chassis internal traffic on eth1-midpplane when Control Plane ACL's are installed.
Why I did:
When Control Plane ACL's are installed there is default Catch All rule is added to drop all traffic that is not white-listed explicitly https://github.com/sonic-net/sonic-host-services/blob/master/scripts/caclmgrd#L735. In this case Internal Traffic between Supervisor and LC will get drop. To fix this added explicit rule to allow all traffic coming from eth1-midplane.
Fixes#11873.
When loading from minigraph, for port channels, don't create the members@ array in config_db in the PORTCHANNEL table. This is no longer needed or used.
In addition, when adding a port channel member from the CLI, that member doesn't get added into the members@ array, resulting in a bit of inconsistency. This gets rid of that inconsistency.
- Why I did it
Advance hw-mgmt service to V.7.0020.4100
Add missing thermal sensors that are supported by hw-mgmt package
Delay system health service before hw-mgmt has started on Mellanox platform in order to avoid reading some sensors before ready.
Depends on sonic-net/sonic-linux-kernel#305
- How I did it
1. Update hw mgmt version
2. Add missing sensors
3. Delay service
- How to verify it
Regression test.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Why I did it
On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.
system-health indicates errors in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).
system-health process error messages were also altered to indicate which container had the issue; multiple containers may run processes with the same name, which can result in identical system-health error messages, causing ambiguity.
How I did it
Port container_checker logic from #11442 into service_checker for system-health.
How to verify it
Bringup Supervisor card with one or more missing fabric cards. Execute 'show system-health summary'. The command should not report failure due to missing dockers for the asics on the fabric cards which are not present.