- Why I did it
During platform API SFP object initialization, there are two steps, one is to read the xSFP type from EEPROM, and another is to parse the xSFP DOM support capability. There is the possibility that the xSFP EEPROM is not ready when it started to read, which will result in the SFP object does not have type and DOM capability correctly initialized, which will cause further issues. So need to add a mechanism to retry in this case.
- How I did it
Add flags to indicate whether the SFP object has been correctly initialized or not, set the flag when an error happened and after all relevant bytes from EEPROM are correctly read out and parsed, clear the flag.
Leverage the Python decorator to decorate the related functions, each time when the related function is called the decorator will check whether the SFP object has been correctly initialized or not, if not it will read the EEPROM and parse again.
- How to verify it
Run SFP-related platform tests to make sure no new issue is introduced.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
The following packages have unmet dependencies:
libssl-dev : Depends: libssl1.1 (= 1.1.1n-0+deb11u3) but 1.1.1n-0+deb11u2 is to be installed
E: Unable to correct problems, you have held broken packages.
Fix sonic-db-cli high CPU usage on SONiC startup issue: https://github.com/Azure/sonic-buildimage/issues/10218
ETA of this issue will be 2022/05/31
Re-write sonic-cli with c++ in sonic-swss-common: https://github.com/Azure/sonic-swss-common/pull/607
Modify swss-common rules and slave.mk to install c++ version sonic-db-cli.
Pass all E2E test scenario.
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
Build and install c++ version sonic-db-cli from swss-common.
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration.
-->
update sonic-utilities submodule
8ac2810 [202111] [generate dump] Move the Core/Log collection to the End of process Execution and removed default timeout (#2225)
77891de [202111] Fix UT failed cause by change pycommon to use swsscommon (#2085) (#2231)
Advance sonic-submodule to get the following fixes:
ebdc242 Fix fbdorch to properly handle syncd FDB FLUSH Notif
8fd44da [ci] Don't publish gcov artifact when test failed
dcf6429 [ci] Change artifact reference pipeline to common lib pipeline.
387ed60 [ci] Use correct branch when downloading artifact.
169f597 [ci] Improve azp trigger settings to automaticlly support new release branch
- Why I did it
Update SAI pointer to include the following fix:
Fix size descriptor pool for rx/tx
- How I did it
Update SAI pointer to point on the new commit
- How to verify it
Run regression tests
#### Why I did it
There might be a case where service checker periodic operation determined that specific container is running but when it tries to perform an operation on it, it was already closed by the user. This is a valid flow and we should not log an error message, informative warning is enough.
#### How I did it
I reduce log severity.
#### How to verify it
I verified it manually.
- Why I did it
When LLDP is disabled through feature command, it gets spawned after reboot.
- How I did it
In syncd.sh check if the service is enabled before spawning automatically during cold reboot.
- How to verify it
Disable lldp feature. Perform cold reboot and verify its not spawned.
sonic-swss
25fe915 [crmorch] Prevent exceededLogCounter from resetting when low and high values are equal (#2327)
1c3c5e0 [BFD]Retry create BFD with different source UDP port on failure (#2225)
d5775b1 Skip consistent fail tests (#2269)
sonicutilities
5800b73 Fix header for the output table following 'show ipv6 interface' command (#2219)
- Why I did it
Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform.
Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again.
- How I did it
On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running.
If it is running it means we are at the first boot and continue normally.
If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent.
- How to verify it
Run fast/warm reboot.
service swss restart.
Observe PMON service starting.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Update SAI version to 1.21.2.1 in order to gain the following fixes:
1. Optimize descriptors apply time
2. BFD - notify SONiC for admin-down event
3. Don't disable used tunnel underlay interfaces
- How I did it
Update SAI pointer and update SAI version on make file
- How to verify it
Run regression tests
sonic-utilities
e2dd672 [yang] remove mistakenly added parameter for 'get_module_name' (#2193)
2b12a39 Add check to not allow deleting PO if its member of vlan. (#2141)
sonic-platform-common
309d169 [ssd_generic] Fix innodisk health regex (#287)
Why I did it
[Build]: Fix pip version constraint conflict issue
When a version is specified in the constraint file, if upgrading the version in build script, it will have conflict issue.
How I did it
If a specified version has specified in pip command line, then the version constraint will be skipped.
Update submodule ptr for sonic-utilities to include
[202111] [portchannel] Added ACL/PBH binding checks to the port before getting added to portchannel (#2186)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
#### Why I did it
To ensure that some internal testcases do not break due to external changes
#### How to verify it
Ran test_cfggen.py with the changes and it passed
Why I did it
It is not necessary to trigger the publish pipeline when build is failed.
How I did it
Remove the condition in the azp task, change to use template condition.
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.
4. Added support for Finisar DR4 (FTCD4523E2PCM) on Spectrum-2 and Spectrum-3 systems.
SAI
1. ECMP overlay support for IPv4 and IPv6
2. BFD offloading / 4K scale
SAI fixes
1. Reduce verbosity of print in case packet ingress on invalid port
2. Added support for Host table entry removal API to remove registration of a trap to a channel
- How I did it
Updated SAI & SDK submodules along with the relevant Makefiles
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
Error message: "ERR healthd: Failed to read from file /var/run/hw-management/led/led_status_capability" is observed during system starting
The system-health daemon will wait for 5 minutes before it starts to run.
During this time, the only thing it does is to set the LED even before it starts.
However, the corresponding sysfs has not been ready at the time it is being read, which causes the error message.
- How I did it
Defer system-health daemon until hw-management service starts
- How to verify it
Run regression test
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Why I did it
UT for sonic-config-engine is broken.
How I did it
Remove yang validation.
How to verify it
Run UT for sonic-config-engine.
Signed-off-by: Gang Lv ganglv@microsoft.com
sonic-utilities
0225195 Accept 0 for queue and dscp (#2162)
282faf0 [show][vrf]Fixing show vrf to include vlan subinterface (#2158)
f3f1b11 Validate destination port is not LAG (#2053)
sonic-platform-common
0f6cccd [sonic_ssd] Nokia-7215: "show platform ssdhealth" not showing health percent (#279)
Why I did it
Config db schema generated by minigraph should run yang validation.
How I did it
Modify run_script to add yang validation.
How to verify it
Run sonic-config-engine unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Support to trigger a pipeline to download and publish artifacts to storage and container registry.
Support to specify the patterns which docker images to upload.
How I did it
Pass the pipeline information and the artifact information by pipeline parameters to the pipeline which will be triggered a new build. It is to decouple the artifacts generation and the publish logic, how and where the artifacts/docker images will be published, depends on the triggered pipeline.
How to verify it
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.
Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
In Makefile.cache, for $(1)_DEP_PKGS_SHA, the intention is to include
the DEP_MOD_SHA and MOD_HASH of each of the current package's
dependencies. However, there's a level of dereferencing missing; instead
of grabbing the value of $(dfile)_DEP_MOD_SHA, it is literally using the
variable name $(dfile)_DEP_MOD_SHA. This means that the value of this
variable will not change when some dependency changes.
The impact of this is in transitive dependencies. For a specific
example, if there is some change in sairedis, then sairedis will be
rebuilt (because there's a change within that component), and swss will
be rebuilt (because it's a direct dependency), but
docker-swss-layer-buster will not get rebuilt, because only the direct
dependencies are effectively being checked, and those aren't changing.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Fix issue: Non compliant leaf list in config_db schema: https://github.com/Azure/sonic-buildimage/issues/9801
The basic flow of DPB is like:
1. Transfer config db json value to YANG json value, name it “yangIn”
2. Validate “yangIn” by libyang
3. Generate a YANG json value to represent the target configuration, name it “yangTarget”
4. Do diff between “yangIn” and “yangTarget”
5. Apply the diff to CONFIG DB json and save it back to DB
The fix:
• For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
• For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.
1. Manual test
2. Changed sample config DB and unit test passed
Conflicts:
src/sonic-yang-mgmt/sonic_yang_ext.py