Why I did it
Fix UT failure caused by typing-extensions version update.
Work item tracking
Microsoft ADO (number only): 25123371
How I did it
How to verify it
* [Build] Fix the PyYang python package installation issue (#15890)
Why I did it
Fix the armhf build failure.
How to reproduce the issue:
docker run -it debain:bullseye bash
apt-get update && apt-get install -y python3-pip
pip3 install PyYAML==5.4.1
Error message:
Collecting PyYAML==5.4.1
Installing build dependencies ... done
Getting requirements to build wheel ... error
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 /tmp/tmp6xabslgb_in_process.py get_requires_for_build_wheel /tmp/tmp_er01ztl
....
raise AttributeError(attr)
AttributeError: cython_sources
----------------------------------------
WARNING: Discarding d63f2d7597/PyYAML-5.4.1.tar.gz (sha256)=607774cbba28732bfa802b54baa7484215f530991055bb562efbed5b2f20a45e (from https://pypi.org/simple/pyyaml/) (requires-python:>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*). Command errored out with exit status 1: /usr/bin/python3 /tmp/tmp6xabslgb_in_process.py get_requires_for_build_wheel /tmp/tmp_er01ztl Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement PyYAML==5.4.1
ERROR: No matching distribution found for PyYAML==5.4.1
root@fa2fa92edcfd:/#
But if adding the option --no-build-isolation, then it is good, see fix.
install "PyYAML==5.4.1" --no-build-isolation
The same error can be found in the multiple builds.
Work item tracking
Microsoft ADO (number only): 24567457
How I did it
Add a build option --no-build-isolation.
* Fix docker-platform-monitor python2 issue
* Fix wheel dependency issue
Why I did it
Fix all mirror is commented out in sources.list in slave image issue. It will have an issue when installing more packages in the slave container.
It will add additional space character after running add-apt-repository command.
For example:
The original config in /etc/apt/sources.list
#deb [arch=amd64] http://deb.debian.org/debian/ bullseye main contrib non-free
Run the following command:
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian bullseye stable"
Then the setting changed to: (added a new space character after #)
# deb [arch=amd64] http://deb.debian.org/debian/ bullseye main contrib non-free
How I did it
Fix the regex string to add the space pattern. After fixed, whether there is a space character or not, it will not be an issue.
How to verify it
Co-authored-by: xumia <59720581+xumia@users.noreply.github.com>
Why I did it
Set build options in pipeline UI.
Support setting reproducible build options to py2,py3 in release branch and none in master branch.
Work item tracking
Microsoft ADO (number only): 22335854
How I did it
How to verify it
Why I did it
Enable the reproducible build for PR build for master branch
Fix the reproducible build variable display error in the slave container.
The below config is none, although the config is set and takes effect.
"SONIC_VERSION_CONTROL_COMPONENTS": "none"
How I did it
Passing the variable through the slave container command line.
The variable has been passed to the slave container and the other docker container by a config file, it is only used to display the value during the build.
How to verify it
See https://dev.azure.com/mssonic/build/_build/results?buildId=247960&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=88f376cf-c35d-5783-0a48-9ad83a873284
"SONIC_VERSION_CONTROL_COMPONENTS": "deb,py2,py3,web,git,docker"
Cherry pick PR#12592
Why I did it
nameserver and domain entries from build system fsroot gets into sonic image.
How I did it
Clear /etc/resolv.conf before building image
How to verify it
Built image with it and verified with install that /etc/resolv.conf is empty
Co-authored-by: Devesh Pathak <54966909+devpatha@users.noreply.github.com>
Why I did it
Cherry-pick commits from master to support the snapshot based mirror, and fix the code conflicts.
ad162ae [Build] Optimize the version control for Debian packages (#14557)
38c5d7f [Build] Support j2 template for debian sources for docker ptf (#13198)
5e4826e [Ci] Support to use the same snapshot for all platform builds (#13913)
8206925 [Build] Change the default mirror version config file (#13786)
5e4a866 [Build] Support Debian snapshot mirror to improve build stability (#13097)
ac5d89c [Build] Support j2 template for debian sources (#12557)
Work item tracking
Microsoft ADO (number only): 18018114
How I did it
How to verify it
Why I did it
mmh3's new version 3.1.0 breaks pipeline build.
bullseye/buster/jessie pined the version to 2.5.1
How I did it
Pin mmh3's version as other dists.
How to verify it
Co-authored-by: Liu Shilong <shilongliu@microsoft.com>
Why I did it
Add platform files for critical processes and default qos config for Innovium platforms
How I did it
Added default files for critical processes and qos config
How to verify it
Tested with autorestart/test_container_autorestart.py::test_containers_autorestart
Signed-off-by: rck-innovium rck@innovium.com
Why I did it
sonic-slave-stretch build failed for mmh3 version update to 3.10 on Mar 24.
How I did it
Enable reproducible build for vhdx image.
How to verify it
Why I did it
sonic-slave-stretch build failed for mmh3 version update to 3.10 on Mar 24.
How I did it
Enable reproducible build for vhdx image.
How to verify it
Why I did it
In PR check pipelines, there are too many duplicated warnings:
fatal: No names found, cannot describe anything.
SONIC_IMAGE_VERSION will not change in one build. We don't need to calculate in every reference. We just need calculate one time, then record it.
In Makefile, '=' will calculate again and again when it is referred.
How I did it
Fix it in Makefile.
How to verify it
Check this PR's check pipeline result.
Why I did it
Docker build has a low rate of hanging up.
It hangs on different steps. So, it looks like a bug in docker daemon.
How I did it
Start a daemon process to scan running time more than 1 hours, and kill the process.
How to verify it
Why I did it
If make fails, we can't rerun the make process, because existing patches can't apply again.
How I did it
Check if patches are applied. if yes, don't apply patches again.
How to verify it
Why I did it
sonic_host_services depends on deepdiff.
But latest deepdiff version has error.
How I did it
pin deepdiff to previous version.
How to verify it
Why I did it
Makefile needs some dependencies from the Internet. It will fail for network related issue.
Retries will fix most of these issues.
How I did it
Add retries when running commands which maybe related with networking.
How to verify it
Cherry-pick PR: #11846
Signed-off-by: Saikrishna Arcot sarcot@microsoft.com
Why I did it
The current error handling code for when a deb package fails to be
installed currently has a chain of commands linked together by && and
ends with exit 1. The assumption is that the commands would succeed,
and the last exit 1 would end it with a non-zero return code, thus
fully failing the target and causing the build to stop because of bash's
-e flag.
However, if one of the commands prior to exit 1 returns a non-zero
return code, then bash won't actually treat it as a terminating error.
From bash's man page:
-e Exit immediately if a pipeline (which may consist of a single simple
command), a list, or a compound command (see SHELL GRAMMAR above),
exits with a non-zero status. The shell does not exit if the
command that fails is part of the command list immediately
following a while or until keyword, part of the test following the
if or elif reserved words, part of any command executed in a && or
|| list except the command following the final && or ||, any
command in a pipeline but the last, or if the command's return
value is being inverted with !. If a compound command other than a
subshell returns a non-zero status because a command failed while
-e was being ignored, the shell does not exit.
The part part of any command executed in a && or || list except the command following the final && or || says that if the failing command
is not the exit 1 that we have at the end, then bash doesn't treat it
as an error and exit immediately. Additionally, since this is a compound
command, but isn't in a subshell (subshell are marked by ( and ),
whereas { and } just tells bash to run the commands in the current
environment), bash doesn't exist. The result of this is that in the
deb-install target, if a package installation fails, it may be
infinitely stuck in that while-loop.
This was seen when the snmpd package upgrade happened, and
builds were failing to install the mismatching libsnmp-dev package,
the builds did not immediately terminate; instead, the installation
was retried again and again, suggesting it was stuck in some infinite
loop. The build jobs finally terminated only because of the timeout
specified for the jobs.
How I did it
There are two fixes for this: change to using a subshell, or use ;
instead of &&. Using a subshell would, I think, require exporting any
shell variables used in the subshell, so I chose to change the && to
;. In addition, at the start of the subshell, set +e is added in,
which removes the exit-on-error handling of bash. This makes sure that
all commands are run (the output of which may help for debugging) and
that it still exits with 1, which will then fully fail the target.
How to verify it
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"
How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
Why I did it
Fix the official build not triggered correctly issue, caused by the azp template path not existing.
How I did it
Change the azp template path.
Why I did it
Current isc-dhcp uses below code to remove DHCP option:
memmove(sp, op, op[1] + 2);
sp += op[1] + 2;
sp points to the option to be stripped, we can call it as option S.
op points to the option after options S, we can call it as option O.
DHCP option is a typical type-length-value structure, the first byte is type, the second byte is length, and remain parts are value.
In this case, option O length is bigger than option S, and more than 2 bytes, after the memmove, we will get this result:
Now Option S and Option O are overwritten, op[1] was the length of Option O, and it's modified after memmove.
But current implementation is still using op[1] as length to update sp (sp+=op[1]+2), so we get the wrong sp.
How I did it
Create patch from https://github.com/isc-projects/dhcp
The new impelementation use mlen to store the length of Option O before memmove, that's how it fixed the bug.
size_t mlen = op[1] + 2;
memmove(sp, op, mlen);
sp += mlen;
How to verify it
I have a PR for sonic-mgmt to cover this issue:
sonic-net/sonic-mgmt#6330
Signed-off-by: Gang Lv ganglv@microsoft.com
Signed-off-by: Gang Lv ganglv@microsoft.com
Co-authored-by: ganglv <88995770+ganglyu@users.noreply.github.com>
Why I did it
Fix apt-get remove/purge version not locked issue when the apt-get options not specified.
How I did it
Add a space character before and after the command line parameters.
Co-authored-by: xumia <59720581+xumia@users.noreply.github.com>
Why I did it
Cherry-pick #12009, and fix code conflict.
Fix the dbus-pyhon installation failure when building docker-sonic-vs, caused by the command dbus-run-session not found.
The command "dbus-run-session" should be the new dependency introduced in dbus-python 1.3.2, the old version 1.2.18 does not have the issue.
How I did it
Install the dbus debian package which contains the command dbus-run-session.
It is not a blocking issue on release branches. The release branches with reproducible build feature can avoid such issue in official builds and PR builds, it only block the version upgrade (trying to upgrade from 1.2.18 to 1.3.2).
How to verify it
- Why I did it
Update SDK/FW version - 4.5.2318/2010_2318 to pick up new fixes:
1. Cr space timeout on Hold and Release GW - at warm boot
2. SPC-1 Port in stuck PHY_UP after peer side rebooted
3. memory leak in sx_api_router_ecmp_update_set
- How I did it
update the make file with the new version number
update submodule Switch-SDK-drivers pointer
- How to verify it
run sonic regression
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* [snmpd]: Update to 5.9+dfsg-4+deb11u1 to match Debian version
This brings in some security fixes.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
This is for the eventual support of multiple architectures for the mellanox platform.
- How I did it
Change the location of the binaries in Switch-SDK-drivers so that the path specifies the target architecture in addition to the target distribution that the debians are built for.
This is the most straightforward way to separate binaries built against different architectures and selectively target them for installation in the mellanox SONiC image.
- How to verify it
Build SONiC for mellanox and verify it compiles successfully.
Why I did it
Fix the missing debian package for reproducible build issue.
The gnupg2 should be added into the version file.
https://dev.azure.com/mssonic/build/_build/results?buildId=118139&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=8d99be27-49d0-54d0-99b1-cfc0d47f0318
The following packages have unmet dependencies:
gnupg2 : Depends: gnupg (>= 2.2.27-2+deb11u2) but 2.2.27-2+deb11u1 is to be installed
E: Unable to correct problems, you have held broken packages.
The issue was caused by the gnupg2 removed, and not detected.
sonic-buildimage/build_debian.sh
Line 250 in 4fb6cf0
sudo LANG=C chroot $FILESYSTEM_ROOT apt-get -y remove software-properties-common gnupg2 python3-gi
Why I did it
Fix the openssh build issue, upgrade from 8.4p1-5 to 8.4p1-5+deb11u1.
https://dev.azure.com/mssonic/build/_build/results?buildId=120209&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=8d99be27-49d0-54d0-99b1-cfc0d47f0318
+ sudo dpkg --root=./fsroot-broadcom -i target/debs/bullseye/openssh-server_8.4p1-5_amd64.deb
dpkg: warning: downgrading openssh-server from 1:8.4p1-5+deb11u1 to 1:8.4p1-5
(Reading database ... 44818 files and directories currently installed.)
Preparing to unpack .../openssh-server_8.4p1-5_amd64.deb ...
Unpacking openssh-server (1:8.4p1-5) over (1:8.4p1-5+deb11u1) ...
dpkg: dependency problems prevent configuration of openssh-server:
openssh-server depends on openssh-client (= 1:8.4p1-5); however:
Version of openssh-client on system is 1:8.4p1-5+deb11u1.
dpkg: error processing package openssh-server (--install):
dependency problems - leaving unconfigured
Errors were encountered while processing:
openssh-server
+ clean_sys
How I did it
Upgrade openssh from 8.4p1-5 to 8.4p1-5+deb11u1.
How to verify it
Why I did it
When any of the test job failed in the test stage, the rerun will not work, the test stage will be skipped automaticall, so we do not have chance to rerun the test stage again, and the checks of the test will be always in failed status, block the PR to merge forever.
It should be caused by the condition in the Test stage, we should specify the status of the BuildVS stage.
How I did it
Fix stage dependency logic.
- Why I did it
During platform API SFP object initialization, there are two steps, one is to read the xSFP type from EEPROM, and another is to parse the xSFP DOM support capability. There is the possibility that the xSFP EEPROM is not ready when it started to read, which will result in the SFP object does not have type and DOM capability correctly initialized, which will cause further issues. So need to add a mechanism to retry in this case.
- How I did it
Add flags to indicate whether the SFP object has been correctly initialized or not, set the flag when an error happened and after all relevant bytes from EEPROM are correctly read out and parsed, clear the flag.
Leverage the Python decorator to decorate the related functions, each time when the related function is called the decorator will check whether the SFP object has been correctly initialized or not, if not it will read the EEPROM and parse again.
- How to verify it
Run SFP-related platform tests to make sure no new issue is introduced.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
The following packages have unmet dependencies:
libssl-dev : Depends: libssl1.1 (= 1.1.1n-0+deb11u3) but 1.1.1n-0+deb11u2 is to be installed
E: Unable to correct problems, you have held broken packages.
Fix sonic-db-cli high CPU usage on SONiC startup issue: https://github.com/Azure/sonic-buildimage/issues/10218
ETA of this issue will be 2022/05/31
Re-write sonic-cli with c++ in sonic-swss-common: https://github.com/Azure/sonic-swss-common/pull/607
Modify swss-common rules and slave.mk to install c++ version sonic-db-cli.
Pass all E2E test scenario.
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
Build and install c++ version sonic-db-cli from swss-common.
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration.
-->
update sonic-utilities submodule
8ac2810 [202111] [generate dump] Move the Core/Log collection to the End of process Execution and removed default timeout (#2225)
77891de [202111] Fix UT failed cause by change pycommon to use swsscommon (#2085) (#2231)
Advance sonic-submodule to get the following fixes:
ebdc242 Fix fbdorch to properly handle syncd FDB FLUSH Notif
8fd44da [ci] Don't publish gcov artifact when test failed
dcf6429 [ci] Change artifact reference pipeline to common lib pipeline.
387ed60 [ci] Use correct branch when downloading artifact.
169f597 [ci] Improve azp trigger settings to automaticlly support new release branch
- Why I did it
Update SAI pointer to include the following fix:
Fix size descriptor pool for rx/tx
- How I did it
Update SAI pointer to point on the new commit
- How to verify it
Run regression tests
#### Why I did it
There might be a case where service checker periodic operation determined that specific container is running but when it tries to perform an operation on it, it was already closed by the user. This is a valid flow and we should not log an error message, informative warning is enough.
#### How I did it
I reduce log severity.
#### How to verify it
I verified it manually.
- Why I did it
When LLDP is disabled through feature command, it gets spawned after reboot.
- How I did it
In syncd.sh check if the service is enabled before spawning automatically during cold reboot.
- How to verify it
Disable lldp feature. Perform cold reboot and verify its not spawned.
sonic-swss
25fe915 [crmorch] Prevent exceededLogCounter from resetting when low and high values are equal (#2327)
1c3c5e0 [BFD]Retry create BFD with different source UDP port on failure (#2225)
d5775b1 Skip consistent fail tests (#2269)
sonicutilities
5800b73 Fix header for the output table following 'show ipv6 interface' command (#2219)
- Why I did it
Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform.
Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again.
- How I did it
On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running.
If it is running it means we are at the first boot and continue normally.
If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent.
- How to verify it
Run fast/warm reboot.
service swss restart.
Observe PMON service starting.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Update SAI version to 1.21.2.1 in order to gain the following fixes:
1. Optimize descriptors apply time
2. BFD - notify SONiC for admin-down event
3. Don't disable used tunnel underlay interfaces
- How I did it
Update SAI pointer and update SAI version on make file
- How to verify it
Run regression tests
sonic-utilities
e2dd672 [yang] remove mistakenly added parameter for 'get_module_name' (#2193)
2b12a39 Add check to not allow deleting PO if its member of vlan. (#2141)
sonic-platform-common
309d169 [ssd_generic] Fix innodisk health regex (#287)
Why I did it
[Build]: Fix pip version constraint conflict issue
When a version is specified in the constraint file, if upgrading the version in build script, it will have conflict issue.
How I did it
If a specified version has specified in pip command line, then the version constraint will be skipped.
Update submodule ptr for sonic-utilities to include
[202111] [portchannel] Added ACL/PBH binding checks to the port before getting added to portchannel (#2186)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
#### Why I did it
To ensure that some internal testcases do not break due to external changes
#### How to verify it
Ran test_cfggen.py with the changes and it passed
Why I did it
It is not necessary to trigger the publish pipeline when build is failed.
How I did it
Remove the condition in the azp task, change to use template condition.
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.
4. Added support for Finisar DR4 (FTCD4523E2PCM) on Spectrum-2 and Spectrum-3 systems.
SAI
1. ECMP overlay support for IPv4 and IPv6
2. BFD offloading / 4K scale
SAI fixes
1. Reduce verbosity of print in case packet ingress on invalid port
2. Added support for Host table entry removal API to remove registration of a trap to a channel
- How I did it
Updated SAI & SDK submodules along with the relevant Makefiles
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
Error message: "ERR healthd: Failed to read from file /var/run/hw-management/led/led_status_capability" is observed during system starting
The system-health daemon will wait for 5 minutes before it starts to run.
During this time, the only thing it does is to set the LED even before it starts.
However, the corresponding sysfs has not been ready at the time it is being read, which causes the error message.
- How I did it
Defer system-health daemon until hw-management service starts
- How to verify it
Run regression test
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Why I did it
UT for sonic-config-engine is broken.
How I did it
Remove yang validation.
How to verify it
Run UT for sonic-config-engine.
Signed-off-by: Gang Lv ganglv@microsoft.com
sonic-utilities
0225195 Accept 0 for queue and dscp (#2162)
282faf0 [show][vrf]Fixing show vrf to include vlan subinterface (#2158)
f3f1b11 Validate destination port is not LAG (#2053)
sonic-platform-common
0f6cccd [sonic_ssd] Nokia-7215: "show platform ssdhealth" not showing health percent (#279)
Why I did it
Config db schema generated by minigraph should run yang validation.
How I did it
Modify run_script to add yang validation.
How to verify it
Run sonic-config-engine unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Support to trigger a pipeline to download and publish artifacts to storage and container registry.
Support to specify the patterns which docker images to upload.
How I did it
Pass the pipeline information and the artifact information by pipeline parameters to the pipeline which will be triggered a new build. It is to decouple the artifacts generation and the publish logic, how and where the artifacts/docker images will be published, depends on the triggered pipeline.
How to verify it
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.
Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
In Makefile.cache, for $(1)_DEP_PKGS_SHA, the intention is to include
the DEP_MOD_SHA and MOD_HASH of each of the current package's
dependencies. However, there's a level of dereferencing missing; instead
of grabbing the value of $(dfile)_DEP_MOD_SHA, it is literally using the
variable name $(dfile)_DEP_MOD_SHA. This means that the value of this
variable will not change when some dependency changes.
The impact of this is in transitive dependencies. For a specific
example, if there is some change in sairedis, then sairedis will be
rebuilt (because there's a change within that component), and swss will
be rebuilt (because it's a direct dependency), but
docker-swss-layer-buster will not get rebuilt, because only the direct
dependencies are effectively being checked, and those aren't changing.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Fix issue: Non compliant leaf list in config_db schema: https://github.com/Azure/sonic-buildimage/issues/9801
The basic flow of DPB is like:
1. Transfer config db json value to YANG json value, name it “yangIn”
2. Validate “yangIn” by libyang
3. Generate a YANG json value to represent the target configuration, name it “yangTarget”
4. Do diff between “yangIn” and “yangTarget”
5. Apply the diff to CONFIG DB json and save it back to DB
The fix:
• For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
• For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.
1. Manual test
2. Changed sample config DB and unit test passed
Conflicts:
src/sonic-yang-mgmt/sonic_yang_ext.py
079f80a (HEAD -> 202111, origin/202111) Fix: if routestr does not exist, skip (#257)
8fd0fe1 Fix: not to use blocking get_all() after keys() (#255)
981107a Add VoQ Recirc interface (i.e., Ethernet-Rec) to interface maps for S… (#244)
f4ecfb6 (HEAD -> 202111, origin/202111) Removing Vnet with scope default (#2239)
Why I did it
The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like:
Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp.
Systemd starts interface-config service who depends on networking service
Interface-config service runs command “ifdown –force eth0”, check line. but networking service is still running so that this line failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down.
Interface-config service updates /etc/networking/interface to static configuration.
Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP)
When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces.
The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3.
How I did it
Check networking service state before running "ifdown –force eth0", wait for it done if it is activating.
How to verify it
Manual test.
Why I did it
When lldpmgrd handled events of other tables besides PORT_TABLE, error message was printed to log.
How I did it
Handle event according to its file descriptor instead of looping all registered selectables for each coming event.
How to verify it
I verified same events are being handled by printing events key and operation, before and after the change.
Also, before the change, in init flow after config reload, when lldpmgrd handled events of other tables besides PORT_TABLE, error messages were printed to log, this issue is solved now.
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot.
- How I did it
Add a timer for PMON service.
Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot.
Copy the timer file to the host bin image.
- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
#### Why I did it
Need to pass LY_CTX_DISABLE_SEARCHDIR_CWD to Context in order to disable automatically searching for schemas in current working directory (which is by default searched automatically)
#### How I did it
add additional attribute into YANG context
#### How to verify it
Create some invalid link on switch :
1) **ln -s /usr/abc xxx**
2) run **spm list**
--> There should not be these messages:
```
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
```
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot.
- How I did it
Add a timer for LLDP service.
Copy the timer file to the host bin image.
- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
This PR is dependent on PR: #10567
* Remove SSH host keys after installing the custom version of sshd
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Use an override for for sshd instead of overwriting the service file
Don't overwrite upstream's .service file, and instead use an override
file for making sure the host key(s) are generated.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
e46b243b Fix checkReplyType failed issue via recreating xcvr_table_helper on forking subprocess (#255) (#256)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
fc29641 [pbh] [aclorch] Fixed a bug causes by updating the flow-counter value for the PBH rule (#2226)
6c38ef7 [QoS] Resolve an issue in the sequence where a referenced object removed and then the referencing object deleting and then re-adding (#2210)
- Why I did it
InvalidPsuVolWA.run might raise exception if user power off PSU when it is running. This exception is not caught and will be raised to psud which causes psud failed to update PSU data to DB.
- How I did it
1. Change the log level when WA does not work. This could happen when user power off PSU, hence changing the log level from error to warning is better
2. Change the wait time from 5 to 1 to avoid introduce too much delay in psud. 1 second is usually enough per my test
3. Give a default return value for function get_voltage_low_threshold and get_voltage_high_threshold to avoid exception reach to psud
- How to verify it
Manual test.
Run sonic-mgmt regression
#### Why I did it
The test plan described in the `How to verify it` section caused an issue when 3 images (instead of 2) were present when executing `show boot` or `sonic-installer list` commands:
```
root@sonic:/home/admin# show boot
Current: SONiC-OS-master.0-dirty-20220118.165941
Next: SONiC-OS-master.0-dirty-20220118.165941
Available:
SONiC-OS-master.0-dirty-20220118.165941
SONiC-OS-202012.201-a0376a6e5_Internal
SONiC-OS-202012.201-a0376a6e5_Internal_RPC
```
#### How I did it
Fixed the `sed` pattern to match the current image revision in the `install.sh` script.
#### How to verify it
Test plan:
1. Install the `imageA` by using ONIE
2. Install the `imageA-rpc` by using `sonic-installer`
3. Reboot the switch
4. Swap to the `imageA` - `sonic-installer set-default imageA`
5. Reboot the switch
6. Install the `imageB` by using `sonic-installer`
7. Check an installed images - `show boot`
8. Reboot the switch
9. Check an installed images - `show boot`
Why I did it
To sign SONiC kernel image and allow secure boot based system to verify SONiC image before loading into the system.
How I did it
Pass following parameter to rules/config.user
Ex:
SONIC_ENABLE_SECUREBOOT_SIGNATURE := y
SIGNING_KEY := /path/to/key/private.key
SIGNING_CERT := /path/to/public/public.cert
How to verify it
Secure boot enabled system enrolled with right public key of the, image in the platform UEFI database will able to verify image before load.
Alternatively one can verify with offline sbsign tool as below.
export SBSIGN_KEY=/abc/bcd/xyz/
sbverify --cert $SBSIGN_KEY/public_cert.cert fsroot-platform-XYZ/boot/vmlinuz-5.10.0-8-2-amd64 mage
O/P:
Signature verification OK
* [CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* add more information in patch
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Update 0003-Remove-minimist-packages.patch
* change the thrift 0.14.1 to package download
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* use the series file for patching
* fix a code defect
Co-authored-by: Richard.Yu <richard.yu@microsoft.com>
Why I did it
Minigraph parser added a new field 'cluster' to device_metadata, and then yang validation is blocked.
How I did it
Add 'cluster' to device_metadata yang models.
How to verify it
Run UT for sonc-yang-models.
Use minigraph parser to generate ConfigDB schema and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
dhcp_server is introduced, and need to update yang model.
How I did it
Update yang models and add unit test.
How to verify it
Run unit test for sonic-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
The interface renaming logic fails if one interface is missing.
Because of the `set -e` the whole initramfs hook would abort early on
error.
This change fixes the current behavior to make sure missing interfaces
are properly skipped and ensure existing interface are renamed.
On some products the pci enumeration adds randomness into which nic gets
initialized first.
Because SONiC doesn't use deterministic interface naming but instead old
style interface naming, this leads to eth0 not always being the
management port.
To make sure eth0 is always the management port (SONiC expectation)
rename the interfaces in the initramfs for Arista products.
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.
- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"
- How to verify it
Unit test and Manual test
Why I did it
minigraph parser has introduced new type.
How I did it
Update yang models to support BmcMgmtToRRouter.
How to verify it
Run unit test for sonic-yang-models
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Need to run yang validation for sonic-cfggen unit test, and many unit test does not provide speed for port table.
How I did it
Update minigraph xml.
How to verify it
Run sonic-cfggen unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
ASN range is from 1 to 4294967295, need to remove invalid ASN.
How I did it
Update unit test and replace ASN 0.
How to verify it
Run unit test for sonic-config-engine.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
sonic-config-engine unit test is using invalid switch_type
How I did it
Update xml with correct switch_type
How to verify it
Run UT for sonic-config-engine
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Need to run yang validation for sonic-cfggen unit test, and many unit test does not provide lanes for port table.
How I did it
Update port config file.
How to verify it
Run sonic-cfggen unit test,
Use below PR to verify
#10228
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Config db schema generated by minigraph can’t pass yang validation, deployment_id can’t be none for yang validation.
How I did it
Update minigraph.py, skip deployment_id with None value
How to verify it
Run UT for sonic-config-enginue.
Run command 'sonic-cfggen -m tests/multi_npu_data/sample-minigraph-noportchannel.xml -p tests/multi_npu_data/sample_port_config-3.ini -n asic3 --print-data'.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Multi-asic platform add aisc_port_name and role to PORT table, and port_index range is changed.
How I did it
Update sonic-port.yang, add asic_port_name and role, and remove range limitation.
How to verify it
Run UT for sonic-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
end2end test is blocked by Yang model for BGP_PEER_RANGE.
How I did it
Add new yang models.
How to verify it
Run UT for sonc-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
Signed-off-by: Gang Lv ganglv@microsoft.com
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
end2end test is blocked by Yang model for AAA login pattern.
#### How I did it
Add pattern to AAA yang models.
#### How to verify it
Run UT for sonc-yang-models.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
Fix#9713
#### A picture of a cute animal (not mandatory but encouraged)
* Ported platform from master
Signed-off-by: Petro Bratash <petrox.bratash@intel.com>
* [BFN] Updated x86_64-accton_as9516_32d-r0/platform.json
* [BFN] Refactoring and adding some functions of Thermal class (set and
get thresholds and etc.)
* [BFN] Fix exception when fwutil run without sudo
* Revert "[BFN] syncd-rpc build with thrift 0.14.1 (#9884)"
This reverts commit bec35267cb.
* [BFN] Updated SDK to 20220127_sai_1.9.1 (#9870)
Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>
* [BFN] updated SDE packages for BFN platforms (#10512)
Updated SDE packages for bfn platform
- introduced X6 profile
- fixes for drop counters
- fixes for platform part
Co-authored-by: Andriy Kokhan <AndriyX.Kokhan@intel.com>
Co-authored-by: roman_savchuk <romanx.savchuk@intel.com>
* Bump Thrift version from 0.13.0 to 0.14.1 (#9881)
#### Why I did it
To bump the Thrift version to 0.14.1
- To avoid [CVE-2020-13949](https://nvd.nist.gov/vuln/detail/CVE-2020-13949)
- to fix some dependencies issues
#### How I did it
- rename `src/thrfit_0_13_0` to `src/thrift_2` to remove version number in the path. (`src/thrift` contains rules to build thrift 0.11.0 )
- Add thrift sources as submodule as there are no prepared debian packages for version >0.13.0 on [debian.org](https://packages.debian.org/search?searchon=sourcenames&keywords=thrift)
- Added patches with fixes for original thrift debian rules:(remove unneeded packages, fix multi job build)
#### How to verify it
```
BLDENV=buster make -f Makefile.work target/debs/buster/libthrift-dev_0.14.1_amd64.deb
```
* Correct thrift 0141 typo fix (#10199)
Correct libsaithrift dependency package name from
LIBTHRIFT_DEV_0_14_1 THRIFT_COMPILER_0_14_1 to
LIBTHRIFT_0_14_1_DEV THRIFT_0_14_1_COMPILER
How I did it
How to verify it
Test Done:
make BLDENV=buster SAITHRIFT_V2=y -f Makefile.work target/debs/buster/saiserverv2_0.9.4_amd64.deb
Co-authored-by: Myron Sosyak <myronx.sosyak@intel.com>
89d0e92 (HEAD -> 202111, origin/202111) [scripts/fast-reboot] cleanup (#2132)
e0baa93 [config/config_mgmt.py]: Fix dpb issue with upper case mac in (#2066)
2135269 (HEAD -> 202111, origin/202111) [SSD]Enhance ssd_generic with more error handling to avoid python crash #271
393fbee [SSD Generic] Add support for parsing nvme ssd model, health and temperature (#265)
b67d479 Fixed the sfp refactor issue
827c5a6 Added nokia_cmd command nokia_common grpc support for power down/up SFM module
aeb7f56 Added the nokia cli commands for midplane
c57d083 Fix the get_my_module issue and the thermal_infos exception issue.
0536293 Change the output of "show chassis module status"
63212d7 Enhance the help display for nokia_cmd command
e8d2599 Fix the sonic_install_ndk_service script issue
d52bdcf Add command nokia_cmd show sfm-eeprom support
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr 2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr 2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors from scapy.route import *
Apr 2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr 2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors conf.iface = get_working_if()
Apr 2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr 2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr 2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr 2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr 2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
- Why I did it
Fixes#9628
During bootup, this error log is seen
Dec 22 04:26:29 sonic interfaces-config.sh[2546]: error: main exception: cannot find interfaces: eth0 (interface was probably never up ?)
This is of non-functional nature and doesn't affect the flow.
- How I did it
Dont take the ifdown if not needed
- How to verify it
Verified during reboot. Log did not appear and IP was acquired on eth0 as expected
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Cherry-pick #10108
- Why I did it
Fixing issue #9991
The ACL RULE table field ETHER_TYPE can accept both hex as well as decimal values. However yang model didn't allow decimal values. Fixed it to allow decimal values (same pattern as in hex (1536-65535)
- How I did it
Updated yang model to handle decimal values
- How to verify it
Added UT to verify it.
bd97dfe Fix urllib3 CVE-2021-33503 issue (#104)
f159bfa Upgrade the containers to be based on Debian Buster (#103)
a1830c1 Fix OpenAPI spec to be readable by autorest (#101)
94805a3 Identify and report Vnet GUID for conflicting VNI (#99)
4832dfd Static route expiry if not specified as persistent (#98)
5cc4358 Add support for overlay ECMP (#96)
6822a46 [CI] Set diff cover threshold to 50% (#97)
dcc826a Add PR diff coverage (#95)
This PR is to backport #10401 to 202111
- Why I did it
Take new hw-mgmt release to SONiC, including:
new features:
hw-mgmt: add to PSU FW upgrade tool command to show current FW version
hw-mgmt: add to PSU FW upgrade tool support for single-PSU-in-the-system FW upgrade
hw-mgmt: add attribute “/firmware” to show FW version of restricted upgradable PSUs only
hw-mgmt: Add NVME temperature reports attributes (_alarm/_crit/_min/_max)
bug fix:
psu: redundant i2c_addr attributes being created for psu 3 & 4 in system having only 2 psus.
hw-mgmt: in SPC1/2 i2c driver removal is too slow vs. ASIC reset causing non-functional log errors
PSU thresholds sysfs changed in 5.10 to “read only” preventing modification (modification required due PSU HW bug)
CPLD3 sysfs attribute missing after chip down/up flow
sysfs attributes missing when hw-mgmt is restarted (stop/start) within systemd
release notes can be found from link https://github.com/Mellanox/hw-mgmt/blob/V.7.0020.2004/debian/Release.txt
- How I did it
Update hw-mgmt make file with new version number
Update hw-mgmt submodule pointer
- How to verify it
Run platform regression on all Mellanox platform
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
Exclude the innovium build in upgrading version build, currently, the builds are always failed, exclude the build temporarily.
Increase the broadcom build timeout.
- Why I did it
Fastboot will delay all counters in CONFIG DB, it relies on enable_counters.py to recover the delayed counters. However, enable_counters.py does not recover those non-default counters.
- How I did it
For non-default counters, if it is in CONFIG DB, put delay status to false after the waiting.
- How to verify it
Manual test
c5c105f (HEAD -> 202111, origin/202111) [PBH] Implement Edit Flows (#2093)
7050581 [techsupport] Handle minor fixes of TS Lock and update auto-TS (#2114)
3602f99 Fix issues in clear_qos (#2122)
4f96d3b Fixing get port speed when oper status is down (#2123)
5bb99c7 Validate LAG has members before mirror session create (#2130)
ec6c8af [vxlan] Remove tunnel map objects on VNET tunnel removal (#2150)
7e7db19 [BFD]Registering BFD state change callback during session creation (#2202)
618fe07 [VNET]Fixing nexthop group delete during route change (#2198)
91b66df [portsorch]: Prevent LAG member configuration when port has active ACL binding (#2165)
29de9d0 Remove redundant and problematic code to skip "pool" field in buffer profile handling (#2197)
ded0b45 [PBH] Implement Edit Flows (#2169)
2ee0f49 [neighsyncd] increase neighsyncd timeout (#2209)
a0160c0 [QosOrch] The notifications cannot be drained in QosOrch in case the first one needs to retry (#2206)
Fix the generating version file failure issue caused by artifacts folder change.
When changing to use the same template for PR build, official build and packages version upgrade, the artifacts folder adding a "target" folder, the version upgrade task should be changed accordingly.
Why I did it
docker hub will limit the pull rate.
Use ACR instead to pull debian related docker image.
How I did it
Set DEFAULT_CONTAINER_REGISTRY in pipeline.
* [Marvell] Update armhf SAI deb version 1.9.1 (#9865)
Move marvell armhf SAI deb to 1.9.1 to address build failures.
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
* [Marvell] Update armhf driver/sai deb version (#10126)
Fixed Marvell SAI deb version naming issue reported in Marvell-switching/sonic-marvell-binaries#62
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
* [Build]: only install grpc in amd64 (#10212)
[Build]: only install grpc in amd64
Unblock marvell-armhf build.
Co-authored-by: Rajkumar-Marvell <54936542+rajkumar38@users.noreply.github.com>
Why I did it
Cherry-pick commits from master to 202111 to fix build broken issue.
See detail in the commits.
Why I did it
Fix host image debian package version issue.
The package dependencies may have issue, when some of debian packages of the base image are upgraded. For example, libc is installed in base image, but if the mirror has new version, when running "apt-get upgrade", the package will be upgraded unexpected. To avoid such issue, need to add the versions when building the host image.
How I did it
The package versions of host-image should contain host-base-image.
#### Why I did it
when adding and removing ports after init stage we saw two issues:
first:
In several cases, after removing a port, lldpmgr is continuing to try to add a port to lldp with lldpcli command. the execution of this command is continuing to fail since the port is not existing anymore.
second:
after adding a port, we sometimes see this warning messgae:
"Command failed 'lldpcli configure ports Ethernet18 lldp portidsubtype local etp5b': 2021-07-27T14:16:54 [WARN/lldpctl] cannot find port Ethernet18"
we added these changes in order to solve it.
#### How I did it
port create events are taken from app db only.
lldpcli command is executed only when linux port is up.
when delete port event is received we remove this command from pending_cmds dictionary
#### How to verify it
manual tests and running lldp tests
#### Description for the changelog
Dynamic port configuration - solve lldp issues when adding/removing ports
Why I did it
Kernel hang in during early boot is caused due overwriting of device tree with uncompressing kernel. Added the fdt_high which gives a safe offset from kernel location.
How I did it
Setting uboot environment variable fdt_high.
How to verify it
Successful boot of bullseye kernel on Marvell Armada 380/385.
Change-Id: I3e2521780f5ecdb3bdf6cbb6542250814ca11959
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
Why I did it
Removing incorrect check in plt setup for fw_env config: This check was added before to compare 2 different types of disk. Now the check is redundant and check is not required as transition is complete.
2)Removing legacy_volume_label in create_partition: legacy_volume_label is not used in armhf install files. With legacy_volume_label initialized to NULL, current code will always return true for check, if demo_part exits.
How I did it
Change is about removing the redundant/incorrect code explained above.
How to verify it
uboot fw_printenv and fw_setenv is tested
onie-nos-install has be verified.
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
For Bullseye, Python 2 isn't present at all. This means that in certain
build cases (such as building something only for Bullseye), the version
file may not exist, and so the sort command would fail.
For most normal build commands, this probably won't be an issue, because
the SONiC build will start with Buster (which has both Python 2 and
Python 3 wheels built), and so the py2 and py3 files will be present
even during the Bullseye builds.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
[Build]: Enable marvell-armhf PR check
Improve the azp dependencies, make the Test stage only depended on BuildVS stage. The Test stage will be triggered once the BuildVS stage finished, reduce the waiting time.
- Why I did it
With the previous MFT 4.18.1-16 there is a bug in mstdump tool accessing wrong address. it is confirmed this issue does not exist in official 4.18.0-106.
- How I did it
Update the MFT version to 4.18.0-106
- How to verify it
Run regression on Mellanox platforms
Why I did it
support to collect version when purging debian package
Support to collect version multiple times
How I did it
Add the collection action before purging.
#### Why I did it
To fix https://github.com/Azure/sonic-buildimage/issues/9643
#### How I did it
Instead of ast.literal_eval added python2 compat code for json strings unicode -> str convertion.
We need python2 compatibility since py2 sonic config engine (buster/sonic_config_engine-1.0-py2-none-any.whl target) is still included into the build (ENABLE_PY2_MODULES flag is set for buster). Once we abandon buster and python2, this compat and ast.literal_eval could be cleaned up all through the code base.
#### How to verify it
run steps from the linked issue
Why I did it
ACL have ACCEPT action indeed, but yang doesn't support it.
How I did it
Add 'ACCEPT' enum to sonic-types.yang.j2
How to verify it
Run the YANG model unit tests
6562ad3 (HEAD -> 202111, origin/202111) [sfpshow][recycle_port] sfpshow script needs to skip recycle ports (#2109)
f184a61 Update `config mirror_session` CLI to support heximal gre type value (#2095)
03936ea (HEAD -> 202111, origin/202111) define index for recirc port (#118)
d48f750 [port_util] Fix issue: port_util.get_vlan_interface_oid_map should not raise exception when DB has not RIF data (#117)
- Why I did it
PDDF utils were python2 compliant and they needed to be migrated to Python3 (as per Bullseye)
PDDF common platform APIs file name changed as the name was already in use
Indentation issues
Dead/redundant code needed to be removed
- How I did it
Made files Python3 compliant
Indentation corrected
Redundant code removed
- How to verify it
AS7326 Accton platform uses PDDF. PDDF utils were run on this platform to verify.
cherry-pick of #9393 for 202111
- Use SfpOptoeBase by default to leverage new `sonic_xcvr` refactor
- Add support for `Woodleaf` product
- Move `libsfp-eeprom.so` to a different `.deb` package
- Add new logrotate configuration for arista logs
- Improve logging mechanism for the drivers (IO loglevel, fix syslog duplicates)
- Initialize chassis cards in parallel
- Refactor of `get_change_event` to fix interrupts treated as presence change
Why I did it
[Build]: Fix armhf mirrors not existing issue
The mirror endpoint debian-archive.trafficmanager.net does not support armhf, change to use deb.debian.org and security.debian.org.
Correct thrift.0.13.0 dependent package name.
In previous code, the buildout target was named as PYTHON3_THRIFT_0_13_0
But when add the prackage to LIBTHRIFT_0_13_0, it typo as PYTHON_THRIFT_0_13_0
Co-authored-by: Yang Wang<yangwang1@microsoft.com>
9968d60 (HEAD -> 202111, origin/202111) [sonic-package-manager] do not mod_config for whole config db when setting init_cfg (#2055)
4b3d53f [generate_dump] exclude mft and mlx folders from /etc (#2072)
51d92ae Validation check correction while adding a member to PortChannel (#2078)
6a43306 [techsupport] Added a lock to avoid running techsupport in parallel (#2065)
44cfdd9 Try get port operational speed from STATE DB (#2030)
45ea623 Fix sonic-installer failure due to missing import
- Why I did it
Fix issue: psu might use wrong voltage sysfs which causes invalid voltage value. The flow is like:
1. User power off a PSU
2. All sysfs files related to this PSU are removed
3. User did a reboot/config reload
4. PSU will use wrong sysfs as voltage node
- How I did it
Always try find an existing sysfs.
- How to verify it
Manual test
#### Why I did it
PR https://github.com/Azure/sonic-utilities/pull/1825 added validation for the input of `config mirror session add`, and only decimal value is accepted.
An issue https://github.com/Azure/sonic-buildimage/issues/10096 was raised to suggest accepting HEX value as well, and the suggestion makes sense to me.
To accept HEX value for GRE type, and keep backward compatibility as well, I updated the YANG model to support both decimal and hexadecimal input for GRE type.
#### How I did it
Update the regex for GRE type.
#### How to verify it
Verified by UT
```
platform linux -- Python 3.9.2, pytest-6.0.2, py-1.10.0, pluggy-0.13.0
rootdir: /sonic/src/sonic-yang-models
plugins: pyfakefs-4.5.4, cov-2.10.1
collected 3 items
tests/test_sonic_yang_models.py .. [ 66%]
tests/yang_model_tests/test_yang_model.py . [100%]
========================================================================================== 3 passed in 2.53s ==========================================================================================
```
#### Description for the changelog
Update YANG model for mirror session to support decimal value for GRE type.
This can save 6 sec for teamd LAG restoration - the time between:
```
Mar 9 13:51:10.467757 r-panther-13 WARNING teamd#teamd_PortChannel1[28]: Got SIGUSR1.
Mar 9 13:52:33.310707 r-panther-13 INFO teamd#teamd_PortChannel1[27]: carrier changed to UP
```
- Why I did it
Optimize warm boot. Specifically reduce the time needed for LAG restoration.
- How I did it
Kill teamd docker after graceful shutdown of teamd processes.
- How to verify it
Run warm reboot.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Why I did it
The marvel-armhf build is hung, it does not exit after waiting for a long time.
It is caused by the process /etc/entropy.py which is started by the postinst script in target/debs/buster/sonic-platform-nokia-7215_1.0_armhf.deb
When mounting the partition that contains `/host` during initramfs, the
mount binary available there (coming from busybox) tries each filesystem
in `/proc/filesystems` and sees which one succeeds. During this time,
there may be some error messages logged into dmesg because some of the
incorrect filesystems failed to mount the partition.
Specify the filesystem type explicitly so that initramfs knows it's that
type, and we know what filesystem will always get used there.
Fixes#9998
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
To implement blocking feature state change.
- How I did it
Record the actual feature state in STATE DB from hostcfg.
- How to verify it
UT + verification by running on the switch and checking STATE DB.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
e56e9b4 Fix CVE-2021-3121 warning (#96)
bf1be4f [ci]: Support code diff coverage threshold 50% (#94)
64e516c Ported Marvell armhf build on x86 for debian buster to use cross-compilation instead of qemu emulation (#80)
e426388 [ci]: Support azp code coverage (#87)
Why I did it
uboot env get and set commands fw_printenv/fw_setenv are not available in bullseye sonic image. Some platforms using them where failing. Ex: sonic-installer commands in marvell-armhf.
In case of buster, u-boot-tools was providing these commands.
How I did it
Added libubootenv-tool which provides these tools along with other uboot tools in build_debian.sh.
How to verify it
root@localhost:# fw_printenv serverip
serverip=10.4.50.39
root@localhost:# fw_setenv serverip 10.4.50.38
root@localhost:~# fw_printenv serverip
serverip=10.4.50.38
Change-Id: I558f8737f41d83d3e8527ce340391ae8f978b6d8
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
Why I did it
It is to fix the issue #10048
When building .raw image, for instance, target/sonic-broadcom.raw, it will generate a .bin image, target/sonic-broadcom.bin, as the intermediate file. The intermediate file is a build target which may contains different dependencies with the raw one.
6a6b711 (HEAD -> 202111, origin/202111) Fix issue: sometimes PFC WD unable to create zero buffer pool (#2164)
459aee0 Use abort instead of exit in case calling SAI API failure (#2170)
e767137 Fix issue config qos reload causing orchagent aborted via tracking dependencies among QoS tables (#2116)
Why I did it
To enable PTF-SAI testing on 202111 branch, cherry-pick necessary PR from master
How I did it
Based on the current ptf docker create a new docker for sai-ptf(saiv2)
upgrade related package
use the latest ptf and install it
How to verify it
Tested on DUT
* [PTF-SAIv2]Add ptf docker for sai-ptf (saiv2) (#9729)
* [PTF-SAIv2]Add ptf dockre for sai-ptf (saiv2)
Base on current ptf docker create a new docker for sai-ptf(saiv2)
upgrade related package
use the latest ptf and install it
test done:
NOJESSIE=1 NOSTRETCH=1 NOBULLSEYE=1 ENABLE_SYNCD_RPC=y make target/docker-ptf-sai.gz
BLDENV=buster make -f Makefile.work target/docker-ptf-sai.gz
* upgrade the thrift to 014
* install xmlrunner python3 version (#10086)
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Support saiserver v2 with python3 and thrift 0.13.0
add variables to support the saiserverv2
build different thrift in saithrift depends on saiserver version
build differernt versions of saiserver
make the saiserver and saiserver docker with version number
test done:
build two different versions of sasiserver in local build environment
add saiserver to buster
Co-authored-by: Yang Wang <yangwang1@microsoft.comwq>
Generate the sai.profile base on the brcm j2 file if the sai.profile
is not existing in the dut mounted folder.
Change the supervisor service configuration accordingly.
Testing done:
Add the script and config in dut
saiservice server can start automatically with [systemctl start saiserver]
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Co-authored-by: yangwang <yangwang@microsoft.com>
Cherry pick of #10072
- Why I did it
Removing DPB breakout modes that require adjacent ports to be disabled as that is not supported by the current DPB infrastructure.
Correspondingly had to remove the hwsku.json file from any SKUs which utilized these removed modes such that the system will fall back to ports_config.ini and DPB will not be supported for those SKUs.
- How I did it
Modified the platform.json files of Mellanox devices.
- How to verify it
Execute show int break [Ethernet] on the affected platforms and ensure there are no modes present that would require an adjacent port to be disabled to function.
91d7558 (HEAD -> 202111, origin/202111) Allow IPv4 link-local nexthops (#1903)
ceb5161 Fix for 2053, Fix IPv6 BGP multipath-relax peer-type. (#2062)
b3b279a [crm] Use sai_object_type_get_availability() API to get counters (#2098)
28955f4 Try get port operational speed from STATE DB (#2119)
Why I did it
Fixed the monit container_checker fails due to unexpected "database-chassis" docker running on Supervisor card in the VOQ chassis. fixes#9042
How I did it
Added database-chassis to the always running docker list if platform is supervisor card.
How to verify it
Execute the CLI command "sudo monit status container_checker"
Signed-off-by: mlok <marty.lok@nokia.com>
* Update container_checker for multi-asic devices
Update container_checker for multi-asic devices to add database containers in always_running_containers.
Previous change was made for single-asic, and that database containers were not considered as feature when writing to state_db.
* Update container_checker
Update an indent
Why I did it
Code review was still in progress when #9858 was merged and upon further testing I have arrived at a better solution.
How I did it
Modified supervisord configuration j2 template for pmon to require no minimum uptime for chassisd_db_init and to remove the redundant exit_codes directive
How to verify it
Boot switch and verify in syslog that there are no errors related to chassis_db_init
Why I did it
amrhf build fails while building sonic-config-engine whl package
https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_apis/build/builds/77089/logs/9
The reason for the failure is due to the fact that there is a new line generated at the top of the file in buffer config test cases while building for broadcom based platform and this issue is not seen in Marvell based platforms.
How I did it
Removed the new line for all the buffer test cases as there is no need to add it and accordingly changed the buffer_config.j2 where the new line is generated.
Why I did it
During warm-reboot and fast-reboot the below error logs appear
Feb 3 22:05:15.187408 r-lionfish-13 ERR container: docker cmd: kill for nat failed with 404 Client Error for http+docker://localhost/v1.41/containers/nat/json: Not Found ("No such container: nat")
The container command when called for local mode doesn't check if it is enabled before calling docker kill which throws the above errors.
b6ca76b482/scripts/fast-reboot (L699)
How I did it
Checking feature state if local mode and returning error exit code along with valid debug message.
How to verify it
Manually tested with warm-reboot and fast-reboot
Added UT to verify it.
PR #9481 changed auditd's log directory to be /var/log instead of
/var/log/audit, because SONiC mounts a disk image at /var/log during
runtime, and so the /var/log/audit directory might not exist (since it
would've been created during package installation, mounting another
partition at /var/log will hide it). However, for security reasons,
auditd changes the log directory to have 0750 permissions, so that not
everyone knows about the audit logs or read them.
To fix this, revert the change to auditd's log directory, and tell
systemd to create the audit log directory at runtime if it doesn't
exist. Because the disk image gets mounted during initramfs (before
systemd starts), systemd will make sure that the /var/log/audit
directory will exist.
Fixes#9548 and #10015
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
The latest upgrade of Mellanox hw-mgmt V7.0020.1300 introduced a couple new kernel modules for new Mellanox platforms that have yet to be upstreamed to the linux kernel.
As these new platforms do not have SONiC support we elected not to upstream these new drivers to sonic-linux-kernel but hw-mgmt expects them to exist which is causing a non-functional error on switch boot.
Feb 15 00:09:55.374130 r-leopard-simx-74 ERR systemd-modules-load[269]: Failed to find module 'emc2305'
Feb 15 00:09:55.374141 r-leopard-simx-74 ERR systemd-modules-load[269]: Failed to find module 'ads1015'
To resolve this we can patch hw-mgmt to no longer attempt to load these modules by default.
- How I did it
Added a SONiC patch to Mellanox hw-mgmt in order to remove the unused kernel modules which were not upstreamed to sonic-linux-kernel
- How to verify it
Boot switch and verify there are no error logs regarding kernel modules failing to load.
- Why I did it
Stopping swss and syncd causes some driver module unloading. Those driver modules are depended by PMON. This could trigger ERROR logs in syslog.
- How I did it
Adjust warmboot shutdown order in make file
- How to verify it
Manual test
- Why I did it
In SONiC thermal control algorithm, it compares thermal zone temperature with thermal zone threshold. Previously, a thermal zone with no thermal sensor can still get its threshold. However, a recently driver patch changes this behavior: a thermal zone with no thermal sensor will return 0 for threshold. We need to ignore such thermal zone.
- How I did it
Ignore thermal zones whose temperature is 0.
- How to verify it
Added unit test case and Manual test
- Why I did it
swsscommon.ConfigDBConnector does not automatically close connection when the instance is recycled by python. So, it should not create this instance each time calling check_services. It will cause error like Failed to read from file /var/run/hw-management/led/led_status_capability - OSError(24, 'Too many open files')
- How I did it
Only connect DB once in init
- How to verify it
Manual test
Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors
How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests
How to verify it
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
5331ecd [vslib]: Fix MACsec bug in SCI and XPN (#1003)
ac04509 Fix build issues on gcc-10 (#999)
1b8ce97 [pipeline] Download swss common artifact in a separated directory (#995)
7a2e096 Change sonic-buildimage.vs artifact source from CI build to official build. (#992)
d5866a3 [vslib]: fix create MACsec SA error (#986)
f36f7ce Added Support for enum query capability of Nexthop Group Type. (#989)
323b89b Support for MACsec statistics (#892)
26a8a12 Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#968)
0cb253a Fix object availability conversion (#974)
Enable dbgsym package for dhcpmon.
Allow CFLAGS and LDFLAGS from environment variables to be used
in the dhcp6relay build. This makes sure that the -O2 flag from
dpkg-buildflags gets used.
Finally, enable all hardening flags in dpkg-buildflags for
dhcp6relay and dhcpmon. The change from the default set of flags is that
during linking, immediate binding of symbols is done instead of lazy
binding.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
sonic-swss
1aa40f7 Remove port serdes object before removing port (#2152)
876d690 [doc] Updating Policer config in Configuration manual (#2144)
sonic-utilities
dfed952 show_platfom_info not run for simx (#2042)
71fdee7 [aclshow] fix aclshow when clear is called before counters are populated (#2037)
a48a027 [sonic-package-manager] implement blocking feature state change (#2035)
c51871d [ci] Fix python dependencies reference path. (#2060)
Why I did it
Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block.
How I did it
Modifies radvd.conf.j2 and added unit tests
How to verify it
Configure multiple ipv6 address to the same vlan, start radvd
Unit test will check if radvd.conf with multiple ipv6 addresses is formed correctly
This issue causes negative threshold value and thus deleting log files even when there is enough space.
This issue causes negative threshold value and thus deleting log files even when there is enough space.
- Why I did it
To fix an issue when log files get deleted even if there is enough space.
- How I did it
Fixed an typo.
- How to verify it
Run the portion of the script that calculates threshold, see that the threshold is calculated correctly.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Why I did it
the strcpy and buffer allocation is not safe, it corrupts 1 byte on the stack. Depending on the memory layout, it may or may not cause issue immediately.
message type is not validated before updating the counter. Which could cause segment fault.
How I did it
Remove the unsafe strcpy, use config->interface.c_str() instead.
Check message type before updating counters.
How to verify it
The issue (1) caused segment fault on a specific platform. The fix was validated there. Issue (2) was precautionary. Added log in case it triggers.
- Why I did it
Update MFT to version 4.18.1-16 for bugs fixes and new SN2201 support
- How I did it
Advance to MFT tool version to 4.18.1-16
- How to verify it
Manually tested on all Mellanox platforms (ASIC FW Upgrade, link debug tools, CPLD upgrade, etc.)
- Why I did it
Error log was shown on switches during boot
pmon#supervisord 2021-12-22 04:27:16,709 INFO exited: chassis_db_init (exit status 0; not expected)
- How I did it
Add exit code zero as an expected exit code and also disable autorestart.
- How to verify it
Boot the switch and ensure the above log line does not appear.
cb3ddf5 [pmon][xcvrd] xcvrd process show backtrace on the internal port. Port PR233 (#236)
5b4c9e1 Fix python wheels path downloaded from vs official build. (#244)
- Why I did it
Fix Issue 9972: Incorrect information about release version in sonic_version.yml
- How I did it
Add "sonic_release" file to /sonic-buildimage/files/image_config/
- How to verify it
Install the image and run: cat /etc/sonic/sonic_version.yml
Verify the following item on sonic_version.yml file: release: '202111'
- Why I did it
platform.json of 4600C only has 2 CPU core thermal sensors, but there are 4 actually
- How I did it
Added thermal sensors for CPU core 2 and core 3.
- How to verify it
Build.
- Why I did it
The chassis name in MSN4410 platform_components.json is not correct
- How I did it
Fix the chassis name
- How to verify it
Run relevant platform API test
Signed-off-by: Kebo Liu <kebol@nvidia.com>
ded0344 Return both 'vendor_rev' and 'hardware_rev' keys from get_transceiver_info to support both earlier and later versions of xcvrd
b3442cc Change log_error to log_info when transceiver module is transitioned
3d3a73c Fix problem introduced with new SFP caching paradigm
2915746 Add script to reboot all IMMs
17ad221 Fix the position_in_parent for the psu entity
207e731 No longer force read pages at SFP init time
b387921 Fixed the voltage and power with rounding to 2 digit decimals
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
Updated the BCM config recommended by Broadcom for Nokia-IXR7250E-36x400G
How I did it
Updated the BCM config file
How to verify it
Verified running the image with this BCM config in Nokia-IXR7250E-36x400G and ensured that the syncd container was stable, ports were up and passing the traffic.
Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
[Build]: Fix hundreds of thousands lines of logs printed in marvell-armhf
It is caused by the bad format of the marvell sai package mrvllibsai_armhf_1.7.1-6.deb, increasing the waiting time to reduce the logs, and reduce the waste of the CPU.
- Why I did it
Need to remove old static configs from sai.profile files.
New implementation: Azure/sonic-swss#1959
New configuration: #9658
- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 lines from files per HWSKU
- How to verify it
When static config is removed following test will fail (src port will be in range 0-255)
py.test vxlan/test_vnet_vxlan.py --inventory "../ansible/inventory, ../ansible/veos" --host-pattern (testbed)-t0 --module-path ../ansible/library/ --testbed (testbed)-t0 --testbed_file ../ansible/testbed.csv --allow_recover --assert plain --log-cli-level info --show-capture=no -ra --showlocals --disable_loganalyzer --skip_sanity --upper_bound_udp_port 65535 --lower_bound_udp_port 64128
- Why I did it
Remove obsolete parameter that enables static VXLAN src port range
provide functionality no generate json config file according to appropriate parameter in config_db
Done for
SN3800:
• Mellanox-SN3800-D28C50
• Mellanox-SN3800-C64
• Mellanox-SN3800-D28C49S1 (New 10G SKU)
SN2700:
• Mellanox-SN2700-D48C8
- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 from appropriate sai.profile files
Created vxlan.json file and added few params that depends on DEVICE_METADATA.localhost.vxlan_port_range
- How to verify it
File /etc/swss/config.d/vxlan.json should be generated inside swss docker when it restart
[
{
"SWITCH_TABLE:switch": {
"vxlan_src": "0xFF00",
"vxlan_mask": "8"
},
"OP": "SET"
}
]
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
- Why I did it
New version of mellanox platform management code available adding support for new platforms and fixing bugs.
- How I did it
1. Updated the submodule
2. Updated makefile version references
3. Regenerated SONiC patches
Added midplane_subnet in chassisdb.conf for interfaces-config.sh to create midplane interface in multi-asic namespaces.
Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
#### Why I did it
PR9611 - sonic-scheduler.yang pattern issue
#### How I did it
Modified the scheduler name pattern string to accept any string
#### How to verify it
Sonic yang tests
Why I did it
Need to be able to run smartctl when pmon docker is not running.
How I did it
Removed the pmon dependency for pmon as well as the command wrapper and added it to the debian-extension.
How to verify it
Stop pmon
Run smartctl from the host and verify it runs without error
Updates include the following changes in order to support new Mellanox platforms and drivers (Azure/sonic-linux-kernel#259)
10ef390 Update kconfig to support / enable newly backported mellanox patches.
6a949e1 Add backported patches for Mellanox hw-mgmt V.7.0020.1300
e1913f7 Rename and reformat patch headers
#### Why I did it
Include sonic-bgp-monitor to setup.py so it gets included in /usr/local/yang-models when installing the package
#### How I did it
#### How to verify it
install the package
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### A picture of a cute animal (not mandatory but encouraged)
- Why I did it
Fix issue: 'sx_port_mapping_t' object has no attribute 'slot_id'. sx_port_mapping_t only has attribute slot.
- How I did it
Change slot_id to slot.
- How to verify it
Manual test
- Why I did it
Python select.select accept a optional timeout value in seconds, however, the value passes to it is a value in millisecond.
- How I did it
Transfer the value to millisecond.
- How to verify it
Manual test
Why I did it
To enable test support for BFD-related features, the PTF docker needs to have the proper support for BFD. This PR aims to add BFD support in ptf docker.
How I did it
Clone and build OpenBFDD for PTF docker.
How to verify it
Build locally and verify BFD is supported.
- Why I did it
To include latest SDK fixes:
1. On CMIS modules, after low power configuration, the firmware waited for the module state to be ModuleReady instead of ModuleLowPower causing delays.
2. When connecting SN4600C, 100GbE port with CWDM4 module (Gen 3.0), link up time is 30 seconds.
and to include SAI fixes \ changes:
1. Reduce verbosity for resource check vendor data not found
2. Fix metadata validation, check default value on conditions check
3. Add 100MB, 10MB to 2201 system
4. L3 VXLAN overlay ECMP
5. VXLAN srcport API implementation
6. Fix scheduler profile null (default values) when set on sub group scheduler group
7. Fix ACL binding restoration when port leaves a LAG
8. Fix route logic for set next hop/action and reference counter for ECMP overlay
- How I did it
1. Updated SDK/FW submodule and relevant makefiles with the required versions.
2. Update SAI submodule and relevant makefile with the required version.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.
How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
- Why I did it
The feature state can be a jinja template, like in this file - https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/init_cfg.json.j2#L39.
Without this change it is not possible to validate a configuration file.
- How I did it
Relaxes the constraint on feature state. Feature state leaf can be any string.
- How to verify it
Run UT.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Why I did it
Update Broadcom SAI to version 6.0.0.13, SDK 6.5.24, saibcm-modules to 6.5.24.gpl
How I did it
Brcm SAI 6.0 EA with fixes for CS00012203367, CS00012219613, CS00012213974, CS00012218290, CS00012217169, CS00012211718, CS00012213944, CS00012215529, CS00012218100, CS00012214196, CS00012212681, CS00012205138, CS00012208537, CS00012185316, CS00012208524, CS00012203367, CS00012197364.
51a9fbf [debug dump] Missing Dict Key handled in the MatchOptimizer (#2014)
ac8fdd3 [Auto Techsupport] Added Event Driven TS to Command Reference (#1985)
458a0c2 [fdbshow] Adding more options for fdbshow and show mac (#1982)
#### Why I did it
src\tacacs\bash_tacplus\debian\rules file mode is 644, and debian build will change it to 755, which will cause image version contains 'dirty'
#### How I did it
Change src\tacacs\bash_tacplus\debian\rules file mode to 755
#### How to verify it
Check the image version not contains dirty
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [*] 202111
#### Description for the changelog
Change src\tacacs\bash_tacplus\debian\rules file mode to 755
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
sonic-broadcom-dnx.bin should be able to installed on DNX supported platform, whereas it doesn't.
How I did it
Changed CONFIGUTED_PLATFORM to TARGET_MACHINE to distinguish broadcom and broadcom-dnx
How to verify it
tar sonic-broadcom-dnx.bin and verify its platforms_asic contians dnx platforms
Also verify on image with other asic, no regression.
Why I did it
Eliminate benign firsttime boot error reported when running on platforms that do not support kdump.
How I did it
Change rc.local to check for presence of the file /etc/default/kdump-tools before referencing it.
How to verify it
Install a new image on an armhf or arm64 platform and check for a failed reference to /etc/default/kdump-tools on firsttime boot.
- External PHY is managed via gearbox (gbsybcd docker container) in SONiC
- Enhanced 'External PHY management' from SONiC's single-ASIC environment to multi-ASIC
- Enhanced gbsyncd docker container from single Namespace to multi-Namspace mode
- Added gbsyncd.service.j2 on per_namespace basis.
- Each namepace/ASIC now to have its unique gbsyncd<ASIC#> docker container with its
own Gearbox table, redis-DB
Signed-off-by: Shyam Kumar <shyakuma@cisco.com>
Why I did it
ConfigDB schema generated by minigraph parser can't pass yang validation.
How I did it
Modify minigraph.py, and use 'state' to replace 'status'.
How to verify it
Run UT for sonic-config-engine.
Use minigraph parser to generate ConfigDB schema, and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
end2end test is blocked by Yang model for BGP monitor.
How I did it
Create new yang files for BGP monitor, and add UT.
How to verify it
Follow the steps in #9711.
Run UT for sonic-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
#### Why I did it
AAA yang model is not up to date.
#### How I did it
Add fallback and trace field, and replace boolean_type
#### How to verify it
Run UT for sonic_yang_models.
Follow the steps from #9710
Why I did it
Config db schema generated by minigraph can’t pass yang validation, bgp_asn must not be None.
How I did it
Update sampe-voq-graph.xml to add bgp_asn.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m tests/sample-voq-graph.xml -p tests/voq-sample-port-config.ini --print-data', and check bgp_asn.
Signed-off-by: Gang Lv ganglv@microsoft.com
Tested on a Celestica Seastone2 DX030 switch
Testing scenarios:
- Various QSFP ports in both normal and breakout config.
- 100G and 40G link speed show different colors.
- SFP1 port works.
Signed-off-by: Christian Svensson <blue@cmd.nu>
- Why I did it
For MSN4410/MSN4600/MSN4700 now they can support fetching PSU voltage threshold, no need to skip the psu voltage check in system health monitoring, so update the system health monitoring configuration file for these platforms.
- How I did it
remove skip PSU change config from the system_health_monitoring_config.json file
- How to verify it
Build image run on these platforms, system health monitoring will not report error against PSU voltage
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Optimize thermal control policies to simplify the logic and add more protection code in policies to make sure it works even if kernel algorithm does not work.
- How I did it
Reduce unused thermal policies
Add timely ASIC temperature check in thermal policy to make sure ASIC temperature and fan speed is coordinated
Minimum allowed fan speed now is calculated by max of the expected fan speed among all policies
Move some logic from fan.py to thermal.py to make it more readable
- How to verify it
1. Manual test
2. Regression
- Why I did it
MSN4700 platform has 8 lanes per port and thus can support 2x40G with each lane running at 10G
- How I did it
Added 40G to 2x200G breakout mode in platform.json
- How to verify it
Run config int break Ethernet0 2x40G[200G,100G,50G,25G,10G,1G]
And verify the command runs successfully and the port speed was set to 40G with a 2x breakout.
* Description: Currently IPv4 routes with IPv6 link local next hops are
not properly installed in FPM.
Reason is the netlink decoding truncates the ipv6 LL address to 4 byte
ipv4 address.
Ex : fe80:: is directly converted to ipv4 and it results in 254.128.0.0
as next hop for below routes
show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
B>* 2.1.0.0/16 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight 1,
02:22:26
B>* 5.1.0.0/16 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight 1,
02:22:26
B>* 10.1.0.2/32 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight
1, 02:22:26
Hence this fix converts the ipv6-LL address to ipv4-LL (169.254.0.1)
address before sending it to FPM. This is inline with how these types of
routes are currently programmed into kernel.
Signed-off-by: Nikhil Kelapure <nikhil.kelapure@broadcom.com>
- Why I did it
The feature state can be a jinja template, like in this file - https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/init_cfg.json.j2#L39.
Without this change it is not possible to validate a configuration file.
- How I did it
Relaxes the constraint on feature state. Feature state leaf can be any string.
- How to verify it
Run UT.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Fixes#9561Fixes#9570Fixes#9563
Partial fix for #9556
#### Why I did it
- Attributes for dual ToR configs lack YANG model support
#### How I did it
- Extend YANG tests to cover dual ToR use cases
- Extend YANG model to cover dual ToR use cases
- Reduce the default log level to warning so only test failures are printed
#### How to verify it
- Run the YANG model unit tests
* fix workdir for seastone2
Signed-off-by: Viktor Ekmark <viktor@ekmark.se>
* seastone2: Add I2C SFP definition for SFP1
Signed-off-by: Christian Svensson <blue@cmd.nu>
* [device/cel_seastone_2] sfputil logic for SFP1
Earlier logic resulted in the name of SFP1 being SFP33 which is not
correct. The cannonical source is seastone2_fpga module and it calls it
SFP1, so ensure the logic does as well.
Signed-off-by: Christian Svensson <blue@cmd.nu>
* [device/cel_seastone_2] sysfs paths for SFP1
Various changes that plumbs the correct port presence and DOM decoding
for the SFP1 port.
Signed-off-by: Christian Svensson <blue@cmd.nu>
Co-authored-by: Christian Svensson <blue@cmd.nu>
#### Why I did it
resolves https://github.com/Azure/sonic-buildimage/issues/8779
snmpd writes the below error message in syslog :
snmp#snmpd[27]: truncating integer value > 32 bits
This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days.
#### How I did it
Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd
To avoid seeing this message if the counter grows, the snmpd error log level is changed to display LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG.
Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog.
#### How to verify it
On a device which is up for more than 497 days, modify supervisord.conf with the change and restart snmp.
Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.
Why I did it
The existing log file size in sonic is 1 Mb. Over a period of time this leads to huge number of log files which becomes difficult for monitoring applications to handle.
Instead of large number of small files, the size of the log file is not set to 16 Mb which reduces the number of files over a period of time.
How I did it
Changed the size parameter and related macros in logrotate config for rsyslog
How to verify it
Execute logrotate manually and verify the limit when the file gets rotated.
Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
- Why I did it
Add sensor conf for MSN4600C A1 platform
- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform
- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
#### Why I did it
It should be handled by `ConfigDBConnector.typed_to_raw()`.
This is a bug for `sonic-cfggen -m --print-data` only
```
"PORTCHANNEL_MEMBER": {
"PortChannel0001|Ethernet112": {
"NULL": "NULL"
},
"PortChannel0002|Ethernet116": {
"NULL": "NULL"
},
"PortChannel0003|Ethernet120": {
"NULL": "NULL"
},
"PortChannel0004|Ethernet124": {
"NULL": "NULL"
}
},
```
But not appears in `sonic-cfgen -d --print-data`.
```
"PORTCHANNEL_MEMBER": {
"PortChannel0001|Ethernet112": {},
"PortChannel0002|Ethernet116": {},
"PortChannel0003|Ethernet120": {},
"PortChannel0004|Ethernet124": {}
},
```
Tested in a T0 KVM.
What I did:-
Enhanced minigraph parser to parse interface name associated with static route nexthop
Why I did:-
One of the use case to support interface name is Chassis Packet. For Chassis Packet we have Static Routes configured to route traffic across line-card. If the FRR programs static route without the interface name then in case if the ip interface that is associated with the nexthop goes down FRR resolves static route nexthop over the default route as we have FRR config ip nht-resolve-via-default which causes undesired behavior. Having interface name with Static Route prevents recursive lookup on default route.
How I verify:
Updated unit-test cases
Manual verification
dd71848 [GCU] Show default option for '--format' (#2003)
f296e76 [GCU] Disallowing DeleteInsteadOfReplaceMoveExtender from generating delete whole config move (#2006)
731d643 [flow counter] Fix issue: should not compare str with int (#2001)
e628f01 Support CLI for buffer queue configuration (#1965)
585fd40 Fix show ip bgp nei command rw required issue (#2011)
Update ztp sub module to include the below fixes:
f7dd3c5 [sonic-ztp]Fixing build failure after bullseye integration (#30)
9218e16 Replace swsssdk.ConfigDBConnector and SonicV2Connector with swsscommon(#28)
Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
* Add boolean as typedef to sonic-types
* Fix boolean in sonic-feature yang model
* Fix boolean in sonic-flex_counter yang model
#### Why I did it
It was request to cherry-pick fix from master (#9418) to 202111 branch to fix issue when boolean is used in different literal cases.
#### How I did it
Added boolean to sonic-types as typedef with different literal cases.
#### How to verify it
Run the command config interface breakout <interface_name> <breakout_mode>
4236bc4 [config reload] Fixing config reload when timer based delayed services are disabled (#1967)
d2514e4 [GCU] Different apply-patch runs should produce same sorted steps (#1988)
2878adb [GCU] Using simulated config instead of target config when validating replace operation in NoDependencyMoveValidator (#1987)
fb8ca98 [GCU] Loading yang-models only once (#1981)
f88ee92 [GCU] Copying config_db before callding sonic_yang.loadData (#1983)
9ed0e91 [GCU] Implementing DryRun by printing patch-sorter steps/imitating config_db (#1973)
b36b5e3 [GCU] Moving PatchSorter unit-test to json file to make it easier to read/maintain (#1977)
c0fa28b [generic-config-updater] Improving CreateOnly validator and marking /LOOPBACK_INTERFACE/LOOPBACK#/vrf_name as create-only (#1969)
0559d04 [generic-config-updater] Adding non-strict mode (#1929)
b07f477 [debug dump util] FDB debug dump util changes (#1968)
6d8757a [warm/fast-reboot] Fix kexec portion to support platforms based on Device Tree (#1966)
cc1409e [Auto Techsupport] Event driven Techsupport Bug Fixes (#1986)
6c48bd5 Fix wrong help message for cable length setting (#1978)
c0bbbe3 [breakout] Fix the check when port is not present in BREAKOUT_CFG table (#1765)
5bb8cad [doc][DPB] Update DPB related interface breakout command Info (#1438)
e6fd990 [config] Fix 'config reload -l' command to get filename by default (#1611)
bd8f7bb Update swss_ready check to check per namespace swss service (#1974)
5439f94 [soft-reboot] Add support for platforms based on Device Tree (#1963)
7c5810a [config] Add portchannel support for static route (#1857)
7cb6a1b preserve old order for config reload (#1964)
20bddbd [Auto-Techsupport] Issues related to Multiple Cores crashing handled (#1948)
On a multi-asic Supervisor card, running commands like
'show interface counter' opens a confid_db connection per
namespace per interface which results in many duplicate connections
exceeding the allowed open file handles. This causes the command to fail.
Caching the connections to prevent duplicate handles.
Why I did it
Config db schema generated by minigraph can’t pass yang validation, there's no Vlan31 in 'VLAN' table.
How I did it
Update test minigraph to add vlan interface.
How to verify it
Build sonic-yang-models.
Run command 'sonic-cfggen -m tests/fg-ecmp-sample-minigraph.xml -p tests/mellanox-sample-port-config.ini --print-data', and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
'SYSLOG_SERVER': {'': {}, '10.0.10.5': {}, '10.0.10.6': {}},
Config db schema generated by minigraph can’t pass yang validation, server address can't be empty.
How I did it
Update test minigraph to remove wrong configuration.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m xxx.xml --print-data', and SYSLOG_SERVERS table.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Config db schema generated by minigraph can’t pass yang validation, portchannel_member has invalid port.
How I did it
Update test minigraph to remove invalid port channel.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m xxx.xml --print-data', and check port channel member.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Config db schema generated from test minigraph can't pass yang validation.
How I did it
Update test minigraph to fix interface
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m xxx.xml --print-data', and check interface table and port table.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Config db schema generated from test minigraph can't pass yang validation.
How I did it
Update minigraph xml to add DeploymentId.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m xxx.xml --print-data', and check deployment_id field.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Config db schema generated by minigraph can’t pass yang validation, and there's no 'alias' field in yang model.
Minigraph parser supports 'alias' field for VLAN.
How I did it
Add 'alias' field to sonic-vlan.yang
How to verify it
Build sonic-yang-models.
Run command 'sonic-cfggen -m xxx.xml --print-data', and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Signed-off-by: Neetha John <nejo@microsoft.com>
Bring back the changes in #9226 that were reverted. Unable to do a revert-revert.
Why I did it
Few device types were missing in the DEVICE_METADATA type field
How I did it
Added missing device types to the device metadata yang
Why I did it
#9122
DEVICE_METADATA does not have cloudtype and region.
How I did it
Add cloudtype and region to DEVICE_METADATA.
How to verify it
Follow the steps in #9122.
Build sonic-yang-model.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Add yang model for syslog server
How I did it
Add new file sonic-syslog.yang and new files for tests
How to verify it
Compile target/python-wheels/sonic_yang_mgmt-1.0-py3-none-any.whl
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com
#### Why I did it
Fixes https://github.com/Azure/sonic-utilities/issues/1871
From [generic-config-updater](https://github.com/Azure/sonic-utilities/tree/master/generic_config_updater) we call `sonic-yang-mgmt` multiple times in order to check a certain change to ConfigDb is valid or not. It is expected for some changes to be invalid, so always printing errors from `sonic-yang-mgmt` makes the output hard to read.
In this PR, we are adding a way to control if logs should be printed or not.
#### How I did it
- Added `print_log_enabled` flag to sonic_yang ctor
- Converted all `print` statements to `sysLog(..., doPrint=True)`
#### How to verify it
unit-test passing means the change did not break logs.
#### Info about libyang logging
libyang provides an extensive logging logic which can support a lot of scenarios:
- ly_log_level: setting logging level
- LY_LLERR
- LY_LLWRN
- ...
- ly_set_log_clb: setting log callback to customize the default behavior which is printing the msgs
- ly_log_options: setting logging options
- LY_LOLOG: If callback is set use it, otherwise just print. If flag is not set, do nothing.
- ...
For more info refer to:
- https://netopeer.liberouter.org/doc/libyang/devel/html/group__logopts.html#gaff80501597ed76344a679be2b90a1d0a
- https://netopeer.liberouter.org/doc/libyang/devel/html/group__log.html#gac88b78694dfe9efe0450a69603f7eceb
#### What's next?
Consume the new flag `print_log_enabled` in [generic-config-updater](https://github.com/Azure/sonic-utilities/tree/master/generic_config_updater) to reduce the logging clutter.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### A picture of a cute animal (not mandatory but encouraged)
#### Why I did it
Fix issue https://github.com/Azure/sonic-utilities/issues/1962
The problem is current implementation of [sonic-yang-mgmt::find_data_dependencies](f2774b635d/src/sonic-yang-mgmt/sonic_yang.py (L518)) does not get referrers if they are using `must` statement, it has to use `leafref`.
For now we can convert `must` to `leafref` if possible. In the future we will investigate get referrers by `must` statements as well https://github.com/Azure/sonic-buildimage/issues/9534
#### How I did it
Instead of `must` use `leafref`
#### How to verify it
unit-test
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Why I did it
Fixing issue #9294
#### How I did it
Updating ACL yang model
#### How to verify it
Validating issue with `config patch-apply` is fixed.
- Start a KVM
- Add file `add-ctrl-plane-tbl.json-patch ` with content:
```json
[
{
"op": "add",
"path": "/ACL_TABLE/ACTRLPLANETABLE",
"value": {
"policy_desc": "ACTRLPLANETABLE",
"services": [
"SSH"
],
"stage": "ingress",
"type": "CTRLPLANE"
}
}
]
```
- Run `sudo config apply-patch add-ctrl-plane-tbl.json-patch`
Before:
```
Patch Applier: The patch was sorted into 4 changes:
Patch Applier: * [{"op": "add", "path": "/ACL_TABLE/ACTRLPLANETABLE", "value": {"type": "CTRLPLANE"}}]
Patch Applier: * [{"op": "add", "path": "/ACL_TABLE/ACTRLPLANETABLE/policy_desc", "value": "ACTRLPLANETABLE"}]
Patch Applier: * [{"op": "add", "path": "/ACL_TABLE/ACTRLPLANETABLE/services", "value": ["SSH"]}]
Patch Applier: * [{"op": "add", "path": "/ACL_TABLE/ACTRLPLANETABLE/stage", "value": "ingress"}]
```
After:
```
Patch Applier: The patch was sorted into 1 change:
Patch Applier: * [{"op": "add", "path": "/ACL_TABLE/ACTRLPLANETABLE", "value": {"policy_desc": "ACTRLPLANETABLE", "services": ["SSH"], "stage": "ingress", "type": "CTRLPLANE"}}]
```
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
Recently additional sensors that were needed only for specific system added to all systems and caused errors.
How I did it
* Include CPU board and switch board sensors only on SN2201 system
* Fix issue in test_chassis_thermal, now it skips non existing thermals.
How to verify it
Run show platform temperature
Signed-off-by: liora <liora@nvidia.com>
Why I did it
To include newer Fan LED, thermal capabilities fields in platform.json of DellEMC S6000, S6100, Z9332f platforms.
How I did it
Add the capabilities fields in each platform's respective platform.json.
How to verify it
Ran sonic-mgmt platform api test cases that use capabilities fields and verified that the results are as expected.
- Why I did it
Add new Spectrum-4 system support SN5600 on top of Nvidia ASIC simulator.
- How I did it
Add all relevant system and simulator SKU.
Updated syseeprom.hex and related directories to reflect Nvidia SN5600 brand name.
- How to verify it
Tested init flow, basic show commands, up interfaces, traffic test.
Signed-off-by: Raphael Tryster <raphaelt@nvidia.com>
Why I did it
Fix typo and missing files in SN3800 and SN4600C's buffer templates
How I did it
ingress_lossless_xoff_size => ingress_lossless_pool_xoff add missing files for SN4600C-D100C12S2
How to verify it
Deploy the fix and verify whether the device can be up.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
* [arm64]: Fix registration of the qemu interpreters
The current code doesn't properly run the container that registers the
qemu interpreters. It checks to see if the container is "known" by
Docker, but that doesn't indicate whether it's been run or not.
Therefore, just always register the qemu interpreters in the kernel, to
make sure the binary that's in the slave images that we build is used.
* [build]: Reduce the number of python calls
Modify the BLDENV and PROJECT_ROOT variables in slave.mk to be
immediate execution instead of lazy execution. Neither of these
variables should be changing for the duration of the build in each slave
container, so just run it once instead of every time they're referenced.
When running `make configure` for broadcom arm64 (where all of the slave
images are already built) on an amd64 host, this reduces the time spent
in each slave container from 4.5-5 minutes to 2 minutes.
* [sonic-slave]: Upgrade the qemu used for Bullseye arm64 to 6.1.0
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
Rename platform x86_64-mlnx_msn4800 to x86_64-nvidia_msn4800
- How I did it
Rename platform folder as well as all code that reference the platform name
- How to verify it
Manual test
When the package name with special characters, such as +, the package name may be encoded as %2b, the package url will not be found when reproducible build enabled.
For broadcom sai, we only need to upgrade the version, not necessary the token part in the url.
Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
691c37b7 [Route bulk] Fix bugs in case a SET operation follows a DEL operation in the same bulk (Azure/sonic-swss#2086)
a4c80c3d patch for issue Azure/sonic-swss#1971 - enable Rx Drop handling for cisco-8000 (Azure/sonic-swss#2041)
71751d10 [macsec] Support setting IPG by gearbox_config.json (Azure/sonic-swss#2051)
5d5c1692 [bulk mode] Fix bulk conflict when in case there are both remove and set operations (Azure/sonic-swss#2071)
8bbdbd2b Fix SRV6 NHOP CRM object type (Azure/sonic-swss#2072)
ef5b35f3 [vstest] VS test failure fix after fabric port orch PR merge (Azure/sonic-swss#1811)
89ea5385 Supply the missing ingress/egress port profile list in document (Azure/sonic-swss#2064)
81234373 [pfc_detect] fix RedisReply errors (Azure/sonic-swss#2040)
b38f527a [swss][CRM][MPLS] MPLS CRM Nexthop - switch back to using SAI OBJECT rather than SWITCH OBJECT
ae061e55 create debug_shell_enable config to enable debug shell (Azure/sonic-swss#2060)
45e446d9 [cbf] Fix max FC value (Azure/sonic-swss#2049)
b1b5b297 Initial p4orch pytest code. (Azure/sonic-swss#2054)
d352d5a9 Update default route status to state DB (Azure/sonic-swss#2009)
24a64d65 Orchagent: Integrate P4Orch (Azure/sonic-swss#2029)
15a3b6ca Delete the IPv6 link-local Neighbor when ipv6 link-local mode is disabled (Azure/sonic-swss#1897)
ed783e1f [orchagent] Add trap flow counter support (Azure/sonic-swss#1951)
e9b05a31 [vnetorch] ECMP for vnet tunnel routes with endpoint health monitor (Azure/sonic-swss#1955)
bcb7d61a P4Orch: inital add of source (Azure/sonic-swss#1997)
f6f6f867 [mclaglink] fix acl out ports (Azure/sonic-swss#2026)
fd887bf8 [Reclaim buffer] Reclaim unused buffer for dynamic buffer model (Azure/sonic-swss#1910)
92589789 [orchagent, cfgmgr] Add response publisher and state recording (Azure/sonic-swss#1992)
3d862a72 Fixing subport vs test script for subport under VNET (Azure/sonic-swss#2048)
fb0a5fd8 Don't handle buffer pool watermark during warm reboot reconciling (Azure/sonic-swss#1987)
16d4bcdb Routed subinterface enhancements (Azure/sonic-swss#1907)
9639db78 [vstest/subintf] Add vs test to validate sub interface ingress to a vnet (Azure/sonic-swss#1642)
Signed-off-by: Stephen Sun stephens@nvidia.com
#### Why I did it
Created SONiC Yang model for Mirror.
Tables: MIRROR_SESSION
#### How I did it
Defined Yang models for COPP based on Guideline doc:
https://github.com/Azure/SONiC/blob/master/doc/mgmt/SONiC_YANG_Model_Guidelines.md
and
https://github.com/Azure/sonic-utilities/blob/master/doc/Command-Reference.md
#### How to verify it
'''
============================= test session starts ==============================
platform linux -- Python 3.7.3, pytest-3.10.1, py-1.7.0, pluggy-0.8.0
rootdir: /sonic/src/sonic-yang-models, inifile:
plugins: cov-2.6.0
collected 3 items
tests/test_sonic_yang_models.py .. [ 66%]
tests/yang_model_tests/test_yang_model.py . [100%]
=============================== warnings summary ===============================
module: sonic-mirror-session
+--rw sonic-mirror-session
+--rw MIRROR_SESSION
+--rw MIRROR_SESSION_LIST* [name]
+--rw name string
+--rw type? string
+--rw src_ip? inet:ipv4-address
+--rw dst_ip? inet:ipv4-address
+--rw gre_type? string
+--rw dscp? uint8
+--rw ttl? uint8
+--rw queue? uint8
+--rw dst_port? -> /port:sonic-port/PORT/PORT_LIST/name
+--rw src_port? union
+--rw direction? string
'''
#### Why I did it
Currently only IP ACL and related model is defined. Support for MAC ACL is missing. Added support for it.
#### How I did it
ACL_RULE table is added with new MAC ACL related fields namely Source MAC, Destination MAC, Ethertype (Pattern updated to match any valid Ethertypes), VLAN, PCP, DEI
#### How to verify it
Yang model tests are attached.
Depends on #9358
Why I did it
Adjust LED logical according to hw-mgmt change.
How I did it
Add a trigger to set LED to blink.
How to verify it
Manual test
#### Why I did it
Add the configuration for the set_owner in the `feature` yang model
#### How I did it
Add new leaf `set_pwner` to the `feature` yang model
#### How to verify it
compile `sonic_yang_mgmt-1.0-py3-none-any.whl`
Why I did it
To support iTCO watchdog using watchdog APIs.
How I did it
Implemented a new watchdog class WatchdogTCO for interfacing with iTCO watchdog.
Updated reboot cause determination logic.
How to verify it
Verified that the watchdog APIs' return values are as expected.
Logs: UT_logs.txt
Update Nokia platform sonic-pmon submoduel to the latest with the following commits
c41c823 Fix transceiver module dynamic insertion/removal operations
21a1df6 Fixed pcied process FATAl issue
7fc1fd4 Fix midplane status nokia_cmd
a14ee1c Override get_module() api in chassis
8a457fc SON-326: Watchdog logger changes and file scrubbing
7250eb1 SON-410: Fix missing eeprom access routine
7a70c42 Allow only reboot of self card for OC API test
6ab5d96 Fixed the flake8 compliant issues
807de95 APIs to set thermal threshold to return false
9b38265 SON-382: platform-dump with common techsupport
3f83a67 Add model, base_mac, system_eeprom and serial number support in moduel.py
848d311 SFP: Add get_error_description and fix return status for set_lpmode
1fcb5de PSU check presence of psu instance for APIs
7c68da3 Fixed the eagle and hornet card description
0c01d07 Module support for reboot API
- Why I did it
Also recalculated all parameters with the latest algorithm with per-speed peer response time taken into account
- How I did it
Detailed information of each SKU:
C64:
t0: 32 100G downlinks and 32 100G uplinks
t1: 56 100G downlinks and 8 100G uplinks with 2km-cable supported
D112C8: 112 50G downlinks and 8 100G uplinks.
D48C40: 48 50G downlinks, 32 100G downlinks, and 8 100G uplinks
D100C12S2: 4 100G downlinks, 2 10G downlinks, 100 50G downlinks, and 8 100G uplinks
2km cable is supported for C64 on t1 only
- How to verify it
Run regression test (QoS)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
The same docker image is built multiple times after upgrading to bullseye, the build time is increased to about 15 hours from 6 hours.
See log: https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_apis/build/builds/50390/logs/9
Line 1437: 2021-11-11T11:15:02.7094923Z [ building ] [ target/docker-sonic-telemetry.gz ]
Line 1446: 2021-11-11T11:37:41.1073304Z [ finished ] [ target/docker-sonic-telemetry.gz ]
Line 1459: 2021-11-11T11:38:20.6293007Z [ building ] [ target/docker-sonic-telemetry.gz-load ]
Line 1462: 2021-11-11T11:38:28.1250201Z [ finished ] [ target/docker-sonic-telemetry.gz-load ]
Line 2906: 2021-11-11T18:57:42.8207365Z [ building ] [ target/docker-sonic-telemetry.gz ]
Line 2917: 2021-11-11T19:43:47.1860961Z [ finished ] [ target/docker-sonic-telemetry.gz ]
Line 3997: 2021-11-11T22:49:35.0196252Z [ building ] [ target/docker-sonic-telemetry.gz ]
Line 4002: 2021-11-11T23:14:00.4127728Z [ finished ] [ target/docker-sonic-telemetry.gz ]
How I did it
Place the python wheels in another folder relative to the build distribution.
Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
Why I did it
Nvidia platform API does not support set LED to orange
How I did it
Allow user to set LED to orange
How to verify it
Added unit test
Manual test
#### Why I did it
DPB falls due to missing POLL_INTERVAL in sonic-flex_counter yang model.
#### How I did it
Added POLL_INTERVAL leaf to ACL container in sonic-flex_counter yang model.
#### How to verify it
Run the command config interface breakout <interface> <breakout_mode>
**NOTE:**
To verify this fix, a PR ([add set_owner to feature yang](https://github.com/Azure/sonic-buildimage/pull/9075)) that fix another bug in SONiC should be merged to master.
Closes#7958
#### Why I did it
The previous implementation of sonic-cfggen did a simple comparison between default breakout mode in
hwsku.json and supported modes in platform.json. To set a different default speed in hwsku.json
it was required to add one more entry to supported modes in platfrom.json file:
1x10G[100G,50G] vs 1x100G[50G,10G]
The new implementation does more intelligent parsing and analysis of supported and default modes. It
allows changing default speed without adding a new entry to platform.json.
#### How I did it
Add more intelligent parsing and analysis of supported and default modes.
#### How to verify it
Run sonic-config-engine unit tests from sonic-config-engine/tests directory
- Why I did it
Add support for SN2201 platform
- How I did it
Add required content for SN2201 platform
Note: still missing kernel driver support for this system. Once all is upstream will be updated as well.
- How to verify it
Install and basic sanity tests including traffic.
Signed-off-by: liora liora@nvidia.com
Why I did it
Adding platform support for centec v682-48y8c and v682-48x8c.
V682-48y8c switch has 48 SFP+ (1G/10G/25G) ports, 8 QSFP28 (40G/100G) ports on CENTEC TsingMa.MX.
V682-48y8c is different from V682-48y8c_d in that:
transceiver is managed by cpu smbus rather than TsingMa.MX i2c bus.
port led is managed by mcu inside TsingMa.MX.
fan, psu, sensors, leds are managed by cpu smbus other than the cpu board vendor's close sourse driver.
V682-48x8c switch has 48 SFP+ (1G/10G) ports, 8 QSFP28 (40G/100G) ports on CENTEC TsingMa.MX.
CPU used in v682-48y8c and v682-48x8c is Intel(R) Xeon(R) CPU D-1527.
How I did it
Modify related code in platform and device directory.
Upgrade centec sai to v1.9.
upgrade python to python3 and kernel version to 5.0 for V682-48y8c_d.
How to verify it
Build centec amd64 sonic image, verify platform functions (port, sfp, led etc) on centec v682-48y8c and v682-48x8c board.
Co-authored-by: shil <shil@centecnetworks.com>
- Why I did it
Fix sonic-config-engine unit test failure
- How I did it
* Do not use pytest fixture in the test since it is not compatible with unittest framework which is used by all of the rest test cases.
* Supply 2 missing files
- How to verify it
Run unit test or compile the module (when the unit test will run automatically)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
This interface type is used for recirculation on chassis.
The definition is required to prevent this interface from being
considered a physical interface in sonic-platform-common and
sonic-platform-daemon
- Why I did it
To include latest fixes.
SAI
1. Reclaim buffers for port which is admin down
2. Support for Spectrum-4 os Nvidia ASIC simulation
3. Support for SN2201
4. Fix host interface table entry, one channel per trap (fix sflow double registration)
5. 2 new queue counters - ecn marked packets + shared current occupancy
6. Fix storm policer unknown unicast
7. Add key/value for accuflow counters
8. Add MAC move
9. Add mirror congestion mode attribute
SDK
1. Under various circumstances, Ethernet ports falsely showed that InfiniBand cables were connected.
2. In SN4600C, at times, the link up time in both DAC and optics cables may, in the worst case, take up to 15 seconds.
3. Using SN4600C with copper or optics loopback cables in NRZ speeds, link may raise in long link up times
4. When ECMP has high amount of next-hops based on VLAN interfaces, in some rare cases, packets will get a wrong VLAN tag and will be dropped.
5. When connecting Spectrum devices with optical transceivers that support RXLOS, remote side port down might cause the switch firmware to get stuck and cause unexpected switch behavior.
6. Aggregation event is missing for WJH L2 drop reason 'Unicast egress port list is empty'.
7. Tying the SCL and SDA of the optical modules to 3.3V causes errors.
8. On SN4600, there was a delay of more than 10 seconds from the time a data packet is sent from CPU until it is transmitted through one of the switch ports.
9. While using SN4600C system with Finisar FTLC1157RGPL 100GbE CWDM4 modules, intermittent link flaps across multiple ports may be observed.
10. In Spectrum-2 and Spectrum-3 systems, link did not work in auto-negotiation when connected to Marvell PHY. KR mechanism has been enhanced to integrate with Marvell PHY.
11. The tunnel counter counts the drop packets now for Spectrum-2 and Spectrum-3 and consistent with Spectrum behavior and count the ECN dropped packets as well.
12. When connecting SN3800 to Cisco-9000, fast-linkup flow will fail and will rise in the normal flow.
13. Race condition in WJH library: when multiple threads load the LAG shared memory concurrently, the program may crash.
14. Add WJH L2 drop reason 'Unicast egress port list is empty' as a new drop reason.
15. Fixed a memory leak in sx_api_port_sflow_statistics_get API.
16. During initialization flow, the command interface that is used by the minimal driver and SDK caused the collision in the firmware since the same buffer is used in the firmware for the two interfaces.
17. Fix route issue on Kernel 5.10
- How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
1. Fix auditd log file path, because known issue: https://github.com/Azure/sonic-buildimage/issues/9548
2. When SONiC change to based on bullseye, auditd version upgrade from 2.8.4 to 3.0.2, and in auditd 3.0.2 the plugin file path changed to /etc/audit/plugins.d, however the upstream auditisp-tacplus project not follow-up this change, it still install plugin config file to /etc/audit/audisp.d. so the plugin can't be launch correctly, the code change in src/tacacs/audisp/patches/0001-Porting-to-sonic.patch fix this issue.
#### How I did it
Fix tacacs plugin config file path.
Create /var/log/audit folder for auditd.
#### How to verify it
Pass all UT, also run per-command acccounting UT to validate plugin loaded.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
Fix tacacs plugin config file path.
Create /var/log/audit folder for auditd.
#### A picture of a cute animal (not mandatory but encouraged)
What I did:
Updated Jinja Template to enable BGP Graceful Restart based on device role. By default it will be enable only if the device role type is TorRouter.
Why I did:-
By default FRR is configured in Graceful Helper mode. Graceful Restart is needed on T0/TorRouter only since the device can go for warm-reboot. For T1/LeafRouter it need to be in Helper mode only
- Why I did it
To have an ability to use PRM sniffer.
- How I did it
Enabled the option in configure flags.
- How to verify it
Built and ran on switch. Enabled the feature in runtime and checked the sniffer recording.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Why I did it
database.sh failed to create the database for namespace in multiasic platform.
The latest code Docker version 20.10.x, command "docker create" no longer takes optional "NET=" with empty value. Syntax error show with current docker create command in database.sh. Issue #9503
How I did it
Modify the docker_image_ctl.j2 to set default network setting NET="bridge" instead of empty for namespace database.
#### Why I did it
POLL_INTERVAL cannot be set if any of the detection/restoration times in this table is less than the POLL_INTERVAL.
#### How I did it
Add "must" constraint to make sure detection/restoration times are greater than POLL_INTERVAL.
#### How to verify it
Use apply-patch command to update POLL_INTERVAL.
Build sonic-yang-model.
- Why I did it
The capability files were incorrect in comparison to the marketing spec of the SN4410 platform.
- How I did it
Aligned the capability files according to the marketing spec.
- How to verify it
Basic manual sanity checks:
1. Check if critical docker containers were UP
2. Check if interfaces were created and were UP
3. Check if interfaces created in the syncd docker container by executing – sx_api_ports_dump.py script
4. Check the logs from the start of the switch – everything was OK
5. Verified the port breakout
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
- Why I did it
To fix an issue that hw-mgmt patches were not applied. One patch was already in upstream hw-mgmt package thus applying it again caused an error and no other patches were applied. Also, I did it to improve the Makefile, so that the make will fail in case patches fail to apply.
- How I did it
Removed obsolete patch, made applying patches a hard failure in the build.
- How to verify it
Run the make and verify patches are applied.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
- Why I did it
To fix the above error when running make slave.mk with PLATFORM=vs.
- How I did it
Instead of:
export BUILD_MULTIASIC_KVM=$(BUILD_MULTIASIC_KVM)
do just the export:
export BUILD_MULTIASIC_KVM
BUILD_MULTIASIC_KVM is already defined to be either empty, or from rules/config or from the environment - from Makefile.work. No need to dereference the variable in the export statement.
- How to verify it
PLATFORM=vs make -f slave.mk list # verify no error and BUILD_MULTIASIC_KVM is empty in the output
PLATFORM=vs BUILD_MULTIASIC_KVM=y make -f slave.mk list # verify no error and BUILD_MULTIASIC_KVM is set to y in the output
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Fix the nodesource.list cannot read issue, it is cased by the full path not used.
```
2021-12-03T06:59:26.0019306Z Removing intermediate container 77cfe980cd36
2021-12-03T06:59:26.0020872Z ---> 528fd40e60f6
2021-12-03T06:59:26.0021457Z Step 81/81 : RUN post_run_buildinfo
2021-12-03T06:59:26.0841136Z ---> Running in d804bd7e1b06
2021-12-03T06:59:29.1626594Z [91mDEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
2021-12-03T06:59:34.2960105Z [0m[91m/usr/bin/sed: can't read nodesource.list: No such file or directory
2021-12-03T06:59:34.5094880Z [0mThe command '/bin/sh -c post_run_buildinfo' returned a non-zero code: 2
```
Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
Fix no space left on device issue in tmpfs.
2021-12-01T06:30:40.1651742Z cp: write error: No space left on device
2021-12-01T06:30:40.1652225Z Failure: local_fs_run():/dev/vdb Unable to copy /tmp/tmp.gl4Sgp/onie-installer.bin to tmpfs
Updated BGP Template for the case:
1. For Packet Chassis do not advertise Loopback4096 address into BGP as there is Static Route for same.
Having this route in BGP causes two level of recursion in Zebra and cause assert in Zebra
when there are many nexthop involved
2. Advertise only P2P Connected IP's into BGP (External Peers). For Packet chassis we have backend IP Interface subnet and if
they get advertised into BGP then it also causes recursion
- Add INCLUDE_PINS to config to enable/disable container
- Add Docker files and supporting resources
- Add sonic-pins submodule and associated make files
Submission containing materials of a third party:
Copyright Google LLC; Licensed under Apache 2.0
#### Why I did it
Adds P4RT container to SONiC for PINS
The P4RT app is covered by this HLD:
https://github.com/pins/SONiC/blob/master/doc/pins/p4rt_app_hld.md
#### How I did it
Followed the pattern and templates used for other SONiC applications
#### How to verify it
Build SONiC with INCLUDE_P4RT set to "y".
Verify that the resulting build has a container called "p4rt" running.
You can verify that the service is up by running the following command on the SONiC switch:
```bash
sudo netstat -lpnt | grep p4rt
```
You should see the service listening on TCP port 9559.
#### Which release branch to backport (provide reason below if selected)
None
#### Description for the changelog
Build P4RT container for PINS
6f2d8d2110967d813053bcfcd8b34c42c5d0cda2 (HEAD -> 202111, origin/202111) [Voq][Inband] Support the Ethernet-IB port (#228
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
f81043b1f9ff02196629655f4735b33afd7f0ae1 (HEAD -> 202111, origin/202111) [port2alias]: Fix to get right number of return values (#1906)
bbbf65943ec46e9330eadaed8bcdf1612cb8bd55 [CLI][show bgp] On chassis don't show internal BGP sessions by default (#1927)
e12de7e7bf6cff3ec127f261bf88e4d29776d27b [port] Fix port speed set (#1952)
cae7af752d484956d7fe40e4c3a849ddad460976 Fix invalid output of syslog IPv6 servers (#1933)
6009341ddf790094166be5f0a81b4c114f00220b Routed subinterface enhancements (#1821)
6ab9d67ca6550c592b97afb513804be474f84eb0 Enhance sfputil for CMIS QSFP (#1949)
76cc67ba4f81c69b20efb3341808037c9db8f703 [debug dump] Refactoring Modules and Unit Tests (#1943)
cff58a8171423e4012bc8caf9748996a1e98b7e2 Add command reference for trap flow counters (#1876)
71cf3ee43524d56ad57dd90b937cfbf4bf63ba6a [Reclaim buffer] [Mellanox] Db migrator support reclaiming reserved buffer for unused ports (#1822)
e699b49fb722e6d6fe5a1d2dacd2d39eb085c1e4 Add show command for BFD sessions (#1942)
bb6c5774c843dbfad5f1ba00ee76dae7720902d1 [warm-reboot] Fix failures of warm reboot on disconnect of ssh session (#1529)
2e8bbb308477862a76d2327fcf696875e8f08650 Add trap flow counter support (#1868)
58407c1386ef13772a9a9320a795e380f162ab2c [load_minigraph] Delay pfcwd start until the buffer templates are rendered (#1937)
eb388e0584ba1fe8d8dba58f1c5a148036ffe047 [sonic-package-manager] support sonic-cli-gen and packages with YANG model (#1650)
2371d84e7d281bdb9988b5a1a012498dbbfb89ec generic_config_updater: Filename changed & VLAN validator added (#1919)
7c0718dfaf23289d4ecc3ada9332e465c9a4e56b [config reload] Update command reference (#1941)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
c2aac75 [SFP-Refactor] Fix LP mode API issue (#247)
dba17c8 Firmware upgrade CLI support for QSFP-DD transceivers (#244)
cd69212 [SFP-Refactor] Implement CMIS Low Power mode (#237)
9cea07f Fix RegGroupField decode (#245)
6ae1909 Add CMIS QSFP support (#246)
c1f317d Gracefully handle CMIS APIs for passive modules (#238)
ec7335d fix for firmware functions (#243)
cf2ebe9 Fix RegBitField decode/encode (#242)
ef4f2c6 Fix SFP_CABLE_TECH_FIELD (#240)
e118644 remove time counting message in functions because function running time could be difficult to predict in unit tests (#241)
Signed-off-by: Prince George <prgeor@microsoft.com>
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.