Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr 2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr 2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors from scapy.route import *
Apr 2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr 2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors conf.iface = get_working_if()
Apr 2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr 2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr 2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr 2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr 2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
- Why I did it
Fixes#9628
During bootup, this error log is seen
Dec 22 04:26:29 sonic interfaces-config.sh[2546]: error: main exception: cannot find interfaces: eth0 (interface was probably never up ?)
This is of non-functional nature and doesn't affect the flow.
- How I did it
Dont take the ifdown if not needed
- How to verify it
Verified during reboot. Log did not appear and IP was acquired on eth0 as expected
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Cherry-pick #10108
- Why I did it
Fixing issue #9991
The ACL RULE table field ETHER_TYPE can accept both hex as well as decimal values. However yang model didn't allow decimal values. Fixed it to allow decimal values (same pattern as in hex (1536-65535)
- How I did it
Updated yang model to handle decimal values
- How to verify it
Added UT to verify it.
bd97dfe Fix urllib3 CVE-2021-33503 issue (#104)
f159bfa Upgrade the containers to be based on Debian Buster (#103)
a1830c1 Fix OpenAPI spec to be readable by autorest (#101)
94805a3 Identify and report Vnet GUID for conflicting VNI (#99)
4832dfd Static route expiry if not specified as persistent (#98)
5cc4358 Add support for overlay ECMP (#96)
6822a46 [CI] Set diff cover threshold to 50% (#97)
dcc826a Add PR diff coverage (#95)
This PR is to backport #10401 to 202111
- Why I did it
Take new hw-mgmt release to SONiC, including:
new features:
hw-mgmt: add to PSU FW upgrade tool command to show current FW version
hw-mgmt: add to PSU FW upgrade tool support for single-PSU-in-the-system FW upgrade
hw-mgmt: add attribute “/firmware” to show FW version of restricted upgradable PSUs only
hw-mgmt: Add NVME temperature reports attributes (_alarm/_crit/_min/_max)
bug fix:
psu: redundant i2c_addr attributes being created for psu 3 & 4 in system having only 2 psus.
hw-mgmt: in SPC1/2 i2c driver removal is too slow vs. ASIC reset causing non-functional log errors
PSU thresholds sysfs changed in 5.10 to “read only” preventing modification (modification required due PSU HW bug)
CPLD3 sysfs attribute missing after chip down/up flow
sysfs attributes missing when hw-mgmt is restarted (stop/start) within systemd
release notes can be found from link https://github.com/Mellanox/hw-mgmt/blob/V.7.0020.2004/debian/Release.txt
- How I did it
Update hw-mgmt make file with new version number
Update hw-mgmt submodule pointer
- How to verify it
Run platform regression on all Mellanox platform
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
Exclude the innovium build in upgrading version build, currently, the builds are always failed, exclude the build temporarily.
Increase the broadcom build timeout.
- Why I did it
Fastboot will delay all counters in CONFIG DB, it relies on enable_counters.py to recover the delayed counters. However, enable_counters.py does not recover those non-default counters.
- How I did it
For non-default counters, if it is in CONFIG DB, put delay status to false after the waiting.
- How to verify it
Manual test
c5c105f (HEAD -> 202111, origin/202111) [PBH] Implement Edit Flows (#2093)
7050581 [techsupport] Handle minor fixes of TS Lock and update auto-TS (#2114)
3602f99 Fix issues in clear_qos (#2122)
4f96d3b Fixing get port speed when oper status is down (#2123)
5bb99c7 Validate LAG has members before mirror session create (#2130)
ec6c8af [vxlan] Remove tunnel map objects on VNET tunnel removal (#2150)
7e7db19 [BFD]Registering BFD state change callback during session creation (#2202)
618fe07 [VNET]Fixing nexthop group delete during route change (#2198)
91b66df [portsorch]: Prevent LAG member configuration when port has active ACL binding (#2165)
29de9d0 Remove redundant and problematic code to skip "pool" field in buffer profile handling (#2197)
ded0b45 [PBH] Implement Edit Flows (#2169)
2ee0f49 [neighsyncd] increase neighsyncd timeout (#2209)
a0160c0 [QosOrch] The notifications cannot be drained in QosOrch in case the first one needs to retry (#2206)
Fix the generating version file failure issue caused by artifacts folder change.
When changing to use the same template for PR build, official build and packages version upgrade, the artifacts folder adding a "target" folder, the version upgrade task should be changed accordingly.
Why I did it
docker hub will limit the pull rate.
Use ACR instead to pull debian related docker image.
How I did it
Set DEFAULT_CONTAINER_REGISTRY in pipeline.
* [Marvell] Update armhf SAI deb version 1.9.1 (#9865)
Move marvell armhf SAI deb to 1.9.1 to address build failures.
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
* [Marvell] Update armhf driver/sai deb version (#10126)
Fixed Marvell SAI deb version naming issue reported in Marvell-switching/sonic-marvell-binaries#62
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
* [Build]: only install grpc in amd64 (#10212)
[Build]: only install grpc in amd64
Unblock marvell-armhf build.
Co-authored-by: Rajkumar-Marvell <54936542+rajkumar38@users.noreply.github.com>
Why I did it
Cherry-pick commits from master to 202111 to fix build broken issue.
See detail in the commits.
Why I did it
Fix host image debian package version issue.
The package dependencies may have issue, when some of debian packages of the base image are upgraded. For example, libc is installed in base image, but if the mirror has new version, when running "apt-get upgrade", the package will be upgraded unexpected. To avoid such issue, need to add the versions when building the host image.
How I did it
The package versions of host-image should contain host-base-image.
#### Why I did it
when adding and removing ports after init stage we saw two issues:
first:
In several cases, after removing a port, lldpmgr is continuing to try to add a port to lldp with lldpcli command. the execution of this command is continuing to fail since the port is not existing anymore.
second:
after adding a port, we sometimes see this warning messgae:
"Command failed 'lldpcli configure ports Ethernet18 lldp portidsubtype local etp5b': 2021-07-27T14:16:54 [WARN/lldpctl] cannot find port Ethernet18"
we added these changes in order to solve it.
#### How I did it
port create events are taken from app db only.
lldpcli command is executed only when linux port is up.
when delete port event is received we remove this command from pending_cmds dictionary
#### How to verify it
manual tests and running lldp tests
#### Description for the changelog
Dynamic port configuration - solve lldp issues when adding/removing ports
Why I did it
Kernel hang in during early boot is caused due overwriting of device tree with uncompressing kernel. Added the fdt_high which gives a safe offset from kernel location.
How I did it
Setting uboot environment variable fdt_high.
How to verify it
Successful boot of bullseye kernel on Marvell Armada 380/385.
Change-Id: I3e2521780f5ecdb3bdf6cbb6542250814ca11959
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
Why I did it
Removing incorrect check in plt setup for fw_env config: This check was added before to compare 2 different types of disk. Now the check is redundant and check is not required as transition is complete.
2)Removing legacy_volume_label in create_partition: legacy_volume_label is not used in armhf install files. With legacy_volume_label initialized to NULL, current code will always return true for check, if demo_part exits.
How I did it
Change is about removing the redundant/incorrect code explained above.
How to verify it
uboot fw_printenv and fw_setenv is tested
onie-nos-install has be verified.
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
For Bullseye, Python 2 isn't present at all. This means that in certain
build cases (such as building something only for Bullseye), the version
file may not exist, and so the sort command would fail.
For most normal build commands, this probably won't be an issue, because
the SONiC build will start with Buster (which has both Python 2 and
Python 3 wheels built), and so the py2 and py3 files will be present
even during the Bullseye builds.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
[Build]: Enable marvell-armhf PR check
Improve the azp dependencies, make the Test stage only depended on BuildVS stage. The Test stage will be triggered once the BuildVS stage finished, reduce the waiting time.
- Why I did it
With the previous MFT 4.18.1-16 there is a bug in mstdump tool accessing wrong address. it is confirmed this issue does not exist in official 4.18.0-106.
- How I did it
Update the MFT version to 4.18.0-106
- How to verify it
Run regression on Mellanox platforms
Why I did it
support to collect version when purging debian package
Support to collect version multiple times
How I did it
Add the collection action before purging.
#### Why I did it
To fix https://github.com/Azure/sonic-buildimage/issues/9643
#### How I did it
Instead of ast.literal_eval added python2 compat code for json strings unicode -> str convertion.
We need python2 compatibility since py2 sonic config engine (buster/sonic_config_engine-1.0-py2-none-any.whl target) is still included into the build (ENABLE_PY2_MODULES flag is set for buster). Once we abandon buster and python2, this compat and ast.literal_eval could be cleaned up all through the code base.
#### How to verify it
run steps from the linked issue
Why I did it
ACL have ACCEPT action indeed, but yang doesn't support it.
How I did it
Add 'ACCEPT' enum to sonic-types.yang.j2
How to verify it
Run the YANG model unit tests
6562ad3 (HEAD -> 202111, origin/202111) [sfpshow][recycle_port] sfpshow script needs to skip recycle ports (#2109)
f184a61 Update `config mirror_session` CLI to support heximal gre type value (#2095)
03936ea (HEAD -> 202111, origin/202111) define index for recirc port (#118)
d48f750 [port_util] Fix issue: port_util.get_vlan_interface_oid_map should not raise exception when DB has not RIF data (#117)
- Why I did it
PDDF utils were python2 compliant and they needed to be migrated to Python3 (as per Bullseye)
PDDF common platform APIs file name changed as the name was already in use
Indentation issues
Dead/redundant code needed to be removed
- How I did it
Made files Python3 compliant
Indentation corrected
Redundant code removed
- How to verify it
AS7326 Accton platform uses PDDF. PDDF utils were run on this platform to verify.
cherry-pick of #9393 for 202111
- Use SfpOptoeBase by default to leverage new `sonic_xcvr` refactor
- Add support for `Woodleaf` product
- Move `libsfp-eeprom.so` to a different `.deb` package
- Add new logrotate configuration for arista logs
- Improve logging mechanism for the drivers (IO loglevel, fix syslog duplicates)
- Initialize chassis cards in parallel
- Refactor of `get_change_event` to fix interrupts treated as presence change
Why I did it
[Build]: Fix armhf mirrors not existing issue
The mirror endpoint debian-archive.trafficmanager.net does not support armhf, change to use deb.debian.org and security.debian.org.
Correct thrift.0.13.0 dependent package name.
In previous code, the buildout target was named as PYTHON3_THRIFT_0_13_0
But when add the prackage to LIBTHRIFT_0_13_0, it typo as PYTHON_THRIFT_0_13_0
Co-authored-by: Yang Wang<yangwang1@microsoft.com>
9968d60 (HEAD -> 202111, origin/202111) [sonic-package-manager] do not mod_config for whole config db when setting init_cfg (#2055)
4b3d53f [generate_dump] exclude mft and mlx folders from /etc (#2072)
51d92ae Validation check correction while adding a member to PortChannel (#2078)
6a43306 [techsupport] Added a lock to avoid running techsupport in parallel (#2065)
44cfdd9 Try get port operational speed from STATE DB (#2030)
45ea623 Fix sonic-installer failure due to missing import
- Why I did it
Fix issue: psu might use wrong voltage sysfs which causes invalid voltage value. The flow is like:
1. User power off a PSU
2. All sysfs files related to this PSU are removed
3. User did a reboot/config reload
4. PSU will use wrong sysfs as voltage node
- How I did it
Always try find an existing sysfs.
- How to verify it
Manual test