- Why I did it
Implement newly added reboot causes in PR Azure/sonic-platform-common#277
- How I did it
Map the reboot cause sysfs to the newly added reboot causes.
- How to verify it
manual test, check whether the reboot cause is correct after rebooting the switch in various ways.
run the community reboot test to see whether the reboot cause checker is passing.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
9ac12bf (HEAD -> master, origin/master, origin/HEAD) Fix platform daemon chassisd to handle auto restart on fail (#247)
24fba04 [ycable] fix the logic to update cable_info values when ycable is not present; fix read side logic for ycable (#249)
The v0.7.5 has bug fix for the support of gearbox port and macsec counters. It also includes a owl firmware update with owl.lz4.fw.1.94.0.bin.
How I did it
Update credo sai url for v0.7.5
Update gearbox_config.json with using firmware owl.lz4.fw.1.94.0.bin instead of owl.lz4.fw.1.92.1.bin
How to verify it
Test gearbox port and macsec counter successfully on A7280.
Updating sonic-utilities sub module with the following commits
f09bd31 Fix UT failed cause by change pycommon to use swsscommon
c092300 Increased pcied unit test coverage to > 80%
7d7c85e Modular chassis: Psud set master led on first run
7195dcc Remove py2 from pipeline
c2e7393 [ycabled] increase UT coverage of ycabled daemon
#### Why I did it
When change pycommon to use swsscommon UT failed in sonic-platform-daemon, need submodule update with UT issue fix.
#### How I did it
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
#### Description for the changelog
Fix UT failed cause by change pycommon to use swsscommon
Increased pcied unit test coverage to > 80%
Modular chassis: Psud set master led on first run
Remove py2 from pipeline
[ycabled] increase UT coverage of ycabled daemon
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
To support address sanitizer for Mellanox syncd
How I did it
/var/log/asan is mapped for syncd container (the same as for swss)
container stop() has a timeout (60s) for syncd (the same as for swss)
This is so libasan has enough time to generate a report.
added ASAN's log path to Mellanox syncd supervisord.conf
added "asan: yes" to sonic_version.yml
How to verify it
Added artificial memory leaks
Compiled with ENABLE_ASAN=y
Installed the image on DUT
Rebooted the DUT
Verified that /var/log/asan/syncd-asan.log contains the leaks
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.
- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"
- How to verify it
Unit test and Manual test
Why I did it
Add libgmock-dev to the package list required by linkmgrd unittests.
Required by PR: Azure/sonic-linkmgrd#45
How I did it
Add the package to the package list.
How to verify it
Build docker-mux with KEEP_SLAVE_ON=yes and verify libgmock-dev is present.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
This PR includes necessary changes for correct generating BUFFER_QUEUE values in DB. Changes are based on the schema.md
Why I did it
Change format of generating BUFFER_QUEUE in DB according to schema.md and yang-model.
Old format:
"BUFFER_QUEUE": {
"Ethernet0,Ethernet100,Ethernet104,Ethernet108,Ethernet112,Ethernet116,Ethernet12,Ethernet120,Ethernet124,Ethernet16,Ethernet20,Ethernet24,Ethernet28,Ethernet32,Ethernet36,Ethernet4,Ethernet40,Ethernet44,Ethernet48,Ethernet52,Ethernet56,Ethernet60,Ethernet64,Ethernet68,Ethernet72,Ethernet76,Ethernet8,Ethernet80,Ethernet84,Ethernet88,Ethernet92,Ethernet96|queue": {
"profile": "profile"
},
"Ethernet0,Ethernet100,Ethernet104,Ethernet108,Ethernet112,Ethernet116,Ethernet12,Ethernet120,Ethernet124,Ethernet16,Ethernet20,Ethernet24,Ethernet28,Ethernet32,Ethernet36,Ethernet4,Ethernet40,Ethernet44,Ethernet48,Ethernet52,Ethernet56,Ethernet60,Ethernet64,Ethernet68,Ethernet72,Ethernet76,Ethernet8,Ethernet80,Ethernet84,Ethernet88,Ethernet92,Ethernet96|queue": {
"profile": "profile"
}
},
New format:
"BUFFER_QUEUE": {
"Ethernet0|queue": {
"profile": "profile"
},
"Ethernet0|queue": {
"profile": "profile"
},
"Ethernet4|queue": {
"profile": "profile"
},
"Ethernet4|queue": {
"profile": "profile"
},
"Ethernet8|queue": {
"profile": "profile"
},
"Ethernet8|queue": {
"profile": "profile"
},
...
}
How I did it
Updated structure of buffers_defaults jinja templates.
Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>
Why I did it
[Build]: Fix pip version constraint conflict issue
When a version is specified in the constraint file, if upgrading the version in build script, it will have conflict issue.
How I did it
If a specified version has specified in pip command line, then the version constraint will be skipped.
It upgraded scapy to 2.4.5 in docker-ptf container, after this upgrade, all scripts under ansible/roles/test/files/ptftests will import scapy 2.4.5, some test cases will fail because they are not upgraded accordingly.
Reverts #10507 to avoid breaking regression test.
This reverts commit 92efc01270.
If it is run during image install, it's not guaranteed that the
installation environment will have tune2fs available. Therefore, run it
during initramfs instead.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* [device config] Adding configuration for default route fallback
* Set sai_tunnel_underlay_route_mode attribute to fallback to default route if more specific route is unavailable.
Why I did it
Config db schema generated by minigraph can’t pass yang validation, PORT table does not have 'lanes' and 'speed' field.
How I did it
Make cfggen command fail when 'lanes' and 'speed' are not provided
How to verify it
Run 'sonic-cfggen -m xxx.xml --print-data' to make sure command fail when 'lanes' and 'speed' not in PORT table
Removed python2 support for sonic-platform-daemons that was causing unit
test errors in sonic_pcied.
* Removed config from docker supervisord jinja templates per VD review comment
* Removed space and python3 per QL comments
Why I did it
Prevent from i2c bus to get locked.
How I did it
Add sysfs driver to access ioport.
Command to reset i2c mux:
echo 1 > /sys/devices/platform/as9716_32d_ioport/i2c_mux_rst
Command to bring i2c mux out of reset:
echo 0 > /sys/devices/platform/as9716_32d_ioport/i2c_mux_rst
Signed-off-by: Brandon Chuang <brandon_chuang@edge-core.com>
Why I did it
In the bringup of tomahawk4/trident4, we realized that such chips need a larger size of /dev/shm in syncd container, so we added the option --shm-size to the docker create for syncd. The default value for shm-size is 64m; after this change, people can add SYNCD_SHM_SIZE=128m to platform_env.conf to change it to 128m.
How to verify it
We verified that after this change, 1) on existing platforms without platform_env.conf, the size of /dev/shm in syncd container (df -h | grep shm) is still the default 64M; 2) after we add SYNCD_SHM_SIZE=128m to platform_env.conf, /dev/shm in syncd becomes 128M.
Why I did it
For trident4/tomahawk4, linux_ngknet.ko and linux_ngknetcb.ko have to be installed. Also, the kernel modules to load on such chips are different from existing ones, so we add an option is_ltsw_chip to determine the kernel modules to load. The option is_ltsw_chip is controlled by adding 'is_ltsw_chip=1' to platform_env.conf or not.
How to verify it
We verified that existing platforms still work after this change; and for platforms with trident4/tomahawk4, we can load the different kernel modules as expected after adding 'is_ltsw_chip=1' to platform_env.conf
Why I did it
Existing dataplane tests cannot be tested under MACsec environment due to the traffic under MACsec link is encrypted. So, I will override the dp_poll of ptf to MACsec dp_poll to decrypt the MACsec packets on injected ports (PR: Azure/sonic-mgmt#5490). MACsec decryption library depends on scapy 2.4.5.
How I did it
Upgrade scapy library to 2.4.5 by pip.
How to verify it
Check the scapy version in docker-ptf by
python -c "import scapy; print(scapy.__version__)"
2.4.5
Signed-off-by: Ze Gan <ganze718@gmail.com>
b67d479 Fixed the sfp refactor issue
827c5a6 Added nokia_cmd command nokia_common grpc support for power down/up SFM module
aeb7f56 Added the nokia cli commands for midplane
c57d083 Fix the get_my_module issue and the thermal_infos exception issue.
0536293 Change the output of "show chassis module status"
63212d7 Enhance the help display for nokia_cmd command
e8d2599 Fix the sonic_install_ndk_service script issue
d52bdcf Add command nokia_cmd show sfm-eeprom support
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
minigraph parser has introduced new type.
How I did it
Update yang models to support BmcMgmtToRRouter.
How to verify it
Run unit test for sonic-yang-models
Signed-off-by: Gang Lv ganglv@microsoft.com
#### Why I did it
As of https://github.com/Azure/sonic-swss-common/pull/587 the blackout issue in ConfigDBConnector has been resolved.
In the past hostcfgd was refactored to use SubscriberStateTable instead of ConfigDBConnector for subscribing to CONFIG_DB updates due to a "blackout" period between hostcfgd pulling the table data down and running the initialization and actually calling `listen()` on ConfigDBConnector which starts the update handler.
However SusbscriberStateTable creates many file descriptors against the redis DB which is inefficient compared to ConfigDBConnector which only opens a single file descriptor.
With the new fix to ConfigDBConnector I refactored hostcfgd to take advantage of these updates.
#### How I did it
Replaced SubscriberStateTable with ConfigDBConnector
#### How to verify it
The functionality of hostcfgd can be verified by booting the switch and verifying that NTP is properly configured.
To check the blackout period you can add a delay in the hostcfgd `load()` function and also add a print statement before and after the load so you know when it occurs. Then restart hostcfgd and wait for the load to start, then during the load push a partial change to the FEATURE table and verify that the change is picked up and the feature is enabled after the load period finishes.
#### Description for the changelog
[hostcfgd] Move hostcfgd back to ConfigDBConnector for subscribing to updates
Why I did it
Exclude the innovium build in upgrading version build, currently, the builds are always failed, exclude the build temporarily.
Increase the broadcom build timeout.
Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr 2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr 2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors from scapy.route import *
Apr 2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr 2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors conf.iface = get_working_if()
Apr 2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr 2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr 2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr 2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr 2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
In order to include the following commit:
0f06910 [PBH] Implement Edit Flows (Azure/sonic-swss#2169)
sonic-swss
50d5be2 Make changes to support compiling on Bullseye with GCC 10 (#2216)
0870cf5 [mirrororch]: Implement HW resources availability validation for SPAN/ERSPAN (#2187)
f4ec565 [vlanmgrd] fix use-after-free memory issue (#2211)
c2de7fc [QosOrch] The notifications cannot be drained in QosOrch in case the first one needs to retry (#2206)
5575935 [neighsyncd] increase neighsyncd timeout (#2209)
0f06910 [PBH] Implement Edit Flows (#2169)
6241bbf Remove redundant and problematic code to skip "pool" field in buffer profile handling (#2197)
a55343c [azp]: Set diff coverage threshhold to 80% (#2188)
390cae1 [portsorch]: Prevent LAG member configuration when port has active ACL binding (#2165)
c1d47e6 [VNET]Fixing nexthop group delete during route change (#2198)
8941cc0 [BFD]Registering BFD state change callback during session creation (#2202)
680c539 [vxlan] Remove tunnel map objects on VNET tunnel removal (#2150)
20dde0c Fix for handling broadcom DNX ASIC to have ipv4 and ipv6 ACL rules in separate tables. (#2178)
5b7c949 [FdbOrch] SAI_FDB_EVENT_MOVE generates update with empty update.entry.port_name (#2200)
7350d49 [Vxlanmgr] vnet netdev cleanup during config reload fix (#2191)
2bef62b Validate LAG has members before mirror session create (#2130)
1e4d4ce [VS test] Increase VS test time, skip dpb flaky test (#2195)
6eda965 [vstest]Migrating vs tests from using click commands to direct DB access (#2179)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
Why I did it
Need to run yang validation for sonic-cfggen unit test, and many unit test does not provide speed for port table.
How I did it
Update minigraph xml.
How to verify it
Run sonic-cfggen unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Fix#9746
How I did it
Split the check condition based on non-exist and zero length.
How to verify it
Run verification script when table contains empty value
890f32f LLDPLocalSystemDataUpdater Exception Log Handled (#249)
2151731 Handle error seen on system where vlan interface map is not present (#246)
c6141c7 [build] use Azure.sonic-buildimage.official.vs pipeline as artifact source (#248)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Why I did it
Fixes#9628
During bootup, this error log is seen
Dec 22 04:26:29 sonic interfaces-config.sh[2546]: error: main exception: cannot find interfaces: eth0 (interface was probably never up ?)
This is of non-functional nature and doesn't affect the flow.
- How I did it
Dont take the ifdown if not needed
- How to verify it
Verified during reboot. Log did not appear and IP was acquired on eth0 as expected
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Why I did it
ASN range is from 1 to 4294967295, need to remove invalid ASN.
How I did it
Update unit test and replace ASN 0.
How to verify it
Run unit test for sonic-config-engine.
Signed-off-by: Gang Lv ganglv@microsoft.com