Why I did it
Fix apt-get remove/purge version not locked issue when the apt-get options not specified.
How I did it
Add a space character before and after the command line parameters.
Why I did it
For the change to support gearbox taps by gearbox_config.json (sonic-net/sonic-swss#2158), I need to add tests to sonic-swss/tests/test_gearbox.py to satisfy the test coverage of the change. The existing code in test_gearbox.py has already used brcm_gearbox_vs and here is to add some gearbox tap value to its gearbox_config.json, for the added tests in sonic-swss/tests/test_gearbox.py.
How I did it
How to verify it
This change itself will not affect existing code.
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content.
#### How I did it
`os` - use with `subprocess`
#### How to verify it
ac71d745d [VxLAN]Fix Vxlan delete command to throw error when there are references (#2404)
7419c6731 Added cisco config platform commands (#2242)
8760bbe80 Add UT to check sonic installer does not depend on database (#2401)
6bef65260 [doc] add documentation on automatic techsupport based on memory (#2411)
4a783745f [doc] update "config feature" section with "--block" option (#2409)
dd6210fcc [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (#2398)
bdc4a8a60 Fix broken pipeline build URL (#2363)
b31681b43 Fix display disorder problem of show vrf (#2392)
123504a85 YANG validation for ConfigDB Updates: portchannel add/remove, loopback interface, VLAN
28f6820c6 [link-local]Modify RIF check to include link-local enabled interfaces (#2394)
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
slb and bgp mon peers are not needed for storage backend. These neighbor are present in the minigraph.
How I did it
After minigraph parsing, remove these neighbors if it is a storage backend device
How to verify it
Unit tests
Verified on the device that once these tables are removed, these peers don't show up in "show runningconfig bgp" output
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
Replace `os` by `subprocess`, remove `shell=True`
Remove unused functions
Include:
df92fb72 Improve verbosity level and provide more info in the log (#2472)
e81ed20b [intfmgr]: Enable `accept_untracked_na` kernel param (#2436)
24d29f18 [orchdaemon]: Fixed sairedis record file rotation (#2299)
b8ee07d7 [build] add missing package libyang-dev in lgtm.yml (#2475)
e46dd294 [crm] Fix issue with continues EXCEEDED and CLEAR logs for ACL group/table counters (#2463)
b61d24cd [doc]: Update README.md (#2456)
b9ade5d2 [orchagent] Fix issue: ip prefix shall be inited even if VRF/VNET is not ready (#2461)
f0f1eb47 Revert "[counters] Improve performance by polling only configured ports buffer queue/pg counters (#2360)" (#2458)
3d757a83 [ci][asan] add DVS tests run with ASAN (#2441)
04fbc8e3 [ci] Only when test stage succeeded or succeededwithissues, PR run Gcov (#2460)
7cc035f9 [orchagent]: Publish identified events via structured-events channel (#2446)
efa0f01d [QoS] Enforce drop probability only for colors whose WRED are enabled (#2422)
05c5c2f6 [swss] Replace memset functions (#2423)
9ff993db Modified the test file to remove click commands and do the REDIS-DB u… (#2264)
9e376af3 Install libyang in azure pipeline. (#2445)
c1eb99a7 check state_db for po before sending ARP/ND pkts (#2444)
43cc4869 [portmgr] Fixed the orchagent crash due to late arrival of notif (#2431)
b62c7162 Enhance orchagent and buffer manager in error handling (#2414)
13bda3c6 [Everflow/ERSPAN] Set correct destination port and mac address when the nexthop is updated for ERSPAN mirror destination (#2392)
0ccb315c Revert "[VS Test] Skip failing subport tests (#2370)" (#2421)
ac8a83f0 [UT] [Portsyncd] Added Unit Tests for portsyncd (#2297)
83a186a9 Change the log messages in addKernelNeigh/Route from ERROR to INFO (#2437)
9c23389b [BFD]Clean up state_db BFD entries on swss restart (#2434)
d41aebfd EntityBulker SIGSEGV when create_entry attr_count 0 (#2224)
f52a7b1c Fix the Fec Mode Setting of gbsyncd (#2430)
8cc0a451 [neighsyncd] Enabling ipv4 link local entries for non-dualtor (#2427)
5624e875 Revert "[ci][asan] add DVS tests run with ASAN (#2363)" (#2433)
a26b26ac Dynamic port configuration - add port buffer cfg to the port ref counter (#2194)
486939a9 tlm_teamd: Filter portchannel subinterface events from STATE_DB LAG_TABLE (#2408)
a4b89925 [counters] Improve performance by polling only configured ports buffer queue/pg counters (#2360)
4aaeec91 added support for Xsight platform (#2426)
ca9edcad [ci][asan] add DVS tests run with ASAN (#2363)
dec4570c Handle dual ToR neighbor miss scenario (#2151)
9eb44220 Upstream new development on p4orch (#2237)
e9be2c0e [lgtm] Fix dependency (#2419)
c0168f35 [muxorch] Returning true if nbr in skip_neighbor_ in isNeighborActive() (#2415)
cfcf3d87 [macsec]: Set MTU for MACsec (#2398)
8346034b Delete Invalid if condition in intfsorch.cpp (#2411)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [PR (#12065)](https://github.com/sonic-net/sonic-buildimage/pull/12065) needs to merge first.
#### Why I did it
`subprocess.Popen()` and `subprocess.check_output()` is used with `shell=True`, which is very dangerous for shell injection.
#### How I did it
Disable `shell=True`, enable `shell=False`
#### How to verify it
Tested on DUT, compare and verify the output between the original behavior and the new changes' behavior.
[testresults.zip](https://github.com/sonic-net/sonic-buildimage/files/9550867/testresults.zip)
There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed). Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread. However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`. The reason is unknown.
This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.
The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
When sending a PR only CI change, as expected, the target target/python-wheels/buster/sonic_config_engine-1.0-py2-none-any.whl should be from the cache, because the depended files were not changed, but it rebuilt.
How I did it
Sort the files by name.
- The Makefile.work becomes complex and it is very difficult to manage the changes across branches.
- Restructured the Makefile.work and it becomes more readable.
- Added $(QUIET) option to turn on command echo mode through command line option.
- Exported the SONIC_BUILD_VARS variable, through which make options can be set dynamically.
Eg: make SONIC_BUILD_VARS='INCLUDE_NAT=y'
Why I did it
Add components data for sonic-mgmt testing
How I did it
Update platform.json and add platform_components.json
How to verify it
Ran sonic-mgmt tests (test_chassis and test_component)
0a7557bd9 [minigraph] add option to specify golden path in load_minigraph (#2350)
322aefc37 [GCU]Remove GCU unique lane check for duplicate lanes platforms (#2343)
7099fffa7 [fastboot] fastboot enhancement: Use warm-boot infrastructure for fast-boot (#2286)
09026edbb [warm-reboot] fix warm-reboot when /tmp/cache is missing (#2367)
a3c404c74 Fix typo in platform_sfputil_helper.is_rj45_port (#2374)
637d834ce Vnet_route_check Vxlan tunnel route update. (#2281)
29a3e5180 Added support for tunnel route status in show vnet routes all. (#2341)
1ac584bb3 Use 'default' VRF when VRF name is not provided (#2368)
4d377a620 [subinterface]Added additional checks in portchannel and subinterface commands (#2345)
bbcdf2ed7 disk_check: Publish event for RO state (#2320)
3fd537b0a Support the bandit check by GitHub Action (#2358)
491d3d380 [generate dump]Added error message when saisdkdump fails (#2356)
6830e01ec [counterpoll]Fixing counterpoll show for tunnel and acl stats (#2355)
3be2ad7de [fast-reboot]Avoid stopping masked services during fast-reboot (#2335)
0e1b0cf20 [GCU] Fix missing backend in dry run (#2347)
676c31bd0 Add verification for override (#2305)
48997c266 Add Password Hardening CLI support (#2338)
414e239ea update unit tests for swap allocator
a91a4922f consider swap checking memory in installer
f0ce58635 [route_check]: Ignore standalone tunnel routes (#2325)
Why I did it
Fix PR merge failed because 'vstest' step does not install libyang.
How I did it
Install libyang in azure pipeline.
How to verify it
Pass vstest step.
- Why I did it
To update MFT package to the latest version.
- How I did it
Updated MFT_VERSION & MFT_REVISION in platform/mellanox/mft.mk.
- How to verify it
Build an image and deploy to the switch
Check MFT version by dpkg -l | grep mft
Verify that all the SONiC services up and running
Run regression testing using tests from sonic-mgmt
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
To include latest fixes and new functionality
SAI fixes and new features
fix#3205239, incorrect object type returned for SG child list
Fix VRF-VNI map entries remove issue
ECC health event and logging
[Port Buffers] restore default queue and pg configuration when all user pools are deleted
Fix EVPN type3 error on removal of uc/bc flood group
Fix EVPN type2 MAC move from local to remote results in SAI failure
Fix Disable learning on VXLAN tunnel
Fix error on VXLAN v6 tunnel removal
Fix port cannot apply schedule group when it is a lag member
Fix BFD add more detailed message on BFD packet not related to any existing session
gcc10 compilation fixes
Disable learning on VXLAN tunnel
Support BFD remote-disc exchange in negotiation stage
Tunnel Loopback packet action attribute implementation (for Dual TOR)
Add KVD resources MIN/MAX functionality (pending CRM issue with MIN only)
Support for CRC2 hash algorithm
Bulk counter support for PGs, queues
Support mirror sample rate attribute (SPC2+)
[Functional] [QoS] | Unable to remove SCHEDULE profile table even if there is no object referencing it
Next hop group optimized bulk API
Reduce verbosity of shared database already exists print
Span mirror policer (SPC2+), optimize pipeline for acl mirror action with policer on SPC2+
use same size descriptor pool for rx/tx
fix bfd - notify Sonic for admin-down event
2201 - empty list for supported fec for RJ45 ports
Fix don't disable used tunnel underlay interfaces
SDK fixes
100GbE FCI DAC (10137628-4050LF/HPE PN: 845408-B21) was recognized by mistake as supporting "cable burning' which caused the switch firmware to read page 0x9f (which unsupported in the cable) and to report this cable as having "bad eeprom".
Added remote peer UDP port information in BFD packet event.
After editing an ECMP, the resilient ECMP next-hop counter may not count correctly.
Fixed potential memory leaks in some APIs related to LPM
If TTL_CMD_COPY is used in Encap direction for a packet with no TTL, then the value passed in the ttl data structure will be used if non-zero (default 255 if zero).
In SN2201: When configuring Force mode, user should configure Speed and FEC on both sides
In Flex Tunnel encapsulation flow, if the encapsulation is with an IPv6 header, the flow label field may not be updated as expected.
In some cases, when changing speed to 400GbE over 8 lanes, the first few packets would be dropped.
In some traffic patterns involving small packets, the PortRcvErrors counter may mistakenly count events of local physical errors due to an internal flow in the hardware that involves link packets.
On Spectrum systems, sometimes during link failure, not all previous firmware indications cleared properly, potentially affecting the next link up attempt.
On the NVIDIA Spectrum-2 switch, when receiving a packet with Symbol Errors on ports that are configured to cut-thought mode, a pipeline might get stuck.
PCI calibration changes from a static to a dynamic mechanism.
SDK debug dump shows "Unknown" Counter in RFC3635 Counter Group.
SDK debug dump shows "Unknown" Counter in the PPCNT Traffic Class Counter Group.
SDK Dump missing column headers in some GC tables may result in difficulty understanding the dump.
SLL configuration is missing in SDK dump.
Spectrum-2 systems, do no support 1GbE on supported 40GbE modules.
When binding a UDP port which is already in use for BFD TX session, the error message appears incorrectly.
When Flex Tunnel was used, Flex Modifier sometimes experienced a brief mis-configuration during ISSU.
When many ports are active (e.g. 70 ports up), and the configuration of shared buffer is applied on the fly, occasionally, the firmware might get stuck.
When running 1GbE speeds on SN4600 system, the port remained active while peer side was closed.
When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
While toggling the cable, and the low power mode is set to ON, an unexpected PMPE event error is received.
- How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
get_rx_los and get_tx_fault is not supported via the exisitng interface used, need provide dummy implementation for them.
NOTE: in later releases we will get them back via different interface.
- How I did it
Return False * lane_num for get_rx_los and get_tx_fault
- How to verify it
Added unit test
Why I did it
With continueOnError: true, a failed job returns the result: partiallySuccess, which cause it can't be rerun, since AZP consider it as passed. Then we can't only rerun t0 jobs when it fails.
How I did it
Mark t0 part1 and part2 as continueOnError: false.
How to verify it
The pipeline will verify it.
* Move qsfp eeprom reading to new cached api
* provide reading multiple pages in recursive manner
* workaround with flat memory on cmis
* remove workaround with memory model
* Remove unused imports
Porting sonic_db_dump_load.py from sonic-py-swsssdk to sonic-py-common.
#### Why I did it
sonic-py-swsssdk will be deprecate, so porting sonic_db_dump_load.py to sonic-py-common.
#### How I did it
Copy sonic_db_dump_load.py to sonic-py-common, and fix minor API different.
#### How to verify it
Pass all E2E test.
The platform_tests/test_advanced_reboot.py::test_warm_reboot will cover this script.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Porting sonic_db_dump_load.py from sonic-py-swsssdk to sonic-py-common.
#### Ensure to add label/tag for the feature raised. example - [PR#2174](https://github.com/sonic-net/sonic-utilities/pull/2174) where, Generic Config and Update feature has been labelled as GCU.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
- Why I did it
Fix a typo in chassis platform API which causes the following error
>>> import sonic_platform as P
>>> c = P.platform.Platform().get_chassis()
>>> sl = c.get_all_sfps()
>>> sl[0].get_lpmode()
Sep 28 07:48:33 INFO LOG: Initializing SX log with STDOUT as output file.
False
>>> del c
Exception ignored in: <function Chassis.__del__ at 0x7f1d166ef8b0>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 126, in __del__
self.sfp_module.deinitialize_sdk_handle(sfp_module.SFP.shared_sdk_handle)
NameError: name 'sfp_module' is not defined
- How I did it
Use self while using the SDK handle
- How to verify it
Manual test
Signed-off-by: Stephen Sun <stephens@nvidia.com>
This reverts commit 9750cb4.
There is a PR to handle 202205 branch revert: #12184
- Why I did it
The PR to be reverted introduced many notice logs every 1 minute if SFP is not plugged:
Cannot get module EEPROM information: Input/output error
Before the "bad" PR, the message format is like this:
INFO pmon#supervisord: xcvrd Cannot get module EEPROM information: Input/output error
It was truncated by rsyslog because every message is the same. However, the "bad" PR introduces SFP index to the message:
NOTICE pmon#xcvrd: Failed to get EEPROM data for sfp 39: Cannot get module EEPROM information: Input/output error
Rsyslog no longer truncate such log and many such messages are flooded to syslog.
- How I did it
Revert the PR
- How to verify it
Manual test
Why I did it
Fix the build unstable issue caused by the kvm 9000 port is not ready to use in 2 seconds.
2022-09-02T10:57:30.8122304Z + /usr/bin/kvm -m 8192 -name onie -boot order=cd,once=d -cdrom target/files/bullseye/onie-recovery-x86_64-kvm_x86_64_4_asic-r0.iso -device e1000,netdev=onienet -netdev user,id=onienet,hostfwd=:0.0.0.0:3041-:22 -vnc 0.0.0.0:0 -vga std -drive file=target/sonic-6asic-vs.img,media=disk,if=virtio,index=0 -drive file=./sonic-installer.img,if=virtio,index=1 -serial telnet:127.0.0.1:9000,server
2022-09-02T10:57:30.8123378Z + sleep 2.0
2022-09-02T10:57:30.8123889Z + '[' -d /proc/284923 ']'
2022-09-02T10:57:30.8124528Z + echo 'to kill kvm: sudo kill 284923'
2022-09-02T10:57:30.8124994Z to kill kvm: sudo kill 284923
2022-09-02T10:57:30.8125362Z + ./install_sonic.py
2022-09-02T10:57:30.8125720Z Trying 127.0.0.1...
2022-09-02T10:57:30.8126041Z telnet: Unable to connect to remote host: Connection refused
How I did it
Waiting more time until the tcp port 9000 is ready, waiting for 60 seconds in maximum.
Why I did it
The python packages azure-kusto-data and azure-kusto-ingest packages for python2 are too old and not really used. The python3 environment has newer version of these packages installed. This change is to deprecate these two packages for python2 in docker-sonic-mgmt image.
How I did it
Removed the lines for installing old version of packages azure-kusto-data and azure-kusto-ingest in python2 in the Dockerfile template.
Signed-off-by: Xin Wang <xiwang5@microsoft.com>
Build swss-common with libyang
#### Why I did it
sonic-swss-common lib add dependency to libyang recently, so need update make file before update sonic-swss-common submodule.
#### How I did it
Add dependency to libyang in rules/swss-common.mk
#### How to verify it
Pass all E2E test case.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Add new Redis database PROFILE_DB
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)