changes:
-- yang model for mpls_tc_to_tc_map table.
-- tests.
#### Why I did it
yang model for mpls_tc_to_tc_map table.
#### How I did it
-- yang model for mpls_tc_to_tc_map table.
-- yang model tests.
#### How to verify it
-- yang model build time tests.
Why I did it
BGP service has always been starting after interface-config. However, recently we discovered an issue where some BGP sessions are unable to establish due to BGP daemon not able to read the interface IP.
This issue was clearly observed after upgrading to FRR 8.2.2. See more details in #12380.
How I did it
Delaying starting BGP seems to be a workaround for this issue.
However, caution is that this delay might impact warm reboot timing and other timing sequences.
This workaround is reducing the probability of hitting the issue by close to 100X. However, this workaround is not bulletproof as test shows. It is still preferrable to have a proper FRR fix and revert this change in the future.
How to verify it
Continuously issuing config reload and check BGP session status afterwards.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
This PR fixes the issue reported in PR#12367
https://github.com/sonic-net/sonic-buildimage/pull/12367
The issue is that exit code always being 0 for the builds that are failed.
Fix is added in the Makefile.work to return the error code
when the slave build is failed with an error.
This is causing a build failure for all builds. The PR build was incorrectly marked as passing due to a different build issue.
libyang[0]: Regular expression "(/[a-zA-Z0-9_-.]+)*/([a-zA-Z0-9_-.]+)./[a-z]{3}" is not valid (".]+)*/([a-zA-Z0-9_-.]+)./[a-z]{3})$": range out of order in character class).
libyang[0]: Module "sonic-restapi" parsing failed.
ERROR:YANG-TEST: Exception >Module "sonic-restapi" parsing failed.< in /sonic/src/sonic-yang-models/tests/yang_model_tests/test_yang_model.py:114
ERROR:YANG-TEST: Exception >Module "sonic-restapi" parsing failed.< in /sonic/src/sonic-yang-models/tests/yang_model_test
This reverts commit e1765121b2.
- Why I did it
Fixes#11431
- How I did it
dhcp6relay binds to ipv6 addresses configured on these vlan interfaces
Thus check if they are ready before launching dhcp6relay
- How to verify it
Unit Tests
Tested on a live device
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Remove swsssdk from sonic OS image and docker image
#### Why I did it
swsssdk is deprecated, so need remove from image.
#### How I did it
Update config file to remove swsssdk from image.
#### How to verify it
Pass all test case.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Remove swsssdk from sonic OS image and docker image
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`eval()` - not secure against maliciously constructed input, can be dangerous if used to evaluate dynamic content. This may be a code injection vulnerability.
#### How I did it
`eval()` - use `literal_eval()`
changed the platform device name under nokia directory; we now need to specify marvell armhf/arm64 to provide more accurate platform identity. otherwise onie discovery won't recognize the asic being installed.
Why I did it
when we load images using onie discovery, the process was failing because of marvell ASIC mismatch
How I did it
replace the platform asic with marvell-armhf under 7215
How to verify it
load a new image using http server and verify that the image can be loaded successfully
Why I did it
Fix apt-get remove/purge version not locked issue when the apt-get options not specified.
How I did it
Add a space character before and after the command line parameters.
Why I did it
For the change to support gearbox taps by gearbox_config.json (sonic-net/sonic-swss#2158), I need to add tests to sonic-swss/tests/test_gearbox.py to satisfy the test coverage of the change. The existing code in test_gearbox.py has already used brcm_gearbox_vs and here is to add some gearbox tap value to its gearbox_config.json, for the added tests in sonic-swss/tests/test_gearbox.py.
How I did it
How to verify it
This change itself will not affect existing code.
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content.
#### How I did it
`os` - use with `subprocess`
#### How to verify it
ac71d745d [VxLAN]Fix Vxlan delete command to throw error when there are references (#2404)
7419c6731 Added cisco config platform commands (#2242)
8760bbe80 Add UT to check sonic installer does not depend on database (#2401)
6bef65260 [doc] add documentation on automatic techsupport based on memory (#2411)
4a783745f [doc] update "config feature" section with "--block" option (#2409)
dd6210fcc [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (#2398)
bdc4a8a60 Fix broken pipeline build URL (#2363)
b31681b43 Fix display disorder problem of show vrf (#2392)
123504a85 YANG validation for ConfigDB Updates: portchannel add/remove, loopback interface, VLAN
28f6820c6 [link-local]Modify RIF check to include link-local enabled interfaces (#2394)
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
slb and bgp mon peers are not needed for storage backend. These neighbor are present in the minigraph.
How I did it
After minigraph parsing, remove these neighbors if it is a storage backend device
How to verify it
Unit tests
Verified on the device that once these tables are removed, these peers don't show up in "show runningconfig bgp" output
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess.Popen()` and `subprocess.run()` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
Replace `os` by `subprocess`, remove `shell=True`
Remove unused functions
Include:
df92fb72 Improve verbosity level and provide more info in the log (#2472)
e81ed20b [intfmgr]: Enable `accept_untracked_na` kernel param (#2436)
24d29f18 [orchdaemon]: Fixed sairedis record file rotation (#2299)
b8ee07d7 [build] add missing package libyang-dev in lgtm.yml (#2475)
e46dd294 [crm] Fix issue with continues EXCEEDED and CLEAR logs for ACL group/table counters (#2463)
b61d24cd [doc]: Update README.md (#2456)
b9ade5d2 [orchagent] Fix issue: ip prefix shall be inited even if VRF/VNET is not ready (#2461)
f0f1eb47 Revert "[counters] Improve performance by polling only configured ports buffer queue/pg counters (#2360)" (#2458)
3d757a83 [ci][asan] add DVS tests run with ASAN (#2441)
04fbc8e3 [ci] Only when test stage succeeded or succeededwithissues, PR run Gcov (#2460)
7cc035f9 [orchagent]: Publish identified events via structured-events channel (#2446)
efa0f01d [QoS] Enforce drop probability only for colors whose WRED are enabled (#2422)
05c5c2f6 [swss] Replace memset functions (#2423)
9ff993db Modified the test file to remove click commands and do the REDIS-DB u… (#2264)
9e376af3 Install libyang in azure pipeline. (#2445)
c1eb99a7 check state_db for po before sending ARP/ND pkts (#2444)
43cc4869 [portmgr] Fixed the orchagent crash due to late arrival of notif (#2431)
b62c7162 Enhance orchagent and buffer manager in error handling (#2414)
13bda3c6 [Everflow/ERSPAN] Set correct destination port and mac address when the nexthop is updated for ERSPAN mirror destination (#2392)
0ccb315c Revert "[VS Test] Skip failing subport tests (#2370)" (#2421)
ac8a83f0 [UT] [Portsyncd] Added Unit Tests for portsyncd (#2297)
83a186a9 Change the log messages in addKernelNeigh/Route from ERROR to INFO (#2437)
9c23389b [BFD]Clean up state_db BFD entries on swss restart (#2434)
d41aebfd EntityBulker SIGSEGV when create_entry attr_count 0 (#2224)
f52a7b1c Fix the Fec Mode Setting of gbsyncd (#2430)
8cc0a451 [neighsyncd] Enabling ipv4 link local entries for non-dualtor (#2427)
5624e875 Revert "[ci][asan] add DVS tests run with ASAN (#2363)" (#2433)
a26b26ac Dynamic port configuration - add port buffer cfg to the port ref counter (#2194)
486939a9 tlm_teamd: Filter portchannel subinterface events from STATE_DB LAG_TABLE (#2408)
a4b89925 [counters] Improve performance by polling only configured ports buffer queue/pg counters (#2360)
4aaeec91 added support for Xsight platform (#2426)
ca9edcad [ci][asan] add DVS tests run with ASAN (#2363)
dec4570c Handle dual ToR neighbor miss scenario (#2151)
9eb44220 Upstream new development on p4orch (#2237)
e9be2c0e [lgtm] Fix dependency (#2419)
c0168f35 [muxorch] Returning true if nbr in skip_neighbor_ in isNeighborActive() (#2415)
cfcf3d87 [macsec]: Set MTU for MACsec (#2398)
8346034b Delete Invalid if condition in intfsorch.cpp (#2411)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Signed-off-by: maipbui <maibui@microsoft.com>
Dependency: [PR (#12065)](https://github.com/sonic-net/sonic-buildimage/pull/12065) needs to merge first.
#### Why I did it
`subprocess.Popen()` and `subprocess.check_output()` is used with `shell=True`, which is very dangerous for shell injection.
#### How I did it
Disable `shell=True`, enable `shell=False`
#### How to verify it
Tested on DUT, compare and verify the output between the original behavior and the new changes' behavior.
[testresults.zip](https://github.com/sonic-net/sonic-buildimage/files/9550867/testresults.zip)
There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed). Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread. However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`. The reason is unknown.
This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.
The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
When sending a PR only CI change, as expected, the target target/python-wheels/buster/sonic_config_engine-1.0-py2-none-any.whl should be from the cache, because the depended files were not changed, but it rebuilt.
How I did it
Sort the files by name.
- The Makefile.work becomes complex and it is very difficult to manage the changes across branches.
- Restructured the Makefile.work and it becomes more readable.
- Added $(QUIET) option to turn on command echo mode through command line option.
- Exported the SONIC_BUILD_VARS variable, through which make options can be set dynamically.
Eg: make SONIC_BUILD_VARS='INCLUDE_NAT=y'
Why I did it
Add components data for sonic-mgmt testing
How I did it
Update platform.json and add platform_components.json
How to verify it
Ran sonic-mgmt tests (test_chassis and test_component)
0a7557bd9 [minigraph] add option to specify golden path in load_minigraph (#2350)
322aefc37 [GCU]Remove GCU unique lane check for duplicate lanes platforms (#2343)
7099fffa7 [fastboot] fastboot enhancement: Use warm-boot infrastructure for fast-boot (#2286)
09026edbb [warm-reboot] fix warm-reboot when /tmp/cache is missing (#2367)
a3c404c74 Fix typo in platform_sfputil_helper.is_rj45_port (#2374)
637d834ce Vnet_route_check Vxlan tunnel route update. (#2281)
29a3e5180 Added support for tunnel route status in show vnet routes all. (#2341)
1ac584bb3 Use 'default' VRF when VRF name is not provided (#2368)
4d377a620 [subinterface]Added additional checks in portchannel and subinterface commands (#2345)
bbcdf2ed7 disk_check: Publish event for RO state (#2320)
3fd537b0a Support the bandit check by GitHub Action (#2358)
491d3d380 [generate dump]Added error message when saisdkdump fails (#2356)
6830e01ec [counterpoll]Fixing counterpoll show for tunnel and acl stats (#2355)
3be2ad7de [fast-reboot]Avoid stopping masked services during fast-reboot (#2335)
0e1b0cf20 [GCU] Fix missing backend in dry run (#2347)
676c31bd0 Add verification for override (#2305)
48997c266 Add Password Hardening CLI support (#2338)
414e239ea update unit tests for swap allocator
a91a4922f consider swap checking memory in installer
f0ce58635 [route_check]: Ignore standalone tunnel routes (#2325)
Why I did it
Fix PR merge failed because 'vstest' step does not install libyang.
How I did it
Install libyang in azure pipeline.
How to verify it
Pass vstest step.
- Why I did it
To update MFT package to the latest version.
- How I did it
Updated MFT_VERSION & MFT_REVISION in platform/mellanox/mft.mk.
- How to verify it
Build an image and deploy to the switch
Check MFT version by dpkg -l | grep mft
Verify that all the SONiC services up and running
Run regression testing using tests from sonic-mgmt
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
To include latest fixes and new functionality
SAI fixes and new features
fix#3205239, incorrect object type returned for SG child list
Fix VRF-VNI map entries remove issue
ECC health event and logging
[Port Buffers] restore default queue and pg configuration when all user pools are deleted
Fix EVPN type3 error on removal of uc/bc flood group
Fix EVPN type2 MAC move from local to remote results in SAI failure
Fix Disable learning on VXLAN tunnel
Fix error on VXLAN v6 tunnel removal
Fix port cannot apply schedule group when it is a lag member
Fix BFD add more detailed message on BFD packet not related to any existing session
gcc10 compilation fixes
Disable learning on VXLAN tunnel
Support BFD remote-disc exchange in negotiation stage
Tunnel Loopback packet action attribute implementation (for Dual TOR)
Add KVD resources MIN/MAX functionality (pending CRM issue with MIN only)
Support for CRC2 hash algorithm
Bulk counter support for PGs, queues
Support mirror sample rate attribute (SPC2+)
[Functional] [QoS] | Unable to remove SCHEDULE profile table even if there is no object referencing it
Next hop group optimized bulk API
Reduce verbosity of shared database already exists print
Span mirror policer (SPC2+), optimize pipeline for acl mirror action with policer on SPC2+
use same size descriptor pool for rx/tx
fix bfd - notify Sonic for admin-down event
2201 - empty list for supported fec for RJ45 ports
Fix don't disable used tunnel underlay interfaces
SDK fixes
100GbE FCI DAC (10137628-4050LF/HPE PN: 845408-B21) was recognized by mistake as supporting "cable burning' which caused the switch firmware to read page 0x9f (which unsupported in the cable) and to report this cable as having "bad eeprom".
Added remote peer UDP port information in BFD packet event.
After editing an ECMP, the resilient ECMP next-hop counter may not count correctly.
Fixed potential memory leaks in some APIs related to LPM
If TTL_CMD_COPY is used in Encap direction for a packet with no TTL, then the value passed in the ttl data structure will be used if non-zero (default 255 if zero).
In SN2201: When configuring Force mode, user should configure Speed and FEC on both sides
In Flex Tunnel encapsulation flow, if the encapsulation is with an IPv6 header, the flow label field may not be updated as expected.
In some cases, when changing speed to 400GbE over 8 lanes, the first few packets would be dropped.
In some traffic patterns involving small packets, the PortRcvErrors counter may mistakenly count events of local physical errors due to an internal flow in the hardware that involves link packets.
On Spectrum systems, sometimes during link failure, not all previous firmware indications cleared properly, potentially affecting the next link up attempt.
On the NVIDIA Spectrum-2 switch, when receiving a packet with Symbol Errors on ports that are configured to cut-thought mode, a pipeline might get stuck.
PCI calibration changes from a static to a dynamic mechanism.
SDK debug dump shows "Unknown" Counter in RFC3635 Counter Group.
SDK debug dump shows "Unknown" Counter in the PPCNT Traffic Class Counter Group.
SDK Dump missing column headers in some GC tables may result in difficulty understanding the dump.
SLL configuration is missing in SDK dump.
Spectrum-2 systems, do no support 1GbE on supported 40GbE modules.
When binding a UDP port which is already in use for BFD TX session, the error message appears incorrectly.
When Flex Tunnel was used, Flex Modifier sometimes experienced a brief mis-configuration during ISSU.
When many ports are active (e.g. 70 ports up), and the configuration of shared buffer is applied on the fly, occasionally, the firmware might get stuck.
When running 1GbE speeds on SN4600 system, the port remained active while peer side was closed.
When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
While toggling the cable, and the low power mode is set to ON, an unexpected PMPE event error is received.
- How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
get_rx_los and get_tx_fault is not supported via the exisitng interface used, need provide dummy implementation for them.
NOTE: in later releases we will get them back via different interface.
- How I did it
Return False * lane_num for get_rx_los and get_tx_fault
- How to verify it
Added unit test
Why I did it
With continueOnError: true, a failed job returns the result: partiallySuccess, which cause it can't be rerun, since AZP consider it as passed. Then we can't only rerun t0 jobs when it fails.
How I did it
Mark t0 part1 and part2 as continueOnError: false.
How to verify it
The pipeline will verify it.