Why I did it
There is a queue in sysmonitor.py that is created based on an object of multiprocessing.Manager.
After performing fast-reboot, system health monitor is being shut down, what causes this Manager to be shut down as well, since it is a child-process of healthd.
That's why I moved the creation of this Manager from the top of the file to the function Sysmonitor.system_service() (The only place it is used), to make Manager a child-process of Sysmonitor, instead of Healthd. This way both the queue (the Manager) and the processes that uses this queue will be child-processes of the same process, and the problematic scenario of sysmonitor sending messages to a dead queue will not be possible.
How I did it
Removed the definition of manager as global and moved it to system_service() function
How to verify it
Perform a fast reboot and verify the traceback issue is fixed
Why I did it
To bring in the following fixes:
Revert temporary fix added to disable SA equal DA drops
CS00012273013 - [7.1][J2, J2c+] Disable SA Equals DA trap on DNX
CS00012274222 - How to block the voq for given destination port for a flow from a remote mod-id
CS00012275381 - SAI_INGRESS_PRIORITY_GROUP_STAT_PACKETS is incremented for port's PG's even if there are no traffic sent to that PG
CS00012274433 - Local Fault and Remote Fault are not polled by linkscan thread
How I did it
Merged above fixes to SAI code
How to verify it
Validated by running the basic sanity tests on XGS and DNX chassis platforms including
fib/test_fib.py
decap/test_decap.py
drop_counters/test_drop_counters.py
arp/test_arpall.py
Why I did it
In some cases, dpkg will call dpkg to validate version.
dpkg hook will get stuck in a loop to lock.
How I did it
Use an env variable to skip duplicated lock.
The deepdiff python package was recently updated to 6.2.3. As part of
this, a dependency was introduced on orjson. There's no armv8l python
wheel available for orjson, which means it needs to be built from
source. However, building it requires rust (which Buster and Bullseye
don't have a new enough version of) and maturin.
As a quick fix, pin this to version 6.2.2, before the orjson dependency
is introduced.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Update sonic-utilities submodule pointer to include the following:
* dddd6c5 [202205] Revert the show-techsupport optimization PR's ([#2581](https://github.com/sonic-net/sonic-utilities/pull/2581))
Signed-off-by: dprital <drorp@nvidia.com>
Why I did it
In the voq chassis the buffer_queue configuration needs to be applied on system_port instead of the sonic port.
This PR has the change to do this.
How I did it
Modify buffer_config.j2 to generate buffer_queue configuration on system_ports if the device is Voq Chassis
How to verify it
Verify the buffer_queue configuration is generated properly using sonic-cfggen
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Update sonic-utilities submodule pointer to include the following:
3cb66b4 [202205] Preserve copp tables through DB migration (2524) (#2574)
Update sonic-swss submodule pointer to include the following:
c9ca7c8 Modify coppmgr mergeConfig to support preserving copp tables through reboot. (#2548)
Signed-off-by: dprital <drorp@nvidia.com>
Update sonic-utilities submodule pointer to include the following:
* c9ed09d [202205] [sonic_installer] use /etc/resolv.conf from the host when migrating packages (#2573) ([#2575](https://github.com/sonic-net/sonic-utilities/pull/2575))
Signed-off-by: dprital <drorp@nvidia.com>
Update sonic-utilities submodule pointer to include the following:
3bc2bc6 [Mellanox][202205] Change severity to NOTICE in Mellanox buffer migrator when unable to fetch DEVICE_METADATA due to empty CONFIG_DB during initialization (#2570)
e1c8243 [202205][generate_dump] Fix for a deletion flow for all secret files in the techsupport dump (#2572)
9f2984a [202205] Fix issue: unconfigured PGs are displayed in watermarkstat (#2568)
f7988b0 [202205] [timer.unit.j2] use wanted-by in timer unit (#2561)
f45dcfb [generate_dump] Optimize the execution time of 'show techsupport' CLI by paraller function execution (#2565)
67cbb15 [202205]Fixes 12170: Delete subinterface and recreate the subinterface in default-vrf (#2564)
93172c4 [202205] [generate_dump] Optimize the execution time of the 'show techsupport' script to 5-10% by reducing calls to the 'tar append' operation (#2562)
Signed-off-by: dprital <drorp@nvidia.com>
Why I did it
Advance SAI Redis head pointer
How I did it
changes:
sonic-net/sonic-sairedis@cf679e7sonic-net/sonic-sairedis@8d6688e
[202205][Submodule][SAI]Advance SAI head pointer sonic-sairedis#1185 sonic-net/sonic-sairedis@66f2961
remove parameter --skip_error, which removed from [202205][Submodule][SAI]Advance SAI head pointer sonic-sairedis#1185
How to verify it
local image build
This PR is a required for changing the L3 IP forwarding Behavior to SoC in active-active toplogy.
Basically a src IP is added to the SNAT rule so that only packets originating from ToR with src IP as vlan IP get natted by the rule and change the src IP to LoopBack IP
Master Branch PR with combined change is here
sonic-net/sonic-host-services#3
How I did it
check the config DB if the ToR is a DualToR and has an SoC IP assigned.
put an iptable rule
iptables -t nat -A POSTROUTING --destination -j SNAT --to-source "
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.
How I did it
add coalesce-time 10000 to the frr bgp startup config.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
- Why I did it
Fixed sflow yang model to include collector_vrf field.
- How I did it
Added leaf for collector_vrf under sflow_collector. Additionally aligned the configuration guide
- How to verify it
Added UT to verify.
* Fix kube mode to local mode long duration issue
* Remove IPV6 parameters which is not necessary
* Fix read node labels bug
* Tag the running image to latest if it's stable
* Disable image_version_higher check
* Change image_version_higher checker test case
Signed-off-by: Yun Li <yunli1@microsoft.com>
Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: lixiaoyuner <35456895+lixiaoyuner@users.noreply.github.com>
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
ECN parameters need to be updated for storage backend
How I did it
Included the check for storage backend devices to update qos configs
How to verify it
Verified that the new ecn settings are applied on storage backend device.
Verified that the old ecn settings are applied for storage frontend, non storage frontend/backend devices
swss:
* 7e274a4 2022-11-18 | [Fdbsyncd] Bug Fix for remote MAC move to local MAC and Fix for Static MAC advertisement in EVPN. (#2521) (HEAD -> 202205, github/202205) [KISHORE KUNAL]
* 434e80c 2022-11-02 | Fix vs test issue: failed to remove vlan due to referenced by vlan interface (#2504) [Stephen Sun]
* 11bef87 2022-11-27 | [dual-tor] add missing SAI attribte in order to create IPNIP tunnel (#2503) [Andriy Yurkiv]
* 11aba29 2022-11-09 | [SWSS] Innovium platform specific changes in PFC Detect lua script (#2493) [maulik_patel_marvell]
* 4a165ee 2022-11-14 | Revert "[vlanmgr] Disable `arp_evict_nocarrier` for vlan host intf (#2469)" (#2518) [Longxiang Lyu]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Why I did it
The PR is to apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor.
The traffic with DSCP 2 and DSCP 6 from T1 is treated as lossless traffic.
DSCP TC Queue
2 2 2
6 6 6
Traffic with DSCP 2 or DSCP 6 from downlink is still treated as lossy traffic as before.
How I did it
Define DSCP_TO_TC_MAP|AZURE_UPLINK and TC_TO_QUEUE_MAP|AZURE_UPLINK.
How to verify it
Verified by UT
Verified by coping the new template to a testbed, and rendering a config_db.json
- Why I did it
The values for config_db "docker_routing_config_mode" are:
separated: FRR config generated from ConfigDB, each FRR daemon has its own config file
unified: FRR config generated from ConfigDB, single FRR config file
split: FRR config not generated from ConfigDB, each FRR daemon has its own config file
This commit adds:
split-unified: FRR config not generated from ConfigDB, single FRR config file
- How I did it
In docker_init.sh, when split-unified is used, the FRR configs are not generated
from ConfigDB. What's more, "service integrated-vtysh-config" is configured in vtysh.conf.
- How to verify it
FRR config not overwritten when FRR container starts.
Signed-off-by: Arnaud le Taillanter <a.letaillanter@criteo.com>
Why I did it
Database container takes long time ( more than 1.5 minutes ) on some vendor platforms.
This makes the determine-reboot-cause starts later than process-reboot-cause service.
And that results in the incorrect reboot-cause determination.
How I did it
Add the dependency of determine-reboot-cause service to process-reboot-cause service
Export remote address to environment variable for TACACS authorization.
#### Why I did it
When remote user login, nss-tacplus need user remove address for TACACSS authorization.
#### How I did it
Export remote address to environment variable "SSH_REMOTE_IP"
#### How to verify it
Pass all E2E test.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Export remote address to environment variable for TACACS authorization.
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Send remote address in TACACS+ authorization message.
#### Why I did it
TACACS+ authorization message not send remote address to server side.
#### How I did it
Send remote address in TACACS+ authorization message.
#### How to verify it
Pass all E2E test.
Create new test case to validate remote address been send to server side.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
Send remote address in TACACS+ authorization message.
#### Ensure to add label/tag for the feature raised. example - [PR#2174](https://github.com/sonic-net/sonic-utilities/pull/2174) where, Generic Config and Update feature has been labelled as GCU.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
* [openssh]: Restore behavior of ClientAliveCountMax=0
OpenSSH 8.2 changed the behavior of ClientAliveCountMax=0 such that
setting it to 0 disables connection-killing entirely when the connection
is idle. Revert that change.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Changes from master branch PR sonic-net/sonic-host-services#13
est_cacl_application fails on VoQ chassis Supervisor with the error:
Failed: Missing expected iptables rules: set(['-A INPUT -s 240.127.1.1/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.3/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.2/32 -d 240.127.1.1/32 -j ACCEPT'])
This failure is seen because acl rules to allow traffic from fabric namespaces is missing.
This PR is to include fabric namespace docker mgmt ips so that acl rules to allow traffic from namespace is added for fabric namespace as well.
How I did it
Get list of fabric namespaces, use this list to get docker mgmt ip of fabric asic namespace as well.
How to verify it
Verified on voq chassis.
unit-test passes
Include following commit:
936f1b1 Revert "[config reload]: On dual ToR systems, cache ARP and FDB table… (#2461)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Cherry-pick #12306 to 202205 branch
Why I did it
Add yang model definition for VOQ_INBAND_INTERFACE defined and implemented for VOQ chassis. HLD for voq-inband-interface is included in https://github.com/sonic-net/SONiC/blob/master/doc/voq/voq_hld.md
How I did it
Added yang model definition, unit tests, sample config and documentation for the table
How to verify it
Validated config tree generation using "pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-voq-inband-interface.yang"
Built the below python-wheels to validate unit tests and other changes
target/python-wheels/bullseye/sonic_yang_mgmt-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_yang_models-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_config_engine-1.0-py3-none-any.whl
* Fix CVE-2022-37032 on FRR submodule
Patch was cherry picked from FRRouting/frr repo - d8d77d3733bc299ed5dd7b44c4d464ba2bfed288
* Fix CVE-2022-37032 on FRR submodule
Patch was cherry picked from FRRouting/frr repo - d8d77d3733bc299ed5dd7b44c4d464ba2bfed288
* Update patch version number
Fixing issue FRRouting/frr#11108
For interface based peers with peer-groups, "no neighbor capability extended-nexthop" gets added by default. This will result in IPv4 routes not having ipv6 next hops.
- How I did it
Porting the commit FRRouting/frr@8e89adc to FRR 8.2.2 which fixes the issue
- How to verify it
Load FRR and verify if the "no neighbor capability extended-nexthop" not gets added for interfaces associated with peer-groups
Why I did it
There is an outstanding FRR issue #12380. This seems to be a known issue but without good fix so far. The root cause is around zebra and kernel netlink interaction. The failure was previously not noticed by zebra.
How I did it
Port the patch that would make the issue obvious.
Signed-off-by: Ying Xie ying.xie@microsoft.com
Why I did it
#4021 describes an issue that is still being observed on master image whereby sensord does not start in pmon due to missing service.
How I did it
Updated the lm-sensors install patch with a case for systemd
How to verify it
Verified that sensord is up in pmon after boot
Co-authored-by: Boyang Yu <byu@arista.com>
Why I did it
Fix apt-get remove/purge version not locked issue when the apt-get options not specified.
How I did it
Add a space character before and after the command line parameters.
- Why I did it
Fixes#11431
- How I did it
dhcp6relay binds to ipv6 addresses configured on these vlan interfaces
Thus check if they are ready before launching dhcp6relay
- How to verify it
Unit Tests
Tested on a live device
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
slb and bgp mon peers are not needed for storage backend. These neighbor are present in the minigraph.
How I did it
After minigraph parsing, remove these neighbors if it is a storage backend device
How to verify it
Unit tests
Verified on the device that once these tables are removed, these peers don't show up in "show runningconfig bgp" output
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"
How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
To get following fixes:
be7da6b [sonic-installer] use host docker startup arguments when running dockerd in chroot (#2179) (#2407)
d112f7c [202205][auto-ts] add memory check (#2116) (#2413)
linkmgrd:
* a0834e4 2022-09-22 | [Active-Active] server side admin forwarding state sync up (#133) (HEAD -> 202205) [Jing Zhang]
* ea56020 2022-09-21 | Post switchover reasons to STATE DB (#131) [Jing Zhang]
swss:
* d9cbf44 2022-09-24 | [orchagent] Fix issue: ip prefix shall be inited even if VRF/VNET is not ready (#2461) (HEAD -> 202205, github/202205) [Junchao-Mellanox]
* 3dcd6ff 2022-09-24 | [macsec]: Set MTU for MACsec (#2398) (#2466) [Ze Gan]
platform-daemon:
* 48b6239 2022-09-27 | [ycabled] add support for getting grpc secerts via shared file (#298) (HEAD -> 202205) [vdahiya12]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Update sonic-utilities submodule pointer to include the following:
* 99ed8ea [link-local]Modify RIF check to include link-local enabled interfaces ([#2394](https://github.com/Azure/sonic-utilities/pull/2394))
Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
Why I did it
Current isc-dhcp uses below code to remove DHCP option:
memmove(sp, op, op[1] + 2);
sp += op[1] + 2;
sp points to the option to be stripped, we can call it as option S.
op points to the option after options S, we can call it as option O.
DHCP option is a typical type-length-value structure, the first byte is type, the second byte is length, and remain parts are value.
In this case, option O length is bigger than option S, and more than 2 bytes, after the memmove, we will get this result:
Now Option S and Option O are overwritten, op[1] was the length of Option O, and it's modified after memmove.
But current implementation is still using op[1] as length to update sp (sp+=op[1]+2), so we get the wrong sp.
How I did it
Create patch from https://github.com/isc-projects/dhcp
The new impelementation use mlen to store the length of Option O before memmove, that's how it fixed the bug.
size_t mlen = op[1] + 2;
memmove(sp, op, mlen);
sp += mlen;
How to verify it
I have a PR for sonic-mgmt to cover this issue:
sonic-net/sonic-mgmt#6330
Signed-off-by: Gang Lv ganglv@microsoft.com
- Why I did it
To pickup new commit from sonic-swss-common submodule:
afd2382 [202205] Fix sonic-db-cli dictionary output format not backward compatible issue. sonic-net/sonic-swss-common#690
- How I did it
Advance the sonic-swss-common submodule pointer
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Update sonic-utilities submodule pointer to include the following:
[202205][subinterface]Added additional checks in portchannel and subinterface commands (#2371)
[202205] Use warm-boot infrastructure for fast-boot (#2365)
*The sonic-frr was upgraded to FRR 8.2.2 as part of PR #10691. However, sonic-frr/frr submodule was still referring to previous 7.5 version. Update the sonic-frr/frr submodule to 8.2.2 commit id. Fixes issue #11484.
Co-authored-by: Hasan Naqvi <56742004+hasan-brcm@users.noreply.github.com>
Why I did it
This PR is to update TC_TO_QUEUE_MAP|AZURE for SKU Arista-7050CX3-32S-D48C8 and Arista-7260CX3 T0.
The change is only to align the TC_TO_QUEUE_MAP for regular traffic and bounced traffic. It has no impact on business because we have no traffic being mapped to TC2 or TC6.
How I did it
Update TC_TO_QUEUE_MAP|AZURE , and test cases as well.
How to verify it
Verified by running test case test_j2files.py
/sonic/src/sonic-config-engine$ python3 setup.py test -s tests/test_j2files.py
running test
......
----------------------------------------------------------------------
Ran 29 tests in 25.390s
OK
For the Restapi/gnmi use-cases, Sonic has to support a new Table: EXTERNAL_CLIENT of type CTRLPLANE, stage ingress
This shall match on 'src ip prefix' and dst port '8080'. Caclmgrd must parse this from acl.json and install as in the below example:
iptables -A INPUT -s 20.20.20.20/27 -p tcp --dport 8080 -j ACCEPT
or ip6tables if the 'src ip prefix' is IPv6.
This change for master branch is in PR sonic-net/sonic-host-services#9
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Update sonic-swss submodule pointer to include the following:
[BFD]Clean up state_db BFD entries on swss restart (#2434)
Fix the Fec Mode Setting of gbsyncd (#2430)
[neighsyncd] Enabling ipv4 link local entries for non-dualtor (#2427)
tlm_teamd: Filter portchannel subinterface events from STATE_DB LAG_TABLE (#2408)
PFCWD recovery changes using DLR_INIT (#2316)
Dynamic port configuration - add port buffer cfg to the port ref counter (#2194)
Signed-off-by: dprital <drorp@nvidia.com>
Why I did it:
API get_device_runtime_metadata() added by #11795 uses merge operator for dict but that is supported only for python version >=3.9. This API will be be used by scrips eg:hostcfgd which is still build for buster which does not have python 3.9 support.
Why I did it
Currently the CLI commands show interface status show interface counters and show interface description displays Ethernet-IB and Ethernet-Rec ports in the output. These are internal ports should only be displayed when the option -d all is used for the above mentioned CLI commands
How I did it
Add the port roles Inb and Rec when classifing a port as internal port.
How to verify it
Verify the CLI output of the command show interface status doesnt display the Ethenet-IB and Ethernet-Rec port when -d all option in not present
Before
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
update sai module with commit
- 566d4a8ef2 2022-08-11 | [SAI-PTF] Enable saiserverv2 with syncd-rpc and fix saithriftv2 build (#1552) (#1533) (#1514) (#1492) (#1558) (#1557) [Richard.Yu]
- a1796a53cc 2022-08-11 | Add support of mdio IPC server class using sai switch api and unix socket
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* [sonic_py_common] Cache Static Information in device_info to speed up CLI response (#11696)
- Why I did it
Profiled the execution for the following cmd intfutil -c status
- How I did it
Cached the following information:
1. get_sonic_version_info()
2. get_platform_info()
None of the API exposed to the user libraries (for eg: sonic-utilities) has been modified
These methods involve reading text files or from redis. Thus, caching helped to improve the execution time
- How to verify it
Added UT's.
Verified on the device
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
* Removed UT since libswsscommom dep is missing in <=202205
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Why I did it
VoQ chassis supervisor will have Fabric asics and the sub_role for fabric asics will be "Fabric".
The fabric asics namespaces are not being returned in get_all_namespaces() and is required in caclmgrd to add right cacl to allow internal docker traffic from fabric asic namespaces.
test_cacl_application fails on VoQ chassis Supervisor with the error:
Failed: Missing expected iptables rules: set(['-A INPUT -s 240.127.1.1/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.3/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.2/32 -d 240.127.1.1/32 -j ACCEPT'])
How I did it
Update get_all_namespaces to return fabric namespaces list.
How to verify it
Verified on VoQ chassis.
Why I did it
On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.
The monit 'container_checker' fails in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).
* [snmpd]: Update to 5.9+dfsg-4+deb11u1 to match Debian version
This brings in some security fixes.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Update snmpd makefile
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Remove binNMU for snmpd
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Update the bcm config file system_ref_core_clock_khz param to
handlesystems with J2cplus linecards.
We need system_ref_core_clock_khz to be set to 1600000 for supporting j2
and j2cplus linecards on the same chassis.
Signed-off-by: maipbui <maibui@microsoft.com>
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
Replace unsafe functions to safe functions
#### How I did it
Replace `strtok()` by `strtok_r()`
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
This PR is to update Yang model for pfc_enable and pfcwd_sw_enable fields to support more than 2 queues, like 2,3,4,6.
Before this change, the regex "[0-7](,[0-7])?" accepts only no more than 2 queues.
How I did it
Update the regex pattern for pfc_enable and pfcwd_sw_enable, from "[0-7](,[0-7])?" to "[0-7](,[0-7])*
How to verify it
The change is verified by UT. The test input is updated to cover the change.
collected 3 items
tests/test_sonic_yang_models.py .. [ 66%]
tests/yang_model_tests/test_yang_model.py .
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com
Why I did it
Generate the port configuration required 400G ZR port from minigraph.
How I did it
Add parse logic to get tx_power and laser_freq from LinkMetadata section of the minigraph.
Add UT for packet-chassis and voq chassis
How to verify it
UT
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
utilities:
* fbf82b9 2022-08-08 | Fix the issue that sonic_platform is not installed on vs image (#2300) (HEAD -> 202205, github/202205) [Stephen Sun]
* 19a3540 2022-08-13 | Revert "Revert "Improve the way to check port type of RJ45 port (#2249)"" [Ying Xie]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Why I did it
2 things are missing in current allow-prefix list implementation.
In some usecase, need to tell the BGP neighbor and have different allow-prefix list for different neighbors, which is not supported.
for the prefix list, can't support flexible le and ge.
How I did it
To enhance the bgp allow-prefix list feature to have:
To include the neighbor type info for the allow-prefix list.
To support flexible le and ge length for allow-prefix list.
How to verify it
4 new unit test cases are added in this PR to cover changes.
Why I did it
src/dhcprelay is being split out to be its own submodule.
How I did it
Add existing dhcprelay commits into the new repo.
Clean up Makefile (sonic-net/sonic-dhcp-relay@772625f)
Add LGTM config (sonic-net/sonic-dhcp-relay@5cc0889)
Add Azure pipeline config (sonic-net/sonic-dhcp-relay@c79cdb7)
Add submodule reference, renaming most references of dhcp6relay to dhcprelay (to reflect that this will not just be for IPv6 in the future).
How to verify it
Successful run of LGTM is tested at sonic-net/sonic-dhcp-relay#4. Failure run of LGTM is tested at sonic-net/sonic-dhcp-relay#3.
Azure pipeline is run for each commit/PR, and will build for amd64, armhf, and arm64. UT/code coverage check is not yet done.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
24f505148 [eloop.c]: Increase timeout of signal termination (#62)
2b2c1ad72 [driver_macsec_sonic.c]: Fixbug: a wrong db_wait in delete sa (#61)
Signed-off-by: Ze Gan <ganze718@gmail.com>
Why I did it
Fix CVE-2017-1000487 alert in thrift 0.14.1.
See https://nvd.nist.gov/vuln/detail/CVE-2017-1000487
How I did it
Change the version of org.codehaus.plexus:plexus-utils from 3.0.14 to 3.0.16.
Why I did it
The bgpcfgd doesn't support deletion of 'zebra set src', if an interface is deleted, the bgpcfgd will drop a warning message. In current implementation, we only care about the loopback0 interface but not others.
To improve the log print to have the key info, which will give the name of the deleted interface. We can ignore it if it is not the loopback0 interface. The application layer should be aware of that update and deletion is not supported, delete or update with a new address of loopback0 could cause issue, this log can give enough info to root cause the issue.
How I did it
How to verify it
Updating sonic-utilities sub module with the following commits
ca785a2 Remove sonic-db-cli
#### Why I did it
To fix sonic-db-cli high CPU usage on SONiC startup issue: https://github.com/Azure/sonic-buildimage/issues/10218
sonic-db-cli re-write with c++ and move to sonic-swss-common repo.
#### How I did it
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
#### Description for the changelog
ca785a2 Remove sonic-db-cli
#### A picture of a cute animal (not mandatory but encouraged)
Co-authored-by: liuh-80 <azureuser@liuh-dev-vm-02.5fg3zjdzj2xezlx1yazx5oxkzd.hx.internal.cloudapp.net>
Why I did it
This PR is to backport PR #11056 and PR #11045 into master branch.
This PR is to enable tunnel_qos_remap on T1 and T0 in DualToR deployment.
On T1, we check the property DownstreamRedundancyTypes. On T0, we check the property RedundancyType.
tunnel_qos_remap is set to enabled if gemini is in DownstreamRedundancyTypes (on T1) or RedundancyType (on T0).
How I did it
The change is implemented in minigraph.py.
How to verify it
Verified by test_minigraph_case.py and 'test_j2files.py`.
Why I did it
ip command cannot update packet number if the cipher is XPN.
How I did it
Specify SSCI when update packet number and ignore SSCI value if update action.
Signed-off-by: Ze Gan <ganze718@gmail.com>
What I did:
Added Support for deployment_id parsing for Device Asic metadata.
Why I did:-
Deployment Id is used in BGP docker for FRR template generation. For multi-asic platforms running in namespace without deployment id as key in DEVICE_METADATA FRR template generation fails. This change is needed after this #10154 where if deployment_id is none we don't update DEVICE_METADA dictionary.
How I verify:-
Added unit-test.
- Why I did it
Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario
This is to port #11032 and #11299 from 202012 to master.
Support additional queue and PG in buffer templates, including both traditional and dynamic model
Support mapping DSCP 2/6 to lossless traffic in the QoS template.
Add macros to generate additional lossless PG in the dynamic model
Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
Buffer tables are rendered via using macros.
Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.
On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:
40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Improve throughput and latency for 7260 deployments
How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%
How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
This PR is a required for changing the L3 IP forwarding Behavior to SoC in active-active toplogy. Basically, for getting a packet to be forwarded to the SoC IP in active-active topology, the requirement is to use the the LoopBack 3 IP inside SONiC device as the SRC IP. This is required because in active-active topology by default if the ToR wants to send packet to the SoC, it would pick the Vlan IP since that's the IP in the subnet, but since there are firewalls inside the SoC , the IP packets with Vlan IP as src IP in the IP header will be dropped. Hence to overcome this limitation, there is an iptable nat rule that is installed inside the kernel, with which all the packets which have SoC IP as destination IP, use Loopnack 3 IP as src in IP header
How I did it
check the config DB if the ToR is a DualToR and has an SoC IP assigned.
put an iptable rule
iptables -t nat -A POSTROUTING --destination -j SNAT --to-source "
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
Why I did it
This PR is to add a flag to control whether to generate PORT_QOS_MAP|global entry or not.
It's because for some HWSKU, such as BackEndToRRouter and BackEndLeafRouter, there is no DSCP_TO_TC_MAP defined.
Hence, if the PORT_QOS_MAP|global entry is generated, OA will report some error because the DSCP_TO_TC_MAP map AZURE can not be found.
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- saiObjectTypeQuery: invalid object id oid:0x7fddb43605d0
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- meta_generic_validation_objlist: SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP:SAI_ATTR_VALUE_TYPE_OBJECT_ID object on list [0] oid 0x7fddb43605d0 is not valid, returned null object id
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- applyDscpToTcMapToSwitch: Failed to apply DSCP_TO_TC QoS map to switch rv:-5
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- doTask: Failed to process QOS task, drop it
This PR is to address the issue.
How I did it
Add a flag require_global_dscp_to_tc_map to control whether to generate the PORT_QOS_MAP|global entry. The default value for require_global_dscp_to_tc_map is true. If the device type is storage backend, the value is changed to false. Then the PORT_QOS_MAP|global entry is not generated.
How to verify it
Update the current test_qos_dscp_remapping_render_template to cover storage backend.
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
There is a need to select different mmu profiles based on deployment type
How I did it
There will be separate subfolders (RDMA-CENTRIC, TCP-CENTRIC, BALANCED) in each hwsku folder which contains deployment specific mmu and qos settings. SonicQosProfile attribute in the minigraph will be used to determine which settings to use. If that attribute is not present, the default settings that exist in the hwsku folder will be used
Why I did it
Add infrastructure to support adding feature specific acls.
If feature specific ACLs has to be added:
if feature_name in self.feature_present and self.feature_present.get('feature_name'):
add_feature_specific_acls()
How I did it
Add function to get features present in feature table.
How to verify it
unit-test passes.
* [device]: Add SAI checksum verify to TD3 config
* A new config option was added to control the value of IPV4_INCR_CHECKSUM_ORIGINAL_VALUE_VERIFY in the EGR_FLEX_CONFIG control register (this prevents checksums of 0xffff from being propagated to other devices)
Why I did it
Fix the missing debian package for reproducible build issue.
The gnupg2 should be added into the version file.
https://dev.azure.com/mssonic/build/_build/results?buildId=118139&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=8d99be27-49d0-54d0-99b1-cfc0d47f0318
The following packages have unmet dependencies:
gnupg2 : Depends: gnupg (>= 2.2.27-2+deb11u2) but 2.2.27-2+deb11u1 is to be installed
E: Unable to correct problems, you have held broken packages.
The issue was caused by the gnupg2 removed, and not detected.
sonic-buildimage/build_debian.sh
Line 250 in 4fb6cf0
sudo LANG=C chroot $FILESYSTEM_ROOT apt-get -y remove software-properties-common gnupg2 python3-gi
How I did it
Export the debian packages when any debian package being removed.
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
For storage backend, certain rules will be applied to the DATAACL table to allow only vlan tagged packets and drop untagged packets.
How I did it
Create DATAACL table if the device is a storage backend device
To avoid ACL resource issues, remove EVERFLOW related tables if the device is a storage backend device
How to verify it
Added the following unit tests
- verify that EVERFLOW acl tables is removed and DATAACL table is added for storage backend tor
- verify that no DATAACL tables are created and EVERFLOW tables exist for storage backend leaf
Why I did it
Storage backend has all vlan members tagged. If untagged packets are received on those links, they are accounted as RX_DROPS which can lead to false alarms in monitoring tools. Using this acl to hide these drops.
How I did it
Created a acl template which will be loaded during minigraph load for backend. This template will allow tagged vlan packets and dropped untagged
How to verify it
Unit tests
Signed-off-by: Neetha John <nejo@microsoft.com>
* [sflow + dropmon] added INCLUDE_SFLOW_DROPMON flag, added patches for hsflowd
*Added a capability of monitoring dropped packets for the sFlow daemon in order to improve network - monitoring, diagnostic, and troubleshooting. The drop monitor service allows the sFlow daemon to export another type of sample - dropped packets as Discard samples alongside Counter samples and Packet Flow samples.
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
Why I did it
To further support parse out soc_ipv4 and soc_ipv6 out of Dpg:
<DeviceDataPlaneInfo>
<IPSecTunnels />
<LoopbackIPInterfaces xmlns:a="http://schemas.datacontract.org/2004/07/Microsoft.Search.Autopilot.Evolution">
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>HostIP</Name>
<AttachTo>Loopback0</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>10.10.10.2/32</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>10.10.10.2/32</a:PrefixStr>
</a:LoopbackIPInterface>
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>HostIP1</Name>
<AttachTo>Loopback0</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>fe80::0002/128</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>fe80::0002/128</a:PrefixStr>
</a:LoopbackIPInterface>
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>SoCHostIP0</Name>
<AttachTo>server2SOC</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>10.10.10.3/32</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>10.10.10.3/32</a:PrefixStr>
</a:LoopbackIPInterface>
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>SoCHostIP1</Name>
<AttachTo>server2SOC</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>fe80::0003/128</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>fe80::0003/128</a:PrefixStr>
</a:LoopbackIPInterface>
</LoopbackIPInterfaces>
</DeviceDataPlaneInfo>
Signed-off-by: Longxiang Lyu lolv@microsoft.com
How I did it
For servers loopback definitions in Dpg, if they contain LoopbackIPInterface with tags AttachTo, which has value of format like <server_name>SOC, the address will be regarded as a SoC IP, and sonic-cfggen now will treat the port connected to the server as active-active if the redundancy_type is either Libra or Mixed.
How to verify it
Pass the unittest.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
swss:
* 6bafea4 2022-06-29 | [vnetorch] [vxlanorch] fix a set of memory usage issues (#2352) (HEAD -> 202205, github/202205) [Yakiv Huryk]
utilities:
* c64454c 2022-06-28 | [GCU] Moving UniqueLanes from only validating moves, to be a supplemental YANG validator (#2234) (HEAD -> 202205, github/202205) [Mohamed Ghoneim]
* fbd79d4 2022-06-29 | Add check to not allow deleting PO if its member of vlan (#2237) [anilkpan]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Why I did it
New security feature for enforcing strong passwords when login or changing passwords of existing users into the switch.
- How I did it
By using mainly Linux package named pam-cracklib that support the enforcement of user passwords, the daemon named hostcfgd, will support add/modify password policies that enforce and strengthen the user passwords.
- How to verify it
Manually Verification-
1. Enable the feature, using the new sonic-cli command passw-hardening or manually add the password hardening table like shown in HLD by using redis-cli command
2. Change password policies manually like in step 1.
Notes:
password hardening CLI can be found in sonic-utilities repo-
P.R: Add support for Password Hardening sonic-utilities#2121
code config path: config/plugins/sonic-passwh_yang.py
code show path: show/plugins/sonic-passwh_yang.py
3. Create a new user (using adduser command) or modify an existing password by using passwd command in the terminal. And it will now request a strong password instead of default linux policies.
Automatic Verification - Unitest:
This PR contained unitest that cover:
1. test default init values of the feature in PAM files
2. test all the types of classes policies supported by the feature in PAM files
3. test aging policy configuration in PAM files
Signed-off-by: bingwang <wang.bing@microsoft.com>
Why I did it
This PR brings two changes
Add lossy PG profile for PG2 and PG6 on T1 for ports between T1 and T2.
After PR Update qos config to clear queues for bounced back traffic #10176 , the DSCP_TO_TC_MAP and TC_TO_PG_MAP is updated when remapping is enable
DSCP_TO_TC_MAP
Before After Why do this change
"2" : "1" "2" : "2" Only change for leaf router to map DSCP 2 to TC 2 as TC 2 will be used for lossless TC
"6" : "1" "6" : "6" Only change for leaf router to map DSCP 6 to TC 6 as TC 6 will be used for lossless TC
TC_TO_PRIORITY_GROUP_MAP
Before After Why do this change
"2" : "0" "2" : "2" Only change for leaf router to map TC 2 to PG 2 as PG 2 will be used for lossless PG
"6" : "0" "6" : "6" Only change for leaf router to map TC 6 to PG 6 as PG 6 will be used for lossless PG
So, we have two new lossy PGs (2 and 6) for the T2 facing ports on T1, and two new lossless PGs (2 and 6) for the T0 facing port on T1.
However, there is no lossy PG profile for the T2 facing ports on T1. The lossless PGs for ports between T1 and T0 have been handled by buffermgrd .Therefore, We need to add lossy PG profiles for T2 facing ports on T1.
We don't have this issue on T0 because PG 2 and PG 6 are lossless PGs, and there is no lossy traffic mapped to PG 2 and PG 6
Map port level TC7 to PG0
Before the PCBB change, DSCP48 -> TC 6 -> PG 0.
After the PCBB change, DSCP48 -> TC 7 -> PG 7
Actually, we can map TC7 to PG0 to save a lossy PG.
How I did it
Update the qos and buffer template.
How to verify it
Verified by UT.
- Why I did it
While doing config reload, FEATURE table may be removed and re-add. During this process, updating FEATURE table is not atomic. It could be that the FEATURE table has entry, but each entry has no field. This PR introduces a retry mechanism to avoid this.
- How I did it
Introduces a retry mechanism to avoid this.
- How to verify it
New unit test added to verify the flow as well as running some manual test.
#### Why I did it
Support the following tables which were introduced during dynamic buffer calculation
- LOSSLESS_TRAFFIC_PATTERN
- DEFAULT_LOSSLESS_BUFFER_PARAMETER
#### How I did it
- LOSSLESS_TRAFFIC_PATTERN
|name|type|range|mandatory|description|
|---|---|---|---|---|
|mtu|uint16|64~10240|true|The maximum packet size of a lossless packet|
|small_packet_percentage|uint8|0~100|true|The percentage of small packet|
- DEFAULT_LOSSLESS_BUFFER_PARAMETER
|name|type|range|mandatory|description|
|---|---|---|---|---|
|default_dynamic_th|int8|-8~7|true|The default dynamic_th for all buffer profiles that are dynamically generated for lossless PG|
|over_subscribe_ratio|uint16|-|false|The oversubscribe ratio for shared headroom pool.|
|||||Semantically, the upper bound is the number of physical ports but it can not be represented in the yang module. So we keep the upper bound open. As the type is (signed) integer whose lower bound is 0 by nature, we do not need to specify the range.|
#### How to verify it
Run unit test
Why I did it
To address internal build failures where the cable len for some of the skus is set to 300m for all tiers.
How I did it
For the buffers test, generate a new output file based off the original expected output with CABLE_LENGTH table updated to use 300m. In the comparison logic, compare against each of the expected output files and if any matches, the testcase is set to pass
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR
How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully
Signed-off-by: Neetha John <nejo@microsoft.com>
#### Why I did it
There might be a case where service checker periodic operation determined that specific container is running but when it tries to perform an operation on it, it was already closed by the user. This is a valid flow and we should not log an error message, informative warning is enough.
#### How I did it
I reduce log severity.
#### How to verify it
I verified it manually.
Signed-off-by: bingwang <bingwang@microsoft.com>
Why I did it
This PR is to add two extra lossless queues for bounced back traffic.
HLD sonic-net/SONiC#950
SKUs include
Arista-7050CX3-32S-C32
Arista-7050CX3-32S-D48C8
Arista-7260CX3-D108C8
Arista-7260CX3-C64
Arista-7260CX3-Q64
How I did it
Update the buffers.json.j2 template and buffers_config.j2 template to generate new BUFFER_QUEUE table.
For T1 devices, queue 2 and queue 6 are set as lossless queues on T0 facing ports.
For T0 devices, queue 2 and queue 6 are set as lossless queues on T1 facing ports.
Queue 7 is added as a new lossy queue as DSCP 48 is mapped to TC 7, and then mapped into Queue 7
How to verify it
Verified by UT
Verified by coping the new template and generate buffer config with sonic-cfggen
Why I did it
Recently the nightly testing pipeline found that the autorestart test case was failed when it was run against master image. The reason is Restart= field in each container's systemd configuration file was set to Restart=no even the value of auto_restart field in FEATURE table of CONFIG_DB is enabled.
This issue introduced by #10168 can be reproduced by the following steps:
Issues the config command to disable the auto-restart feature of a container
Runs command config reload or config reload minigraph to enable auto-restart of the container
Checks Restart= field in the container's systemd config file mentioned in step 1 by running the command
sudo systemctl cat <container_name>.service
Initially this PR (#10168) wants to revert the changes proposed by this: #8861. However, it did not fully revert all the changes.
How I did it
When hostcfgd started or was restarted, the Restart= field in each container's systemd configuration file should be initialized according to the value of auto_restart field in FEATURE table of CONFIG_DB.
How to verify it
I verified this change by running auto-restart test case against newly built master image and also ran the unittest:
The following commits are pushed
1f112b8 (HEAD -> 202205, origin/202205) [sonic-ycabled] fix grpc logic for timeout,cli HWSTATUS value retrival logic for active-active cable (#264)
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
* [Tunnel PFC] Tests for adding property 'sai_remap_prio_on_tnl_egress'
Add tests for adding property 'sai_remap_prio_on_tnl_egress', this
property should only be added in dual tor environment.
Test done:
Run test test_j2files.py
Co-authored-by: richardyu <richardyu@contoso.com>
Including change:
* 7ff8f75 2022-06-03 | Revert "[portsorch]: Prevent LAG member configuration when port has active ACL binding (#2165)" (#2306) (HEAD -> 202205, github/202205) [bingwang-ms]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Why I did it
Yang Model about password hardening feature, the sonic CLI of this feature was autogenerated from this Yang model
- How I did it
Create new Yang model in src/sonic-yang-models/yang-models/sonic-passwh.yang.
- How to verify it
There are unitests(yang test) in this P.R covering all the passwords policies with good and bad values cases.
Or is possible manually using the config/show password commands that were autogenerated from this Yang model. (this CLI code added in sonic-utilities)
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
There was a typo in hwsku specified as part of #10889
How I did it
Replaced with the correct hwsku
How to verify it
test_cfggen.py is passing
Updating sonic-utilities sub module with the following commits
84dbd93 Clear the fvs vector before popping the message out of notification
a90b2b7 selectabletimer: add mutex to start() and stop()
7ae22be Fix SIGTERM can't terminate PubSub::listen issue
#### Why I did it
To fix hostcfgd can't terminate by sigint issue, need update sonic-swss-common submodule.
#### How I did it
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
#### Description for the changelog
84dbd93 Clear the fvs vector before popping the message out of notification
a90b2b7 selectabletimer: add mutex to start() and stop()
7ae22be Fix SIGTERM can't terminate PubSub::listen issue
#### A picture of a cute animal (not mandatory but encouraged)
#### Why I did it
For yang model, sample_config_db.json file was missing Sample data for the features SNAT/DNAT/IPMC
#### How I did it
Added the SNAT,DNAT,IPMC(low Threshold/high threshold/threshold_type)entries in CRM table.
#### How to verify it
With sanity Build/test only.
[muxorch] Handling optional attributes in muxorch (#2288)
Update netlink messages handler (#2233)
Broadcast Unknown-multicast and Unknown-unicast Storm-control (#1306)
[vstest]: Increase PollingConfig default timeout (#2285)
[FDB] Fix fbdorch to properly handle syncd FDB FLUSH Notif (#2254)
[macsecorch]: Support for non-default sa per sc (#2250)
Migrating the NAT vs tests from Click to direct DB access (#2278)
[neighsync] Ignoring IPv4 link local addresses (#2260)
[IntfMgrd] Retry adding ipv6 prefix by setting disabled_ipv6 flag (#2267)
Increase Redis Timeout value for Switch Create Opration for Packet (#2243)
Update fdborch.cpp (#2261)
Signed-off-by: dprital <drorp@nvidia.com>
- Why I did it
With SAI V.1.10.2, new interface types CR8/SR8/KR8/LR8 have been introduced, we should also support them from the CLI configuration.
- How I did it
Add new enum for the new interface types
- How to verify it
Run the "config interface type" command to verify new interface types can be accepted and handled correctly.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
It is to improve the build performance, when building multiple targets.
The modified time of downloaded files should be not older than the file .platform.
If not, the file will be downloaded again, when building any dependent targets.
How I did it
When downloading the packages from web site, the modified time will be changed by the command "touch".
#### Why I did it
To ensure that some internal testcases do not break due to external changes
#### How to verify it
Ran test_cfggen.py with the changes and it passed
- Why I did it
YANG schema is missing for sonic-telemetry
- How I did it
Added YANG schema to sonic-yang-models and appropriate unit tests inside of test and test_config
- How to verify it
Build sonic-yang-models python wheels target and verify that unit tests are passing
Why I did it
Upgrade FRR to version 8.2.2. Build libyang2 required by FRR.
How I did it
Update FRR version and tag.
How to verify it
Following tests were performed on sonic-vs:
BGP docker status check
BGP configuration and session establishment
Route redistribution and ping
Issued show commands to check the bgp neighbor and routes
Checked app-db to ensure bgp routes are installed with correct interface and nexthop.
Create VRF and check FRR knows the VRF
Check VRF routes are installed in app-db with correct Vrf name and next-hop
Establish BGP Evpn session and check if Evpn routes (multicast, mac, prefix) are exchanged and installed correctly in app-db.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com
Why I did it
resolves#10761.
For VOQ chassis, the Recirc port, which was added for the Everflow, stays admin down after load minigraph.
This PR add the fix to make the recirc port as admin up
How I did it
The PR adds a change in minigraph.py, if port has role as Rec make the the port as admin-status up.
How to verify it
UT
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Why I did it
To further add cable_type and soc_ipv4 field to table MUX_CABLE, this PR tries to parse the minigraph like the following:
```
<Device i:type="SmartCable">
<ElementType>SmartCable</ElementType>
<SubType>active-active</SubType>
<Address xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>192.168.0.3/21</d5p1:IPPrefix>
</Address>
<AddressV6 xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>::/0</d5p1:IPPrefix>
</AddressV6>
<ManagementAddress xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>0.0.0.0/0</d5p1:IPPrefix>
</ManagementAddress>
<ManagementAddressV6 xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>::/0</d5p1:IPPrefix>
</ManagementAddressV6>
<SerialNumber i:nil="true" />
<Hostname>svcstr-7050-acs-1-Servers0-SC</Hostname>
</Device>
<Device i:type="Server">
<ElementType>Server</ElementType>
<Address xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>192.168.0.2/21</d5p1:IPPrefix>
</Address>
<AddressV6 xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>fc02:1000::2/64</d5p1:IPPrefix>
</AddressV6>
<ManagementAddress xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>0.0.0.0/0</d5p1:IPPrefix>
</ManagementAddress>
<Hostname>Servers0</Hostname>
</Device>
```
Signed-off-by: Longxiang Lyu lolv@microsoft.com
How I did it
get_mux_cable_entries will try to get the mux cable device from the devices list and get the cable type and soc ip address from the device definition.
How to verify it
Pass the unit-test
Why I did it
At present, there is no mechanism in an event driven model to know that the system is up with all the essential sonic services and also, all the docker apps are ready along with port ready status to start the network traffic. With the asynchronous architecture of SONiC, we will not be able to verify if the config has been applied all the way down to the HW. But we can get the closest up status of each app and arrive at the system readiness.
How I did it
A new python based system monitor tool is introduced under system-health framework to monitor all the essential system host services including docker wrapper services on an event based model and declare the system is ready. This framework gives provision for docker apps to notify its closest up status. CLIs are provided to fetch the current system status and also service running status and its app ready status along with failure reason if any.
How to verify it
"show system-health sysready-status" click CLI
Syslogs for system ready
What I did:
Added support to create route-map action set tag <user define value>
when the the allow prefix list matches. The tag can ben define by user in
constants.yml.
Why I did:
Since for Allow List feature we call from base route-map allow-list route-map having set tag option provides way for base route-map to do match tag and take any further action if needed. Adding tag provide metadata that can used by base route-map
Why I did it
https://github.com/Azure/SONiC/blob/master/doc/vxlan/Overlay%20ECMP%20with%20BFD.md
From the design, need to advertise the route with community string, the PR is to implement this.
How I did it
To use the route-map as the profile for the community string, all advertised routes can be associated with one route-map.
Add one file, mangers_rm.py, which is to add/update/del the route-map. Modified the managers_advertise_rt.py file to associate profile with IP route.
The route-map usage is very flexible, by this PR, we only support one fixed usage to add community string for route to simplify this design.
How to verify it
Implement new unit tests for mangers_rm.py and updated unit test for managers_advertise_rt.py.
Manually verified the test case in the test plan section, will add testcase in sonic-mgmt later. Azure/sonic-mgmt#5581
Why I did it
Config db schema generated by minigraph should run yang validation.
How I did it
Modify run_script to add yang validation.
How to verify it
Run sonic-config-engine unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
This is part of HLD Azure/SONiC#925
#### Why I did it
Add link-training support
#### How I did it
Update SONiC YANG for port link-training support
#### Description for the changelog
Add "link_training" to sonic-port.yang
#### Link to config_db schema for YANG module changes
https://github.com/sonic-net/SONiC/wiki/Configuration#port
Why I did it
Previous subport unit tests uses port channel names like PortChannel01, so for subport name generated PortChannel01.10, it exceeds Linux network interface name 15 char limit.
Signed-off-by: Longxiang Lyu lolv@microsoft.com
How I did it
Modify PortChannel01 to PortChannel1.
Why I did it
Fixes#10793
How I did it
Removed the switch_type validation from the Yang model.
How to verify it
compile sonic_yang_mgmt-1.0-py3-none-any.whl and sonic_yang_mgmt-1.0-py3-none-any.whl
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
288c2d8 Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)" (#2161)
bce4694 [autoneg] add support for remote speed advertisement (#2124)
a73f156 [show][vrf]Fixing show vrf to include vlan subinterface (#2158)
7a06457 [auto_ts] Enable register/de-register auto_ts config for APP Extension (#2139)
083ebcc Add transceiver-info items advertised for cmis-supported moddules (#2135)
0811214 Validate destination port is not LAG (#2053)
6ab1c51 [minigraph] Consume golden_config_db.json while loading minigraph (#2140)
c37a957 [Kdump] Remove the duplicate logic if Kdump was disabled (#2128)
1143869 Ordering fix for sfpshow eeprom (#2113)
fdb79b8 Allow fw update for other boot type against on the previous "none" boot fw update (#2040)
a54a091 [GCU] Supressing YANG errors from libyang while sorting (#1991)
fbfa8bc [GCU] Enabling AddRack and adding RemoveRack tests (#2143)
d012be9 [Command-Reference] Add CLI docs for route flow counter (#2069)
8c07d59 [Mellanox] [reboot] [asan] stop asan-enabled containers on reboot (#2107)
697aae3 Fix speed parsing when speed is NOT fetched from APPL_DB (#2138)
22a388b [show] fix get routing stack routine (#2137)
cb3a047 Support option --ports of config qos reload for reloading ports' QoS and buffer configuration to default (#2125)
154a801 Enhance "config interface type/advertised-type" to be blocked on RJ45 ports (#2112)
3732ac5 Add CLI for route flow counter feature (#2031)
29771e7 [techsupport] improve robustness (#2117)
f9dc681 [intfutil] Display RJ45 port and portchannel speed in 'M' instead of 'G' when it's <= 1000M (#2110)
781ae9f [config] Do not enable pfcwd for BmcMgmtToRRouter (#2136)
23e9398 [scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)
576c9ef [scripts/fast-reboot] stop timers in advance (#2131)
4dad79c bugfix: incorrect command for portchannel creation (#2134)
c17b1f4 [show][muxcable] Decrease the timeout for show mux status/hwmode (#2130)
49d61f8 [scripts/fast-reboot] cleanup (#2132)
52ca324 [config/config_mgmt.py]: Fix dpb issue with upper case mac in (#2066)
9e2fbf4 Update db_migrator to support `pfcwd_sw_enable` (#2087)
4010bd0 FGNHG CLI changes (#1588)
6bd54d0 Fix 'show mac' output when FDB entry for default vlan is None instead of 1 (#2126)
Signed-off-by: Ze Gan <ganze718@gmail.com>
#### Why I did it
The SSCI is wrong in the output of MACsec so that the virtual SAI cannot parse the output corretly.
The wrong output:
```
142: macsec_eth1: protect on validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off
cipher suite: GCM-AES-XPN-256, using ICV length 16
TXSC: 5254008f4f1c0001 on SA 0
0: PN 103, state on, key 12cbc4b64e26c9a1ba14d810da20d16e
SSCI 33554432, RXSC: 525400edac5b0001, state on
0: PN 107, state on, key 12cbc4b64e26c9a1ba14d810da20d16e
offload: off
```
Expected
```
142: macsec_eth1: protect on validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off
cipher suite: GCM-AES-XPN-256, using ICV length 16
TXSC: 5254008f4f1c0001 on SA 0
0: PN 252, state on, SSCI 33554432, key 12cbc4b64e26c9a1ba14d810da20d16e
RXSC: 525400edac5b0001, state on
0: PN 264, state on, key 12cbc4b64e26c9a1ba14d810da20d16e
```
#### How I did it
Move SSCI before the key so that SSCI will not be the front of SC information.
#### Why I did it
To pick up new commits:
* 60d2467 Add depends to p4rt debian package
#### How I did it
update sonic-p4rt/sonic-pins submodule pointer
#### How to verify it
should be able to build with p4rt enabled.
#### Why I did it
This function is critical for is_multi_asic() and SonicDBConfig initializing. No explicit reading ConfigDB. Otherwise it will implicitly trigger SonicDBConfig initializing.
#### How I did it
1. No explicit reading ConfigDB in get_asic_conf_file_path()
2. Collect asic_conf_path_candidates lazily to prevent any unnecessary side effect and improve the performance
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Why I did it
submodule update for the following commits
7a203b1 [chassis] Add new tables in counter db for Voq counter support. (#530)
5effea3 add new table schema for bgp profile (#608)
130dca5 [ci] Update azure pipeline branch variable reference.
708ed39 [ci] Parameterize pipeline and improve azure pipeline (#599)
9c08456 Added new P4RT tables. (#604)
#### Why I did it
Fix issue: Non compliant leaf list in config_db schema: https://github.com/Azure/sonic-buildimage/issues/9801
#### How I did it
The basic flow of DPB is like:
1. Transfer config db json value to YANG json value, name it “yangIn”
2. Validate “yangIn” by libyang
3. Generate a YANG json value to represent the target configuration, name it “yangTarget”
4. Do diff between “yangIn” and “yangTarget”
5. Apply the diff to CONFIG DB json and save it back to DB
The fix:
• For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
• For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.
#### How to verify it
1. Manual test
2. Changed sample config DB and unit test passed
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
Address build failures due to sonic config engine unit tests failing. Failures are due to referencing format used in Arista 7800 sample output for buffer template
How I did it
Remove referencing format
How to verify it
Sonic config engine wheel should be built successfully
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com
Why I did it
Fixes#10158
How I did it
Add yang model for config_db table BGP_VOQ_CHASSIS_NEIGHBOR and UT
closes#10157
Why I did it
Add yang model for the bgp_internal_neighbor table in config_db
How I did it
Add new yang model file and unit tests
How to verify it
UT and compile sonic_yang_models-1.0-py3-none-any.whl and sonic_yang_mgmt-1.0-py3-none-any.whl
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
- Why I did it
To add support for 800G speed for port in the yang.
- How I did it
Change limitation from 400G to 800G.
- How to verify it
Set a port speed to 800G and run the yang DB validation. e.g. by using dynamic port breakout.
#### Why I did it
Need to pass LY_CTX_DISABLE_SEARCHDIR_CWD to Context in order to disable automatically searching for schemas in current working directory (which is by default searched automatically)
#### How I did it
add additional attribute into YANG context
#### How to verify it
Create some invalid link on switch :
1) **ln -s /usr/abc xxx**
2) run **spm list**
--> There should not be these messages:
```
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
```
Add the following commits:
- [orchagent, crm]: Reset crm threshold exceed count when threshold type changed 5ba6a54786c0fd9b155bb9ea2a7ed724a58aab74
- [pbh] [aclorch] Fixed a bug causes by updating the flow-counter value for the PBH rule 841f00389b338e91ddc4de460ace4ff96adfa796
- [ACL]Avoid incrementing crm count when ACL rule create fails 3d3364f9715fa05fbdf2d09b08676c3055903b84
- set remote vtep the netdev down before delete 7f53db782aed2973f4ff6807911b5a549461f3c7
- Removing Vnet with scope default 2ea8581da4ba6f97bebde4845a234d7c810e5515
#### Why I did it
Adding exceptlionList to validation exception
#### How I did it
Check code.
#### How to verify it
Ran manually.
- Run full config validation from a KVM
- Print the thrown exception
**Before**
```
Error: Data Loading Failed
All Keys are not parsed in FEATURE
dict_keys(['telemetry'])
```
**After**
```
Error: Data Loading Failed
All Keys are not parsed in FEATURE
dict_keys(['telemetry'])
exceptionList:["'status'"]
```
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration.
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
Can not start sonic-hostservice
How I did it
Install python3-dbus and systemd-python, and replace invalid path
How to verify it
Start the service with below commands:
sudo systemctl start sonic-hostservice
sudo systemctl status sonic-hostservice
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Migrate ptftests script to python3, in order to do an incremental migration, add python virtual environment firstly, install all required python packages in virtual env as well.
Then migrate ptftests scripts from python2 to python3 one by one avoid impacting non-changed scripts.
Signed-off-by: Zhaohui Sun zhaohuisun@microsoft.com
How I did it
Add python3 virtual environment for docker-ptf.
Add submodule ptf-py3 and install patched ptf 0.9.3 into virtual environment as well, two ptf issues were reported here:
p4lang/ptf#173p4lang/ptf#174
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Why I did it
Allow portchannel vlan sub intf long name format as long as it follows Linux interface name length limit(<16).
How I did it
Modify the leaf name check.
How to verify it
Test case passes.
Why I did it
Provide fix for comment: https://github.com/Azure/sonic-buildimage/pull/10475/files#r847753187;
How I did it
Try exception is not required in this scenario, so remove and modify to initial db config according to single or multi-asic platforms.
How to verify it
Verified on multi-asic device.
* [CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* add more information in patch
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Update 0003-Remove-minimist-packages.patch
* change the thrift 0.14.1 to package download
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* use the series file for patching
* fix a code defect
#### Why I did it
Fix several bugs:
1. If one vlan member belongs to multiple vlans, and if any of the vlans is "Tagged" type, we respect the tagged type
2. If one vlan member belongs to multiple vlans, and all of the vlans have no "Tagged" type, we override it to be a tagged member
3. make sure `vlantype_name` is assigned correctly in each iteration
#### How to verify it
1. Test the command line to parse a minigraph and make sure the output does not change.
```
./sonic-cfggen -m minigraph.mlnx20.xml
```
The minigraph is for HwSKU Mellanox-SN2700-D40C8S8.
2. Test on a DUT with HwSKU Mellanox-SN2700-D40C8S8
```
sudo config load_minigraph
show vlan brief
```
Checked the "Port Tagging" column in the output.
* [build]: Patch debootstrap to not unmount the host's /proc filesystem
Currently, when the final image is being built (sonic-vs.img.gz,
sonic-broadcom.bin, or similar), each invocation of sudo in the
build_debian.sh script takes 0.8 seconds to run and execute the actual
command. This is because the /proc filesystem in the slave container has
been unmounted somehow. This is happening when debootstrap is running,
and it incorrectly unmounts the host's (in our case, the slave
container's) /proc filesystem because in the new image being built,
/proc is a symlink to the host's (the slave container's) /proc. Because
of that, /proc is gone, and each invocation of sudo adds 0.8 seconds
overhead. As a side effect, docker exec into the slave container during
this time will fail, because /proc/self/fd doesn't exist anymore, and
docker exec assumes that that exists.
Debootstrap has fixed this in 1.0.124 and newer, so backport the patch
that fixes this into the version that Bullseye has.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* [build_debian.sh]: Use eatmydata to speed up deb package installations
During package installations, dpkg calls fsync multiples times (for each
package) to ensure that tht efiles are written to disk, so that if
there's some system crash during package installation, then it is in at
least a somewhat recoverable state. For our use case though, we're
installing packages in a chroot in fsroot-* from a slave container and
then packaging it into an image. If there were a system crash (or even
if docker crashed), the fsroot-* directory would first be removed, and
the process would get restarted. This means that the fsync calls aren't
really needed for our use case.
The eatmydata package includes a library that will block/suppress the
use of fsync (and similar) system calls from applications and will
instead just return success, so that the application is not blocked on
disk writes, which can instead happen in the background instead as
necessary. If dpkg is run with this library, then the fsync calls that
it does will have no effect.
Therefore, install the eatmydata package at the beginning of
build_debian.sh and have dpkg be run under eatmydata for almost all
package installations/removals. At the end of the installation, remove
it, so that the final image uses dpkg as normal.
In my testing, this saves about 2-3 minutes from the image build time.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Change ln syntax to use chroot
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
9ac12bf (HEAD -> master, origin/master, origin/HEAD) Fix platform daemon chassisd to handle auto restart on fail (#247)
24fba04 [ycable] fix the logic to update cable_info values when ycable is not present; fix read side logic for ycable (#249)
Updating sonic-utilities sub module with the following commits
f09bd31 Fix UT failed cause by change pycommon to use swsscommon
c092300 Increased pcied unit test coverage to > 80%
7d7c85e Modular chassis: Psud set master led on first run
7195dcc Remove py2 from pipeline
c2e7393 [ycabled] increase UT coverage of ycabled daemon
#### Why I did it
When change pycommon to use swsscommon UT failed in sonic-platform-daemon, need submodule update with UT issue fix.
#### How I did it
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
#### Description for the changelog
Fix UT failed cause by change pycommon to use swsscommon
Increased pcied unit test coverage to > 80%
Modular chassis: Psud set master led on first run
Remove py2 from pipeline
[ycabled] increase UT coverage of ycabled daemon
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
[Build]: Fix pip version constraint conflict issue
When a version is specified in the constraint file, if upgrading the version in build script, it will have conflict issue.
How I did it
If a specified version has specified in pip command line, then the version constraint will be skipped.
* [device config] Adding configuration for default route fallback
* Set sai_tunnel_underlay_route_mode attribute to fallback to default route if more specific route is unavailable.
Why I did it
Config db schema generated by minigraph can’t pass yang validation, PORT table does not have 'lanes' and 'speed' field.
How I did it
Make cfggen command fail when 'lanes' and 'speed' are not provided
How to verify it
Run 'sonic-cfggen -m xxx.xml --print-data' to make sure command fail when 'lanes' and 'speed' not in PORT table
Why I did it
minigraph parser has introduced new type.
How I did it
Update yang models to support BmcMgmtToRRouter.
How to verify it
Run unit test for sonic-yang-models
Signed-off-by: Gang Lv ganglv@microsoft.com
#### Why I did it
As of https://github.com/Azure/sonic-swss-common/pull/587 the blackout issue in ConfigDBConnector has been resolved.
In the past hostcfgd was refactored to use SubscriberStateTable instead of ConfigDBConnector for subscribing to CONFIG_DB updates due to a "blackout" period between hostcfgd pulling the table data down and running the initialization and actually calling `listen()` on ConfigDBConnector which starts the update handler.
However SusbscriberStateTable creates many file descriptors against the redis DB which is inefficient compared to ConfigDBConnector which only opens a single file descriptor.
With the new fix to ConfigDBConnector I refactored hostcfgd to take advantage of these updates.
#### How I did it
Replaced SubscriberStateTable with ConfigDBConnector
#### How to verify it
The functionality of hostcfgd can be verified by booting the switch and verifying that NTP is properly configured.
To check the blackout period you can add a delay in the hostcfgd `load()` function and also add a print statement before and after the load so you know when it occurs. Then restart hostcfgd and wait for the load to start, then during the load push a partial change to the FEATURE table and verify that the change is picked up and the feature is enabled after the load period finishes.
#### Description for the changelog
[hostcfgd] Move hostcfgd back to ConfigDBConnector for subscribing to updates
Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr 2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr 2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors from scapy.route import *
Apr 2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr 2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors conf.iface = get_working_if()
Apr 2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr 2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr 2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr 2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr 2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
In order to include the following commit:
0f06910 [PBH] Implement Edit Flows (Azure/sonic-swss#2169)
sonic-swss
50d5be2 Make changes to support compiling on Bullseye with GCC 10 (#2216)
0870cf5 [mirrororch]: Implement HW resources availability validation for SPAN/ERSPAN (#2187)
f4ec565 [vlanmgrd] fix use-after-free memory issue (#2211)
c2de7fc [QosOrch] The notifications cannot be drained in QosOrch in case the first one needs to retry (#2206)
5575935 [neighsyncd] increase neighsyncd timeout (#2209)
0f06910 [PBH] Implement Edit Flows (#2169)
6241bbf Remove redundant and problematic code to skip "pool" field in buffer profile handling (#2197)
a55343c [azp]: Set diff coverage threshhold to 80% (#2188)
390cae1 [portsorch]: Prevent LAG member configuration when port has active ACL binding (#2165)
c1d47e6 [VNET]Fixing nexthop group delete during route change (#2198)
8941cc0 [BFD]Registering BFD state change callback during session creation (#2202)
680c539 [vxlan] Remove tunnel map objects on VNET tunnel removal (#2150)
20dde0c Fix for handling broadcom DNX ASIC to have ipv4 and ipv6 ACL rules in separate tables. (#2178)
5b7c949 [FdbOrch] SAI_FDB_EVENT_MOVE generates update with empty update.entry.port_name (#2200)
7350d49 [Vxlanmgr] vnet netdev cleanup during config reload fix (#2191)
2bef62b Validate LAG has members before mirror session create (#2130)
1e4d4ce [VS test] Increase VS test time, skip dpb flaky test (#2195)
6eda965 [vstest]Migrating vs tests from using click commands to direct DB access (#2179)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
Why I did it
Need to run yang validation for sonic-cfggen unit test, and many unit test does not provide speed for port table.
How I did it
Update minigraph xml.
How to verify it
Run sonic-cfggen unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Fix#9746
How I did it
Split the check condition based on non-exist and zero length.
How to verify it
Run verification script when table contains empty value
890f32f LLDPLocalSystemDataUpdater Exception Log Handled (#249)
2151731 Handle error seen on system where vlan interface map is not present (#246)
c6141c7 [build] use Azure.sonic-buildimage.official.vs pipeline as artifact source (#248)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>