Commit Graph

5560 Commits

Author SHA1 Message Date
Guohan Lu
693bc5faae Revert "[qos]: Adjust 7260 buffer sizes to accomodate extra lossless queues (#11018)"
This reverts commit 21f14dc6ea.

unit test needs to be cherry-picked.
2022-06-06 06:30:24 -07:00
Neetha John
21f14dc6ea [qos]: Adjust 7260 buffer sizes to accomodate extra lossless queues (#11018)
Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR

How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully

Signed-off-by: Neetha John <nejo@microsoft.com>
2022-06-05 22:23:13 -07:00
Richard.Yu
f555a4a0a0 [Tunnel PFC] Add property for tunnel PFC (#10962)
* [Tunnel PFC] Add property for tunnel PFC

Replace the config.bcm file with j2 template file
- Add 'sai_remap_prio_on_tnl_egress=1' property when device metadata local
- Host subtype is 'dualtor'
- Change sai.profile foe the new config.bcm.j2
2022-06-05 22:02:19 -07:00
bingwang-ms
e159998657
[202012][cherry-pick] Add two extra lossless queues for bounced back traffic (#10715)
* Add extra lossless queues

Signed-off-by: bingwang <bingwang@microsoft.com>
2022-06-04 19:25:02 +08:00
bingwang-ms
eda95b8caa
[sonic-sairedis]: update submodule sonic-sairedis (#10982)
544e4c2 (origin/202012) [ci] Paralize azure pipeline (#1044)
45b310d Enable SAI_SWITCH_ATTR_UNINIT_DATA_PLANE_ON_REMOVAL attribute (#1016)
9e56674 Advance submodule SAI for 202012 branch (#1032)

Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-06-03 23:33:28 -07:00
Arun Saravanan Balachandran
8981ae5cae
[202012][cherry-pick] DellEMC: Z9332f - Component API Fixes (#10997) 2022-06-03 10:27:39 -07:00
Saikrishna Arcot
3289598ea4
Split kvmtest t0 job into two jobs and run in parallel (#10044) (#10858)
Why I did it
Introduce 2 sub jobs for kvmtest t0 job in sonic-mgmt repo in PR Azure/sonic-mgmt#4861
But in sonic-buildimage repo, because section parameter is null, it always run the part 2 test scripts in kvmtest t0 job.
It missed the part 1 test scripts in kvmtest.sh.

How I did it
Split kvmtest t0 job into two sub jobs such as sonic-mgmt repo and run them in parallel to save time.

How to verify it
Submit PR will trigger the pipeline to run.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
(cherry picked from commit 44028723ef)

Co-authored-by: Zhaohui Sun <94606222+ZhaohuiS@users.noreply.github.com>
2022-06-03 09:39:30 -07:00
Shilong Liu
5d2ae332d4
[ci] Add arm artifacts in common lib azure pipeline (#10892) 2022-06-03 12:20:49 +08:00
bingwang-ms
7ec6a60230
[cherry-pick] [202012] Update qos config to clear queues for bounced back traffic (#10608)
* Update qos config to clear queues for bounced back traffic

Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-06-02 16:29:25 +08:00
Jing Kan
14fdcc815a
[202012][openssh] openssh: Upgrade from 7.9 to 8.4, to match version in buster-backports (#10910)
* Use buster-backports version
* Use dget dsc file instead source repo
* Update make files
* Upgrade openssh-client to 8.4 in base image
* Remove useless installation
* Install openssh-server from buster-backports in build_debian
* Update dev buster package version list

Signed-off-by: Jing Kan jika@microsoft.com
2022-06-02 16:06:22 +08:00
kellyyeh
79555c894a
[dhcp6relay] Add dhcpv6 option check (#10486) (#10808) 2022-06-01 14:20:26 -07:00
vdahiya12
148ff285a6
[202012][sonic-platform-common][sonic-platform-daemons] submodule (#10938)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>

This PR adds the following commits are added in sonic-platform-common
ac25515 (HEAD -> 202012, origin/202012) [Credo][Ycable] changes for
synchronizing executing Telemetry API's when mux toggle is inprogress
(#280)
fb304c1 [sonic_ssd] Nokia-7215: "show platform ssdhealth" not showing
health percent (#279)
8bfe9c0 [Credo][Ycable] improve logging for Server Powered off/Faulty
cables (#272)

The following commits are added in sonic-platform-daemons
6695c55 (HEAD -> 202012, origin/202012) [ycabled] fix the posting for
mux_cable_static_info per downlink when ycabled is spawned;
synchronizing executing Telemetry API (#257)
6a315ae (HEAD -> 202012, origin/202012) [ycabled] Fix some syntax
warnings in ycabled (#263)
2022-05-31 11:03:15 -07:00
StormLiangMS
1ec4f3ea5c
submodule advance (#10949)
Update sonic-swss-common submodule pointer to include the following:

fb89310 - (HEAD, origin/202012) [Overlay ECMP] add new table schema for bgp profile [sonic-utilities]: Update submodule; Add commands to sudoers as necessary #608 (10 hours ago)
3632090 - [ci] Update azure pipeline branch variable reference. (4 weeks ago)
2022-05-31 23:47:33 +08:00
xumia
aebc3a8d6a
[Build]: Support to use the base image version when a package version not specified (#10971) (#10976)
Why I did it
It is to fix issue: #10952
[Build]: Support to use the base image version when a package version not specified
2022-05-31 22:18:17 +08:00
xumia
6d3f1346fb
Set the version for python2 package protobuf (#10964)
Why I did it
Python2 not support to install protobuf>=4.21.
2022-05-30 20:00:54 +08:00
bingwang-ms
c4e806fcf7
[202012][cherry-pick] Define SYSTEM_DEFAULTS table to control tunnel_qos_remap (#10930)
[202012][cherry-pick] Define `SYSTEM_DEFAULTS` table to control tunnel_qos_remap (#10930)

Signed-off-by: bingwang <bingwang@microsoft.com>
2022-05-30 17:52:43 +08:00
SuvarnaMeenakshi
ec9732aa3b
[202012][multi-asic][sonic-config-engine]: Get PORT table from namespace config db (#10475)
Why I did it
Cherry-pick of: #7632
portconfig.py gets PORT table from config_db if it is present. If not, port_config.ini files are parsed.
For multi-asic platform, if namespace is passed to get_port_config(), config_db connection was done to host namespace always and not to asic specific namespace.
Provides fix for: #7161

How I did it
Modify db connection function to connect to namespace config_db.

How to verify it
Unit-test passed.
Verified on multi-asic VS platform.
2022-05-27 16:28:33 -07:00
Jing Zhang
0761850f17
[sonic-linkmgrd][202012] submodule updates (#10924)
[sonic-linkmgrd][202012] submodule updates

489cf3 Jing Zhang Wed May 18 09:59:02 2022 -0700 Avoid switching active when LinkState == Down (#77)
a6c9713 Jing Zhang Tue May 24 11:03:54 2022 -0700 [202012] Add option to enable or disable default route related feature (#72)
dbb607d Jing Zhang Thu May 12 08:19:20 2022 -0700 [ci]: uplift diff coverage threshold to 80% (#71)

sign-off: Jing Zhang zhangjing@microsoft.com
2022-05-27 14:38:43 -07:00
Arun Saravanan Balachandran
33c1ba1b2c [DellEMC S5248f] Remove duplicate ipmihelper.py (#10455)
Why I did it
To remove the ipmihelper.py in S5248f directory to prevent the image label being marked 'dirty', due to the file being replaced by the ipmihelper.py in common folder during build.

How I did it
Remove ipmihelper.py in S5248f directory.

How to verify it
Build a broadcom image and verify that the tracked files are not modified.

Which release branch to backport (provide reason below if selected)
 201811
 201911
 202006
 202012
 202106
 202111
Description for the changelog
DellEMC S5248f : Remove duplicate ipmihelper.py

Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)
2022-05-27 17:28:56 +00:00
Neetha John
876d982bce [sonic-config-engine] Fix typo in hwsku name in sample graph (#10941)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
There was a typo in hwsku specified as part of #10889

How I did it
Replaced with the correct hwsku

How to verify it
test_cfggen.py is passing
2022-05-27 17:00:55 +00:00
xumia
06addae853 Revert "Reduce image size for lazy installation packages (#10775)" (#10916)
This reverts commit 15cf9b0d70.
Why I did it
Revert the PR #10775, for it has impact on onie installation.
It is caused by the symbol links not supported in some of the onie unzip.
We will enable after fixing the issue, see #10914
2022-05-27 17:00:50 +00:00
mssonicbld
8ce3cab508
[ci/build]: Upgrade SONiC package versions (#10732)
Co-authored-by: mssonicbld <vsts@fv-az31-361.b1uo4dmaffwenkazr3a2h2ovdb.jx.internal.cloudapp.net>
2022-05-27 01:10:32 -07:00
Guohan Lu
9b84294ffc Revert "[bgpcfgd] ECMP overlay VxLan with BGP support (#10716)"
This reverts commit 35c9becc3c.
2022-05-26 06:20:40 -07:00
Neetha John
a76899b04f [sonic-config-engine] Change hwsku for sample graph in unit tests (#10889)
#### Why I did it
To ensure that some internal testcases do not break due to external changes

#### How to verify it
Ran test_cfggen.py with the changes and it passed
2022-05-25 22:57:01 +00:00
Taylor Cai
f5ecf1ee1c [device/celestica]:Fix failed test case of Haliburton snmp (#10844) 2022-05-25 22:56:57 +00:00
xumia
455d44efea [Ci]: Fix to trigger the publish pipeline in failure build issue (#10847)
Why I did it
It is not necessary to trigger the publish pipeline when build is failed.

How I did it
Remove the condition in the azp task, change to use template condition.
2022-05-24 23:14:29 +00:00
abdosi
51f4bf111e Added Support for BGP allow list feature to have route-map action of setting tag (#10731)
What I did:
Added support to create route-map action set tag <user define value>
when the the allow prefix list matches. The tag can ben define by user in
constants.yml.

Why I did:
Since for Allow List feature we call from base route-map allow-list route-map having set tag option provides way for base route-map to do match tag and take any further action if needed. Adding tag provide metadata that can used by base route-map
2022-05-24 23:14:25 +00:00
StormLiangMS
35c9becc3c [bgpcfgd] ECMP overlay VxLan with BGP support (#10716)
Why I did it
https://github.com/Azure/SONiC/blob/master/doc/vxlan/Overlay%20ECMP%20with%20BFD.md
From the design, need to advertise the route with community string, the PR is to implement this.

How I did it
To use the route-map as the profile for the community string, all advertised routes can be associated with one route-map.
Add one file, mangers_rm.py, which is to add/update/del the route-map. Modified the managers_advertise_rt.py file to associate profile with IP route.

The route-map usage is very flexible, by this PR, we only support one fixed usage to add community string for route to simplify this design.

How to verify it
Implement new unit tests for mangers_rm.py and updated unit test for managers_advertise_rt.py.
Manually verified the test case in the test plan section, will add testcase in sonic-mgmt later. Azure/sonic-mgmt#5581
2022-05-24 23:14:21 +00:00
Lawrence Lee
5205a379e4 [scapy]: Patch scapy 2.4.5 for sniffing on intfs (#10644)
Apply scapy fix (https://github.com/secdev/scapy/pull/3240) since it is not available in release yet

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-05-24 23:14:17 +00:00
StormLiangMS
e659cfb2de [bgpcfgd] to support removal part of configuration of bgp allowed prefix list (#10165)
* fix allow list issue

Signed-off-by: stormliang <stormliang@microsoft.com>

* add the ipaddress in the install list

* add unit test

Co-authored-by: Ubuntu <azureuser@SONIC-SH-STORM-02.5pu3m0fajw1edcfltykk1gauxa.gx.internal.cloudapp.net>

Why I did it
Failed to remove part of configuration of bgp allowed prefix list. The details in #10141

How I did it
There are two issues:

In FRR, ipv6 default route is ::/0, but in the configuration, it is 0::/0, string comparison would be false, but why ipv4 failed to remove the allowed prefix list, ipv6 works? Looks into next one for the answer.

The current managers_allow_list doesn’t support removal part of the prefix list. But why IPv6 works in 1? It is because the bug for the IPv6 default route comparison, it would do the update no matter what is the operation (the code will compare the prefix list in the FRR and configuration db, if all configurations in db are presented in FRR, it do nothing, otherwise it will update the prefix list based on the configuration from db).

How to verify it
Follow the step in #10141
2022-05-24 23:14:13 +00:00
Aravind Mani
9caf12859d
DellEMC: S52xx Reboot cause fix (#10783) 2022-05-23 21:03:11 -07:00
Rajkumar-Marvell
abb977f4c4
[Marvell] Marvell armhf SAI debian (#10854)
Addressed system-health failure, when src-mac learned same as switchMac.

Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
2022-05-23 19:05:22 +08:00
Volodymyr Samotiy
6b029a613b
[202012] [Mellanox] Update SAI to 1.21.1.2 and SDK/FW to 4.5.2262/xx.2010.2262 (#10880)
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.

- How I did it
Updated SAI & SDK submodules along with the relevant Makefiles

- How to verify it
Build an image and run tests from "sonic-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-05-22 09:48:37 +03:00
Aravind Mani
cceef8e36d
Dell S6100: Addition of SFP type_abbrv_name field (#10846)
* Dell S6100: Addition of SFP type_abbrv_name field

* Update sfp.py

* Update sfp.py

Co-authored-by: Aravind Mani <aravind.m1@dell.com>
2022-05-19 12:14:53 -07:00
Dror Prital
77298da275
[202012][submodule] Update sonic-swss pointer (#10837)
Signed-off-by: Dror Prital <drorp@nvidia.com>

Co-authored-by: liat-grozovik <44433539+liat-grozovik@users.noreply.github.com>
Co-authored-by: Prince Sunny <prince.sunny@microsoft.com>
Co-authored-by: bingwang-ms <66248323+bingwang-ms@users.noreply.github.com>
2022-05-19 17:35:39 +08:00
Jing Zhang
acfee3be9a
[sonic-linkmgrd][202012] submodule update (#10814)
[sonic-linkmgrd][202012] submodule update
3d13ff2 Jing Zhang      Wed May 4 10:07:14 2022 -0700   Add doc for default route related changes  (#63)
c703be4 Jing Zhang      Mon May 2 13:27:54 2022 -0700   Reset WaitActiveUp count before switching to active (#70)
86eb727 Jing Zhang      Wed Apr 27 10:35:05 2022 -0700  lower log level to warning (#69)
e22c736 Jing Zhang      Mon May 2 13:33:24 2022 -0700   [202012] Avoid proactively switching to active if default route is missing (#67)
d4f282b Jing Zhang      Thu Apr 28 18:35:11 2022 -0700  [202012] Add support to enable switchover time measurement (with link prober interval decreased to 10ms) feature (#66)

sign-off: Jing Zhang [zhangjing@microsoft.com](mailto:zhangjing@microsoft.com)
2022-05-18 17:36:48 -07:00
Shilong Liu
f5c876edf3 [build] docker-sonic-mgmt replace USER by whoami (#9702) 2022-05-18 22:33:14 +08:00
Saikrishna Arcot
ee84ba81ed
Use 202012 branch for sonic-mgmt (#10851)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-05-17 14:59:31 -07:00
xumia
f6a52e31ec [Ci] Support to trigger a pipeline to download and publish artifacts to storage (#10820)
Why I did it
Support to trigger a pipeline to download and publish artifacts to storage and container registry.
Support to specify the patterns which docker images to upload.

How I did it
Pass the pipeline information and the artifact information by pipeline parameters to the pipeline which will be triggered a new build. It is to decouple the artifacts generation and the publish logic, how and where the artifacts/docker images will be published, depends on the triggered pipeline.

How to verify it
2022-05-16 23:28:06 +00:00
Vivek R
cd0a0608a9 Removed platform specific reboot files for mellanox simx platforms (#10806)
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.

Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-05-16 23:28:02 +00:00
shlomibitton
c71c91e2b0
[202012] [Fastboot] Delay PMON service for better fastboot performance (#10745)
#### Why I did it

Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot.

#### How I did it

Add a timer for PMON service.
Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot.
Copy the timer file to the host bin image.

#### How to verify it

Run fast-reboot on MLNX platform and observe faster create_switch execution time.
2022-05-15 23:31:32 -07:00
shlomibitton
bca8a244c6
[202012] [Fastboot] Delay LLDP service for better fastboot performance (#10568) (#10744)
This PR is to backport a fix #10568
This PR is dependent on PR: #10745

- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot.

- How I did it
Add a timer for LLDP service.
Copy the timer file to the host bin image.

- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
2022-05-15 15:05:29 +03:00
Junchao-Mellanox
4f326e8779
Fix race condition between networking service and interface-config service (#10573) (#10766)
Backport https://github.com/Azure/sonic-buildimage/pull/10573 to 202012.

#### Why I did it

The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like:

1.	Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp.
2.	Systemd starts interface-config service who depends on networking service
3.	Interface-config service runs command  “ifdown –force eth0”, check [line](16717d2dc5/files/image_config/interfaces/interfaces-config.sh (L4)). but networking service is still running so that this [line](ac32bec0e2/ifupdown2/ifupdown/main.py (L74)) failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down.
4.	Interface-config service updates /etc/networking/interface to static configuration.
5.	Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP)
6.	When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces.

The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3.

#### How I did it

Check networking service state before running "ifdown –force eth0", wait for it done if it is activating.

#### How to verify it

Manual test.
2022-05-14 14:58:24 -07:00
Sudharsan Dhamal Gopalarathnam
f16d11237a
[202012][submodule] Advance sonic-swss submodule pointer (#10803)
Update sonic-swss submodule to include below commits

b9163d3 [Vnet] Set BFD multihop to true for Vnet routes
cfed8c7 [202012][cherry-pick]Update orchagent to support new field pfcwd_sw_enable
172cd13 [ACL]Avoid incrementing crm count when ACL rule create fails
7377901 [pfcwd] Add vs test infrastructure
0b58595 Removing Vnet with scope default
2022-05-14 10:29:34 +03:00
Saikrishna Arcot
8970425a75 Fix calculation of $(1)_DEP_PKGS_SHA in Makefile.cache (#10764)
In Makefile.cache, for $(1)_DEP_PKGS_SHA, the intention is to include
the DEP_MOD_SHA and MOD_HASH of each of the current package's
dependencies. However, there's a level of dereferencing missing; instead
of grabbing the value of $(dfile)_DEP_MOD_SHA, it is literally using the
variable name $(dfile)_DEP_MOD_SHA. This means that the value of this
variable will not change when some dependency changes.

The impact of this is in transitive dependencies. For a specific
example, if there is some change in sairedis, then sairedis will be
rebuilt (because there's a change within that component), and swss will
be rebuilt (because it's a direct dependency), but
docker-swss-layer-buster will not get rebuilt, because only the direct
dependencies are effectively being checked, and those aren't changing.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-05-10 06:44:45 +00:00
xumia
951d93e362 Reduce image size for lazy installation packages (#10775)
Why I did it
The image size is too large, when there are multiple lazy packages and multiple platforms. It is not necessary to keep the lazy installation packages in multiple copies.
For cisco image, the image size will reduce from 3.5G to 1.7G.

How I did it
Use symbol links to only keep one package for each of the lazy package.
Make a new folder fsroot/platform/common
Copy the lazy packages into the folder.
When using a package in each of the platform, such as x86_64-grub, x86_64-8800_rp-r0, x86_64-8201_on-r0, etc, only make a symbol link to the package in the common folder.
2022-05-10 06:44:40 +00:00
Shilong Liu
a296267097 [ci] Support multi tags when pushing docker image (#10771) 2022-05-10 06:44:35 +00:00
Qi Luo
be5eb80b14
[202012] Fix tagged VlanInterface if attached to multiple vlan as untagged member (#10589)
Backport https://github.com/Azure/sonic-buildimage/pull/8927 to 202012 branch
2022-05-09 14:07:02 -07:00
Sudharsan Dhamal Gopalarathnam
502ddbb249
[202012][caclmgrd]Added logic to allow BFD port numbers (#10740)
* [caclmgrd]Added logic to allow BFD port numbers
2022-05-06 10:38:05 -07:00
Sudharsan Dhamal Gopalarathnam
2a232730b0
[202012][Mellanox] Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.1.2 (#10464)
* [Mellanox] Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.0.1

Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>

* Updating Switch-SDK-drivers submodule pointer

* Updating SAI version
2022-05-04 06:07:10 +03:00