Commit Graph

4564 Commits

Author SHA1 Message Date
trzhang-msft
1c2c1f50ba dhcpmon: support dual tor scenario (#7471) 2021-05-05 09:35:01 -07:00
trzhang-msft
d76206bae4 dhcpmon: support dual tor in docker template (#7470) 2021-05-05 09:34:42 -07:00
Stephen Sun
a554ddc91d [Mellanox] Adopt single way to get fan direction for all ASIC types (#7386)
#### Why I did it
Adopt a single way to get fan direction for all ASIC types.
It depends on hw-mgmt V.7.0010.2000.2303. Depends on https://github.com/Azure/sonic-buildimage/pull/7419

#### How I did it
Originally, the get_direction was implemented by fetching and parsing `/var/run/hw-management/system/fan_dir` on the Spectrum-2 and the Spectrum-3 systems. It isn't supported on the Spectrum system.
Now, it is implemented by fetching `/var/run/hw-management/thermal/fanX_dir` for all the platforms.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-05 09:34:25 -07:00
vmittal-msft
f766a1bccf Updated Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8 (#7068)
* TD3 Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8
2021-05-05 09:33:19 -07:00
Guohan Lu
9d5b899452 [ci]: enable sonic-slave build on 202012 branch
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:31:27 -07:00
Shilong Liu
90bcc51203 [CI] Add Bldenv pipeline files (#7458)
Add 3 pipeline files:

- pipeline for build docker image sonic-slave-[buster|jessie|stretch] for amd64/armhf/arm64, and push to ACR(sonicdev-microsoft.com)
- pipeline for build docker image sonic-mgmt, and push to ACR
- pipeline for cleaning dpkg cache which are created more than 30 days.

Co-authored-by: lguohan <lguohan@gmail.com>
2021-05-05 07:30:39 -07:00
Guohan Lu
7841b7513c [ci]: change timeout to 36 hours for armhf/arm64 build
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:30:07 -07:00
Guohan Lu
a0575a7790 [ci]: increase official build to 12 hours
10-hour limit is not enough to finish several jobs.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:11:59 -07:00
xumia
e27293f4df
Disable the reproducible build for arm64 and armhf temporarily (#7517)
Disable the reproducible build for arm64 and armhf temporarily.
The package version upgrade has been disabled for armhf and armhf for long run time, so disable the reproducible build for the official builds as well. we will enable it when arm64 azure agent ready.
2021-05-05 21:02:39 +08:00
Danny Allen
6060da3a36
[202012][swss/swss-common/utilities/kernel] Update submodule (#7487)
* [202012][swss/swss-common/utilities/kernel] Update submodule

sonic-swss:
- [Monitor Vlan] Fix a typo in hostif (#1722)
- Update pool sizes during initialization from timer only (#1708)
- [SflowMgr] SamplingRate Update by Speed Change Added (#1721)

sonic-swss-common:
- [swss-common] Add MUX Metrics Table (#482)
- [azp] Purge swss before installing the newly built deb package (#472)

sonic-utilities:
- disk_check: Check & mount RO as RW using tmpfs (#1569)
- No more IP validation as it is more likely a URL (#1555)
- Stop PMON docker before cold and soft reboots (#1514)
- Add soft-reboot reboot type (#1453)
- [acl] Use a list instead of a comma-separated string for ACL port list (#1519)
- sonic-installer: fix py3 issues in bootloader.aboot (#1553)
- Fix unsupported fs.squashfs extraction in sonic-installer (#1366)
- [show][config] cli support for firmware upgrade on Y-Cable (#1528) (#1558)

sonic-linux-kernel:
- [Mellanox] backport kernel patches for hw-management 7.0100.2303 (#211)

Signed-off-by: Danny Allen <daall@microsoft.com>

* Update utilities w/ build fix
2021-05-04 08:35:48 -07:00
Volodymyr Samotiy
4152c3e337
[Mellanox] [202012] Update SAI submodule pointer (#7499)
- Why I did it
To include below changes:
Set monitoring VLAN hostif up dy default (for VNET ping tool)

- How I did it
Updated SAI submodule pointer

- How to verify it
Create VLAN hostif according to changes in PR: Azure/sonic-swss#1645
Verify it is admin up by default

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-05-04 17:29:47 +03:00
judyjoseph
8cea931cad Fixes for errors seen in staging devices (#7171)
With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix.

admin@STG01-0101-0102-01T1:~$ TSB
BGP0 : % Could not find route-map entry TO_TIER0_V4 20
line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20
% Could not find route-map entry TO_TIER0_V4 30
line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30

In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.
2021-05-03 13:19:29 -07:00
judyjoseph
7ae4a990e7 [docker-fpm-frr]: TSA/B/C changes for multi-asic (#6510)
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
       In addition skip enforcing it on route maps used between internal BGP sessions.

admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance

and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
2021-05-03 13:19:17 -07:00
Nazarii Hnydyn
0e970582c1
[swss_vars]: Add 'resource_type' attribute. (#7188)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-05-03 10:38:11 -07:00
guxianghong
a0fde3a626 [arm] support compile sonic arm image on arm server (#7285)
- Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly.
- Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228
- rename multiarch docker to sonic-slave-${distro}-march-${arch}

Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Co-authored-by: Shi Lei <shil@centecnetworks.com>
2021-05-02 08:11:56 -07:00
Samuel Angebault
30cc959787 [Arista] Fix dockerd issue on Arista platforms (#7376)
Why I did it
Recent systemd upgrade from #7228 requires an extra cmdline parameter for dockerd to start properly.
Updating boot0 was missed as part of the systemd upgrade change.

How I did it
Just added the missing cmdline parameter in files/Aboot/boot0.j2
This change fixes #7372

How to verify it
Boot the image and dockerd should start normally.
2021-05-01 19:43:51 -07:00
Stepan Blyshchak
ae574ab000 [systemd] disable default systemd udev rules for interfaces (#7369)
Fix #7364

99-default.link - was always in SONiC, but previous systemd (<247) had an issue and it did not work due to issue systemd/systemd#3374. Now systemd 247 works.

However, such policy overrides teamd provided mac address which causes teamd netdev to use a random mac
address. Therefore, needs to be disabled.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-05-01 19:43:41 -07:00
Guohan Lu
4f33a90a82 Revert "Revert "[build_debian.sh] fix systemd is not from backports buster (#7323)""
This reverts commit a253f2039a.
2021-05-01 19:43:15 -07:00
Guohan Lu
55f8f8647b Revert "Revert "[debian] install systemd version 247 from buster-backports (#7228)""
This reverts commit 86f9c7b7a6.
2021-05-01 19:43:04 -07:00
Andriy Yurkiv
c65a8a227f [devices][hwsku] add support to VXLAN src port range feature (#7394)
Enable VXLAN src port range configuration via SAI profile
2021-04-29 10:11:14 -07:00
Stephen Sun
48908b1c5a Fix issue: exception occurred during chassis object being destroyed (#7446)
The following error message is observed during chassis object being destroyed

"Exception ignored in: <function Chassis.__del__ at 0x7fd22165cd08>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/sonic_platform/chassis.py", line 83, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
The chassis tries to import deinitialize_sdk_handle during being destroyed for the purpose of releasing the sdk_handle.
However, importing another module during shutting down can cause the error because some of the fundamental infrastructures are no longer available."

This error occurs when a chassis object is created and then destroyed in the Python shell.

- How I did it
To fix it, record the deinitialize_sdk_handle in the chassis object when sdk_handle is being initialized and call the deinitialize handler when the chassis object is being destroyed

- How to verify it
Manually test.
2021-04-29 10:10:25 -07:00
Junchao-Mellanox
d4e8c3f666 [Mellanox] Upgrade hw-mgmt to 7.0100.2303 (#7419)
- Why I did it
Upgrade hw-mgmt to 7.0100.2303

Bug fixes

1. Fan direction feature fix for fixed FAN system (using shell instead of binutils/strings)
2. Remove cpld 4th link on systems with only 3 CPLD's
3. hw-mgmt: thermal: Add hardcoded critical trip point. Follow-up after patch "Removing critical thermal zones to prevent unexpected software system shutdown".
4. Fix sensor attribute mapping to be label based instead of index based to allow common handling of voltage regulator names independently of hardware changes.
5. Update 'lm-sensors' custom configuration file. Relevant only for users utilizing sensors.conf files coming along with hw-management package.
6. For full feature list please follow https://github.com/Mellanox/hw-mgmt/blob/V.7.0010.2300_BR/debian/Release.txt

- How I did it
Update hw-mgmt pointer
Remove unused patches
Fix existing patch to make sure it apply successfully

- How to verify it
Full platform regression on all mellanox platforms
2021-04-29 10:09:58 -07:00
xumia
bdb23a0d94 Fix workflow permission issue when running in merge branch (#7417)
Fix the labeler workflow permission issue when merging from fork repo.
It impacts the labeler workflow to support auto-merge for package versions upgrade on 202012 branch. The current workaround is to add the label "automerge" on the PR sent by mssonicbld, then the automerge workflow will merge the PR.
2021-04-29 10:09:43 -07:00
xumia
1b05982727 Support readonly vtysh for sudoers (#7383)
Why I did it
Support readonly version of the command vtysh

How I did it
Check if the command starting with "show", and verify only contains single command in script.
2021-04-29 10:08:55 -07:00
anish-n
17d5e69c5b Add downstreamsubrole parsing to minigraph.py (#7193) 2021-04-29 10:08:03 -07:00
madhanmellanox
051b848377
202012: Created new SKU Mellanox-SN3800-D28C49S1 (#7466)
platform files for the new SKU D28C49S1
2021-04-29 08:54:22 -07:00
xumia
0472323860
[ci] Fix the generating version file failure issue caused by artifacts folder change (#7464)
Fix the generating version file failure issue caused by artifacts folder change.
When changing to use the same template for PR build, official build and packages version upgrade, the artifacts folder adding a "target" folder, the version upgrade task should be changed accordingly.
2021-04-28 16:01:07 -07:00
dflynn-Nokia
83e23801fd [build]: Fix ARM build break introduced in PR# 7249 (#7395)
PR# 7249 introduced a new bit of logic _after_ the point where the qemu based
build environment for ARM is removed. Hence the new logic fails when building
for ARM. Builds for AMD64 were not affected.

This commit moves the new logic introduced by PR# 7249 to just _before_ the
point where the qemu based build environment for ARM is removed. A comment is
added to reduce the likelihood of this sort of ARM build break from happening
again.
2021-04-28 09:25:26 -07:00
xumia
a53e9ebfa6
[ci] Add the platform filter to ignore some platforms not ready to use (#7453)
Why I did it
Add the platform filter to ignore some platforms not ready to use
The platform centos-arm64 and the platform marvell-armhf are not ready to use now. We will add it when it is available.
2021-04-28 21:22:53 +08:00
Praveen Chaudhary
59cae24e43 [sonic-slave-buster]: upgrade pyang version to 2.4.0 and install only using pip3. (#7441)
[sonic-slave-stretch]: upgrade pyang version to 2.4.0.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
2021-04-26 21:54:05 -07:00
Dror Prital
16dc30b944
[Mellanox] [202012] Update SAI version 1.18.3.0 (#7427)
- Why I did it
Changes in the new release:
1. Fix 10G and 50G speeds in SAI XML to support all interface types
2. Enable SMAC=DMAC and SMAC MC in tunnel debug counter
3. Add tunnel statistics
4. Add isolation group API implementation
5. Fix ACL ANY debug counter to correctly track ACL drops
6. Add VXLAN source port hard coded range, controlled by K/V
7. FW dump me now feature
8. Add mlxtrace to saidump
9. Speed lane setting and AN control
10. Implement query stats API
11. VNI miss part of tunnel decal drop reason

- How I did it
Update the version number in SAI make file, update the mlnx-sai submodule pointer.

- How to verify it
Run full regression tests on Mellanox platforms

Signed-off-by: Dror Prital <drorp@nvidia.com>
2021-04-26 20:44:36 +03:00
xumia
193c376f97 Export the azure pipeline build id for SONiC version (#7406)
Improve the SONiC version, fix the "azure pipeline build id" part

<target branch name>-<pullrequest id>.<azure pipelines build id>-<merge commit id>
Example: master-7381.11668-43df5c87
2021-04-25 11:35:57 -07:00
xumia
2694afb0cd [ci]: Fix official build not existing issue (#7408)
When submitting a new official build for broadcom, vs, it prompts a error message, which says the job is not defined.
It was caused by the default option "[]", which is not empty, it is used as the jobGroups parameter.
2021-04-25 11:35:57 -07:00
xumia
a2d27eeb78 Improve the PR build version (#7381)
Why I did it
Improve the version of the Pull Request build by changing the local branch name.

How I did it
Change the default branch name merge to [target_branch_name]-[pullrequestid].

How to verify it
For official build, the version is not changed.
For pull request build, the version as below:
2021-04-25 11:35:57 -07:00
xumia
fd8e84ac50 [ci] Fix the boolean value case sensitive issue in Azure Pipelines (#7399)
Why I did it
Fix the boolean value case sensitive issue in Azure Pipelines

When passing parameters to a template, the "true" or "false" will have case sensitive issue, it should be a type casting issue.
To fix it, we change the true/false to yes/no, to escape the trap.

Support to override the job groups in the template, so PR build has chance to use different build parameters, only build simple targets. For example, for broadcom, we only build target/sonic-broadcom.bin, the other images, such as swi, debug bin, etc, will not be built.
2021-04-25 11:35:57 -07:00
Shilong Liu
e45d636b77 fix 2021-04-25 11:35:57 -07:00
Shilong Liu
bf02ba16ee [CI] Add azure pipeline file for official build
Signed-off-by: Shilong Liu <shilongliu@microsoft.com>
2021-04-25 11:35:57 -07:00
Stephen Sun
c91b02cb1b
[submodule][202012] Advance submodule head for sonic-swss (#7411)
adf5ab58 [vstest/subintf] Add vs test case to validate processing sequence of APPL DB keys (#1663)
8a732726 [intfsorch] Create subport with the entry contains necessary attributes (#1650)
7ba813b2 [vstest/subintf] Update vs tests to validate physical port host interface vlan tag attribute (#1634)
ed32e333 [portsorch] Configure hostif tagging for subports (#1573)
b5209c43 Handle IPv6 and ECMP routes to be programmed to ASIC (#1711)
515cc1a7 [Dynamic buffer calc][Mellanox] Fix bug: buffer over subscription in buffer pool size calculation (#1706)
0ad524b2 [202012] Allowing the first time FEC and AN configuration to be pushed to SAI (#1710)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-04-24 13:13:37 -07:00
mssonicbld
f7adf9c180
[ci/build]: Upgrade SONiC package versions (#7415) 2021-04-24 15:17:07 +00:00
Kamil Cudnik
4ac2734c34
[submodule] Advance sonic-sairedis submodule (#7409)
To fix innovium platform compilation and fix missing -lpython3.5 module
2021-04-23 18:25:34 +02:00
kakkotetsu
e6bbb3c344 [restapi] fix python version during restapi startup (#7056)
changed from python3 to python in supervisord.conf.
2021-04-22 14:36:09 -07:00
gechiang
4c43ecc81b
[202012] BRCM SAI 4.3.3.5 pick up 2 bug fixes and MMU changes (#7400)
This is the SAI 4.3.3.5 code drop from BRCM to address 2 CSP case and initial MMU changes
Note the MMU changes is the same as that of SAI 4.3.3.4-1 (#7341) but with official patch.

- Case CS00012178716 [4.3] Polling SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES fails often on TH3
- Case CS00012159273 [4.3.3.3] [TD3][IPFWD] subnet broadcast flooding end with extra VLAN Tag if member port of VLAN interface deleted then added back to VLAN

Once we have this PR merged, will validate each one of the above to ensure they are indeed fixed...

Preliminary tests looks fine. BGP neighbors were all up with proper routes programmed
interfaces are all up
Manually ran the following test cases on TD3 DUT and all passed:

     ipfwd/test_dir_bcast.py
     fib/test_fib.py
     vxlan/test_vxlan_decap.py 
     decap/test_decap.py
     fdb/test_fdb.py
2021-04-22 08:34:52 -07:00
Rajkumar-Marvell
b8daa00ef0 [marvell] Move armhf syncd build from stretch to buster. (#7366)
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2021-04-21 15:40:52 -07:00
Kebo Liu
2208c9212a [Mellanox] Update SDK to 4.4.2522 and FW to 2008.2520 (#7391)
New features and fixes in the new SDK/FW:

SN4600C | AN/LT support
SN2700 | AN/LT bugs fixes
WJH | FID_MISS support

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-04-21 14:05:56 -07:00
Prince Sunny
75ac46eab0 [Broadcom] Set hierarchical ecmp levels to 2 (#7370)
Set hierarchical ecmp level to 2 instead of 3. Based on CS00011833367, ecmp level must be set to 2.
This is already handled for TH2 platforms. Change is required only for TD3

Co-authored-by: Ubuntu <prsunny@prince-vm.vzw1i4tqyeburcdz5lrgulxi2c.yx.internal.cloudapp.net>
2021-04-21 14:05:31 -07:00
Aravind Mani
8832c10fb7 [Dell S6100]: Add dell ich driver (#7336)
dell_ich driver was removed as part of #7309 and it is needed for watchdog tickle in S6100 platform.
2021-04-21 14:02:27 -07:00
Aravind Mani
80fdb29957 Dell S6100: Modify transceiver change event from interrupt to poll mode (#7309)
#### Why I did it

- xcvrd crash was seen in latest 201811 images.
- For Dell S6100,API 2.0 uses poll mode while 1.0 was still using interrupt mode.

#### How I did it

- Modified get_transceiver_change_event in 1.0 to poll mode.
2021-04-21 14:01:53 -07:00
Renuka Manavalan
09c53af61b Kubernetes server configurable using URL
1) Dropped non-required IP update in admin.conf, as all masters use VIP only (#7288)
2) Don't clear VERSION during stop, as it would overwrite new version pending to go.
3) subprocess, get return value from proc and do not imply with presence of data in stderr.
2021-04-21 14:00:53 -07:00
shlomibitton
f7ddf1e73c [Mellanox] Fix for all Spectrum based systems: SAI profile speed configurations (#7119)
Fix to the correct value for all SPC1 devices.
For 10G added 10GB_CX4_XAUI, 10GB_KX4, 10GB_KR, 10GB_SR and 10GB_ER_LR
For 50G added 50GB_SR2

This bitmask represents all the options available for interface type and some were missing.
Note: it was working just fine if you were setting the value from SONiC CLI but not from the default SAI Profile.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-04-21 13:59:12 -07:00
Kuanyu Chen
66dedf38c2 [config-setup]: Fix a bug in checking if updategraph is enabled (#7093)
Encounter error during "config-setup boot" if the updategraph is enabled.

How I did it
Correct the code inside the config-setup script.
Remove the space between the assignment operator.

How to verify it
Remove the /etc/sonic/config_db.json and reboot the device.
Originally, it will return following error after boot up.
rv: command not found
After modification, it can correctly parse the status of updategraph without error.
2021-04-21 13:58:03 -07:00