Commit Graph

5114 Commits

Author SHA1 Message Date
Guohan Lu
8a024f00ce [owners]: add initial owners
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-31 21:19:53 -08:00
lguohan
dbfdab7c4e
[ci]: add t1-lag testbed (#6619)
introduce run-test template 

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-31 19:47:10 -08:00
guxianghong
4421a6823b
[arm64] disable snmp's parallel make (#6592)
snmpd's compile is always failed with file truncated on ARM64 arch, the error log is like "/usr/bin/ld: mibgroup/ip-forward-mib/inetCidrRouteTable/.libs/inetCidrRouteTable_interface.o: file not recognized: file truncated"

Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
2021-01-31 17:30:30 -08:00
gechiang
7928fbf36b
[broadcom]: broadcom sai update to 4.3.0.10-3 (#6620)
1. BRCM SAI Debian build need not have any Kernel version dependency - Starting with 4.3 BRCM made changes in SAI so that this dependency has been cleaned up. We can now remove the Kernel Version dependency from Azure Pipeline build script.

2. Bypass PEER_MODE p2mp setting causing SYNCd crash on non-TD3 SKUs - Temporarily patch BRCM SAI code to not cause SYNCd crash when Orchagent program SAI_TUNNEL_ATTR_PEER_MODE: SAI_TUNNEL_PEER_MODE_P2MP on Non-TD3 SKUs. Will remove this when BRCM provide proper fix to address this issue.
2021-01-31 16:59:01 -08:00
Stephen Sun
4f50658cfc
[syncd-rpc docker] Fix issue: ptf_nn_agent isn't able to start in syncd-rpc docker on buster (#6448)
- Why I did it
Fix issue: ptf_nn_agent isn't able to start in syncd-rpc docker on buster.

- How I did it
The issue is fixed by installing python-dev, cffi and nnpy for python 2 explicitly.

- How to verify it
Run copp test on RPC image.
2021-01-31 09:11:33 +02:00
Junchao-Mellanox
2a0351c65c
Check fan speed before check fan status (#6586)
**- Why I did it**
In thermalctd, when speed of fan exceeds threshold, the fan status will be saved as "bad". So in system health, it is better to check fan speed before fan status. In this case, if fan speed exceeds threshold, we get more detailed information.

**- How I did it**
Move fan speed check logic before fan status check

**- How to verify it**
Manual test
2021-01-31 09:06:36 +02:00
Guohan Lu
c041d25f92 [ci]: cleanup source directory upon checkout
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 17:38:33 -08:00
Guohan Lu
83c51e4803 [kvm]: install net-tools package for debug
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 17:38:33 -08:00
Mahesh Maddikayala
908884b0e8
[broadcom]: Add BCM config variable that contains premier cancun firmware path (#6611)
BRCM SDK 6.5.21 includes firmware updates (premier cancun) for TD3 platforms. The firmware update is required on TD3 platforms, which is packaged with BCMSAI 4.3.0.10.

**- How I did it**

Updated BCM config with a new variable that specifies the firmware package path. SDK uses this path to locate firmware packages and load during cold boot.

**- How to verify it**

 
bsv
BRCM SAI ver: [4.3.0.10], OCP SAI ver: [1.7.1], SDK ver: [sdk-6.5.21] CANCUN ver: [5.3.3]
drivshell>
admin@str2-7050cx3-acs-02:~$ bcmsh
Press Enter to show prompt.
Press Ctrl+C to exit.
NOTICE: Only one bcmsh or bcmcmd can connect to the shell at same time.
 
 
drivshell>cancun stat
cancun stat
UNIT0 CANCUN:
        CIH: LOADED
        Ver: 06.06.01
 
        CMH: LOADED
        Ver: 06.06.01
        SDK Ver: 06.05.21
 
        CCH: LOADED
       Ver: 06.06.01
        SDK Ver: 06.05.21
 
        CEH: LOADED
        Ver: 06.06.01
        SDK Ver: 06.05.21
 
drivshell>
2021-01-30 14:41:17 -08:00
Guohan Lu
3ffa352ac7 [ci]: reset the repo
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 06:27:40 -08:00
gechiang
c5d47791e8
[broadcom]: Fix BRCM Syncd Error:syncd#/supervisord: syncd sh: 1: ethtool: not found (#6615)
Starting with BRCM SAI 4.3.1.5 we see the following :ethtool not fount" error in syslog during boot up:
```
Jan 27 07:36:14.712472 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1:
Jan 27 07:36:14.712844 str-s6100-acs-1 INFO syncd#/supervisord: syncd ethtool: not found
Jan 27 07:36:14.713228 str-s6100-acs-1 INFO syncd#/supervisord: syncd #015
Jan 27 07:36:14.713840 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet39 speed 40000 rc:32512
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initializePort: Initializing port alias:Ethernet36 pid:1000000000040
Jan 27 07:36:14.726793 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet36 admin:0 oper:0 addr:4c:76:25:f5:48:80 ifindex:75 master:0
Jan 27 07:36:14.727967 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet36(ok) to state db
Jan 27 07:36:14.729331 str-s6100-acs-1 NOTICE swss#orchagent: :- addHostIntfs: Create host interface for port Ethernet36
Jan 27 07:36:14.752398 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1: ethtool: not found#015
Jan 27 07:36:14.752689 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet36 speed 40000 rc:32512
Jan 27 07:36:14.756050 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet36
Jan 27 07:36:14.757585 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet36
```
It seems that starting with BRCM SAI 4.2.1.5 syncd is using ethtool to set the host interface speed and since this ethtool was not part of the syncd Docker, we observe these "ethtool not found" issue.
2021-01-30 05:28:33 -08:00
Volodymyr Boiko
481870670d
[barefoot][platform] platform API 2.0 fixes (#6607)
To improve python3 support of berefoot's sonic_platform

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-01-29 17:29:37 -08:00
Tamer Ahmed
284c2738e8
[sonic-device-data]: Update BRCM Tunnel/ECMP Parameter For 7050cx3 SKUs (#6415)
Update Tunnel and ECMP parameters for brcm 7050cx3 48x50G+8x100G and 32x100G SKUs.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-01-29 14:15:48 -08:00
dflynn-Nokia
2a2c6b73fa
[submodule] update sonic-sairedis (#6609)
This update includes the following changes

  > [syncd armhf] Fix syncd crash when running community test suites (#777)
  > Revert "[tests]:Add unittest for MACsec on p2p establishment (#771)"
  > [tests]:Add unittest for MACsec on p2p establishment (#771)
  > [tests] Enable azure pipeline make check to respect unittests (#760)
2021-01-29 11:32:27 -08:00
Joe LeVeque
f9d75a046f
[build_debian.sh] Freeze pip2 < version 21 (#6597)
**- Why I did it**

As per https://pypi.org/project/pip/ pip 21.0 does not not support Python 2 from Jan 2021. Most places in the codebase have already been pinned, but this one was missed.

**- How I did it**

Pin pip2 < version 21 in build_debian.sh
2021-01-29 10:24:24 -08:00
lguohan
759936c67c
[submodule]: update sonic-swss (#6601)
* 832815e 2021-01-28 | [orchagent]: Add MACsec Orchagent (#1474) (HEAD, origin/master, origin/HEAD) [Ze Gan]
* dd4e409 2021-01-28 | [MACsecMgr]: Add MACsec Manager (#1475) [Ze Gan]
* 91e231c 2021-01-28 | [portsorch] Configure hostif tagging for subports (#1573) [Vitaliy Senchyshyn]
* 008325c 2021-01-29 | [PortsOrch] Add reference counting to ports for ACL bindings (#1614) [chaoskao]
* bbd2ca6 2021-01-28 | [crm]: Ignore unsupported/non-implemented switch attributes (#1613) [Prabhu Sreenivasan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-29 08:21:55 -08:00
arlakshm
b5225407ef
[baseimage]: add docker ps to the sudoer file (#6604)
fixes Azure/sonic-utilities#1389

With the recent changes in sudoer files. The  show commands fails for the read-only users. 
The problem here is the 'docker ps' is failing in the function [get_routing_stack()](8a1109ed30/show/main.py (L54)) therefore all the CLI commands are failing.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-01-29 08:16:32 -08:00
Qi Luo
e623c903bb
Revert "[build]: disable unit tests for sonic-utilities" (#6598)
This reverts commit 470ed18a6b.
2021-01-29 02:08:56 -08:00
arlakshm
ff8cc49b18
[multi asic] add ip netns identify command to sudoer (#6591)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

- Why I did it
The command sudo ip netns identify <pid> is used in function get_current_namespace
to check in the cli command is running in host context or within a namespace.

This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands

This problem is not there on single asic platforms.

- How I did it
Add ip netns identify [0-9]* to sudoers file.
2021-01-28 23:12:01 -08:00
Qi Luo
1c8d5ec500
Bump pyyaml from 5.3.1 to 5.4.1 (#6511)
RCE resolved in new version https://github.com/yaml/pyyaml/issues/420
2021-01-28 10:46:56 -08:00
Joe LeVeque
5985d949fa
[docker-sonic-vs] Install sonic-platform-common package (#6587)
**- Why I did it**

sonic-utilities will become dependent upon sonic-platform-common as of https://github.com/Azure/sonic-utilities/pull/1386.

**- How I did it**

- Add sonic-platform-common as a dependency in docker-sonic-vs.mk
- Additionally, no longer install Python 2 packages of swsssdk and sonic-py-common, as they should no longer be needed.
2021-01-28 09:44:43 -08:00
Mahesh Maddikayala
f6b842edd3
[BCMSAI] Update BCMSAI debian to 4.3.0.10 with 6.5.21 SDK, and opennsl module to 6.5.21 (#6526)
BCMSAI 4.3.0.10, 6.5.21 SDK release with enhancements and fixes for vxlan, TD3 MMU, TD4-X9 EA support, etc.
2021-01-28 08:38:47 -08:00
Qi Luo
0e7287856c
[build]: stop prompt during build (#6585)
Some commands used during build will prompt user interactively, but this is not expected during build. Since most output is collected into log file, user could not see the prompt and feel the build process hangs.

- How I did it

Use mv command in non interactive mode
Redirect stdin to null if command output is collected into log file.
2021-01-28 02:21:38 -08:00
Guohan Lu
fb0b999361 [ci]: append job.attempt in memdump/log artifacts
azure pipepline does not allow upload same artifacts again.
thus, use job.attempt to uniquely name the test artifacts

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-28 01:05:44 -08:00
Vaibhav Hemant Dixit
98298f7f7f
[sonic-sairedis] advance submodule to include fix for syncd crash during shutdown (#6581)
Remove unregisterMessageHandler from NetMsgRegistrar thread (#779)
2021-01-27 21:55:18 -08:00
lguohan
7d01613300
[ci]: correct ownership of artifacts (#6582)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 20:57:17 -08:00
Qi Luo
69c5832153
[ci]: Download artifact instead of using nfs storage (#6570)
I notice that I rerun a failed job (not the stages), the nfs store is already cleaned by previous failed jobs.
2021-01-27 19:58:58 -08:00
Guohan Lu
f7346cca32 [docker-fmp-frr]: remove blank lines in generated critical_process
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 19:41:59 -08:00
Guohan Lu
34cca20cb6 [proc-exit-listener]: ignore blank lines
make proc-exit-listener more rebust

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 19:41:59 -08:00
Shi Su
aab37b7f42
[FRR] Create a separate script to wait zebra to be ready to receive connections (#6519)
The requirement for zebra to be ready to accept connections is a generic problem that is not 
specific to bgpd. Making the script to wait for zebra socket a separate script and let bgpd and 
staticd to wait for zebra socket.
2021-01-27 12:36:02 -08:00
Kebo Liu
7f222e7bc1
[mellanox]: Update SAI to sonic2012 1.18.1.0 (#6566)
Changes in the new release:

1. Policy based hashing optimization
2. New attribute support for Max port headroom
3. Tunnel ECN map fixes
4. Tunnel EVPN skeleton extensions (peer attrib, maps)
5. Bridge port admin not affecting port admin (optimize port down time)
6. CRM new API for neighbors and tunnel termination entries
7. Improve FDB event for flush by bridge port (before, null bridge was reported to SONiC, now the bridge will be extracted from bridge port)
8. DHCP L2 v4+v6 traps (for ZTP use case)
9. Generic counter implementation

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-27 12:29:28 -08:00
dflynn-Nokia
1f2797a56d
[docker-config-engine-stretch]: Fix dependency typo PYTHON2_SWSSCOMMON (#6568)
This commit fixes a typo in the fix delivered in PR #6538

syncd fails on the armhf platform within sonic-config-engine/portconfig.py when importing the following
'from swsscommon.swsscommon import ConfigDBConnector'
2021-01-27 12:27:41 -08:00
abdosi
cfa8fbbf1a
[baseimage]: Updates for Ebtables and support for multi-asic (#6542)
Following changes were done for ebtables:

- Support for Multi-asic platforms. Ebtable filters are installed in namespace for multi-asic and not host. On Single asic installed on  host.

- For Multi-asic platforms we don't want to install on host otherwise Namespace-to-Namespace communication does not happens since ARP Request are not forwarded.

- Updated to use text file to restore ebtables rules then the binary format. Rules are restore as part of Database docker init instead of rc.local

- Removed the ebtable service files for buster as not needed as filters are restored/installed as part of database docker init.
   All the binaries are pre-installed with ebtables* binary are same as ebatbles-legacy-* 

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 08:36:10 -08:00
Guohan Lu
f3a901c41e [ci]: build docker-ptf on vs platform
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 08:28:21 -08:00
Guohan Lu
044efe72ca [build]: add _BUILD_ENV to specify env for dpkg-buildpackage
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 08:28:21 -08:00
Guohan Lu
ca0e8cbe0e [docker-ptf]: build docker ptf
- combine docker-ptf-saithrift into docker-ptf docker
- build docker-ptf under platform vs
- remove docker-ptf for other platforms

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 08:28:21 -08:00
Kebo Liu
9ff56445c9
Add hw-mgmt patch to support SDK OFFLINE event for handling flow within service firmware upgrade (#6550)
During ISSU, "mlxsw_minimal" driver still trying to access firmware, in some cases FW could return some wrong critical threshold value which will cause switch shutdown.

**- How I did it**
In order to prevent "mlxsw_minimal" driver from accessing ASIC during ISSU, SDK will raise "OFFLINE" 'udev' event
at the early beginning of such flow. When this event is received, hw-management will remove "mlxsw_minimal" driver.
There is no need to implement the opposite "ONLINE" event since this flow is ended up with "kexec".

**- How to verify it**
repeatedly perform warm reboot, make sure there is no switch shutdown occurred.
2021-01-27 15:39:54 +02:00
bingwang-ms
6fa807d0d0
[bgpmon]: Fix exception in bgpmon caused by duplicate bgp neighbor ID (#6546)
* Fix exception in bgpmon caused by duplicate keys
It is possible that BGP neighbors in IPv4 and IPv6 address families
share the same name (such as bgp monitor). However, such case is not
handled in bgpmon, and an Exception will be raised. This commit will
address the issue by Using set instead of list to avoid duplicate keys.
2021-01-26 23:01:42 -08:00
Qi Luo
e616a32950
[ci]: add master and 202012 into azure-pipelines trigger (#6560) 2021-01-26 15:42:04 -08:00
Prince Sunny
7337483381
[submodule]: update sonic-swss (#6561)
f4aefba - 2021-01-25 : [Mux] Fix repeating logs in case of tunnel creation fail (#1610) [Prince Sunny]
2021-01-26 13:35:11 -08:00
lguohan
a9a0e3062c
[ci]: archive kvmtest artifacts (#6567)
- archive logs
- archive kvm memdump when failed
- publish kvm test results

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-26 13:09:56 -08:00
lguohan
30ae46ea7f
[ci]: add -k ceos option to setup t0 testbed (#6565)
this is due to command line change in
1e12790a93

this is due to command line change in
Azure/sonic-mgmt@1e12790

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-26 02:28:27 -08:00
lguohan
6957e376be
[ci]: reset the owner for all files under working directory (#6557)
reset the owner for all files under working directory. some files were owned by root after build, which cause
next build to fail since directory cannot be cleanned.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-25 19:05:01 -08:00
Kebo Liu
84985e103d
[mellanox]: Update SDK to 4.4.2308, FW to *.2008.2308 (#6552)
Bugs fixes:
    All | Kernel | During system reload when CPU is loaded with heavy traffic, a Kernel Panic may occur.
    All | Modules, Port split | FW stuck when device rebooted with locked Optical Transceivers in split mode
    Spectrum-3 | PFC | On Spectrum-3 systems, slow reaction time to Rx pause packets on 40GbE ports may lead to buffer overflow on servers.
    Spectrum-3 | SN4700, Port Split | On rare occasion SN4700, conducting 100G split (4x25G) in NRZ when splitter port 1 or 2 are down, ports 3 and 4 will also go down.

Enahncments:
    All | Kernel | new notification on ISSU start, so other kernel drivers can disable any interface to ASIC

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-25 10:52:22 -08:00
Tamer Ahmed
8d857fab16
[dhcp-relay]: Launch DHCP Relay On L3 Vlan (#6527)
Recent changes brought l2 vlan concept which do not have DHCP
clients behind them and so DHCP relay is not required. Also,
dhcpmon fails to launch on those vlans as their interfaces
lack IP addresses. This PR limit launch of both DHCP relay
and dhcpmon to L3 vlans only.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-01-25 10:48:48 -08:00
Guohan Lu
a38377e96d [submodule]: update sonic-swss
* f4e8245 2021-01-24 | [fpmsyncd] Skip routes to eth0 or docker0 (#1606) (HEAD, origin/master, origin/HEAD) [Shi Su]
* f4c3579 2021-01-23 | Enhance dynamic buffer calculation and bug fixes (#1601) [Stephen Sun]
* e800c9f 2021-01-22 | [logfile]: Add option to specify swss rec file name (#1546) [arlakshm]
* 1acf60e 2021-01-17 | Implementation of System ports initialization, Interface & Neighbor Synchronization... (#1431) [minionatwork]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 22:25:11 -08:00
Guohan Lu
4b5212b8b2 [vstest]: add default vs test
Check #6483

add test to make sure default route change in eth0 does not
affect the default route in the default vrf

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 22:25:11 -08:00
lguohan
cd3ed549c3
[submodule]: update sonic-sairedis (#6544)
* 20b9573 2021-01-24 | [SAI]: update SAI submodule (#775) (HEAD, origin/master, origin/HEAD) [lguohan]
* 667c33d 2021-01-22 | [syncd] Comparison logic add support to LABEL attribute with higher priority (#764) [Kamil Cudnik]
* aaf5b98 2021-01-22 | [vslib]: Fix missing MACsec Create Port action (#770) [Ze Gan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 21:01:47 -08:00
lguohan
3bc82e5556
[ci]: add vs tests (#6506)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 21:01:19 -08:00
Zhenhong Zhao
a171e6c5e4
[frrcfgd] introduce frrcfgd to manage frr config when frr_mgmt_framework_config is true (#5142)
- Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema.
- Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup

**- How I did it**

- add frrcfgd service
- when frr_mgmg_framework_config is set, frrcfgd starts in bgp container
- when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR.
- add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted

**- How to verify it**
1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed.
1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB

Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>
2021-01-24 17:57:03 -08:00