Commit Graph

4304 Commits

Author SHA1 Message Date
gechiang
c5d47791e8
[broadcom]: Fix BRCM Syncd Error:syncd#/supervisord: syncd sh: 1: ethtool: not found (#6615)
Starting with BRCM SAI 4.3.1.5 we see the following :ethtool not fount" error in syslog during boot up:
```
Jan 27 07:36:14.712472 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1:
Jan 27 07:36:14.712844 str-s6100-acs-1 INFO syncd#/supervisord: syncd ethtool: not found
Jan 27 07:36:14.713228 str-s6100-acs-1 INFO syncd#/supervisord: syncd #015
Jan 27 07:36:14.713840 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet39 speed 40000 rc:32512
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initializePort: Initializing port alias:Ethernet36 pid:1000000000040
Jan 27 07:36:14.726793 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet36 admin:0 oper:0 addr:4c:76:25:f5:48:80 ifindex:75 master:0
Jan 27 07:36:14.727967 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet36(ok) to state db
Jan 27 07:36:14.729331 str-s6100-acs-1 NOTICE swss#orchagent: :- addHostIntfs: Create host interface for port Ethernet36
Jan 27 07:36:14.752398 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1: ethtool: not found#015
Jan 27 07:36:14.752689 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet36 speed 40000 rc:32512
Jan 27 07:36:14.756050 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet36
Jan 27 07:36:14.757585 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet36
```
It seems that starting with BRCM SAI 4.2.1.5 syncd is using ethtool to set the host interface speed and since this ethtool was not part of the syncd Docker, we observe these "ethtool not found" issue.
2021-01-30 05:28:33 -08:00
Volodymyr Boiko
481870670d
[barefoot][platform] platform API 2.0 fixes (#6607)
To improve python3 support of berefoot's sonic_platform

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-01-29 17:29:37 -08:00
Tamer Ahmed
284c2738e8
[sonic-device-data]: Update BRCM Tunnel/ECMP Parameter For 7050cx3 SKUs (#6415)
Update Tunnel and ECMP parameters for brcm 7050cx3 48x50G+8x100G and 32x100G SKUs.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-01-29 14:15:48 -08:00
dflynn-Nokia
2a2c6b73fa
[submodule] update sonic-sairedis (#6609)
This update includes the following changes

  > [syncd armhf] Fix syncd crash when running community test suites (#777)
  > Revert "[tests]:Add unittest for MACsec on p2p establishment (#771)"
  > [tests]:Add unittest for MACsec on p2p establishment (#771)
  > [tests] Enable azure pipeline make check to respect unittests (#760)
2021-01-29 11:32:27 -08:00
Joe LeVeque
f9d75a046f
[build_debian.sh] Freeze pip2 < version 21 (#6597)
**- Why I did it**

As per https://pypi.org/project/pip/ pip 21.0 does not not support Python 2 from Jan 2021. Most places in the codebase have already been pinned, but this one was missed.

**- How I did it**

Pin pip2 < version 21 in build_debian.sh
2021-01-29 10:24:24 -08:00
lguohan
759936c67c
[submodule]: update sonic-swss (#6601)
* 832815e 2021-01-28 | [orchagent]: Add MACsec Orchagent (#1474) (HEAD, origin/master, origin/HEAD) [Ze Gan]
* dd4e409 2021-01-28 | [MACsecMgr]: Add MACsec Manager (#1475) [Ze Gan]
* 91e231c 2021-01-28 | [portsorch] Configure hostif tagging for subports (#1573) [Vitaliy Senchyshyn]
* 008325c 2021-01-29 | [PortsOrch] Add reference counting to ports for ACL bindings (#1614) [chaoskao]
* bbd2ca6 2021-01-28 | [crm]: Ignore unsupported/non-implemented switch attributes (#1613) [Prabhu Sreenivasan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-29 08:21:55 -08:00
arlakshm
b5225407ef
[baseimage]: add docker ps to the sudoer file (#6604)
fixes Azure/sonic-utilities#1389

With the recent changes in sudoer files. The  show commands fails for the read-only users. 
The problem here is the 'docker ps' is failing in the function [get_routing_stack()](8a1109ed30/show/main.py (L54)) therefore all the CLI commands are failing.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-01-29 08:16:32 -08:00
Qi Luo
e623c903bb
Revert "[build]: disable unit tests for sonic-utilities" (#6598)
This reverts commit 470ed18a6b.
2021-01-29 02:08:56 -08:00
arlakshm
ff8cc49b18
[multi asic] add ip netns identify command to sudoer (#6591)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

- Why I did it
The command sudo ip netns identify <pid> is used in function get_current_namespace
to check in the cli command is running in host context or within a namespace.

This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands

This problem is not there on single asic platforms.

- How I did it
Add ip netns identify [0-9]* to sudoers file.
2021-01-28 23:12:01 -08:00
Qi Luo
1c8d5ec500
Bump pyyaml from 5.3.1 to 5.4.1 (#6511)
RCE resolved in new version https://github.com/yaml/pyyaml/issues/420
2021-01-28 10:46:56 -08:00
Joe LeVeque
5985d949fa
[docker-sonic-vs] Install sonic-platform-common package (#6587)
**- Why I did it**

sonic-utilities will become dependent upon sonic-platform-common as of https://github.com/Azure/sonic-utilities/pull/1386.

**- How I did it**

- Add sonic-platform-common as a dependency in docker-sonic-vs.mk
- Additionally, no longer install Python 2 packages of swsssdk and sonic-py-common, as they should no longer be needed.
2021-01-28 09:44:43 -08:00
Mahesh Maddikayala
f6b842edd3
[BCMSAI] Update BCMSAI debian to 4.3.0.10 with 6.5.21 SDK, and opennsl module to 6.5.21 (#6526)
BCMSAI 4.3.0.10, 6.5.21 SDK release with enhancements and fixes for vxlan, TD3 MMU, TD4-X9 EA support, etc.
2021-01-28 08:38:47 -08:00
Qi Luo
0e7287856c
[build]: stop prompt during build (#6585)
Some commands used during build will prompt user interactively, but this is not expected during build. Since most output is collected into log file, user could not see the prompt and feel the build process hangs.

- How I did it

Use mv command in non interactive mode
Redirect stdin to null if command output is collected into log file.
2021-01-28 02:21:38 -08:00
Guohan Lu
fb0b999361 [ci]: append job.attempt in memdump/log artifacts
azure pipepline does not allow upload same artifacts again.
thus, use job.attempt to uniquely name the test artifacts

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-28 01:05:44 -08:00
Vaibhav Hemant Dixit
98298f7f7f
[sonic-sairedis] advance submodule to include fix for syncd crash during shutdown (#6581)
Remove unregisterMessageHandler from NetMsgRegistrar thread (#779)
2021-01-27 21:55:18 -08:00
lguohan
7d01613300
[ci]: correct ownership of artifacts (#6582)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 20:57:17 -08:00
Qi Luo
69c5832153
[ci]: Download artifact instead of using nfs storage (#6570)
I notice that I rerun a failed job (not the stages), the nfs store is already cleaned by previous failed jobs.
2021-01-27 19:58:58 -08:00
Guohan Lu
f7346cca32 [docker-fmp-frr]: remove blank lines in generated critical_process
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 19:41:59 -08:00
Guohan Lu
34cca20cb6 [proc-exit-listener]: ignore blank lines
make proc-exit-listener more rebust

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 19:41:59 -08:00
Shi Su
aab37b7f42
[FRR] Create a separate script to wait zebra to be ready to receive connections (#6519)
The requirement for zebra to be ready to accept connections is a generic problem that is not 
specific to bgpd. Making the script to wait for zebra socket a separate script and let bgpd and 
staticd to wait for zebra socket.
2021-01-27 12:36:02 -08:00
Kebo Liu
7f222e7bc1
[mellanox]: Update SAI to sonic2012 1.18.1.0 (#6566)
Changes in the new release:

1. Policy based hashing optimization
2. New attribute support for Max port headroom
3. Tunnel ECN map fixes
4. Tunnel EVPN skeleton extensions (peer attrib, maps)
5. Bridge port admin not affecting port admin (optimize port down time)
6. CRM new API for neighbors and tunnel termination entries
7. Improve FDB event for flush by bridge port (before, null bridge was reported to SONiC, now the bridge will be extracted from bridge port)
8. DHCP L2 v4+v6 traps (for ZTP use case)
9. Generic counter implementation

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-27 12:29:28 -08:00
dflynn-Nokia
1f2797a56d
[docker-config-engine-stretch]: Fix dependency typo PYTHON2_SWSSCOMMON (#6568)
This commit fixes a typo in the fix delivered in PR #6538

syncd fails on the armhf platform within sonic-config-engine/portconfig.py when importing the following
'from swsscommon.swsscommon import ConfigDBConnector'
2021-01-27 12:27:41 -08:00
abdosi
cfa8fbbf1a
[baseimage]: Updates for Ebtables and support for multi-asic (#6542)
Following changes were done for ebtables:

- Support for Multi-asic platforms. Ebtable filters are installed in namespace for multi-asic and not host. On Single asic installed on  host.

- For Multi-asic platforms we don't want to install on host otherwise Namespace-to-Namespace communication does not happens since ARP Request are not forwarded.

- Updated to use text file to restore ebtables rules then the binary format. Rules are restore as part of Database docker init instead of rc.local

- Removed the ebtable service files for buster as not needed as filters are restored/installed as part of database docker init.
   All the binaries are pre-installed with ebtables* binary are same as ebatbles-legacy-* 

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 08:36:10 -08:00
Guohan Lu
f3a901c41e [ci]: build docker-ptf on vs platform
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 08:28:21 -08:00
Guohan Lu
044efe72ca [build]: add _BUILD_ENV to specify env for dpkg-buildpackage
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 08:28:21 -08:00
Guohan Lu
ca0e8cbe0e [docker-ptf]: build docker ptf
- combine docker-ptf-saithrift into docker-ptf docker
- build docker-ptf under platform vs
- remove docker-ptf for other platforms

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 08:28:21 -08:00
Kebo Liu
9ff56445c9
Add hw-mgmt patch to support SDK OFFLINE event for handling flow within service firmware upgrade (#6550)
During ISSU, "mlxsw_minimal" driver still trying to access firmware, in some cases FW could return some wrong critical threshold value which will cause switch shutdown.

**- How I did it**
In order to prevent "mlxsw_minimal" driver from accessing ASIC during ISSU, SDK will raise "OFFLINE" 'udev' event
at the early beginning of such flow. When this event is received, hw-management will remove "mlxsw_minimal" driver.
There is no need to implement the opposite "ONLINE" event since this flow is ended up with "kexec".

**- How to verify it**
repeatedly perform warm reboot, make sure there is no switch shutdown occurred.
2021-01-27 15:39:54 +02:00
bingwang-ms
6fa807d0d0
[bgpmon]: Fix exception in bgpmon caused by duplicate bgp neighbor ID (#6546)
* Fix exception in bgpmon caused by duplicate keys
It is possible that BGP neighbors in IPv4 and IPv6 address families
share the same name (such as bgp monitor). However, such case is not
handled in bgpmon, and an Exception will be raised. This commit will
address the issue by Using set instead of list to avoid duplicate keys.
2021-01-26 23:01:42 -08:00
Qi Luo
e616a32950
[ci]: add master and 202012 into azure-pipelines trigger (#6560) 2021-01-26 15:42:04 -08:00
Prince Sunny
7337483381
[submodule]: update sonic-swss (#6561)
f4aefba - 2021-01-25 : [Mux] Fix repeating logs in case of tunnel creation fail (#1610) [Prince Sunny]
2021-01-26 13:35:11 -08:00
lguohan
a9a0e3062c
[ci]: archive kvmtest artifacts (#6567)
- archive logs
- archive kvm memdump when failed
- publish kvm test results

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-26 13:09:56 -08:00
lguohan
30ae46ea7f
[ci]: add -k ceos option to setup t0 testbed (#6565)
this is due to command line change in
1e12790a93

this is due to command line change in
Azure/sonic-mgmt@1e12790

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-26 02:28:27 -08:00
lguohan
6957e376be
[ci]: reset the owner for all files under working directory (#6557)
reset the owner for all files under working directory. some files were owned by root after build, which cause
next build to fail since directory cannot be cleanned.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-25 19:05:01 -08:00
Kebo Liu
84985e103d
[mellanox]: Update SDK to 4.4.2308, FW to *.2008.2308 (#6552)
Bugs fixes:
    All | Kernel | During system reload when CPU is loaded with heavy traffic, a Kernel Panic may occur.
    All | Modules, Port split | FW stuck when device rebooted with locked Optical Transceivers in split mode
    Spectrum-3 | PFC | On Spectrum-3 systems, slow reaction time to Rx pause packets on 40GbE ports may lead to buffer overflow on servers.
    Spectrum-3 | SN4700, Port Split | On rare occasion SN4700, conducting 100G split (4x25G) in NRZ when splitter port 1 or 2 are down, ports 3 and 4 will also go down.

Enahncments:
    All | Kernel | new notification on ISSU start, so other kernel drivers can disable any interface to ASIC

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-25 10:52:22 -08:00
Tamer Ahmed
8d857fab16
[dhcp-relay]: Launch DHCP Relay On L3 Vlan (#6527)
Recent changes brought l2 vlan concept which do not have DHCP
clients behind them and so DHCP relay is not required. Also,
dhcpmon fails to launch on those vlans as their interfaces
lack IP addresses. This PR limit launch of both DHCP relay
and dhcpmon to L3 vlans only.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-01-25 10:48:48 -08:00
Guohan Lu
a38377e96d [submodule]: update sonic-swss
* f4e8245 2021-01-24 | [fpmsyncd] Skip routes to eth0 or docker0 (#1606) (HEAD, origin/master, origin/HEAD) [Shi Su]
* f4c3579 2021-01-23 | Enhance dynamic buffer calculation and bug fixes (#1601) [Stephen Sun]
* e800c9f 2021-01-22 | [logfile]: Add option to specify swss rec file name (#1546) [arlakshm]
* 1acf60e 2021-01-17 | Implementation of System ports initialization, Interface & Neighbor Synchronization... (#1431) [minionatwork]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 22:25:11 -08:00
Guohan Lu
4b5212b8b2 [vstest]: add default vs test
Check #6483

add test to make sure default route change in eth0 does not
affect the default route in the default vrf

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 22:25:11 -08:00
lguohan
cd3ed549c3
[submodule]: update sonic-sairedis (#6544)
* 20b9573 2021-01-24 | [SAI]: update SAI submodule (#775) (HEAD, origin/master, origin/HEAD) [lguohan]
* 667c33d 2021-01-22 | [syncd] Comparison logic add support to LABEL attribute with higher priority (#764) [Kamil Cudnik]
* aaf5b98 2021-01-22 | [vslib]: Fix missing MACsec Create Port action (#770) [Ze Gan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 21:01:47 -08:00
lguohan
3bc82e5556
[ci]: add vs tests (#6506)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 21:01:19 -08:00
Zhenhong Zhao
a171e6c5e4
[frrcfgd] introduce frrcfgd to manage frr config when frr_mgmt_framework_config is true (#5142)
- Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema.
- Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup

**- How I did it**

- add frrcfgd service
- when frr_mgmg_framework_config is set, frrcfgd starts in bgp container
- when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR.
- add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted

**- How to verify it**
1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed.
1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB

Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>
2021-01-24 17:57:03 -08:00
Dmytro Shevchuk
dd0e1100a5
[sonic-cfggen] parse optional fec and autoneg fields from hwsku.json (#6155)
**- Why I did it**

For now `hwsku.json` and `platform.json` dont support optional fields. For example no way to add `fec` or `autoneg` field using `platform.json` and `hwsku.json`.

**- How I did it**
Added parsing of optional fields from hwsku.json.

**- How to verify it**
Add optional field to `hwsku.json`. After first boot will be generated new `config_db.json` or you can generate it using `sonic-cfggen` command. In this file must be optional field from `hwsku.json` or check using command `redis-cli hgetall PORT_TABLE:Ethernet0`
Example of `hwsku.json`, that must be parsed:
```
{
    "interfaces": {
        "Ethernet0": {
            "default_brkout_mode": "1x100G[40G]",
            "fec": "rs",
            "autoneg": "0"
        },
...
}
```
Example of generated `config_db.json`:
```
    "PORT": {
        "Ethernet0": {
            "alias": "Ethernet0",
            "lanes": "0,1,2,3",
            "speed": "100000",
            "index": "1",
            "admin_status": "up",
            "fec": "rs",
            "autoneg": "0",
            "mtu": "9100"
        },
```
So, we can see this entries in redis db:
```
admin@sonic:~$ redis-cli hgetall PORT_TABLE:Ethernet0

 1) "alias"
 2) "Ethernet0"
 3) "lanes"
 4) "0,1,2,3"
 5) "speed"
 6) "100000"
 7) "index"
 8) "1"
 9) "admin_status"
10) "up"
11) "fec"
12) "rs"
13) "autoneg"
14) "0"
15) "mtu"
16) "9100"
17) "description"
18) ""
19) "oper_status"
20) "up"
```

Also its way to fix `show interface status`, `FEC` field but also need add `FEC` field to `hwsku.json`.
Before:
```
admin@sonic:~$ show interfaces status
  Interface            Lanes    Speed    MTU    FEC        Alias    Vlan    Oper    Admin             Type    Asym PFC
-----------  ---------------  -------  -----  -----  -----------  ------  ------  -------  ---------------  ----------
  Ethernet0          0,1,2,3     100G   9100     N/A    Ethernet0  routed      up       up  QSFP28 or later         N/A
```
After:
```
admin@sonic:~$ show interfaces status
  Interface            Lanes    Speed    MTU    FEC        Alias    Vlan    Oper    Admin             Type    Asym PFC
-----------  ---------------  -------  -----  -----  -----------  ------  ------  -------  ---------------  ----------
  Ethernet0          0,1,2,3     100G   9100     rs    Ethernet0  routed      up       up  QSFP28 or later         N/A
```
2021-01-24 17:46:33 -08:00
Praveen Chaudhary
24df482e0e
[yang_model_test]: Tests for default value of docker_routing_config_mode and Empty ACL ports. (#6470)
Tests for default value of docker_routing_config_mode and Empty ACL ports.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
2021-01-24 17:33:12 -08:00
lguohan
0daad0b51d
[ci]: build syncd-rpc for broadcom and mellanox (#6522)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-24 17:31:47 -08:00
Vadym Hlushko
48e7116bcd
[DPB][SN3700C] extended set of speeds for split modes (#6277)
platform.json and hwsku.json files has not a full set of speeds for split modes

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2021-01-24 16:33:52 -08:00
Vadym Hlushko
709c1ecb06
[DPB][SN4700] extended set of speeds for split modes (#6278)
platform.json and hwsku.json files has not a full set of speeds for split modes

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2021-01-24 16:33:25 -08:00
Antonina Melnyk
da7f80d70b
[barefoot] Fixes for platform API (#6487)
There was a mismatch with Eeprom class methods names and methods called from Eeprom class.

Signed-off-by: Antonina Melnyk antoninax.melnyk@intel.com
2021-01-24 16:32:18 -08:00
judyjoseph
46b3bd5503
[teamd]: Increase wait timeout for teamd docker stop to clean Port channels. (#6537)
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.
Fixes #6199
To check if it works out for this issue in 201911 ? #6503

This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels.

master

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m15.599s
    user    0m0.061s
    sys     0m0.038s
Sonic 201911.v58

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m5.541s
    user    0m0.020s
    sys     0m0.028s
2021-01-23 20:57:52 -08:00
Joe LeVeque
238803d6bf
[sonic-host-services] Report unit test coverage (#6533)
To view unit test coverage of sonic-host-services package upon build
2021-01-23 00:32:06 -08:00
Tamer Ahmed
8ce1e3ed92
[build-docker-buster]: Install libboost 1.171 In Build Docker (#6532)
Installing newst buster version of libboost (v1.71) in build docker.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-01-23 00:29:35 -08:00
Joe LeVeque
d4cde6d310
[process-reboot-cause] Make process-reboot-cause executable (#6534)
process-reboot-cause script should be executable.
2021-01-23 00:29:13 -08:00