Commit Graph

3596 Commits

Author SHA1 Message Date
Dror Prital
940944e41c
Support new SKU under the name of SN2700-D40C8S8 (#6822)
#### Why I did it

Add new SKU for SN2700 Mellanox system that supports the following port configuration:
8 X 100G
40 X 50G
8 X 10G

#### How I did it

Add new Folder - "Mellanox-SN2700-D40C8S8" under /sonic-buildimage/device/mellanox/x86_64-mlnx_msn2700-r0/
that contains the relevant files supporting this SKU

#### How to verify it

Bring up the image, run "show interface status" and make sure that all ports are up and reflect the following requirement:
Port 1/3 will be used as 4x10G
Port 2/4 - Not exist (blocked since 1 and 3 split to 4)
Port 7/8/9/10/23/24/25/26 will used as 100G
All other ports will be used as 2x50G

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [X] 201911
- [ ] 202006
- [ ] 202012

#### Description for the changelog

Support new SKU under the name of SN2700-D40C8S8
2021-02-21 09:24:45 -08:00
Qi Luo
712f3311fb
[mgmt-framework]: Update submodule (#6829)
Including commits:
```
58a77fa 2021-02-20 | Git clone go dependencies instead of 'go get' (#79) [Sachin Holla]
```
2021-02-19 22:57:10 -08:00
Abhishek Dosi
fa1934f715 [submodule update] sonic-utilities
Refactor neighbor_advertiser script (#1447)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-19 18:38:53 -08:00
Prince Sunny
10e4fd637c Submodule update for restapi (#6808) 2021-02-19 16:10:25 -08:00
Abhishek Dosi
dc306eeba3 [submodule-update] sonic-utilities
02438f953aafa3303792eda2309f8f3303e55dc7 (HEAD -> 201911, origin/201911) Cherry-pick Master PR for route-checker tool (#1433)
e54fb69f7323f6ef48f44a1a893fe8266fd6f817 [201911][vnet] Add "vnet_route_check" script (#1443)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-18 18:14:25 -08:00
Kebo Liu
5ec164b694 [Mellanox] [platform API] Fix “local variable 'label_port' referenced before assignment” error (#6419)
In rare case can see that xcvrd failed due to "UnboundLocalError: local variable 'label_port' referenced before assignment"

Init "label_port" as None at the beginning of the function, to avoid the case that "label_port" not assigned.
2021-02-18 18:10:54 -08:00
Roy Lee
ce6cc3821f [device/accton]: As7816-64x, fix memory leakage on accton fan monitor. (#6168)
It's been reported that accton fan monitor process keeps consuming memory after few days.
The amount of memory occupied increases in linear and never leased.

Signed-off-by: roy_lee <roy_lee@edge-core.com>
2021-02-18 18:10:22 -08:00
Wirut Getbamrung
a5de91069c [device/celestica]: Add thermalctld support on DX010 platform APIs (#6089)
**- Why I did it**
- The thermalctld daemon on the Pmon docker requires support from the thermal manager API.

**- How I did it**
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
2021-02-18 18:09:57 -08:00
Volodymyr Boiko
06334ff438 [barefoot][device][plugins] Fix sfp reset (#6745)
Fix sfp reset in Barefoot's sfputil

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-02-18 18:07:49 -08:00
SuvarnaMeenakshi
9e777e90a0 [multi_asic][vs]: Add dependency in teamd service to start after topology service(#6594)
[multi_asic][vs]: Add dependency in teamd service to start after topology service.
- Why I did it
In multi-asic VS, topology service is run after database service to set up the internal asic topology.
swss and syncd have a dependency to start after topology service is run so that the interfaces are moved to right namespace and created in the right namespace. In case of multi-asic vs, during the initial boot up, when there is no configuration added, teamd service starts and swss/syncd do not start as topology service does not start. Upon loading configuration using config_db or minigraph, swss and sycnd start up , but teamd is not restarted as swss is not stopped and started. This causes teamd to be in a bad state and requires a reload of config.

- How I did it
Add dependency in teamd service to start after topology service is completed.

- How to verify it
No change in single asic vs or platform.
No change in multi-asic regular image.
Change only in multi-asic VS. Bring up a multi-asic VS image without any configration, teamd service will fail to start due to dependency failure. Load minigraph, start topology service, load configuration, ensure all services come up.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-18 18:05:10 -08:00
judyjoseph
86a13610cb [docker-fpm-frr]: TSA/B/C changes for multi-asic (#6510)
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
       In addition skip enforcing it on route maps used between internal BGP sessions.

admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance

and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
2021-02-18 18:04:24 -08:00
Sumukha Tumkur Vani
c4a4399da5 Disable port 8090 (#6764) 2021-02-18 17:46:44 -08:00
shlomibitton
4a1742e839
Stop teamd service before syncd (#6756)
When large number of port channels (more than 64) is configured, a config reload command might ends with not all port channel configured and up. Further debug shows that unloading the port channels on the ASIC driver take a lot of time.
With the change, deleting all port channels before the syncd restart will free resources better and the ASIC driver will unload all netdev fast and the operation will execute properly.
2021-02-18 15:48:11 -08:00
Volodymyr Samotiy
dc5eaf618f
[201911][Mellanox][SAI] update submodule pointer (#6805)
* Open ACL Outer VLAN ID for egress for ports part of VLAN RIF

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-02-17 22:26:17 -08:00
Guohan Lu
8d1adb4701 [ci]: add debug log for cleanup
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-02-17 16:30:05 -08:00
Stepan Blyshchak
d328af4016
[mellanox] update FW to *.2008.2314 (#6790)
Bring in a fix for thermal shutdown observed while executing warm-reboot:

- All | prevent FW access during ISSU

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-02-17 16:23:25 -08:00
abdosi
7a78dccfb4
[build]: Installl setuptool tools explicitly via pip3 install (#6800)
Fix the build failure of asyncsnmp-2.1.0-py3-none-any.whl .
Looks like a recent update on https://packages.debian.org/stretch/libpython3.5-stdlib
3.5.3-1+deb9u1 -> 3.5.3-1+deb9u3 break the build command python3 setup.py bdist_wheel

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-17 04:15:38 -08:00
Kamil Cudnik
f755129507
[sonic-sairedis] Advance submodule to include timeout knob (#6741) 2021-02-16 18:13:18 +01:00
Abhishek Dosi
5265472188 [submodule update] sonic-utilities
603ac53c573ca6fbb8c1ca67091ebac428b8661e (HEAD -> 201911, origin/201911) Advertise ipv6 link local address (#1402)
9b0680c3bdc4c8f28d0266a7c422b29582e0888a [acl_loader] Fix default DENY rule for V6 dataplane ACLs (#1281)
25e64ce5fe9597a13a4259c185f72df9b655ab0c [show] Fix `show ip bgp sum` (#1194)
4709da02bec6014a5f4a3946a2885f50878fce3c Revert "Add FW dump with new SAI implementation (#1298)" (#1408)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-11 16:43:42 -08:00
Petro Bratash
4031791b4e [lldp]: Add verification IPv4 address on LLDP conf Jinja2 Template (#5699)
Fix #5812

LLDP conf Jinja2 Template does not verify IPv4 address and can use IPv6 version. This issue does not effect control LLDP daemon. Issue can be reproduced via `test_snmp_lldp` test. LLDP conf Jinja2 Template selects first item from the list of mgmt interfaces.

TESTBED_1 LLDP conf

```
configure ports eth0 lldp portidsubtype local eth0
configure system ip management pattern FC00:3::32
configure system hostname dut-1
```
TESTBED_2  LLDP conf

```
configure ports eth0 lldp portidsubtype local eth0
configure system ip management pattern 10.22.24.61
configure system hostname dut-2
```
TESTBED_1  MGMT_INTERFACE

```
$ redis-cli -n 4 keys "*" | grep MGMT_INTERFACE
MGMT_INTERFACE|eth0|10.22.24.53/23
MGMT_INTERFACE|eth0|FC00:3::32/64
```
TESTBED_2  MGMT_INTERFACE

```
$ redis-cli -n 4 keys "*" | grep MGMT_INTERFACE
MGMT_INTERFACE|eth0|FC00:3::32/64
MGMT_INTERFACE|eth0|10.22.24.61/23

```

Signed-off-by: Petro Bratash <petrox.bratash@intel.com>
2021-02-11 15:34:06 -08:00
Volodymyr Samotiy
4742eaacc3
[201911][Mellanox] Update SDK to 4.4.2318, FW to *.2008.2312 (#6752)
To have the following fixes:
* All | Port status remains down after warm boot and flapping the port on peer side
* All | LAG HASH  | IPv6 SRC_IP is not accounted in LAG hashing [
* All | ASIC driver | Kernel crash observed when driver reload is initiated before it fully loaded
* Spectrum-3 | Buffer | In lossless configuration, headroom is been evicted only when the shared buffers is free

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-02-10 23:28:33 -08:00
Lior Avramov
eed13d9d53
[submodule] update sonic-sairedis (#6748)
af0d084 2021-02-08 [sairedis] Add get response timeout knob (#776)

Signed-off-by: liora <liora@nvidia.com>
2021-02-10 23:25:41 -08:00
Samuel Angebault
6cc5c93484
[arista]: Update Arista driver submodules (#6670)
On the DCS-7060CX-32S, a SEU can happen on a CPLD which by default would reboot the platform.
Other SEU scenarios are already handled but this one was missed since it's specific to this platform.
It's a pretty rare case which will now be reported in the syslog the same way others are.
2021-02-10 23:15:55 -08:00
Stepan Blyshchak
313bfdfc4c
[Mellanox][SAI] update submodule pointer (#6730)
Include SAI bug fixes:

Apply device MAC on port host interface when port is removed from LAG.
[Shared Headroom]: fixed watermark handling for SHP flow
Decrease verbosity of policer unbind message when no policer is attached

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-02-09 14:47:41 -08:00
Andriy Kokhan
33995efed5
[BFN] Updated SAI/SDK packages to 20210128 (#6595)
Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

Co-authored-by: Andriy Kokhan <andriyx.kokhan@intel.com>
2021-02-04 09:24:52 -08:00
abdosi
fede95da19 Fix Allow prefix Delete case (#6671)
When we add allow-list key with action above route-map gets updated . For eg if we add deny action above template will become to no-export community. Now if we delete the key Issue is we still keep the no-export and do not move back to drop community.

This PR fixes this issue by rolling back default route-map community value back to constants.yml default action.
2021-02-04 09:04:13 -08:00
Eran Dahan
9c9f0453f9
[MLNX] update SAI submodule (#6666)
** Why I did it **
Disable SDK extended dump due to issue found

** How I did it ** 
Update SAI submodule

** How to verify it **
Verify the SDK extended dump is not called.

Signed-off-by: Eran Dahan <erand@nvidia.com>
2021-02-04 09:03:51 +02:00
Guohan Lu
418665cede [ci]: further clean up the source directory before checkout
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-02-03 15:41:24 -08:00
xumia
6fc6346263 [ci]: Cleanup fsroot before checking out code (#6639)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Guohan Lu <lguohan@gmail.com>
2021-02-03 15:41:20 -08:00
arlakshm
a750f89630 [multi asic] add ip netns identify command to sudoer (#6591)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

- Why I did it
The command sudo ip netns identify <pid> is used in function get_current_namespace
to check in the cli command is running in host context or within a namespace.

This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands

This problem is not there on single asic platforms.

- How I did it
Add ip netns identify [0-9]* to sudoers file.
2021-02-02 10:32:59 -08:00
Abhishek Dosi
075bab813c [submodule update] sonic-sairedis
1f6982d786292390cf0dc7a3da936e035b7685e4 (HEAD -> 201911, origin/201911) [201911 Flex Counters] Add PFC pause duration counters in microseconds (#785)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-02 09:17:55 -08:00
Guohan Lu
3479308117 [ci]: cleanup source directory upon checkout
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 23:09:39 -08:00
Guohan Lu
67754a843a [ci]: reset the repo
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 06:28:55 -08:00
lguohan
fcf93dda12
[sonic-linux-kernel]: kernel security update to 4.9.246 (#6545)
* [sonic-linux-kernel]: kernel security update to 4.9.246
* [Arista] Update driver submodule (#60)
     Update kernel dependency to 4.9.0-14-2

Signed-off-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Samuel Angebault <angebault.samuel@gmail.com>
2021-01-28 08:46:07 -08:00
abdosi
95bcefa7c9
[201911] Fix PTF Docker Build Error (#6583)
We are hitting the issue as described pypa/pip#9520.
Fix to use get_pip.py from 2.7 repo.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-28 02:19:12 -08:00
Lawrence Lee
e9cab58c2d [minigraph.py]: Check for empty cluster tag before parsing (#6440)
Some non-production minigraphs will have an empty ClusterName tag

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-27 17:52:20 -08:00
Abhishek Dosi
8606d78688 [submodule update] sonic-utilities
d324eaec945081f8718468b39a8cf12dae965fd5 (HEAD -> 201911, origin/201911) [PFCWD] Fix 'start' pfcwd command (#1345)
235c61cccbbbb1f948f53b561c98888681b7071a [ecnconfig] handle backend port names when extracting port I/F ID from the port name (#1361)
7f5c3b497148fdd8e710131c5ac3f9f0a5f2cddf Drop explict 3 seconds pause between two object updates/deletes. (#1359)
12c899207917751eac719916be69c0078671963d add vlan_intf_object only if there are ipv4 or ipv6 mappings (#1377)
52ce2c32bf4e267d043a739641f5eefba3f3910f Add  subcommand description to interfaces counters (#1373)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 17:35:01 -08:00
Abhishek Dosi
1f326ca7e7 [submodule update] sonic-swss
5aa80a0f7b27204e7cc23d99ba24ea716f5fb32f (HEAD -> 201911, origin/201911) [logfile]: Add option to specify swss rec file name (#1546)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 17:21:15 -08:00
arlakshm
3cd536bb45 [Multi Asic] support of swss.rec and sairedis.rec for multi asic (#6310)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

- Why I did it
This PR has the changes to support having different swss.rec and sairedis.rec for each asic.
The logrotate script is updated as well

- How I did it

Update the orchagent.sh script to use the logfile name options in these PRs(Azure/sonic-swss#1546 and Azure/sonic-sairedis#747)
In multi asic platforms the record files will be different for each asic, with the format swss.asic{x}.rec and sairedis.asic{x}.rec

Update the logrotate script for multiasic platform .
2021-01-27 17:12:32 -08:00
bingwang-ms
869b3bc415 [bgpmon]: Fix exception in bgpmon caused by duplicate bgp neighbor ID (#6546)
* Fix exception in bgpmon caused by duplicate keys
It is possible that BGP neighbors in IPv4 and IPv6 address families
share the same name (such as bgp monitor). However, such case is not
handled in bgpmon, and an Exception will be raised. This commit will
address the issue by Using set instead of list to avoid duplicate keys.
2021-01-27 17:08:52 -08:00
abdosi
9779560b63 [baseimage]: Updates for Ebtables and support for multi-asic (#6542)
Following changes were done for ebtables:

- Support for Multi-asic platforms. Ebtable filters are installed in namespace for multi-asic and not host. On Single asic installed on  host.

- For Multi-asic platforms we don't want to install on host otherwise Namespace-to-Namespace communication does not happens since ARP Request are not forwarded.

- Updated to use text file to restore ebtables rules then the binary format. Rules are restore as part of Database docker init instead of rc.local

- Removed the ebtable service files for buster as not needed as filters are restored/installed as part of database docker init.
   All the binaries are pre-installed with ebtables* binary are same as ebatbles-legacy-*

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 16:59:10 -08:00
arheneus@marvell.com
e9d3d96c69 [ebtbles] Replace binary config file to text config file for ebtables (#5252)
Issue: Binary ebtables config file is CPU arch dependent
Fix: Load the text config during firsttime boot and
     Generate the binary persistent atomic file

Signed-off-by: Antony Rheneus <arheneus@marvell.com>
2021-01-27 16:57:41 -08:00
Guohan Lu
cc998f3059 [build]: fix dpkg uninstall bug
fix a bug when there are multiple debian packages to be uninstalled

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 14:08:36 -08:00
lguohan
22a19e87aa [build]: wait for conflicts package to be uninstalled (#5039)
when parallel build is enabled, both docker-fpm-frr and docker-syncd-brcm
is built at the same time, docker-fpm-frr requires swss which requires to
install libsaivs-dev. docker-syncd-brcm requires syncd package which requires
to install libsaibcm-dev.

since libsaivs-dev and libsaibcm-dev install the sai header in the same
location, these two packages cannot be installed at the same time. Therefore,
we need to serialize the build between these two packages. Simply uninstall
the conflict package is not enough to solve this issue. The correct solution
is to have one package wait for another package to be uninstalled.

For example, if syncd is built first, then it will install libsaibcm-dev.
Meanwhile, if the swss build job starts and tries to install libsaivs-dev,
it will first try to query if libsaibcm-dev is installed or not. if it is
installed, then it will wait until libsaibcm-dev is uninstalled. After syncd
job is finished, it will uninstall libsaibcm-dev and swss build job will be
unblocked.

To solve this issue, _UNINSTALLS is introduced to uninstall a package that
is no longer needed and to allow blocked job to continue.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 14:07:30 -08:00
Abhishek Dosi
16233ad877 [Submodule update] sonic-swss
0b662807d6c3e23349ef3ce4cd63c961c991fd09 (HEAD -> 201911, origin/201911) [201911-SWSS] Change Error log to NOTICE log for FDB flush notification failure (#1593)
4db83289f60f307788106143a3be43f66da3458f [pfcwd] Update PFC storm detection logic for Mellanox platforms (#1587)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 13:56:34 -08:00
lguohan
8bcdefbc34 [docker-orchagent]: make build depends only on sairedis package (#6467)
backport c4b5b002c3

make swss build depends only on libsairedis instead of syncd. This allows to build swss without depending
on vendor sai library.

Currently, libsairedis build also buils syncd which requires vendor SAI lib. This makes difficult to build
swss docker in buster while still keeping syncd docker in stretch, as swss requires libsairedis which also
build syncd and requires vendor to provide SAI for buster. As swss docker does not really contain syncd
binary, so it is not necessary to build syncd for swss docker.

[submodule]: update sonic-sairedis
1e42517996bfe41ac58d4c25ee3f93502befcb9d (HEAD -> 201911) [build]: add option to build without syncd

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 13:51:24 -08:00
zzhiyuan
511541f7f0
[Arista] Use thermalctld instead of fancontrol (#6173)
**- Why I did it**
There is a preference to use thermalctld instead of fancontrol for 201911 release branch. The Arista platform submodule updates and thermal policies in the platforms will allow Arista devices to use thermalctld instead of fancontrol.

**- How I did it**
I cherry-picked the necessary commits from master branch for sonic-platform-modules-arista into 201911 branch. I've also added the file to skip fancontrol and added the thermal policies json.

**- How to verify it**
On Gardena, Upperlake, Clearlake, and Lodoga thermalctld is up and running with no errors. Fans show ~29%.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2021-01-27 08:31:32 -08:00
Kebo Liu
35d93ff8a3
[201911][Mellanox] Add hw-mgmt patch to support SDK OFFLINE event handling during ISSU (#6551)
In order to prevent "mlxsw_minimal" driver accessing ASIC during in
service firmware upgrade flow, SDK will raise "OFFLINE" 'udev' event
at early beginning of such flow. When this event is received,
hw-managemnet will remove "mlxsw_minimal" driver.
There is no need to implement opposite "ONLINE" event, since this flow
is ended up with "kexec".

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-26 16:49:13 -08:00
Kebo Liu
687e1b9931
[mellanox]: Update SDK to 4.4.2308, FW to *.2008.2308 (#6553)
Bugs fixes:
    All | Kernel | During system reload when CPU is loaded with heavy traffic, a Kernel Panic may occur.
    All | Modules, Port split | FW stuck when device rebooted with locked Optical Transceivers in split mode
    Spectrum-3 | PFC | On Spectrum-3 systems, slow reaction time to Rx pause packets on 40GbE ports may lead to buffer overflow on servers.
    Spectrum-3 | SN4700, Port Split | On rare occasion SN4700, conducting 100G split (4x25G) in NRZ when splitter port 1 or 2 are down, ports 3 and 4 will also go down.

Enahncments:
    All | Kernel | new notification on ISSU start, so other kernel drivers can disable any interface to ASIC

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-01-25 20:10:15 -08:00
lguohan
a90eac73bf [mellanox]: fix mellanox hw-management build (#6471)
use dpkg-buildpackage build with fakeroot

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-25 12:44:50 -08:00