Commit Graph

3822 Commits

Author SHA1 Message Date
Junchao-Mellanox
5cf9c369e9 Change buffer config for new SKU Mellanox-SN2700-D40C8S8 (#6926)
#### Why I did it

Change buffer config for new SKU Mellanox-SN2700-D40C8S8

#### How I did it

Reuse the buffer config of SKU Mellanox-SN2700-D48C8

#### How to verify it

Run sonic-mgmt qos test and all passed
2021-03-04 22:45:45 -08:00
Danny Allen
5975d54917
[201911][openconfig_acl] Allow setting ICMP type/code to 0 (#6941)
Signed-off-by: Danny Allen <daall@microsoft.com>
2021-03-03 14:03:10 -08:00
Lior Avramov
c7b9aa7fb4
[thermalctld] Disable thermalctld on Mellanox simx platforms (#6855)
Signed-off-by: liora <liora@nvidia.com>

Co-authored-by: liora <liora@nvidia.com>
2021-03-03 11:33:27 -08:00
gechiang
705b0c4daa
[broadcom]: BRCM SAI 3.7.5.2-2 Pick up fix for CS00011729558 SAI_STATUS_INSUFFICIENT_RESOURCE wit attr SAI_BUFFER_PROFILE_ATTR_RESERVED_BUFFER_SIZE on the buffer profile using mmuconfig -p egress_lossy_profile (#6900)
This is to address the issue when "mmuconfig -p egress_lossy_profile" is executed which causes SYNCd failure with SAI_STATUS_INSUFFICIENT_RESOURCE for attr SAI_BUFFER_PROFILE_ATTR_RESERVED_BUFFER_SIZE.
This change also requires the change from (https://github.com/Azure/sonic-swss/pull/1649)
This SAI change was already tested as part of the (https://github.com/Azure/sonic-swss/pull/1649) PR.
2021-03-03 11:09:32 -08:00
abdosi
9dc285ab05 Changes in FRR temapltes for multi-asic (#6901)
1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic.

2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output.

3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27

4. Enhancement to add mult_asic specific bgpd template generation unit test cases.
2021-03-02 14:42:22 -08:00
abdosi
fbc3386825 [multi-asic] BBR support on internal-peers for multi-asic platfroms. (#6848)
Enable BBR config allowas-in 1 for internal peers

Why I did:
To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.
2021-03-02 13:44:17 -08:00
Danny Allen
16e11cf875
[201911][openconfig_acl] Add SONiC ACL extension to open config ACL model (#6897)
Add support for VLAN ID match
Add support for ICMP type/code match

To allow users to add ACL rules w/ ICMP and VLAN qualifiers via acl-loader.
2021-02-28 12:02:56 -08:00
Danny Allen
a1faa590ae
[201911][submodule] Update swss submodule (#6899)
[201911][acl] Enable VLAN ID qualifier for ACL rules (#1648) (#1651)
Skip setting not implemented brcm attr in buffer profile (#1649)
2021-02-28 12:00:02 -08:00
arlakshm
5595633008
[201911][baseimage] Install pyroute and submodule update sonic-utilities (#6916)
Install pyroute2 need for sonic-utilities in sonic-slave-stretch docker.
Submodule update of sonic-utilities to the commit 9297d5c5a00e64b5dea94a49a69cb776ac862bdc
2021-02-28 11:59:10 -08:00
Qi Luo
95ec75e24e
For egress ACL attaching to vlan, we break them into vlan members (#6898)
Same as https://github.com/Azure/sonic-buildimage/pull/6895
But target against 201911 branch
2021-02-27 20:19:29 -08:00
Eric Seifert
3a554794f2
Add missing mgmt-framework dep to telemetry build (#6910)
To prevent build issue, build mgmt-framework before telemetry.
2021-02-27 17:37:42 -08:00
Joe LeVeque
b2b6b75d2a [201911] Install Python 3 scapy version 2.4.4 in host OS 2021-02-27 20:07:19 +00:00
judyjoseph
b05a4f1c30
Port fix for https://github.com/Azure/sonic-buildimage/pull/6537 in 201911 (#6648)
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.

Fix Issue #6537
2021-02-26 17:16:33 -08:00
Abhishek Dosi
8e0faf42f3 Revert "[submodule-update] sonic-utilities"
This reverts commit f0a86bf038.
2021-02-26 11:21:46 -08:00
Myron Sosyak
7ce40c52a3 [BFN] Fix MTU for internal interface (#6783)
Set correct MTU size of internal interface for Newport platform
2021-02-25 18:56:02 -08:00
SuvarnaMeenakshi
272781855e [multi-asic][vs]: Add new multi-asic vs hwsku with four asics (#6558)
- Why I did it
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces. When bringing this up, below issue was seen:
When all 32 interfaces(sonic interfaces and linux interface) are set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .In order to keep multi-asic VS lighter and easier to bring up and test, new hwsku 'msft_four_asic_vs' is added to represent 4-asic hwsku with 2 frontend asics and 2 backend asics and each asic having 8 interfaces interconnected by port-channels.
- How I did it
Add msft_four_asic_hwsku directory to have the right number of directories (4) and update port_config.ini and lanemap.ini files to include 8 ports information.
Add topology.sh script to create the internal asic-asic connectivity.
- How to verify it
Update asic.conf with the 4 asic information as below and build sonic-vs.img:
NUM_ASIC=4
DEV_ID_ASIC_0=0
DEV_ID_ASIC_1=1
DEV_ID_ASIC_2=2
DEV_ID_ASIC_3=3
Modify sonic_multiasic.xml to have 8 front panel interfaces.
create virtual switch using "sudo virsh sonic_mutliasic.xml" command.
Start topology service and Load config_db files for switch and each asic.
Ensure that that all internal interfaces and port_channels are coming up.
multi-asic vs testbed:
Bring up mutli-asic VS testbed with a multi-asic image(asic.conf updated to 4 asics) and using t1-lag topology.
./testbed-cli.sh -t vtestbed.csv -m veos_vtb -k ceos add-topo vms-kvm-four-asic-t1-lag password.txt
Load minigraph/config_dbs.
Ensure all internal and external interfaces come up.
No change on single asic vs.
2021-02-25 18:55:21 -08:00
Abhishek Dosi
2f1eacbb74 [submoudle-update] sonic-platform-daemons
61acd3a2e4a457f3bc706cbfaf3162b947763864 (HEAD -> 201911, origin/201911)
[xcvrd] Change in xcvrd ports cache creation, now ports are being
fetched from config DB (#5892) (#155)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-25 18:52:48 -08:00
Abhishek Dosi
1a62cd2f67 [submodule update] sonic-platform-common
0b9429d032c2c0449dfeaad07542707f78b5c01f (HEAD -> 201911, origin/201911)
[sfputilhelper] Add new option in ports cache creation, fetch ports from config DB (#5892) (#172)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-25 18:51:22 -08:00
SuvarnaMeenakshi
f694787521 [vs]: Update swiotlb buffer size to support multi-asic VS platform. (#6674)
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces.
When bringing this up, below issue was seen:
When all 32 interfaces in each namespace (sonic interfaces and linux interface) is set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-25 18:43:15 -08:00
SuvarnaMeenakshi
9208dc507b [multi-asic][vs]: Update topology script to retrieve hwsku from minigraph (#6219)
Update topology script to retrieve hwsku from minigraph
if hwsku information is not available in config_db.
Fix clean up of interfaces in msft_multi_asic_vs hwsku
topology script.
- Why I did it
When bringing up multi-asic VS switch, topology service is started during boot up.
Topology service starts a shell script which runs the topology script present in /usr/share/sonic/device// directory. To invoke hwsku specific script, the topology script tries to retrieve hwsku information from config_db.
During initial boot up config_db might not be populated. In order to start topology service before config_db is updated,
update topology script to get hwsku information from minigraph.xml if it is available.
This will be helpful to bring up multi-asic VS testbed by loading minigraph and starting topology service.
- How I did it
Update topology.sh script to retrieve hwsku information from minigraph.xml.
Fix clean up function on msft_multi_asic_vs toplogy script.
- How to verify it
single-asic VS - no change; topology service is only enabled for multi-asic VS.
multi-asic VS - Bring up multi-asic VS image, copy minigraph to vs image, start topology service. Topology service should be successful.
to test clean up function fix, start topology service - make sure interfaces are created and moved to the right namespaces.
stop topology service - make sure namespace do not have any interface and all front end interfaces are present in default namespace.
2021-02-25 18:42:44 -08:00
Abhishek Dosi
f0a86bf038 [submodule-update] sonic-utilities
[201911] show ip int changes (#1437)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-25 18:41:29 -08:00
abdosi
1064cd5cd0 [multi-asic] Enhanced iptable default rules (#6765)
What I did:-

For multi-asic platforms added iptable v4 rule to communicate on docker bridge ip
For multi-asic platforms extend iptable v4 rule for iptable v6 also
For multi-asic program made all internal rules applicable for all protocols (not filter based on tcp/udp). This is done to be consistent same as local host rule
For multi-asic platforms made nat rule (to forward traffic from namespace to host) generic for all protocols and also use Source IP if present for matching
2021-02-25 18:39:43 -08:00
arlakshm
5822b42fdb
[sudoers]: add ipintutil in sudoer file (#6857)
This PR is port of #6845 for 201911

show ip interfaces is enhanced recently to support multi ASIC platforms in this Azure/sonic-utilities#1437. The ipintutil script as to run as sudo user, to get the ip interface from each namespace.
Add this script to the sudoer file so that show ip interface command is available for user with read-only permissions

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-02-23 13:26:53 -08:00
arlakshm
daecc34180
[201911][baseimage] Install pyroute2 for sonic-utilites (#6792)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

Install pyroute2 for sonic-utilities. This change is needed for Azure/sonic-utilities#1437
2021-02-22 23:30:27 -08:00
Qi Luo
950557a0f5
[minigraph] Support tagged VlanInterface if attached to multiple vlans (#6846)
Same as https://github.com/Azure/sonic-buildimage/pull/6833
But adapted for 201911 branch
2021-02-22 21:18:35 -08:00
Qi Luo
c9febff961 [radv] Disable radv for specific deployment_id (#6830) 2021-02-22 18:52:40 -08:00
Dror Prital
940944e41c
Support new SKU under the name of SN2700-D40C8S8 (#6822)
#### Why I did it

Add new SKU for SN2700 Mellanox system that supports the following port configuration:
8 X 100G
40 X 50G
8 X 10G

#### How I did it

Add new Folder - "Mellanox-SN2700-D40C8S8" under /sonic-buildimage/device/mellanox/x86_64-mlnx_msn2700-r0/
that contains the relevant files supporting this SKU

#### How to verify it

Bring up the image, run "show interface status" and make sure that all ports are up and reflect the following requirement:
Port 1/3 will be used as 4x10G
Port 2/4 - Not exist (blocked since 1 and 3 split to 4)
Port 7/8/9/10/23/24/25/26 will used as 100G
All other ports will be used as 2x50G

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [X] 201911
- [ ] 202006
- [ ] 202012

#### Description for the changelog

Support new SKU under the name of SN2700-D40C8S8
2021-02-21 09:24:45 -08:00
Qi Luo
712f3311fb
[mgmt-framework]: Update submodule (#6829)
Including commits:
```
58a77fa 2021-02-20 | Git clone go dependencies instead of 'go get' (#79) [Sachin Holla]
```
2021-02-19 22:57:10 -08:00
Abhishek Dosi
fa1934f715 [submodule update] sonic-utilities
Refactor neighbor_advertiser script (#1447)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-19 18:38:53 -08:00
Prince Sunny
10e4fd637c Submodule update for restapi (#6808) 2021-02-19 16:10:25 -08:00
Abhishek Dosi
dc306eeba3 [submodule-update] sonic-utilities
02438f953aafa3303792eda2309f8f3303e55dc7 (HEAD -> 201911, origin/201911) Cherry-pick Master PR for route-checker tool (#1433)
e54fb69f7323f6ef48f44a1a893fe8266fd6f817 [201911][vnet] Add "vnet_route_check" script (#1443)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-18 18:14:25 -08:00
Kebo Liu
5ec164b694 [Mellanox] [platform API] Fix “local variable 'label_port' referenced before assignment” error (#6419)
In rare case can see that xcvrd failed due to "UnboundLocalError: local variable 'label_port' referenced before assignment"

Init "label_port" as None at the beginning of the function, to avoid the case that "label_port" not assigned.
2021-02-18 18:10:54 -08:00
Roy Lee
ce6cc3821f [device/accton]: As7816-64x, fix memory leakage on accton fan monitor. (#6168)
It's been reported that accton fan monitor process keeps consuming memory after few days.
The amount of memory occupied increases in linear and never leased.

Signed-off-by: roy_lee <roy_lee@edge-core.com>
2021-02-18 18:10:22 -08:00
Wirut Getbamrung
a5de91069c [device/celestica]: Add thermalctld support on DX010 platform APIs (#6089)
**- Why I did it**
- The thermalctld daemon on the Pmon docker requires support from the thermal manager API.

**- How I did it**
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
2021-02-18 18:09:57 -08:00
Volodymyr Boiko
06334ff438 [barefoot][device][plugins] Fix sfp reset (#6745)
Fix sfp reset in Barefoot's sfputil

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-02-18 18:07:49 -08:00
SuvarnaMeenakshi
9e777e90a0 [multi_asic][vs]: Add dependency in teamd service to start after topology service(#6594)
[multi_asic][vs]: Add dependency in teamd service to start after topology service.
- Why I did it
In multi-asic VS, topology service is run after database service to set up the internal asic topology.
swss and syncd have a dependency to start after topology service is run so that the interfaces are moved to right namespace and created in the right namespace. In case of multi-asic vs, during the initial boot up, when there is no configuration added, teamd service starts and swss/syncd do not start as topology service does not start. Upon loading configuration using config_db or minigraph, swss and sycnd start up , but teamd is not restarted as swss is not stopped and started. This causes teamd to be in a bad state and requires a reload of config.

- How I did it
Add dependency in teamd service to start after topology service is completed.

- How to verify it
No change in single asic vs or platform.
No change in multi-asic regular image.
Change only in multi-asic VS. Bring up a multi-asic VS image without any configration, teamd service will fail to start due to dependency failure. Load minigraph, start topology service, load configuration, ensure all services come up.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-18 18:05:10 -08:00
judyjoseph
86a13610cb [docker-fpm-frr]: TSA/B/C changes for multi-asic (#6510)
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
       In addition skip enforcing it on route maps used between internal BGP sessions.

admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance

and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
2021-02-18 18:04:24 -08:00
Sumukha Tumkur Vani
c4a4399da5 Disable port 8090 (#6764) 2021-02-18 17:46:44 -08:00
shlomibitton
4a1742e839
Stop teamd service before syncd (#6756)
When large number of port channels (more than 64) is configured, a config reload command might ends with not all port channel configured and up. Further debug shows that unloading the port channels on the ASIC driver take a lot of time.
With the change, deleting all port channels before the syncd restart will free resources better and the ASIC driver will unload all netdev fast and the operation will execute properly.
2021-02-18 15:48:11 -08:00
Volodymyr Samotiy
dc5eaf618f
[201911][Mellanox][SAI] update submodule pointer (#6805)
* Open ACL Outer VLAN ID for egress for ports part of VLAN RIF

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-02-17 22:26:17 -08:00
Guohan Lu
8d1adb4701 [ci]: add debug log for cleanup
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-02-17 16:30:05 -08:00
Stepan Blyshchak
d328af4016
[mellanox] update FW to *.2008.2314 (#6790)
Bring in a fix for thermal shutdown observed while executing warm-reboot:

- All | prevent FW access during ISSU

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-02-17 16:23:25 -08:00
abdosi
7a78dccfb4
[build]: Installl setuptool tools explicitly via pip3 install (#6800)
Fix the build failure of asyncsnmp-2.1.0-py3-none-any.whl .
Looks like a recent update on https://packages.debian.org/stretch/libpython3.5-stdlib
3.5.3-1+deb9u1 -> 3.5.3-1+deb9u3 break the build command python3 setup.py bdist_wheel

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-17 04:15:38 -08:00
Kamil Cudnik
f755129507
[sonic-sairedis] Advance submodule to include timeout knob (#6741) 2021-02-16 18:13:18 +01:00
Abhishek Dosi
5265472188 [submodule update] sonic-utilities
603ac53c573ca6fbb8c1ca67091ebac428b8661e (HEAD -> 201911, origin/201911) Advertise ipv6 link local address (#1402)
9b0680c3bdc4c8f28d0266a7c422b29582e0888a [acl_loader] Fix default DENY rule for V6 dataplane ACLs (#1281)
25e64ce5fe9597a13a4259c185f72df9b655ab0c [show] Fix `show ip bgp sum` (#1194)
4709da02bec6014a5f4a3946a2885f50878fce3c Revert "Add FW dump with new SAI implementation (#1298)" (#1408)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-11 16:43:42 -08:00
Petro Bratash
4031791b4e [lldp]: Add verification IPv4 address on LLDP conf Jinja2 Template (#5699)
Fix #5812

LLDP conf Jinja2 Template does not verify IPv4 address and can use IPv6 version. This issue does not effect control LLDP daemon. Issue can be reproduced via `test_snmp_lldp` test. LLDP conf Jinja2 Template selects first item from the list of mgmt interfaces.

TESTBED_1 LLDP conf

```
configure ports eth0 lldp portidsubtype local eth0
configure system ip management pattern FC00:3::32
configure system hostname dut-1
```
TESTBED_2  LLDP conf

```
configure ports eth0 lldp portidsubtype local eth0
configure system ip management pattern 10.22.24.61
configure system hostname dut-2
```
TESTBED_1  MGMT_INTERFACE

```
$ redis-cli -n 4 keys "*" | grep MGMT_INTERFACE
MGMT_INTERFACE|eth0|10.22.24.53/23
MGMT_INTERFACE|eth0|FC00:3::32/64
```
TESTBED_2  MGMT_INTERFACE

```
$ redis-cli -n 4 keys "*" | grep MGMT_INTERFACE
MGMT_INTERFACE|eth0|FC00:3::32/64
MGMT_INTERFACE|eth0|10.22.24.61/23

```

Signed-off-by: Petro Bratash <petrox.bratash@intel.com>
2021-02-11 15:34:06 -08:00
Volodymyr Samotiy
4742eaacc3
[201911][Mellanox] Update SDK to 4.4.2318, FW to *.2008.2312 (#6752)
To have the following fixes:
* All | Port status remains down after warm boot and flapping the port on peer side
* All | LAG HASH  | IPv6 SRC_IP is not accounted in LAG hashing [
* All | ASIC driver | Kernel crash observed when driver reload is initiated before it fully loaded
* Spectrum-3 | Buffer | In lossless configuration, headroom is been evicted only when the shared buffers is free

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-02-10 23:28:33 -08:00
Lior Avramov
eed13d9d53
[submodule] update sonic-sairedis (#6748)
af0d084 2021-02-08 [sairedis] Add get response timeout knob (#776)

Signed-off-by: liora <liora@nvidia.com>
2021-02-10 23:25:41 -08:00
Samuel Angebault
6cc5c93484
[arista]: Update Arista driver submodules (#6670)
On the DCS-7060CX-32S, a SEU can happen on a CPLD which by default would reboot the platform.
Other SEU scenarios are already handled but this one was missed since it's specific to this platform.
It's a pretty rare case which will now be reported in the syslog the same way others are.
2021-02-10 23:15:55 -08:00
Stepan Blyshchak
313bfdfc4c
[Mellanox][SAI] update submodule pointer (#6730)
Include SAI bug fixes:

Apply device MAC on port host interface when port is removed from LAG.
[Shared Headroom]: fixed watermark handling for SHP flow
Decrease verbosity of policer unbind message when no policer is attached

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-02-09 14:47:41 -08:00