Commit Graph

3833 Commits

Author SHA1 Message Date
Santhosh Kumar T
140576ddbb
[201911] DellEMC S6100 SSD Monitor (#6934)
Why I did it
To monitor the SSD health condition in DellEMC S6100 platform post upgrade.

A daemon is introduced to monitor the SSD every one hour.

To check for SSD status at boot time and at the time of cold-reboot.

All these changes are supported only for newer SSD firmware.

Added a platform_reboot_pre_check script to prevent cold-reboot based on SSD status.
Depends on Azure/sonic-utilities#1472
DO NOT MERGE UNTIL ABOVE PR IS MERGED
2021-03-12 17:02:17 -08:00
abdosi
9b553d905d
Fix bgpmon.py sylog for exception handling. (#7030)
[201911] Fix bgpmon.py syslog message during exception handling.
2021-03-12 11:11:59 -08:00
Kebo Liu
c2806eb756
Pickup latest change in sonic-platform-daemon (#7014)
Pick up the latest change in sonic-platform-daemons submodule: Azure/sonic-platform-daemons@f59480d

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-03-11 12:00:37 +02:00
judyjoseph
b20e67819f [sonic-cfggen]: Use unix socket when reading from DB only if we are using sudo. (#7002)
Closes issue #6982.
The issue was root caused as we were using the unix_socket for reading from DB as a default mechanism (#5250). The redis unix socket is created as follows.

admin@str--acs-1:~$ ls -lrt /var/run/redis/redis.sock 
srwxrw---- 1 root redis 0 Mar  6 01:57 /var/run/redis/redis.sock
So it used to work fine for the user "root" or if user is part of redis group ( admin was made part of redis group by default )

Check if the user is with sudo permissions then use the redis unix socket, else fallback to tcp socket.
2021-03-10 12:47:20 -08:00
Ze Gan
b73d5a659e [docker-ptf]: Add teamd dependency to ptf (#6994)
Signed-off-by: Ze Gan <ganze718@gmail.com>
2021-03-10 10:50:17 -08:00
Qi Luo
b12383013f [build]: Fix get-pip 2.7 url according to upstream announcement (#6999)
ref: https://bootstrap.pypa.io/2.7/get-pip.py

The URL you are using to fetch this script has changed, and this one will no
longer work. Please use get-pip.py from the following URL instead:

    https://bootstrap.pypa.io/pip/2.7/get-pip.py
2021-03-10 09:51:31 -08:00
Abhishek Dosi
38fbd98cd7 [submodule update] sonic-utilities
9e740759c370645b4367acf22856aebcfb7fce45 (HEAD -> 201911, origin/201911) [201911][multi asic] show ip bgp summary changes for bgp mon (#1483)
fa07245786df11e6df902c33fcd9c7115a7c5380 [CLI][techsupport] Merge 'show techsupport' changes from master (#1468)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-03-06 21:25:34 -08:00
abdosi
ab05a2f58a
Add support for BGP Monitors on multi asic SONiC platforms. (#6977)
This PR is cherry-pick of master
https://github.com/Azure/sonic-buildimage/pull/6920

Why I did it
Add support for BGP Monitors on multi asic SONiC platforms.

How I did it
On multi ASIC SONiC platforms, BGP monitor session will be established from Backend ASIC.
To achieve this following changes are done

Add BGP monitor configuration on the backend ASIC.
The BGP monitor configuration is present in the DPG of the device in minigraph.xml of multi-ASIC device, so this configuration will be added to the config_db of the host, when the minigraph is loaded.
To add configuration for this in the Backend ASIC, a new class MultiAsicBgpMonCfg is added to the hostcfgd service to update the config_db of the backend ASIC when the BGP_MONITOR table of the host config_db is updated.
This way incremental BGP_MONITOR configuration can also be handled.

Changes to establish BGP session with bgp monitor.

Add route in host main routing table to go to one of pre-define backend asic
Add IP table rule on front asic to mark the BGP packets with destination as IPv4 Loopback.
Add IP rule in front asic namespace to match mark BGP packet and lookup default table
Program the default route in FrontEnd asic name space docker default table as part of start.sh of the BGP container.
It need to be done as part of start.sh otherwise FRR default route will get over-written.
How to verify it

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: Arvind <arlakshm@microsoft.com>
2021-03-06 21:21:52 -08:00
Qi Luo
32e3cd9454
Revert "[monit] Periodically monitor VNET route consistency (#6819)" (#6975)
This reverts commit 2c6be7e0f5.
Reverts #6819
2021-03-06 06:56:26 -08:00
Volodymyr Samotiy
2c6be7e0f5
[monit] Periodically monitor VNET route consistency (#6819)
To run VNET route consistency check periodically.

For any failure, the monit will raise alert based on return code.
The tool will log required details.
2021-03-05 13:15:19 -08:00
Danny Allen
603767d94a
[201911][submodule] Update sonic-utilities submodule (#6966)
- [201911][acl] Expand VLAN into VLAN members when creating an ACL table (#1477)
- [201911][acl-loader] Add support for matching on ICMP and VLAN info (#1476)
- [201911][acl-loader] Improve input validation for acl_loader (#1481)

Signed-off-by: Danny Allen <daall@microsoft.com>
2021-03-05 07:26:10 -08:00
Junchao-Mellanox
5cf9c369e9 Change buffer config for new SKU Mellanox-SN2700-D40C8S8 (#6926)
#### Why I did it

Change buffer config for new SKU Mellanox-SN2700-D40C8S8

#### How I did it

Reuse the buffer config of SKU Mellanox-SN2700-D48C8

#### How to verify it

Run sonic-mgmt qos test and all passed
2021-03-04 22:45:45 -08:00
Danny Allen
5975d54917
[201911][openconfig_acl] Allow setting ICMP type/code to 0 (#6941)
Signed-off-by: Danny Allen <daall@microsoft.com>
2021-03-03 14:03:10 -08:00
Lior Avramov
c7b9aa7fb4
[thermalctld] Disable thermalctld on Mellanox simx platforms (#6855)
Signed-off-by: liora <liora@nvidia.com>

Co-authored-by: liora <liora@nvidia.com>
2021-03-03 11:33:27 -08:00
gechiang
705b0c4daa
[broadcom]: BRCM SAI 3.7.5.2-2 Pick up fix for CS00011729558 SAI_STATUS_INSUFFICIENT_RESOURCE wit attr SAI_BUFFER_PROFILE_ATTR_RESERVED_BUFFER_SIZE on the buffer profile using mmuconfig -p egress_lossy_profile (#6900)
This is to address the issue when "mmuconfig -p egress_lossy_profile" is executed which causes SYNCd failure with SAI_STATUS_INSUFFICIENT_RESOURCE for attr SAI_BUFFER_PROFILE_ATTR_RESERVED_BUFFER_SIZE.
This change also requires the change from (https://github.com/Azure/sonic-swss/pull/1649)
This SAI change was already tested as part of the (https://github.com/Azure/sonic-swss/pull/1649) PR.
2021-03-03 11:09:32 -08:00
abdosi
9dc285ab05 Changes in FRR temapltes for multi-asic (#6901)
1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic.

2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output.

3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27

4. Enhancement to add mult_asic specific bgpd template generation unit test cases.
2021-03-02 14:42:22 -08:00
abdosi
fbc3386825 [multi-asic] BBR support on internal-peers for multi-asic platfroms. (#6848)
Enable BBR config allowas-in 1 for internal peers

Why I did:
To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.
2021-03-02 13:44:17 -08:00
Danny Allen
16e11cf875
[201911][openconfig_acl] Add SONiC ACL extension to open config ACL model (#6897)
Add support for VLAN ID match
Add support for ICMP type/code match

To allow users to add ACL rules w/ ICMP and VLAN qualifiers via acl-loader.
2021-02-28 12:02:56 -08:00
Danny Allen
a1faa590ae
[201911][submodule] Update swss submodule (#6899)
[201911][acl] Enable VLAN ID qualifier for ACL rules (#1648) (#1651)
Skip setting not implemented brcm attr in buffer profile (#1649)
2021-02-28 12:00:02 -08:00
arlakshm
5595633008
[201911][baseimage] Install pyroute and submodule update sonic-utilities (#6916)
Install pyroute2 need for sonic-utilities in sonic-slave-stretch docker.
Submodule update of sonic-utilities to the commit 9297d5c5a00e64b5dea94a49a69cb776ac862bdc
2021-02-28 11:59:10 -08:00
Qi Luo
95ec75e24e
For egress ACL attaching to vlan, we break them into vlan members (#6898)
Same as https://github.com/Azure/sonic-buildimage/pull/6895
But target against 201911 branch
2021-02-27 20:19:29 -08:00
Eric Seifert
3a554794f2
Add missing mgmt-framework dep to telemetry build (#6910)
To prevent build issue, build mgmt-framework before telemetry.
2021-02-27 17:37:42 -08:00
Joe LeVeque
b2b6b75d2a [201911] Install Python 3 scapy version 2.4.4 in host OS 2021-02-27 20:07:19 +00:00
judyjoseph
b05a4f1c30
Port fix for https://github.com/Azure/sonic-buildimage/pull/6537 in 201911 (#6648)
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.

Fix Issue #6537
2021-02-26 17:16:33 -08:00
Abhishek Dosi
8e0faf42f3 Revert "[submodule-update] sonic-utilities"
This reverts commit f0a86bf038.
2021-02-26 11:21:46 -08:00
Myron Sosyak
7ce40c52a3 [BFN] Fix MTU for internal interface (#6783)
Set correct MTU size of internal interface for Newport platform
2021-02-25 18:56:02 -08:00
SuvarnaMeenakshi
272781855e [multi-asic][vs]: Add new multi-asic vs hwsku with four asics (#6558)
- Why I did it
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces. When bringing this up, below issue was seen:
When all 32 interfaces(sonic interfaces and linux interface) are set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .In order to keep multi-asic VS lighter and easier to bring up and test, new hwsku 'msft_four_asic_vs' is added to represent 4-asic hwsku with 2 frontend asics and 2 backend asics and each asic having 8 interfaces interconnected by port-channels.
- How I did it
Add msft_four_asic_hwsku directory to have the right number of directories (4) and update port_config.ini and lanemap.ini files to include 8 ports information.
Add topology.sh script to create the internal asic-asic connectivity.
- How to verify it
Update asic.conf with the 4 asic information as below and build sonic-vs.img:
NUM_ASIC=4
DEV_ID_ASIC_0=0
DEV_ID_ASIC_1=1
DEV_ID_ASIC_2=2
DEV_ID_ASIC_3=3
Modify sonic_multiasic.xml to have 8 front panel interfaces.
create virtual switch using "sudo virsh sonic_mutliasic.xml" command.
Start topology service and Load config_db files for switch and each asic.
Ensure that that all internal interfaces and port_channels are coming up.
multi-asic vs testbed:
Bring up mutli-asic VS testbed with a multi-asic image(asic.conf updated to 4 asics) and using t1-lag topology.
./testbed-cli.sh -t vtestbed.csv -m veos_vtb -k ceos add-topo vms-kvm-four-asic-t1-lag password.txt
Load minigraph/config_dbs.
Ensure all internal and external interfaces come up.
No change on single asic vs.
2021-02-25 18:55:21 -08:00
Abhishek Dosi
2f1eacbb74 [submoudle-update] sonic-platform-daemons
61acd3a2e4a457f3bc706cbfaf3162b947763864 (HEAD -> 201911, origin/201911)
[xcvrd] Change in xcvrd ports cache creation, now ports are being
fetched from config DB (#5892) (#155)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-25 18:52:48 -08:00
Abhishek Dosi
1a62cd2f67 [submodule update] sonic-platform-common
0b9429d032c2c0449dfeaad07542707f78b5c01f (HEAD -> 201911, origin/201911)
[sfputilhelper] Add new option in ports cache creation, fetch ports from config DB (#5892) (#172)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-25 18:51:22 -08:00
SuvarnaMeenakshi
f694787521 [vs]: Update swiotlb buffer size to support multi-asic VS platform. (#6674)
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces.
When bringing this up, below issue was seen:
When all 32 interfaces in each namespace (sonic interfaces and linux interface) is set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-25 18:43:15 -08:00
SuvarnaMeenakshi
9208dc507b [multi-asic][vs]: Update topology script to retrieve hwsku from minigraph (#6219)
Update topology script to retrieve hwsku from minigraph
if hwsku information is not available in config_db.
Fix clean up of interfaces in msft_multi_asic_vs hwsku
topology script.
- Why I did it
When bringing up multi-asic VS switch, topology service is started during boot up.
Topology service starts a shell script which runs the topology script present in /usr/share/sonic/device// directory. To invoke hwsku specific script, the topology script tries to retrieve hwsku information from config_db.
During initial boot up config_db might not be populated. In order to start topology service before config_db is updated,
update topology script to get hwsku information from minigraph.xml if it is available.
This will be helpful to bring up multi-asic VS testbed by loading minigraph and starting topology service.
- How I did it
Update topology.sh script to retrieve hwsku information from minigraph.xml.
Fix clean up function on msft_multi_asic_vs toplogy script.
- How to verify it
single-asic VS - no change; topology service is only enabled for multi-asic VS.
multi-asic VS - Bring up multi-asic VS image, copy minigraph to vs image, start topology service. Topology service should be successful.
to test clean up function fix, start topology service - make sure interfaces are created and moved to the right namespaces.
stop topology service - make sure namespace do not have any interface and all front end interfaces are present in default namespace.
2021-02-25 18:42:44 -08:00
Abhishek Dosi
f0a86bf038 [submodule-update] sonic-utilities
[201911] show ip int changes (#1437)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-25 18:41:29 -08:00
abdosi
1064cd5cd0 [multi-asic] Enhanced iptable default rules (#6765)
What I did:-

For multi-asic platforms added iptable v4 rule to communicate on docker bridge ip
For multi-asic platforms extend iptable v4 rule for iptable v6 also
For multi-asic program made all internal rules applicable for all protocols (not filter based on tcp/udp). This is done to be consistent same as local host rule
For multi-asic platforms made nat rule (to forward traffic from namespace to host) generic for all protocols and also use Source IP if present for matching
2021-02-25 18:39:43 -08:00
arlakshm
5822b42fdb
[sudoers]: add ipintutil in sudoer file (#6857)
This PR is port of #6845 for 201911

show ip interfaces is enhanced recently to support multi ASIC platforms in this Azure/sonic-utilities#1437. The ipintutil script as to run as sudo user, to get the ip interface from each namespace.
Add this script to the sudoer file so that show ip interface command is available for user with read-only permissions

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-02-23 13:26:53 -08:00
arlakshm
daecc34180
[201911][baseimage] Install pyroute2 for sonic-utilites (#6792)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

Install pyroute2 for sonic-utilities. This change is needed for Azure/sonic-utilities#1437
2021-02-22 23:30:27 -08:00
Qi Luo
950557a0f5
[minigraph] Support tagged VlanInterface if attached to multiple vlans (#6846)
Same as https://github.com/Azure/sonic-buildimage/pull/6833
But adapted for 201911 branch
2021-02-22 21:18:35 -08:00
Qi Luo
c9febff961 [radv] Disable radv for specific deployment_id (#6830) 2021-02-22 18:52:40 -08:00
Dror Prital
940944e41c
Support new SKU under the name of SN2700-D40C8S8 (#6822)
#### Why I did it

Add new SKU for SN2700 Mellanox system that supports the following port configuration:
8 X 100G
40 X 50G
8 X 10G

#### How I did it

Add new Folder - "Mellanox-SN2700-D40C8S8" under /sonic-buildimage/device/mellanox/x86_64-mlnx_msn2700-r0/
that contains the relevant files supporting this SKU

#### How to verify it

Bring up the image, run "show interface status" and make sure that all ports are up and reflect the following requirement:
Port 1/3 will be used as 4x10G
Port 2/4 - Not exist (blocked since 1 and 3 split to 4)
Port 7/8/9/10/23/24/25/26 will used as 100G
All other ports will be used as 2x50G

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [X] 201911
- [ ] 202006
- [ ] 202012

#### Description for the changelog

Support new SKU under the name of SN2700-D40C8S8
2021-02-21 09:24:45 -08:00
Qi Luo
712f3311fb
[mgmt-framework]: Update submodule (#6829)
Including commits:
```
58a77fa 2021-02-20 | Git clone go dependencies instead of 'go get' (#79) [Sachin Holla]
```
2021-02-19 22:57:10 -08:00
Abhishek Dosi
fa1934f715 [submodule update] sonic-utilities
Refactor neighbor_advertiser script (#1447)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-19 18:38:53 -08:00
Prince Sunny
10e4fd637c Submodule update for restapi (#6808) 2021-02-19 16:10:25 -08:00
Abhishek Dosi
dc306eeba3 [submodule-update] sonic-utilities
02438f953aafa3303792eda2309f8f3303e55dc7 (HEAD -> 201911, origin/201911) Cherry-pick Master PR for route-checker tool (#1433)
e54fb69f7323f6ef48f44a1a893fe8266fd6f817 [201911][vnet] Add "vnet_route_check" script (#1443)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-02-18 18:14:25 -08:00
Kebo Liu
5ec164b694 [Mellanox] [platform API] Fix “local variable 'label_port' referenced before assignment” error (#6419)
In rare case can see that xcvrd failed due to "UnboundLocalError: local variable 'label_port' referenced before assignment"

Init "label_port" as None at the beginning of the function, to avoid the case that "label_port" not assigned.
2021-02-18 18:10:54 -08:00
Roy Lee
ce6cc3821f [device/accton]: As7816-64x, fix memory leakage on accton fan monitor. (#6168)
It's been reported that accton fan monitor process keeps consuming memory after few days.
The amount of memory occupied increases in linear and never leased.

Signed-off-by: roy_lee <roy_lee@edge-core.com>
2021-02-18 18:10:22 -08:00
Wirut Getbamrung
a5de91069c [device/celestica]: Add thermalctld support on DX010 platform APIs (#6089)
**- Why I did it**
- The thermalctld daemon on the Pmon docker requires support from the thermal manager API.

**- How I did it**
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
2021-02-18 18:09:57 -08:00
Volodymyr Boiko
06334ff438 [barefoot][device][plugins] Fix sfp reset (#6745)
Fix sfp reset in Barefoot's sfputil

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-02-18 18:07:49 -08:00
SuvarnaMeenakshi
9e777e90a0 [multi_asic][vs]: Add dependency in teamd service to start after topology service(#6594)
[multi_asic][vs]: Add dependency in teamd service to start after topology service.
- Why I did it
In multi-asic VS, topology service is run after database service to set up the internal asic topology.
swss and syncd have a dependency to start after topology service is run so that the interfaces are moved to right namespace and created in the right namespace. In case of multi-asic vs, during the initial boot up, when there is no configuration added, teamd service starts and swss/syncd do not start as topology service does not start. Upon loading configuration using config_db or minigraph, swss and sycnd start up , but teamd is not restarted as swss is not stopped and started. This causes teamd to be in a bad state and requires a reload of config.

- How I did it
Add dependency in teamd service to start after topology service is completed.

- How to verify it
No change in single asic vs or platform.
No change in multi-asic regular image.
Change only in multi-asic VS. Bring up a multi-asic VS image without any configration, teamd service will fail to start due to dependency failure. Load minigraph, start topology service, load configuration, ensure all services come up.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-18 18:05:10 -08:00
judyjoseph
86a13610cb [docker-fpm-frr]: TSA/B/C changes for multi-asic (#6510)
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
       In addition skip enforcing it on route maps used between internal BGP sessions.

admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance

and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
2021-02-18 18:04:24 -08:00
Sumukha Tumkur Vani
c4a4399da5 Disable port 8090 (#6764) 2021-02-18 17:46:44 -08:00
shlomibitton
4a1742e839
Stop teamd service before syncd (#6756)
When large number of port channels (more than 64) is configured, a config reload command might ends with not all port channel configured and up. Further debug shows that unloading the port channels on the ASIC driver take a lot of time.
With the change, deleting all port channels before the syncd restart will free resources better and the ASIC driver will unload all netdev fast and the operation will execute properly.
2021-02-18 15:48:11 -08:00