Commit Graph

3184 Commits

Author SHA1 Message Date
rkdevi27
f1bbda19f0 Fix "/host unmount failure" during reboot (#4558) 2020-08-09 10:34:02 -07:00
Guohan Lu
544aa236c4 [submodule]: update sonic-utilities
ef0b1fa 2020-07-21 | [config] Restart telemetry service upon config (re)load (#992)

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-08-05 17:30:35 -07:00
Stephen Sun
b76f8fafdb [Mellanox] Update the buffer setting (#4989)
* Update the buffer size based on the latest excel

Signed-off-by: Stephen Sun <stephens@mellanox.com>

* Align the buffer configuration with the latest formula:

- reduce redundant "*2" in formula
- use port MTU for local sending the PFC frame and peer lossless MTU for peer sending lossless traffic

Buffer pool size updated accordingly.

Signed-off-by: Stephen Sun <stephens@mellanox.com>
2020-08-03 23:06:21 -07:00
Joe LeVeque
6556c40040
[201911] Introduce sonic-py-common package (#5063)
Consolidate common SONiC Python-language functionality into one shared package (sonic-py-common) and eliminate duplicate code.

The package currently includes four modules:
- daemon_base
- device_info
- logger
- task_base

NOTE: This is a combination of all changes from https://github.com/Azure/sonic-buildimage/pull/5003, https://github.com/Azure/sonic-buildimage/pull/5049 and some changes from https://github.com/Azure/sonic-buildimage/pull/5043 backported to align with the 201911 branch. As part of the 201911 port, I am not installing the Python 3 package in the base image or in the VS container, because we do not have pip3 installed, and we do not intend to migrate to Python 3 in 201911.
2020-08-03 11:50:06 -07:00
Nazarii Hnydyn
4e558bca25
[201911][Mellanox] Update MFT to 4.15.0-104 (#5077)
* [Mellanox] Update MFT to 4.15.0-104.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [Mellanox] Remove build system W/A.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [Mellanox] Add MFT DKMS build support.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-08-03 13:53:33 +03:00
Andriy Kokhan
fbf3cb13a5
[bfn] Updated BFN SDK packages to 20200731 with SAI v1.5.2 (#5082)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-08-03 02:12:55 -07:00
Abhishek Dosi
3c224e7ce3 [submodule update] sonic-swss
Revert "Refine getDbId() calling to fix build after swss-common change (#1245)"

 This shoudl fix VS build.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-08-01 17:41:30 -07:00
Abhishek Dosi
35ff8b3e12 [submodule update] sonic-platform-daemons
[xcvrd] Fix bailing out on platforms that do not support QSFP-DD (#78)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-08-01 14:24:13 -07:00
Abhishek Dosi
ae65de6cac [submodule update] sonic-swss
Remove 00-copp.config.json from swss debian package. (#1366)
2020-07-31 17:30:09 -07:00
pavel-shirshov
f757a5d6eb Fix for ipv6 local-addr problem (#4876)
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2020-07-31 17:26:07 -07:00
abdosi
e3eddede1e Changes to add template support for copp.json. (#5053)
* Changes to add template support for copp.json.
This is needed so that we can install differnt type of
Traps based on Device Role (Tor/Leaf/Mgmt/etc...).

Initial use case is to install DHCP/DHCPv6 tarp only
for tor router.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fixed based on review comments.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fixed based on review comment.
2020-07-31 17:24:45 -07:00
Joe LeVeque
c96c3cd311 [caclmgrd] Always restart service upon process termination (#5065) 2020-07-31 17:23:48 -07:00
Abhishek Dosi
a4d399c1a9 [submodule update] sonic-platform-common
Fix import issue for python3 whl building (#105)
2020-07-31 10:24:53 -07:00
Abhishek Dosi
b576443af9 [submodule update] sonic-platform-daemons
[xcvrd] Add support for QSFP-DD cables (#66)
2020-07-30 12:12:03 -07:00
Abhishek Dosi
a2aa5c4d8c [submodule update] sonic-platform-common
[Transceiver] Add parser for QSFP-DD cable type and dictionaries for
 QSFP-DD codes (#101)
2020-07-30 12:10:20 -07:00
Kebo Liu
c8c4493a96
Update SAI to 1.16.6, SDK to 4.4.1014, FW to *.2008.1032 (#5056)
SAI:
    Fix ECMP max groups logic
    add set issu log level for spc2/spc3, as now issu is supported
    set vlan max swid = 0 on sdk init, as only single swid is needed, for efficient resource usage
    Fix traffic lost during FFB related to buffer config + optimize buffer config timing for FB
    Add ACL fields BTH, IP flags
    Add ACL infrastructure of different fields per ASIC type
    Add port stat ether rx/tx oversize pkts
  SDK/FW:
    Added support for Finisar 100GbE SWDM Transceiver FTLC9152RGPL.
    Spectrum-2 Added support for 10G BaseT modules
    Added link LED support for SN4600C.
    Counters | In SDK debug dump, the incorrect counter type appears for vtraps.
    WJH | Without any traffic or events on the idle system, the CPU load is constantly above 4%
    WJH | WJH filter currently cannot filter by PORT for buffer drop reason.
    Spectrum | ACL, Unbind, Lazy Delete | Running Lazy Delete together with auto_unbind may cause rate condition errors. To work work with Lazy Delete use new INIT parameter "acl_manual_unbind" so that ACLs will notbe removed automatically when binding point is deleted.
    Spectrum | ISSU | In ISSU mode, when querying for the number of configurable buffers, using the API sx_api_cos_port_buff_type_get with the count parameter as 0, the API returns the number for NORMAL mode instead.
    Spectrum-2 | BER | BER monitor counts raw errors instead of effective errors
    Spectrum-2 | BER | Connecting to ConnectX-5 adapter card with copper splitter cable MCP7H50-V001R30 in 1
    Spectrum-2 | Cables | Link flaps in 200GbE with AOM Optic cable MMA1T00-VS
    Spectrum-3 | Speeds, Link | When moving from a 400GbE link to a 1GbE link, packets may drop for 1msec right after link up
    Spectrum-3 | Cables, Speeds | Using 400GbE with 3rd party systems is not supported
    Spectrum-3 | LAG | After a while, LAG members become out of sync with one another
    Spectrum-3 | VLAN, Ports | Packets with VLAN headers are sent to
2020-07-30 13:37:54 +03:00
Kebo Liu
b4f010f042
[kernel]: Update sonic-linux-kernel to pick up new fix (#5044)
Azure/sonic-linux-kernel#154 patch psample to safely unload the module
2020-07-28 15:06:51 -07:00
Abhishek Dosi
a163583d93 [submodule update] sonic-swss-common
This is fix for compilation error also on 201911.
[schema]: Add a new table "NAT_DNAT_POOL_TABLE" to hold the DNAT Pool
entries.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-07-27 08:58:42 -07:00
Stephen Sun
33a3cc8861
[submodule]: update submodule head for sonic-sairedis on 201911 (#5041)
Update the meta code to support DNAT Pool changes (#616)
[syncd] Fix notification on shutdown request (#637)
Advance the submodule head of SAI (#641)

Signed-off-by: Stephen Sun <stephens@mellanox.com>
2020-07-26 13:43:01 -07:00
Abhishek Dosi
940e5c0db0 [submodule update] sonic-utilities
[config] Add 'config interface mtu' command #793
add fec configuration 'config interface fec [OPTIONS] <interface_name>
<interface_fec>' #764
2020-07-26 11:29:10 -07:00
Abhishek Dosi
6216d32db0 [submodule update] sonic-swss
[201911] Update nat entries to use nat_type to support DNAT Pool
  changes. (#1297)
    [201911] Update nat entries to use nat_type to support DNAT Pool
    changes. (#1297)
2020-07-26 11:22:41 -07:00
Nazarii Hnydyn
7fe359747a [orchagent]: Fix platform string export. (#4993)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-07-26 11:20:56 -07:00
Tamer Ahmed
755319c37c [docker-orchagent] Call sonic-cfggen Once (#4936)
Optimizing number of calls made to sonic-cfggen during service
start up as it adds to total system boot up time.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

**- Why I did it**
sonic-cfggen call is slow and it adds to system start up time

**- How I did it**
places all required variable into single template and called into sonic-cfggen using this template

**- How to verify it**
***-Test 1***
there is an average saving of .5 to 1 sec between old script and new script
```
root@str-s6000-acs-14:/# time ./orchagent_old.sh
/usr/bin/orchagent -d /var/log/swss -b 8192 -m f4:8e:38:16:bc:8d

real	0m3.546s
user	0m2.365s
sys	0m0.585s

root@str-s6000-acs-14:/# time ./orchagent_new.sh
/usr/bin/orchagent -d /var/log/swss -b 8192 -m f4:8e:38:16:bc:8d

real	0m2.058s
user	0m1.650s
sys	0m0.363s
```
***-Test 2***
Built an image with this change and orchagent is running with intended params:
```
admin@str-s6000-acs-14:~$ ps -ef | grep orchagent
root      2988  1901  1 02:09 pts/0    00:00:02 /usr/bin/orchagent -d /var/log/swss -b 8192 -m f4:8e:38:16:bc:8d
```

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-07-26 11:19:15 -07:00
Akhilesh Samineni
dd26117bf2 [NAT]: Update the conntrack entries timeout to Max value after warmboot (#4596)
Signed-off-by: Akhilesh Samineni <akhilesh.samineni@broadcom.com>

All new NAT conntrack entries are added to kernel with max entry timeout of 432000 and setting the same timeout during system warm reboot also
2020-07-26 11:18:42 -07:00
shlomibitton
9385775803 [Mellanox] Change fan tolerance to 50% (#5018)
Mellanox platforms fan tolerance should change to 50%

Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-07-26 11:18:00 -07:00
anish-n
733f7091ac [bgpcfgd]: Add Vlan prefix list to the FRR templates (#5005)
add the Vlan prefix list to the FRR templates
2020-07-26 11:17:29 -07:00
madhanmellanox
130aeb4cc1 [caclmgrd] Log error message if IPv4 ACL table contains IPv6 rule and vice-versa (#4498)
* Defect 2082949: Handling Control Plane ACLs so that IPv4 rules and IPv6 rules are not added to the same ACL table

* Previous code review comments of coming up with functions for is_ipv4_rule and is_ipv6_rule is addressed and also raising Exceptions instead of simply aborting when the conflict occurs is handled

* Addressed code review comment to replace duplicate code with already existing functions

* removed raising Exception when rule conflict in Control plane ACLs are found

* added code to remove the rule_props if it is conflicting ACL table versioning rule

* addressed review comment to add ignoring rule in the error statement

Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-07-26 11:16:30 -07:00
shlomibitton
d0be3ebe16 Add support for QSFP-DD cables on MLNX platform API (#4965)
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
2020-07-26 11:15:05 -07:00
vdahiya12
2b86e51026 [daemon_base] fix to not reregister signal handler (#4998)
* [daemon_base] fix to not reregister signal handler

-src/sonic-daemon-base/sonic_daemon_base/daemon_base.py
Problem:
Currently all daemons inherit from daemon_base class, and for
signal handling functionality they register the signal_handler() by
overriding the siganl_handler() in daemon_base by their own
implmentation.
But some sonic_platform instances also can invoke the daemon_base
constructor while trying to instantiate the common utilities
for example
platform_chassis = sonic_platform.platform.Platform().get_chassis()
This will cause the re registration of signal_handler which will
cause base class signal_handler() to be invoked when the daemon
gets a signal, whereas their own signal_handler should have been
invoked.

Fix:
We only register the siganl_handler once, and if signal_handler has
been registered, not re register it.

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>

* [daemon_base] fix to not reregister signal handler

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2020-07-26 11:14:17 -07:00
Stepan Blyshchak
b100ec559e [services] remove swss from WantedBy for nat service (#4991)
Otherwise, it may cause issues for warm restarts, warm reboot.
Warm restart of swss will start nat which is not expected for warm
restart. Also it is observed that during warm-reboot script execution
nat container gets started after it was killed. This causes removal of
nat dump generated by nat previously:

A check [ -f /host/warmboot/nat/nat_entries.dump ] || echo "NAT dump
does not exists" was added right before kexec:

```
Fri Jul 17 10:47:16 UTC 2020 Prepare MLNX ASIC to fastfast-reboot:
install new FW if required
Fri Jul 17 10:47:18 UTC 2020 Pausing orchagent ...
Fri Jul 17 10:47:18 UTC 2020 Stopping nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopped nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopping radv ...
Fri Jul 17 10:47:19 UTC 2020 Stopping bgp ...
Fri Jul 17 10:47:19 UTC 2020 Stopped bgp ...
Fri Jul 17 10:47:21 UTC 2020 Initialize pre-shutdown ...
Fri Jul 17 10:47:21 UTC 2020 Requesting pre-shutdown ...
Fri Jul 17 10:47:22 UTC 2020 Waiting for pre-shutdown ...
Fri Jul 17 10:47:24 UTC 2020 Pre-shutdown succeeded ...
Fri Jul 17 10:47:24 UTC 2020 Backing up database ...
Fri Jul 17 10:47:25 UTC 2020 Stopping teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopped teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopping syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopped syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopping all remaining containers ...
Warning: Stopping telemetry.service, but it can still be activated by:
  telemetry.timer
Fri Jul 17 10:47:37 UTC 2020 Stopped all remaining containers ...
NAT dump does not exists
Fri Jul 17 10:47:39 UTC 2020 Rebooting with /sbin/kexec -e to
SONiC-OS-201911.140-08245093 ...
```

With this change, executed warm-reboot 10 times without hitting this
issue, while without this change the issue is easily reproducible almost
every warm-reboot run.

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2020-07-26 11:11:35 -07:00
anish-n
b0ccf58682 [bgpcfgd]: Add fix to bgpcfgd to ignore NEIGHBOR_METADATA entries for dynamic peers (#5008)
This fix removes the requirement to have a NEIGHBOR_METADATA for dynamic peers. The change is made since it is not necessary for NEIGHBOR_METADATA entries be present for the dynamic neighbors
2020-07-26 11:10:24 -07:00
Kebo Liu
0701da4145 [Mellanox] remove code which instructs hw-mgmt to skip mlsw_minimal probing in fast-boot flow (#5011) 2020-07-26 11:09:26 -07:00
isabelmsft
ca844ec6b3 Update Kubernetes and kubernetes-cni versions (#5024)
This PR updates kubernetes version to 1.18.6 and kubernetes-cni version to 0.8.6

signed-off by: Isabel Li isabel.li@microsoft.com

Why I did it
Previous kubernetes-cni version (0.7.5) introduced Kubernetes Man In The Middle Vulnerability. “A vulnerability was found in all versions of containernetworking/plugins before version 0.8.6, that allows malicious containers in Kubernetes clusters to perform man-in-the-middle (MitM) attacks. A malicious container can exploit this flaw by sending rogue IPv6 router advertisements to the host or other containers, to redirect traffic to the malicious container.”

How I did it
Defined kubernetes-cni version to be 0.8.6 and updated kubernetes version to be 1.18.6

How to verify it
Check versions by running dpkg -l | grep kube
2020-07-26 11:08:21 -07:00
Joe LeVeque
4a2db8e216 [caclmgrd] remove default DROP rule on FORWARD chain (#5034) 2020-07-26 11:07:42 -07:00
Nazarii Hnydyn
0ec979dd30
[Mellanox] Fix SN3700 platform string. (#5035)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-07-25 03:04:03 -07:00
Joe LeVeque
3f3fcd3253 [caclmgrd] Filter DHCP packets based on dest port only (#4995) 2020-07-21 10:13:17 +00:00
Joe LeVeque
52e45e823e
[201911][sudoers] Add sonic_installer list to read-only commands (#4997)
`sonic_installer list` is a read-only command. Specify it as such in the sudoers file.

This will also ensure the new `show boot` command, which calls `sudo sonic_installer list` under the hood doesn't fail due to permissions.
2020-07-17 20:13:42 -07:00
Joe LeVeque
5591131bba
[201911][sonic-telemetry] Update submodule (#4987)
Point submodule to new 201911 branch of sonic-telemetry and update pointer to the current HEAD of the 201911 branch

* src/sonic-telemetry aaa9188...01b5365 (1):
  > [testdata] Update SFP keys to align with new standard (#39)
2020-07-17 17:37:21 -07:00
Joe LeVeque
840be7732c
[201911][devices] Update SFP keys to align with new standard (#4976)
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97

- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
2020-07-16 11:09:47 -07:00
Samuel Angebault
41ba95ee3f
[arista] update Arista drivers submodules (#4967)
Merge most of the changes that recently made it to master.
This will be the last such merge operation and future commits will only cherry-pick fixes and targeted features.

Major fixes and features,
- reboot cause enhancement with more hardware reboot cause reporting
- fix reboot cause parsing issue with 201811 release
- fix get_change_event logic
- fix error message on missing sysfs entry by our plugins
- final piece of the platform refactors for fan and sensor reporting through the platform API
2020-07-16 10:36:07 -07:00
Danny Allen
0824509373 [docker-ptf] Add support for spytest to ptf container (#4410)
- Install apt and pip dependencies
- Define traffic generator service

Signed-off-by: Danny Allen <daall@microsoft.com>
2020-07-14 22:06:05 -07:00
Abhishek Dosi
0d6754d140 [Submodule update] sonic-snmpagent
[201911] Fix interface counters in RFC1213 (#144)
2020-07-14 22:05:13 -07:00
abdosi
b725572023
Fix the below frr start.sh jija2 exception in 201911 image syslog: (#4958)
File "/usr/local/bin/sonic-cfggen", line 380, in <module>
     main()
   File "/usr/local/bin/sonic-cfggen", line 354, in main
     print(template.render(data))
   File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 1090, in render
     self.environment.handle_exception()
   File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 832, in handle_exception
     reraise(*rewrite_traceback_stack(source=source))
   File "<template>", line 1, in top-level template code
   File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 471, in getattr
     return getattr(obj, attribute)
 jinja2.exceptions.UndefinedError: 'WARM_RESTART' is undefined

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-07-14 08:02:32 -07:00
Abhishek Dosi
7591d76b55 [Submodule Update] sonic-utilities
Fix the None Type Exception when Interface Table does not exist (cold
boot) as part of db migration (#986)
2020-07-13 09:09:00 -07:00
Prince Sunny
79d434a442 [bgpcfgd] - Fix a key error during delete (#4946) 2020-07-11 22:36:41 -07:00
Abhishek Dosi
b78e17a143 [Submodule update] sonic-snmpagent. Movent to 201911 Branch with with
following PR's :
Implement cbgpPeer2State in CiscoBgp4MIB (#119)
Fix index nodes in LLDP tables whose access right is not-accessible.
(#112)
 Fix quagga/FRR parser on IPv6 BGP sessions (#122)
 [lint] Fix some syntax errors or warnings (#127)
  Update README.md: Add lgtm badges (#128)
  [Multi-asic]: Support multi-asic platform (#126)
  Simplify test code (#132)
  [Multi-asic]: Namespace support for LLDP and Sensor tables (#131)
  Fix undefined variable and warning message (#134)
  Fix SNMP AgentX socket connection timeout when using
  Namespace.get_all() (#140)
  [Namespace] Fix interfaces counters in InterfacesMIB RFC 2863 (#141)
   Fix LGTM reported alert of PR#141 (#142)
2020-07-11 14:07:50 -07:00
Abhishek Dosi
93d2eda1a2 [Submodule Update] sonic-py-swssdk
[MultiDB]: use python class composition to avoid confusion in base
class (#74)
2020-07-11 09:54:04 -07:00
Abhishek Dosi
7aefe6d58f [Submodule Update] sonic-utilities
Intf table migration for APP_DB entries during warmboot (#980)
[Multi NPU] Time Improvements to the config reload/load_minigraph
commands  (#917)
2020-07-11 09:51:16 -07:00
Joe LeVeque
0559b7d3b6 [caclmgrd] Improve code reuse (#4931)
Improve code reuse in `generate_block_ip2me_traffic_iptables_commands()` function.
2020-07-11 09:48:10 -07:00
arlakshm
7c699df654 Add support for bcmsh and bcmcmd utlitites in multi ASIC devices (#4926)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
This PR has changes to support accessing the bcmsh and bcmcmd utilities on multi ASIC devices
Changes done
- move the link of /var/run/sswsyncd from docker-syncd-brcm.mk to docker_image_ctl.j2
- update the bcmsh and bcmcmd scripts to take -n [ASIC_ID] as an argument on multi ASIC platforms
2020-07-11 09:47:24 -07:00