Commit Graph

2156 Commits

Author SHA1 Message Date
zhenggen-xu
34f3caf2e2 Set the default mac ageing time to 600 seconds (#2365)
* Set the default mac ageing time to 300 seconds

The current mac ageing was disabled, this could lead the mac address
table to increase over time and lead to resource and performance issues.

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Update the default HW ageing timer to be 600 seconds.

This is to be on the safer side where ARP update interval
is 300 seconds and SONiC does not flood when ARP is aged out.

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2019-06-16 05:50:27 +00:00
pavel-shirshov
f71c665705
[libteam]: Reimplement Warm-Reboot procedure (#2999)
* [libteam]: Reimplement Warm-Reboot procedure

* Address internal comments
2019-06-14 13:56:16 -07:00
Ying Xie
983a4b24eb [bcm SAI] upgrade Broadcom SAI to version 3.3.6.1-9 (#3009)
- Broadcom SAI GA version 20190513
- Broadcom fix for CS7999193, CS7913246, CS4529162, CS8180755, CS8242625

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-06-13 22:23:40 -07:00
Ying Xie
be799cbed3
[swss][utilities] advance sub module head (#3010)
Submodule src/sonic-utilities 46b5aa8..5b73b83:
  > [intfutil] Fix error when <interface name> specified in show interface related commands (#548)

Submodule src/sonic-swss a637562..93497ec:
  > [orchagent] PFC WD support for BFN platform (#916)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-06-13 12:24:45 -07:00
Joe LeVeque
13b066fc0e [201803][monit] Restart rsyslog service if rsyslogd consumes > 800 MB memory (#2963) 2019-06-13 19:10:14 +00:00
SuvarnaMeenakshi
0023fca739 [baseimage] kernel oom-killer to panic when the system is truly out of memory (#2988)
- What I did
Currently when the system is under memory pressure, the OOM killer kicks in and kills a rogue process. Killing a rogue process can cause the device to be un-healthy leading to blackholing of the traffic.

To avoid this, configure the OOM to do a kernel panic which will cause the device to reboot and come back up healthy.

- How I did it
Added the sysctl variable panic_on_oom and set the value to 2.
Setting it to 2 will ensure OOM killer to always do a kernel panic.
2019-06-13 18:59:51 +00:00
pavel-shirshov
2295dab965 [submodule]: Update sonic-quagga submodule (#2984) 2019-06-13 18:59:31 +00:00
Ying Xie
fbe9715f85
[201811][swss][utilities] advance sub module head (#2968)
Submodule src/sonic-utilities 6b4d1a0..46b5aa8:
  > [show ip interface] Add support for 'alias' interface naming mode (#486)

Submodule src/sonic-swss 9c4ae18..a637562:
  > Suppress storm detect counter increment for ongoing pfc storm case during a warm reboot (#869)
  > Remove *_LEFT fields to allow PFC watchdog to enter fresh into the (#897)
  > Set LAG mtu value based on kernel netlink msg (#922)
  > [warm restart assist] assume vector values could be reordered (#921)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-06-04 11:35:08 -07:00
Ying Xie
fbe55e9adf
[201811][utilities] advance utilties sub module head (#2960)
Submodule src/sonic-utilities 4488525..6b4d1a0:
  > [show vlan brief] Support 'alias' interface naming mode (#497)
  > [show interface neighbor expected] Support 'alias' interface naming mode (#495)
  > updated show ipv6 interface for alias mode (#493)
  > [show] Add serial numbers/uptime/hwinfo to 'show version' output (#488)
  > [show] show interface status added vlan and portchannels to command (#483)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-30 14:34:48 -07:00
pavel-shirshov
3954e0821c [libteam] Send updates as soon as we need to update the LACP partner about changes (#2955) 2019-05-30 21:15:12 +00:00
Phanindra TV
abc25df612 [teamd]: Administratively shutdown port channel has member ports in deselected state and traffic is not forwarded. #1771 (#2882) 2019-05-30 21:15:05 +00:00
Ying Xie
f791502237
[201811][utilities][swss] advance sub-module heads (#2953)
Submodule src/sonic-utilities 7a2348c..4488525:
  > use vlan members (#542)
  > [sonic_installer] If asked to install an image which is already installed, simply set as default (#534)

Submodule src/sonic-swss 8246bd9..9c4ae18:
  > Ignore neighbor entry with BCAST MAC, check SAI status exists (#914)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-28 17:58:30 -07:00
Kebo Liu
506081813a [mellanox]: fix wrong type of paramerter (#2950) 2019-05-29 00:53:36 +00:00
Joe LeVeque
8ae67c4c5d [logrotate] Enhance robustness (#2942)
* [logrotate] Decrease frequency to every 10 minutes; kill any lingering logrotate processes

* [logrotate] Delete all *.1.gz files as firstaction; Remove note about init-system-helpers < 1.47 workaround

However, continue to send SIGHUP directly to rsyslogd process
because 'service rsyslog rotate' still doesn't work properly with
init-system-helpers version 1.48
2019-05-29 00:53:13 +00:00
Qi Luo
0f4cb41efc [monit] Set memory usage alert at 50% (#2939)
Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2019-05-29 00:52:43 +00:00
Sudharsan D.G
fb1f156eb2 [devices]: Optics fixes in Dell Z9100/Z9264f platforms (#2936) 2019-05-29 00:51:43 +00:00
Stepan Blyshchak
fae35536c3 [swss.sh] flush FDB table during cold start (#2933)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-05-29 00:51:09 +00:00
paavaanan
5b52a24e25 [devices]: Export reboot_reason sysfs attribute for DellEMC S6100/Z9100 (#2922) 2019-05-29 00:50:40 +00:00
paavaanan
c49bac1457 [devices]: Dell Hwmon S6100/Z9100 SFM version export (#2521) 2019-05-29 00:50:13 +00:00
Ying Xie
f434b80758
[201811][utilities] update sub-module head (#2927)
Submodule src/sonic-utilities a1f961c..7a2348c:
  > [201811] enable DB migrator code (#536)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-20 12:09:28 -07:00
Ying Xie
5975a9c25b [updategraph] set DB version after minigraph reload (#2917)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-20 19:05:29 +00:00
Stepan Blyshchak
712d4b90fe [mlnx] fix incorrect attr assignment in mlnx-sfpd (#2913)
* [mlnx] fix incorrect attr assignment in mlnx-sfpd

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mlnx] on_pmpe returns bool and not SX_STATUS_SUCCESS

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mlnx] fix typo

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-05-20 19:04:52 +00:00
Stepan Blyshchak
82cd144fbd [mlnx] refactor and fix mlnx-sfpd shutdown (#2907)
* [mlnx] fix mlnx-sfpd shutdown

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* fix type and handle only EINTR and EAGAIN errors from select

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* handle select.error as well during init/run

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-05-20 19:03:44 +00:00
Sudharsan D.G
85c51bf5c9 [devices]: Added index for dell z9100 c32 (#2892) 2019-05-20 18:59:55 +00:00
Renuka Manavalan
238db1e06a [tacacs]: skip accessing tacacs servers for local non-tacacs users (#2843)
* Switch the nss look up order as "compat" followed by "tacplus".
This helps use the legacy passwd file for user info and go to tacacs only if not found.
This means, we never contact tacacs for local users like "admin".
This isolates local users from any issues with tacacs servers.
W/o this fix, the sudo commands by local users could take <count of servers> * <tacacs timeout> seconds, if the tacacs servers are unreachable.

* Skip tacacs server access for local non-tacacs users.
Revert the order of 'compat tacplus' to original 'tacplus compat' as tacplus
access is required for all tacacs users, who also get created locally.
2019-05-20 18:59:26 +00:00
paavaanan
643d16a4d7 LED Supprot For DellEMC Z9100 (#2799) 2019-05-20 18:58:55 +00:00
Joe LeVeque
bd7b96fea3 [201811][dhcp_relay] Add support for DHCP client(s) on one VLAN and DHCP server(s) on another (#2919)
* Change URL for isc-dhcp source repository

* Modify supervisor conf to generate dhcrelay commands with '-id' and '-iu' options

* Comments; Also clean up jinja2 syntax

* Patch relay to open one socket per interface and send to all servers on all upstream interfaces

* Patch relay agent to properly forward BOOTREQUEST only on appropriate interface if it is a directed broadcast

* Port upstream patches to isc-dhcp-relay to support upstream/downstream interfaces

* Update patch to properly support interfaces with multiple IP addresses assigned

* Pass --enable-use-sockets to configure instead of uncommenting USE_SOCKETS directly
2019-05-18 10:33:26 -07:00
Ying Xie
116246de1b
[201811][utilities] update sub module head (#2897)
Submodule src/sonic-utilities 6130695..a1f961c:
  > update scheme variable name (#531)
  > [teamshow]: Add * to indicate if the state has been synced into database (#395)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-14 15:39:16 -07:00
Sumukha Tumkur Vani
c1836146a3 Fix for LLDP portname issue (#2886)
* Fix for LLDP portname issue
First check for operstate and if its not present then check for ifindex

* Addressing review comments
2019-05-14 18:03:29 +00:00
pavel-shirshov
99de97c5cc [sonic-quagga]: Fix missing fpm messages (#2884) 2019-05-14 18:03:06 +00:00
Ying Xie
21f31e97e1
[201811][swss][utilities] advance sub-module head (#2878)
Submodule src/sonic-swss e26e1d8..8246bd9:
  > [watermarkorch] only perform periodic clear if the polling is on (#781)

Submodule src/sonic-utilities e3bb8b9..6130695:
  > [reboot] log reboot progress and add a sanity check before reboot (#526)
  > Fix TODO to get/set active ports only (#494)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-09 17:34:21 -07:00
paavaanan
b54d7874c6 [devices]: DellEMC S6100/Z9100 sensor.conf update (#2861) 2019-05-09 16:55:37 +00:00
Ying Xie
dc2fb747a5 [ebtables] install ebtables in base image and install filter rules
- Add ebtables package, and install some filter rules:
  1. ebtables -A FORWARD -d BGA -j DROP
  2. ebtables -A FORWARD -p ARP -j DROP

Basically, we let the ARP packets in the VLAN being forwarded by the ASIC,
kernel gets a copy of these ARP packets and the forwarding from Kenerl gets
dropped. So there is always only one copy of ARP/response in the VLAN.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-06 22:13:03 +00:00
Ying Xie
d12782cc48
[201811][swss] advance sub-module head (#2868)
Submodule src/sonic-swss 6e8f991..e26e1d8:
  > [arp] copy arp IO to cpu instead of trap and drop (#812)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-06 15:09:31 -07:00
sridhar-ravindran
4e99b603ea Enable Debugs in BCM Kernel-bde and Knet Modules (#2786)
* Enable Debugs in BCM Kernel-bde and Knet Modules

* Added Explanation for debugs enabled
2019-05-06 17:21:04 +00:00
Andriy Moroz
cf6f22f775 [mellanox]: Update SAI (#2841)
Add support to trap copy action

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
2019-05-01 11:57:08 -07:00
Ying Xie
65cd722223
[201811][utilities] advance sub-module head (#2849)
Submodule src/sonic-utilities 584e706..e3bb8b9:
  > [show] Call teamshow using sudo in 'show interfaces portchannel' (#524)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-01 08:51:56 -07:00
Joe LeVeque
cc90d7f5ee [sudoers] Add /usr/bin/teamshow to READ_ONLY_CMDS (#2846) 2019-05-01 15:51:13 +00:00
Ying Xie
3b02eec933 [db migrator] migrate the DB to latest schema when needed (#2808)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-30 23:43:52 +00:00
Ying Xie
f22666ce58
[201811][utilities] advance sub-module head (#2844)
Submodule src/sonic-utilities 9005508..584e706:
  > [db migrator] Introduce the DB migration infrastructure (#519)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-30 16:41:38 -07:00
paavaanan
af380e9c79 [devices]: DellEMC S6000 xcvrd support (#2560)
* DellEMC S6000, xcvrd support

* sleep 1 second to avoid busy looping

* removal of dead code

* Correct typo error to 1 second

* Introduced 1 second sleep

* Revamped script with blocking call support

* get_transceiver_change_event api definition update

* adding timeout support for get_transceiver_change_event
2019-04-30 19:16:19 +00:00
Qi Luo
dd31c2d84a Remove unused packages in docker images and host (#2807)
* Remove unneeded packages in docker images and host
* Remove libpython3.6 from snmp docker image
2019-04-30 19:12:00 +00:00
Ying Xie
e4a663a606 [teamd] do not process lacpdu before the port ifinfo is set (#2815)
Port libteam patch which fixes the race condition we observed during
warm reboot.

Remove early patches: 0006, 0008, 0009.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-30 19:07:39 +00:00
Stepan Blyshchak
675a89959a [mellanox] Update Mellanox FW version (#2827)
Fixes random failures during ISSU start (warm pre-shutdown)

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-04-28 23:13:10 -07:00
Shuotian Cheng
34734e4c55 [minigraph]: Fix bug in copying list in Python (#2831)
'=' cannot be used for copying the list

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2019-04-26 22:29:03 +00:00
Ying Xie
edc8685e1e [teamd service] start teamd service after swss (#2829)
SWSS clears DB tables, if teamd is not started after swss, there is a
race condition that swss might clear vital teamd information.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-26 22:14:14 +00:00
Ying Xie
042d6145a5 [201811][sairedis][utilities] advance sub module heads (#2830)
Submodule src/sonic-sairedis 74f0f44..d027eae:
  > [SAI header] upgrade SAI header to version v1.3.7 (#445)

Submodule src/sonic-utilities 0f7e75c..9005508:
  > Bring queue storm status to 'pfcwd show stats' (#500)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-26 12:00:14 -07:00
pavel-shirshov
525ee59165 Downport the netlink patch to libteam1.26. Increase netlink buffers (#2822) 2019-04-26 15:27:22 +00:00
Andriy Moroz
5004d2b4fe Increase syncd start timeout (#2776)
* Increase syncd start timeout

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>

* Replace TimeoutSec to TimeoutStartSec

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
2019-04-26 15:27:11 +00:00
Ying Xie
5663c812e1 Revert "[devices]: Watchdog enable/disable in DellEMC S6100 (#2730)" (#2817)
This reverts commit 22d17da09c.
2019-04-26 15:25:50 +00:00