Commit Graph

2412 Commits

Author SHA1 Message Date
Aravind Mani
bee516e370
Dell S6100- Fix PCA MUX attachment issue (#5401)
* Dell S6100- Fix PCA MUX attachment t issue

* Update s6100_i2c_enumeration.sh

* Update s6100_i2c_enumeration.sh
2020-09-20 20:05:53 -07:00
Ying Xie
f041345e4e
[201811][bcm SAI] ugprade Broadcom SAI to 3.5.3.5-2 (#5405)
Including following Broadcom patches:
- CS00010869953, CS00010914668(KB29456), CS00010503275(KB0029315), CS00010914673(KB0029442)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-18 14:54:54 -07:00
Wirut Getbamrung
ec07d10748
[201811][platform-cel]: Fix dx010 FSC error (#4949)
* [platform/cel-dx010]: add gpio init for fan direction

* [platform/cel-dx010]: remove invalid code on fancontrol service

* [platform/cel-dx010]: modify fancontrol service permission

* [platform/cel-dx010]: install fancontrol in pmon
2020-09-17 15:30:15 -07:00
Tamer Ahmed
b903c8e198 [dhcpmon] Print Both Snapshot And Current Counters (#5374)
Printing both snapshot and current counter sets will make it easier to pinpoint
which message type(s) is/are not being relayed. This PR prints both counter sets.
Also, this PR defines gnu11 as a C standard to compile with in order to avoid
making changes when porting to 201811 branch.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-16 09:57:36 -07:00
Tamer Ahmed
949bdee24e [dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.

In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-16 09:57:36 -07:00
Ying Xie
db1ef65102
[201811][swss-common] advance swss-common sub module head (#5369)
* [201811][swss-common] advance swss-common sub module head

- Fix SubscriberStateTable::hasCachedData formula for a timing risk (#379)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Fix build of the unit test of SubscriberStateTable (#383)
2020-09-15 09:10:19 -07:00
Ying Xie
6c4914b62c Revert "[dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)"
This reverts commit 44d6e03df3.
2020-09-14 22:03:55 +00:00
Tamer Ahmed
44d6e03df3 [dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.

In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-14 16:56:01 +00:00
Guohan Lu
083607f4d1 [submodule]: update sonic-utilities
* 4d69425 2020-09-12 | [utilities] Define Explicit Dependency On Ipaddress Package (#1113) (HEAD, origin/201811) [Guohan Lu]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-09-12 23:08:39 -07:00
Ying Xie
6597bd8dea
[201811][utilities] advance utilities sub module head (#5339)
- [filter-fdb] Call Filter FDB Main From Within Test Code #1051 and #1059 (#1086)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-08 14:35:22 -07:00
Blueve
55d2d15e4e [conf] append nos-config-part for s6100 (#5234)
* [conf] append nos-config-part for s6100

* modify rc.local

Signed-off-by: Guohan Lu <lguohan@gmail.com>

* Update rc.local

Co-authored-by: Blueve <jika@microsoft.com>
Co-authored-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Ying Xie <yxieca@users.noreply.github.com>
2020-09-08 19:30:33 +00:00
Ying Xie
6b75059b1d
[201811][kernel][utilities][sairedis] advance submodule heads (#5288)
- Kernel: [201811] Fix I2C ISMT DMA buffer alignment issue (#158)[201811] Fix I2C ISMT DMA buffer alignment issue (#158)
- utilities: Fix pfcwd stats crash with invalid queue name (#1077)
- sairedis: [syncd] Fix notification on switch shutdown request (#638)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-02 12:15:47 -07:00
zhenggen-xu
a99026acdc
[Build] pin down setuptools for build issues (#5280)
See: https://github.com/Azure/sonic-buildimage/issues/5279

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-09-01 16:29:31 -07:00
Joe LeVeque
c909422abc [caclmgrd] Always restart service upon process termination (#5065) 2020-08-31 20:31:13 +00:00
Joe LeVeque
4547ea022d [caclmgrd] Improve code reuse (#4931)
Improve code reuse in `generate_block_ip2me_traffic_iptables_commands()` function.
2020-08-31 20:30:54 +00:00
Baptiste Covolato
c706a1079f [arista/aboot]: Zero out 1st MB before repartitioning (#5220)
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.

Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.

Signed-off-by: Baptiste Covolato <baptiste@arista.com>
2020-08-22 18:48:10 -07:00
Santhosh Kumar T
a2cb92056a
Dell S6100 Port I2C changes to 201811 branch (#5150)
* Dell S6100 Port I2C changes to 201811 branch

* Update s6100_i2c_enumeration.sh
2020-08-18 14:38:28 -07:00
zhenggen-xu
e1e97199e3
[201811 Monit] Enable monitoring of SWSS daemons (#5144)
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-08-13 20:42:06 -07:00
Samuel Angebault
5d891d8832
[201811][Arista] Update arista driver submodules (#5149) 2020-08-12 09:31:58 -07:00
pavel-shirshov
e03ce8ba14
Clarify error message for bgpcfgd update loopback address (#5076) 2020-07-31 07:46:47 -07:00
pavel-shirshov
459c29cfaa
[bgpcfgd]: Fix bgpcfgd crash on reset Loopback0 ip addresses (#5050)
Fix an error which causes bgpcfgd crash on invalid ip address. Before the fix we had an issue here. When either loopback ipv4 or ipv6 addresses were already set and bgpcfgd received another "SET" message for already set ip loopback address, bgpcfgd will send syslog message about ambiguous ip address (despite the fact that the address is good) and crash of bgpcfgd. With this change this behavior is changed: if we receive ip address and this ip address is already set, bgpcfgd will send this message to the syslog and return from the handler.
2020-07-28 12:18:07 -07:00
Joe LeVeque
6120145bf1 [caclmgrd] remove default DROP rule on FORWARD chain (#5034) 2020-07-24 19:09:32 +00:00
zzhiyuan
59072a627b
[201811][Arista] Update 201811 branch with Arista syseeprom fix (#5016)
If a device had a master or 201911 image then installed a 201811 image, it could result in a prefdl that was not properly processed by 201811 Arista code.

This is a commit that was on 201911 and master branch.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-07-22 10:57:18 -07:00
Joe LeVeque
cf142e7e6c [caclmgrd] Filter DHCP packets based on dest port only (#4995) 2020-07-17 18:17:27 +00:00
Ying Xie
a37a7d3dcf
[201811][snmpagent] advance snmpagent submodule head (#4988)
- [psutil] pin psutil version to 5.7.0.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-17 06:53:35 -07:00
Joe LeVeque
3d31ef3a0c
[201811][sonic-platform-daemons] Update submodule (#4974) 2020-07-14 19:12:41 -07:00
pavel-shirshov
b7a0669f36
[201811][quagga]: Use 201811 branch of sonic-quagga (#4966)
sonic-quagga using utility from master branch of sonic-buildimage. I had to create 201811 branch in sonic-quagga which could work with 201811 branch of sonic-buildimage.
2020-07-14 10:09:11 -07:00
pavel-shirshov
8a78ff6944
[quagga]: Update sonic-quagga (#4962)
sonic-quagga repository has new fix. Update submodule to bring the fix into the image.
2020-07-13 23:14:02 -07:00
Ying Xie
0a1f043b02
[201811][utilities] advance utilities submodule head (#4947)
- [filter-fdb] Fix For Vlan Defined With No CIDR (#976)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-11 21:11:36 -07:00
zzhiyuan
a43eec53b7
[201811][Arista] Update Arista submodules (#4939)
Fix the method get_transceiver_change_event to abide by the function description, return True status and use timeout in milliseconds.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-07-09 22:55:06 -07:00
Ying Xie
ecd93eb8ab
[201811][swss] advance swss submodule head (#4935)
[aclorch] Use IPv6 Next Header internally for protocol number on MLNX platform (#1343)
Add/Del lag_name_map item according to lag adding and removing (#1124)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-09 15:42:58 -07:00
Wirut Getbamrung
c9d86f0587
[201811][platform/cel]: Port fancontrol changes for dx010/e1031 to 201811 branch (#4867)
Update fancontrol service for Seastone-DX010/E1031 device to support hysteresis temperature threshold and difference config for each unit fan direction type (B2F/F2B); follow master branch
2020-07-03 19:59:55 -07:00
Guohan Lu
d04ad415b4 [docker-config-engine]: lockdown netaddr,ipaddr,jinja pip version
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-06-25 06:58:02 +00:00
Ying Xie
6fc62208d5
[201811][utilities] advance utilities sub module head (#4844)
[filter-fdb] Check VLAN Presence When Filter FDB (#957)
[mellanox] enable watchdog before fast-reboot (#844)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-24 10:15:23 -07:00
Tamer Ahmed
ab3400f217 [fast-reboot] Back up FDB/ARP/Default routes (#4795)
FDB/ARP/Default routes files are deleted after swssconfig. This
makes debugging/validation of device conversion hard. This PR
saves those files in order to facilitate debugging of device conversion.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-06-21 02:41:39 +00:00
padmanarayana
062fd849b3 [DELL]: FTOS to SONiC fast conversion fixes (#4807)
While migrating to SONiC 20181130, identified a couple of issues:
1. union-mount needs /host/machine.conf parameters for vendor specific checks : however, in case of migration, the /host/machine.conf is extracted from ONIE only in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/platform/rc.local#L127.
2. Since grub.cfg is updated to have net.ifnames=0 biosdevname=0, 70-persistent-net.rules changes are no longer required.
2020-06-19 22:35:29 +00:00
Joe LeVeque
d9b8bed916 [caclmgrd] Don't limit connection tracking to TCP (#4796)
Don't limit iptables connection tracking to TCP protocol; allow connection tracking for all protocols. This allows services like NTP, which is UDP-based, to receive replies from an NTP server even if the port is blocked, as long as it is in reply to a request sent from the device itself.
2020-06-19 04:33:50 +00:00
Qi Luo
e02de1dc89 Fix bug: check port alias even when port_config_file parameter is not provided (#4787) 2020-06-18 01:21:53 +00:00
Ying Xie
d07649e6b6
[201811][swss-common] advance swss-common submoudle head (#4761)
- Add missed BGP tables into the schema (#351)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-11 15:28:02 -07:00
Ying Xie
4cd54ed58c [ntp] disable ntp long jump (#4748)
Found another syncd timing issue related to clock going backwards.
To be safe disable the ntp long jump.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-11 22:03:22 +00:00
Ying Xie
d433e529fd
[bcm SAI] upgrade Broadcom SAI to version 3.5.3.5-1 (#4739)
- Broadcom SAI 3.5 GA code drop on 20200608.

Changes:
- CS9533198
- CS10283709
- CS00009716645
- CS00010389861
- CS00010406122
- CS00010503275
- Addressed a few memory leak issues.
- Addressed an array memory allocation issue.
- Addressed assert during SER handling.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-10 01:29:39 -07:00
Joe LeVeque
7ae30d7898 [caclmgrd] Get first VLAN host IP address via next() (#4685)
I found that with IPv4Network types, calling list(ip_ntwrk.hosts()) is reliable. However, when doing the same with an IPv6Network, I found that the conversion to a list can hang indefinitely. This appears to me to be a bug in the ipaddress.IPv6Network implementation. However, I could not find any other reports on the web.

This patch changes the behavior to call next() on the ip_ntwrk.hosts() generator instead, which returns the IP address of the first host.
2020-06-09 16:30:45 +00:00
pavel-shirshov
c587f3c4d5 [sonic-slave]: Install pympler to find the memory leaks in python (#4652) 2020-06-09 16:27:53 +00:00
Joe LeVeque
494701a0ee [caclmgrd] Allow more ICMP types (#4625) 2020-06-09 16:07:51 +00:00
yozhao101
aa949cdc74 [docker-syncd] Add timeout to force stop syncd container (#4617)
**- Why I did it**
When I tested auto-restart feature of swss container by manually killing one of critical processes in it, swss will be stopped. Then syncd container as the peer container should also be
stopped as expected. However, I found sometimes syncd container can be stopped, sometimes
it can not be stopped. The reason why syncd container can not be stopped is the process
(/usr/local/bin/syncd.sh stop) to execute the stop() function will be stuck between the lines 164 –167. Systemd will wait for 90 seconds and then kill this process.

164 # wait until syncd quit gracefully
165 while docker top syncd$DEV | grep -q /usr/bin/syncd; do
166 sleep 0.1
167 done

The first thing I did is to profile how long this while loop will spin if syncd container can be
normally stopped after swss container is stopped. The result is 5 seconds or 6 seconds. If syncd
container can be normally stopped, two messages will be written into syslog:

str-a7050-acs-3 NOTICE syncd#dsserve: child /usr/bin/syncd exited status: 134
str-a7050-acs-3 INFO syncd#supervisord: syncd [5] child /usr/bin/syncd exited status: 134

The second thing I did was to add a timer in the condition of while loop to ensure this while loop will be forced to exit after 20 seconds:

After that, the testing result is that syncd container can be normally stopped if swss is stopped
first. One more thing I want to mention is that if syncd container is stopped during 5 seconds or 6 seconds, then the two log messages can be still seen in syslog. However, if the execution
time of while loop is longer than 20 seconds and is forced to exit, although syncd container can be stopped, I did not see these two messages in syslog. Further, although I observed the auto-restart feature of swss container can work correctly right now, I can not make sure the issue which syncd container can not stopped will occur in future.

**- How I did it**
I added a timer around the while loop in stop() function. This while loop will exit after spinning
20 seconds.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-09 16:07:24 +00:00
Santhosh Kumar T
e6312e72f2 [DellEMC] S6000 Disable Low power mode by default (#4592) 2020-06-09 16:06:00 +00:00
Joe LeVeque
7da0c15af5 [caclmgrd] Ignore keys in interface-related tables if no IP prefix is present (#4581)
Since the introduction of VRF, interface-related tables in ConfigDB will have multiple entries, one of which only contains the interface name and no IP prefix. Thus, when iterating over the keys in the tables, we need to ignore the entries which do not contain IP prefixes.
2020-06-09 16:05:40 +00:00
Qi Luo
f71389bc34
[submodule] Update submodule: swss-common (#4729)
7c1cce5 2020-05-27 | Fix memory leak in pyext when Selectable is returned to Python (#343)  [pavel-shirshov]
1e8b5ca 2020-04-04 | [table] add hdel operation [Guohan Lu]
50bf741 2020-03-23 | [201811][schema] Add COUNTERS_LAG_NAME_MAP table in COUNTERS_DB (#334) [Joe LeVeque]
2020-06-09 09:02:37 -07:00
Joe LeVeque
3ee9c5d1e3 [caclmgrd] Add some default ACCEPT rules and lastly drop all incoming packets (#4412)
Modified caclmgrd behavior to enhance control plane security as follows:

Upon starting or receiving notification of ACL table/rule changes in Config DB:
1. Add iptables/ip6tables commands to allow all incoming packets from established TCP sessions or new TCP sessions which are related to established TCP sessions
2. Add iptables/ip6tables commands to allow bidirectional ICMPv4 ping and traceroute
3. Add iptables/ip6tables commands to allow bidirectional ICMPv6 ping and traceroute
4. Add iptables/ip6tables commands to allow all incoming Neighbor Discovery Protocol (NDP) NS/NA/RS/RA messages
5. Add iptables/ip6tables commands to allow all incoming IPv4 DHCP packets
6. Add iptables/ip6tables commands to allow all incoming IPv6 DHCP packets
7. Add iptables/ip6tables commands to allow all incoming BGP traffic
8. Add iptables/ip6tables commands for all ACL rules for recognized services (currently SSH, SNMP, NTP)
9. For all services which we did not find configured ACL rules, add iptables/ip6tables commands to allow all incoming packets for those services (allows the device to accept SSH connections before the device is configured)
10. Add iptables rules to drop all packets destined for loopback interface IP addresses
11. Add iptables rules to drop all packets destined for management interface IP addresses
12. Add iptables rules to drop all packets destined for point-to-point interface IP addresses
13. Add iptables rules to drop all packets destined for our VLAN interface gateway IP addresses
14. Add iptables/ip6tables commands to allow all incoming packets with TTL of 0 or 1 (This allows the device to respond to tools like tcptraceroute)
15. If we found control plane ACLs in the configuration and applied them, we lastly add iptables/ip6tables commands to drop all other incoming packets
2020-06-09 04:21:27 +00:00
Wirut Getbamrung
9f8d691d4e
[platform/cel]: Backport reboot cause API to 201811 branch (#4619)
Add reboot cause API to support process-reboot-cause.service
Implement chassis.get_reboot_cause platform API
2020-05-26 02:27:03 -07:00