Commit Graph

2501 Commits

Author SHA1 Message Date
Ying Xie
6b75059b1d
[201811][kernel][utilities][sairedis] advance submodule heads (#5288)
- Kernel: [201811] Fix I2C ISMT DMA buffer alignment issue (#158)[201811] Fix I2C ISMT DMA buffer alignment issue (#158)
- utilities: Fix pfcwd stats crash with invalid queue name (#1077)
- sairedis: [syncd] Fix notification on switch shutdown request (#638)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-02 12:15:47 -07:00
zhenggen-xu
a99026acdc
[Build] pin down setuptools for build issues (#5280)
See: https://github.com/Azure/sonic-buildimage/issues/5279

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-09-01 16:29:31 -07:00
Joe LeVeque
c909422abc [caclmgrd] Always restart service upon process termination (#5065) 2020-08-31 20:31:13 +00:00
Joe LeVeque
4547ea022d [caclmgrd] Improve code reuse (#4931)
Improve code reuse in `generate_block_ip2me_traffic_iptables_commands()` function.
2020-08-31 20:30:54 +00:00
Baptiste Covolato
c706a1079f [arista/aboot]: Zero out 1st MB before repartitioning (#5220)
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.

Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.

Signed-off-by: Baptiste Covolato <baptiste@arista.com>
2020-08-22 18:48:10 -07:00
Santhosh Kumar T
a2cb92056a
Dell S6100 Port I2C changes to 201811 branch (#5150)
* Dell S6100 Port I2C changes to 201811 branch

* Update s6100_i2c_enumeration.sh
2020-08-18 14:38:28 -07:00
zhenggen-xu
e1e97199e3
[201811 Monit] Enable monitoring of SWSS daemons (#5144)
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-08-13 20:42:06 -07:00
Samuel Angebault
5d891d8832
[201811][Arista] Update arista driver submodules (#5149) 2020-08-12 09:31:58 -07:00
pavel-shirshov
e03ce8ba14
Clarify error message for bgpcfgd update loopback address (#5076) 2020-07-31 07:46:47 -07:00
pavel-shirshov
459c29cfaa
[bgpcfgd]: Fix bgpcfgd crash on reset Loopback0 ip addresses (#5050)
Fix an error which causes bgpcfgd crash on invalid ip address. Before the fix we had an issue here. When either loopback ipv4 or ipv6 addresses were already set and bgpcfgd received another "SET" message for already set ip loopback address, bgpcfgd will send syslog message about ambiguous ip address (despite the fact that the address is good) and crash of bgpcfgd. With this change this behavior is changed: if we receive ip address and this ip address is already set, bgpcfgd will send this message to the syslog and return from the handler.
2020-07-28 12:18:07 -07:00
Joe LeVeque
6120145bf1 [caclmgrd] remove default DROP rule on FORWARD chain (#5034) 2020-07-24 19:09:32 +00:00
zzhiyuan
59072a627b
[201811][Arista] Update 201811 branch with Arista syseeprom fix (#5016)
If a device had a master or 201911 image then installed a 201811 image, it could result in a prefdl that was not properly processed by 201811 Arista code.

This is a commit that was on 201911 and master branch.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-07-22 10:57:18 -07:00
Joe LeVeque
cf142e7e6c [caclmgrd] Filter DHCP packets based on dest port only (#4995) 2020-07-17 18:17:27 +00:00
Ying Xie
a37a7d3dcf
[201811][snmpagent] advance snmpagent submodule head (#4988)
- [psutil] pin psutil version to 5.7.0.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-17 06:53:35 -07:00
Joe LeVeque
3d31ef3a0c
[201811][sonic-platform-daemons] Update submodule (#4974) 2020-07-14 19:12:41 -07:00
pavel-shirshov
b7a0669f36
[201811][quagga]: Use 201811 branch of sonic-quagga (#4966)
sonic-quagga using utility from master branch of sonic-buildimage. I had to create 201811 branch in sonic-quagga which could work with 201811 branch of sonic-buildimage.
2020-07-14 10:09:11 -07:00
pavel-shirshov
8a78ff6944
[quagga]: Update sonic-quagga (#4962)
sonic-quagga repository has new fix. Update submodule to bring the fix into the image.
2020-07-13 23:14:02 -07:00
Ying Xie
0a1f043b02
[201811][utilities] advance utilities submodule head (#4947)
- [filter-fdb] Fix For Vlan Defined With No CIDR (#976)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-11 21:11:36 -07:00
zzhiyuan
a43eec53b7
[201811][Arista] Update Arista submodules (#4939)
Fix the method get_transceiver_change_event to abide by the function description, return True status and use timeout in milliseconds.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-07-09 22:55:06 -07:00
Ying Xie
ecd93eb8ab
[201811][swss] advance swss submodule head (#4935)
[aclorch] Use IPv6 Next Header internally for protocol number on MLNX platform (#1343)
Add/Del lag_name_map item according to lag adding and removing (#1124)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-09 15:42:58 -07:00
Wirut Getbamrung
c9d86f0587
[201811][platform/cel]: Port fancontrol changes for dx010/e1031 to 201811 branch (#4867)
Update fancontrol service for Seastone-DX010/E1031 device to support hysteresis temperature threshold and difference config for each unit fan direction type (B2F/F2B); follow master branch
2020-07-03 19:59:55 -07:00
Guohan Lu
d04ad415b4 [docker-config-engine]: lockdown netaddr,ipaddr,jinja pip version
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-06-25 06:58:02 +00:00
Ying Xie
6fc62208d5
[201811][utilities] advance utilities sub module head (#4844)
[filter-fdb] Check VLAN Presence When Filter FDB (#957)
[mellanox] enable watchdog before fast-reboot (#844)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-24 10:15:23 -07:00
Tamer Ahmed
ab3400f217 [fast-reboot] Back up FDB/ARP/Default routes (#4795)
FDB/ARP/Default routes files are deleted after swssconfig. This
makes debugging/validation of device conversion hard. This PR
saves those files in order to facilitate debugging of device conversion.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-06-21 02:41:39 +00:00
padmanarayana
062fd849b3 [DELL]: FTOS to SONiC fast conversion fixes (#4807)
While migrating to SONiC 20181130, identified a couple of issues:
1. union-mount needs /host/machine.conf parameters for vendor specific checks : however, in case of migration, the /host/machine.conf is extracted from ONIE only in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/platform/rc.local#L127.
2. Since grub.cfg is updated to have net.ifnames=0 biosdevname=0, 70-persistent-net.rules changes are no longer required.
2020-06-19 22:35:29 +00:00
Joe LeVeque
d9b8bed916 [caclmgrd] Don't limit connection tracking to TCP (#4796)
Don't limit iptables connection tracking to TCP protocol; allow connection tracking for all protocols. This allows services like NTP, which is UDP-based, to receive replies from an NTP server even if the port is blocked, as long as it is in reply to a request sent from the device itself.
2020-06-19 04:33:50 +00:00
Qi Luo
e02de1dc89 Fix bug: check port alias even when port_config_file parameter is not provided (#4787) 2020-06-18 01:21:53 +00:00
Ying Xie
d07649e6b6
[201811][swss-common] advance swss-common submoudle head (#4761)
- Add missed BGP tables into the schema (#351)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-11 15:28:02 -07:00
Ying Xie
4cd54ed58c [ntp] disable ntp long jump (#4748)
Found another syncd timing issue related to clock going backwards.
To be safe disable the ntp long jump.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-11 22:03:22 +00:00
Ying Xie
d433e529fd
[bcm SAI] upgrade Broadcom SAI to version 3.5.3.5-1 (#4739)
- Broadcom SAI 3.5 GA code drop on 20200608.

Changes:
- CS9533198
- CS10283709
- CS00009716645
- CS00010389861
- CS00010406122
- CS00010503275
- Addressed a few memory leak issues.
- Addressed an array memory allocation issue.
- Addressed assert during SER handling.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-10 01:29:39 -07:00
Joe LeVeque
7ae30d7898 [caclmgrd] Get first VLAN host IP address via next() (#4685)
I found that with IPv4Network types, calling list(ip_ntwrk.hosts()) is reliable. However, when doing the same with an IPv6Network, I found that the conversion to a list can hang indefinitely. This appears to me to be a bug in the ipaddress.IPv6Network implementation. However, I could not find any other reports on the web.

This patch changes the behavior to call next() on the ip_ntwrk.hosts() generator instead, which returns the IP address of the first host.
2020-06-09 16:30:45 +00:00
pavel-shirshov
c587f3c4d5 [sonic-slave]: Install pympler to find the memory leaks in python (#4652) 2020-06-09 16:27:53 +00:00
Joe LeVeque
494701a0ee [caclmgrd] Allow more ICMP types (#4625) 2020-06-09 16:07:51 +00:00
yozhao101
aa949cdc74 [docker-syncd] Add timeout to force stop syncd container (#4617)
**- Why I did it**
When I tested auto-restart feature of swss container by manually killing one of critical processes in it, swss will be stopped. Then syncd container as the peer container should also be
stopped as expected. However, I found sometimes syncd container can be stopped, sometimes
it can not be stopped. The reason why syncd container can not be stopped is the process
(/usr/local/bin/syncd.sh stop) to execute the stop() function will be stuck between the lines 164 –167. Systemd will wait for 90 seconds and then kill this process.

164 # wait until syncd quit gracefully
165 while docker top syncd$DEV | grep -q /usr/bin/syncd; do
166 sleep 0.1
167 done

The first thing I did is to profile how long this while loop will spin if syncd container can be
normally stopped after swss container is stopped. The result is 5 seconds or 6 seconds. If syncd
container can be normally stopped, two messages will be written into syslog:

str-a7050-acs-3 NOTICE syncd#dsserve: child /usr/bin/syncd exited status: 134
str-a7050-acs-3 INFO syncd#supervisord: syncd [5] child /usr/bin/syncd exited status: 134

The second thing I did was to add a timer in the condition of while loop to ensure this while loop will be forced to exit after 20 seconds:

After that, the testing result is that syncd container can be normally stopped if swss is stopped
first. One more thing I want to mention is that if syncd container is stopped during 5 seconds or 6 seconds, then the two log messages can be still seen in syslog. However, if the execution
time of while loop is longer than 20 seconds and is forced to exit, although syncd container can be stopped, I did not see these two messages in syslog. Further, although I observed the auto-restart feature of swss container can work correctly right now, I can not make sure the issue which syncd container can not stopped will occur in future.

**- How I did it**
I added a timer around the while loop in stop() function. This while loop will exit after spinning
20 seconds.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-09 16:07:24 +00:00
Santhosh Kumar T
e6312e72f2 [DellEMC] S6000 Disable Low power mode by default (#4592) 2020-06-09 16:06:00 +00:00
Joe LeVeque
7da0c15af5 [caclmgrd] Ignore keys in interface-related tables if no IP prefix is present (#4581)
Since the introduction of VRF, interface-related tables in ConfigDB will have multiple entries, one of which only contains the interface name and no IP prefix. Thus, when iterating over the keys in the tables, we need to ignore the entries which do not contain IP prefixes.
2020-06-09 16:05:40 +00:00
Qi Luo
f71389bc34
[submodule] Update submodule: swss-common (#4729)
7c1cce5 2020-05-27 | Fix memory leak in pyext when Selectable is returned to Python (#343)  [pavel-shirshov]
1e8b5ca 2020-04-04 | [table] add hdel operation [Guohan Lu]
50bf741 2020-03-23 | [201811][schema] Add COUNTERS_LAG_NAME_MAP table in COUNTERS_DB (#334) [Joe LeVeque]
2020-06-09 09:02:37 -07:00
Joe LeVeque
3ee9c5d1e3 [caclmgrd] Add some default ACCEPT rules and lastly drop all incoming packets (#4412)
Modified caclmgrd behavior to enhance control plane security as follows:

Upon starting or receiving notification of ACL table/rule changes in Config DB:
1. Add iptables/ip6tables commands to allow all incoming packets from established TCP sessions or new TCP sessions which are related to established TCP sessions
2. Add iptables/ip6tables commands to allow bidirectional ICMPv4 ping and traceroute
3. Add iptables/ip6tables commands to allow bidirectional ICMPv6 ping and traceroute
4. Add iptables/ip6tables commands to allow all incoming Neighbor Discovery Protocol (NDP) NS/NA/RS/RA messages
5. Add iptables/ip6tables commands to allow all incoming IPv4 DHCP packets
6. Add iptables/ip6tables commands to allow all incoming IPv6 DHCP packets
7. Add iptables/ip6tables commands to allow all incoming BGP traffic
8. Add iptables/ip6tables commands for all ACL rules for recognized services (currently SSH, SNMP, NTP)
9. For all services which we did not find configured ACL rules, add iptables/ip6tables commands to allow all incoming packets for those services (allows the device to accept SSH connections before the device is configured)
10. Add iptables rules to drop all packets destined for loopback interface IP addresses
11. Add iptables rules to drop all packets destined for management interface IP addresses
12. Add iptables rules to drop all packets destined for point-to-point interface IP addresses
13. Add iptables rules to drop all packets destined for our VLAN interface gateway IP addresses
14. Add iptables/ip6tables commands to allow all incoming packets with TTL of 0 or 1 (This allows the device to respond to tools like tcptraceroute)
15. If we found control plane ACLs in the configuration and applied them, we lastly add iptables/ip6tables commands to drop all other incoming packets
2020-06-09 04:21:27 +00:00
Wirut Getbamrung
9f8d691d4e
[platform/cel]: Backport reboot cause API to 201811 branch (#4619)
Add reboot cause API to support process-reboot-cause.service
Implement chassis.get_reboot_cause platform API
2020-05-26 02:27:03 -07:00
Guohan Lu
236707ac64 [baseimage]: install same version for docker-ce and docker-ce-cli
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-05-20 01:08:44 +00:00
lguohan
8e014bb7e7 [baseimage]: pin down package version for azure-storage, watchdog and futures (#4575)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-05-13 05:05:29 +00:00
Ying Xie
f52e59a032
[ntp] enable/disable NTP long jump according to reboot type (#4582)
- Enable NTP long jump after cold reboot.
- Disable NTP long jump after warrm/fast reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-05-12 12:23:47 -07:00
Qi Luo
8d200300ca
[minigraph] Support FECDisabled in minigraph parser (#4556) (#4567)
Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2020-05-09 15:54:39 -07:00
Neetha John
3d41c271a4 [qos]: Alpha and ECN settings change for Th (#4564)
Dynamic threshold setting changed to 0 and WRED profile green min threshold set to 250000 for Tomahawk devices

Changed the dynamic threshold settings in pg_profile_lookup.ini
Added a macro for WRED profiles in qos.json.j2 for Tomahawk devices
Necessary changes made in qos.config.j2 to use the macro if present

Signed-off-by: Neetha John <nejo@microsoft.com>
2020-05-09 18:25:17 +00:00
Ying Xie
660b0be9c5
[201811][sairedis] advance sairedis submodule head (#4562)
Submodule src/sonic-sairedis 5065d7858..370e3c171:
  > [syncd] Use steady clock for TimerWatchdog (#613)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-05-09 09:04:16 -07:00
Joe LeVeque
ceb878414d [process-reboot-cause] If software reboot cause is unknown add note if first boot into new image (#4538) 2020-05-08 20:37:22 +00:00
Qi Luo
708d901209
[bgpcfgd]: ip_addr is not defined (#4560) 2020-05-08 12:19:48 -07:00
Guohan Lu
9966a0a341 [bgpcfgd]: fix missing reference
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-05-08 06:09:40 +00:00
Renuka Manavalan
de05770895
Extend debug image build ability to all platforms. (#3134) (#4524) 2020-05-04 09:48:40 -07:00
Ying Xie
e2ae4ff365
[201811][utilities] advance utilities submodule head (#4490)
Submodule src/sonic-utilities d7e8f84cf..8c21fc151:
  > [utility] Filter FDB entries (#890)
  > Fix the warm-reboot script to support FRR based warm-reboot (#842)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-04-28 17:27:52 -07:00