With the release of pip21.0 (https://pypi.org/project/pip/#history) on branch
201811 stretch build is failing with below error logs:
As per https://pypi.org/project/pip/ pip21.0 does not not support python2
from Jan 2021. To fix this tag the pip to 20.3.3 version which was being used last
and is working fine.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
We believe that the supervisord issue in face of clock rolling backwards
has been addressed. Therefore reverting change 2598 to allow ntp sync
to right clock at the start up time.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [cron.d] Create cron job to periodically clean-up core files
* Create script to scan /var/core and clean-up older core files
* Create cron job to run clean-up script
Signed-off-by: Danny Allen <daall@microsoft.com>
* Update interval for running cron job
* Respond to feedback
* Change syslog id
- monit config broke by one monit upgrade
- abandon sed approach since it is suspestible to monit config changes
- use unixsocket instead of httpd due to a bug in 5.20.0
[build_debian] Generate checksum of ASIC config files
* Adds script to generate checksums for ASIC config files
* Adds step to build_debian that copies ASIC config checksum into SONiC filesystem
Signed-off-by: Danny Allen daall@microsoft.com
ndisc6 gathers a few diagnostic tools for IPv6 networks including:
- ndisc6, which performs ICMPv6 Neighbor Discovery in userland,
- rdisc6, which performs ICMPv6 Router Discovery in userland,
- rltraceroute6, a UDP/ICMP IPv6 implementation of traceroute,
- tcptraceroute6, a TCP/IPv6-based traceroute implementation,
- tcpspray6, a TCP/IP Discard/Echo bandwidth meter,
- addrinfo, easy script interface for hostname and address resolution,
- dnssort, DNS sorting script.
Signed-off-by: Guohan Lu <gulv@microsoft.com>
- What I did
Configure sshd to close all SSH connetions after 15 minutes of inactivity.
- How I did it
Set ClientAliveInterval to 900 (900 seconds = 15 minutes) and ClientAliveCountMax to 0
in /etc/ssh/sshd_config using augtool in build_debian.sh. In the process, I refactored the existing augtool command for sshd_config so as to add comments and empty lines to file for readability.
- How to verify it
Log into device via management port. Wait 15 minutes without sending a keystroke -- you should be automatically logged out.
2) Install debug tools in every debug docker image
3) Install available debug symbols in debug docker image
4) Provide additional host/docker mapping for host dirs /src & /debug
4.1) The one-image will have source code under /src
4.2) /debug is mapped as rw. User can put his core file there and use this dir to
collect debug session logs too.
5) Build debug image using debug dockers
6) Source code is archived into /src of debug image
7) The welcome banner is extended to display these additional facilities in debug image.
* [submodule] update sonic-linux-kernel (#2985)
* Fix many version strings
* Update minor version
* Update arista-drivers submodule (#9)
* Rebuild SDK on new kernel (#10)
- What I did
Currently when the system is under memory pressure, the OOM killer kicks in and kills a rogue process. Killing a rogue process can cause the device to be un-healthy leading to blackholing of the traffic.
To avoid this, configure the OOM to do a kernel panic which will cause the device to reboot and come back up healthy.
- How I did it
Added the sysctl variable panic_on_oom and set the value to 2.
Setting it to 2 will ensure OOM killer to always do a kernel panic.
- Add ebtables package, and install some filter rules:
1. ebtables -A FORWARD -d BGA -j DROP
2. ebtables -A FORWARD -p ARP -j DROP
Basically, we let the ARP packets in the VLAN being forwarded by the ASIC,
kernel gets a copy of these ARP packets and the forwarding from Kenerl gets
dropped. So there is always only one copy of ARP/response in the VLAN.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
When rebooting without the platform_reboot plugin, systemd takes a few
minutes to properly shutdown. It's blocking on some docker cleanup
operation.
As described by https://github.com/docker/for-linux/issues/421 there
is a race between docker.service and containerd.service.
docker needs containerd to properly stop the containers.
* [security kernel] Upgrade kernel from 4.9.110-3+deb9u2 to 4.9.110-3+deb9u6
short version: 4.9.0-7 to 4.9.0-8
See changelogs for security fixes:
https://tracker.debian.org/media/packages/l/linux/changelog-4.9.110-3deb9u6
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* Update sonic-linux-kernel submodule after it was merged
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* [baseimage] set default locale en_US.UTF-8
Signed-off-by: chenhu <chenhu@didichuxing.com>
* [baseimage]set default locale to en_US.UTF-8, clean all other unused
* [baseimage] update-locale after locale-gen
* correct update-locale command line
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* [baseimage]: install picocom 3.1 in base image
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* add picocom to stretch build
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* fix slave.mk bug
Signed-off-by: Guohan Lu <gulv@microsoft.com>
stretch docker-engine in base image is not started by default
in the build process. Need to create empty /var/lib/docker
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* Fix potential blackholing/looping traffic and refresh ipv6 neighbor to avoid CPU hit
In case ipv6 global addresses were configured on L3 interfaces and used for peering,
and routing protocol was using link-local addresses on the same interfaces as prefered nexthops,
the link-local addresses could be aged out after a while due to no activities towards the link-local
addresses themselves. And when we receive new routes with the link-local nexthops, SONiC won't insert
them to the HW, and thus cause looping or blackholing traffic.
Global ipv6 addresses on L3 interfaces between switches are refreshed by BGP keeplive and other messages.
On server facing side, traffic may hit fowarding plane only, and no refresh for the ipv6 neighbor entries regularly.
This could age-out the linux kernel ipv6 neighbor entries, and HW neighbor table entries could be removed,
and thus traffic going to those neighbors would hit CPU, and cause traffic drop and temperary CPU high load.
Also, if link-local addresses were not learned, we may not get them at all later.
It is intended to fix all above issues.
Changes:
Add ndisc6 package in swss docker and use it for ipv6 ndp ping to update the neighbors' state on Vlan interfaces
Change the default ipv6 neighbor reachable timer to 30mins
Add periodical ipv6 multicast ping to ff02::11 to get/refresh link-local neighbor info.
* Fix review comments:
Add PORTCHANNEL_INTERFACE interface for ipv6 multicast ping
format issue
* Combine regular L3 interface and portchannel interface for looping
* Add ndisc6 package to vs docker
some platform drivers install blacklist.conf in /etc/modprobe.d.
Those configuration should be proprogated into initramfs to avoid
loading those blacklisted driver.