- What I did
Currently when the system is under memory pressure, the OOM killer kicks in and kills a rogue process. Killing a rogue process can cause the device to be un-healthy leading to blackholing of the traffic.
To avoid this, configure the OOM to do a kernel panic which will cause the device to reboot and come back up healthy.
- How I did it
Added the sysctl variable panic_on_oom and set the value to 2.
Setting it to 2 will ensure OOM killer to always do a kernel panic.
- Add ebtables package, and install some filter rules:
1. ebtables -A FORWARD -d BGA -j DROP
2. ebtables -A FORWARD -p ARP -j DROP
Basically, we let the ARP packets in the VLAN being forwarded by the ASIC,
kernel gets a copy of these ARP packets and the forwarding from Kenerl gets
dropped. So there is always only one copy of ARP/response in the VLAN.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
We are going to use initramfs hook for firmware upgrades
To install Arista hook:
- create folder /mnt/flash/<image dir>/platform/hooks/boot1/ from Aboot or
/host/<image dir>/platform/hooks/boot1/ from Sonic
- add executable script to created folder
* [build]: put stretch debian packages under target/debs/stretch/
* in stretch build phase, all debian packages built in that stage are placed under target/debs/stretch directory.
* for python-based debian packages, since they are really the same for jessie and stretch, they are placed under target/python-debs directory.
Signed-off-by: Guohan Lu <gulv@microsoft.com>
When rebooting without the platform_reboot plugin, systemd takes a few
minutes to properly shutdown. It's blocking on some docker cleanup
operation.
As described by https://github.com/docker/for-linux/issues/421 there
is a race between docker.service and containerd.service.
docker needs containerd to properly stop the containers.
* [security kernel] Upgrade kernel from 4.9.110-3+deb9u2 to 4.9.110-3+deb9u6
short version: 4.9.0-7 to 4.9.0-8
See changelogs for security fixes:
https://tracker.debian.org/media/packages/l/linux/changelog-4.9.110-3deb9u6
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* Update sonic-linux-kernel submodule after it was merged
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* [baseimage] set default locale en_US.UTF-8
Signed-off-by: chenhu <chenhu@didichuxing.com>
* [baseimage]set default locale to en_US.UTF-8, clean all other unused
* [baseimage] update-locale after locale-gen
* correct update-locale command line
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* [baseimage]: install picocom 3.1 in base image
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* add picocom to stretch build
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* fix slave.mk bug
Signed-off-by: Guohan Lu <gulv@microsoft.com>
stretch docker-engine in base image is not started by default
in the build process. Need to create empty /var/lib/docker
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* Fix potential blackholing/looping traffic and refresh ipv6 neighbor to avoid CPU hit
In case ipv6 global addresses were configured on L3 interfaces and used for peering,
and routing protocol was using link-local addresses on the same interfaces as prefered nexthops,
the link-local addresses could be aged out after a while due to no activities towards the link-local
addresses themselves. And when we receive new routes with the link-local nexthops, SONiC won't insert
them to the HW, and thus cause looping or blackholing traffic.
Global ipv6 addresses on L3 interfaces between switches are refreshed by BGP keeplive and other messages.
On server facing side, traffic may hit fowarding plane only, and no refresh for the ipv6 neighbor entries regularly.
This could age-out the linux kernel ipv6 neighbor entries, and HW neighbor table entries could be removed,
and thus traffic going to those neighbors would hit CPU, and cause traffic drop and temperary CPU high load.
Also, if link-local addresses were not learned, we may not get them at all later.
It is intended to fix all above issues.
Changes:
Add ndisc6 package in swss docker and use it for ipv6 ndp ping to update the neighbors' state on Vlan interfaces
Change the default ipv6 neighbor reachable timer to 30mins
Add periodical ipv6 multicast ping to ff02::11 to get/refresh link-local neighbor info.
* Fix review comments:
Add PORTCHANNEL_INTERFACE interface for ipv6 multicast ping
format issue
* Combine regular L3 interface and portchannel interface for looping
* Add ndisc6 package to vs docker
some platform drivers install blacklist.conf in /etc/modprobe.d.
Those configuration should be proprogated into initramfs to avoid
loading those blacklisted driver.
* [slave.mk]: Fix displaying username and password in build summary
We display contents of DEFAULT_USERNAME and DEFAULT_PASSWORD, while
image can be build with USERNAME and/or PASSWORD given on make(1)
command line. For example:
$ make USERNAME=adm PASSWORD=mypass target/sonic-broadcom.bin
Fix by displaying USERNAME and PASSWORD variables in build summary.
Signed-off-by: Sergey Popovich <sergey.popovich@ordnance.co>
* [baseimage]: Improve default user account handling
There are couple of issues with current implementation of default
user account management in baseimage:
1) It uses DES to encrypt accounts password. Furthermore this
effectively limits password length to 8 symbols, even if more
provided with PASSWORD or DEFAULT_PASSWORD from rules/config.
2) Salt value for password is same on all builds even with different
password increasing attack surface.
3) During the build process password passed as command line parameter
either as plain text (if given to make(1) as "make PASSWORD=...")
or DES encrypted (if given to build_debian.sh) can be seen by
non-build users using /proc/<pid>/cmdline file that has group and
world readable permissions.
Both 1) and 2) come from:
perl -e 'print crypt("$(PASSWORD)", "salt"),"\n"')"
that by defalt uses DES if salt does not have format $<id>$<salt>$,
where <id> is hashing function id. See crypt(3) for more details on
valid <id> values.
To address issues above we propose following changes:
1) Do not create password by hands (e.g. using perl snippet above):
put this job to chpasswd(8) which is aware about system wide
password hashing policy specified in /etc/login.defs with
ENCRYPT_METHOD (by default it is SHA512 for Debian 8).
2) Now chpasswd(8) will take care about proper salt value.
3) This has two steps:
3.1) For compatibility reasons accept USERNAME and PASSWORD as
make(1) parameters, but warn user that this is unsafe.
3.2) Use process environment to pass USERNAME and PASSWORD variables
from Makefile to build_debian.sh as more secure alternative to
passing via command line parameters: /proc/<pid>/environ
readable only by user running process or privileged users like
root.
Before change:
--------------
hash1
-----
# u='admin'
# p="$(LANG=C perl -e 'print crypt("YourPaSs", "salt"),"\n"')"
^^^^^^^^
8 symbols
# echo "$u:$p" | chpasswd -e
# getent shadow admin
admin:sazQDkwgZPfSk:17680:0:99999:7:::
^^^^^^^^^^^^^
Note the hash (DES encrypted password)
hash2
-----
# u='admin'
# p="$(LANG=C perl -e 'print crypt("YourPaSsWoRd", "salt"),"\n"')"
^^^^^^^^^^^^
12 symbols
# echo "$u:$p" | chpasswd -e
# getent shadow admin
admin:sazQDkwgZPfSk:17680:0:99999:7:::
^^^^^^^^^^^^^
Hash is the same as for "YourPaSs"
After change:
-------------
hash1
-----
# echo "admin:YourPaSs" | chpasswd
# getent shadow admin
admin:$6$1Nho1jHC$T8YwK58FYToXMFuetQta7/XouAAN2q1IzWC3bdIg86woAs6WuTg\
^^^^^^^^
Note salt here
ksLO3oyQInax/wNVq.N4de6dyWZDsCAvsZ1:17681:0:99999:7:::
hash2
-----
# echo "admin:YourPaSs" | chpasswd
# getent shadow admin
admin:$6$yKU5g7BO$kdT02Z1wHXhr1VCniKkZbLaMPZXK0WSSVGhSLGrNhsrsVxCJ.D9\
^^^^^^^^
Here salt completely different from case above
plFpd8ksGNpw/Vb92hvgYyCL2i5cfI8QEY/:17681:0:99999:7:::
Since salt is different hashes for same password different too.
hash1
-----
# LANG=C perl -e 'print crypt("YourPaSs", "\$6\$salt\$"),"\n"'
^^^^^
We want SHA512 hash
$6$salt$qkwPvXqUeGpexO1vatnIQFAreOTXs6rnDX.OI.Sz2rcy51JrO8dFc9aGv82bB\
yd2ELrIMJ.FQLNjgSD0nNha7/
hash2
-----
# LANG=C perl -e 'print crypt("YourPaSsWoRd", "\$6\$salt\$"),"\n"'
$6$salt$1JVndGzyy/dj7PaXo6hNcttlQoZe23ob8GWYWxVGEiGOlh6sofbaIvwl6Ho7N\
kYDI8zwRumRwga/A29nHm4mZ1
Now with same "salt" and $<id>$, and same 8 symbol prefix in password, but
different password length we have different hashes.
Signed-off-by: Sergey Popovich <sergey.popovich@ordnance.co>
* Support OS9 -> SONiC fast-reboot migration
* Address review comments. Update NOS mac in EEPROM and net.rules for eth0
* Address review comments. Update sonic-platform-modules-dell to fac81d...
* Fix script for POSIX compliance
* Reduce SONiC migration partition from 8G to 1G.
* Changes to create 1G partition with ability to resize post migration.
* Remove redundant changes in varlog
* Use findfs to interpret root. Move resize in case cmdline params are reordered
* Upgrade linux-image version
* Add missing dependency of igb
* Fix mft build rule
* Add missing dependency of ixgbe
* [Broadcom]: Update OpenNSL modules to be compatible with kernel 3.16.0-5 (#3)
* [Nephos] Update SDK version to support new kernel module 3.16.0-5 (#4)
* [mellanox]: Update URL for SDK (#5)
* Add switch ASIC vendor and platforms for Nephos
- What I did
Add switch ASIC vendor: Nephos
Add Nephos platforms: Ingrasys S9130-32X, Ingrasys S9230-64X
- How I did it
Add platform/nephos files
Add platform/nephos/sonic-platform-modules-ingrasys submodule
Add device/ingrasys/x86_64-ingrasys_s9130_32x-r0 files
Add device/ingrasys/x86_64-ingrasys_s9230_64x-r0 files
Add SONiC to support Nephos platform
- How to verify it
To build SONiC installer image and docker images, run the following commands:
make configure PLATFORM=nephos
make target/sonic-nephos.bin
Check system and network feature is worked as well
- Description for the changelog
Add switch ASIC vendor and platforms for Nephos
- A picture of a cute animal (not mandatory but encouraged)
Signed-off-by: Sam Yang <yang.kaiyu@gmail.com>
* Advance sonic-sairedis submodule to include #271 (Add Nephos ASIC)
* Framework to plugin Organization specific scripts
* Framework to plugin Organization specific scripts
* Framework to plugin Organization specific scripts
* add getopt option to organization script
moving to initramfs unifies disk allocate on different platforms.
use fallocate instead of dd to speed up the disk allocation.
By default, mkfs.ext4 has -E discard option which discards the blocks
at the mkfs time, also speed up the initialization time.
* [core dump] pass unix time to coredump-compress script
Currently we only have program name (e.g. bgpd) and PID in the core file
name. PID could collide especially after docker restart or recreate.
Passing the unix time to coredump-compress so it could also add time to
the core file name.
* [utilities] include the change to coredump_compress script
* [quagga] enable core dump for bgpd and zebra
bgpd and zebra downgrade their privilege shortly after started. For that
sysctrl kernel.suid_dumpable needs to be set to 2, so that they can dump
core.
Note that fs.suid_dumpable SHOULD NOT be set to 1. Which will bypass all
system security.
* Add monit for disk>85% into pmon docker
* Revert "Add monit for disk>85% into pmon docker"
This reverts commit 9cbbf591c08bce4b52a0f68cbbddae102d7fc614.
* Install monit in base image
* [build]: Include SONiC version into installer.
Signed-off-by: marian-pritsak <marianp@mellanox.com>
* Append dirty if contains local changes
Signed-off-by: marian-pritsak <marianp@mellanox.com>
* Update config
* Use correct name for kernel version field
* Update sysDescription.j2
- Create /var/run/redis/ folder on the host
- Install Python client for Redis on the host
- Mount /var/run/redis/ as read/write from host for all dockers
- Enable accessing the database everywhere including on the host and from remote
Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
The reason is that /etc/network/interfaces file is in base image. After booting,
docker-swss is not ready and thus the empty VLAN interfaces cannot be created
when the brctl is pointing to the binary inside the swss docker.
Add the bridge-utils into the base image and add bridge_ports none to the
/etc/network/interfaces file so that after boot-up the empty VLAN interfaces
will be created to let the members to join later.
Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
- Add a functionality to get SNMP community from DHCP (option 224)
- Add a functionality to get minigraph from http service instead of using default minigraph
- The url for graph service is passed through DHCP option 225
- This feature is by default disabled. Modify rule/config to enable it on build time, or modify /etc/sonic/graph_service_url on run time.
- Fix a bug that getting hostname from DHCP is not working correctly
* Single image
* Fix review comments
* Update syncd service. Add HW mgmt to Mellanox single image.
* Add single image template for Broadcom platform.
SKU should be provided during configure:
make configure PLATFORM=broadcom SKU=Force10-S6000
* Add single image template for Cavium platform.
SKU should be provided during configure:
make configure PLATFORM=cavium SKU=AS7512
* Add description to sonic_debian_extension.j2 file.
This commit will convert the existing file system of flash drive on Arista switches from VFAT to EXT4 in the booting of SONiC. It will take the whole flash and therefore remove the recovery partition. There is a check in the script making sure that the conversion operation will not happen on a non-Arista switch or if the existing file system is not VFAT.
* Build improvements
Fix dependencies
Add configuration options
Automatically build sonic-slave
* Set default number of jobs to 1
* Auto generate target/debs directory
Signed-off-by: marian-pritsak <marianp@mellanox.com>
* Automatically remove sonic-slave container after exit
* Silence clean-logs
* Add SONIC_CLEAN_TARGETS to clean
* Use second expansion for clean dependencies
* Avoid creating empty log files
Remove log file on flush instead of writing empty string
* Put dpkg install inside lock
Use same lock as debian install targets do to avoid
race condition in dpkg installation
* Remove redirect to log from docker save
* Add .platform dependency to all and clean targets
* Remove header and footer from clean targets
* Disable messages for SONIC_CLEAN_TARGETS
* Exit with error if dpkg-buildpackage fails
* Set new location for debs in build_debian.sh
* Add recipe for docker-database
* Update redis version to 3.2.4
* Add support for p4 platform
* Add recipe for snmpd
* Add slave targets to phony and make all target default
* Remove build.sh from thrift
* Add versioning to team, nl, hiredis and initramfs
* Change sonic-slave to support snmpd build from sources
* Remove src/tenjin
* Add recipe for lldpd
* Add recipe for mpdecimal
* Remove hiredis directory on rebuild
* Add recipe for Mellanox hw management
* Remove generic image from all targets for Mellanox
* Add support for python wheels
* Add lldp and snmp dockers
* Sync docker-database to include libjemalloc
* Fix asyncsnmp variable name
* Change default build configuration
Redirect output to log files by default
Set number of jobs to nproc value
Do not print dependencies
Fix logging to print log of failed job into console
* Use docker inspect to check if sonic-slave image exists
* Use config in slave.mk directly
* Disable color output by default
* Remove sswsdk dependency from lldp and snmp dockers
* Fix comment in py wheels install targets
* Add dependency between two versions of sswsdk
* Add containers to mellanox platform
lldp, snmp and database containers
* Add recipe for team docker
* Add team docker to mellanox platform
* Encrypt password passed to build_debian.sh
* Update mellanox SAI version
Make version and revision setting only in main recipe
* Fix error handling in makefiles
As makefiles use .ONESHELL we should add -e
option to shell options in order to exit after any command fails
* Add recipe for platform monitor image
* Add platfotm monitor to mellanox targets
* Ignore submodules when building base image
This change disables DAD (IPv6's Duplicate Address Detection). DAD
protects against IP address conflicts. The way it works is that after
an address is added to an interface, the operating system uses the
Neighbor Discovery Protocol to check if any other host on the network
has the same address. If it finds a neighbor with the same address,
the address is removed from the interface.
The problem here is that the time waiting for DAD to be done is fairly
long and because that we set the host interface operating status to be
down at first, the port cannot exchange the Neighbor Discovery Protocol
and DAD will time out. The host interface is only brought up after we
have received the port admin status up notification from the kernel,
which happens only after the DAD is done or times out. This makes the
whole host interfaces bringing up procedure very slow.
This the DAD is disabled. When it is disabled, addresses are immediately
usable. Without DAD, we need to make sure that the IPv6 addresses don't
have conflicts. For now, we have two IPv6 addresses. One is assigned
manually, which prevents conflicts at first. Another one is the IPv6
link-local address. It is derived from the MAC address and thus all the
link-local addresses are the same on one box. Because link-local addresses
are not used, it will not trigger issues even if they are the same.
* Automatic fw upgrade for mlnx platform
Implement script for firmware upgrade to required version
Add firmware binary and script to ops-syncd-mlnx container
Add pciutils and usbutils to sonic-generic.bin
* Update firmware installation message
It is possible to do both upgrade and downgrade
Change "Upgrading" to "Installing compatible version"
Signed-off-by: marian-pritsak <marianp@mellanox.com>