Commit Graph

480 Commits

Author SHA1 Message Date
Michel Moriniaux
dc747247d1 [ARISTA] adding 7060_cs32s to eMMC exclusions (#2982)
* [ARISTA] adding 7060_cs32s to eMMC exclusions

Following PR 2774 we added the 7060-cx32s according to the guidelines of
PR 2780

This adds the 7060-cx32s to the list f devices that mount /var/log as a
tmpfs to mitigate eMMC wearout

Signed-off-by: Michel Moriniaux <m.moriniaux@criteo.com>

* [ARISTA] adding 7060_cs32s to eMMC exclusions

Following PR 2774 we added the 7060-cx32s according to the guidelines of
PR 2780

This adds the 7060-cx32s to the list f devices that mount /var/log as a
tmpfs to mitigate eMMC wearout

Signed-off-by: Michel Moriniaux <m.moriniaux@criteo.com>
2019-07-02 11:52:43 -07:00
Stepan Blyshchak
6961816dec fix fast reboot compatibility (#3083)
* fix fast reboot compatibility

We should handle both cases for backward-compatible with 201803:
 - fast-reboot
 - SONIC_BOOT_TYPE=fast-reboot

* handle review comments
* add a comment that getBootType code snippet is shared between two files
2019-06-26 12:46:58 -07:00
Jipan Yang
9a1bebe496 [telemetry]: change the service dependency from swss to database (#3072)
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2019-06-24 12:36:16 -07:00
Joe LeVeque
319d854e46 [baseimage]: Increase TMOUT for serial port connections to 15 minutes (#3032)
Increase TMOUT value in order to close inactive serial console connections after 900 seconds (15 minutes) of inactivity
2019-06-19 00:16:01 -07:00
Qi Luo
e7b1988638
[submodule] update sonic-linux-kernel (#2985)
* [submodule] update sonic-linux-kernel
* update linux kernel version
* Fix many version strings
* update mellanox components (built with new kernel)
* [mlnx] add make files for SDK WJH libs
* Update arista driver submodule (#8)
Make the debian packaging point to a newer kernel version.
2019-06-18 10:00:16 -07:00
Kebo Liu
c927517355 [Mellanox] Inject SDK libs dependency to pmon on Mellanox platform (#3000)
* inject sdk libs to pmon
* fix wrong code
2019-06-14 17:38:24 -07:00
lguohan
8f6ae90cba
[docker]: get hostname from config db instead of minigraph (#3004)
minigraph may not be always available on the some system configuration.
Should use config db as the source of truth.
2019-06-13 22:24:09 -07:00
Renuka Manavalan
cdca062693 [build]: Build sonic-broadcom.bin using debug dockers for all stretch based dockers (#2833)
* Updated Makefile infrastructure to build debug images.
As a sample, platform/broadcom/docker-orchagent-brcm.mk is updated to add a docker-orchagent-brcm-dbg.gz target.

Now "BLDENV=stretch make target/docker-orchagent-brcm-dbg.gz" will build the debug image.

NOTE: If you don't specify NOSTRETcH=1, it implicitly calls "make stretch", which builds all stretch targets and that would include debug dockers too.

This debug image can be used in any linux box to inspect core file. If your module's external dependency can be suitably mocked, you my even manually run it inside.

"docker run -it --entrypoint=/bin/bash e47a8fb8ed38"

You may map the core file path to this docker run.

* Dropped the regular binary using DBG_PACKAGES and a small name change to help readability.

* Tweaked the changes to retain the existing behavior w.r.t INSTALL_DEBUG_TOOLS=y.

When this change ('building debug docker image transparently') is extended to all dockers, this flag would become redundant. Yet, there can be some test based use cases that rely on this flag.

Until after all the dockers gets their debug images by default and we switch all use cases of this flag to use the newly built debug images, we need to maintain the existing behavior.

* 1) slave.mk - Dropped unused Docker build args
2) Debug template builder: renamed build_dbg_j2.sh to build_debug_docker_j2.sh
3) Dropped insignifcant statement CMD from debug Docker file, as base docker has Entrypoint.

* Reverted some changes, per review comments.
"User, uid, guid, frr-uid & frr-guid" are required for all docker images, with exception of debug images.

* Get in sync with the new update that filters out dockers to be built (SONIC_STRETCH_DOCKERS_FOR_INSTALLERS) and build debug-dockers only for those to be built and debug target is available.

* Mkae a template for each target that can be shared by all platforms.
Where needed a platform entry can override the template.
This avoids duplication, hence easier to maintain.

* A small change, that can fit better with other targets too.
Just take the platform code and do the rest in template.

* Extended debug to all stretch based docker images

* 1) Combined all orchagent makefiles into one platform independent make under rules/docker-orchagent.mk
2) Extened debug image to all stretch dockers

* Changes per review comments:
1) Dropped LIBSAIREDIS_DBG from database, teamd, router-advertiser, telemetry, and platform-monitor docker*.mk files from _DBG_DEPENDS list
2) W.r.t docker make for syncd, moved DEPENDS from template to specific makefile and let the template has stuff that is applicable to all.

* 1) Corrected a copy/paste mistake

* Fixed a copy/paste bug

* The base syncd dockers follow a template, which defines the base docker as DOCKER_SYNCD_BASE instead of DOCKER_SYNCD_<platform code>. Fix the docker-syncd-<mlnx, bfn>.mk to use the new one.

[Yet to be tested locally]

* Fixed spelling mistake

* Enable build of dbg-sonic-broadcom.bin, which uses dbg-dockers in place of regular dockers, for dockers that build debug version. For dockers that do not build debug version, it uses the regular docker.

This debug bin is installable and usable in a DUT, just like a regular bin.

* Per review comments:
  1) Share a single rule for final image for normal & debug flavors (e.g. sonic-broadcom.bin & sonic-broadcom-dbg.bin)
  2) Put dbg as suffix in final image name.
  3) Compared target/sonic-broadcom.bin.logs with & w/o fix to verify integrity of sonic-broadcom.bin
  4) Compared target/sonic-broadcom.bin.logs with sonic-broadcom-dbg.bin.log for verification

This fix takes care of ONIE image only. The next PR will cover the rest.
The next PR, will also make debug image conditional with flag.

* Updated per comments.
Now that debug dockers are available, do not need a way to install debug symbols in regular dockers.

With this commit, when INSTALL_DEBUG_TOOLS=y is set, it builds debug dockers (for dockers that enable debug build) and the final image uses debug dockers. For dockers that do not enable debug build, regular dockers get used in the final image.

Note:
The debug dockers are explicitly named as <docker name>-dbg.gz. But there is no "-dbg" suffix for image.
Hence if you make two runs with and w/o INSTALL_DEBUG_TOOLS=y, you have complete set of regular dockers + debug dockers. But the image gets overwritten.
Hence if both regular & debug images are needed, make two runs, as one with INSTALL_DEBUG_TOOLS=y and one w/o. Make sure to copy/rename the final image, before making the second run.
2019-06-12 01:36:21 -07:00
Prince Sunny
231d309b69
Generate interface table to have an entry designated to default VRF. (#2848)
* Generate default VRF table for router interfaces

* Updated jinja2 template to have prefix filter
2019-06-10 14:02:55 -07:00
Myron Sosyak
3ec95e17c8 [build_templates] [hostcfgd] Keep containers hostname up to date (#2924)
* Add updateHostName function to docker_image_ctl.j2
* Add hostname specification on container creating step
* Add listener for hostname changes in hostcfgd

Signed-off-by: Myron Sosyak <msosyak@barefootnetworks.com>
2019-06-06 00:41:30 -07:00
Kebo Liu
bd519322cb [Mellanox] Expose SDK share buffer and unix socket from syncd (#2951)
* expose SDK share buffer and unix socket from syncd
* fix PR comments
* fix community comments and add TODO
2019-06-05 11:19:56 -07:00
Nazarii Hnydyn
e041b15d10 [mellanox]: Fixed config reload race. (#2930)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2019-05-29 09:57:29 +03:00
lguohan
30b37ec6fb
[build]: make sonic-slave-stretch as the default build docker (#2921)
Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-05-27 15:50:51 -07:00
Joe LeVeque
3ec3e20e5a [logrotate] Enhance robustness (#2942)
* [logrotate] Decrease frequency to every 10 minutes; kill any lingering logrotate processes

* [logrotate] Delete all *.1.gz files as firstaction; Remove note about init-system-helpers < 1.47 workaround

However, continue to send SIGHUP directly to rsyslogd process
because 'service rsyslog rotate' still doesn't work properly with
init-system-helpers version 1.48
2019-05-25 18:00:18 -07:00
Stepan Blyshchak
9523e64666 [swss.sh] flush FDB table during cold start (#2933)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-05-22 22:07:29 -07:00
Ying Xie
222706120d [updategraph] set DB version after minigraph reload (#2917)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-18 22:08:41 -07:00
Samuel Angebault
aac0c24312 [device/Arista] Add support for the 7280CR3-32P4 (#2910)
* Add boot0 support for the 7280CR3

* Add platform and plugins for 7280CR3

* Add port config for 7280CR3

* Add platform_reboot for 7280CR3

* Add support for 7280CR3-32D4 based on the 7280CR3-32P4

* Update arista driver submodules

 - Introduce new 7280CR3-32P4
 - Improve to the led plugin for OSFP
2019-05-18 10:34:07 -07:00
Samuel Angebault
77cde50541 [device/Arista] Improvements to the boot of Arista devices. (#2898)
* Fix showing systemd shutdown sequence when verbose is set

* Fix creation of kernel-cmdline file

Sometimes boot0 prints error
"mv: can't preserve ownership of '/mnt/flash/image-arsonic.xxxx/kernel-cmdline': Operation not permitted"

* Improve flash space usage during installation

Some older systems only have 2GB of flash available. Installing a second
image on these can prove to be challenging.
The new installation process moves the installer swi to memory in order
to avoid free up space from the flash before uncompressing it there.
It removes all the flash space usage spike and also improves the IO
since the installation is no more reading and writting to the flash at
the same time.

* Add support of 7060CX-32S-SSD

* 7260CX3: use inventory powerCycle procedures

* 7050QX-32S: use inventory powerCycle procedures

* 7050QX-32: use inventory powerCycle procedures

* platform: arista: add common platform_reboot

Replace platform_reboot by a link to new common for devices already
using a similar script.

* 7060CX-32S: use inventory powerCycle procedures

* Install python smbus in pmon

Some platform plugin need the python smbus library to perform some actions.
This installs the dependency.
2019-05-15 12:45:05 -07:00
Renuka Manavalan
a357693f52 [tacacs]: skip accessing tacacs servers for local non-tacacs users (#2843)
* Switch the nss look up order as "compat" followed by "tacplus".
This helps use the legacy passwd file for user info and go to tacacs only if not found.
This means, we never contact tacacs for local users like "admin".
This isolates local users from any issues with tacacs servers.
W/o this fix, the sudo commands by local users could take <count of servers> * <tacacs timeout> seconds, if the tacacs servers are unreachable.

* Skip tacacs server access for local non-tacacs users.
Revert the order of 'compat tacplus' to original 'tacplus compat' as tacplus
access is required for all tacacs users, who also get created locally.
2019-05-09 14:36:32 -07:00
Ying Xie
9efcf1759a
[ebtables] install ebtables in base image and install filter rules (#2805)
- Add ebtables package, and install some filter rules:
  1. ebtables -A FORWARD -d BGA -j DROP
  2. ebtables -A FORWARD -p ARP -j DROP

Basically, we let the ARP packets in the VLAN being forwarded by the ASIC,
kernel gets a copy of these ARP packets and the forwarding from Kenerl gets
dropped. So there is always only one copy of ARP/response in the VLAN.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-09 09:44:41 -07:00
lguohan
5fb185cd83
[docker-frr]: bring quagga docker features to frr docker (#2870)
- use superviord to manage process in frr docker
- intro separated configuration mode for frr
- bring quagga configuration template to frr.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-05-08 23:00:49 -07:00
Joe LeVeque
6eca27e564 [services] Restart SwSS service upon unexpected critical process exit (#2845)
* [service] Restart SwSS Docker container if orchagent exits unexpectedly

* Configure systemd to stop restarting swss if it attempts to restart more than 3 times in 20 minutes

* Move supervisor-proc-exit-listener script

* [docker-dhcp-relay] Enhance wait_for_intf.sh.j2 to utilize STATEDB

* Ensure dependent services stop/start/restart with SwSS

* Change 'StartLimitInterval' to 'StartLimitIntervalSec', as Stretch installs systemd 232 (>= v230)

* Also update journald.conf options

* Remove 'PartOf' option from unit files

* Add '$(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)' to new shared docker-orchagent makefile

* Make supervisor-proc-exit-listener script read from 'critical_processes' file inside container

* Update critical_processes file for swss container
2019-05-01 08:02:38 -07:00
Joe LeVeque
2736da97c7 [sudoers] Add /usr/bin/teamshow to READ_ONLY_CMDS (#2846) 2019-05-01 08:01:44 -07:00
Ying Xie
6431248243
[db migrator] migrate the DB to latest schema when needed (#2808)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-30 14:46:18 -07:00
Qi Luo
6b3a26f0cc
Remove unused packages in docker images and host (#2807)
* Remove unneeded packages in docker images and host
* Remove libpython3.6 from snmp docker image
2019-04-29 17:21:24 -07:00
Ying Xie
c7af19a4db
[teamd service] start teamd service after swss (#2829)
SWSS clears DB tables, if teamd is not started after swss, there is a
race condition that swss might clear vital teamd information.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-26 15:12:33 -07:00
Andriy Moroz
ca7924eb27 Increase syncd start timeout (#2776)
* Increase syncd start timeout

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>

* Replace TimeoutSec to TimeoutStartSec

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
2019-04-24 17:51:26 +03:00
zhenggen-xu
75964ef243 [baseimage]: Add fstrim service and fstrim timer by default (#2804)
This service (weekly) will let SSD firmware to do the garbage collection
after file-system deleted files. It could avoid slowness or
even READ-ONLY error due to SSD not being able to free the pages
even though the file system thinks there was a lot of space left.

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2019-04-21 14:21:16 -07:00
Stepan Blyshchak
6a4ffef1fd [snmp.service] Make swss.service a requisite (#2790) 2019-04-16 18:32:36 -07:00
Ying Xie
8bf9247c5e
[tmpfs var/log] mount /var/log as tmpfs for some platforms (#2780)
SONiC is a heavy writer to /var/log partition, we noticed that this
behavior causes certain flash drive to become read-only over time.
To avoid this issue, we mount /var/log parition on these devices as
tmpfs.

- Mount /var/log as tmpfs
- /var/log default size is 128M
- Adjust size according to existing var-log.ext4 file size.
- Adjust size to between 5% to 10% of total memory size.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-14 22:46:26 -07:00
Ying Xie
f583f57af6
[service] add warmboot finializer service (#2715)
After warm reboot is done, we need to disable warm reboot flag and
tear down anything setup for warm reboot and persisted across.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-04-12 15:45:58 -07:00
Renuka Manavalan
6d7ecc426c [hostcfgd] -- Fix the default for failthrough as false.
This implies that by default, if TACACS is configured properly and it reported auth_err, then don't try fail through to traditional unix authentication through /etc/passwd.

If this failthrough is intended, make it explicit through "sudo config aaa authentication failthrough enable"

Removed an unused variable "aaa.fallback"

Tested manually. Note the presence of 'auth_err=die' in all cases except when failthrough is explicitly enabled.

admin@str-s6000-acs-13:~$ sudo config aaa authentication failthrough default; date
Wed Apr  3 23:05:18 UTC 2019
admin@str-s6000-acs-13:~$ ls -lrt /etc/pam.d/common-auth-sonic ; grep 123 /etc/pam.d/common-auth-sonic
-rw-r--r-- 1 root root 1316 Apr  3 23:05 /etc/pam.d/common-auth-sonic
auth    [success=done new_authtok_reqd=done default=ignore auth_err=die]        pam_tacplus.so server=100.127.20.22:49 secret=testing123 login=login timeout=5 try_first_pass
auth    [success=done new_authtok_reqd=done default=ignore auth_err=die]        pam_tacplus.so server=100.127.20.21:49 secret=testing123 login=login timeout=5 try_first_pass

admin@str-s6000-acs-13:~$ sudo config aaa authentication failthrough enable; date ; h4 "AAA|authentication"
Wed Apr  3 23:06:37 UTC 2019
admin@str-s6000-acs-13:~$ ls -lrt /etc/pam.d/common-auth-sonic ; grep 123 /etc/pam.d/common-auth-sonic
-rw-r--r-- 1 root root 1294 Apr  3 23:06 /etc/pam.d/common-auth-sonic
auth    [success=done new_authtok_reqd=done default=ignore]     pam_tacplus.so server=100.127.20.22:49 secret=testing123 login=login timeout=5 try_first_pass
auth    [success=done new_authtok_reqd=done default=ignore]     pam_tacplus.so server=100.127.20.21:49 secret=testing123 login=login timeout=5 try_first_pass

admin@str-s6000-acs-13:~$ sudo config aaa authentication failthrough disable; date ; h4 "AAA|authentication"
Wed Apr  3 23:07:09 UTC 2019
admin@str-s6000-acs-13:~$ ls -lrt /etc/pam.d/common-auth-sonic ; grep 123 /etc/pam.d/common-auth-sonic
-rw-r--r-- 1 root root 1321 Apr  3 23:07 /etc/pam.d/common-auth-sonic
auth    [success=done new_authtok_reqd=done default=ignore auth_err=die]        pam_tacplus.so server=100.127.20.22:49 secret=testing123 login=login timeout=5 try_first_pass
auth    [success=done new_authtok_reqd=done default=ignore auth_err=die]        pam_tacplus.so server=100.127.20.21:49 secret=testing123 login=login timeout=5 try_first_pass
2019-04-03 23:16:56 +00:00
Ying Xie
00a0f22f38
Revert "[teamd service] teamd service should start after syncd (#2724)" (#2733)
This reverts commit 0d1efb131c.
2019-04-03 08:20:44 -07:00
paavaanan
b56124bf48 removing dhcp- turn- off option from initrd (#2555)
* removing dhcp changes from initrd

* removing mgmt-intf-dhcp file
2019-04-02 15:48:04 -07:00
Ying Xie
0d1efb131c
[teamd service] teamd service should start after syncd (#2724)
* [teamd service] teamd service should start after syncd

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* combine after lines
2019-04-01 15:40:22 -07:00
Qi Luo
9c83b5480d
[security] Do not generate ssh server keys for non RSA protocols (#2718) 2019-03-29 15:27:33 -07:00
Ying Xie
698b248a13
[docker script] skip docker mount point checking for database container (#2683)
database container doesn't mount hwsku folder.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-03-19 20:14:07 -07:00
Renuka Manavalan
ae05579c67 [baseos]: Install ipaddress python package that has deprecated current ipaddr. … (#2674)
* Install ipaddress python package that has deprecated current ipaddr. ipaddress has backport to python2.7

* Install python ipaddress module as required by route_check.py sonic utility. BTW, ipaddress deprecates ipaddr and ipaddress has python2 backport

* Revert the old chaneg per review comments.

Signed-off-by: Renuka Manavalan <remanava@microsoft.com>
2019-03-18 11:12:47 -07:00
Pavlo Yadvichuk
11c2e9ee3d [barefoot]: Allow configuration of platform-specific interfaces used for internal purposes (#2631)
- Why it is required
since SONiC master switches ifupdown package to the new implementation (ifupdown2), it is required to change the configuration of a platform-specific interface for wedge100bf_32x and wedge100bf_65x platforms (bc of ifupdown2 doesn't support auto mode for inet6 protocol).

Also, need to make some refactoring and remove if platform == smth then.. from the system level scripts.

- What I did

removed customization of /usr/bin/interfaces-config.sh
explicitly created directory /etc/network/interfaces.d
added "source" to the /etc/network/interfaces generation template (to include platform-specific interfaces processing)
added platform-specific interfaces config itself (for wedge100bf_32x and wedge100bf_65x)
fixed testcase in sonic-config-engine
- How to verify it

build image for wedge100bf_32x
perform sudo config reload -y on new installation
check the correct configuration of usb0 interface
- Description for the changelog

Allow configuration of platform-specific interfaces
2019-03-09 06:22:32 -08:00
Joe LeVeque
2bb5400948 [services] Services which start containers now use 'docker wait' instead of 'docker attach' (#2661) 2019-03-08 10:59:41 -08:00
Wenda Ni
f9c9fa8ba1 [qos]: Map tc 1, 2, 5, and 6 back to pg 0 (#2650)
Lossy traffic does not need to be mapped to different ingress PGs. They can all share the same ingress PG.

Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-03-08 02:23:32 -08:00
Nazarii Hnydyn
b22fe37670 [mellanox]: Upgraded hw-management V.2.0.0160. (#2643)
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2019-03-06 18:51:46 -08:00
Wenda Ni
784bf77a92 Add hook to allow customizing link cable lengths
Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-03-05 22:06:00 +00:00
Ying Xie
66f5202b9f
[swss/syncd] cold start syncd service in swss in attach method (#2639)
start() is called by service startPre method, which is blocking. Starting
syncd service here is causing deadlock.

attach() is called by service start method, which is non-blocking.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-03-04 16:46:55 -08:00
RAMA CHANDRA REDDY GADDAM
b9edb7153d [aaa] Fix common-auth-sonic.j2 template issue (#2613) 2019-03-02 15:36:35 -08:00
Joe LeVeque
5eb7872a07 [services] Ensure swss and syncd services start before dependent services (#2634)
* [services] Ensure swss and syncd services start before dependent services

* Add 'attach' functions to scripts which get installed to /usr/local/bin so that services only reference the one script each

* Add 'After=swss.service' to syncd.service
2019-03-02 15:28:34 -08:00
yurypm
d632569a6a Add initramfs hook for Arista devices (#2595)
We are going to use initramfs hook for firmware upgrades
To install Arista hook:
- create folder /mnt/flash/<image dir>/platform/hooks/boot1/ from Aboot or
  /host/<image dir>/platform/hooks/boot1/ from Sonic
- add executable script to created folder
2019-02-27 10:28:04 -08:00
Ying Xie
3086f4f391
Revert "[baseimage] Delay ntp-config service to start after 5 minutes (#2494)" (#2590)
This reverts commit 33fe8d298e.
2019-02-21 10:04:54 -08:00
Nikos
1158277533 [frr]: staticd terminating due to inadequate permissions (#2580)
Signed-off-by: nikos <ntriantafillis@gmail.com>
2019-02-19 21:50:19 -08:00
lguohan
572db1e0a9
[swss]: flush asic db in swss.sh for non warm-boot (#2582)
need to flush asic db in swss.sh instead of syncd.sh

orchagent might already started in swss.sh and put commands
into asic db before asic db is flushed in syncd.sh. This
causes race condition such as INIT_VIEW not passing to syncd.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-19 21:48:43 -08:00
Jipan Yang
ff74daaf13 Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#2538)
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2019-02-19 17:06:56 -08:00
Renuka Manavalan
fa7c46611e [hostcfgd]: Promote logs for update-notifications-from-DB from DEBUG to INFO (#2576)
* Add a log message for each notification of add/del TACACS server.

Signed-off-by: Renuka Manavalan <remanava@microsoft.com>

* Moved another syslog message from DEBUG to INFO to be able to see those notifications.

All these changes are to help with a one-time-seen-bug, that hostcfgd did not act upon changes to redis for TACACS servers. We could not repro the bug.

Signed-off-by: Renuka Manavalan <remanava@microsoft.com>
2019-02-16 10:17:13 -08:00
Stepan Blyshchak
2dd769bf46 [syncd.sh] Don't stop sxdkernel during warm shutdown on Mellanox platform (#2572)
/etc/init.d/sxdkernel stop may take up to 15 sec which has impact on
control plane downtime

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-02-15 16:08:08 -08:00
Nazarii Hnydyn
d53df059d4 [devices]: Added new SN3700/SN3700C Mellanox platforms (#2548)
* [mlnx-msn3700]: Added MSN3700 platform.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mlnx-msn3700]: Upgrade FW burn: use ASIC auto detect.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mlnx-msn3700]: Updated HW-MGMT/FW/MFT/SAI/SDK.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mlnx-msn3700]: Added MSN3700C platform.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2019-02-13 23:08:04 -08:00
Ying Xie
44551d0fb5
[swss/syncd] log swss/syncd service script activities (#2545)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-02-10 11:56:31 -08:00
zzhiyuan
6037707abc [devices]: Add device data for Arista 7060PX/DX4-32 (#2534)
* Add boot0 definition for Arista 7060PX4-32 and 7060DX4-32

* Add port configuration for Arista 7060PX4-32

* Add plugins for Arista 7060PX4-32

* Add platform_reboot for Arista 7060PX4-32

* Add Arista 7060DX4-32 as symlink of 7060PX4-32

* Add sensors configuration and fancontrol for Arista 7060PX4-32

* Update arista-driver submodules for barefoot/broadcom

* Add platform_reboot script for Alhambra

* Rook fancontrol CPLD rename
2019-02-08 22:02:01 -08:00
Nadiia Stetskovych
bb5a171ffc [minigraph]: Do not fail for minigraphs which do not have neighbors listed in <Devices> section (#2522)
Signed-off-by: Nadiya.Stetskovych <nstetskovych@barefootnetworks.com>
2019-02-04 22:43:08 -08:00
lguohan
f20665008c
[build]: put stretch debian packages under target/debs/stretch/ (#2519)
* [build]: put stretch debian packages under target/debs/stretch/

* in stretch build phase, all debian packages built in that stage are placed under target/debs/stretch directory.
* for python-based debian packages, since they are really the same for jessie and stretch, they are placed under target/python-debs directory.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-04 22:06:37 -08:00
zhenggen-xu
982eddfaa4 [updategraph] After system upgrade, restore files/directories with original attributes etc. (#2368)
* [updategraph] After system upgrade, restore files/directories with
original attributes etc.
Restore a few more files that was missed before.
Restore FRR configuration directory if exists on old system

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Removed deployment_id_asn_map.yml from copy list

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2019-02-02 12:50:19 -08:00
lguohan
9c2d7240ea
[vs]: Force10-S6000 buffer settings for virtual switch (#2515)
Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-01 11:18:02 -08:00
Prince Sunny
39e12a1d82 [swss]: Change VrfMgrd startup order, cleanup VRF_TABLE from state DB (#2510) 2019-01-31 23:28:31 -08:00
Wenda Ni
58adf06cc0 [QoS]: Link pg 2 and 6 to lossy buffer profile (#2511)
* Link pg 2 and 6 to lossy buffer profile

Signed-off-by: Wenda <wenni@microsoft.com>
2019-01-31 23:27:58 -08:00
Joe LeVeque
33fe8d298e [baseimage] Delay ntp-config service to start after 5 minutes (#2494) 2019-01-30 19:01:21 -08:00
Wenda Ni
ce9a3f0c5a [QoS]: QoS Config change for multiple devices (#2505)
* QoS config change: 1) DSCP mapping; 2) link pg/queue 6 to lossy buffer;
3) redistribute scheduler

Signed-off-by: Wenda <wenni@microsoft.com>

* Add scheduling weight to queue 2

Signed-off-by: Wenda <wenni@microsoft.com>

* Link pg/queue 2 to lossy buffer

Signed-off-by: Wenda <wenni@microsoft.com>

* Update the pg headroom for a7060-D48C8 50G

Signed-off-by: Wenda <wenni@microsoft.com>

* Update config gen test for qos

Signed-off-by: Wenda <wenni@microsoft.com>

* Update pg headroom size, and update egress lossy pool size accordingly

Signed-off-by: Wenda <wenni@microsoft.com>

* Update headroom pool size; Update ingress service pool and egress lossy
pool sizes accordingly;

Signed-off-by: Wenda <wenni@microsoft.com>

* a7260: update headroom pool size; Update ingress service pool and egress lossy pool sizes accordingly;

Signed-off-by: Wenda <wenni@microsoft.com>

* Update config gen test for buffer

Signed-off-by: Wenda <wenni@microsoft.com>
2019-01-30 19:00:13 -08:00
Joe LeVeque
39b60d2a50 [reboot cause] Move reboot-cause files to /host directory so they persist across SONiC upgrades (#2490)
* [reboot cause] Move reboot-cause files to /host directory so they persist across SONiC upgrades

* [sonic-utilities] Update submodule to include related changes
2019-01-29 03:42:19 -08:00
Joe LeVeque
8f43cad061 [rsyslog] Suppress duplicate messages from base image and all Docker containers (#2497) 2019-01-29 03:41:40 -08:00
lguohan
4ccd35bc25
[kernel]: update sonic kernel to 4.9.0-8-2 (#2468)
* [kernel]: update sonic kernel to 4.9.0-8-2

* 3b2114d 2019-01-20 | [sonic-linux-kernel] add udp_l3mdev_accept kernel upstream patch (#70) (HEAD, azure/master) [Harish Venkatraman]
* 37734aa 2019-01-10 | L3mdev cgroup (#73) [lguohan]
* d631eeb 2018-12-15 | yet another uart race condition fix (#75) [lguohan]

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* Update Mellanox SDK

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* Update arista platform driver to match 4.9.0-8-2 kernel

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-01-25 00:46:09 -08:00
Joe LeVeque
116ddb996a [caclmgrd] Don't crash if we find empty/null rule_props (#2475)
* [caclmgrd] Don't crash if we find empty/null rule_props
2019-01-23 18:47:05 -08:00
Prabhu Sreenivasan
f28a670097 [baseimage]: Avoid removing localhost entry from /etc/hosts file (#2452)
- What I did
This fix removes the possibility of 'localhost' entry getting removed from /etc/hosts file by hostname-config service.

Without this change, whenever we change the hostname from 'localhost' to any other name on the config_db.json and reload the config, /etc/hosts file will only have the new hostname on it. But there are multiple sonic utilities (eg: swssconfig) which relies on the hard coded 'localhost' name and they tend to stop working.

- How I did it
Added a new check on hostname-config.sh script to avid blindly deleting the line containing the old hostname from /etc/hosts file. Now it will delete the old hostname only if its not localhost or when the hostname is not changing.

- How to verify it

Bring up SONiC on a device with hostname as localhost
Edit /etc/sonic/config_db.json to update the 'hostname' filed under DEVICE_METADATA from "hostname" : "localhost" --> "hostname" : "sonic"
run config reload -y to reflect the hostname change done on config_db.json file.
cat /etc/hosts and check whether both 127.0.0.1 localhost and 127.0.0.1 sonic entry are present on the file.
ping localhost should work fine.
- Description for the changelog
Make hostname-config service more robust in handling SONiC hostname change from localhost to anything else.
2019-01-17 22:47:19 -08:00
stepanblyschak
20dfb03359 [mellanox|ffb] ISSU version check (#2437)
* Revert "[mellanox]: Integrate CRIU tool to SYNCD docker container (#2061)"

This reverts commit 514b38f348.

Conflicts:
	platform/mellanox/docker-syncd-mlnx.mk
	sonic-slave/Dockerfile

* [mellanox|ffb] remove unused scripts

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mellanox|ffb] ISSU version check

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mlnx|ffb] remove extra ';'

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-01-17 14:41:32 -08:00
Nikos
e55a7d7db7 [baseimage]: Initial changes for dhcp to support eth0 in a mgmt vrf (#2348)
* Initial changes to support eth0 in a mgmt vrf
2019-01-15 18:15:56 -08:00
stepanblyschak
ff526dd103 [mellanox|ffb] use system level warm reboot for Mellanox fastfast boot (#2374)
* [mellanox|ffb] use system level warm reboot for Mellanox fastfast boot

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mellanox|ffb] add comments for mellanox start/stop drivers section

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-01-10 14:09:03 -08:00
lguohan
b57a376622
[docker-engine]: upgrade docker engine to 18.09 (#2417)
* [docker-engine]: upgrade docker engine to 18.09
2019-01-04 20:47:43 -08:00
Volodymyr Samotiy
b506241b84 [syncd]: Fix reload flow for Mellanox platforms (#2386)
* Perform stop/start of Mellanox driver tools for all types of reboot
* Don't set Mellanox FAST_BOOT option for "cold" reboot
* Don't send "syncd_request_shutdown" event for "cold" reboot on Mellanox platforms

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2018-12-15 11:36:12 -08:00
zhenggen-xu
f093ef2a9f [security kernel] Upgrade kernel from 4.9.110-3+deb9u2 to 4.9.110-3+deb9u6 (#2367)
* [security kernel] Upgrade kernel from 4.9.110-3+deb9u2 to 4.9.110-3+deb9u6
short version: 4.9.0-7 to 4.9.0-8

See changelogs for security fixes:
https://tracker.debian.org/media/packages/l/linux/changelog-4.9.110-3deb9u6

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Update sonic-linux-kernel submodule after it was merged

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2018-12-11 04:17:17 -08:00
Samuel Angebault
6c7bcf5067 [device/Arista] fix small issue for the 7170 (#2373)
* Fix boot0 install on vfat

* Only display the hook name in boot0

Instead of printing the entire path

* Update arista driver submodule
2018-12-11 04:14:46 -08:00
Ying Xie
6ba93acd9c
[update graph] adapt to warm reboot scenario (#2353)
* [update graph] adapt to warm reboot scenario

When migrating configuration, always copy config files from old_config
to /etc/sonic. But if warm reboot is detected, then skip configuration
operations.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* log file copies and misses
2018-12-06 10:24:50 -08:00
Wenda Ni
f5e678cf84 Port QoS & buffer changes in 0330 to master (#2239)
* 1) DSCP 46 to 5; 2) ecn config for lossless traffic; 3) ecn on by default; 4) DWRR equal weight;

Signed-off-by: Wenda <wenni@microsoft.com>

* 1) link pg & queue 5 to lossy buffer profile; 2) ingress lossless alpha 1/8

Signed-off-by: Wenda <wenni@microsoft.com>

* Update the test case for qos & buffer json template

Signed-off-by: Wenda <wenni@microsoft.com>

* Migrate a7050-qx32 and s6000 to use pg_profile lookup architecture

Signed-off-by: Wenda <wenni@microsoft.com>

* Update pg headroom egress service pool for a7050-qx-32s, a7050-qx32, and s6000

Signed-off-by: Wenda <wenni@microsoft.com>

* Link queue 5 to lossy profile

Signed-off-by: Wenda <wenni@microsoft.com>
2018-12-04 20:51:55 -08:00
kannankvs
a9a7ce1091 tacacs management vrf changes (#2217) 2018-12-04 10:22:48 -08:00
Volodymyr Samotiy
75b41233d2 [Mellanox|FFB]: Add support for Mellanox fast-fast boot (#2294)
* [mlnx|ffb] Add support for mellanox fast-fast boot

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mlnx|ffb]: Add support of "config end" event for mlnx fast-fast boot

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>

* [Mellanox|FFB]: Fix review comments

* Change naming convention from "fast-fast" to "fastfast"

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2018-12-04 10:11:24 -08:00
Samuel Angebault
989b60059b [device/arista]: Update (#2336)
* Update arista drivers submodule

* Ignore the possible timestamp warning in tar extraction

* Add verbosity toggle to boot0

Console logging is slow because of the 9600 baud rate.
Some time can be saved by decreasing the console verbosity.

* Add hook mechanism in boot0.

Support additional features in boot0 via hooks.
Hooks are unpacked and executed at post-install or pre-exec time.

* Fix 7170 sensors.conf file

Fix critical temperature settings for MAX6658 sensors

* Fix the random swap of storage devices

For arista 7050 switches running with linux 4.9, it is likely the device
name of flash drive (/dev/sda) and usb (/dev/sdb) randomly swap in kernel
booting, depending on which one is ready first. It breaks the expectation
that flash will be mounted as root by setting root=/dev/sda1. This patch
will correct ROOT to flash device refering to the path under block_flash.

* Fix 7170 fancontrol

* Do not remove aquota.user file in boot0

This file is a filesystem protected file used by EOS.
It can be simply removed and will make the SONiC installation failed if
not skipped.
2018-12-04 10:08:55 -08:00
Taoyu Li
aedfd6e708 [sonic-cfggen] Multi-key should be in form of (a,b) instead of 'a|b' (#2337) 2018-12-04 10:07:44 -08:00
Joe LeVeque
298d2ad8f4
[boot] Refactor: All services which start Docker containers start before ntp-config service (#2335) 2018-12-03 16:01:44 -08:00
Ying Xie
84bde1511a
[sonic boot] disable dhcp during boot up, until updategraph service is running (#2316)
* [sonic] disable management port eth0 during boot up

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [updategraph] enable dhcp client on management port eth0

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-29 08:34:22 -08:00
Joe LeVeque
d1c9b0cb77 [boot] Start ntp-config service after all Docker containers are started (#2303) 2018-11-28 00:12:03 -08:00
Ying Xie
ce60c53933
[build image] copy init_interfaces to interfaces (#2302)
init_interfaces meant to be sonic init interfaces configuration file.
However, it needs to be copied to the right file name to take effect.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-27 14:35:17 -08:00
Nikos
7056b49af7 Routing application split config support (#2286)
* Routing application split config support

Signed-off-by: nikos <ntriantafillis@gmail.com>

* Routing application split config support
Routing application split config support

Signed-off-by: nikos <Nikos Triantafillis>
2018-11-26 18:19:12 -08:00
zzhiyuan
f0540e7381 Fix networking.service waiting for udevadm settle (#2295)
There was a fix to speed up initialization when networking used init.d
but it did not carry over to systemd networking.service. This fix will
apply the same change on the systemd service.

The result is much less time spent being blocked in networking.service.
2018-11-23 17:06:23 -08:00
Qi Luo
c2ae736f2e [warmboot] Load database from redis-cli save (#2287)
* [warmboot] Load database from `redis-cli save`

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Add trivial statement to make bash function valid

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Update submodule sonic-utilities: Use 'redis-cli save' to dump database to file

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Move configdb-load.sh outside docker, and only run in cold

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix for more strict warm check

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-11-22 15:13:35 -08:00
Ying Xie
4abbe43463 [syncd] skip ledinit during syncd warm start (#2285)
* [syncd] skip ledinit during syncd warm start

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-21 17:56:19 -08:00
Ying Xie
873df9d8e8
[bde driver] black list linux_kernel_bde driver (#2284)
This driver should be loaded by sonic service. If kernel tries to load
it, the driver would be loaded with default parameters, which is not
right for sonic.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-21 08:08:37 -08:00
Qi Luo
465ebbafff
Build patched redis-dump-load (#2277)
* Build patched redis-dump-load
* Fix build
* Add build rule
2018-11-20 19:27:56 -08:00
Qi Luo
b4fd40a75e Fix redis-py version to 2.10.6 (#2273)
* Fix redis-py version

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Update submodule sonic-py-swsssdk: Fix redis-py version to 2.10.6

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-11-19 12:03:15 -08:00
Ying Xie
5c8650aaaa [swss service] don't clear WARM_RESTART table (#2256)
Clear WARM_RESTART table could cause component level warm restart to
fail due to missing WARM_RESTART state.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-15 22:04:53 -08:00
Ying Xie
8598ccaf84
[syncd] extend syncd service script to support both warm/cold shutdown (#2238)
- cold shutdown is used by regular service stop and/or fast reboot
- warm shutdown is used by warm restart and/or warm reboot

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-15 15:47:33 -08:00
Joe LeVeque
f126000cc9
[sudoers] Add 'SONIC_CLI_IFACE_MODE' to env_keep to ensure variable is made available to sudo calls (#2249) 2018-11-15 15:16:06 -08:00
stepanblyschak
447ae7b61a [mlnx] Fix fast reboot (#2237)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2018-11-09 21:54:20 -08:00
Ying Xie
914d5c7451 [warm boot] restore log level DB during warm reboot (#2233)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-07 21:59:24 -08:00
Shuotian Cheng
110355201b [swss]: Update swss.sh script to clean up specific db when start (#2223)
This script shall not flush all the entries in the state database
when it starts up, since there are entries maintained and written
by other processes outside this docker.

The issue we noticed was that the portchannel states are cleaned
up after teamsyncd writes the entries into the database, which
causes the IPs failed to be configured because intfmgrd considers
the portchannels are not ready yet.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-11-03 12:32:46 -07:00
Qi Luo
8b67424101 Warm reboot: restore the database docker with content saved (#2216)
* Database service warm start

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Update files/build_templates/docker_image_ctl.j2

Co-Authored-By: qiluo-msft <qiluo-msft@users.noreply.github.com>

* Update files/build_templates/docker_image_ctl.j2

Co-Authored-By: qiluo-msft <qiluo-msft@users.noreply.github.com>

* Update files/build_templates/docker_image_ctl.j2

Co-Authored-By: qiluo-msft <qiluo-msft@users.noreply.github.com>

* Fix sudo, and exit immediately if any failure

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix syntax

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix redisLoadAndDelete argument, and refactor

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix: sudo, ping through unix socket

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-11-02 07:20:07 -07:00
Ying Xie
5cff136951 [console speed] lock console speed to start up speed (#1734)
Auto negotiating console speed could cause sonic to lock on a wrong
speed under rare conditions. The only way to come out of the wrong
speed is to issue line break or restart console service with forced
speed, or reboot sonic.

Lock down the console speed to avoid these situations.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-01 15:12:22 -07:00
Taoyu Li
2897686de8
[updategraph] Use empty configuration when DHCP graphurl option is missing (#2185) 2018-10-29 12:16:00 -07:00
Joe LeVeque
1e1add90f9
Remove Arista-specific service ACL solution; All platforms now use caclmgrd (#2202) 2018-10-29 10:25:18 -07:00
Wenda Ni
09ae9a8965 In the case of upgrade, have pfcwd enabled on the upgraded sonic (#2192)
Signed-off-by: Wenda <wenni@microsoft.com>
2018-10-26 09:13:45 -07:00
Shuotian Cheng
7313e7d9bc [teamd]: Add teammgrd in docker-teamd (#2064)
Remove the teamd.j2 templates used for starting the teamd. Add
teammgrd instead to manage all port channel related configuration
changes. Remove front panel port related configurations in
interfaces.j2 templates as well.

Remove teamd.sh script and use teammgrd to start all the teamd
processes. Remove all the logics in the start.sh script as well.

Update the sonic-swss submodule.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-10-19 03:41:53 -07:00
Taoyu Li
2a24a303ec [tacplus nss conf] tacplus should be before compat (#2163) 2018-10-18 12:42:24 -07:00
Wenda Ni
77652c55fd [QoS]: Unify qos json by using qos_config.j2 template (#2023)
* Unify qos config with qos_config.j2 template

Signed-off-by: Wenda <wenni@microsoft.com>

* Change 7050 to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   device/arista/x86_64-arista_7050_qx32/Arista-7050-QX32/qos.json.j2
	modified:   device/arista/x86_64-arista_7050_qx32s/Arista-7050-QX-32S/qos.json.j2

* Change a7060, a7260, s6000, s6100, z9100  to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

* Change mlnx devices to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   ../../../mellanox/x86_64-mlnx_msn2100-r0/ACS-MSN2100/qos.json.j2
	modified:   ../../../mellanox/x86_64-mlnx_msn2410-r0/ACS-MSN2410/qos.json.j2
	modified:   ../../../mellanox/x86_64-mlnx_msn2700-r0/ACS-MSN2700/qos.json.j2
	modified:   ../../../mellanox/x86_64-mlnx_msn2700-r0/Mellanox-SN2700-D48C8/qos.json.j2

* Change barefoot devices to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   barefoot/x86_64-accton_wedge100bf_32x-r0/montara/qos.json.j2
	modified:   barefoot/x86_64-accton_wedge100bf_65x-r0/mavericks/qos.json.j2

* Change accton as7212 to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   accton/x86_64-accton_as7212_54x-r0/AS7212-54x/qos.json.j2

* Apply PORT_QOS_MAP to active ports only

Signed-off-by: Wenda <wenni@microsoft.com>

* Update qos config test with qos_config.j2 template

Signed-off-by: Wenda <wenni@microsoft.com>

* Update sample output of qos-dell6100.json

Signed-off-by: Wenda <wenni@microsoft.com>

* Remove generating the default port name and index list, i.e., remove the generate_port_lists macro, because PORT is always defined

Signed-off-by: Wenda <wenni@microsoft.com>

* Include pfc_to_pg_map according to platform asic type obtained from
/etc/sonic/sonic_version.yml rather than specifying per hwsku

Signed-off-by: Wenda Ni <wenni@microsoft.com>

* Customize TC_TO_PRIORITY_GROUP_MAP and
PFC_PRIORITY_TO_PRIORITY_GROUP_MAP for barefoot

Signed-off-by: Wenda <wenni@microsoft.com>

* Unify PFC_PRIORITY_TO_PRIORITY_GROUP_MAP: remove "0":"0", "1":"1" as
these two pgs do not generate PFC frames.

Signed-off-by: Wenda <wenni@microsoft.com>
2018-10-17 14:10:34 -07:00
Ying Xie
f3ab8cdf9a [warm boot] syncd warm start could be individual warm start (#2147)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-10-16 11:20:39 -07:00
Qi Luo
7d51f8363f Fix bug: if all containers killed, service stop will throw exception because no redis (#2139)
Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-10-12 08:39:06 -07:00
Joe LeVeque
f047756d7b [image config] Install Python tabulate library v0.8.2 via pip (#2130) 2018-10-08 18:36:37 -07:00
Kevin(Shengkai) Wang
ea4b4bd650 [mellanox]: Update recipe for hw-mgmt according to latest changes (#2128)
Update the hw-mgmt to latest release V.2.0.0060.
Update the related files according to the latest hw-mgmt.

Signed-off-by: Kevin Wang <kevinw@mellanox.com>
2018-10-08 18:33:44 -07:00
Samuel Angebault
6ba2f97f1e [devices]: Align flash partition at 1M (#2104)
Flashes used for the 7050QX-32 and 7050QX-32S have a fw issue.
The best option to solve the problem is to upgrade to a newer firmware.
However this can only be done while in memory and take 10 seconds.
Adding an upgrade mechanism is possible but would need more
consideration as flashing the firmware and reformating the flash will
exceed the fast-reboot requirements.

A quick mitigation is to align the ext4 partition that we create on
these vfat based system on a 4k boundary.
Here we chose 1M instead but it's the same.
Newer version of sfdisk do this automatically but the one in SONiC
today doesn't have this behavior.

This workaround will only reduce the pace of the flash health
degradation. The only long term fix is to flash the firmware.
2018-10-02 06:10:12 -07:00
Jipan Yang
dedd5624a0 Adapt to the new WARM_RESTART_TABLE table schema: change from restart… (#2083)
* Adapt to the new WARM_RESTART_TABLE table schema: change from restart_count to restore_count

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Update variable and function name to match restore_count name change

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Update swss submodule for warm restart schema change

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2018-10-02 06:08:26 -07:00
Ying Xie
c8e6b15504
[syncd] warn shutdown syncd process when warm boot is enabled (#2078)
* [syncd] warn shutdown syncd process when warm boot is enabled

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [warmboot] mount folder to hold warmboot temporary files

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Fix a typo
2018-10-01 19:01:04 -07:00
Samuel Angebault
e72d63cf92 [arista] Update Arista drivers submodule (#2097)
* [arista] Update Arista drivers submodule

* Fix 7050qx32 fancontrol for kernel 4.9

* Fix 7060cx32s fancontrol for kernel 4.9

* Install python3-yaml for sonic config tests

* Fix 7260cx3 fancontrol for kernel 4.9

* Fix hwsku-init scripts and permissions

* Preserve old_config folder in boot0
2018-09-28 21:27:41 -07:00
Ying Xie
cfe01f19e4
Separate syncd service from swss service (#2051)
* [swss.sh] refactor ssh service script code

- Move checks and waits to helper functions.
- Remove early returns from code stream

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [swss.sh] Add debug log for service state changes

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [syncd] Separate out syncd service from swss service

Still make them start/stop/restart synchronously so existing scripts
continue working.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Remove extra 'After' in swss service and remove syncd docker warm boot code

Syncd warm boot needs more thinking, we can put it back once the work
flow has been defined and ready for coding/testing.

* [syncd] syncd start/stop/restart shouldn't affect swss state

Semi-detach syncd service state change from swss:

- swss state change still chase syncd service to follow except warm boot
- syncd state change will only affect itself.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* add missing '{'
2018-09-24 16:35:01 -07:00
Taoyu Li
018b5899be [updategraph] add support to use preset config instead of default minigraph (#2050)
* [updategraph] add support to use preset config instead of default minigraph

* Fix variable case

* Remove default minigraph case

* Remove default minigraphs and add default_sku files
2018-09-21 22:01:10 -07:00
Samuel Angebault
7ece396592 Add SWI_DEFAULT support in boot0 (#2056)
Currently setting the next boot image is the same as setting a default
image.
With this change SWI_DEFAULT= will be considered the default image and
SWI= the next image.
When executing the boot0 SWI= will be overriden by SWI_DEFAULT= if it
exists and create in with the value of SWI= otherwise.
2018-09-20 00:19:40 -07:00
Taoyu Li
47c9542c63 Don't reuse init_cfg.json from old image during upgrade (#2036) 2018-09-11 21:26:51 -07:00
Jipan Yang
3f37b96de6 [swss]: Add support for swss docker warm restart (#1982)
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2018-08-25 01:39:09 -07:00
lguohan
83f0822dde
[build]: run docker info at later stage in the build (#1984)
wait till docker service started

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-24 10:33:56 -07:00
lguohan
80c6453731
[swss]: simplify swss systemd service file (#1965)
move the swss service start/stop logic into /usr/local/bin/swss.sh

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-22 13:02:32 -07:00
yurypm
de0e892eaa [arista] Fix arista-convertfs script (#1961) 2018-08-21 15:19:31 -07:00
Samuel Angebault
788b20ee12 [arista]: Fix mount point discovery in boot0 for overlayfs (#1959)
On overlay filesystem the name of the mountpoint will also match in the
mount command for overlayfs as upperdir=
To prevent detecting the wrong partition we now look for space before.
This ensure that we match mountpoint and not devices in df and mount
outputs.
2018-08-21 00:58:16 -07:00
Shuotian Cheng
9413fa9a7b
[interfaces]: Move IP/MTU information from interfaces file into database (#1908)
- Move front panel ports and port channels MTU and IP configurations out of
the current /etc/network/interfaces file and store them in the configuration
database.

- The default MTU value for both front panel ports and the port channels is
9100. They are set via the minigraph or 9100 by default.

- Introduce portmgrd which will pick up the MTU configurations from the
configuration database.

- The updated intfmgrd will pick up IP address changes from the configuration
database.

- Update sonic-swss submodule

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-08-20 11:19:16 -07:00
Joe LeVeque
98082d56a0 [baseimage]: Download picocom version 3.1-2 from stretch-backports; No longer build from source (#1946) 2018-08-17 17:38:20 -07:00
lguohan
38f3eba695
[kernel]: upgrade kernel to 4.9.0-7 (4.9.110-3+deb9u1) (#1922)
* [kernel]: upgrade kernel to 4.9.0-7 (4.9.110-3+deb9u1)

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* [mellanox]: Update SDK pointer for 4.9.0-7 kernel (#44)

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>

* Update arista drivers for 4.9.0-7 linux kernel (#43)
2018-08-16 08:56:56 -07:00
cawand
9f545456c9 Added picocom and pexpect to base image, for use in consutil (#1935)
Signed-off-by: Cayla Wanderman-Milne <t-cawand@microsoft.com>
2018-08-15 21:41:12 -07:00
Volodymyr Samotiy
746ad967a4 [mellanox]: Fix post stop action in swss service template (#1928)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2018-08-14 11:46:01 -07:00
lguohan
f3ca7c422f
[rsyslog]: use # to separate container name and program name in syslog message (#1918)
Previously use / to separate container name and program name.

However, in rsyslogd:

Precisely, the programname is terminated by either (whichever occurs first):

end of tag
nonprintable character
‘:’
‘[‘
‘/’
The above definition has been taken from the FreeBSD syslogd sources.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-12 22:23:58 -07:00
zhenggen-xu
d761630f73 Fix potential blackholing/looping traffic when link-local was used and refresh ipv6 neighbor to avoid CPU hit (#1904)
* Fix potential blackholing/looping traffic and refresh ipv6 neighbor to avoid CPU hit

In case ipv6 global addresses were configured on L3 interfaces and used for peering,
and routing protocol was using link-local addresses on the same interfaces as prefered nexthops,
the link-local addresses could be aged out after a while due to no activities towards the link-local
addresses themselves. And when we receive new routes with the link-local nexthops, SONiC won't insert
them to the HW, and thus cause looping or blackholing traffic.

Global ipv6 addresses on L3 interfaces between switches are refreshed by BGP keeplive and other messages.

On server facing side, traffic may hit fowarding plane only, and no refresh for the ipv6 neighbor entries regularly.
This could age-out the linux kernel ipv6 neighbor entries, and HW neighbor table entries could be removed,
and thus traffic going to those neighbors would hit CPU, and cause traffic drop and temperary CPU high load.

Also, if link-local addresses were not learned, we may not get them at all later.

It is intended to fix all above issues.

Changes:
Add ndisc6 package in swss docker and use it for ipv6 ndp ping to update the neighbors' state on Vlan interfaces
Change the default ipv6 neighbor reachable timer to 30mins
Add periodical ipv6 multicast ping to ff02::11 to get/refresh link-local neighbor info.

* Fix review comments:
Add PORTCHANNEL_INTERFACE interface for ipv6 multicast ping
format issue

* Combine regular L3 interface and portchannel interface for looping

* Add ndisc6 package to vs docker
2018-08-12 03:14:55 -07:00
Guohan Lu
7f7a2a019e [sshd]: regenerate ssh key if ssh_host_rsa_key is not present
ssh_host_key is removed in debian stretch. Use ssh_host_rsa_key
to decide if the host keys are present.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-11 21:38:33 +00:00
Volodymyr Samotiy
6a3c05f498 [mellanox]: Update recipe for hw-mgmt according to latest changes (#40)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2018-08-11 09:09:03 +00:00
Guohan Lu
46b0847339 [baseimage]: use original stretch bash in the base image 2018-08-11 09:09:03 +00:00
paavaanan
ecfca8bf23 [devices]: DellEMC new platform support for z9264f - 64x100 (#26)
* Added new platform support DellEMC - Z92264f - 64x100

* Includes changes with Makefiles, sfputil, eeprom and default minigraph

* Led support for Z9264f platform

* Includes changes on default minigraph

* ipmitool implementation in pmon docker. platform_sensors script is inclued in pmon startup
2018-08-11 09:09:03 +00:00
Samuel Angebault
0f0e7ab7e8 Add support 4.9 support for 7260CX3 (#34) 2018-08-11 09:09:03 +00:00
Samuel Angebault
764a7edd83 [device]: Enable arista drivers for sonic-linux-kernel 4.9 (#21)
* Enable arista drivers

* Add vfat ascii charset in initramfs

* Update boot0.j2 for 4.9 kernel

* Fix i2c offsets in sensors.conf

* Bump sonic-platform-modules-arista submodule
2018-08-11 09:09:03 +00:00
lguohan
35ab7a6e09 [kernel]: upgrade linux kernel to 4.9.0-5 (4.9.65-3+deb9u2) (#8) 2018-08-11 09:09:03 +00:00
Guohan Lu
0827ed3b1a [baseimage]: install tacacs dependencies 2018-08-11 09:07:59 +00:00
Guohan Lu
540a87a9db [opennsl]: use opennsl kernel module based on kernel 4.9.0-3 2018-08-11 09:07:59 +00:00
Guohan Lu
8c72d8c6f2 [build]: insert overlay kmod for base image build 2018-08-11 09:07:59 +00:00
Guohan Lu
b03e974bb3 [baseimage]: let docker in base image use overlay fs instead of aufs 2018-08-11 09:07:59 +00:00
Guohan Lu
f64ffe8571 [baseimage]: build root filesystem via overlay fs instead of aufs 2018-08-11 09:07:59 +00:00
Guohan Lu
4d701ad037 [baseimage]: update base image from jessie to stretch 2018-08-11 09:07:59 +00:00
Joe LeVeque
7aefa185d4 Download newer version (8.23.0-2) of rsyslog from jessie-backports in hopes of eliminating memory leaks (#1912) 2018-08-09 23:56:41 -07:00
Taoyu Li
530e2dc4e1
Only keep most recent one in old_config (#1884) 2018-07-31 12:50:54 -07:00
Sagar Balani
5011622c6f [platform]: bfn intf: allow-hotplug for usb0 interface (#1889) 2018-07-30 09:54:05 -07:00
Rodny Molina
c3c8f7fd7f Fix for bash's memory-leak (#1879)
* Fix for bash's memory-leak

Memory leak is observed during the execution of scripts that make use of bash-arrays. In scenarios where the offending script is executed on a regular basis (e.g. fancontrol), the leaking process may end up consuming most of the system resources.

In this PR i'm replacing bash in all the contexts where it executes (both host and dockers). The official patch for this issue is here: https://ftp.gnu.org/gnu/bash/bash-4.3-patches/bash43-040

* Fixing minor issue during code-merge

Signed-off-by: Rodny Molina <rmolina@linkedin.com>
2018-07-27 17:46:33 -07:00
pavel-shirshov
10b4bbcae8 [swss]: Start counter from swss container (#1875)
* sonic-quagga update. Don't spam with 'Vtysh connected from' message

* Enable counters inside swss container. systemd is not flexible enough to follow our business rules
2018-07-26 13:39:08 -07:00
Qi Luo
e3abf0c070
[docker] Exit if docker run fails (#1870) 2018-07-24 21:46:55 -07:00