Commit Graph

309 Commits

Author SHA1 Message Date
stepanblyschak
ff526dd103 [mellanox|ffb] use system level warm reboot for Mellanox fastfast boot (#2374)
* [mellanox|ffb] use system level warm reboot for Mellanox fastfast boot

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mellanox|ffb] add comments for mellanox start/stop drivers section

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-01-10 14:09:03 -08:00
lguohan
b57a376622
[docker-engine]: upgrade docker engine to 18.09 (#2417)
* [docker-engine]: upgrade docker engine to 18.09
2019-01-04 20:47:43 -08:00
Volodymyr Samotiy
b506241b84 [syncd]: Fix reload flow for Mellanox platforms (#2386)
* Perform stop/start of Mellanox driver tools for all types of reboot
* Don't set Mellanox FAST_BOOT option for "cold" reboot
* Don't send "syncd_request_shutdown" event for "cold" reboot on Mellanox platforms

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2018-12-15 11:36:12 -08:00
zhenggen-xu
f093ef2a9f [security kernel] Upgrade kernel from 4.9.110-3+deb9u2 to 4.9.110-3+deb9u6 (#2367)
* [security kernel] Upgrade kernel from 4.9.110-3+deb9u2 to 4.9.110-3+deb9u6
short version: 4.9.0-7 to 4.9.0-8

See changelogs for security fixes:
https://tracker.debian.org/media/packages/l/linux/changelog-4.9.110-3deb9u6

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Update sonic-linux-kernel submodule after it was merged

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2018-12-11 04:17:17 -08:00
Samuel Angebault
6c7bcf5067 [device/Arista] fix small issue for the 7170 (#2373)
* Fix boot0 install on vfat

* Only display the hook name in boot0

Instead of printing the entire path

* Update arista driver submodule
2018-12-11 04:14:46 -08:00
Ying Xie
6ba93acd9c
[update graph] adapt to warm reboot scenario (#2353)
* [update graph] adapt to warm reboot scenario

When migrating configuration, always copy config files from old_config
to /etc/sonic. But if warm reboot is detected, then skip configuration
operations.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* log file copies and misses
2018-12-06 10:24:50 -08:00
Wenda Ni
f5e678cf84 Port QoS & buffer changes in 0330 to master (#2239)
* 1) DSCP 46 to 5; 2) ecn config for lossless traffic; 3) ecn on by default; 4) DWRR equal weight;

Signed-off-by: Wenda <wenni@microsoft.com>

* 1) link pg & queue 5 to lossy buffer profile; 2) ingress lossless alpha 1/8

Signed-off-by: Wenda <wenni@microsoft.com>

* Update the test case for qos & buffer json template

Signed-off-by: Wenda <wenni@microsoft.com>

* Migrate a7050-qx32 and s6000 to use pg_profile lookup architecture

Signed-off-by: Wenda <wenni@microsoft.com>

* Update pg headroom egress service pool for a7050-qx-32s, a7050-qx32, and s6000

Signed-off-by: Wenda <wenni@microsoft.com>

* Link queue 5 to lossy profile

Signed-off-by: Wenda <wenni@microsoft.com>
2018-12-04 20:51:55 -08:00
kannankvs
a9a7ce1091 tacacs management vrf changes (#2217) 2018-12-04 10:22:48 -08:00
Volodymyr Samotiy
75b41233d2 [Mellanox|FFB]: Add support for Mellanox fast-fast boot (#2294)
* [mlnx|ffb] Add support for mellanox fast-fast boot

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>

* [mlnx|ffb]: Add support of "config end" event for mlnx fast-fast boot

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>

* [Mellanox|FFB]: Fix review comments

* Change naming convention from "fast-fast" to "fastfast"

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2018-12-04 10:11:24 -08:00
Samuel Angebault
989b60059b [device/arista]: Update (#2336)
* Update arista drivers submodule

* Ignore the possible timestamp warning in tar extraction

* Add verbosity toggle to boot0

Console logging is slow because of the 9600 baud rate.
Some time can be saved by decreasing the console verbosity.

* Add hook mechanism in boot0.

Support additional features in boot0 via hooks.
Hooks are unpacked and executed at post-install or pre-exec time.

* Fix 7170 sensors.conf file

Fix critical temperature settings for MAX6658 sensors

* Fix the random swap of storage devices

For arista 7050 switches running with linux 4.9, it is likely the device
name of flash drive (/dev/sda) and usb (/dev/sdb) randomly swap in kernel
booting, depending on which one is ready first. It breaks the expectation
that flash will be mounted as root by setting root=/dev/sda1. This patch
will correct ROOT to flash device refering to the path under block_flash.

* Fix 7170 fancontrol

* Do not remove aquota.user file in boot0

This file is a filesystem protected file used by EOS.
It can be simply removed and will make the SONiC installation failed if
not skipped.
2018-12-04 10:08:55 -08:00
Taoyu Li
aedfd6e708 [sonic-cfggen] Multi-key should be in form of (a,b) instead of 'a|b' (#2337) 2018-12-04 10:07:44 -08:00
Joe LeVeque
298d2ad8f4
[boot] Refactor: All services which start Docker containers start before ntp-config service (#2335) 2018-12-03 16:01:44 -08:00
Ying Xie
84bde1511a
[sonic boot] disable dhcp during boot up, until updategraph service is running (#2316)
* [sonic] disable management port eth0 during boot up

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [updategraph] enable dhcp client on management port eth0

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-29 08:34:22 -08:00
Joe LeVeque
d1c9b0cb77 [boot] Start ntp-config service after all Docker containers are started (#2303) 2018-11-28 00:12:03 -08:00
Ying Xie
ce60c53933
[build image] copy init_interfaces to interfaces (#2302)
init_interfaces meant to be sonic init interfaces configuration file.
However, it needs to be copied to the right file name to take effect.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-27 14:35:17 -08:00
Nikos
7056b49af7 Routing application split config support (#2286)
* Routing application split config support

Signed-off-by: nikos <ntriantafillis@gmail.com>

* Routing application split config support
Routing application split config support

Signed-off-by: nikos <Nikos Triantafillis>
2018-11-26 18:19:12 -08:00
zzhiyuan
f0540e7381 Fix networking.service waiting for udevadm settle (#2295)
There was a fix to speed up initialization when networking used init.d
but it did not carry over to systemd networking.service. This fix will
apply the same change on the systemd service.

The result is much less time spent being blocked in networking.service.
2018-11-23 17:06:23 -08:00
Qi Luo
c2ae736f2e [warmboot] Load database from redis-cli save (#2287)
* [warmboot] Load database from `redis-cli save`

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Add trivial statement to make bash function valid

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Update submodule sonic-utilities: Use 'redis-cli save' to dump database to file

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Move configdb-load.sh outside docker, and only run in cold

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix for more strict warm check

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-11-22 15:13:35 -08:00
Ying Xie
4abbe43463 [syncd] skip ledinit during syncd warm start (#2285)
* [syncd] skip ledinit during syncd warm start

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-21 17:56:19 -08:00
Ying Xie
873df9d8e8
[bde driver] black list linux_kernel_bde driver (#2284)
This driver should be loaded by sonic service. If kernel tries to load
it, the driver would be loaded with default parameters, which is not
right for sonic.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-21 08:08:37 -08:00
Qi Luo
465ebbafff
Build patched redis-dump-load (#2277)
* Build patched redis-dump-load
* Fix build
* Add build rule
2018-11-20 19:27:56 -08:00
Qi Luo
b4fd40a75e Fix redis-py version to 2.10.6 (#2273)
* Fix redis-py version

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Update submodule sonic-py-swsssdk: Fix redis-py version to 2.10.6

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-11-19 12:03:15 -08:00
Ying Xie
5c8650aaaa [swss service] don't clear WARM_RESTART table (#2256)
Clear WARM_RESTART table could cause component level warm restart to
fail due to missing WARM_RESTART state.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-15 22:04:53 -08:00
Ying Xie
8598ccaf84
[syncd] extend syncd service script to support both warm/cold shutdown (#2238)
- cold shutdown is used by regular service stop and/or fast reboot
- warm shutdown is used by warm restart and/or warm reboot

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-15 15:47:33 -08:00
Joe LeVeque
f126000cc9
[sudoers] Add 'SONIC_CLI_IFACE_MODE' to env_keep to ensure variable is made available to sudo calls (#2249) 2018-11-15 15:16:06 -08:00
stepanblyschak
447ae7b61a [mlnx] Fix fast reboot (#2237)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2018-11-09 21:54:20 -08:00
Ying Xie
914d5c7451 [warm boot] restore log level DB during warm reboot (#2233)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-07 21:59:24 -08:00
Shuotian Cheng
110355201b [swss]: Update swss.sh script to clean up specific db when start (#2223)
This script shall not flush all the entries in the state database
when it starts up, since there are entries maintained and written
by other processes outside this docker.

The issue we noticed was that the portchannel states are cleaned
up after teamsyncd writes the entries into the database, which
causes the IPs failed to be configured because intfmgrd considers
the portchannels are not ready yet.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-11-03 12:32:46 -07:00
Qi Luo
8b67424101 Warm reboot: restore the database docker with content saved (#2216)
* Database service warm start

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Update files/build_templates/docker_image_ctl.j2

Co-Authored-By: qiluo-msft <qiluo-msft@users.noreply.github.com>

* Update files/build_templates/docker_image_ctl.j2

Co-Authored-By: qiluo-msft <qiluo-msft@users.noreply.github.com>

* Update files/build_templates/docker_image_ctl.j2

Co-Authored-By: qiluo-msft <qiluo-msft@users.noreply.github.com>

* Fix sudo, and exit immediately if any failure

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix syntax

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix redisLoadAndDelete argument, and refactor

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

* Fix: sudo, ping through unix socket

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-11-02 07:20:07 -07:00
Ying Xie
5cff136951 [console speed] lock console speed to start up speed (#1734)
Auto negotiating console speed could cause sonic to lock on a wrong
speed under rare conditions. The only way to come out of the wrong
speed is to issue line break or restart console service with forced
speed, or reboot sonic.

Lock down the console speed to avoid these situations.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-11-01 15:12:22 -07:00
Taoyu Li
2897686de8
[updategraph] Use empty configuration when DHCP graphurl option is missing (#2185) 2018-10-29 12:16:00 -07:00
Joe LeVeque
1e1add90f9
Remove Arista-specific service ACL solution; All platforms now use caclmgrd (#2202) 2018-10-29 10:25:18 -07:00
Wenda Ni
09ae9a8965 In the case of upgrade, have pfcwd enabled on the upgraded sonic (#2192)
Signed-off-by: Wenda <wenni@microsoft.com>
2018-10-26 09:13:45 -07:00
Shuotian Cheng
7313e7d9bc [teamd]: Add teammgrd in docker-teamd (#2064)
Remove the teamd.j2 templates used for starting the teamd. Add
teammgrd instead to manage all port channel related configuration
changes. Remove front panel port related configurations in
interfaces.j2 templates as well.

Remove teamd.sh script and use teammgrd to start all the teamd
processes. Remove all the logics in the start.sh script as well.

Update the sonic-swss submodule.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-10-19 03:41:53 -07:00
Taoyu Li
2a24a303ec [tacplus nss conf] tacplus should be before compat (#2163) 2018-10-18 12:42:24 -07:00
Wenda Ni
77652c55fd [QoS]: Unify qos json by using qos_config.j2 template (#2023)
* Unify qos config with qos_config.j2 template

Signed-off-by: Wenda <wenni@microsoft.com>

* Change 7050 to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   device/arista/x86_64-arista_7050_qx32/Arista-7050-QX32/qos.json.j2
	modified:   device/arista/x86_64-arista_7050_qx32s/Arista-7050-QX-32S/qos.json.j2

* Change a7060, a7260, s6000, s6100, z9100  to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

* Change mlnx devices to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   ../../../mellanox/x86_64-mlnx_msn2100-r0/ACS-MSN2100/qos.json.j2
	modified:   ../../../mellanox/x86_64-mlnx_msn2410-r0/ACS-MSN2410/qos.json.j2
	modified:   ../../../mellanox/x86_64-mlnx_msn2700-r0/ACS-MSN2700/qos.json.j2
	modified:   ../../../mellanox/x86_64-mlnx_msn2700-r0/Mellanox-SN2700-D48C8/qos.json.j2

* Change barefoot devices to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   barefoot/x86_64-accton_wedge100bf_32x-r0/montara/qos.json.j2
	modified:   barefoot/x86_64-accton_wedge100bf_65x-r0/mavericks/qos.json.j2

* Change accton as7212 to use qos config template

Signed-off-by: Wenda <wenni@microsoft.com>

	modified:   accton/x86_64-accton_as7212_54x-r0/AS7212-54x/qos.json.j2

* Apply PORT_QOS_MAP to active ports only

Signed-off-by: Wenda <wenni@microsoft.com>

* Update qos config test with qos_config.j2 template

Signed-off-by: Wenda <wenni@microsoft.com>

* Update sample output of qos-dell6100.json

Signed-off-by: Wenda <wenni@microsoft.com>

* Remove generating the default port name and index list, i.e., remove the generate_port_lists macro, because PORT is always defined

Signed-off-by: Wenda <wenni@microsoft.com>

* Include pfc_to_pg_map according to platform asic type obtained from
/etc/sonic/sonic_version.yml rather than specifying per hwsku

Signed-off-by: Wenda Ni <wenni@microsoft.com>

* Customize TC_TO_PRIORITY_GROUP_MAP and
PFC_PRIORITY_TO_PRIORITY_GROUP_MAP for barefoot

Signed-off-by: Wenda <wenni@microsoft.com>

* Unify PFC_PRIORITY_TO_PRIORITY_GROUP_MAP: remove "0":"0", "1":"1" as
these two pgs do not generate PFC frames.

Signed-off-by: Wenda <wenni@microsoft.com>
2018-10-17 14:10:34 -07:00
Ying Xie
f3ab8cdf9a [warm boot] syncd warm start could be individual warm start (#2147)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2018-10-16 11:20:39 -07:00
Qi Luo
7d51f8363f Fix bug: if all containers killed, service stop will throw exception because no redis (#2139)
Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2018-10-12 08:39:06 -07:00
Joe LeVeque
f047756d7b [image config] Install Python tabulate library v0.8.2 via pip (#2130) 2018-10-08 18:36:37 -07:00
Kevin(Shengkai) Wang
ea4b4bd650 [mellanox]: Update recipe for hw-mgmt according to latest changes (#2128)
Update the hw-mgmt to latest release V.2.0.0060.
Update the related files according to the latest hw-mgmt.

Signed-off-by: Kevin Wang <kevinw@mellanox.com>
2018-10-08 18:33:44 -07:00
Samuel Angebault
6ba2f97f1e [devices]: Align flash partition at 1M (#2104)
Flashes used for the 7050QX-32 and 7050QX-32S have a fw issue.
The best option to solve the problem is to upgrade to a newer firmware.
However this can only be done while in memory and take 10 seconds.
Adding an upgrade mechanism is possible but would need more
consideration as flashing the firmware and reformating the flash will
exceed the fast-reboot requirements.

A quick mitigation is to align the ext4 partition that we create on
these vfat based system on a 4k boundary.
Here we chose 1M instead but it's the same.
Newer version of sfdisk do this automatically but the one in SONiC
today doesn't have this behavior.

This workaround will only reduce the pace of the flash health
degradation. The only long term fix is to flash the firmware.
2018-10-02 06:10:12 -07:00
Jipan Yang
dedd5624a0 Adapt to the new WARM_RESTART_TABLE table schema: change from restart… (#2083)
* Adapt to the new WARM_RESTART_TABLE table schema: change from restart_count to restore_count

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Update variable and function name to match restore_count name change

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Update swss submodule for warm restart schema change

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2018-10-02 06:08:26 -07:00
Ying Xie
c8e6b15504
[syncd] warn shutdown syncd process when warm boot is enabled (#2078)
* [syncd] warn shutdown syncd process when warm boot is enabled

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [warmboot] mount folder to hold warmboot temporary files

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Fix a typo
2018-10-01 19:01:04 -07:00
Samuel Angebault
e72d63cf92 [arista] Update Arista drivers submodule (#2097)
* [arista] Update Arista drivers submodule

* Fix 7050qx32 fancontrol for kernel 4.9

* Fix 7060cx32s fancontrol for kernel 4.9

* Install python3-yaml for sonic config tests

* Fix 7260cx3 fancontrol for kernel 4.9

* Fix hwsku-init scripts and permissions

* Preserve old_config folder in boot0
2018-09-28 21:27:41 -07:00
Ying Xie
cfe01f19e4
Separate syncd service from swss service (#2051)
* [swss.sh] refactor ssh service script code

- Move checks and waits to helper functions.
- Remove early returns from code stream

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [swss.sh] Add debug log for service state changes

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [syncd] Separate out syncd service from swss service

Still make them start/stop/restart synchronously so existing scripts
continue working.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Remove extra 'After' in swss service and remove syncd docker warm boot code

Syncd warm boot needs more thinking, we can put it back once the work
flow has been defined and ready for coding/testing.

* [syncd] syncd start/stop/restart shouldn't affect swss state

Semi-detach syncd service state change from swss:

- swss state change still chase syncd service to follow except warm boot
- syncd state change will only affect itself.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* add missing '{'
2018-09-24 16:35:01 -07:00
Taoyu Li
018b5899be [updategraph] add support to use preset config instead of default minigraph (#2050)
* [updategraph] add support to use preset config instead of default minigraph

* Fix variable case

* Remove default minigraph case

* Remove default minigraphs and add default_sku files
2018-09-21 22:01:10 -07:00
Samuel Angebault
7ece396592 Add SWI_DEFAULT support in boot0 (#2056)
Currently setting the next boot image is the same as setting a default
image.
With this change SWI_DEFAULT= will be considered the default image and
SWI= the next image.
When executing the boot0 SWI= will be overriden by SWI_DEFAULT= if it
exists and create in with the value of SWI= otherwise.
2018-09-20 00:19:40 -07:00
Taoyu Li
47c9542c63 Don't reuse init_cfg.json from old image during upgrade (#2036) 2018-09-11 21:26:51 -07:00
Jipan Yang
3f37b96de6 [swss]: Add support for swss docker warm restart (#1982)
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2018-08-25 01:39:09 -07:00
lguohan
83f0822dde
[build]: run docker info at later stage in the build (#1984)
wait till docker service started

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-24 10:33:56 -07:00