Commit Graph

21 Commits

Author SHA1 Message Date
andywongarista
6e0559d5fa
[Arista] Add initial support for 720DT-48S (#10656)
Added initial set of config files to allow for booting and partial traffic testing in SONiC on the 720DT-48S.

How to verify it
- Switch boots
- show interfaces status shows links up on interfaces Ethernet24-51
- Traffic flows with no errors on interfaces Ethernet24-51
2022-06-29 09:56:24 -07:00
Samuel Angebault
0c4d4ace76
[kdump] Fix OOM events in crashkernel (#6447)
A few issues where discovered with crashkernel on Arista platforms.

1) platforms using `docker_inram=on` would end up OOM in kdump environment.
This happens because the same initramfs is used by SONiC and the crashkernel.
With `docker_inram=on` the `dockerfs.tar.gz` is extracted in a `tmpfs` created for the occasion.
Since `dockerfs.tar.gz` weights more than 1.5G, it doesn't fit into the kdump environment and ends up OOM.
This OOM event can in turn trigger a panic.

2) Arista platforms with `secureboot` enabled would fail to load the crashkernel because the kernel parameter would be discarded on boot.
This happens because the `boot0` in secureboot mode is strict about kernel parameter injection.

3) The secureboot path allowlist would remove kernel crash reports.

4) The kdump service would fail on Arista products since `/boot/` is empty in `secureboot`

**- How I did it**

1) To prevent an OOM event in the crashkernel the fix is to avoid the codepaths in `union-mount` that create tmpfs and populate them. Some more codepath specific to Arista devices are also skipped to make the kdump process faster.
This relies on detecting that the initramfs is starting in a kdump environment and skipping some initialization.
The `/usr/sbin/kdump-config` tool appends a few kernel cmdline arguments when loading the crashkernel.
The most unique one is `systemd.unit=kdump-tools.service` which is used in a few initramfs hooks to set `in_kdump`.

2) To allow `kdump` to work in `secureboot` environment the cmdline generation in boot0 was slightly modified.
The codepath to load kernel parameters changed by SONiC is now running for booting in secure mode.
It was altered to prevent an append only behavior which would grow the `kernel-cmdline` at every reboot.
This ever growing behavior would lead `kexec` to fail to load the kernel due to a too long cmdline.

3) To get the kernel crash under /var/crash this path has to be added to `allowlist_paths`

4) The `/host/image-XXX/boot` folder is now populated in `secureboot` mode but not used.

**- How to verify it**

Regular boot:
 - enable kdump
 - enable docker_inram=on via kernel-params
 - reboot
 - generate a crash `echo c > /proc/sysrq-trigger`
 - before: witness OOM events on the console
 - after: crash kernel works and crash available under /var/crash

Secure boot:
 - enable kdump
 - reboot
 - generate a crash `echo c > /proc/sysrq-trigger`
 - before: witness no kdump
 - after: crash kernel works and crash available under /var/crash


Co-authored-by: Boyang Yu <byu@arista.com>
2021-02-02 01:55:09 -08:00
Baptiste Covolato
cd486a82a4
[arista/aboot]: Zero out 1st MB before repartitioning (#5220)
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.

Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.

Signed-off-by: Baptiste Covolato <baptiste@arista.com>
2020-08-22 18:46:30 -07:00
Guohan Lu
6f5ac4b282 [initramfs]: move mke2fs to /usr/local/sbin
/usr/sbin/mke2fs is now rom busybox which does not support
the operation we need
2020-04-17 04:51:51 +00:00
byu343
fb3253329e
[arista]: Fix convertfs condition for booting from EOS (#4139)
Fix the issue of incorrectly skipping the convertfs hook when fast-reboot from EOS, by adding an extra kernel cmdline param "prev_os" to differentiate fast-reboot from EOS and from SONiC.

This is because we still do disk conversion for fast reboot from eos to sonic, like format the disk.
2020-02-11 18:44:25 -08:00
lguohan
2b28d55853
[build]: enable docker in ram option for small disk device (#3279)
when device disk is small, do not unzip dockerfs.tar.gz on disk.
keep the tar file on the disk, unzip to tmpfs in the initrd phase.

enabled this for 7050-qx32

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-08-06 23:04:00 -07:00
byu343
6add9445c8 [aboot-image]: Skip arista-hook and arista-convertfs for fast/warm-reboot (#3242) 2019-07-31 14:20:17 -07:00
Samuel Angebault
989b60059b [device/arista]: Update (#2336)
* Update arista drivers submodule

* Ignore the possible timestamp warning in tar extraction

* Add verbosity toggle to boot0

Console logging is slow because of the 9600 baud rate.
Some time can be saved by decreasing the console verbosity.

* Add hook mechanism in boot0.

Support additional features in boot0 via hooks.
Hooks are unpacked and executed at post-install or pre-exec time.

* Fix 7170 sensors.conf file

Fix critical temperature settings for MAX6658 sensors

* Fix the random swap of storage devices

For arista 7050 switches running with linux 4.9, it is likely the device
name of flash drive (/dev/sda) and usb (/dev/sdb) randomly swap in kernel
booting, depending on which one is ready first. It breaks the expectation
that flash will be mounted as root by setting root=/dev/sda1. This patch
will correct ROOT to flash device refering to the path under block_flash.

* Fix 7170 fancontrol

* Do not remove aquota.user file in boot0

This file is a filesystem protected file used by EOS.
It can be simply removed and will make the SONiC installation failed if
not skipped.
2018-12-04 10:08:55 -08:00
Samuel Angebault
6ba2f97f1e [devices]: Align flash partition at 1M (#2104)
Flashes used for the 7050QX-32 and 7050QX-32S have a fw issue.
The best option to solve the problem is to upgrade to a newer firmware.
However this can only be done while in memory and take 10 seconds.
Adding an upgrade mechanism is possible but would need more
consideration as flashing the firmware and reformating the flash will
exceed the fast-reboot requirements.

A quick mitigation is to align the ext4 partition that we create on
these vfat based system on a 4k boundary.
Here we chose 1M instead but it's the same.
Newer version of sfdisk do this automatically but the one in SONiC
today doesn't have this behavior.

This workaround will only reduce the pace of the flash health
degradation. The only long term fix is to flash the firmware.
2018-10-02 06:10:12 -07:00
yurypm
de0e892eaa [arista] Fix arista-convertfs script (#1961) 2018-08-21 15:19:31 -07:00
Samuel Angebault
0f0e7ab7e8 Add support 4.9 support for 7260CX3 (#34) 2018-08-11 09:09:03 +00:00
Samuel Angebault
26afa348ea [device] Misc fixes for Arista platforms (#1844)
* Update sensors.conf for 7050QX-32 and 7050QX-32S

These two platforms were using a previous version of a kernel driver.
The new one names the i2c buses differently.
We therefore need to rename them here.

* Fix the default minigraph for the 7050QX-32S

The interface offset is invalid which makes sonic-cfggen generate an
invalid config_db.jon in rc.local.
This config then silently makes orchagent/syncd fail.

* Use the partition on which sonic-aboot.swi is

Instead of always assuming /mnt/flash, use the partition where the image
to be installed lies.
This allow for the image to be on any partition.
2018-07-05 14:30:57 -07:00
Samuel Angebault
7f25b94378 [aboot]: Add setfacl in the initramfs (#1185)
Arista platforms need the filesystem ACLs to be removed on boot to
prevent invalid permission to be set for new files.
2017-11-24 17:30:11 -08:00
Samuel Angebault
ca214b947c [arista]: Bump sonic-platform-modules-arista submodule (#1111)
* Bump sonic-platform-modules-arista

Improves i2c performance for xcvrs
Fix the led_plugin by ignoring unknown ports
Miscellaneous improvements

* Fix index column for Arista-7260CX3-D108C8

* Fix flash permissions for Arista platforms

The ext4 flash uses acl to properly handle permissions in EOS.
Aboot isn't built with this support and therefore can't be used
to set the flash permissions. It has to be deferred in sonic initrd.
2017-11-03 15:22:05 -07:00
lguohan
116ba4b180 [baseimage]: allocate varlog disk in the initramfs stage (#936)
moving to initramfs unifies disk allocate on different platforms.
use fallocate instead of dd to speed up the disk allocation.

By default, mkfs.ext4 has -E discard option which discards the blocks
at the mkfs time, also speed up the initialization time.
2017-09-06 20:07:32 -07:00
Samuel Angebault
97e4360d9b [platform] Add support for Arista DCS-7260CX3-64 (#863)
* Update sonic-platform-modules-arista submodule

* Update boot0 to handle DCS-7260CX3-64

* Add sys eeprom plugin for DCS-7260CX3-64

* Add sfputil plugin for DCS-7260CX3-64

* Add sensors config for DCS-7260CX3-64

* Add Arista-7260CX3-64 HwSku port_config

* Handle slow flash partition re-read

* Add minigraph.xml for DCS-7260CX3-64 64x100G
2017-08-05 20:56:32 -07:00
Samuel Angebault
7d33387e7c [platform] Complete support for Arista-7050QX-32S (#661)
* Bump sonic-platform-modules-arista submodule

* Use sonic_sfputil plugin from the arista library

* Fix undefined variable varlog_size

* Prevent minigraph.xml to be removed from the flash

* Update DCS-7050QX-32 sensors config
2017-06-02 01:31:53 -07:00
lguohan
34adb715df [oneimage]: use loop variable to get image dir in /proc/cmdline for aboot image (#534) 2017-04-24 13:39:29 -07:00
Andriy Moroz
b549adc36c [image]: SONiC-to-SONiC update (#464) 2017-04-21 17:23:36 -07:00
lguohan
1458e9ea6b [aboot]: add varlog limit file in aboot image (#487)
* [aboot]: add varlog limit file in aboot image
2017-04-07 15:28:30 -07:00
lguohan
8826beb597 [docker]: change hardcoded value to DOCKERFS_DIR for docker directory on the disk (#269) 2017-02-06 08:17:16 -08:00