Why I did it
After docker_inram is enabled, the docker folder's default max size is 1.5G.
It's not big enough for some tests which need to install additional docker images or install extra packages.
Work item tracking
Microsoft ADO 24199761:
How I did it
add docker_inram into cmdline_allowlist
How to verify it
sudo sh -c 'echo "docker_inram_size=3000M" >> kernel-cmdline-append'
sudo reboot and check the docker folder size
* Re-add 127.0.0.1/8 when bringing down the interfaces
With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.
To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.
Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Update cable length for uplink/downlink ports for chassis and and update PG/pool headroom size accordingly.
Work item tracking
17880812
How I did it
Updated cable length as well as buffer config in HWSKU files.
* [Arista] Fix boot0 code for docker_inram
Enable docker_inram for all systems with 4GB or less of flash.
This is mandatory to allow these systems to store 2 SONiC images.
This change also fixes the missing docker_inram attribute when
installing a new image from SONiC.
Because the SWI image can ship with additional kernel parameters within
such as `sonic_fips=` this lead to a conflict.
To prevent the conflict, the extra kernel parameters from the SWI are
now stored in the file `kernel-cmdline-append` which isn't used anywhere.
* Add optional zram compression for docker_inram
Some devices running SONiC have a small storage device (2G and 4G mainly)
The SONiC image growth over time has made it impossible to install
2 images on a single device.
Some mitigations have been implemented in the past for some devices but
there is a need to do more.
One such mitigation is `docker_inram` which creates a `tmpfs` and
extracts `dockerfs.tar.gz` in it.
This all happens in the SONiC initramfs and by ensuring the installation
process does not extract `dockerfs.tar.gz` on the flash but keep the file as is.
This mitigation does a tradeoff by using more RAM to reduce the disk footprint.
It however creates new issues for devices with 4G of system memory since
the extracted `dockerfs.tar.gz` nears the 1.6G.
Considering debian upgrades (with dual base images) and the continuous
stream of features this is only going to get bigger.
This change introduces an alternative to the `tmpfs` by allowing a system
to extract the `dockerfs.tar.gz` inside a `zram` device thus bringing
compression in play at the detriment of performance.
Introduce 2 new optional kernel parameters to be consumed by SONiC initramfs.
- `docker_inram_size` which represent the max physical size of the
`zram` or `tmpfs` volume (defaults to DOCKER_RAMFS_SIZE)
- `docker_inram_algo` which is the method to use to extract the
`dockerfs.tar.gz` (defaults to `tmpfs`)
other values are considered to be compression algorithm for `zram`
(e.g `zstd`, `zlo-rle`, `lz4`)
Refactored the logic to mount the docker fs in the SONiC initramfs under
the `union-mount` script.
Moved the code into a function to make it cleaner and separated the
inram volume creation and docker extraction.
On Arista platform with a flash smaller or equal to 4GB set
`docker_inram_algo` to `zstd` which produces the best compression ratio
at the detriment of a slower write performance and a similar read
performance to other `zram` compression algorithms.
Why I did it
Fixes#14179
chassis-packet: missing arp entries for static routes causing high orchagent cpu usage
It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear.
How I did it
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping if any
found to resolve it.
How to verify it
After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present
manual validation:
Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries
run arp_update
Check for neigh entries. All entries should be present.
Testing on T0 setup route/for test_static_route.py
The test set the STATIC_ROUTE entry in conifg db without ifname:
sonic-db-cli CONFIG_DB hmset 'STATIC_ROUTE|2.2.2.0/24' nexthop 192.168.0.18,192.168.0.25,192.168.0.23
"STATIC_ROUTE": {
"2.2.2.0/24": {
"nexthop": "192.168.0.18,192.168.0.25,192.168.0.23"
}
},
Validate that the arp_update gets the proper ARP_UPDATE_VARDS using arp_update_vars.j2 template from config db and does not crash:
{ "switch_type": "", "interface": "", "pc_interface" : "PortChannel101 PortChannel102 PortChannel103 PortChannel104 ", "vlan_sub_interface": "", "vlan" : "Vlan1000", "static_route_nexthops": "192.168.0.18 192.168.0.25 192.168.0.23 ", "static_route_ifnames": "" }
validate route/test_static_route.py testcase pass.
Why I did it
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.
SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
How I did it
Why I did it
SONiC currently does not identify 'EdgeZoneAggregator' neighbor. As a result, the buffer profile attached to those interfaces uses the default cable length which could cause ingress packet drops due to insufficient headroom. Hence, there is a need to update the buffer templates to identify such neighbors and assign the same cable length as used by the T1.
How I did it
Modified the buffer template to identify EdgeZoneAggregator as a neighbor device type and assign it the same cable length as a T1/leaf router.
How to verify it
Unit tests pass, and manually checked on a 7260 to see the changes take effect.
Improve sudo cat command for RO user.
#### Why I did it
RO user can use sudo command show none syslog files.
#### How I did it
Improve sudo cat command for RO user.
#### How to verify it
Pass all UT.
Manually check fixed code work correctly.
#### Description for the changelog
Improve sudo cat command for RO user.
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping to
resolve the missing entry.
Why I did it
Fixes#14179
chassis-packet: missing arp entries for static routes causing high orchagent cpu usage
It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear.
How I did it
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping if any
found to resolve it.
How to verify it
After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present
manual validation:
Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries
run arp_update
Check for neigh entries. All entries should be present.
Signed-off-by: anamehra <anamehra@cisco.com>
Why I did it
After warm reboot, show environment prints the following error:
failed to import plugin show.plugins.macsec: [Errno 13] Permission denied: '/tmp/cache/macsec'
How I did it
Set owner back to admin after restoring counters folder.
How to verify it
sudo warm-reboot, then ensure show environement does not print errors.
Signed-off-by: Oleksandr Kolomeiets <oleksandrx.kolomeiets@intel.com>
Co-authored-by: oleksandrx-kolomeiets <oleksandrx.kolomeiets@intel.com>
Why I did it
Fix some of vulnerability issue relative python packages #14269
Pillow: [CVE-2021-27921]
Wheel: [CVE-2022-40898]
lxml: [CVE-2022-2309]
How I did it
- Why I did it
Remove dependency on interfaces-config.service to speed up boot, because interfaces-config.service takes a lot of time on boot.
- How I did it
Changed service files for swss, syncd.
- How to verify it
Boot and check swss/syncd start time comparing to interfaces-config
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>