Commit Graph

1244 Commits

Author SHA1 Message Date
mssonicbld
fbe5fe736e
[ci/build]: Upgrade SONiC package versions (#15326) 2023-06-06 15:40:37 -07:00
mssonicbld
b0abe7149a
Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933) (#15316) 2023-06-03 09:30:12 +08:00
vmittal-msft
723c508a30
Update PG headroom settings ports based on port speed/cable length (#15287)
Why I did it
Update cable length for uplink/downlink ports for chassis and and update PG/pool headroom size accordingly.

Work item tracking
17880812

How I did it
Updated cable length as well as buffer config in HWSKU files.
2023-06-02 15:48:11 -07:00
Samuel Angebault
feb8671601
[202205] Implement zram compression for docker in RAM (#15137)
* [Arista] Fix boot0 code for docker_inram

Enable docker_inram for all systems with 4GB or less of flash.
This is mandatory to allow these systems to store 2 SONiC images.

This change also fixes the missing docker_inram attribute when
installing a new image from SONiC.
Because the SWI image can ship with additional kernel parameters within
such as `sonic_fips=` this lead to a conflict.
To prevent the conflict, the extra kernel parameters from the SWI are
now stored in the file `kernel-cmdline-append` which isn't used anywhere.

* Add optional zram compression for docker_inram

Some devices running SONiC have a small storage device (2G and 4G mainly)
The SONiC image growth over time has made it impossible to install
2 images on a single device.
Some mitigations have been implemented in the past for some devices but
there is a need to do more.

One such mitigation is `docker_inram` which creates a `tmpfs` and
extracts `dockerfs.tar.gz` in it.
This all happens in the SONiC initramfs and by ensuring the installation
process does not extract `dockerfs.tar.gz` on the flash but keep the file as is.

This mitigation does a tradeoff by using more RAM to reduce the disk footprint.
It however creates new issues for devices with 4G of system memory since
the extracted `dockerfs.tar.gz` nears the 1.6G.
Considering debian upgrades (with dual base images) and the continuous
stream of features this is only going to get bigger.

This change introduces an alternative to the `tmpfs` by allowing a system
to extract the `dockerfs.tar.gz` inside a `zram` device thus bringing
compression in play at the detriment of performance.

Introduce 2 new optional kernel parameters to be consumed by SONiC initramfs.
 - `docker_inram_size` which represent the max physical size of the
   `zram` or `tmpfs` volume (defaults to DOCKER_RAMFS_SIZE)
 - `docker_inram_algo` which is the method to use to extract the
   `dockerfs.tar.gz` (defaults to `tmpfs`)
   other values are considered to be compression algorithm for `zram`
   (e.g `zstd`, `zlo-rle`, `lz4`)

Refactored the logic to mount the docker fs in the SONiC initramfs under
the `union-mount` script.
Moved the code into a function to make it cleaner and separated the
inram volume creation and docker extraction.

On Arista platform with a flash smaller or equal to 4GB set
`docker_inram_algo` to `zstd` which produces the best compression ratio
at the detriment of a slower write performance and a similar read
performance to other `zram` compression algorithms.
2023-06-02 08:36:18 -07:00
mssonicbld
6cf6c59c8c
[ci/build]: Upgrade SONiC package versions (#15245) 2023-05-28 20:42:02 +08:00
mssonicbld
d00dd1fca7
[ci/build]: Upgrade SONiC package versions (#15243) 2023-05-27 20:17:51 +08:00
mssonicbld
036b8d1315
[ci/build]: Upgrade SONiC package versions (#15192) 2023-05-23 20:43:44 +08:00
mssonicbld
11226b9ca4
[ci/build]: Upgrade SONiC package versions (#15174) 2023-05-21 20:38:55 +08:00
mssonicbld
80aba31433
[arp_update] Resolve neighbors from config_db (#15006) (#15124) 2023-05-18 08:50:55 +08:00
judyjoseph
b6df524b0f
Add override_config to load_minigraph in config-setup service (#14834) (#15097)
This PR is to handle the override minigraph config by golden_config_db.json file if it is present in the backup location.
2023-05-17 13:17:20 -07:00
Tejaswini Chadaga
a3a041a3cd
Revert "Add load_minigraph option to include traffic-shift-away during config migration (#11403)" (#14881)
This reverts commit 0c7f0aa9b7.
2023-05-03 17:10:15 -07:00
mssonicbld
0ed0df6ddb
[ci/build]: Upgrade SONiC package versions (#14913) 2023-05-02 20:20:59 +08:00
Ying Xie
9ac9908321
Revert "Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516) (#13695)" (#14900)
This reverts commit d1fa414f1b.
2023-05-01 16:48:56 -07:00
mssonicbld
5d9658f503
[ci/build]: Upgrade SONiC package versions (#14839) 2023-04-29 20:49:34 +08:00
mssonicbld
36b6d5824c
[ci/build]: Upgrade SONiC package versions (#14812) 2023-04-23 20:52:29 +08:00
mssonicbld
7cc8c76f0f
Increase wait_for_tunnel() timeout to 90s (#14279) (#14733) 2023-04-20 05:47:12 +08:00
mssonicbld
20bb5daa6a [ci/build]: Upgrade SONiC package versions 2023-04-18 22:39:02 +08:00
mssonicbld
94ba969676
[write standby] force DB connections to use unix socket to connect (#14524) (#14553) 2023-04-18 17:11:59 +08:00
anamehra
0b30826e56 chassis-packet: resolve the missing static routes (#14593)
Why I did it
Fixes #14179
chassis-packet: missing arp entries for static routes causing high orchagent cpu usage

It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear.

How I did it
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping if any
found to resolve it.

How to verify it
After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present
manual validation:
Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries
run arp_update
Check for neigh entries. All entries should be present.
Testing on T0 setup route/for test_static_route.py

The test set the STATIC_ROUTE entry in conifg db without ifname:
sonic-db-cli CONFIG_DB hmset 'STATIC_ROUTE|2.2.2.0/24' nexthop 192.168.0.18,192.168.0.25,192.168.0.23

"STATIC_ROUTE": {
    "2.2.2.0/24": {
        "nexthop": "192.168.0.18,192.168.0.25,192.168.0.23"
    }
},
Validate that the arp_update gets the proper ARP_UPDATE_VARDS using arp_update_vars.j2 template from config db and does not crash:

{ "switch_type": "", "interface": "", "pc_interface" : "PortChannel101 PortChannel102 PortChannel103 PortChannel104 ", "vlan_sub_interface": "", "vlan" : "Vlan1000", "static_route_nexthops": "192.168.0.18 192.168.0.25 192.168.0.23 ", "static_route_ifnames": "" }

validate route/test_static_route.py testcase pass.
2023-04-18 14:34:49 +08:00
mssonicbld
c9d5e20923
[image_config] add rasdaemon.timer (#14300) (#14692) 2023-04-18 12:33:15 +08:00
xumia
1a9d6cdc5a
Support to add SONiC OS Version in device info (#14601) (#14624)
Why I did it
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.

SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
How I did it
2023-04-17 17:30:49 -07:00
mssonicbld
c919e758db
[ci/build]: Upgrade SONiC package versions (#14681) 2023-04-16 20:54:23 +08:00
mssonicbld
32b8764d3b
[ci/build]: Upgrade SONiC package versions (#14674) 2023-04-15 20:36:48 +08:00
mssonicbld
597d37d395
[ci/build]: Upgrade SONiC package versions (#14604) 2023-04-11 21:29:00 +08:00
Ye Jianquan
c55a5a94eb
Revert "chassis-packet: resolve the missing static routes (#14230)" (#14545)
This reverts commit a6d597a811.
2023-04-10 23:40:31 +08:00
mssonicbld
c62ed6800e
[ci/build]: Upgrade SONiC package versions (#14579) 2023-04-09 20:29:39 +08:00
mssonicbld
9680eb9e07
[ci/build]: Upgrade SONiC package versions (#14573) 2023-04-08 20:37:10 +08:00
Dev Ojha
82873c24ce
[202205][Buffer] Added cable length config to buffer config template for EdgeZoneAggregator (#14538)
Why I did it
SONiC currently does not identify 'EdgeZoneAggregator' neighbor. As a result, the buffer profile attached to those interfaces uses the default cable length which could cause ingress packet drops due to insufficient headroom. Hence, there is a need to update the buffer templates to identify such neighbors and assign the same cable length as used by the T1.

How I did it
Modified the buffer template to identify EdgeZoneAggregator as a neighbor device type and assign it the same cable length as a T1/leaf router.

How to verify it
Unit tests pass, and manually checked on a 7260 to see the changes take effect.
2023-04-06 07:52:43 -07:00
mssonicbld
eab159f5e3
Delay mux/sflow/snmp timer after interface-config service (#14506) (#14523) 2023-04-05 15:28:41 +08:00
mssonicbld
8ca2d47aa9 [ci/build]: Upgrade SONiC package versions 2023-04-04 22:32:19 +08:00
Hua Liu
0a58f4f68b Improve sudo cat command for RO user. (#14428)
Improve sudo cat command for RO user.

#### Why I did it
RO user can use sudo command show none syslog files.

#### How I did it
Improve sudo cat command for RO user.

#### How to verify it
Pass all UT.
Manually check fixed code work correctly.

#### Description for the changelog
Improve sudo cat command for RO user.
2023-04-03 16:34:36 +08:00
anamehra
a6d597a811 chassis-packet: resolve the missing static routes (#14230)
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping to
resolve the missing entry.

Why I did it
Fixes #14179

chassis-packet: missing arp entries for static routes causing high orchagent cpu usage

It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear.

How I did it
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping if any
found to resolve it.

How to verify it
After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present
manual validation:
Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries
run arp_update
Check for neigh entries. All entries should be present.

Signed-off-by: anamehra <anamehra@cisco.com>
2023-04-03 16:34:21 +08:00
mssonicbld
2b3bea61de
[ci/build]: Upgrade SONiC package versions (#14489) 2023-04-02 20:39:29 +08:00
mssonicbld
40889d25ce
[ci/build]: Upgrade SONiC package versions (#14443) 2023-03-28 20:37:28 +08:00
mssonicbld
16bbbd776a
Set owner after restoring counters folder during warmboot (#13507) (#14435)
Why I did it
After warm reboot, show environment prints the following error:
failed to import plugin show.plugins.macsec: [Errno 13] Permission denied: '/tmp/cache/macsec'

How I did it
Set owner back to admin after restoring counters folder.

How to verify it
sudo warm-reboot, then ensure show environement does not print errors.

Signed-off-by: Oleksandr Kolomeiets <oleksandrx.kolomeiets@intel.com>
Co-authored-by: oleksandrx-kolomeiets <oleksandrx.kolomeiets@intel.com>
2023-03-27 14:57:26 -07:00
mssonicbld
5fb685ebb8
[ci/build]: Upgrade SONiC package versions (#14418) 2023-03-26 20:42:23 +08:00
mssonicbld
f76fbfca5b
[ci/build]: Upgrade SONiC package versions (#14415) 2023-03-25 21:02:51 +08:00
Stepan Blyshchak
e2ed36c764
[202205] Clear teamd-timer when finalizing fast-reboot (#14295)
* Clear teamd-timer when finalizing fast-reboot
* Move config save after finalizing fast/warm reboot
2023-03-22 12:07:58 -07:00
mssonicbld
22b7b68ff0 [ci/build]: Upgrade SONiC package versions 2023-03-21 20:46:04 +08:00
xumia
e5e8d46fe1
[Security] Fix some of vulnerability issue relative python packages (#14269) (#14353)
Why I did it
Fix some of vulnerability issue relative python packages #14269
Pillow: [CVE-2021-27921]
Wheel: [CVE-2022-40898]
lxml: [CVE-2022-2309]

How I did it
2023-03-20 16:52:40 -07:00
mssonicbld
19a89aa6e2
[ci/build]: Upgrade SONiC package versions (#14344) 2023-03-19 20:19:28 +08:00
mssonicbld
3df60a15dd
[ci/build]: Upgrade SONiC package versions (#14314) 2023-03-19 00:04:48 +08:00
mssonicbld
51df8df2a8
[ci/build]: Upgrade SONiC package versions (#14299) 2023-03-18 02:31:30 +08:00
mssonicbld
cac2cf5d24
[storage_backend] Add backend acl service (#14229) (#14281) 2023-03-17 08:50:15 +08:00
Stepan Blyshchak
58937201f9 [swss/syncd] remove dependency on interfaces-config.service (#13084)
- Why I did it
Remove dependency on interfaces-config.service to speed up boot, because interfaces-config.service takes a lot of time on boot.

- How I did it
Changed service files for swss, syncd.

- How to verify it
Boot and check swss/syncd start time comparing to interfaces-config

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-03-16 14:37:17 +08:00
Aryeh Feigin
4a3c5e42c2
[202205] Fast reboot finalizer 202205 (#14143)
* Finalize fast-reboot in warmboot finalizer

* update fast/warm-reboot finalizer

* support compatibility for fast-reboot from previous versions (prior 202205)

* advance pointers: sairedis, utilities
2023-03-15 09:34:05 -07:00
Arvindsrinivasan Lakshmi Narasimhan
dddf1db1d3
[202205]Revert "Revert "[Chassis][Voq]update to add buffer_queue config on sy… (#14173)
* Revert "Revert "[Chassis][Voq]update to add buffer_queue config on system ports (#12156)" (#13421)"

This reverts commit 73c0deb810.

* update swss submodule

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

---------

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2023-03-10 13:09:13 -08:00
anamehra
0c5ab622c4 Add support for platform syncd pre shutdown plugin (#13564)
Why I did it
Vendor platform may require running platform specific pre-shutdown routine before shutting down the syncd process which runs the SAI and vendor sdk instance.

How I did it
Added a platform script hook which will be executed if the plugin script is provided by the platform in device//plugins/
2023-03-09 04:33:42 +08:00
Marty Y. Lok
432c4f9222 [Chassis][multiasic] Fix the sonic-db-cli core files issue on multiasic platform after the c++ implementation of sonic-db-cli (#13207)
Fixe #12047. After the c++ implementation of the sonic-db-cli, sonic-db-cli PING command tries to initialize the global database for all instances database starting. If all instance database-config.json are not ready yet. it will crash and generate core file. PR sonic-net/sonic-swss-common#701 only fix the crash and the process abortion. 

Signed-off-by: mlok <marty.lok@nokia.com>
2023-03-07 14:39:25 +08:00
mssonicbld
cf5b888534
[ci/build]: Upgrade SONiC package versions (#14083) 2023-03-05 18:57:50 +08:00
mssonicbld
b40cccafa4
[ci/build]: Upgrade SONiC package versions (#14079) 2023-03-04 22:15:09 +08:00
mssonicbld
bab3243d93
[Arista] Disable SSD NCQ on Lodoga (#13964) (#14073) 2023-03-04 04:41:19 +08:00
mssonicbld
9d6457a2ff
[ci/build]: Upgrade SONiC package versions (#14046) 2023-03-02 05:37:16 +08:00
xumia
b8ef3c07df
Bump lxml from 4.6.5 to 4.9.1 in /src/sonic-config-engine (#14011)
Why I did it
It is to fix the security alert CVE-2022-2309, see https://security-tracker.debian.org/tracker/CVE-2022-2309
The fix has already merged in master, See detail in PR #11366

How I did it
Upgrade version to 4.9.1

How to verify it
2023-03-01 08:21:57 +00:00
mssonicbld
73f572948f
[netlink] Increse netlink buffer size from 3MB to 16MB (#13965) (#14027) 2023-03-01 11:47:29 +08:00
mssonicbld
7f4afadd1a
[ci/build]: Upgrade SONiC package versions (#13995) 2023-02-26 20:02:25 +08:00
mssonicbld
3170ef9b50
[ci/build]: Upgrade SONiC package versions (#13991) 2023-02-25 20:14:59 +08:00
mssonicbld
668774db9f
[ci/build]: Upgrade SONiC package versions (#13975) 2023-02-24 17:32:54 +08:00
Stepan Blyshchak
70e2ea1e87
[202205][Mellanox] Place FW binaries under platform directory instead of squashfs (#13838)
Fixes #13568
Backport of #13837

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to 202205
202205 to 201911 downgrade
202205 -> 202205 reboot
ONIE -> 202205 boot (First FW burn)

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 17:40:50 +02:00
zhixzhu
8b5a42794d
set cable length of backplane ports to 1m (#13279)
* set cable length of backplane ports to 1m

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>

* add UT for cable length

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>

* correct argument format

---------

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>
2023-02-21 22:14:53 +00:00
mssonicbld
9bf90a5f2e
Use tmpfs for /var/log on Arista 7050CX3-32S (#13805) (#13843)
This is to reduce writes to the SSD on the device.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-02-21 13:25:35 -08:00
Samuel Angebault
638fdd0e93 [Arista] Disable ATA NCQ for a few products (#13739)
Why I did it
Some products might experience an occasional IO failure in the communication between CPU and SSD.
Based on some research it could be attributable to some device not handling ATA NCQ (Native Command Queue).

This issue currently affect 4 products:

DCS-7170-32C*
DCS-7170-64C
DCS-7060DX4-32
DCS-7260CX3-64

How I did it
This change disable NCQ on the affected drive for a small set of products.

How to verify it
When the fix is applied, these 2 patterns can be found in the dmesg.
ata1.00: FORCE: horkage modified (noncq)
NCQ (not used)

Test results using: fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4

with NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (depth 32), AA)

   READ: bw=33.9MiB/s (35.6MB/s), 33.9MiB/s-33.9MiB/s (35.6MB/s-35.6MB/s), io=4073MiB (4270MB), run=120078-120078msec
  WRITE: bw=34.1MiB/s (35.8MB/s), 34.1MiB/s-34.1MiB/s (35.8MB/s-35.8MB/s), io=4100MiB (4300MB), run=120078-120078msec
without NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (not used))

   READ: bw=31.7MiB/s (33.3MB/s), 31.7MiB/s-31.7MiB/s (33.3MB/s-33.3MB/s), io=3808MiB (3993MB), run=120083-120083msec
  WRITE: bw=31.9MiB/s (33.4MB/s), 31.9MiB/s-31.9MiB/s (33.4MB/s-33.4MB/s), io=3830MiB (4016MB), run=120083-120083msec
Which release branch to backport (provide reason below if selected)
2023-02-22 04:34:01 +08:00
Stepan Blyshchak
b5be0da272 [dockerd] Force usage of cgo DNS resolver (#13649)
Go's runtime (and dockerd inherits this) uses own DNS resolver implementation by default on Linux.
It has been observed that there are some DNS resolution issues when executing ```docker pull``` after first boot.

Consider the following script:

```
admin@r-boxer-sw01:~$ while :; do date; cat /etc/resolv.conf; ping -c 1 harbor.mellanox.com; docker pull harbor.mellanox.com/sonic/cpu-report:1.0.0 ; sleep 1; done
Fri 03 Feb 2023 10:06:22 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.99 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.989/5.989/5.989/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:57245->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:23 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.56 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.561/5.561/5.561/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:53299->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:24 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.78 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.783/5.783/5.783/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:55765->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:25 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=7.17 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.171/7.171/7.171/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:44877->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:26 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.66 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.656/5.656/5.656/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:54604->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:27 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=8.22 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 8.223/8.223/8.223/0.000 ms
1.0.0: Pulling from sonic/cpu-report
004f1eed87df: Downloading [===================>                               ]   19.3MB/50.43MB
5d6f1e8117db: Download complete
48c2faf66abe: Download complete
234b70d0479d: Downloading [=========>                                         ]  9.363MB/51.84MB
6fa07a00e2f0: Downloading [==>                                                ]   9.51MB/192.4MB
04a31b4508b8: Waiting
e11ae5168189: Waiting
8861a99744cb: Waiting
d59580d95305: Waiting
12b1523494c1: Waiting
d1a4b09e9dbc: Waiting
99f41c3f014f: Waiting
```

While /etc/resolv.conf has the correct content and ping (and any other utility that uses libc's DNS resolution implementation) works correctly
docker is unable to resolve the hostname and falls back to default [::1]:53. This started to happen after PR https://github.com/sonic-net/sonic-buildimage/pull/13516 has been merged.
As you can see from the log, dockerd is able to pick up the correct /etc/resolv.conf only after 5 sec since first try. This seems to be somehow related to the logic in Go's DNS resolver
https://github.com/golang/go/blob/master/src/net/dnsclient_unix.go#L385.

There have been issues like that reported in docker like:
  - https://github.com/docker/cli/issues/2299
  - https://github.com/docker/cli/issues/2618
  - https://github.com/moby/moby/issues/22398

Since this starts to happen after inclusion of resolvconf package by
above mentioned PR and the fact I can't see any problem with that (ping,
nslookup, etc. works) the choice is made to force dockerd to use cgo
(libc) resolver.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 04:33:44 +08:00
mssonicbld
25ead73d10
[ci/build]: Upgrade SONiC package versions (#13893) 2023-02-21 19:42:54 +08:00
mssonicbld
0469a2a02f
[ci/build]: Upgrade SONiC package versions (#13881) 2023-02-19 18:47:51 +08:00
mssonicbld
7521705bb8
[ci/build]: Upgrade SONiC package versions (#13877) 2023-02-18 19:09:33 +08:00
mssonicbld
e170a4b8a1
[ci/build]: Upgrade SONiC package versions (#13840) 2023-02-17 07:40:37 +08:00
mssonicbld
df0685fb19
[Arista] Add emmc quirks in boot0 to improve reliability (#10013) (#13743)
Why I did it
Fix some unreliability seen on emmc device with some AMD CPUs

How I did it
Added a kernel parameter to add quirks to
It depends on a sonic-linux-kernel change to work properly but will be a no-op without it.
The quirk added is SDHCI_QUIRK2_BROKEN_HS200 used to downgrade the link speed for the eMMC.

Co-authored-by: Samuel Angebault <staphylo@arista.com>
2023-02-10 09:28:24 -08:00
mssonicbld
f5656d1aad
[Mellanox][sai_failure_dump]Added platform specific script to be invoked during SAI failure dump (#13533) (#13749)
- Why I did it
Added platform specific script to be invoked during SAI failure dump. Added some generic changes to mount /var/log/sai_failure_dump as read write in the syncd docker

- How I did it
Added script in docker-syncd of mellanox and copied it to /usr/bin

- How to verify it
Manual UT and new sonic-mgmt tests

Co-authored-by: Sudharsan Dhamal Gopalarathnam <dgsudharsan@users.noreply.github.com>
2023-02-10 09:23:10 -08:00
mssonicbld
d623dd2fca
Increase PikeZ varlog size (#13550) (#13750)
Why I did it
To address error sometimes seen when running sonic-mgmt test_stress_routes.py::test_announce_withdraw_route on 720DT-48S

How I did it
Update boot0 logic to set platform specific varlog size for 720DT-48S

How to verify it
Verified that /var/log size increased and error is no longer observed when running test

Co-authored-by: andywongarista <78833093+andywongarista@users.noreply.github.com>
2023-02-10 09:20:36 -08:00
mssonicbld
06aa8aa11b
[Mellanox] Support DSCP remapping in dual ToR topo on T0 switch (#12605) (#13745)
- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
2023-02-10 09:16:56 -08:00
Oleksandr Ivantsiv
d1fa414f1b
Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516) (#13695)
- Why I did it
fixes #12907

When the management interface IP address configuration changes from dynamic to static the DNS configuration (retrieved from the DHCP server) in /etc/resolv.conf remains uncleared. This leads to a DNS configuration pointing to the wrong nameserver. To make the behavior clear DNS configuration received from DHCP should be cleared.

- How I did it
Use resolvconf package for managing DNS configuration. It is capable of tracking the source of DNS configuration and puts the configuration retrieved from the DHCP servers into a separate file. This allows the implementation of DNS configuration cleanup retrieved from DHCP during networking reconfiguration.

- How to verify it
Ensure that the management interface has no static configuration.
Check that /etc/resolv.conf has DNS configuration.
Configure a static IP address on the management interface.
Verify that /etc/resolv.conf has no DNS configuration.
Remove the static IP address from the management interface.
Verify that /etc/resolv.conf has DNS configuration retrieved form DHCP server.
2023-02-10 09:11:05 -08:00
xumia
7642f4c07f
[Security][202205] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips #13737 (#13759)
* [Security] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips (#13737)

Why I did it
[Security] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips

f6df7303d8 Update expired certs.
84540b59c1 CVE-2022-2068
f763d8a93e Prepare 1.1.1n-0+deb11u2
576562cebe CVE-2022-1292
How I did it
Upgrade the OpenSSL version

* [Security] Upgrade OpenSSL version for armhf
2023-02-10 12:01:22 +00:00
mssonicbld
6620871fff
Add support for platform topology configuration service (#12066) (#13605) 2023-02-03 06:34:17 +08:00
Saikrishna Arcot
78ed216167 Use tmpfs for /var/log for Arista 7260 (#13587)
This is to reduce writes to disk, which then can use the SSD to get worn
out faster.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-02-03 04:32:29 +08:00
mssonicbld
3591f6b8a3
rsyslog to start after interfaces-config (#13503) (#13529) 2023-01-27 16:10:20 +08:00
mssonicbld
3860186ec2
[sudoers] add /usr/local/bin/storyteller to READ_ONLY_CMDS (#13422) (#13530) 2023-01-27 16:08:57 +08:00
mssonicbld
5d29448f41
change default to be on (#13495) (#13498)
Changing the default config knob value to be True for killing radv, due to the reasons below:

Killing RADV is to prevent sending the "cease to be advertising interface" protocol packet.
RFC 4861 says this ceasing packet as "should" instead of "must", considering that it's fatal to not do this.
In active-active scenario, host side might have difficulty distinguish if the "cease to be advertising interface" is for the last interface leaving.
6.2.5. Ceasing To Be an Advertising Interface

shutting down the system.
In such cases, the router SHOULD transmit one or more (but not more
than MAX_FINAL_RTR_ADVERTISEMENTS) final multicast Router
Advertisements on the interface with a Router Lifetime field of zero.
In the case of a router becoming a host, the system SHOULD also
depart from the all-routers IP multicast group on all interfaces on
which the router supports IP multicast (whether or not they had been
advertising interfaces). In addition, the host MUST ensure that
subsequent Neighbor Advertisement messages sent from the interface
have the Router flag set to zero.

sign-off: Jing Zhang zhangjing@microsoft.com

Co-authored-by: Jing Zhang <zhangjing@microsoft.com>
2023-01-25 09:58:53 -08:00
mssonicbld
9d4c5f1ad5
[ci/build]: Upgrade SONiC package versions (#13492) 2023-01-24 22:55:46 +08:00
mssonicbld
a80c8b4c01
[ci/build]: Upgrade SONiC package versions (#13466) 2023-01-22 22:50:02 +08:00
mssonicbld
8e52922c4e
[ci/build]: Upgrade SONiC package versions (#13463) 2023-01-21 22:31:16 +08:00
mssonicbld
a67ed33ec1
[dualtor][active-active]Killing radv instead of stopping on active-active dualtor if config knob is on (#13408) (#13458) 2023-01-21 12:55:10 +08:00
Arvindsrinivasan Lakshmi Narasimhan
73c0deb810
Revert "[Chassis][Voq]update to add buffer_queue config on system ports (#12156)" (#13421)
This reverts commit 1cffbc7b07.

Why I did it
This PR reverts the changes done in #12156 in 202205.
The dependant swss changes in PR sonic-net/sonic-swss#2618 are not merged in 202205 yet.

This revert is to avoid issues on 202205 till the sonic-net/sonic-swss#2618 is merged in.

Once sonic-net/sonic-swss#2618 changes are merged, this change will be added back.
2023-01-18 16:38:05 -08:00
mssonicbld
d7ac478415
[ci/build]: Upgrade SONiC package versions (#13393) 2023-01-17 22:56:10 +08:00
mssonicbld
da8bc0bb12
[ci/build]: Upgrade SONiC package versions (#13369) 2023-01-15 23:11:45 +08:00
mssonicbld
242e5d8db8
[ci/build]: Upgrade SONiC package versions (#13365) 2023-01-14 22:50:30 +08:00
mssonicbld
3c86702d15
[Bug] Fix SONiC installation failure caused by pip/pip3 not found (#13284) (#13352) 2023-01-13 09:57:35 +08:00
mssonicbld
991c98e560
During build time mask only those feature/services that are disabled excplicitly (#13283) (#13296)
What I did:
Fix : #13117

How I did:
During build time mask only those feature/services that are disabled explicitly. Some of the features ((eg: teamd/bgp/dhcp-relay/mux/etc..)) state is determine run-time so for those feature by default service will be up and running and then later hostcfgd will mask them if needed.

So Default behavior will be

init_cfg.json.j2 during build time make state as disabled then mask the service
init_cfg.json.j2 during build time make state as another jinja2 template render string than do no mask the service
init_cfg.json.j2 during build time make state as enabled then do not mask the service

How I verify:
Manual Verification.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com>
2023-01-09 10:28:03 -08:00
Arvindsrinivasan Lakshmi Narasimhan
1cffbc7b07 [Chassis][Voq]update to add buffer_queue config on system ports (#12156)
Why I did it
In the voq chassis the buffer_queue configuration needs to be applied on system_port instead of the sonic port.
This PR has the change to do this.

How I did it
Modify buffer_config.j2 to generate buffer_queue configuration on system_ports if the device is Voq Chassis

How to verify it
Verify the buffer_queue configuration is generated properly using sonic-cfggen

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2023-01-05 12:37:16 +08:00
mssonicbld
f5974090ad
[ci/build]: Upgrade SONiC package versions (#13221) 2023-01-03 22:46:44 +08:00
mssonicbld
bf05eeada6
[ci/build]: Upgrade SONiC package versions (#13209) 2022-12-31 22:30:03 +08:00
xumia
8395de69d3
[Build] Support j2 template for debian sources (#12557) (#13185)
Why I did it
Unify the Debian mirror sources
Make easy to upgrade to the next Debian release, not source url code change required. Support to customize the Debian mirror sources during the build
Relative issue: #12523

How I did it
How to verify it
2022-12-30 09:47:33 +08:00
mssonicbld
8358cffe2d
[ci/build]: Upgrade SONiC package versions (#13169) 2022-12-25 22:12:53 +08:00
mssonicbld
7722833311
[ci/build]: Upgrade SONiC package versions (#13159) 2022-12-24 22:37:23 +08:00
mssonicbld
4b5d019bbe
[ci/build]: Upgrade SONiC package versions (#13147) 2022-12-23 06:25:41 +08:00
mssonicbld
b193ba4d9f
[ci/build]: Upgrade SONiC package versions (#13095) 2022-12-18 22:36:59 +08:00
mssonicbld
a8382a3245
[ci/build]: Upgrade SONiC package versions (#13093) 2022-12-17 23:56:08 +08:00
mssonicbld
faaaea0464
[ci/build]: Upgrade SONiC package versions (#13036) 2022-12-13 23:15:57 +08:00
Saikrishna Arcot
c725dfb975 Replace logrotate cron file with (adapted) systemd timer file (#12921)
Debian is shipping a systemd timer unit for logrotate, but we're also
packaging in a cron job, which means both of them will run, potentially
at the same time. Remove our cron file, and add an override to the
shipped timer file to have it be run every 10 minutes.

Fixes #12392.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-12-13 06:34:28 +08:00
mssonicbld
6891aa915a
[ci/build]: Upgrade SONiC package versions (#13017) 2022-12-11 22:19:51 +08:00