Commit Graph

6966 Commits

Author SHA1 Message Date
xumia
b8ef3c07df
Bump lxml from 4.6.5 to 4.9.1 in /src/sonic-config-engine (#14011)
Why I did it
It is to fix the security alert CVE-2022-2309, see https://security-tracker.debian.org/tracker/CVE-2022-2309
The fix has already merged in master, See detail in PR #11366

How I did it
Upgrade version to 4.9.1

How to verify it
2023-03-01 08:21:57 +00:00
Ying Xie
1f6456a601
[202205][utilities][swss-common][sairedis][platform-common][swss] advance submodule head (#14029)
utilities:
* a4f141f1 2023-01-10 | [sfputil] Firmware download/upgrade CLI support for QSFP-DD (#1947) (#2349) (HEAD -> 202205) [CliveNi]

swss-common:
* 41fcad8 2023-01-30 | Increase the netlink buffer size from 3MB to 16MB. (#739) (HEAD -> 202205) [KISHORE KUNAL]

sairedis:
* 5ce9990 2023-02-27 | [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd (#1201) (HEAD -> 202205) [Andriy Yurkiv]
* 3c2e0c5 2023-02-23 | Use new value of STATE_DB FAST_REBOOT entry (#1196) [Aryeh Feigin]
* fe7756f 2023-02-28 | [submodule][SAI]Advance SAI head (#1210) (github/202205) [Richard.Yu]

platform-common:
* 321a8e7 2022-09-23 | Cdb fw upgrade (#308) (HEAD -> 202205) [CliveNi]

swss:
* ceea558 2023-02-28 | [orchagent]: Get bridge port ID from orchagent cache instead of SAI API (#2657) (HEAD -> 202205) [Lawrence Lee]
* bd04e24 2023-03-01 | [dualtor] Fix neighbor miss when mux is not ready (#2676) (HEAD -> 202205) [Longxiang Lyu]
* 7d87a90 2023-02-28 | [ci] Fix pipeline error about team5 not found. (#2684) [Liu Shilong]
* 93a924c 2023-02-27 | [aclorch] Fixed issue #2204.Support IN_PORTS qualifer in MIRRORV6 table. (#2668) [Rajkumar-Marvell]
* 9d87ec4 2023-02-23 | swss: Fix Invalid port oid messages generated because of voq counters. (#2653) [Sambath Kumar Balasubramanian]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-03-01 06:24:34 +00:00
Saikrishna Arcot
052cf9a0a6
Don't create the members@ array in config_db for PC when reading from minigraph (#13660) (#14028)
Fixes #11873.

When loading from minigraph, for port channels, don't create the members@ array in config_db in the PORTCHANNEL table. This is no longer needed or used.

In addition, when adding a port channel member from the CLI, that member doesn't get added into the members@ array, resulting in a bit of inconsistency. This gets rid of that inconsistency.
2023-03-01 04:52:03 +00:00
Stephen Sun
76a5c75b82 [Mellanox] Advance hw-mgmt to v.7.0020.4104 (#13372)
- Why I did it
Advance hw-mgmt service to V.7.0020.4100
Add missing thermal sensors that are supported by hw-mgmt package
Delay system health service before hw-mgmt has started on Mellanox platform in order to avoid reading some sensors before ready.
Depends on sonic-net/sonic-linux-kernel#305

- How I did it
1. Update hw mgmt version
2. Add missing sensors
3. Delay service 

- How to verify it
Regression test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-03-01 12:38:50 +08:00
mssonicbld
73f572948f
[netlink] Increse netlink buffer size from 3MB to 16MB (#13965) (#14027) 2023-03-01 11:47:29 +08:00
spilkey-cisco
b2e124cf00
Add asic presence filtering for container checking in system-health (#13497) (#13966)
Why I did it
On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.

system-health indicates errors in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).

system-health process error messages were also altered to indicate which container had the issue; multiple containers may run processes with the same name, which can result in identical system-health error messages, causing ambiguity.

How I did it
Port container_checker logic from #11442 into service_checker for system-health.

How to verify it
Bringup Supervisor card with one or more missing fabric cards. Execute 'show system-health summary'. The command should not report failure due to missing dockers for the asics on the fabric cards which are not present.
2023-02-28 15:40:21 -08:00
Arvindsrinivasan Lakshmi Narasimhan
3e96341049
400 to 100 speed change (#14024)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2023-02-28 14:19:35 -08:00
jhli-cisco
a100e250f6
Update cisco-8000.ini (#13794)
Why I did it
Code drop with incremental fixes for Cisco 8808 and Cisco 8101-32FH-O platforms
Support for ordered ECMP capability
Support for BFD Serviceability CLI
Support for VxLAN Serviceability CLI
Support for 200G Fabric port speed for Fabric Card 8808-FC0 and Line Card 88-LC0-36FH-M/88-LC0-36FH-MO
PFC support for Q200 ASIC based line cards
Support for Single-TOR and Dual-TOR on Cisco 8102-64H-O Platform

How I did it
update cisco 8000 tag

How to verify it
2023-02-27 14:41:25 +08:00
mssonicbld
7f4afadd1a
[ci/build]: Upgrade SONiC package versions (#13995) 2023-02-26 20:02:25 +08:00
mssonicbld
3170ef9b50
[ci/build]: Upgrade SONiC package versions (#13991) 2023-02-25 20:14:59 +08:00
mssonicbld
668774db9f
[ci/build]: Upgrade SONiC package versions (#13975) 2023-02-24 17:32:54 +08:00
Ying Xie
7d49a38e05
[202205][linkmgrd] advance submodule head (#13968)
linkmgrd:
* a2e8391 2023-02-23 | [active-active] link operational down didn't trigger toggle to standby if `MuxUnknown` event arrives first.  (#175) (HEAD -> 202205) [Jing Zhang]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-02-24 04:14:40 +00:00
Ying Xie
ade145f58a
[202205][linkmgrd] advance submodule head (#13939)
linkmgrd:
* d2227d8 2023-02-22 | [active-standby] Toggle to standby if link down and config auto (#173) (HEAD -> 202205) [Longxiang Lyu]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-02-23 11:38:09 -08:00
Dror Prital
d0d59e3f51
[202205][submodule] Advance sonic-platform-common pointer (#13948)
Update sonic-platform-common submodule pointer to include the following:
* d353e7f Update host electrical interface for 2x100G AOC ([#346](https://github.com/sonic-net/sonic-platform-common/pull/346))

Signed-off-by: dprital <drorp@nvidia.com>
2023-02-23 15:39:40 +02:00
Dror Prital
72b57c0913
[202205][submodule] Advance sonic-platform-daemons pointer (#13947)
Update sonic-platform-daemons submodule pointer to include the following:
* 7ee7805 Update CMIS module types for 2x100G AOC support ([#339](https://github.com/sonic-net/sonic-platform-daemons/pull/339))

Signed-off-by: dprital <drorp@nvidia.com>
2023-02-23 15:39:32 +02:00
Dror Prital
cf5a145a69
[202205][submodule] Advance sonic-swss pointer (#13949)
Update sonic-swss submodule pointer to include the following:
* 3aeb4be Align watermark flow with port configuration ([#2672](https://github.com/sonic-net/sonic-swss/pull/2672))

Signed-off-by: dprital <drorp@nvidia.com>
2023-02-23 15:02:46 +02:00
Liu Shilong
b23f79f463
[ci] Fix docker hang issue and change template reference branch (#13894) (#13921)
Why I did it
Cherry pick PR(#13894)
Azure pipeline change.
Use common template to make it easy to change common steps. Fix docker hang issue.

How I did it
How to verify
2023-02-23 14:19:56 +08:00
Stepan Blyshchak
0095ad57f2
[systemd-sonic-generator] Fix overlapping strings being passed to strcpy/strcat (#13889)
Why I did it
Fix an issue that services do not start automatically on first boot and start only after hostcfgd enables them.
This is due to a bug in systemd-sonic-generator:

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 17:04:57 -08:00
mathieulaunay
bda91a19e3
build: add an env var to run make reset unattended (#13820)
- previously "make reset" was expecting user input from the terminal
    to do its job
  - setting UNATTENDED to any non-zero string will allow "make reset" to
    run without interactive confirmation

Signed-off-by: Mathieu Launay <m.launay@criteo.com>
2023-02-23 00:42:51 +00:00
mssonicbld
35a4410f86
[Ci] Support to use the same snapshot for all platform builds (#13913) (#13938) 2023-02-23 08:04:43 +08:00
Tejaswini Chadaga
4345a556eb
Update DNX BRCM SAI version to 7.1.35.4 (#13907) 2023-02-22 14:43:39 -08:00
mssonicbld
e842241f71
add psu fans status led available config (#13926) (#13936) 2023-02-23 06:29:38 +08:00
Stepan Blyshchak
70e2ea1e87
[202205][Mellanox] Place FW binaries under platform directory instead of squashfs (#13838)
Fixes #13568
Backport of #13837

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to 202205
202205 to 201911 downgrade
202205 -> 202205 reboot
ONIE -> 202205 boot (First FW burn)

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 17:40:50 +02:00
Judy Joseph
3c81038bc2 To fix the test failure when porting PR to 202205 2023-02-22 16:37:36 +08:00
judyjoseph
bc8b34c49a Voq Chassis: Add the Recirc ports to the INTERFACES table to make it routed intf (#13779)
* VOQ: Add the Recirc ports to the INTERFACES table to make it routed intf

* Add a test to cover Recir port generation in INTERFACE table
2023-02-22 16:37:36 +08:00
mssonicbld
1e15f00a4f
[Mellanox] Check system eeprom existence in a retry manner (#13884) (#13915) 2023-02-22 13:43:03 +08:00
mssonicbld
b3bcd3d08b
[Mellanox] Fix issue: cannot find label port for logical port when logical port number is larger than 64 (#13710) (#13908)
- Why I did it
sfp_event.py gets a PMPE message when a cable event is available. In PMPE message, there is no label port available. Current sfp_event.py is using sx_api_port_device_get to get 64 logical ports attributes, and find the label port from those 64 attributes. However, if there are more than 64 ports, sfp_event.py might not be able to find the label port and drop the PMPE message.

- How I did it
Don't use hardcoded 64, get logical port number instead.

- How to verify it
Manual test

Co-authored-by: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
2023-02-21 18:15:06 -08:00
Stepan Blyshchak
b51de79ffc [systemd-sonic-generator] Fix overlapping strings being passed to strcpy/strcat (#13647)
#### Why I did it

Fix an issue that services do not start automatically on first boot and start only after hostcfgd enables them.
This is due to a bug in systemd-sonic-generator:

```
admin@arc-switch1004:~$ /usr/lib/systemd/system-generators/systemd-sonic-generator dir
Failed to open file /usr/lib/systemd/system/database.servcee
Error parsing targets for database.servcee
Error parsing database.servcee
Failed to open file /usr/lib/systemd/system/bgp.servcee
Error parsing targets for bgp.servcee
Error parsing bgp.servcee
Failed to open file /usr/lib/systemd/system/lldp.servcee
Error parsing targets for lldp.servcee
Error parsing lldp.servcee
Failed to open file /usr/lib/systemd/system/swss.servcee
Error parsing targets for swss.servcee
Error parsing swss.servcee
Failed to open file /usr/lib/systemd/system/teamd.servcee
Error parsing targets for teamd.servcee
Error parsing teamd.servcee
Failed to open file /usr/lib/systemd/system/syncd.servcee
Error parsing targets for syncd.servcee
Error parsing syncd.servcee
```

A wrong file name is generated (e.g database.**servcee**).

#### How I did it

Fixed overlapping strings being passed to strcpy/strcat that receive restirct* pointers (strings should not overlap).

#### How to verify it

Perform first boot and observe services start immidiatelly after boot.
2023-02-22 10:00:56 +08:00
Ying Xie
602ffb135a
[202205][linkmgrd][utilities][swss][swss-common][platform-daemons][linux-kernel] advance submodule head (#13906)
linkmgrd:
* 3e7a9df 2023-02-19 | [active-active] Toggle to standby if default route is missing (#171) (HEAD -> 202205) [Longxiang Lyu]
* 8ab1b2b 2023-02-15 | [active-active] fix issue that interfaces get stuck in `active` if service starts up with link state down (#169) [Jing Zhang]
* df862ad 2023-02-11 | Fix mux config when gRPC connection is lost (#166) [Longxiang Lyu]

utilities:
* 8aa7930c 2023-02-13 | [portstat CLI] don't print reminder if use json format (#2670) (HEAD -> 202205, github/202205) [wenyiz2021]
* 4e3bb6fa 2023-02-21 | Add "show fabric reachability" command. (#2672) [jfeng-arista]
* 3587a94b 2023-02-18 | [202205][dhcp_relay] Remove add field of vlanid to DHCP_RELAY table while adding vlan (#2680) [Yaqiang Zhu]
* 4f07f7f0 2023-02-10 | Skip saidump for Spine Router as this can take more than 5 sec (#2637) (#2671) [kenneth-arista]
* e61c5ec4 2023-02-10 | [vlan] Refresh dhcpv6_relay config while adding/deleting a vlan (#2660) (#2669) [Yaqiang Zhu]

swss:
* 1bbf725 2023-02-14 | [Workaround] EvpnRemoteVnip2pOrch warmboot check failure (#2626) (HEAD -> 202205) [jcaiMR]
* 380f72b 2023-02-20 | Support for tc-dot1p and tc-dscp qosmap (#2559) [Divya Mukundan]
* dbf6fcc 2022-11-01 | Added LAG member check on addLagMember() (#2464) [Andriy Kokhan]

swss-common:
* b31391b 2023-02-21 | Prevent sonic-db-cli generate core dump (#749) (HEAD -> 202205) [Hua Liu]
* 16ff689 2022-12-13 | Support for TC-DOT1p qos map (#721) [Divya Mukundan]

platform-daemons:
* fb92af4 2023-02-09 | [ycabled] add more coverage to ycabled; add minor name change for vendor API CLI return key-values pairs (#338) (HEAD -> 202205) [vdahiya12]

linux-kernel:
* 4e62401 2023-02-09 | Update linux kernel for hw-mgmt V.7.0020.4104 (#305) (HEAD -> 202205) [Stephen Sun]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-02-22 00:45:03 +00:00
mssonicbld
cfb9877372
[Nokia][sonic-platform] Update Nokia sonic-platform submodule (#13522) (#13909) 2023-02-22 07:18:09 +08:00
mssonicbld
6d75f80856
[Build] Change the default mirror version config file (#13786) (#13903)
Why I did it
Change the mirror config file
Use the files/build/versions/default/versions-mirror only when reproducible build enabled.
The config in files/build/versions is only for reproducible build, while snapshot mirror feature does not have the dependency on the reproducible build.

How I did it
Skip the mirror config in files/build/versions/default/versions-mirror if reproducible build not enabled.

How to verify it

Co-authored-by: xumia <59720581+xumia@users.noreply.github.com>
2023-02-21 15:07:21 -08:00
Ashwin Srinivasan
57139110ca Added libpci and pciutils to the pmon docker (#12684)
This enables the pcied daemon to call the corresponding system commands needed for pci transactions
2023-02-22 06:34:40 +08:00
zhixzhu
8b5a42794d
set cable length of backplane ports to 1m (#13279)
* set cable length of backplane ports to 1m

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>

* add UT for cable length

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>

* correct argument format

---------

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>
2023-02-21 22:14:53 +00:00
mssonicbld
9bf90a5f2e
Use tmpfs for /var/log on Arista 7050CX3-32S (#13805) (#13843)
This is to reduce writes to the SSD on the device.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-02-21 13:25:35 -08:00
mssonicbld
9cbd4ddd05
[Build] Remove the additional space character in the mirrors.list file (#13812) (#13904) 2023-02-22 04:34:21 +08:00
Samuel Angebault
638fdd0e93 [Arista] Disable ATA NCQ for a few products (#13739)
Why I did it
Some products might experience an occasional IO failure in the communication between CPU and SSD.
Based on some research it could be attributable to some device not handling ATA NCQ (Native Command Queue).

This issue currently affect 4 products:

DCS-7170-32C*
DCS-7170-64C
DCS-7060DX4-32
DCS-7260CX3-64

How I did it
This change disable NCQ on the affected drive for a small set of products.

How to verify it
When the fix is applied, these 2 patterns can be found in the dmesg.
ata1.00: FORCE: horkage modified (noncq)
NCQ (not used)

Test results using: fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4

with NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (depth 32), AA)

   READ: bw=33.9MiB/s (35.6MB/s), 33.9MiB/s-33.9MiB/s (35.6MB/s-35.6MB/s), io=4073MiB (4270MB), run=120078-120078msec
  WRITE: bw=34.1MiB/s (35.8MB/s), 34.1MiB/s-34.1MiB/s (35.8MB/s-35.8MB/s), io=4100MiB (4300MB), run=120078-120078msec
without NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (not used))

   READ: bw=31.7MiB/s (33.3MB/s), 31.7MiB/s-31.7MiB/s (33.3MB/s-33.3MB/s), io=3808MiB (3993MB), run=120083-120083msec
  WRITE: bw=31.9MiB/s (33.4MB/s), 31.9MiB/s-31.9MiB/s (33.4MB/s-33.4MB/s), io=3830MiB (4016MB), run=120083-120083msec
Which release branch to backport (provide reason below if selected)
2023-02-22 04:34:01 +08:00
Samuel Angebault
3c3a4ac517 [Arista] update sensors.conf to ignore sensors (#12529)
Why I did it
The sensors and sensord processes were reporting data on unused sensors.
This lead to ALARM messages or erroneous values that could be misinterpreted.

How I did it
Ignore the affected sensors in the sensors.conf

How to verify it
Check that there are no longer ALARM messages from sensord in the syslog or in the output of sensors
2023-02-22 04:33:53 +08:00
xumia
ccfb13aa2a [Build] Support j2 template for debian sources for docker ptf (#13198)
Change to use the sources.list from the file generated from the j2 template
2023-02-22 04:33:49 +08:00
Stepan Blyshchak
b5be0da272 [dockerd] Force usage of cgo DNS resolver (#13649)
Go's runtime (and dockerd inherits this) uses own DNS resolver implementation by default on Linux.
It has been observed that there are some DNS resolution issues when executing ```docker pull``` after first boot.

Consider the following script:

```
admin@r-boxer-sw01:~$ while :; do date; cat /etc/resolv.conf; ping -c 1 harbor.mellanox.com; docker pull harbor.mellanox.com/sonic/cpu-report:1.0.0 ; sleep 1; done
Fri 03 Feb 2023 10:06:22 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.99 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.989/5.989/5.989/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:57245->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:23 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.56 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.561/5.561/5.561/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:53299->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:24 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.78 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.783/5.783/5.783/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:55765->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:25 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=7.17 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.171/7.171/7.171/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:44877->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:26 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.66 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.656/5.656/5.656/0.000 ms
Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:54604->[::1]:53: read: connection refused
Fri 03 Feb 2023 10:06:27 AM UTC
nameserver 10.211.0.124
nameserver 10.211.0.121
nameserver 10.7.77.135
search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com
PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data.
64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=8.22 ms

--- harbor.mellanox.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 8.223/8.223/8.223/0.000 ms
1.0.0: Pulling from sonic/cpu-report
004f1eed87df: Downloading [===================>                               ]   19.3MB/50.43MB
5d6f1e8117db: Download complete
48c2faf66abe: Download complete
234b70d0479d: Downloading [=========>                                         ]  9.363MB/51.84MB
6fa07a00e2f0: Downloading [==>                                                ]   9.51MB/192.4MB
04a31b4508b8: Waiting
e11ae5168189: Waiting
8861a99744cb: Waiting
d59580d95305: Waiting
12b1523494c1: Waiting
d1a4b09e9dbc: Waiting
99f41c3f014f: Waiting
```

While /etc/resolv.conf has the correct content and ping (and any other utility that uses libc's DNS resolution implementation) works correctly
docker is unable to resolve the hostname and falls back to default [::1]:53. This started to happen after PR https://github.com/sonic-net/sonic-buildimage/pull/13516 has been merged.
As you can see from the log, dockerd is able to pick up the correct /etc/resolv.conf only after 5 sec since first try. This seems to be somehow related to the logic in Go's DNS resolver
https://github.com/golang/go/blob/master/src/net/dnsclient_unix.go#L385.

There have been issues like that reported in docker like:
  - https://github.com/docker/cli/issues/2299
  - https://github.com/docker/cli/issues/2618
  - https://github.com/moby/moby/issues/22398

Since this starts to happen after inclusion of resolvconf package by
above mentioned PR and the fact I can't see any problem with that (ping,
nslookup, etc. works) the choice is made to force dockerd to use cgo
(libc) resolver.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 04:33:44 +08:00
mssonicbld
e86f8fa31e
Add PYTHON3_SWSSCOMMON as build time dependency to Mellanox platform API (#13847) (#13905) 2023-02-22 02:34:55 +08:00
suresh-rupanagudi
d1d5bce4f6
cherry-picked snmp sonic-yang file from 2bb8306d8e (#13896) 2023-02-21 08:31:50 -08:00
mssonicbld
25ead73d10
[ci/build]: Upgrade SONiC package versions (#13893) 2023-02-21 19:42:54 +08:00
xumia
c7c59ee8c7
[Build] Clean up the debian preference config file (#13886) 2023-02-21 05:52:33 +00:00
mssonicbld
0469a2a02f
[ci/build]: Upgrade SONiC package versions (#13881) 2023-02-19 18:47:51 +08:00
mssonicbld
7521705bb8
[ci/build]: Upgrade SONiC package versions (#13877) 2023-02-18 19:09:33 +08:00
Samuel Angebault
aa912ec925
[202205][Arista] Update platform library submodules (#13871)
add SEU reporting on chassis
fix fallback logic for Clearlake eeprom identification
fix fan speed reporting for a specific model
move pcie timeout configuration for Upperlake in platform code (deprecates hwsku-init)
2023-02-17 13:52:14 -08:00
Yaqiang Zhu
928aad1944 [dhcp_relay] Remove exist check while adding dhcpv6 relay (#13822)
Why I did it
DHCPv6 relay config entry is not useful while del dhcpv6 relay config.

How I did it
Remove dhcpv6_relay entry if it is empty and not check entry exist while adding dhcpv6 relay
2023-02-17 20:53:42 +08:00
Richard.Yu
cf5ca9d27c
[SAI-PTF][mlnx]Enable saiserver test container on mlnx platforms (#13311)
* Why I did it
Enable Test sai api on bfn container with a lightweight container(saiserver).
[SAI-PTF][mlnx]Enable saiserver test container on mlnx container

How I did it
enable saiserver container on mlnx platform.

add docker-saiserver-mlnx.mk for building saiserver container
in platform/barefoot/docker-saiserver-mlnx, add necessary files that needs in saiserver container
How to verify it

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2023-02-16 15:42:12 -08:00
mssonicbld
e170a4b8a1
[ci/build]: Upgrade SONiC package versions (#13840) 2023-02-17 07:40:37 +08:00
mssonicbld
e44b255555
[DX010 platform] fix dx010 platform testcase issues (#13595) (#13778)
Why I did it
1. fix chassis test_set_fans_led case
2. fix chassis get_name case mismatch issue
3. fix fan_drawer test_set_fans_speed
4. fix component test_components test case

How I did it
Add corresponding configuration into chassis json file

How to verify it
Run platform tests cases to verify these failure cases

Co-authored-by: Ikki Zhu <79439153+qnos@users.noreply.github.com>
2023-02-10 18:18:00 -08:00