sonic-buildimage

Author	SHA1	Message	Date
lixiaoyuner	e33af15d2d	Install kubernetes-cni for kubelet (#14163 ) Why I did it Find a new bug on kubelet side. The kubernetes-cni plug-in was removed in #12997, the reason is that the plug-in will be auto installed when install kubeadm, and will report error if we don't remove the install code. But after removal, the version auto installed is different from what we installed before. This will affect the kubelet action in some scenarios we don't find before. Need to install it by another way. How I did it Install kubernetes-cni==0.8.7-00 before install kubeadm How to verify it Flannel binary will be installed under /opt/cni/bin/ folder	2023-03-19 22:32:35 +08:00
jhli-cisco	098678fd3f	[sonci-slave]: update sonic-slave docker files to include cisco sdk dependencies (#14203 ) cisco SDK dependencies needed	2023-03-19 22:32:29 +08:00
Neetha John	17bf0c85cb	Update dynamic threshold for TD2 (#14224 ) Why I did it Update dynamic threshold to -1 to get optimal performance for RDMA traffic How I did it Modified pg_profile_lookup.ini to reflect the correct value Signed-off-by: Neetha John <nejo@microsoft.com>	2023-03-19 22:32:26 +08:00
Neetha John	0aacc4531a	[storage_backend] Add backend acl service (#14229 ) Why I did it This PR addresses the issue mentioned above by loading the acl config as a service on a storage backend device How I did it The new acl service is a oneshot service which will start after swss and does some retries to ensure that the SWITCH_CAPABILITY info is present before attempting to load the acl rules. The service is also bound to sonic targets which ensures that it gets restarted during minigraph reload and config reload How to verify it Build an image with the following changes and did the following tests Verified that acl is loaded successfully on a storage backend device after a switch boot up Verified that acl is loaded successfully on a storage backend ToR after minigraph load and config reload Verified that acl is not loaded if the device is not a storage backend ToR or the device does not have a DATAACL table Signed-off-by: Neetha John <nejo@microsoft.com>	2023-03-19 22:32:22 +08:00
mssonicbld	5c55eb8c40	[ci/build]: Upgrade SONiC package versions	2023-03-19 20:51:06 +08:00
Sudharsan Dhamal Gopalarathnam	156189dbad	[Mellanox]Fix lpmode set when logical port is larger than 64 (#14138 ) - Why I did it In sfplpm API, the number of logical ports is hardcoded as 64. When a system contains more port than this, the SDK APIs would fail with a syslog as below Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [MGMT_LIB.ERR] Slot [0] Module [0] has logport [0x00010069] in enabled state Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [SDK_MGMT_LIB.ERR] Failed in __sdk_mgmt_phy_module_pwr_attr_set, error: Internal Error Mar 7 03:53:58.106118 r-leopard-58 ERR pmon#-c: Error occurred when setting power mode for SFP module 0, slot 0, error code 1 - How I did it Remove the hardcoded value of 64. Obtained the number of logical ports from SDK - How to verify it Manual testing	2023-03-19 20:50:58 +08:00
Junhua Zhai	29f3c4944a	[gearbox] use credo sai v0.9.0 (#14149 ) Update credo sai package to the latest v0.9.0.	2023-03-19 20:50:53 +08:00
Dror Prital	ba14f728de	Update SDK/FW to version 4.5.4206/4.5.4204 (#14164 ) - Why I did it To include latest fixes: Fix traffic loss on all routed traffic when moving from 4.4.3372/XX_2008_3388 to 4.5.4118-012/XX_2010_4120-010. Issue occurred after ISSU process in Spectrum 1 only, When upgrading from older version to a new one. Neighbor entries are overwritten. Fix When using mirror session policer on SPC2/3, the actual CIR was 1.28 times more than the configured CIR value. Fix Creation of router interface of type bridge may occasionally fail if create is performed immediately after delete. Fix False errors during SDK deinitialization may be seen in the syslog - How I did it Updated SDK submodule and relevant makefiles with the required versions. - How to verify it Build an image and run tests from "sonic-mgmt".	2023-03-19 20:50:49 +08:00
dbarashinvd	d7ba89a95b	[Mellanox] fix for watchdog device not found, adding dependency on hw-management (#14182 ) - Why I did it Sometimes Nvidia watchdog device isn't ready when watchdog-control service is up after first installation from ONIE need to delay watchdog control service to go up after hw-mgmt which gets devices up and ready - How I did it Delay Nvidia watchdog-control service before hw-mgmt has started on Mellanox platform in order to avoid missing or not ready watchdog device. - How to verify it verification test of ONIE installation of image in a loop making sure watchdog service is always up (not failed) after first installation from ONIE	2023-03-19 20:50:44 +08:00
Volodymyr Samotiy	cc5ed4b632	[Mellanox] Update MFT to 4.22.1-15 (#14133 ) Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>	2023-03-19 18:33:57 +08:00
mssonicbld	66447256a6	[ci/build]: Upgrade SONiC package versions (#14313 )	2023-03-18 19:58:17 +08:00
mssonicbld	4e54c580cd	[submodule] Update submodule to the latest HEAD automatically (#14308 )	2023-03-18 15:59:42 +08:00
mssonicbld	9eb5cb4104	[ci/build]: Upgrade SONiC package versions (#14301 )	2023-03-18 05:28:33 +08:00
mssonicbld	16eca71f35	[submodule] Update submodule to the latest HEAD automatically	2023-03-17 16:36:38 +08:00
Vivek	efc79b2272	[202211] Advance sonic-dbsyncd submodule (#14226 ) fa8b709 Handled the error case of negative age (#57) 990f5b0 Use github code scanning instead of LGTM (#55) a7992c5 Install libyang for swss-common. (#50) 244fa86 Update README.md Signed-off-by: Vivek Reddy <vkarri@nvidia.com>	2023-03-16 20:57:40 +08:00
mssonicbld	5312a814b3	[submodule] Update submodule to the latest HEAD automatically	2023-03-15 12:36:48 +08:00
Sudharsan Dhamal Gopalarathnam	bc414bb82d	[202211][yang]Add missing fields in PortChannel yang model (#14045 ) (#14145 ) Manual cherry-pick of #14045 Why I did it Fixing issue #13983 Added Missing fields in sonic-portchannel yang model. "fallback" and "fast_rate" fields are present in configuration schema but not in yang model. This leads to traceback when yang is validated sonic_yang(3):All Keys are not parsed in PORTCHANNEL dict_keys(['PortChannel100']) sonic_yang(3):exceptionList:["'fast_rate'"] sonic_yang(3):Data Loading Failed:All Keys are not parsed in PORTCHANNEL dict_keys(['PortChannel100']) exceptionList:["'fast_rate'"] Data Loading Failed All Keys are not parsed in PORTCHANNEL dict_keys(['PortChannel100']) exceptionList:["'fast_rate'"] ConfigMgmt Class creation failed Failed to break out Port. Error: Failed to load the config. Error: ConfigMgmtDPB Class creation failed How I did it Updated yang model How to verify it Added tests to verify	2023-03-14 12:06:34 +08:00
xumia	05b89457c2	[Build] Fix the mirror gpg key expired issue (#14206 ) Why I did it [Build] Fix the mirror gpg key expired issue See vs build: https://dev.azure.com/mssonic/build/_build/results?buildId=231680&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795 How I did it Add the apt option not to check the valid until, the option is set to the SONiC docker base image, docker ptf missing the option. Acquire::Check-Valid-Until "false"; How to verify it The build of docker-ptf is succeeded after fixed. 2023-03-11T17:26:35.1801999Z [ building ] [ target/docker-ptf.gz ] 2023-03-11T17:38:10.1608536Z [ finished ] [ target/docker-ptf.gz ]	2023-03-13 16:37:49 +08:00
Andriy Yurkiv	c4e488c84f	[Dual-ToR] add default value for ACL rule for mellanox platform (#13547 ) - Why I did it Need to add the possibility to choose between dropping packets (using ACL) on ingress or egress in Dual ToR scenario - How I did it Add new attribute "mux_tunnel_ingress_acl" to SYSTEM_DEFAULTS table - How to verify it check that new attribute exists in redis: admin@sonic:~$ redis-cli -n 4 127.0.0.1:6379[4]> HGETALL SYSTEM_DEFAULTS\|mux_tunnel_ingress_acl 1."state" 2."false" Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>	2023-03-10 14:39:38 +08:00
zitingguo-ms	3c312dec1c	Upgrade SAI xgs version to 8.4.0.2 and migrate to DMZ (#14119 ) Why I did it Update SAI xgs version to 8.4.0.2 and migrate xgs to DMZ repo. How I did it Update SAI xgs version in sai.mk. How to verify it Run the SONiC and SAI test with the8.4 SAI release pipeline.	2023-03-09 14:52:08 +08:00
Samuel Angebault	6173b4dbe5	[Arista] Disable SSD NCQ on Lodoga (#13964 ) Why I did it Fix similar issue seen on #13739 but only for DCS-7050CX3-32S How I did it Add a kernel parameter to tell libata to disable NCQ How to verify it The message ata2.00: FORCE: horkage modified (noncq) should appear on the dmesg. Test results using: fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4 with NCQ READ: bw=26.1MiB/s (27.4MB/s), 26.1MiB/s-26.1MiB/s (27.4MB/s-27.4MB/s), io=3136MiB (3288MB), run=120053-120053msec WRITE: bw=26.3MiB/s (27.6MB/s), 26.3MiB/s-26.3MiB/s (27.6MB/s-27.6MB/s), io=3161MiB (3315MB), run=120053-120053msec without NCQ READ: bw=22.0MiB/s (23.1MB/s), 22.0MiB/s-22.0MiB/s (23.1MB/s-23.1MB/s), io=2647MiB (2775MB), run=120069-120069msec WRITE: bw=22.2MiB/s (23.3MB/s), 22.2MiB/s-22.2MiB/s (23.3MB/s-23.3MB/s), io=2665MiB (2795MB), run=120069-120069msec	2023-03-08 13:50:25 +08:00
Stepan Blyshchak	969166d769	[Mellanox] Place FW binaries under platform directory instead of squashfs (#13837 ) Fixes #13568 Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation: admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa lrwxrwxrwx 1 root root 66 Feb 8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa - Why I did it 202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change. - How I did it Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation. /etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade. - How to verify it Upgrade from 201911 to master master to 201911 downgrade master -> master reboot ONIE -> master boot (First FW burn) Which release branch to backport (provide reason below if selected)	2023-03-08 13:50:18 +08:00
StormLiangMS	f06732632a	[submodule advance] Advance/sonic utilities 202211 #14124 Why I did it 8c7ddf56 - [warm/fast-reboot] Backup logs from tmpfs to disk during fast/warm shutdown ([swss]: update swss docker to stretch #2714) (3 hours ago) [Vaibhav Hemant Dixit] f2a31b30 - [ci] Fix pipeline issue caused by sonic-slave-* change. ([201803] Modify Debian apt repos to reflect changes made by maintainers #2709) (3 hours ago) [Liu Shilong] 586ecf0e - [dhcp_relay] Fix dhcp_relay restart error while add/del vlan ([thrift] add a patch to revert THRIFT-3650 #2688) (3 hours ago) [Yaqiang Zhu] 07b0ef4c - [portstat CLI] don't print reminder if use json format ([devices] add new accton platform minipack. #2670) (3 hours ago) [wenyiz2021] 48d3d3ef - [show][muxcable] add some new commands health, reset-cause, queue_info support for muxcable (DUT takes more than 7 seconds to finish update ip v6 neighbor #2414) (3 hours ago) [vdahiya12] How I did it How to verify it	2023-03-08 08:22:40 +08:00
StormLiangMS	b1445648ae	[submodule advance] advance sonic-swss #14116 Why I did it submodule advance b085b5f - [ci] Fix pipeline error about team5 not found. (Core dump in orchagent when assigning router interface to a vlan with untagged mode #2684) (3 hours ago) [Liu Shilong] 4549b4c - Fix issue: there is no retry while creating a RIF which is in removing state ([201811 sub-module] advance sub-modules: utilities, swss, swss-common #2679) (3 hours ago) [Junchao-Mellanox] 980a45b - [FDB]Fixing FDB consolidated flush for Remote MACs (pmon to stretch #2673) (3 hours ago) [Sudharsan Dhamal Gopalarathnam] c646607 - Do not allow to add port to .1Q bridge while router port deletion is not completed (Update SDK, FW and SAI #2669) (3 hours ago) [Lior Avramov] 4a321f0 - [orchagent]: Get bridge port ID from orchagent cache instead of SAI API ([201811 sub module] advance sairedis sub module #2657) (3 hours ago) [Lawrence Lee] f4b88f3 - [Dual-ToR] handle 'mux_tunnel_egress_acl' attrib in order to change ACL configuration (drop on ingress/egress) on standby ToR (lm75 doesn't support written alarm to syslog. #2646) (3 hours ago) [Andriy Yurkiv] a4f29c1 - [Workaround] EvpnRemoteVnip2pOrch warmboot check failure ([teamd]: wait for swss db flush done before starting teamd container #2626) (3 hours ago) [jcaiMR] 53ee0a8 - Support for tc-dot1p and tc-dscp qosmap ([201803] [router-advertiser] Add templated script to wait for pertinent interfaces to be ready before starting radvd #2559) (3 hours ago) [Divya Mukundan] b953866 - [dual-tor] add missing SAI attribte in order to create IPNIP tunnel (Config reload/load_minigraph not clearing State DB #2503) (3 hours ago) [Andriy Yurkiv] How I did it How to verify it	2023-03-08 08:21:53 +08:00
Sudharsan Dhamal Gopalarathnam	e1536c00a7	[netlink] Increse netlink buffer size from 3MB to 16MB (#13965 ) #### Why I did it Following the PR https://github.com/sonic-net/sonic-swss-common/pull/739 increasing netlink buffer size in linux kernel As error is seen in fdbsyncd with netlink reports "out of memory on reading a netlink socket" It is seen when kernel is sending 10k remote mac to fdbsyncd. #### How I did it Increase the buffer size of the netlink buffer from 3MB to 16MB #### How to verify it Verified with 10k remote mac, and restarting the fdbsyncd process. So that kernel send the bridge fdb dump to the fdbsyncd. Verified that the netlink buffer error is not reported in the sys log.	2023-03-08 06:35:20 +08:00
StormLiangMS	e57197bc8c	[submodule advance] Advance/sonic sairedis 202211 #14121 Why I did it cf9a66b - Fix issue: bulk counter feature is disabled ([Broadcom]: Update Broadcom SDK/SAI package #1205) (4 hours ago) [Lior Avramov] 8b1583b - [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd ([Makefile]: variable ENABLE_SYNCD_RPC is always empty string #1201) (4 hours ago) [Andriy Yurkiv] 50d8e21 - [syncd]: Enable port bulk API ([platform] Accton AS7712-32X. Update for sensors and sfputil. #1197) (4 hours ago) [Nazarii Hnydyn] a72438a - Use new value of STATE_DB FAST_REBOOT entry ([device/accton]: Update Accton-AS5712_54X #1196) (4 hours ago) [Aryeh Feigin] d78ce86 - validation support for SAI_ATTR_VALUE_TYPE_JSON ([installer] FIX. ONIE installer error issue: #1152) (4 hours ago) [svshah-intel] How I did it How to verify it	2023-03-08 00:32:39 +08:00
StormLiangMS	132ff067d3	[submodule advance] Advance/sonic platform common 202211 #14122 Why I did it 9ccaaa5 - Update host electrical interface for 2x100G AOC ([platform]: add dell s6100 into one image #346) (4 hours ago) [mihirpat1] d7016a4 - [ssd_generic] Get health status from Remaining_Life_Left field for virtium SSD ([docker]: Update docker-orchagent start.sh to combine td2 qos/buffers… #344) (4 hours ago) [Junchao-Mellanox] How I did it How to verify it	2023-03-07 23:11:55 +08:00
StormLiangMS	fab25c9d4a	[submodule advance] advance src/sonic-platform-daemons 202211 #14123 Why I did it 6391de0 - [ycable] add changes for correcting telemetry values for 'active-active' (Add default dhcp_relay.yml file to OneImage build #341) (4 hours ago) [vdahiya12] 2cb31c4 - Update CMIS module types for 2x100G AOC support ([kernel]: update linux kernel to support z9100 #339) (4 hours ago) [mihirpat1] 2ea9cf2 - [ycabled] add more coverage to ycabled; add minor name change for vendor API CLI return key-values pairs ([Makefile]: Automatically rebuild sonic-slave #338) (4 hours ago) [vdahiya12] How I did it How to verify it	2023-03-07 23:11:20 +08:00
StormLiangMS	d8765f780a	[submodule advance] advance src/sonic-swss-common 202211 #14126 Why I did it e732ed0 - Prevent sonic-db-cli generate core dump (Update submodule: sairedis #749) (4 minutes ago) [Hua Liu] 28adcb4 - Support for TC-DOT1p qos map (Update submodules: sonic-swss-common, sonic-sairedis #721) (5 minutes ago) [Divya Mukundan] How I did it How to verify it	2023-03-07 23:10:23 +08:00
Mai Bui	eeb3ae17a6	Revert "[system-health] Remove subprocess with shell=True (#12572 )" (#13505 ) This reverts commit `b3a8167968`. Due to issue https://github.com/sonic-net/sonic-buildimage/issues/13432	2023-03-06 19:30:11 +08:00
mssonicbld	aea96da04d	[Mellanox] Fix issue: cannot find label port for logical port when logical port number is larger than 64 (#13710 ) (#13962 )	2023-03-06 16:47:31 +08:00
mssonicbld	523cd8dab5	[ci/build]: Upgrade SONiC package versions (#14077 )	2023-03-04 20:49:07 +08:00
xumia	b8fe3c2989	[Build] Support to use loosen version when failed to install python packages (#14013 ) Why I did it [Build] Support to use loosen version when failed to install python packages It is to fix the issue #14012 How I did it Try to use the installation command without constraint How to verify it	2023-03-03 19:30:57 +08:00
mssonicbld	1757f53290	[Mellanox] update sdk/fw build procedure (#14025 ) (#14059 )	2023-03-03 02:43:19 +08:00
mssonicbld	72f9f51287	[Seastone] fix dx010 qsfp eeprom data write issue (#13930 ) (#14032 )	2023-03-01 19:28:38 +08:00
Sudharsan Dhamal Gopalarathnam	76cc29b19d	[202211]Added vni field in VRF Yang for VxLAN L3 VNI Support (#13980 ) Manual cherry-pick of #13735 Why I did it Added vni field in VRF Yang for VxLAN L3 VNI Support. The VRF table schema as per EVPN HLD is below https://github.com/sonic-net/SONiC/blob/master/doc/vxlan/EVPN/EVPN_VXLAN_HLD.md Addresses Issue #13456	2023-02-28 14:35:20 +08:00
Patrick MacArthur	ff5605ae00	fix platform.json on Wolverine for thermal sensors (#13984 ) Why I did it Manual rebase of PR #13524 to 202211 branch. How I did it See PR #13524	2023-02-28 08:54:01 +08:00
mssonicbld	f1f1af841f	[ci/build]: Upgrade SONiC package versions (#13994 )	2023-02-26 19:41:42 +08:00
mssonicbld	f18f424d17	[ci/build]: Upgrade SONiC package versions (#13990 )	2023-02-25 20:39:59 +08:00
judyjoseph	16e3a72925	Voq Chassis: Add the Recirc ports to the INTERFACES table to make it routed intf (#13779 ) * VOQ: Add the Recirc ports to the INTERFACES table to make it routed intf * Add a test to cover Recir port generation in INTERFACE table	2023-02-25 06:35:01 +08:00
mssonicbld	18bc044179	Remove support to Mellanox SPC4 ASIC (#13932 ) (#13957 )	2023-02-23 22:22:35 +08:00
mssonicbld	310827c26c	Add PYTHON3_SWSSCOMMON as build time dependency to Mellanox platform API (#13847 ) (#13959 )	2023-02-23 20:32:15 +08:00
mssonicbld	50aaf92590	[Mellanox] Non upstream patches for hw-mgmt V.4.0020.4104 (#13792 ) (#13960 )	2023-02-23 20:32:09 +08:00
Junchao-Mellanox	e8789a2e11	[Mellanox] Check system eeprom existence in a retry manner (#13884 ) - Why I did it On Mellanox platform, system EEPROM is a soft link provided by hw-management. There is chance that config-setup service accessing the EEPROM before hw-management creating it. It causes errors. The PR is aim to fix it. - How I did it Waiting EEPROM creation in platform API up to 10 seconds. - How to verify it Manual test	2023-02-23 20:31:29 +08:00
mssonicbld	6a12ca9332	[Mellanox] [ECMP calculator] Add support for 4600/4600C/2201 platforms with different interface naming method (#13814 ) (#13931 )	2023-02-22 22:14:09 +08:00
andywongarista	be51191fd8	[Arista] Add other chassis names to platform_components.json for 720DT-48S (#12378 ) Why I did it The 720DT-48S platform has variants with different chassis names, and these need to all be included in platform_components.json to ensure that sonic-mgmt platform_tests/fwutil/test_fwutil.py::test_fwutil_show passes How I did it Updated platform_components.json with the variant names for 720DT-48S. How to verify it Ran aforementioned testcase and verified that it passes on the different variants.	2023-02-22 20:55:50 +08:00
Stepan Blyshchak	708e83ea63	[dockerd] Force usage of cgo DNS resolver (#13649 ) Go's runtime (and dockerd inherits this) uses own DNS resolver implementation by default on Linux. It has been observed that there are some DNS resolution issues when executing ```docker pull``` after first boot. Consider the following script: ``` admin@r-boxer-sw01:~$ while :; do date; cat /etc/resolv.conf; ping -c 1 harbor.mellanox.com; docker pull harbor.mellanox.com/sonic/cpu-report:1.0.0 ; sleep 1; done Fri 03 Feb 2023 10:06:22 AM UTC nameserver 10.211.0.124 nameserver 10.211.0.121 nameserver 10.7.77.135 search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data. 64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.99 ms --- harbor.mellanox.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.989/5.989/5.989/0.000 ms Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:57245->[::1]:53: read: connection refused Fri 03 Feb 2023 10:06:23 AM UTC nameserver 10.211.0.124 nameserver 10.211.0.121 nameserver 10.7.77.135 search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data. 64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.56 ms --- harbor.mellanox.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.561/5.561/5.561/0.000 ms Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:53299->[::1]:53: read: connection refused Fri 03 Feb 2023 10:06:24 AM UTC nameserver 10.211.0.124 nameserver 10.211.0.121 nameserver 10.7.77.135 search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data. 64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.78 ms --- harbor.mellanox.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.783/5.783/5.783/0.000 ms Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:55765->[::1]:53: read: connection refused Fri 03 Feb 2023 10:06:25 AM UTC nameserver 10.211.0.124 nameserver 10.211.0.121 nameserver 10.7.77.135 search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data. 64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=7.17 ms --- harbor.mellanox.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 7.171/7.171/7.171/0.000 ms Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:44877->[::1]:53: read: connection refused Fri 03 Feb 2023 10:06:26 AM UTC nameserver 10.211.0.124 nameserver 10.211.0.121 nameserver 10.7.77.135 search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data. 64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=5.66 ms --- harbor.mellanox.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.656/5.656/5.656/0.000 ms Error response from daemon: Get "https://harbor.mellanox.com/v2/": dial tcp: lookup harbor.mellanox.com on [::1]:53: read udp [::1]:54604->[::1]:53: read: connection refused Fri 03 Feb 2023 10:06:27 AM UTC nameserver 10.211.0.124 nameserver 10.211.0.121 nameserver 10.7.77.135 search mtr.labs.mlnx labs.mlnx mlnx lab.mtl.com mtl.com PING harbor.mellanox.com (10.7.1.117) 56(84) bytes of data. 64 bytes from harbor.mtl.labs.mlnx (10.7.1.117): icmp_seq=1 ttl=53 time=8.22 ms --- harbor.mellanox.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 8.223/8.223/8.223/0.000 ms 1.0.0: Pulling from sonic/cpu-report 004f1eed87df: Downloading [===================> ] 19.3MB/50.43MB 5d6f1e8117db: Download complete 48c2faf66abe: Download complete 234b70d0479d: Downloading [=========> ] 9.363MB/51.84MB 6fa07a00e2f0: Downloading [==> ] 9.51MB/192.4MB 04a31b4508b8: Waiting e11ae5168189: Waiting 8861a99744cb: Waiting d59580d95305: Waiting 12b1523494c1: Waiting d1a4b09e9dbc: Waiting 99f41c3f014f: Waiting ``` While /etc/resolv.conf has the correct content and ping (and any other utility that uses libc's DNS resolution implementation) works correctly docker is unable to resolve the hostname and falls back to default [::1]:53. This started to happen after PR https://github.com/sonic-net/sonic-buildimage/pull/13516 has been merged. As you can see from the log, dockerd is able to pick up the correct /etc/resolv.conf only after 5 sec since first try. This seems to be somehow related to the logic in Go's DNS resolver https://github.com/golang/go/blob/master/src/net/dnsclient_unix.go#L385. There have been issues like that reported in docker like: - https://github.com/docker/cli/issues/2299 - https://github.com/docker/cli/issues/2618 - https://github.com/moby/moby/issues/22398 Since this starts to happen after inclusion of resolvconf package by above mentioned PR and the fact I can't see any problem with that (ping, nslookup, etc. works) the choice is made to force dockerd to use cgo (libc) resolver. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2023-02-22 20:55:46 +08:00
Saikrishna Arcot	228763fac7	Add lsof and sysstat packages to the base system for debugging purposes (#13741 ) The lsof and sysstat packages make determining what files/sockets a program has open a bit easier. This helps if, for example, some application has a file open that's been deleted from disk. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2023-02-22 20:55:41 +08:00
mssonicbld	6d66a320a6	[ci/build]: Upgrade SONiC package versions	2023-02-22 20:55:33 +08:00
Pavan-Nokia	d7815f3229	add sfp get error description (#13275 ) Why I did it Command "sudo sfputil show error-status -hw" shows "OK (Not implemented)" in the output. How I did it Add a new SFP API get_error_description support in Nokia sonic-platform sfp.py module. How to verify it Run the new image and execute command "sudo sfputil show error-status -hw"	2023-02-22 18:36:56 +08:00

... 3 4 5 6 7 ...

7240 Commits