sonic-buildimage

Archived

Author	SHA1	Message	Date
Michael Li	132c6e934a	Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804 ) Why I did it There is an issue on the Arista PikeZ platform (using T3.X2: BCM56274) while running SONiC. If the 'syncd' container in SONiC is restarted, the expected behaviour is that syncd will automatically restart/recover; however it does not and always fails at create_switch due to BCM SDK kmod DMA operation cancellation getting stuck. Sep 16 22:19:44.855125 pkz208 ERR syncd#syncd: [none] SAI_API_SWITCH:platform_process_command:428 Platform command "init soc" failed, rc = -1. Sep 16 22:19:44.855206 pkz208 INFO syncd#supervisord: syncd CMIC_CMC0_PKTDMA_CH4_DESC_COUNT_REQ:0x33#015 Sep 16 22:19:44.855264 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:platformInit:1909 initialization command "init soc" failed, rc = -1 (Internal error). Sep 16 22:19:44.855403 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:sai_driver_init:642 Error initializing driver, rc = -1. ... Sep 16 22:19:44.855891 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:1173 initializing SDK failed with error Operation failed (0xfffffff5). Reloading the BCM SDK kmods allows the switch init to continue properly. How I did it If BCM SDK kmods are loaded, unload and load them again on syncd docker start script. How to verify it Steps to reproduce: In SONiC, run 'docker ps' to see current running containers; 'syncd' should be present. Run 'docker stop syncd' Wait ~1 minute. Run 'docker ps' to see that syncd is missing. Check logs to see messages similar to the above. Signed-off-by: Michael Li <michael.li@broadcom.com>	2022-12-01 01:36:18 +00:00
abdosi	81fe1d9c1a	Added Support to runtime render bgp and teamd feature state and lldp has_asic_scope flag (#11796 ) (#12856 ) Added Support to runtime render bgp and teamd feature state and lldp has_asic_scope flag	2022-11-29 13:47:37 -08:00
bingwang-ms	4f7a0b4705	Apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor (#12730 ) Why I did it The PR is to apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor. The traffic with DSCP 2 and DSCP 6 from T1 is treated as lossless traffic. DSCP TC Queue 2 2 2 6 6 6 Traffic with DSCP 2 or DSCP 6 from downlink is still treated as lossy traffic as before. How I did it Define DSCP_TO_TC_MAP\|AZURE_UPLINK and TC_TO_QUEUE_MAP\|AZURE_UPLINK. How to verify it Verified by UT Verified by coping the new template to a testbed, and rendering a config_db.json	2022-11-28 18:51:04 +00:00
Lorne Long	5a4efe211c	[Build] Use apt-get to predictably support dependency ordered configuration of lazy packages (#12164 ) Why I did it The current lazy installer relies on a filename sort for both unpack and configuration steps. When systemd services are configured [started] by multiple packages the order is by filename not by the declared package dependencies. This can cause the start order of services to differ between first-boot and subsequent boots. Declared systemd service dependencies further exacerbate the issue (e.g. blocking the first-boot script). The current installer leaves packages un-configured if the package dependency order does not match the filename order. This also fixes a trivial bug in [Build]: Support to use symbol links for lazy installation targets to reduce the image size #10923 where externally downloaded dependencies are duplicated across lazy package device directories. How I did it Changed the staging and first-boot scripts to use apt-get: dpkg -i /host/image-$SONIC_VERSION/platform/$platform/.deb becomes apt-get -y install /host/image-$SONIC_VERSION/platform/$platform/.deb when dependencies are detected during image staging. How to verify it Apt-get critical rules Add a Depends= to the control information of a package. Grep the syslog for rc.local between images and observe the configuration order of packages change.	2022-11-28 18:48:36 +00:00
abdosi	88bb83e859	[chassis-packet] fix the issue of internal ip arp not getting resolved. (#12127 ) Fix the issue where arp_update will not ping some of the ip's even though they are in failed state since grep of that ip on ip neigh show command does not do exact word match and can return multiple match.	2022-11-28 18:48:36 +00:00
arlakshm	b86b3b0d7d	[202205][chassis] update the asic_status.py to read from CHASSIS_FABRIC_ASIC_INFO_TABLE (#12780 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2022-11-26 20:27:30 -08:00
mssonicbld	66ba3285ac	[ci/build]: Upgrade SONiC package versions (#12830 )	2022-11-25 21:59:36 +08:00
mssonicbld	4424937611	[ci/build]: Upgrade SONiC package versions (#12812 )	2022-11-23 21:35:21 +08:00
mssonicbld	2d2305091f	[ci/build]: Upgrade SONiC package versions (#12772 )	2022-11-20 22:50:40 +08:00
mssonicbld	13b8078555	[ci/build]: Upgrade SONiC package versions (#12759 )	2022-11-19 04:22:29 +08:00
mssonicbld	f4bace99f1	[ci/build]: Upgrade SONiC package versions (#12726 )	2022-11-17 02:52:28 +08:00
mssonicbld	1be9baa1c0	[ci/build]: Upgrade SONiC package versions (#12691 )	2022-11-13 22:33:45 +08:00
mssonicbld	2b641e0505	[ci/build]: Upgrade SONiC package versions (#12656 )	2022-11-11 23:34:54 +08:00
Jing Kan	b2d3e2cf2e	[dhcp_relay] Enable DHCP Relay for BmcMgmtToRRouter in init_cfg (#12648 ) Why I did it DHCP relay feature needs to be enabled for BmcMgmtToRRouter by default How I did it Update device type list	2022-11-10 18:16:15 +00:00
Sudharsan Dhamal Gopalarathnam	1ea37e2723	[logrotate]Fix logrotate firstaction script to reflect correct size (#12599 ) - Why I did it Fix logrotate firstaction script to reflect correct size. The size was modified to change dynamically based on disk size. However this variable was not updated #9504 - How I did it Updated the variable based on disk size - How to verify it Verify in the generated rsyslog file if the variable is correctly generated from jinja template	2022-11-10 18:15:10 +00:00
bingwang-ms	d824846928	Add lossy scheduler for queue 7 (#12596 ) * Add lossy scheduler for queue 7	2022-11-10 18:14:55 +00:00
Devesh Pathak	c7ce62154b	Clear /etc/resolv.conf before building image (#12592 ) Why I did it nameserver and domain entries from build system fsroot gets into sonic image. How I did it Clear /etc/resolv.conf before building image How to verify it Built image with it and verified with install that /etc/resolv.conf is empty	2022-11-10 18:14:10 +00:00
Lawrence Lee	f60e22a5c3	[arp_update]: Fix hardcoded vlan (#12566 ) Typo in prior PR #11919 hardcodes Vlan name. Change command to use the $vlan variable instead Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-11-10 18:12:02 +00:00
judyjoseph	ab713dcfb6	Use the macsec_enabled flag in platform to enable macsec feature state (#11998 ) * Use the macsec_enabled flag in platform to enable macesc feature state * Add macsec supported metadata in DEVICE_RUNTIME_METADATA	2022-11-10 18:08:42 +00:00
mssonicbld	584aaa7058	[ci/build]: Upgrade SONiC package versions (#12612 )	2022-11-06 22:25:30 +08:00
mssonicbld	98c3e24770	[ci/build]: Upgrade SONiC package versions (#12606 )	2022-11-05 00:36:02 +08:00
mssonicbld	1463af1227	[ci/build]: Upgrade SONiC package versions (#12584 )	2022-11-03 00:12:56 +08:00
mssonicbld	fe62175aa6	[ci/build]: Upgrade SONiC package versions (#12571 )	2022-11-02 01:18:10 +08:00
mssonicbld	ae681eabb8	[ci/build]: Upgrade SONiC package versions (#12556 )	2022-11-01 03:49:12 +08:00
mssonicbld	483257d88c	[ci/build]: Upgrade SONiC package versions (#12543 )	2022-10-28 23:15:39 +08:00
Samuel Angebault	8e44292d74	[202205][Arista] Fix cmdline generation during warm-reboot from 201811/201911 (#12371 ) * [202012][Arista] Fix cmdline generation during warm-reboot from 201811/201911 (#11161) Issue fixed: when performing a warm-reboot or fast-reboot from 201811 or 201911 to 202012 the kernel command line contains duplicate information. This issue is related to a change that was made to make 202012 boot0 file more futureproof. A cold reboot brings everything back into a clean slate though not always desirable. Changes done: Added some logic to properly detect the end of the Aboot cmdline when cmdline-aboot-end delimiter is not set (clean case) Added some logic to regenerate the Aboot cmdline when cmdline-aboot-end is set but duplicate parameters exists before (dirty case). Reorganized some code to handle duplicate parameter handling in the allowlist. * Fix cmdline generation due to sonic_fips	2022-10-27 10:14:26 -07:00
Samuel Angebault	b1c0d8d5e4	Add emmc quirks to boot0 (#9989 ) (#12373 ) Why I did it Fix some unreliability seen on emmc device with some AMD CPUs How I did it Added a kernel parameter to add quirks to It depends on a sonic-linux-kernel change to work properly but will be a no-op without it. Description for the changelog Add emmc quirks for Upperlake	2022-10-27 07:09:03 -07:00
Devesh Pathak	17c213a264	Fix to improve hostname handling (#12064 ) * Fix to improve hostname handling If config_db.json is missing hostname entry, hostname-config.sh ends up deleting existing entry too and hostname changes to default 'localhost' * default hostname to 'sonic` if missing in config file	2022-10-25 21:52:42 +00:00
Samuel Angebault	94c8107f5e	Fix extraction of platform.tar.gz for firsttime (#11935 )	2022-10-25 20:43:32 +00:00
cytsao1	8930d70972	[pmon] Add smartmontools to pmon docker (#11837 ) * Add smartmontools to pmon docker * Set smartmontools to install version 7.2-1 in pmon to match host; clean up smartmontools build files * Add comments on smartmontools version for both host and pmon	2022-10-25 20:41:26 +00:00
xumia	db2128564b	[202205] Change submodule path from Azure to sonic-net (#12308 ) Why I did it Change the path of sonic submodules that point to "Azure" to point to "sonic-net" How I did it Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules	2022-10-24 13:13:14 +08:00
mssonicbld	abc92c6248	[ci/build]: Upgrade SONiC package versions (#12452 )	2022-10-20 03:23:45 +08:00
mssonicbld	5d2db5068c	[ci/build]: Upgrade SONiC package versions (#12437 )	2022-10-18 22:19:35 +08:00
mssonicbld	cfc9af71ef	[ci/build]: Upgrade SONiC package versions (#12418 )	2022-10-16 22:24:10 +08:00
mssonicbld	b4e6a06d1a	[ci/build]: Upgrade SONiC package versions (#12409 )	2022-10-14 23:51:03 +08:00
Ying Xie	a1365b44c3	[BGP] starting BGP service after swss (#12381 ) Why I did it BGP service has always been starting after interface-config. However, recently we discovered an issue where some BGP sessions are unable to establish due to BGP daemon not able to read the interface IP. This issue was clearly observed after upgrading to FRR 8.2.2. See more details in #12380. How I did it Delaying starting BGP seems to be a workaround for this issue. However, caution is that this delay might impact warm reboot timing and other timing sequences. This workaround is reducing the probability of hitting the issue by close to 100X. However, this workaround is not bulletproof as test shows. It is still preferrable to have a proper FRR fix and revert this change in the future. How to verify it Continuously issuing config reload and check BGP session status afterwards. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2022-10-13 16:34:10 +00:00
mssonicbld	3435a8a305	[ci/build]: Upgrade SONiC package versions (#12372 )	2022-10-13 02:58:26 +08:00
mssonicbld	1b5d61246a	[ci/build]: Upgrade SONiC package versions (#12324 )	2022-10-09 21:44:14 +08:00
Stepan Blyshchak	06f8b1f98a	[auto-ts] add memory check (#10433 ) (#12291 ) #### Why I did it To support automatic techsupport invokation in case memory usage is too high. #### How I did it Implemented according to https://github.com/Azure/SONiC/pull/939 #### How to verify it UT, manual test on the switch. DEPENDS on https://github.com/Azure/sonic-utilities/pull/2116	2022-10-06 08:06:46 -07:00
Prince George	fab37239dd	Disable brackted-paste mode off by default (#12285 ) * Disable brackted-paste mode off by default * address review comment	2022-10-06 14:58:46 +00:00
Saikrishna Arcot	ac19e2a8ba	[docker-wait-any]: Exit worker thread if main thread is expected to exit (#12255 ) There's an odd crash that intermittently happens after the teamd container exits, and a signal is raised to the main thread to exit. This thread (watching teamd) continues execution because it's in a `while True`. The subsequent wait call on the teamd container very likely returns immediately, and it calls `is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these cases, sometimes, there is a crash in the transition from C code to Python code (after the function gets executed). Python sees that this thread got a signal to exit, because the main thread is exiting, and tells pthread to exit the thread. However, during the stack unwinding, _something_ is telling the unwinder to call `std::terminate`. The reason is unknown. This then results in a python3 SIGABRT, and systemd then doesn't call the stop script to actually stop the container (possibly because the main process exited with a SIGABRT, so it's a hard crash). This means that the container doesn't actually get stopped or restarted, resulting in an inconsistent state afterwards. The workaround appears to be that if we know the main thread needs to exit, just return here, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still feasible to get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C functions, potentially hitting the issue). Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-10-06 14:57:53 +00:00
mssonicbld	204cf58221	[ci/build]: Upgrade SONiC package versions (#12278 )	2022-10-05 20:38:20 +08:00
Ying Xie	76f7d7fa53	Revert "[auto-ts] add memory check (#10433 )" This reverts commit `a2cd0f5d4c`.	2022-10-04 21:53:45 +00:00
mssonicbld	1a08069d40	[ci/build]: Upgrade SONiC package versions (#12268 )	2022-10-04 21:09:24 +08:00
Stepan Blyshchak	a2cd0f5d4c	[auto-ts] add memory check (#10433 ) #### Why I did it To support automatic techsupport invokation in case memory usage is too high. #### How I did it Implemented according to https://github.com/Azure/SONiC/pull/939 #### How to verify it UT, manual test on the switch. DEPENDS on https://github.com/Azure/sonic-utilities/pull/2116	2022-10-03 18:58:38 +00:00
mssonicbld	89643d4717	[ci/build]: Upgrade SONiC package versions (#12245 )	2022-10-02 21:13:07 +08:00
mssonicbld	a7d088c47c	[ci/build]: Upgrade SONiC package versions (#12191 )	2022-09-28 23:25:55 +08:00
mssonicbld	1c5abca0a6	[ci/build]: Upgrade SONiC package versions (#12187 )	2022-09-27 08:41:31 +08:00
mssonicbld	99f9c53d19	[ci/build]: Upgrade SONiC package versions (#12142 )	2022-09-25 21:57:18 +08:00
Volodymyr Boiko	3d620370f7	[bgp][service] Start bgp service after interfaces-config service (#11827 ) - Why I did it interfaces-config service restarts networking service, during the restart loopback interface address is being removed and reassigned back, leaving loopback without an ipv4 address for a while. On SONiC startup and config reload interfaces-config and bgp services start in parallel and sometimes fpmsyncd in bgp attempts bind to loopback while it does not have an address, fails with the log Exception "Cannot assign requested address" had been thrown in daemon and exits with rc 0. root@sonic:/# supervisorctl status fpmsyncd EXITED Jul 20 05:04 AM zebra RUNNING pid 35, uptime 6:15:05 zsocket EXITED Jul 20 05:04 AM docker logs bgp INFO exited: fpmsyncd (exit status 0; expected) With fpmsyncd dead, configured routes do not appear in the database. - How I did it Added ordering dependency on interfaces-config service into bgp.config - How to verify it Itself the issue reproduces quite rarely, but one can gain the time interval between networking down and networking up in interfaces-config.sh like this: diff --git a/files/image_config/interfaces/interfaces-config.sh b/files/image_config/interfaces/interfaces-config.sh index f6aa4147a..87caceeff 100755 --- a/files/image_config/interfaces/interfaces-config.sh +++ b/files/image_config/interfaces/interfaces-config.sh @@ -63,7 +63,11 @@ done # Read sysctl conf files again sysctl -p /etc/sysctl.d/90-dhcp6-systcl.conf -systemctl restart networking +# systemctl restart networking + +systemctl start networking +sleep 10 +systemctl stop networking # Clean-up created files rm -f /tmp/ztp_input.json /tmp/ztp_port_data.json with this change the issue reproduces on every config reload. Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>	2022-09-21 21:15:08 +00:00

1 2 3 4 5 ...

1086 Commits