sonic-buildimage

Author	SHA1	Message	Date
mssonicbld	5d2db5068c	[ci/build]: Upgrade SONiC package versions (#12437 )	2022-10-18 22:19:35 +08:00
mssonicbld	cfc9af71ef	[ci/build]: Upgrade SONiC package versions (#12418 )	2022-10-16 22:24:10 +08:00
mssonicbld	b4e6a06d1a	[ci/build]: Upgrade SONiC package versions (#12409 )	2022-10-14 23:51:03 +08:00
Ying Xie	a1365b44c3	[BGP] starting BGP service after swss (#12381 ) Why I did it BGP service has always been starting after interface-config. However, recently we discovered an issue where some BGP sessions are unable to establish due to BGP daemon not able to read the interface IP. This issue was clearly observed after upgrading to FRR 8.2.2. See more details in #12380. How I did it Delaying starting BGP seems to be a workaround for this issue. However, caution is that this delay might impact warm reboot timing and other timing sequences. This workaround is reducing the probability of hitting the issue by close to 100X. However, this workaround is not bulletproof as test shows. It is still preferrable to have a proper FRR fix and revert this change in the future. How to verify it Continuously issuing config reload and check BGP session status afterwards. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2022-10-13 16:34:10 +00:00
mssonicbld	3435a8a305	[ci/build]: Upgrade SONiC package versions (#12372 )	2022-10-13 02:58:26 +08:00
mssonicbld	1b5d61246a	[ci/build]: Upgrade SONiC package versions (#12324 )	2022-10-09 21:44:14 +08:00
Stepan Blyshchak	06f8b1f98a	[auto-ts] add memory check (#10433 ) (#12291 ) #### Why I did it To support automatic techsupport invokation in case memory usage is too high. #### How I did it Implemented according to https://github.com/Azure/SONiC/pull/939 #### How to verify it UT, manual test on the switch. DEPENDS on https://github.com/Azure/sonic-utilities/pull/2116	2022-10-06 08:06:46 -07:00
Prince George	fab37239dd	Disable brackted-paste mode off by default (#12285 ) * Disable brackted-paste mode off by default * address review comment	2022-10-06 14:58:46 +00:00
Saikrishna Arcot	ac19e2a8ba	[docker-wait-any]: Exit worker thread if main thread is expected to exit (#12255 ) There's an odd crash that intermittently happens after the teamd container exits, and a signal is raised to the main thread to exit. This thread (watching teamd) continues execution because it's in a `while True`. The subsequent wait call on the teamd container very likely returns immediately, and it calls `is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these cases, sometimes, there is a crash in the transition from C code to Python code (after the function gets executed). Python sees that this thread got a signal to exit, because the main thread is exiting, and tells pthread to exit the thread. However, during the stack unwinding, _something_ is telling the unwinder to call `std::terminate`. The reason is unknown. This then results in a python3 SIGABRT, and systemd then doesn't call the stop script to actually stop the container (possibly because the main process exited with a SIGABRT, so it's a hard crash). This means that the container doesn't actually get stopped or restarted, resulting in an inconsistent state afterwards. The workaround appears to be that if we know the main thread needs to exit, just return here, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still feasible to get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C functions, potentially hitting the issue). Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-10-06 14:57:53 +00:00
mssonicbld	204cf58221	[ci/build]: Upgrade SONiC package versions (#12278 )	2022-10-05 20:38:20 +08:00
Ying Xie	76f7d7fa53	Revert "[auto-ts] add memory check (#10433 )" This reverts commit `a2cd0f5d4c`.	2022-10-04 21:53:45 +00:00
mssonicbld	1a08069d40	[ci/build]: Upgrade SONiC package versions (#12268 )	2022-10-04 21:09:24 +08:00
Stepan Blyshchak	a2cd0f5d4c	[auto-ts] add memory check (#10433 ) #### Why I did it To support automatic techsupport invokation in case memory usage is too high. #### How I did it Implemented according to https://github.com/Azure/SONiC/pull/939 #### How to verify it UT, manual test on the switch. DEPENDS on https://github.com/Azure/sonic-utilities/pull/2116	2022-10-03 18:58:38 +00:00
mssonicbld	89643d4717	[ci/build]: Upgrade SONiC package versions (#12245 )	2022-10-02 21:13:07 +08:00
mssonicbld	a7d088c47c	[ci/build]: Upgrade SONiC package versions (#12191 )	2022-09-28 23:25:55 +08:00
mssonicbld	1c5abca0a6	[ci/build]: Upgrade SONiC package versions (#12187 )	2022-09-27 08:41:31 +08:00
mssonicbld	99f9c53d19	[ci/build]: Upgrade SONiC package versions (#12142 )	2022-09-25 21:57:18 +08:00
Volodymyr Boiko	3d620370f7	[bgp][service] Start bgp service after interfaces-config service (#11827 ) - Why I did it interfaces-config service restarts networking service, during the restart loopback interface address is being removed and reassigned back, leaving loopback without an ipv4 address for a while. On SONiC startup and config reload interfaces-config and bgp services start in parallel and sometimes fpmsyncd in bgp attempts bind to loopback while it does not have an address, fails with the log Exception "Cannot assign requested address" had been thrown in daemon and exits with rc 0. root@sonic:/# supervisorctl status fpmsyncd EXITED Jul 20 05:04 AM zebra RUNNING pid 35, uptime 6:15:05 zsocket EXITED Jul 20 05:04 AM docker logs bgp INFO exited: fpmsyncd (exit status 0; expected) With fpmsyncd dead, configured routes do not appear in the database. - How I did it Added ordering dependency on interfaces-config service into bgp.config - How to verify it Itself the issue reproduces quite rarely, but one can gain the time interval between networking down and networking up in interfaces-config.sh like this: diff --git a/files/image_config/interfaces/interfaces-config.sh b/files/image_config/interfaces/interfaces-config.sh index f6aa4147a..87caceeff 100755 --- a/files/image_config/interfaces/interfaces-config.sh +++ b/files/image_config/interfaces/interfaces-config.sh @@ -63,7 +63,11 @@ done # Read sysctl conf files again sysctl -p /etc/sysctl.d/90-dhcp6-systcl.conf -systemctl restart networking +# systemctl restart networking + +systemctl start networking +sleep 10 +systemctl stop networking # Clean-up created files rm -f /tmp/ztp_input.json /tmp/ztp_port_data.json with this change the issue reproduces on every config reload. Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>	2022-09-21 21:15:08 +00:00
Maxime Lorrillere	458b12b4af	[Chassis][Voq]Configure midplane network on supervisor (#11725 ) Multi-asic Docker instances are created behind Docker's default bridge which doesn't allow talking to other Docker instances that are in the host network (like database-chassis). On linecards, we configure midplane interfaces to let per-asic docker containers talk to CHASSIS_DB on the supervisor through internal chassis network. On the supervisor we don't need to use chassis internal network, but we still need a similar setup in order to allow fabric containers to talk to database-chassis	2022-09-21 21:12:40 +00:00
mssonicbld	77b469d7c8	[ci/build]: Upgrade SONiC package versions (#12121 )	2022-09-20 21:24:25 +08:00
Oleksandr Ivantsiv	c9ba827773	[202205] [services] Update "WantedBy=" section for tacacs-config.timer. (#11893 ) (#12080 ) Manually cherry-picking #11893 - Why I did it The timer execution may fail if triggered during a config reload (when the sonic.target is stopped). This might happen in a rare situation if config reload is executed after reboot in a small time slot (for 0 to 30 seconds) before the tacacs-config timer is triggered: systemctl status tacacs-config.timer tacacs-config.timer - Delays tacacs apply until SONiC has started Loaded: loaded (/lib/systemd/system/tacacs-config.timer; enabled-runtime; vendor preset: enabled) Active: failed (Result: resources) since Mon 2022-08-29 15:53:03 IDT; 1min 28s ago Trigger: n/a Triggers: tacacs-config.service Aug 29 15:47:53 r-boxer-sw01 systemd[1]: Started Delays tacacs apply until SONiC has started. Aug 29 15:53:03 r-boxer-sw01 systemd[1]: tacacs-config.timer: Failed to queue unit startup job: Transaction for tacacs-config.service/start is destructive (mgmt-framework.timer has 's> Aug 29 15:53:03 r-boxer-sw01 systemd[1]: tacacs-config.timer: Failed with result 'resources'. - How I did it To ensure that timer execution will be resumed after a config reload the WantedBy section of the systemd service is updated to describe relation to sonic.target. - How to verify it Reboot the system After reboot monitor tacacs-config.timer status. 30 seconds before timer activation run "config reload -y" command. Check system status. Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>	2022-09-19 09:20:10 +03:00
mssonicbld	f361c029c5	[ci/build]: Upgrade SONiC package versions (#11980 )	2022-09-19 12:31:16 +08:00
Aryeh Feigin	b8c6e2a45d	Use warm-boot infrastructure for fast-boot (#12026 )	2022-09-14 21:23:34 +03:00
Saikrishna Arcot	f1243bad1b	Pin version of bazelisk to v1.13.0 (#12027 ) * Pin version of bazelisk to v1.13.0 This tries to avoid builds failures due to the latest version of bazelisk changing and causing hash mismatches. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-09-08 21:15:35 -07:00
Ying Xie	ee40402ab7	Revert "[build] Fix version of bazelist which is lost acccidently (#12012 )" This reverts commit `36c5787daf`.	2022-09-09 04:14:59 +00:00
Liu Shilong	36c5787daf	[build] Fix version of bazelist which is lost acccidently (#12012 ) Why I did it bazelisk package with hash value 1227b24db77557d552701f6add122edc is deleted from github release. Reproducible build only cached hash value. Package file didn't be cached. Because they are in different pipelines. Using latest package hash instead.	2022-09-09 07:24:44 +08:00
Ze Gan	0a54c46a0d	[docker-macsec]: Add dependencies of MACsec (#11770 ) Why I did it If the SWSS services was restarted, the MACsec service should also be restarted. Otherwise the data in wpa_supplicant and orchagent will not be consistent. How I did it Add dependency in docker-macsec.mk. How to verify it Manually check by 'sudo service swss restart'. The MACsec container should be started after swss, the syslog will look like Sep 8 14:36:29.562953 sonic INFO swss.sh[9661]: Starting existing swss container with HWSKU Force10-S6000 Sep 8 14:36:30.024399 sonic DEBUG container: container_start: BEGIN ... Sep 8 14:36:33.391706 sonic INFO systemd[1]: Starting macsec container... Sep 8 14:36:33.392925 sonic INFO systemd[1]: Starting Management Framework container... Signed-off-by: Ze Gan <ganze718@gmail.com>	2022-09-08 15:50:06 +00:00
Ying Xie	b4bf4aca3f	[mux] skip mux operations during warm shutdown (#11937 ) * [mux] skip mux operations during warm shutdown - Enhance write_standby.py script to skip actions during warm shutdown. - Expand the support to BGP service. - MuX support was added by a previous PR. - don't skip action during warm recovery Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2022-09-08 15:48:56 +00:00
Lawrence Lee	12e6b89d80	[arp_update]: Set failed IPv6 neighbors to incomplete (#11919 ) After pinging any failed IPv6 neighbor entries, set the remaining failed/incomplete entries to a permanent INCOMPLETE state. This manual setting to INCOMPLETE prevents these entries from automatically transitioning to FAILED state, and since they are now incomplete any subsequent NA messages for these neighbors is able to resolve the entry in the cache. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-09-08 15:48:05 +00:00
Stepan Blyshchak	8431d3ab36	[docker-wait-any] immediately start to wait (#11595 ) It could happen that a container has already crashed but docker-wait-any will wait forever till it starts. It should, however, immediately exit to make the serivce restart. #### Why I did it It is observed in some circumstances that the auto-restart mechanism does not work. Specifically for ```swss.service```, ```orchagent``` had crashed before ```docker-wait-any``` started in ```swss.sh```. This led ```docker-wait-any``` wait forever for ```swss``` to be in ```"Running"``` state and it results in: ``` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1abef1ecebff bcbca2b74df6 "/usr/local/bin/supe…" 22 hours ago Up 22 hours what-just-happened 3c924d405cd5 docker-lldp:latest "/usr/bin/docker-lld…" 22 hours ago Up 22 hours lldp eb2b12a98c13 docker-router-advertiser:latest "/usr/bin/docker-ini…" 22 hours ago Up 22 hours radv d6aac4a46974 docker-sonic-mgmt-framework:latest "/usr/local/bin/supe…" 22 hours ago Up 22 hours mgmt-framework d880fd07aab9 docker-platform-monitor:latest "/usr/bin/docker_ini…" 22 hours ago Up 22 hours pmon 75f9e22d4fdd docker-snmp:latest "/usr/local/bin/supe…" 22 hours ago Up 22 hours snmp 76d570a4bd1c docker-sonic-telemetry:latest "/usr/local/bin/supe…" 22 hours ago Up 22 hours telemetry ee49f50344b3 docker-syncd-mlnx:latest "/usr/local/bin/supe…" 22 hours ago Up 22 hours syncd 1f0b0bab3687 docker-teamd:latest "/usr/local/bin/supe…" 22 hours ago Up 22 hours teamd 917aeeaf9722 docker-orchagent:latest "/usr/bin/docker-ini…" 22 hours ago Exited (0) 22 hours ago swss 81a4d3e820e8 docker-fpm-frr:latest "/usr/bin/docker_ini…" 22 hours ago Up 22 hours bgp f6eee8be282c docker-database:latest "/usr/local/bin/dock…" 22 hours ago Up 22 hours database ``` The check for ```"Running"``` state is not needed because for cold boot case we do ```start_peer_and_dependent_services``` and for warm boot case the loop will retry to wait for container if this container is doing warm boot: `d01a91a569/files/image_config/misc/docker-wait-any (L56)` #### How I did it Removed the check for ```"Running"```. #### How to verify it Kill swss before ```docker-wait-any``` is reached and verify auto restart will restart swss serivce.	2022-09-08 15:47:27 +00:00
mssonicbld	dc987ebd2c	[ci/build]: Upgrade SONiC package versions (#11951 )	2022-09-05 14:42:32 +08:00
mssonicbld	613d3431d1	[ci/build]: Upgrade SONiC package versions (#11913 ) Upgrade SONiC Versions	2022-09-01 15:47:48 +08:00
abdosi	72852cdd02	Address Review Comment to define SONIC_GLOBAL_DB_CLI in gbsyncd.sh (#11857 ) As part of PR #11754 Change was added to use variable SONIC_DB_NS_CLI for namespace but that will not work since ./files/scripts/syncd_common.sh uses SONIC_DB_CLI. So revert back to use SONIC_DB_CLI and define new variable for SONIC_GLOBAL_DB_CLI for global/host db cli access Also fixed DB_CLI not working for namespace.	2022-09-01 00:12:56 +00:00
Longxiang Lyu	d7f049ebf0	[mux] Exit to write `standby` state to `active-active` ports (#11821 ) [mux] Exit to write standby state to `active-active` ports Signed-off-by: Longxiang Lyu <lolv@microsoft.com>	2022-09-01 00:11:09 +00:00
andywongarista	0adfd724e6	[202205][Arista] Add initial support for 720DT-48S (#10656 ) (#11860 ) Added initial set of config files to allow for booting and partial traffic testing in SONiC on the 720DT-48S. How to verify it - Switch boots - show interfaces status shows links up on interfaces Ethernet24-51 - Traffic flows with no errors on interfaces Ethernet24-51	2022-08-30 12:39:26 +08:00
Stepan Blyshchak	c60d78dd1f	[syncd.sh] 'sxdkernel start' => 'sxdkernel restart' (#11718 ) Change `sxdkernel start` to `sxdkernel restart`. If `syncd` service crashes in `ExecStartPre` systemd will not call `ExecStop` and thus will not call `sxdkernel stop`. Use of `sxdkernel restart` is more robust in terms of guarantees to restore the system after unexpected crashes. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com> Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2022-08-27 16:16:17 +00:00
anamehra	a2bed2ae4a	container_checker on supervisor should check containers based on asic presence (#11442 ) Why I did it On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created. The monit 'container_checker' fails in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).	2022-08-26 20:50:24 +00:00
Saikrishna Arcot	91e9db005a	[202205]: Update package versions (#11801 ) This was done manually, to try to get past a build error due to changing package versions in Debian. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-08-21 15:23:44 -07:00
abdosi	0355caf20b	Added support to add gbsyncd in Feature Table of Host Config DB (#11754 ) Why I did: In case of multi-asic platforms gbsyncd is not getting added to Feature Table of Host Config DB. Without this container_checker complains of not needed gbsyncd container's are running. How I did: Update Both Host and Namespace config db when gbsyncd docker is starting. How I verify: Verified on Multi-asic platforms.	2022-08-19 15:22:12 +00:00
Nikola Dancejic	f63dc738f9	[swss] Adding conditional for bgp when on multi ASIC platform (#11691 ) bgp should be a per-asic service, and runs for each namespace on multi-asic platforms. However, putting bgp in MULTI_INST_DEPENDENT causes swss to be restarted as well as bgp. this is causing issues after #11000 Issue: #11653 This fix: removes bgp from dependents list adds a conditional that either adds bgp, or bgp@$DEV to separate between single and multi-asic platforms	2022-08-17 17:10:29 +00:00
Hua Liu	6a2c540cba	[swsscommon] Add c++ version sonic-db-cli from sonic-swss-common (#10825 ) (#11713 ) Cherry pick PR https://github.com/sonic-net/sonic-buildimage/pull/10825 to 202205 branch #### Why I did it Fix sonic-db-cli high CPU usage on SONiC startup issue: https://github.com/sonic-net/sonic-buildimage/issues/10218 ETA of this issue will be 2022/05/31 #### How I did it Re-write sonic-cli with c++ in sonic-swss-common: https://github.com/sonic-net/sonic-swss-common/pull/607 Modify swss-common rules and slave.mk to install c++ version sonic-db-cli. #### How to verify it Pass all E2E test scenario. #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, not features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 #### Description for the changelog Build and install c++ version sonic-db-cli from swss-common. #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration. --> #### A picture of a cute animal (not mandatory but encouraged)	2022-08-17 15:35:00 +08:00
mssonicbld	5c306cc2e5	[ci/build]: Upgrade SONiC package versions (#11679 )	2022-08-15 05:50:59 +00:00
Lawrence Lee	15c80b207c	[arp_update]: Resolve failed neighbors on dualtor (#11615 ) In arp_update, check for FAILED or INCOMPLETE kernel neighbor entries and manually ping them to try and resolve the neighbor Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-08-11 16:19:25 +00:00
Stepan Blyshchak	3201dc93f6	[swss.sh/syncd.sh] Trap only on EXIT (#11590 ) When using trap on SIGTERM the script will not react to the SIGTERM signal sent while a child is executing. I.e, the following script does not react on SIGTERM sent to it if it is waiting for sleep to finish: ``` trap "echo Handled SIGTERM" 0 2 3 15 echo "Before sleep" sleep inf echo "After sleep" ``` Instead, trap only on EXIT which covers also a scenario with exit on SIGINT, SIGTERM. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2022-08-11 16:18:00 +00:00
Ying Xie	094745f06f	[write_standby] update write_standby.py script (#11650 ) Why I did it The initial value has to be present for the state machines to work. In active-standby dual-tor scenario, or any hardware mux scenario, the value will be updtaed eventually with a delay. However, in active-active dual-tor scenario, there is no other mechanism to initialize the value and get state machines started. So this script will have to write something at start up time. For active-active dualtor, 'active' is a more preferred initial value, the state machine will switch the state to standby soon if link prober found link not in good state. How I did it Update the script to always provide initial values. How to verify it Tested on active-active dual-tor testbed. Signed-off-by: Ying Xie ying.xie@microsoft.com	2022-08-09 23:02:09 +00:00
Sudharsan Dhamal Gopalarathnam	871a1c51d8	[vs]Preventing ebtables cfg to be applied on vs (#11585 ) *Preventing ebtables rules to be applied on KVM image. The ebtables rules in SONiC are added to prevent ARP as well as L2 forwarding to be blocked in linux kernel since the hardware will take care of the actual L2 forward. However this is not the case with KVM where linux needs to forward even L2 packets	2022-08-08 20:45:28 +00:00
bingwang-ms	fda1290926	Support different `DSCP_TO_TC_MAP` for T1 in dualtor deployment (#11569 ) * Support different DSCP_TO_TC_MAP for T1 in dualtor deployment	2022-08-08 20:44:32 +00:00
Stepan Blyshchak	29d29b9491	[swss.sh] clear counters cache folder on swss cold/fast reload (#11244 ) A change in sonic-utilities makes all cache files be saved into a /tmp/cache. On swss restart this cache has to be removed in case swss starts in cold or fast mode. A related cache restoration in the warmboot finalizer script is also updated to use new location. - Why I did it To fix #9817. Clear the cache directory on swss.sh except for warm start. Also, adopted finalize-warmboot script to take the cache directory. - How I did it A change in sonic-utilities makes all cache files be saved into a /tmp/cache. On swss restart this cache has to be removed in case swss starts in cold or fast mode. A related cache restoration in the warmboot finalizer script is also updated to use new location. - How to verify it Run togather with Azure/sonic-utilities#2232. Verify counters cache is removed on config reload, cold/fast reboots, swss restart. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2022-08-08 20:42:54 +00:00
Nikola Dancejic	32fb4c7772	[swss] Adding bgp container as dependent of swss (#11000 ) What I did: Added bgp as a dependent of swss Why I did it: bgp container was not restarting on swss crash. When swss crashes, linkmgrd doesn't initate a switchover because it cannot access the default route from orchagent. Bringing down bgp with swss will isolate the ToR, causing linkmgrd to initiate a switchover to the peer ToR avoiding significant packet loss. How I did it: Added bgp to DEPENDENT Signed-off-by: Nikola Dancejic <ndancejic@microsoft.com>	2022-08-08 20:40:35 +00:00
mssonicbld	f30e85358e	[ci/build]: Upgrade SONiC package versions (#11438 ) Upgrade SONiC Versions	2022-08-07 11:29:11 +08:00

1 2 3 4 5 ...

1054 Commits