sonic-buildimage

Archived

Author	SHA1	Message	Date
Feng-msft	46c0d073a5	Update golang version for telemetry build in sonic-slave-buster to fix (#14636 ) Update golang version for telemetry build in sonic-slave-jessie to fix CVE-2021-33195, this PR will be merged into 201911 branch finally. #### Why I did it Go before 1.15.13 and 1.16.x before 1.16.5 has functions for DNS lookups that do not validate replies from DNS servers, and thus a return value may contain an unsafe injection (e.g., XSS) that does not conform to the RFC1035 format. Now in 201911 and 202012 branch we're using 1.14.2 ##### Work item tracking - Microsoft ADO (number only):17727291 #### How I did it Bump golang version into 1.15.15 which contains corresponding fix. #### How to verify it unit test to do sanity check.	2023-04-16 23:44:11 -07:00
mssonicbld	7931abd527	[submodule] Update submodule sonic-host-services to the latest HEAD automatically (#14670 )	2023-04-16 15:04:19 +08:00
mssonicbld	49dbaeb649	[ci/build]: Upgrade SONiC package versions (#14672 )	2023-04-15 18:21:50 +08:00
mssonicbld	98ed13b978	[submodule] Update submodule sonic-swss to the latest HEAD automatically (#14648 )	2023-04-15 15:04:14 +08:00
Saikrishna Arcot	070a64af89	Fix backend port channels and routes being displayed (#14479 ) * Fix backend port channels and routes being displayed In `show interface portchannel` and `show ip route`, backend port channels and routes were being displayed. This is due to changes in #13660. Fix these issues by switching to reading from PORTCHANNEL_MEMBERS table instead. Fixes #14459. * Replace table name with constant Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2023-04-14 19:54:02 -07:00
mssonicbld	d014b03849	[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#14649 )	2023-04-14 15:10:53 +08:00
Ravi [Marvell]	fa48caf39d	Add debug shell packages for Marvell Innovium platforms (#11845 ) - Why I did it Package Marvell/Innovium CLI shell. - How I did it Include shell packages. - How to verify it Platform specific shell commands. Signed-off-by: rck-innovium rck@innovium.com	2023-04-13 22:04:36 +03:00
Dror Prital	d8f75c932b	[submodule] Advance sonic-platform-daemons pointer (#14595 ) Update sonic-platform-daemons submodule pointer to include the following: * d1203ef Update xcvrd to use new STATE_DB FAST_REBOOT entry ([#335](https://github.com/sonic-net/sonic-platform-daemons/pull/335)) Signed-off-by: dprital <drorp@nvidia.com>	2023-04-13 18:21:32 +03:00
mssonicbld	3e529cbab3	[submodule] Update submodule to the latest HEAD automatically	2023-04-13 20:51:28 +08:00
Vivek	397908aa59	[Mellanox] Facilitate automatic integration of new hw-mgmt (#14594 ) - Why I did it Facilitate Automatic integration of new hw-mgmt version into SONiC. Inputs to the Script: MLNX_HW_MANAGEMENT_VERSION Eg: 7.0040.5202 CREATE_BRANCH: (y\|n) Creates a branch instead of a commit (optional, default: n) BRANCH_SONIC: Only relevant when CREATE_BRANCH is y. Default: master. Note: These should be provided through SONIC_OVERRIDE_BUILD_VARS parameter Output: Script creates a commit (in each of sonic-buildimage, sonic-linux-kernel) with all the changes required for upgrading the hw-management version to a version provided by MLNX_HW_MANAGEMENT_VERSION Brief Summary of the changes made: MLNX_HW_MANAGEMENT_VERSION flag in the hw-management.mk file hw-mgmt submodule is updated to the corresponding version Updates are made to non-upstream-patches/patches and series.patch file series, kconfig-inclusion and kconfig-exclusion files can be updated in the sonic-linux-kernel repo sonic-linux-kernel/patches folder is updated with the corresponding upstream patches Based on the inputs, there could be a branch seen in the local for each of the repo's. Branch is named as <branch>_<parent_commit>_integrate_<hw_mgmt_version> - How I did it Added a new make target which can be invoked by calling make integrate-mlnx-hw-mgmt user@server:/sonic-buildimage$ git rev-parse --abbrev-ref HEAD master_23193446a_integrate_7.0020.5052 user@server:/sonic-buildimage$ git log --oneline -n 2 f66e01867 (HEAD -> master_23193446a_integrate_V.7.0020.5052, show) Intgerate HW-MGMT V.7.0020.5052 Changes `23193446a` (master_intg_hw_mgmt) Update logic user@server:/sonic-buildimage/src/sonic-linux-kernel$ git rev-parse --abbrev-ref HEAD master_6847319_integrate_7.0020.4104 user@server:/sonic-buildimage/src/sonic-linux-kernel$ git log --oneline -n 2 6094f71 (HEAD -> master_6847319_integrate_V.7.0020.5052) Intgerate HW-MGMT V.7.0020.5052 Changes 6847319 (origin/master, origin/HEAD) Read ID register for optoe1 to find pageable bit in optoe driver (#308) Changes made will be summarized under sonic-buildimage/integrate-mlnx-hw-mgmt_user.out file. Debugging and troubleshooting output is written to sonic-buildimage/integrate-mlnx-hw-mgmt.log files User output file & stdout file: log_files.tar.gz Limitations: Assumes the changes would only work for amd64 Assumes the non-upstream patches in mellanox only belong to hw-mgmt - How to verify it Build the Kernel Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2023-04-13 14:18:09 +03:00
Sudharsan Dhamal Gopalarathnam	2804998766	[config reload]Config Reload Enhancement (#13969 ) #### Why I did it Implementing code changes for https://github.com/sonic-net/SONiC/pull/1203 #### How I did it Removed the timers and delayed target since the delayed services would start based on event driven approach. Cleared port table during config reload and cold reboot scenario. Modified yang model, init_cfg.json to change has_timer to delayed #### How to verify it Running regression	2023-04-12 11:20:03 -07:00
Christian Svensson	97c29a45bd	[build] Do not ignore well-known debian files (#14565 ) Includes the common debian files that we always want to include. This mitigates but does not fully solve #7683 as it could be more files that are ignored by this default rule. Signed-off-by: Christian Svensson <blue@cmd.nu>	2023-04-12 09:10:22 -07:00
mssonicbld	f9eb849d75	[ci/build]: Upgrade SONiC package versions (#14620 )	2023-04-12 20:05:30 +08:00
anamehra	f34360f101	chassis-packet: resolve the missing static routes (#14593 ) Why I did it Fixes #14179 chassis-packet: missing arp entries for static routes causing high orchagent cpu usage It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear. How I did it arp_update should resolve the missing arp/ndp static route entries. Added code to check for missing entries and try ping if any found to resolve it. How to verify it After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present manual validation: Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries run arp_update Check for neigh entries. All entries should be present. Testing on T0 setup route/for test_static_route.py The test set the STATIC_ROUTE entry in conifg db without ifname: sonic-db-cli CONFIG_DB hmset 'STATIC_ROUTE\|2.2.2.0/24' nexthop 192.168.0.18,192.168.0.25,192.168.0.23 "STATIC_ROUTE": { "2.2.2.0/24": { "nexthop": "192.168.0.18,192.168.0.25,192.168.0.23" } }, Validate that the arp_update gets the proper ARP_UPDATE_VARDS using arp_update_vars.j2 template from config db and does not crash: { "switch_type": "", "interface": "", "pc_interface" : "PortChannel101 PortChannel102 PortChannel103 PortChannel104 ", "vlan_sub_interface": "", "vlan" : "Vlan1000", "static_route_nexthops": "192.168.0.18 192.168.0.25 192.168.0.23 ", "static_route_ifnames": "" } validate route/test_static_route.py testcase pass.	2023-04-12 15:07:42 +08:00
xumia	f1fd42558a	Support to add SONiC OS Version in device info (#14601 ) Why I did it Support to add SONiC OS Version in device info. It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version. SONiC Software Version: SONiC.master-13812.218661-7d94c0c28 SONiC OS Version: 11 Distribution: Debian 11.6 Kernel: 5.10.0-18-2-amd64 How I did it	2023-04-12 09:20:08 +08:00
Qi Luo	38f0ec6563	Update pull request template for test evidence, and work item trackers (#14552 ) Update pull request template for test evidence, and work item trackers	2023-04-11 15:16:35 -07:00
xumia	ad162ae0e8	[Build] Optimize the version control for Debian packages (#14557 ) Why I did it Optimize the version control for Debian packages. Fix sonic-slave-buster/sources.list.amd64 not found display issue, need to generate the file before running the shell command to evaluate the sonic image tag. When using the snapshot mirror, it is not necessary to update the version file based on the base image. It will reduce the version dependency issue, when an image is not run when freezing the version. How I did it Not to update the version file when snapshot mirror enabled. How to verify it	2023-04-11 17:07:26 +08:00
Konstantin Vasin	d7d6445abf	[Build] disable DOCKER_BUILDKIT explicitly (#14405 ) Why I did it Fix #14081 By default DOCKER_BUILDKIT is enabled after docker version 23.0.0 So we need to disable it explicitly if SONIC_USE_DOCKER_BUILDKIT is not set. Otherwise it will produce larger installable images. How I did it set DOCKER_BUILDKIT=0 in slave.mk How to verify it	2023-04-11 08:06:07 +00:00
Liu Shilong	3d32008e49	[build] Fix reproducible build version issue when failed to download web file (#14587 ) Why I did it refine reproducible build. How I did it Fix reset map variable in bash. Ignore empty web file md5sum value. If web file didn't backup in azure storage, use file on web. How to verify i	2023-04-11 10:47:30 +08:00
Vivek	0df155b014	Made non-upstream patch design order aware (#14434 ) - Why I did it Currently, non upstream patches are applied only after upstream patches. Depends on sonic-net/sonic-linux-kernel#313. Can be merged in any order, preferably together - What I did it Non upstream Patches that reside in the sonic repo will not be saved in a tar file bur rather in a folder pointed out by EXTERNAL_KERNEL_PATCH_LOC. This is to make changes to the non upstream patches easily traceable. The build variable name is also updated to INCLUDE_EXTERNAL_PATCHES Files/folders expected under EXTERNAL_KERNEL_PATCH_LOC EXTERNAL_KERNEL_PATCH_LOC/ ├──── patches/ ├── 0001-xxxxx.patch ├── 0001-yyyyyyyy.patch ├── ............. ├──── series.patch series.patch should contain a diff that is applied on the sonic-linux-kernel/patch/series file. The diff should include all the non-upstream patches. How to verify it Build the Kernel and verified if all the patches are applied properly Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2023-04-10 19:48:27 +03:00
mssonicbld	4e5c8988b1	[ci/build]: Upgrade SONiC package versions (#14586 )	2023-04-10 18:10:37 +08:00
mssonicbld	4ff784a489	[submodule] Update submodule to the latest HEAD automatically (#14585 )	2023-04-10 15:00:12 +08:00
xumia	09bd333b63	[Build] Fix the reproducible build variable display error in the slave container (#14543 ) Why I did it Enable the reproducible build for PR build for master branch Fix the reproducible build variable display error in the slave container. The below config is none, although the config is set and takes effect. "SONIC_VERSION_CONTROL_COMPONENTS": "none" How I did it Passing the variable through the slave container command line. The variable has been passed to the slave container and the other docker container by a config file, it is only used to display the value during the build. How to verify it See https://dev.azure.com/mssonic/build/_build/results?buildId=247960&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=88f376cf-c35d-5783-0a48-9ad83a873284 "SONIC_VERSION_CONTROL_COMPONENTS": "deb,py2,py3,web,git,docker"	2023-04-10 14:56:30 +08:00
Konstantin Vasin	1bf50a5566	[Build] use snapshots of debian mirrors for sonic-slave containers #14400 Why I did it We don't use snapshots of debian mirrors for sonic-slave containers even if MIRROR_SNAPSHOT is enabled. How I did it Export MIRROR_SNAPSHOT in Makefile.work to generate sources.list for sonic-slave containers using debian snapshot mirror How to verify it	2023-04-10 09:15:10 +08:00
Aryeh Feigin	41a9813018	Finalize fast-reboot in warmboot finalizer (#14238 ) - Why I did it To solve an issue with upgrade with fast-reboot including FW upgrade which has been introduced since moving to fast-reboot over warm-reboot infrastructure. As well, this introduces fast-reboot finalizing logic to determine fast-reboot is done. - How I did it Added logic to finalize-warmboot script to handle fast-reboot as well, this makes sense as using fast-reboot over warm-reboot this script will be invoked. The script will clear fast-reboot entry from state-db instead of previous implementation that relied on timer. The timer could expire in some scenarios between fast-reboot finished causing fallback to cold-reboot and possible crashes. As well this PR updates all services/scripts reading fast-reboot state-db entry to look for the updated value representing fast-reboot is active. - How to verify it Run fast-reboot and check that fast-reboot entry exists in state-db right after startup and being cleared as warm-reboot is finalized and not due to a timer.	2023-04-09 16:59:15 +03:00
mssonicbld	e32624d362	[ci/build]: Upgrade SONiC package versions (#14571 )	2023-04-08 18:00:30 +08:00
mssonicbld	95fb9ee637	[submodule] Update submodule to the latest HEAD automatically (#14525 )	2023-04-08 17:05:31 +08:00
Stephen Sun	152148fb81	Enhance the error message output mechanism (#14384 ) #### Why I did it Enhance the error message output mechanism during swss docker creating #### How I did it Capture the output to stderr of `sonic-cfggen` and output it using `echo` to make sure the error message will be logged in syslog. #### How to verify it Manually test	2023-04-07 14:23:35 -07:00
Lior Avramov	71f2a6a3a9	Add teamd patches to solve traffic loss issue when removing port from LAG (#14002 ) #### Why I did it When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds. Fix issue https://github.com/sonic-net/sonic-buildimage/issues/14381 #### How I did it The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port. Implement the op in LACP runner to disable the port which leads to proper LACPDU send. #### How to verify it Set LAG between 2 switches. Set LAGs to be router port and set ip address. In switch A send ping to ip address of LAG in switch B. In switch B, while ping is running remove port from LAG. Verify ping is not stopping.	2023-04-07 14:15:19 -07:00
Stephen Sun	3b5871f7f8	Fix issue: wrong teamd link watch state after warm reboot (#14084 ) #### Why I did it Fix issue: wrong teamd link watch state after warm reboot due to TEAM_ATTR_PORT_CHANGED lost The flag TEAM_ATTR_PORT_CHANGED is maintained by kernel team driver: - a flag "changed" is maintained in struct team_port struct - the flag is set by __team_port_change_send once relevant information is updated, including port linkup (together with speed, duplex), adding or removing - the flag is cleared by team_nl_fill_one_port_get once the updated information has been notified to user space via RTNL In the userspace, the change flag is maintained by libteam in struct team_port. The team daemon calls port_priv_change_handler_func on receiving port change event. The logic in port_priv_change_handler_func 1. creates the port if it did not exist, which triggers port add event and eventually calls lacp_port_added callback. 2. triggers port change event if team_port->changed is true, which eventually calls lw_ethtool_event_watch_port_changed to update port state for link watch ethtool. 3. removes the port if team_port->removed is removed In lacp_port_added, it calls team_refresh to refresh ifinfo, port info, and option info from the kernel via RTNL. In this step, port_priv_change_handler_func is called recursively. - In the inner call, it won't get TEAM_ATTR_PORT_CHANGED flag because kernel has already notified that. - As a result, team_port->changed flag is cleared in the libteam. - The port change event won't be triggered from either inner or outer call of port_priv_change_handler_func. If the port has been up when the port is being added to the team device, the "port up" information is carried in the outer call but will be lost. In case the flag TEAM_ATTR_PORT_CHANGED is set only in the inner call, function port_priv_change_handler_func can be called in the inner call. However, it will fail to fetch "enable" options because option_list_init has not be called. Signed-off-by: Stephen Sun <stephens@nvidia.com> #### How I did it Fix: Do not call check_call_change_handlers when parsing RTNL function is called from another check_call_change_handlers recursively. #### How to verify it - Manually test - Regression test - warm reboot - warm reboot sad lag - warm reboot sad lag member - warm reboot sad (partial)	2023-04-07 14:13:33 -07:00
Devesh Pathak	d74055e12c	Increase wait_for_tunnel() timeout to 90s (#14279 ) Why I did it Orchagent sometimes take additional time to execute Tunnel tasks. This cause write_standby script to error out and mux state machines are not initialized. It results in show mux status missing some ports in output. Mar 13 20:36:52.337051 m64-tor-0-yy41 INFO systemd[1]: Starting MUX Cable Container... Mar 13 20:37:52.480322 m64-tor-0-yy41 ERR write_standby: Timed out waiting for tunnel MuxTunnel0, mux state will not be written Mar 13 20:37:58.983412 m64-tor-0-yy41 NOTICE swss#orchagent: :- doTask: Tunnel(s) added to ASIC_DB. How I did it Increase timeout from 60s to 90s How to verify it Verified that mux state machine is initialized and show mux status has all needed ports in it.	2023-04-07 11:30:58 +08:00
xumia	6e43b5c515	[Build] Support to use the snapshot mirror for debian base image (#14474 ) Why I did it [Build] Support to use the snapshot mirror for debian base image How I did it If the MIRROR_SNAPSHOT=n, then use the default mirror http://deb.debian.org/debian If the MIRROR_SNAPSHOT=y, then use the snapshot mirror, for instance http://packages.trafficmanager.net/snapshot/debian/20230330T000330Z/. How to verify it + scripts/build_debian_base_system.sh amd64 bullseye ./fsroot-vs I: Target architecture can be executed I: Retrieving InRelease I: Checking Release signature I: Valid Release signature (key id A4285295FC7B1A81600062A9605C66F00D6C9793) I: Retrieving Packages I: Validating Packages I: Resolving dependencies of required packages... I: Resolving dependencies of base packages... I: Checking component main on http://packages.trafficmanager.net/snapshot/debian/20230331T000125Z... I: Retrieving libacl1 2.2.53-10	2023-04-07 11:05:51 +08:00
xumia	46cb2ad03d	[Ci] Fix the wrong SONIC_BUILD_JOBS build variable used issue in Azp (#14071 ) Why I did it [Ci] Fix the no parallel jobs in some of the platforms issue We observed some of the pipelines running more time than expected. The issue is the SONIC_BUILD_JOBS using the wrong value 1. It is caused by the runtime variable issue, there is additional single quota mark character added in the make command line. make 'SONIC_BUILD_JOBS=$(nproc)' targe/xxxx Need to change to make SONIC_BUILD_JOBS=$(nproc) targe/xxxx It is to improve the build performance for some of the platforms using the variable SONIC_BUILD_JOBS=1. Good one vs: https://dev.azure.com/mssonic/build/_build/results?buildId=227986&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795 "SONIC_BUILD_JOBS" : "8" Bad one barefoot: https://dev.azure.com/mssonic/build/_build/results?buildId=227379&view=logs&j=993d6e22-aeec-5c03-fa19-35ecba587dd9&t=7be0d2ec-661f-5569-462c-2d9b7ca4ca5d "SONIC_BUILD_JOBS" : "1" How I did it Expand the BUILD_OPTIONS variable for all platforms.	2023-04-07 09:35:02 +08:00
Ying Xie	737d0e57ad	[write standby] force DB connections to use unix socket to connect (#14524 ) Why I did it At service start up time, there are chances that the networking service is being restarted by interface-config service. When that happens, write_standby could fail to make DB connections due to loopback interface is being reconfigured. How I did it Force the db connector to use unix socket to avoid loopback reconfig timing window. How to verify it Run config reload test 20+ times and no issue encountered. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * use unix socket instead Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2023-04-06 13:54:56 -07:00
Kuanyu Chen	cffd87a627	Add monit_snmp file to monitor memory usage (#14464 ) #### Why I did it When CPU is busy, the sonic_ax_impl may not have sufficient speed to handle the notification message sent from REDIS. Thus, the message will keep stacking in the memory space of sonic_ax_impl. If the condition continues, the memory usage will keep increasing. #### How I did it Add a monit file to check if the SNMP container where sonic_ax_impl resides in use more than 4GB memory. If yes, restart the sonic_ax_impl process. #### How to verify it Run a lot of this command: `while true; do ret=$(redis-cli -n 0 set LLDP_ENTRY_TABLE:test1 test1); sleep 0.1; done;` And check the memory used by sonic_ax_impl keeps increasing. After a period, make sure the sonic_ax_impl is restarted when the memory usage reaches the 4GB threshold. And verify the memory usage of sonic_ax_impl drops down from 4GB.	2023-04-06 12:19:11 -07:00
shdasari	dd6659ae07	Modify common-auth-sonic to take care of case where no RADIUS servers are configured. (#14514 ) #### Why I did it Fixes #14277. Fixes the inconsistent fallback behaviour for RADIUS authentication when AAA authentication is configured as "radius, local". #### How I did it Modified common-auth-sonic.j2 template to make sure that when no RADIUS servers are configured (with AAA authentication login method set to radius, local), the system falls back to local authentication successfully. #### How to verify it 1. Configure authentication based on RADIUS and local. config aaa authentication login radius local 2. Configure an unreachable RADIUS server. config radius add 6.6.6.6 3. Try to login to switch with existing admin user credentials. This is successful. 4. Remove RADIUS server configuration. config radius delete 6.6.6.6 5. Try to login to switch with admin user credentials. This is successful.	2023-04-06 12:14:01 -07:00
mihirpat1	63cee3ff3c	[yang]: Modify yang model to handle subport in PORT table (#14519 ) Based on the port breakout HLD, we are now using subport instead of channel in the CONFIG_DB PORT table to handle port breakout. The yang schema needs to be modified accordingly to handle the corresponding change. The corresponding code changes have been merged through sonic-net/sonic-platform-daemons/pull/342 merged Signed-off-by: Mihir Patel <patelmi@microsoft.com>	2023-04-06 10:59:47 -07:00
arista-nwolfe	990993e3f4	[devices/arista]: Added recycle ports required for egress mirroring (#13967 ) Why I did it Support Egress Mirroring on supported Arista platforms How I did it Add necessary soc properties for egress mirroring recycle ports to be created Signed-off-by: Nathan Wolfe <nwolfe@arista.com>	2023-04-06 10:58:01 -07:00
kenneth-arista	8ddfaec34f	[devices/arista] Update asic_port_name in Arista LCs (#14234 ) Updated asic_port_names for all Arista LC SKUs to follow latest naming conventions to remove redundant ASICx suffix. For Arista-7800R3-48CQ2-C48, added the asic_port_name mapping.	2023-04-06 10:53:42 -07:00
Ye Jianquan	6c04ed987d	Revert "chassis-packet: resolve the missing static routes (#14230 )" (#14544 ) This reverts commit `a8f8ea3b50`.	2023-04-06 10:36:10 -07:00
snider-nokia	6f54251375	[armhf][Nokia-7215]Add SFP refactor support for Nokia-7215 platform (#14396 )	2023-04-06 08:04:45 -07:00
xumia	9b769244d5	[Build] Fix the SLAVE_DRI not defined issue in the slave container issue (#14297 ) Why I did it It is to fix the issue #13773 It only has impact on the build triggered manually inside of the slave container. Developers can go to the slave container do a build, it will print a skippable error message complaining the variable not found. How I did it Add the default value for variable SLAVE_DRI. How to verify it	2023-04-06 16:42:59 +08:00
Hua Liu	e17e4fc4c0	[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. (#14402 ) [S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert. #### Why I did it On S6100, the serial-getty service some time can't auto-restart by systemd. So there is a monit unit to check serial-getty service status and restart it. However, this monit will report false alert, because in most case when serial-getty not running, systemd can restart it successfully. To avoid the false alert, improve the monitor to wait and re-check. Steps to reproduce this issue: 1. User login to device via console, and keep the connection. 2. User login to device via SSH, check the serial-getty@ttyS1.service service, it's running. 3. Run 'monit reload' from SSH connection. 4. Check syslog 1 minutes later, there will be false alert: ' 'serial-getty' process is not running' #### How I did it Add check-getty.sh script to recheck again later when getty service not running. And update monit unit to check serial-getty service status with this script to avoid false alert. #### How to verify it Pass all UT. Manually check fixed code work correctly: ``` admin@*:~$ sudo systemctl stop serial-getty@ttyS1.service admin@:~$ sudo /usr/local/bin/check-getty.sh admin@:~$ echo $? 1 admin@:~$ sudo systemctl status serial-getty@ttyS1.service ● serial-getty@ttyS1.service - Serial Getty on ttyS1 Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled) Active: inactive (dead) since Tue 2023-03-28 07:15:21 UTC; 1min 13s ago admin@:~$ sudo /usr/local/bin/check-getty.sh admin@:~$ echo $? 0 admin@:~$ sudo systemctl status serial-getty@ttyS1.service ● serial-getty@ttyS1.service - Serial Getty on ttyS1 Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled) ``` syslog: ``` Mar 28 07:10:37.597458 * INFO systemd[1]: serial-getty@ttyS1.service: Succeeded. Mar 28 07:12:43.010550 * ERR monit[593]: 'serial-getty' status failed (1) -- no output Mar 28 07:12:43.010744 * INFO monit[593]: 'serial-getty' trying to restart Mar 28 07:12:43.010846 * INFO monit[593]: 'serial-getty' stop: '/bin/systemctl stop serial-getty@ttyS1.service' Mar 28 07:12:43.132172 * INFO monit[593]: 'serial-getty' start: '/bin/systemctl start serial-getty@ttyS1.service' Mar 28 07:13:43.286276 *** INFO monit[593]: 'serial-getty' status succeeded (0) -- no output ``` #### Description for the changelog [S6100] Improve S6100 serial-getty monitor. #### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.	2023-04-05 21:34:31 -07:00
mssonicbld	41c46aedf6	[ci/build]: Upgrade SONiC package versions (#14528 )	2023-04-05 18:36:57 +08:00
Ying Xie	d3f3ac6411	Delay mux/sflow/snmp timer after interface-config service (#14506 ) Why I did it All these 3 services started after swss service, which used to start after interface-config service. But #13084 remove the time constraints for swss. After that, these 3 services has the chance of start earlier when the inteface-config service is restarting the networking service, which could cause db connect request to fail. How I did it Delay mux/sflow/snmp timer after the interface-config service. How to verify it PR test. Config reload can repro the issue in 1-3 retries. With this change. config reload run 30+ iterations without hitting the issue. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2023-04-04 16:23:00 -07:00
Santhosh Kumar T	c4435e833b	[DellEMC] S6100 - Adding logger to fetch SSD FW Upgrade status (#14247 ) Adding logger to fetch SSD FW Upgrade status	2023-04-04 10:19:47 -07:00
mssonicbld	8fc8578c4d	[submodule] Update submodule to the latest HEAD automatically (#14491 )	2023-04-04 14:55:27 +08:00
Christian Svensson	bce824723c	[sflow] Switch to bullseye (#14494 ) Change references to use bullseye instead of buster Why I did it Almost all daemons in 202211 and master uses bullseye, and sflow was easy to migrate. How I did it Replaced the references, built and tested in 202211. How to verify it Build with the changes, enable sflow: admin@sonic:~$ sudo config sflow collector add test 1.2.3.4 admin@sonic:~$ sudo config sflow collector enable tcpdump on 1.2.3.4 and see that UDP sFlow are being sent. Signed-off-by: Christian Svensson <blue@cmd.nu>	2023-04-03 09:49:35 -07:00
mssonicbld	884dfa5427	[ci/build]: Upgrade SONiC package versions (#14498 )	2023-04-03 18:34:35 +08:00
Christian Svensson	67abcff944	[nat] Switch to bullseye (#14495 ) Change references to use bullseye instead of buster Why I did it Almost all daemons in 202211 and master uses bullseye, and NAT seems easy to migrate. How I did it Replaced the references, built with 202211 branch. How to verify it Not sure, it builds and tests pass as far as I can tell but I don't use the feature myself. Signed-off-by: Christian Svensson <blue@cmd.nu>	2023-04-02 14:02:33 -07:00

1 2 3 4 5 ...

7342 Commits