sonic-buildimage

Author	SHA1	Message	Date
Vaibhav Hemant Dixit	2969d84e58	Revert "Revert "Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933 )" (#15464 )" (#15684 ) This reverts commit `9649a44470`.	2023-08-15 04:32:38 +08:00
Yevhen Fastiuk	4602d30a73	[syslog] Add remote syslog configuration (cherry-pick to 202305) (#15897 ) cherry-pick: #14513 depends: https://github.com/sonic-net/sonic-utilities/pull/2939 * Add an ability to configure remote syslog servers * Add an initial configuration for remote syslog * Extend YANG module and add unit tests #### Why I did it Adding the following functionality to rsyslog feature: * Configure remote syslog servers: protocol, filter, severity level * Update global syslog configuration: severity level, message format #### How I did it added parameters to syslog server and global configuration. #### How to verify it create syslog server using CLI/adding to Redis-DB verify server is added to file /etc/rsyslog.conf and server is functional. #### Description for the changelog extend rsyslog capabilities, added server and global configuration parameters. #### Link to config_db schema for YANG module changes [sonic-syslog.yang](https://github.com/sonic-net/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-syslog.yang)	2023-08-14 13:12:33 -07:00
mssonicbld	ec73d0f3ff	[chassis]: removed dependency for bgp and swss for chassis supervisor (#15734 ) (#16135 ) Fixes #15667 and #13293 Work item tracking Microsoft ADO 24472854: How I did it On chassis supervisor bgp feature is disabled in hostcfgd. The dependency between swss and bgp causes the bgp containers to start even though the feature is disabled. How to verify it Tests on chassis supervisor and LC Co-authored-by: Arvindsrinivasan Lakshmi Narasimhan <55814491+arlakshm@users.noreply.github.com>	2023-08-14 22:39:24 +08:00
Longxiang Lyu	6e49fa5fd2	[monit][dualtor] Periodically check mux neighbors consistency (#15769 ) Signed-off-by: Longxiang Lyu <lolv@microsoft.com>	2023-08-08 18:33:29 +08:00
mssonicbld	4ca01a7715	[syncd.sh] Clear semaphore before updating firmware (#15818 ) (#16067 )	2023-08-07 18:20:15 +08:00
vmittal-msft	5ee18ece65	Update WRED profile on system ports (#15612 ) * Update WRED profile on system ports	2023-08-07 14:33:42 +08:00
mssonicbld	33a10b479a	[nvidia] make sure shared storage with syncd is cleared on restarts (#14547 ) (#16046 ) Why I did it Sharing the storage of syncd with other proprietary application extensions allows them to communicate with syncd in differnt ways. If one container wants to pass some information to syncd then shared storage can be used. However, today the shared storage isn't cleaned on restarts making it possible for syncd to read out-of-date information generated in the past. NOTE: No plans to use it for standard SONIC dockers and we are working on removing the SDK dependency from PMON docker How I did it Implemented new service to clean the shared storage. How to verify it Do reboot/fast-reboot/warm-reboot/config-reload/systemctl restart swss and verify /tmp/ is cleaned after each restart in syncd container. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com> Co-authored-by: Stepan Blyshchak <38952541+stepanblyschak@users.noreply.github.com>	2023-08-07 09:27:43 +08:00
Junchao-Mellanox	bf37c3162c	Fix issue: set delayed attribute to true for platform monitor service (#15816 ) There is a redundant line in init_cfg.json.j2. It would cause pmon service always has "delayed=False". However, we know that PMON has a timer now. So, I try to fix it here.	2023-08-07 00:34:12 +08:00
mssonicbld	6004054711	[arp_update]: Fix IPv6 neighbor race condition (#15583 ) (#15877 )	2023-07-19 20:06:12 +08:00
lixiaoyuner	c59f55f6a3	Move k8s script to docker-config-engine (#14788 ) (#15768 ) Why I did it To reduce the container's dependency from host system Work item tracking Microsoft ADO (number only): 17713469 How I did it Move the k8s container startup script to config engine container, other than mount it from host. How to verify it Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container. Signed-off-by: Yun Li <yunli1@microsoft.com> Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>	2023-07-17 23:21:01 +08:00
mssonicbld	0b1f834e22	update rsyslog log size conf (#15821 ) (#15837 )	2023-07-14 20:34:22 +08:00
mssonicbld	bb3eff6ab4	Revert "Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933 )" (#15464 ) (#15618 )	2023-06-29 22:35:47 +08:00
Stepan Blyshchak	e2e5b77f16	[mlnx-ffb.sh] Update issu-version location (#14925 ) #### Why I did it ISSU version check fails due to inability to mount squashfs from 202211 on 201911 #### How I did it Put ISSU version file under platform directory #### How to verify it Warm-upgrade matrix: - 201911 (with https://github.com/sonic-net/sonic-buildimage/pull/14928) to master - 201911 (with https://github.com/sonic-net/sonic-buildimage/pull/14928) to 202211 - 202012 (with https://github.com/sonic-net/sonic-buildimage/pull/14927) to master - 202205 (with this change cherry-picked) to master	2023-06-15 15:14:52 -07:00
Saikrishna Arcot	f84dfd2345	Re-add 127.0.0.1/8 when bringing down the interfaces (#15080 ) * Re-add 127.0.0.1/8 when bringing down the interfaces With #5353, 127.0.0.1/16 was added to the lo interface, and then 127.0.0.1/8 was removed. However, when bringing down the lo interface, like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8 isn't added back to the interface. This means that there's a period of time where 127.0.0.1 is not available at all, and services that need to connect to 127.0.01 (such as for redis DB) will fail. To fix this, when going down, add 127.0.0.1/8. Add this address before the existing configuration gets removed, so that 127.0.0.1 is available at all times. Note that running `ifdown lo` doesn't actually bring down the loopback interface; the interface always stays "physically" up. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2023-06-13 18:45:39 -07:00
Hua Liu	05f1a5a31e	Add watchdog mechanism to swss service and generate alert when swss have issue. (#15429 ) Add watchdog mechanism to swss service and generate alert when swss have issue. Work item tracking Microsoft ADO (number only): 16578912 What I did Add orchagent watchdog to monitor and alert orchagent stuck issue. Why I did it Currently SONiC monit system only monit orchagent process exist or not. If orchagent process stuck and stop processing, current monit can't find and report it. How I verified it Pass all UT. Manually test process_monitoring/test_critical_process_monitoring.py can pass. Add new UT https://github.com/sonic-net/sonic-mgmt/pull/8306 to check watchdog works correctly. Manually test, after pause orchagent with 'kill -STOP <pid>', check there are warning message exist in log: Apr 28 23:36:41.504923 vlab-01 ERR swss#supervisor-proc-watchdog-listener: Process 'orchagent' is stuck in namespace 'host' (1.0 minutes). Details if related Heartbeat message PR: https://github.com/sonic-net/sonic-swss/pull/2737 UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8306	2023-06-12 17:53:54 -07:00
Alpesh Patel	633fff8c10	enable ethernet backplane port support in port config for packet mode T2 devices (#14533 ) For T2 systems using packet mode, the backplane interfaces (Ethernet-BP#) and the fabric card ethernet interfaces are not visible as neighbor interfaces. In packet mode, these interfaces needs qos and buffer config as well. This fix addresses that issue and adds the backplane interfaces to the PORTS_ACTIVE list	2023-06-12 14:02:22 -07:00
mssonicbld	cb9d9e57a6	[ci/build]: Upgrade SONiC package versions (#15431 ) Upgrade SONiC Versions	2023-06-12 22:27:29 +08:00
mssonicbld	a45595158b	[ci/build]: Upgrade SONiC package versions (#15345 )	2023-06-10 20:38:13 +08:00
Liping Xu	78c41a1e58	allow docker_inram to kernel cmd list (#15374 ) Why I did it After docker_inram is enabled, the docker folder's default max size is 1.5G. It's not big enough for some tests which need to install additional docker images or install extra packages. Work item tracking Microsoft ADO 24199761: How I did it add docker_inram into cmdline_allowlist How to verify it sudo sh -c 'echo "docker_inram_size=3000M" >> kernel-cmdline-append' sudo reboot and check the docker folder size	2023-06-10 14:19:44 +08:00
Sudharsan Dhamal Gopalarathnam	162856ad9a	[sflow]Delay starting sflow service until ports are created (#15333 ) * [sflow]Delay starting sflow service until ports are created * Removing sflow from sonic.target dependency since it will be managed by hostcfgd	2023-06-09 16:28:15 -07:00
Ye Jianquan	cec9d7b83a	Revert "Add watchdog mechanism to swss service and generate alert when swss have issue. (#14686 )" (#15390 ) This reverts commit `44427a2f6b`. Docker image not updated during PR validation and caused PR check failures. Force merge this revert. After cache is updated after this PR is merged, issue should be fixed.	2023-06-09 09:10:35 +08:00
Yevhen Fastiuk	8a6d45227e	[Clock] Add timezone config YANG model (#14651 ) * Add the ability to configure timezone Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com> * Add YANG model for timezone Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com> * Add timezone reference Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com> --------- Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>	2023-06-07 10:39:24 -07:00
Hua Liu	44427a2f6b	Add watchdog mechanism to swss service and generate alert when swss have issue. (#14686 ) This PR depends on https://github.com/sonic-net/sonic-swss/pull/2737 merge first. What I did Add orchagent watchdog to monitor and alert orchagent stuck issue. Why I did it Currently SONiC monit system only monit orchagent process exist or not. If orchagent process stuck and stop processing, current monit can't find and report it. How I verified it Pass all UT. Add new UT https://github.com/sonic-net/sonic-mgmt/pull/8306 to check watchdog works correctly. Manually test, after pause orchagent with 'kill -STOP <pid>', check there are warning message exist in log: Apr 28 23:36:41.504923 vlab-01 ERR swss#supervisor-proc-watchdog-listener: Process 'orchagent' is stuck in namespace 'host' (1.0 minutes). Details if related Heartbeat message PR: https://github.com/sonic-net/sonic-swss/pull/2737 UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8306	2023-06-05 22:21:17 -07:00
siqbal1986	381cfe4485	Added VNET_MONITOR_TABLE,BFD_SESSION_TABLE,VNET_ROUTE_TUNNEL_TABLE to the list (#14992 ) * The 3 tables in state DB need to be cleaned up after SWSS restart for have consistant state.	2023-06-05 13:18:50 -07:00
mssonicbld	4335690de7	[ci/build]: Upgrade SONiC package versions	2023-06-05 20:51:47 +08:00
Arvindsrinivasan Lakshmi Narasimhan	3f4b959d3f	[chassis] add libffi-dev for sonic-utilities (#15218 ) In the PR sonic-net/sonic-utilities#2850 , for support remote access of linecards paramiko package is installed in sonic-utilities. libffi-dev needs to installed to be able to compile for armhf image Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2023-06-03 14:36:50 -07:00
mssonicbld	f80e182c22	[ci/build]: Upgrade SONiC package versions (#15325 )	2023-06-03 19:45:07 +08:00
mssonicbld	c044e6e34e	[ci/build]: Upgrade SONiC package versions (#15307 )	2023-06-02 21:40:29 +08:00
Vaibhav Hemant Dixit	02b17839c3	Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933 ) Why I did it Fix the issue where db_migrator is called before DB is loaded w/ config. This leads to db_migrator: Not finding anything, and resumes to incorrectly migrate every missing config This is not expected. migration should happen after the old config is loaded and only new schema changes need migration. Since DB does not have anything when migrator is called, db_migrator fails when some APIs return None. The reason for incorrect call is that: database service starts db_migrator as part of startup sequence. config-setup service loads data from old-config/minigraph. However, since it has Requires=database.service. Hence, config-setup starts only when database service is started. And database service is started when db_migrator is completed. Fixed by: Check if this is first time boot by checking pending_config_migration flag. If pending_config_migration is enabled, then do not call db_migrator as part of database service startup. Let database service start which triggers config-setup service to start. Now call db_migrator after when config-setup service loads old-config/minigraph	2023-05-30 10:16:21 -07:00
vmittal-msft	ecb4db58a9	Update PG headroom settings ports based on port speed/cable length (#14908 ) * Update PG headroom settings ports based on port speed/cable length * Updated XOFF settings to use chip level numbers than core * Updated PG headroom based on uplink/downlink side * fix for sonic-config-gen tests * More fixes for unit test cases * more test fixes * Merged multiple functions into one	2023-05-19 08:19:27 -07:00
Pavan-Nokia	c5d0507224	[arm64][Nokia-7215-A1]Add support for Nokia-7215-A1 platform (#13795 ) Add new Nokia build target and establish an arm64 build: Platform: arm64-nokia_ixs7215_52xb-r0 HwSKU: Nokia-7215-A1 ASIC: marvell Port Config: 48x1G + 4x10G How I did it - Change make files for saiserver and syncd to use Bulleseye kernel - Change Marvell SAI version to 1.11.0-1 - Add Prestera make files to build kernel, Flattened Device Tree blob and ramdisk for arm64 platforms - Provide device and platform related files for new platform support (arm64-nokia_ixs7215_52xb-r0).	2023-05-18 14:24:05 -07:00
Samuel Angebault	fa95ebcaae	Add optional zram compression for docker_inram Some devices running SONiC have a small storage device (2G and 4G mainly) The SONiC image growth over time has made it impossible to install 2 images on a single device. Some mitigations have been implemented in the past for some devices but there is a need to do more. One such mitigation is `docker_inram` which creates a `tmpfs` and extracts `dockerfs.tar.gz` in it. This all happens in the SONiC initramfs and by ensuring the installation process does not extract `dockerfs.tar.gz` on the flash but keep the file as is. This mitigation does a tradeoff by using more RAM to reduce the disk footprint. It however creates new issues for devices with 4G of system memory since the extracted `dockerfs.tar.gz` nears the 1.6G. Considering debian upgrades (with dual base images) and the continuous stream of features this is only going to get bigger. This change introduces an alternative to the `tmpfs` by allowing a system to extract the `dockerfs.tar.gz` inside a `zram` device thus bringing compression in play at the detriment of performance. Introduce 2 new optional kernel parameters to be consumed by SONiC initramfs. - `docker_inram_size` which represent the max physical size of the `zram` or `tmpfs` volume (defaults to DOCKER_RAMFS_SIZE) - `docker_inram_algo` which is the method to use to extract the `dockerfs.tar.gz` (defaults to `tmpfs`) other values are considered to be compression algorithm for `zram` (e.g `zstd`, `zlo-rle`, `lz4`) Refactored the logic to mount the docker fs in the SONiC initramfs under the `union-mount` script. Moved the code into a function to make it cleaner and separated the inram volume creation and docker extraction. On Arista platform with a flash smaller or equal to 4GB set `docker_inram_algo` to `zstd` which produces the best compression ratio at the detriment of a slower write performance and a similar read performance to other `zram` compression algorithms.	2023-05-18 14:21:52 -07:00
Samuel Angebault	467994c024	[Arista] Fix boot0 code for docker_inram Enable docker_inram for all systems with 4GB or less of flash. This is mandatory to allow these systems to store 2 SONiC images. This change also fixes the missing docker_inram attribute when installing a new image from SONiC. Because the SWI image can ship with additional kernel parameters within such as `sonic_fips=` this lead to a conflict. To prevent the conflict, the extra kernel parameters from the SWI are now stored in the file `kernel-cmdline-append` which isn't used anywhere.	2023-05-18 14:21:52 -07:00
Anish Narsian	05a85b57b8	[arp_update] Resolve neighbors from config_db (#15006 ) * To resolve NEIGH table entries present in CONFIG_DB. Without this change arp/ndp entries which we wish to resolve, and configured via CONFIG_DB are not resolved.	2023-05-17 10:42:03 -07:00
mssonicbld	3d1ae46f90	[ci/build]: Upgrade SONiC package versions	2023-05-15 18:32:43 +08:00
mssonicbld	31223fb9fe	[ci/build]: Upgrade SONiC package versions (#15057 )	2023-05-13 18:30:20 +08:00
judyjoseph	efeae03ea3	Add override_config to load_minigraph in config-setup service (#14834 ) This PR is to handle the override minigraph config by golden_config_db.json file if it is present in the backup location.	2023-05-10 11:54:33 -07:00
Zain Budhwani	a738c39328	Add fix to monit_regex.json for catching mem_usage and cpu_usage (#14954 ) Why I did it Current regex not able to capture logs, modify regex to capture syslog messages Work item tracking Microsoft ADO (number only): 13366345 How I did it Code change How to verify it sonic-mgmt test case	2023-05-08 11:48:17 -07:00
Ying Xie	72c52bc677	Revert "Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516 )" (#14902 ) This reverts commit `c7ecd92c54`.	2023-05-01 17:12:38 -07:00
mssonicbld	80c5ab4a4a	[ci/build]: Upgrade SONiC package versions (#14896 )	2023-05-01 18:10:48 +08:00
mssonicbld	0d709a3655	[ci/build]: Upgrade SONiC package versions (#14888 )	2023-04-29 17:42:19 +08:00
Tejaswini Chadaga	ca224863cb	Changes to support TSA from supervisor (#14691 ) Why I did it Support for SONIC chassis isolation using TSA and un-isolation using TSB from supervisor module Work item tracking Microsoft ADO (number only): 17826134 How I did it When TSA is run on the supervisor, it triggers TSA on each of the linecards using the secure rexec infrastructure introduced in sonic-net/sonic-utilities#2701. User password is requested to allow secure login to linecards through ssh, before execution of TSA/TSB on the linecards TSA of the chassis withdraws routes from all the external BGP neighbors on each linecard, in order to isolate the entire chassis. No route withdrawal is done from the internal BGP sessions between the linecards to prevent transient drops during internal route deletion. With these changes, complete isolation of a single linecard using TSA will not be possible (a separate CLI/script option will be introduced at a later time to achieve this) Changes also include no-stats option with TSC for quick retrieval of the current system isolation state This PR also reverts changes in #11403 How to verify it These changes have a dependency on sonic-net/sonic-utilities#2701 for testing Run TSA from supervisor module and ensure transition to Maintenance mode on each linecard Verify that all routes are withdrawn from eBGP neighbors on all linecards Run TSB from supervisor module and ensure transition to Normal mode on each linecard Verify that all routes are re-advertised from eBGP neighbors on all linecards Run TSC no-stats from supervisor and verify that just the system maintenance state is returned from all linecards	2023-04-28 16:28:06 +08:00
Stephen Sun	9e56fea091	Temporary WA for the issue that asic_table.json can not be rendered (#13888 ) - Why I did it We suspect the issue #13791 is caused by redis server being temporarily unavailable during system initialization so we do not use -d in sonic-cfggen, for now, to avoid accessing redis server - How I did it Provide a string containing required json data when calling sonic-cfggen - How to verify it Manually test it Signed-off-by: Stephen Sun <stephens@nvidia.com>	2023-04-24 17:02:35 +03:00
mssonicbld	5ad844f185	[ci/build]: Upgrade SONiC package versions	2023-04-24 18:33:06 +08:00
mssonicbld	81a557885b	[ci/build]: Upgrade SONiC package versions (#14799 )	2023-04-22 17:47:40 +08:00
mssonicbld	d006219e2d	[ci/build]: Upgrade SONiC package versions (#14718 )	2023-04-19 18:59:16 +08:00
Aryeh Feigin	039a9c998a	[Fast-boot] Clear teamd-timer when finalizing fast-reboot (#14583 ) Part of sonic-net/sonic-utilities#2760 Similar to #14295 - Why I did it To clear teamd timer when fast-reboot is finalized to prevent any further affect. - How I did it Deleted teamd timer from config-db in fast-reboot finalizer. config save call is moved to after clearing teamd-timer so it won't have any further affect as well. - How to verify it Verified manually that entry was deleted after fast-reboot was finailized.	2023-04-18 09:15:42 +03:00
Stepan Blyshchak	d73c810e86	[image_config] add rasdaemon.timer (#14300 ) rasdaemon is a tool to log hardware errors. It takes 100% CPU during boot for a few seconds. It impacts fast/warm boot by delaying control plane restoration for 5 sec on some platforms. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2023-04-17 08:58:45 -07:00
mssonicbld	7f262d71da	[ci/build]: Upgrade SONiC package versions (#14685 )	2023-04-17 19:58:43 +08:00
mssonicbld	49dbaeb649	[ci/build]: Upgrade SONiC package versions (#14672 )	2023-04-15 18:21:50 +08:00

1 2 3 4 5 ...

1192 Commits