sonic-buildimage

Author	SHA1	Message	Date
Lawrence Lee	e821dd8551	[arp_update]: Set failed IPv6 neighbors to incomplete (#11919 ) After pinging any failed IPv6 neighbor entries, set the remaining failed/incomplete entries to a permanent INCOMPLETE state. This manual setting to INCOMPLETE prevents these entries from automatically transitioning to FAILED state, and since they are now incomplete any subsequent NA messages for these neighbors is able to resolve the entry in the cache. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-09-02 21:57:47 +00:00
Ying Xie	4ab83170a5	[write_standby] update write_standby.py script (#11650 ) Why I did it The initial value has to be present for the state machines to work. In active-standby dual-tor scenario, or any hardware mux scenario, the value will be updtaed eventually with a delay. However, in active-active dual-tor scenario, there is no other mechanism to initialize the value and get state machines started. So this script will have to write something at start up time. For active-active dualtor, 'active' is a more preferred initial value, the state machine will switch the state to standby soon if link prober found link not in good state. How I did it Update the script to always provide initial values. How to verify it Tested on active-active dual-tor testbed. Signed-off-by: Ying Xie ying.xie@microsoft.com	2022-09-01 23:57:23 +00:00
Jing Zhang	9d3194c77a	Avoid write_standby in warm restart context (#11283 ) Avoid write_standby in warm restart context. sign-off: Jing Zhang zhangjing@microsoft.com Why I did it In warm restart context, we should avoid mux state change. How I did it Check warm restart flag before applying changes to app db. How to verify it Ran write_standby in table missing, key missing, field missing scenarios. Did a warm restart, app db changes were skipped. Saw this in syslog: WARNING write_standby: Taking no action due to ongoing warmrestart.	2022-09-01 23:57:17 +00:00
mssonicbld	ed68e4c97c	[ci/build]: Upgrade SONiC package versions (#11896 )	2022-08-30 22:44:47 +08:00
mssonicbld	347b2dddcd	[ci/build]: Upgrade SONiC package versions (#11757 )	2022-08-29 14:08:14 +08:00
mssonicbld	07082bb5f5	[ci/build]: Upgrade SONiC package versions (#11676 )	2022-08-16 13:07:32 +00:00
Stepan Blyshchak	8ab448a852	[swss.sh/syncd.sh] Trap only on EXIT (#11590 ) When using trap on SIGTERM the script will not react to the SIGTERM signal sent while a child is executing. I.e, the following script does not react on SIGTERM sent to it if it is waiting for sleep to finish: ``` trap "echo Handled SIGTERM" 0 2 3 15 echo "Before sleep" sleep inf echo "After sleep" ``` Instead, trap only on EXIT which covers also a scenario with exit on SIGINT, SIGTERM. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2022-08-11 20:38:20 +00:00
zitingguo-ms	5b5bd5e818	[202012 BRCM SAI 4.3.7.0] Pick up fixes and make up BRCM SAI version to 4.3.7.0 (#11681 ) Pick upfollowing fixes and update BRCM SAI to 4.3.7.0: CS00012208537: Add back previous commit 54c5bc4848eb748 CS00012253061,SONIC-63280: WB from 3.5 to 4.3, followed by WB to 4.3 CS00012207978: SDK-296517, time spent for SAI operations CS00012245601,SONIC-62898: Egress ACL Counted ad Interface TX drops Update pcbb with Fixes for CS00012243699 Upgrade on pcbb with Fixes for KB0025353, CS00012221689, CS00012221688, KB0025391, CS00012230519 commit of "CS00012221688:PFC frames egressing, PFC storm happens simultaneously on 2 ports" is purposely skipped to be picked up later due to SWSS dependency not ready. Why I did it How I did it How to verify it Tested build target, successful Manually run these tests after installing sai binary within image 20201231.73 on 7050CX3 (TD3) T0 DUT, all passed. vxlan/test_vxlan_decap.py fdb/test_fdb.py pfcwd/test_pfcwd_all_port_storm.py acl/null_route/test_null_route_helper.py acl/test_acl.py vlan/test_vlan.py platform_tests/test_reboot.py Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>	2022-08-10 15:02:47 -07:00
Jing Zhang	ffd9e190e1	Update WARM START FINALIZER to wait for linkmgrd to reconcile (#11477 ) Spanning from sonic-net/sonic-linkmgrd#76, this PR is to update warm restart finalizer to wait for linkmgrd to be reconciled. sign-off: Jing Zhang zhangjing@microsoft.com Why I did it To make sure finalizer save config after linkmgrd's reconciliation. How I did it Add linkmgrd to the reconciliation wait list of warmboot finalizer. How to verify it Verified on lab device, linkmgrd reconciled as expected.	2022-08-09 21:05:12 +00:00
mssonicbld	14f93e15c6	[ci/build]: Upgrade SONiC package versions (#11629 ) Why I did it Upgrade SONiC Versions	2022-08-07 11:27:16 +08:00
Lawrence Lee	04ba6da1ab	[202012][arp_update]: Resolve failed neighbors on dualtor (#11641 ) In arp_update, check for FAILED or INCOMPLETE kernel neighbor entries and manually ping them to try and resolve the neighbor Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-08-05 23:30:04 -07:00
tjchadaga	6d66d9b8fc	Revert "Add load_minigraph option to include traffic-shift-away during config migration (#11403 )" (#11625 ) This reverts commit `6c2f99a327`.	2022-08-06 10:05:45 +05:30
bingwang-ms	84aca00847	[202012]Support different `DSCP_TO_TC_MAP` for T1 in dualtor deployment (#11580 ) Why I did it This PR is to backport #11569 into 202012 branch. This PR is to apply different DSCP_TO_TC_MAP to downlink and uplink ports on T1 in dualtor deployment. For T1 downlink ports (To T0) The DSCP_TO_TC_MAP is not changed. DSCP2 and DSCP6 are mapped to TC2 and TC6 respectively. For T1 uplink ports (To T1) A new DSCP_TO_TC_MAP\|AZURE_UPLINK is defined and applied. DSCP2 and DSCP6 are mapped to TC1 to avoid mixing up lossy and lossless traffic from T2. The extra lossy PG2 and PG6 added in PR #11157 is reverted as well because no traffic from T2 is mapped to PG2 or PG6 now. How I did it Define a new map DSCP_TO_TC_MAP\|AZURE_UPLINK for 7260 T1. How to verify it Verified by test case in test_j2files.py.	2022-08-01 08:59:45 -07:00
Nikola Dancejic	c5a5734242	[swss] Adding bgp container as dependent of swss (#11168 ) What I did: Added bgp as a dependent of swss Why I did it: bgp container was not restarting on swss crash. When swss crashes, linkmgrd doesn't initate a switchover because it cannot access the default route from orchagent. Bringing down bgp with swss will isolate the ToR, causing linkmgrd to initiate a switchover to the peer ToR avoiding significant packet loss. Signed-off-by: Nikola Dancejic <ndancejic@microsoft.com>	2022-07-29 09:37:09 -07:00
Lior Avramov	a40aca43b9	[memory_checker] Do not check memory usage of containers if docker daemon is not running (#11476 ) Fix in Monit memory_checker plugin. Skip fetching running containers if docker engine is down (can happen in deinit). This PR fixes issue #11472. Signed-off-by: liora liora@nvidia.com Why I did it In the case where Monit runs during deinit flow, memory_checker plugin is fetching the running containers without checking if Docker service is still running. I added this check. How I did it Use systemctl is-active to check if Docker engine is still running. How to verify it Use systemctl to stop docker engine and reload Monit, no errors in log and relevant print appears in log. Which release branch to backport (provide reason below if selected) The fix is required in 202205 and 202012 since the PR that introduced the issue was cherry picked to those branches (#11129).	2022-07-27 23:28:19 +00:00
tjchadaga	6c2f99a327	Add load_minigraph option to include traffic-shift-away during config migration (#11403 )	2022-07-27 23:27:21 +00:00
bingwang-ms	c5eb031111	[202012] Add flag to control the generation of global level map (#11451 ) Why I did it This PR is to cherry-pick #11448 to 202012 branch after resolving conflicts. There are conflicts in files/build_templates/qos_config.j2 src/sonic-config-engine/tests/test_j2files.py	2022-07-15 09:44:45 -07:00
mssonicbld	550ab26fc7	[ci/build]: Upgrade SONiC package versions (#11422 )	2022-07-12 15:39:32 +00:00
Neetha John	26ee4ae4a4	Add backend acl template (#11220 ) Why I did it Storage backend has all vlan members tagged. If untagged packets are received on those links, they are accounted as RX_DROPS which can lead to false alarms in monitoring tools. Using this acl to hide these drops. How I did it Created a acl template which will be loaded during minigraph load for backend. This template will allow tagged vlan packets and dropped untagged How to verify it Unit tests Signed-off-by: Neetha John <nejo@microsoft.com>	2022-07-08 21:39:39 +00:00
mssonicbld	9a86fa9264	[ci/build]: Upgrade SONiC package versions (#11074 ) Upgrade SONiC Versions	2022-07-06 11:00:50 +08:00
xumia	32cda89f93	[Build]: Support to use symbol links for lazy installation targets to reduce the image size (#10923 ) Why I did it Support to use symbol links in platform folder to reduce the image size. The current solution is to copy each lazy installation targets (xxx.deb files) to each of the folders in the platform folder. The size will keep growing when more and more packages added in the platform folder. For cisco-8000 as an example, the size will be up to 2G, while most of them are duplicate packages in the platform folder. How I did it Create a new folder in platform/common, all the deb packages are copied to the folder, any other folders where use the packages are the symbol links to the common folder. Why platform.tar? We have implemented a patch for it, see #10775, but the problem is the the onie use really old unzip version, cannot support the symbol links. The current solution is similar to the PR 10775, but make the platform folder into a tar package, which can be supported by onie. During the installation, the package.tar will be extracted to the original folder and removed.	2022-07-05 20:57:49 +00:00
yozhao101	4487a962e3	[memory_checker] Do not check memory usage of containers which are not created (#11129 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to fix an issue (#10088) by enhancing the script memory_checker. Specifically, if container is not created successfully during device is booted/rebooted, then memory_checker do not need check its memory usage. How I did it In the script memory_checker, a function is added to get names of running containers. If the specified container name is not in current running container list, then this script will exit without checking its memory usage. How to verify it I tested on a lab device by following the steps: Stops telemetry container with command sudo systemctl stop telemetry.service Removes telemetry container with command docker rm telemetry Checks whether the script memory_checker ran by Monit will generate the syslog message saying it will exit without checking memory usage of telemetry.	2022-07-05 20:57:45 +00:00
Samuel Angebault	d15a484dfa	[202012][Arista] Fix cmdline generation during warm-reboot from 201811/201911 (#11161 ) Issue fixed: when performing a warm-reboot or fast-reboot from 201811 or 201911 to 202012 the kernel command line contains duplicate information. This issue is related to a change that was made to make 202012 boot0 file more futureproof. A cold reboot brings everything back into a clean slate though not always desirable. Changes done: Added some logic to properly detect the end of the Aboot cmdline when cmdline-aboot-end delimiter is not set (clean case) Added some logic to regenerate the Aboot cmdline when cmdline-aboot-end is set but duplicate parameters exists before (dirty case). Reorganized some code to handle duplicate parameter handling in the allowlist.	2022-07-04 11:01:03 -07:00
Stephen Sun	fe6be5da92	[202012] Configure different map between uplink and downlink on t1 switch in dual ToR scenario (#11299 ) - Why I did it Configure different DSCP_TO_TC_MAP between uplink and downlink on T1 switch in dual ToR scenario On T1 uplink, both DSCP 2/6 will be mapped to TC 1 for the purpose of avoiding such traffic occupying lossless buffers. On T1 downlink, they will be mapped to TC 2/6 respectively. (unchanged) - How I did it For vendors who want to configure different DSCP_TO_TC_MAP between uplinks and downlinks on T1, they should Define generate_dscp_to_tc_map macro in SKU's qos.json.j2 file Define map AZURE for downlink and AZURE_UPLINK for uplink Define jinja2 variable different_dscp_to_tc_map as True Signed-off-by: Stephen Sun <stephens@nvidia.com>	2022-07-03 15:58:06 +03:00
Stephen Sun	307d0e2aca	[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario (#11032 ) Why I did it Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario 1. Support additional queue and PG in buffer templates, including both traditional and dynamic model 2. Support mapping DSCP 2/6 to lossless traffic in the QoS template. 3. Add macros to generate additional lossless PG in the dynamic model 4. Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2 - Buffer tables are rendered via using macros. - Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones. 5. Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues. On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as: 40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues 16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues Signed-off-by: Stephen Sun stephens@nvidia.com How to verify it Run regression test.	2022-06-21 10:04:49 -07:00
bingwang-ms	6ddf5cd7dc	[202012] [cherry-pick] Generate switch level dscp_to_tc_map entry from qos_config template (#11132 ) * Generate switch level dscp_to_tc_map Signed-off-by: bingwang <wang.bing@microsoft.com>	2022-06-17 20:49:56 +08:00
Jing Kan	5b2261da37	Revert "[202012][openssh] openssh: Upgrade from 7.9 to 8.4, to match version in buster-backports (#10910 )" (#11136 ) This reverts commit `14fdcc815a`.	2022-06-17 20:46:43 +08:00
Saikrishna Arcot	044570c42e	Remove SSH host keys after installing the custom version of sshd (#10633 ) (#11140 ) * Remove SSH host keys after installing the custom version of sshd Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Use an override for for sshd instead of overwriting the service file Don't overwrite upstream's .service file, and instead use an override file for making sure the host key(s) are generated. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-06-16 11:47:04 -07:00
Lukas Stockner	ab10005729	[swss] Clear VXLAN tunnel table from State DB on startup (#11078 ) *Clear VXLAN tunnel table from State DB on startup Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com>	2022-06-10 11:50:56 -07:00
shlomibitton	2a9aa0836c	[202012] [Mellanox] [pmon] Fix for PMON service not starting when restarting SWSS service after fast/warm reboot (#10902 ) - Why I did it Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform. Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again. - How I did it On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running. If it is running it means we are at the first boot and continue normally. If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent. - How to verify it Run fast/warm reboot. service swss restart. Observe PMON service starting.	2022-06-08 09:46:54 +03:00
mssonicbld	855ae0491f	[ci/build]: Upgrade SONiC package versions (#11051 ) Upgrade SONiC Versions #11051	2022-06-08 08:42:04 +08:00
Richard.Yu	8f3edde302	[202012][BRCM SAI 4.3.5.3-5] Update saibcm for pcbb feature (#10998 ) Support Tunnel PFC/pcbb feature on Broadcom platform. How to verify it Tested build target, successful make target/docker-syncd-brcm.gz manual run those tests after installing sai binary within image 20201231.67 on 7050CX3 (TD3) T0 DUT, all passed fib/test_fib.py vxlan/test_vxlan_decap.py fdb/test_fdb.py decap/test_decap.py pfcwd/test_pfcwd_all_port_storm.py acl/null_route/test_null_route_helper.py acl/test_acl.py vlan/test_vlan.py platform_tests/test_reboot.py Signed-off-by: richardyu-ms <richard.yu@microsoft.com>	2022-06-06 09:54:00 -07:00
bingwang-ms	e159998657	[202012][cherry-pick] Add two extra lossless queues for bounced back traffic (#10715 ) * Add extra lossless queues Signed-off-by: bingwang <bingwang@microsoft.com>	2022-06-04 19:25:02 +08:00
bingwang-ms	7ec6a60230	[cherry-pick] [202012] Update qos config to clear queues for bounced back traffic (#10608 ) * Update qos config to clear queues for bounced back traffic Signed-off-by: bingwang <wang.bing@microsoft.com>	2022-06-02 16:29:25 +08:00
Jing Kan	14fdcc815a	[202012][openssh] openssh: Upgrade from 7.9 to 8.4, to match version in buster-backports (#10910 ) * Use buster-backports version * Use dget dsc file instead source repo * Update make files * Upgrade openssh-client to 8.4 in base image * Remove useless installation * Install openssh-server from buster-backports in build_debian * Update dev buster package version list Signed-off-by: Jing Kan jika@microsoft.com	2022-06-02 16:06:22 +08:00
xumia	06addae853	Revert "Reduce image size for lazy installation packages (#10775 )" (#10916 ) This reverts commit `15cf9b0d70`. Why I did it Revert the PR #10775, for it has impact on onie installation. It is caused by the symbol links not supported in some of the onie unzip. We will enable after fixing the issue, see #10914	2022-05-27 17:00:50 +00:00
mssonicbld	8ce3cab508	[ci/build]: Upgrade SONiC package versions (#10732 ) Co-authored-by: mssonicbld <vsts@fv-az31-361.b1uo4dmaffwenkazr3a2h2ovdb.jx.internal.cloudapp.net>	2022-05-27 01:10:32 -07:00
shlomibitton	c71c91e2b0	[202012] [Fastboot] Delay PMON service for better fastboot performance (#10745 ) #### Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot. #### How I did it Add a timer for PMON service. Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot. Copy the timer file to the host bin image. #### How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.	2022-05-15 23:31:32 -07:00
shlomibitton	bca8a244c6	[202012] [Fastboot] Delay LLDP service for better fastboot performance (#10568 ) (#10744 ) This PR is to backport a fix #10568 This PR is dependent on PR: #10745 - Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot. - How I did it Add a timer for LLDP service. Copy the timer file to the host bin image. - How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.	2022-05-15 15:05:29 +03:00
Junchao-Mellanox	4f326e8779	Fix race condition between networking service and interface-config service (#10573 ) (#10766 ) Backport https://github.com/Azure/sonic-buildimage/pull/10573 to 202012. #### Why I did it The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like: 1. Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp. 2. Systemd starts interface-config service who depends on networking service 3. Interface-config service runs command “ifdown –force eth0”, check [line](`16717d2dc5/files/image_config/interfaces/interfaces-config.sh (L4)`). but networking service is still running so that this [line](`ac32bec0e2/ifupdown2/ifupdown/main.py (L74)`) failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down. 4. Interface-config service updates /etc/networking/interface to static configuration. 5. Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP) 6. When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces. The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3. #### How I did it Check networking service state before running "ifdown –force eth0", wait for it done if it is activating. #### How to verify it Manual test.	2022-05-14 14:58:24 -07:00
xumia	951d93e362	Reduce image size for lazy installation packages (#10775 ) Why I did it The image size is too large, when there are multiple lazy packages and multiple platforms. It is not necessary to keep the lazy installation packages in multiple copies. For cisco image, the image size will reduce from 3.5G to 1.7G. How I did it Use symbol links to only keep one package for each of the lazy package. Make a new folder fsroot/platform/common Copy the lazy packages into the folder. When using a package in each of the platform, such as x86_64-grub, x86_64-8800_rp-r0, x86_64-8201_on-r0, etc, only make a symbol link to the package in the common folder.	2022-05-10 06:44:40 +00:00
Samuel Angebault	705d3c0804	[Arista] Remove arista.log from rsyslog default logrotate (#9731 ) Why I did it In parallel of this change Arista added a custom logrotate configuration as part of its driver library. Having 2 logrotate configuration for the same log file triggers an issue. Fixes aristanetworks/sonic#38 How I did it Arista merged a few changes in sonic-buildimage which added a logrotate configuration aristanetworks/sonic@e43c797 It is therefore the right path to remove the arista.log line from the logrotate.d/rsyslog configuration. How to verify it Logrotate works without any error message, arista log rotation happens and arista daemons still append logs once file was truncated.	2022-04-28 23:58:41 +00:00
mssonicbld	1c9cdc4c7a	[ci/build]: Upgrade SONiC package versions (#10594 )	2022-04-27 15:25:14 +00:00
yozhao101	e6c18fa6dd	[Monit] Fix the issue which shows Monit can not reset its counter. (#10288 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com> Why I did it This PR aims to fix the Monit issue which shows Monit can't reset its counter when monitoring memory usage of telemetry container. Specifically the Monit configuration file related to monitoring memory usage of telemetry container is as following: check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400" if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry" If memory usage of telemetry container is larger than 400MB for 10 times within 20 cycles (minutes), then it will be restarted. Recently we observed, after telemetry container was restarted, its memory usage continuously increased from 400MB to 11GB within 1 hour, but it was not restarted anymore during this 1 hour sliding window. The reason is Monit can't reset its counter to count again and Monit can reset its counter if and only if the status of monitored service was changed from Status failed to Status ok. However, during this 1 hour sliding window, the status of monitored service was not changed from Status failed to Status ok. Currently for each service monitored by Monit, there will be an entry showing the monitoring status, monitoring mode etc. For example, the following output from command sudo monit status shows the status of monitored service to monitor memory usage of telemetry: Program 'container_memory_telemetry' status Status ok monitoring status Monitored monitoring mode active on reboot start last exit value 0 last output - data collected Sat, 19 Mar 2022 19:56:26 Every 1 minute, Monit will run the script to check the memory usage of telemetry and update the counter if memory usage is larger than 400MB. If Monit checked the counter and found memory usage of telemetry is larger than 400MB for 10 times within 20 minutes, then telemetry container was restarted. Following is an example status of monitored service: Program 'container_memory_telemetry' status Status failed monitoring status Monitored monitoring mode active on reboot start last exit value 0 last output - data collected Tue, 01 Feb 2022 22:52:55 After telemetry container was restarted. we found memory usage of telemetry increased rapidly from around 100MB to more than 400MB during 1 minute and status of monitored service did not have a chance to be changed from Status failed to Status ok. How I did it In order to provide a workaround for this issue, Monit recently introduced another syntax format repeat every <n> cycles related to exec. This new syntax format will enable Monit repeat executing the background script if the error persists for a given number of cycles. How to verify it I verified this change on lab device str-s6000-acs-12. Another pytest PR (Azure/sonic-mgmt#5492) is submitted in sonic-mgmt repo for review.	2022-04-21 22:00:42 +00:00
Samuel Angebault	9de6b2ca12	[Arista] Fix arista-net initramfs hook (#10626 ) The interface renaming logic fails if one interface is missing. Because of the `set -e` the whole initramfs hook would abort early on error. This change fixes the current behavior to make sure missing interfaces are properly skipped and ensure existing interface are renamed.	2022-04-20 10:03:37 -07:00
Jing Kan	4ee75f490e	[202012][copp_cfg] Enable dhcp trap for BmcMgmtToRRouter (#10596 ) Signed-off-by: Jing Kan jika@microsoft.com	2022-04-19 15:59:20 +08:00
Stepan Blyshchak	fa1e364f54	[services] kill container on stop in warm/fast mode (#10511 ) To optimize stop on warm boot, added kill for containers Use service "kill" in the shutdown path for fast and warm reboot. For all other reload methods, service "stop" is used. This is done to save time in shutdown path, and to overall improve the time spent in warm and fast reload. How - Use service_mgmt.sh to trigger common logic to initiate kill (fast/warm) or stop (cold) for database.sh, radv.sh, snmp.sh, telemetry.sh, mgmt-framework.sh Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>, Vaibhav H D <vaibhav.dixit@microsoft.com>	2022-04-18 14:27:48 -07:00
Ying Xie	6af3de4372	[202012][copp cfg] enable dhcp trap for a couple more devices (#10582 ) * [copp cfg] enable copp trap for a couple more devices Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2022-04-15 11:47:02 -07:00
Saikrishna Arcot	29b6f62902	[202012] Run tune2fs during initramfs instead of image install (#10558 ) If it is run during image install, it's not guaranteed that the installation environment will have tune2fs available. Therefore, run it during initramfs instead. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-04-12 19:59:24 -07:00
mssonicbld	e0fa07307a	[ci/build]: Upgrade SONiC package versions (#10395 ) [ci/build]: Upgrade SONiC package versions (#10395)	2022-04-10 17:00:00 +08:00
kellyyeh	b68f4dd74c	Enable dhcp copp trap for EPMS and MgmtTsToR (#10439 )	2022-04-06 09:46:08 -07:00
Saikrishna Arcot	e9db38594d	Image disk space reduction (#10172 ) (#10371 ) Reduce the disk space taken up during bootup and runtime. 1. Remove python package cache from the base image and from the containers. 2. During bootup, if logs are to be stored in memory, then don't create the `var-log.ext4` file just to delete it later during bootup. 3. For the partition containing `/host`, don't reserve any blocks for just the root user. This just makes sure all disk space is available for all users, if needed during upgrades (for example). * Remove pip2 and pip3 caches from some containers Only containers which appeared to have a significant pip cache size are included here. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Don't create var-log.ext4 if we're storing logs in memory Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Run tune2fs on the device containing /host to not reserve any blocks for just the root user Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> (cherry picked from commit `5617b1ae3e`)	2022-03-29 10:11:28 -07:00
mssonicbld	873689ef6e	[ci/build]: Upgrade SONiC package versions (#10373 )	2022-03-28 23:08:38 +00:00
mssonicbld	e71c14502d	[ci/build]: Upgrade SONiC package versions (#10331 ) Upgrade SONiC Versions	2022-03-25 15:09:37 +08:00
Saikrishna Arcot	aafb3d00e2	Start haveged before systemd-random-seed (#10328 ) The haveged service file in Debian Buster specifies that haveged should start after systemd-random-seed starts (this was removed in Bullseye after systemd changes caused a bootloop). This is a bit counterproductive, since haveged is meant to be used in environments with minimal sources of entropy, but one of the checks that systemd-random-seed does is to verify that entropy is present. Therefore, override the default .service file for haveged that moves systemd-random-seed to the Before list, allowing it to start before systemd-random-seed checks the system entropy level. (systemd doesn't allow removing items from dependency/ordering entries such as After= and Before=, so the entire .service file has to be overwritten.) Note that despite this, haveged takes up to two seconds to actually start working, so systemd-random-seed may still block for about two seconds. However, this still allows other work (such as running rc.local) to proceed a bit sooner. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-03-24 14:28:42 -07:00
noaOrMlnx	4f021c44c2	Update docker-sonic-vs infrastructure in order to run CoPP UT (#10230 ) *Changes to run CoPP UT in docker-sonic-vs	2022-03-21 21:55:24 -07:00
xumia	67312ff635	[Build]: Use one debian mirror config (#10281 ) Why I did it Use one debian mirror config. The empty config in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/apt/sources.list overrides the file https://github.com/Azure/sonic-buildimage/blob/master/files/apt/sources.list.amd64 (armhf/arm64), it does not make sense. All the content in files/image_config/apt is no use, any one wants to add mirror config, please add in files/apt. How I did it Remove files/image_config/apt and the reference.	2022-03-21 17:04:19 +08:00
gechiang	a984757b9d	[202012 BRCM SAI 4.3.5.3-3] Picked up fixes that makes up BRCM SAI version 4.3.5.3-3 (#10255 )	2022-03-19 17:18:50 -07:00
xumia	413ee3e219	[Build]: Fix /proc not mounted issue (#10164 ) (#10256 ) [Build]: Fix /proc not mounted issue	2022-03-19 22:19:06 +08:00
mssonicbld	03d058efe4	[ci/build]: Upgrade SONiC package versions (#10283 ) [ci/build]: Upgrade SONiC package versions	2022-03-19 11:09:51 +08:00
Stepan Blyshchak	8ce5e4e77b	[teamd.sh] kill teamd docker on warm shutdown for faster shutdown (#10219 ) This can save 6 sec for teamd LAG restoration - the time between: ``` Mar 9 13:51:10.467757 r-panther-13 WARNING teamd#teamd_PortChannel1[28]: Got SIGUSR1. Mar 9 13:52:33.310707 r-panther-13 INFO teamd#teamd_PortChannel1[27]: carrier changed to UP ``` - Why I did it Optimize warm boot. Specifically reduce the time needed for LAG restoration. - How I did it Kill teamd docker after graceful shutdown of teamd processes. - How to verify it Run warm reboot. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2022-03-16 22:22:26 +00:00
wenyiz2021	5878cfdb06	Update container_checker for multi-asic devices when state is 'always_enabled' (#10067 ) * Update container_checker for multi-asic devices Update container_checker for multi-asic devices to add database containers in always_running_containers. Previous change was made for single-asic, and that database containers were not considered as feature when writing to state_db. * Update container_checker Update an indent	2022-03-14 23:01:43 +00:00
mssonicbld	1c4364222d	[ci/build]: Upgrade SONiC package versions (#10214 )	2022-03-11 15:13:09 +00:00
Santhosh Kumar T	e83955599d	[202012] Refactoring DELL platform init to reduce rc.local processing time (#10171 ) Why I did it To reduce the processing time of rc.local, refactoring s6100 platform initialization. Fixing [warm-upgrade][202012] Slow DELL platform init in rc.local causes lacp-teardown #10150 How I did it On branch 202012-s6100-rclocalChanges to be committed: (use "git restore --staged <file>..." to unstage) modified: ../../../../files/image_config/platform/rc.local modified: ../debian/platform-modules-s6100.install modified: scripts/fast-reboot_plugin modified: scripts/s6100_platform.sh renamed: scripts/s6100_i2c_enumeration.sh -> scripts/s6100_platform_startup.sh renamed: systemd/s6100-i2c-enumerate.service -> systemd/s6100-platform-startup.service	2022-03-10 18:51:07 -08:00
mssonicbld	7fe1489061	[ci/build]: Upgrade SONiC package versions (#10194 )	2022-03-09 22:41:51 +00:00
mssonicbld	063882cf87	[ci/build]: Upgrade SONiC package versions (#10069 ) [ci/build]: Upgrade SONiC package versions (#10069)	2022-03-08 21:32:36 +08:00
xumia	a8d844c83d	[build]: Fix marvell-armhf build hung issue (#10156 ) The marvel-armhf build is hung, it does not exist after waiting for a long time. It is caused by the process /etc/entropy.py which is started by the postinst script in target/debs/buster/sonic-platform-nokia-7215_1.0_armhf.deb $ cat postinst sh /usr/sbin/nokia-7215_plt_setup.sh ... $ cat usr/sbin/nokia-7215_plt_setup.sh \| tail python /etc/entropy.py & $ cat etc/entropy.py if path.exists("/proc/sys/kernel/random/entropy_avail"): while 1: while avail() < 2048: with open('/dev/urandom', 'rb') as urnd, open("/dev/random", mode='wb') as rnd: d = urnd.read(512) t = struct.pack('ii', 4 * len(d), len(d)) + d fcntl.ioctl(rnd, RNDADDENTROPY, t) time.sleep(30) It is a workaround to fix the build issue, need to fix debian package, and revert the change.	2022-03-07 08:00:56 -08:00
roman_savchuk	4d6f9f2de7	[ BFN ] update SDE package for BFN platform (#10049 ) Updated SDE package for Barefoot platform with fixes for: - NAT - VRF	2022-03-04 20:43:08 -08:00
Qi Luo	04925df451	[build] Fix the urllib3 version in sonic-mgmt-framework (#10149 ) Fix the urllib3 version in sonic-mgmt-framework constrain file because it is already updated in Dockerfile	2022-03-04 20:34:23 -08:00
gechiang	7fb546dce4	[202012]BRCM SAI 4.3.5.3-2 Fixes CS00012228504, SONIC-55963:SID, CS00012209080, CS00012220761, and CS00012222414 (#10155 )	2022-03-04 16:24:59 -08:00
Lawrence Lee	4d1abbc09b	[write_standby]: Increase timeout to 60s (#10065 ) - Avoid scenarios where script times out before orchagent can establish IPinIP tunnel Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-03-01 22:49:17 +00:00
noaOrMlnx	7a35504ff7	[202012] [CoPP] Add always_enabled field (#9999 ) Add the "always_enabled" field to copp_cfg.j2 file, in order to allow traps without an entry in features table, to be installed automatically. This is a cherry-pick of https://github.com/Azure/sonic-buildimage/pull/9302 - Why I did it In order to allow traps without an entry in features table, to be installed automatically. - How I did it Add always_enabled field to traps without a feature	2022-02-20 12:42:39 +02:00
mssonicbld	a23aac25d3	[ci/build]: Upgrade SONiC package versions (#10023 ) [ci/build]: Upgrade SONiC package versions	2022-02-19 08:10:17 +08:00
Samuel Angebault	b32d7eedaf	Add emmc quirks to boot0 (#9989 ) Why I did it Fix some unreliability seen on emmc device with some AMD CPUs How I did it Added a kernel parameter to add quirks to It depends on a sonic-linux-kernel change to work properly but will be a no-op without it. Description for the changelog Add emmc quirks for Upperlake	2022-02-17 08:55:01 -08:00
vmittal-msft	304ec5b0cd	Updated traffic scheduler settings for HWSKUs : DellEMC-Z9332f-O32 & DellEMC-Z9332f-M-O16C64 (#9927 )	2022-02-15 16:15:20 -08:00
mssonicbld	f746d27c7d	[ci/build]: Upgrade SONiC package versions (#9933 )	2022-02-09 00:59:47 +00:00
Prince George	c1a0871fe9	Close console session due to user inactivity (#9890 ) Signed-off-by: Prince George <prgeor@microsoft.com>	2022-02-08 19:07:29 +00:00
tbgowda	78dc2d8a7b	Enable SAI_SWITCH_ATTR_UNINIT_DATA_PLANE_ON_REMOVAL attribute (#9419 ) Why I did it Fixes #8980 partly. The corresponding changes in sonic-sairedis is here : Azure/sonic-sairedis#975 How I did it Include changes from both repos and build an image for verification. How to verify it Trigger fast-reboot with the changes, see the attribute SAI_SWITCH_ATTR_UNINIT_DATA_PLANE_ON_REMOVAL being set at the SAI level. Signed-off-by: Thushar Gowda <24815472+tbgowda@users.noreply.github.com>	2022-02-08 19:07:08 +00:00
vmittal-msft	7435613216	[202012] BRCM SAI 4.3.5.3-1 Fix for CS00012218555 (#9923 )	2022-02-07 08:02:57 -08:00
Shi Su	4191889803	[bgpcfgd] Add bgpcfgd support to advertise routes (#9197 ) (#9697 ) Why I did it Cherry pick changes in #9197 to 202012 branch Add bgpcfgd support to advertise routes. How I did it Make bgpcfgd subscribe to the ADVERTISE_NETWORK table in STATE_DB and configure route advertisement accordingly. How to verify it Added unit tests in bgpcfgd and verify on KVM about route advertisement.	2022-01-26 14:38:04 -08:00
mssonicbld	3dae536de4	[ci/build]: Upgrade SONiC package versions (#9834 )	2022-01-23 22:13:50 +00:00
mssonicbld	ae7514b1bd	[ci/build]: Upgrade SONiC package versions (#9832 )	2022-01-22 16:01:17 +00:00
dflynn-Nokia	c715bdbf56	[firsttime boot] suppress error message on platforms not supporting kdump (#9521 ) Why I did it Eliminate benign firsttime boot error reported when running on platforms that do not support kdump. How I did it Change rc.local to check for presence of the file /etc/default/kdump-tools before referencing it. How to verify it Install a new image on an armhf or arm64 platform and check for a failed reference to /etc/default/kdump-tools on firsttime boot.	2022-01-21 02:39:17 +00:00
gechiang	090ef33ca2	[202012]BRCM SAI 4.3.5.3 Fixes CS00012218100,CS00012215529,CS00012208995,CS00012220761,CS00012211718,CS00012208995,CS00012220761, and CS00012225760 (#9815 )	2022-01-20 15:28:34 -08:00
mssonicbld	2eb8fe3a2c	[ci/build]: Upgrade SONiC package versions (#9799 )	2022-01-19 22:46:23 +00:00
gechiang	bdc7ce86de	[202012] BRCM SAI 4.3.5.2 Fixes CS00012205357, CS00012214196, CS00012213974 (#9754 )	2022-01-13 11:40:43 -08:00
mssonicbld	a0376a6e59	[ci/build]: Upgrade SONiC package versions (#9680 )	2022-01-07 22:12:12 +00:00
mssonicbld	9b1a3971bd	[ci/build]: Upgrade SONiC package versions (#9645 )	2021-12-26 23:30:40 +00:00
mssonicbld	813a6387c5	[ci/build]: Upgrade SONiC package versions (#9543 )	2021-12-24 17:05:45 +00:00
vmittal-msft	724037ebc3	BRCM SAI 4.3.5.1-9 for enabling SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP capability (#9463 )	2021-12-14 09:56:21 -08:00
Lawrence Lee	b3a3aa0c38	[mux]: Fix `mark_dhcp_packet` (#9373 ) - Consolidate the two [Service] sections by moving the ExecStartPre line for mark_dhcp_packet.py to the first section and removing the second. - Make the mark_dhcp_packet.py file executable - Also clean up mark_dhcp_packet.py - Remove unused imports - Fix spacing and line lengths to conform to PEP8 Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-12-01 02:28:56 +00:00
Stephen Sun	fafd5327bd	[Reclaim buffer] Common infrastructure update for reclaiming buffer (#9133 ) - Why I did it This is to update the common sonic-buildimage infra for reclaiming buffer. - How I did it Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there. Rendering is done here for passing azure pipeline. Load zero_profiles.json when the dynamic buffer manager starts Generate inactive port list to reclaim buffer Signed-off-by: Stephen Sun <stephens@nvidia.com>	2021-12-01 02:28:46 +00:00
gechiang	a5f4780c64	[202012] BRCM SAI 4.3.5.1-8 Pick up fix for PFCWD getting continuously triggered/restored when pause frames are sent continuously to both queues of a port (#9296 ) 1. CS00012211718 [4.3] Pfcwd getting continuously triggered/restored when pause frames are sent continuously to both queues of a port (TD2/Th/Th2/TD3) MSFT Default Preliminary tests look fine. BGP neighbors were all up with proper routes programmed interfaces are all up Manually ran the following test cases on 7050CX3 (TD3) T0 DUT and all passed: ``` fib/test_fib.py vxlan/test_vxlan_decap.py fdb/test_fdb.py decap/test_decap.py ipfwd/test_dip_sip.py ipfwd/test_dir_bcast.py acl/test_acl.py vlan/test_vlan.py platform_tests/test_reboot.py ```	2021-11-17 21:30:10 -08:00
trzhang-msft	19008889de	update DHCP_PACKET_MARK schema (#9077 ) - update DHCP_PACKET_MARK schema in state_db - this is an update over PR: Add service mark_dhcp_packet to mux container #9015	2021-11-15 21:37:08 +00:00
trzhang-msft	86fa5eede2	Add service mark_dhcp_packet to mux container (#9015 ) - add a new service "mark_dhcp_packet" to mux container - apply packet marks on a per-interface basis in ebtables - write packet marks to "DHCP_PACKET_MARK" table in state_db	2021-11-15 21:36:29 +00:00
Renuka Manavalan	6cb7af73d9	add arista.log to logrotate (#9245 )	2021-11-15 21:32:03 +00:00
mssonicbld	36f1a547b1	[ci/build]: Upgrade SONiC package versions (#9255 )	2021-11-14 23:26:35 +00:00
mssonicbld	4d15a1c1f6	[ci/build]: Upgrade SONiC package versions (#9221 )	2021-11-13 23:37:09 +00:00
gechiang	7ac5b40f4b	[202012]BRCM SAI 4.3.5.1-7 Picked up fixes for CS00012209390, CS00012212995, SONIC-51583, CS00012215744, and SONIC-51638 (#9252 ) This is to pick up BRCM SAI 4.3.5.1-7 fixes which contains the following fixes: 1. CS00012209390: SONIC-50037, Used SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP as a default decap map for IPinIP tunnels. 2. CS00012212995: SONIC-50948 SAI_API_QUEUE:_brcm_sai_cosq_stat_get:1353 egress Min limit get failed with error Invalid parameter 3. SONIC-51583: Fixed acl group member creation failure with priority of -1 4. CS00012215744:SONIC-51395 [TH, TH2] WB 3.5 to 4.3 fails at APPLY_VIEW while setting SAI_PORT_ATTR_EGRESS_ACL 5. SONIC-51638: SDK-249337 ERROR: AddressSanitizer: heap-buffer-overflow in _tlv_print_array Preliminary tests look fine. BGP neighbors were all up with proper routes programmed interfaces are all up Manually ran the following test cases on 7050CX3 (TD3) T0 DUT and all passed: ``` fib/test_fib.py vxlan/test_vxlan_decap.py fdb/test_fdb.py decap/test_decap.py ipfwd/test_dip_sip.py ipfwd/test_dir_bcast.py acl/test_acl.py vlan/test_vlan.py platform_tests/test_reboot.py ```	2021-11-13 10:45:46 -08:00
Mykhailo Onipko	a7117b905f	[BFN]: Updated SDK packages to 20211112 (#9244 ) Signed-off-by: Mykhailo Onipko <monipko@barefootnetworks.com>	2021-11-12 21:47:56 -08:00
Lawrence Lee	b027e87ffb	[mux.service]: Remove pmon dependency (#9211 ) Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-11 02:56:27 +00:00
Lawrence Lee	f317d93cb0	Merged PR 4679112: [write_standby]: Ignore non-auto interfaces [write_standby]: Ignore non-auto interfaces * In the event that `write_standby.py` is used to automatically switchover interfaces when linkmgrd or bgp crashes, ignore any interfaces that are not configured to auto-switch Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	57ad50cfd9	Merged PR 4559560: [bgp]: Switch to standby if BGP container exits [bgp]: Switch mux to standby if BGP container exits Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	6a9c709336	[write_standby]: Improve logging Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	77378b4364	[mux]: Call write_standby from host only Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	25712c712e	[mux]: Make write_standby available on host Signed-off-by: Lawrence Lee <lawlee@microsoft.com> [write_standby]: Cleanup and fix build Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Tamer Ahmed	18d1f65339	Merged PR 4813977: [mux] Update Service Install With SONiC Target [mux] Update Service Install With SONiC Target Recent PR grouped all SONiC service into sonic.taget. The install section of mux.service was not update and this causes delays when using config reload as the service failed state is not being reset. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	70fbd6826c	Merged PR 4366316: [mux.service]: Bind to sonic.target [mux.service]: Bind to sonic.target Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Tamer Ahmed	b42aef68f3	Merged PR 4234524: [mux] Start Mux on Only Dual-ToR Platform [mux] Start Mux on Only Dual-ToR Platform mux docker depends on the presence of mux cable hardware and is supposed to run only Gemini ToRs. This PR change the mux feature config in order to enable mux docker based on device configuration. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2021-11-10 18:54:33 -08:00
Tamer Ahmed	b8f70f8986	Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd Linkmgrd monitors link status, mux status, and link state. Has the link becomes unhealthy, linkmgrd will trigger mux switchover on a standby ToR ensuring uninterrupted service to servers/blades. This PR is initial implementation of linkmgrd. Also, docker-mux container hold packages related to maintaining and managing mux cable. It currently runs linkmgrd binary that monitor and switches the mux if needed. This PR also introduces mux-container and starts linkmgrd as startup when build is configured with INCLUDE_MUX=y Edit: linkmgrd PR will follow. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com> Related work items: #2315, #3146150	2021-11-10 18:54:33 -08:00
tjchadaga	9a1b1bc44e	Fix for additional intf flap during fast-reboot (#9166 )	2021-11-09 23:20:06 +00:00
mssonicbld	c15bae7c84	[ci/build]: Upgrade SONiC package versions (#9128 )	2021-11-09 22:52:26 +00:00
gechiang	400e40f255	[202012] BRCM SAI 4.3.5.1-6 Picked up fixes for CS00012213351, CS00012182162, and CS00012210826 (#9158 ) This is to pick up BRCM SAI 4.3.5.1-6 fixes which contains the following fixes: 1. CS00012213351 SONIC-50679: [TH, TH2] Warm-reboot from 3.5 to 4.3 fails due to null objects discovered 2. CS00012182162: SONIC-49805 TD3 MMU config profile optimization changes 3. CS00012210826:SONIC-50205/760c60fc: Should read MMU_INTFI_MMU_PORT_TO_MMU_QUEUES_FC_BKP for TH3 Preliminary tests looks fine. BGP neighbors were all up with proper routes programmed interfaces are all up Manually ran the following test cases on 7050CX3 (TD3) T0 DUT and all passed: ``` fib/test_fib.py vxlan/test_vxlan_decap.py fdb/test_fdb.py decap/test_decap.py ipfwd/test_dip_sip.py ipfwd/test_dir_bcast.py acl/test_acl.py vlan/test_vlan.py platform_tests/test_reboot.py ```	2021-11-03 07:24:33 -07:00
Sumukha Tumkur Vani	65626c8925	Flush RESTAPI DB upon config reload (#9093 )	2021-10-28 09:31:38 -07:00
Nazarii Hnydyn	0cbda8d362	[teamd]: Send USR1/USR2 only to subscribers. (#8856 ) To fix teamd signal handling, without which Process 'tlm_teamd' exited unexpectedly	2021-10-27 03:54:58 +00:00
mssonicbld	1c86196411	[ci/build]: Upgrade SONiC package versions (#9050 )	2021-10-25 17:09:12 +00:00
gechiang	c95178157d	[202012]BRCM SAI 4.5.3.1-5 picked up SAI fixes for several CSP cases (#9003 )	2021-10-19 14:08:31 -07:00
Ying Xie	f1d5aaced0	[copp] bind copp-config.service to sonic.target (#8969 ) copp-config service needs to be started after sonic.target so that it could render the copp-config with the latest information. It also needs to be restarted when config reload or load_minigraph is invoked. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2021-10-15 00:40:05 +00:00
gechiang	eca9020a48	[202012] BRCM SAI 4.5.3.1-4 Fixes dscp-uniform mode, th3 debug counter bmp crash (#8968 ) * [202012] BRCM SAI 4.5.3.1-4 Fixes dscp-uniform mode, th3 debug counter bmp crash	2021-10-13 08:25:44 -07:00
mssonicbld	b11d6cf5ee	[ci/build]: Upgrade SONiC package versions (#8919 )	2021-10-09 19:12:09 +00:00
mssonicbld	0f48239167	[ci/build]: Upgrade SONiC package versions (#8894 )	2021-10-02 19:01:40 +00:00
mssonicbld	d790caecbc	[ci/build]: Upgrade SONiC package versions (#8867 )	2021-09-29 17:11:32 +00:00
Vaibhav Hemant Dixit	636870d86f	Save DB dump after warm/fast reboot (#8803 ) As a part of warmboot, redis database is dumped: `c97fe546e5/scripts/fast-reboot (L269)` However, this dump file is deleted, after it is loaded back into db post reboot. The DB dump can be useful for debugging purpose, hence taking a backup of it can be useful. Instead of deleting the dump, rename and keep the dump.	2021-09-27 02:29:12 +00:00
gechiang	ac9feadbf1	[202012] BRCMSAI 4.3.5.1-3 fix CS00012203600, CS00012202255, CS00012208537 (#8840 )	2021-09-25 17:09:34 -07:00
mssonicbld	667fe3702c	[ci/build]: Upgrade SONiC package versions (#8829 )	2021-09-23 17:34:56 +00:00
mssonicbld	c988a7766c	[ci/build]: Upgrade SONiC package versions (#8800 )	2021-09-20 12:48:20 +00:00
mssonicbld	7ce529ea35	[ci/build]: Upgrade SONiC package versions (#8795 )	2021-09-19 15:26:49 +00:00
mssonicbld	f716745d76	[ci/build]: Upgrade SONiC package versions (#8637 )	2021-09-17 16:40:09 +00:00
abdosi	7732fa95bb	[baseimage]: Logrotate for wtmp and btmp files. (#8743 ) Added logrotate file for wtmp and btmp to override default conf and set size cap as 100K as done in PR: #865. For buster this is control by separate file wtmp and btmp. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2021-09-17 08:24:10 +00:00
Sudharsan Dhamal Gopalarathnam	9c5917d8dd	Removing execute permission from copp config file (#8680 ) *Removed execute permissions from the systemd copp-config.service file. Without this we will get a warning: "Configuration file /lib/systemd/system/copp-config.service is marked executable. Please remove executable permission bits. Proceeding anyway."	2021-09-14 08:59:21 +00:00
Ying Xie	e8b8012818	[202012][fstrim] delay fstrim timer after sonic.target (#8737 ) Why I did it fstrim has dependency on pmon docker. How I did it start fstrim timer after sonic.target. How to verify it local test and PR test. Signed-off-by: Ying Xie ying.xie@microsoft.com	2021-09-14 08:59:17 +00:00
gechiang	84b5659372	[202012] BRCM SAI 4.3.5.1-2 Fix BRCM SAI regression due to ACL Egress Mirroring Action capability (#8682 )	2021-09-06 22:12:59 -07:00
Samuel Angebault	96f2eaaadb	[Arista] Fix flash size computation for Lodoga (#8622 ) The Lodoga platform also matched crow which was hardcoding the flash size to 3700. This change enables autodetect on Clearlake which in turns allows autodetect for Lodoga. The threshold was bumped from 3700 to 4000 because size computation can differ slightly and report slightly above 3700.	2021-09-01 01:40:45 +00:00
mssonicbld	7eb4a345fa	[ci/build]: Upgrade SONiC package versions (#8584 ) Co-authored-by: mssonicbld <vsts@fv-az232-326.x3jni0md3anuvcz2px3t3ecixa.bx.internal.cloudapp.net>	2021-08-30 16:24:18 +08:00
Samuel Angebault	01117d58b5	[Arista] Rely on automatic flash size detection for Lodoga (#8608 ) Lodoga actually has a 8GB storage device. LodogaSsd variant has a 30GB SSD drive. However, in boot0 both were mishandled and assigned 4GB for legacy reasons. Remove the hardcoding of the flash size and let boot0 autodetect the available space.	2021-08-27 02:27:15 +00:00
dflynn-Nokia	2c91efcd15	[Nokia ixs7215] Add support for changing the console baud rate (#8595 ) This commit adds support for changing the default console baud rate configured within the U-Boot bootloader. That default baud rate is exposed via the value of the U-Boot 'baudrate' environment variable. This commit removes logic that hardcoded the console baud rate to 115200 and instead ensures that the U-Boot 'baudrate' variable is always used when constructing the Linux kernel boot arguments used when booting Sonic. A change is also made to rc.local to ensure that the specified baud rate is set correctly in the serial getty service.	2021-08-27 02:27:06 +00:00
gechiang	fcdd63835b	[202012]BRCM SAI 4.3.5.1-1 Fix configurable drop counter out of resource (#8601 ) * [202012]BRCM SAI 4.3.5.1 Fix for configurable drop counter out of resource	2021-08-26 14:30:22 -07:00
mssonicbld	98dd76c485	[ci/build]: Upgrade SONiC package versions (#8561 )	2021-08-24 14:53:16 +00:00
mssonicbld	8f604998b4	[ci/build]: Upgrade SONiC package versions (#8556 )	2021-08-23 17:32:59 +00:00
Volodymyr Samotiy	8365245122	[monit] Periodically monitor VNET route consistency (#8266 ) To run VNET route consistency check periodically. For any failure, the monit will raise alert based on return code. Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>	2021-08-23 03:05:16 +00:00
mssonicbld	733c851fc9	[ci/build]: Upgrade SONiC package versions (#8527 )	2021-08-19 18:49:15 +00:00
mssonicbld	aa5d05ed2c	[ci/build]: Upgrade SONiC package versions (#8385 )	2021-08-16 10:48:24 +00:00
Stephen Sun	d599450052	Use predefined macro as vendor information (#8361 ) #### Why I did it Use a predefined variable to get vendor information when the swss docker container is created #### How I did it Use `{{ sonic_asic_platform }}` instead of `$SONIC_CFGGEN -y /etc/sonic/sonic_version.yml -v asic_type` #### How to verify it Manually test.	2021-08-16 07:51:01 +00:00
Sudharsan Dhamal Gopalarathnam	ba2284c4c0	Grouping delayed services under a target for config reload checks (#7846 ) #### Why I did it Create a target for delayed service timers. Few services in sonic have delayed to speed up the bring up of the system and essential services. However there is no way to track when they start. This will be a problem when executing config reload as config reload expects all services to be up. Hence grouped all the timers that trigger the delayed services under one target so that they could be tracked in 'config reload' command #### How I did it Created delay.target service and add created dependency on the delayed targets.	2021-08-16 07:50:56 +00:00
Ying Xie	92fb9c94bd	[aboot] use ram partition for /var/log for devices with 3.7G disks (#8400 ) Master/202012 image size grew quite a bit. 3.7G harddrive can no longer hold one image and safely upgrade to another image. Every bit of harddrive space is precious to save now. Also sh syntax seemingly changed, [ condition ] && action was a legit syntax in 201911 branch but it is an error when condition not met with 202012 or later images. Change the syntax to if statement to avoid the issue. Signed-off-by: Ying Xie ying.xie@microsoft.com	2021-08-14 17:22:01 -07:00
novikauanton	aae4e8dc7c	[build]: Fix bfn package version for reproducible build (#8468 ) Barefoot pipeline is broken, because version has not been update by ci build yet.	2021-08-14 14:27:45 -07:00
Vladyslav Morokhovych	754378f1d8	[swss] Fix arp_update script (#8412 ) Fix #7968 Issue is detected on SONiC.20201231.11 In test_static_route.py::test_static_route_ecmp static routes are configured, but neighbors are not resolved after config reload even after 10 minutes. It looks like the arp_update script is starting to ping when Vlan1000 is not fully configured. When issue is reproduced, stuck ping6 process is observed in swss container : USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 180 0.1 0.0 6296 1272 pts/0 S 17:03 0:03 ping6 -I Vlan1000 -n -q -i 0 -c 1 -W 0 ff02::1 And when arp_update script successfully resolves neighbors, we observe sleep 300 instead of ping process	2021-08-12 23:25:28 -07:00
mssonicbld	1c30a8b0c1	[ci/build]: Upgrade SONiC package versions (#8376 )	2021-08-08 19:05:13 +00:00
Guohan Lu	ceab083fc5	[build]: add sonic_release 202012 Signed-off-by: Guohan Lu <lguohan@gmail.com>	2021-08-07 18:04:28 -07:00
Longxiang Lyu	25f53289eb	[swss][arp_update] Send ipv6 pings over vlan sub interfaces (#8363 ) #### Why I did it * `arp_update` fails to ping those neighbors over vlan sub interfaces. #### How I did it * modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned. * modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces. Signed-off-by: Longxiang Lyu <lolv@microsoft.com>	2021-08-07 12:43:51 +00:00

1 2 3 4 5 ...

1104 Commits