sonic-buildimage

Author	SHA1	Message	Date
Neetha John	876d982bce	[sonic-config-engine] Fix typo in hwsku name in sample graph (#10941 ) Signed-off-by: Neetha John <nejo@microsoft.com> Why I did it There was a typo in hwsku specified as part of #10889 How I did it Replaced with the correct hwsku How to verify it test_cfggen.py is passing	2022-05-27 17:00:55 +00:00
xumia	06addae853	Revert "Reduce image size for lazy installation packages (#10775 )" (#10916 ) This reverts commit `15cf9b0d70`. Why I did it Revert the PR #10775, for it has impact on onie installation. It is caused by the symbol links not supported in some of the onie unzip. We will enable after fixing the issue, see #10914	2022-05-27 17:00:50 +00:00
mssonicbld	8ce3cab508	[ci/build]: Upgrade SONiC package versions (#10732 ) Co-authored-by: mssonicbld <vsts@fv-az31-361.b1uo4dmaffwenkazr3a2h2ovdb.jx.internal.cloudapp.net>	2022-05-27 01:10:32 -07:00
Guohan Lu	9b84294ffc	Revert "[bgpcfgd] ECMP overlay VxLan with BGP support (#10716 )" This reverts commit `35c9becc3c`.	2022-05-26 06:20:40 -07:00
Neetha John	a76899b04f	[sonic-config-engine] Change hwsku for sample graph in unit tests (#10889 ) #### Why I did it To ensure that some internal testcases do not break due to external changes #### How to verify it Ran test_cfggen.py with the changes and it passed	2022-05-25 22:57:01 +00:00
Taylor Cai	f5ecf1ee1c	[device/celestica]:Fix failed test case of Haliburton snmp (#10844 )	2022-05-25 22:56:57 +00:00
xumia	455d44efea	[Ci]: Fix to trigger the publish pipeline in failure build issue (#10847 ) Why I did it It is not necessary to trigger the publish pipeline when build is failed. How I did it Remove the condition in the azp task, change to use template condition.	2022-05-24 23:14:29 +00:00
abdosi	51f4bf111e	Added Support for BGP allow list feature to have route-map action of setting tag (#10731 ) What I did: Added support to create route-map action set tag <user define value> when the the allow prefix list matches. The tag can ben define by user in constants.yml. Why I did: Since for Allow List feature we call from base route-map allow-list route-map having set tag option provides way for base route-map to do match tag and take any further action if needed. Adding tag provide metadata that can used by base route-map	2022-05-24 23:14:25 +00:00
StormLiangMS	35c9becc3c	[bgpcfgd] ECMP overlay VxLan with BGP support (#10716 ) Why I did it https://github.com/Azure/SONiC/blob/master/doc/vxlan/Overlay%20ECMP%20with%20BFD.md From the design, need to advertise the route with community string, the PR is to implement this. How I did it To use the route-map as the profile for the community string, all advertised routes can be associated with one route-map. Add one file, mangers_rm.py, which is to add/update/del the route-map. Modified the managers_advertise_rt.py file to associate profile with IP route. The route-map usage is very flexible, by this PR, we only support one fixed usage to add community string for route to simplify this design. How to verify it Implement new unit tests for mangers_rm.py and updated unit test for managers_advertise_rt.py. Manually verified the test case in the test plan section, will add testcase in sonic-mgmt later. Azure/sonic-mgmt#5581	2022-05-24 23:14:21 +00:00
Lawrence Lee	5205a379e4	[scapy]: Patch scapy 2.4.5 for sniffing on intfs (#10644 ) Apply scapy fix (https://github.com/secdev/scapy/pull/3240) since it is not available in release yet Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-05-24 23:14:17 +00:00
StormLiangMS	e659cfb2de	[bgpcfgd] to support removal part of configuration of bgp allowed prefix list (#10165 ) * fix allow list issue Signed-off-by: stormliang <stormliang@microsoft.com> * add the ipaddress in the install list * add unit test Co-authored-by: Ubuntu <azureuser@SONIC-SH-STORM-02.5pu3m0fajw1edcfltykk1gauxa.gx.internal.cloudapp.net> Why I did it Failed to remove part of configuration of bgp allowed prefix list. The details in #10141 How I did it There are two issues: In FRR, ipv6 default route is ::/0, but in the configuration, it is 0::/0, string comparison would be false, but why ipv4 failed to remove the allowed prefix list, ipv6 works? Looks into next one for the answer. The current managers_allow_list doesn’t support removal part of the prefix list. But why IPv6 works in 1? It is because the bug for the IPv6 default route comparison, it would do the update no matter what is the operation (the code will compare the prefix list in the FRR and configuration db, if all configurations in db are presented in FRR, it do nothing, otherwise it will update the prefix list based on the configuration from db). How to verify it Follow the step in #10141	2022-05-24 23:14:13 +00:00
Aravind Mani	9caf12859d	DellEMC: S52xx Reboot cause fix (#10783 )	2022-05-23 21:03:11 -07:00
Rajkumar-Marvell	abb977f4c4	[Marvell] Marvell armhf SAI debian (#10854 ) Addressed system-health failure, when src-mac learned same as switchMac. Signed-off-by: rajkumar38 <rpennadamram@marvell.com>	2022-05-23 19:05:22 +08:00
Volodymyr Samotiy	6b029a613b	[202012] [Mellanox] Update SAI to 1.21.1.2 and SDK/FW to 4.5.2262/xx.2010.2262 (#10880 ) - Why I did it To include latest fixes: 1. Warmboot \| When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU. 2. Link Up \| When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted. 3. Shared buffer \| While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized. - How I did it Updated SAI & SDK submodules along with the relevant Makefiles - How to verify it Build an image and run tests from "sonic-mgmt". Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>	2022-05-22 09:48:37 +03:00
Aravind Mani	cceef8e36d	Dell S6100: Addition of SFP type_abbrv_name field (#10846 ) * Dell S6100: Addition of SFP type_abbrv_name field * Update sfp.py * Update sfp.py Co-authored-by: Aravind Mani <aravind.m1@dell.com>	2022-05-19 12:14:53 -07:00
Dror Prital	77298da275	[202012][submodule] Update sonic-swss pointer (#10837 ) Signed-off-by: Dror Prital <drorp@nvidia.com> Co-authored-by: liat-grozovik <44433539+liat-grozovik@users.noreply.github.com> Co-authored-by: Prince Sunny <prince.sunny@microsoft.com> Co-authored-by: bingwang-ms <66248323+bingwang-ms@users.noreply.github.com>	2022-05-19 17:35:39 +08:00
Jing Zhang	acfee3be9a	[sonic-linkmgrd][202012] submodule update (#10814 ) [sonic-linkmgrd][202012] submodule update 3d13ff2 Jing Zhang Wed May 4 10:07:14 2022 -0700 Add doc for default route related changes (#63) c703be4 Jing Zhang Mon May 2 13:27:54 2022 -0700 Reset WaitActiveUp count before switching to active (#70) 86eb727 Jing Zhang Wed Apr 27 10:35:05 2022 -0700 lower log level to warning (#69) e22c736 Jing Zhang Mon May 2 13:33:24 2022 -0700 [202012] Avoid proactively switching to active if default route is missing (#67) d4f282b Jing Zhang Thu Apr 28 18:35:11 2022 -0700 [202012] Add support to enable switchover time measurement (with link prober interval decreased to 10ms) feature (#66) sign-off: Jing Zhang [zhangjing@microsoft.com](mailto:zhangjing@microsoft.com)	2022-05-18 17:36:48 -07:00
Shilong Liu	f5c876edf3	[build] docker-sonic-mgmt replace USER by whoami (#9702 )	2022-05-18 22:33:14 +08:00
Saikrishna Arcot	ee84ba81ed	Use 202012 branch for sonic-mgmt (#10851 ) Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-05-17 14:59:31 -07:00
xumia	f6a52e31ec	[Ci] Support to trigger a pipeline to download and publish artifacts to storage (#10820 ) Why I did it Support to trigger a pipeline to download and publish artifacts to storage and container registry. Support to specify the patterns which docker images to upload. How I did it Pass the pipeline information and the artifact information by pipeline parameters to the pipeline which will be triggered a new build. It is to decouple the artifacts generation and the publish logic, how and where the artifacts/docker images will be published, depends on the triggered pipeline. How to verify it	2022-05-16 23:28:06 +00:00
Vivek R	cd0a0608a9	Removed platform specific reboot files for mellanox simx platforms (#10806 ) - Why I did it Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present. Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this. Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2022-05-16 23:28:02 +00:00
shlomibitton	c71c91e2b0	[202012] [Fastboot] Delay PMON service for better fastboot performance (#10745 ) #### Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot. #### How I did it Add a timer for PMON service. Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot. Copy the timer file to the host bin image. #### How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.	2022-05-15 23:31:32 -07:00
shlomibitton	bca8a244c6	[202012] [Fastboot] Delay LLDP service for better fastboot performance (#10568 ) (#10744 ) This PR is to backport a fix #10568 This PR is dependent on PR: #10745 - Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot. - How I did it Add a timer for LLDP service. Copy the timer file to the host bin image. - How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.	2022-05-15 15:05:29 +03:00
Junchao-Mellanox	4f326e8779	Fix race condition between networking service and interface-config service (#10573 ) (#10766 ) Backport https://github.com/Azure/sonic-buildimage/pull/10573 to 202012. #### Why I did it The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like: 1. Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp. 2. Systemd starts interface-config service who depends on networking service 3. Interface-config service runs command “ifdown –force eth0”, check [line](`16717d2dc5/files/image_config/interfaces/interfaces-config.sh (L4)`). but networking service is still running so that this [line](`ac32bec0e2/ifupdown2/ifupdown/main.py (L74)`) failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down. 4. Interface-config service updates /etc/networking/interface to static configuration. 5. Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP) 6. When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces. The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3. #### How I did it Check networking service state before running "ifdown –force eth0", wait for it done if it is activating. #### How to verify it Manual test.	2022-05-14 14:58:24 -07:00
Sudharsan Dhamal Gopalarathnam	f16d11237a	[202012][submodule] Advance sonic-swss submodule pointer (#10803 ) Update sonic-swss submodule to include below commits b9163d3 [Vnet] Set BFD multihop to true for Vnet routes cfed8c7 [202012][cherry-pick]Update orchagent to support new field pfcwd_sw_enable 172cd13 [ACL]Avoid incrementing crm count when ACL rule create fails 7377901 [pfcwd] Add vs test infrastructure 0b58595 Removing Vnet with scope default	2022-05-14 10:29:34 +03:00
Saikrishna Arcot	8970425a75	Fix calculation of $(1)_DEP_PKGS_SHA in Makefile.cache (#10764 ) In Makefile.cache, for $(1)_DEP_PKGS_SHA, the intention is to include the DEP_MOD_SHA and MOD_HASH of each of the current package's dependencies. However, there's a level of dereferencing missing; instead of grabbing the value of $(dfile)_DEP_MOD_SHA, it is literally using the variable name $(dfile)_DEP_MOD_SHA. This means that the value of this variable will not change when some dependency changes. The impact of this is in transitive dependencies. For a specific example, if there is some change in sairedis, then sairedis will be rebuilt (because there's a change within that component), and swss will be rebuilt (because it's a direct dependency), but docker-swss-layer-buster will not get rebuilt, because only the direct dependencies are effectively being checked, and those aren't changing. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-05-10 06:44:45 +00:00
xumia	951d93e362	Reduce image size for lazy installation packages (#10775 ) Why I did it The image size is too large, when there are multiple lazy packages and multiple platforms. It is not necessary to keep the lazy installation packages in multiple copies. For cisco image, the image size will reduce from 3.5G to 1.7G. How I did it Use symbol links to only keep one package for each of the lazy package. Make a new folder fsroot/platform/common Copy the lazy packages into the folder. When using a package in each of the platform, such as x86_64-grub, x86_64-8800_rp-r0, x86_64-8201_on-r0, etc, only make a symbol link to the package in the common folder.	2022-05-10 06:44:40 +00:00
Shilong Liu	a296267097	[ci] Support multi tags when pushing docker image (#10771 )	2022-05-10 06:44:35 +00:00
Qi Luo	be5eb80b14	[202012] Fix tagged VlanInterface if attached to multiple vlan as untagged member (#10589 ) Backport https://github.com/Azure/sonic-buildimage/pull/8927 to 202012 branch	2022-05-09 14:07:02 -07:00
Sudharsan Dhamal Gopalarathnam	502ddbb249	[202012][caclmgrd]Added logic to allow BFD port numbers (#10740 ) * [caclmgrd]Added logic to allow BFD port numbers	2022-05-06 10:38:05 -07:00
Sudharsan Dhamal Gopalarathnam	2a232730b0	[202012][Mellanox] Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.1.2 (#10464 ) * [Mellanox] Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.0.1 Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com> * Updating Switch-SDK-drivers submodule pointer * Updating SAI version	2022-05-04 06:07:10 +03:00
Qi Luo	9b55564289	[sonic-snmpagent] Update submodule (#10730 ) Include below commits: ``` c75440b 2022-05-02 \| Fix: not to use blocking get_all() after keys() (#255) [Qi Luo] ```	2022-05-02 23:46:33 -07:00
kellyyeh	96c0d8a7f8	[dhcp6relay] Add retry mechanism for binding socket to interface ipv6 addresses (#10712 )	2022-05-03 00:42:49 +00:00
vmittal-msft	7b7737ef0f	Adjustment to ingress pool size to accomodate brcm sai (#10694 )	2022-05-03 00:42:27 +00:00
xumia	c70a35dda3	Fix the build target error when building sonic-rest-api (#10693 ) Why I did it Fix target target/debs/bullseye/sonic-rest-api_1.0.1_arm64.deb not existing issue, the correct target is target/debs/bullseye/sonic-rest-api_1.0.1_armhf.deb. Fix issue: #9896 [ FAIL LOG START ] [ target/debs/stretch/sonic-rest-api_1.0.1_amd64.deb ] [ REASON ] : target/debs/stretch/sonic-rest-api_1.0.1_amd64.deb does not exist NON-EXISTENT PREREQUISITES: [ FLAGS FILE ] : []	2022-05-03 00:42:21 +00:00
Jing Zhang	3da032766e	[sonic-linkmgrd][202012] submodule update (#10703 ) [sonic-linkmgrd][202012] submodule update 3523738 Jing Zhang Sun Apr 3 20:54:40 2022 -0700 Reset link prober state when default route is back #56 8282e78 Jing Zhang Fri Apr 15 15:59:34 2022 -0700 Keep incrementing sequence number when link prober is suspended and shutdown #55 (#65) 8246eb8 Jing Zhang Thu Apr 14 18:49:36 2022 -0700 Shutdown ICMP heartbeats when default route state is missing and ToR is in auto mode #44 (#59) sign-off: Jing Zhang zhangjing@microsoft.com	2022-05-02 09:37:14 -07:00
Vaibhav Hemant Dixit	26055cf46e	[submodule]: update sonic-utilities submodule (#10713 ) [202012][dualtor] Fix config_db.json path for config-reload	2022-04-30 10:40:27 -07:00
Nikola Dancejic	602c8e99dc	[device config] Adding configuration for default route fallback (#10692 ) Set sai_tunnel_underlay_route_mode attribute to fallback to default route if more specific route is unavailable. Signed-off-by: Nikola Dancejic <ndancejic@microsoft.com>	2022-04-29 16:20:18 -07:00
bingwang-ms	a07930bead	Update submodule sonic-sws-common (#10707 ) Signed-off-by: bingwang <bingwang@microsoft.com>	2022-04-29 15:37:26 +08:00
Samuel Angebault	705d3c0804	[Arista] Remove arista.log from rsyslog default logrotate (#9731 ) Why I did it In parallel of this change Arista added a custom logrotate configuration as part of its driver library. Having 2 logrotate configuration for the same log file triggers an issue. Fixes aristanetworks/sonic#38 How I did it Arista merged a few changes in sonic-buildimage which added a logrotate configuration aristanetworks/sonic@e43c797 It is therefore the right path to remove the arista.log line from the logrotate.d/rsyslog configuration. How to verify it Logrotate works without any error message, arista log rotation happens and arista daemons still append logs once file was truncated.	2022-04-28 23:58:41 +00:00
mssonicbld	1c9cdc4c7a	[ci/build]: Upgrade SONiC package versions (#10594 )	2022-04-27 15:25:14 +00:00
Taylor Cai	1aeb658964	Fix issue test_crm and test_fib (#10585 ) Why I did it Fix issue (https://github.com/Azure/sonic-buildimage/issues/9171) and (https://github.com/Azure/sonic-buildimage/issues/9236) How I did it Add flag in config file for get correct count of IPv6 entry. Add init config file to set IPv4 ECMP hash on L4. How to verify it Compile the sonic_platform wheel for e1031, then upload to device and install the wheel, verify using testbed.	2022-04-26 17:40:47 +00:00
Vaibhav Hemant Dixit	4dcf6c3dc9	Advanced sonic-sairedis submodule (#10684 )	2022-04-26 10:07:45 -07:00
xumia	6ad9daded3	[Submodule]: update submodule for sonic-restapi (#10679 ) Why I did it Update submodule sonic-restapi e83e0e8 Fix Ctype_char larger than address space issue in 32-bit armhf (#107)	2022-04-26 17:54:51 +08:00
Shilong Liu	cc591039b3	[submodule] Update submodule for sonic-mgmt-common (#10666 )	2022-04-25 17:04:37 +08:00
dflynn-Nokia	44ec8372a4	[Nokia ixs7215] Platform API temperature threshold value fixes (#10533 ) Incorrect high-threshold and critical-high-threshold values are displayed for some of the temperature sensors. This commit fixes that. Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com> Co-authored-by: Jing Kan <jika@microsoft.com>	2022-04-25 09:28:13 +08:00
Shilong Liu	48f5c0ebff	[CG] Fix CG alert about underscore version. (#10606 ) Fix CG CVE-2021-23358	2022-04-24 19:18:55 +08:00
Shilong Liu	5779a92d99	[ci] Fix PR checker archieve artifacts step (#9357 ) (#10652 ) Why I did it When a failed job retry. Publish artifact will fail for duplicated name	2022-04-23 13:57:50 +08:00
xumia	55a6faf925	[Ci]: Support to sign image for cisco-8000 uefi secure boot (#10616 ) Why I did it [Ci]: Support to sign image for cisco-8000 uefi secure boot	2022-04-21 22:00:47 +00:00
yozhao101	e6c18fa6dd	[Monit] Fix the issue which shows Monit can not reset its counter. (#10288 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com> Why I did it This PR aims to fix the Monit issue which shows Monit can't reset its counter when monitoring memory usage of telemetry container. Specifically the Monit configuration file related to monitoring memory usage of telemetry container is as following: check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400" if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry" If memory usage of telemetry container is larger than 400MB for 10 times within 20 cycles (minutes), then it will be restarted. Recently we observed, after telemetry container was restarted, its memory usage continuously increased from 400MB to 11GB within 1 hour, but it was not restarted anymore during this 1 hour sliding window. The reason is Monit can't reset its counter to count again and Monit can reset its counter if and only if the status of monitored service was changed from Status failed to Status ok. However, during this 1 hour sliding window, the status of monitored service was not changed from Status failed to Status ok. Currently for each service monitored by Monit, there will be an entry showing the monitoring status, monitoring mode etc. For example, the following output from command sudo monit status shows the status of monitored service to monitor memory usage of telemetry: Program 'container_memory_telemetry' status Status ok monitoring status Monitored monitoring mode active on reboot start last exit value 0 last output - data collected Sat, 19 Mar 2022 19:56:26 Every 1 minute, Monit will run the script to check the memory usage of telemetry and update the counter if memory usage is larger than 400MB. If Monit checked the counter and found memory usage of telemetry is larger than 400MB for 10 times within 20 minutes, then telemetry container was restarted. Following is an example status of monitored service: Program 'container_memory_telemetry' status Status failed monitoring status Monitored monitoring mode active on reboot start last exit value 0 last output - data collected Tue, 01 Feb 2022 22:52:55 After telemetry container was restarted. we found memory usage of telemetry increased rapidly from around 100MB to more than 400MB during 1 minute and status of monitored service did not have a chance to be changed from Status failed to Status ok. How I did it In order to provide a workaround for this issue, Monit recently introduced another syntax format repeat every <n> cycles related to exec. This new syntax format will enable Monit repeat executing the background script if the error persists for a given number of cycles. How to verify it I verified this change on lab device str-s6000-acs-12. Another pytest PR (Azure/sonic-mgmt#5492) is submitted in sonic-mgmt repo for review.	2022-04-21 22:00:42 +00:00

1 2 3 4 5 ...

5391 Commits