sonic-buildimage

Author	SHA1	Message	Date
Richard.Yu	5d7a345c09	[SAI-PTF][202012]Fix sai ptf 202012 (#12724 ) * fix sai-ptf docker build error Signed-off-by: richardyu-ms <richard.yu@microsoft.com> * correct the docker image version Signed-off-by: richardyu-ms <richard.yu@microsoft.com> * update thrift package Signed-off-by: richardyu-ms <richard.yu@microsoft.com> * fix version upgrade issue in 202012 Signed-off-by: richardyu-ms <richard.yu@microsoft.com> * remove useless file Signed-off-by: richardyu-ms <richard.yu@microsoft.com> Signed-off-by: richardyu-ms <richard.yu@microsoft.com>	2022-11-16 20:32:24 -08:00
kellyyeh	5d8efe9470	Add dhcp6relay dualtor option (#12459 )	2022-10-26 05:48:05 +00:00
Lawrence Lee	888f6ec157	[tunnel_pkt_handler]: Skip nonexistent intfs (#12424 ) - Skip the interface status check if the interface does not exist. In the future, when the interface is created/comes up this check will be triggered again. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-10-26 05:47:59 +00:00
Vivek	0de604baa6	[DHCP_RELAY] Updated wait_for_intf.sh to wait for ipv6 global and link local addr (#12273 ) - Why I did it Fixes #11431 - How I did it dhcp6relay binds to ipv6 addresses configured on these vlan interfaces Thus check if they are ready before launching dhcp6relay - How to verify it Unit Tests Tested on a live device Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2022-10-12 23:27:03 +00:00
Sudharsan Dhamal Gopalarathnam	81e139f483	VxLAN Tunnel Counters and Rates implementation (#8369 ) (#11986 ) * Enable flex counters for Vxlan tunnel * VxLAN Tunnel Counters and Rates implementation (#8369) (#11986)	2022-09-09 16:43:09 -07:00
kellyyeh	973fb9e494	[dhcp_relay] Add "vlan missing ip helper" dhcp relay unittest (#10654 ) (#11794 )	2022-08-24 19:53:11 -07:00
Lawrence Lee	663bf00c22	[swss]: Run tunnel_pkt_handler on dualtor only (#11626 ) At SWSS docker init time, check the device subtype and enable tunnel packet handler only if it is dualtor Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-08-05 11:33:37 -07:00
yozhao101	c1ab4c6831	[tunnel_packet_handler] Add a whitespace in the warning syslog message. (#11232 ) *This PR aims to add a whitespace in the warning syslog message of process tunnel_packet_handler. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2022-07-05 20:57:57 +00:00
xumia	6d3f1346fb	Set the version for python2 package protobuf (#10964 ) Why I did it Python2 not support to install protobuf>=4.21.	2022-05-30 20:00:54 +08:00
Jing Zhang	0761850f17	[sonic-linkmgrd][202012] submodule updates (#10924 ) [sonic-linkmgrd][202012] submodule updates 489cf3 Jing Zhang Wed May 18 09:59:02 2022 -0700 Avoid switching active when LinkState == Down (#77) a6c9713 Jing Zhang Tue May 24 11:03:54 2022 -0700 [202012] Add option to enable or disable default route related feature (#72) dbb607d Jing Zhang Thu May 12 08:19:20 2022 -0700 [ci]: uplift diff coverage threshold to 80% (#71) sign-off: Jing Zhang zhangjing@microsoft.com	2022-05-27 14:38:43 -07:00
yozhao101	e6c18fa6dd	[Monit] Fix the issue which shows Monit can not reset its counter. (#10288 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com> Why I did it This PR aims to fix the Monit issue which shows Monit can't reset its counter when monitoring memory usage of telemetry container. Specifically the Monit configuration file related to monitoring memory usage of telemetry container is as following: check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400" if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry" If memory usage of telemetry container is larger than 400MB for 10 times within 20 cycles (minutes), then it will be restarted. Recently we observed, after telemetry container was restarted, its memory usage continuously increased from 400MB to 11GB within 1 hour, but it was not restarted anymore during this 1 hour sliding window. The reason is Monit can't reset its counter to count again and Monit can reset its counter if and only if the status of monitored service was changed from Status failed to Status ok. However, during this 1 hour sliding window, the status of monitored service was not changed from Status failed to Status ok. Currently for each service monitored by Monit, there will be an entry showing the monitoring status, monitoring mode etc. For example, the following output from command sudo monit status shows the status of monitored service to monitor memory usage of telemetry: Program 'container_memory_telemetry' status Status ok monitoring status Monitored monitoring mode active on reboot start last exit value 0 last output - data collected Sat, 19 Mar 2022 19:56:26 Every 1 minute, Monit will run the script to check the memory usage of telemetry and update the counter if memory usage is larger than 400MB. If Monit checked the counter and found memory usage of telemetry is larger than 400MB for 10 times within 20 minutes, then telemetry container was restarted. Following is an example status of monitored service: Program 'container_memory_telemetry' status Status failed monitoring status Monitored monitoring mode active on reboot start last exit value 0 last output - data collected Tue, 01 Feb 2022 22:52:55 After telemetry container was restarted. we found memory usage of telemetry increased rapidly from around 100MB to more than 400MB during 1 minute and status of monitored service did not have a chance to be changed from Status failed to Status ok. How I did it In order to provide a workaround for this issue, Monit recently introduced another syntax format repeat every <n> cycles related to exec. This new syntax format will enable Monit repeat executing the background script if the error persists for a given number of cycles. How to verify it I verified this change on lab device str-s6000-acs-12. Another pytest PR (Azure/sonic-mgmt#5492) is submitted in sonic-mgmt repo for review.	2022-04-21 22:00:42 +00:00
kellyyeh	6e17ef311a	[dhcp_relay] Remove dhcp6mon (#10467 )	2022-04-12 18:39:19 +00:00
Stepan Blyshchak	721a53b9a0	[scapy] update scapy to 2.4.5 and patch it (#10457 ) Why I did it Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration: Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last): Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors File "/usr/bin/restore_neighbors.py", line 24, in <module> Apr 2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr Apr 2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module> Apr 2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors from scapy.route import * Apr 2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module> Apr 2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors conf.iface = get_working_if() Apr 2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if Apr 2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0] Apr 2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if Apr 2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8"))) Apr 2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2022-04-07 22:57:47 +00:00
kellyyeh	0e6f1833e0	Update docker-router-advertiser.supervisord.conf.j2 (#10375 )	2022-04-07 22:57:37 +00:00
Lawrence Lee	5b0f0c1d99	[tun_pkt]: Wait for AsyncSniffer to init fully (#10346 ) Fix for Tunnel packet handler can crash at system startup Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-03-30 21:16:18 +00:00
Lior Avramov	07c170fa04	Remove quagga from SONiC (#10384 ) Quagga is no longer being used in SONiC. Cherry-pick from master PR #7898 Co-authored-by: liora <liora@nvidia.com>	2022-03-30 13:57:34 -07:00
Saikrishna Arcot	e9db38594d	Image disk space reduction (#10172 ) (#10371 ) Reduce the disk space taken up during bootup and runtime. 1. Remove python package cache from the base image and from the containers. 2. During bootup, if logs are to be stored in memory, then don't create the `var-log.ext4` file just to delete it later during bootup. 3. For the partition containing `/host`, don't reserve any blocks for just the root user. This just makes sure all disk space is available for all users, if needed during upgrades (for example). * Remove pip2 and pip3 caches from some containers Only containers which appeared to have a significant pip cache size are included here. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Don't create var-log.ext4 if we're storing logs in memory Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Run tune2fs on the device containing /host to not reserve any blocks for just the root user Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> (cherry picked from commit `5617b1ae3e`)	2022-03-29 10:11:28 -07:00
Saikrishna Arcot	e4b30e3090	[restapi]: Don't use python/python2 for restapi start scripts (#10285 ) Python 2 isn't installed by default in Buster and Bullseye containers, and the scripts/modules can be used with Python 3, so make sure Python 3 is used. Why I did it After the Buster and Bullseye upgrade for the restapi container, processes will no longer start because supervisord is trying to call python and python2, both of which are unavailable. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-03-22 18:35:27 -07:00
kellyyeh	adaec6337f	[radv] Support multiple ipv6 prefixes per vlan interface (#9934 ) (#10253 ) Why I did it Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block. How I did it Modifies radvd.conf.j2 and added unit tests How to verify it Configure multiple ipv6 address to the same vlan, start radvd Unit test will check if radvd.conf with multiple ipv6 addresses is formed correctly	2022-03-20 17:17:59 -07:00
Shilong Liu	3455e99d45	Add a config variable to override default container registry instead of dockerhub. (#10166 ) (#10262 ) * Add variable to reset default docker registry * fix bug in docker version control	2022-03-18 12:01:52 +08:00
Longxiang Lyu	259aa0856b	Add dualtor TSA/B/C support (#9726 ) Why I did it Add TSA/B/C dualtor support Signed-off-by: Longxiang Lyu lolv@microsoft.com How I did it For TSA, toggle all the mux to standby if the device type is dualtor and there are active mux ports. For TSC, add mux status output. How to verify it Run TSA/B/C on a dualtor setup	2022-03-08 19:02:06 +00:00
Saikrishna Arcot	ee2b08e988	[202012] Upgrade restapi docker to Buster (#10003 ) Backport the changes done in #9791 to the 202012 branch, and change the base image to Buster. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-03-04 20:44:07 -08:00
Lawrence Lee	d162ffe0a5	[swss]: Wait for vlan intf to start ndppd (#10119 ) (#10153 ) 202012 version of #10119 Why I did it If the VLAN interface is not up when ndppd starts, it will fail to enable allmulti mode on the interface and be unable to process received NDP packets The following logs are seen: /var/log/syslog.33.gz:Feb 18 10:33:12.825406 sonic INFO swss#/supervisord: ndppd (error) Failed to set allmulti: No such device How I did it Use the wait_for_link script currently used by radv to delay ndppd startup until the vlan interface is ready How to verify it Apply the changes to a device. config reload the device and confirm that the above error logs are not observed when ndppd starts. Run the arp/test_arp_dualtor.py::test_proxy_arp test case and verify it passes.	2022-03-04 20:40:29 -08:00
xumia	2a7378b8c4	[Security]: Upgrade urllib3 to fix CVE-2021-33503 See https://security.archlinux.org/CVE-2021-33503	2022-02-25 09:11:56 +00:00
Richard.Yu	38f5e3bc66	[PTF-SAIv2]Add ptf docker for sai-ptf (saiv2) (#9729 ) * [PTF-SAIv2]Add ptf dockre for sai-ptf (saiv2) Base on current ptf docker create a new docker for sai-ptf(saiv2) upgrade related package use the latest ptf and install it test done: NOJESSIE=1 NOSTRETCH=1 NOBULLSEYE=1 ENABLE_SYNCD_RPC=y make target/docker-ptf-sai.gz BLDENV=buster make -f Makefile.work target/docker-ptf-sai.gz * upgrade the thrift to 014	2022-02-23 22:46:33 +00:00
Travis Van Duyn	d18b7fa24c	updated jinja template for snmp contact python2 vs python3 issue (#9949 )	2022-02-12 01:06:13 +00:00
arlakshm	14bbccc9d6	[multi-asic] fix network command for internal loopback (#7878 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> In the multi asic platforms all the ASIC are advertising the same IPv6 /64 network from Loopback4096. Therefore, the IPv6 loopback address of backend asic is not learnt on the frontend asic. Change the bgpd.conf.main.conf.j2 template file to advertise the Loopback4096 ipv6 address as /128	2022-02-09 19:27:46 +00:00
abdosi	17a8f42704	[muti-asic] Updated BGP community for Internal routes (#7617 ) Following changes are done: Internal routes are tagged with no-export instead of local-AS Option to add User Define BGP community on top of no-export	2022-02-09 19:27:32 +00:00
Lawrence Lee	59a7dc9f1e	[swss]: Reduce tunnel_packet_handler memory usage (#9762 ) * Configure scapy to not store sniffed packets Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-02-08 19:07:40 +00:00
vdahiya12	73b27b7c9e	fix build error (#9902 ) Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>	2022-02-03 08:52:29 +05:30
Shi Su	0b9077dc47	Add openbfdd to ptf docker (#9488 ) Why I did it To enable test support for BFD-related features, the PTF docker needs to have the proper support for BFD. This PR aims to add BFD support in ptf docker. How I did it Clone and build OpenBFDD for PTF docker. How to verify it Build locally and verify BFD is supported.	2022-01-31 20:08:49 +00:00
Saikrishna Arcot	5f3269a61b	Create a docker-swss-layer that holds the swss package. This is to save about 40MB of disk space, since 5 containers individually install this package. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> (cherry picked from commit `bd479cad29`)	2022-01-27 23:53:09 -08:00
SuvarnaMeenakshi	d2ee7a5bef	[docker-snmp]: Modify log level of snmpd (#9734 ) #### Why I did it resolves https://github.com/Azure/sonic-buildimage/issues/8779 snmpd writes the below error message in syslog : snmp#snmpd[27]: truncating integer value > 32 bits This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days. #### How I did it Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd To avoid seeing this message if the counter grows, the snmpd error log level is changed to display LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG. Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog. #### How to verify it On a device which is up for more than 497 days, modify supervisord.conf with the change and restart snmp. Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.	2022-01-14 23:01:19 +00:00
Shi Su	60ac485f96	Reduce route selection deferral timer for bgp graceful restart (#7533 ) Why I did it There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation. Fix #7488 How I did it Reduce route selection deferral timer for bgp graceful restart to 15 seconds.	2021-12-20 19:24:58 +00:00
Lawrence Lee	a41c15a329	[swss]: Listen for undeliverable tunnel packets (#9348 ) - Create a script in the orchagent docker container which listens for these encapsulated packets which are trapped to CPU (indicating that they cannot be routed/no neighbor info exists for the inner packet). When such a packet is received, the script will issue a ping command to the packet's inner destination IP to start the neighbor learning process. - This script is also resilient to portchannel status changes (i.e. interface going up or down). An interface going down does not affect traffic sniffing on interfaces which are still up. When an interface comes back up, we restart the sniffer to start capturing traffic on that interface again.	2021-12-16 11:59:34 -08:00
Travis Van Duyn	0226140e9c	[snmp]: updated to support snmp config from redis configdb (#6134 ) - Why I did it I'm updating the jinja2 template to support getting SNMP information from the redis configdb. I'm using the format approved here: https://github.com/Azure/SONiC/pull/718 This will pave the way for us to decrement using the snmp.yml in the future. Right now we will still be using both the snmp.yml and configdb to get variable information in order to create the snmpd.conf via the sonic-cfggen tool. - How I did it I first updated the SNMP Schema in PR #718 to get that approved as a standardized format. Then I verified I could add snmp configs to the configdb using this standard schema. Once the configs were added to the configdb then I updated the snmpd.conf.j2 file to support the updates via the configdb while still using the variables in the snmp.yml file in parallel. This way we will have backward compatibility until we can fully migrate to the configdb only. By updating the snmpd.conf.j2 template and running the sonic-cfggen tool the snmpd.conf gets generated with using the values in both the configdb and snmp.yml file. Co-authored-by: trvanduy <trvanduy@microsoft.com>	2021-12-13 17:42:48 +00:00
kellyyeh	2019ccaa2a	[radv] Run radv on MgmtToRRouter (#9424 ) * Allow radv to run on mgmt tor and EPMS	2021-12-06 21:32:33 +00:00
arlakshm	9f0fc89cff	remove staticd.conf.j2 (#9182 ) Why I did it resolves #8979 and #9055 How I did it Remove the file static.conf.j2,which adds the default route on eth0 from bgp docker Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2021-12-01 02:28:51 +00:00
Stephen Sun	fafd5327bd	[Reclaim buffer] Common infrastructure update for reclaiming buffer (#9133 ) - Why I did it This is to update the common sonic-buildimage infra for reclaiming buffer. - How I did it Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there. Rendering is done here for passing azure pipeline. Load zero_profiles.json when the dynamic buffer manager starts Generate inactive port list to reclaim buffer Signed-off-by: Stephen Sun <stephens@nvidia.com>	2021-12-01 02:28:46 +00:00
Lawrence Lee	77378b4364	[mux]: Call write_standby from host only Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	25712c712e	[mux]: Make write_standby available on host Signed-off-by: Lawrence Lee <lawlee@microsoft.com> [write_standby]: Cleanup and fix build Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Lawrence Lee	84cd0e9471	[mux]: Initialize all mux ports as standby Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-10 18:54:33 -08:00
Tamer Ahmed	b8f70f8986	Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd Linkmgrd monitors link status, mux status, and link state. Has the link becomes unhealthy, linkmgrd will trigger mux switchover on a standby ToR ensuring uninterrupted service to servers/blades. This PR is initial implementation of linkmgrd. Also, docker-mux container hold packages related to maintaining and managing mux cable. It currently runs linkmgrd binary that monitor and switches the mux if needed. This PR also introduces mux-container and starts linkmgrd as startup when build is configured with INCLUDE_MUX=y Edit: linkmgrd PR will follow. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com> Related work items: #2315, #3146150	2021-11-10 18:54:33 -08:00
tjchadaga	9a1b1bc44e	Fix for additional intf flap during fast-reboot (#9166 )	2021-11-09 23:20:06 +00:00
Lawrence Lee	8ada006302	[swss]: Start ndppd after vlanmgrd (#9155 ) Why I did it During swss container startup, if ndppd starts up before/with vlanmgrd, ndppd will be pinned at nearly 100% CPU usage. How I did it Only start ndppd after vlanmgrd is running. Also, call ndppd directly instead of through bash for improved logging and to prevent orphaned processes. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-05 00:39:10 +00:00
Saikrishna Arcot	bb1bc59a22	docker-dhcp-relay: Fix waiting for interfaces to get set up (#9034 ) Fix the check used to wait for interfaces to come up. The group name in the supervisor config files has changed from isc-dhcp-relay to dhcp-relay. Also, in the wait script, wait 10 additional seconds after the vlans, port channels, and any interfaces are up. This is because dhcrelay listens on all interfaces (in addition to port channels and vlans), and to ensure that it stays in a clean state during runtime, wait some extra time to make sure that those interfaces are created as well. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2021-10-22 17:14:22 +00:00
kellyyeh	d4a6a009cf	Change radv interval to 3min (#8891 ) (cherry picked from commit `0e175e6d6c`)	2021-10-01 23:00:17 -07:00
kellyyeh	a4b6788b4b	Replace isc-dhcp with DHCPv6 Relay in dhcp_relay docker (#8884 )	2021-10-01 19:55:03 -07:00
kellyyeh	47ba7a9091	[dhcp_relay] DHCP relay support for IPv6 (#7772 ) (#8871 )	2021-09-30 01:33:02 -07:00
Christian Svensson	5dce093464	[mgmt-framework]: Fix typo in mgmt_vars.j2 (#8475 ) Signed-off-by: Christian Svensson <blue@cmd.nu>	2021-08-25 04:11:16 +00:00

1 2 3 4 5 ...

868 Commits