sonic-buildimage

Author	SHA1	Message	Date
judyjoseph	ce86621399	[multi-ASIC] BGP internal neighbor table support (#5520 ) * Initial commit for BGP internal neighbor table support. > Add new template named "internal" for the internal BGP sessions > Add a new table in database "BGP_INTERNAL_NEIGHBOR" > The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR" * Changes in template generation tests with the introduction of internal neighbor template files.	2020-11-10 12:52:58 -08:00
abdosi	65cc37cadf	[multi-asic] teamdctl support for multi-asic (#5851 ) Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-11-09 12:33:41 -08:00
Junchao-Mellanox	1070d024bc	[thermalctld] Enlarge startretries value to avoid thermalctld not able to restart during regression test (#5633 ) Increase startretires value from default of 10 to 50 to prevent supervisor from placing thermalctld in FATAL state during regression testing. Also ensures supervisord tries hard to get thermalctld running in production, as thermalctld is critical to prevent device from overheating.	2020-11-03 08:19:19 -08:00
abdosi	0fad6bdc7f	[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720 ) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action	2020-11-01 10:27:10 -08:00
shlomibitton	97f2cafe0b	[LLDP] Fix for LLDP advertisements being sent with wrong information. (#5493 ) * Fix for LLDP advertisments being sent with wrong information. Since lldpd is starting before lldpmgr, some advertisment packets might sent with default value, mac address as Port ID. This fix hold the packets from being sent by the lldpd until all interfaces are well configured by the lldpmgrd. Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com> * Fix comments * Fix unit-test output caused a failure during build * Add 'run_cmd' function and use it * Resume lldpd even if port init timeout reached	2020-10-30 09:06:23 -07:00
pavel-shirshov	2eec3b3254	[bgpcfgd]: Dynamic BBR support (#5626 ) - Why I did it To introduce dynamic support of BBR functionality into bgpcfgd. BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0 Now we can add and remove this configuration based on CONFIG_DB entry - How I did it I introduced a new CONFIG_DB entry: - table name: "BGP_BBR" - key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated - data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled" Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below). bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value). - How to verify it Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` Then we apply following configuration to the db: ``` admin@str-s6100-acs-1:~$ cat disable.json { "BGP_BBR": { "all": { "status": "disabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w ``` The log output are: ``` Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))' Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'. Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that no allowas parameters are there: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' admin@str-s6100-acs-1:~$ ``` Then we apply enabling configuration back: ``` admin@str-s6100-acs-1:~$ cat enable.json { "BGP_BBR": { "all": { "status": "enabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w ``` The log output: ``` Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))' Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'. Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that the BBR configuration is back: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` * The test coverage * Below is the test coverage ``` ---------- coverage: platform linux2, python 2.7.12-final-0 ---------- Name Stmts Miss Cover ---------------------------------------------------- bgpcfgd/__init__.py 0 0 100% bgpcfgd/__main__.py 3 3 0% bgpcfgd/config.py 78 41 47% bgpcfgd/directory.py 63 34 46% bgpcfgd/log.py 15 3 80% bgpcfgd/main.py 51 51 0% bgpcfgd/manager.py 41 23 44% bgpcfgd/managers_allow_list.py 385 21 95% bgpcfgd/managers_bbr.py 76 0 100% bgpcfgd/managers_bgp.py 193 193 0% bgpcfgd/managers_db.py 9 9 0% bgpcfgd/managers_intf.py 33 33 0% bgpcfgd/managers_setsrc.py 45 45 0% bgpcfgd/runner.py 39 39 0% bgpcfgd/template.py 64 11 83% bgpcfgd/utils.py 32 24 25% bgpcfgd/vars.py 1 0 100% ---------------------------------------------------- TOTAL 1128 530 53% ``` - Which release branch to backport (provide reason below if selected) - [ ] 201811 - [x] 201911 - [x] 202006	2020-10-30 08:58:27 -07:00
pavel-shirshov	84405ab953	[bgp]: Enable next-hop-tracking through default (#5600 ) - Why I did it FRR introduced [next hop tracking](http://docs.frrouting.org/projects/dev-guide/en/latest/next-hop-tracking.html) functionality. That functionality requires resolving BGP neighbors before setting BGP connection (or explicit ebgp-multihop command). Sometimes (BGP MONITORS) our neighbors are not directly connected and sessions are IBGP. In this case current configuration prevents FRR to establish BGP connections. Reason would be "waiting for NHT". To fix that we need either add static routes for each not-directly connected ibgp neighbor, or enable command `ip nht resolve-via-default` - How I did it Put `ip nht resolve-via-default` into the config - How to verify it Build an image. Enable BGP_MONITOR entry and check that entry is Established or Connecting in FRR Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com> Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-13 22:42:29 -07:00
abdosi	9202b1c7eb	Fix monit complaining of snmp on 201911 branch. (#5612 ) There is difference between master and 201911 how sonic_ax_impl is started. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-13 17:17:43 -07:00
Mahesh Maddikayala	f354a20d94	[ECMP][Multi-ASIC] Have different ECMP seed value on each ASIC (#5357 ) * Calculate ECMP hash seed based on ASIC ID on multi ASIC platform. Each ASIC will have a unique ECMP hash seed value. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-13 09:48:57 -07:00
pavel-shirshov	437ad95646	[bgp] Add 'allow list' manager feature (#5513 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-10-06 11:15:19 -07:00
abdosi	3a29249e04	[Multi-asic] Fixed Default Route to be BGP (#5548 ) Learned and not docker default route for multi-asic platforms. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-06 06:04:31 +00:00
Nazarii Hnydyn	f456f1fd03	[monit]: Fix process checker. (#5480 ) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>	2020-09-30 00:25:37 +00:00
Stephen Sun	e9c2fdbf4a	[watermark] Fix error: BUFFER_POOL_WATERMARK isn't enabled by default (#4882 ) (#5455 ) * Fix error: watermarkstat -t buffer_pool doesn't work Signed-off-by: Stephen Sun <stephens@nvidia.com>	2020-09-29 13:59:26 -07:00
arlakshm	c8f92232ef	Vtysh support for multi asic (#5479 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-09-29 19:40:37 +00:00
Abhishek Dosi	04725bc030	Revert "[bgp] Add 'allow list' manager feature (#5309 )" This reverts commit `b5d33b39de`.	2020-09-29 15:39:04 +00:00
judyjoseph	4dbe391b9a	[multi-Asic] Add support for multi-asic to swssloglevel (#5316 ) * Support for multi-asic platform for swssloglevel command admin@str-acs-1:~$ swssloglevel Usage: /usr/bin/swssloglevel -n [0 to 3] [OPTION]... * Update to use the env file to get the PLATFORM string.	2020-09-28 21:15:44 +00:00
Tamer Ahmed	2cc98b4bac	[platform] Add Support For Environment Variable File (#5010 ) * [platform] Add Support For Environment Variable This PR adds the ability to read environment file from /etc/sonic. the file contains immutable SONiC config attributes such as platform, hwsku, version, device_type. The aim is to minimize calls being made into sonic-cfggen during boot time. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-28 21:14:39 +00:00
pavel-shirshov	b5d33b39de	[bgp] Add 'allow list' manager feature (#5309 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-09-28 16:20:27 +00:00
Sumukha Tumkur Vani	d6856aa424	Update conf DB with CA cert & rename ca_crt field (#5448 )	2020-09-28 16:19:27 +00:00
Tamer Ahmed	dd87bf7f7c	[swss] Start Restore Neighbor After SWSS Config (#5451 ) SWSS config script restore ARP/FDB/Routes. Restore neighbor script uses config DB ARP information to restore ARP entries and so needs to be started after swssconfig exits. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-28 16:15:19 +00:00
Joe LeVeque	b70c6f72b2	[dockers][supervisor] Increase event buffer size for dependent-startup (#5247 ) When stopping the swss, pmon or bgp containers, log messages like the following can be seen: ``` Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37 ``` This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100. Resolves https://github.com/Azure/sonic-buildimage/issues/5241	2020-09-28 16:12:53 +00:00
yozhao101	7580c846ad	[201911][Monit] Unmonitor processes in disabled containers (#5462 ) We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that Monit will not generate false alerting messages into the syslog. - Backport of https://github.com/Azure/sonic-buildimage/pull/5153 to the 201911 branch Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-09-25 00:30:41 -07:00
gechiang	6ae77f87cc	Renamed sonic-bgpcfgd/bgpmon_proj directory to sonic-bfgcfgd/bgpmon so it is in sync with master branch naming change. Also made bgpmon auto restart enabled (#5453 ) synch up the changes from master branch where bgpmon_proj is renamed to bgpmon. Added bgpmon to be autorestart enabled by supervisord	2020-09-24 08:57:55 -07:00
gechiang	7168fc8c07	Add bgpmon under sonic-bgpcfgd to be started as a new daemon under BGP docker (#5426 ) This is to port the same set of changes from master branch to 201911 branch for the bgpmon daemon running under bgp docker.	2020-09-22 12:13:37 -07:00
lguohan	13d28f9d19	[docker-base-stretch]: install rsyslog from stretch-backports (#5411 ) Install a newer version of rsyslog from stretch-backports to support -iNONE Previous backport from master use -iNONE option which is only available after v8.32.0 Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-09-21 02:08:30 -07:00
Prince Sunny	20f627044f	Add new DB for Restapi to database config (#5350 )	2020-09-19 14:08:36 -07:00
Tamer Ahmed	56cab18501	[dhcpmon] Print Both Snapshot And Current Counters (#5374 ) Printing both snapshot and current counter sets will make it easier to pinpoint which message type(s) is/are not being relayed. This PR prints both counter sets. Also, this PR defines gnu11 as a C standard to compile with in order to avoid making changes when porting to 201811 branch. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-19 14:06:25 -07:00
Tamer Ahmed	b27ba0630c	[dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317 ) When BGP routes are missing, DHCP packets get relayed over mgmt interface. This results in dhcpmon alerting that DHCP packets are not being relayed. This is PR include mgmt interface as uplink device, and so, if DHCP packet gets relayed over mgmt interface, regular dhcpmon alert will not be issues. Instead, dhcpmon will check the mgmt interface counts and issue a separate alert regarding packets travelling through mgmt network. In addition, this PR includes the following enhancements: 1. Add SIGUSR1 handler that prints out current packet counts 2. Increase alert grace window to 3 minutes from currently 2 minutes 3. Time is now computed more accurately 4. Print vlan name before counters signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-19 14:05:49 -07:00
Joe LeVeque	c3117bc35e	[lldpmgrd] Inherit DaemonBase class from sonic-py-common package (#5370 ) Eliminate duplicate logging and signal handling code by inheriting from DaemonBase class in sonic-py-common package.	2020-09-19 13:59:01 -07:00
Tamer Ahmed	4f7c346c53	[swss] Start Arp Update Process (#5391 ) Arp update process was not being started due to an issue with the directory name having an extra 'd' in supervisor as in '/etc/supervisord/conf.d/arp_update.conf'. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-19 13:52:18 -07:00
Joe LeVeque	1ee4fa5a40	[docker-radv] Fix startup issues (#5230 ) - Why I did it PR https://github.com/Azure/sonic-buildimage/pull/4599 introduced two bugs in the startup of the router advertiser container: 1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed 2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read. - How I did it 1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh` 2. Use the Jinja2 "namespace" construct to fix the scope issue - How to verify it Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).	2020-09-04 21:20:08 +00:00
abdosi	e564142df2	Fix the issue as reported in (#5315 ) https://github.com/Azure/sonic-buildimage/issues/5255 Root Cause: Waiting on Restore count != 0 can lead to race condition between orchagent process and swssconfig.sh. Ideally check of Restore count != 0 is not needed as the State DB cannot be flushed as if it was flushed then Warm Restart or swss-restart should not be true also.	2020-09-04 21:10:39 +00:00
Prince Sunny	b1acfb60a7	Skip vnet-vxlan interfaces from generating networks (#5251 ) * Skip Vnet interface from generating networks	2020-09-03 15:49:59 -07:00
arlakshm	15a2195236	[Multi-ASIC]:Update the template to add ipinip entry for Loopback4096 (#5235 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> The following changes are done. - Multi asic platform have 2 Loopback interfaces, Loopback0 and Loopback4096. IPinIP decap entries need to be added for both of them. Update the ipinip.json.j2 template to add decap entries for Loopback4096. - Add corressponding unit test	2020-09-03 15:48:39 -07:00
zhenggen-xu	a949cf004e	[Build] pin down setuptools for build issues (#5281 ) Pin down setuptools version to fix build issues. See: https://github.com/Azure/sonic-buildimage/issues/5279 Signed-off-by: Zhenggen Xu <zxu@linkedin.com>	2020-08-31 20:44:39 -07:00
pra-moh	c43a994486	[docker-ptf] add gnmi python client (#4928 ) For telemetry regression test we need gnmi client to be present on ptfdocker. Gnmi-server will be present on SONiC DuT. Further, we can access gnmi_get from ptfdocker inside pytest to verify gnmi server streaming data successfully or not.	2020-08-27 08:05:41 -07:00
Mykola F	c243b8a9f5	[201911] Update SAI-Implementation submodule and enable port in/out dropped pkts stats (#5093 ) - Enable port buffer drops by default - Update SAI submodule Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>	2020-08-25 08:20:05 -07:00
RayWang910012	4810db8447	[monit]: monit_telemetry which will have error when telemetry is in secure mode (#4286 ) When telemetry is in secure mode ,the monitor will have error log of the match string "--insecure". So I modify to be compatiable with insecure mode and secure mode. Co-authored-by: Ubuntu <ubuntu@ip-10-5-1-21.ap-south-1.compute.internal>	2020-08-24 10:22:25 -07:00
Tamer Ahmed	9514932ed5	[telemetry] Fix telemetry vars template path (#4938 ) The template is referenced relative to the script path and this could results in errors in case script is run from root. Add explicit path to the template file name. Also, moving telemetry_var template to template dir. And remove double quotes from around json dict. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-19 16:59:32 -07:00
lguohan	92270544c5	[docker-orchagent]: start portsyncd before orchagent (#4845 ) when portsyncd starts, it first enumerates all front panel ports and marks them as old interfaces. Then, for new front panel ports it checks if their indexes exist in previous sets. If yes, it will treats them as old interfaces and ignore them. The reason we have this check is because broadcom SAI only removes front panel ports after sai switch init. So, if portsyncd starts after orchagent, new interfaces could be created before portsyncd and treated as old interface. Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-08-16 08:25:36 -07:00
Joe LeVeque	802e77c3f1	[docker-pmon] Fix copy of fancontrol config file (#5037 ) Copy proper fancontrol config file to the proper destination. Also some minor refactoring for code reuse to help prevent issues like this in the future. Fixes a bug introduced by #4599	2020-08-15 22:35:02 -07:00
Guohan Lu	42f9be1de3	[docker-database]: do not generate pidfile for rsyslogd Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-08-15 22:32:25 -07:00
Guohan Lu	569766f698	[docker-snmp-sv2]: use service dependency in supervisord to start services Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-08-15 22:32:19 -07:00
Guohan Lu	b378b4d249	[docker-dhcp-relay]: use service dependency in supervisord to start services	2020-08-15 22:25:52 -07:00
Guohan Lu	7158ccd30d	[docker-teamd]: use service dependency in supervisord to start services	2020-08-15 22:25:46 -07:00
Guohan Lu	1b6b6055e7	[docker-mgmt-framework]: use service dependency in supervisord to start services Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-08-15 22:25:38 -07:00
Guohan Lu	4d2f9d1245	[docker-telemetry]: use service dependency in supervisord to start services Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-08-15 22:25:32 -07:00
Guohan Lu	9f5c5c7a4a	[docker-restapi]: use service dependency in supervisord to start services	2020-08-15 22:25:24 -07:00
Guohan Lu	763673993e	[docker-pmon]: use service dependency in supervisord to start services	2020-08-15 22:23:50 -07:00
Guohan Lu	aa0b875b03	[docker-sflow]: use service dependency in supervisord to start services	2020-08-15 22:22:00 -07:00

1 2 3 4 5 ...

631 Commits