sonic-buildimage

Author	SHA1	Message	Date
Joe LeVeque	7bf05f7f4f	[supervisor] Install vanilla package once again, install Python 3 version in Buster container (#5546 ) - Why I did it We were building a custom version of Supervisor because I had added patches to prevent hangs and crashes if the system clock ever rolled backward. Those changes were merged into the upstream Supervisor repo as of version 3.4.0 (http://supervisord.org/changes.html#id9), therefore, we should be able to simply install the vanilla package via pip. This will also allow us to easily move to Python 3, as Python 3 support was added in version 4.0.0. - How I did it - Remove Makefiles and patches for building supervisor package from source - Install Python 3 supervisor package version 4.2.1 in Buster base container - Also install Python 3 version of supervisord-dependent-startup in Buster base container - Debian package installed binary in `/usr/bin/`, but pip package installs in `/usr/local/bin/`, so rather than update all absolute paths, I changed all references to simply call `supervisord` and let the system PATH find the executable to prevent future need for changes just in case we ever need to switch back to build a Debian package, then we won't need to modify these again. - Install Python 2 supervisor package >= 3.4.0 in Stretch and Jessie base containers	2020-11-19 23:41:32 -08:00
pavel-shirshov	af654944bd	[bgp]: Update TSA functionality (#5906 ) Fixed TSA bugs: 1. TSA didn't advertise Loopback ipv6 address 2. TSA and TSB changed BGP dynamic and BGP monitors sessions - How to verify it Build an image and run on your DUT. ``` admin@str-s6100-acs-1:~$ TSA System Mode: Normal -> Maintenance admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv4 neighbors 10.0.0.1 advertised-routes' BGP table version is 6, local router ID is 10.1.0.32, vrf id 0 Default local pref 100, local AS 64601 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path > 10.1.0.32/32 0.0.0.0 0 32768 i Total number of prefixes 1 admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv6 neighbors fc00::a advertised-routes' BGP table version is 6, local router ID is 10.1.0.32, vrf id 0 Default local pref 100, local AS 64601 Status codes: s suppressed, d damped, h history, valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> fc00:1::/64 :: 0 32768 i Total number of prefixes 1 admin@str-s6100-acs-1:~$ TSB System Mode: Maintenance -> Normal ``` Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-11-13 17:54:20 -08:00
judyjoseph	f2b22b5cd1	[multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table (#5874 ) Reintroduce #5760, along with the fix needed in the template file for python3 compatibility.	2020-11-10 09:34:56 -08:00
judyjoseph	b5121dcfd4	Revert "[multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table. (#5760 )" (#5871 ) This reverts commit `c972052594`.	2020-11-09 14:30:13 -08:00
judyjoseph	c972052594	[multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table. (#5760 ) - Why I did it Update the routine is_bgp_session_internal() by checking the BGP_INTERNAL_NEIGHBOR table. Additionally to address the review comment #5520 (comment) Add timer settings as will in the internal session templates and keep it minimal as these sessions which will always be up. Updates to the internal tests data + add all of it to template tests. - How I did it Updated the APIs and the template files. - How to verify it Verified the internal BGP sessions are displayed correctly with show commands with this API is_bgp_session_internal()	2020-11-09 11:10:10 -08:00
pavel-shirshov	cdcd20a7b5	[BGP]: Convert ip address to network address for the LOCAL_VLAN filter (#5832 ) * [BGP]: Convert ip address to network address for the LOCAL_VLAN prefix filter	2020-11-06 17:47:08 -08:00
pavel-shirshov	13f8e9ce5e	[bgpcfgd]: Convert bgpcfgd and bgpmon to python3 (#5746 ) * Convert bgpcfgd to python3 Convert bgpmon to python3 Fix some issues in bgpmon * Add python3-swsscommon as depends * Install dependencies * reorder deps Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-11-05 10:01:43 -08:00
judyjoseph	6088bd59de	[multi-ASIC] BGP internal neighbor table support (#5520 ) * Initial commit for BGP internal neighbor table support. > Add new template named "internal" for the internal BGP sessions > Add a new table in database "BGP_INTERNAL_NEIGHBOR" > The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR" * Changes in template generation tests with the introduction of internal neighbor template files.	2020-10-28 16:41:27 -07:00
pavel-shirshov	c94f93f046	[bgpcfgd]: Dynamic BBR support (#5626 ) - Why I did it To introduce dynamic support of BBR functionality into bgpcfgd. BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0 Now we can add and remove this configuration based on CONFIG_DB entry - How I did it I introduced a new CONFIG_DB entry: - table name: "BGP_BBR" - key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated - data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled" Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below). bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value). - How to verify it Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` Then we apply following configuration to the db: ``` admin@str-s6100-acs-1:~$ cat disable.json { "BGP_BBR": { "all": { "status": "disabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w ``` The log output are: ``` Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))' Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'. Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that no allowas parameters are there: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' admin@str-s6100-acs-1:~$ ``` Then we apply enabling configuration back: ``` admin@str-s6100-acs-1:~$ cat enable.json { "BGP_BBR": { "all": { "status": "enabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w ``` The log output: ``` Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))' Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'. Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that the BBR configuration is back: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` * The test coverage * Below is the test coverage ``` ---------- coverage: platform linux2, python 2.7.12-final-0 ---------- Name Stmts Miss Cover ---------------------------------------------------- bgpcfgd/__init__.py 0 0 100% bgpcfgd/__main__.py 3 3 0% bgpcfgd/config.py 78 41 47% bgpcfgd/directory.py 63 34 46% bgpcfgd/log.py 15 3 80% bgpcfgd/main.py 51 51 0% bgpcfgd/manager.py 41 23 44% bgpcfgd/managers_allow_list.py 385 21 95% bgpcfgd/managers_bbr.py 76 0 100% bgpcfgd/managers_bgp.py 193 193 0% bgpcfgd/managers_db.py 9 9 0% bgpcfgd/managers_intf.py 33 33 0% bgpcfgd/managers_setsrc.py 45 45 0% bgpcfgd/runner.py 39 39 0% bgpcfgd/template.py 64 11 83% bgpcfgd/utils.py 32 24 25% bgpcfgd/vars.py 1 0 100% ---------------------------------------------------- TOTAL 1128 530 53% ``` - Which release branch to backport (provide reason below if selected) - [ ] 201811 - [x] 201911 - [x] 202006	2020-10-22 11:04:21 -07:00
pavel-shirshov	812e1a3489	[bgp]: Enable next-hop-tracking through default (#5600 ) - Why I did it FRR introduced [next hop tracking](http://docs.frrouting.org/projects/dev-guide/en/latest/next-hop-tracking.html) functionality. That functionality requires resolving BGP neighbors before setting BGP connection (or explicit ebgp-multihop command). Sometimes (BGP MONITORS) our neighbors are not directly connected and sessions are IBGP. In this case current configuration prevents FRR to establish BGP connections. Reason would be "waiting for NHT". To fix that we need either add static routes for each not-directly connected ibgp neighbor, or enable command `ip nht resolve-via-default` - How I did it Put `ip nht resolve-via-default` into the config - How to verify it Build an image. Enable BGP_MONITOR entry and check that entry is Established or Connecting in FRR Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-10-13 22:21:28 -07:00
pavel-shirshov	ffae82f8be	[bgp] Add 'allow list' manager feature (#5513 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-10-02 10:06:04 -07:00
Tamer Ahmed	6754635010	[cfggen] Make Jinja2 Template Python 3 Compatible Jinja2 templates rendered using Python 3 interpreter, are required to conform with Python 3 new semantics. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-30 07:07:43 -07:00
Guohan Lu	e412338743	Revert "[bgp] Add 'allow list' manager feature (#5309 )" This reverts commit `6eed0820c8`.	2020-09-28 22:00:29 -07:00
pavel-shirshov	6eed0820c8	[bgp] Add 'allow list' manager feature (#5309 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-09-27 10:47:43 -07:00
gechiang	43a8368874	make bgpmon autorestart enabled by supervisord (#5460 )	2020-09-25 10:25:11 -07:00
gechiang	128def6969	Add bgpmon to be started as a new daemon under BGP docker (#5329 ) * Add bgpmon under sonic-bgpcfgd to be started as a new daemon under BGP docker * Added bgpmon to be monitored by Monit so that if it crashed, it gets alerted * use console_scripts entry point to package bgpmon	2020-09-20 14:32:09 -07:00
Joe LeVeque	5b3b4804ad	[dockers][supervisor] Increase event buffer size for dependent-startup (#5247 ) When stopping the swss, pmon or bgp containers, log messages like the following can be seen: ``` Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37 ``` This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100. Resolves https://github.com/Azure/sonic-buildimage/issues/5241	2020-09-08 23:36:38 -07:00
Prince Sunny	4338d8293f	Skip vnet-vxlan interfaces from generating networks (#5251 ) * Skip Vnet interface from generating networks	2020-08-27 14:14:04 -07:00
Tamer Ahmed	a10c5bfd02	[frr] Reduce Calls to SONiC Cfggen (#5176 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to two calls during startup when starting frr service. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:47:42 -07:00
pavel-shirshov	89184038fd	[docker-fpm-frr]: Start bgpd after zebra was started (#5038 ) fixes https://github.com/Azure/sonic-buildimage/issues/5026 Explanation: In the log from the issue I found: ``` I see following in the log Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed ``` Analyzing source code I found that the error message could be issues only when `zclient_send_rnh()` return less than 0. ``` ret = zclient_send_rnh(zclient, command, p, exact_match, bnc->bgp->vrf_id); /* TBD: handle the failure / if (ret < 0) flog_warn(EC_BGP_ZEBRA_SEND, "sendmsg_nexthop: zclient_send_message() failed"); ``` I checked [zclient_send_rnh()](`88351c8f6d/lib/zclient.c (L654)`) and found that this function will return the exit code which the function gets from [zclient_send_message()](`88351c8f6d/lib/zclient.c (L266)`) But the latter function could return not 0 in two cases: 1. bgpd didn’t connect to the zclient socket yet [code](`88351c8f6d/lib/zclient.c (L269)`) 2. The socket was closed. But in this case we would receive the error message in the log. (And I can find the message in the log when we reboot sonic) [code](`88351c8f6d/lib/zclient.c (L277)`) Also I see from the logs that client connection was set later we had the issue in bgpd. Bgpd.log ``` Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed ``` Vs Zebra.log ``` Jul 22 21:13:12.713249 vlab-01 NOTICE bgp#zebra[48]: client 25 says hello and bids fair to announce only static routes vrf=0 Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 30 says hello and bids fair to announce only bgp routes vrf=0 Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 33 says hello and bids fair to announce only vnc routes vrf=0 ``` So in our case we should start zebra first. Wait until it is started and then start bgpd and other daemons. - How I did it* I changed a graph to start daemons in the following order: 1. First start zebra 2. Then starts staticd and bgpd 3. Then starts vtysh -b and bgpeoi after bgpd is started.	2020-07-25 03:48:47 -07:00
anish-n	da017f4ec9	[bgpcfgd]: Add Vlan prefix list to the FRR templates (#5005 ) add the Vlan prefix list to the FRR templates	2020-07-21 19:26:19 -07:00
arlakshm	97fa2c087b	"[config]: Multi ASIC loopback changes (#4895 ) Resubmitting the changes for (#4825) with fixes for sonic-bgpcdgd test failures Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-07-12 18:08:51 +00:00
Guohan Lu	f8da3e4c69	Revert "[config]: Loopback Interface changes for multi ASIC devices (#4825 )" This reverts commit `cae65a451c`.	2020-07-12 18:08:51 +00:00
arlakshm	002335a3d5	[config]: Loopback Interface changes for multi ASIC devices (#4825 ) * Loopback IP changes for multi ASIC devices multi ASIC will have 2 Loopback Interfaces - Loopback0 has globally unique IP address, which is advertised by the multi ASIC device to its peers. This way all the external devices will see this device as a single device. - Loopback4096 is assigned an IP address which has a scope is within the device. Each ASIC has a different ip address for Loopback4096. This ip address will be used as Router-Id by the bgp instance on multi ASIC devices. This PR implements this change for multi ASIC devices Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-07-12 18:08:51 +00:00
pavel-shirshov	2b137fb540	Tests of FRR templates which rendered by sonic-cfggen (#4875 ) * Tests of FRR templates which rendered by sonic-cfggen	2020-07-12 18:08:51 +00:00
pavel-shirshov	d592e9b0f8	Tests for bgpcfgd templates (#4841 ) * Tests for bgpcfgd templates	2020-06-25 14:54:02 -07:00
pavel-shirshov	0d863c39ac	[bgpcfgd]: make a package for bgpcfgd (#4813 )	2020-06-20 21:01:24 -07:00
arlakshm	80298fae1f	[bgp]:Add redistribution connected for ipv6 also for Frontend ASICs (#4767 ) * fix redistribution connected for ipv6 also Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-06-16 06:45:45 -07:00
Guohan Lu	2c7e55ae98	[docker-frr]: use service dependency in supervisord to start services Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-05-22 11:01:28 -07:00
judyjoseph	4ba2f608c1	Adding new BGP peer groups PEER_V4_INT and PEER_V6_INT. (#4620 ) * Adding new BGP peer groups PEER_V4_INT and PEER_V6_INT. The internal BGP sessions will be added to this peer group while the external BGP sessions will be added to the exising PEER_V4 and PEER_V6 peer group. * Check for "ASIC" keyword in the hostname to identify the internal neighbors.	2020-05-20 20:52:11 -07:00
arlakshm	d3c28a45d9	Change to enable redistribute connected on Frontend asics instead of backend asics (#4588 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-05-13 08:56:01 -07:00
arlakshm	2db87669c2	[bgp]: align the bgp templates with new minigraph for multi NPU platforms (#4488 ) - change the references to 'type' field to 'sub_role' - change the references to 'InternalFrontend' and 'InternalBackend' to 'FrontEnd' and 'BackEnd' respectively - add a statement to reflect route-reflector for backend asics - add a change to set "next-hop-self force" configuration for internal BGP session in multi asic platform. Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-05-06 14:58:02 -07:00
pavel-shirshov	057ced0391	[bgpcfgd]: Split one bgp mega-template to chunks. (#4143 ) The one big bgp configuration template was splitted into chunks. Currently we have three types of bgp neighbor peers: general bgp peers. They are represented by CONFIG_DB::BGP_NEIGHBOR table entries dynamic bgp peers. They are represented by CONFIG_DB::BGP_PEER_RANGE table entries monitors bgp peers. They are represented by CONFIG_DB::BGP_MONITORS table entries This PR introduces three templates for each peer type: bgp policies: represent policieas that will be applied to the bgp peer-group (ip prefix-lists, route-maps, etc) bgp peer-group: represent bgp peer group which has common configuration for the bgp peer type and uses bgp routing policy from the previous item bgp peer-group instance: represent bgp configuration, which will be used to instatiate a bgp peer-group for the bgp peer-type. Usually this one is simple, consist of the referral to the bgp peer-group, bgp peer description and bgp peer ip address. This PR redefined constant.yml file. Now this file has a setting for to use or don't use bgp_neighbor metadata. This file has more parameters for now, which are not used. They will be used in the next iteration of bgpcfgd. Currently all tests have been disabled. I'm going to create next PR with the tests right after this PR is merged. I'm going to introduce better bgpcfgd in a short time. It will include support of dynamic changes for the templates. FIX:: #4231	2020-04-23 09:42:22 -07:00

33 Commits