Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors
How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests
How to verify it
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
What I did:
Updated Jinja Template to enable BGP Graceful Restart based on device role. By default it will be enable only if the device role type is TorRouter.
Why I did:-
By default FRR is configured in Graceful Helper mode. Graceful Restart is needed on T0/TorRouter only since the device can go for warm-reboot. For T1/LeafRouter it need to be in Helper mode only
Updated BGP Template for the case:
1. For Packet Chassis do not advertise Loopback4096 address into BGP as there is Static Route for same.
Having this route in BGP causes two level of recursion in Zebra and cause assert in Zebra
when there are many nexthop involved
2. Advertise only P2P Connected IP's into BGP (External Peers). For Packet chassis we have backend IP Interface subnet and if
they get advertised into BGP then it also causes recursion
The BGP_VOQ_CHASSIS_NEIGHBOR keepalive and holdtime timers are
configured similar to general neighbors. Changes are done to configure
BGP_VOQ_CHASSIS_NEIGHBOR timers similar to BGP_INTENAL_NEIGBOR since voq
chassis bgp neighbors are similar to bgp internal neighbors in
multi-asic. As it is done for bgp internal neighbors, the keepalive and
holdtime timers are set to 3 and 10 seconds respectively. Also similar
to bgp internal neighbors, connection retry timer is also configured for
voq chassis bgp neighbors.
Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>
Why I did it
resolves#8979 and #9055
How I did it
Remove the file static.conf.j2,which adds the default route on eth0 from bgp docker
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
What I did:
Fix the typo in Internal Peer Group template for Packet-based Chassis.
Address Review comments of PR: [chassis-packet] minigraph parsing and BGP template changes #8966
- Static Route Parsing for Host
- Formatting of chassis port_config.ini
1. Changes for Generation LC-Graph for packet-based chassis.
2. Added Support Ipv6 Peering on Loopback4096 for voq also
3. Updated asic topology yml files to be offset of slot
4. Made slot_num to take string slot<number> instead of number
5. Consolidated template_dpg_voq_asic.j2 into dpg_asic.j2
6. Remove Loopback4096 from asic topology and parse as dut invertory for
multi-asic
7. Updated topo_facts parsing for asic topology_
8. Internal BGP Session rename from <VoqChassisInternal> to <ChassisInternal> and take switch_type as value.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
For multiasic, the back end asics use ip addresss of Loopback4096 for BGP router id. In VOQ multi-asic chassis there are no back end asics. All the asics are front end and the iBGP connections are established via Ethernet-IB of asics. Since these asics are not designated as BackEnd, the ip address of interface Loopback0 is used as BGP router id. Since the ip address of Loopback0 is same for all the asics in the line card, same router id is used for voq iBGP configurations and hence the iBGP connections are not established. Changes are done to fix this
Why I did it
There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.
Fix#7488
How I did it
Reduce route selection deferral timer for bgp graceful restart to 15 seconds.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
In the multi asic platforms all the ASIC are advertising the same IPv6 /64 network from Loopback4096.
Therefore, the IPv6 loopback address of backend asic is not learnt on the frontend asic.
Change the bgpd.conf.main.conf.j2 template file to advertise the Loopback4096 ipv6 address as /128
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.
How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.
How to verify it
I verified this on the device str-7260cx3-acs-1.
Why I did it
Support readonly version of the command vtysh
How I did it
Check if the command starting with "show", and verify only contains single command in script.
For the split mode, the config files, like bgpd.conf, zebra.conf and so on, were provided by outside. But the docker_init.sh will overwrite the outside config files if restart bgp service.
How I did it
Add a split mode checking in docker_init.sh, if docker_routing_config_mode is split, don't overwrite the existing routing config files.
How to verify it
Set split mode in config db
{
"DEVICE_METADATA": {
"localhost": {
"hwsku": "Force10-S6000",
"platform": "x86_64-kvm_x86_64-r0",
"docker_routing_config_mode": "split"
...
}
}
}
Replace your bgpd.conf to /etc/sonic/frr/bgpd.conf
Restart bgp service by sudo service bgp restart
The /etc/sonic/frr/bgpd.conf your provided shouldn't be overwritten
Signed-off-by: Ze Gan <ganze718@gmail.com>
This commit has following changes:
* Add templates and code to support VoQ chassis iBGP peers
* Add support to convert a new VoQChassisInternal element in the
BGPSession element of the minigraph to a new BGP_VOQ_CHASSIS_NEIGHBOR
table in CONFIG_DB.
* Add a new set of "voq_chassis" templates to docker-fpm-frr
* Add a new BGP peer manager to bgpcfgd to add neighbors from the
BGP_VOQ_CHASSIS_NEIGHBOR table using the voq_chassis templates.
* Add a test case for minigraph.py, making sure the VoQChassisInternal
element creates a BGP_VOQ_CHASSIS_NEIGHBOR entry, but not if its
value is "false".
* Add a set of test cases for the new voq_chassis templates in
sonic-bgpcfgd tests.
Note that the templates expect the new
"bgp bestpath peer-type multipath-relax" bgpd configuration to be
available.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix.
admin@STG01-0101-0102-01T1:~$ TSB
BGP0 : % Could not find route-map entry TO_TIER0_V4 20
line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20
% Could not find route-map entry TO_TIER0_V4 30
line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30
In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.
The default bgp connect retry timer is 120 seconds. A reconnection will happen 120 seconds if the initial connection fails. This PR aims to allow a more frequent retry.
Why I did it
It was observed that on a multi-asic DUT bootup, the BGP internal sessions between ASIC's was taking more time to get ESTABLISHED than external BGP sessions. The internal sessions was coming up almost exactly 120 secs later.
In multi-asic platform the bgp dockers ( which is per ASIC ) on switch start are bring brought up around the same time and they try to make the bgp sessions with neighbors (in peer ASIC's) which may be not be completely up. This results in BGP connect fail and the retry happens after 120sec which is the default Connect Retry Timer
How I did it
Add the command to set the bgp neighboring session retry timer to 10sec for internal bgp neighbors.
1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic.
2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output.
3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27
4. Enhancement to add mult_asic specific bgpd template generation unit test cases.
Enable BBR config allowas-in 1 for internal peers
Why I did:
To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
In addition skip enforcing it on route maps used between internal BGP sessions.
admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance
and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
The requirement for zebra to be ready to accept connections is a generic problem that is not
specific to bgpd. Making the script to wait for zebra socket a separate script and let bgpd and
staticd to wait for zebra socket.
- Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema.
- Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup
**- How I did it**
- add frrcfgd service
- when frr_mgmg_framework_config is set, frrcfgd starts in bgp container
- when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR.
- add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted
**- How to verify it**
1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed.
1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB
Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>
- Why I did it
Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running
or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process
in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance.
Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by
Supervisord, we can only focus on the logic of monitoring.
- How I did it
We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take
following steps if it was notified one of critical processes exited unexpectedly:
The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted.
If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog.
- How to verify it
First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not.
Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute.
- Which release branch to backport (provide reason below if selected)
201811
201911
[x ] 202006
Fix#5026
There is a race condition between zebra server accepts connections and bgpd tries to connect. Bgpd has a chance to try to connect before zebra is ready. In this scenario, bgpd will try again after 10 seconds and operate as normal within these 10 seconds. As a consequence, whatever bgpd tries to sent to zebra will be missing in the 10 seconds. To avoid such a scenario, bgpd should start after zebra is ready to accept connections.
**- Why I did it**
Earlier today we found a bug in the SONiC TSA implementation.
TSC shows incorrect output (see below) in case we have a route-map which contains TSA route-map as a prefix.
```
admin@str-s6100-acs-1:~$ TSC
Traffic Shift Check:
System Mode: Not consistent
```
The reason is that TSC implementation has too loose regexps in TSA utilities, which match wrong route-map entries:
For example, current TSC matches following
```
route-map TO_BGP_PEER_V4 permit 200
route-map TO_BGP_PEER_V6 permit 200
```
But it should match only
```
route-map TO_BGP_PEER_V4 permit 20
route-map TO_BGP_PEER_V4 deny 30
route-map TO_BGP_PEER_V6 permit 20
route-map TO_BGP_PEER_V6 deny 30
```
**- How I did it**
I fixed it by using egrep with `^` and `$` regexp markers which match begin and end of the line.
**- How to verify it**
1. Add follwing entry to FRR config:
```
str-s6100-acs-1#
str-s6100-acs-1# conf t
str-s6100-acs-1(config)# route-map TO_BGP_PEER_V4 permit 200
str-s6100-acs-1(config-route-map)# end
```
2. Use the TSC command and check output. It should show normal.
```
admin@str-s6100-acs-1:~$ TSC
Traffic Shift Check:
System Mode: Normal```
* Use 20 and 30 route-map entries instead of 2 and 3 for TSA
* Added support for dynamic "Allow list" default action.
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
frr does not advertise route if local route is not reachable, as a result
loopback route /64 is not advertised to the neighbors. Add static route
allows frr to advertise the route to its peers
Signed-off-by: Guohan Lu <lguohan@gmail.com>
**- Why I did it**
We were building a custom version of Supervisor because I had added patches to prevent hangs and crashes if the system clock ever rolled backward. Those changes were merged into the upstream Supervisor repo as of version 3.4.0 (http://supervisord.org/changes.html#id9), therefore, we should be able to simply install the vanilla package via pip. This will also allow us to easily move to Python 3, as Python 3 support was added in version 4.0.0.
**- How I did it**
- Remove Makefiles and patches for building supervisor package from source
- Install Python 3 supervisor package version 4.2.1 in Buster base container
- Also install Python 3 version of supervisord-dependent-startup in Buster base container
- Debian package installed binary in `/usr/bin/`, but pip package installs in `/usr/local/bin/`, so rather than update all absolute paths, I changed all references to simply call `supervisord` and let the system PATH find the executable to prevent future need for changes just in case we ever need to switch back to build a Debian package, then we won't need to modify these again.
- Install Python 2 supervisor package >= 3.4.0 in Stretch and Jessie base containers
Fixed TSA bugs:
1. TSA didn't advertise Loopback ipv6 address
2. TSA and TSB changed BGP dynamic and BGP monitors sessions
**- How to verify it**
Build an image and run on your DUT.
```
admin@str-s6100-acs-1:~$ TSA
System Mode: Normal -> Maintenance
admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv4 neighbors 10.0.0.1 advertised-routes'
BGP table version is 6, local router ID is 10.1.0.32, vrf id 0
Default local pref 100, local AS 64601
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 10.1.0.32/32 0.0.0.0 0 32768 i
Total number of prefixes 1
admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv6 neighbors fc00::a advertised-routes'
BGP table version is 6, local router ID is 10.1.0.32, vrf id 0
Default local pref 100, local AS 64601
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> fc00:1::/64 :: 0 32768 i
Total number of prefixes 1
admin@str-s6100-acs-1:~$ TSB
System Mode: Maintenance -> Normal
```
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
- Why I did it
Update the routine is_bgp_session_internal() by checking the BGP_INTERNAL_NEIGHBOR table.
Additionally to address the review comment #5520 (comment)
Add timer settings as will in the internal session templates and keep it minimal as these sessions which will always be up.
Updates to the internal tests data + add all of it to template tests.
- How I did it
Updated the APIs and the template files.
- How to verify it
Verified the internal BGP sessions are displayed correctly with show commands with this API is_bgp_session_internal()
lldpmgrd, bgpcfgd, and bgpmon are reported error status not running due
to recent change of shebang to use `Python3`. Modifying the argument of
`process_checker` to follow this change.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
* Convert bgpcfgd to python3
Convert bgpmon to python3
Fix some issues in bgpmon
* Add python3-swsscommon as depends
* Install dependencies
* reorder deps
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
Why/How I did:
Make sure first error syslog is triggered based on FAULT TOLERANCE condition.
Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent
Updated the monit conf files with repeat every x cycles for the alert action
* Initial commit for BGP internal neighbor table support.
> Add new template named "internal" for the internal BGP sessions
> Add a new table in database "BGP_INTERNAL_NEIGHBOR"
> The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR"
* Changes in template generation tests with the introduction of internal neighbor template files.
**- Why I did it**
To introduce dynamic support of BBR functionality into bgpcfgd.
BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0
Now we can add and remove this configuration based on CONFIG_DB entry
**- How I did it**
I introduced a new CONFIG_DB entry:
- table name: "BGP_BBR"
- key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated
- data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled"
Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below).
bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value).
**- How to verify it**
Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6:
```
admin@str-s6100-acs-1:~$ vtysh -c 'show run' | egrep 'PEER_V.? allowas'
neighbor PEER_V4 allowas-in 1
neighbor PEER_V6 allowas-in 1
```
Then we apply following configuration to the db:
```
admin@str-s6100-acs-1:~$ cat disable.json
{
"BGP_BBR": {
"all": {
"status": "disabled"
}
}
}
admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w
```
The log output are:
```
Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))'
Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'.
Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'.
Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'.
```
Check FRR configuraiton and see that no allowas parameters are there:
```
admin@str-s6100-acs-1:~$ vtysh -c 'show run' | egrep 'PEER_V.? allowas'
admin@str-s6100-acs-1:~$
```
Then we apply enabling configuration back:
```
admin@str-s6100-acs-1:~$ cat enable.json
{
"BGP_BBR": {
"all": {
"status": "enabled"
}
}
}
admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w
```
The log output:
```
Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))'
Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'.
Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'.
Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'.
```
Check FRR configuraiton and see that the BBR configuration is back:
```
admin@str-s6100-acs-1:~$ vtysh -c 'show run' | egrep 'PEER_V.? allowas'
neighbor PEER_V4 allowas-in 1
neighbor PEER_V6 allowas-in 1
```
*** The test coverage ***
Below is the test coverage
```
---------- coverage: platform linux2, python 2.7.12-final-0 ----------
Name Stmts Miss Cover
----------------------------------------------------
bgpcfgd/__init__.py 0 0 100%
bgpcfgd/__main__.py 3 3 0%
bgpcfgd/config.py 78 41 47%
bgpcfgd/directory.py 63 34 46%
bgpcfgd/log.py 15 3 80%
bgpcfgd/main.py 51 51 0%
bgpcfgd/manager.py 41 23 44%
bgpcfgd/managers_allow_list.py 385 21 95%
bgpcfgd/managers_bbr.py 76 0 100%
bgpcfgd/managers_bgp.py 193 193 0%
bgpcfgd/managers_db.py 9 9 0%
bgpcfgd/managers_intf.py 33 33 0%
bgpcfgd/managers_setsrc.py 45 45 0%
bgpcfgd/runner.py 39 39 0%
bgpcfgd/template.py 64 11 83%
bgpcfgd/utils.py 32 24 25%
bgpcfgd/vars.py 1 0 100%
----------------------------------------------------
TOTAL 1128 530 53%
```
**- Which release branch to backport (provide reason below if selected)**
- [ ] 201811
- [x] 201911
- [x] 202006
**- Why I did it**
FRR introduced [next hop tracking](http://docs.frrouting.org/projects/dev-guide/en/latest/next-hop-tracking.html) functionality.
That functionality requires resolving BGP neighbors before setting BGP connection (or explicit ebgp-multihop command). Sometimes (BGP MONITORS) our neighbors are not directly connected and sessions are IBGP. In this case current configuration prevents FRR to establish BGP connections. Reason would be "waiting for NHT". To fix that we need either add static routes for each not-directly connected ibgp neighbor, or enable command `ip nht resolve-via-default`
**- How I did it**
Put `ip nht resolve-via-default` into the config
**- How to verify it**
Build an image. Enable BGP_MONITOR entry and check that entry is Established or Connecting in FRR
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
implements a new feature: "BGP Allow list."
This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.