What I did:
Workaround for the issue seen here : FRRouting/frr#13682
It seems there is timing issue where there are multiple recursive lookup needed to resolve nexthop of the route it's possible that it does not happen correctly causing route to remain in inactive state
Issue is seen on chassis-packet as there 2 level of recursive lookup needed for a given e-BGP learnt route
- Level1 to resolve e-BGP peer (connected route via bgp ) over Loopback4096 (i-BGP peering)
- Level 2 Loopback4096 over backend port-channels next-hops
For VOQ chassis there is no e-BGP peer (connected route via bgp ) resolution as route is added as Static route by orchagent over Ethernet-IB.
Also as part of this remove route-map policy from instance.conf.j2 as same is define in peer-group.j2.
Microsoft ADO: https://msazure.visualstudio.com/One/_workitems/edit/24198507
How I verify:
Functional Verification manually
Updated UT.
We will be adding sanity check in sonic-mgmt to make sure none of route are in inactive state.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.
How I did it
add coalesce-time 10000 to the frr bgp startup config.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
#### Why I did it
Improve naming convention for bgp notification events and change type of leaf for sonic-events-host mem usage from uint64 to decimal64
#### How I did it
Replace "-" with "_"
Replace uint64 with decimal64
#### How to verify it
Run yang model unit tests
#### Description for the changelog
Change YANG model leaf naming convention for bgp notification
Backport of https://github.com/sonic-net/sonic-buildimage/pull/12490 into 202211
- Why I did it
Support syslog rate limit configuration feature
- How I did it
Remove unused rsyslog.conf from containers
Modify docker startup script to generate rsyslog.conf from template files
Add metadata/init data for syslog rate limit configuration
- How to verify it
Manual test
New sonic-mgmt regression cases
- Why I did it
The values for config_db "docker_routing_config_mode" are:
separated: FRR config generated from ConfigDB, each FRR daemon has its own config file
unified: FRR config generated from ConfigDB, single FRR config file
split: FRR config not generated from ConfigDB, each FRR daemon has its own config file
This commit adds:
split-unified: FRR config not generated from ConfigDB, single FRR config file
- How I did it
In docker_init.sh, when split-unified is used, the FRR configs are not generated
from ConfigDB. What's more, "service integrated-vtysh-config" is configured in vtysh.conf.
- How to verify it
FRR config not overwritten when FRR container starts.
Signed-off-by: Arnaud le Taillanter <a.letaillanter@criteo.com>
bgpd.main.conf.j2: bugfix-9739
* Update bgpd.main.conf.j2 to gracefully handle the bgp configuration cases for when 'bgp_asn' is set to 'None', 'Null', or missing.
How I did it
Include a conditional statement to avoid configuring bgp in FRR when 'bgp_asn' is missing or set to 'None' or 'Null'
How to verify it
Configure 'bgp_asn' as 'None', 'Null' or have it missing from configurations and verify that /etc/frr/bgpd.conf does not have invalid bgp configurations like 'router bgp None'
Description for the changelog
Update bgpd.main.conf.j2 to gracefully handle the bgp configuration cases for when 'bgp_asn' is set to 'None', 'Null', or missing for bugfix 9739.
Signed-off-by: cchoate54@gmail.com
With this PR in, you flap BGP and use events_tool to see the published events.
With telemetry PR #111 in and corresponding submodule update done in buildimage, one could run gnmi_cli to capture BGP flap events.
Why I did it
Migrate FRR to bullseye
How I did it
Makefile and docker config changes to refer to bullseye instead of buster.
How to verify it
Build bullseye frr docker.
Co-authored-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
* [BGP]Adding configuration knob to allow advertise Loopback ipv6 /128 prefix
By default when IPv6 address is configured with /128 as subnet mask in Loopback0 interface, it will be advertised as prefix with /64 subnet.
To control this behavior a new field 'bgp_adv_lo_prefix_as_128' is introduced in DEVICE_METADATA table which when set to true will advertise prefix with /128 subnet as it is.
Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are
specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are
created with a fixed docker name and common to all the users.
docker-database:latest
docker-swss:latest
When multiple builds are triggered on the same build server that creates parallel building issue because
all the build jobs are trying to create the same docker with latest tag.
This happens only when sonic dockers are built using native host dockerd for sonic docker image creation.
This patch creates all sonic dockers as user sonic dockers and then, while
saving and loading the user sonic dockers, it rename the user sonic
dockers into correct sonic dockers with tag as latest.
docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag
The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env
variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for
loading and saving the docker image.
Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors
How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests
How to verify it
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Why I did it
Add TSA/B/C dualtor support
Signed-off-by: Longxiang Lyu lolv@microsoft.com
How I did it
For TSA, toggle all the mux to standby if the device type is dualtor and there are active mux ports.
For TSC, add mux status output.
How to verify it
Run TSA/B/C on a dualtor setup
What I did:
Updated Jinja Template to enable BGP Graceful Restart based on device role. By default it will be enable only if the device role type is TorRouter.
Why I did:-
By default FRR is configured in Graceful Helper mode. Graceful Restart is needed on T0/TorRouter only since the device can go for warm-reboot. For T1/LeafRouter it need to be in Helper mode only
Updated BGP Template for the case:
1. For Packet Chassis do not advertise Loopback4096 address into BGP as there is Static Route for same.
Having this route in BGP causes two level of recursion in Zebra and cause assert in Zebra
when there are many nexthop involved
2. Advertise only P2P Connected IP's into BGP (External Peers). For Packet chassis we have backend IP Interface subnet and if
they get advertised into BGP then it also causes recursion
The BGP_VOQ_CHASSIS_NEIGHBOR keepalive and holdtime timers are
configured similar to general neighbors. Changes are done to configure
BGP_VOQ_CHASSIS_NEIGHBOR timers similar to BGP_INTENAL_NEIGBOR since voq
chassis bgp neighbors are similar to bgp internal neighbors in
multi-asic. As it is done for bgp internal neighbors, the keepalive and
holdtime timers are set to 3 and 10 seconds respectively. Also similar
to bgp internal neighbors, connection retry timer is also configured for
voq chassis bgp neighbors.
Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>
Why I did it
resolves#8979 and #9055
How I did it
Remove the file static.conf.j2,which adds the default route on eth0 from bgp docker
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
What I did:
Fix the typo in Internal Peer Group template for Packet-based Chassis.
Address Review comments of PR: [chassis-packet] minigraph parsing and BGP template changes #8966
- Static Route Parsing for Host
- Formatting of chassis port_config.ini
1. Changes for Generation LC-Graph for packet-based chassis.
2. Added Support Ipv6 Peering on Loopback4096 for voq also
3. Updated asic topology yml files to be offset of slot
4. Made slot_num to take string slot<number> instead of number
5. Consolidated template_dpg_voq_asic.j2 into dpg_asic.j2
6. Remove Loopback4096 from asic topology and parse as dut invertory for
multi-asic
7. Updated topo_facts parsing for asic topology_
8. Internal BGP Session rename from <VoqChassisInternal> to <ChassisInternal> and take switch_type as value.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
For multiasic, the back end asics use ip addresss of Loopback4096 for BGP router id. In VOQ multi-asic chassis there are no back end asics. All the asics are front end and the iBGP connections are established via Ethernet-IB of asics. Since these asics are not designated as BackEnd, the ip address of interface Loopback0 is used as BGP router id. Since the ip address of Loopback0 is same for all the asics in the line card, same router id is used for voq iBGP configurations and hence the iBGP connections are not established. Changes are done to fix this
Why I did it
There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.
Fix#7488
How I did it
Reduce route selection deferral timer for bgp graceful restart to 15 seconds.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
In the multi asic platforms all the ASIC are advertising the same IPv6 /64 network from Loopback4096.
Therefore, the IPv6 loopback address of backend asic is not learnt on the frontend asic.
Change the bgpd.conf.main.conf.j2 template file to advertise the Loopback4096 ipv6 address as /128
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.
How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.
How to verify it
I verified this on the device str-7260cx3-acs-1.
Why I did it
Support readonly version of the command vtysh
How I did it
Check if the command starting with "show", and verify only contains single command in script.
For the split mode, the config files, like bgpd.conf, zebra.conf and so on, were provided by outside. But the docker_init.sh will overwrite the outside config files if restart bgp service.
How I did it
Add a split mode checking in docker_init.sh, if docker_routing_config_mode is split, don't overwrite the existing routing config files.
How to verify it
Set split mode in config db
{
"DEVICE_METADATA": {
"localhost": {
"hwsku": "Force10-S6000",
"platform": "x86_64-kvm_x86_64-r0",
"docker_routing_config_mode": "split"
...
}
}
}
Replace your bgpd.conf to /etc/sonic/frr/bgpd.conf
Restart bgp service by sudo service bgp restart
The /etc/sonic/frr/bgpd.conf your provided shouldn't be overwritten
Signed-off-by: Ze Gan <ganze718@gmail.com>
This commit has following changes:
* Add templates and code to support VoQ chassis iBGP peers
* Add support to convert a new VoQChassisInternal element in the
BGPSession element of the minigraph to a new BGP_VOQ_CHASSIS_NEIGHBOR
table in CONFIG_DB.
* Add a new set of "voq_chassis" templates to docker-fpm-frr
* Add a new BGP peer manager to bgpcfgd to add neighbors from the
BGP_VOQ_CHASSIS_NEIGHBOR table using the voq_chassis templates.
* Add a test case for minigraph.py, making sure the VoQChassisInternal
element creates a BGP_VOQ_CHASSIS_NEIGHBOR entry, but not if its
value is "false".
* Add a set of test cases for the new voq_chassis templates in
sonic-bgpcfgd tests.
Note that the templates expect the new
"bgp bestpath peer-type multipath-relax" bgpd configuration to be
available.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix.
admin@STG01-0101-0102-01T1:~$ TSB
BGP0 : % Could not find route-map entry TO_TIER0_V4 20
line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20
% Could not find route-map entry TO_TIER0_V4 30
line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30
In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.
The default bgp connect retry timer is 120 seconds. A reconnection will happen 120 seconds if the initial connection fails. This PR aims to allow a more frequent retry.
Why I did it
It was observed that on a multi-asic DUT bootup, the BGP internal sessions between ASIC's was taking more time to get ESTABLISHED than external BGP sessions. The internal sessions was coming up almost exactly 120 secs later.
In multi-asic platform the bgp dockers ( which is per ASIC ) on switch start are bring brought up around the same time and they try to make the bgp sessions with neighbors (in peer ASIC's) which may be not be completely up. This results in BGP connect fail and the retry happens after 120sec which is the default Connect Retry Timer
How I did it
Add the command to set the bgp neighboring session retry timer to 10sec for internal bgp neighbors.
1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic.
2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output.
3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27
4. Enhancement to add mult_asic specific bgpd template generation unit test cases.
Enable BBR config allowas-in 1 for internal peers
Why I did:
To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
In addition skip enforcing it on route maps used between internal BGP sessions.
admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance
and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
The requirement for zebra to be ready to accept connections is a generic problem that is not
specific to bgpd. Making the script to wait for zebra socket a separate script and let bgpd and
staticd to wait for zebra socket.
- Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema.
- Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup
**- How I did it**
- add frrcfgd service
- when frr_mgmg_framework_config is set, frrcfgd starts in bgp container
- when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR.
- add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted
**- How to verify it**
1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed.
1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB
Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>
- Why I did it
Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running
or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process
in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance.
Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by
Supervisord, we can only focus on the logic of monitoring.
- How I did it
We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take
following steps if it was notified one of critical processes exited unexpectedly:
The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted.
If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog.
- How to verify it
First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not.
Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute.
- Which release branch to backport (provide reason below if selected)
201811
201911
[x ] 202006
Fix#5026
There is a race condition between zebra server accepts connections and bgpd tries to connect. Bgpd has a chance to try to connect before zebra is ready. In this scenario, bgpd will try again after 10 seconds and operate as normal within these 10 seconds. As a consequence, whatever bgpd tries to sent to zebra will be missing in the 10 seconds. To avoid such a scenario, bgpd should start after zebra is ready to accept connections.