sonic-buildimage/dockers/docker-fpm-frr/frr
pavel-shirshov 89184038fd
[docker-fpm-frr]: Start bgpd after zebra was started (#5038)
fixes https://github.com/Azure/sonic-buildimage/issues/5026
Explanation:
In the log from the issue I found:
```
I see following in the log
Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed
```
Analyzing source code I found that the error message could be issues only when `zclient_send_rnh()` return less than 0.
```
	ret = zclient_send_rnh(zclient, command, p, exact_match,
			       bnc->bgp->vrf_id);
	/* TBD: handle the failure */
	if (ret < 0)
		flog_warn(EC_BGP_ZEBRA_SEND,
			  "sendmsg_nexthop: zclient_send_message() failed");
```
I checked [zclient_send_rnh()](88351c8f6d/lib/zclient.c (L654)) and found that this function will return the exit code which the function gets from [zclient_send_message()](88351c8f6d/lib/zclient.c (L266)) But the latter function could return not 0 in two cases:
1.	bgpd didn’t connect to the zclient socket yet [code](88351c8f6d/lib/zclient.c (L269))
2.	The socket was closed. But in this case we would receive the error message in the log. (And I can find the message in the log when we reboot sonic) [code](88351c8f6d/lib/zclient.c (L277))

Also I see from the logs that client connection was set later we had the issue in bgpd.

Bgpd.log
```
Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed
```
Vs
Zebra.log
```
Jul 22 21:13:12.713249 vlab-01 NOTICE bgp#zebra[48]: client 25 says hello and bids fair to announce only static routes vrf=0
Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 30 says hello and bids fair to announce only bgp routes vrf=0
Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 33 says hello and bids fair to announce only vnc routes vrf=0
```
So in our case we should start zebra first. Wait until it is started and then start bgpd and other daemons.

**- How I did it**

I changed a graph to start daemons in the following order:
1. First start zebra
2. Then starts staticd and bgpd
3. Then starts vtysh -b and bgpeoi after bgpd is started.
2020-07-25 03:48:47 -07:00
..
bgpd [bgpcfgd]: Add Vlan prefix list to the FRR templates (#5005) 2020-07-21 19:26:19 -07:00
common [bgpcfgd]: Split one bgp mega-template to chunks. (#4143) 2020-04-23 09:42:22 -07:00
staticd [bgpcfgd]: Split one bgp mega-template to chunks. (#4143) 2020-04-23 09:42:22 -07:00
supervisord [docker-fpm-frr]: Start bgpd after zebra was started (#5038) 2020-07-25 03:48:47 -07:00
zebra [bgpcfgd]: Split one bgp mega-template to chunks. (#4143) 2020-04-23 09:42:22 -07:00
frr.conf.j2 [bgpcfgd]: Split one bgp mega-template to chunks. (#4143) 2020-04-23 09:42:22 -07:00
isolate.j2 [bgpcfgd]: Split one bgp mega-template to chunks. (#4143) 2020-04-23 09:42:22 -07:00
unisolate.j2 [bgpcfgd]: Split one bgp mega-template to chunks. (#4143) 2020-04-23 09:42:22 -07:00