Commit Graph

15 Commits

Author SHA1 Message Date
Joe LeVeque
72b32a96fc
[201911][dockers][supervisor] Increase event buffer size for process exit listener (#7106)
Backport of https://github.com/Azure/sonic-buildimage/pull/7083 to the 201911 branch.

#### Why I did it

To prevent error [messages](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802) like the following from being logged:

```
Mar 17 02:33:48.523153 vlab-01 INFO swss#supervisord 2021-03-17 02:33:48,518 ERRO pool supervisor-proc-exit-listener event buffer overflowed, discarding event 46
```

This is basically an addendum to https://github.com/Azure/sonic-buildimage/pull/5247, which increased the event buffer size for dependent-startup. While supervisor-proc-exit-listener doesn't subscribe to as many events as dependent-startup, there is still a chance some containers (like swss, as in the example above) have enough processes running to cause an overflow of the default buffer size of 10.

This is especially important for preventing erroneous log_analyzer failures in the sonic-mgmt repo regression tests, which have started occasionally causing PR check builds to fail. Example [here](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802).

I set all supervisor-proc-exit-listener event buffer sizes to 1024, and also updated all dependent-startup event buffer sizes to 1024, as well, to keep things simple, unified, and allow headroom so that we will not need to adjust these values frequently, if at all.
2021-03-29 10:07:43 -07:00
Tamer Ahmed
7c5f0ff316
Start DHCP Relay When Helpers IPs Are Available (#6961) (#7059)
It is possible to have DHCP relay configuration with no servers/
helpers which result in DHCP container to crash. This PR fixes this
issue by not starting DHCP relay for vlans with no DHCP helpers.

resolves: #6931
closes: #6931
Do not add program group for dhcp relay with not dhcp helpers

Unit test
2021-03-15 14:43:50 -07:00
trzhang-msft
a0b824f83e
[docker-dhcp-relay]: add -si support in dhcp docker template (#7054) 2021-03-15 09:21:32 -07:00
Tamer Ahmed
c5bd46f857 [dhcp-relay]: Launch DHCP Relay On L3 Vlan (#6527)
Recent changes brought l2 vlan concept which do not have DHCP
clients behind them and so DHCP relay is not required. Also,
dhcpmon fails to launch on those vlans as their interfaces
lack IP addresses. This PR limit launch of both DHCP relay
and dhcpmon to L3 vlans only.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-01-25 12:38:16 -08:00
Joe LeVeque
b70c6f72b2 [dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-28 16:12:53 +00:00
Tamer Ahmed
56cab18501 [dhcpmon] Print Both Snapshot And Current Counters (#5374)
Printing both snapshot and current counter sets will make it easier to pinpoint
which message type(s) is/are not being relayed. This PR prints both counter sets.
Also, this PR defines gnu11 as a C standard to compile with in order to avoid
making changes when porting to 201811 branch.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-19 14:06:25 -07:00
Tamer Ahmed
b27ba0630c [dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.

In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-19 14:05:49 -07:00
Guohan Lu
b378b4d249 [docker-dhcp-relay]: use service dependency in supervisord to start services 2020-08-15 22:25:52 -07:00
yozhao101
71225ea4cc [Service] Enable/disable container auto-restart based on configuration. (#4073) 2020-02-13 16:20:21 -08:00
Tamer Ahmed
c883583e20 [dhcp-relay]: Add DHCP Relay Monitor (#3886)
DHCP relay MONitor (dhcpmon) keeps track of DORA messages. If DHCP Relay
is detected to be not forwarding DORA message, dhcpmon will log such event
to syslog. Under the hood dhcpmon keeps counts of clients DR messages,
forwarded DR messages, DHCP server OA messages, and forwarded OA messages.
dhcpmon will check every 12 sec (configurable) if counts are monotonically
increasing and record snapshot of those counters. dhcpmon will report
discrepancies when detected between current counters and snapshot counters.

pull-request: https://github.com/Azure/sonic-buildimage/pull/3886
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-01-21 10:41:30 -08:00
yozhao101
ed79f54569 [Services] Restart DHCP-Relay service upon unexpected critical process exit. (#3667)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-05 18:32:14 -08:00
wangshengjun
9fdc6bde8c [dhcp_relay]:filter out the ipv6 address of dhcp server for dhcp rela… (#3397)
* [dhcp_relay]:filter out the ipv6 address of dhcp server for dhcp relay(v4) config file.

Signed-off-by: wangshengjun <wangshengjun@asterfusion.com>
2019-09-06 12:01:08 -07:00
Prince Sunny
231d309b69
Generate interface table to have an entry designated to default VRF. (#2848)
* Generate default VRF table for router interfaces

* Updated jinja2 template to have prefix filter
2019-06-10 14:02:55 -07:00
Joe LeVeque
552684fc08
[dhcp_relay] Add support for DHCP client(s) on one VLAN and DHCP server(s) on another (#2946) 2019-06-03 14:26:45 -07:00
Joe LeVeque
1d16a37d48 [DHCP Relay]: Support Multiple VLANs (Separate DHCP Relay Agents, One Per VLAN) (#999)
* [DHCP Relay]: Support new <DhcpRelays> minigraph tag; support multiple VLANs

* Don't start dhcrelay in quiet mode so as to get startup output in syslog

* Update sonic-cfggen tests to support new '<DhcpRelays>' tag

* <DhcpRelays> tag is only present for VLANs which require a DHCP relay agent -- only parse if present

* Don't attempt to configure a DHCP relay agent for VLANs without specified DHCP servers

* Modify to work with Taoyu's minigraph/DB changes (#942)

* Reduce number of DHCP servers in sonic-cfggen unit tests from 4 to 2

* Remove isc-dhcp-relay sample output file from sonic-cfggen test, as we no longer generate that file

* Update Option 82 isc-dhcp-relay patch to load all interface name-alias maps into memory once at start instead of calling sonic-cfggen on each packet we relay

* Remove executable permission from Jinja2 template

* Set max hop count to 1 so that DHCP relay will only relay packets with a hop count of zero

* Replace tabs with spaces

* Modify overlooked sonic-cfggen call, use Config DB instead of minigraph

* Also ensure > 1 VLAN requires a DHCP relay agent before outputting to template

* Generate port name-alias map file using sonic-cfggen and parse that in lieu of parsing port_config.ini directly

* No longer drop packets with hop count > 0; Instead, drop packets which already contain agent info
2017-10-04 23:35:43 -07:00