Commit Graph

21 Commits

Author SHA1 Message Date
Lawrence Lee
d0f16c0d79
Make backend device checking more robust (#5730)
Treat devices that are ToRRouters (ToRRouters and BackEndToRRouters) the same when rendering templates
 Except for BackEndToRRouters belonging to a storage cluster, since these devices have extra sub-interfaces created
Treat devices that are LeafRouters (LeafRouters and BackEndLeafRouters) the same when rendering templates

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-10 15:06:35 -08:00
Joe LeVeque
5b3b4804ad
[dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-08 23:36:38 -07:00
Joe LeVeque
fb8f09a116
[radvd] No longer build from source; Install vanilla Debian package once again (#5242)
Remove radvd Makefile and patch, change docker-router-advertiser Dockerfile template to simply install the vanilla radvd package using apt-get.

- In PR https://github.com/Azure/sonic-buildimage/pull/2795, we started building radvd from source and patching it to prevent it from erroring out when advertising an MTU of 9100 which was greater than the MTU size configured on the bridge interface (1500), which was due to a limitation in the 4.9 Linux kernel.
- Master branch is now using Linux kernel 4.19. As of 4.18, the kernel supports setting a bridge MTU to a value > 1500.
- PR https://github.com/Azure/sonic-swss/pull/1393 modified vlanmgrd to take advantage of this and now configures the MTU of bridge interfaces in SONiC to the proper size of 9100. Therefore, we no longer need to patch radvd. Since we no longer need to patch radvd, we no longer need to build it from source, so we can save build time by going back to simply installing the vanilla radvd Debian package in the router-advertiser container.
2020-09-01 13:53:36 -07:00
Joe LeVeque
97d44214cf
[docker-radv] Fix startup issues (#5230)
**- Why I did it**

PR https://github.com/Azure/sonic-buildimage/pull/4599 introduced two bugs in the startup of the router advertiser container:

1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed
2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read.

**- How I did it**
1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh`
2. Use the Jinja2 "namespace" construct to fix the scope issue

**- How to verify it**

Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).
2020-08-21 13:12:01 -07:00
Tamer Ahmed
adcca53b8d
[radv] Reduce Calls to SONiC Cfggen (#5178)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting radv service.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-08-17 15:48:04 -07:00
yozhao101
4fa81b4f8d
[dockers] Update critical_processes file syntax (#4831)
**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a  container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.

Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.

**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
2020-06-25 21:18:21 -07:00
joyas-joseph
1714e621be
[docker-radv]: Convert radv docker to buster (#4727)
* Set radvd version to match buster version(2.17-2)

Signed-off-by: Joyas Joseph <joyas_joseph@dell.com>
2020-06-12 16:10:23 -07:00
Guohan Lu
7ea6d9dc8f [docker-radvd]: use service dependency in supervisord to start services 2020-05-22 11:01:28 -07:00
yozhao101
91e5fb5602
[Service] Enable/disable container auto-restart based on configuration. (#4073) 2020-02-07 12:34:07 -08:00
Dong Zhang
5057ac3122 [MultiDB] (./dockers dir) : replace redis-cli with sonic-db-cli and use new DBConnector (#3923)
* [MultiDB] (./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector

* remove unnecessary quota

* update typo
2020-01-22 11:27:21 -08:00
yozhao101
cff30c59d0 [Services] Restart Router-advertiser service upon unexpected critical process exit (#3681)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-10-30 16:41:55 -07:00
Stepan Blyshchak
81cf33231f [build]: Improve dockerfile instructions (#3048)
- create a dockerfile-marcros.j2 file with all common operations
  written as j2 macro
- use single dockerfile instruction for COPY and RUN commands
  when possible to improve build time
- reorganize dockerfile instructions to make more cache friendly
  (in case someday we will remove --no-cache to build docker images)

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-06-22 11:26:23 -07:00
Prince Sunny
231d309b69
Generate interface table to have an entry designated to default VRF. (#2848)
* Generate default VRF table for router interfaces

* Updated jinja2 template to have prefix filter
2019-06-10 14:02:55 -07:00
Joe LeVeque
c0904f766b
[radvd] Build radvd from source; Patch so as not to treat out-of-range MTU as an error (#2795) 2019-04-17 16:41:20 -07:00
Joe LeVeque
b48037090e [router-advertiser] Add templated script to wait for pertinent interfaces to be ready before starting radvd (#2558) 2019-03-02 15:45:43 -08:00
lguohan
f682e7b131
[docker-radvd]: upgrade docker radvd to stretch based (#2524)
* [docker-radvd]: upgrade docker radvd to stretch based

* install jinja>=2.10

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* install pip packages for testing sonic-utilities

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* set storage driver to vfs

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-06 21:28:07 -08:00
lguohan
f3ca7c422f
[rsyslog]: use # to separate container name and program name in syslog message (#1918)
Previously use / to separate container name and program name.

However, in rsyslogd:

Precisely, the programname is terminated by either (whichever occurs first):

end of tag
nonprintable character
‘:’
‘[‘
‘/’
The above definition has been taken from the FreeBSD syslogd sources.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-12 22:23:58 -07:00
Qi Luo
7ba08e5bf6
Prefix docker container name to syslog syslogtag (program name) (#1810) 2018-06-25 10:48:42 -07:00
Joe LeVeque
f7151e8ddb
[radvd] Ensure at least one interface is specified in radvd.conf before starting radvd (#1636) 2018-04-24 15:13:51 -07:00
Joe LeVeque
30466b27c1 [router advertiser] Only start radvd process if device role is 'ToRRouter' (#1569) 2018-04-06 19:24:18 -07:00
Joe LeVeque
cea87e985c
Add docker-router-advertiser to support IPv6 router advertisements (#1103) 2017-11-14 14:40:15 -08:00