Commit Graph

51 Commits

Author SHA1 Message Date
Joe LeVeque
b70c6f72b2 [dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-28 16:12:53 +00:00
yozhao101
7580c846ad
[201911][Monit] Unmonitor processes in disabled containers (#5462)
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.

- Backport of https://github.com/Azure/sonic-buildimage/pull/5153 to the 201911 branch

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-09-25 00:30:41 -07:00
Guohan Lu
7158ccd30d [docker-teamd]: use service dependency in supervisord to start services 2020-08-15 22:25:46 -07:00
yozhao101
c2364cf03e
[201911][dockers] Update critical_processes file syntax (#4854)
Backport of https://github.com/Azure/sonic-buildimage/pull/4831 to the 201911 branch
2020-06-26 11:37:05 -07:00
yozhao101
71225ea4cc [Service] Enable/disable container auto-restart based on configuration. (#4073) 2020-02-13 16:20:21 -08:00
judyjoseph
d815328043 [teamd]: increase startsecs to 5 seconds for teamsyncd (#4083)
Updating the startsecs=5sec for teamsyncd to make the time for which the process needs to stay up before declaring the startup successfull.
2020-01-30 18:14:46 -08:00
pavel-shirshov
74b45be487 [fast-reboot]: Save fast-reboot state into the db (#3741)
Put a flag for fast-reboot to the db using EXPIRE feature. Using this flag in other part of SONiC to start in Fast-reboot mode. If we reload a config, the state in the db will be removed.
2020-01-06 10:30:36 -08:00
yozhao101
4c31ef3cd2 [Services] Restart Teamd service upon unexpected critical process exit. (#3703)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-04 17:45:41 -08:00
Jipan Yang
9a8202a39d [database]: Update redis to 5.0.3 (#3066)
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2019-07-03 22:16:09 -07:00
Stepan Blyshchak
81cf33231f [build]: Improve dockerfile instructions (#3048)
- create a dockerfile-marcros.j2 file with all common operations
  written as j2 macro
- use single dockerfile instruction for COPY and RUN commands
  when possible to improve build time
- reorganize dockerfile instructions to make more cache friendly
  (in case someday we will remove --no-cache to build docker images)

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-06-22 11:26:23 -07:00
Haiyang Zheng
0af5f0b6b5 [docker-team]: update teamd docker to stretch (#2734)
Signed-off-by: Haiyang Zheng <haiyang.z@alibaba-inc.com>
2019-04-12 10:14:51 -07:00
zhenggen-xu
51a76614a3 Restore neighbor table to kernel during system warm-reboot (#2213)
* Restore neighbor table to kernel during system warm-reboot

Added a service: "restore_neighbors" to restore neighbor table into
kernel during system warm reboot. The service is started by supervisord
in swss docker when the docker is started.

In case system warm reboot is enabled, it will try to restore the neighbor
table from appDB into kernel through netlink API calls and update the neighbor
table by sending arp/ns requests to all neighbor entries, then it sets the
stateDB flag for neighsyncd to continue the reconciliation process.

-- Added tcpdump python-scapy debian package into orchagent and vs dockers.
-- Added python module: pyroute2 netifaces into orchagent and vc dockers.
-- Workarounded tcpdump issue in the vs docker

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Move the restore_neighbors.py to sonic-swss submodule
Made changes to makefiles accordingly

Make dockerfile.j2 changes and supervisord config changes

Add python monotonic lib for time access

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Added PYTHON_SWSSCOMMON as swss runtime dependency

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2018-11-09 17:06:09 -08:00
pavel-shirshov
61fe8fd2b5
Add /host filesystem into teamd docker (#2235) 2018-11-09 16:14:03 -08:00
Shuotian Cheng
7313e7d9bc [teamd]: Add teammgrd in docker-teamd (#2064)
Remove the teamd.j2 templates used for starting the teamd. Add
teammgrd instead to manage all port channel related configuration
changes. Remove front panel port related configurations in
interfaces.j2 templates as well.

Remove teamd.sh script and use teammgrd to start all the teamd
processes. Remove all the logics in the start.sh script as well.

Update the sonic-swss submodule.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-10-19 03:41:53 -07:00
pavel-shirshov
3e1b9e17b4 [teamd] Force team device recreation in case it already exists (#2168)
* Force team device recreation in case it already exists
2018-10-18 15:22:26 -07:00
lguohan
f3ca7c422f
[rsyslog]: use # to separate container name and program name in syslog message (#1918)
Previously use / to separate container name and program name.

However, in rsyslogd:

Precisely, the programname is terminated by either (whichever occurs first):

end of tag
nonprintable character
‘:’
‘[‘
‘/’
The above definition has been taken from the FreeBSD syslogd sources.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-08-12 22:23:58 -07:00
Andriy Moroz
dadc17d9e6 [Mellanox] Use MAC from EEPROM for PortChannels and VLAN Interfaces (#1793)
* Use MAC from EEPROM for PortChannels

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>

* Use MAC from EEPROM in DEVICE_METADATA

Will affect MAC for VLAN interfaces

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>

* Get MAC via decode-syseeprom

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>

* hw-management is now a service

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>

* Add error handling for MAC fetch process

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
2018-07-23 15:51:03 -07:00
Qi Luo
7ba08e5bf6
Prefix docker container name to syslog syslogtag (program name) (#1810) 2018-06-25 10:48:42 -07:00
Joe LeVeque
e1cb2ace36 [base image files] All 'docker exec' wrapper scripts now dynamically adjust their flags depending on whether or not they are run on a terminal (#1507) 2018-03-17 00:43:29 -07:00
Haiyang Zheng
2abdf8dc58 [libteam] Add fallback support for single-member-port LAG (#1118)
* [libteam] Add fallback support for single-member-port LAG

* Allow the port to be selected if the LAG is configured
with fallback and port is in defaulted state due to missing
LACP PDUs from remote end
* Only enable port if LAG is admin up and the member port
is link up

* [team] Add lacp fallback config to teamd.j2 template

* [teamd] Resolve config conflict between fallback and minlink

* Remove min_link config if fallback is configured
* Add support for fallback config in minigraph

* [teamd] Only enable fallback if it is single-member-port LAG

Signed-off-by: Haiyang Zheng <haiyang.z@alibaba-inc.com>

* [teamd] Removing the admin status check in lacp_port_link_update

Will submit another pull request to fix this issue.

Signed-off-by: Haiyang Zheng <haiyang.z@alibaba-inc.com>
2017-12-16 11:28:18 -08:00
Shuotian Cheng
c93c008bae
Revert "[docker-teamd]: Manage teamd and teamsyncd processes with supervisor (#1137)" (#1156)
This reverts commit a6edef2fa5.

The reason to revert this commit is that it breaks the current nightly test as
no port channel interfaces are get created after boot. teamd failed to start and
complained about 'Cannot allocate memory' possibly due to nlmsg_alloc function
failure.

Will revert this commit to investigate it further before moving to supervisor.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2017-11-16 10:30:41 -08:00
Joe LeVeque
a6edef2fa5
[docker-teamd]: Manage teamd and teamsyncd processes with supervisor (#1137) 2017-11-13 14:16:19 -08:00
Shuotian Cheng
d1156ca3b2 [teamd]: Bring down all member interfaces before starting teamd (#1081)
teamd requires all members to be set down before adding as a team port

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2017-10-26 14:20:44 -07:00
Shuotian Cheng
0f6c8c14e8 [teamd]: Remove deprecated blocking logic before starting teamd (#976)
With the fixes in /etc/network/interfaces file, host interfaces
could be added into the corresponding LAGs automatically. Thus,
the logic of checking if port initialization is ready is no longer
needed.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2017-09-21 14:56:16 -07:00
Taoyu Li
c9cc7aea41 [configdb] Migrate minigraph configurations to DB (#942)
Modify minigraph parser output format so it fit DB schema
Modify configuration templates to fit new schema
Systemd services dependencies are modified so database starts before any configuration consumer
2017-09-12 14:13:27 -07:00
Joe LeVeque
f49cac086f Remove extra trailing newlines at EOF (#804)
Files now end with a single newline
2017-07-12 20:54:37 -07:00
Shuotian Cheng
a74b3a1eb7 [teamd]: Fix Jinja2 template for calculating min_ports (#791)
In Jinja2, '|' cannot be treated directly as piping operator. The
operator precedence of '|' is higher than '*'. The filter only applies
to the value just before it. Group the expression to make sure that the
filter is applied to the outcome of the expression.

Update the unit test to add such case.
2017-07-06 16:33:24 -07:00
Marian Pritsak
3903b45d41 [teamd.sh]: Remove LAG interfaces on exit (#643)
Use -k option for teamd to properly remove LAG interfaces when docker is
exiting.

Signed-off-by: marian-pritsak <marianp@mellanox.com>
2017-06-08 01:53:51 -07:00
Marian Pritsak
820e7aafd0 [docker-teamd]: Explicitly set LAG hwaddr (#664)
* [docker-teamd]: Explicitly set LAG hwaddr

Team device is initially created without any members and has a random HW
address, which is later changed to port's address. This configuration
sets team device's address explicitly to base MAC to avoid reassignment.

Signed-off-by: marian-pritsak <marianp@mellanox.com>

* Update teamd config tests with hwaddr

Signed-off-by: marian-pritsak <marianp@mellanox.com>

* Align HW addr byte for Centec and Mellanox

Signed-off-by: marian-pritsak <marianp@mellanox.com>

* Change HW addr to unicast in config tests

Signed-off-by: marian-pritsak <marianp@mellanox.com>
2017-06-06 16:13:38 -07:00
Joe LeVeque
d5c13c0a83 [dockers]: Disable autorestart on all supervisor processes inside containers (#580) 2017-05-09 17:37:08 -07:00
Joe LeVeque
8f348399f5 [Dockers]: Manage all Docker containers with Supervisord (#573)
- Consolidate config.sh and start.sh scripts into one script (start.sh)
 - Solve issue #435 - All dockers now run supervisord as their ENTRYPOINT
 - All stdout/stderr output from processes managed by supervisord is now sent to syslog instead of their own files
 - Supervisord log messages are now also sent to syslog
 - Removed unused smartmontools package from docker-platform-monitor
2017-05-08 15:43:31 -07:00
pavel-shirshov
814fd87e63 Remove /var/run/rsyslogd.pid bofore starting rsyslog (#453) 2017-03-29 18:07:25 -07:00
Shuotian Cheng
27dae90726 [docker-teamd]: Clean /etc/teamd/ folder before adding new configurations (#451)
Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-03-28 15:49:48 -07:00
Shuotian Cheng
7b7a61693a [sonic-cfggen]: Add -p option and add teamd.j2 test (#414)
- Add -p --port-config option to feed sonic-cfggen with port_config.ini
  file when necessary.
- Update minigraph.py file to accept the -p option
- Add test_j2files.py test to test config.sh and all .j2 templates
  * Currently test_teamd is added to test both the config.sh and teamd.j2
    file works well with the t0 sample minigraph and sample port config
    file
  * The sample output is added to the folder sample_output for comparison

Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-03-17 21:38:20 -07:00
Shuotian Cheng
783be14a5d [minigraph]: Add portchannels/vlans dictionary and update teamd templates (#408)
- minigraph_portchannel_interfaces and minigraph_vlan_interfaces are lists
  of interfaces and the name could duplicate due to multiple IPs
- Add minigraph_portchannels and minigraph_vlans dictionaries to support
  querying port channels and vlans via the name
- Update teamd.j2 template and config.sh file in docker-teamd
- Update zebra.conf.j2 template to add port channel interfaces

Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-03-17 16:48:13 -07:00
Joe LeVeque
5dafa907b2 [dockers]: Add base image files to syncd-brcm, database and teamd (#380) 2017-03-06 12:22:42 -08:00
Shuotian Cheng
f9d8137975 [teamd]: Add redis-tools as the dependency of docker-teamd and fix bugs (#363)
- Fix the if condition bug to wait till PORT_TABLE:ConfigDone before start

Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-03-02 18:07:43 -08:00
Shuotian Cheng
f06dc5d3f9 [teamd]: Update the start.sh script to clean up the docker state (#351)
This change should be temporary because the current teamd cannot
re-create net devices acrosss restart. Basically, it will fail
when there're files in /var/run/teamd/ folder or the previously
created net devices are still there. Thus, the current workaround
is to remove the obsolete files to restart the docker-teamd.

This workaround cannot resolve the swss restart issue. Before
restarting swss, docker teamd needs to be stopped manually. After
swss starts, docker teamd needs to be restarted manually.

This change will only make sure that rebooting the switch will
make the switch at the correct state.

Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-03-02 15:59:16 -08:00
Taoyu Li
903499000f [docker-teamd] fix a config bug introduced in #345 (#353) 2017-03-01 14:32:52 -08:00
Shuotian Cheng
ad1b581111 [docker]: Add -c option in teamd docker Dockerfile (#348)
CMD is not longer a file name but a command that needs to be executed,
thus /bin/bash is not enough for the entrypoint and -c is needed.

Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-02-28 13:36:59 -08:00
Taoyu Li
d9b1000e6c [cfggen] Add support in -v for jinja2 expression (#345)
* Add support in -v for jinja2 expression
* Format json output
2017-02-28 10:52:56 -08:00
Taoyu Li
a16c780285 [teamd] Fix a bug in #305 that will break teamd (#329) 2017-02-23 14:47:51 -08:00
Taoyu Li
073c28bf15 Move template files to /usr/share/sonic/templates (#305) 2017-02-18 17:50:29 -08:00
pavel-shirshov
a845740543 [All Dockerfiles]: Prevent apt asking questions on the console (#300)
Add noninteractive setting into every Dockerfile in the repo

Signed-off-by: Pavel Shirshov pavelsh@microsoft.com
2017-02-16 21:48:49 -08:00
Shuotian Cheng
877291ae91 [docker-teamd]: Automatically start the processes after host interfaces are created (#278)
Signed-off-by: Shuotian Cheng <shuche@microsoft.com>
2017-02-15 08:54:10 -08:00
lguohan
b6753e7960 [docker-config-engine]: introduce docker sonic config engine (#274)
* [docker-config-engine]: introduce docker sonic config engine

sonic config engine provide the sonic configure engine for all sonic
dockers that rely on the engine to generate runtime configuration.
2017-02-07 18:11:19 -08:00
Taoyu Li
be8ed80554 teamd: Use 75% links upperbound as min-links (#224) 2017-01-30 17:33:03 -08:00
Taoyu Li
1b49499b65 [docker]: Add a status file to mark that the file is generated by sonic-config-engine (#211) 2017-01-24 19:16:55 -08:00
Taoyu Li
4fe1bdcf87 sonic-cfggen with sonicv2 dockers (#190)
Add a sonic-config-engine to help generate config file based on minigraph and other data on runtime. Modify fpm, teamd, lldp, snmp, and platform-monitor docker to use sonic-config-engine to generate config in docker upon load.
2017-01-19 20:56:26 -08:00
Oleksandr Ivantsiv
80d0d2d43b Reduce docker images size. (#196)
* Reduce docker images size.

Install only required dependencies.

* Update Dockerfile.j2
2017-01-19 12:19:21 -08:00