Commit Graph

82 Commits

Author SHA1 Message Date
Joe LeVeque
72b32a96fc
[201911][dockers][supervisor] Increase event buffer size for process exit listener (#7106)
Backport of https://github.com/Azure/sonic-buildimage/pull/7083 to the 201911 branch.

#### Why I did it

To prevent error [messages](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802) like the following from being logged:

```
Mar 17 02:33:48.523153 vlab-01 INFO swss#supervisord 2021-03-17 02:33:48,518 ERRO pool supervisor-proc-exit-listener event buffer overflowed, discarding event 46
```

This is basically an addendum to https://github.com/Azure/sonic-buildimage/pull/5247, which increased the event buffer size for dependent-startup. While supervisor-proc-exit-listener doesn't subscribe to as many events as dependent-startup, there is still a chance some containers (like swss, as in the example above) have enough processes running to cause an overflow of the default buffer size of 10.

This is especially important for preventing erroneous log_analyzer failures in the sonic-mgmt repo regression tests, which have started occasionally causing PR check builds to fail. Example [here](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802).

I set all supervisor-proc-exit-listener event buffer sizes to 1024, and also updated all dependent-startup event buffer sizes to 1024, as well, to keep things simple, unified, and allow headroom so that we will not need to adjust these values frequently, if at all.
2021-03-29 10:07:43 -07:00
lguohan
8bcdefbc34 [docker-orchagent]: make build depends only on sairedis package (#6467)
backport c4b5b002c3

make swss build depends only on libsairedis instead of syncd. This allows to build swss without depending
on vendor sai library.

Currently, libsairedis build also buils syncd which requires vendor SAI lib. This makes difficult to build
swss docker in buster while still keeping syncd docker in stretch, as swss requires libsairedis which also
build syncd and requires vendor to provide SAI for buster. As swss docker does not really contain syncd
binary, so it is not necessary to build syncd for swss docker.

[submodule]: update sonic-sairedis
1e42517996bfe41ac58d4c25ee3f93502befcb9d (HEAD -> 201911) [build]: add option to build without syncd

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-27 13:51:24 -08:00
Tamer Ahmed
fae4c4bfcc [swss] Enhance ARP Update to Call Sonic Cfggen Once (#5398)
This PR limited the number of calls to sonic-cfggen to one call
per iteration instead of current 3 calls per iteration.

The PR also installs jq on host for future scripts if needed.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Joe LeVeque
b70c6f72b2 [dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-28 16:12:53 +00:00
Guohan Lu
6909170c3e [docker-syncd-vs]: use service dependency in supervisord to start services 2020-08-15 22:31:13 -07:00
Joe LeVeque
890e1f38cc [sonic-py-common] get_platform(): Refactor method of retrieving platform identifier (#5094)
Applications running in the host OS can read the platform identifier from /host/machine.conf. When loading configuration, sonic-config-engine *needs* to read the platform identifier from machine.conf, as it it responsible for populating the value in Config DB.

When an application is running inside a Docker container, the machine.conf file is not accessible, as the /host directory is not mounted. So we need to retrieve the platform identifier from Config DB if get_platform() is called from inside a Docker 
container. However, we can't simply check that we're running in a Docker container because the host OS of the SONiC virtual switch is running inside a Docker container. So I refactored `get_platform()` to:
    1. Read from the `PLATFORM` environment variable if it exists (which is defined in a virtual switch Docker container)
    2. Read from machine.conf if possible (works in the host OS of a standard SONiC image, critical for sonic-config-engine at boot)
    3. Read the value from Config DB (needed for Docker containers running in SONiC, as machine.conf is not accessible to them)

- Also fix typo in daemon_base.py
- Also changes to align `get_hwsku()` with `get_platform()`
2020-08-09 10:40:20 -07:00
Joe LeVeque
6556c40040
[201911] Introduce sonic-py-common package (#5063)
Consolidate common SONiC Python-language functionality into one shared package (sonic-py-common) and eliminate duplicate code.

The package currently includes four modules:
- daemon_base
- device_info
- logger
- task_base

NOTE: This is a combination of all changes from https://github.com/Azure/sonic-buildimage/pull/5003, https://github.com/Azure/sonic-buildimage/pull/5049 and some changes from https://github.com/Azure/sonic-buildimage/pull/5043 backported to align with the 201911 branch. As part of the 201911 port, I am not installing the Python 3 package in the base image or in the VS container, because we do not have pip3 installed, and we do not intend to migrate to Python 3 in 201911.
2020-08-03 11:50:06 -07:00
yozhao101
c2364cf03e
[201911][dockers] Update critical_processes file syntax (#4854)
Backport of https://github.com/Azure/sonic-buildimage/pull/4831 to the 201911 branch
2020-06-26 11:37:05 -07:00
SuvarnaMeenakshi
0099305475 Multi-ASIC implementation (#3888)
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
2020-04-15 13:08:34 -07:00
Abhishek Dosi
249265ad99 Revert "Multi-ASIC implementation (#3888)"
This reverts commit 2e87a16941.
2020-04-03 14:34:38 -07:00
SuvarnaMeenakshi
2e87a16941 Multi-ASIC implementation (#3888)
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
2020-04-01 23:21:49 -07:00
Danny Allen
d793cc8fb0 [vs] Add dependencies for NAT to docker-sonic-vs (#4259)
I added dependencies to support the NAT feature on the virtual switch.

Signed-off-by: Danny Allen <daall@microsoft.com>
2020-03-22 23:01:21 -07:00
Joe LeVeque
8e36068237 [sonic-cfggen] Loading the configuration from init_cfg.json and then from config_db.json (#4148) 2020-03-15 08:54:05 -07:00
Prince Sunny
4f3d399092 [orchagent] Use mac address from config_db instead of from eth0 (#4166)
* Use mac address from config_db instead of eth0
2020-02-24 10:26:19 -08:00
SuvarnaMeenakshi
abe7ef7e2e [baseimage]: support building multi-asic component (#3856)
- move single instance services into their own folder
- generate Systemd templates for any multi-instance service files in slave.mk
- detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file.
- update container hostname after creation instead of during creation (docker_image_ctl)
- run Docker containers in a network namespace if specified
- add a service to create a simulated multi-ASIC topology on the virtual switch platform

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>
2020-02-03 15:32:21 -08:00
Kiran Kumar Kella
a943e6ce45 Changes in sonic-buildimage to support the NAT feature (#3494)
* Changes in sonic-buildimage for the NAT feature
- Docker for NAT
- installing the required tools iptables and conntrack for nat

Signed-off-by: kiran.kella@broadcom.com

* Add redis-tools dependencies in the docker nat compilation

* Addressed review comments

* add natsyncd to warm-boot finalizer list

* addressed review comments

* using swsscommon.DBConnector instead of swsssdk.SonicV2Connector

* Enable NAT application in docker-sonic-vs
2020-02-03 15:30:39 -08:00
Dong Zhang
1d5005bc8c [multiDB] add database_config.json into vs images (#3757) 2019-11-20 10:40:19 -08:00
Sudharsan D.G
1942e3363b Enable sflowmgrd in docker-sonic-vs (#3595) 2019-10-29 18:05:55 -07:00
zhenggen-xu
c23aac1581 [swss] Remove "-p port_config.ini" option from the portsyncd (#3671)
* [portsyncd] Remove "-p port_config.ini" option from the portsyncd

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2019-10-27 21:15:39 -07:00
Mike Lazar
e9a0c57714 [vsimage]: Support for the creation of a GNS3 appliance file (#3553)
The script sonic-gns3a.sh creates a GNS3 appliance flle, that points to a sonin-vs.img (SONiC Virtual Switch).

The appliance file (and sonic-vs.img file) can subsequently be imported into a GNS3 simulation environment.
2019-10-07 07:16:11 -07:00
padmanarayana
75104bb35d [sflow]: Build infrastructure changes to support sflow docker and utilities (#3251)
Introduce a new "sflow" container (if ENABLE_SFLOW is set). The new docker will include:
hsflowd : host-sflow based daemon is the sFlow agent
psample : Built from libpsample repository. Useful in debugging sampled packets/groups.
sflowtool : Locally dump sflow samples (e.g. with a in-unit collector)

In case of SONiC-VS, enable psample & act_sample kernel modules.

VS' syncd needs iproute2=4.20.0-2~bpo9+1 & libcap2-bin=1:2.25-1 to support tc-sample

tc-syncd is provided as a convenience tool for debugging (e.g. tc-syncd filter show ...)
2019-09-14 20:27:09 -07:00
Wenda Ni
81aef6b64c [Qos] use dot1p to tc mapping for backend switches (#3422)
* Use dot1p to tc mapping for backend switches

Signed-off-by: Wenda Ni <wenni@microsoft.com>

* Do not write DSCP to TC mapping into CONFIG_DB or config_db.json for
storage switches

Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-09-13 11:28:25 -07:00
Lawrence Lee
7271fe598f [build]: Move Systemd service start to systemd generator (#3172)
- What I did

 Move the enabling of Systemd services from sonic_debian_extension to a new systemd generator

- How I did it

  Create a new systemd generator to manually create symlinks to enable systemd services
  Add rules/Makefile to build generator
  Add services to be enabled to /etc/sonic/generated_services.conf to be read by the generator at boot time

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
2019-07-29 15:52:15 -07:00
Renuka Manavalan
92efe73e48
Enable debug image build for kvm image. (#3203) 2019-07-22 14:30:13 -07:00
Renuka Manavalan
a1b91937ca
Extend debug image build ability to all platforms. (#3134) 2019-07-10 12:23:13 -07:00
Jipan Yang
9a8202a39d [database]: Update redis to 5.0.3 (#3066)
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
2019-07-03 22:16:09 -07:00
pavel-shirshov
dd0f005b8a
[FRR]: Port some patches from sonic-quagga repo (#3017)
* Update sonic-quagga submodule

* Port some patches from sonic-quagga

* Fix Makefile

* Another patch

* Uncomment bgp test

* Downport Nikos's patch

* Add a patch to alleviate the vendor issue

* use patch instead of stg
2019-06-23 15:26:02 -07:00
Shuotian Cheng
c0eb90b96c
[docker-vs]: Start staticd by default (#2929)
Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2019-05-29 16:14:48 -07:00
Shuotian Cheng
d2eba43b40 [docker-vs]: Connect zebra with fpm and add staticd (#2925)
Since we move to FRR, we need to connect FRR with fpmsyncd via FPM.
Adding static routes is also required.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2019-05-20 15:06:04 -07:00
lguohan
f35daa7694
[frr]: change frr as default sonic routing stack (#2863)
* [frr]: change frr as default sonic routing stack

* fix quagga configuration

* [vstest]: fix bgp test for frr

* [vstest]: skip bgp/test_invalid_nexthop.py for frr

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-05-07 23:40:40 -07:00
lguohan
8080695ecf
[docker-{sonic,syncd}-vs]: upgrade {sonic,syncd}-vs docker to stretch (#2865)
* [docker-{sonic,syncd}-vs]: upgrade sonic-vs and syncd-vs docker to stretch

* remove python-click 6.6

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-05-06 07:19:36 -07:00
Ze Gan
2e86caaedb [vxlanmgrd]: Add vxlanmgrd start command (#2705)
* Add bridge-utils to orchagent image

- Add vxlanmgrd to supervisorctl in docker -orchagent

Signed-off-by: Ze Gan zegan@microsoft.com

* Update submodule pointer for swss to include Vxlanmgrd changes
2019-04-23 20:38:08 -07:00
Renuka Manavalan
ba0ca01ee0 [build]: Makefile: Extend to build debug docker images for all stretch dockers (#2789)
Overall goal: Build debug images for every stretch docker.

An earlier PR (#2789) made the first cut, by transforming broadcom/orchagent to build target/docker-orhagent-dbg.gz.

Changes in this PR:

Made docker-orchagent build to be platform independent.
1.1) Created rules/docker_orchagent.mk
1.2) Removed platform//docker-orchagent-*.mk
1.3) Removed the corresponding entry from platform//rules.mk

Extended the debug docker image build to stretch based syncd dockers.
2.1) For now, only mellanox & barefoot are stretch based.
2.2) All the common variable definitions are put in one place platform/template/docker-syncd-base.mk
2.3) platform/[mellanox, bfn]/docker-syncd-[mlnx, bfn].mk are updated as detailed below.
2.3.1) Set platform code and include template base file
2.3.2) Add the dependencies & debug dependencies and any update over what base template offers.

Extended all stretch based non-platform dockers to build debug dockers too.
3.1) Affected are:
docker-database.mk,
docker-platform-monitor.mk,
docker-router-advertiser.mk,
docker-teamd.mk,
docker-telemetry.mk

Next: Build debug flavor of final images with regular dockers replaced with debug dockers where available.
2019-04-19 18:49:21 -07:00
Stepan Blyshchak
ea078e7823 [buildsystem] Install debug packages in syncd when INSTALL_DEBUG_TOOLS=y (#2702)
* [buildsystem] Install debug packages in syncd when INSTALL_DEBUG_TOOLS=y

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-04-18 02:25:51 -07:00
pavel-shirshov
d6cf075ca5 [vstest]: Test for quagga livelock fix (#2751)
* Test for quagga livelock fix

* Create /usr/local/etc for the test

* Add more debug info

* Install specific version of exabgp

* Update sonic-quagga
2019-04-09 09:03:25 -07:00
lguohan
b73f9a5b1d
[swss]: update swss docker to stretch (#2714)
* [swss]: update swss docker to stretch

sonic-swss update:

* aa92326 2019-03-29 | fix c++ 11 build complaint for destructors default to noexcept (#822) (HEAD, origin/master, origin/HEAD) [lguohan]
* a304007 2019-03-28 | Allow ACL entry creation without ACL counter (#818) [Wenda Ni]
* 60a8a0d 2019-03-28 | [orchagent]: Cast enum class variable to int (#819) (HEAD, origin/master, origin/HEAD) [Shuotian Cheng]
* 3dd37a4 2019-03-26 | [vnetorch]: Add VNET/tunnel/route removal flows for Bitmap VNET implementation (#816) [Volodymyr Samotiy]
* a937f92 2019-03-22 | [VS]: fix occasional test_fdb_notifications vs test failure (#813) [Jipan Yang]
* ea54825 2019-03-21 | [portsorch] Fix inconsistent return value in bindAclTable (#791) [yorke]
* 5984e3a 2019-03-07 | Fix orchagent SEGV when PortConfigDone not set (#803) [Ramesh Santhanakrishnan]

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-03-30 11:57:25 -07:00
Volodymyr Samotiy
419c69b289 [vs]: Add option to specify platform name for DVS orchagent (#2571)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
2019-03-06 16:27:41 -08:00
lguohan
f20665008c
[build]: put stretch debian packages under target/debs/stretch/ (#2519)
* [build]: put stretch debian packages under target/debs/stretch/

* in stretch build phase, all debian packages built in that stage are placed under target/debs/stretch directory.
* for python-based debian packages, since they are really the same for jessie and stretch, they are placed under target/python-debs directory.

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-04 22:06:37 -08:00
lguohan
9c2d7240ea
[vs]: Force10-S6000 buffer settings for virtual switch (#2515)
Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-02-01 11:18:02 -08:00
Prince Sunny
39e12a1d82 [swss]: Change VrfMgrd startup order, cleanup VRF_TABLE from state DB (#2510) 2019-01-31 23:28:31 -08:00
lguohan
2b01beb7d4
[kvm]: support for all hwsku in kvm switch (#2495)
Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-01-30 02:04:20 -08:00
lguohan
bc3f649631
[swss]: remove intfsyncd service (#2499)
intfsyncd is replaced by intfmgrd service

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2019-01-29 10:36:04 -08:00
Prince Sunny
43f6df4654 Add nbrmgr to supervisor control (#2265)
* Add nbrmgr to supervisord conf

* Corrected priority values [Fix typo]

* Submodule update for Neighbor manager daemon

Submodule update sonic-swss-common:

edbfeec - Remove default docker name value of swss. (#250)
9728462 - Corrected configDB name for neigh table (#251)
6decc65 - Add NEIGH_TABLE to configDB for neighbor configuration (#249)
9918ae6 - Add ProducerStateTable temp view implementation and UT (#247)
41408f2 - Update README on dependencies
d9c0ba4 -Update README on the section 'Build with Google Test'
bb7fa5b - [ut]: explicit convert is to bool type (#248)
661b82c - Add gtest instruction in README

Submodule update sonic-swss

705b092 - Support ConfigDB neighbor configuration, introduce nbrmgr daemon (#693)
8522390 - Add vxlan switch attributes to switch orch (#712)
b123fa0 - [schema] update WARM_RESTART_TABLE:process_name schema document (#707)
2d7ab0c - Revert "Align default MTU value as SAI default (#705)" (#710)
836a58c - Align default MTU value as SAI default (#705)
bffa01f - VNET/VXLAN changes (#643)
b750a4b - [watermarkorch] add watermarkorch, extend queue and pg counters with wat\u2026 (#629)
2018-11-28 21:58:59 -08:00
lguohan
6e71cc7887
[vs]: sync changes to disk and add e1000 driver to sonic vm (#2288)
* syncd changes to disk and add e1000 driver to sonic vm

* add pg_profile_lookup.ini

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* update swss and sairedis

sairedis:
 * d146572 2018-11-22 | Fix interface name used on link message using lane map (#386)

swss:
* c74dc60 2018-11-22 | [vstest]: use eth1~32 as physical interface name in vs docker (#700) (HEAD -> master, origin/master, origin/HEAD) [lguohan]
* 6007e7f 2018-11-22 | [portmgrd]: Fix setting default port admin status and MTU (#691) [stepanblyschak]
* 6c70f6d 2018-11-22 | [portsorch] Fix port queue index init bug (#505) [yangbashuang]
* 70ac79b 2018-11-21 | [gitignore]: Update all binary names in the ignore list (#698) [Shuotian Cheng]
* 2a3626c 2018-11-21 | [test]: Remove duplicate legacy ACL tests (#699) [Shuotian Cheng]
* 8099811 2018-11-20 | [aclorch]: Remove unnecessary warning message (#696) [Shuotian Cheng]
* 63d8ebc 2018-11-18 | [portsorch]: Remove duplicate local variables - port (#690) [Shuotian Cheng]
* 28dc042 2018-11-18 | Remove default docker name value of swss. (#692) [Jipan Yang]

Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-11-22 12:09:21 -08:00
lguohan
64a2b1ce99
[vs]: build sonic vs kvm image (#2269)
Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-11-20 22:32:40 -08:00
Shuotian Cheng
ecca7e9697 [vs]: Add time.sleep(1) to make test stable (#2274)
time.sleep(1) after running the command to enslave member port

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-11-19 12:03:57 -08:00
Shuotian Cheng
3b4d85239f [vs]: Create /var/warmboot/teamd folder for teammgrd (#2262)
To suppress the error message:
INFO #supervisord: teammgrd Can't write to the lacp directory
'/var/warmboot/teamd/': No such file or directory

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-11-16 09:40:35 -08:00
lguohan
c038626273
[vstest]: add testlog for vstests (#2247)
Signed-off-by: Guohan Lu <gulv@microsoft.com>
2018-11-10 13:40:02 -08:00
zhenggen-xu
51a76614a3 Restore neighbor table to kernel during system warm-reboot (#2213)
* Restore neighbor table to kernel during system warm-reboot

Added a service: "restore_neighbors" to restore neighbor table into
kernel during system warm reboot. The service is started by supervisord
in swss docker when the docker is started.

In case system warm reboot is enabled, it will try to restore the neighbor
table from appDB into kernel through netlink API calls and update the neighbor
table by sending arp/ns requests to all neighbor entries, then it sets the
stateDB flag for neighsyncd to continue the reconciliation process.

-- Added tcpdump python-scapy debian package into orchagent and vs dockers.
-- Added python module: pyroute2 netifaces into orchagent and vc dockers.
-- Workarounded tcpdump issue in the vs docker

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Move the restore_neighbors.py to sonic-swss submodule
Made changes to makefiles accordingly

Make dockerfile.j2 changes and supervisord config changes

Add python monotonic lib for time access

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

* Added PYTHON_SWSSCOMMON as swss runtime dependency

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2018-11-09 17:06:09 -08:00
Shuotian Cheng
1bc8e4bd97 [vs]: Add missing packages to speed up build process (#2228)
Add logrotate, apt-utils, psmics packages to speed up
build process.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
2018-11-06 21:07:12 -08:00