sonic-buildimage

Author	SHA1	Message	Date
Robert J. Halstead	147d631065	[PINS] update sonic-p4rt docker to bullseye (#10182 ) #### Why I did it SONiC is migrating to bullseye. This will update the sonic-pins container to bullseye. #### How I did it The [sonic-pins code](https://github.com/Azure/sonic-buildimage/blob/master/rules/p4rt.mk) isn't dependent on any architecture so it will already build successfully for bullseye. This PR updates the docker to use bullseye. #### How to verify it Today we cannot build the docker-sonic-p4rt.gz target (e.g. Issue #9885). With this change the docker will build successfully. The P4RT executable will not run, because of a missing runtime library, libgmpxx, which I'll address in a followup PR. #### Description for the changelog Update docker-sonic-p4rt.gz target to build with Bullseye instead of Buster.	2022-03-23 17:21:36 -07:00
Saikrishna Arcot	4a5e75e45e	[restapi]: Don't use python/python2 for restapi start scripts (#10285 ) Python 2 isn't installed by default in Buster and Bullseye containers, and the scripts/modules can be used with Python 3, so make sure Python 3 is used. Why I did it After the Buster and Bullseye upgrade for the restapi container, processes will no longer start because supervisord is trying to call python and python2, both of which are unavailable. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-03-22 18:34:42 -07:00
Oleksandr Kozodoi	c5849c9650	Add scapy support for python3 virtual environment in the sonic-mgmt docker container (#10234 ) Why I did it Migration of sonic-mgmt codebase from Python 2 to Python 3 How I did it Added scapy dependencies to the env-python3 virtual environment. How to verify it Run test case: py.test --testbed=testbed-t0 --inventory=../ansible/lab --testbed_file=../ansible/testbed.csv --host-pattern=testbed-t0 -- module-path=../ansible/library lldp Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>	2022-03-16 12:00:51 +08:00
Saikrishna Arcot	5617b1ae3e	Image disk space reduction (#10172 ) # Why I did it Reduce the disk space taken up during bootup and runtime. # How I did it 1. Remove python package cache from the base image and from the containers. 2. During bootup, if logs are to be stored in memory, then don't create the `var-log.ext4` file just to delete it later during bootup. 3. For the partition containing `/host`, don't reserve any blocks for just the root user. This just makes sure all disk space is available for all users, if needed during upgrades (for example). * Remove pip2 and pip3 caches from some containers Only containers which appeared to have a significant pip cache size are included here. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Don't create var-log.ext4 if we're storing logs in memory Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * Run tune2fs on the device containing /host to not reserve any blocks for just the root user Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-03-15 18:12:49 -07:00
Shilong Liu	3fa627f290	Add a config variable to override default container registry instead of dockerhub. (#10166 ) * Add variable to reset default docker registry * fix bug in docker version control	2022-03-14 18:09:20 +08:00
xwjiang2021	b73da484c4	Install the allure-pytest package globally in sonic-mgmt docker (#10216 ) Why I did it This fix is to address issue: Azure/sonic-mgmt#5280 In the sonic-mgmt Dockerfile, python package allure-pytest is installed after ENV USER $user. Consequently the package is installed to path /home/$user/.local and is only available to the $user account. If we use root account in sonic-mgmt docker container to run tests, any script importing the allure package will fail with ImportError. We need to install the allure-pytest package to global directory instead of user local directory. How I did it Update the sonic-mgmt Dockerfile to ensure that the allure-pytest package is installed to global directory How to verify it Build a new sonic-mgmt docker image based on the changes. Use sonic-mgmt docker container of the newly built image to run test scripts that depend on the allure-pytest package. No ImportError is raised.	2022-03-12 20:18:12 +08:00
Oleksandr Kozodoi	3fa18d18d4	Add necessary changes for python3 virtual environment of sonic-mgmt docker container (#9277 ) This PR includes necessary changes for the setup of the Python3 virtual environment in the sonic-mgmt docker container. How to activate Python3 virtual environment? Connect to the sonic-mgmt container $ docker exec -ti sonic-mgmt bash Activate the virtual environment $ source /var/user/env-python3/bin/activate Why I did it Migration of sonic-mgmt codebase from Python 2 to Python 3 How I did it Added all necessary dependencies to the env-python3 virtual environment. Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>	2022-03-09 12:28:01 +08:00
Alexander Allen	d0ff8b5f48	[pmon] Clean up supervisord chassis_db_init entry and fix startsecs (#10071 ) Why I did it Code review was still in progress when #9858 was merged and upon further testing I have arrived at a better solution. How I did it Modified supervisord configuration j2 template for pmon to require no minimum uptime for chassisd_db_init and to remove the redundant exit_codes directive How to verify it Boot switch and verify in syslog that there are no errors related to chassis_db_init	2022-03-03 17:10:15 -08:00
Lawrence Lee	4d2a55d373	[swss]: Wait for vlan intf to start ndppd (#10119 ) - Use the `wait_for_link.sh` script to delay ndppd start until after the VLAN interface is ready - Avoids issue where ndppd tries to change interface attributes before the interface is ready	2022-03-02 16:23:56 -08:00
Lawrence Lee	47d9b26063	Revert "[swss]: Wait for vlan intf to start ndppd (#10036 )" (#10085 ) This reverts commit `91204879df`. #10036 breaks ndppd functionality	2022-02-28 15:42:02 -08:00
Yang Wang	b8fa5e0d8d	install xmlrunner python3 version (#10086 )	2022-02-28 11:21:04 +08:00
Lawrence Lee	91204879df	[swss]: Wait for vlan intf to start ndppd (#10036 ) - Use the `wait_for_link.sh` script to delay ndppd start until after the VLAN interface is ready - Avoids issue where ndppd tries to change interface attributes before the interface is ready Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-02-24 17:54:45 -08:00
xumia	b101b023d3	[Security]: Upgrade urllib3 to fix CVE-2021-33503 See https://security.archlinux.org/CVE-2021-33503	2022-02-25 08:59:57 +08:00
arlakshm	fd22635de0	[chassis][bgp] create v4 and v6 peer group for VoQ internal neighbors (#9693 ) Why I did it In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors. This PR is adding different Peer groups for V4 and V6 neighbors How I did it Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups Add extra Unit tests How to verify it Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2022-02-24 11:21:26 -08:00
Richard.Yu	2210c82ef8	[PTF-SAIv2]Add ptf docker for sai-ptf (saiv2) (#9729 ) * [PTF-SAIv2]Add ptf dockre for sai-ptf (saiv2) Base on current ptf docker create a new docker for sai-ptf(saiv2) upgrade related package use the latest ptf and install it test done: NOJESSIE=1 NOSTRETCH=1 NOBULLSEYE=1 ENABLE_SYNCD_RPC=y make target/docker-ptf-sai.gz BLDENV=buster make -f Makefile.work target/docker-ptf-sai.gz * upgrade the thrift to 014	2022-02-18 01:48:50 -08:00
kellyyeh	f136c53d19	[radv] Support multiple ipv6 prefixes per vlan interface (#9934 ) Why I did it Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block. How I did it Modifies radvd.conf.j2 and added unit tests How to verify it Configure multiple ipv6 address to the same vlan, start radvd Unit test will check if radvd.conf with multiple ipv6 addresses is formed correctly	2022-02-16 14:17:26 -08:00
Jason Lyu	b023c29a1e	[redis] Upgrade redis version (#9757 ) #### Why I did it The current redis version of SONiC is `6.0.6`, which contains many high-risky security issues like CVEs that are fixed in the latest version. The Redis release notes also highly recommend to upgrade with SECURITY urgency. ``` ================================================================================ Redis 6.0.16 Released Mon Oct 4 12:00:00 IDT 2021 ================================================================================ Upgrade urgency: SECURITY, contains fixes to security issues. Security Fixes: * (CVE-2021-41099) Integer to heap buffer overflow handling certain string commands and network payloads, when proto-max-bulk-len is manually configured to a non-default, very large value [reported by yiyuaner]. * (CVE-2021-32762) Integer to heap buffer overflow issue in redis-cli and redis-sentinel parsing large multi-bulk replies on some older and less common platforms [reported by Microsoft Vulnerability Research]. * (CVE-2021-32687) Integer to heap buffer overflow with intsets, when set-max-intset-entries is manually configured to a non-default, very large value [reported by Pawel Wieczorkiewicz, AWS]. * (CVE-2021-32675) Denial Of Service when processing RESP request payloads with a large number of elements on many connections. * (CVE-2021-32672) Random heap reading issue with Lua Debugger [reported by Meir Shpilraien]. * (CVE-2021-32628) Integer to heap buffer overflow handling ziplist-encoded data types, when configuring a large, non-default value for hash-max-ziplist-entries, hash-max-ziplist-value, zset-max-ziplist-entries or zset-max-ziplist-value [reported by sundb]. * (CVE-2021-32627) Integer to heap buffer overflow issue with streams, when configuring a non-default, large value for proto-max-bulk-len and client-query-buffer-limit [reported by sundb]. * (CVE-2021-32626) Specially crafted Lua scripts may result with Heap buffer overflow [reported by Meir Shpilraien]. Other bug fixes: * Fix appendfsync to always guarantee fsync before reply, on MacOS and FreeBSD (kqueue) (#9416) * Fix the wrong mis-detection of sync_file_range system call, affecting performance (#9371) * Fix replication issues when repl-diskless-load is used (#9280) ``` #### How I did it Edit `Dockerfile.j2` file #### How to verify it Check redis version #### Description for the changelog This PR will upgrade redis-server version to `6.0.16`.	2022-02-15 16:43:01 -08:00
Alexander Allen	9677401f4a	[pmon] Fix chassis_db_init exit not being expected (#9858 ) - Why I did it Error log was shown on switches during boot pmon#supervisord 2021-12-22 04:27:16,709 INFO exited: chassis_db_init (exit status 0; not expected) - How I did it Add exit code zero as an expected exit code and also disable autorestart. - How to verify it Boot the switch and ensure the above log line does not appear.	2022-02-15 10:51:24 +02:00
Travis Van Duyn	62934ad4c4	updated jinja template for snmp contact python2 vs python3 issue (#9949 )	2022-02-10 09:01:46 -08:00
Oleksandr Ivantsiv	25a0ce5eb1	[asan] Add address sanitizer support. (#9857 ) Implement infrastructure that allows enabling address sanitizer for docker containers. Enable address sanitizer for SWSS container. - Why I did it To add a possibility to compile SONiC applications with address sanitizer (ASAN). ASAN is a memory error detector for C/C++. It finds: 1. Use after free (dangling pointer dereference) 2. Heap buffer overflow 3. Stack buffer overflow 4. Global buffer overflow 5. Use after return 6. Use after the scope 7. Initialization order bugs 8. Memory leaks - How I did it By adding new ENABLE_ASAN configuration option. - How to verify it By default ASAN is disabled and the SONiC image is not affected. When ASAN is enabled it inspects all allocation, deallocation, and memory usage that the application does in run time. To verify whether the application has memory errors tests that trigger memory usage of the application should be run. Ideally, the whole regression tests should be run. Memory leaks reports will be placed in /var/log/asan/ directory of SONiC host OS. Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>	2022-02-09 13:29:18 +02:00
abdosi	e44a40cc3b	Updated Internal BGP Templates for chassis packet (#9674 ) Fixes: https://github.com/Azure/sonic-buildimage/issues/9610	2022-02-08 09:36:32 -08:00
Lawrence Lee	eff80f750f	[swss]: Reduce tunnel_packet_handler memory usage (#9762 ) * Configure scapy to not store sniffed packets Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2022-02-07 11:55:48 -08:00
Christian Svensson	660c0cbe7b	docker-dhcp-relay: Fix test call to MockConfigDb (#9903 ) *docker-dhcp-relay: Fix test call to MockConfigDb Signed-off-by: Christian Svensson <blue@cmd.nu>	2022-02-01 18:52:52 -08:00
Andriy Yurkiv	cb3b9416a6	[Mellanox][VXLAN] add params to vxlan.json file in order to configure VXLAN src port range feature (#9658 ) - Why I did it Remove obsolete parameter that enables static VXLAN src port range provide functionality no generate json config file according to appropriate parameter in config_db Done for SN3800: • Mellanox-SN3800-D28C50 • Mellanox-SN3800-C64 • Mellanox-SN3800-D28C49S1 (New 10G SKU) SN2700: • Mellanox-SN2700-D48C8 - How I did it Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 from appropriate sai.profile files Created vxlan.json file and added few params that depends on DEVICE_METADATA.localhost.vxlan_port_range - How to verify it File /etc/swss/config.d/vxlan.json should be generated inside swss docker when it restart [ { "SWITCH_TABLE:switch": { "vxlan_src": "0xFF00", "vxlan_mask": "8" }, "OP": "SET" } ] Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>	2022-01-31 15:57:30 +02:00
vdahiya12	61e9a7683c	[y_cable] Support for initialization of new daemon ycable to support ycables (#9125 ) * [y_cable] Support for initialization of new Daemon ycable to support ycables This PR also adds the commit in sonic-platform-daemons 94fa239 [y_cable] refactor y_cable to a seperate logic and new daemon from xcvrd (#219) Why I did it This PR separates the logic of Y-Cable from xcvrd. Before this change we were utilizing xcvrd daemon to control all aspects of Y-Cable right from initialization to processing requests from other entities like orch,linkmgr. Now we would have another daemon ycabled which will serve this purpose. Logically everything still remains the same from the perspective of other daemons. it also take care aspects like init/delete daemon from Y-Cable perspective. How I did it To serve the purpose we build a new wheel sonic_ycabled-1.0-py3-none-any.whl and install it inside pmon. We also initalize the daemon ycabled which serves our purpose for refactor inside pmon How to verify it Ran the changes with an image for dualtor tests on a 7050cx3 platform Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>	2022-01-25 11:10:25 -08:00
Longxiang Lyu	49a036e90c	Add dualtor TSA/B/C support (#9726 ) Why I did it Add TSA/B/C dualtor support Signed-off-by: Longxiang Lyu lolv@microsoft.com How I did it For TSA, toggle all the mux to standby if the device type is dualtor and there are active mux ports. For TSC, add mux status output. How to verify it Run TSA/B/C on a dualtor setup	2022-01-25 10:50:29 +08:00
DINESH KUMAR SELLAPPAN	d9b1577675	Support for Statistics Python Module in sonic-mgmt docker image (#9682 ) This PR includes the support for statistics module in sonic-mgmt docker image	2022-01-25 10:32:22 +08:00
xumia	7a226ffd0d	Support bullseye for docker-sonic-restapi docker-sonic-telemetry (#9791 ) Support bullseye for docker-sonic-restapi docker-sonic-telemetry Upgrade to bullseye and Golang-1.15 to support FIPS.	2022-01-21 08:41:39 +08:00
kellyyeh	3e263fa6a8	[dhcp_relay] Remove dhcpv6 servers from VlanBrief (#9718 )	2022-01-19 07:47:08 -08:00
Saikrishna Arcot	bb3362760d	[docker-dhcprelay]: Update to Bullseye (#9736 ) As part of this, update the isc-dhcp package to match the Bullseye version (this fixes some compile errors related to BIND), clean up some of the build dependencies and runtime dependencies for debian packaging, and use the default Boost version to compile against instead of explicitly saying using 1.74. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-01-18 15:11:36 -08:00
shlomibitton	eaa888d948	Fix import error for DHCP relay CLI (#9691 ) Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>	2022-01-16 08:08:01 +02:00
SuvarnaMeenakshi	945278f6d8	[docker-snmp]: Modify log level of snmpd (#9734 ) #### Why I did it resolves https://github.com/Azure/sonic-buildimage/issues/8779 snmpd writes the below error message in syslog : snmp#snmpd[27]: truncating integer value > 32 bits This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days. #### How I did it Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd To avoid seeing this message if the counter grows, the snmpd error log level is changed to display LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG. Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog. #### How to verify it On a device which is up for more than 497 days, modify supervisord.conf with the change and restart snmp. Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.	2022-01-12 14:40:01 -08:00
Saikrishna Arcot	fee2441717	Create docker-base-bullseye and docker-config-engine-bullseye (#9666 ) * [slave-bullseye]: Remove Python 2 It shouldn't be needed anymore. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> * [dockers]: Add docker-base-bullseye and docker-config-engine-bullseye Also upgrade socat from 1.7.3.1 to 1.7.4.1 Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-01-11 09:23:42 -08:00
abdosi	6c507329b7	Enable/Disable Order ECMP feature. (#9651 ) Updated Jinja2 Template in switch.json.j2 for enabling/disabling Order ECMP feature based on device role. Changes as per design: Azure/SONiC#896	2022-01-06 16:40:50 -08:00
Saikrishna Arcot	bd479cad29	Create a docker-swss-layer that holds the swss package. This is to save about 50MB of disk space, since 6 containers individually install this package. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2022-01-06 09:26:55 -08:00
Saikrishna Arcot	b09b845225	[docker-platform-monitor]: Remove Python 2 Python 2 doesn't appear to be required any more.	2022-01-06 09:26:55 -08:00
Shilong Liu	36d866002a	[build] Fix docker-sonic-mgmt pylint dependency lazy-object-proxy version (#9596 )	2021-12-24 10:42:37 +08:00
zzhiyuan	a6d0a27a18	[Arista] Increase switch PCIe timeout for 7060-cx32s (#9248 ) Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com> Why I did it Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up. No issues would be expected from the setpci command. From the PCIe spec: "Software is permitted to change the value in this field at any time. For Requests already pending when the Completion Timeout Value is changed, hardware is permitted to use either the new or the old value for the outstanding Requests, and is permitted to base the start time for each Request either on when this value was changed or on when each request was issued. " How I did it Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder. Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands. How to verify it On bootup of an Arista 7060, can execute: lspci -vv -s 01:00.0 \| grep -i "devctl2" In order to check that the timeout has changed.	2021-12-17 08:43:25 -08:00
Lawrence Lee	7bd0a2ad11	[swss]: Listen for undeliverable tunnel packets (#9348 ) - Create a script in the orchagent docker container which listens for these encapsulated packets which are trapped to CPU (indicating that they cannot be routed/no neighbor info exists for the inner packet). When such a packet is received, the script will issue a ping command to the packet's inner destination IP to start the neighbor learning process. - This script is also resilient to portchannel status changes (i.e. interface going up or down). An interface going down does not affect traffic sniffing on interfaces which are still up. When an interface comes back up, we restart the sniffer to start capturing traffic on that interface again.	2021-12-14 14:45:23 -08:00
Shi Su	f2774b635d	Add openbfdd to ptf docker (#9488 ) Why I did it To enable test support for BFD-related features, the PTF docker needs to have the proper support for BFD. This PR aims to add BFD support in ptf docker. How I did it Clone and build OpenBFDD for PTF docker. How to verify it Build locally and verify BFD is supported.	2021-12-14 11:46:48 -08:00
abdosi	6c0da4bcf0	[bgp] Enable BGP Graceful Restart based on device role (#9486 ) What I did: Updated Jinja Template to enable BGP Graceful Restart based on device role. By default it will be enable only if the device role type is TorRouter. Why I did:- By default FRR is configured in Graceful Helper mode. Graceful Restart is needed on T0/TorRouter only since the device can go for warm-reboot. For T1/LeafRouter it need to be in Helper mode only	2021-12-13 10:14:50 -08:00
novikauanton	969cea07aa	add platform to iccpd's env (#8945 )	2021-12-08 09:21:44 -08:00
Brian O'Connor	46bcda359c	[PINS] Build P4RT container for PINS (#9083 ) - Add INCLUDE_PINS to config to enable/disable container - Add Docker files and supporting resources - Add sonic-pins submodule and associated make files Submission containing materials of a third party: Copyright Google LLC; Licensed under Apache 2.0 #### Why I did it Adds P4RT container to SONiC for PINS The P4RT app is covered by this HLD: https://github.com/pins/SONiC/blob/master/doc/pins/p4rt_app_hld.md #### How I did it Followed the pattern and templates used for other SONiC applications #### How to verify it Build SONiC with INCLUDE_P4RT set to "y". Verify that the resulting build has a container called "p4rt" running. You can verify that the service is up by running the following command on the SONiC switch: ```bash sudo netstat -lpnt \| grep p4rt ``` You should see the service listening on TCP port 9559. #### Which release branch to backport (provide reason below if selected) None #### Description for the changelog Build P4RT container for PINS	2021-12-07 11:11:25 -08:00
abdosi	f501311f11	Updated BGP Template for Chassis/Multi-asic (#9291 ) Updated BGP Template for the case: 1. For Packet Chassis do not advertise Loopback4096 address into BGP as there is Static Route for same. Having this route in BGP causes two level of recursion in Zebra and cause assert in Zebra when there are many nexthop involved 2. Advertise only P2P Connected IP's into BGP (External Peers). For Packet chassis we have backend IP Interface subnet and if they get advertised into BGP then it also causes recursion	2021-12-06 09:36:24 -08:00
kellyyeh	d11207d4f4	[radv] Run radv on MgmtToRRouter (#9424 ) * Allow radv to run on mgmt tor and EPMS	2021-12-03 09:45:06 -08:00
kellyyeh	f2ee94d201	[dhcp_relay] Update DHCPv6 counter on relayed messages (#9283 )	2021-11-30 20:15:30 -08:00
vganesan-nokia	78de10713c	[voq-chassis][bgpcfg] VOQ_BGP_CHASSIS_NEIGHBORS timers default (#8455 ) The BGP_VOQ_CHASSIS_NEIGHBOR keepalive and holdtime timers are configured similar to general neighbors. Changes are done to configure BGP_VOQ_CHASSIS_NEIGHBOR timers similar to BGP_INTENAL_NEIGBOR since voq chassis bgp neighbors are similar to bgp internal neighbors in multi-asic. As it is done for bgp internal neighbors, the keepalive and holdtime timers are set to 3 and 10 seconds respectively. Also similar to bgp internal neighbors, connection retry timer is also configured for voq chassis bgp neighbors. Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>	2021-11-30 12:10:27 -08:00
Saikrishna Arcot	fdd8236864	[docker-mgmt-framework]: Don't overwrite /etc/passwd and /etc/group with symlinks (#9375 ) Fixes #9376 Because /etc/passwd and /etc/group have been overwritten with symlinks to /host_etc/passwd and /host_etc/group, the debug container build fails. This is because the debug container is built without /etc being mounted at /host_etc in the container (which does happen at runtime). Because of that, /etc/passwd and /etc/group don't exist, which causes some package installation errors when openssh-client tries to create a group. This is a partial revert of `1347f29178`. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2021-11-24 23:51:59 -08:00
arlakshm	5830852832	remove staticd.conf.j2 (#9182 ) Why I did it resolves #8979 and #9055 How I did it Remove the file static.conf.j2,which adds the default route on eth0 from bgp docker Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2021-11-24 15:32:16 -08:00
Junchao-Mellanox	554b04f312	Add trap flow counter support (#8940 ) *Add trap flow counter support	2021-11-24 15:26:52 -08:00
Brian O'Connor	002827f08e	[PINS] Add APPL_STATE_DB and response path log (#9082 ) - Add APPL_STATE_DB to database_config.json - Clear APPL_STATE_DB during SwSS container restarts - Add response path log file to logrotate config: responsepublisher.rec Co-authored-by: PINS Working Group <sonic-pins-subgroup@googlegroups.com>	2021-11-24 10:31:06 -08:00
Stephen Sun	b3ccef9c08	[Reclaim buffer] Common infrastructure update for reclaiming buffer (#9133 ) - Why I did it This is to update the common sonic-buildimage infra for reclaiming buffer. - How I did it Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there. Rendering is done here for passing azure pipeline. Load zero_profiles.json when the dynamic buffer manager starts Generate inactive port list to reclaim buffer Signed-off-by: Stephen Sun <stephens@nvidia.com>	2021-11-24 15:00:23 +02:00
Ze Gan	79b8ff52b0	[sonic-mgmt]: Upgrade scapy (#8554 ) * Upgrade scapy Signed-off-by: Ze Gan <ganze718@gmail.com> * Add scapy version Signed-off-by: Ze Gan <ganze718@gmail.com>	2021-11-24 10:07:50 +08:00
Stepan Blyshchak	a2c2d67098	[ACL] enable ACL FC when genereting config from minigraph but disable by default (#8908 ) * [ACL] enable ACL FC when genereting config from minigraph but disable by default Why I did it To support ACL counters on Flex Counter Infrastructure. How I did it Enable ACL FC in init_cfg and minigraph. Disable when genereting configuration from preset. How to verify it Together with depends PRs. Run ACL/Everflow test suite. Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>	2021-11-11 09:07:54 +08:00
tjchadaga	8544147a70	Fix for additional intf flap during fast-reboot (#9166 )	2021-11-08 15:21:11 -08:00
Lawrence Lee	7c0507b6db	[swss]: Start ndppd after vlanmgrd (#9155 ) Why I did it During swss container startup, if ndppd starts up before/with vlanmgrd, ndppd will be pinned at nearly 100% CPU usage. How I did it Only start ndppd after vlanmgrd is running. Also, call ndppd directly instead of through bash for improved logging and to prevent orphaned processes. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-11-03 11:03:01 -07:00
Sudharsan Dhamal Gopalarathnam	fcff3f3d09	VxLAN Tunnel Counters and Rates implementation (#8369 ) * Enable flex counters for Vxlan tunnel	2021-11-01 10:42:21 -07:00
abdosi	919b3e5cdf	[chassis-packet] Fixed BGP Internal Peer template (#9106 ) What I did: Fix the typo in Internal Peer Group template for Packet-based Chassis. Address Review comments of PR: [chassis-packet] minigraph parsing and BGP template changes #8966 - Static Route Parsing for Host - Formatting of chassis port_config.ini	2021-10-29 11:02:38 -07:00
Junhua Zhai	7de673cb5b	[gearbox] Use separator ':' for GB_ASIC_DB, GB_COUNTERS_DB and GB_FLEX_COUNTER_DB (#9100 ) Keep GB_ASIC_DB, etc consistent with the ones in sonic-swss-common/common/database_config.json	2021-10-28 10:27:52 -07:00
Marty Y. Lok	b91190d82d	[Nokia] Add protobuf and grpc C++ and python lib to support Nokia IXR7250E platform (#8366 ) #### Why I did it Nokia IXR7250E platform requires grpcio, grpcio-tools python library, and libprotobuf-dev, libgrpc++ library #### How I did it Modified the build_debian.sh install libprotobuf-dev and libgrpc++ to support nokia ndk Modified the sonic_debian_extension.j2 to install the grpcio and grpcio-tools in the host Modified the docker-platform-monitor/Dockerfile.js to install grpcio and grpcio-tools for the pmon container. #### How to verify it Image running success.	2021-10-26 18:09:32 -07:00
Kebo Liu	9c4a7c2fed	[PMON] Skip chassis_db_init task on Mellanox simx platform (#9017 ) Why I did it "chassis_db_init" task of PMON should be skipped on Mellanox simx platform, since the hardware info which this task is trying to access is not available on simx platforms, It will introduce some error log. How I did it Add the capability for "chassis_db_init" in the template for it can be skipped by adding configuration in "pmon_daemon_control.json". add "skip_chassis_db_init" configuration for simx platforms. use symbol link for "pmon_daemon_control.json" since all the simx platforms share the same configuration How to verify it Build an image and install it on simx platform to check whether "chassis_db_init" task is skipped. Signed-off-by: Kebo Liu <kebol@nvidia.com>	2021-10-24 09:10:41 -07:00
Saikrishna Arcot	c1d5e0682f	docker-dhcp-relay: Fix waiting for interfaces to get set up (#9034 ) Fix the check used to wait for interfaces to come up. The group name in the supervisor config files has changed from isc-dhcp-relay to dhcp-relay. Also, in the wait script, wait 10 additional seconds after the vlans, port channels, and any interfaces are up. This is because dhcrelay listens on all interfaces (in addition to port channels and vlans), and to ensure that it stays in a clean state during runtime, wait some extra time to make sure that those interfaces are created as well. Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>	2021-10-21 18:45:00 -07:00
shlomibitton	546340bf7b	[dhcp_relay] Fix import for dhcp_counters on clear_dhcp6relay_counter.py (#8991 ) #### Why I did it Import issue will cause: root@sonic:/# sudo sonic-clear arp failed to import plugin clear.plugins.dhcprelay: No module named 'show_dhcp_relay' #### How I did it Fix the import. #### How to verify it run sudo sonic-clear arp	2021-10-19 03:10:36 -07:00
abdosi	3bb248bd67	[chassis-packet] minigraph parsing and BGP template changes (#8966 ) 1. Changes for Generation LC-Graph for packet-based chassis. 2. Added Support Ipv6 Peering on Loopback4096 for voq also 3. Updated asic topology yml files to be offset of slot 4. Made slot_num to take string slot<number> instead of number 5. Consolidated template_dpg_voq_asic.j2 into dpg_asic.j2 6. Remove Loopback4096 from asic topology and parse as dut invertory for multi-asic 7. Updated topo_facts parsing for asic topology_ 8. Internal BGP Session rename from <VoqChassisInternal> to <ChassisInternal> and take switch_type as value. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2021-10-18 18:44:24 -07:00
Lawrence Lee	fad5ec47b4	[mux]: Call write_standby from host only Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-10-15 09:59:59 -07:00
Lawrence Lee	5232647b33	[mux]: Make write_standby available on host Signed-off-by: Lawrence Lee <lawlee@microsoft.com> [write_standby]: Cleanup and fix build Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-10-15 09:59:59 -07:00
Lawrence Lee	14403c61d2	[mux]: Initialize all mux ports as standby Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-10-15 09:59:59 -07:00
Tamer Ahmed	c9c2826520	Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd Linkmgrd monitors link status, mux status, and link state. Has the link becomes unhealthy, linkmgrd will trigger mux switchover on a standby ToR ensuring uninterrupted service to servers/blades. This PR is initial implementation of linkmgrd. Also, docker-mux container hold packages related to maintaining and managing mux cable. It currently runs linkmgrd binary that monitor and switches the mux if needed. This PR also introduces mux-container and starts linkmgrd as startup when build is configured with INCLUDE_MUX=y Edit: linkmgrd PR will follow. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com> Related work items: #2315, #3146150	2021-10-15 09:59:59 -07:00
kellyyeh	df6361f50c	Change radv interval to 3min (#8882 )	2021-10-01 15:00:16 -07:00
Ye Jianquan	38500fa92e	Add gdb and pyrasite to ptf image (#8816 )	2021-09-24 17:10:48 +08:00
kellyyeh	62a1f5eb19	Add CLI Support for IPv6 Helpers and DHCPv6 Relay Counters (#8593 )	2021-09-23 22:01:26 -07:00
DINESH KUMAR SELLAPPAN	31a647a72d	[docker-sonic-mgmt]: Snappi version to 0.5.11 (#8790 )	2021-09-23 02:12:12 -07:00
kellyyeh	bc06c6fcb5	Incorporate DHCPv6 Relay Agent into dhcp-relay docker (#8321 )	2021-09-22 16:05:03 -07:00
shlomibitton	112fda7877	[Flex Counters] Reset flex counters delay flag on config DB when enable_counters script is called (#8500 ) #### Why I did it Reset flex counters delay flag on config DB when enable_counters script is called to allow enablement of flex counters in orchagent. #### How I did it Push to config DB 'false' value for delay indication when enable_counters script is called before enabling the counters. #### How to verify it Observe counters are created when enable_counters script is called.	2021-09-01 21:17:36 -07:00
richardyu	479f61404b	Add thrift in the docker-sonic-mgmt (#8623 ) Co-authored-by: richardyu-ms <richard.yu@microsoft.com>	2021-08-31 19:20:06 -07:00
shlomibitton	56533ceb9e	[dhcp_relay] Adapt config/show CLI commands to support DHCPv6 relay (#8211 ) #### Why I did it - Adapt config/show CLI commands to support DHCPv6 relay - Support multiple dhcp servers assignment in one command - Fix IP validation - Adapt UT and add new UT cases #### How I did it - Modify config/show dhcp relay files - Modify config/show UT files #### How to verify it This PR has a dependency on PR https://github.com/Azure/sonic-utilities/pull/1717 Build an image with the dependent PR and this PR Use config/show DHCPv6 relay commands.	2021-08-25 00:48:39 -07:00
Christian Svensson	f7de685be2	[mgmt-framework]: Fix typo in mgmt_vars.j2 (#8475 ) Signed-off-by: Christian Svensson <blue@cmd.nu>	2021-08-24 10:54:13 -07:00
Kostiantyn Yarovyi	6530f93881	[Pcied] run by python 3 Why I did it Pcied running by python 2. How I did it dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2 How to verify it docker exec pmon supervisorctl status	2021-08-23 03:30:12 +00:00
Myron Sosyak	4d03526311	[docker-ptf] Upgrade to buster (#8254 ) Co-authored-by: Your Name <you@example.com>	2021-08-18 10:42:03 -07:00
xumia	a4405f09ed	Support to build armhf/arm64 platforms on arm based system (#7731 ) Why I did it Support to build armhf/arm64 platforms on arm based system without qemu simulator. When building the armhf/arm64 on arm based system, it is not necessary to use qemu simulator. How I did it Build armhf on armhf system, or build arm64 on arm64 system, by default, qemu simulator will not be used. When building armhf on arm64, and you have enabled armhf docker, then it will build images without simulator automatically. It is based how the docker service is run. Docker base image change: For amd64, change from debian:to amd64/debian: For arm64, change from multiarch/debian-debootstrap:arm64- to arm64v8/debian: For armhf, change from multiarch/debian-debootstrap:armhf- to arm32v7/debian: See https://github.com/docker-library/official-images#architectures-other-than-amd64 The mapping relations: arm32v6 --- armel arm32v7 --- armhf arm64v8 --- arm64 Docker image armhf deprecated info: https://hub.docker.com/r/armhf/debian, using arm32v7 instead.	2021-08-12 22:24:37 +08:00
richardyu	9417fe9303	PTF adds unittest-xml-reporting (#8417 ) Co-authored-by: richardyu-ms <richard.yu@microsoft.com>	2021-08-11 20:55:21 -07:00
Blueve	aa01315f60	[ARM] Fix issue whre the ping6 tool is missing from orchagent docker (#8345 ) Signed-off-by: Jing Kan jika@microsoft.com	2021-08-05 22:00:50 +08:00
Sujin Kang	447f0c64da	[pmon]: Enable Autorestart of the daemons in PMON for unexpected exit cases (#8326 ) Remove the daemon list from the critical_process which prevent the PMON from restarting when the individual daemon crashes.	2021-08-04 09:57:54 -07:00
VenkatCisco	0803f7bf34	[pmon]: add python3-jsonschema pmon (#8018 ) jsonschema is an implementation of JSON Schema for Python . Signed-off-by: Venkat Garigipati <venkatg@cisco.com>	2021-08-03 18:08:09 -07:00
vganesan-nokia	f9231723f9	[multiasic][voq][bgpconf] Fix for the issue of same BGP router id in all asics (#8049 ) For multiasic, the back end asics use ip addresss of Loopback4096 for BGP router id. In VOQ multi-asic chassis there are no back end asics. All the asics are front end and the iBGP connections are established via Ethernet-IB of asics. Since these asics are not designated as BackEnd, the ip address of interface Loopback0 is used as BGP router id. Since the ip address of Loopback0 is same for all the asics in the line card, same router id is used for voq iBGP configurations and hence the iBGP connections are not established. Changes are done to fix this	2021-07-26 12:54:52 -07:00
Shi Su	8a48be9b74	Reduce route selection deferral timer for bgp graceful restart (#7533 ) Why I did it There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation. Fix #7488 How I did it Reduce route selection deferral timer for bgp graceful restart to 15 seconds.	2021-07-26 10:16:19 -07:00
賓少鈺	aa59bfeab7	[PDE]: introduce the SONiC Platform Development Env (#7510 ) The PDE silicon test harness and platform test harness can be found in src/sonic-platform-pdk-pde	2021-07-24 16:24:43 -07:00
slutati1536	de43c6a163	Added retry to sonic-mgmt docker container (#7997 ) Why I did it the motivation for this PR is to add retry_call to several test cases in the community, for example, the following cases: test_show_platform_fanstatus_mocked test_show_platform_temperature_mocked are executing a command once and comparing the output to the expected mock data, sometimes differences between the mock and the actual are causing the tests to fail. retry will make these tests more stable. retry will also be more efficient than sleep which will cause the tests to run longer because sometimes it is not necessary to sleep all that time, retry will only run a function only until it passed. How I did it added retry to the docker file How to verify it I run the tests with retry on the docker after installing the retry package Signed-off-by: Sharon Lutati <slutati@nvidia.com>	2021-07-20 09:28:10 -07:00
shlomibitton	604becdd5c	[dhcp_relay] DHCP relay support for IPv6 (#7772 ) Why I did it Currently SONiC use the 'isc-dhcp-relay' package to allow DHCP relay functionality on IPv4 networks only. This will allow the IPv6 functionality along the IPv4 type. How I did it Edit supervisord template to start DHCPv6 instances when configured to do so on Config DB. Align cfg unit test to the new change. Add DHCPv6 relay minigraph parsing support and a suitable t0 topology xml file for UT. How to verify it Configure DHCPv6 agents as described on the feature HLD: Azure/SONiC#765 Test it with real client/server with IPv6 or use the dedicated automatic test: Azure/sonic-mgmt#3565 Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com> * Split docker-dhcp-relay.supervisord.conf.j2 template into several files for easier code maintenance	2021-07-16 07:31:05 -07:00
Stepan Blyshchak	b3b6938fda	[dhcp-relay] make DHCP relay an extension (#6531 ) - Why I did it Make DHCP relay docker an extension. DHCP relay now carries dhcp relay commands CLI plugin and has a complete manifest. It is installed as extension if INCLUDE_DHCP_REALY is set to y. DEPENDS on #5939 - How I did it Modify DHCP relay docker makefile and dockerfile. Make changes to sonic_debian_extension.j2 to install sonic packages. I moved DHCP related CLI tests from sonic-utilities to DHCP relay docker. This PR introduces a way to write a plugin as part of docker image and run the tests from cli-plugin-tests directory under docker directory. The test result is available in target/docker-dhcp-relay.gz.log: [ REASON ] : target/docker-dhcp-relay.gz does not exist NON-EXISTENT PREREQUISITES: docker-start target/docker-config-engine-buster.gz-load target/python-wheels/sonic_utilities-1.2-py3-none-any.whl-in stall target/debs/buster/python3-swsscommon_1.0.0_amd64.deb-install [ FLAGS FILE ] : [] [ FLAGS DEPENDS ] : [] [ FLAGS DIFF ] : [] ============================= test session starts ============================== platform linux -- Python 3.7.3, pytest-3.10.1, py-1.7.0, pluggy-0.8.0 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /sonic/dockers/docker-dhcp-relay/cli-plugin-tests, inifile: plugins: cov-2.6.0 collecting ... collected 10 items test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_plugin_registration PASSED [ 10%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_nonexist_vlanid PASSED [ 20%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_vlanid PASSED [ 30%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_ip PASSED [ 40%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_exist_ip PASSED [ 50%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_del_dhcp_relay_dest PASSED [ 60%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_nonexist_dhcp_relay_dest PASSED [ 70%] test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_dhcp_relay_dest_with_nonexist_vlanid PASSED [ 80%] test_show_dhcp_relay.py::TestVlanDhcpRelay::test_plugin_registration PASSED [ 90%] test_show_dhcp_relay.py::TestVlanDhcpRelay::test_dhcp_relay_column_output PASSED [100%] =============================== warnings summary =============================== /usr/local/lib/python3.7/dist-packages/tabulate.py:7 /usr/local/lib/python3.7/dist-packages/tabulate.py:7: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import namedtuple, Iterable -- Docs: https://docs.pytest.org/en/latest/warnings.html ==================== 10 passed, 1 warnings in 0.35 seconds =====================	2021-07-15 10:35:56 -07:00
Vivek Reddy	e439676455	autorestart inside restapi docker is disabled (#8006 ) Fix issue with critical process in the restapi docker restarting immediately after getting killed Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>	2021-07-14 11:37:00 -07:00
Guohan Lu	4f2bc1fbed	Revert "Add ethtool to docker-platform-monitor (#8017 )" This reverts commit `1618aec370`.	2021-07-07 23:36:44 -07:00
Stepan Blyshchak	9dd05bb1f6	[docker-teamd]: Increase teammgrd timeout to allow graceful shutdown. (#7662 ) (#8045 ) NOTE: This is cherry-pick from 1911/2012 to master. - Why I did it To fix LAG IP configuration race - How I did it Extended timeout for teammgrd - How to verify it Add >80 router LAGs. Do config reload Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>	2021-07-07 14:48:29 +03:00
novikauanton	1da8c145a6	[iccpd][docker] fix initial startup configuration (#7982 ) #### Why I did it The process of config generation (sonic-cfggen) fails, but the services continue to run with invalid config #### How I did it * add exit with error on errors in start.sh script (because supervisord relies on start.sh return code). * fix jinja template. Jinja use common python expressions under the hood and `has_key` method was removed from dict in py3, so use check by `in` operator as it is supported by both py2 and py3. #### How to verify it * compile sonic with enabled iccp. * add mclag config to CONFIG_DB. ``` 'MC_LAG\|1' => { "local_ip": "10.0.0.2", "peer_ip": "10.0.0.3", "peer_link": "Ethernet8", "mclag_interface": "Ethernet12" } * unmaks, enable and start swss and iccpd services in sonic. * log in into the iccpd container and check the config file `/etc/iccpd/iccpd.conf` * expected config: ``` mclag_id:1 local_ip:10.0.0.2 peer_ip:10.0.0.3 peer_link:Ethernet8 mclag_interface:Ethernet12 system_mac:YOUR_SYSTEM_MAC #### Description for the changelog Fixed initial iccpd startup configuration.	2021-07-01 00:47:26 -07:00
VenkatCisco	1618aec370	Add ethtool to docker-platform-monitor (#8017 ) #### Why I did it ethtool can be used to query and change settings such as speed, auto- negotiation and checksum offload on many network devices, especially Ethernet devices. #### How I did it add package extension to docker-platform-monitor/Dockerfile.j2	2021-06-30 09:36:47 -07:00
VenkatCisco	c5855eba08	Add libpci3 pkg to docker-platform-monitor (#8016 ) #### Why I did it The libpci library provides portable access to configuration registers of devices connected to the PCI bus. #### How I did it update dockers/docker-platform-monitor/Dockerfile.j2	2021-06-30 09:35:16 -07:00
thomas.cappleman@metaswitch.com	101b1fa08b	[build]: Fix sonic-cfggen contextlib err (#7996 ) A recent version of contextlib2 (https://pypi.org/project/contextlib2/21.6.0/#history) has broken Python2 compatibility, so the version picked up by netaddr when using Python2 must be specified, or else builds fail Co-authored-by: Tom Zhu <tom.zhu@metaswitch.com>	2021-06-28 17:15:03 -07:00
arlakshm	ef67ba5f6e	[multi-asic] fix network command for internal loopback (#7878 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> In the multi asic platforms all the ASIC are advertising the same IPv6 /64 network from Loopback4096. Therefore, the IPv6 loopback address of backend asic is not learnt on the frontend asic. Change the bgpd.conf.main.conf.j2 template file to advertise the Loopback4096 ipv6 address as /128	2021-06-24 12:02:01 -07:00
Shi Su	f52ba3b496	Remove quagga-related code (#7898 ) Why I did it Quagga is no longer being used. Remove quagga-related code (e.g., docker-fpm-quagga, sonic-quagga, etc.). How I did it Remove quagga-related code.	2021-06-23 09:15:56 -07:00
Qi Luo	658ed4fd37	Revert "Remove quagga related code (#7476 )" (#7831 ) Reverts Azure/sonic-buildimage#7476 It remove bgpd.conf.j2 and zebra.conf.j2, which is still used by sonic-config-engine unit test.	2021-06-09 18:52:45 -07:00
ngoc-do	710563f83d	[fabric] Disable unnecessary processes in swss and the orchagent-portsyncd dependency for fabric asic (#5569 ) * Disable unnecessary processes in swss for fabric asic Signed-off-by: ngocdo <ngocdo@arista.com>	2021-06-09 10:53:47 -07:00
Andriy Yurkiv	0c2521b936	Set default values only on the first start (#7735 )	2021-06-09 18:39:22 +08:00
Shi Su	62a4603eef	Remove quagga related code (#7476 ) Why I did it Quagga is no longer being used. Remove quagga-related code (e.g., docker-fpm-quagga, sonic-quagga, etc.). How I did it Remove quagga-related code.	2021-06-07 16:44:54 -07:00
yozhao101	1a3cab43ac	[Monit] Deprecate the feature of monitoring the critical processes by Monit (#7676 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit. How I did it I removed the script process_checker and corresponding Monit configuration entries of critical processes. How to verify it I verified this on the device str-7260cx3-acs-1.	2021-06-04 10:16:53 -07:00
Kwan	1347f29178	[docker-mgmt-framework]: update mgmt framework docker to support sonic-cli cmd (#6148 ) - Why I did it migrate to python3 support add dependent packages for Klish allow login as non-root user - How I did it update sonic-cli script to start Klish with user name, system name and timeout update the Dockerfile.j2 to resolve dependent packages add python3-dev for Klish use - How to verify it Incremental buster build with Azure/sonic-mgmt-framework#76 and verify the sonic-cli - Description for the changelog Migrate to python3.7 support, update sonic-cli script and resolve package dependencies	2021-06-02 19:38:21 -07:00
ppikh	3ad4f79fea	[sonic-mgmt docker]: Added allure-pytest library to sonic-mgmt docker container (#7665 ) * Modified Dockerfile.j2 - added allure-pytest library Signed-off-by: Petro Pikh <petrop@nvidia.com>	2021-06-02 08:42:30 -07:00
Myron Sosyak	3bf60b3db2	[docker-database] Fix Python3 issue (#7700 ) #### Why I did it To avoid the following error ``` Traceback (most recent call last): File "/usr/local/bin/flush_unused_database", line 10, in <module> if 'PONG' in output: TypeError: a bytes-like object is required, not 'str' ``` `communicate` method returns the strings if streams were opened in text mode; otherwise, bytes. In our case text arg in Popen is not true and that means that `communicate` return the bytes #### How I did it Set `text=True` to get strings instead of bytes #### How to verify it run `/usr/local/bin/flush_unused_database` inside database container	2021-05-31 05:36:24 -07:00
bingwang-ms	3bb123930b	Fix lldpmgrd syntax issue (#7742 ) Signed-off-by: bingwang <bingwang@microsoft.com>	2021-05-31 16:41:28 +08:00
Alexander Allen	21b9fccd75	[dockers][platform-monitor] Add chassis_db_init to platform monitor tasks (#7596 ) I added `chassis_db_init` to the startup tasks for the `docker-platform-monitor` docker so that the script is run on startup of the switch and the chassis info is correctly provisioned to STATE_DB. Depends on https://github.com/Azure/sonic-platform-daemons/pull/183	2021-05-28 12:01:03 -07:00
yozhao101	37863ac854	[Monit] Restart telemetry container if memory usage is beyond the threshold (#7645 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold. How I did it I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container. How to verify it I verified this implementation on device str-7260cx3-acs-1.	2021-05-28 11:13:44 -07:00
Stepan Blyshchak	d7b96dfdf1	[sonic-sdk] add sonic sdk and sonic sdk buildenv (#6712 ) - Why I did it To give SONiC Application Extension developers an environment to run and develop their apps. - How I did it Created sonic-sdk and sonic-sdk-buildenv dockers and their dbg versions. - How to verify it Build: $ make -f slave target/sonic-sdk.gz target/sonic-sdk-buildenv.gz	2021-05-28 10:16:02 -07:00
bingwang-ms	e304182116	Fix supervisor-proc-exit-listener startup issue in restapi (#7681 ) * Fix supervisor-proc-exit-listener startup issue in restapi Signed-off-by: bingwang <bingwang@microsoft.com>	2021-05-26 18:28:10 +08:00
LuiSzee	cf83a99f45	[radv] fix bug for radv can't startup if DEVICE_METADATA.localhost.type is NULL (#7651 ) Co-authored-by: Shi Lei <shil@centecnetworks.com>	2021-05-25 08:17:44 -07:00
Myron Sosyak	5ab300b626	Fix python version (#7658 ) #### Why I did it To avoid the following logs ``` Mar 15 15:52:04.599302 igk-dut-04 INFO database#/supervisord: flushdb /bin/bash: /usr/local/bin/flush_unused_database: /usr/bin/python: bad interpreter: No such file or directory Mar 15 15:52:04.599947 igk-dut-04 INFO database#supervisord 2021-03-15 15:52:04,599 INFO exited: flushdb (exit status 126; not expected) ``` #### How I did it Fix shebang #### How to verify it Check the logs	2021-05-20 15:47:46 -07:00
xumia	9387350e19	Fix the type issue in rvtysh (#7648 ) Why I did it Change the type issue in the command rvtysh change PARA/para to PARAM/param	2021-05-20 21:35:23 +08:00
sudhanshukumar22	f783aefd6d	docker-lldp:intermittent DB errors will result in Client termination (#6119 ) This PR allows listen to hostname changes and mgmt ip changes.	2021-05-18 09:51:02 -07:00
abdosi	f27aa33e69	[muti-asic] Updated BGP community for Internal routes (#7617 ) Following changes are done: Internal routes are tagged with no-export instead of local-AS Option to add User Define BGP community on top of no-export	2021-05-16 19:44:06 -07:00
VenkatCisco	db3d353e77	[pmon]: add psmisc to bring fuser that dentifies processes that are using files or sockets (#7509 ) fuser support is required since new cisco hardware watchdog plugin uses them to check anyone else use's /dev/watchdogX resource. The actual validation happens in the platform code, but the package is required for pmon container. Currently the /dev/watchdogX is being used by cisco platform-monitor service. Cisco chassis level watchdog plugin uses "fuser" to claim the watchdog release from platform-monitor service.	2021-05-06 22:24:07 -07:00
Junchao-Mellanox	a795bc0b8e	[Mellanox] Support new sensor conf file for MSN4700 A1/A0 (#7535 ) #### Why I did it MSN4700 A1/A0 used different sensor chip but keep the existing platform name x86_64-mlnx_msn4700-r0, this is a workaround to replace the sensor conf on MSN4700 A1/A0 #### How I did it Use a shell script to get the sensor conf path and copy that files to /etc/sensors.d/sensors.conf	2021-05-06 10:13:26 -07:00
trzhang-msft	4f2b54e735	dhcpmon: support dual tor in docker template (#7470 )	2021-05-03 10:51:34 -07:00
Lawrence Lee	1b39424520	[docker-orchagent]: Increase ndppd kernel poll interval (#7456 ) Why I did it ndppd by default reads /proc/net/ipv6_route ever 30 seconds. Since T1s advertise so many routes to ToRs, this file is extremely large, and reading it causes ndppd's CPU usage to spike every 30 seconds How I did it Increase the delay for reading this file to the maximum possible value (max integer value), which will result in CPU spikes every ~24 days instead of every 30 seconds How to verify it Start ndppd with the new config file, confirm that no CPU spikes are seen except at startup Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2021-04-30 16:30:30 -07:00
Wei Bai	3967c28a76	[docker-sonic-mgmt]: Upgrade Tgen version in SONiC mgmt docker (#7472 )	2021-04-29 12:31:46 -07:00
Xin Wang	a7e1f7cbad	[docker-sonic-mgmt]: Install aiohttp package to sonic-mgmt docker (#7429 ) The aiohttp package is required by azure.kusto.data which is used by sonic-mgmt/test_reporting. This change is to ensure that the dependent package is installed in the sonic-mgmt docker. Signed-off-by: Xin Wang <xiwang5@microsoft.com>	2021-04-26 23:38:16 -07:00
xumia	56bdd750ab	Support readonly vtysh for sudoers (#7383 ) Why I did it Support readonly version of the command vtysh How I did it Check if the command starting with "show", and verify only contains single command in script.	2021-04-25 16:32:02 +08:00
ajbalogh	990b1127a7	[docker-sonic-mgmt] update version of ixnetwork client packages (#7242 ) * Why I did it Upgrade to the latest ixnetwork-restpy and ixnetwork-open-traffic-generator pypi packages * How I did it Updated the pip install entries for the packages in the Dockerfile.j2 * How to verify it pip show ixnetwork-restpy pip show ixnetwork-open-traffic-generator Co-authored-by: Neetha John <nejo@microsoft.com>	2021-04-23 10:17:19 -07:00
Ze Gan	f77d719f7c	[docker-fpm-frr]: Add split mode to routing config (#7307 ) For the split mode, the config files, like bgpd.conf, zebra.conf and so on, were provided by outside. But the docker_init.sh will overwrite the outside config files if restart bgp service. How I did it Add a split mode checking in docker_init.sh, if docker_routing_config_mode is split, don't overwrite the existing routing config files. How to verify it Set split mode in config db { "DEVICE_METADATA": { "localhost": { "hwsku": "Force10-S6000", "platform": "x86_64-kvm_x86_64-r0", "docker_routing_config_mode": "split" ... } } } Replace your bgpd.conf to /etc/sonic/frr/bgpd.conf Restart bgp service by sudo service bgp restart The /etc/sonic/frr/bgpd.conf your provided shouldn't be overwritten Signed-off-by: Ze Gan <ganze718@gmail.com>	2021-04-23 10:16:20 -07:00
guxianghong	6fe6d7394d	[arm] support compile sonic arm image on arm server (#7285 ) - Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly. - Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228 - rename multiarch docker to sonic-slave-${distro}-march-${arch} Co-authored-by: Xianghong Gu <xgu@centecnetworks.com> Co-authored-by: Shi Lei <shil@centecnetworks.com>	2021-04-18 08:17:57 -07:00
jmmikkel	43342b33b8	[chassis] Add templates and code to support VoQ chassis iBGP peers (#5622 ) This commit has following changes: * Add templates and code to support VoQ chassis iBGP peers * Add support to convert a new VoQChassisInternal element in the BGPSession element of the minigraph to a new BGP_VOQ_CHASSIS_NEIGHBOR table in CONFIG_DB. * Add a new set of "voq_chassis" templates to docker-fpm-frr * Add a new BGP peer manager to bgpcfgd to add neighbors from the BGP_VOQ_CHASSIS_NEIGHBOR table using the voq_chassis templates. * Add a test case for minigraph.py, making sure the VoQChassisInternal element creates a BGP_VOQ_CHASSIS_NEIGHBOR entry, but not if its value is "false". * Add a set of test cases for the new voq_chassis templates in sonic-bgpcfgd tests. Note that the templates expect the new "bgp bestpath peer-type multipath-relax" bgpd configuration to be available. Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>	2021-04-16 11:11:32 -07:00
ANISH-GOTTAPU	e858d6e346	adding snappi to docker (#7292 ) For the migration of tests that involves tgen from abstract to snappi, snappi library is needed	2021-04-15 08:24:31 -07:00
judyjoseph	1ad5dbeab6	Fixes for errors seen in staging devices (#7171 ) With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix. admin@STG01-0101-0102-01T1:~$ TSB BGP0 : % Could not find route-map entry TO_TIER0_V4 20 line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20 % Could not find route-map entry TO_TIER0_V4 30 line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30 In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.	2021-04-08 15:16:43 -07:00
Prince Sunny	20c8dd2691	[IPinIP] Add Loopback2 interface, change dscp mode to uniform (#7234 ) Co-authored-by: Ubuntu <prsunny>	2021-04-07 09:58:12 -07:00
Stephen Sun	0b16ca4ae9	[monit] Avoid monit error log by removing "-l" from monit_swss\|buffermgrd (#7236 ) Avoid the following error messages while dynamic buffer calculation is enabled ``` ERR monit[491]: 'swss\|buffermgrd' status failed (1) -- '/usr/bin/buffermgrd -l' is not running in host ``` Change /usr/bin/buffermgrd -l to /usr/bin/buffermgrd. The buffermgrd is started by -l for traditional model or -a for dynamic model. So we need to use the common section of both. Signed-off-by: Stephen Sun <stephens@nvidia.com>	2021-04-06 10:12:23 -07:00
vganesan-nokia	973affce39	[voq/inbandif] Support for inband port as regular port (#6477 ) Changes in this PR are to make LLDP to consider Inband port and to avoid regular port handling on Inband port.	2021-04-01 16:24:57 -07:00
kakkotetsu	e11397df1d	[restapi] fix python version during restapi startup (#7056 ) changed from python3 to python in supervisord.conf.	2021-03-30 13:54:37 -07:00
Joe LeVeque	c651a9ade4	[dockers][supervisor] Increase event buffer size for process exit listener; Set all event buffer sizes to 1024 (#7083 ) To prevent error [messages](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802) like the following from being logged: ``` Mar 17 02:33:48.523153 vlab-01 INFO swss#supervisord 2021-03-17 02:33:48,518 ERRO pool supervisor-proc-exit-listener event buffer overflowed, discarding event 46 ``` This is basically an addendum to https://github.com/Azure/sonic-buildimage/pull/5247, which increased the event buffer size for dependent-startup. While supervisor-proc-exit-listener doesn't subscribe to as many events as dependent-startup, there is still a chance some containers (like swss, as in the example above) have enough processes running to cause an overflow of the default buffer size of 10. This is especially important for preventing erroneous log_analyzer failures in the sonic-mgmt repo regression tests, which have started occasionally causing PR check builds to fail. Example [here](https://dev.azure.com/mssonic/build/_build/results?buildId=2254&view=logs&j=9a13fbcd-e92d-583c-2f89-d81f90cac1fd&t=739db6ba-1b35-5485-5697-de102068d650&l=802). I set all supervisor-proc-exit-listener event buffer sizes to 1024, and also updated all dependent-startup event buffer sizes to 1024, as well, to keep things simple, unified, and allow headroom so that we will not need to adjust these values frequently, if at all.	2021-03-27 21:14:24 -07:00
Shi Su	de64c4e34c	[bgp]: Reduce bgp connect retry timer to 10 seconds (#7169 ) The default bgp connect retry timer is 120 seconds. A reconnection will happen 120 seconds if the initial connection fails. This PR aims to allow a more frequent retry.	2021-03-27 11:36:56 -07:00
judyjoseph	9d9503e1fe	To decrease the Connect Retry Timer from default value which is 120sec to 10 sec. (#7087 ) Why I did it It was observed that on a multi-asic DUT bootup, the BGP internal sessions between ASIC's was taking more time to get ESTABLISHED than external BGP sessions. The internal sessions was coming up almost exactly 120 secs later. In multi-asic platform the bgp dockers ( which is per ASIC ) on switch start are bring brought up around the same time and they try to make the bgp sessions with neighbors (in peer ASIC's) which may be not be completely up. This results in BGP connect fail and the retry happens after 120sec which is the default Connect Retry Timer How I did it Add the command to set the bgp neighboring session retry timer to 10sec for internal bgp neighbors.	2021-03-17 23:14:38 -07:00
shlomibitton	43d4d45645	Backport ethtool to support QSFP-DD (#5725 ) Backport ethtool debian package version 5.9 to support QSFP-DD cable parsing. Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>	2021-03-16 09:56:53 -07:00
trzhang-msft	97b371ee08	[docker-dhcp-relay]: add -si support in dhcp docker template (#7053 )	2021-03-15 09:21:03 -07:00
Ying Xie	070b020bc3	[sonic-mgmt docker] pin cryptography version to 3.3.2 (#7009 ) Why I did it sonic-mgmt-docker build was failing. How I did it pin cryptography version to 3.3.2 How to verify it build sonic-mgmt docker.	2021-03-10 19:15:11 -08:00
Ze Gan	5221e68b99	[docker-ptf]: Add teamd dependency to ptf (#6994 ) Signed-off-by: Ze Gan <ganze718@gmail.com>	2021-03-10 09:11:23 -08:00
Qi Luo	38d973b834	[build]: Fix get-pip 2.7 url according to upstream announcement (#6999 ) ref: https://bootstrap.pypa.io/2.7/get-pip.py The URL you are using to fetch this script has changed, and this one will no longer work. Please use get-pip.py from the following URL instead: https://bootstrap.pypa.io/pip/2.7/get-pip.py	2021-03-09 18:15:16 -08:00
Tamer Ahmed	bb03e5bb37	Start DHCP Relay When Helpers IPs Are Available (#6961 ) #### Why I did it It is possible to have DHCP relay configuration with no servers/ helpers which result in DHCP container to crash. This PR fixes this issue by not starting DHCP relay for vlans with no DHCP helpers. resolves: #6931 closes: #6931 #### How I did it Do not add program group for dhcp relay with not dhcp helpers #### How to verify it Unit test	2021-03-04 20:43:08 -08:00
abdosi	30b6668b7d	Changes in FRR temapltes for multi-asic (#6901 ) 1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic. 2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output. 3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27 4. Enhancement to add mult_asic specific bgpd template generation unit test cases.	2021-02-26 17:05:15 -08:00
abdosi	a520cecb44	[multi-asic] BBR support on internal-peers for multi-asic platfroms. (#6848 ) Enable BBR config allowas-in 1 for internal peers Why I did: To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.	2021-02-25 23:15:02 -08:00
Ze Gan	4068944202	[MACsec]: Set MACsec feature to be auto-start (#6678 ) 1. Add supervisord as the entrypoint of docker-macsec 2. Add wpa_supplicant conf into docker-macsec 3. Set the macsecmgrd as the critical_process 4. Configure supervisor to monitor macsecmgrd 5. Set macsec in the features list 6. Add config variable `INCLUDE_MACSEC` 7. Add macsec.service - How to verify it Change the `/etc/sonic/config_db.json` as follow ``` { "PORT": { "Ethernet0": { ... "macsec": "test" } } ... "MACSEC_PROFILE": { "test": { "priority": 64, "cipher_suite": "GCM-AES-128", "primary_cak": "0123456789ABCDEF0123456789ABCDEF", "primary_ckn": "6162636465666768696A6B6C6D6E6F707172737475767778797A303132333435", "policy": "security" } } } ``` To execute `sudo config reload -y`, We should find the following new items were inserted in app_db of redis ``` 127.0.0.1:6379> keys MAC 1) "MACSEC_EGRESS_SC_TABLE:Ethernet0:72152375678227538" 2) "MACSEC_PORT_TABLE:Ethernet0" 127.0.0.1:6379> hgetall "MACSEC_EGRESS_SC_TABLE:Ethernet0:72152375678227538" 1) "ssci" 2) "" 3) "encoding_an" 4) "0" 127.0.0.1:6379> hgetall "MACSEC_PORT_TABLE:Ethernet0" 1) "enable" 2) "false" 3) "cipher_suite" 4) "GCM-AES-128" 5) "enable_protect" 6) "true" 7) "enable_encrypt" 8) "true" 9) "enable_replay_protect" 10) "false" 11) "replay_window" 12) "0" ``` Signed-off-by: Ze Gan <ganze718@gmail.com>	2021-02-23 13:22:45 -08:00
Qi Luo	ce3b2cbfc5	[radv] Disable radv for specific deployment_id (#6830 )	2021-02-20 11:01:12 -08:00
pra-moh	2e42ecb5e7	[StreamingTelemetry] add noTLS support for debug purpose (#6704 ) adding noTLS mode for debugging purpose Removing config-set for port 8080. It fails to start telemetry if docker restarts in case on noTLS mode because it expects log_level config to be present as well.	2021-02-17 17:23:00 -08:00
Andriy Yurkiv	bf83b6ca59	Enable SAI_INGRESS_PRIORITY_GROUP_STAT_DROPPED_PACKETS counter by default (#6444 ) Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>	2021-02-17 10:04:48 -08:00
yozhao101	4b10924c2f	[SwSS] Disabled the autorestart of process `coppmgrd`. (#6774 ) coppmgrd process do not need to be auto-restarted if it exited unexpectedly. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2021-02-12 10:59:29 -08:00
judyjoseph	ad88700912	[docker-fpm-frr]: TSA/B/C changes for multi-asic (#6510 ) - Introduced TS common file in docker as well and moved common functions. - TSA/B/C scripts run only in BGP instances for front end ASICs. In addition skip enforcing it on route maps used between internal BGP sessions. admin@str--acs-1:~$ sudo /usr/bin/TSA System Mode: Normal -> Maintenance and in case of Multi-ASIC admin@str--acs-1:~$ sudo /usr/bin/TSA BGP0 : System Mode: Normal -> Maintenance BGP1 : System Mode: Normal -> Maintenance BGP2 : System Mode: Normal -> Maintenance	2021-02-12 10:56:44 -08:00
Guohan Lu	f7346cca32	[docker-fmp-frr]: remove blank lines in generated critical_process Signed-off-by: Guohan Lu <lguohan@gmail.com>	2021-01-27 19:41:59 -08:00
Shi Su	aab37b7f42	[FRR] Create a separate script to wait zebra to be ready to receive connections (#6519 ) The requirement for zebra to be ready to accept connections is a generic problem that is not specific to bgpd. Making the script to wait for zebra socket a separate script and let bgpd and staticd to wait for zebra socket.	2021-01-27 12:36:02 -08:00
Guohan Lu	ca0e8cbe0e	[docker-ptf]: build docker ptf - combine docker-ptf-saithrift into docker-ptf docker - build docker-ptf under platform vs - remove docker-ptf for other platforms Signed-off-by: Guohan Lu <lguohan@gmail.com>	2021-01-27 08:28:21 -08:00
Tamer Ahmed	8d857fab16	[dhcp-relay]: Launch DHCP Relay On L3 Vlan (#6527 ) Recent changes brought l2 vlan concept which do not have DHCP clients behind them and so DHCP relay is not required. Also, dhcpmon fails to launch on those vlans as their interfaces lack IP addresses. This PR limit launch of both DHCP relay and dhcpmon to L3 vlans only. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2021-01-25 10:48:48 -08:00
Zhenhong Zhao	a171e6c5e4	[frrcfgd] introduce frrcfgd to manage frr config when frr_mgmt_framework_config is true (#5142 ) - Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema. - Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup - How I did it - add frrcfgd service - when frr_mgmg_framework_config is set, frrcfgd starts in bgp container - when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR. - add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted - How to verify it 1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed. 1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>	2021-01-24 17:57:03 -08:00
arlakshm	0e12ca81c7	[Multi Asic] support of swss.rec and sairedis.rec for multi asic (#6310 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com - Why I did it This PR has the changes to support having different swss.rec and sairedis.rec for each asic. The logrotate script is updated as well - How I did it Update the orchagent.sh script to use the logfile name options in these PRs(Azure/sonic-swss#1546 and Azure/sonic-sairedis#747) In multi asic platforms the record files will be different for each asic, with the format swss.asic{x}.rec and sairedis.asic{x}.rec Update the logrotate script for multiasic platform .	2021-01-22 09:42:19 -08:00
Samuel Angebault	0464d15b18	[pmon]: Run ledd using python3 unless excluded (#6528 ) - Why I did it Ledd is the last daemon that is not enabled to run in python3. Even though there is a plan to deprecate this daemon and to replace it by something else it's one simple step toward python2 deprecation. - How I did it Changed the `command=` line for `ledd` in the `supervisord` configuration of `pmon`. Copied what was done for other daemons. - How to verify it Booting a product that has a `led_control.py` should now show the ledd running in python3. I ran `python3 -m pylint` on all `led_control.py` plugin which means that most of them should be python3 compliant. There is however still a risk that some might not work.	2021-01-22 07:12:01 -08:00
yozhao101	be3c036794	[supervisord] Monitoring the critical processes with supervisord. (#6242 ) - Why I did it Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance. Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by Supervisord, we can only focus on the logic of monitoring. - How I did it We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take following steps if it was notified one of critical processes exited unexpectedly: The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted. If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog. - How to verify it First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not. Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute. - Which release branch to backport (provide reason below if selected) 201811 201911 [x ] 202006	2021-01-21 12:57:49 -08:00
Shi Su	afee1a851c	[bgpd]: Check zebra is ready to connect when starting bgpd (#6478 ) Fix #5026 There is a race condition between zebra server accepts connections and bgpd tries to connect. Bgpd has a chance to try to connect before zebra is ready. In this scenario, bgpd will try again after 10 seconds and operate as normal within these 10 seconds. As a consequence, whatever bgpd tries to sent to zebra will be missing in the 10 seconds. To avoid such a scenario, bgpd should start after zebra is ready to accept connections.	2021-01-19 00:23:36 -08:00
pavel-shirshov	16e54340b7	[docker-frr]: Use egrep with regexp to match correct TSA rules (#6403 ) - Why I did it Earlier today we found a bug in the SONiC TSA implementation. TSC shows incorrect output (see below) in case we have a route-map which contains TSA route-map as a prefix. ``` admin@str-s6100-acs-1:~$ TSC Traffic Shift Check: System Mode: Not consistent ``` The reason is that TSC implementation has too loose regexps in TSA utilities, which match wrong route-map entries: For example, current TSC matches following ``` route-map TO_BGP_PEER_V4 permit 200 route-map TO_BGP_PEER_V6 permit 200 ``` But it should match only ``` route-map TO_BGP_PEER_V4 permit 20 route-map TO_BGP_PEER_V4 deny 30 route-map TO_BGP_PEER_V6 permit 20 route-map TO_BGP_PEER_V6 deny 30 ``` - How I did it I fixed it by using egrep with `^` and `$` regexp markers which match begin and end of the line. - How to verify it 1. Add follwing entry to FRR config: ``` str-s6100-acs-1# str-s6100-acs-1# conf t str-s6100-acs-1(config)# route-map TO_BGP_PEER_V4 permit 200 str-s6100-acs-1(config-route-map)# end ``` 2. Use the TSC command and check output. It should show normal. ``` admin@str-s6100-acs-1:~$ TSC Traffic Shift Check: System Mode: Normal```	2021-01-14 11:09:16 -08:00
carl-nokia	380edf054d	[Platform][nokia]: python3-smbus package add with python3 and jinja fixes (#6416 ) fix platform driver breakage due to python3 upgrade and fix load minigraph errors with config load_minigraph -y - How I did it added python3-smbus to the pmon docker template since the previous was python2 specific fixed additional "ord" python2 specific code fixed the jinja templates used by qos reload - the template logic required data to be parsed - How to verify it run "show platform XXX" commands and verify output run "sudo config load_minigraph -y" and verify configuration run "show interfaces XXX" and verify output Co-authored-by: Carl Keene <keene@nokia.com>	2021-01-12 15:05:06 -08:00
Ze Gan	c22575218a	[docker-macsec]: MACsec container and wpa_supplicant component (#5700 ) The HLD about MACsec feature is at : https://github.com/Azure/SONiC/blob/master/doc/macsec/MACsec_hld.md - How to verify it This PR doesn't set MACsec container automatically start, You should manually start the container by docker run docker-macsec wpa_supplicant binary can be found at MACsec container. This PR depends on the PR, WPA_SUPPLICANT, and The MACsec container will be set as automatically start by later PR. Signed-off-by: zegan <zegan@microsoft.com>	2021-01-10 10:39:59 -08:00
pavel-shirshov	83715cfc49	[bgpcfgd]: Support default action for "Allow prefix" feature (#6370 ) * Use 20 and 30 route-map entries instead of 2 and 3 for TSA * Added support for dynamic "Allow list" default action. Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2021-01-08 14:03:26 -08:00
Joe LeVeque	e52581e919	[PDDF] Build and install Python 3 package (#6286 ) - Make PDDF code compliant with both Python 2 and Python 3 - Align code with PEP8 standards using autopep8 - Build and install both Python 2 and Python 3 PDDF packages	2021-01-07 10:03:29 -08:00
abdosi	afc87b8ccd	Updated imfile configuration for supervisord logs (#6368 ) Updated imfile configuration for supervisord logs for stretch and buster.	2021-01-06 18:47:36 -08:00
sudhanshukumar22	8a3ac8ff9c	[docker-lldp]: sonic advertise meaningful SysDescription instead of debian (#6114 ) Sonic devices advertise meaningful system description along with Debian package information. before the fix: ------------- admin@sonic:~$ show lldp neighbors ------------------------------------------------------------------------------- LLDP neighbors: ------------------------------------------------------------------------------- Interface: Ethernet0, via: LLDP, RID: 3, Time: 0 day, 16:36:30 SysName: sonic SysDescr: Debian GNU/Linux 9 (stretch) Linux 4.9.0-11-2-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 ------------------------------------------------------------------------------- After the fix: root@sonic:~# show lldp neighbors Ethernet16 ------------------------------------------------------------------------------- LLDP neighbors: ------------------------------------------------------------------------------- Interface: Ethernet16, via: LLDP, RID: 10, Time: 0 day, 00:01:00 SysName: sonic SysDescr: SONiC Software Version: SONiC.sonic_upstream_1.0_daily_201130_1501_62-dirty-20201130.203529 - HwSku: Accton-AS7816-64X - Distribution: Debian 10.6 - Kernel: 4.19.0-9-2-amd64 ------------------------------------------------------------------------------- Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>	2021-01-06 12:24:57 -08:00
abdosi	afd60bdc48	[rsyslog]: Explicitly set the notify mode for rsyslog imfile module (#6351 ) Enable the notify mode of rsyslogd imfile module used for supervisord logs in docker container. Setup the mode="inotify" when loading imfile, made sure we are are getting supervisord logs in host immediately. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2021-01-06 00:00:18 -08:00
Travis Van Duyn	d769ef2abd	[snmp]: updated to support snmp config from redis configdb (#6134 ) - Why I did it I'm updating the jinja2 template to support getting SNMP information from the redis configdb. I'm using the format approved here: https://github.com/Azure/SONiC/pull/718 This will pave the way for us to decrement using the snmp.yml in the future. Right now we will still be using both the snmp.yml and configdb to get variable information in order to create the snmpd.conf via the sonic-cfggen tool. - How I did it I first updated the SNMP Schema in PR #718 to get that approved as a standardized format. Then I verified I could add snmp configs to the configdb using this standard schema. Once the configs were added to the configdb then I updated the snmpd.conf.j2 file to support the updates via the configdb while still using the variables in the snmp.yml file in parallel. This way we will have backward compatibility until we can fully migrate to the configdb only. By updating the snmpd.conf.j2 template and running the sonic-cfggen tool the snmpd.conf gets generated with using the values in both the configdb and snmp.yml file. Co-authored-by: trvanduy <trvanduy@microsoft.com>	2021-01-05 13:43:29 -08:00
abdosi	ef0088c29f	Enable the notify mode of rsyslogd imfile module used for supervisord (#6298 ) Enable the notify mode of rsyslogd imfile module used for supervisord logs in docker container	2020-12-31 17:01:57 -08:00
Ubuntu	273846a412	FRR 7.5 Build libyang1 which is required for frr 7.5	2020-12-29 03:44:49 -08:00
Stepan Blyshchak	23f1d51de3	[ipinip.json.j2] align mellanox configuration dst_ip with other platforms (#6304 ) Mellanox already supports multiple destination IPs in IPinIP tunnel configuration, thus removing mellanox exception for IPinIP configuration. - How I did it Removed "dst_ip" field generation in mellanox platform condition. Sorted the "dst_ip" list, so that it is easier to test against sample configuration in unit tests. Aligned unit test sample. Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>	2020-12-28 20:53:12 -08:00
Travis Van Duyn	6efc0a885f	Convert snmp.yml to configdb (#6205 ) This PR is in preparation to move from snmp.yml to configdb. This will more closely align with other commands in sonic and use configdb as the source of truth for snmp configuration. Note: This is the first of 2 PR's to enable this. This PR will not change any functionality but will allow the snmp.yml file info to be put into the configdb. Created a script that takes the snmp.yml variables and converts them to the configdb format. Added file to dockerfile.j2 so that file is copied in the container. Updated start.sh file to automatically run the python conversion script each time the docker container is restarted.	2020-12-28 11:51:58 -08:00
Guohan Lu	ed58684e36	[docker-frr]: add static ipv6 loopback route to allow bgp to advertise prefix frr does not advertise route if local route is not reachable, as a result loopback route /64 is not advertised to the neighbors. Add static route allows frr to advertise the route to its peers Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-12-28 10:34:34 -08:00
Junchao-Mellanox	51f896b33e	Add pmon daemons python3 build support (#6176 ) - Why I did it python2 is end of life and SONiC is going to support python3. This PR is going to support: 1. Build pmon daemons with python3 2. Install and run python3 version pmon daemons - How I did it 1. Change pmon daemons make files to build bothe python2 and python3 whl 2. Change docker-platform-monitor make files to install both python2 and python3 whl 3. Change pmon docker startup files to start pmon daemons according to the supported platform API version	2020-12-28 10:19:24 -08:00
Prince Sunny	8fd50e895c	[submodule]: swss Tunnel Manager changes (#5843 ) Introduce tunnel manager daemon. Start the process as part of swss container Submodule update for swss: 9ed3026 - 2020-12-24 : [NAT] ACL Rule with DO_NOT_NAT action is getting failed. (#1502) [Akhilesh Samineni] c39a4b1 - 2020-12-23 : Mux/IPTunnel orchagent changes (#1497) [Prince Sunny] bc8df0e - 2020-12-23 : Add support for headroom pool watermark (#1567) [Neetha John]	2020-12-26 11:17:18 -08:00
Joe LeVeque	d40c9a1e8d	[docker-base-buster][docker-config-engine-buster] No longer install Python 2 (#6162 ) - Why I did it As part of migrating SONiC codebase from Python 2 to Python 3 - How I did it - No longer install Python 2 in docker-base-buster or docker-config-engine-buster. - Install Python 2 and pip2 in the following containers until we can completely eliminate it there: - docker-platform-monitor - docker-sonic-mgmt-framework - docker-sonic-vs - Pin pip2 version <21 where it is still temporarily needed, as pip version 21 will drop support for Python 2 - Also preform some other cleanup, ensuring that pip3, setuptools and wheel packages are installed in docker-base-buster, and then removing any attempts to re-install them in derived containers	2020-12-25 21:29:25 -08:00
KISHORE KUNAL	4bb8ab3495	Add support to start fdbsyncd when orchagent docker starts (#5979 ) Add support to start fdbsyncd when swss docker starts. New demon is added to sync MAC from Kernel to DB and vise versa.	2020-12-24 18:36:01 -08:00
Wei Bai	8939202f67	[docker-sonic-mgmt]: Upgrade Tgen API from 0.0.42 to 0.0.70 (#6275 ) Tgen API 0.0.42 has many problems. We have fixed them in 0.0.70.	2020-12-24 01:53:31 -08:00
faraazbrcm	9d35fa19dc	[mgmt-framework]: support python3 in mgmt-framework (#6038 ) Fix specific version for mmh3 for python2 and python3 and Add pyang for python3	2020-12-22 12:06:28 -08:00
Renuka Manavalan	ba02209141	First cut image update for kubernetes support. (#5421 ) * First cut image update for kubernetes support. With this, 1) dockers dhcp_relay, lldp, pmon, radv, snmp, telemetry are enabled for kube management init_cfg.json configure set_owner as kube for these 2) Each docker's start.sh updated to call container_startup.py to register going up As part of this call, it registers the current owner as local/kube and its version The images are built with its version ingrained into image during build 3) Update all docker's bash script to call 'container start/stop/wait' instead of 'docker start/stop/wait'. For all locally managed containers, it calls docker commands, hence no change for locally managed. 4) Introduced a new ctrmgrd service, that helps with transition between owners as kube & local and carry over any labels update from STATE-DB to API server 5) hostcfgd updated to handle owner change 6) Reboot scripts are updatd to tag kube running images as local, so upon reboot they run the same image. 7) Added kube_commands.py to handle all updates with Kubernetes API serrver -- dedicated for k8s interaction only.	2020-12-22 08:01:33 -08:00
xumia	0a36de3a89	Recover "Support SONiC Reproduceable Build-debian/pip/web packages (#6255 ) * Revert "Revert "Support SONiC Reproduceable Build-debian/pip/web packages (#5718)"" This reverts commit `17497a65e3`. * Revert "Revert "Remove unnecessary sudo authority in build Makefile (#6237)"" This reverts commit `163b7111b5`.	2020-12-21 15:31:10 +08:00
Guohan Lu	17497a65e3	Revert "Support SONiC Reproduceable Build-debian/pip/web packages (#5718 )" This reverts commit `55a707586b`.	2020-12-18 23:37:27 -08:00
macikgozwa	169b2fb188	[docker-ptf]: Updating Python-based GNMI client (#6216 ) Upgrading the reference for the Python GNMI tool repository. The commit for the new payload Co-authored-by: Murat Acikgoz <muacikgo@microsoft.com>	2020-12-17 22:12:04 -08:00
xumia	55a707586b	Support SONiC Reproduceable Build-debian/pip/web packages (#5718 ) * Support SONiC reproduceable build for deb/py2/py3/web * Remove j2 files * Fix bug * Fix some issues 1. Change some code format issues 2. Fix curl calling wget command, pip2 calling pip3 issue 3. Fix wget/curl downloading multiple urls issue * Fix some code format issue * Fix bug * Fix bug * Fix command path hard code in build info scripts issue * Add debian package sonic-build-tools * Fix auto debian package removed issue * Change build debian package name, and change the folder * Collect the pre-versions and post-versions * Change to use debian:buster * Remove apt-mark and improve code * Remove set_build_hooks * Change docker trusted gpg files * Fix docker build COPY directory name issue * Move the trusted gpg files into the sonic-build-hooks package	2020-12-17 13:06:53 +08:00
mprabhu-nokia	41012f791e	In modular chassis, add CHASSIS_STATE_DB on control card (#5624 ) HLD: Azure/SONiC#646 In modular chassis, add CHASSIS_STATE_DB on control card Why I did it Modular Chassis has control-cards, line-cards and fabric-cards along with other peripherals. Control-Card CHASSIS_STATE_DB will be the central DB to maintain any state information of cards that is accessible to control-card/ How I did it Adding another DB on an existing REDIS instance running on port 6380.	2020-12-15 17:15:00 -08:00
mprabhu-nokia	00cea080af	Chassisd to monitor cards in a modular chassis (#5523 ) HLD: Azure/SONiC#646 Introducing chassisd process to monitor status of the control, line and fabric cards in a modular chassis. - Why I did it Modular Chassis has control-cards, line-cards and fabric-cards along with other peripherals. Chassisd will be a central entity that has visibility of the entire chassis. - How I did it Chassisd process will monitor cards in the main thread. Another configuation_handling_task is created to listen to CONFIG_DB for admin_status up/down events. The monitored status is persisted in REDIS-DB.	2020-12-15 16:28:58 -08:00
zhenggen-xu	182a809dc3	[docker-vs][docker-orchagent] install python3 dependent packages for restore_neighbors.py (#6207 ) Install the necessary python3 dependent packages to convert restore_neighbor.py to support python3 as python2 is EOL. See: Azure/sonic-swss#1542 Signed-off-by: Zhenggen Xu <zxu@linkedin.com>	2020-12-15 11:06:30 -08:00
Sabareesh-Kumar-Anandan	9f4ca01388	[sonic-config-engine] Adding dependent pkgs needed for arm compilation (#6186 ) libxslt-dev and libz-dev are dependencies for lxml==4.6.1 which is required for pyangbind==0.8.1 lxml-4.6.2-cp37-cp37m-manylinux1_x86_64.whl is directly downloaded in amd64 whereas in arm this is built from lxml-4.6.2.tar.gz Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>	2020-12-15 08:44:46 -08:00
Stephen Sun	e010d83fc3	[Dynamic buffer calc] Support dynamic buffer calculation (#6194 ) - Why I did it To support dynamic buffer calculation. This PR also depends on the following PRs for sub modules - [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](https://github.com/Azure/sonic-swss/pull/1338) - [sonic-swss-common: Dynamic buffer calculation #361](https://github.com/Azure/sonic-swss-common/pull/361) - [sonic-utilities: Support dynamic buffer calculation #973](https://github.com/Azure/sonic-utilities/pull/973) - How I did it 1. Introduce field `buffer_model` in `DEVICE_METADATA\|localhost` to represent which buffer model is running in the system currently: - `dynamic` for the dynamic buffer calculation model - `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used 2. Add the tables required for the feature: - ASIC_TABLE in platform/\<vendor\>/asic_table.j2 - PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2 - PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed. - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2 - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2 3. Copy the newly introduced j2 files into the image and rendering them when the system starts 4. Update the CLI options for buffermgrd so that it can start with dynamic mode 5. Fetches the ASIC vendor name in orchagent: - fetch the vendor name when creates the docker and pass it as a docker environment variable - `buffermgrd` can use this passed-in variable 6. Clear buffer related tables from STATE_DB when swss docker starts 7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2 8. Remove buffer pool sizes for ingress pools and egress_lossy_pool Update the buffer settings for dynamic buffer calculation	2020-12-13 11:35:39 -08:00
Dong Zhang	b2a3de5f4f	[MultiDB] add mutidb warmboot support - restoring database (#5773 ) * restoring each database with all data before warmboot and then flush unused data in each instance, following the multiDB warmboot design at https://github.com/Azure/SONiC/blob/master/doc/database/multi_database_instances.md * restore needs to be done in database docker since we need to know the database_config.json in new version * copy all data rdb file into each instance restoration location andthen flush unused database * other logic is the same as before * backing up database part is in another PR at sonic-utilities https://github.com/Azure/sonic-utilities/pull/1205, they depend on each other	2020-12-10 11:06:19 -08:00
trzhang-msft	d4d90a8963	Support for dual tor option in dhcp docker template (#6152 )	2020-12-09 18:10:00 -08:00
Samuel Angebault	8576911a57	[database-chassis]: Fix the way database-chassis start (#6099 ) The service crash when the platform boots due to missing waits. /usr/bin/database.sh tries to operate on a missing socket and fails. We now wait for the chassis database to be ready the same way we do database.	2020-12-04 10:09:35 -08:00
Joe LeVeque	83f0d8240e	[pmon]: Install vanilla 'thrift' Python 2 and 3 packages for Barefoot in host and PMon (#6080 ) Barefoot platform vendors' sonic_platform packages import the Python 'thrift' library. Previously, our custom-built package was being installed in the PMon container and host OS. However, we are only building a Python 2 version of that package, which was only intended for use with saithrift. Fixes #6077	2020-12-04 08:41:17 -08:00
Joe LeVeque	905a5127bb	[Python] Align files in root dir, dockers/ and files/ with PEP8 standards (#6109 ) - Why I did it Align style with slightly modified PEP8 standards (extend maximum line length to 120 chars). This will also help in the transition to Python 3, where it is more strict about whitespace, plus it helps unify style among the SONiC codebase. Will tackle other directories in separate PRs. - How I did it Using `autopep8 --in-place --max-line-length 120` and some manual tweaks.	2020-12-03 15:57:50 -08:00
Sabareesh-Kumar-Anandan	fe524c37e7	[platform][marvell] Arm 32-bit Arch support changes (#5749 ) - Added Arm 32-bit arch build fixes - Added marvell armhf platform specific changes Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>	2020-12-03 12:38:50 -08:00
Junchao-Mellanox	68464381bc	Add a configuration to delay start xcvrd for fast-reboot (#5643 )	2020-12-02 21:28:18 +02:00
abdosi	872c85d8e7	[lldp]: Lldp docker to use python3 version of sonic-db-syncd package. (#6046 ) Made changes so that Lldp docker start using py3 of sonic-db-syncd submodule update sonic-db-syncd 5cc29a1b32d8d1f4dfbc967bfea2727c50a49c76 (HEAD -> master, origin/master, origin/HEAD) Changes to convert sonic-dbsyncd from python 2 to 3 Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-11-30 10:44:40 -08:00
pavel-shirshov	fd87ba0aee	[bgpcfgd]: Add on-match next rule for set ipv6 next-hop prefer-global (#6011 ) * Add 'on-match next' after every 'set ipv6 next-hop prefer-global' * Check that 'set ipv6 next-hop prefer-global' rule has 'on-match' next	2020-11-24 08:33:31 -08:00
Sudharsan Dhamal Gopalarathnam	98a434e8c1	Copp Manager Changes (#4861 ) *Introduce CoPP Manager infrastructure Copp service to generate initial copp config template file Co-authored-by: dgsudharsan <sudharsan_gopalarat@dell.com>	2020-11-23 09:31:42 -08:00
pavel-shirshov	5df8af5378	[TSA]: Fix TSC. Avoid 'Not consistent' state (#5968 )	2020-11-23 09:30:39 -08:00
lguohan	4d3eb18ca7	[supervisord]: use abspath as supervisord entrypoint (#5995 ) use abspath makes the entrypoint not affected by PATH env. Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-11-22 21:18:44 -08:00
Joe LeVeque	7bf05f7f4f	[supervisor] Install vanilla package once again, install Python 3 version in Buster container (#5546 ) - Why I did it We were building a custom version of Supervisor because I had added patches to prevent hangs and crashes if the system clock ever rolled backward. Those changes were merged into the upstream Supervisor repo as of version 3.4.0 (http://supervisord.org/changes.html#id9), therefore, we should be able to simply install the vanilla package via pip. This will also allow us to easily move to Python 3, as Python 3 support was added in version 4.0.0. - How I did it - Remove Makefiles and patches for building supervisor package from source - Install Python 3 supervisor package version 4.2.1 in Buster base container - Also install Python 3 version of supervisord-dependent-startup in Buster base container - Debian package installed binary in `/usr/bin/`, but pip package installs in `/usr/local/bin/`, so rather than update all absolute paths, I changed all references to simply call `supervisord` and let the system PATH find the executable to prevent future need for changes just in case we ever need to switch back to build a Debian package, then we won't need to modify these again. - Install Python 2 supervisor package >= 3.4.0 in Stretch and Jessie base containers	2020-11-19 23:41:32 -08:00
Mykola F	bbbd94f4dd	[enable counters] provide initial rates parameters (#5048 ) * [enable counters] provide initial rates parameters Signed-off-by: Mykola Faryma <mykolaf@mellanox.com> * add descriptive comment Signed-off-by: Mykola Faryma <mykolaf@mellanox.com> Co-authored-by: Volodymyr Samotiy <volodymyrs@nvidia.com>	2020-11-18 19:33:19 +02:00
pavel-shirshov	af654944bd	[bgp]: Update TSA functionality (#5906 ) Fixed TSA bugs: 1. TSA didn't advertise Loopback ipv6 address 2. TSA and TSB changed BGP dynamic and BGP monitors sessions - How to verify it Build an image and run on your DUT. ``` admin@str-s6100-acs-1:~$ TSA System Mode: Normal -> Maintenance admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv4 neighbors 10.0.0.1 advertised-routes' BGP table version is 6, local router ID is 10.1.0.32, vrf id 0 Default local pref 100, local AS 64601 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path > 10.1.0.32/32 0.0.0.0 0 32768 i Total number of prefixes 1 admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv6 neighbors fc00::a advertised-routes' BGP table version is 6, local router ID is 10.1.0.32, vrf id 0 Default local pref 100, local AS 64601 Status codes: s suppressed, d damped, h history, valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> fc00:1::/64 :: 0 32768 i Total number of prefixes 1 admin@str-s6100-acs-1:~$ TSB System Mode: Maintenance -> Normal ``` Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-11-13 17:54:20 -08:00
fk410167	a3dd3f55f9	Platform Driver Developement Framework (PDDF) (#4756 ) This change introduces PDDF which is described here: https://github.com/Azure/SONiC/pull/536 Most of the platform bring up effort goes in developing the platform device drivers, SONiC platform APIs and validating them. Typically each platform vendor writes their own drivers and platform APIs which is very tailor made to that platform. This involves writing code, building, installing it on the target platform devices and testing. Many of the details of the platform are hard coded into these drivers, from the HW spec. They go through this cycle repetitively till everything works fine, and is validated before upstreaming the code. PDDF aims to make this platform driver and platform APIs development process much simpler by providing a data driven development framework. This is enabled by: JSON descriptor files for platform data Generic data-driven drivers for various devices Generic SONiC platform APIs Vendor specific extensions for customisation and extensibility Signed-off-by: Fuzail Khan <fuzail.khan@broadcom.com>	2020-11-12 10:22:38 -08:00
Sabareesh-Kumar-Anandan	6c362a08e7	[armhf][redis] compilation fixes for armhf arch (#5901 ) 1. Update SSL ca certificates for secure download [arm specific] 2. Using redis-tools from blob sonic-storage for docker-base-stretch Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>	2020-11-11 18:19:48 -08:00
Lawrence Lee	d0f16c0d79	Make backend device checking more robust (#5730 ) Treat devices that are ToRRouters (ToRRouters and BackEndToRRouters) the same when rendering templates Except for BackEndToRRouters belonging to a storage cluster, since these devices have extra sub-interfaces created Treat devices that are LeafRouters (LeafRouters and BackEndLeafRouters) the same when rendering templates Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2020-11-10 15:06:35 -08:00
judyjoseph	f2b22b5cd1	[multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table (#5874 ) Reintroduce #5760, along with the fix needed in the template file for python3 compatibility.	2020-11-10 09:34:56 -08:00
judyjoseph	b5121dcfd4	Revert "[multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table. (#5760 )" (#5871 ) This reverts commit `c972052594`.	2020-11-09 14:30:13 -08:00
ajbalogh	a9fc866ba5	dockers/docker-sonic-mgmt/Dockerfile.js: Add keysight ixnetwork-open-… (#5762 ) Why I did it Provide access to the ixnetwork-open-traffic-generator pypi package How I did it Added a pip install entry for the package in the Dockerfile.j2 How to verify it pip show ixnetwork-open-traffic-generator Description for the changelog Install of ixnetwork-open-traffic-generator pypi package is required for proposed rdma tests.	2020-11-09 14:12:39 -08:00
abdosi	84aa99d04b	[multi-asic] teamdctl support for multi-asic (#5851 ) Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-11-09 12:31:33 -08:00
judyjoseph	c972052594	[multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table. (#5760 ) - Why I did it Update the routine is_bgp_session_internal() by checking the BGP_INTERNAL_NEIGHBOR table. Additionally to address the review comment #5520 (comment) Add timer settings as will in the internal session templates and keep it minimal as these sessions which will always be up. Updates to the internal tests data + add all of it to template tests. - How I did it Updated the APIs and the template files. - How to verify it Verified the internal BGP sessions are displayed correctly with show commands with this API is_bgp_session_internal()	2020-11-09 11:10:10 -08:00
Longxiang Lyu	385dfc4921	[monit] Fix status error due to shebang change (#5865 ) lldpmgrd, bgpcfgd, and bgpmon are reported error status not running due to recent change of shebang to use `Python3`. Modifying the argument of `process_checker` to follow this change. Signed-off-by: Longxiang Lyu <lolv@microsoft.com>	2020-11-09 01:52:22 -08:00
Garrick He	9fda295cd0	[sflow] Clean-up sFlow container and port_index_mapper.py script (#5846 ) * Fix some spelling error and add addition check in port_index_mapper.py script. * Remove unneeded pyroute2 python module from sFlow container Signed-off-by: Garrick He <garrick_he@dell.com>	2020-11-07 20:23:01 -08:00
Petro Bratash	32a832a8ac	[lldp]: Add verification IPv4 address on LLDP conf Jinja2 Template (#5699 ) Fix #5812 LLDP conf Jinja2 Template does not verify IPv4 address and can use IPv6 version. This issue does not effect control LLDP daemon. Issue can be reproduced via `test_snmp_lldp` test. LLDP conf Jinja2 Template selects first item from the list of mgmt interfaces. TESTBED_1 LLDP conf ``` # cat /etc/lldpd.conf configure ports eth0 lldp portidsubtype local eth0 configure system ip management pattern FC00:3::32 configure system hostname dut-1 ``` TESTBED_2 LLDP conf ``` # cat /etc/lldpd.conf configure ports eth0 lldp portidsubtype local eth0 configure system ip management pattern 10.22.24.61 configure system hostname dut-2 ``` TESTBED_1 MGMT_INTERFACE ``` $ redis-cli -n 4 keys "" \| grep MGMT_INTERFACE MGMT_INTERFACE\|eth0\|10.22.24.53/23 MGMT_INTERFACE\|eth0\|FC00:3::32/64 ``` TESTBED_2 MGMT_INTERFACE ``` $ redis-cli -n 4 keys "" \| grep MGMT_INTERFACE MGMT_INTERFACE\|eth0\|FC00:3::32/64 MGMT_INTERFACE\|eth0\|10.22.24.61/23 ``` Signed-off-by: Petro Bratash <petrox.bratash@intel.com>	2020-11-07 10:30:41 -08:00
pavel-shirshov	cdcd20a7b5	[BGP]: Convert ip address to network address for the LOCAL_VLAN filter (#5832 ) * [BGP]: Convert ip address to network address for the LOCAL_VLAN prefix filter	2020-11-06 17:47:08 -08:00
Joe LeVeque	ad555d9ffd	[restore_nat_entries.py] Convert to Python 3 (#5788 ) - Convert restore_nat_entries.py script to Python 3 - Use logger from sonic-py-common for uniform logging - Reorganize imports alphabetically per PEP8 standard - Two blank lines precede functions per PEP8 standard	2020-11-06 10:15:49 -08:00
Garrick He	27a911f16e	[sflow] Fix port_index_mapper.py script; Convert to Python 3 (#5800 ) - Why I did it A memory issue was discovered during system test for scaling. The issue is documented here: https://docs.pyroute2.org/ipdb.html > One of the major issues with IPDB is its memory footprint. It proved not to be suitable for environments with thousands of routes or neighbours. Being a design issue, it could not be fixed, so a new module was started, NDB, that aims to replace IPDB. IPDB is still more feature rich, but NDB is already more fast and stable. - How I did it - Rewrote the port_index_mapper.py script to use dB events. - Convert to Python 3	2020-11-06 10:15:06 -08:00
Joe LeVeque	51292330e9	[enable_counters.py] Convert to Python 3 (#5789 ) - Why I did it As part of moving all SONiC code from Python 2 (no longer supported) to Python 3 - How I did it - Convert enable_counters.py script to Python 3 - Reorganize imports per PEP8 standard - Two blank lines precede functions per PEP8 standard	2020-11-06 09:00:19 -08:00
pavel-shirshov	13f8e9ce5e	[bgpcfgd]: Convert bgpcfgd and bgpmon to python3 (#5746 ) * Convert bgpcfgd to python3 Convert bgpmon to python3 Fix some issues in bgpmon * Add python3-swsscommon as depends * Install dependencies * reorder deps Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-11-05 10:01:43 -08:00
Joe LeVeque	ba7fda7fd1	[docker-platform-monitor] Install Python 2 'enum34' package to fix Arista platforms (#5779 ) Recent changes to dependencies caused the 'enum34' package to cease being installed for Python 2 in the PMon container. This broke Arista platforms, where the Arista sonic_platform package imports 'enum'. This is because on Arista devices, the sonic_platform wheel is not installed in the container. Instead, the installation directory is mounted from the host OS. However, this method doesn't ensure all dependencies are installed in the container.	2020-11-04 11:23:03 -08:00
dflynn-Nokia	ac3a605c75	[build]: ARM build: Download redis-tools and redis-server from sonicstorage (#5797 ) Prevent intermittent build failures when building Sonic for the ARM platform architecture due to version upgrades of the redis-tools and redis-server packages. Modify select Dockerfile templates to download the redis-tools and redis-server packages from sonicstorage rather than from debian.org. This PR has been made possible by the inclusion of ARM versions of redis-tools and redis-server into sonicstorage as described in Issue# 5701	2020-11-04 09:31:06 -08:00
Joe LeVeque	e3164d5fb4	[lldpmgrd] Convert to Python 3 (#5785 ) - Convert lldpmgrd to Python 3 - Install Python 3 swsscommon package in docker-lldp	2020-11-03 12:50:11 -08:00
Lawrence Lee	10ab46f7a0	Revert "[docker-base]: Rate limit priority INFO and lower in syslog" (#5763 ) * This was a temporary fix for orchagent spamming log messages and causing rate limiting, leading to critical messages being dropped for the syslog. No longer needed since Azure/sonic-sairedis#680 was merged.	2020-11-02 08:49:40 -08:00
Joe LeVeque	f2a258aca9	[docker-platform-monitor] Check if sonic_platform is available before installed (#5764 ) On Arista platforms, sonic_platform packages are not installed in the PMon container, but are rather mounted into the container from the host OS. Therefore, pip show sonic_platform will fail in the PMon container. This change will first check if we can import sonic_platform. If this fails, it will then fall back to checking if the package is installed. If both fail, it will attempt to install the package.	2020-11-01 01:48:07 -08:00
abdosi	dddf96933c	[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720 ) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action	2020-10-31 17:29:49 -07:00
Junchao-Mellanox	781188f549	[thermalctld] Enlarge startretries value to avoid thermalctld not able to restart during regression test (#5633 ) Increase startretires value from default of 10 to 50 to prevent supervisor from placing thermalctld in FATAL state during regression testing. Also ensures supervisord tries hard to get thermalctld running in production, as thermalctld is critical to prevent device from overheating.	2020-10-30 12:01:17 -07:00
Joe LeVeque	6333bb73b0	Explicitly call `pip2` rather than `pip` in locations where both pip2 and pip3 are installed (#5747 ) As part of the transition from Python 2 to Python 3, we are installing both pip2 and pip3 in the slave and config-engine containers. This PR replaces calls to `pip` in these containers with an explicit call to `pip2` to ensure the proper version of pip is executed, no matter which version of pip is aliased to `pip`, as we no longer rely on that alias. Also some other pip-related cleanup	2020-10-30 09:43:14 -07:00
Joe LeVeque	b132ca0980	[build]: Upgrade pip3 before pip2 (#5743 ) Upgrading pip3 after pip2 caused the pip command to be aliased to the pip3 command. However, since we are still transitioning from Python 2 to Python 3, most pip commands in the codebase are expecting pip to alias to pip2. The proper solution here is to explicitly call pip2 and pip3, and no longer call pip, however this will require extensive changes and testing, so to quickly fix this issue, we upgraded pip2 after pip3 to ensure that pip2 is installed after pip3.	2020-10-29 19:01:17 -07:00
judyjoseph	6088bd59de	[multi-ASIC] BGP internal neighbor table support (#5520 ) * Initial commit for BGP internal neighbor table support. > Add new template named "internal" for the internal BGP sessions > Add a new table in database "BGP_INTERNAL_NEIGHBOR" > The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR" * Changes in template generation tests with the introduction of internal neighbor template files.	2020-10-28 16:41:27 -07:00
Joe LeVeque	9e34003136	[sonic-config-engine] Clean up dependencies, pin versions; install Python 3 package in Buster container (#5656 ) To clean up the image build procedure, and let setuptools/pip[3] implicitly install Python dependencies. Also use ipaddress package instead of ipaddr.	2020-10-26 13:48:50 -07:00
shlomibitton	e66d49a57c	[LLDP] Fix for LLDP advertisements being sent with wrong information. (#5493 ) * Fix for LLDP advertisments being sent with wrong information. Since lldpd is starting before lldpmgr, some advertisment packets might sent with default value, mac address as Port ID. This fix hold the packets from being sent by the lldpd until all interfaces are well configured by the lldpmgrd. Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com> * Fix comments * Fix unit-test output caused a failure during build * Add 'run_cmd' function and use it * Resume lldpd even if port init timeout reached	2020-10-26 19:38:09 +02:00
lguohan	7d4ab4237a	[docker-base]: swss/syncd support use zmq as rpc channel (#5715 ) install libzmq5 in docker-base Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-10-25 20:39:38 -07:00
Shi Su	67408c85aa	[synchronous-mode] Add template file for synchronous mode (#5644 ) The orchagent and syncd need to have the same default synchronous mode configuration. This PR adds a template file to translate the default value in CONFIG_DB (empty field) to an explicit mode so that the orchagent and syncd could have the same default mode.	2020-10-23 13:08:35 -07:00
pavel-shirshov	c94f93f046	[bgpcfgd]: Dynamic BBR support (#5626 ) - Why I did it To introduce dynamic support of BBR functionality into bgpcfgd. BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0 Now we can add and remove this configuration based on CONFIG_DB entry - How I did it I introduced a new CONFIG_DB entry: - table name: "BGP_BBR" - key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated - data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled" Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below). bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value). - How to verify it Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` Then we apply following configuration to the db: ``` admin@str-s6100-acs-1:~$ cat disable.json { "BGP_BBR": { "all": { "status": "disabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w ``` The log output are: ``` Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))' Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'. Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that no allowas parameters are there: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' admin@str-s6100-acs-1:~$ ``` Then we apply enabling configuration back: ``` admin@str-s6100-acs-1:~$ cat enable.json { "BGP_BBR": { "all": { "status": "enabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w ``` The log output: ``` Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))' Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'. Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that the BBR configuration is back: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` * The test coverage * Below is the test coverage ``` ---------- coverage: platform linux2, python 2.7.12-final-0 ---------- Name Stmts Miss Cover ---------------------------------------------------- bgpcfgd/__init__.py 0 0 100% bgpcfgd/__main__.py 3 3 0% bgpcfgd/config.py 78 41 47% bgpcfgd/directory.py 63 34 46% bgpcfgd/log.py 15 3 80% bgpcfgd/main.py 51 51 0% bgpcfgd/manager.py 41 23 44% bgpcfgd/managers_allow_list.py 385 21 95% bgpcfgd/managers_bbr.py 76 0 100% bgpcfgd/managers_bgp.py 193 193 0% bgpcfgd/managers_db.py 9 9 0% bgpcfgd/managers_intf.py 33 33 0% bgpcfgd/managers_setsrc.py 45 45 0% bgpcfgd/runner.py 39 39 0% bgpcfgd/template.py 64 11 83% bgpcfgd/utils.py 32 24 25% bgpcfgd/vars.py 1 0 100% ---------------------------------------------------- TOTAL 1128 530 53% ``` - Which release branch to backport (provide reason below if selected) - [ ] 201811 - [x] 201911 - [x] 202006	2020-10-22 11:04:21 -07:00
BrynXu	29928c93a1	[chassis]: Use correct path for chassisdb.conf file (#5632 ) use correct chassisdb.conf path while bringing up chassis_db service on VoQ modular switch.chassis_db service on VoQ modular switch. resolves #5631 Signed-off-by: Honggang Xu <hxu@arista.com>	2020-10-21 01:40:04 -07:00
Lawrence Lee	207587d97c	[docker-base]: Rate limit priority INFO and lower in syslog (#5666 ) There is currently a bug where messages from swss with priority lower than the current log level are still being counted against the syslog rate limiting threshhold. This leads to rate-limiting in syslog when the rate-limiting conditions have not been met, which causes several sonic-mgmt tests to fail since they are dependent on LogAnalyzer. It also omits potentially useful information from the syslog. Only rate-limiting messages of level INFO and lower allows these tests to pass successfully. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>	2020-10-20 11:52:46 -07:00
BrynXu	a2e3d2fcea	[ChassisDB]: bring up ChassisDB service (#5283 ) bring up chassisdb service on sonic switch according to the design in Distributed Forwarding in VoQ Arch HLD Signed-off-by: Honggang Xu <hxu@arista.com> - Why I did it To bring up new ChassisDB service in sonic as designed in ['Distributed forwarding in a VOQ architecture HLD' ](`90c1289eaf/doc/chassis/architecture.md`). - How I did it Implement the section 2.3.1 Global DB Organization of the VOQ architecture HLD. - How to verify it ChassisDB service won't start without chassisdb.conf file on the existing platforms. ChassisDB service is accessible with global.conf file in the distributed arichitecture. Signed-off-by: Honggang Xu <hxu@arista.com>	2020-10-14 15:15:24 -07:00
Joe LeVeque	88c1d66c27	[python-click] No longer build our own package, let pip/setuptools install vanilla (#5549 ) We were building our own python-click package because we needed features/bug fixes available as of version 7.0.0, but the most recent version available from Debian was in the 6.x range. "Click" is needed for building/testing and installing sonic-utilities. Now that we are building sonic-utilities as a wheel, with Click specified as a dependency in the setup.py file, setuptools will install a more recent version of Click in the sonic-slave-buster container when building the package, and pip will install a more recent version of Click in the host OS of SONiC when installing the sonic-utilities package. Also, we don't need to worry about installing the Python 2 or 3 version of the package, as the proper one will be installed as necessary.	2020-10-14 10:16:35 -07:00
pavel-shirshov	812e1a3489	[bgp]: Enable next-hop-tracking through default (#5600 ) - Why I did it FRR introduced [next hop tracking](http://docs.frrouting.org/projects/dev-guide/en/latest/next-hop-tracking.html) functionality. That functionality requires resolving BGP neighbors before setting BGP connection (or explicit ebgp-multihop command). Sometimes (BGP MONITORS) our neighbors are not directly connected and sessions are IBGP. In this case current configuration prevents FRR to establish BGP connections. Reason would be "waiting for NHT". To fix that we need either add static routes for each not-directly connected ibgp neighbor, or enable command `ip nht resolve-via-default` - How I did it Put `ip nht resolve-via-default` into the config - How to verify it Build an image. Enable BGP_MONITOR entry and check that entry is Established or Connecting in FRR Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-10-13 22:21:28 -07:00
Mahesh Maddikayala	744612d269	[ECMP][Multi-ASIC] Have different ECMP seed value on each ASIC (#5357 ) * Calculate ECMP hash seed based on ASIC ID on multi ASIC platform. Each ASIC will have a unique ECMP hash seed value.	2020-10-08 09:05:37 -07:00
abdosi	70528f7460	[Multi-asic] Fixed Default Route to be BGP (#5548 ) Learned and not docker default route for multi-asic platforms. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-05 22:54:47 -07:00
Lawrence Lee	8c344095a8	[docker-orchagent]: Add NDP Proxy Daemon (#5517 ) * Install ndppd during image build, and copy config files to image * Configure proxy settings based on config DB at container start * Pipe ndppd output to logger inside container to log output in syslog	2020-10-05 08:48:13 -07:00
pavel-shirshov	ffae82f8be	[bgp] Add 'allow list' manager feature (#5513 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-10-02 10:06:04 -07:00
Tamer Ahmed	6754635010	[cfggen] Make Jinja2 Template Python 3 Compatible Jinja2 templates rendered using Python 3 interpreter, are required to conform with Python 3 new semantics. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-30 07:07:43 -07:00
Nazarii Hnydyn	79bda7d0d6	[monit]: Fix process checker. (#5480 ) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>	2020-09-29 17:23:09 -07:00
Volodymyr Boiko	d71a4efe3b	[sonic-platform-common] Install Python 3 package in host OS and PMon container (#5461 ) Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>	2020-09-29 13:57:54 -07:00
arlakshm	e3a0feaa47	Vtysh support for multi asic (#5479 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-09-29 12:39:53 -07:00
Guohan Lu	e412338743	Revert "[bgp] Add 'allow list' manager feature (#5309 )" This reverts commit `6eed0820c8`.	2020-09-28 22:00:29 -07:00
pavel-shirshov	6eed0820c8	[bgp] Add 'allow list' manager feature (#5309 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-09-27 10:47:43 -07:00
gechiang	43a8368874	make bgpmon autorestart enabled by supervisord (#5460 )	2020-09-25 10:25:11 -07:00
Sumukha Tumkur Vani	b5bcfef013	Update conf DB with CA cert & rename ca_crt field (#5448 )	2020-09-25 09:20:09 -07:00
Syd Logan	0311a4a037	Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature (#4851 ) * buildimage: Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature * scripts and configuration needed to support a second syncd docker (physyncd) * physyncd supports gearbox device and phy SAI APIs and runs multiple instances of syncd, one per phy in the device * support for VS target (sonic-sairedis vslib has been extended to support a virtual BCM81724 gearbox PHY). HLD is located at `b817a12fd8/doc/gearbox/gearbox_mgr_design.md` - Why I did it This work is part of the gearbox phy joint effort between Microsoft and Broadcom, and is based on multi-switch support in sonic-sairedis. - How I did it Overall feature was implemented across several projects. The collective pull requests (some in late stages of review at this point): https://github.com/Azure/sonic-utilities/pull/931 - CLI (merged) https://github.com/Azure/sonic-swss-common/pull/347 - Minor changes (merged) https://github.com/Azure/sonic-swss/pull/1321 - gearsyncd, config parsers, changes to orchargent to create gearbox phy on supported systems https://github.com/Azure/sonic-sairedis/pull/624 - physyncd, virtual BCM81724 gearbox phy added to vslib - How to verify it In a vslib build: root@sonic:/home/admin# show gearbox interfaces status PHY Id Interface MAC Lanes MAC Lane Speed PHY Lanes PHY Lane Speed Line Lanes Line Lane Speed Oper Admin -------- ----------- --------------- ---------------- --------------- ---------------- ------------ ----------------- ------ ------- 1 Ethernet48 121,122,123,124 25G 200,201,202,203 25G 204,205 50G down down 1 Ethernet49 125,126,127,128 25G 206,207,208,209 25G 210,211 50G down down 1 Ethernet50 69,70,71,72 25G 212,213,214,215 25G 216 100G down down In addition, docker ps \| grep phy should show a physyncd docker running. Signed-off-by: syd.logan@broadcom.com	2020-09-25 08:32:44 -07:00
yozhao101	13cec4c486	[Monit] Unmonitor the processes in containers which are disabled. (#5153 ) We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that Monit will not generate false alerting messages into the syslog. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-09-25 00:28:28 -07:00
Tamer Ahmed	b43f1129b4	[swss] Start Restore Neighbor After SWSS Config (#5451 ) SWSS config script restore ARP/FDB/Routes. Restore neighbor script uses config DB ARP information to restore ARP entries and so needs to be started after swssconfig exits. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-24 14:57:42 -07:00
Danny Allen	a56ad41b9e	[mgmt] Install dhclient in sonic-mgmt docker (#5447 ) Signed-off-by: Danny Allen <daall@microsoft.com>	2020-09-23 20:01:58 -07:00
Danny Allen	7eda531ffd	[mgmt] Install ip command in mgmt docker (#5430 ) Signed-off-by: Danny Allen <daall@microsoft.com>	2020-09-22 18:39:14 -07:00
Danny Allen	9feba88455	[mgmt] Fix Azure CLI install behind proxy (#5407 ) Signed-off-by: Danny Allen <daall@microsoft.com>	2020-09-22 09:37:03 -07:00
lguohan	0ed44db7b7	[docker-base-stretch]: install rsyslog from stretch-backports (#5410 ) Install a newer version of rsyslog from stretch-backports to support -iNONE Previous backport from master use -iNONE option which is only available after v8.32.0 Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-09-21 02:09:50 -07:00
Joe LeVeque	3987cbd80a	[sonic-utilities] Build and install as a Python wheel package (#5409 ) We are moving toward building all Python packages for SONiC as wheel packages rather than Debian packages. This will also allow us to more easily transition to Python 3. Python files are now packaged in "sonic-utilities" Pyhton wheel. Data files are now packaged in "sonic-utilities-data" Debian package. - How I did it - Build and install sonic-utilities as a Python package - Remove explicit installation of wheel dependencies, as these will now get installed implicitly by pip when installing sonic-utilities as a wheel - Build and install new sonic-utilities-data package to install data files required by sonic-utilities applications - Update all references to sonic-utilities scripts/entrypoints to either reference the new /usr/local/bin/ location or remove absolute path entirely where applicable Submodule updates: * src/sonic-utilities aa27dd9...2244d7b (5): > Support building sonic-utilities as a Python wheel package instead of a Debian package (#1122) > [consutil] Display remote device name in show command (#1120) > [vrf] fix check state_db error when vrf moving (#1119) > [consutil] Fix issue where the ConfigDBConnector's reference is missing (#1117) > Update to make config load/reload backward compatible. (#1115) * src/sonic-ztp dd025bc...911d622 (1): > Update paths to reflect new sonic-utilities install location, /usr/local/bin/ (#19)	2020-09-20 20:16:42 -07:00
gechiang	128def6969	Add bgpmon to be started as a new daemon under BGP docker (#5329 ) * Add bgpmon under sonic-bgpcfgd to be started as a new daemon under BGP docker * Added bgpmon to be monitored by Monit so that if it crashed, it gets alerted * use console_scripts entry point to package bgpmon	2020-09-20 14:32:09 -07:00
Tamer Ahmed	2de3afaf35	[swss] Enhance ARP Update to Call Sonic Cfggen Once (#5398 ) This PR limited the number of calls to sonic-cfggen to one call per iteration instead of current 3 calls per iteration. The PR also installs jq on host for future scripts if needed. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-18 18:44:23 -07:00
Danny Allen	2868a27935	[mgmt] Upgrade sonic-mgmt container to stretch (#5397 ) - Bump sonic-mgmt version to 18.04 - Update installation methods - Add virtualenv for python3 Signed-off-by: Danny Allen <daall@microsoft.com>	2020-09-17 22:00:32 -07:00
Tamer Ahmed	7515aea9db	[swss] Start Arp Update Process (#5391 ) Arp update process was not being started due to an issue with the directory name having an extra 'd' in supervisor as in '/etc/supervisord/conf.d/arp_update.conf'. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-17 18:33:10 -07:00
Prince Sunny	ae9a86fac4	Add new DB for Restapi to database config (#5350 )	2020-09-16 19:02:47 -07:00
judyjoseph	642479f75d	[multi-Asic] Add support for multi-asic to swssloglevel (#5316 ) * Support for multi-asic platform for swssloglevel command admin@str-acs-1:~$ swssloglevel Usage: /usr/bin/swssloglevel -n [0 to 3] [OPTION]... * Update to use the env file to get the PLATFORM string.	2020-09-16 11:29:05 -07:00
Tamer Ahmed	15f5d47338	[dhcpmon] Print Both Snapshot And Current Counters (#5374 ) Printing both snapshot and current counter sets will make it easier to pinpoint which message type(s) is/are not being relayed. This PR prints both counter sets. Also, this PR defines gnu11 as a C standard to compile with in order to avoid making changes when porting to 201811 branch. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-15 15:27:36 -07:00
Joe LeVeque	12c94a7431	[lldpmgrd] Inherit DaemonBase class from sonic-py-common package (#5370 ) Eliminate duplicate logging and signal handling code by inheriting from DaemonBase class in sonic-py-common package.	2020-09-15 10:55:55 -07:00
Tamer Ahmed	1bf6fdc6d2	[dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317 ) When BGP routes are missing, DHCP packets get relayed over mgmt interface. This results in dhcpmon alerting that DHCP packets are not being relayed. This is PR include mgmt interface as uplink device, and so, if DHCP packet gets relayed over mgmt interface, regular dhcpmon alert will not be issues. Instead, dhcpmon will check the mgmt interface counts and issue a separate alert regarding packets travelling through mgmt network. In addition, this PR includes the following enhancements: 1. Add SIGUSR1 handler that prints out current packet counts 2. Increase alert grace window to 3 minutes from currently 2 minutes 3. Time is now computed more accurately 4. Print vlan name before counters signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-09 18:37:01 -07:00
Joe LeVeque	5b3b4804ad	[dockers][supervisor] Increase event buffer size for dependent-startup (#5247 ) When stopping the swss, pmon or bgp containers, log messages like the following can be seen: ``` Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36 Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37 ``` This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100. Resolves https://github.com/Azure/sonic-buildimage/issues/5241	2020-09-08 23:36:38 -07:00
Qi Luo	d4fc8e5b22	[redis] Use redis-server and redis-tools in blob storage to prevent upstream link broken (#5340 ) * [redis] Use redis-server and redis-tools in blob storage to prevent upstream link broken * Use curl instead of wget * Explicitly install dependencies	2020-09-08 19:30:14 -07:00
abdosi	eff745fbb1	Fix the issue as reported in (#5315 ) https://github.com/Azure/sonic-buildimage/issues/5255 Root Cause: Waiting on Restore count != 0 can lead to race condition between orchagent process and swssconfig.sh. Ideally check of Restore count != 0 is not needed as the State DB cannot be flushed as if it was flushed then Warm Restart or swss-restart should not be true also.	2020-09-04 09:34:26 -07:00
Joe LeVeque	fb8f09a116	[radvd] No longer build from source; Install vanilla Debian package once again (#5242 ) Remove radvd Makefile and patch, change docker-router-advertiser Dockerfile template to simply install the vanilla radvd package using apt-get. - In PR https://github.com/Azure/sonic-buildimage/pull/2795, we started building radvd from source and patching it to prevent it from erroring out when advertising an MTU of 9100 which was greater than the MTU size configured on the bridge interface (1500), which was due to a limitation in the 4.9 Linux kernel. - Master branch is now using Linux kernel 4.19. As of 4.18, the kernel supports setting a bridge MTU to a value > 1500. - PR https://github.com/Azure/sonic-swss/pull/1393 modified vlanmgrd to take advantage of this and now configures the MTU of bridge interfaces in SONiC to the proper size of 9100. Therefore, we no longer need to patch radvd. Since we no longer need to patch radvd, we no longer need to build it from source, so we can save build time by going back to simply installing the vanilla radvd Debian package in the router-advertiser container.	2020-09-01 13:53:36 -07:00
arlakshm	17e78715ae	[Multi-ASIC]:Update the template to add ipinip entry for Loopback4096 (#5235 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> The following changes are done. - Multi asic platform have 2 Loopback interfaces, Loopback0 and Loopback4096. IPinIP decap entries need to be added for both of them. Update the ipinip.json.j2 template to add decap entries for Loopback4096. - Add corressponding unit test	2020-08-31 17:35:48 -07:00
Prince Sunny	4338d8293f	Skip vnet-vxlan interfaces from generating networks (#5251 ) * Skip Vnet interface from generating networks	2020-08-27 14:14:04 -07:00
Mykola F	834a29cb66	[enable counters] Enable port buffer drops by default and update MLNX SAI submodule (#5059 ) * Enable port buffer drops by default * [Mellanox] Update SAI_Implementation Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>	2020-08-25 08:48:56 -07:00
shi-su	f3feb56c8a	Add switch for synchronous mode (#5237 ) Add a master switch so that the sync/async mode can be configured. Example usage of the switch: 1. Configure mode while building an image `make ENABLE_SYNCHRONOUS_MODE=y <target>` 2. Configure when the device is running Change CONFIG_DB with `sonic-cfggen -a '{"DEVICE_METADATA":{"localhost": {"synchronous_mode": "enable"}}}' --write-to-db` Restart swss with `systemctl restart swss`	2020-08-24 14:04:10 -07:00
Joe LeVeque	97d44214cf	[docker-radv] Fix startup issues (#5230 ) - Why I did it PR https://github.com/Azure/sonic-buildimage/pull/4599 introduced two bugs in the startup of the router advertiser container: 1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed 2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read. - How I did it 1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh` 2. Use the Jinja2 "namespace" construct to fix the scope issue - How to verify it Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).	2020-08-21 13:12:01 -07:00
Tamer Ahmed	adcca53b8d	[radv] Reduce Calls to SONiC Cfggen (#5178 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to one call during startup when starting radv service. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:48:04 -07:00
Tamer Ahmed	89f3206a3f	[swss] Reduce Calls to SONiC Cfggen (#5177 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to one call during startup when starting swss service. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:47:52 -07:00
Tamer Ahmed	a10c5bfd02	[frr] Reduce Calls to SONiC Cfggen (#5176 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to two calls during startup when starting frr service. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:47:42 -07:00
Tamer Ahmed	3a10e9c6fa	[dhcp-relay] Reduce Calls to SONiC Cfggen (#5175 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to one call during startup when starting dhcp-relay service. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:47:14 -07:00
Tamer Ahmed	cc970627d0	[snmp]: Reduce Calls to SONiC Cfggen (#5166 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to once calle during snmp startup singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-17 15:46:29 -07:00
Joe LeVeque	6132ae34fe	[build] Build/install remaining platform daemons as Python wheel packages (#5188 ) As part of migrating all Python-based package installers to wheel format rather than Debian packages. Also to allow for easily building a Python 3 version of the package in the near future. ledd and psud were converted in earlier PRs. This PR converts the remainder: - pcied - syseepromd - thermalctld - xcvrd	2020-08-15 08:42:11 -07:00
Joe LeVeque	c3202d8982	[build] Build/install sonic-psud as a Python wheel package (#5182 ) As part of migrating all Python-based package installers to wheel format rather than Debian packages. Also to allow for easily building a Python 3 version of the package in the near future.	2020-08-14 11:11:45 -07:00
Joe LeVeque	fc9e97fc3d	[build] Build/install sonic-ledd as a Python wheel package (#5168 ) As part of migrating all Python-based package installers to wheel format rather than Debian packages. Also to allow for easily building a Python 3 version of the package in the near future. - Also remove some references to sonic-daemon-base which I previously missed and add missing sonic-py-common dependency for sonic-pcied.	2020-08-13 11:26:43 -07:00
Tamer Ahmed	7872b4e196	[platform] Add Support For Environment Variable File (#5010 ) * [platform] Add Support For Environment Variable This PR adds the ability to read environment file from /etc/sonic. the file contains immutable SONiC config attributes such as platform, hwsku, version, device_type. The aim is to minimize calls being made into sonic-cfggen during boot time. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-07-31 17:59:09 -07:00
abdosi	ec435b955c	Changes to add template support for copp.json. (#5053 ) * Changes to add template support for copp.json. This is needed so that we can install differnt type of Traps based on Device Role (Tor/Leaf/Mgmt/etc...). Initial use case is to install DHCP/DHCPv6 tarp only for tor router. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Fixed based on review comments. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Fixed based on review comment.	2020-07-31 14:14:21 -07:00
joyas-joseph	f0dfe36953	[docker-fpm-frr]: Upgrade docker-fpm-frr to buster (#4920 ) Verify that /etc/apt/sources.list points to buster using docker exec bgp cat /etc/apt/sources.list BGP neighborship is established. root@sonic:~# show ip bgp summary IPv4 Unicast Summary: BGP router identifier 10.1.0.1, local AS number 65100 vrf-id 0 BGP table version 1 RIB entries 1, using 184 bytes of memory Peers 1, using 20 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 6.1.1.1 4 100 96 96 0 0 0 01:32:04 0 Total number of neighbors 1 root@sonic:~# Signed-off-by: Joyas Joseph <joyas_joseph@dell.com>	2020-07-29 14:19:03 -07:00
Sujin Kang	02a98add92	Add pcied to PMON docker to monitor the PCIe device status (#5000 ) * Add pcied to PMON container * remove tailing spaces * update pmon submodule * review comments * rebase to the latest	2020-07-29 11:27:49 -07:00
Qi Luo	48b5792b07	[redis] Upgrade redis version (#5060 ) buster-backports updated and the old version disappeared	2020-07-28 20:50:31 -07:00
Joe LeVeque	2600747f0e	[docker-pmon] Fix copy of fancontrol config file (#5037 ) Copy proper fancontrol config file to the proper destination. Also some minor refactoring for code reuse to help prevent issues like this in the future. Fixes a bug introduced by #4599	2020-07-28 00:23:21 -07:00
pavel-shirshov	89184038fd	[docker-fpm-frr]: Start bgpd after zebra was started (#5038 ) fixes https://github.com/Azure/sonic-buildimage/issues/5026 Explanation: In the log from the issue I found: ``` I see following in the log Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed ``` Analyzing source code I found that the error message could be issues only when `zclient_send_rnh()` return less than 0. ``` ret = zclient_send_rnh(zclient, command, p, exact_match, bnc->bgp->vrf_id); /* TBD: handle the failure / if (ret < 0) flog_warn(EC_BGP_ZEBRA_SEND, "sendmsg_nexthop: zclient_send_message() failed"); ``` I checked [zclient_send_rnh()](`88351c8f6d/lib/zclient.c (L654)`) and found that this function will return the exit code which the function gets from [zclient_send_message()](`88351c8f6d/lib/zclient.c (L266)`) But the latter function could return not 0 in two cases: 1. bgpd didn’t connect to the zclient socket yet [code](`88351c8f6d/lib/zclient.c (L269)`) 2. The socket was closed. But in this case we would receive the error message in the log. (And I can find the message in the log when we reboot sonic) [code](`88351c8f6d/lib/zclient.c (L277)`) Also I see from the logs that client connection was set later we had the issue in bgpd. Bgpd.log ``` Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed ``` Vs Zebra.log ``` Jul 22 21:13:12.713249 vlab-01 NOTICE bgp#zebra[48]: client 25 says hello and bids fair to announce only static routes vrf=0 Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 30 says hello and bids fair to announce only bgp routes vrf=0 Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 33 says hello and bids fair to announce only vnc routes vrf=0 ``` So in our case we should start zebra first. Wait until it is started and then start bgpd and other daemons. - How I did it* I changed a graph to start daemons in the following order: 1. First start zebra 2. Then starts staticd and bgpd 3. Then starts vtysh -b and bgpeoi after bgpd is started.	2020-07-25 03:48:47 -07:00
anish-n	da017f4ec9	[bgpcfgd]: Add Vlan prefix list to the FRR templates (#5005 ) add the Vlan prefix list to the FRR templates	2020-07-21 19:26:19 -07:00
Stepan Blyshchak	16a37d8c17	[dockers] update mellanox syncd and pmon to buster (#4818 ) Upgrade to libsensors5 Updated sonic-sairedis pointer: d54bfb4 [SAI] update pointer (#636) 1885a8c [syncd] Fix notification on shutdown request (#635) 9e57ba2 Fixing hostif For Genetlink host interfaces (#633) 449a092 sonic-sairedis: Add support to sonic-sairedis for gearbox phys (#632) Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2020-07-18 03:46:15 -07:00
joyas-joseph	78945766fc	[docker-iccpd]: Upgrade docker-iccpd to buster (#4984 ) Signed-off-by: Joyas Joseph <joyas_joseph@dell.com>	2020-07-18 00:12:59 -07:00
Nazarii Hnydyn	a3f4c31193	[orchagent]: Fix platform string export. (#4993 ) Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>	2020-07-18 00:05:10 -07:00
joyas-joseph	18bfa6df08	[docker-nat]: upgrade docker-nat to buster (#4943 ) move iptables to 1.8.2-4 (version in buster) Signed-off-by: Joyas Joseph <joyas_joseph@dell.com>	2020-07-15 22:48:09 -07:00
taochengyi	1ca47da40d	[build][arm] Adding a backport source to arm to resolve docker-base-stretch install redis-tools=5:5.0.3-3~bpo9+2 failure issue (#4950 )	2020-07-15 12:23:20 -07:00

... 4 5 6 7 8 ...

1172 Commits