sonic-buildimage

Author	SHA1	Message	Date
Devesh Pathak	4dba276094	Fix to improve hostname handling (#12064 ) * Fix to improve hostname handling If config_db.json is missing hostname entry, hostname-config.sh ends up deleting existing entry too and hostname changes to default 'localhost' * default hostname to 'sonic` if missing in config file	2023-01-30 18:39:09 +00:00
Prince George	cd2bb08545	Close console session due to user inactivity (#9890 ) Signed-off-by: Prince George <prgeor@microsoft.com>	2023-01-30 18:36:11 +00:00
arheneus@marvell.com	fc1295bdcc	[ntp][apparmor] Allow apparmor read permission for ntpd under rw mount path of rootfs (#6040 ) Certain platform specific packages sonic-platform-xyz, installs files onto rootfs, which would be placed on read-write mount path on /host/image-name/rw/... when ntpd starts it tries to do read access on /usr/bin /usr/sbin/ /usr/local/bin , which inturn links further to the read-write mount path also. Where ntpd would get below Apparmor Warning message LOG:- audit: type=1400 audit(1606226503.240:21): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/local/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0 audit: type=1400 audit(1606226503.240:22): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/sbin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0 audit: type=1400 audit(1606226503.240:23): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0 Fix: Add rw/.. mount path similar to root path access provided for ntpd in /etc/apparmor.d/usr.sbin.ntpd Signed-off-by: Antony Rheneus <arheneus@marvell.com>	2022-10-16 05:42:35 +00:00
xumia	37fa1014ad	[201911] Change submodule path from Azure to sonic-net (#12313 ) Why I did it Change the path of sonic submodules that point to "Azure" to point to "sonic-net" How I did it Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules	2022-10-12 21:07:22 +08:00
Sujin Kang	61a34fcf22	[201911] Add hardware reboot cause when software reboot failed (#11753 ) Why I did it Add the hardware reboot cause when the previous software reboot failed How I did it Check both hardware reboot cause and software reboot cause. Add the hardware reboot as actual reboot cause if any hardware reboot cause is available for any software reboot. How to verify it Perform reboots and verify the reboot-cause	2022-08-25 12:30:53 -07:00
Ying Xie	db5b9ee834	[warm boot finalizer] only wait for enabled components to reconcile (#6454 ) * [warm boot finalizer] only wait for enabled components to reconcile Define the component with its associated service. Only wait for components that have associated service enabled to reconcile during warm reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2022-03-31 12:01:25 -07:00
Renuka Manavalan	eda84d2209	Invoke disk check periodically (#7374 ) Helps with periodic scan of disk for RO state. If found, this script makes transient fix and raise error message.	2021-11-19 16:45:21 -08:00
Renuka Manavalan	8cd6714ef4	hostcfgd: Handle missed tacacs updates between load & listen (#8223 ) Why I did it The time gap between last config load & db-listen seem to have increased. Any config updates that occurred in this gap gets missed by db-listen. This could miss updating /etc/pam.d/common-auth-sonic How I did it Add a one shot timer, just before db-listen. The timer will fire after the subscribe is done When the timer fires, reload tacacs & aaa	2021-08-06 10:38:37 -07:00
xumia	e4a4cfed98	Fix vtysh shell-ingestion security issue (#8022 ) Why I did it Fix vtysh shell-ingestion security issue Only expose the limited parameters of the command vtysh show.	2021-06-30 19:34:55 +08:00
Renuka Manavalan	3ea38a9788	Add service to restore TACACS from old config (#7560 ) (#7865 ) In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials. Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present. How I did it By adding a service, that would fire 5 mins after boot. This service apply tacacs if available. How to verify it Upgrade and watch status of tacacs.timer & tacacs.service You may create /etc/sonic/old_config/tacacs.json, with updated credentials (before 5mins after boot) and see that appears in config & persisted too.	2021-06-15 10:52:31 -07:00
Kuanyu Chen	c4f8cf9371	[config-setup]: Fix a bug in checking if updategraph is enabled (#7093 ) Encounter error during "config-setup boot" if the updategraph is enabled. How I did it Correct the code inside the config-setup script. Remove the space between the assignment operator. How to verify it Remove the /etc/sonic/config_db.json and reboot the device. Originally, it will return following error after boot up. rv: command not found After modification, it can correctly parse the status of updategraph without error.	2021-05-31 08:11:08 -07:00
xumia	7aa8a021ea	Support readonly vtysh for sudoers (#7383 ) (#7572 ) * Support readonly vtysh for sudoers (#7383) Why I did it Support readonly version of the command vtysh How I did it Check if the command starting with "show", and verify only contains single command in script. * Fix the type issue in rvtysh	2021-05-19 09:02:16 +08:00
yozhao101	24e1cde1e6	[201911][Monit] Restart telemetry container if memory usage is beyond the threshold (#7618 ) This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.	2021-05-17 16:51:13 -07:00
yozhao101	a8d2d0b5cd	[201911][Monit] Monitor critical processes in PMon contianer. (#7438 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to monitor the critical processes in PMon container by Monit in 201911 branch. How I did it I created a template configuration file of Monit and it will be rendered to generate Monit configuration file of PMon container by a service generate_monit_config.service. How to verify it I verified this on a Mellanox device str-msn2700-03 and an Arista device str-a7050-acs-1. Which release branch to backport (provide reason below if selected) 201811 [x ] 201911 202006 202012	2021-04-28 17:12:21 -07:00
yozhao101	528543bc6a	[201911][Monit] Monitor critical processes in radv and dhcp_relay containers. (#7340 ) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it This PR aims to monitor critical processes in router advertiser and dhcp_relay containers by Monit. How I did it Router advertiser container only ran on T0 device and the T0 device should have at least one VLAN interface which was configured an IPv6 address. At the same time, router advertiser container will not run on devices of which the deployment type is 8. As such, I created a service which will dynamically generate Monit configuration file of router advertiser from a template. Similarly Monit configuration file of dhcp_relay was also generated from a template since the number of dhcrelay process in dhcp_relay container is depended on number of VLANs. How to verify it I verified this implementation on a DuT.	2021-04-16 08:40:06 -07:00
pra-moh	e1eb1bda59	[201911][procdockerstatsd] fix typo for variable name (#7183 )	2021-03-29 19:22:03 -07:00
pra-moh	afe548b61a	[201911][procdockerstatsd] Add missing unit conversion (#7157 ) Fixing same issue in 201911 as mention here #7151	2021-03-26 10:24:02 -07:00
Volodymyr Samotiy	fd22b3bcee	[monit] Periodically monitor VNET route consistency (#7078 ) To run VNET route consistency check periodically. For any failure, the monit will raise alert based on return code. The tool will log required details.	2021-03-25 07:24:59 -07:00
pra-moh	5f5644bb93	[201911][procdockerstatsd] Fix bug in procdockerstatsd (#7073 ) Fix incorrect variable name	2021-03-16 18:41:45 -07:00
pra-moh	bd07256bfd	[201911][procdockerstatsd] Fix unit conversion for docker stats (#7063 ) Bug exists in 201911 branch where unit conversion for docker stats is incorrect. Both MiB/GiB to byes conversion is incorrect Example: admin@str-s6000-acs-10:/usr/bin$ docker stats --no-stream -a CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS e958c81d27a8 mgmt-framework 0.00% 0B / 0B 0.00% 0B / 0B 0B / 0B 0 9b6b7b4361d5 telemetry 3.13% 86.31MiB / 7.785GiB 1.08% 0B / 0B 0B / 106kB 30 e7fee0b617fe snmp 70.28% 57.03MiB / 7.785GiB 0.72% 0B / 0B 0B / 102kB 9 admin@str-s6000-acs-10:/usr/bin$ redis-cli -n 6 hgetall "DOCKER_STATS\|e7fee0b617fe" "MEM%" "0.72" "MEM_LIMIT_BYTES" "8359080099840" "NAME" "snmp" "NET_OUT_BYTES" "0" "MEM_BYTES" "5980028928" "BLOCK_OUT_BYTES" "102000" "NET_IN_BYTES" "0" "BLOCK_IN_BYTES" "0" "PIDS" "9" "CPU%" "5.96"	2021-03-16 05:54:19 -07:00
abdosi	ab05a2f58a	Add support for BGP Monitors on multi asic SONiC platforms. (#6977 ) This PR is cherry-pick of master https://github.com/Azure/sonic-buildimage/pull/6920 Why I did it Add support for BGP Monitors on multi asic SONiC platforms. How I did it On multi ASIC SONiC platforms, BGP monitor session will be established from Backend ASIC. To achieve this following changes are done Add BGP monitor configuration on the backend ASIC. The BGP monitor configuration is present in the DPG of the device in minigraph.xml of multi-ASIC device, so this configuration will be added to the config_db of the host, when the minigraph is loaded. To add configuration for this in the Backend ASIC, a new class MultiAsicBgpMonCfg is added to the hostcfgd service to update the config_db of the backend ASIC when the BGP_MONITOR table of the host config_db is updated. This way incremental BGP_MONITOR configuration can also be handled. Changes to establish BGP session with bgp monitor. Add route in host main routing table to go to one of pre-define backend asic Add IP table rule on front asic to mark the BGP packets with destination as IPv4 Loopback. Add IP rule in front asic namespace to match mark BGP packet and lookup default table Program the default route in FrontEnd asic name space docker default table as part of start.sh of the BGP container. It need to be done as part of start.sh otherwise FRR default route will get over-written. How to verify it Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> Co-authored-by: Arvind <arlakshm@microsoft.com>	2021-03-06 21:21:52 -08:00
Qi Luo	32e3cd9454	Revert "[monit] Periodically monitor VNET route consistency (#6819 )" (#6975 ) This reverts commit `2c6be7e0f5`. Reverts #6819	2021-03-06 06:56:26 -08:00
Volodymyr Samotiy	2c6be7e0f5	[monit] Periodically monitor VNET route consistency (#6819 ) To run VNET route consistency check periodically. For any failure, the monit will raise alert based on return code. The tool will log required details.	2021-03-05 13:15:19 -08:00
SuvarnaMeenakshi	9208dc507b	[multi-asic][vs]: Update topology script to retrieve hwsku from minigraph (#6219 ) Update topology script to retrieve hwsku from minigraph if hwsku information is not available in config_db. Fix clean up of interfaces in msft_multi_asic_vs hwsku topology script. - Why I did it When bringing up multi-asic VS switch, topology service is started during boot up. Topology service starts a shell script which runs the topology script present in /usr/share/sonic/device// directory. To invoke hwsku specific script, the topology script tries to retrieve hwsku information from config_db. During initial boot up config_db might not be populated. In order to start topology service before config_db is updated, update topology script to get hwsku information from minigraph.xml if it is available. This will be helpful to bring up multi-asic VS testbed by loading minigraph and starting topology service. - How I did it Update topology.sh script to retrieve hwsku information from minigraph.xml. Fix clean up function on msft_multi_asic_vs toplogy script. - How to verify it single-asic VS - no change; topology service is only enabled for multi-asic VS. multi-asic VS - Bring up multi-asic VS image, copy minigraph to vs image, start topology service. Topology service should be successful. to test clean up function fix, start topology service - make sure interfaces are created and moved to the right namespaces. stop topology service - make sure namespace do not have any interface and all front end interfaces are present in default namespace.	2021-02-25 18:42:44 -08:00
abdosi	1064cd5cd0	[multi-asic] Enhanced iptable default rules (#6765 ) What I did:- For multi-asic platforms added iptable v4 rule to communicate on docker bridge ip For multi-asic platforms extend iptable v4 rule for iptable v6 also For multi-asic program made all internal rules applicable for all protocols (not filter based on tcp/udp). This is done to be consistent same as local host rule For multi-asic platforms made nat rule (to forward traffic from namespace to host) generic for all protocols and also use Source IP if present for matching	2021-02-25 18:39:43 -08:00
arlakshm	5822b42fdb	[sudoers]: add ipintutil in sudoer file (#6857 ) This PR is port of #6845 for 201911 show ip interfaces is enhanced recently to support multi ASIC platforms in this Azure/sonic-utilities#1437. The ipintutil script as to run as sudo user, to get the ip interface from each namespace. Add this script to the sudoer file so that show ip interface command is available for user with read-only permissions Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2021-02-23 13:26:53 -08:00
arlakshm	a750f89630	[multi asic] add ip netns identify command to sudoer (#6591 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> - Why I did it The command sudo ip netns identify <pid> is used in function get_current_namespace to check in the cli command is running in host context or within a namespace. This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands This problem is not there on single asic platforms. - How I did it Add ip netns identify [0-9]* to sudoers file.	2021-02-02 10:32:59 -08:00
arlakshm	3cd536bb45	[Multi Asic] support of swss.rec and sairedis.rec for multi asic (#6310 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com - Why I did it This PR has the changes to support having different swss.rec and sairedis.rec for each asic. The logrotate script is updated as well - How I did it Update the orchagent.sh script to use the logfile name options in these PRs(Azure/sonic-swss#1546 and Azure/sonic-sairedis#747) In multi asic platforms the record files will be different for each asic, with the format swss.asic{x}.rec and sairedis.asic{x}.rec Update the logrotate script for multiasic platform .	2021-01-27 17:12:32 -08:00
abdosi	9779560b63	[baseimage]: Updates for Ebtables and support for multi-asic (#6542 ) Following changes were done for ebtables: - Support for Multi-asic platforms. Ebtable filters are installed in namespace for multi-asic and not host. On Single asic installed on host. - For Multi-asic platforms we don't want to install on host otherwise Namespace-to-Namespace communication does not happens since ARP Request are not forwarded. - Updated to use text file to restore ebtables rules then the binary format. Rules are restore as part of Database docker init instead of rc.local - Removed the ebtable service files for buster as not needed as filters are restored/installed as part of database docker init. All the binaries are pre-installed with ebtables* binary are same as ebatbles-legacy-* Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2021-01-27 16:59:10 -08:00
arheneus@marvell.com	e9d3d96c69	[ebtbles] Replace binary config file to text config file for ebtables (#5252 ) Issue: Binary ebtables config file is CPU arch dependent Fix: Load the text config during firsttime boot and Generate the binary persistent atomic file Signed-off-by: Antony Rheneus <arheneus@marvell.com>	2021-01-27 16:57:41 -08:00
Renuka Manavalan	b346a3a699	Take a copy of existing TACACS credentials and restore it during upgrade (#6285 ) In scenario where upgrade gets config from minigraph, it could miss tacacs credentials as they are not in minigraph. Hence restore explicitly upon load-minigraph, if present. - Why I did it Upon boot, when config migration is required, the switch could load config from minigraph. The config-load from minigraph would wipe off TACACS key and disable login via TACACS, which would disable all remote user access. This change, would re-configure the TACACS if there is a saved copy available. - How I did it When config is loaded from minigraph, look for a TACACS credentials back up (tacacs.json) under /etc/sonic/old_config. If present, load the credentials into running config, before config-save is called. - How to verify it Remove /etc/sonic/config_db.json and do an image update. Upon reboot, w/o this change, you would not be able ssh in as remote user. You may login as admin and check out, "show tacacs" & "show aaa" to verify that tacacs-key is missing and login is not enabled for tacacs. With this change applied, remove /etc/sonic/config_db.json, but save tacacs & aaa credentials as tacacs.json in /etc/sonic/. Upon reboot, you should see remote user access possible.	2021-01-09 08:13:52 -08:00
judyjoseph	1e4f09c860	Move frr logs from syslog to /var/log/frr/*.log (#5988 ) - Why I did it Move frr logs from syslog from the directory /var/log/quagga/.log to /var/log/frr/log - How I did it Updated the rsyslog config files. - How to verify it Verified the logs come into the file zebra.log and bgpd.log in the DIR /var/log/frr/log	2020-12-22 10:53:16 -08:00
Tamer Ahmed	b5bf5e3bce	[interfaces] Reduce Calls to SONiC Cfggen (#5174 ) Calls to sonic-cfggen is CPU expensive. This PR reduces calls to sonic-cfggen to one call during startup when running interfaces- config. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-12-22 09:51:54 -08:00
abdosi	35fc12c373	Telemetry Certificate Copy Across Image Upgrade. (#6252 ) To copy telemetry certificate during image upgrade from previous image to new image	2020-12-19 08:24:41 -08:00
arlakshm	7f76698b7d	[201911][hostcfgd]:wait updating the feature table till system init is done (#6234 ) - Why I did it The change is done to make sure the system initialization is done before the hostcfgd sets the feature states. - How I did it This is port of the PR #6232. Since the systemctl version in 201911 doesn't support "--wait". Added a function to check the output of systemctl is-system-running every second, till the command system is done booting up. For now this change is only applicable to multi asic platforms based on the testing this change will be extended to all platforms in the future PR. Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-12-18 12:31:35 -08:00
abdosi	3a24e7f31f	[multi-asic] Enhancing monit process checker for multi-asic. (#6100 ) Added Support of process checker for work on multi-asic platforms.	2020-12-04 13:17:35 -08:00
abdosi	1d1898d8e2	Enhanced Feature table to support 'always_enabled' value for state and auto-restart fields. (#6000 ) Added new flag value 'always_enabled' for the state and auto-restart field of feature table init_cfg.json is updated to initialize state field of database/swss/syncd/teamd feature and auto-restart field of database feature as always_enabled Once the state/auto-restart value is initialized as "always_enabled" it is immutable and cannot be change via feature config commands. (config feature..) PR#Azure/sonic-utilities#1271 hostcfgd will not take any action if state field value is 'always_enabled' Since we have always_enabled field for auto-restart updated supervisor-proc-exit-listener not to have special check for database and always rely on value from Feature table.	2020-11-25 10:04:42 -08:00
Rajkumar-Marvell	17045f42d1	Set sock rx Buf size to 3MB. (#5566 ) * Set sock rx Buf size to 3MB.	2020-11-24 11:21:56 -08:00
Prince Sunny	1c2c30fccd	Set preference for forced mgmt routes (#5844 ) When forced mgmt routes are present, the issue fixed as part of #5754 is not complete. Added a preference(priority) field to forced mgmt route ip rules	2020-11-21 09:27:09 -08:00
pavel-shirshov	5f5ec04dda	[bgpcfgd]: Fixes for BBR (#5956 ) * Add explicit default state into the constants.yml * Enable/disable only peer-groups, available in the config * Retrieve updates from frr before using configuration Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-11-19 10:42:42 -08:00
madhanmellanox	a79c3c219d	[201911][caclmgrd] Accomadating case insensitive rule props for Control plane ACLs (#5918 ) To make Control plane ACLs handle case insensitive ACL rules. Currently, it handles only upper case ACL rules. Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>	2020-11-13 11:41:05 -08:00
judyjoseph	ce86621399	[multi-ASIC] BGP internal neighbor table support (#5520 ) * Initial commit for BGP internal neighbor table support. > Add new template named "internal" for the internal BGP sessions > Add a new table in database "BGP_INTERNAL_NEIGHBOR" > The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR" * Changes in template generation tests with the introduction of internal neighbor template files.	2020-11-10 12:52:58 -08:00
arlakshm	431f97d11d	Add the vtysh command with newly added "-n" option for multi asic to the read_only_cmds (#5845 ) In multi asic platforms the "show ip bgp summary" commands is not available for user with read only privileges, so to fix this the vtysh command with the new "-n" option, added for multi asic platforms, needs to be added to the READ_ONLY_COMMANDS list in the sudoers files. Added the command vtysh -n [0-9] -c show * to list of READ_ONLY_COMMANDS in the sudoers files in this commit. Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-11-10 12:30:32 -08:00
Stepan Blyshchak	dc68576bab	[hostcfgd] If feature state entry not in the cache, add a default state (#5777 ) Our use case is to register new features in runtime. The previous change which introduced the cache broke this capability and caused hostcfgd crash. Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>	2020-11-09 12:34:42 -08:00
lguohan	339d2aa6c8	[mgmt ip]: mvrf ip rule priority change to 32765 (#5754 ) Fix Azure/SONiC#551 When eth0 IP address is configured, an ip rule is getting added for eth0 IP address through the interfaces.j2 template. This eth0 ip rule creates an issue when VRF (data VRF or management VRF) is also created in the system. When any VRF (data VRF or management VRF) is created, a new rule is getting added automatically by kernel as "1000: from all lookup [l3mdev-table]". This l3mdev IP rule is never getting deleted even if VRF is deleted. Once if this l3mdev IP rule is added, if user configures IP address for the eth0 interface, interfaces.j2 adds an eth0 IP rule as "1000:from 100.104.47.74 lookup default ". Priority 1000 is automatically chosen by kernel and hence this rule gets higher priority than the already existing rule "1001:from all lookup local ". This results in an issue "ping from console to eth0 IP does not work once if VRF is created" as explained in Issue 551. More details and possible solutions are explained as comments in the Issue551. This PR is to resolve the issue by always fixing the low priority 32765 for the IP rule that is created for the eth0 IP address. Tested with various combinations of VRF creation, deletion and IP address configuration along with ping from console to eth0 IP address. Co-authored-by: Kannan KVS <kannan_kvs@dell.com>	2020-11-01 10:41:44 -08:00
Abhishek Dosi	65cb10714c	Revert "[mgmt ip]: mvrf ip rule priority change to 32765 (#5754 )" This reverts commit `28366cd0ce`.	2020-11-01 10:37:16 -08:00
Renuka Manavalan	ecd10b9d10	Load config after subscribe (#5740 ) - Why I did it The update_all_feature_states can run in the range of 20+ seconds to one minute. With load of AAA & Tacacs preceding it, any DB updates in AAA/TACACS during the long running feature updates would get missed. To avoid, switch the order. - How I did it Do a load after after updating all feature states. - How to verify it Not a easy one Have a script that restart hostcfgd sleep 2s run redis-cli/config command to update AAA/TACACS table Run the script above and watch the file /etc/pam.d/common-auth-sonic for a minute. - When it repro: The updates will not reflect in /etc/pam.d/common-auth-sonic	2020-11-01 10:27:10 -08:00
abdosi	0fad6bdc7f	[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720 ) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action	2020-11-01 10:27:10 -08:00
lguohan	28366cd0ce	[mgmt ip]: mvrf ip rule priority change to 32765 (#5754 ) Fix Azure/SONiC#551 When eth0 IP address is configured, an ip rule is getting added for eth0 IP address through the interfaces.j2 template. This eth0 ip rule creates an issue when VRF (data VRF or management VRF) is also created in the system. When any VRF (data VRF or management VRF) is created, a new rule is getting added automatically by kernel as "1000: from all lookup [l3mdev-table]". This l3mdev IP rule is never getting deleted even if VRF is deleted. Once if this l3mdev IP rule is added, if user configures IP address for the eth0 interface, interfaces.j2 adds an eth0 IP rule as "1000:from 100.104.47.74 lookup default ". Priority 1000 is automatically chosen by kernel and hence this rule gets higher priority than the already existing rule "1001:from all lookup local ". This results in an issue "ping from console to eth0 IP does not work once if VRF is created" as explained in Issue 551. More details and possible solutions are explained as comments in the Issue551. This PR is to resolve the issue by always fixing the low priority 32765 for the IP rule that is created for the eth0 IP address. Tested with various combinations of VRF creation, deletion and IP address configuration along with ping from console to eth0 IP address. Co-authored-by: Kannan KVS <kannan_kvs@dell.com>	2020-11-01 10:27:10 -08:00
pavel-shirshov	2eec3b3254	[bgpcfgd]: Dynamic BBR support (#5626 ) - Why I did it To introduce dynamic support of BBR functionality into bgpcfgd. BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0 Now we can add and remove this configuration based on CONFIG_DB entry - How I did it I introduced a new CONFIG_DB entry: - table name: "BGP_BBR" - key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated - data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled" Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below). bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value). - How to verify it Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` Then we apply following configuration to the db: ``` admin@str-s6100-acs-1:~$ cat disable.json { "BGP_BBR": { "all": { "status": "disabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w ``` The log output are: ``` Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))' Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'. Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that no allowas parameters are there: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' admin@str-s6100-acs-1:~$ ``` Then we apply enabling configuration back: ``` admin@str-s6100-acs-1:~$ cat enable.json { "BGP_BBR": { "all": { "status": "enabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w ``` The log output: ``` Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))' Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'. Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that the BBR configuration is back: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` * The test coverage * Below is the test coverage ``` ---------- coverage: platform linux2, python 2.7.12-final-0 ---------- Name Stmts Miss Cover ---------------------------------------------------- bgpcfgd/__init__.py 0 0 100% bgpcfgd/__main__.py 3 3 0% bgpcfgd/config.py 78 41 47% bgpcfgd/directory.py 63 34 46% bgpcfgd/log.py 15 3 80% bgpcfgd/main.py 51 51 0% bgpcfgd/manager.py 41 23 44% bgpcfgd/managers_allow_list.py 385 21 95% bgpcfgd/managers_bbr.py 76 0 100% bgpcfgd/managers_bgp.py 193 193 0% bgpcfgd/managers_db.py 9 9 0% bgpcfgd/managers_intf.py 33 33 0% bgpcfgd/managers_setsrc.py 45 45 0% bgpcfgd/runner.py 39 39 0% bgpcfgd/template.py 64 11 83% bgpcfgd/utils.py 32 24 25% bgpcfgd/vars.py 1 0 100% ---------------------------------------------------- TOTAL 1128 530 53% ``` - Which release branch to backport (provide reason below if selected) - [ ] 201811 - [x] 201911 - [x] 202006	2020-10-30 08:58:27 -07:00

1 2 3 4 5 ...

334 Commits