sonic-buildimage

Archived

Author	SHA1	Message	Date
Renuka Manavalan	ecd10b9d10	Load config after subscribe (#5740 ) - Why I did it The update_all_feature_states can run in the range of 20+ seconds to one minute. With load of AAA & Tacacs preceding it, any DB updates in AAA/TACACS during the long running feature updates would get missed. To avoid, switch the order. - How I did it Do a load after after updating all feature states. - How to verify it Not a easy one Have a script that restart hostcfgd sleep 2s run redis-cli/config command to update AAA/TACACS table Run the script above and watch the file /etc/pam.d/common-auth-sonic for a minute. - When it repro: The updates will not reflect in /etc/pam.d/common-auth-sonic	2020-11-01 10:27:10 -08:00
abdosi	0fad6bdc7f	[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720 ) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action	2020-11-01 10:27:10 -08:00
lguohan	28366cd0ce	[mgmt ip]: mvrf ip rule priority change to 32765 (#5754 ) Fix Azure/SONiC#551 When eth0 IP address is configured, an ip rule is getting added for eth0 IP address through the interfaces.j2 template. This eth0 ip rule creates an issue when VRF (data VRF or management VRF) is also created in the system. When any VRF (data VRF or management VRF) is created, a new rule is getting added automatically by kernel as "1000: from all lookup [l3mdev-table]". This l3mdev IP rule is never getting deleted even if VRF is deleted. Once if this l3mdev IP rule is added, if user configures IP address for the eth0 interface, interfaces.j2 adds an eth0 IP rule as "1000:from 100.104.47.74 lookup default ". Priority 1000 is automatically chosen by kernel and hence this rule gets higher priority than the already existing rule "1001:from all lookup local ". This results in an issue "ping from console to eth0 IP does not work once if VRF is created" as explained in Issue 551. More details and possible solutions are explained as comments in the Issue551. This PR is to resolve the issue by always fixing the low priority 32765 for the IP rule that is created for the eth0 IP address. Tested with various combinations of VRF creation, deletion and IP address configuration along with ping from console to eth0 IP address. Co-authored-by: Kannan KVS <kannan_kvs@dell.com>	2020-11-01 10:27:10 -08:00
pavel-shirshov	2eec3b3254	[bgpcfgd]: Dynamic BBR support (#5626 ) - Why I did it To introduce dynamic support of BBR functionality into bgpcfgd. BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0 Now we can add and remove this configuration based on CONFIG_DB entry - How I did it I introduced a new CONFIG_DB entry: - table name: "BGP_BBR" - key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated - data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled" Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below). bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value). - How to verify it Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` Then we apply following configuration to the db: ``` admin@str-s6100-acs-1:~$ cat disable.json { "BGP_BBR": { "all": { "status": "disabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w ``` The log output are: ``` Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))' Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'. Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that no allowas parameters are there: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' admin@str-s6100-acs-1:~$ ``` Then we apply enabling configuration back: ``` admin@str-s6100-acs-1:~$ cat enable.json { "BGP_BBR": { "all": { "status": "enabled" } } } admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w ``` The log output: ``` Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))' Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'. Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'. Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'. ``` Check FRR configuraiton and see that the BBR configuration is back: ``` admin@str-s6100-acs-1:~$ vtysh -c 'show run' \| egrep 'PEER_V.? allowas' neighbor PEER_V4 allowas-in 1 neighbor PEER_V6 allowas-in 1 ``` * The test coverage * Below is the test coverage ``` ---------- coverage: platform linux2, python 2.7.12-final-0 ---------- Name Stmts Miss Cover ---------------------------------------------------- bgpcfgd/__init__.py 0 0 100% bgpcfgd/__main__.py 3 3 0% bgpcfgd/config.py 78 41 47% bgpcfgd/directory.py 63 34 46% bgpcfgd/log.py 15 3 80% bgpcfgd/main.py 51 51 0% bgpcfgd/manager.py 41 23 44% bgpcfgd/managers_allow_list.py 385 21 95% bgpcfgd/managers_bbr.py 76 0 100% bgpcfgd/managers_bgp.py 193 193 0% bgpcfgd/managers_db.py 9 9 0% bgpcfgd/managers_intf.py 33 33 0% bgpcfgd/managers_setsrc.py 45 45 0% bgpcfgd/runner.py 39 39 0% bgpcfgd/template.py 64 11 83% bgpcfgd/utils.py 32 24 25% bgpcfgd/vars.py 1 0 100% ---------------------------------------------------- TOTAL 1128 530 53% ``` - Which release branch to backport (provide reason below if selected) - [ ] 201811 - [x] 201911 - [x] 202006	2020-10-30 08:58:27 -07:00
pavel-shirshov	bee6c87f90	[bgpcfgd]: Change prefix-list generation for "Allow prefix" feature (#5639 ) - Why I did it I was asked to change "Allow list" prefix-list generation rule. Previously we generated the rules using following method: ``` For each {prefix}/{masklen} we would generate the prefix-rule permit {prefix}/{masklen} ge {masklen}+1 Example: Prefix 1.2.3.4/24 would have following prefix-list entry generated permit 1.2.3.4/24 ge 23 ``` But we discovered the old rule doesn't work for all cases we have. So we introduced the new rule: ``` For ipv4 entry, For mask < 32 , we will add ‘le 32’ to cover all prefix masks to be sent by T0 For mask =32 , we will not add any ‘le mask’ For ipv6 entry, we will add le 128 to cover all the prefix mask to be sent by T0 For mask < 128 , we will add ‘le 128’ to cover all prefix masks to be sent by T0 For mask = 128 , we will not add any ‘le mask’ ``` - How I did it I change prefix-list entry generation function. Also I introduced a test for the changed function. - How to verify it 1. Build an image and put it on your dut. 2. Create a file test_schema.conf with the test configuration ``` { "BGP_ALLOWED_PREFIXES": { "DEPLOYMENT_ID\|0\|1010:1010": { "prefixes_v4": [ "10.20.0.0/16", "10.50.1.0/29" ], "prefixes_v6": [ "fc01:10::/64", "fc02:20::/64" ] }, "DEPLOYMENT_ID\|0": { "prefixes_v4": [ "10.20.0.0/16", "10.50.1.0/29" ], "prefixes_v6": [ "fc01:10::/64", "fc02:20::/64" ] } } } ``` 3. Apply the configuration by command ``` sonic-cfggen -j test_schema.conf --write-to-db ``` 4. Check that your bgp configuration has following prefix-list entries: ``` admin@str-s6100-acs-1:~$ show runningconfiguration bgp \| grep PL_ALLOW ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 10 deny 0.0.0.0/0 le 17 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 20 permit 127.0.0.1/32 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 30 permit 10.20.0.0/16 le 32 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 40 permit 10.50.1.0/29 le 32 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 10 deny 0.0.0.0/0 le 17 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 20 permit 127.0.0.1/32 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 30 permit 10.20.0.0/16 le 32 ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 40 permit 10.50.1.0/29 le 32 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 10 deny ::/0 le 59 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 20 deny ::/0 ge 65 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 30 permit fc01:10::/64 le 128 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 40 permit fc02:20::/64 le 128 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 10 deny ::/0 le 59 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 20 deny ::/0 ge 65 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 30 permit fc01:10::/64 le 128 ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 40 permit fc02:20::/64 le 128 ``` Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>	2020-10-30 08:56:52 -07:00
bingwang-ms	7a015eacc1	Fix 'NoSuchProcess' exception in process_checker (#5716 ) The psutil library used in process_checker create a cache for each process when calling process_iter. So, there is some possibility that one process exists when calling process_iter, but not exists when calling cmdline, which will raise a NoSuchProcess exception. This commit fix the issue. Signed-off-by: bingwang <bingwang@microsoft.com>	2020-10-30 08:56:10 -07:00
judyjoseph	5a802533b5	Fix to remove the import of APIClient (#5724 )	2020-10-27 08:32:37 -07:00
judyjoseph	963bd7fdc4	[docker-teamd]: Add teamd as a depedent service to swss (#5628 ) - Why I did it On teamd docker restart, the swss and syncd needs to be restarted as there are dependent resources present. - How I did it Add the teamd as a dependent service for swss Updated the docker-wait script to handle service and dependent services separately. Handle the case of warm-restart for the dependent service - How to verify it Verified the following scenario's with the following testbed VM1 ----------------------------[DUT 6100] -----------------------VM2, ping traffic continuous between VMs 1. Stop teamd docker alone > swss, syncd dockers seen going away > The LAG reference count error messages seen for a while till swss docker stops. > Dockers back up. 2. Enable WR mode for teamd. Stop teamd docker alone > swss, syncd dockers not removed. > The LAG reference count error messages not seen > Repeated stop teamd docker test - same result, no effect on swss/syncd. 3. Stop swss docker. > swss, teamd, syncd goes off - dockers comes back correctly, interfaces up 4. Enable WR mode for swss . Stop swss docker > swss goes off not affecting syncd/teamd dockers. 5. Config reload > no reference counter error seen, dockers comes back correctly, with interfaces up 6. Warm reboot, observations below > swss docker goes off first > teamd + syncd goes off to the end of WR process. > dockers comes back up fine. > ping traffic between VM's was NOT HIT 7. Fast reboot, observations below > teamd goes off first ( confirmed swss don't exit here ) > swss goes off next > syncd goes away at the end of the FR process > dockers comes back up fine. > there is a traffic HIT as per fast-reboot 8. Verified in multi-asic platform, the tests above other than WR/FB scenarios	2020-10-23 15:49:23 -07:00
yozhao101	d8ae2a0019	[hostcfgd] Enable/disable the container service only when the feature state was changed. (#5689 ) - Why I did it If we ran the CLI commands `sudo config feature autorestart snmp disabled/enabled` or `sudo config feature autorestart swss disabled/enabled`, then SNMP container will be stopped and started. This behavior was not expected since we updated the `auto_restart` field not update `state` field in `FEATURE` table. The reason behind this issue is that either `state` field or `auto_restart` field was updated, the function `update_feature_state(...)` will be invoked which then starts snmp.timer service. The snmp.timer service will first stop snmp.service and later start snmp.service. In order to solve this issue, the function `update_feature_state(...)` will be only invoked if `state` field in `FEATURE` table was updated. - How I did it When the demon `hostcfgd` was activated, all the values of `state` field in `FEATURE` table of each container will be cached. Each time the function `feature_state_handler(...)` is invoked, it will determine whether the `state` field of a container was changed or not. If it was changed, function `update_feature_state(...)` will be invoked and the cached value will also be updated. Otherwise, nothing will be done. - How to verify it We can run the CLI commands `sudo config feature autorestart snmp disabled/enabled` or `sudo config feature autorestart swss disabled/enabled` to check whether SNMP container is stopped and started. We also can run the CLI commands `sudo config feature state snmp disabled/enabled` or `sudo config feature state swss disabled/enabled` to check whether the container is stopped and restarted. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-10-23 15:45:04 -07:00
Joe LeVeque	4dde7d00cf	[caclmgrd] Prevent unnecessary iptables updates (#5312 ) When a large number of changes occur to the ACL table of Config DB, caclmgrd will get flooded with notifications, and previously, it would regenerate and apply the iptables rules for each change, which is unnecessary, as the iptables rules should only get applied once after the last change notification is received. If the ACL table contains a large number of control plane ACL rules, this could cause a large delay in caclmgrd getting the rules applied. This patch causes caclmgrd to delay updating the iptables rules until it has not received a change notification for at least 0.5 seconds.	2020-10-21 12:15:04 -07:00
abdosi	c9e0b06009	Optimze ACL Table/Rule notification handling (#5621 ) * Optimze ACL Table/Rule notifcation handling to loop pop() until empty to consume all the data in a batch This wau we prevent multiple call to iptable updates Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address review comments Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-14 08:08:23 -07:00
abdosi	ccebd006b5	Optimized caclmgrd Notification handling. Previously (#5560 ) any event happening on ACL Rule Table (eg DATAACL rules programmed) caused control plane default action to be triggered. Now Control Plance ACTION will be trigger only a) ACL Rule beloging to Control ACL Table Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-10-08 11:57:04 -07:00
pavel-shirshov	437ad95646	[bgp] Add 'allow list' manager feature (#5513 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-10-06 11:15:19 -07:00
Ying Xie	bea968bb2b	[rc.local] separate configuration migration and grub installation logic (#5528 ) To address issue #5525 Explicitly control the grub installation requirement when it is needed. We have scenario where configuration migration happened but grub installation is not required. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2020-10-04 19:41:50 +00:00
Abhishek Dosi	04725bc030	Revert "[bgp] Add 'allow list' manager feature (#5309 )" This reverts commit `b5d33b39de`.	2020-09-29 15:39:04 +00:00
Tamer Ahmed	2cc98b4bac	[platform] Add Support For Environment Variable File (#5010 ) * [platform] Add Support For Environment Variable This PR adds the ability to read environment file from /etc/sonic. the file contains immutable SONiC config attributes such as platform, hwsku, version, device_type. The aim is to minimize calls being made into sonic-cfggen during boot time. singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-09-28 21:14:39 +00:00
pavel-shirshov	b5d33b39de	[bgp] Add 'allow list' manager feature (#5309 ) implements a new feature: "BGP Allow list." This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.	2020-09-28 16:20:27 +00:00
bingwang-ms	0fabc906d1	Fix exception when attempting to write a datetime to db (#5467 ) redis-py 3.0 used in master branch only accepts user data as bytes, strings or numbers (ints, longs and floats). Attempting to specify a key or a value as any other type will raise a DataError exception. This PR address the issue bt converting datetime to str	2020-09-28 16:18:24 +00:00
judyjoseph	cff716f7a5	[Multi-Asic] Forward SNMP requests received on front panel interface to SNMP agent in host. (#5420 ) * [Multi-Asic] Forward SNMP requests destined to loopback IP, and coming in through the front panel interface present in the network namespace, to SNMP agent running in the linux host. * Updates based on comments * Further updates in docker_image_ctl.j2 and caclmgrd * Change the variable for net config file. * Updated the comments in the code. * No need to clean up the exising NAT rules if present, which could be created by some other process. * Delete our rule first and add it back, to take care of caclmgrd restart. Another benefit is that we delete only our rules, rather than earlier approach of "iptables -F" which cleans up all rules. * Keeping the original logic to clean the NAT entries, to revist when NAT feature added in namespace. * Missing updates to log_info call.	2020-09-28 16:14:07 +00:00
yozhao101	7580c846ad	[201911][Monit] Unmonitor processes in disabled containers (#5462 ) We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that Monit will not generate false alerting messages into the syslog. - Backport of https://github.com/Azure/sonic-buildimage/pull/5153 to the 201911 branch Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-09-25 00:30:41 -07:00
abdosi	73bd647e44	Enhanced Feature Table state enable/disable for multi-asic platforms. (#5358 ) * Enhanced Feature Table state enable/disbale for multi-asic platforms. In Multi-asic for some features we can service per asic so we need to get list of all services. Also updated logic to return if any one of systemctl command return failure and make sure syslog of feature getting enable/disable only come when all commads are sucessful. Moved the service list get api from sonic-util to sonic-py-common Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net> * Make sure to retun None for both service list in case of error. Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net> * Return empty list as fail condition Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net> * Address Review Comments. Made init_cfg.json.j2 knowledegable of Feature service is global scope or per asic scope Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net> * Fix merge conflict * Address Review Comment. Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net> Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>	2020-09-22 11:38:19 -07:00
Renuka Manavalan	7d6e5083ce	[monit] Periodically monitor route consistency (#5085 ) * Add route_check to mont. * Switched to units of cycles per comments * Added comments per Joe's comments. * Added more comments per Royal's comments.	2020-09-19 15:47:53 -07:00
Blueve	64e04f8542	[conf] append nos-config-part for s6100 (#5234 ) * [conf] append nos-config-part for s6100 * modify rc.local Signed-off-by: Guohan Lu <lguohan@gmail.com> * Update rc.local Co-authored-by: Blueve <jika@microsoft.com> Co-authored-by: Guohan Lu <lguohan@gmail.com> Co-authored-by: Ying Xie <yxieca@users.noreply.github.com>	2020-09-19 14:14:32 -07:00
noaOrMlnx	d4f6e080cb	Change update_feature_state call to pass False as default if feature has no 'has_timer' field (#5260 ) * Pass False as default if feature has no timer field * Update hostcfgd to fit the new changes merged New changes can be found in PR:5248	2020-09-19 14:07:53 -07:00
abdosi	e43521ab64	[Multi-Asic] Fix for multi-asic where we should allow docker local (#5364 ) communication on docker eth0 ip . Without this TCP Connection to Redis does not happen in namespace. Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net> Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>	2020-09-19 14:04:56 -07:00
Joe LeVeque	05e5807b3f	[process-reboot-cause] Use Logger class from sonic-py-common package (#5384 ) Eliminate duplicate logging code by importing Logger class from sonic-py-common package.	2020-09-19 13:59:59 -07:00
Joe LeVeque	930526f6f8	[procdockerstatsd] Inherit DaemonBase class from sonic-py-common package (#5372 ) Eliminate duplicate logging code by inheriting from DaemonBase class in sonic-py-common package.	2020-09-19 13:55:06 -07:00
Joe LeVeque	a957ac6402	[caclmgrd] Inherit DaemonBase class from sonic-py-common package (#5373 ) Eliminate duplicate logging code by inheriting from DaemonBase class in sonic-py-common package.	2020-09-19 13:54:25 -07:00
Abhishek Dosi	13d44a4faf	Removed DB specific get api's from Selectable class (PR #378 ) on sonic-swss-common With the change as part of #378 caclmgrd need to be updated to use new client side Get API to access namespace. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>	2020-09-03 16:45:33 -07:00
Tamer Ahmed	826aaf51f6	[hostcfgd] Fix Boolean String Evaluation (#5248 ) New attribute 'has_timer' introduced to init_cfg.json does not evaluate as Bool, rather it evaluates as string. This PR fixes this issue. Also, this PR fixes an issue when there is system config unit (snmp, telemetry) that has no installation config (WantedBy=, RequiredBy=, Also=, Alias=) settings in the [Install] section. In the latter case, the .service should not be enabled. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-27 08:04:29 -07:00
Tamer Ahmed	dd3e7a6fa8	[hostcfgd] Handle Both Service And Timer Units (#5228 ) Commit `e484ae9dd` introduced systemd .timer unit to hostcfgd. However, when stopping service that has timer, there is possibility that timer is not running and the service would not be stopped. This PR address this situation by handling both .timer and .service units. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-22 09:29:54 -07:00
Tamer Ahmed	9decadfed2	[services] Fix Delay Start of SNMP And Telemetry (#5211 ) SNMP and Telemetry services are not critical to switch startup. They also cause fast-reboot not to meet timing requirements. In order to delay start those service are associated with systemd timer units, however when hostcfgd initiate service start, it start the service and not the timer. This PR fixes this issue by starting the timer associated with systemd unit. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>	2020-08-20 16:26:08 -07:00
abdosi	f785a0a270	[caclmgrd] Add support for multi-ASIC platforms (#5022 ) * Support for Control Plane ACL's for Multi-asic Platforms. Following changes were done: 1) Moved from using blocking listen() on Config DB to the select() model via python-swsscommon since we have to wait on event from multiple config db's 2) Since python-swsscommon is not available on host added libswsscommon and python-swsscommon and dependent packages in the base image (host enviroment) 3) Made iptables programmed in all namespace using ip netns exec Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comments Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Fix Review Comments * Fix Comments * Added Change for Multi-asic to have iptables rules to accept internal docker tcp/udp traffic needed for syslog and redis-tcp connection. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Fix Review Comments * Added more comments on logic. * Fixed all warning/errors reported by http://pep8online.com/ other than line > 80 characters. * Fix Comment Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Verified with swsscommon package. Fix issue for single asic platforms. * Moved to new python package * Address Review Comments. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comments.	2020-08-20 16:01:12 -07:00
Joe LeVeque	309a098b21	[201911][Python] Migrate applications/scripts to import sonic-py-common package (#5132 ) As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request: 1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added to the 201911 branch in https://github.com/Azure/sonic-buildimage/pull/5063 2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`. 3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module This is a step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999	2020-08-13 16:35:53 -07:00
lguohan	78c803851c	[build]: combine feature and container feature table (#5081 ) 1. remove container feature table 2. do not generate feature entry if the feature is not included in the image 3. rename ENABLE_* to INCLUDE_* for better clarity 4. rename feature status to feature state 5. [submodule]: update sonic-utilities * 9700e45 2020-08-03 \| [show/config]: combine feature and container feature cli (#1015) (HEAD, origin/master, origin/HEAD) [lguohan] * c9d3550 2020-08-03 \| [tests]: fix drops_group_test failure on second run (#1023) [lguohan] * dfaae69 2020-08-03 \| [lldpshow]: Fix input device is not a TTY error (#1016) [Arun Saravanan Balachandran] * 216688e 2020-08-02 \| [tests]: rename sonic-utilitie-tests to tests (#1022) [lguohan] Signed-off-by: Guohan Lu <lguohan@gmail.com>	2020-08-09 11:55:40 -07:00
Sujin Kang	ff6cb6c402	Add disabling HW watchdog during boot for fast-reboot and warm-reboot (#4927 ) * Add disabling HW watchdog during boot for fast-reboot and warm-reboot case * typo	2020-08-09 11:25:31 -07:00
rkdevi27	5ddfc13a75	[baseimage]: /host unmount timeout issue during reboot. (#5032 ) Fix for the host unmount issue through PR https://github.com/Azure/sonic-buildimage/pull/4558 and https://github.com/Azure/sonic-buildimage/pull/4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.	2020-08-09 10:38:33 -07:00
rkdevi27	652aa3b072	[baseimage]: /host unmount failed in VM during reboot (#4865 ) Added a check further to make the services to stop appropriately before unmount. Fix #4651	2020-08-09 10:37:12 -07:00
rkdevi27	f1bbda19f0	Fix "/host unmount failure" during reboot (#4558 )	2020-08-09 10:34:02 -07:00
Joe LeVeque	c96c3cd311	[caclmgrd] Always restart service upon process termination (#5065 )	2020-07-31 17:23:48 -07:00
madhanmellanox	130aeb4cc1	[caclmgrd] Log error message if IPv4 ACL table contains IPv6 rule and vice-versa (#4498 ) * Defect 2082949: Handling Control Plane ACLs so that IPv4 rules and IPv6 rules are not added to the same ACL table * Previous code review comments of coming up with functions for is_ipv4_rule and is_ipv6_rule is addressed and also raising Exceptions instead of simply aborting when the conflict occurs is handled * Addressed code review comment to replace duplicate code with already existing functions * removed raising Exception when rule conflict in Control plane ACLs are found * added code to remove the rule_props if it is conflicting ACL table versioning rule * addressed review comment to add ignoring rule in the error statement Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>	2020-07-26 11:16:30 -07:00
Joe LeVeque	4a2db8e216	[caclmgrd] remove default DROP rule on FORWARD chain (#5034 )	2020-07-26 11:07:42 -07:00
Joe LeVeque	3f3fcd3253	[caclmgrd] Filter DHCP packets based on dest port only (#4995 )	2020-07-21 10:13:17 +00:00
Joe LeVeque	52e45e823e	[201911][sudoers] Add `sonic_installer list` to read-only commands (#4997 ) `sonic_installer list` is a read-only command. Specify it as such in the sudoers file. This will also ensure the new `show boot` command, which calls `sudo sonic_installer list` under the hood doesn't fail due to permissions.	2020-07-17 20:13:42 -07:00
Joe LeVeque	0559b7d3b6	[caclmgrd] Improve code reuse (#4931 ) Improve code reuse in `generate_block_ip2me_traffic_iptables_commands()` function.	2020-07-11 09:48:10 -07:00
abdosi	4869fa7173	[sonic-buildimage] Changes to make network specific sysctl common for both host and docker namespace (#4838 ) * [sonic-buildimage] Changes to make network specific sysctl common for both host and docker namespace (in multi-npu). This change is triggered with issue found in multi-npu platforms where in docker namespace net.ipv6.conf.all.forwarding was 0 (should be 1) because of which RS/RA message were triggered and link-local router were learnt. Beside this there were some other sysctl.net.ipv6* params whose value in docker namespace is not same as host namespace. So to make we are always in sync in host and docker namespace created common file that list all sysctl.net.* params and used both by host and docker namespace. Any change will get applied to both namespace. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comments and made sure to invoke augtool only one and do string concatenation of all set commands * Address Review Comments.	2020-07-05 15:32:30 -07:00
arlakshm	b6b1f3fac8	syslog changes Multi ASIC platforms (#4738 ) Add changes for syslog support for containers running in namespaces on multi ASIC platforms. On Multi ASIC platforms Rsyslog service is only running on the host. There is no rsyslog service running in each namespace. On multi ASIC platforms the rsyslog service on the host will be listening on the docker0 ip address instead of loopback address. The rsyslog.conf on the containers is modified to have omfwd target ip to be docker0 ipaddress instead of loopback ip Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-07-05 15:19:22 -07:00
Joe LeVeque	0768bf7733	[hostcfgd] Synchronize all feature statuses once upon start (#4714 ) - Ensure all features (services) are in the configured state when hostcfgd starts - Better functionalization of code - Also replace calls to deprecated `has_key()` method in `tacacs_server_handler()` and `tacacs_global_handler()` with `in` keyword. This PR depends on https://github.com/Azure/sonic-utilities/pull/944, otherwise `config load_minigraph` will fail when trying to restart disabled services.	2020-06-28 07:28:33 -07:00
padmanarayana	7564b060e4	[DELL]: FTOS to SONiC fast conversion fixes (#4807 ) While migrating to SONiC 20181130, identified a couple of issues: 1. union-mount needs /host/machine.conf parameters for vendor specific checks : however, in case of migration, the /host/machine.conf is extracted from ONIE only in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/platform/rc.local#L127. 2. Since grub.cfg is updated to have net.ifnames=0 biosdevname=0, 70-persistent-net.rules changes are no longer required.	2020-06-20 08:15:05 -07:00
Joe LeVeque	d8886ba473	[caclmgrd] Don't limit connection tracking to TCP (#4796 ) Don't limit iptables connection tracking to TCP protocol; allow connection tracking for all protocols. This allows services like NTP, which is UDP-based, to receive replies from an NTP server even if the port is blocked, as long as it is in reply to a request sent from the device itself.	2020-06-20 08:13:11 -07:00
Ying Xie	aecebac86b	[ntp] disable ntp long jump (#4748 ) Found another syncd timing issue related to clock going backwards. To be safe disable the ntp long jump. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2020-06-16 08:15:00 -07:00
Joe LeVeque	ed0e6aed1c	[hostcfgd] Get service enable/disable feature working (#4676 ) Fix hostcfgd so that changes to the "FEATURE" table in ConfigDB are properly handled. Three changes here: 1. Fix indenting such that the handling of each key actually occurs in the for key in status_data.keys(): loop 2. Add calls to sudo systemctl mask and sudo systemctl unmask as appropriate to ensure changes persist across reboots 3. Substitute returns with continues so that even if one service fails, we still try to handle the others Note that the masking is persistent, even if the configuration is not saved. We may want to consider only calling systemctl enable/disable in hostcfgd when the DB table changes, and only call systemctl mask/unmask upon calling config save.	2020-06-16 08:13:32 -07:00
Olivier Singla	18bbbb3c02	[baseimage]: Run fsck filesystem check support prior mounting filesystem (#4431 ) * Run fsck filesystem check support prior mounting filesystem If the filesystem become non clean ("dirty"), SONiC does not run fsck to repair and mark it as clean again. This patch adds the functionality to run fsck on each boot, prior to the filesystem being mounted. This allows the filesystem to be repaired if needed. Note that if the filesystem is maked as clean, fsck does nothing and simply return so this is perfectly fine to call fsck every time prior to mount the filesystem. How to verify this patch (using bash): Using an image without this patch: Make the filesystem "dirty" (not clean) [we are making the assumption that filesystem is stored in /dev/sda3 - Please adjust depending of the platform] [do this only on a test platform!] dd if=/dev/sda3 of=superblock bs=1 count=2048 printf "$(printf '\\x%02X' 2)" \| dd of="superblock" bs=1 seek=1082 count=1 conv=notrunc &> /dev/null dd of=/dev/sda3 if=superblock bs=1 count=2048 Verify that filesystem is not clean tune2fs -l /dev/sda3 \| grep "Filesystem state:" reboot and verify that the filesystem is still not clean Redo the same test with an image with this patch, and verify that at next reboot the filesystem is repaired and becomes clean. fsck log is stored on syslog, using the string FSCK as markup.	2020-06-16 08:12:11 -07:00
Joe LeVeque	913d380f6b	[caclmgrd] Get first VLAN host IP address via next() (#4685 ) I found that with IPv4Network types, calling list(ip_ntwrk.hosts()) is reliable. However, when doing the same with an IPv6Network, I found that the conversion to a list can hang indefinitely. This appears to me to be a bug in the ipaddress.IPv6Network implementation. However, I could not find any other reports on the web. This patch changes the behavior to call next() on the ip_ntwrk.hosts() generator instead, which returns the IP address of the first host.	2020-06-03 15:38:11 -07:00
Joe LeVeque	f2c0ed8e21	[caclmgrd] Allow more ICMP types (#4625 )	2020-06-03 15:35:49 -07:00
Joe LeVeque	1e59be8941	[caclmgrd] Ignore keys in interface-related tables if no IP prefix is present (#4581 ) Since the introduction of VRF, interface-related tables in ConfigDB will have multiple entries, one of which only contains the interface name and no IP prefix. Thus, when iterating over the keys in the tables, we need to ignore the entries which do not contain IP prefixes.	2020-06-03 15:35:10 -07:00
Joe LeVeque	ac957a0c7a	[caclmgrd] Add some default ACCEPT rules and lastly drop all incoming packets (#4412 ) Modified caclmgrd behavior to enhance control plane security as follows: Upon starting or receiving notification of ACL table/rule changes in Config DB: 1. Add iptables/ip6tables commands to allow all incoming packets from established TCP sessions or new TCP sessions which are related to established TCP sessions 2. Add iptables/ip6tables commands to allow bidirectional ICMPv4 ping and traceroute 3. Add iptables/ip6tables commands to allow bidirectional ICMPv6 ping and traceroute 4. Add iptables/ip6tables commands to allow all incoming Neighbor Discovery Protocol (NDP) NS/NA/RS/RA messages 5. Add iptables/ip6tables commands to allow all incoming IPv4 DHCP packets 6. Add iptables/ip6tables commands to allow all incoming IPv6 DHCP packets 7. Add iptables/ip6tables commands to allow all incoming BGP traffic 8. Add iptables/ip6tables commands for all ACL rules for recognized services (currently SSH, SNMP, NTP) 9. For all services which we did not find configured ACL rules, add iptables/ip6tables commands to allow all incoming packets for those services (allows the device to accept SSH connections before the device is configured) 10. Add iptables rules to drop all packets destined for loopback interface IP addresses 11. Add iptables rules to drop all packets destined for management interface IP addresses 12. Add iptables rules to drop all packets destined for point-to-point interface IP addresses 13. Add iptables rules to drop all packets destined for our VLAN interface gateway IP addresses 14. Add iptables/ip6tables commands to allow all incoming packets with TTL of 0 or 1 (This allows the device to respond to tools like tcptraceroute) 15. If we found control plane ACLs in the configuration and applied them, we lastly add iptables/ip6tables commands to drop all other incoming packets	2020-06-03 09:41:52 -07:00
Ying Xie	14b3f0022b	[ntp] enable/disable NTP long jump according to reboot type (#4577 ) * [ntp] enable/disable NTP long jump according to reboot type - Enable NTP long jump after cold reboot. - Disable NTP long jump after warrm/fast reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * fix typo * further refactoring * use sonic-db-cli instead	2020-05-20 22:44:14 -07:00
abdosi	bb60e2b670	Changes to support config-setup service for multi-npu (#4609 ) * Changes to support config-setup service for multi-npu platforms. For Multi-npu we are not supporting as of now config initializtion and ZTP. It will support creating config db from minigraph or using config db from previous file system Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comments. * Address Review comments * Address Review Comments of using pyhton based config load_minigraph/ config save/config reload from shell scripts so that we don't duplicate code. Also while running from shell we will skip stop/start services done by those commands. * Updated to use python command so no code duplication.	2020-05-20 22:44:14 -07:00
abdosi	508f6bfa02	Fix for issue where image is compile with flag ENABLE_DHCP_GRAPH_SERVICE (#4573 ) and then we load image and reboot even if there was existing config_db.json we will look for DHCP Service. we should disbale update_graph in such cases. This behaviour is silimar to what we have in 201811 image.	2020-05-20 07:53:23 -07:00
Santhosh Kumar T	1e3df476e5	[DellEMC] S6100 Last Reboot Reason Thermal Support (#3767 )	2020-05-09 18:37:31 -07:00
wangshengjun	18e51088a0	[ebtables]add the filter rule for ARP packets with vlan tag: (#3945 ) 1. ebtables -t filter -A FORWARD -p 802_1Q --vlan-encap 0806 -j DROP The ARP packet with vlan tag can't match the default rule. Signed-off-by: wangshengjun <wangshengjun@asterfusion.com>	2020-05-09 18:36:36 -07:00
Joe LeVeque	9bdd2ef014	[process-reboot-cause] If software reboot cause is unknown add note if first boot into new image (#4538 )	2020-05-09 18:17:31 -07:00
Dong Zhang	3faa4e936e	[MultiDB] use sonic-db-cli PING and fix wrong multiDB API in NAT (#4541 )	2020-05-09 18:16:48 -07:00
pavel-shirshov	2f44bcd071	[bgpcfgd]: Split one bgp mega-template to chunks. (#4143 ) The one big bgp configuration template was splitted into chunks. Currently we have three types of bgp neighbor peers: general bgp peers. They are represented by CONFIG_DB::BGP_NEIGHBOR table entries dynamic bgp peers. They are represented by CONFIG_DB::BGP_PEER_RANGE table entries monitors bgp peers. They are represented by CONFIG_DB::BGP_MONITORS table entries This PR introduces three templates for each peer type: bgp policies: represent policieas that will be applied to the bgp peer-group (ip prefix-lists, route-maps, etc) bgp peer-group: represent bgp peer group which has common configuration for the bgp peer type and uses bgp routing policy from the previous item bgp peer-group instance: represent bgp configuration, which will be used to instatiate a bgp peer-group for the bgp peer-type. Usually this one is simple, consist of the referral to the bgp peer-group, bgp peer description and bgp peer ip address. This PR redefined constant.yml file. Now this file has a setting for to use or don't use bgp_neighbor metadata. This file has more parameters for now, which are not used. They will be used in the next iteration of bgpcfgd. Currently all tests have been disabled. I'm going to create next PR with the tests right after this PR is merged. I'm going to introduce better bgpcfgd in a short time. It will include support of dynamic changes for the templates. FIX:: #4231	2020-04-25 09:41:28 +00:00
Renuka Manavalan	9b017a83b5	[baseimage]: Install Kubernetes packages if enabled in image (#4374 ) (#4432 ) Install kubeadm, which transparently installs kubelet & kubectl As well download required Kubernetes images required to run as kubernetes node. The kubelet service is intentionally kept in disabled state, as it would otherwise continuously restart wasting resources, until join to master.	2020-04-16 21:54:45 -07:00
SuvarnaMeenakshi	0099305475	Multi-ASIC implementation (#3888 ) Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.	2020-04-15 13:08:34 -07:00
rajendra-dendukuri	a97b73e79c	Fix typo in config-setup service (#4388 )	2020-04-10 21:23:07 -07:00
Abhishek Dosi	249265ad99	Revert "Multi-ASIC implementation (#3888 )" This reverts commit `2e87a16941`.	2020-04-03 14:34:38 -07:00
SuvarnaMeenakshi	2e87a16941	Multi-ASIC implementation (#3888 ) Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.	2020-04-01 23:21:49 -07:00
Garrick He	a059d7ec0e	[procdockerstatsd] Fix CMD field in dB (#4335 ) * Fix the CMD for the PROCESSSTATS entries so that there is a space between the command name and the arguments. Signed-off-by: Garrick He <garrick_he@dell.com>	2020-03-29 22:47:05 -07:00
Stepan Blyshchak	ee84dca683	[docker_image_ctl.j2] Share UTS namespace with host OS (#4169 ) Instead of updating hostname manualy on Config DB hostname change, simply share containers UTS namespace with host OS. Ideally, instead of setting `--uts=host` for every container in SONiC, this setting can be set per container if feature requires. One behaviour change is introduced in this commit, when `--privileged` or `--cap-add=CAP_SYS_ADMIN` and `--uts=host` are combined, container has privilege to change host OS and every other container hostname. Such privilege should be fixed by limiting containers capabilities. Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2020-03-22 23:04:02 -07:00
SuvarnaMeenakshi	7b4b1245bd	[ntp]: Add "tinker panic 0" in ntp.conf to avoid ntpd from panic (#4263 ) - What I did Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large. - How I did it Added "tinker panic 0" in ntp.conf file. - How to verify it [this assumes that there is a valid NTP server IP in config_db/ntp.conf] Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s. Reboot the device. Before the fix: 3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time. After the fix: 3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.	2020-03-22 23:00:40 -07:00
yozhao101	358570324b	[Monit] Delay start of monitoring for 5 minutes (#4281 )	2020-03-22 22:58:57 -07:00
Abhishek Dosi	cc2d497aa4	Fixing Bad Cherry-pick	2020-03-04 10:46:45 -08:00
rajendra-dendukuri	8581a52571	ZTP infrastructure changes to support DHCP discovery provisioning data (#3298 ) * ZTP infrastructure changes to support DHCP discovery provisioning data - Dynamically generate DHCP client configuration based on current ZTP state - Added support to request and process hostname when using DHCPv6 - Do not process graphservice url dhcp option if ZTP is enabled, ZTP service will process it - Generate /e/n/i file with all active interfaces seeking address assignment via DHCP. Only interfaces that are created in Linux will be added to /e/n/i. Also DHCP is started only on linked up in-band interfaces. Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>	2020-03-03 22:23:59 -08:00
Joe LeVeque	f6d69aed49	[interfaces-config.sh] Do not bring 'lo' interface down and up (#4150 )	2020-02-24 10:23:35 -08:00
pra-moh	c70a7b877d	[procdockerstatsd] Fix incorrect case issue in service file (#4134 )	2020-02-13 16:06:30 -08:00
Stephen Sun	6143fdd54d	[process-reboot-cause]Clean up the process-reboot-cause as reqired in issue 3927 (#4128 )	2020-02-13 16:05:55 -08:00
pra-moh	e1946432ff	[procdockerstats]: Update file permission for procdockerstatsd (#4126 )	2020-02-13 16:05:36 -08:00
kannankvs	74ac9b02dc	modified down rules to pre-down rules to ensure that default route is… (#3853 ) * modified down rules to pre-down rules to ensure that default route is deleted just before interface is made down	2020-02-13 16:01:21 -08:00
kannankvs	a836ead688	mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from con… (#4057 ) * mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from confiDB to snmpd.conf without using snmp.yml * added a missing if condition	2020-02-03 15:38:38 -08:00
pra-moh	8e4a4caf79	[baseimage]: removing space from shebang in procdockerstatsd (#4051 )	2020-02-03 15:37:47 -08:00
Dong Zhang	42bffc1215	[MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector (#4035 ) * [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector * update comment for a potential bug * update comment * add TODO maker as review reqirement	2020-02-03 15:36:55 -08:00
Howard Persh	cc825ff2fe	[startup] Fixes issue with /var/platform directory not created (#4000 )	2020-02-03 15:34:34 -08:00
SuvarnaMeenakshi	abe7ef7e2e	[baseimage]: support building multi-asic component (#3856 ) - move single instance services into their own folder - generate Systemd templates for any multi-instance service files in slave.mk - detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file. - update container hostname after creation instead of during creation (docker_image_ctl) - run Docker containers in a network namespace if specified - add a service to create a simulated multi-ASIC topology on the virtual switch platform Signed-off-by: Lawrence Lee <t-lale@microsoft.com> Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>	2020-02-03 15:32:21 -08:00
Kiran Kumar Kella	a943e6ce45	Changes in sonic-buildimage to support the NAT feature (#3494 ) * Changes in sonic-buildimage for the NAT feature - Docker for NAT - installing the required tools iptables and conntrack for nat Signed-off-by: kiran.kella@broadcom.com * Add redis-tools dependencies in the docker nat compilation * Addressed review comments * add natsyncd to warm-boot finalizer list * addressed review comments * using swsscommon.DBConnector instead of swsssdk.SonicV2Connector * Enable NAT application in docker-sonic-vs	2020-02-03 15:30:39 -08:00
Joe LeVeque	ccdc097a8f	[caclmgrd] Fix application of IPv6 service ACL rules (part 2) (#4036 )	2020-01-21 10:53:16 -08:00
Sujin Kang	9deb8c15f3	[reboot cause]: Delay process-reboot-cause service until network connection is stable (#4003 )	2020-01-21 10:47:13 -08:00
yozhao101	82c2eee1e6	[Monit] Change the monitoring period from 120 seconds to 60 seconds. (#3974 ) * [Monit] Change the monitoring period of monit from 120 seconds to 60 seconds and also at the same time double the interval for existing sonic monit config file in host. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-01-21 10:44:36 -08:00
rajendra-dendukuri	bb34edf1af	[config-setup]: create a SONiC configuration management service (#3227 ) * Create a SONiC configuration management service * Perform config db migration after loading config_db.json to redis DB * Migrate config-setup post migration hooks on image upgrade config-setup post migration hooks help user to migrate configurations from old image to new image. If the installed hooks are user defined they will not be part of the newly installed image. So these hooks have to be migrated to new image and only then they can be executing when the new image is booting. The changes in this fix migrate config-setup post-migration hooks and ensure that any hooks with the same filename in newly installed image are not overwritten. It is expected that users install new hooks as per their requirement and not edit existing hooks. Any changes to existing hooks need to be done as part of new image and not post bootup.	2020-01-21 10:39:19 -08:00
Prabhu Sreenivasan	7ec2732387	SONiC Management Framework Release 1.0 (#3488 ) * Added sonic-mgmt-framework as submodule / docker * fix build issues * update sonic-mgmt-framework submodule branch to master * Merged changes 70007e6d2ba3a4c0b371cd693ccc63e0a8906e77..00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 Modified environemnt variables * Changes to build sonic-mgmt-framework docker * bumped up sonic-mgmt-framework commit-id * version bump for sonic-mgmt-framework commit-it * bumped up sonic-mgmt-framework commit-id * Add python packages to docker * Build fix for docker with python packages * added libyang as dependent package * Allow building images on NFS-mounted clones Prior to this change, `build_debian.sh` would generate a Debian filesystem in `./fsroot`. This needs root permissions, and one of the tests that is performed is whether the user can create a character special file in the filesystem (using mknod). On most NFS deployments, `root` is the least privileged user, and cannot run mknod. Also, attempting to run commands like rm or mv as root would fail due to permission errors, since the root user gets mapped to an unprivileged user like `nobody`. This commit changes the location of the Debian filesystem to `/fsroot`, which is a tmpfs mount within the slave Docker. The default squashfs, docker tarball and zip files are also created within /tmp, before being copied back to /sonic as the regular user. The side effect of this change is that the contents of `/fsroot` are no longer available once the slave container exits, however they are available within the squashfs image. Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com> * bumped up sonc-mgmt-framework commit to include PR #18 * REST Server startup script is enahnced to read the settings from ConfigDB. Below table provides mapping of db field to command line argument name. ============================================================ ConfigDB entry key Field name REST Server argument ============================================================ REST_SERVER\|default port -port REST_SERVER\|default client_auth -client_auth REST_SERVER\|default log_level -v DEVICE_METADATA\|x509 server_crt -cert DEVICE_METADATA\|x509 server_key -key DEVICE_METADATA\|x509 ca_crt -cacert ============================================================ * Replace src/telemetry as submodule to sonic-telemetry * Update telemetry commit HEAD * Update sonic-telemetry commit HEAD * libyang env path update * Add libyang dependency to telemetry * Add scripts to create JSON files for CLI backend Scripts to create /var/platform/syseeprom and /var/platform/system, which are back-end files for CLI, for system EEPROM and system information. Signed-off-by: Howard Persh <Howard_Persh@dell.com> * In startup script, create directory where CLI back-end files live Signed-off-by: Howard Persh <Howard_Persh@dell.com> * build dependency pkgs added to docker for build failure fix * Changes to fix build issue for mgmt framework * Fix exec path issue with telemetry * s5232[device] PSU detecttion and default led state support * Processing of first boot in rc.local should not have premature exit Signed-off-by: Howard Persh <Howard_Persh@dell.com> * docker mount options added for platform, system features * bumped up sonic-mgmt-framework commit id to pick 23rd July 2019 changes * Added mount options for telemetry docker to get access for system and platform info. * Update commit for sonic-utilities * [dell]: Corrected dport map and renamed config files for S5232F * Fix telemetry submodule commit * added support for sonic-cli console * [Dell S5232F, Z9264F] Harden FPGA driver kernel module For Dell S5232F and Z9264F platforms, be more strict when checking state in ISR of FPGA driver, to harden against spurious interrupts. Signed-off-by: Howard Persh <Howard_Persh@dell.com> * update mgmt-framework submodule to 27th Aug commit. * remove changes not related to mgmt-framework and sonic-telemetry * Revert "Replace src/telemetry as submodule to sonic-telemetry" This reverts commit `11c3192975`. * Revert "Replace src/telemetry as submodule to sonic-telemetry" This reverts commit `11c3192975`. * make submodule changes and remove a change not related to PR * more changes * Update .gitmodules * Update Dockerfile.j2 * Update .gitmodules * Update .gitmodules * Update .gitmodules reverting experimental change * Removed syspoll for release_1.0 Signed-off-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com> * Update docker-sonic-mgmt-framework.mk * Update sonic-mgmt-framework.mk * Update sonic-mgmt-framework.mk * Update docker-sonic-mgmt-framework.mk * Update docker-sonic-mgmt-framework.mk * Revert "Processing of first boot in rc.local should not have premature exit" This reverts commit `e99a91ffc2`. * Remove old telemetry directory * Update docker-sonic-mgmt-framework.mk * Resolving merge conflict with Azure * Reverting the wrong merge * Use CVL_SCHEMA_PATH instead of changing directory for telemetry startup * Add missing export * Add python mmh3 to slave dockerfile * Remove sonic-mgmt-framework build dep for telemetry, fix dialout startup issues * Provided flag to disable compiling mgmt-framework * Update sonic-utilites point latest commit id * Point sonic-utilities to Azure accepted SHA * Updating mgmt framework to right sha * Add sonic-telemetry submodule * Update the mgmt-framework commit id Co-authored-by: jghalam <joe.ghalam@gmail.com> Co-authored-by: Partha Dutta <51353699+dutta-partha@users.noreply.github.com> Co-authored-by: srideepDell <srideep_devireddy@dell.com> Co-authored-by: nirenjan <nirenjan@users.noreply.github.com> Co-authored-by: Sachin Holla <51310506+sachinholla@users.noreply.github.com> Co-authored-by: Eric Seifert <seiferteric@gmail.com> Co-authored-by: Howard Persh <hpersh@yahoo.com> Co-authored-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com> Co-authored-by: Arunsundar Kannan <31632515+arunsundark@users.noreply.github.com> Co-authored-by: rvasanthm <51932293+rvasanthm@users.noreply.github.com> Co-authored-by: Ashok Daparthi-Dell <Ashok_Daparthi@Dell.com> Co-authored-by: anand-kumar-subramanian <51383315+anand-kumar-subramanian@users.noreply.github.com>	2020-01-08 15:51:02 -08:00
Joe LeVeque	5e07b252ff	[monit] Build from source and patch to use MemAvailable value if available on system (#3875 )	2020-01-06 11:41:20 -08:00
Joe LeVeque	f0b7dfad7c	[caclmgrd] Fix application of IPv6 service ACL rules (#3917 )	2019-12-31 14:42:49 -08:00
Renuka Manavalan	2d079a15dd	corefile uploader: Updates per review comments offline (#3915 ) * Updates per review comments 1) core_uploader service waits for syslog.service 2) core_uploader service enabled for restart on failure 3) Use mtime instead of file size + ample time to be robust. * Avoid reloading already uploaded file, by marking the names with a prefix. * Updated failing path. 1) If rc file is missing or required data missing, it periodically logs error in forever loop. 2) If upload fails, retry every hour with a error log, forever. * Fix few bugs * The binary update_json.py will come from sonic-utilities.	2019-12-31 14:42:01 -08:00
Ying Xie	759bde3a43	[hostcfgd] avoid in place editing config file contents (#3904 ) In place editing (sed -i) seems having some issues with filesystem interaction. It could leave 0 size file or corrupted file behind. It would be safer to sed the file contents into a new file and switch new file with the old file. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-12-18 11:20:25 -08:00
Renuka Manavalan	14f7b8da2d	Corefile uploader service (#3887 ) * Corefile uploader service 1) A service is added to watch /var/core and upload to Azure storage 2) The service is disabled on boot. One may enable explicitly. 3) The .rc file to be updated with acct credentials and http proxy to use. 4) If service is enabled with no credentials, it would sleep, with periodic log messages 5) For any update in .rc, the service has to be restarted to take effect. * Remove rw permission for .rc file for group & others. * Changes per review comments. Re-ordered .rc file per JSON.dump order. Added a script to enable partial update of .rc, which HWProxy would use to add acct key. * Azure storage upload requires python module futures, hence added it to install list. * Removed trailing spaces. * A mistake in name corrected. Copy the .rc updater script to /usr/bin.	2019-12-18 11:19:25 -08:00
Stephen Sun	ba4f0f30c8	[process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot (#3880 ) * [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot 1. check whether /proc/cmdline indicates warm/fast reboot. if yes the software reboot cause file will be treated as the reboot cause. finish 2. check whether platform api returns a reboot cause. if yes it is treated as the reboot cause. finish. 3. check whether /hosts/reboot-cause contains a cause. if yes it is treated as the cause otherwise return unknown. * [process-reboot-cause]Fix review comments * [process-reboot-cause]address comments 1. use "with" statement 2. update fast/warm reboot BOOT_ARG * [process-reboot-cause]address comments * refactor the code flow * Remove escape * Remove extra ':'	2019-12-18 11:17:17 -08:00
pra-moh	bfa96bbce3	Add daemon which periodically pushes process and docker stats to State DB (#3525 )	2019-11-27 15:35:41 -08:00
pra-moh	d3a1555f30	[hostcfgd] Add support to enable/disable optional features (#3653 )	2019-11-26 14:11:12 -08:00

1 2 3 4 5 ...

338 Commits