Added source interface support for NTP.
Also made NTP start on Mgmt-VRF by default when configured.
**- How I did it**
1) Updated hostcfg to listen to global config NTP and NTP_SERVER tables and restart ntp when ever the configuration changes. NTP table includes source interface configuration.
2) The ntp script updated to by default start on Mgmt-VFT when configured.
Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom>
Certain platform specific packages sonic-platform-xyz, installs files onto rootfs, which would be placed on read-write mount path on /host/image-name/rw/...
when ntpd starts it tries to do read access on /usr/bin /usr/sbin/ /usr/local/bin , which inturn links further to the read-write mount path also.
Where ntpd would get below Apparmor Warning message
LOG:-
audit: type=1400 audit(1606226503.240:21): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/local/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
audit: type=1400 audit(1606226503.240:22): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/sbin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
audit: type=1400 audit(1606226503.240:23): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Fix:
Add rw/.. mount path similar to root path access provided for ntpd in /etc/apparmor.d/usr.sbin.ntpd
Signed-off-by: Antony Rheneus <arheneus@marvell.com>
Create new file to "sysctl.d" with desired panic conditions.
It will trigger a vmcore dump using kdump-tools on these situations.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
The default /etc/default/kdump-tools file provided by the kdump-tools
package doesn't set a value for KDUMP_CMDLINE_APPEND.
The default kdump command line arguments need to be set in order
to extend them to use additional arguments required for SONiC
platforms.
Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
- Why I did it
Move frr logs from syslog from the directory /var/log/quagga/.log to /var/log/frr/log
- How I did it
Updated the rsyslog config files.
- How to verify it
Verified the logs come into the file zebra.log and bgpd.log in the DIR /var/log/frr/log
- Allow platform specific reboot script to be called after crash kernel has
finished copying the kernel vmcore
- Disable pcie advanced features when running crash kernel. This improves
reliability of the crash kernel to successfully create a vmcore and also
reboot
- Allow crash kernel to reboot if a panic is seen while it is generating a
vmcore
- Fix crash kernel to use the SONiC specific /usr/local/bin/reboot script
instead of the Linux reboot command /sbin/reboot
- Use sonic_platform as the kernel command line parameter to pass platform identifier string
Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
Make sure ntp-config service is executed before ntpd
Updated ntp-config service files to force dependency with ntp service. Also resolved circular dependency with --no-block flag. (needed as ntp-config service internally invokes systemd to restart ntp which in turn waits for ntp-config to complete)
Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom.com>
**- Why I did it**
Align style with slightly modified PEP8 standards (extend maximum line length to 120 chars). This will also help in the transition to Python 3, where it is more strict about whitespace, plus it helps unify style among the SONiC codebase. Will tackle other directories in separate PRs.
**- How I did it**
Using `autopep8 --in-place --max-line-length 120` and some manual tweaks.
* [bash.bashrc] Add reverse SSH script to bash.bashrc
* Fix command issue and add emptt line before EOF
* Add checks for SSH_TARGET_CONSOLE_LINE
Signed-off-by: Jing Kan jika@microsoft.com
- Why I did it
Add reboot history to State db so that can be used telemetry service
- How I did it
Split the process-reboot-cause service to determine-reboot-cause and process-reboot-cause
determine-reboot-cause to determine the reboot cause
process-reboot-cause to parse the reboot cause files and put the reboot history to state db
Moved to sonic-host-service* packages
- How to verify it
Performed unit test and tested on DUT
Fix 259 alerts reported by the LGTM tool:
- 245 for Unused import
- 7 for Testing equality to None
- 5 for Duplicate key in dict literal
- 1 for Module is imported more than once
- 1 for Unused local variable
- Why I did it
There is a issue for counters after warm-reboot:
If I clear counters by command "sonic-clear counters", then execute 'warm-reboot' and whenSONiC is restart, the counters showed with command "show interface counters" is still old counters before "sonic-clear". It is not the right counters because the counters file in '/tmp' is lost in warm-reboot process.
- How I did it
I fixed it by saving '/tmp/portstat-0' folders in '/host/' before executing 'warm-reboot' (in pull request Azure/sonic-utilities#1099 ), and restore the counters folders back to '/tmp/' after warm-reboot process is finished.
- How to verify it
Clear counters by command 'sonic-clear'
sonic-clear counters
sonic-clear dropcounters
sonic-clear pfccounters
sonic-clear queuecounters
sonic-clear rifcounters
Execute 'warm-reboot'
Use command ‘show interface counters’ to see if the counters is right.
* Add explicit default state into the constants.yml
* Enable/disable only peer-groups, available in the config
* Retrieve updates from frr before using configuration
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
When forced mgmt routes are present, the issue fixed as part of #5754 is not complete.
Added a preference(priority) field to forced mgmt route ip rules
In multi asic platforms the "show ip bgp summary" commands is not available for user with read only privileges, so to fix this the vtysh command with the new "-n" option, added for multi asic platforms, needs to be added to the READ_ONLY_COMMANDS list in the sudoers files. Added the command vtysh -n [0-9] -c show * to list of READ_ONLY_COMMANDS in the sudoers files in this commit.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
To consolidate host services and install via packages instead of file-by-file, also as part of migrating all of SONiC to Python 3, as Python 2 is no longer supported.
Our use case is to register new features in runtime. The previous change which introduced the cache broke this capability and caused hostcfgd crash.
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
- Convert core_uploader.py script to Python 3
- Use logger from sonic-py-common for uniform logging
- Reorganize imports alphabetically per PEP8 standard
- Two blank lines precede functions per PEP8 standard
- Remove unnecessary global variable declarations
* This was a temporary fix for orchagent spamming log messages and causing rate limiting, leading to critical messages being dropped for the syslog. No longer needed since Azure/sonic-sairedis#680 was merged.
FixAzure/SONiC#551
When eth0 IP address is configured, an ip rule is getting added for eth0 IP address through the interfaces.j2 template.
This eth0 ip rule creates an issue when VRF (data VRF or management VRF) is also created in the system.
When any VRF (data VRF or management VRF) is created, a new rule is getting added automatically by kernel as "1000: from all lookup [l3mdev-table]".
This l3mdev IP rule is never getting deleted even if VRF is deleted.
Once if this l3mdev IP rule is added, if user configures IP address for the eth0 interface, interfaces.j2 adds an eth0 IP rule as "1000:from 100.104.47.74 lookup default ". Priority 1000 is automatically chosen by kernel and hence this rule gets higher priority than the already existing rule "1001:from all lookup local ".
This results in an issue "ping from console to eth0 IP does not work once if VRF is created" as explained in Issue 551.
More details and possible solutions are explained as comments in the Issue551.
This PR is to resolve the issue by always fixing the low priority 32765 for the IP rule that is created for the eth0 IP address.
Tested with various combinations of VRF creation, deletion and IP address configuration along with ping from console to eth0 IP address.
Co-authored-by: Kannan KVS <kannan_kvs@dell.com>
Why/How I did:
Make sure first error syslog is triggered based on FAULT TOLERANCE condition.
Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent
Updated the monit conf files with repeat every x cycles for the alert action
- Why I did it
The update_all_feature_states can run in the range of 20+ seconds to one minute. With load of AAA & Tacacs preceding it, any DB updates in AAA/TACACS during the long running feature updates would get missed. To avoid, switch the order.
- How I did it
Do a load after after updating all feature states.
- How to verify it
Not a easy one
Have a script that
restart hostcfgd
sleep 2s
run redis-cli/config command to update AAA/TACACS table
Run the script above and watch the file /etc/pam.d/common-auth-sonic for a minute.
- When it repro:
The updates will not reflect in /etc/pam.d/common-auth-sonic
To consolidate host services and install via packages instead of file-by-file, also as part of migrating all of SONiC to Python 3, as Python 2 is no longer supported, convert caclmgrd to Python 3 and add to sonic-host-services package
* Initial commit for BGP internal neighbor table support.
> Add new template named "internal" for the internal BGP sessions
> Add a new table in database "BGP_INTERNAL_NEIGHBOR"
> The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR"
* Changes in template generation tests with the introduction of internal neighbor template files.
The psutil library used in process_checker create a cache for each
process when calling process_iter. So, there is some possibility that
one process exists when calling process_iter, but not exists when
calling cmdline, which will raise a NoSuchProcess exception. This commit
fix the issue.
Signed-off-by: bingwang <bingwang@microsoft.com>
**- Why I did it**
Install all host services and their data files in package format rather than file-by-file
**- How I did it**
- Create sonic-host-services Python wheel package, currently including procdockerstatsd
- Also add the framework for unit tests by adding one simple procdockerstatsd test case
- Create sonic-host-services-data Debian package which is responsible for installing the related systemd unit files to control the services in the Python wheel. This package will also be responsible for installing any Jinja2 templates and other data files needed by the host services.
**- Why I did it**
On teamd docker restart, the swss and syncd needs to be restarted as there are dependent resources present.
**- How I did it**
Add the teamd as a dependent service for swss
Updated the docker-wait script to handle service and dependent services separately.
Handle the case of warm-restart for the dependent service
**- How to verify it**
Verified the following scenario's with the following testbed
VM1 ----------------------------[DUT 6100] -----------------------VM2, ping traffic continuous between VMs
1. Stop teamd docker alone
> swss, syncd dockers seen going away
> The LAG reference count error messages seen for a while till swss docker stops.
> Dockers back up.
2. Enable WR mode for teamd. Stop teamd docker alone
> swss, syncd dockers not removed.
> The LAG reference count error messages not seen
> Repeated stop teamd docker test - same result, no effect on swss/syncd.
3. Stop swss docker.
> swss, teamd, syncd goes off - dockers comes back correctly, interfaces up
4. Enable WR mode for swss . Stop swss docker
> swss goes off not affecting syncd/teamd dockers.
5. Config reload
> no reference counter error seen, dockers comes back correctly, with interfaces up
6. Warm reboot, observations below
> swss docker goes off first
> teamd + syncd goes off to the end of WR process.
> dockers comes back up fine.
> ping traffic between VM's was NOT HIT
7. Fast reboot, observations below
> teamd goes off first ( **confirmed swss don't exit here** )
> swss goes off next
> syncd goes away at the end of the FR process
> dockers comes back up fine.
> there is a traffic HIT as per fast-reboot
8. Verified in multi-asic platform, the tests above other than WR/FB scenarios
**- Why I did it**
If we ran the CLI commands `sudo config feature autorestart snmp disabled/enabled` or `sudo config feature autorestart swss disabled/enabled`, then SNMP container will be stopped and started. This behavior was not expected since we updated the `auto_restart` field not update `state` field in `FEATURE` table. The reason behind this issue is that either `state` field or `auto_restart` field was updated, the function `update_feature_state(...)` will be invoked which then starts snmp.timer service.
The snmp.timer service will first stop snmp.service and later start snmp.service.
In order to solve this issue, the function `update_feature_state(...)` will be only invoked if `state` field in `FEATURE` table was
updated.
**- How I did it**
When the demon `hostcfgd` was activated, all the values of `state` field in `FEATURE` table of each container will be
cached. Each time the function `feature_state_handler(...)` is invoked, it will determine whether the `state` field of a
container was changed or not. If it was changed, function `update_feature_state(...)` will be invoked and the cached
value will also be updated. Otherwise, nothing will be done.
**- How to verify it**
We can run the CLI commands `sudo config feature autorestart snmp disabled/enabled` or `sudo config feature autorestart swss disabled/enabled` to check whether SNMP container is stopped and started. We also can run the CLI commands `sudo config feature state snmp disabled/enabled` or `sudo config feature state swss disabled/enabled` to check whether the container is stopped and restarted.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
**- Why I did it**
To introduce dynamic support of BBR functionality into bgpcfgd.
BBR is adding `neighbor PEER_GROUP allowas-in 1' for all BGP peer-groups which points to T0
Now we can add and remove this configuration based on CONFIG_DB entry
**- How I did it**
I introduced a new CONFIG_DB entry:
- table name: "BGP_BBR"
- key value: "all". Currently only "all" is supported, which means that all peer-groups which points to T0s will be updated
- data value: a dictionary: {"status": "status_value"}, where status_value could be either "enabled" or "disabled"
Initially, when bgpcfgd starts, it reads initial BBR status values from the [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR34). Then you can control BBR status by changing "BGP_BBR" table in the CONFIG_DB (see examples below).
bgpcfgd knows what peer-groups to change fron [constants.yml](https://github.com/Azure/sonic-buildimage/pull/5626/files#diff-e6f2fe13a6c276dc2f3b27a5bef79886f9c103194be4fcb28ce57375edf2c23cR39). The dictionary contains peer-group names as keys, and a list of address-families as values. So when bgpcfgd got a request to change the BBR state, it changes the state only for peer-groups listed in the constants.yml dictionary (and only for address families from the peer-group value).
**- How to verify it**
Initially, when we start SONiC FRR has BBR enabled for PEER_V4 and PEER_V6:
```
admin@str-s6100-acs-1:~$ vtysh -c 'show run' | egrep 'PEER_V.? allowas'
neighbor PEER_V4 allowas-in 1
neighbor PEER_V6 allowas-in 1
```
Then we apply following configuration to the db:
```
admin@str-s6100-acs-1:~$ cat disable.json
{
"BGP_BBR": {
"all": {
"status": "disabled"
}
}
}
admin@str-s6100-acs-1:~$ sonic-cfggen -j disable.json -w
```
The log output are:
```
Oct 14 18:40:22.450322 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'disabled'),))'
Oct 14 18:40:22.450620 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpmWTiuq']'.
Oct 14 18:40:22.681084 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'.
Oct 14 18:40:22.904626 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'.
```
Check FRR configuraiton and see that no allowas parameters are there:
```
admin@str-s6100-acs-1:~$ vtysh -c 'show run' | egrep 'PEER_V.? allowas'
admin@str-s6100-acs-1:~$
```
Then we apply enabling configuration back:
```
admin@str-s6100-acs-1:~$ cat enable.json
{
"BGP_BBR": {
"all": {
"status": "enabled"
}
}
}
admin@str-s6100-acs-1:~$ sonic-cfggen -j enable.json -w
```
The log output:
```
Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: Received message : '('all', 'SET', (('status', 'enabled'),))'
Oct 14 18:40:41.074720 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpDD6SKv']'.
Oct 14 18:40:41.587257 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V4 soft in']'.
Oct 14 18:40:42.042967 str-s6100-acs-1 DEBUG bgp#bgpcfgd: execute command '['vtysh', '-c', 'clear bgp peer-group PEER_V6 soft in']'.
```
Check FRR configuraiton and see that the BBR configuration is back:
```
admin@str-s6100-acs-1:~$ vtysh -c 'show run' | egrep 'PEER_V.? allowas'
neighbor PEER_V4 allowas-in 1
neighbor PEER_V6 allowas-in 1
```
*** The test coverage ***
Below is the test coverage
```
---------- coverage: platform linux2, python 2.7.12-final-0 ----------
Name Stmts Miss Cover
----------------------------------------------------
bgpcfgd/__init__.py 0 0 100%
bgpcfgd/__main__.py 3 3 0%
bgpcfgd/config.py 78 41 47%
bgpcfgd/directory.py 63 34 46%
bgpcfgd/log.py 15 3 80%
bgpcfgd/main.py 51 51 0%
bgpcfgd/manager.py 41 23 44%
bgpcfgd/managers_allow_list.py 385 21 95%
bgpcfgd/managers_bbr.py 76 0 100%
bgpcfgd/managers_bgp.py 193 193 0%
bgpcfgd/managers_db.py 9 9 0%
bgpcfgd/managers_intf.py 33 33 0%
bgpcfgd/managers_setsrc.py 45 45 0%
bgpcfgd/runner.py 39 39 0%
bgpcfgd/template.py 64 11 83%
bgpcfgd/utils.py 32 24 25%
bgpcfgd/vars.py 1 0 100%
----------------------------------------------------
TOTAL 1128 530 53%
```
**- Which release branch to backport (provide reason below if selected)**
- [ ] 201811
- [x] 201911
- [x] 202006
There is currently a bug where messages from swss with priority lower than the current log level are still being counted against the syslog rate limiting threshhold. This leads to rate-limiting in syslog when the rate-limiting conditions have not been met, which causes several sonic-mgmt tests to fail since they are dependent on LogAnalyzer. It also omits potentially useful information from the syslog. Only rate-limiting messages of level INFO and lower allows these tests to pass successfully.
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
**- Why I did it**
I was asked to change "Allow list" prefix-list generation rule.
Previously we generated the rules using following method:
```
For each {prefix}/{masklen} we would generate the prefix-rule
permit {prefix}/{masklen} ge {masklen}+1
Example:
Prefix 1.2.3.4/24 would have following prefix-list entry generated
permit 1.2.3.4/24 ge 23
```
But we discovered the old rule doesn't work for all cases we have.
So we introduced the new rule:
```
For ipv4 entry,
For mask < 32 , we will add ‘le 32’ to cover all prefix masks to be sent by T0
For mask =32 , we will not add any ‘le mask’
For ipv6 entry, we will add le 128 to cover all the prefix mask to be sent by T0
For mask < 128 , we will add ‘le 128’ to cover all prefix masks to be sent by T0
For mask = 128 , we will not add any ‘le mask’
```
**- How I did it**
I change prefix-list entry generation function. Also I introduced a test for the changed function.
**- How to verify it**
1. Build an image and put it on your dut.
2. Create a file test_schema.conf with the test configuration
```
{
"BGP_ALLOWED_PREFIXES": {
"DEPLOYMENT_ID|0|1010:1010": {
"prefixes_v4": [
"10.20.0.0/16",
"10.50.1.0/29"
],
"prefixes_v6": [
"fc01:10::/64",
"fc02:20::/64"
]
},
"DEPLOYMENT_ID|0": {
"prefixes_v4": [
"10.20.0.0/16",
"10.50.1.0/29"
],
"prefixes_v6": [
"fc01:10::/64",
"fc02:20::/64"
]
}
}
}
```
3. Apply the configuration by command
```
sonic-cfggen -j test_schema.conf --write-to-db
```
4. Check that your bgp configuration has following prefix-list entries:
```
admin@str-s6100-acs-1:~$ show runningconfiguration bgp | grep PL_ALLOW
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 10 deny 0.0.0.0/0 le 17
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 20 permit 127.0.0.1/32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 30 permit 10.20.0.0/16 le 32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 40 permit 10.50.1.0/29 le 32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 10 deny 0.0.0.0/0 le 17
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 20 permit 127.0.0.1/32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 30 permit 10.20.0.0/16 le 32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 40 permit 10.50.1.0/29 le 32
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 10 deny ::/0 le 59
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 20 deny ::/0 ge 65
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 30 permit fc01:10::/64 le 128
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 40 permit fc02:20::/64 le 128
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 10 deny ::/0 le 59
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 20 deny ::/0 ge 65
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 30 permit fc01:10::/64 le 128
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 40 permit fc02:20::/64 le 128
```
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
When a large number of changes occur to the ACL table of Config DB, caclmgrd will get flooded with notifications, and previously, it would regenerate and apply the iptables rules for each change, which is unnecessary, as the iptables rules should only get applied once after the last change notification is received. If the ACL table contains a large number of control plane ACL rules, this could cause a large delay in caclmgrd getting the rules applied.
This patch causes caclmgrd to delay updating the iptables rules until it has not received a change notification for at least 0.5 seconds.
bring up chassisdb service on sonic switch according to the design in
Distributed Forwarding in VoQ Arch HLD
Signed-off-by: Honggang Xu <hxu@arista.com>
**- Why I did it**
To bring up new ChassisDB service in sonic as designed in ['Distributed forwarding in a VOQ architecture HLD' ](90c1289eaf/doc/chassis/architecture.md).
**- How I did it**
Implement the section 2.3.1 Global DB Organization of the VOQ architecture HLD.
**- How to verify it**
ChassisDB service won't start without chassisdb.conf file on the existing platforms.
ChassisDB service is accessible with global.conf file in the distributed arichitecture.
Signed-off-by: Honggang Xu <hxu@arista.com>
* Optimze ACL Table/Rule notifcation handling
to loop pop() until empty to consume all the data in a batch
This wau we prevent multiple call to iptable updates
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address review comments
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* system health first commit
* system health daemon first commit
* Finish healthd
* Changes due to lower layer logic change
* Get ASIC temperature from TEMPERATURE_INFO table
* Add system health make rule and service files
* fix bugs found during manual test
* Change make file to install system-health library to host
* Set system LED to blink on bootup time
* Caught exceptions in system health checker to make it more robust
* fix issue that fan/psu presence will always be true
* fix issue for external checker
* move system-health service to right after rc-local service
* Set system-health service start after database service
* Get system up time via /proc/uptime
* Provide more information in stat for CLI to use
* fix typo
* Set default category to External for external checker
* If external checker reported OK, save it to stat too
* Trim string for external checker output
* fix issue: PSU voltage check always return OK
* Add unit test cases for system health library
* Fix LGTM warnings
* fix demo comments: 1. get boot up timeout from monit configuration file; 2. set system led in library instead of daemon
* Remove boot_timeout configuration because it will get from monit config file
* Fix argument miss
* fix unit test failure
* fix issue: summary status is not correct
* Fix format issues found in code review
* rename th to threshold to make it clearer
* Fix review comment: 1. add a .dep file for system health; 2. deprecated daemon_base and uses sonic-py-common instead
* Fix unit test failure
* Fix LGTM alert
* Fix LGTM alert
* Fix review comments
* Fix review comment
* 1. Add relevant comments for system health; 2. rename external_checker to user_define_checker
* Ignore check for unknown service type
* Fix unit test issue
* Rename user define checker to user defined checker
* Rename user_define_checkers to user_defined_checkers for configuration file
* Renmae file user_define_checker.py -> user_defined_checker.py
* Fix typo
* Adjust import order for config.py
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
* Adjust import order for src/system-health/health_checker/hardware_checker.py
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
* Adjust import order for src/system-health/scripts/healthd
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
* Adjust import orders in src/system-health/tests/test_system_health.py
* Fix typo
* Add new line after import
* If system health configuration file not exist, healthd should exit
* Fix indent and enable pytest coverage
* Fix typo
* Fix typo
* Remove global logger and use log functions inherited from super class
* Change info level logger to notice level
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
any event happening on ACL Rule Table (eg DATAACL rules
programmed) caused control plane default action to be triggered.
Now Control Plance ACTION will be trigger only
a) ACL Rule beloging to Control ACL Table
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
To address issue #5525
Explicitly control the grub installation requirement when it is needed.
We have scenario where configuration migration happened but grub
installation is not required.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
implements a new feature: "BGP Allow list."
This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.
implements a new feature: "BGP Allow list."
This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.
* [Multi-Asic] Forward SNMP requests destined to loopback IP, and coming in through the front panel interface
present in the network namespace, to SNMP agent running in the linux host.
* Updates based on comments
* Further updates in docker_image_ctl.j2 and caclmgrd
* Change the variable for net config file.
* Updated the comments in the code.
* No need to clean up the exising NAT rules if present, which could be created by some other process.
* Delete our rule first and add it back, to take care of caclmgrd restart.
Another benefit is that we delete only our rules, rather than earlier approach of "iptables -F" which cleans up all rules.
* Keeping the original logic to clean the NAT entries, to revist when NAT feature added in namespace.
* Missing updates to log_info call.
redis-py 3.0 used in master branch only accepts user data as bytes,
strings or numbers (ints, longs and floats). Attempting to specify a key
or a value as any other type will raise a DataError exception.
This PR address the issue bt converting datetime to str
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>