communication on docker eth0 ip . Without this TCP Connection to Redis
does not happen in namespace.
Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
management framework provides management plane services like rest and
CLI which is not needed right after boot, instead by delaying this
service we give some more CPU for data plane and control plane services
on fast/warm boot.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
sonic-cfggen is now using Unix Domain Socket for Redis DB. The socket
is created using root account. Subsequently, services that are started
as admin fails to start. This PR creates redis group and add admin
user to redis group. It also grants read/write access on redis.sock
for redis group members.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
on sonic-swss-common
With the change as part of #378 caclmgrd need to be updated
to use new client side Get API to access namespace.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
New attribute 'has_timer' introduced to init_cfg.json does not evaluate
as Bool, rather it evaluates as string. This PR fixes this issue. Also,
this PR fixes an issue when there is system config unit (snmp, telemetry) that
has no installation config (WantedBy=, RequiredBy=, Also=, Alias=) settings
in the [Install] section. In the latter case, the .service should not be enabled.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.
Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.
Signed-off-by: Baptiste Covolato <baptiste@arista.com>
Commit e484ae9dd introduced systemd .timer unit to hostcfgd.
However, when stopping service that has timer, there is possibility that
timer is not running and the service would not be stopped. This PR
address this situation by handling both .timer and .service units.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
startup when doing redis PING since database_config.json getting
generated from jinja2 template is still not ready.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
SNMP and Telemetry services are not critical to switch startup.
They also cause fast-reboot not to meet timing requirements.
In order to delay start those service are associated with systemd
timer units, however when hostcfgd initiate service start, it start
the service and not the timer. This PR fixes this issue by
starting the timer associated with systemd unit.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
* Support for Control Plane ACL's for Multi-asic Platforms.
Following changes were done:
1) Moved from using blocking listen() on Config DB to the select() model
via python-swsscommon since we have to wait on event from multiple
config db's
2) Since python-swsscommon is not available on host added libswsscommon and python-swsscommon
and dependent packages in the base image (host enviroment)
3) Made iptables programmed in all namespace using ip netns exec
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix Review Comments
* Fix Comments
* Added Change for Multi-asic to have iptables
rules to accept internal docker tcp/udp traffic
needed for syslog and redis-tcp connection.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix Review Comments
* Added more comments on logic.
* Fixed all warning/errors reported by http://pep8online.com/
other than line > 80 characters.
* Fix Comment
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Verified with swsscommon package. Fix issue for single asic platforms.
* Moved to new python package
* Address Review Comments.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments.
As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request:
1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added to the 201911 branch in https://github.com/Azure/sonic-buildimage/pull/5063
2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`.
3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module
This is a step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999
that cherry-pick PR#5081 ICCPD feature got include din 201911
which is not needed so removed it back
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
1. remove container feature table
2. do not generate feature entry if the feature is not included
in the image
3. rename ENABLE_* to INCLUDE_* for better clarity
4. rename feature status to feature state
5. [submodule]: update sonic-utilities
* 9700e45 2020-08-03 | [show/config]: combine feature and container feature cli (#1015) (HEAD, origin/master, origin/HEAD) [lguohan]
* c9d3550 2020-08-03 | [tests]: fix drops_group_test failure on second run (#1023) [lguohan]
* dfaae69 2020-08-03 | [lldpshow]: Fix input device is not a TTY error (#1016) [Arun Saravanan Balachandran]
* 216688e 2020-08-02 | [tests]: rename sonic-utilitie-tests to tests (#1022) [lguohan]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Removes installation of kube-proxy (117 MB) and flannel (53 MB) images from Kubernetes-enabled devices. These images are tested to be unnecessary for our use case, as we do not rely on ClusterIPs for Kubernetes Services or a CNI for pod networking.
Fix for the host unmount issue through PR https://github.com/Azure/sonic-buildimage/pull/4558 and https://github.com/Azure/sonic-buildimage/pull/4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket
Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
Consolidate common SONiC Python-language functionality into one shared package (sonic-py-common) and eliminate duplicate code.
The package currently includes four modules:
- daemon_base
- device_info
- logger
- task_base
NOTE: This is a combination of all changes from https://github.com/Azure/sonic-buildimage/pull/5003, https://github.com/Azure/sonic-buildimage/pull/5049 and some changes from https://github.com/Azure/sonic-buildimage/pull/5043 backported to align with the 201911 branch. As part of the 201911 port, I am not installing the Python 3 package in the base image or in the VS container, because we do not have pip3 installed, and we do not intend to migrate to Python 3 in 201911.
* Changes to add template support for copp.json.
This is needed so that we can install differnt type of
Traps based on Device Role (Tor/Leaf/Mgmt/etc...).
Initial use case is to install DHCP/DHCPv6 tarp only
for tor router.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fixed based on review comments.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fixed based on review comment.
* Defect 2082949: Handling Control Plane ACLs so that IPv4 rules and IPv6 rules are not added to the same ACL table
* Previous code review comments of coming up with functions for is_ipv4_rule and is_ipv6_rule is addressed and also raising Exceptions instead of simply aborting when the conflict occurs is handled
* Addressed code review comment to replace duplicate code with already existing functions
* removed raising Exception when rule conflict in Control plane ACLs are found
* added code to remove the rule_props if it is conflicting ACL table versioning rule
* addressed review comment to add ignoring rule in the error statement
Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
Otherwise, it may cause issues for warm restarts, warm reboot.
Warm restart of swss will start nat which is not expected for warm
restart. Also it is observed that during warm-reboot script execution
nat container gets started after it was killed. This causes removal of
nat dump generated by nat previously:
A check [ -f /host/warmboot/nat/nat_entries.dump ] || echo "NAT dump
does not exists" was added right before kexec:
```
Fri Jul 17 10:47:16 UTC 2020 Prepare MLNX ASIC to fastfast-reboot:
install new FW if required
Fri Jul 17 10:47:18 UTC 2020 Pausing orchagent ...
Fri Jul 17 10:47:18 UTC 2020 Stopping nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopped nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopping radv ...
Fri Jul 17 10:47:19 UTC 2020 Stopping bgp ...
Fri Jul 17 10:47:19 UTC 2020 Stopped bgp ...
Fri Jul 17 10:47:21 UTC 2020 Initialize pre-shutdown ...
Fri Jul 17 10:47:21 UTC 2020 Requesting pre-shutdown ...
Fri Jul 17 10:47:22 UTC 2020 Waiting for pre-shutdown ...
Fri Jul 17 10:47:24 UTC 2020 Pre-shutdown succeeded ...
Fri Jul 17 10:47:24 UTC 2020 Backing up database ...
Fri Jul 17 10:47:25 UTC 2020 Stopping teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopped teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopping syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopped syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopping all remaining containers ...
Warning: Stopping telemetry.service, but it can still be activated by:
telemetry.timer
Fri Jul 17 10:47:37 UTC 2020 Stopped all remaining containers ...
NAT dump does not exists
Fri Jul 17 10:47:39 UTC 2020 Rebooting with /sbin/kexec -e to
SONiC-OS-201911.140-08245093 ...
```
With this change, executed warm-reboot 10 times without hitting this
issue, while without this change the issue is easily reproducible almost
every warm-reboot run.
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
`sonic_installer list` is a read-only command. Specify it as such in the sudoers file.
This will also ensure the new `show boot` command, which calls `sudo sonic_installer list` under the hood doesn't fail due to permissions.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
This PR has changes to support accessing the bcmsh and bcmcmd utilities on multi ASIC devices
Changes done
- move the link of /var/run/sswsyncd from docker-syncd-brcm.mk to docker_image_ctl.j2
- update the bcmsh and bcmcmd scripts to take -n [ASIC_ID] as an argument on multi ASIC platforms
* [sonic-buildimage] Changes to make network specific sysctl
common for both host and docker namespace (in multi-npu).
This change is triggered with issue found in multi-npu platforms
where in docker namespace
net.ipv6.conf.all.forwarding was 0 (should be 1) because of
which RS/RA message were triggered and link-local router were learnt.
Beside this there were some other sysctl.net.ipv6* params whose value
in docker namespace is not same as host namespace.
So to make we are always in sync in host and docker namespace
created common file that list all sysctl.net.* params and used
both by host and docker namespace. Any change will get applied
to both namespace.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments and made sure to invoke augtool
only one and do string concatenation of all set commands
* Address Review Comments.
* [systemd-generator]: Fix the code to make sure that dependencies
of host services are generated correctly for multi-asic platforms.
Add code to make sure that systemd timer files are also modified
to add the correct service dependency for multi-asic platforms.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
* [systemd-generator]: Minor fix, remove debug code and
remove unused variable.
* Changes to make default route programming
correct in multi-asic platform where frr is not running
in host namespace. Change is to set correct administrative distance.
Also make NAMESPACE* enviroment variable available for all dockers
so that it can be used when needed.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix review comments
* Review comment to check to add default route
only if default route exist and delete is successful.
Add changes for syslog support for containers running in namespaces on multi ASIC platforms.
On Multi ASIC platforms
Rsyslog service is only running on the host. There is no rsyslog service running in each namespace.
On multi ASIC platforms the rsyslog service on the host will be listening on the docker0 ip address instead of loopback address.
The rsyslog.conf on the containers is modified to have omfwd target ip to be docker0 ipaddress instead of loopback ip
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
While testing reboot case for 201911 facing error:
supervisor-proc-exit-listener FATAL command at '/usr/bin/supervisor-proc-exit-listener' is not executable
Signed-off-by: Roman Savchuk <romanx.savchuk@intel.com>
When building the SONiC image, used systemd to mask all services which are set to "disabled" in init_cfg.json.
This PR depends on https://github.com/Azure/sonic-utilities/pull/944, otherwise `config load_minigraph will fail when trying to restart disabled services.
- Ensure all features (services) are in the configured state when hostcfgd starts
- Better functionalization of code
- Also replace calls to deprecated `has_key()` method in `tacacs_server_handler()` and `tacacs_global_handler()` with `in` keyword.
This PR depends on https://github.com/Azure/sonic-utilities/pull/944, otherwise `config load_minigraph` will fail when trying to restart disabled services.
While migrating to SONiC 20181130, identified a couple of issues:
1. union-mount needs /host/machine.conf parameters for vendor specific checks : however, in case of migration, the /host/machine.conf is extracted from ONIE only in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/platform/rc.local#L127.
2. Since grub.cfg is updated to have net.ifnames=0 biosdevname=0, 70-persistent-net.rules changes are no longer required.
Don't limit iptables connection tracking to TCP protocol; allow connection tracking for all protocols. This allows services like NTP, which is UDP-based, to receive replies from an NTP server even if the port is blocked, as long as it is in reply to a request sent from the device itself.