We are moving toward building all Python packages for SONiC as wheel packages rather than Debian packages. This will also allow us to more easily transition to Python 3.
Python files are now packaged in "sonic-utilities" Pyhton wheel. Data files are now packaged in "sonic-utilities-data" Debian package.
**- How I did it**
- Build and install sonic-utilities as a Python package
- Remove explicit installation of wheel dependencies, as these will now get installed implicitly by pip when installing sonic-utilities as a wheel
- Build and install new sonic-utilities-data package to install data files required by sonic-utilities applications
- Update all references to sonic-utilities scripts/entrypoints to either reference the new /usr/local/bin/ location or remove absolute path entirely where applicable
Submodule updates:
* src/sonic-utilities aa27dd9...2244d7b (5):
> Support building sonic-utilities as a Python wheel package instead of a Debian package (#1122)
> [consutil] Display remote device name in show command (#1120)
> [vrf] fix check state_db error when vrf moving (#1119)
> [consutil] Fix issue where the ConfigDBConnector's reference is missing (#1117)
> Update to make config load/reload backward compatible. (#1115)
* src/sonic-ztp dd025bc...911d622 (1):
> Update paths to reflect new sonic-utilities install location, /usr/local/bin/ (#19)
This PR limited the number of calls to sonic-cfggen to one call
per iteration instead of current 3 calls per iteration.
The PR also installs jq on host for future scripts if needed.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
communication on docker eth0 ip . Without this TCP Connection to Redis
does not happen in namespace.
Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Introduced a new build parameter 'SONIC_IMAGE_VERSION' that allows build
system users to build SONiC image with a specific version string. If
'SONIC_IMAGE_VERSION' was not passed by the user, SONIC_IMAGE_VERSION will be
set to the output of functions.sh:sonic_get_version function.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
As per the VOQ HLDs, internal networking between the linecards and supervisor is required within a chassis.
Allocating 127.X/16 subnets for private communication within a chassis is a good candidate.
It doesn't require any external IP allocation as well as ensure that the traffic will not leave the chassis.
References:
https://github.com/Azure/SONiC/pull/622https://github.com/Azure/SONiC/pull/639
**- How I did it**
Changed the `interfaces.j2` file to add `127.0.0.1/16` as the `lo` ip address.
Then once the interface is up, the post-up command removes the `127.0.0.1/8` ip address.
The order in which the netmask change is made matters for `127.0.0.1` to be reachable at all times.
**- How to verify it**
```
root@sonic:~# ip address show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/16 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
```
Co-authored-by: Baptiste Covolato <baptiste@arista.com>
- Merge chassis codebase upstream
- Add support for Otterlake supervisor
- Add support for NorthFace and Camp chassis
- Add support for Eldridge, Dragonfly and Brooks fabrics
- Add support for Clearwater2 and Clearwater2Ms linecards
- Add new arista Cli to power on/off cards
- Add new arista show Cli to inspect supervisor, chassis, fabrics and linecards
Issue: Binary ebtables config file is CPU arch dependent
Fix: Load the text config during firsttime boot and
Generate the binary persistent atomic file
Signed-off-by: Antony Rheneus <arheneus@marvell.com>
sonic-cfggen is now using Unix Domain Socket for Redis DB. The socket
is created using root account. Subsequently, services that are started
as admin fails to start. This PR creates redis group and add admin
user to redis group. It also grants read/write access on redis.sock
for redis group members.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
[schema] Make schema header support C project (#373)
Removed DB specific get api's from Selectable class (#378)
With the change as part of #378 caclmgrd need to be updated
to use new client side Get API to access namespace.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
The pcie-check.sh script was added in https://github.com/Azure/sonic-buildimage/pull/4771, but was not given executable permission. Therefore, we would see messages like:
```
Aug 26 22:54:05.536248 sonic ERR systemd[664]: pcie-check.service: Failed to execute command: Permission denied
Aug 26 22:54:05.536386 sonic ERR systemd[664]: pcie-check.service: Failed at step EXEC spawning /usr/bin/pcie-check.sh: Permission denied
Aug 26 22:54:05.536600 sonic WARNING systemd[1]: pcie-check.service: Failed with result 'exit-code'.
```
management framework provides management plane services like rest and
CLI which is not needed right after boot, instead by delaying this
service we give some more CPU for data plane and control plane services
on fast/warm boot.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
New attribute 'has_timer' introduced to init_cfg.json does not evaluate
as Bool, rather it evaluates as string. This PR fixes this issue. Also,
this PR fixes an issue when there is system config unit (snmp, telemetry) that
has no installation config (WantedBy=, RequiredBy=, Also=, Alias=) settings
in the [Install] section. In the latter case, the .service should not be enabled.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Add a master switch so that the sync/async mode can be configured.
Example usage of the switch:
1. Configure mode while building an image
`make ENABLE_SYNCHRONOUS_MODE=y <target>`
2. Configure when the device is running
Change CONFIG_DB with `sonic-cfggen -a '{"DEVICE_METADATA":{"localhost": {"synchronous_mode": "enable"}}}' --write-to-db`
Restart swss with `systemctl restart swss`
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.
Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.
Signed-off-by: Baptiste Covolato <baptiste@arista.com>
- Why I did it
When SONiC is configured with the management framework and/or telemetry services, the applications running inside those containers need to access some functionality on the host system. The following is a non-exhaustive list of such functionality:
Image management
Configuration save and load
ZTP enable/disable and status
Show tech support
- How I did it
The host service is a Python process that listens for requests via D-Bus. It will then service those requests and send a response back to the requestor.
This PR only introduces the host service infrastructure. Applications that need access to the host services must add applets that will register on D-Bus endpoints to service the appropriate functionality.
- How to verify it
- Description for the changelog
Add SONiC Host Service for container to execute select commands in host
Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>
Commit e484ae9dd introduced systemd .timer unit to hostcfgd.
However, when stopping service that has timer, there is possibility that
timer is not running and the service would not be stopped. This PR
address this situation by handling both .timer and .service units.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
startup when doing redis PING since database_config.json getting
generated from jinja2 template is still not ready.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Support for Control Plane ACL's for Multi-asic Platforms.
Following changes were done:
1) Moved from using blocking listen() on Config DB to the select() model
via python-swsscommon since we have to wait on event from multiple
config db's
2) Since python-swsscommon is not available on host added libswsscommon and python-swsscommon
and dependent packages in the base image (host enviroment)
3) Made iptables programmed in all namespace using ip netns exec
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix Review Comments
* Fix Comments
* Added Change for Multi-asic to have iptables
rules to accept internal docker tcp/udp traffic
needed for syslog and redis-tcp connection.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix Review Comments
* Added more comments on logic.
* Fixed all warning/errors reported by http://pep8online.com/
other than line > 80 characters.
* Fix Comment
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Verified with swsscommon package. Fix issue for single asic platforms.
* Moved to new python package
* Address Review Comments.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments.
SNMP and Telemetry services are not critical to switch startup.
They also cause fast-reboot not to meet timing requirements.
In order to delay start those service are associated with systemd
timer units, however when hostcfgd initiate service start, it start
the service and not the timer. This PR fixes this issue by
starting the timer associated with systemd unit.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when running interfaces-
config.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Removes installation of kube-proxy (117 MB) and flannel (53 MB) images from Kubernetes-enabled devices. These images are tested to be unnecessary for our use case, as we do not rely on ClusterIPs for Kubernetes Services or a CNI for pod networking.
1. remove container feature table
2. do not generate feature entry if the feature is not included
in the image
3. rename ENABLE_* to INCLUDE_* for better clarity
4. rename feature status to feature state
5. [submodule]: update sonic-utilities
* 9700e45 2020-08-03 | [show/config]: combine feature and container feature cli (#1015) (HEAD, origin/master, origin/HEAD) [lguohan]
* c9d3550 2020-08-03 | [tests]: fix drops_group_test failure on second run (#1023) [lguohan]
* dfaae69 2020-08-03 | [lldpshow]: Fix input device is not a TTY error (#1016) [Arun Saravanan Balachandran]
* 216688e 2020-08-02 | [tests]: rename sonic-utilitie-tests to tests (#1022) [lguohan]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request:
1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added in https://github.com/Azure/sonic-buildimage/pull/5003
2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`.
3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module
This is the next step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999
Also reverted my previous change in which device_info.get_platform() would first try obtaining the platform ID string from Config DB and fall back to gathering it from machine.conf upon failure because this function is called by sonic-cfggen before the data is in the DB, in which case, the db_connect() call will hang indefinitely, which was not the behavior I expected. As of now, the function will always reference machine.conf.
* [platform] Add Support For Environment Variable
This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
* Changes to add template support for copp.json.
This is needed so that we can install differnt type of
Traps based on Device Role (Tor/Leaf/Mgmt/etc...).
Initial use case is to install DHCP/DHCPv6 tarp only
for tor router.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fixed based on review comments.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fixed based on review comment.
virtual-chassis test uses multiple vs instances to simulate a
modular switch and a redis-chassis service is required to run on
the vs instance that represents a supervisor card.
This change allows vs docker start redis-chassis service according
to external config file.
**- Why I did it**
To support virtual-chassis setup, so that we can test distributed forwarding feature in virtual sonic environment, see `Distributed forwarding in a VOQ architecture HLD` pull request at https://github.com/Azure/SONiC/pull/622
**- How I did it**
The sonic-vs start.sh is enhanced to start new redis_chassis service if external chassis config file found. The config file doesn't exist in current vs environment, start.sh will behave like before.
**- How to verify it**
The swss/test still pass. The chassis_db service is verified in virtual-chassis topology and tests which are in following PRs.
Signed-off-by: Honggang Xu <hxu@arista.com>
(cherry picked from commit c1d45cf81ce3238be2dcbccae98c0780944981ce)
Co-authored-by: Honggang Xu <hxu@arista.com>
Consolidate common SONiC Python-language functionality into one shared package (sonic-py-common) and eliminate duplicate code.
The package currently includes three modules:
- daemon_base
- device_info
- logger
Fix for the host unmount issue through PR https://github.com/Azure/sonic-buildimage/pull/4558 and https://github.com/Azure/sonic-buildimage/pull/4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket
Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
`sonic-installer list` is a read-only command. Specify it as such in the sudoers file.
This will also ensure the new `show boot` command, which calls `sudo sonic-installer list` under the hood doesn't fail due to permissions.
Otherwise, it may cause issues for warm restarts, warm reboot.
Warm restart of swss will start nat which is not expected for warm
restart. Also it is observed that during warm-reboot script execution
nat container gets started after it was killed. This causes removal of
nat dump generated by nat previously:
A check [ -f /host/warmboot/nat/nat_entries.dump ] || echo "NAT dump
does not exists" was added right before kexec:
```
Fri Jul 17 10:47:16 UTC 2020 Prepare MLNX ASIC to fastfast-reboot:
install new FW if required
Fri Jul 17 10:47:18 UTC 2020 Pausing orchagent ...
Fri Jul 17 10:47:18 UTC 2020 Stopping nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopped nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopping radv ...
Fri Jul 17 10:47:19 UTC 2020 Stopping bgp ...
Fri Jul 17 10:47:19 UTC 2020 Stopped bgp ...
Fri Jul 17 10:47:21 UTC 2020 Initialize pre-shutdown ...
Fri Jul 17 10:47:21 UTC 2020 Requesting pre-shutdown ...
Fri Jul 17 10:47:22 UTC 2020 Waiting for pre-shutdown ...
Fri Jul 17 10:47:24 UTC 2020 Pre-shutdown succeeded ...
Fri Jul 17 10:47:24 UTC 2020 Backing up database ...
Fri Jul 17 10:47:25 UTC 2020 Stopping teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopped teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopping syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopped syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopping all remaining containers ...
Warning: Stopping telemetry.service, but it can still be activated by:
telemetry.timer
Fri Jul 17 10:47:37 UTC 2020 Stopped all remaining containers ...
NAT dump does not exists
Fri Jul 17 10:47:39 UTC 2020 Rebooting with /sbin/kexec -e to
SONiC-OS-201911.140-08245093 ...
```
With this change, executed warm-reboot 10 times without hitting this
issue, while without this change the issue is easily reproducible almost
every warm-reboot run.
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>