Commit Graph

633 Commits

Author SHA1 Message Date
Lawrence Lee
5232647b33 [mux]: Make write_standby available on host
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

[write_standby]: Cleanup and fix build

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
b880f9d973 Merged PR 4813977: [mux] Update Service Install With SONiC Target
[mux] Update Service Install With SONiC Target

Recent PR grouped all SONiC service into sonic.taget. The install section
of mux.service was not update and this causes delays when using config
reload as the service failed state is not being reset.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Lawrence Lee
0295c832c2 Merged PR 4366316: [mux.service]: Bind to sonic.target
[mux.service]: Bind to sonic.target

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
bff785ec49 Merged PR 4234524: [mux] Start Mux on Only Dual-ToR Platform
[mux] Start Mux on Only Dual-ToR Platform

mux docker depends on the presence of mux cable hardware and is
supposed to run only Gemini ToRs. This PR change the mux feature
config in order to enable mux docker based on device configuration.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
c9c2826520 Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd
Linkmgrd monitors link status, mux status, and link state. Has
the link becomes unhealthy, linkmgrd will trigger mux switchover
on a standby ToR ensuring uninterrupted service to servers/blades.
This PR is initial implementation of linkmgrd.

Also, docker-mux container hold packages related to maintaining and managing
mux cable. It currently runs linkmgrd binary that monitor and switches
the mux if needed.
This PR also introduces mux-container and starts linkmgrd as startup when
build is configured with INCLUDE_MUX=y

Edit: linkmgrd PR will follow.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

Related work items: #2315, #3146150
2021-10-15 09:59:59 -07:00
liuh-80
7d40384c58
[TACACS+] Add plugin support to bash. (#8660)
This pull request add plugin support library to bash.
    And we will create a TACACS+ plugin for bash in an other PR, which will bring per command authorization feature to bash.

Why I did it
    To support TACACS per command authorization, we check user command before execute it.

How I did it
    Add plugin support to bash.

How to verify it
    UT with CUnit under bash project cover all new code in plugin.c.
    Also pass all current UT.

Which release branch to backport (provide reason below if selected)
    N/A

Description for the changelog
    Add plugin support to bash.
2021-10-11 15:20:51 +08:00
Ashok Daparthi-Dell
6cbdf11e53
SONIC QOS YANG - Remove qos tables field value refernce format (#7752)
Depends on Azure/sonic-utilities#1626
Depends on Azure/sonic-swss#1754

QOS tables in config db used ABNF format i.e "[TABLE_NAME|name] to refer fieldvalue to other qos tables.

Example:
Config DB:
"Ethernet92|3": {
"scheduler": "[SCHEDULER|scheduler.1]",
"wred_profile": "[WRED_PROFILE|AZURE_LOSSLESS]"
},
"Ethernet0|0": {
"profile": "[BUFFER_PROFILE|ingress_lossy_profile]"
},
"Ethernet0": {
"dscp_to_tc_map": "[DSCP_TO_TC_MAP|AZURE]",
"pfc_enable": "3,4",
"pfc_to_queue_map": "[MAP_PFC_PRIORITY_TO_QUEUE|AZURE]",
"tc_to_pg_map": "[TC_TO_PRIORITY_GROUP_MAP|AZURE]",
"tc_to_queue_map": "[TC_TO_QUEUE_MAP|AZURE]"
},

This format is not consistent with other DB schema followed in sonic.
And also this reference in DB is not required, This is taken care by YANG "leafref".

Removed this format from all platform files to consistent with other sonic db schema.
Example:
"Ethernet92|3": {
"scheduler": "scheduler.1",
"wred_profile": "AZURE_LOSSLESS"
},

Dependent pull requests:
#7752 - To modify platfrom files
#7281 - Yang model
Azure/sonic-utilities#1626 - DB migration
Azure/sonic-swss#1754 - swss change to remove ABNF format
2021-09-28 09:21:24 -07:00
Vaibhav Hemant Dixit
ee9250e8cc
Save DB dump after warm/fast reboot (#8803)
As a part of warmboot, redis database is dumped:
c97fe546e5/scripts/fast-reboot (L269)
However, this dump file is deleted, after it is loaded back into db post reboot.
The DB dump can be useful for debugging purpose, hence taking a backup of it can be useful.
Instead of deleting the dump, rename and keep the dump.
2021-09-23 23:53:22 -07:00
kellyyeh
62a1f5eb19
Add CLI Support for IPv6 Helpers and DHCPv6 Relay Counters (#8593) 2021-09-23 22:01:26 -07:00
Stephen Sun
c895677507
Use predefined macro as vendor information (#8361)
#### Why I did it
Use a predefined variable to get vendor information when the swss docker container is created

#### How I did it
Use `{{ sonic_asic_platform }}` instead of `$SONIC_CFGGEN -y /etc/sonic/sonic_version.yml -v asic_type`

#### How to verify it
Manually test.
2021-08-16 00:36:48 -07:00
Saikrishna Arcot
c8b5daed27 Upgrade to ifupdown2 3.0.0 with a patch to fix using broadcast addresses
In version 3.0.0, If a broadcast address is specified in
/etc/network/interfaces, then when ifup is run, it will fail with an
error saying `'str' object has no attribute 'packed'`. This appears to
be because it expects all attributes for an interface to be "packable"
into a compact binary representation. However, it doesn't actually
convert the broadcast address into an IPNetwork object (other addresses
are handled).

Therefore, convert the broadcast address it reads in from a str to an
IPNetwork object.

Also explicitly specify the scope of the loopback address in
/etc/network/interfaces as host scope. Otherwise, it will get added as
global scope by default. As part of this, use JSON to parse ip's output
instead of text, for robustness.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2021-08-12 23:18:01 -07:00
Stepan Blyshchak
14da7a1663
[sonic_debian_extension.j2] export DOCKER_HOST so that clients can use it to connect to dockerd (#8398)
Use DOCKER_HOST. Every client including docker command and python docker API uses this environment variable to connect to dockerd.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-08-10 11:11:45 -07:00
lguohan
cf73e22d52
[build]: add branch and release name in sonic_version.yml (#6356)
the branch refers the branch name that the commit is in,
for example master, 202012, 201911, ...
In case there is no branch, the name will be HEAD.

release is encoded in /etc/sonic/sonic_release file.
the file is only available for a release branch.
It is not available in master branch.

example for master branch
```
build_version: 'master.602-6efc0a88'
debian_version: '10.7'
kernel_version: '4.19.0-9-2-amd64'
asic_type: vs
commit_id: '6efc0a88'
branch: 'master'
release: 'none'
build_date: Tue Dec 29 06:54:02 UTC 2020
build_number: 602
built_by: johnar@jenkins-worker-23
```

example for 202012 release branch
```
build_version: '202012.602-6efc0a88'
debian_version: '10.7'
kernel_version: '4.19.0-9-2-amd64'
asic_type: vs
commit_id: '6efc0a88'
branch: '202012'
release: '202012'
build_date: Tue Dec 29 06:54:02 UTC 2020
build_number: 602
built_by: johnar@jenkins-worker-23
```

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-08 20:44:02 -07:00
Guohan Lu
0b155c003e [build]: Fix docker pull on armhf platform
armhf build uses native dockerd

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-06 23:33:40 -07:00
Longxiang Lyu
6283716e1d
[swss][arp_update] Send ipv6 pings over vlan sub interfaces (#8363)
#### Why I did it
* `arp_update` fails to ping those neighbors over vlan sub interfaces.

#### How I did it
* modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned.
* modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2021-08-06 21:14:18 -07:00
byu343
2fccf0661a
[gearbox] Add gbsyncd container for Credo gearbox chips (#8144)
This change is to add a gbsyncd container to accommodate the syncd process and the SAI libraries for the Credo gearbox chips.

How I did it
This container works similar to the existing Broadcom syncd container. Its main difference is that the SAI-related dynamic libraries are replaced by the ones for Credo gearbox chips, and the container only reacts to SAI events for the gearbox chips. The SAI libraries will be provided by the package libsai-credo_1.0_amd64.deb.

For the image build, the added container will be built and included in the Broadcom platform image, after $(LIBSAI_CREDO)_URL = is replaced to the correct value. For now, as $(LIBSAI_CREDO)_URL is empty, the container build is skipped in the image build.

After the container is included in the image, in the runtime, the container will begin with checking the existence of /usr/share/sonic/hwsku/gearbox_config.json; if that file is not provided, the container will exit by itself. Therefore, for platforms unrelated to the Credo chips, as long as they are not providing the file, they will not be affected by this change.
2021-08-04 16:05:53 -07:00
VenkatCisco
d294807b6b
[baseimage]: add j2cli to sonic_debian_extension.j2 (#8019)
j2cli provides access to jinja library. cisco platform.py requires j2cli to handle jinja template configuration files.
2021-08-03 18:06:59 -07:00
vdahiya12
498422968f
[pmon] create and mount firmware directory on PMON for firmware upgrade support on muxcable (#8283)
This PR creates a directory firmware on the HOST with the path /usr/share/sonic/firmware, as well as this is 
mounted on PMON container with the same path /usr/share/sonic/firmware. This is required for firmware 
upgrade support for muxcable as currently by design all Y-Cable API's are called by xcvrd. As such if CLI has 
to transfer a file to PMON we need to mount a directory from host to PMON just for getting the firmware files. 
Hence we require this change.

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2021-08-03 09:39:39 -07:00
mprabhu-nokia
3fd6e8d500
[systemd] ASIC status based service bringup on VOQ chassis (#7477)
Changes to allow starting per asic services like swss and syncd only if the platform vendor codedetects the asic is detected and notified. The systemd services ordering we want is database->database@->pmon->swss@->syncd@->teamd@->lldp@
There is also a requirement that management, telemetry, snmp dockers can start even if all asic services are not up.

Why I did it
For VOQ chassis, the fabric cards will have 1-N asics. Also, there could be multiple removable fabric cards. On the supervisor, swss and syncd containers need to be started only if the fabric-card is in Online state and respective asics are detected by the kernel. Using systemd, the dependent services can be in inactive state.

How I did it
Introduce a mechanism where all ASIC dependent service wait on its state to be published via PMON to REDIS. Once the subscription is received, the service proceeds to create respective dockers.
For fixed platforms, systemd is unchanged i.e. the service bring up and docker creation happens in the start()/ExecStartPre routine of the .sh scripts.
For VOQ chassis platform on supervisor, the service bringup skips docker creation in the start() routine, but does it in the wait()/ExecStart routine of the .sh scrips.
Management dockers are decoupled from ASIC docker creation.
2021-07-27 23:02:49 -07:00
賓少鈺
aa59bfeab7
[PDE]: introduce the SONiC Platform Development Env (#7510)
The PDE silicon test harness and platform test harness can be found in
src/sonic-platform-pdk-pde
2021-07-24 16:24:43 -07:00
Renuka Manavalan
3a96eb933e
Get Docker proxy info from config (#8205)
This helps not to hard code the docker proxy IP, but take it from config file during build time.
2021-07-19 21:17:47 -07:00
Renuka Manavalan
c5dff0c640
Revert "Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469)" (#8023)" (#8158)
This reverts commit 7236fa98e8.

Restore original PR #7469
2021-07-15 19:48:55 -07:00
Stepan Blyshchak
b3b6938fda
[dhcp-relay] make DHCP relay an extension (#6531)
- Why I did it
Make DHCP relay docker an extension. DHCP relay now carries dhcp relay commands CLI plugin and has a complete manifest.
It is installed as extension if INCLUDE_DHCP_REALY is set to y.

DEPENDS on #5939

- How I did it
Modify DHCP relay docker makefile and dockerfile. Make changes to sonic_debian_extension.j2 to install sonic packages.
I moved DHCP related CLI tests from sonic-utilities to DHCP relay docker.
This PR introduces a way to write a plugin as part of docker image and run the tests from cli-plugin-tests directory under docker directory.
The test result is available in target/docker-dhcp-relay.gz.log:

[ REASON ] :      target/docker-dhcp-relay.gz does not exist   NON-EXISTENT PREREQUISITES: docker-start target/docker-config-engine-buster.gz-load target/python-wheels/sonic_utilities-1.2-py3-none-any.whl-in
stall target/debs/buster/python3-swsscommon_1.0.0_amd64.deb-install
[ FLAGS  FILE    ] : []
[ FLAGS  DEPENDS ] : []
[ FLAGS  DIFF    ] : []
============================= test session starts ==============================
platform linux -- Python 3.7.3, pytest-3.10.1, py-1.7.0, pluggy-0.8.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /sonic/dockers/docker-dhcp-relay/cli-plugin-tests, inifile:
plugins: cov-2.6.0
collecting ... collected 10 items

test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_plugin_registration PASSED [ 10%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_nonexist_vlanid PASSED [ 20%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_vlanid PASSED [ 30%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_ip PASSED [ 40%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_exist_ip PASSED [ 50%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_del_dhcp_relay_dest PASSED [ 60%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_nonexist_dhcp_relay_dest PASSED [ 70%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_dhcp_relay_dest_with_nonexist_vlanid PASSED [ 80%]
test_show_dhcp_relay.py::TestVlanDhcpRelay::test_plugin_registration PASSED [ 90%]
test_show_dhcp_relay.py::TestVlanDhcpRelay::test_dhcp_relay_column_output PASSED [100%]

=============================== warnings summary ===============================
/usr/local/lib/python3.7/dist-packages/tabulate.py:7
  /usr/local/lib/python3.7/dist-packages/tabulate.py:7: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    from collections import namedtuple, Iterable

-- Docs: https://docs.pytest.org/en/latest/warnings.html
==================== 10 passed, 1 warnings in 0.35 seconds =====================
2021-07-15 10:35:56 -07:00
Stepan Blyshchak
3a2b8c6ba5
[SONiC Application Extension] support warm/fast reboot for extension packages (#7286)
#### Why I did it

I made this change to support warm/fast reboot for SONiC extension packages as per HLD Azure/SONiC#682.

#### How I did it

I extended manifest.json.j2 with new warm/fast reboot related fields and also extended sonic_debian_extension.j2 script template to generate the shutdown order files for warm and fast reboot.
2021-07-11 06:58:05 -07:00
Stepan Blyshchak
a294bfb03b
[sonic_debian_extension] fix packages.json generation and make the build fail when packages.json is not generated (#8044)
After https://github.com/Azure/sonic-buildimage/pull/7598 the packages.json generation is broken. This change fixes it make the whole build fail in case generation failed.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-07-09 12:29:33 -07:00
shlomibitton
776a446d76
[dhcp_relay] Disable dhcp_relay for ToRRouter switches type by the feature manager (#7789)
- Why I did it
Currently dhcp packets are disabled by the COPP manager for non ToRRouter type switches.
Even if the feature is enabled, DHCP packets wont hook to the CPU since the COPP manager will not trap this packets.
This change is to disable dhcp_relay by default for non ToRRouter switches from init_cfg.json.
With this approach, if the user want to enable the feature for non ToRRouter switches, manual enablement is required by the 'feature' configuration.
This is to keep the current approach for MSFT production issue with dhcp relay for non ToRRouter switched and allow the user to decide if to use it or not.

- How I did it
Configure dhcp_relay 'disabled' by default on init_cfg.json for non ToRRouter switches.
Remove the exclusion of dhcp packets on copp_cfg.json

- How to verify it
Enable dhcp_relay feature on a non ToRRouter switch.
Unit-tests modified so the default values on mocked CONFIG DB in 'test_vectors.py' for dhcp_relay will be 'disabled'.
This is by the change for 'init_cfg.json.j2'.
For ToRRouter the state will change from 'disabled' to 'enabled'.
Another test case added for a 'ToR' switch type, this is to test the state is 'enabled' if the user configured it to be so.
2021-07-08 09:10:46 +03:00
vganesan-nokia
ffca17da0b
[voqsystemlagid] Fix for timing issue in setting system lag id boundary (#7911)
The voq system lag id boundary is set in redis-chassis. Changes include
setting this from database-chassis container. This fixes a timing issue
in finding datbase_config.json file from redis directory which is
created from database container. Since database container usually
starts after database-chassis container the existence of this file is
unreliable while running the command. Running the command under
database-chassis container makes sure that the database_config.json form
redis-chassis directory is guaranteed to be available and hence fixes the
timing issue.

Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>
2021-06-30 09:57:08 -07:00
Ying Xie
7236fa98e8
Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469)" (#8023)
This change causes nightly test to fail due to the fake proxy IP is not reachable.

Reverts #7469

This reverts commit f7ed82f44a.
2021-06-29 18:43:53 -07:00
Stepan Blyshchak
9de7e6860b
[sonic-app-ext] support app extensions installation during build (#7593)
Signed-off-by: Stepan Blyschak stepanb@mellanox.com

Why I did it
To support building DHCP relay as extension and installing it during build time.

How I did it
Created infrastructure. Users need to define their packages in rules/sonic-packages.mk

How to verify it
Together with #6531
2021-06-29 09:07:33 -07:00
Stepan Blyshchak
9ce7c6d9fe
[hostcfgd] Configure service auto-restart in hostcfgd. (#5744)
Before this change, a process running inside every SONiC container dealt with FEATURE table 'auto_restart' field and depending on the value decided whether a container has to be killed or not.
If killed service auto restart mechanism restarts the container.
This change moves the logic from container to the host daemon - hostcfgd.
The 'auto_restart' handling is kept in supervisor-proc-exit-listener but now it is not required for container that wants to support auto restart feature.

hostcfgd refactoring - move feature handling in another class.
override systemd service Restart= setting from hostcfgd.
remove default systemd Restart=always.
Signed-off-by: Stepan Blyshchak stepanb@nvidia.com

- Why I did it

Remove the need to deal with container orchestration logic from the container itself. Leave this logic to the orchestrator - host OS.

- How I did it

hostcfgd configures 'Restart=' value for systemd service.

- How to verify it

root@r-tigon-11:/home/admin# sudo config feature autorestart lldp enabled
root@r-tigon-11:/home/admin# show feature status | grep lldp
lldp            enabled   enabled
root@r-tigon-11:/home/admin# docker exec -it lldp pkill -9 lldpd
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c        docker-lldp:latest                   "/usr/bin/docker-lld…"   2 days ago          Exited (0) 20 seconds ago                       lldp
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c        docker-lldp:latest                   "/usr/bin/docker-lld…"   2 days ago          Up 5 seconds                            lldp
root@r-tigon-11:/home/admin# sudo config feature autorestart lldp disabled
root@r-tigon-11:/home/admin# docker exec -it lldp pkill -9 lldpd
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c        docker-lldp:latest                   "/usr/bin/docker-lld…"   2 days ago          Up 35 seconds                           lldp
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c        docker-lldp:latest                   "/usr/bin/docker-lld…"   2 days ago          Exited (0) 3 seconds ago                       lldp
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c        docker-lldp:latest                   "/usr/bin/docker-lld…"   2 days ago          Exited (0) 39 seconds ago                       lldp
root@r-tigon-11:/home/admin#
2021-06-29 09:06:21 -07:00
Santhosh Kumar T
f8eb5b0958
Flashrom refactoring for broadcom platforms (#7693)
#### Why I did it
- To build flashrom properly with dependency tracking.

#### How I did it
- Moved flashrom code from platform/broadcom/sonic-platform-modules-dell/tools directory to src/flashrom directory.
- At the end, flashrom_0.9.7_amd64.deb package is build which will be installed in the devices.
- Currently flashrom builds only for Dell S6100 platforms.
2021-06-22 15:29:21 -07:00
judyjoseph
3ad830eb49
New sonic-buildimage images for Broadcom DNX ASIC family. (#7598)
Introduce new sonic-buildimage images for Broadcom DNX ASIC family.

sonic-broadcom-dnx.bin
sonic-aboot-broadcom-dnx.swi

How I did it

NO CHANGE to existing make commands

make init; make configure PLATFORM=broadcom;  make target/sonic-aboot-broadcom.swi; make  target/sonic-broadcom.bin

The difference now is that it will result in new broadcom images for DNX asic family as well. 

sonic-broadcom.bin, sonic-broadcom-dnx.bin
sonic-aboot-broadcom.swi, sonic-aboot-broadcom-dnx.swi

Note: This PR also adds support for Broadcom SAI 5.0 (based on 1.8 SAI ) for DNX based platform + changes in platform x86_64-arista_7280cr3_32p4 bcm config files and platform_env.conf files
2021-06-22 11:12:22 -07:00
Kebo Liu
078e0e0410
mount 'mellanox' folder only instead of create each sub folder (#7830)
#### Why I did it

Following the discussion in another PR https://github.com/Azure/sonic-buildimage/pull/7708#discussion_r642933510 , since there will be multi subfolders under **/var/log/mellanox**, so we agreed to only mount this folder and the subfolders will be created afterward on demand.  

#### How I did it

during the syncd docker creation, only mount  folder **/var/log/mellanox**

#### How to verify it

build an Mellanox image and verify the related folder on the host and docker side.
2021-06-22 06:27:56 -07:00
Sudharsan Dhamal Gopalarathnam
c88c3c7ba5
Grouping delayed services under a target for config reload checks (#7846)
#### Why I did it
Create a target for delayed service timers. Few services in sonic have delayed to speed up the bring up of the system and essential services. However there is no way to track when they start. This will be a problem when executing config reload as config reload expects all services to be up. Hence grouped all the timers that trigger the delayed services under one target so that they could be tracked in 'config reload' command

#### How I did it
Created delay.target service and add created dependency on the delayed targets.
2021-06-21 11:55:02 -07:00
Ann Pokora
3d629233bf
[MPLS][libnl3] libnl patches for supporting MPLS
* New accessors in libnl3 for MPLS attributes
* contains patch files for bug fixes in libnl3 for MPLS attribute parsing
2021-06-16 15:08:23 -07:00
Renuka Manavalan
f7ed82f44a
[Kubernetes]: The kube server could be used as http-proxy for docker (#7469)
Why I did it
The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo.

Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy.

How I did it
When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker.

To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo.

How to verify it
Configure a switch such that image repo is unreachable
Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1)
Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option.
Configure a k8s server, and deploy an image for feature with set_owner="kube"
Check if switch could successfully download the image or not.
2021-06-16 07:46:01 -07:00
arlakshm
4d07bbbec6
[Yang][cfggen] update sonic-cfggen to generate config_db from Yang data (#7712)
Why I did it
This PR adds changes in sonic-config-engine to consume configuration data in SONiC Yang schema and generate config_db entries

How I did it
Add a new file sonic_yang_cfg_generator .
This file has the functions to

parse yang data json and convert them in config_db json format.
Validate the converted config_db entries to make sure all the dependencies and constraints are met.
Add a new option -Y to the sonic-cfggen command for this purpose

Add unit tests

This capability is support only in sonic-config-engine Python3 package only
2021-06-10 12:03:33 -07:00
yozhao101
1a3cab43ac
[Monit] Deprecate the feature of monitoring the critical processes by Monit (#7676)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.

How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.

How to verify it
I verified this on the device str-7260cx3-acs-1.
2021-06-04 10:16:53 -07:00
Renuka Manavalan
73447efc31
Add service to restore TACACS from old config (#7560)
Why I did it
In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials.
Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present.

How I did it
By adding a service, that would fire 5 mins after boot.
This service apply tacacs if available.

How to verify it
Upgrade and watch status of tacacs.timer & tacacs.service
You may create /etc/sonic/old_config/tacacs.json, with updated credentials
(before 5mins after boot) and see that appears in config & persisted too.

Which release branch to backport (provide reason below if selected)
 201911
 202006
 202012
2021-06-03 20:07:17 -07:00
yozhao101
37863ac854
[Monit] Restart telemetry container if memory usage is beyond the threshold (#7645)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.

How I did it
I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container.

How to verify it
I verified this implementation on device str-7260cx3-acs-1.
2021-05-28 11:13:44 -07:00
Stepan Blyshchak
d7b96dfdf1
[sonic-sdk] add sonic sdk and sonic sdk buildenv (#6712)
- Why I did it

To give SONiC Application Extension developers an environment to run and develop their apps.

- How I did it
Created sonic-sdk and sonic-sdk-buildenv dockers and their dbg versions.

- How to verify it
Build:

$ make -f slave target/sonic-sdk.gz target/sonic-sdk-buildenv.gz
2021-05-28 10:16:02 -07:00
Lawrence Lee
79914f5336
[swss.service]: Remove ordering with pmon (#7614)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-05-26 09:12:54 -07:00
Neetha John
3b06f44555
[qos]: modify dot1p to tc mapping (#7661)
Map priority 0 to TC 1 and priority 1 to TC 0

Send traffic on priority 0 and 1 and verified that it gets mapped correctly in hw

Signed-off-by: Neetha John <nejo@microsoft.com>
2021-05-20 10:36:39 -07:00
Ze Gan
8f883fee67
[macsec]: Bind macsec service to sonic.target (#7642)
MACsec service cannot be enabled by "sudo config feature state macsec enabled"

Signed-off-by: Ze Gan <ganze718@gmail.com>
2021-05-18 11:44:21 -07:00
Nazarii Hnydyn
6e264d8ac9
[swss_vars]: Add 'resource_type' attribute. (#7526)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-05-06 12:14:21 -07:00
Stepan Blyshchak
cd2c86eab6
[dockers] label SONiC Docker with manifest (#5939)
Signed-off-by: Stepan Blyschak stepanb@nvidia.com

This PR is part of SONiC Application Extension

Depends on #5938

- Why I did it
To provide an infrastructure change in order to support SONiC Application Extension feature.

- How I did it
Label every installable SONiC Docker with a minimal required manifest and auto-generate packages.json file based on
installed SONiC images.

- How to verify it
Build an image, execute the following command:

admin@sonic:~$ docker inspect docker-snmp:1.0.0 | jq '.[0].Config.Labels["com.azure.sonic.manifest"]' -r | jq
Cat /var/lib/sonic-package-manager/packages.json file to verify all dockers are listed there.
2021-04-26 13:51:50 -07:00
Guohan Lu
27a635a15a Revert "Flashrom refactoring (#6922)"
This reverts commit 7dd9d1f3f2.
2021-04-25 11:51:35 -07:00
a-barboza
ec9101f9c5
RADIUS Management User Authentication Feature (#7284)
Why I did it
HLD: https://github.com/Azure/SONiC/blob/master/doc/aaa/radius_authentication.md
CLI: In a separate PR.

How I did it
How to verify it
UT: src/sonic-host-services/tests/hostcfgd/hostcfgd_radius_test.py
2021-04-23 19:09:41 -07:00
Stepan Blyshchak
ae339c95d2
[systemd] disable default systemd udev rules for interfaces (#7369)
Fix #7364

99-default.link - was always in SONiC, but previous systemd (<247) had an issue and it did not work due to issue systemd/systemd#3374. Now systemd 247 works.

However, such policy overrides teamd provided mac address which causes teamd netdev to use a random mac
address. Therefore, needs to be disabled.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-04-21 17:50:00 -07:00
Santhosh Kumar T
7dd9d1f3f2
Flashrom refactoring (#6922)
#### Why I did it
To build flashrom properly with dependency tracking.

#### How I did it
Moved flashrom code from platform/broadcom/sonic-platform-modules-dell/tools directory to src/flashrom directory.
At the end, flashrom_0.9.7_amd64.deb package is build which will be installed in the devices.
2021-04-20 15:24:44 -07:00
guxianghong
6fe6d7394d
[arm] support compile sonic arm image on arm server (#7285)
- Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly.
- Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228
- rename multiarch docker to sonic-slave-${distro}-march-${arch}

Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Co-authored-by: Shi Lei <shil@centecnetworks.com>
2021-04-18 08:17:57 -07:00
Stepan Blyshchak
4369361894
[sonic_debian_extension.j2] fix systemd version not from buster-backports (#7322)
Install systemd explicitelly from backports and install libsystemd* packages from backports.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-04-18 08:07:02 -07:00
vganesan-nokia
b313d4d092
[systemlag] Lag id boundary set for system lag (#6488)
Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>

Changes for setting platfrom specific lag id boundary id in the chassis
app db. The platfrom specific lag id boundaries are supplied via
chassisdb.conf. The lag_id_start and lag_id_end boundary values sourced
from this file are set in chassis app db which will be used by lag id
allocator to allocate unique lag id in atomic fashion
2021-03-30 23:21:53 -07:00
Stepan Blyshchak
1f7d9e2698
[docker_img_ctl.j2] make tmpfs mounts optional and add ability to run container by image id (#6439)
- Why I did it
I made the docker_img_ctl.j2 applicable for more dockers (including application extensions dockers) by adding an option not to mount tmpfs on /tmp/ and /var/tmp/. In some applications /tmp/ is a different docker volume which can't be tmpfs.
Also, I added and ability to pass REPO[:TAG]|[@digest]/IMAGE_ID instead of just REPO name.

- How I did it
Modified docker_img_ctl.j2 and docker makefiles.

- How to verify it
Run it on the switch.
2021-03-16 17:03:12 +02:00
Stepan Blyshchak
2b8941e716
[sonic_debian_extension] add docker script to SONiC filesystem (#5935)
- Why I did it
To allow SONiC Package Migration during SONiC-2-SONiC upgrade we need to start docker daemon in chroot-ed environment in new SONiC filesystem.
Later this script will be used to start dockerd in chroot environment on SONiC

- How I did it
Install a docker service script into /usr/lib/docker/ in SONiC filesystem.

- How to verify it
Install SONiC image on the switch, mount squashfs to some directory, mount overlay rw layer over squashfs, mount procfs and sysfs, mount docker library. Start the docker using:
root@sonic:~$ /usr/lib/docker/docker.sh start

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2021-03-14 14:15:42 +02:00
Renuka Manavalan
6f7cd8d772
Copy dummy flannel.conf to get around absence of CNI Network (#6985)
Why I did it
We skip install of CNI plugin, as we don't need. But this leaves node in "not ready" state, upon joining master.
To fix, we copy this dummy .conf file in /etc/cni/net.d

How I did it
Keep this file in /usr/share/sonic/templates and copy to /etc/cni/net.d upon joining k8s master.

How to verify it
Upon configuring master-IP and enable join, watch node join and move to ready state.
You may verify using kubectl get nodes command
2021-03-09 19:49:54 -08:00
Stepan Blyshchak
12c03c4f25
[sonic_debian_exntesion] install docker_image_ctl.j2 template in the image templates (#5937)
SONiC Package Manager will require to auto-generate the start script using that template. For that, we need this template to be recorded in SONiC filesystem.

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2021-02-25 09:11:12 -08:00
Stepan Blyshchak
e179ec2fae
[services] introduce sonic.target (#5705)
- Why I did it
Group all SONiC services together and able to manage them together. Will be used in config reload command as much simpler and generic way to restart services.

- How I did it
Add services to sonic.target

- How to verify it
Together with Azure/sonic-utilities#1199
config reload -y

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2021-02-25 14:26:24 +02:00
Ze Gan
4068944202
[MACsec]: Set MACsec feature to be auto-start (#6678)
1. Add supervisord as the entrypoint of docker-macsec
2. Add wpa_supplicant conf into docker-macsec
3. Set the macsecmgrd as the critical_process
4. Configure supervisor to monitor macsecmgrd
5. Set macsec in the features list
6. Add config variable `INCLUDE_MACSEC`
7. Add macsec.service

**- How to verify it**

Change the `/etc/sonic/config_db.json` as follow
```
{
    "PORT": {
        "Ethernet0": {
            ...
            "macsec": "test"
         }
    }
    ...
    "MACSEC_PROFILE": {
        "test": {
            "priority": 64,
            "cipher_suite": "GCM-AES-128",
            "primary_cak": "0123456789ABCDEF0123456789ABCDEF",
            "primary_ckn": "6162636465666768696A6B6C6D6E6F707172737475767778797A303132333435",
            "policy": "security"
        }
    }
}
```
To execute `sudo config reload -y`, We should find the following new items were inserted in app_db of redis
```
127.0.0.1:6379> keys *MAC*
1) "MACSEC_EGRESS_SC_TABLE:Ethernet0:72152375678227538"
2) "MACSEC_PORT_TABLE:Ethernet0"
127.0.0.1:6379> hgetall "MACSEC_EGRESS_SC_TABLE:Ethernet0:72152375678227538"
1) "ssci"
2) ""
3) "encoding_an"
4) "0"
127.0.0.1:6379> hgetall "MACSEC_PORT_TABLE:Ethernet0"
 1) "enable"
 2) "false"
 3) "cipher_suite"
 4) "GCM-AES-128"
 5) "enable_protect"
 6) "true"
 7) "enable_encrypt"
 8) "true"
 9) "enable_replay_protect"
10) "false"
11) "replay_window"
12) "0"
```

Signed-off-by: Ze Gan <ganze718@gmail.com>
2021-02-23 13:22:45 -08:00
Sujin Kang
d5238ae8dd
[pcie.yaml] Move pcie configuration file path to platform directory (#6475)
- Why I did it
The pcie configuration file location is under plugin directory not under platform directory.
#6437

- How I did it

Move all pcie.yaml configuration file from plugin to platform directory.
Remove unnecessary timer to start pcie-check.service
Move pcie-check.service to sonic-host-services
- How to verify it
Verify on the device
2021-02-21 08:27:37 -08:00
Lawrence Lee
97c605f1f7
[swss]: Clear MUX-related state DB tables on start (#6759)
* Add *MUX_CABLE_TABLE* to set of tables to clear on SWSS start, which
will clear HW_MUX_CABLE_TABLE and MUX_CABLE_TABLE
* Order swss to start before pmon to ensure that DBs are cleared before
xcvrd (running inside pmon) starts and re-populates the tables

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-02-14 12:43:49 -08:00
dflynn-Nokia
88961f1339
[armhf build] Fix azure-storage dependency on cryptography package (#6780)
Fix marvell-armhf build break

The azure-storage package depends on the cryptography package. Newer
versions of cryptography require the rust compiler, the correct version
for which is not readily available in buster. Hence we pre-install an
older version here to satisfy the azure-storage dependency.
Note: This is not a problem for other architectures as pre-built versions
of cryptography are available for those. This sequence can be removed
after upgrading to debian bullseye.
2021-02-14 10:36:04 -08:00
Lior Avramov
6f8c31554f
[systemd] Increase syncd startup script timeout to support FW upgrade on init. (#6709)
**- Why I did it**
To support FW upgrade on init.

**- How I did it**
Change timeout value

**- How to verify it**
I manually changed ASIC and Gearbox FW followed by hard reset in order for FW upgrade to take place on init.

Signed-off-by: liora <liora@nvidia.com>
2021-02-11 12:53:36 +02:00
Arun Saravanan Balachandran
3015de1dd0
[sonic-host-service] Move to sonic-host-services package (#6273)
- Why I did it

To move ‘sonic-host-service’ which is currently built as a separate package to ‘sonic-host-services' package. 

- How I did it

- Moved 'sonic-host-server' to 'src/sonic-host-services' and included it as part of the python3 wheel.
- Other files were moved to 'src/sonic-host-services-data' and included as part of the deb package.
- Changed build option ‘INCLUDE_HOST_SERVICE’ to ‘ENABLE_HOST_SERVICE_ON_START’ for enabling sonic-hostservice at boot-up by default.
2021-02-08 19:35:08 -08:00
SuvarnaMeenakshi
62a599a5b3
[multi_asic][vs]: Add dependency in teamd service to start after topology service(#6594)
[multi_asic][vs]: Add dependency in teamd service to start after topology service.
- Why I did it
In multi-asic VS, topology service is run after database service to set up the internal asic topology.
swss and syncd have a dependency to start after topology service is run so that the interfaces are moved to right namespace and created in the right namespace. In case of multi-asic vs, during the initial boot up, when there is no configuration added, teamd service starts and swss/syncd do not start as topology service does not start. Upon loading configuration using config_db or minigraph, swss and sycnd start up , but teamd is not restarted as swss is not stopped and started. This causes teamd to be in a bad state and requires a reload of config.

- How I did it
Add dependency in teamd service to start after topology service is completed.

- How to verify it
No change in single asic vs or platform.
No change in multi-asic regular image.
Change only in multi-asic VS. Bring up a multi-asic VS image without any configration, teamd service will fail to start due to dependency failure. Load minigraph, start topology service, load configuration, ensure all services come up.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-04 14:10:56 -08:00
abdosi
cfa8fbbf1a
[baseimage]: Updates for Ebtables and support for multi-asic (#6542)
Following changes were done for ebtables:

- Support for Multi-asic platforms. Ebtable filters are installed in namespace for multi-asic and not host. On Single asic installed on  host.

- For Multi-asic platforms we don't want to install on host otherwise Namespace-to-Namespace communication does not happens since ARP Request are not forwarded.

- Updated to use text file to restore ebtables rules then the binary format. Rules are restore as part of Database docker init instead of rc.local

- Removed the ebtable service files for buster as not needed as filters are restored/installed as part of database docker init.
   All the binaries are pre-installed with ebtables* binary are same as ebatbles-legacy-* 

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 08:36:10 -08:00
judyjoseph
46b3bd5503
[teamd]: Increase wait timeout for teamd docker stop to clean Port channels. (#6537)
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.
Fixes #6199
To check if it works out for this issue in 201911 ? #6503

This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels.

master

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m15.599s
    user    0m0.061s
    sys     0m0.038s
Sonic 201911.v58

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m5.541s
    user    0m0.020s
    sys     0m0.028s
2021-01-23 20:57:52 -08:00
yozhao101
04cd1d61e8
[Monit] Monitoring the running status of containers. (#6251)
**- Why I did it**
This PR aims to monitor the running status of each container. Currently the auto-restart feature was enabled. If a critical process exited unexpected, the container will be restarted. If the container was restarted 3 times during 20 minutes, then it will not run anymore unless we cleared the flag using the command `sudo systemctl reset-failed <container_name>` manually. 

**- How I did it**
We will employ Monit to monitor a script. This script will generate the expected running container list and compare it with the current running containers. If there are containers which were expected to run but were not running, then an alerting message will be written into syslog.

**- How to verify it**
I tested this feature on a lab device `str-a7050-acs-3` which has single ASIC and `str2-n3164-acs-3` which has a Multi-ASIC. First I manually stopped a container by running the command `sudo systemctl stop <container_name>`, then I checked whether there was an alerting message in the syslog.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2021-01-07 19:52:22 -08:00
Joe LeVeque
e52581e919
[PDDF] Build and install Python 3 package (#6286)
- Make PDDF code compliant with both Python 2 and Python 3
- Align code with PEP8 standards using autopep8
- Build and install both Python 2 and Python 3 PDDF packages
2021-01-07 10:03:29 -08:00
Joe LeVeque
566ea4f601
[system-health] Convert to Python 3 (#5886)
- Convert system-health scripts to Python 3
- Build and install system-health as a Python 3 wheel
- Also convert newlines from DOS to UNIX
2020-12-29 14:04:09 -08:00
Joe LeVeque
62662acbd5
No longer install some unnecessary Python 2 packages in host (#6301)
- No longer install Python 2 packages in host:
    - libpython2.7-dev
    - docker
    - ipaddress
    - netifaces
    - azure-storage
    - watchdog
    - futures

- Install Python 3 versions of the following packages in host:
    - docker
    - azure-storage
    - watchdog
    - redis
    - swsssdk (install unconditionally)
2020-12-29 13:02:11 -08:00
lguohan
162f0fdfe1
[init_cfg]: allow enable/disable swss/teamd/syncd services (#6291)
swss/teamd/syncd services were changed to always enabled
in commit fad481edc1 as a workaround
for not letting hostcfgd start service during the bootup process.

commit 317a4b3410 introduce
wait till full system bootup before updating feature states in hostcfgd.

Thus, workaround introduced in commit fad481ed can be removed

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-12-28 10:33:46 -08:00
Prabhu Sreenivasan
df13245b9f
[CRM] Add support for snat, dnat and ipmc crm resources (#6012)
Signed-off-by: Prabhu Sreenivasan prabhu.sreenivasan@broadcom

What I did
Added support for snat, dnat and ipmc resources under CRM module.

How I did it
New feature NAT adds new resources snat_enty and dnat_entry that needs to be monitored. ipmc_entry tracks IP multicast resources used by switch.

How to verify it
sonic-utilities tests and crm spytest
2020-12-23 06:15:53 -08:00
lguohan
aa1cc848e2
[sonic-yang-mgmt-py2]: remove sonic-yang-mgmt py2 (#6262)
No longer needed as sonic-utilties has been moved python3

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-12-22 21:05:33 -08:00
Renuka Manavalan
ba02209141
First cut image update for kubernetes support. (#5421)
* First cut image update for kubernetes support.
With this,
    1)  dockers dhcp_relay, lldp, pmon, radv, snmp, telemetry are enabled
        for kube management
        init_cfg.json configure set_owner as kube for these

    2)  Each docker's start.sh updated to call container_startup.py to register going up
          As part of this call, it registers the current owner as local/kube and its version
          The images are built with its version ingrained into image during build

    3)  Update all docker's bash script to call 'container start/stop/wait' instead of 'docker start/stop/wait'.
         For all locally managed containers, it calls docker commands, hence no change for locally managed.
        
    4)  Introduced a new ctrmgrd service, that helps with transition between owners as  kube & local and carry over any labels update from STATE-DB to API server

    5)  hostcfgd updated to handle owner change

    6) Reboot scripts are updatd to tag kube running images as local, so upon reboot they run the same image.

   7) Added kube_commands.py to handle all updates with Kubernetes API serrver -- dedicated for k8s interaction only.
2020-12-22 08:01:33 -08:00
Prabhu Sreenivasan
df2a4ded98
[ntp]: Source interface support for NTP (#6033)
Added source interface support for NTP.
Also made NTP start on Mgmt-VRF by default when configured.

**- How I did it**
1) Updated hostcfg to listen to global config NTP and NTP_SERVER tables and restart ntp when ever the configuration changes. NTP table includes source interface configuration.
2) The ntp script updated to by default start on Mgmt-VFT when configured.

Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom>
2020-12-21 05:34:13 -08:00
Lawrence Lee
03ad30d2ab
[build_templates]: Start SNMP timer after SWSS service (#6195)
Fixes #5663

- Why I did it
It's currently possible for the SNMP timer to conflict with config reload (specifically if the timer triggers while config reload is stopping the SWSS service). config reload triggers SWSS to shutdown, which causes SNMP to shutdown, which conflicts with the SNMP timer causing SNMP to startup. See the linked issue for more details.

- How I did it
Including the After ordering dependency forces the SNMP timer to wait until SWSS finishes stopping, preventing the conflict. If there is an ordering dependency between two units (e.g. one unit is ordered After another), if one unit is shutting down while the other is starting up, the shutdown will always be ordered before the startup. In this case, that means that the SNMP timer is forced to wait for the SWSS shutdown to complete. Only then can the SNMP timer proceed. See here for more details.

It's important to note that the After dependency will not cause SWSS to be started when the SNMP timer fires (assuming that SWSS has not yet been started). The existing Requisite dependency in the SNMP service will also not cause SWSS to be started, instead it will cause the SNMP service to fail if SWSS is not active.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-12-16 16:39:14 -08:00
Joe LeVeque
c829e6914a
Install 'wheel' package in host OS; upgrade pip and setuptools (#6187)
Install the 'wheel' package in host OS (along with python3 and python3-distutils which are also needed for building some Python packages) to eliminate error messages like the following:

```
  Running setup.py bdist_wheel for watchdog: started
  Running setup.py bdist_wheel for watchdog: finished with status 'error'
  Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-Qd3K08/watchdog/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-0AHpMe --python-tag cp27:
  usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
     or: -c --help [cmd1 cmd2 ...]
     or: -c --help-commands
     or: -c cmd --help
  
  error: invalid command 'bdist_wheel'
  
  ----------------------------------------
  Failed building wheel for watchdog

```

These error messages appear to have no impact on the image build, because the Python package seems to still get installed successfully afterward, just the building of a wheel package fails. Therefore, this is more of a cosmetic fix than an actual bug.

This is an addendum to https://github.com/Azure/sonic-buildimage/pull/6182.

Also upgrade pip and install more recent version of setuptools package via PyPI.
2020-12-16 16:38:15 -08:00
Sabareesh-Kumar-Anandan
9f4ca01388
[sonic-config-engine] Adding dependent pkgs needed for arm compilation (#6186)
libxslt-dev and libz-dev are dependencies for lxml==4.6.1 which is required for pyangbind==0.8.1

lxml-4.6.2-cp37-cp37m-manylinux1_x86_64.whl is directly downloaded in amd64 whereas in arm this is built from lxml-4.6.2.tar.gz

Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>
2020-12-15 08:44:46 -08:00
Stephen Sun
e010d83fc3
[Dynamic buffer calc] Support dynamic buffer calculation (#6194)
**- Why I did it**
To support dynamic buffer calculation.
This PR also depends on the following PRs for sub modules
- [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](https://github.com/Azure/sonic-swss/pull/1338)
- [sonic-swss-common: Dynamic buffer calculation #361](https://github.com/Azure/sonic-swss-common/pull/361)
- [sonic-utilities: Support dynamic buffer calculation #973](https://github.com/Azure/sonic-utilities/pull/973)

**- How I did it**
1. Introduce field `buffer_model` in `DEVICE_METADATA|localhost` to represent which buffer model is running in the system currently:
    - `dynamic` for the dynamic buffer calculation model
    - `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used
2. Add the tables required for the feature:
   - ASIC_TABLE in platform/\<vendor\>/asic_table.j2
   - PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2
   - PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed.
   - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2
   - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2
3. Copy the newly introduced j2 files into the image and rendering them when the system starts
4. Update the CLI options for buffermgrd so that it can start with dynamic mode
5. Fetches the ASIC vendor name in orchagent:
   - fetch the vendor name when creates the docker and pass it as a docker environment variable
   - `buffermgrd` can use this passed-in variable
6. Clear buffer related tables from STATE_DB when swss docker starts
7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2
8. Remove buffer pool sizes for ingress pools and egress_lossy_pool
   Update the buffer settings for dynamic buffer calculation
2020-12-13 11:35:39 -08:00
Junchao-Mellanox
51c77b179f
[Mellanox] Add python3 support for Mellanox platform API (#6175)
python2 is end of life and SONiC is going to support python3. This PR is going to support:

1. Mellanox SONiC platform API python3 support
2. Install both python2 and python3 verson of Mellanox SONiC platform API or pmon and host side
2020-12-11 10:51:31 -08:00
Prabhu Sreenivasan
77afb8e54d
[ntp]: ntp-systemd-wrapper file is getting overwritten (#6179)
ntp-systemd-wrapper file from files/image_config/ntp was not getting picked up. Added a line on sonic_debian_extension.j2 to copy over the file from files/image_config/ntp after installing the debian package.

Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom.com>
2020-12-10 23:20:41 -08:00
rajendra-dendukuri
31ce20ac38
[kdump]: Kdump usability and reliability improvements (#6113)
- Allow platform specific reboot script to be called after crash kernel has
finished copying the kernel vmcore
- Disable pcie advanced features when running crash kernel. This improves
reliability of the crash kernel to successfully create a vmcore and also
reboot
- Allow crash kernel to reboot if a panic is seen while it is generating a
vmcore
- Fix crash kernel to use the SONiC specific /usr/local/bin/reboot script
instead of the Linux reboot command /sbin/reboot
- Use sonic_platform as the kernel command line parameter to pass platform identifier string

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2020-12-10 01:32:37 -08:00
Samuel Angebault
8576911a57
[database-chassis]: Fix the way database-chassis start (#6099)
The service crash when the platform boots due to missing waits.
/usr/bin/database.sh tries to operate on a missing socket and fails.
We now wait for the chassis database to be ready the same way we do database.
2020-12-04 10:09:35 -08:00
Joe LeVeque
83f0d8240e
[pmon]: Install vanilla 'thrift' Python 2 and 3 packages for Barefoot in host and PMon (#6080)
Barefoot platform vendors' sonic_platform packages import the Python 'thrift' library. Previously, our custom-built package was being installed in the PMon container and host OS. However, we are only building a Python 2 version of that package, which was only intended for use with saithrift.

Fixes #6077
2020-12-04 08:41:17 -08:00
Garrick He
fc0e6af337
[sflow] Fix race-condition seen with mVRF configured (#6102)
Under certain conditions, the sFlow service can start before
interface configurations are sucessfully applied. This will
cause hsflowd to get a socket error.

This fix ensures all interface configurations are successfully
applied before the sFlow service (hsflowd) starts.

During testing we saw this error from hsflowd if interface configs were not successfully applied before hsflowd started.

    ERR sflow#hsflowd: socket sendto error: Network is unreachable

no FLOW samples can be seen. This can be consistently reproducible if you force sFlow service to start before interface-config.service.

Signed-off-by: Garrick He <garrick_he@dell.com>
2020-12-03 01:33:10 -08:00
lguohan
4812953468
[ntp]: build ntp with various fixes (#6037)
- NTP Bug 1970 (UNLINK_EXPR_SLIST empty list) Fix
- ENOBUFS log message level set to WARN
- Fix audit message seen on console apparmor
- add force-confold option when install ntp

Signed-off-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom>
2020-12-02 15:02:50 -08:00
Joe LeVeque
7f4ab8fbd8
[sonic-utilities] Update submodule; Build and install as a Python 3 wheel (#5926)
Submodule updates include the following commits:

* src/sonic-utilities 9dc58ea...f9eb739 (18):
  > Remove unnecessary calls to str.encode() now that the package is Python 3; Fix deprecation warning (#1260)
  > [generate_dump] Ignoring file/directory not found Errors (#1201)
  > Fixed porstat rate and util issues (#1140)
  > fix error: interface counters is mismatch after warm-reboot (#1099)
  > Remove unnecessary calls to str.decode() now that the package is Python 3 (#1255)
  > [acl-loader] Make list sorting compliant with Python 3 (#1257)
  > Replace hard-coded fast-reboot with variable. And some typo corrections (#1254)
  > [configlet][portconfig] Remove calls to dict.has_key() which is not available in Python 3 (#1247)
  > Remove unnecessary conversions to list() and calls to dict.keys() (#1243)
  > Clean up LGTM alerts (#1239)
  > Add 'requests' as install dependency in setup.py (#1240)
  > Convert to Python 3 (#1128)
  > Fix mock SonicV2Connector in python3: use decode_responses mode so caller code will be the same as python2 (#1238)
  > [tests] Do not trim from PATH if we did not append to it; Clean up/fix shebangs in scripts (#1233)
  > Updates to bgp config and show commands with BGP_INTERNAL_NEIGHBOR table (#1224)
  > [cli]: NAT show commands newline issue after migrated to Python3 (#1204)
  > [doc]: Update Command-Reference.md (#1231)
  > Added 'import sys' in feature.py file (#1232)

* src/sonic-py-swsssdk 9d9f0c6...1664be9 (2):
  > Fix: no need to decode() after redis client scan, so it will work for both python2 and python3 (#96)
  > FieldValueMap `contains`(`in`)  will also work when migrated to libswsscommon(C++ with SWIG wrapper) (#94)

- Also fix Python 3-related issues:
    - Use integer (floor) division in config_samples.py (sonic-config-engine)
    - Replace print statement with print function in eeprom.py plugin for x86_64-kvm_x86_64-r0 platform
    - Update all platform plugins to be compatible with both Python 2 and Python 3
    - Remove shebangs from plugins files which are not intended to be executable
    - Replace tabs with spaces in Python plugin files and fix alignment, because Python 3 is more strict
    - Remove trailing whitespace from plugins files
2020-11-25 10:28:36 -08:00
abdosi
fad481edc1
Enhanced Feature table to support 'always_enabled' value for state and auto-restart fields. (#6000)
Added new flag value 'always_enabled' for the state and auto-restart field of feature table

init_cfg.json is updated to initialize state field of database/swss/syncd/teamd feature and auto-restart field of database feature
as always_enabled

Once the state/auto-restart value is initialized as "always_enabled" it is immutable and cannot be change via feature config commands. (config feature..) PR#Azure/sonic-utilities#1271

hostcfgd will not take any action if state field value is 'always_enabled'

Since we have always_enabled field for auto-restart updated supervisor-proc-exit-listener
not to have special check for database and always rely on value from Feature table.
2020-11-25 08:41:11 -08:00
Sudharsan Dhamal Gopalarathnam
98a434e8c1
Copp Manager Changes (#4861)
*Introduce CoPP Manager infrastructure
Copp service to generate initial copp config template file

Co-authored-by: dgsudharsan <sudharsan_gopalarat@dell.com>
2020-11-23 09:31:42 -08:00
Sujin Kang
5b31996f7b
[reboot-history] Add reboot history to state db (#5933)
- Why I did it
Add reboot history to State db so that can be used telemetry service
- How I did it
Split the process-reboot-cause service to determine-reboot-cause and process-reboot-cause
determine-reboot-cause to determine the reboot cause
process-reboot-cause to parse the reboot cause files and put the reboot history to state db
Moved to sonic-host-service* packages
- How to verify it
Performed unit test and tested on DUT
2020-11-20 20:08:18 -08:00
heidinet2007
7c17c58b83
Move teamd warm reboot code to service script (#5163)
Summary: Move teamd functions to a new service script

Motivation: To segregate teamd functions in one common place. fast-reboot script calls teamd functions that should ideally be replaced by a simple call to a service script.
 
Changes: New teamd service script and path modification from /usr/bin/teamd.sh to /usr/local/bin/teamd.sh
fast-reboot script (in sonic-utilities) modification (to use new teamd.sh to stop teamd) should follow soon after this change.

Verification: VS image tests.

Signed-off-by: Vaibhav Hemant Dixit <vaibhav.dixit@microsoft.com>

Co-authored-by: heidi.ou@alibaba-inc.com <heidi.ou@alibaba-inc.com>
Co-authored-by: Ying Xie <ying.xie@microsoft.com>
2020-11-13 13:34:18 -08:00
fk410167
a3dd3f55f9
Platform Driver Developement Framework (PDDF) (#4756)
This change introduces PDDF which is described here: https://github.com/Azure/SONiC/pull/536

Most of the platform bring up effort goes in developing the platform device drivers, SONiC platform APIs and validating them. Typically each platform vendor writes their own drivers and platform APIs which is very tailor made to that platform. This involves writing code, building, installing it on the target platform devices and testing. Many of the details of the platform are hard coded into these drivers, from the HW spec. They go through this cycle repetitively till everything works fine, and is validated before upstreaming the code.
PDDF aims to make this platform driver and platform APIs development process much simpler by providing a data driven development framework. This is enabled by:

JSON descriptor files for platform data
Generic data-driven drivers for various devices
Generic SONiC platform APIs
Vendor specific extensions for customisation and extensibility

Signed-off-by: Fuzail Khan <fuzail.khan@broadcom.com>
2020-11-12 10:22:38 -08:00
Lawrence Lee
ae69fdf312
[buffers_config.j2]: Use correct cable lengths for backend devices (#5905)
* Remove 'backend' from device type strings so that backend devices ('BackEndToRRouter' and 'BackEndLeafRouter') are given the same cable lengths as regular device types.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-12 09:03:59 -08:00
Lawrence Lee
d0f16c0d79
Make backend device checking more robust (#5730)
Treat devices that are ToRRouters (ToRRouters and BackEndToRRouters) the same when rendering templates
 Except for BackEndToRRouters belonging to a storage cluster, since these devices have extra sub-interfaces created
Treat devices that are LeafRouters (LeafRouters and BackEndLeafRouters) the same when rendering templates

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-10 15:06:35 -08:00
abdosi
4f82463670
[multi-asic] Fixed the docker mount point check for multi-asic (#5848)
API getMount() API was not updated to handle multi-asic platforms
Updated API getMount() to return abspath() for Docker Mount Point
and use that one for mount point comparison

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-09 13:03:00 -08:00
Guohan Lu
ad2e18e856 [baseimage]: install psutil for python3
psutil is needed by process_checker which is using python3

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-11-09 00:29:10 -08:00
Praveen Chaudhary
6156cb2805
[sonic-yang-mgmt] Build PY3 & PY2 packages (#5559)
Moving sonic-yang-mgmt to PY3 to support move of sonic-utilities to PY3.

Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com>
2020-11-07 13:03:41 -08:00
Joe LeVeque
04d0e8ab00
[hostcfgd] Convert to Python 3; Add to sonic-host-services package (#5713)
To consolidate host services and install via packages instead of file-by-file, also as part of migrating all of SONiC to Python 3, as Python 2 is no longer supported.
2020-11-07 12:48:19 -08:00
lguohan
e6796da141
[init_cfg.json.j2]: only enable gbsyncd feature for vs platform (#5815)
currently only vs platform has gdbsyncd feature built

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-11-07 00:46:18 -08:00
Joe LeVeque
d8045987a6
[core_uploader.py] Convert to Python 3; Use logger from sonic-py-common for uniform logging (#5790)
- Convert core_uploader.py script to Python 3
- Use logger from sonic-py-common for uniform logging
- Reorganize imports alphabetically per PEP8 standard
- Two blank lines precede functions per PEP8 standard
- Remove unnecessary global variable declarations
2020-11-05 11:19:26 -08:00
Blueve
698b5544c9
[openssh] Introduce custom openssh-server package for supporting reverse console SSH (#5717)
* Build and install openssh from source
* Copy openssh deb package to dest folder
* Update make rule
* Update sonic debian extension
* Append empty line before EOF
* Update openssh patch
* Add openssh-server to base image dependency
* Fix indent type
* Fix comments
* Use commit id instead of tag id and add comment

Signed-off-by: Jing Kan jika@microsoft.com
2020-11-02 10:31:15 +08:00
Joe LeVeque
6333bb73b0
Explicitly call pip2 rather than pip in locations where both pip2 and pip3 are installed (#5747)
As part of the transition from Python 2 to Python 3, we are installing both pip2 and pip3 in the slave and config-engine containers. This PR replaces calls to `pip` in these containers with an explicit call to `pip2` to ensure the proper version of pip is executed, no matter which version of pip is aliased to `pip`, as we no longer rely on that alias.

Also some other pip-related cleanup
2020-10-30 09:43:14 -07:00
Joe LeVeque
e111204206
[caclmgrd] Convert to Python 3; Add to sonic-host-services package (#5739)
To consolidate host services and install via packages instead of file-by-file, also as part of migrating all of SONiC to Python 3, as Python 2 is no longer supported, convert caclmgrd to Python 3 and add to sonic-host-services package
2020-10-29 16:29:12 -07:00
Shi Su
5ee5c13f32
Enable synchronous mode by default and add in minigraph parser (#5735) 2020-10-29 09:15:12 -07:00
lguohan
07748a939f
[gbsyncd]: add gbsyncd to FEATURE table (#5683)
remove syncd from critical process list because
gbsyncd process will exit for platform without
gearbox.

closes #5623

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-10-27 11:40:23 -07:00
Joe LeVeque
9e34003136
[sonic-config-engine] Clean up dependencies, pin versions; install Python 3 package in Buster container (#5656)
To clean up the image build procedure, and let setuptools/pip[3] implicitly install Python dependencies. Also use ipaddress package instead of ipaddr.
2020-10-26 13:48:50 -07:00
Shi Su
67408c85aa
[synchronous-mode] Add template file for synchronous mode (#5644)
The orchagent and syncd need to have the same default synchronous mode configuration. This PR adds a template file to translate the default value in CONFIG_DB (empty field) to an explicit mode so that the orchagent and syncd could have the same default mode.
2020-10-23 13:08:35 -07:00
Joe LeVeque
3a4435eb53
Add sonic-host-services and sonic-host-services-data packages (#5694)
**- Why I did it**

Install all host services and their data files in package format rather than file-by-file

**- How I did it**

- Create sonic-host-services Python wheel package, currently including procdockerstatsd
  - Also add the framework for unit tests by adding one simple procdockerstatsd test case
- Create sonic-host-services-data Debian package which is responsible for installing the related systemd unit files to control the services in the Python wheel. This package will also be responsible for installing any Jinja2 templates and other data files needed by the host services.
2020-10-23 09:52:29 -07:00
BrynXu
29928c93a1
[chassis]: Use correct path for chassisdb.conf file (#5632)
use correct chassisdb.conf path while bringing up chassis_db service on VoQ modular switch.chassis_db service on VoQ modular switch.

resolves #5631

Signed-off-by: Honggang Xu <hxu@arista.com>
2020-10-21 01:40:04 -07:00
BrynXu
a2e3d2fcea
[ChassisDB]: bring up ChassisDB service (#5283)
bring up chassisdb service on sonic switch according to the design in
Distributed Forwarding in VoQ Arch HLD

Signed-off-by: Honggang Xu <hxu@arista.com>

**- Why I did it**
To bring up new ChassisDB service in sonic as designed in ['Distributed forwarding in a VOQ architecture HLD' ](90c1289eaf/doc/chassis/architecture.md). 

**- How I did it**
Implement the section 2.3.1 Global DB Organization of the VOQ architecture HLD.

**- How to verify it**
ChassisDB service won't start without chassisdb.conf file on the existing platforms.
ChassisDB service is accessible with global.conf file in the distributed arichitecture.

Signed-off-by: Honggang Xu <hxu@arista.com>
2020-10-14 15:15:24 -07:00
Joe LeVeque
88c1d66c27
[python-click] No longer build our own package, let pip/setuptools install vanilla (#5549)
We were building our own python-click package because we needed features/bug fixes available as of version 7.0.0, but the most recent version available from Debian was in the 6.x range.

"Click" is needed for building/testing and installing sonic-utilities. Now that we are building sonic-utilities as a wheel, with Click specified as a dependency in the setup.py file, setuptools will install a more recent version of Click in the sonic-slave-buster container when building the package, and pip will install a more recent version of Click in the host OS of SONiC when installing the sonic-utilities package. Also, we don't need to worry about installing the Python 2 or 3 version of the package, as the proper one will be installed as necessary.
2020-10-14 10:16:35 -07:00
Junchao-Mellanox
1c97a03b81
[system-health] Add support for monitoring system health (#4835)
* system health first commit

* system health daemon first commit

* Finish healthd

* Changes due to lower layer logic change

* Get ASIC temperature from TEMPERATURE_INFO table

* Add system health make rule and service files

* fix bugs found during manual test

* Change make file to install system-health library to host

* Set system LED to blink on bootup time

* Caught exceptions in system health checker to make it more robust

* fix issue that fan/psu presence will always be true

* fix issue for external checker

* move system-health service to right after rc-local service

* Set system-health service start after database service

* Get system up time via /proc/uptime

* Provide more information in stat for CLI to use

* fix typo

* Set default category to External for external checker

* If external checker reported OK, save it to stat too

* Trim string for external checker output

* fix issue: PSU voltage check always return OK

* Add unit test cases for system health library

* Fix LGTM warnings

* fix demo comments: 1. get boot up timeout from monit configuration file; 2. set system led in library instead of daemon

* Remove boot_timeout configuration because it will get from monit config file

* Fix argument miss

* fix unit test failure

* fix issue: summary status is not correct

* Fix format issues found in code review

* rename th to threshold to make it clearer

* Fix review comment: 1. add a .dep file for system health; 2. deprecated daemon_base and uses sonic-py-common instead

* Fix unit test failure

* Fix LGTM alert

* Fix LGTM alert

* Fix review comments

* Fix review comment

* 1. Add relevant comments for system health; 2. rename external_checker to user_define_checker

* Ignore check for unknown service type

* Fix unit test issue

* Rename user define checker to user defined checker

* Rename user_define_checkers to user_defined_checkers for configuration file

* Renmae file user_define_checker.py -> user_defined_checker.py

* Fix typo

* Adjust import order for config.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/health_checker/hardware_checker.py

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import order for src/system-health/scripts/healthd

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>

* Adjust import orders in src/system-health/tests/test_system_health.py

* Fix typo

* Add new line after import

* If system health configuration file not exist, healthd should exit

* Fix indent and enable pytest coverage

* Fix typo

* Fix typo

* Remove global logger and use log functions inherited from super class

* Change info level logger to notice level

Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
2020-10-12 11:12:49 +03:00
jon-nokia
d03de95e81
[build]: fix pip installation for sonic utilities whl package (#5498)
The problem was proxy was missing on "pip install". This is to fix the build behind the proxy.

Signed-off-by: Jon Goldberg <jon.goldberg@nokia.com>
2020-10-06 15:47:50 -07:00
Tamer Ahmed
110f7b7817 [cfggen] Build Python 2 And Python 3 Wheel Packages
This builds Python 2&3 wheel packages for sonic-cfggen script.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-30 07:07:43 -07:00
Volodymyr Boiko
d71a4efe3b
[sonic-platform-common] Install Python 3 package in host OS and PMon container (#5461)
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-09-29 13:57:54 -07:00
judyjoseph
4006ce711f
[Multi-Asic] Forward SNMP requests received on front panel interface to SNMP agent in host. (#5420)
* [Multi-Asic] Forward SNMP requests destined to loopback IP, and coming in through the front panel interface
             present in the network namespace, to SNMP agent running in the linux host.

* Updates based on comments

* Further updates in docker_image_ctl.j2 and caclmgrd

* Change the variable for net config file.

* Updated the comments in the code.

* No need to clean up the exising NAT rules if present, which could be created by some other process.

* Delete our rule first and add it back, to take care of caclmgrd restart.
Another benefit is that we delete only our rules, rather than earlier approach of "iptables -F" which cleans up all rules.

* Keeping the original logic to clean the NAT entries, to revist when NAT feature added in namespace.

* Missing updates to log_info call.
2020-09-26 12:14:30 -07:00
Syd Logan
0311a4a037
Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature (#4851)
* buildimage: Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature

* scripts and configuration needed to support a second syncd docker (physyncd)
* physyncd supports gearbox device and phy SAI APIs and runs multiple instances of syncd, one per phy in the device
* support for VS target (sonic-sairedis vslib has been extended to support a virtual BCM81724 gearbox PHY).

HLD is located at b817a12fd8/doc/gearbox/gearbox_mgr_design.md

**- Why I did it**

This work is part of the gearbox phy joint effort between Microsoft and Broadcom, and is based
on multi-switch support in sonic-sairedis.

**- How I did it**

Overall feature was implemented across several projects. The collective pull requests (some in late stages of review at this point):

https://github.com/Azure/sonic-utilities/pull/931 - CLI (merged)
https://github.com/Azure/sonic-swss-common/pull/347 - Minor changes (merged)
https://github.com/Azure/sonic-swss/pull/1321 - gearsyncd, config parsers, changes to orchargent to create gearbox phy on supported systems
https://github.com/Azure/sonic-sairedis/pull/624 - physyncd, virtual BCM81724 gearbox phy added to vslib

**- How to verify it**

In a vslib build:

root@sonic:/home/admin# show gearbox interfaces status
  PHY Id    Interface        MAC Lanes    MAC Lane Speed        PHY Lanes    PHY Lane Speed    Line Lanes    Line Lane Speed    Oper    Admin
--------  -----------  ---------------  ----------------  ---------------  ----------------  ------------  -----------------  ------  -------
       1   Ethernet48  121,122,123,124               25G  200,201,202,203               25G       204,205                50G    down     down
       1   Ethernet49  125,126,127,128               25G  206,207,208,209               25G       210,211                50G    down     down
       1   Ethernet50      69,70,71,72               25G  212,213,214,215               25G           216               100G    down     down

In addition, docker ps | grep phy should show a physyncd docker running.

  Signed-off-by: syd.logan@broadcom.com
2020-09-25 08:32:44 -07:00
yozhao101
13cec4c486
[Monit] Unmonitor the processes in containers which are disabled. (#5153)
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-09-25 00:28:28 -07:00
abdosi
0483255e82
Fix the build issue when port2cable lenth define in (#5437)
buffer_default_*.j2 because of which internal cable length never gets
define and cause failure in test case test_multinpu_cfggen.py

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
2020-09-23 08:07:09 -07:00
abdosi
75e4258508
Enhanced Feature Table state enable/disable for multi-asic platforms. (#5358)
* Enhanced Feature Table state enable/disbale for multi-asic platforms.
In Multi-asic for some features we can service per asic so we need to
get list of all services.

Also updated logic to return if any one of systemctl command return failure
and make sure syslog of feature getting enable/disable only come when
all commads are sucessful.

Moved the service list get api from sonic-util to sonic-py-common

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Make sure to retun None for both service list in case of error.

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Return empty list as fail condition

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Address Review Comments.

Made init_cfg.json.j2 knowledegable of Feature
service is global scope or per asic scope

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Fix merge conflict

* Address Review Comment.

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
2020-09-22 08:34:02 -07:00
Joe LeVeque
3987cbd80a
[sonic-utilities] Build and install as a Python wheel package (#5409)
We are moving toward building all Python packages for SONiC as wheel packages rather than Debian packages. This will also allow us to more easily transition to Python 3.

Python files are now packaged in "sonic-utilities" Pyhton wheel. Data files are now packaged in "sonic-utilities-data" Debian package.

**- How I did it**
- Build and install sonic-utilities as a Python package
- Remove explicit installation of wheel dependencies, as these will now get installed implicitly by pip when installing sonic-utilities as a wheel
- Build and install new sonic-utilities-data package to install data files required by sonic-utilities applications
- Update all references to sonic-utilities scripts/entrypoints to either reference the new /usr/local/bin/ location or remove absolute path entirely where applicable

Submodule updates:

* src/sonic-utilities aa27dd9...2244d7b (5):
  > Support building sonic-utilities as a Python wheel package instead of a Debian package (#1122)
  > [consutil] Display remote device name in show command (#1120)
  > [vrf] fix check state_db error when vrf moving (#1119)
  > [consutil] Fix issue where the ConfigDBConnector's reference is missing (#1117)
  > Update to make config load/reload backward compatible. (#1115)

* src/sonic-ztp dd025bc...911d622 (1):
  > Update paths to reflect new sonic-utilities install location, /usr/local/bin/ (#19)
2020-09-20 20:16:42 -07:00
Tamer Ahmed
2de3afaf35
[swss] Enhance ARP Update to Call Sonic Cfggen Once (#5398)
This PR limited the number of calls to sonic-cfggen to one call
per iteration instead of current 3 calls per iteration.

The PR also installs jq on host for future scripts if needed.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-18 18:44:23 -07:00
Stepan Blyshchak
6de9390bb0
[build] Add a parameter to specify sonic version during build (#5278)
Introduced a new build parameter 'SONIC_IMAGE_VERSION' that allows build
system users to build SONiC image with a specific version string. If
'SONIC_IMAGE_VERSION' was not passed by the user, SONIC_IMAGE_VERSION will be
set to the output of functions.sh:sonic_get_version function.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2020-09-16 10:47:26 -07:00
shi-su
339cfbf9af
Remove the configuration of synchronous mode from init_cfg.json (#5308)
Remove the configuration of synchronous mode from init_cfg.json
2020-09-10 01:26:10 -07:00
Tamer Ahmed
fdb9d028e9
[redis] Add redis Group And Grant Read/Write Access to Members (#5289)
sonic-cfggen is now using Unix Domain Socket for Redis DB. The socket
is created using root account. Subsequently, services that are started
as admin fails to start. This PR creates redis group and add admin
user to redis group. It also grants read/write access on redis.sock
for redis group members.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-02 23:40:22 -07:00
Stepan Blyshchak
b31050d60e
[services][mgmt-framework] delay mgmt-framework service on boot (#5226)
management framework provides management plane services like rest and
CLI which is not needed right after boot, instead by delaying this
service we give some more CPU for data plane and control plane services
on fast/warm boot.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2020-08-27 21:53:58 +03:00
shi-su
f3feb56c8a
Add switch for synchronous mode (#5237)
Add a master switch so that the sync/async mode can be configured.
Example usage of the switch:
1.  Configure mode while building an image
    `make ENABLE_SYNCHRONOUS_MODE=y <target>`
2. Configure when the device is running 
    Change CONFIG_DB with `sonic-cfggen -a '{"DEVICE_METADATA":{"localhost": {"synchronous_mode": "enable"}}}' --write-to-db`
    Restart swss with `systemctl restart swss`
2020-08-24 14:04:10 -07:00
nirenjan
bb57ccecd4
[sonic-host-service]: Add SONiC Host Services infrastructure (#4840)
- Why I did it

When SONiC is configured with the management framework and/or telemetry services, the applications running inside those containers need to access some functionality on the host system. The following is a non-exhaustive list of such functionality:

Image management
Configuration save and load
ZTP enable/disable and status
Show tech support
- How I did it

The host service is a Python process that listens for requests via D-Bus. It will then service those requests and send a response back to the requestor.

This PR only introduces the host service infrastructure. Applications that need access to the host services must add applets that will register on D-Bus endpoints to service the appropriate functionality.

- How to verify it

- Description for the changelog

Add SONiC Host Service for container to execute select commands in host

Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>
2020-08-21 15:34:14 -07:00
abdosi
1a805e7409
Fix unwanted python exception in syslog during database container (#5227)
startup when doing redis PING since database_config.json getting
generated from jinja2 template is still not ready.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-08-21 07:33:19 -07:00
abdosi
74d8b4a6be
[caclmgrd] Add support for multi-ASIC platforms (#5022)
* Support for Control Plane ACL's for Multi-asic Platforms.
Following changes were done:
 1) Moved from using blocking listen() on Config DB to the select() model
 via python-swsscommon since we have to wait on event from multiple
 config db's
 2) Since  python-swsscommon is not available on host added libswsscommon and python-swsscommon
    and dependent packages in the base image (host enviroment)
 3) Made iptables programmed in all namespace using ip netns exec

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix Review Comments

* Fix Comments

* Added Change for Multi-asic to have iptables
rules to accept internal docker tcp/udp traffic
needed for syslog and redis-tcp connection.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix Review Comments

* Added more comments on logic.

* Fixed all warning/errors reported by http://pep8online.com/
other than line > 80 characters.

* Fix Comment
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Verified with swsscommon package. Fix issue for single asic platforms.

* Moved to new python package

* Address Review Comments.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments.
2020-08-20 15:11:42 -07:00
Tamer Ahmed
e484ae9dda
[services] Fix Delay Start of SNMP And Telemetry (#5211)
SNMP and Telemetry services are not critical to switch startup.
They also cause fast-reboot not to meet timing requirements.
In order to delay start those service are associated with systemd
timer units, however when hostcfgd initiate service start, it start
the service and not the timer. This PR fixes this issue by
starting the timer associated with systemd unit.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-08-19 19:27:59 -07:00
Vaibhav Hemant Dixit
9fdbaf0196
BGP Service script path and error fix (#5183)
* BGP service script path update and error fix

Co-authored-by: Vaibhav Hemant Dixit <vadixit@microsoft.com>
2020-08-15 12:09:10 -07:00
Prince Sunny
3fee094760
Enable restapi, update sonic-restapi (#5169)
* Enable restapi if included in image
* [Submodule update] sonic-restapi
2020-08-13 11:29:49 -07:00
Vaibhav Hemant Dixit
31d7f27b4b
Move RADV fastboot handling to a service script (#5108)
* New /usr/local/bin/ script to start/stop radv service
2020-08-11 13:13:13 -07:00
Joe LeVeque
2b5e418e2e
Remove sonic-daemon-base package (#5131)
sonic-daemon-base package has been deprecated in favor of the sonic-py-common package. All related functionality has been moved there.
2020-08-09 21:27:36 -07:00
isabelmsft
19a3452ddc
[Kubernetes Setup] Remove flannel, kube-proxy images (#5098)
Removes installation of kube-proxy (117 MB) and flannel (53 MB) images from Kubernetes-enabled devices. These images are tested to be unnecessary for our use case, as we do not rely on ClusterIPs for Kubernetes Services or a CNI for pod networking.
2020-08-06 18:23:27 -05:00
lguohan
082c26a27d
[build]: combine feature and container feature table (#5081)
1. remove container feature table
2. do not generate feature entry if the feature is not included
   in the image
3. rename ENABLE_* to INCLUDE_* for better clarity
4. rename feature status to feature state
5. [submodule]: update sonic-utilities

* 9700e45 2020-08-03 | [show/config]: combine feature and container feature cli (#1015) (HEAD, origin/master, origin/HEAD) [lguohan]
* c9d3550 2020-08-03 | [tests]: fix drops_group_test failure on second run (#1023) [lguohan]
* dfaae69 2020-08-03 | [lldpshow]: Fix input device is not a TTY error (#1016) [Arun Saravanan Balachandran]
* 216688e 2020-08-02 | [tests]: rename sonic-utilitie-tests to tests (#1022) [lguohan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-08-05 13:23:12 -07:00
Joe LeVeque
3b89e5d467
[Python] Migrate applications/scripts to import sonic-py-common package (#5043)
As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request:
1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added in https://github.com/Azure/sonic-buildimage/pull/5003
2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`.
3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module

This is the next step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999

Also reverted my previous change in which device_info.get_platform() would first try obtaining the platform ID string from Config DB and fall back to gathering it from machine.conf upon failure because this function is called by sonic-cfggen before the data is in the DB, in which case, the db_connect() call will hang indefinitely, which was not the behavior I expected. As of now, the function will always reference machine.conf.
2020-08-03 11:43:12 -07:00
Tamer Ahmed
7872b4e196
[platform] Add Support For Environment Variable File (#5010)
* [platform] Add Support For Environment Variable

This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-07-31 17:59:09 -07:00
Nazarii Hnydyn
fe387289b1
[Mellanox] Update MFT to 4.15.0-104 (#5075)
* [Mellanox] Update MFT to 4.15.0-104.
* [Mellanox] Remove build system W/A.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-07-30 22:59:57 -07:00
Mahesh Maddikayala
ee4197e9f8
+ Modified buffer config template to include internal ASIC with 5m ca… (#4959)
+ Modified buffer config template to include internal ASIC with 5m cable length
+ added unit test to verify the changes.
2020-07-27 10:57:07 -07:00
Joe LeVeque
c0d1616f89
Introduce sonic-py-common package (#5003)
Consolidate common SONiC Python-language functionality into one shared package (sonic-py-common) and eliminate duplicate code.

The package currently includes three modules:

- daemon_base
- device_info
- logger
2020-07-26 23:15:41 -07:00
Stepan Blyshchak
5d753b8d43
[services] remove swss from WantedBy for nat service (#4991)
Otherwise, it may cause issues for warm restarts, warm reboot.
Warm restart of swss will start nat which is not expected for warm
restart. Also it is observed that during warm-reboot script execution
nat container gets started after it was killed. This causes removal of
nat dump generated by nat previously:

A check [ -f /host/warmboot/nat/nat_entries.dump ] || echo "NAT dump
does not exists" was added right before kexec:

```
Fri Jul 17 10:47:16 UTC 2020 Prepare MLNX ASIC to fastfast-reboot:
install new FW if required
Fri Jul 17 10:47:18 UTC 2020 Pausing orchagent ...
Fri Jul 17 10:47:18 UTC 2020 Stopping nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopped nat ...
Fri Jul 17 10:47:18 UTC 2020 Stopping radv ...
Fri Jul 17 10:47:19 UTC 2020 Stopping bgp ...
Fri Jul 17 10:47:19 UTC 2020 Stopped bgp ...
Fri Jul 17 10:47:21 UTC 2020 Initialize pre-shutdown ...
Fri Jul 17 10:47:21 UTC 2020 Requesting pre-shutdown ...
Fri Jul 17 10:47:22 UTC 2020 Waiting for pre-shutdown ...
Fri Jul 17 10:47:24 UTC 2020 Pre-shutdown succeeded ...
Fri Jul 17 10:47:24 UTC 2020 Backing up database ...
Fri Jul 17 10:47:25 UTC 2020 Stopping teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopped teamd ...
Fri Jul 17 10:47:25 UTC 2020 Stopping syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopped syncd ...
Fri Jul 17 10:47:35 UTC 2020 Stopping all remaining containers ...
Warning: Stopping telemetry.service, but it can still be activated by:
  telemetry.timer
Fri Jul 17 10:47:37 UTC 2020 Stopped all remaining containers ...
NAT dump does not exists
Fri Jul 17 10:47:39 UTC 2020 Rebooting with /sbin/kexec -e to
SONiC-OS-201911.140-08245093 ...
```

With this change, executed warm-reboot 10 times without hitting this
issue, while without this change the issue is easily reproducible almost
every warm-reboot run.

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2020-07-19 21:50:26 -07:00
yozhao101
83738fca2f
[dockers] Default container autorestart feature to "enabled" for all except database (#4853)
Set the default auto_restart state to "enabled" in init_cfg.json for all containers except database

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-07-15 11:49:14 -07:00
heidinet2007
de51b9e424
BGP warm reboot script to service (#3992)
* [sonic-buildimage] Move BGP warm reboot scripts into BGP service /usr/local/bin

* Revert "[sonic-buildimage] Move BGP warm reboot scripts into BGP service /usr/local/bin"

This reverts commit d16d163fc4.

* [sonic-buildimage] Move BGP warm reboot script to BGP service

* [sonic-buildimage] Move BGP warm reboot script to BGP service

* [sonic-buildimage] Move BGP warm reboot script to BGP service
- access DB correctly

* Address code review comments, also change file mode of bgp.sh (+x)

* Address code review comments, also change the file mode of bgp.sh (+x)

* BGP warm reboot script to service, also handle fast boot as indicated by flag saved in StateDB

* BGP warm reboot script to service, code review comments on space alignment

* BGP warm reboot script to service: remove uncesseary space

* BGP warm reboot script to service: replace tab with space

* Code review comments: -) use new multi-db api -) add ignore error from zebra in case it's not configured

* Integrate with multi-ASIC changes committed recently

Co-authored-by: heidi.ou@alibaba-inc.com <heidi.ou@alibaba-inc.com>
2020-07-15 11:46:23 -07:00
Sujin Kang
bf45e11d27
Add pcie-check service to check PCIe devices at boot (#4771)
* PCIe Monitor service

* Add rescan to pcie-mon.service when it fails to get all pcie devices

* space

* Clean up

* review comments

* update the pcie status in state db

* update the failed pcie status once at the end

* Update the pcie_status in STATE_DB and rename the service

* Add log to exit the service if the configuration file doesn't exist.

* fix the build failure

* Redo the pcie rescan for pcie-check failed case.

* review comments

* review comments

* review comments
2020-07-13 14:15:09 -07:00
Sujin Kang
b4452edb8a Add disabling HW watchdog during boot for fast-reboot and warm-reboot (#4927)
* Add disabling HW watchdog during boot for fast-reboot and warm-reboot case

* typo
2020-07-12 18:08:52 +00:00
arlakshm
a46f4c96e7 Add support for bcmsh and bcmcmd utlitites in multi ASIC devices (#4926)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
This PR has changes to support accessing the bcmsh and bcmcmd utilities on multi ASIC devices
Changes done
- move the link of /var/run/sswsyncd from docker-syncd-brcm.mk to docker_image_ctl.j2
- update the bcmsh and bcmcmd scripts to take -n [ASIC_ID] as an argument on multi ASIC platforms
2020-07-12 18:08:52 +00:00
abdosi
fc6bcff52b [sonic-buildimage] Changes to make network specific sysctl common for both host and docker namespace (#4838)
* [sonic-buildimage] Changes to make network specific sysctl
common for both host and docker namespace (in multi-npu).

This change is triggered with issue found in multi-npu platforms
where in docker namespace
net.ipv6.conf.all.forwarding was 0 (should be 1) because of
which RS/RA message were triggered and link-local router were learnt.

Beside this there were some other sysctl.net.ipv6* params whose value
in docker namespace is not same as host namespace.

So to make we are always in sync in host and docker namespace
created common file that list all sysctl.net.* params and used
both by host and docker namespace. Any change will get applied
to both namespace.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments and made sure to invoke augtool
only one and do string concatenation of all set commands

* Address Review Comments.
2020-07-12 18:08:51 +00:00
arlakshm
a8b99f77f3 syslog changes Multi ASIC platforms (#4738)
Add changes for syslog support for containers running in namespaces on multi ASIC platforms.
On Multi ASIC platforms

Rsyslog service is only running on the host. There is no rsyslog service running in each namespace.
On multi ASIC platforms the rsyslog service on the host will be listening on the docker0 ip address instead of loopback address.
The rsyslog.conf on the containers is modified to have omfwd target ip to be docker0 ipaddress instead of loopback ip

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-07-12 18:08:51 +00:00
abdosi
15440b6e43
Changes to make default route programming correct in multi-npu platforms (#4774)
* Changes to make default route programming
correct in multi-asic platform where frr is not running
in host namespace. Change is to set correct administrative distance.
Also make NAMESPACE* enviroment variable available for all dockers
so that it can be used when needed.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix review comments

* Review comment to check to add default route
only if default route exist and delete is successful.
2020-06-29 11:38:46 -07:00
SuvarnaMeenakshi
ab2177b4a9
[systemd-generator]: Fix dependency update for multi-asic platform (#4820)
* [systemd-generator]: Fix the code to make sure that dependencies
of host services are generated correctly for multi-asic platforms.
Add code to make sure that systemd timer files are also modified
to add the correct service dependency for multi-asic platforms.

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>

* [systemd-generator]: Minor fix, remove debug code and
remove unused variable.
2020-06-29 09:39:23 -07:00
Praveen Chaudhary
07930c39ba
[build] Add essential PY PKGs on host for sonic-utilities/config/config_mgmt.py (#4740)
Add essential PY PKGs on host by installing them in sonic_debian_extension.j2

Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
2020-06-28 11:03:48 -07:00
Qi Luo
6849a0351c
[redis] Install vanilla redis packages for Buster and Stretch; upgrade Buster to 6.0.5 (#4732)
upgrade redis server to 5:6.0.5-1~bpo10+1
2020-06-27 01:17:20 -07:00
Joe LeVeque
63d2efbe03
[build][systemd] Mask disabled services by default (#4721)
When building the SONiC image, used systemd to mask all services which are set to "disabled" in init_cfg.json.

This PR depends on https://github.com/Azure/sonic-utilities/pull/944, otherwise `config load_minigraph will fail when trying to restart disabled services.
2020-06-24 15:25:16 -07:00
Joe LeVeque
1f8a78cef1
[build] No longer install Python 'click-default-group' package (#4811)
All dependencies upon the Python 'click-default-group' package have been removed from sonic-utilities as of https://github.com/Azure/sonic-utilities/pull/903. The submodule was updated to include this patch as of https://github.com/Azure/sonic-buildimage/pull/4601, therefore we no longer need to install this package in the SONiC image.
2020-06-19 10:54:10 -07:00
abdosi
30d7ce0004
[build] Ensure /usr/lib/systemd/system/ directory exists before referencing (#4788)
* Fix the Build on 201911 (Stretch) where the directory
/usr/lib/systemd/system/ does not exist so creating
manually. Change should not harm Master (buster) where
the directory is created by Linux

* Fix as per review comments
2020-06-17 09:16:58 -07:00
Renuka Manavalan
edeb40ffcf
[k8s]: switching to Flannel from Calico. (#4768)
Switching to Flannel from Calico which brings down the image size by around 500+MB.
2020-06-12 18:06:08 -07:00
Joe LeVeque
4e482c16ba
[build] Enable telemetry service by default (#4760)
**- Why I did it**
To ensure telemetry service is enabled by default after installing a fresh SONiC image

**- How I did it**
Set telemetry feature status to "enabled" when generating init_cfg.json file
2020-06-12 16:20:31 -07:00
Joe LeVeque
1e369b0998
[systemd] Relocate all SONiC unit files to /usr/lib/systemd/system (#4673)
This will allow us to disable services and have it persist across reboots by using the `systemctl mask` operation
2020-05-30 13:46:44 -07:00
Qi Luo
65e7a84509
[baseimage]: Build and install redis-dump-load Python 3 package in host image (#4661)
Fix #4656
2020-05-30 05:52:27 -07:00
taocy
4cd36175ce arm arch: 1. install required libraries; 2. umount /proc after dockerfs. 2020-05-25 13:15:19 +00:00
Praveen Chaudhary
0ccdd70671
[sonic-yang-mgmt]: sonic-yang-mgmt package for configuration validation. (#3861)
**- What I did**

#### wheel package Makefiles

- wheel package Makefiles for sonic-yang-mgmt package.

#### libyang Python APIs:
- python APIs based on libyang
- functions to load/merge yang models and Yang data files
- function to validate data trees based on Yang models
- functions to merge yang data files/trees
- add/set/delete node in schema and data trees
- find data/schema nodes from xpath from the Yang data/schema tree in memory
- find dependencies
- dump the data tree in json/xml

#### Extension of libyang Python APIs:
-- Cropping input config based on Yang Model.
-- Translate input config based on Yang Model.
-- rev Translate input config based on Yang Model.
-- Find xpath of port, portleaf and a yang list.
-- Find if node is key of a list while deletion if yes, then delete the parent.

Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
Signed-off-by: Ping Mao pmao@linkedin.com
2020-05-21 16:27:57 -07:00
simonJi2018
0b6253baa1
[platform/nephos] Optimize the code to reduce changes due to the kernel upgrade (#4332)
- bug fix : Fixed an issue which the nps ko file was not loaded due to the wrong service file name
- Optimize the code to reduce changes due to the kernel upgrade
- Remove nephos ko file loaded in swss.service.j2 because it has loaded at syncd.service.j2
2020-05-21 02:21:07 -07:00
anand-kumar-subramanian
34586032dc
[mgmt-framework] removed requires dependency on swss (#4548)
fixes #4473
2020-05-20 20:47:09 -07:00
rkdevi27
32f58b5864
Fix "/host unmount failure" during reboot (#4558) 2020-05-20 11:18:11 -07:00
rajendra-dendukuri
9c7105b5f3
Install swsssdk-py3 in the base Debian image for python3 based apps (#4542)
Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2020-05-19 11:15:05 -07:00
lguohan
1066f238ba
[baseimage]: pin down package version for azure-storage, watchdog and futures (#4575)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-05-11 23:17:47 -07:00
abdosi
a96f9ecee9
Changes for LLDP docker to support multi-npu platforms (#4530)
* Changes for LLDP for Multi NPU Platoforms:-
a) Enable LLDP for Host namespace for Management Port
b) Make sure Management IP is avaliable in per asic namespace
   needed for LLDP Chassis configuration
c) Make sure chassis mac-address is correct in per asic namespace
d) Do not run lldp on eth0 of per asic namespace and avoid chassis
   configuration for same
e) Use Linux hostname instead from Device Metadata for lldp chassis
   configuration since in multi-npu platforms device metadata hostname
   will be differnt

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comment with following changes:
a) Use Device Metadata hostname even in per namespace conatiner.
   updated minigraph parsing for same to have hostname as system
   hostname and add new key for asic name

b) Minigraph changes to have MGMT_INTERFACE Key in per asic/namespace
   config also as needed for LLDP for setting chassis management IP.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments
2020-05-11 11:05:44 -07:00
Neetha John
286aa35ac6
[qos]: Alpha and ECN settings change for Th (#4564)
Dynamic threshold setting changed to 0 and WRED profile green min threshold set to 250000 for Tomahawk devices

Changed the dynamic threshold settings in pg_profile_lookup.ini
Added a macro for WRED profiles in qos.json.j2 for Tomahawk devices
Necessary changes made in qos.config.j2 to use the macro if present

Signed-off-by: Neetha John <nejo@microsoft.com>
2020-05-09 11:21:18 -07:00
judyjoseph
acf465b43b
Multi DB with namespace support, Introducing the database_global.json… (#4477)
* Multi DB with namespace support, Introducing the database_global.json file
for supporting accessing DB's in other namespaces for service running in
linux host

* Updates based on comments

* Adding the j2 templates for database_config and database_global files.

* Updating to retrieve the redis DIR's to be mounted from database_global.json file.

* Additional check to see if asic.conf file exists before sourcing it.

* Updates based on PR comments discussion.

* Review comments update

* Updates to the argument "-n" for namespace used in both context of parsing minigraph and multi DB access.

* Update with the attribute "persistence_for_warm_boot" that was added to database_config.json file earlier.

* Removing the database_config.json file to avioid confusion in future.
We use the database_config.json.j2 file to generate database_config.json files dynamically.

* Update the comments for sudo usage in docker_image_ctrl.j2

* Update with the new logic in PING PONG tests using sonic-db-cli. With this we wait till the
PONG response is received when redis server is up.

* Similar changes in swss and syncd scripts for the PING tests with sonic-db-cli

* Updated with a missing , in the database_config.json.j2 file, Do pip install of j2cli in docker-base-buster.
2020-05-08 21:24:05 -07:00
Akhilesh Samineni
86627dfd35
[NAT] : Removed requires dependency on swss (#4551)
Signed-off-by: Akhilesh Samineni <akhilesh.samineni@broadcom.com>
2020-05-08 00:01:48 -07:00
Dong Zhang
340cf826a6
[MultiDB] use sonic-db-cli PING and fix wrong multiDB API in NAT (#4541) 2020-05-06 15:41:28 -07:00
lguohan
86bc8aec5f
[vs]: dynamically create front panel ports in vs docker (#4499)
currently, vs docker always create 32 front panel ports.

when vs docker starts, it first detects the peer links
in the namespace and then setup equal number of front panel
interfaces as the peer links.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-30 12:50:59 -07:00
arlakshm
3a82ade3ef
[docker]: Enabled ipv6 in dockers when using docker bridge network (#4426)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-04-21 17:09:41 -07:00
Stephen
e95504fe72 [Mellanox]WA to avoid fsroot being corrupted by "dpkg --extract" 2020-04-17 04:51:51 +00:00
Guohan Lu
65dfe75903 [build]: umount target directory properly
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-17 04:51:51 +00:00
Guohan Lu
fb12b0a621 [baseimage]: various fixes due to buster changes
do mount/umount in the chroot environment
install cron explicitly
install rasdaemon as a replacement for mcelog
switch python package docker-py to docker
2020-04-17 04:51:51 +00:00
Praveen Chaudhary
a02255e2f4
[sonic-yang-models]: First version of yang models for Port, VLan, Interface, PortChannel, loopback and ACL. (#3730)
[sonic-yang-models]: First version of yang models for Port, VLan, Interface, PortChannel, loopback and ACL.

YANG models as per Guidelines.

Guideline doc: https://github.com/Azure/SONiC/blob/master/doc/mgmt/SONiC_YANG_Model_Guidelines.md

[sonic-yang-models/tests]: YANG model test code and JSON input for testing.

[sonic-yang-models/setup.py]: Build infra for yang models.

**- What I did**
Created Yang model for Sonic.
Tables:  PORT, VLAN, VLAN_INTERFACE, VLAN_MEMBER, ACL_RULE, ACL_TABLE, INTERFACE.

Created build infra files using which a new package (sonic-yang-models) can be build and can be deployed on sonic switches. Yang models will be part of this new package.

**- How I did it**
Wrote yang models based on Guideline doc: 
https://github.com/Azure/SONiC/blob/master/doc/mgmt/SONiC_YANG_Model_Guidelines.md
and 
https://github.com/Azure/SONiC/wiki/Configuration.

Wrote python wheel Package infra which runs test for these Yang models using a json files which consists configuration as per yang models. These configs are for negative tests, which means we want to test that most must condition, pattern and when condition works as expected.

**- How to verify it**
Build Logs and testing:
———————————————————————————————————

```
/sonic/src/sonic-yang-models /sonic
running test
running egg_info
writing top-level names to sonic_yang_models.egg-info/top_level.txt
writing dependency_links to sonic_yang_models.egg-info/dependency_links.txt
writing sonic_yang_models.egg-info/PKG-INFO
reading manifest file 'sonic_yang_models.egg-info/SOURCES.txt'
writing manifest file 'sonic_yang_models.egg-info/SOURCES.txt'
running build_ext

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK
running bdist_wheel
running build
running build_py
(Reading database ... 155852 files and directories currently installed.)
Preparing to unpack .../libyang_1.0.73_amd64.deb ...
Unpacking libyang (1.0.73) over (1.0.73) ...
Setting up libyang (1.0.73) ...
Processing triggers for libc-bin (2.24-11+deb9u4) ...
Processing triggers for man-db (2.7.6.1-2) ...
(Reading database ... 155852 files and directories currently installed.)
Preparing to unpack .../libyang-cpp_1.0.73_amd64.deb ...
Unpacking libyang-cpp (1.0.73) over (1.0.73) ...
Setting up libyang-cpp (1.0.73) ...
Processing triggers for libc-bin (2.24-11+deb9u4) ...
(Reading database ... 155852 files and directories currently installed.)
Preparing to unpack .../python3-yang_1.0.73_amd64.deb ...
Unpacking python3-yang (1.0.73) over (1.0.73) ...
Setting up python3-yang (1.0.73) ...
INFO:YANG-TEST:module: sonic-vlan is loaded successfully
ERROR:YANG-TEST:Could not get module: sonic-head
INFO:YANG-TEST:module: sonic-portchannel is loaded successfully
INFO:YANG-TEST:module: sonic-acl is loaded successfully
INFO:YANG-TEST:module: sonic-loopback-interface is loaded successfully
ERROR:YANG-TEST:Could not get module: sonic-port
INFO:YANG-TEST:module: sonic-interface is loaded successfully
INFO:YANG-TEST:
------------------- Test 1: Configure a member port in VLAN_MEMBER table which does not exist.---------------------
libyang[0]: Leafref "/sonic-port:sonic-port/sonic-port:PORT/sonic-port:PORT_LIST/sonic-port:port_name" of value "Ethernet156" points to a non
-existing leaf. (path: /sonic-vlan:sonic-vlan/VLAN_MEMBER/VLAN_MEMBER_LIST[vlan_name='Vlan100'][port='Ethernet156']/port)
INFO:YANG-TEST:Configure a member port in VLAN_MEMBER table which does not exist. Passed

INFO:YANG-TEST:
------------------- Test 2: Configure non-existing ACL_TABLE in ACL_RULE.---------------------
libyang[0]: Leafref "/sonic-acl:sonic-acl/sonic-acl:ACL_TABLE/sonic-acl:ACL_TABLE_LIST/sonic-acl:ACL_TABLE_NAME" of value "NOT-EXIST" points
to a non-existing leaf. (path: /sonic-acl:sonic-acl/ACL_RULE/ACL_RULE_LIST[ACL_TABLE_NAME='NOT-EXIST'][RULE_NAME='Rule_20']/ACL_TABLE_NAME)
INFO:YANG-TEST:Configure non-existing ACL_TABLE in ACL_RULE. Passed

INFO:YANG-TEST:
------------------- Test 3: Configure IP_TYPE as ARP and ICMPV6_CODE in ACL_RULE.---------------------
libyang[0]: When condition "boolean(IP_TYPE[.='ANY' or .='IP' or .='IPV6' or .='IPv6ANY'])" not satisfied. (path: /sonic-acl:sonic-acl/ACL_RU
LE/ACL_RULE_LIST[ACL_TABLE_NAME='NO-NSW-PACL-V4'][RULE_NAME='Rule_40']/ICMPV6_CODE)
INFO:YANG-TEST:Configure IP_TYPE as ARP and ICMPV6_CODE in ACL_RULE. Passed
INFO:YANG-TEST:

INFO:YANG-TEST:
------------------- Test 4: Configure IP_TYPE as ipv4any and SRC_IPV6 in ACL_RULE.---------------------
libyang[0]: When condition "boolean(IP_TYPE[.='ANY' or .='IP' or .='IPV6' or .='IPv6ANY'])" not satisfied. (path: /sonic-acl:sonic-acl/ACL_RU
LE/ACL_RULE_LIST[ACL_TABLE_NAME='NO-NSW-PACL-V4'][RULE_NAME='Rule_20']/SRC_IPV6)
INFO:YANG-TEST:Configure IP_TYPE as ipv4any and SRC_IPV6 in ACL_RULE. Passed

------------------- Test 5: Configure l4_src_port_range as 99999-99999 in ACL_RULE---------------------
libyang[0]: Value "99999-99999" does not satisfy the constraint "([0-9]{1,4}|[0-5][0-9]{4}|[6][0-4][0-9]{3}|[6][5][0-2][0-9]{2}|[6][5][3][0-5]{2}|[6][5][3][6][0-5])-([0-9]{1,4}|[0-5][0-9]{4}|[6][0-4][0-9]{3}|[6][5][0-2][0-9]{2}|[6][5][3][0-5]{2}|[6][5][3][6][0-5])" (range, length, or pattern). (path: /sonic-acl:sonic-acl/ACL_RULE/ACL_RULE_LIST[ACL_TABLE_NAME='NO-NSW-PACL-V6'][RULE_NAME='Rule_20']/L4_SRC_PORT_RANGE)
INFO:YANG-TEST:Configure l4_src_port_range as 99999-99999 in ACL_RULE Passed

INFO:YANG-TEST:
------------------- Test 6: Configure empty string as ip-prefix in INTERFACE table.---------------------
libyang[0]: Invalid value "" in "ip-prefix" element. (path: /sonic-interface:sonic-interface/INTERFACE/INTERFACE_LIST[interface='Ethernet8'][ip-prefix='']/ip-prefix)
INFO:YANG-TEST:Configure empty string as ip-prefix in INTERFACE table. Passed

INFO:YANG-TEST:
------------------- Test 7: Configure Wrong family with ip-prefix for VLAN_Interface Table---------------------
libyang[0]: Must condition "(contains(../ip-prefix, ':') and current()='IPv6') or                               (contains(../ip-prefix, '.') and current()='IPv4')" not satisfied. (path: /sonic-vlan:sonic-vlan/VLAN_INTERFACE/VLAN_INTERFACE_LIST[vlanid='100'][ip-prefix='2a04:5555:66:7777::1/64']/family)
INFO:YANG-TEST:Configure Wrong family with ip-prefix for VLAN_Interface Table Passed

INFO:YANG-TEST:
------------------- Test 8: Configure IP_TYPE as ARP and DST_IPV6 in ACL_RULE.---------------------
libyang[0]: When condition "boolean(IP_TYPE[.='ANY' or .='IP' or .='IPV6' or .='IPV6ANY'])" not satisfied. (path: /sonic-acl:sonic-acl/ACL_RULE/ACL_RULE_LIST[ACL_TABLE_NAME='NO-NS
W-PACL-V6'][RULE_NAME='Rule_20']/DST_IPV6)
INFO:YANG-TEST:Configure IP_TYPE as ARP and DST_IPV6 in ACL_RULE. Passed

INFO:YANG-TEST:
------------------- Test 9: Configure INNER_ETHER_TYPE as 0x080C in ACL_RULE.---------------------
libyang[0]: Value "0x080C" does not satisfy the constraint "(0x88CC|0x8100|0x8915|0x0806|0x0800|0x86DD|0x8847)" (range, length, or pattern). (path: /sonic-acl:sonic-acl/ACL_RULE/ACL_RULE_LIST[ACL_TABLE_NAME='NO-NSW-PACL-V4'][RULE_NAME='Rule_40']/INNER_ETHER_TYPE)
INFO:YANG-TEST:Configure INNER_ETHER_TYPE as 0x080C in ACL_RULE. Passed

INFO:YANG-TEST:
------------------- Test 10: Add dhcp_server which is not in correct ip-prefix format.---------------------
libyang[0]: Invalid value "10.186.72.566" in "dhcp_servers" element. (path: /sonic-vlan:sonic-vlan/VLAN/VLAN_LIST/dhcp_servers[.='10.186.72.566'])
INFO:YANG-TEST:Add dhcp_server which is not in correct ip-prefix format. Passed

INFO:YANG-TEST:
------------------- Test 11: Configure undefined acl_table_type in ACL_TABLE table.---------------------
libyang[0]: Invalid value "LAYER3V4" in "type" element. (path: /sonic-acl:sonic-acl/ACL_TABLE/ACL_TABLE_LIST[ACL_TABLE_NAME='NO-NSW-PACL-V6']/type)
INFO:YANG-TEST:Configure undefined acl_table_type in ACL_TABLE table. Passed

INFO:YANG-TEST:
------------------- Test 12: Configure undefined packet_action in ACL_RULE table.---------------------
libyang[0]: Invalid value "SEND" in "PACKET_ACTION" element. (path: /sonic-acl:sonic-acl/ACL_RULE/ACL_RULE_LIST/PACKET_ACTION)
INFO:YANG-TEST:Configure undefined packet_action in ACL_RULE table. Passed

INFO:YANG-TEST:
------------------- Test 13: Configure wrong value for tagging_mode.---------------------
libyang[0]: Invalid value "non-tagged" in "tagging_mode" element. (path: /sonic-vlan:sonic-vlan/VLAN_MEMBER/VLAN_MEMBER_LIST/tagging_mode)
INFO:YANG-TEST:Configure wrong value for tagging_mode. Passed

INFO:YANG-TEST:
------------------- Test 14: Configure vlan-id in VLAN_MEMBER table which does not exist in VLAN  table.---------------------
libyang[0]: Leafref "../../../VLAN/VLAN_LIST/vlanid" of value "200" points to a non-existing leaf. (path: /sonic-vlan:sonic-vlan/VLAN_MEMBER/VLAN_MEMBER_LIST[vlanid='200'][port='Ethernet0']/vlanid)
libyang[0]: Leafref "../../../VLAN/VLAN_LIST/vlanid" of value "200" points to a non-existing leaf. (path: /sonic-vlan:sonic-vlan/VLAN_MEMBER/VLAN_MEMBER_LIST[vlanid='200'][port='Ethernet0']/vlanid)
INFO:YANG-TEST:Configure vlan-id in VLAN_MEMBER table which does not exist in VLAN  table. Passed

INFO:YANG-TEST:All Test Passed
../../target/debs/stretch/libyang0.16_0.16.105-1_amd64.deb installtion failed
../../target/debs/stretch/libyang-cpp0.16_0.16.105-1_amd64.deb installtion failed
../../target/debs/stretch/python2-yang_0.16.105-1_amd64.deb installtion failed
YANG Tests passed
Passed: pyang -f tree ./yang-models/*.yang > ./yang-models/sonic_yang_tree
copying tests/yangModelTesting.py -> build/lib/tests
copying tests/test_sonic_yang_models.py -> build/lib/tests
copying tests/__init__.py -> build/lib/tests
running egg_info
writing top-level names to sonic_yang_models.egg-info/top_level.txt
writing dependency_links to sonic_yang_models.egg-info/dependency_links.txt
writing sonic_yang_models.egg-info/PKG-INFO
reading manifest file 'sonic_yang_models.egg-info/SOURCES.txt'
writing manifest file 'sonic_yang_models.egg-info/SOURCES.txt'
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/tests
copying build/lib/tests/yangModelTesting.py -> build/bdist.linux-x86_64/wheel/tests
copying build/lib/tests/test_sonic_yang_models.py -> build/bdist.linux-x86_64/wheel/tests
copying build/lib/tests/__init__.py -> build/bdist.linux-x86_64/wheel/tests
running install_data
creating build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data
creating build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data
creating build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-head.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-acl.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-interface.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-loopback-interface.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-port.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-portchannel.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
copying ./yang-models/sonic-vlan.yang -> build/bdist.linux-x86_64/wheel/sonic_yang_models-1.0.data/data/yang-models
```
2020-04-14 15:36:02 -07:00
Renuka Manavalan
f128153706
[baseimage]: Install Kubernetes packages if enabled in image (#4374)
* Install kubernetes worker node packages, if enabled.

* Minor updates

* Added some comments

* Updates per review comments.
Built a private image to test to work fine.

* Remove the removed file.

* Update per comments
Make a fix, as kubeadm no demands a higher version of kubelet & kubectl.
As kubeadm auto install kubectl & kubelet, removing explicit install is an easier/robust fix.

* Changes per review comments.

* Updates per comments.
1) Dropped helper & pod scripts
2) Made install verbose

* Drop creation of pods subdir, as this PR does not use them.

* From comments to 'n' per review comments.

* 1) kubeadm.conf is created as part of kubeadm package install. Hence dropped explicit copy.
2020-04-13 08:41:18 -07:00
Nazarii Hnydyn
1b8897eec0
[mellanox]: Add SSD FW update tool (#4351)
* [mellanox]: Add SSD FW update tool.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Align Platform API.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Fix firmware description.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Update SSD tool.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
2020-04-13 18:13:19 +03:00
lguohan
296470de25
[docker-iccp]: do not mount kernel module into iccp container (#4372)
kernel module should be loaded outside container

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-06 07:40:24 -07:00
shine4chen
524cf9e56a
MCLAG feature for SONIC (#2514)
* MCLAG feature for sonic

* MCLAG feature for sonic

* remove binary file

* remove unused dockerfile

update docker-iccpd to stretch-based container

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* minor fix for isolation port setting

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* iccpd docker would start on demand

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* Add x attribute on mclagdctl file

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* add warm-reboot support for MCLAG

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* merge to master branch and reformat iccpd file

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* fix some bugs and make peer-link configuration optional

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* refactor code per Brcm review

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* correct a typo

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* * optimize iccpd arp/mac sync process
* refine code according to brcm opinoin
* unify function return value

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* * optimize warm-reboot process
* estabish iccpd connection with configurated src-ip

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* fix a typo

Signed-off-by: shine.chen <shine.chen@nephosinc.com>

* optimize some code
* add some debug info
* optimize bridge mac setting
* fix vlan mac sync issue on standby node

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* optimize some code

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* fix some bugs for warm-reboot

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* refine log level

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* refine iccpd syslog & skip arp packet whose src ip is local ip

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* remove iccpd dependency with teamd

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* print log level when dump mclag status

Signed-off-by: shine.chen <shine.chen@mediatek.com>

* revise per community review

Signed-off-by: shine.chen <shine.chen@mediatek.com>

Co-authored-by: shine.chen <shine.chen@nephosinc.com>
Co-authored-by: shine.chen <shine.chen@mediatek.com>
2020-04-04 15:24:06 -07:00
SuvarnaMeenakshi
4b8067e913
Multi-ASIC implementation (#3888)
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
2020-03-31 10:06:19 -07:00
Kebo Liu
0fe58af6d2
copy spc3 fw file to image (#4328) 2020-03-28 11:45:38 -07:00
lguohan
20260ceb1d
[build]: add SONIC_CONFIG_BUILD_LOG_TIMESTAMP to add timestamp in build log (#4269)
add timestamp in each job build log

example:

   [01:39:21] dh clean  --with autotools-dev
   [01:39:22]    dh_auto_clean
   [01:39:27]      make -j16 distclean

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-03-21 14:21:26 -07:00
Andriy Kokhan
540cc78038
[Service] Added NAT entry into CONTAINER_FEATURE. Fixes #4247. (#4250)
* [Service] Added NAT entry into CONTAINER_FEATURE. Fixes #4247.

Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-03-12 16:11:15 -07:00
Stephen Sun
7d0570c517
[Mellanox]Take advantage of sdk variable to customize the location where sdk_socket exists. (#4223)
Take advantage of an SDK environment variable to customize the location where sdk_socket exists.
In the latest SDK sdk_socket has been moved from /tmp to /var/run which is a better place to contain this kind of file.
However, this prevents the subdirs under /var/run from being mapped to different volumes. To resolve this, we take advantage of an SDK variable to designate the location of sdk_socket.
This requires every process that requires to access sdk_socket have this environment variable defined. However, to define environment variable for each process is less scalable. We take advantage of the docker scope environment variable to avoid that.
It depends on PR 4227
2020-03-09 12:36:56 -07:00
Joe LeVeque
7c8da20516
[sonic-cfggen] Loading the configuration from init_cfg.json and then from config_db.json (#4148) 2020-03-05 15:35:35 -08:00
Joe LeVeque
64a6989d02
[Services] Restart NAT service upon unexpected critical process exit. (#4208) 2020-03-05 15:27:21 -08:00
yozhao101
23ff55a709
[Services] Restart BGP service upon unexpected critical process exit. (#4207) 2020-03-03 16:50:32 -08:00
Stepan Blyshchak
1ef740361c
[docker_image_ctl.j2] Share UTS namespace with host OS (#4169)
Instead of updating hostname manualy on Config DB hostname change,
simply share containers UTS namespace with host OS.
Ideally, instead of setting `--uts=host` for every container in SONiC,
this setting can be set per container if feature requires.
One behaviour change is introduced in this commit, when `--privileged`
or `--cap-add=CAP_SYS_ADMIN` and `--uts=host` are combined, container
has privilege to change host OS and every other container hostname.
Such privilege should be fixed by limiting containers capabilities.

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2020-02-26 10:56:54 +02:00
Stepan Blyshchak
ab78ee0232
[mgmt-framework] start after syncd (#4174)
every service starts after syncd to start the most critical parts first

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2020-02-20 14:49:28 -08:00
Prince Sunny
31fb631cd3
Fix service and container name to be same (#4151) 2020-02-14 11:08:57 -08:00
Olivier Singla
6a0dcb1b16
[kernel]: security kernel update to 4.9.189 (#3913)
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)

Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
2020-02-12 17:41:58 -08:00
Sumukha Tumkur Vani
a9f3619901
Start RestAPI container when sonic boots (#4140)
* Start RestAPI container when sonic boots
2020-02-12 16:38:45 -08:00
yozhao101
729f343f77
[Services] Restart database service upon unexpected critical process exit. (#4138)
* [database] Implement the auto-restart feature for database container.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [database] Remove the duplicate dependency in service files. Since we
already have updategraph ---> config_setup ---> database, we do not need
explicitly add database.service in all other container service files.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [event listener] Reorganize the line 73 in event listener script.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [database] update the file sflow.service.j2 to remove the duplicate
dependency.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [event listener] Add comments in event listener.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [event listener] Update the comments in line 56.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [event listener] Add parentheses for if statement in line 76 in event listener.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-02-11 14:03:02 -08:00
yozhao101
41958aad52
[init_cfg.json] Add new FEATURE and CONTAINER_FEATURE tables (#4137)
* [init_cfg.json] Add a new table CONTAINER_FEATURE.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [init_cfg.json] Update the content of table CONTAINER_FEATURE.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [init_cfg.json] Use the template to generate the table
CONTAINER_FEATURE.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [init_cfg.json] Add a new table FEATURE.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [init_cfg.json] Change the order of container names according to
alphabetical order.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>

* [init_cfg.json] Change the dhcp_relay container name and add rest-api.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-02-11 11:05:21 -08:00
yozhao101
3bb61ab10c
[init_cfg.json] Maintain a separate init_cfg.json.j2 template file (#4092) 2020-02-07 12:35:35 -08:00
Kiran Kumar Kella
97165a0d69
Changes in sonic-buildimage to support the NAT feature (#3494)
* Changes in sonic-buildimage for the NAT feature
- Docker for NAT
- installing the required tools iptables and conntrack for nat

Signed-off-by: kiran.kella@broadcom.com

* Add redis-tools dependencies in the docker nat compilation

* Addressed review comments

* add natsyncd to warm-boot finalizer list

* addressed review comments

* using swsscommon.DBConnector instead of swsssdk.SonicV2Connector

* Enable NAT application in docker-sonic-vs
2020-01-29 17:40:43 -08:00
B S Rama krishna
1a7d822638
[kdump]: kdump support for arm, as the dependency with uboot, working on that. (#3962)
as the current kdump installation is searching for grub path, and ARM arch (marvell-armhf) are dependent on uboot, these changes has to be addressed. For now skipping kdump installation on ARM

Co-authored-by: lguohan <lguohan@gmail.com>
2020-01-28 22:12:52 -08:00
Stephen Sun
33e918f7ff
[Mellanox] platform api support firmware install (#3931)
support firmware install, including CPLD and BIOS.

CPLD: cpldupdate
BIOS: boot to onie and update BIOS in onie and then boot to SONiC
2020-01-28 21:55:50 -08:00
kannankvs
7cb63008d7
mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from con… (#4057)
* mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from confiDB to snmpd.conf without using snmp.yml
* added a missing if condition
2020-01-28 17:41:21 -08:00
SuvarnaMeenakshi
c9483796dc [baseimage]: support building multi-asic component (#3856)
- move single instance services into their own folder
- generate Systemd templates for any multi-instance service files in slave.mk
- detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file.
- update container hostname after creation instead of during creation (docker_image_ctl)
- run Docker containers in a network namespace if specified
- add a service to create a simulated multi-ASIC topology on the virtual switch platform

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>
2020-01-26 13:56:42 -08:00
Dong Zhang
7aa0baf709 [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector (#4035)
* [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector
* update comment for a potential bug
* update comment
* add TODO maker as review reqirement
2020-01-22 11:26:23 -08:00
Kalimuthu-Velappan
6dcc08e36c [psud]: Fix for psud crash because of database connection reset (#3647)
When database service is down, psud daemon throws an error because of DB connection reset, this because pmon service has no dependency with database service.

To resolve this issue, added database service dependency to the pmon service.

Also, increased the net.core.somaxconn value to 512 to solve the connection failure on the scaled setup.
2020-01-10 13:26:04 -08:00
Sujin Kang
856b4b64eb [reboot cause]: Delay process-reboot-cause service until network connection is stable (#4003) 2020-01-10 09:47:13 -08:00
lguohan
483a5946a8
Revert "[MultiDB]except src and dockers : replace redis-cli with sonic-db-cli and use new DBConnector (#3928)" (#4002)
This reverts commit 0dae59ac30.
2020-01-10 08:27:34 -08:00
Qi Luo
c4755192b1
Fix bug: chroot command line (#3972) 2020-01-08 14:37:06 -08:00
Dong Zhang
0dae59ac30 [MultiDB]except src and dockers : replace redis-cli with sonic-db-cli and use new DBConnector (#3928)
* [MultiDB]except src and dockers : replace redis-cli with sonic-db-cli and use new DBConnector
* fix vs tests along with swss vs tests together
2020-01-02 14:46:25 -08:00
Joe LeVeque
24a0c46464
[monit] Build from source and patch to use MemAvailable value if available on system (#3875) 2019-12-30 18:25:57 -08:00
Renuka Manavalan
78db0804d3
corefile uploader: Updates per review comments offline (#3915)
* Updates per review comments
1) core_uploader service waits for syslog.service
2) core_uploader service enabled for restart on failure
3) Use mtime instead of file size + ample time to be robust.

* Avoid reloading already uploaded file, by marking the names with a prefix.

* Updated failing path.
1) If rc file is missing or required data missing, it periodically logs error in forever loop.
2) If upload fails, retry every hour with a error log, forever.

* Fix few bugs

* The binary update_json.py will come from sonic-utilities.
2019-12-30 13:01:03 -08:00
Prabhu Sreenivasan
87f70108cb SONiC Management Framework Release 1.0 (#3488)
* Added sonic-mgmt-framework as submodule / docker

* fix build issues

* update sonic-mgmt-framework submodule branch to master

* Merged changes 70007e6d2ba3a4c0b371cd693ccc63e0a8906e77..00d4fcfed6a759e40d7b92120ea0ee1f08300fc6

00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 Modified environemnt variables

* Changes to build sonic-mgmt-framework docker

* bumped up sonic-mgmt-framework commit-id

* version bump for sonic-mgmt-framework commit-it

* bumped up sonic-mgmt-framework commit-id

* Add python packages to docker

* Build fix for docker with python packages

* added libyang as dependent package

* Allow building images on NFS-mounted clones

Prior to this change, `build_debian.sh` would generate a Debian
filesystem in `./fsroot`. This needs root permissions, and one of the
tests that is performed is whether the user can create a character
special file in the filesystem (using mknod).

On most NFS deployments, `root` is the least privileged user, and cannot
run mknod. Also, attempting to run commands like rm or mv as root would
fail due to permission errors, since the root user gets mapped to an
unprivileged user like `nobody`.

This commit changes the location of the Debian filesystem to `/fsroot`,
which is a tmpfs mount within the slave Docker. The default squashfs,
docker tarball and zip files are also created within /tmp, before being
copied back to /sonic as the regular user.

The side effect of this change is that the contents of `/fsroot` are no
longer available once the slave container exits, however they are
available within the squashfs image.

Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>

* bumped up sonc-mgmt-framework commit to include PR #18

*     REST Server startup script is enahnced to read the settings from
    ConfigDB. Below table provides mapping of db field to command line
    argument name.

    ============================================================
    ConfigDB entry key      Field name      REST Server argument
    ============================================================
    REST_SERVER|default     port            -port
    REST_SERVER|default     client_auth     -client_auth
    REST_SERVER|default     log_level       -v
    DEVICE_METADATA|x509    server_crt      -cert
    DEVICE_METADATA|x509    server_key      -key
    DEVICE_METADATA|x509    ca_crt          -cacert
    ============================================================

* Replace src/telemetry as submodule to sonic-telemetry

* Update telemetry commit HEAD

* Update sonic-telemetry commit HEAD

* libyang env path update

* Add libyang dependency to telemetry

* Add scripts to create JSON files for CLI backend

Scripts to create /var/platform/syseeprom and /var/platform/system, which are back-end
files for CLI, for system EEPROM and system information.

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

* In startup script, create directory where CLI back-end files live

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

* build dependency pkgs added to docker for build failure fix

* Changes to fix build issue for mgmt framework

* Fix exec path issue with telemetry

* s5232[device] PSU detecttion and default led state support

* Processing of first boot in rc.local should not have premature exit

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

*  docker mount options added for platform, system features

* bumped up sonic-mgmt-framework commit id to pick 23rd July 2019 changes

* Added mount options for telemetry docker to get access for system and platform info.

* Update commit for sonic-utilities

* [dell]: Corrected dport map and renamed config files for S5232F

* Fix telemetry submodule commit

* added support for sonic-cli console

* [Dell S5232F, Z9264F] Harden FPGA driver kernel module

For Dell S5232F and Z9264F platforms, be more strict when checking state
in ISR of FPGA driver, to harden against spurious interrupts.

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

* update mgmt-framework submodule to 27th Aug commit.

* remove changes not related to mgmt-framework and sonic-telemetry

* Revert "Replace src/telemetry as submodule to sonic-telemetry"

This reverts commit 11c3192975.

* Revert "Replace src/telemetry as submodule to sonic-telemetry"

This reverts commit 11c3192975.

* make submodule changes and remove a change not related to PR

* more changes

* Update .gitmodules

* Update Dockerfile.j2

* Update .gitmodules

* Update .gitmodules

* Update .gitmodules

reverting experimental change

* Removed syspoll for release_1.0

Signed-off-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com>

* Update docker-sonic-mgmt-framework.mk

* Update sonic-mgmt-framework.mk

* Update sonic-mgmt-framework.mk

* Update docker-sonic-mgmt-framework.mk

* Update docker-sonic-mgmt-framework.mk

* Revert "Processing of first boot in rc.local should not have premature exit"

This reverts commit e99a91ffc2.

* Remove old telemetry directory

* Update docker-sonic-mgmt-framework.mk

* Resolving merge conflict with Azure

* Reverting the wrong merge

* Use CVL_SCHEMA_PATH instead of changing directory for telemetry startup

* Add missing export

* Add python mmh3 to slave dockerfile

* Remove sonic-mgmt-framework build dep for telemetry, fix dialout startup issues

* Provided flag to disable compiling mgmt-framework

* Update sonic-utilites point latest commit id

* Point sonic-utilities to Azure accepted SHA

* Updating mgmt framework to right sha

* Add sonic-telemetry submodule

* Update the mgmt-framework commit id

Co-authored-by: jghalam <joe.ghalam@gmail.com>
Co-authored-by: Partha Dutta <51353699+dutta-partha@users.noreply.github.com>
Co-authored-by: srideepDell <srideep_devireddy@dell.com>
Co-authored-by: nirenjan <nirenjan@users.noreply.github.com>
Co-authored-by: Sachin Holla <51310506+sachinholla@users.noreply.github.com>
Co-authored-by: Eric Seifert <seiferteric@gmail.com>
Co-authored-by: Howard Persh <hpersh@yahoo.com>
Co-authored-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com>
Co-authored-by: Arunsundar Kannan <31632515+arunsundark@users.noreply.github.com>
Co-authored-by: rvasanthm <51932293+rvasanthm@users.noreply.github.com>
Co-authored-by: Ashok Daparthi-Dell <Ashok_Daparthi@Dell.com>
Co-authored-by: anand-kumar-subramanian <51383315+anand-kumar-subramanian@users.noreply.github.com>
2019-12-23 21:47:16 -08:00
Stepan Blyshchak
4ba0ff25d2 [services] make snmp.timer work again and delay telemetry.service (#3742)
Delay CPU intensive services at boot

- How I did it
Made snmp.timer work and add telemetry.timer.
But this is not enough because it breaks the existing snmp dependency on swss.
So, in this solution snmp timer is a wanted by swss service, but since OnBootSec timer expires only once it will not trigger snmp service, so I added line "OnUnitActiveSec=0 sec" which will start snmp service based on the last time it was active. On boot only OnBootSec will expire, on swss start/restarts only second timer will expire immediately and trigger snmp service.
However, snmp service will not stop after "systemctl stop snmp" because of the second timer which will always expire when snmp service because unavailable.
So there is a conflict which will be handled by systemd if we add "Conflicts=" line to both snmp.service and snmp.timer.

So during boot:

snmp does not start by default
swss starts and starts snmp timer
OnUnitActiveSec=0 does not expire since there is no snmp active
OnBootSec expires and starts snmp service and snmp timer gets stopped
During "systemctl restart swss"

snmp stops because of Requisite on swss
snmp unblocks snmp timer from running
swss starts and starts snmp timer
OnUnitActiveSec=0 expires imidiately and start snmp which stops snmp timer
During "systemctl stop snmp"

stop of snmp service unblocks snmp timer but no one starts the timer so it is not started by "OnUnitActiveSec=0"
2019-12-16 09:07:05 -08:00
Renuka Manavalan
3ab4b71656
Corefile uploader service (#3887)
* Corefile uploader service

1) A service is added to watch /var/core and upload to Azure storage
2) The service is disabled on boot. One may enable explicitly.
3) The .rc file to be updated with acct credentials and http proxy to use.
4) If service is enabled with no credentials, it would sleep, with periodic log messages
5) For any update in .rc, the service has to be restarted to take effect.

* Remove rw permission for .rc file for group & others.

* Changes per review comments.
Re-ordered .rc file per JSON.dump order.
Added a script to enable partial update of .rc, which HWProxy would use to add acct key.

* Azure storage upload requires python module futures, hence added it to install list.

* Removed trailing spaces.

* A mistake in name corrected.
Copy the .rc updater script to /usr/bin.
2019-12-15 16:48:48 -08:00
rajendra-dendukuri
fec80293dd ZTP infrastructure changes to support DHCP discovery provisioning data (#3298)
* ZTP infrastructure changes to support DHCP discovery provisioning data

- Dynamically generate DHCP client configuration based on current ZTP state
- Added support to request and process hostname when using DHCPv6
- Do not process graphservice url dhcp option if ZTP is enabled, ZTP service
will process it
- Generate /e/n/i file with all active interfaces seeking address assignment
via DHCP. Only interfaces that are created in Linux will be added to /e/n/i.
Also DHCP is started only on linked up in-band interfaces.

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2019-12-10 08:16:56 -08:00
pavel-shirshov
1848fb262b [fast-reboot]: Save fast-reboot state into the db (#3741)
Put a flag for fast-reboot to the db using EXPIRE feature. Using this flag in other part of SONiC to start in Fast-reboot mode. If we reload a config, the state in the db will be removed.
2019-12-04 14:10:19 -08:00
rajendra-dendukuri
cda61290ac [config-setup]: create a SONiC configuration management service (#3227)
* Create a SONiC configuration management service
* Perform config db migration after loading config_db.json to redis DB
* Migrate config-setup post migration hooks on image upgrade

config-setup post migration hooks help user to migrate configurations from
old image to new image. If the installed hooks are user defined they will not
be part of the newly installed image. So these hooks have to be migrated to
new image and only then they can be executing when the new image is booting.

The changes in this fix migrate config-setup post-migration hooks and ensure
that any hooks with the same filename in newly installed image are not
overwritten.

It is expected that users install new hooks as per their requirement and
not edit existing hooks. Any changes to existing hooks need to be done as
part of new image and not post bootup.
2019-12-04 07:15:58 -08:00
rajendra-dendukuri
eec594adf2 [sonic-ztp]: Build sonic-ztp package (#3299)
* Build sonic-ztp package

- Add changes in make rules to conditionally include sonic-ztp package

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2019-12-04 04:50:56 -08:00
Joe LeVeque
100d67941a [services] sflow service sets swss service as Requisite=, not Requires= (#3819)
The sflow service should not start unless the swss service is started. However, if this service is not started, the sflow service should not attempt to start them, instead it should simply fail to start. Using Requisite=, we will achieve this behavior, whereas using Requires= will cause the required service to be started.
2019-12-03 09:50:49 -08:00
pra-moh
bfa96bbce3 Add daemon which periodically pushes process and docker stats to State DB (#3525) 2019-11-27 15:35:41 -08:00
Joe LeVeque
5e6f8adb22 [services] Remove explicit dependencies from dhcp_relay service file, control in swss.sh (#3823) 2019-11-26 16:59:45 -08:00
yozhao101
67fc68513e [Services] Restart Sflow service upon unexpected critical process exit. (#3751)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-25 13:02:00 -08:00
yozhao101
df11b2b9f1 [Services] Restart Telemetry service upon unexpected critical process exit. (#3768)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-18 16:56:44 -08:00
Joe LeVeque
85b0de3df1 [docker-syncd]: Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly (#3534)
Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
2019-11-09 10:26:39 -08:00
Olivier Singla
c70d8bca9f [baseimage]: kdump support (#3722)
* In the event of a kernel crash, we need to gather as much information
as possible to understand and identify the root cause of the crash.
Currently, the kernel does not provide much information, which make
kernel crash investigation difficult and time consuming.

Fortunately, there is a way in the kernel to provide more information
in the case of a kernel crash. kdump is a feature of the Linux kernel
that creates crash dumps in the event of a kernel crash. This PR
will add kermel kdump support.

An extension to the CLI utilities config and show is provided to
configure and manage kdump:
 - enable / disable kdump functionality
 - configure kdump (how many kernel crash logs can be saved, memory
   allocated for capture kernel)
 - view kernel crash logs
2019-11-08 23:08:42 -08:00
Ying Xie
96fffd883d Revert "[services] make snmp.timer work again and delay telemetry.service (#3657)" (#3729)
This reverts commit d346cb3898.
2019-11-08 21:44:25 -08:00
pavel-shirshov
d5af096f41
[TSA]: Add community to the loopback prefix, when isolated (#3708)
* Rename asn/deployment_id_asn_map.yaml to constants/constants.yaml

* Fix bgp templates

* Add community for loopback when bgpd is isolated

* Use correct community value
2019-11-06 16:07:28 -08:00
Stepan Blyshchak
d346cb3898 [services] make snmp.timer work again and delay telemetry.service (#3657)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-11-06 12:12:31 -08:00
yozhao101
a117b25446 [Services] Restart LLDP service upon unexpected critical process exit. (#3713)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-06 11:02:57 -08:00
yozhao101
ed79f54569 [Services] Restart DHCP-Relay service upon unexpected critical process exit. (#3667)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-05 18:32:14 -08:00
yozhao101
4c31ef3cd2 [Services] Restart Teamd service upon unexpected critical process exit. (#3703)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-04 17:45:41 -08:00
yozhao101
4fa3a1e27e [Services] Restart Platform-monitor service upon unexpected critical process exit. (#3689)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-11-04 17:44:01 -08:00
Stepan Blyshchak
8dbe13c4cc [services] improve startup time by changing startup order (#3656)
* [services] improve startup time by given precedence to critical services (syncd.service)

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-10-31 09:18:26 -07:00
yozhao101
cff30c59d0 [Services] Restart Router-advertiser service upon unexpected critical process exit (#3681)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-10-30 16:41:55 -07:00
yozhao101
a0fbeeaca5 [Services] Restart SNMP service upon unexpected critical process exit. (#3650)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2019-10-22 14:41:12 -07:00
Wenda Ni
be52977aca Revert "Configure buffer profile to all ports (#3561)" (#3628)
This reverts commit 8861cbe98e.
2019-10-18 09:14:39 -07:00
kannankvs
150ed36be2 [snmp]: changes to handle snmp configuration as per the modified CLI (#3586)
While doing CLI changes for SNMP configuration, few changes are made in backend to handle the modified CLI.

** Changes**

- "community" for "snmp trap" is also made as "configurable". snmpd_conf.j2 is modified to handle the same.

- Changed the snmp.yml file generation from postStartAction to preStartAction in docker_image_ctl.j2 specific to SNMP docker, to ensure that the snmp.yml is generated before sonic-cfggen generates the snmpd.conf.

- Changed to make the code common for management vrf and default vrf. Users can configure snmp trap and snmp listening IP for both management vrf and default vrf.
2019-10-10 09:24:18 -07:00
Wenda Ni
8861cbe98e
Configure buffer profile to all ports (#3561)
Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-10-04 11:20:57 -07:00
Wenda Ni
cf0465bf53
Adopt per-port buffer and qos profile (#3542)
Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-10-02 13:01:16 -07:00
Stepan Blyshchak
52e35a0f95 [docker_image_ctl.j2] skip hostname update if is up to date (#3529)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2019-10-01 20:48:03 -07:00
Stephen Sun
7308d2eb97 [Mellanox] Stop pmon ahead of syncd (#3505)
Issue Overview
shutdown flow

For any shutdown flow, which means all dockers are stopped in order, pmon docker stops after syncd docker has stopped, causing pmon docker fail to release sx_core resources and leaving sx_core in a bad state. The related logs are like the following:

INFO syncd.sh[23597]: modprobe: FATAL: Module sx_core is in use.
INFO syncd.sh[23597]: Unloading sx_core[FAILED]
INFO syncd.sh[23597]: rmmod: ERROR: Module sx_core is in use
config reload & service swss.restart
In the flows like "config reload" and "service swss restart", the failure cause further consequences:

sx_core initialization error with error message like "sx_core: create EMAD sdq 0 failed. err: -16"
syncd fails to execute the create switch api with error message "syncd_main: Runtime error: :- processEvent: failed to execute api: create, key: SAI_OBJECT_TYPE_SWITCH:oid:0x21000000000000, status: SAI_STATUS_FAILURE"
swss fails to call SAI API "SAI_SWITCH_ATTR_INIT_SWITCH", which causes orchagent to restart. This will introduce an extra 1 or 2 minutes for the system to be available, failing related test cases.
reboot, warm-reboot & fast-reboot
In the reboot flows including "reboot", "fast-reboot" and "warm-reboot" this failure doesn't have further negative effects since the system has already rebooted. In addition, "warm-reboot" requires the system to be shutdown as soon as possible to meet the GR time restriction of both BGP and LACP. "fast-reboot" also requires to meet the GR time restriction of BGP which is longer than LACP. In this sense, any unnecessary steps should be avoided. It's better to keep those flows untouched.

summary
To summarize, we have to come up with a way to ensure:

shutdown pmon docker ahead of syncd for "config reload" or "service swss restart" flow;
don't shutdown pmon docker ahead of syncd for "fast-reboot" or "warm-reboot" flow in order to save time.
for "reboot" flow, either order is acceptable.
Solution
To solve the issue, pmon shoud be stopped ahead of syncd stopped for all flows except for the warm-reboot.

- How I did it

To stop pmon ahead of syncd stopped. This is done in /usr/local/bin/syncd.sh::stop() and for all shutdown sequence.
Now pmon stops ahead of syncd so there must be a way in which pmon can start after syncd started. Another point that should be taken consideration is that pmon starting should be deferred so that services which have the logic of graceful restart in fast-reboot and warm-reboot have sufficient CPU cycles to meet their deadline.
This is done by add "syncd.service" as "After" to pmon.service and startin /usr/local/bin/syncd.sh::wait()
To start pmon automatically after syncd started.
2019-09-27 10:15:46 +02:00
Stephen Sun
c34a4783e0 [build] install new platform api on host (#3282)
slave.mk: add SONIC_PLATFORM_API_PY2 as dependency of host
sonic_debian_extension.j2: install sonic_daemon_base and Mellanox-specific sonic_platform on host
mlnx-platform-api.mk: export mlnx_platform_api_py2_wheel_path for sonic_debian_extension.j2
sonic-daemon-base.mk: export daemon_base_py2_wheel_path for sonic_debian_extension.j2
daemon_base.py: hind unnecessary dependency of swss_common on host
2019-09-25 11:00:24 -07:00
Harish Venkatraman
9d2d617264 [SNMP] management VRF SNMP support (#2608)
* [SNMP] management VRF SNMP support

This commit adds SNMP support for Management VRF using l3mdev.
The patch included provides VRF support, there is no single
"listendevice" configuration, rather multiple agentaddress
config options can each have their own "interface" to bind to
using "ip%interface". The snmpd.conf file is accordingly
generated using the snmp.yml file and redis database info.

Adding below the comments of SNMP patch 1376
--------------------------------------------
Since the Linux kernel added support for Virtual Routing
and Forwarding (VRF) in version 4.3
(Note: these won't compile on non-linux platforms)

https://www.kernel.org/doc/Documentation/networking/vrf.txt

Linux users could not use snmpd in its current form to
bind specific listening IP addresses to specific VRF
devices. A simplified description of a VRF inteface
is an interface that is a master (a container of sorts)
that collects a set of physicalinterfaces to form a
routing table.

This set of two patches (one for V5-7-patches and one
for V5-8-patches branches) is almost identical to patch
single "listendevice" configuration. Rather, multiple
agentAddress config options can each have their own
"interface" to bind to using the <ip>%<interface>
syntax.</interface></ip>
-------------------------------------------

Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
2019-09-18 17:26:45 -07:00
padmanarayana
75104bb35d [sflow]: Build infrastructure changes to support sflow docker and utilities (#3251)
Introduce a new "sflow" container (if ENABLE_SFLOW is set). The new docker will include:
hsflowd : host-sflow based daemon is the sFlow agent
psample : Built from libpsample repository. Useful in debugging sampled packets/groups.
sflowtool : Locally dump sflow samples (e.g. with a in-unit collector)

In case of SONiC-VS, enable psample & act_sample kernel modules.

VS' syncd needs iproute2=4.20.0-2~bpo9+1 & libcap2-bin=1:2.25-1 to support tc-sample

tc-syncd is provided as a convenience tool for debugging (e.g. tc-syncd filter show ...)
2019-09-14 20:27:09 -07:00
Wenda Ni
81aef6b64c [Qos] use dot1p to tc mapping for backend switches (#3422)
* Use dot1p to tc mapping for backend switches

Signed-off-by: Wenda Ni <wenni@microsoft.com>

* Do not write DSCP to TC mapping into CONFIG_DB or config_db.json for
storage switches

Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-09-13 11:28:25 -07:00
lguohan
a1158c6c18
Revert "Use dot1p to tc mapping for backend switches (#3412)" (#3421)
This reverts commit ca43dad12f.
2019-09-09 14:44:46 -07:00
Wenda Ni
ca43dad12f Use dot1p to tc mapping for backend switches (#3412)
* Use dot1p to tc mapping for backend switches

Signed-off-by: Wenda Ni <wenni@microsoft.com>

* Do not write DSCP to TC mapping into CONFIG_DB or config_db.json for
storage switches

Signed-off-by: Wenda Ni <wenni@microsoft.com>
2019-09-06 11:59:47 -07:00
Dong Zhang
768beb79e1 create multiple Redis DB instances based on CONFIG at /etc/sonic/database_config.json (#2182)
this is the first step to moving different databases tables into different database instances

in this PR, only handle multiple database instances creation based on user configuration at /etc/sonic/database_config.json

we keep current method to create single database instance if no extra/new DATABASE configuration exist in database_config.json file.

if user try to configure more db instances at database_config.json , we create those new db instances along with the original db instance existing today.

The configuration is as below, later we can add more db related information if needed:
{
...
"DATABASE": {
"redis-db-01" : {
"port" : "6380",
"database": ["APPL_DB", "STATE_DB"]
},
"redis-db-02" : {
"port" : "6381",
"database":["ASIC_DB"]
},
}
...
}

The detail description is at design doc at Azure/SONiC#271

The main idea is : when database.sh started, we check the configuration and generate corresponding scripts.

rc.local service handle old_config copy when loading new images, there is no dependency between rc.local and database service today, for safety and make sure the copy operation are done before database try to read it, we make database service run after rc.local

Then database docker started, we check the configuration and generate corresponding scripts/.conf in database docker as well.

based on those conf, we create databases instances as required.

at last, we ping_pong check database are up and continue


Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com
2019-08-28 11:15:10 -07:00