Commit Graph

205 Commits

Author SHA1 Message Date
Joe LeVeque
aca1a86856 [caclmgrd] Fix application of IPv6 service ACL rules (part 2) (#4036) 2020-01-17 17:33:31 -08:00
kannankvs
d150721fa1 modified down rules to pre-down rules to ensure that default route is… (#3853)
* modified down rules to pre-down rules to ensure that default route is deleted just before interface is made down
2020-01-16 19:36:49 -08:00
yozhao101
aa67921d06 [Monit] Change the monitoring period from 120 seconds to 60 seconds. (#3974)
* [Monit] Change the monitoring period of monit from 120 seconds to 60
seconds and also at the same time double the interval for existing sonic monit config file in
host.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-01-10 13:01:24 -08:00
Sujin Kang
856b4b64eb [reboot cause]: Delay process-reboot-cause service until network connection is stable (#4003) 2020-01-10 09:47:13 -08:00
lguohan
483a5946a8
Revert "[MultiDB]except src and dockers : replace redis-cli with sonic-db-cli and use new DBConnector (#3928)" (#4002)
This reverts commit 0dae59ac30.
2020-01-10 08:27:34 -08:00
Dong Zhang
0dae59ac30 [MultiDB]except src and dockers : replace redis-cli with sonic-db-cli and use new DBConnector (#3928)
* [MultiDB]except src and dockers : replace redis-cli with sonic-db-cli and use new DBConnector
* fix vs tests along with swss vs tests together
2020-01-02 14:46:25 -08:00
Joe LeVeque
24a0c46464
[monit] Build from source and patch to use MemAvailable value if available on system (#3875) 2019-12-30 18:25:57 -08:00
Renuka Manavalan
78db0804d3
corefile uploader: Updates per review comments offline (#3915)
* Updates per review comments
1) core_uploader service waits for syslog.service
2) core_uploader service enabled for restart on failure
3) Use mtime instead of file size + ample time to be robust.

* Avoid reloading already uploaded file, by marking the names with a prefix.

* Updated failing path.
1) If rc file is missing or required data missing, it periodically logs error in forever loop.
2) If upload fails, retry every hour with a error log, forever.

* Fix few bugs

* The binary update_json.py will come from sonic-utilities.
2019-12-30 13:01:03 -08:00
Prabhu Sreenivasan
87f70108cb SONiC Management Framework Release 1.0 (#3488)
* Added sonic-mgmt-framework as submodule / docker

* fix build issues

* update sonic-mgmt-framework submodule branch to master

* Merged changes 70007e6d2ba3a4c0b371cd693ccc63e0a8906e77..00d4fcfed6a759e40d7b92120ea0ee1f08300fc6

00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 Modified environemnt variables

* Changes to build sonic-mgmt-framework docker

* bumped up sonic-mgmt-framework commit-id

* version bump for sonic-mgmt-framework commit-it

* bumped up sonic-mgmt-framework commit-id

* Add python packages to docker

* Build fix for docker with python packages

* added libyang as dependent package

* Allow building images on NFS-mounted clones

Prior to this change, `build_debian.sh` would generate a Debian
filesystem in `./fsroot`. This needs root permissions, and one of the
tests that is performed is whether the user can create a character
special file in the filesystem (using mknod).

On most NFS deployments, `root` is the least privileged user, and cannot
run mknod. Also, attempting to run commands like rm or mv as root would
fail due to permission errors, since the root user gets mapped to an
unprivileged user like `nobody`.

This commit changes the location of the Debian filesystem to `/fsroot`,
which is a tmpfs mount within the slave Docker. The default squashfs,
docker tarball and zip files are also created within /tmp, before being
copied back to /sonic as the regular user.

The side effect of this change is that the contents of `/fsroot` are no
longer available once the slave container exits, however they are
available within the squashfs image.

Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>

* bumped up sonc-mgmt-framework commit to include PR #18

*     REST Server startup script is enahnced to read the settings from
    ConfigDB. Below table provides mapping of db field to command line
    argument name.

    ============================================================
    ConfigDB entry key      Field name      REST Server argument
    ============================================================
    REST_SERVER|default     port            -port
    REST_SERVER|default     client_auth     -client_auth
    REST_SERVER|default     log_level       -v
    DEVICE_METADATA|x509    server_crt      -cert
    DEVICE_METADATA|x509    server_key      -key
    DEVICE_METADATA|x509    ca_crt          -cacert
    ============================================================

* Replace src/telemetry as submodule to sonic-telemetry

* Update telemetry commit HEAD

* Update sonic-telemetry commit HEAD

* libyang env path update

* Add libyang dependency to telemetry

* Add scripts to create JSON files for CLI backend

Scripts to create /var/platform/syseeprom and /var/platform/system, which are back-end
files for CLI, for system EEPROM and system information.

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

* In startup script, create directory where CLI back-end files live

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

* build dependency pkgs added to docker for build failure fix

* Changes to fix build issue for mgmt framework

* Fix exec path issue with telemetry

* s5232[device] PSU detecttion and default led state support

* Processing of first boot in rc.local should not have premature exit

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

*  docker mount options added for platform, system features

* bumped up sonic-mgmt-framework commit id to pick 23rd July 2019 changes

* Added mount options for telemetry docker to get access for system and platform info.

* Update commit for sonic-utilities

* [dell]: Corrected dport map and renamed config files for S5232F

* Fix telemetry submodule commit

* added support for sonic-cli console

* [Dell S5232F, Z9264F] Harden FPGA driver kernel module

For Dell S5232F and Z9264F platforms, be more strict when checking state
in ISR of FPGA driver, to harden against spurious interrupts.

Signed-off-by: Howard Persh <Howard_Persh@dell.com>

* update mgmt-framework submodule to 27th Aug commit.

* remove changes not related to mgmt-framework and sonic-telemetry

* Revert "Replace src/telemetry as submodule to sonic-telemetry"

This reverts commit 11c3192975.

* Revert "Replace src/telemetry as submodule to sonic-telemetry"

This reverts commit 11c3192975.

* make submodule changes and remove a change not related to PR

* more changes

* Update .gitmodules

* Update Dockerfile.j2

* Update .gitmodules

* Update .gitmodules

* Update .gitmodules

reverting experimental change

* Removed syspoll for release_1.0

Signed-off-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com>

* Update docker-sonic-mgmt-framework.mk

* Update sonic-mgmt-framework.mk

* Update sonic-mgmt-framework.mk

* Update docker-sonic-mgmt-framework.mk

* Update docker-sonic-mgmt-framework.mk

* Revert "Processing of first boot in rc.local should not have premature exit"

This reverts commit e99a91ffc2.

* Remove old telemetry directory

* Update docker-sonic-mgmt-framework.mk

* Resolving merge conflict with Azure

* Reverting the wrong merge

* Use CVL_SCHEMA_PATH instead of changing directory for telemetry startup

* Add missing export

* Add python mmh3 to slave dockerfile

* Remove sonic-mgmt-framework build dep for telemetry, fix dialout startup issues

* Provided flag to disable compiling mgmt-framework

* Update sonic-utilites point latest commit id

* Point sonic-utilities to Azure accepted SHA

* Updating mgmt framework to right sha

* Add sonic-telemetry submodule

* Update the mgmt-framework commit id

Co-authored-by: jghalam <joe.ghalam@gmail.com>
Co-authored-by: Partha Dutta <51353699+dutta-partha@users.noreply.github.com>
Co-authored-by: srideepDell <srideep_devireddy@dell.com>
Co-authored-by: nirenjan <nirenjan@users.noreply.github.com>
Co-authored-by: Sachin Holla <51310506+sachinholla@users.noreply.github.com>
Co-authored-by: Eric Seifert <seiferteric@gmail.com>
Co-authored-by: Howard Persh <hpersh@yahoo.com>
Co-authored-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com>
Co-authored-by: Arunsundar Kannan <31632515+arunsundark@users.noreply.github.com>
Co-authored-by: rvasanthm <51932293+rvasanthm@users.noreply.github.com>
Co-authored-by: Ashok Daparthi-Dell <Ashok_Daparthi@Dell.com>
Co-authored-by: anand-kumar-subramanian <51383315+anand-kumar-subramanian@users.noreply.github.com>
2019-12-23 21:47:16 -08:00
Joe LeVeque
77d636256b
[caclmgrd] Fix application of IPv6 service ACL rules (#3917) 2019-12-19 07:15:27 -08:00
Renuka Manavalan
3ab4b71656
Corefile uploader service (#3887)
* Corefile uploader service

1) A service is added to watch /var/core and upload to Azure storage
2) The service is disabled on boot. One may enable explicitly.
3) The .rc file to be updated with acct credentials and http proxy to use.
4) If service is enabled with no credentials, it would sleep, with periodic log messages
5) For any update in .rc, the service has to be restarted to take effect.

* Remove rw permission for .rc file for group & others.

* Changes per review comments.
Re-ordered .rc file per JSON.dump order.
Added a script to enable partial update of .rc, which HWProxy would use to add acct key.

* Azure storage upload requires python module futures, hence added it to install list.

* Removed trailing spaces.

* A mistake in name corrected.
Copy the .rc updater script to /usr/bin.
2019-12-15 16:48:48 -08:00
Stephen Sun
80bb7fd15a [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot (#3880)
* [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot
1. check whether /proc/cmdline indicates warm/fast reboot.
   if yes the software reboot cause file will be treated as the reboot cause.
   finish
2. check whether platform api returns a reboot cause.
   if yes it is treated as the reboot cause.
   finish.
3. check whether /hosts/reboot-cause contains a cause.
   if yes it is treated as the cause otherwise return unknown.

* [process-reboot-cause]Fix review comments

* [process-reboot-cause]address comments
1. use "with" statement
2. update fast/warm reboot BOOT_ARG

* [process-reboot-cause]address comments

* refactor the code flow

* Remove escape

* Remove extra ':'
2019-12-14 09:41:48 -08:00
Ying Xie
eefa8455d7
[hostcfgd] avoid in place editing config file contents (#3904)
In place editing (sed -i) seems having some issues with filesystem
interaction. It could leave 0 size file or corrupted file behind.

It would be safer to sed the file contents into a new file and switch
new file with the old file.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-12-13 19:26:39 -08:00
rajendra-dendukuri
fec80293dd ZTP infrastructure changes to support DHCP discovery provisioning data (#3298)
* ZTP infrastructure changes to support DHCP discovery provisioning data

- Dynamically generate DHCP client configuration based on current ZTP state
- Added support to request and process hostname when using DHCPv6
- Do not process graphservice url dhcp option if ZTP is enabled, ZTP service
will process it
- Generate /e/n/i file with all active interfaces seeking address assignment
via DHCP. Only interfaces that are created in Linux will be added to /e/n/i.
Also DHCP is started only on linked up in-band interfaces.

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2019-12-10 08:16:56 -08:00
rajendra-dendukuri
cda61290ac [config-setup]: create a SONiC configuration management service (#3227)
* Create a SONiC configuration management service
* Perform config db migration after loading config_db.json to redis DB
* Migrate config-setup post migration hooks on image upgrade

config-setup post migration hooks help user to migrate configurations from
old image to new image. If the installed hooks are user defined they will not
be part of the newly installed image. So these hooks have to be migrated to
new image and only then they can be executing when the new image is booting.

The changes in this fix migrate config-setup post-migration hooks and ensure
that any hooks with the same filename in newly installed image are not
overwritten.

It is expected that users install new hooks as per their requirement and
not edit existing hooks. Any changes to existing hooks need to be done as
part of new image and not post bootup.
2019-12-04 07:15:58 -08:00
pra-moh
bfa96bbce3 Add daemon which periodically pushes process and docker stats to State DB (#3525) 2019-11-27 15:35:41 -08:00
pra-moh
d3a1555f30 [hostcfgd] Add support to enable/disable optional features (#3653) 2019-11-26 14:11:12 -08:00
kannankvs
4007d9ba9c [ntp]: modified ntp script to hide the error related to cfggen (#3745)
This PR is to handle the issue 3527.
When device boots up, NTP throws a traceback as explained in the issue 3527.

- Traceback will be seen when MGMT_VRF_CONFIG does not exist in the database. Traceback is coming from the script “/etc/init.d/ntp”.

- Traceback does not affect the NTP functionality with/without management VRF. When MGMT_VRF_CONFIG does not exist or when MGMT_VRF_CONFIG’s mgmtVrfEnabled is configured to “false”, “NTP” will be started in the “default VRF” context, which is working fine even with this traceback.

- This traceback error will be hidden by redirecting the error to /dev/null without affecting functionality.
2019-11-14 00:06:54 -08:00
Joe LeVeque
c50c390eb4 [rsyslog] Add support for IPv6 remote addresses (#3754) 2019-11-14 00:00:55 -08:00
Tyler Li
c07ae3b16f Loopback ip addresses move to intfmgrd for supporting VRF 2019-11-10 02:27:33 -08:00
Joe LeVeque
85b0de3df1 [docker-syncd]: Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly (#3534)
Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
2019-11-09 10:26:39 -08:00
lguohan
6d46badbdc
[aboot]: preserve snmp.yml and acl.json for eos to sonic fast reboot (#3716) 2019-11-06 20:18:31 -08:00
Neetha John
95466c3ab7 [pfcwd]: Do not start pfc watchdog on Management Tor (#3719)
Signed-off-by: Neetha John <nejo@microsoft.com>
2019-11-06 18:51:02 -08:00
pavel-shirshov
d5af096f41
[TSA]: Add community to the loopback prefix, when isolated (#3708)
* Rename asn/deployment_id_asn_map.yaml to constants/constants.yaml

* Fix bgp templates

* Add community for loopback when bgpd is isolated

* Use correct community value
2019-11-06 16:07:28 -08:00
Ying Xie
5961e031e1
[hostname-config] improve hostname-config process (#3676)
We noticed in tests/production that there is a low probability failure
where /etc/hosts could have some garbage characters before the entry for
local host name. The consequence is that all sudo command would be very
slow. In extreme cases it would prevent some services from starting
properly.

I suspect that the /etc/hosts file might be opened by some process causing
the issue. Editing contents with new file level and replace the whole file
should be safer.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-10-29 08:30:27 -07:00
Danny Allen
63328814fc
[core_cleanup] Fix issue where core_cleanup job runs too frequently (#3659)
Signed-off-by: Danny Allen <daall@microsoft.com>
2019-10-23 15:55:47 -07:00
pavel-shirshov
9b8f5c9c9a [ntp]: Use loopback address when we don't have MGMT interface (#3566)
Added configuration to use Loopback ip if a switch doesn't have MGMT_PORT.
2019-10-07 07:49:25 -07:00
Ying Xie
cd85e2148b
[updategraph] enhance update graph handling (#3549)
- after reloading minigraph, write latest version string in the DB.
- if old config_db.json file exists, use it and migrate to latest version.
- only reload minigraph when config_db.json doesn't exist and minigraph
  exists.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-10-02 13:58:44 -07:00
Ying Xie
d5262a3621
[first boot] sync file system after moving/copying files (#3550)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-10-02 13:58:34 -07:00
Long Ou
b6a09999de [hostcfgd] hostcfgd will exit when set hostname in DEVICE_METADATA (#3394)
Signed-off-by: ouxiaolong <ouxiaolong@asterfusion.com>
2019-09-24 17:36:02 -07:00
Harish Venkatraman
9d2d617264 [SNMP] management VRF SNMP support (#2608)
* [SNMP] management VRF SNMP support

This commit adds SNMP support for Management VRF using l3mdev.
The patch included provides VRF support, there is no single
"listendevice" configuration, rather multiple agentaddress
config options can each have their own "interface" to bind to
using "ip%interface". The snmpd.conf file is accordingly
generated using the snmp.yml file and redis database info.

Adding below the comments of SNMP patch 1376
--------------------------------------------
Since the Linux kernel added support for Virtual Routing
and Forwarding (VRF) in version 4.3
(Note: these won't compile on non-linux platforms)

https://www.kernel.org/doc/Documentation/networking/vrf.txt

Linux users could not use snmpd in its current form to
bind specific listening IP addresses to specific VRF
devices. A simplified description of a VRF inteface
is an interface that is a master (a container of sorts)
that collects a set of physicalinterfaces to form a
routing table.

This set of two patches (one for V5-7-patches and one
for V5-8-patches branches) is almost identical to patch
single "listendevice" configuration. Rather, multiple
agentAddress config options can each have their own
"interface" to bind to using the <ip>%<interface>
syntax.</interface></ip>
-------------------------------------------

Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
2019-09-18 17:26:45 -07:00
Prince Sunny
8ca1eb289e
Install Iptables rules to set TCPMSS for 'lo' interface (#3452)
* Install Iptables rules to set TCPMSS for lo interface
* Moved implementation to hostcfgd to maintain at one place
2019-09-18 10:12:28 -07:00
sridhar-ravindran
3c0b56a709 [DELL] S6100 Support PowerCycle in Last Reboot Reason (#3403)
* [DELL] S6100 Support PowerCycle in Last Reboot Reason

* handle first time boot properly

* S6000 Last Reboot Reason Fix
2019-09-17 16:51:46 -07:00
Harish Venkatraman
31d1a76197 [baseimage]: Management vrf ntp support (#3204)
This commit adds NTP support for management VRF using L3mdev. Config vrf add
mgmt will enable management VRF, enslave the eth0 device to the master device
mgmt, stop ntp service in default, restart interfaces-configs and restart ntp
service in mgmt-vrf context. Requirement and design are covered in mgmt vrf
design document.

Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
2019-09-16 10:21:06 -07:00
Danny Allen
97c675c6d5 [cron.d] Add cron job to periodically clean-up core files (#3449)
* [cron.d] Create cron job to periodically clean-up core files
* Create script to scan /var/core and clean-up older core files
* Create cron job to run clean-up script

Signed-off-by: Danny Allen <daall@microsoft.com>

* Update interval for running cron job

* Respond to feedback

* Change syslog id
2019-09-13 10:50:31 -07:00
lguohan
95a72b4e39
[baseimage]: fix monit configuration (#3448)
- monit config broke by one monit upgrade
- abandon sed approach since it is suspestible to monit config changes
- use unixsocket instead of httpd due to a bug in 5.20.0
2019-09-12 22:48:40 -07:00
Joe LeVeque
a27f12773b [baseimage]: Log message containing SONiC version to syslog at boot (#3416) 2019-09-09 14:18:23 -07:00
Ying Xie
d6b4223bdd [control plane assistant] stop control plane assistant after warm reboot (#3337)
Delay saving configuration so that the control assistant configurations
won't be persisted.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-08-15 00:45:54 -07:00
Renuka Manavalan
fcdf62f5f6
Fix to ensure that tacacs servers are ordered (reverse) by priority in pam.d's config. (#3322)
Present: Servers are listed in the same order as in redis-db
Fix: Save the sort o/p, hence use sorted list to write into pam.d's conf.
     As well convert priority to integer for use by sort.
2019-08-09 11:46:46 -07:00
arheneus@marvell.com
50fe458592 [build]: SONiC buildimage ARM arch support (#2980)
ARM Architecture support in SONIC

make configure platform=[ASIC_VENDOR_ARCH] PLATFORM_ARCH=[ARM_ARCH]
SONIC_ARCH: default amd64
armhf - arm32bit
arm64 - arm64bit

Signed-off-by: Antony Rheneus <arheneus@marvell.com>
2019-07-25 22:06:41 -07:00
Harish Venkatraman
3e69427ac0 [baseimage] management VRF support via l3mdev (#2585)
This commit adds support for New feature management VRF using L3mdev.  Added
commands to enable/disable management VRF. Config vrf add mgmt will enable
management VRF, enslave the eth0 device to the master device mgmt and restart
interfaces-configs in mgmt-vrf context.

management interface (eth0) can be configured using config interface eth0 ip
add command and removed using config interface eth0 ip remove command.

Requirement and design are covered in mgmt vrf design document.  Currently show
command displays linux command output; will update show command display in next
PR after concluding what would be the output for the show commands. Added
metric for default routes in dhcp and static, any changes for metric will be
addressed subsequently after discussing.

Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
2019-07-24 16:18:40 -07:00
Ying Xie
9d64ce761f
[warm reboot] save configuration after warm reboot (#3200)
* [warm reboot] save configuration after warm reboot

After warm reboot, save a copy of in memory database to config_db.json,
upgrade procedure might have removed config_db.json to force new image
to reload minigraph. However, reload minigraph is skipped during warm
reboot. Missing config_db.json would cause device to fault in next
non-upgrading cold/fast reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Update finalize-warmboot.sh
2019-07-24 09:59:47 -07:00
zzhiyuan
e4c041b57f [baseimage]: Fix process-reboot-cause possibly throwing OSError (#3159)
In case of going from previous iteration of SONiC, and the last reboot
was hardware, REBOOT_CAUSE_FILE may not be present and the service may
throw an error.
2019-07-16 08:34:11 -07:00
Joe LeVeque
5e2ab9dd03
[process-reboot-cause] Handle case if platform does not yet have sonic_platform implementation (#3126) 2019-07-05 17:53:49 -07:00
Joe LeVeque
e5a2beb13b [reboot-cause]: Move reboot cause processing to its own service, 'process-reboot-cause' (#3102) 2019-07-03 10:38:20 -07:00
Joe LeVeque
319d854e46 [baseimage]: Increase TMOUT for serial port connections to 15 minutes (#3032)
Increase TMOUT value in order to close inactive serial console connections after 900 seconds (15 minutes) of inactivity
2019-06-19 00:16:01 -07:00
Prince Sunny
231d309b69
Generate interface table to have an entry designated to default VRF. (#2848)
* Generate default VRF table for router interfaces

* Updated jinja2 template to have prefix filter
2019-06-10 14:02:55 -07:00
Myron Sosyak
3ec95e17c8 [build_templates] [hostcfgd] Keep containers hostname up to date (#2924)
* Add updateHostName function to docker_image_ctl.j2
* Add hostname specification on container creating step
* Add listener for hostname changes in hostcfgd

Signed-off-by: Myron Sosyak <msosyak@barefootnetworks.com>
2019-06-06 00:41:30 -07:00
Joe LeVeque
3ec3e20e5a [logrotate] Enhance robustness (#2942)
* [logrotate] Decrease frequency to every 10 minutes; kill any lingering logrotate processes

* [logrotate] Delete all *.1.gz files as firstaction; Remove note about init-system-helpers < 1.47 workaround

However, continue to send SIGHUP directly to rsyslogd process
because 'service rsyslog rotate' still doesn't work properly with
init-system-helpers version 1.48
2019-05-25 18:00:18 -07:00
Ying Xie
222706120d [updategraph] set DB version after minigraph reload (#2917)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2019-05-18 22:08:41 -07:00