Found another syncd timing issue related to clock going backwards.
To be safe disable the ntp long jump.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
I found that with IPv4Network types, calling list(ip_ntwrk.hosts()) is reliable. However, when doing the same with an IPv6Network, I found that the conversion to a list can hang indefinitely. This appears to me to be a bug in the ipaddress.IPv6Network implementation. However, I could not find any other reports on the web.
This patch changes the behavior to call next() on the ip_ntwrk.hosts() generator instead, which returns the IP address of the first host.
Fix hostcfgd so that changes to the "FEATURE" table in ConfigDB are properly handled. Three changes here:
1. Fix indenting such that the handling of each key actually occurs in the for key in status_data.keys(): loop
2. Add calls to sudo systemctl mask and sudo systemctl unmask as appropriate to ensure changes persist across reboots
3. Substitute returns with continues so that even if one service fails, we still try to handle the others
Note that the masking is persistent, even if the configuration is not saved. We may want to consider only calling systemctl enable/disable in hostcfgd when the DB table changes, and only call systemctl mask/unmask upon calling config save.
* Changes to support config-setup service for multi-npu
platforms. For Multi-npu we are not supporting as of
now config initializtion and ZTP. It will support creating
config db from minigraph or using config db from previous
file system
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments.
* Address Review comments
* Address Review Comments of using pyhton based config load_minigraph/
config save/config reload from shell scripts so that we don't duplicate
code. Also while running from shell we will skip stop/start services
done by those commands.
* Updated to use python command so no code duplication.
* [ntp] enable/disable NTP long jump according to reboot type
- Enable NTP long jump after cold reboot.
- Disable NTP long jump after warrm/fast reboot.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* fix typo
* further refactoring
* use sonic-db-cli instead
Since the introduction of VRF, interface-related tables in ConfigDB will have multiple entries, one of which only contains the interface name and no IP prefix. Thus, when iterating over the keys in the tables, we need to ignore the entries which do not contain IP prefixes.
and then we load image and reboot even if there was existing
config_db.json we will look for DHCP Service. we should disbale
update_graph in such cases. This behaviour is silimar to what we have in
201811 image.
Modified caclmgrd behavior to enhance control plane security as follows:
Upon starting or receiving notification of ACL table/rule changes in Config DB:
1. Add iptables/ip6tables commands to allow all incoming packets from established TCP sessions or new TCP sessions which are related to established TCP sessions
2. Add iptables/ip6tables commands to allow bidirectional ICMPv4 ping and traceroute
3. Add iptables/ip6tables commands to allow bidirectional ICMPv6 ping and traceroute
4. Add iptables/ip6tables commands to allow all incoming Neighbor Discovery Protocol (NDP) NS/NA/RS/RA messages
5. Add iptables/ip6tables commands to allow all incoming IPv4 DHCP packets
6. Add iptables/ip6tables commands to allow all incoming IPv6 DHCP packets
7. Add iptables/ip6tables commands to allow all incoming BGP traffic
8. Add iptables/ip6tables commands for all ACL rules for recognized services (currently SSH, SNMP, NTP)
9. For all services which we did not find configured ACL rules, add iptables/ip6tables commands to allow all incoming packets for those services (allows the device to accept SSH connections before the device is configured)
10. Add iptables rules to drop all packets destined for loopback interface IP addresses
11. Add iptables rules to drop all packets destined for management interface IP addresses
12. Add iptables rules to drop all packets destined for point-to-point interface IP addresses
13. Add iptables rules to drop all packets destined for our VLAN interface gateway IP addresses
14. Add iptables/ip6tables commands to allow all incoming packets with TTL of 0 or 1 (This allows the device to respond to tools like tcptraceroute)
15. If we found control plane ACLs in the configuration and applied them, we lastly add iptables/ip6tables commands to drop all other incoming packets
1. ebtables -t filter -A FORWARD -p 802_1Q --vlan-encap 0806 -j DROP
The ARP packet with vlan tag can't match the default rule.
Signed-off-by: wangshengjun <wangshengjun@asterfusion.com>
* Run fsck filesystem check support prior mounting filesystem
If the filesystem become non clean ("dirty"), SONiC does not run fsck to
repair and mark it as clean again.
This patch adds the functionality to run fsck on each boot, prior to the
filesystem being mounted. This allows the filesystem to be repaired if
needed.
Note that if the filesystem is maked as clean, fsck does nothing and simply
return so this is perfectly fine to call fsck every time prior to mount the
filesystem.
How to verify this patch (using bash):
Using an image without this patch:
Make the filesystem "dirty" (not clean)
[we are making the assumption that filesystem is stored in /dev/sda3 - Please adjust depending of the platform]
[do this only on a test platform!]
dd if=/dev/sda3 of=superblock bs=1 count=2048
printf "$(printf '\\x%02X' 2)" | dd of="superblock" bs=1 seek=1082 count=1 conv=notrunc &> /dev/null
dd of=/dev/sda3 if=superblock bs=1 count=2048
Verify that filesystem is not clean
tune2fs -l /dev/sda3 | grep "Filesystem state:"
reboot and verify that the filesystem is still not clean
Redo the same test with an image with this patch, and verify that at next reboot the filesystem is repaired and becomes clean.
fsck log is stored on syslog, using the string FSCK as markup.
The one big bgp configuration template was splitted into chunks.
Currently we have three types of bgp neighbor peers:
general bgp peers. They are represented by CONFIG_DB::BGP_NEIGHBOR table entries
dynamic bgp peers. They are represented by CONFIG_DB::BGP_PEER_RANGE table entries
monitors bgp peers. They are represented by CONFIG_DB::BGP_MONITORS table entries
This PR introduces three templates for each peer type:
bgp policies: represent policieas that will be applied to the bgp peer-group (ip prefix-lists, route-maps, etc)
bgp peer-group: represent bgp peer group which has common configuration for the bgp peer type and uses bgp routing policy from the previous item
bgp peer-group instance: represent bgp configuration, which will be used to instatiate a bgp peer-group for the bgp peer-type. Usually this one is simple, consist of the referral to the bgp peer-group, bgp peer description and bgp peer ip address.
This PR redefined constant.yml file. Now this file has a setting for to use or don't use bgp_neighbor metadata. This file has more parameters for now, which are not used. They will be used in the next iteration of bgpcfgd.
Currently all tests have been disabled. I'm going to create next PR with the tests right after this PR is merged.
I'm going to introduce better bgpcfgd in a short time. It will include support of dynamic changes for the templates.
FIX:: #4231
- create a file in files/image_config/ntp/ntp-systemd-wrapper to add mgmt vrf related start cmd for ntp service. So that the default /usr/lib/ntp/ntp-systemd-wrapper can be overriden during build time.
- modify build_debian.sh to cp files/image_config/ntp/ntp-systemd-wrapper to /usr/lib/ntp/ntp-systemd-wrapper during build time.
Co-authored-by: Bing Sun <Bing_Sun@dell.com>
do mount/umount in the chroot environment
install cron explicitly
install rasdaemon as a replacement for mcelog
switch python package docker-py to docker
* Install kubernetes worker node packages, if enabled.
* Minor updates
* Added some comments
* Updates per review comments.
Built a private image to test to work fine.
* Remove the removed file.
* Update per comments
Make a fix, as kubeadm no demands a higher version of kubelet & kubectl.
As kubeadm auto install kubectl & kubelet, removing explicit install is an easier/robust fix.
* Changes per review comments.
* Updates per comments.
1) Dropped helper & pod scripts
2) Made install verbose
* Drop creation of pods subdir, as this PR does not use them.
* From comments to 'n' per review comments.
* 1) kubeadm.conf is created as part of kubeadm package install. Hence dropped explicit copy.
* Fix the CMD for the PROCESSSTATS entries so that
there is a space between the command name and the
arguments.
Signed-off-by: Garrick He <garrick_he@dell.com>
- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.
- How I did it
Added "tinker panic 0" in ntp.conf file.
- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]
Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
Reboot the device.
Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.
After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.
Instead of updating hostname manualy on Config DB hostname change,
simply share containers UTS namespace with host OS.
Ideally, instead of setting `--uts=host` for every container in SONiC,
this setting can be set per container if feature requires.
One behaviour change is introduced in this commit, when `--privileged`
or `--cap-add=CAP_SYS_ADMIN` and `--uts=host` are combined, container
has privilege to change host OS and every other container hostname.
Such privilege should be fixed by limiting containers capabilities.
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
* Changes in sonic-buildimage for the NAT feature
- Docker for NAT
- installing the required tools iptables and conntrack for nat
Signed-off-by: kiran.kella@broadcom.com
* Add redis-tools dependencies in the docker nat compilation
* Addressed review comments
* add natsyncd to warm-boot finalizer list
* addressed review comments
* using swsscommon.DBConnector instead of swsssdk.SonicV2Connector
* Enable NAT application in docker-sonic-vs
- move single instance services into their own folder
- generate Systemd templates for any multi-instance service files in slave.mk
- detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file.
- update container hostname after creation instead of during creation (docker_image_ctl)
- run Docker containers in a network namespace if specified
- add a service to create a simulated multi-ASIC topology on the virtual switch platform
Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>
* [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector
* update comment for a potential bug
* update comment
* add TODO maker as review reqirement
* [Monit] Change the monitoring period of monit from 120 seconds to 60
seconds and also at the same time double the interval for existing sonic monit config file in
host.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Updates per review comments
1) core_uploader service waits for syslog.service
2) core_uploader service enabled for restart on failure
3) Use mtime instead of file size + ample time to be robust.
* Avoid reloading already uploaded file, by marking the names with a prefix.
* Updated failing path.
1) If rc file is missing or required data missing, it periodically logs error in forever loop.
2) If upload fails, retry every hour with a error log, forever.
* Fix few bugs
* The binary update_json.py will come from sonic-utilities.
* Added sonic-mgmt-framework as submodule / docker
* fix build issues
* update sonic-mgmt-framework submodule branch to master
* Merged changes 70007e6d2ba3a4c0b371cd693ccc63e0a8906e77..00d4fcfed6a759e40d7b92120ea0ee1f08300fc6
00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 Modified environemnt variables
* Changes to build sonic-mgmt-framework docker
* bumped up sonic-mgmt-framework commit-id
* version bump for sonic-mgmt-framework commit-it
* bumped up sonic-mgmt-framework commit-id
* Add python packages to docker
* Build fix for docker with python packages
* added libyang as dependent package
* Allow building images on NFS-mounted clones
Prior to this change, `build_debian.sh` would generate a Debian
filesystem in `./fsroot`. This needs root permissions, and one of the
tests that is performed is whether the user can create a character
special file in the filesystem (using mknod).
On most NFS deployments, `root` is the least privileged user, and cannot
run mknod. Also, attempting to run commands like rm or mv as root would
fail due to permission errors, since the root user gets mapped to an
unprivileged user like `nobody`.
This commit changes the location of the Debian filesystem to `/fsroot`,
which is a tmpfs mount within the slave Docker. The default squashfs,
docker tarball and zip files are also created within /tmp, before being
copied back to /sonic as the regular user.
The side effect of this change is that the contents of `/fsroot` are no
longer available once the slave container exits, however they are
available within the squashfs image.
Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>
* bumped up sonc-mgmt-framework commit to include PR #18
* REST Server startup script is enahnced to read the settings from
ConfigDB. Below table provides mapping of db field to command line
argument name.
============================================================
ConfigDB entry key Field name REST Server argument
============================================================
REST_SERVER|default port -port
REST_SERVER|default client_auth -client_auth
REST_SERVER|default log_level -v
DEVICE_METADATA|x509 server_crt -cert
DEVICE_METADATA|x509 server_key -key
DEVICE_METADATA|x509 ca_crt -cacert
============================================================
* Replace src/telemetry as submodule to sonic-telemetry
* Update telemetry commit HEAD
* Update sonic-telemetry commit HEAD
* libyang env path update
* Add libyang dependency to telemetry
* Add scripts to create JSON files for CLI backend
Scripts to create /var/platform/syseeprom and /var/platform/system, which are back-end
files for CLI, for system EEPROM and system information.
Signed-off-by: Howard Persh <Howard_Persh@dell.com>
* In startup script, create directory where CLI back-end files live
Signed-off-by: Howard Persh <Howard_Persh@dell.com>
* build dependency pkgs added to docker for build failure fix
* Changes to fix build issue for mgmt framework
* Fix exec path issue with telemetry
* s5232[device] PSU detecttion and default led state support
* Processing of first boot in rc.local should not have premature exit
Signed-off-by: Howard Persh <Howard_Persh@dell.com>
* docker mount options added for platform, system features
* bumped up sonic-mgmt-framework commit id to pick 23rd July 2019 changes
* Added mount options for telemetry docker to get access for system and platform info.
* Update commit for sonic-utilities
* [dell]: Corrected dport map and renamed config files for S5232F
* Fix telemetry submodule commit
* added support for sonic-cli console
* [Dell S5232F, Z9264F] Harden FPGA driver kernel module
For Dell S5232F and Z9264F platforms, be more strict when checking state
in ISR of FPGA driver, to harden against spurious interrupts.
Signed-off-by: Howard Persh <Howard_Persh@dell.com>
* update mgmt-framework submodule to 27th Aug commit.
* remove changes not related to mgmt-framework and sonic-telemetry
* Revert "Replace src/telemetry as submodule to sonic-telemetry"
This reverts commit 11c3192975.
* Revert "Replace src/telemetry as submodule to sonic-telemetry"
This reverts commit 11c3192975.
* make submodule changes and remove a change not related to PR
* more changes
* Update .gitmodules
* Update Dockerfile.j2
* Update .gitmodules
* Update .gitmodules
* Update .gitmodules
reverting experimental change
* Removed syspoll for release_1.0
Signed-off-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com>
* Update docker-sonic-mgmt-framework.mk
* Update sonic-mgmt-framework.mk
* Update sonic-mgmt-framework.mk
* Update docker-sonic-mgmt-framework.mk
* Update docker-sonic-mgmt-framework.mk
* Revert "Processing of first boot in rc.local should not have premature exit"
This reverts commit e99a91ffc2.
* Remove old telemetry directory
* Update docker-sonic-mgmt-framework.mk
* Resolving merge conflict with Azure
* Reverting the wrong merge
* Use CVL_SCHEMA_PATH instead of changing directory for telemetry startup
* Add missing export
* Add python mmh3 to slave dockerfile
* Remove sonic-mgmt-framework build dep for telemetry, fix dialout startup issues
* Provided flag to disable compiling mgmt-framework
* Update sonic-utilites point latest commit id
* Point sonic-utilities to Azure accepted SHA
* Updating mgmt framework to right sha
* Add sonic-telemetry submodule
* Update the mgmt-framework commit id
Co-authored-by: jghalam <joe.ghalam@gmail.com>
Co-authored-by: Partha Dutta <51353699+dutta-partha@users.noreply.github.com>
Co-authored-by: srideepDell <srideep_devireddy@dell.com>
Co-authored-by: nirenjan <nirenjan@users.noreply.github.com>
Co-authored-by: Sachin Holla <51310506+sachinholla@users.noreply.github.com>
Co-authored-by: Eric Seifert <seiferteric@gmail.com>
Co-authored-by: Howard Persh <hpersh@yahoo.com>
Co-authored-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com>
Co-authored-by: Arunsundar Kannan <31632515+arunsundark@users.noreply.github.com>
Co-authored-by: rvasanthm <51932293+rvasanthm@users.noreply.github.com>
Co-authored-by: Ashok Daparthi-Dell <Ashok_Daparthi@Dell.com>
Co-authored-by: anand-kumar-subramanian <51383315+anand-kumar-subramanian@users.noreply.github.com>
* Corefile uploader service
1) A service is added to watch /var/core and upload to Azure storage
2) The service is disabled on boot. One may enable explicitly.
3) The .rc file to be updated with acct credentials and http proxy to use.
4) If service is enabled with no credentials, it would sleep, with periodic log messages
5) For any update in .rc, the service has to be restarted to take effect.
* Remove rw permission for .rc file for group & others.
* Changes per review comments.
Re-ordered .rc file per JSON.dump order.
Added a script to enable partial update of .rc, which HWProxy would use to add acct key.
* Azure storage upload requires python module futures, hence added it to install list.
* Removed trailing spaces.
* A mistake in name corrected.
Copy the .rc updater script to /usr/bin.
* [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot
1. check whether /proc/cmdline indicates warm/fast reboot.
if yes the software reboot cause file will be treated as the reboot cause.
finish
2. check whether platform api returns a reboot cause.
if yes it is treated as the reboot cause.
finish.
3. check whether /hosts/reboot-cause contains a cause.
if yes it is treated as the cause otherwise return unknown.
* [process-reboot-cause]Fix review comments
* [process-reboot-cause]address comments
1. use "with" statement
2. update fast/warm reboot BOOT_ARG
* [process-reboot-cause]address comments
* refactor the code flow
* Remove escape
* Remove extra ':'
In place editing (sed -i) seems having some issues with filesystem
interaction. It could leave 0 size file or corrupted file behind.
It would be safer to sed the file contents into a new file and switch
new file with the old file.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* ZTP infrastructure changes to support DHCP discovery provisioning data
- Dynamically generate DHCP client configuration based on current ZTP state
- Added support to request and process hostname when using DHCPv6
- Do not process graphservice url dhcp option if ZTP is enabled, ZTP service
will process it
- Generate /e/n/i file with all active interfaces seeking address assignment
via DHCP. Only interfaces that are created in Linux will be added to /e/n/i.
Also DHCP is started only on linked up in-band interfaces.
Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
* Create a SONiC configuration management service
* Perform config db migration after loading config_db.json to redis DB
* Migrate config-setup post migration hooks on image upgrade
config-setup post migration hooks help user to migrate configurations from
old image to new image. If the installed hooks are user defined they will not
be part of the newly installed image. So these hooks have to be migrated to
new image and only then they can be executing when the new image is booting.
The changes in this fix migrate config-setup post-migration hooks and ensure
that any hooks with the same filename in newly installed image are not
overwritten.
It is expected that users install new hooks as per their requirement and
not edit existing hooks. Any changes to existing hooks need to be done as
part of new image and not post bootup.
This PR is to handle the issue 3527.
When device boots up, NTP throws a traceback as explained in the issue 3527.
- Traceback will be seen when MGMT_VRF_CONFIG does not exist in the database. Traceback is coming from the script “/etc/init.d/ntp”.
- Traceback does not affect the NTP functionality with/without management VRF. When MGMT_VRF_CONFIG does not exist or when MGMT_VRF_CONFIG’s mgmtVrfEnabled is configured to “false”, “NTP” will be started in the “default VRF” context, which is working fine even with this traceback.
- This traceback error will be hidden by redirecting the error to /dev/null without affecting functionality.
Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
* Rename asn/deployment_id_asn_map.yaml to constants/constants.yaml
* Fix bgp templates
* Add community for loopback when bgpd is isolated
* Use correct community value
We noticed in tests/production that there is a low probability failure
where /etc/hosts could have some garbage characters before the entry for
local host name. The consequence is that all sudo command would be very
slow. In extreme cases it would prevent some services from starting
properly.
I suspect that the /etc/hosts file might be opened by some process causing
the issue. Editing contents with new file level and replace the whole file
should be safer.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- after reloading minigraph, write latest version string in the DB.
- if old config_db.json file exists, use it and migrate to latest version.
- only reload minigraph when config_db.json doesn't exist and minigraph
exists.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [SNMP] management VRF SNMP support
This commit adds SNMP support for Management VRF using l3mdev.
The patch included provides VRF support, there is no single
"listendevice" configuration, rather multiple agentaddress
config options can each have their own "interface" to bind to
using "ip%interface". The snmpd.conf file is accordingly
generated using the snmp.yml file and redis database info.
Adding below the comments of SNMP patch 1376
--------------------------------------------
Since the Linux kernel added support for Virtual Routing
and Forwarding (VRF) in version 4.3
(Note: these won't compile on non-linux platforms)
https://www.kernel.org/doc/Documentation/networking/vrf.txt
Linux users could not use snmpd in its current form to
bind specific listening IP addresses to specific VRF
devices. A simplified description of a VRF inteface
is an interface that is a master (a container of sorts)
that collects a set of physicalinterfaces to form a
routing table.
This set of two patches (one for V5-7-patches and one
for V5-8-patches branches) is almost identical to patch
single "listendevice" configuration. Rather, multiple
agentAddress config options can each have their own
"interface" to bind to using the <ip>%<interface>
syntax.</interface></ip>
-------------------------------------------
Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
This commit adds NTP support for management VRF using L3mdev. Config vrf add
mgmt will enable management VRF, enslave the eth0 device to the master device
mgmt, stop ntp service in default, restart interfaces-configs and restart ntp
service in mgmt-vrf context. Requirement and design are covered in mgmt vrf
design document.
Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
* [cron.d] Create cron job to periodically clean-up core files
* Create script to scan /var/core and clean-up older core files
* Create cron job to run clean-up script
Signed-off-by: Danny Allen <daall@microsoft.com>
* Update interval for running cron job
* Respond to feedback
* Change syslog id
- monit config broke by one monit upgrade
- abandon sed approach since it is suspestible to monit config changes
- use unixsocket instead of httpd due to a bug in 5.20.0
Present: Servers are listed in the same order as in redis-db
Fix: Save the sort o/p, hence use sorted list to write into pam.d's conf.
As well convert priority to integer for use by sort.
ARM Architecture support in SONIC
make configure platform=[ASIC_VENDOR_ARCH] PLATFORM_ARCH=[ARM_ARCH]
SONIC_ARCH: default amd64
armhf - arm32bit
arm64 - arm64bit
Signed-off-by: Antony Rheneus <arheneus@marvell.com>
This commit adds support for New feature management VRF using L3mdev. Added
commands to enable/disable management VRF. Config vrf add mgmt will enable
management VRF, enslave the eth0 device to the master device mgmt and restart
interfaces-configs in mgmt-vrf context.
management interface (eth0) can be configured using config interface eth0 ip
add command and removed using config interface eth0 ip remove command.
Requirement and design are covered in mgmt vrf design document. Currently show
command displays linux command output; will update show command display in next
PR after concluding what would be the output for the show commands. Added
metric for default routes in dhcp and static, any changes for metric will be
addressed subsequently after discussing.
Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>
* [warm reboot] save configuration after warm reboot
After warm reboot, save a copy of in memory database to config_db.json,
upgrade procedure might have removed config_db.json to force new image
to reload minigraph. However, reload minigraph is skipped during warm
reboot. Missing config_db.json would cause device to fault in next
non-upgrading cold/fast reboot.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* Update finalize-warmboot.sh
In case of going from previous iteration of SONiC, and the last reboot
was hardware, REBOOT_CAUSE_FILE may not be present and the service may
throw an error.
* [logrotate] Decrease frequency to every 10 minutes; kill any lingering logrotate processes
* [logrotate] Delete all *.1.gz files as firstaction; Remove note about init-system-helpers < 1.47 workaround
However, continue to send SIGHUP directly to rsyslogd process
because 'service rsyslog rotate' still doesn't work properly with
init-system-helpers version 1.48
* Switch the nss look up order as "compat" followed by "tacplus".
This helps use the legacy passwd file for user info and go to tacacs only if not found.
This means, we never contact tacacs for local users like "admin".
This isolates local users from any issues with tacacs servers.
W/o this fix, the sudo commands by local users could take <count of servers> * <tacacs timeout> seconds, if the tacacs servers are unreachable.
* Skip tacacs server access for local non-tacacs users.
Revert the order of 'compat tacplus' to original 'tacplus compat' as tacplus
access is required for all tacacs users, who also get created locally.
- Add ebtables package, and install some filter rules:
1. ebtables -A FORWARD -d BGA -j DROP
2. ebtables -A FORWARD -p ARP -j DROP
Basically, we let the ARP packets in the VLAN being forwarded by the ASIC,
kernel gets a copy of these ARP packets and the forwarding from Kenerl gets
dropped. So there is always only one copy of ARP/response in the VLAN.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [service] Restart SwSS Docker container if orchagent exits unexpectedly
* Configure systemd to stop restarting swss if it attempts to restart more than 3 times in 20 minutes
* Move supervisor-proc-exit-listener script
* [docker-dhcp-relay] Enhance wait_for_intf.sh.j2 to utilize STATEDB
* Ensure dependent services stop/start/restart with SwSS
* Change 'StartLimitInterval' to 'StartLimitIntervalSec', as Stretch installs systemd 232 (>= v230)
* Also update journald.conf options
* Remove 'PartOf' option from unit files
* Add '$(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)' to new shared docker-orchagent makefile
* Make supervisor-proc-exit-listener script read from 'critical_processes' file inside container
* Update critical_processes file for swss container
This service (weekly) will let SSD firmware to do the garbage collection
after file-system deleted files. It could avoid slowness or
even READ-ONLY error due to SSD not being able to free the pages
even though the file system thinks there was a lot of space left.
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
After warm reboot is done, we need to disable warm reboot flag and
tear down anything setup for warm reboot and persisted across.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Why it is required
since SONiC master switches ifupdown package to the new implementation (ifupdown2), it is required to change the configuration of a platform-specific interface for wedge100bf_32x and wedge100bf_65x platforms (bc of ifupdown2 doesn't support auto mode for inet6 protocol).
Also, need to make some refactoring and remove if platform == smth then.. from the system level scripts.
- What I did
removed customization of /usr/bin/interfaces-config.sh
explicitly created directory /etc/network/interfaces.d
added "source" to the /etc/network/interfaces generation template (to include platform-specific interfaces processing)
added platform-specific interfaces config itself (for wedge100bf_32x and wedge100bf_65x)
fixed testcase in sonic-config-engine
- How to verify it
build image for wedge100bf_32x
perform sudo config reload -y on new installation
check the correct configuration of usb0 interface
- Description for the changelog
Allow configuration of platform-specific interfaces
* Add a log message for each notification of add/del TACACS server.
Signed-off-by: Renuka Manavalan <remanava@microsoft.com>
* Moved another syslog message from DEBUG to INFO to be able to see those notifications.
All these changes are to help with a one-time-seen-bug, that hostcfgd did not act upon changes to redis for TACACS servers. We could not repro the bug.
Signed-off-by: Renuka Manavalan <remanava@microsoft.com>
* [updategraph] After system upgrade, restore files/directories with
original attributes etc.
Restore a few more files that was missed before.
Restore FRR configuration directory if exists on old system
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* Removed deployment_id_asn_map.yml from copy list
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* [reboot cause] Move reboot-cause files to /host directory so they persist across SONiC upgrades
* [sonic-utilities] Update submodule to include related changes
- What I did
This fix removes the possibility of 'localhost' entry getting removed from /etc/hosts file by hostname-config service.
Without this change, whenever we change the hostname from 'localhost' to any other name on the config_db.json and reload the config, /etc/hosts file will only have the new hostname on it. But there are multiple sonic utilities (eg: swssconfig) which relies on the hard coded 'localhost' name and they tend to stop working.
- How I did it
Added a new check on hostname-config.sh script to avid blindly deleting the line containing the old hostname from /etc/hosts file. Now it will delete the old hostname only if its not localhost or when the hostname is not changing.
- How to verify it
Bring up SONiC on a device with hostname as localhost
Edit /etc/sonic/config_db.json to update the 'hostname' filed under DEVICE_METADATA from "hostname" : "localhost" --> "hostname" : "sonic"
run config reload -y to reflect the hostname change done on config_db.json file.
cat /etc/hosts and check whether both 127.0.0.1 localhost and 127.0.0.1 sonic entry are present on the file.
ping localhost should work fine.
- Description for the changelog
Make hostname-config service more robust in handling SONiC hostname change from localhost to anything else.
* [update graph] adapt to warm reboot scenario
When migrating configuration, always copy config files from old_config
to /etc/sonic. But if warm reboot is detected, then skip configuration
operations.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* log file copies and misses
This driver should be loaded by sonic service. If kernel tries to load
it, the driver would be loaded with default parameters, which is not
right for sonic.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Auto negotiating console speed could cause sonic to lock on a wrong
speed under rare conditions. The only way to come out of the wrong
speed is to issue line break or restart console service with forced
speed, or reboot sonic.
Lock down the console speed to avoid these situations.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Remove the teamd.j2 templates used for starting the teamd. Add
teammgrd instead to manage all port channel related configuration
changes. Remove front panel port related configurations in
interfaces.j2 templates as well.
Remove teamd.sh script and use teammgrd to start all the teamd
processes. Remove all the logics in the start.sh script as well.
Update the sonic-swss submodule.
Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
* [updategraph] add support to use preset config instead of default minigraph
* Fix variable case
* Remove default minigraph case
* Remove default minigraphs and add default_sku files
- Move front panel ports and port channels MTU and IP configurations out of
the current /etc/network/interfaces file and store them in the configuration
database.
- The default MTU value for both front panel ports and the port channels is
9100. They are set via the minigraph or 9100 by default.
- Introduce portmgrd which will pick up the MTU configurations from the
configuration database.
- The updated intfmgrd will pick up IP address changes from the configuration
database.
- Update sonic-swss submodule
Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
Previously use / to separate container name and program name.
However, in rsyslogd:
Precisely, the programname is terminated by either (whichever occurs first):
end of tag
nonprintable character
‘:’
‘[‘
‘/’
The above definition has been taken from the FreeBSD syslogd sources.
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* Initial commit
* Add Ingrasys S9180-32X platform dirver.
Signed-off-by: Wade He <chihen.he@gmail.com>
* Add bfn.service for init barefoot.
Signed-off-by: Wade He <chihen.he@gmail.com>
* [Barefoot Beta] Add some functions and fixed some bugs.
1. Update sensors.conf.
2. Fixed IO expander init.
3. Fixed PSU EEPROM.
4. Fixed MB EEPROM.
5. Add fancontrol and fan init.
6. Add SYS LED control (sys, fan, fan tray).
7. 2.5V compute and setup max and min.
8. Fixed typo MB eeprom delete address.
9. Remove coretemp to BMC.
10. Add active CPLD.
11. Modify SFP+ GPIO slave address.
12. Modify tmp75 Near Port 32 slave address.
Signed-off-by: Wade He <chihen.he@gmail.com>
* Add bfn script in /etc/init.d/
Signed-off-by: Wade He <chihen.he@gmail.com>
* Add bfn service in debian
Signed-off-by: Wade He <chihen.he@gmail.com>
* Fixed CPLD switch LED behavior.
Signed-off-by: Wade He <chihen.he@gmail.com>
* [Barefoot Beta] Fixed sensors and hwmon order.
1. Fixed ignore sensors Vbat.
2. Reorg hwmon order.
Signed-off-by: Wade He <chihen.he@gmail.com>
* Fixed PSU1 and PSU2 EEPROM order.
Signed-off-by: Wade He <chihen.he@gmail.com>
* initial barefoot checkin october 2017
* update refpoint
* update refpoints
* update refpoints to bf-master
* update refpoint
* update refpoint to tested version
* change to platform from asic
* update refpoint for swss
* revert core creation setting
* update refpoints
* add telnet for debug shell
* update refpoints 11/17/17
* missed change in file on previous merge
* [CPLD] Fixed blink LED issue.
* Fixed blink LED mask set error.
Signed-off-by: Wade He <chihen.he@gmail.com>
* Update bf_kdrv.c for 6.0.2.39
* Update bf kernel driver
* Add bf_fun kernel module.
* Update bf_tun for fixed build error
* merge with Azure master (12/12/17)
* update swss refpoint
* update refpoint of swss
* library dependency for stack unroll
* update refpoint to bf-master
* [DHCP relay]: Fix circuit ID and remote ID bugs (#1248)
* [DHCP relay]: Fix circuit ID and remote ID bugs
* Set circuit_id_len after setting circuit_id_len to ip->name
* [Platform] Add Psuutil and update sensors.conf for S9100-32X, S8810-32Q and S9200-64X (#1272)
* Add I2C CPLD kernel module for psuutil.
* Support psuutil script.
* Add voltage min and max threshold.
* Update sensors.conf for tmp75.
Signed-off-by: Wade He <chihen.he@gmail.com>
* Allow multi platform support - infra (more changes to follow)
* update relative path to include platform for clarity
* [Platform] Add Ingrasys S9130-32X and S9230-64X with Nephos Switch ASIC for "branch 201712" (#1274)
- What I did
Add switch ASIC vendor: Nephos
Add Nephos platforms: Ingrasys S9130-32X, Ingrasys S9230-64X
- How I did it
Add platform/nephos files
Add platform/nephos/sonic-platform-modules-ingrasys submodule
Add device/ingrasys/x86_64-ingrasys_s9130_32x-r0 files
Add device/ingrasys/x86_64-ingrasys_s9230_64x-r0 files
Add SONiC to support Nephos platform
Update Head of submodule src/sonic-sairedis to "3b817bb"
- How to verify it
To build SONiC installer image and docker images, run the following commands:
make configure PLATFORM=nephos
make target/sonic-nephos.bin
Check system and network feature is worked as well
- Description for the changelog
Add switch ASIC vendor and platforms for Nephos
- A picture of a cute animal (not mandatory but encouraged)
Signed-off-by: Sam Yang <yang.kaiyu@gmail.com>
* change source of files to github (from dropbox), update sairedis refpoint
* update refpoint of sairedis
* [centec] support CENTEC SAI 1.0 on 201712 branch and update e582-48x6q board (#1269)
* [marvel]: Marvell's updates for SONiC.201712 & SAI v1.0 (#1287)
* update sairedis (fast-boot refpoint)
* fix syncd rpc make files
* update refpoint to handle Makefile change (no functional change)
* [Marvell]: Add support for SLM5401-54x device (#1307)
* Marvell's updates for SONiC.201712 & SAI v1.0
* [Platform] Add Marvell's SLM5401-54x for branch 201712
* [Broadcom]: Update Boradcom SAI package to 3.0.3.3-3 (#1312) (#1321)
- update Arista 7050-QX32S config.bcm file
- update Accton th-as771*-32x100G.config.bcm files
* update refpoint for Makefile chnage in sairedis
* update refpoint - sairedis
* update sairedis to older refpoint till we debug clean build
* export asic platform for build
* update refpoint for makefiles
* [PLATFORM] Centec update E582 driver fan/epprom/sensor (#1332)
* Upload wnc-osw1800
* Modify for Barefoot suggest
* Revert bfn-platform.mk
* Update bfn-platform-wnc.mk
Update parameter name
* Update parameter name
* initial support for WNC platform
* change switch name to "switch"
* Delete bf modules for rel_7_0
* Add Ingrasys S9180 platform
Signed-off-by: Wade He <chihen.he@gmail.com>
* Modify bfnsdk for Ingrasys S9180 platform
Signed-off-by: Wade He <chihen.he@gmail.com>
* Resolved the conflict.
* Resolved the conflict.
* Update submodule path and url.
* Delete unused file.
* Update PSU GPIO and EEPROM for psuutil.
* Add psuutil in S9180-32X
Signed-off-by: Wade He <chihen.he@gmail.com>
* update refpoint
* update refpoint
* change contact email, update refpoint
* cleanup and update kernel modules
* updates based on review
* update refpoint
* update refpoint
* fix typo in config script to check for platforms
* remove stale file
* resolve conflicts
* cleanup diffs with Azure repo and update SDK debs
* update refpoints to Azure
* address review comments
* revert refpoint of swss-common
* porting the build fix from master
* porting build fix from master
* Minor Fix
* Minor fix
* Temp to sde deb packages url
* Update sonic - sairedis,swss & swss-common refpoints
* Update git modules url path to bfn repo
* updated paths for swss, swss-common & sairedis
* Update refpoint for sonic-swss to local bfn repo
* Update URL for downloading sde debian packages
* porting fix links of debian git server from master
* porting fix links of debian git server from master
* [Ingrasys] Add platform support for S9280-64X with Barefoot ASIC
* Update ref points for swss, swss-common and sairedis repos
* Add sonic platform scripts for bfn montara/maverick
* Call sh scripts instead of calling py scripts
* Address upstream PR Comments (#10)
* Update bf-master with azure/master
* Undo changes to some files
* Revert "Address upstream PR Comments (#10)"
This reverts commit a7fddb83ca.
* Address upstream comments (#11)
* Remove all non bfn specific changes from upstream PR
* Revert "Address upstream comments (#11)"
This reverts commit 559132103e.
* Undo non bfn changes
* Little more cleanup
* Add back code removed in merge
* export CONFIGURED_PLATFORM
* Update sairedis and swss refpoints
* Address Upstream PR comment
* change deb pkg dependency from 3.16.0-4-amd64 to 3.16.0-5-amd64
* Set default tx queue len for usb0 interface to 64
* Update sairedis refpoint
* Update swss ref point
* Add bfn buffer cfg files for montara/maverick as per new design
* Update buffer cfg templates for bfn montara
* add non zero size to buffer profile
* add macro to generate port lists
* Update buffer cfg templates for bfn mavericks
* add non zero size for buffer profiles
* add port generation macro
* Add missing psmisc package
* BGP docker seems to be missing killall utility being used by fast-reboot script. This is causing non graceful termination of BGP sessions.
Adding psmisc to resolve this issue.
* Update swss ref point
* Update swss ref point
* Update sairedis refpoint
* Update sairedis refpoint
* Update sairedis refpoint
* Update sairedis refpoint
* Update refpoint for sairedis and swss
* sairedis to azure master
* swss to latest bfn bf-master
* Update gitmodules
Update url for sairedis to azure master
* Correct typo in bfn platform script
* Update swss and sairedis ref points
* Update swss ref point
* Address Review comments
* Update swws path in gitmodules to azure master
* update swss refpoint
* update base docker j2 file -remove psmisc package (could be a concern, would cause fast reboot to not work correctly will fix in another PR)
* Fix sairedis refpoint broken in by previous merge
* Remove psmisc from docker base image
* This will break fast reboot as killall is required for killing bgp process and initiating graceful termination of BGP session.
Will fix this in a seperate PR. Need this for SONIC upstreaming
* Address upstream comments
* Remove bmc interface from interface jinja template and sample output interfaces file
* Add bmc interface at boot time to network interfaces for bfn bmc based platforms
* Remove autogen ingrasys debian files
* Revert "Remove autogen ingrasys debian files"
* Buffer and qos config template fix for bfn platforms (#21)
SWI-1509 Buffer and qos config template fix for bfn platforms
* Fix qos config files for montara & mavericks (#22)
* Reference only ppg 3,4 in qos files as no profiles are attached to 0,1 in buffer configs
* Fix vs test (#23)
* Use MAC from EEPROM for PortChannels
Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
* Use MAC from EEPROM in DEVICE_METADATA
Will affect MAC for VLAN interfaces
Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
* Get MAC via decode-syseeprom
Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
* hw-management is now a service
Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
* Add error handling for MAC fetch process
Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
* [rc.local] Move all constants and functions to top of file; Unify style; Reword messages
* Add function to process reboot cause upon boot
* Simplify retrieval of SONIC_VERSION per comments
* Change wording
* [caclmgrd] Heuristically determine whether ACL is IPv4 or IPv6, use iptables/ip6tables accordingly
* Check all rules in table until we find one with a SRC_IP
* Revert "[serial watchdog] remove serial watchdog service dependency to rc.local (#1752)"
* Revert "[service] introducing serial port watchdog service (#1743)"
* [serial watchdog] remove serial watchdog service dependency to rc.local
When restarting this service in rc.local, the dependency causes an error
in syslog. Removing the dependency to mute the error log entry.
* remove lines with empty inputs
* [rc.local] refactor platform identification code to separate function
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [rc.local] infrastructure to take action according to installer.conf
* [serial port watchdog] add service to watch serial port processes
Monitor serial port processes. Kill ones stuck for too long.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [rc.local] start watchdog on serial port specified by installer.conf
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [fast-reboot]: support encoded & gzipped minigraph in fast reboot
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* add acl.json and snmp.yml into fast-reboot
Signed-off-by: Guohan Lu <gulv@microsoft.com>
cfggen generates new eth0 configuration. Need to first
clean existing configuration on eth0 before bring up
new configuration on eth0. Thus, we need to first bring
down eth0 before putting new configuration into /etc/network/
interfaces
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* SONiC system telemetry Support
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Update package name from telemetry to sonic-telemetry
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
commit 0965b33 added quote to build_version in /etc/sonic/sonic_version.yml,
e.g., sonic_version : '20170104.10'. scripts to use the $sonic_version need
to remove the quote.
Signed-off-by: Guohan Lu <gulv@microsoft.com>
- Move all minigraph-related action from rc.local to updategraph
- updategraph service is now after database. All feature services are now after and depending on updategraph
* Support OS9 -> SONiC fast-reboot migration
* Address review comments. Update NOS mac in EEPROM and net.rules for eth0
* Address review comments. Update sonic-platform-modules-dell to fac81d...
* Fix script for POSIX compliance
* [TACACS+]: Add configDB enforcer for TACACS+
* hostcfgd - configDB enforcer for TACACS+, listen configDB to
modify the pam configuration for Authentication in host
* Add a service script for hostcfgd
Signed-off-by: Chenchen Qi <chenchen.qcc@alibaba-inc.com>
* [TACACS+]: Generate conf file by template file
* Generate common-auth-sonic and tacplus_nss.conf by jinja2 template
Signed-off-by: Chenchen Qi <chenchen.qcc@alibaba-inc.com>
If device MAC is added to init_cfg.json, it has to be done using
intermediate file. We cannot redirect to same file while trying to read
from it because it will be truncated first.
Signed-off-by: marian-pritsak <marianp@mellanox.com>
* [init]: save the initial switch mac to config db
Save the initial switch mac to config db DEVICE_METADATA|localhost entry.
* update sonic-swss submodule
* Add support for vlanconfd and intfconfd
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Change name to vlanmgrd and intfmgrd
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Add missing vlan_members for parse_dpg result
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Remove cfgmgr debug CLI from image
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Update swss and swss-common submodules for VLAN trunk support
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
When updategraph service is enabled, a special value 'default'
from DHCP response will now initialize the system with an empty
configuration instead of existing minigraph.
A DHCP response without option 224 will remain the current behavior
of skipping graph update and use existing default minigraph.
Modify minigraph parser output format so it fit DB schema
Modify configuration templates to fit new schema
Systemd services dependencies are modified so database starts before any configuration consumer
* [rsyslog]: Use timegenerated instead of timestamp
This is useful when rsyslog is used to put markers generated on other machines.
This way all messages will have a timestamp from a single system.
* [rsyslog] Use subseconds from local machine
1. "make target/sonic-broadcom.raw" will create the compressed dd'able image.
2. This will also update the grub config files (device/dell/*/nos_to_sonic_grub.cfg) with the image versions.
- Force log rotation at size thresholds only (no longer also rotating logs daily), allowing for more consistent archived log size
- Eliminate remaining duplicate log messages
- Cron facility now only logs to cron.log (was also logging to syslog)
- Debug, mail, news and user log facilities only log to syslog; no longer creating separate log files for these facilities
- Cron job that calls logrotate every minute now uses the main /etc/logrotate.conf file so as to check/rotate all logs every minute, not just the logs specified in the rsyslog file. Also redirecting output of this command to /dev/null to prevent "(CRON) info (No MTA installed, discarding output)" messages in cron.log due to lack of a mail service
- Delete archive files based on remaining /var/log partition space. Note that this solution currently requires a minimum /var/log partition size of 32MB to function correctly
- Update sonic-sairedis and sonic-swss submodules to incorporate recording file name changes
- Add .screen file to .gitignore (unrelated)