Commit Graph

254 Commits

Author SHA1 Message Date
rkdevi27
80dc2b71d1 [baseimage]: /host unmount timeout issue during reboot. (#5032)
Fix for the host unmount issue through PR https://github.com/Azure/sonic-buildimage/pull/4558 and https://github.com/Azure/sonic-buildimage/pull/4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket

Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
2020-07-25 08:31:05 +00:00
Joe LeVeque
210dc90d0d [caclmgrd] Filter DHCP packets based on dest port only (#4995) 2020-07-21 10:15:08 +00:00
arlakshm
40e37f385e syslog changes Multi ASIC platforms (#4738)
Add changes for syslog support for containers running in namespaces on multi ASIC platforms.
On Multi ASIC platforms

Rsyslog service is only running on the host. There is no rsyslog service running in each namespace.
On multi ASIC platforms the rsyslog service on the host will be listening on the docker0 ip address instead of loopback address.
The rsyslog.conf on the containers is modified to have omfwd target ip to be docker0 ipaddress instead of loopback ip

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-07-12 18:16:44 +00:00
Joe LeVeque
4d2d95e8e6
[hostcfgd] Synchronize all feature statuses once upon start (#4714)
- Ensure all features (services) are in the configured state when hostcfgd starts
- Better functionalization of code
- Also replace calls to deprecated `has_key()` method in `tacacs_server_handler()` and `tacacs_global_handler()` with `in` keyword.

This PR depends on https://github.com/Azure/sonic-utilities/pull/944, otherwise `config load_minigraph` will fail when trying to restart disabled services.
2020-06-20 12:09:29 -07:00
padmanarayana
95e3cda5da
[DELL]: FTOS to SONiC fast conversion fixes (#4807)
While migrating to SONiC 20181130, identified a couple of issues:
1. union-mount needs /host/machine.conf parameters for vendor specific checks : however, in case of migration, the /host/machine.conf is extracted from ONIE only in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/platform/rc.local#L127. 
2. Since grub.cfg is updated to have net.ifnames=0 biosdevname=0, 70-persistent-net.rules changes are no longer required.
2020-06-19 11:02:08 -07:00
Joe LeVeque
6960477cc2
[caclmgrd] Don't limit connection tracking to TCP (#4796)
Don't limit iptables connection tracking to TCP protocol; allow connection tracking for all protocols. This allows services like NTP, which is UDP-based, to receive replies from an NTP server even if the port is blocked, as long as it is in reply to a request sent from the device itself.
2020-06-18 00:18:20 -07:00
xumia
76a395cdbf
[secure boot] Support rw files allowlist (#4585)
* Support rw files allowlist for Sonic Secure Boot
* Improve the performance
* fix bug
* Move the config description into a md file
* Change to use a simple way to remove the blank line
* Support chmod a-x in rw folder
* Change function name
* Change some unnecessary words
2020-06-13 00:10:13 -07:00
Ying Xie
ae7bf3db52
[ntp] disable ntp long jump (#4748)
Found another syncd timing issue related to clock going backwards.
To be safe disable the ntp long jump.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-06-11 13:01:21 -07:00
Joe LeVeque
7b8037770d
[caclmgrd] Get first VLAN host IP address via next() (#4685)
I found that with IPv4Network types, calling list(ip_ntwrk.hosts()) is reliable. However, when doing the same with an IPv6Network, I found that the conversion to a list can hang indefinitely. This appears to me to be a bug in the ipaddress.IPv6Network implementation. However, I could not find any other reports on the web.

This patch changes the behavior to call next() on the ip_ntwrk.hosts() generator instead, which returns the IP address of the first host.
2020-06-02 02:11:21 -07:00
Joe LeVeque
eff8a89523
[hostcfgd] Get service enable/disable feature working (#4676)
Fix hostcfgd so that changes to the "FEATURE" table in ConfigDB are properly handled. Three changes here:

1. Fix indenting such that the handling of each key actually occurs in the for key in status_data.keys(): loop
2. Add calls to sudo systemctl mask and sudo systemctl unmask as appropriate to ensure changes persist across reboots
3. Substitute returns with continues so that even if one service fails, we still try to handle the others

Note that the masking is persistent, even if the configuration is not saved. We may want to consider only calling systemctl enable/disable in hostcfgd when the DB table changes, and only call systemctl mask/unmask upon calling config save.
2020-06-02 02:07:22 -07:00
taocy
ea2dd9541d change image apt source list from stretch to buster for arm 2020-05-25 13:15:19 +00:00
Joe LeVeque
bce42a7595
[caclmgrd] Allow more ICMP types (#4625) 2020-05-20 17:45:07 -07:00
abdosi
a44fc07e78
Changes to support config-setup service for multi-npu (#4609)
* Changes to support config-setup service for multi-npu
platforms. For Multi-npu we are not supporting as of
now config initializtion and ZTP. It will support creating
config db from minigraph or using  config db from previous
file system

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments.

* Address Review comments

* Address Review Comments of using pyhton based config load_minigraph/
config save/config reload from shell scripts so that we don't duplicate
code. Also while running from shell we will skip stop/start services
done by those commands.

* Updated to use python command so no code duplication.
2020-05-20 16:32:33 -07:00
rkdevi27
32f58b5864
Fix "/host unmount failure" during reboot (#4558) 2020-05-20 11:18:11 -07:00
Ying Xie
cdfb1ced44
[ntp] enable/disable NTP long jump according to reboot type (#4577)
* [ntp] enable/disable NTP long jump according to reboot type

- Enable NTP long jump after cold reboot.
- Disable NTP long jump after warrm/fast reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* fix typo

* further refactoring

* use sonic-db-cli instead
2020-05-20 10:57:21 -07:00
Joe LeVeque
5150e7b655
[caclmgrd] Ignore keys in interface-related tables if no IP prefix is present (#4581)
Since the introduction of VRF, interface-related tables in ConfigDB will have multiple entries, one of which only contains the interface name and no IP prefix. Thus, when iterating over the keys in the tables, we need to ignore the entries which do not contain IP prefixes.
2020-05-12 18:16:55 -07:00
abdosi
5fe2216ea3
Fix for issue where image is compile with flag ENABLE_DHCP_GRAPH_SERVICE (#4573)
and then we load image and reboot even if there was existing
config_db.json we will look for DHCP Service. we should disbale
update_graph in such cases. This behaviour is silimar to what we have in
201811 image.
2020-05-12 14:49:56 -07:00
Joe LeVeque
5e8e0d76fc
[caclmgrd] Add some default ACCEPT rules and lastly drop all incoming packets (#4412)
Modified caclmgrd behavior to enhance control plane security as follows:

Upon starting or receiving notification of ACL table/rule changes in Config DB:
1. Add iptables/ip6tables commands to allow all incoming packets from established TCP sessions or new TCP sessions which are related to established TCP sessions
2. Add iptables/ip6tables commands to allow bidirectional ICMPv4 ping and traceroute
3. Add iptables/ip6tables commands to allow bidirectional ICMPv6 ping and traceroute
4. Add iptables/ip6tables commands to allow all incoming Neighbor Discovery Protocol (NDP) NS/NA/RS/RA messages
5. Add iptables/ip6tables commands to allow all incoming IPv4 DHCP packets
6. Add iptables/ip6tables commands to allow all incoming IPv6 DHCP packets
7. Add iptables/ip6tables commands to allow all incoming BGP traffic
8. Add iptables/ip6tables commands for all ACL rules for recognized services (currently SSH, SNMP, NTP)
9. For all services which we did not find configured ACL rules, add iptables/ip6tables commands to allow all incoming packets for those services (allows the device to accept SSH connections before the device is configured)
10. Add iptables rules to drop all packets destined for loopback interface IP addresses
11. Add iptables rules to drop all packets destined for management interface IP addresses
12. Add iptables rules to drop all packets destined for point-to-point interface IP addresses
13. Add iptables rules to drop all packets destined for our VLAN interface gateway IP addresses
14. Add iptables/ip6tables commands to allow all incoming packets with TTL of 0 or 1 (This allows the device to respond to tools like tcptraceroute)
15. If we found control plane ACLs in the configuration and applied them, we lastly add iptables/ip6tables commands to drop all other incoming packets
2020-05-11 12:36:47 -07:00
Joe LeVeque
dfdd94d8ad
[process-reboot-cause] If software reboot cause is unknown add note if first boot into new image (#4538) 2020-05-06 22:48:33 -07:00
wangshengjun
bed4a799df
[ebtables]add the filter rule for ARP packets with vlan tag: (#3945)
1. ebtables -t filter -A FORWARD -p 802_1Q --vlan-encap 0806 -j DROP
The ARP packet with vlan tag can't match the default rule.

Signed-off-by: wangshengjun <wangshengjun@asterfusion.com>
2020-05-06 20:03:09 -07:00
Dong Zhang
340cf826a6
[MultiDB] use sonic-db-cli PING and fix wrong multiDB API in NAT (#4541) 2020-05-06 15:41:28 -07:00
rkdevi27
4511216789
Ssd mitigation changes (#4214)
* ssd_mitigation_changes

* ssd_mitigation_changes

* ssd_mitigation_changes

* ssd_mitigation_changes
2020-04-30 22:58:09 -07:00
Olivier Singla
799f22d4c7
[baseimage]: Run fsck filesystem check support prior mounting filesystem (#4431)
* Run fsck filesystem check support prior mounting filesystem

If the filesystem become non clean ("dirty"), SONiC does not run fsck to
repair and mark it as clean again.

This patch adds the functionality to run fsck on each boot, prior to the
filesystem being mounted. This allows the filesystem to be repaired if
needed.

Note that if the filesystem is maked as clean, fsck does nothing and simply
return so this is perfectly fine to call fsck every time prior to mount the
filesystem.

How to verify this patch (using bash):

Using an image without this patch:

Make the filesystem "dirty" (not clean)
[we are making the assumption that filesystem is stored in /dev/sda3 - Please adjust depending of the platform]
[do this only on a test platform!]

dd if=/dev/sda3 of=superblock bs=1 count=2048
printf "$(printf '\\x%02X' 2)" | dd of="superblock" bs=1 seek=1082 count=1 conv=notrunc &> /dev/null
dd of=/dev/sda3 if=superblock bs=1 count=2048

Verify that filesystem is not clean
tune2fs -l /dev/sda3 | grep "Filesystem state:"

reboot and verify that the filesystem is still not clean
Redo the same test with an image with this patch, and verify that at next reboot the filesystem is repaired and becomes clean.

fsck log is stored on syslog, using the string FSCK as markup.
2020-04-30 00:33:20 -07:00
pavel-shirshov
057ced0391
[bgpcfgd]: Split one bgp mega-template to chunks. (#4143)
The one big bgp configuration template was splitted into chunks.

Currently we have three types of bgp neighbor peers:

general bgp peers. They are represented by CONFIG_DB::BGP_NEIGHBOR table entries
dynamic bgp peers. They are represented by CONFIG_DB::BGP_PEER_RANGE table entries
monitors bgp peers. They are represented by CONFIG_DB::BGP_MONITORS table entries
This PR introduces three templates for each peer type:

bgp policies: represent policieas that will be applied to the bgp peer-group (ip prefix-lists, route-maps, etc)
bgp peer-group: represent bgp peer group which has common configuration for the bgp peer type and uses bgp routing policy from the previous item
bgp peer-group instance: represent bgp configuration, which will be used to instatiate a bgp peer-group for the bgp peer-type. Usually this one is simple, consist of the referral to the bgp peer-group, bgp peer description and bgp peer ip address.
This PR redefined constant.yml file. Now this file has a setting for to use or don't use bgp_neighbor metadata. This file has more parameters for now, which are not used. They will be used in the next iteration of bgpcfgd.

Currently all tests have been disabled. I'm going to create next PR with the tests right after this PR is merged.

I'm going to introduce better bgpcfgd in a short time. It will include support of dynamic changes for the templates.

FIX:: #4231
2020-04-23 09:42:22 -07:00
bsun-sudo
012c832ce5 [ntp] add ntp support in buster with mgmt vrf (#55)
- create a file in files/image_config/ntp/ntp-systemd-wrapper to add mgmt vrf related start cmd for ntp service. So that the default /usr/lib/ntp/ntp-systemd-wrapper can be overriden during build time.

- modify build_debian.sh to cp files/image_config/ntp/ntp-systemd-wrapper to /usr/lib/ntp/ntp-systemd-wrapper during build time.

Co-authored-by: Bing Sun <Bing_Sun@dell.com>
2020-04-17 04:51:51 +00:00
bsun-sudo
2a237c57e6 [mgmt-vrf]: mgmt vrf related change for Buster (#53)
Co-authored-by: Bing Sun <Bing_Sun@dell.com>
2020-04-17 04:51:51 +00:00
Stephen Sun
aec51c8aaf [docker-wait-any] Use APIClient instead of Client according to API update
due to the upgrade from docker-py (1.6.0) to docker (4.1.0)
2020-04-17 04:51:51 +00:00
Guohan Lu
e479a56db3 [baseimage]: setup ebtables.service in buster image
ebtables is not enabled by default in buster

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-17 04:51:51 +00:00
Guohan Lu
fb12b0a621 [baseimage]: various fixes due to buster changes
do mount/umount in the chroot environment
install cron explicitly
install rasdaemon as a replacement for mcelog
switch python package docker-py to docker
2020-04-17 04:51:51 +00:00
Renuka Manavalan
f128153706
[baseimage]: Install Kubernetes packages if enabled in image (#4374)
* Install kubernetes worker node packages, if enabled.

* Minor updates

* Added some comments

* Updates per review comments.
Built a private image to test to work fine.

* Remove the removed file.

* Update per comments
Make a fix, as kubeadm no demands a higher version of kubelet & kubectl.
As kubeadm auto install kubectl & kubelet, removing explicit install is an easier/robust fix.

* Changes per review comments.

* Updates per comments.
1) Dropped helper & pod scripts
2) Made install verbose

* Drop creation of pods subdir, as this PR does not use them.

* From comments to 'n' per review comments.

* 1) kubeadm.conf is created as part of kubeadm package install. Hence dropped explicit copy.
2020-04-13 08:41:18 -07:00
rajendra-dendukuri
de377ebccd
Fix typo in config-setup service (#4388) 2020-04-07 23:44:50 -07:00
SuvarnaMeenakshi
4b8067e913
Multi-ASIC implementation (#3888)
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
2020-03-31 10:06:19 -07:00
Garrick He
d095d0bdbc
[procdockerstatsd] Fix CMD field in dB (#4335)
* Fix the CMD for the PROCESSSTATS entries so that
  there is a space between the command name and the
  arguments.

Signed-off-by: Garrick He <garrick_he@dell.com>
2020-03-28 11:43:48 -07:00
lguohan
3c6f23e7b7
[tacacs]: fix /etc/nsswitch.conf for buster image (#4303)
in buster image, default /etc/nsswitch.conf becomes

```
passwd:         files
```

when tacacs is enable, this files changes to

```
passwd:         tacplus files
```
2020-03-22 09:44:48 -07:00
SuvarnaMeenakshi
cfe754f665
[ntp]: Add "tinker panic 0" in ntp.conf to avoid ntpd from panic (#4263)
- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.

- How I did it
Added "tinker panic 0" in ntp.conf file.

- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]

Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
Reboot the device.
Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.

After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.
2020-03-21 18:50:12 -07:00
arheneus@marvell.com
94162679bb
[sonic-cfggen] MGMT Interface configuration (#4280)
update network and broadcast address in /etc/network/interfaces

Before:
root@sonic:/home/admin# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.28.32.73  netmask 255.255.254.0  broadcast 0.0.0.0 <<<<<

After:
root@sonic:~# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.28.32.73  netmask 255.255.254.0  broadcast 10.28.33.255 <<<<<

Signed-off-by: Antony Rheneus <arheneus@marvell.com>
2020-03-21 14:25:19 -07:00
yozhao101
560fd50262
[Monit] Delay start of monitoring for 5 minutes (#4281) 2020-03-19 14:14:47 -07:00
Stepan Blyshchak
1ef740361c
[docker_image_ctl.j2] Share UTS namespace with host OS (#4169)
Instead of updating hostname manualy on Config DB hostname change,
simply share containers UTS namespace with host OS.
Ideally, instead of setting `--uts=host` for every container in SONiC,
this setting can be set per container if feature requires.
One behaviour change is introduced in this commit, when `--privileged`
or `--cap-add=CAP_SYS_ADMIN` and `--uts=host` are combined, container
has privilege to change host OS and every other container hostname.
Such privilege should be fixed by limiting containers capabilities.

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
2020-02-26 10:56:54 +02:00
Santhosh Kumar T
2626565afb
[DellEMC] S6100 Last Reboot Reason Thermal Support (#3767) 2020-02-18 00:02:33 -08:00
Joe LeVeque
8126916b46
[interfaces-config.sh] Do not bring 'lo' interface down and up (#4150) 2020-02-14 14:55:03 -08:00
Stephen Sun
af44856d5c
[process-reboot-cause]Clean up the process-reboot-cause as reqired in issue 3927 (#4128) 2020-02-11 09:54:12 -08:00
pra-moh
ab1a945cb9
[procdockerstatsd] Fix incorrect case issue in service file (#4134) 2020-02-10 11:08:42 -08:00
pra-moh
4338fbe12b
[procdockerstats]: Update file permission for procdockerstatsd (#4126) 2020-02-07 07:46:29 -08:00
Kiran Kumar Kella
97165a0d69
Changes in sonic-buildimage to support the NAT feature (#3494)
* Changes in sonic-buildimage for the NAT feature
- Docker for NAT
- installing the required tools iptables and conntrack for nat

Signed-off-by: kiran.kella@broadcom.com

* Add redis-tools dependencies in the docker nat compilation

* Addressed review comments

* add natsyncd to warm-boot finalizer list

* addressed review comments

* using swsscommon.DBConnector instead of swsssdk.SonicV2Connector

* Enable NAT application in docker-sonic-vs
2020-01-29 17:40:43 -08:00
kannankvs
7cb63008d7
mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from con… (#4057)
* mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from confiDB to snmpd.conf without using snmp.yml
* added a missing if condition
2020-01-28 17:41:21 -08:00
SuvarnaMeenakshi
c9483796dc [baseimage]: support building multi-asic component (#3856)
- move single instance services into their own folder
- generate Systemd templates for any multi-instance service files in slave.mk
- detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file.
- update container hostname after creation instead of during creation (docker_image_ctl)
- run Docker containers in a network namespace if specified
- add a service to create a simulated multi-ASIC topology on the virtual switch platform

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>
2020-01-26 13:56:42 -08:00
pra-moh
e3475b81d7 [baseimage]: removing space from shebang in procdockerstatsd (#4051) 2020-01-23 17:49:41 -08:00
Dong Zhang
7aa0baf709 [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector (#4035)
* [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector
* update comment for a potential bug
* update comment
* add TODO maker as review reqirement
2020-01-22 11:26:23 -08:00
Howard Persh
44fa5efe00 [startup] Fixes issue with /var/platform directory not created (#4000) 2020-01-22 10:02:28 -08:00
Joe LeVeque
aca1a86856 [caclmgrd] Fix application of IPv6 service ACL rules (part 2) (#4036) 2020-01-17 17:33:31 -08:00