sonic-buildimage

Author	SHA1	Message	Date
Neetha John	596bec1b32	[qos]: Alpha and ECN settings change for Th (#4564 ) Dynamic threshold setting changed to 0 and WRED profile green min threshold set to 250000 for Tomahawk devices Changed the dynamic threshold settings in pg_profile_lookup.ini Added a macro for WRED profiles in qos.json.j2 for Tomahawk devices Necessary changes made in qos.config.j2 to use the macro if present Signed-off-by: Neetha John <nejo@microsoft.com>	2020-05-09 18:13:10 -07:00
arlakshm	542f722055	[docker]: Enabled ipv6 in dockers when using docker bridge network (#4426 ) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>	2020-04-27 08:50:23 -07:00
pavel-shirshov	2f44bcd071	[bgpcfgd]: Split one bgp mega-template to chunks. (#4143 ) The one big bgp configuration template was splitted into chunks. Currently we have three types of bgp neighbor peers: general bgp peers. They are represented by CONFIG_DB::BGP_NEIGHBOR table entries dynamic bgp peers. They are represented by CONFIG_DB::BGP_PEER_RANGE table entries monitors bgp peers. They are represented by CONFIG_DB::BGP_MONITORS table entries This PR introduces three templates for each peer type: bgp policies: represent policieas that will be applied to the bgp peer-group (ip prefix-lists, route-maps, etc) bgp peer-group: represent bgp peer group which has common configuration for the bgp peer type and uses bgp routing policy from the previous item bgp peer-group instance: represent bgp configuration, which will be used to instatiate a bgp peer-group for the bgp peer-type. Usually this one is simple, consist of the referral to the bgp peer-group, bgp peer description and bgp peer ip address. This PR redefined constant.yml file. Now this file has a setting for to use or don't use bgp_neighbor metadata. This file has more parameters for now, which are not used. They will be used in the next iteration of bgpcfgd. Currently all tests have been disabled. I'm going to create next PR with the tests right after this PR is merged. I'm going to introduce better bgpcfgd in a short time. It will include support of dynamic changes for the templates. FIX:: #4231	2020-04-25 09:41:28 +00:00
Renuka Manavalan	9b017a83b5	[baseimage]: Install Kubernetes packages if enabled in image (#4374 ) (#4432 ) Install kubeadm, which transparently installs kubelet & kubectl As well download required Kubernetes images required to run as kubernetes node. The kubelet service is intentionally kept in disabled state, as it would otherwise continuously restart wasting resources, until join to master.	2020-04-16 21:54:45 -07:00
SuvarnaMeenakshi	2f66b4c545	[sonic-netns-exec]: use "$@" to reflects all positional parameters as they were set initially (#4375 ) sonic-netns-exec fails to execute below command in swss.sh: sonic-netns-exec "$NET_NS" sonic-db-cli $1 EVAL " local tables = {$2} for i = 1, table.getn(tables) do local matches = redis.call('KEYS', tables[i]) for j,name in ipairs(matches) do redis.call('DEL', name) end end" 0 This command fails with error " redis.exceptions.ResponseError: value is not an integer or out of range" . Root cause: When sonic-netns-exec executes the above function, argument passed to sonic-db-cli is NOT executed as a single script. The argument is passed as separate keywords to sonic-db-cli, as below: ['EVAL', 'local', 'tables', '=', "{'PORT_TABLE'}", 'for', 'i', '=', '1,', 'table.getn(tables)', 'do', 'local', 'matches', '=', "redis.call('KEYS',", 'tables[i])', 'for', 'j,name', 'in', 'ipairs(matches)', 'do', "redis.call('DEL',", 'name)', 'end', 'end', '0'] - How I did it To make sure that the parameters are passed as they were set initially, fix sonic-netns-exec to use double quoted "$@", where "$@" is "$1" "$2" "$3" ... "${N}" After fix, the argument passed to sonic-db-cli is as below: Argument passed to sonic-db-cli: ['EVAL', "\n local tables = {'PORT_TABLE'}\n for i = 1, table.getn(tables) do\n local matches = redis.call('KEYS', tables[i])\n for j,name in ipairs(matches) do\n redis.call('DEL', name)\n end\n end", '0'] Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>	2020-04-15 13:13:31 -07:00
SuvarnaMeenakshi	0099305475	Multi-ASIC implementation (#3888 ) Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.	2020-04-15 13:08:34 -07:00
Nazarii Hnydyn	0b35fcf3bf	[mellanox]: Add SSD FW update tool (#4351 ) * [mellanox]: Add SSD FW update tool. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com> * [mellanox]: Align Platform API. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com> * [mellanox]: Fix firmware description. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com> * [mellanox]: Update SSD tool. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>	2020-04-15 13:02:36 -07:00
rajendra-dendukuri	a97b73e79c	Fix typo in config-setup service (#4388 )	2020-04-10 21:23:07 -07:00
Abhishek Dosi	249265ad99	Revert "Multi-ASIC implementation (#3888 )" This reverts commit `2e87a16941`.	2020-04-03 14:34:38 -07:00
Samuel Angebault	8819322210	[Arista] Update drivers submodules (#4353 ) * Update arista drivers submodules * Add device configs for 7060CX2-32S * Update boot0 and union-mount for 7060CX2-32S * Add 7170-32C and 7170-32CD support in boot0 * Sync after writting boot configs * Add 7170-32C and 7170-32CD device configurations Co-authored-by: Boyang Yu <byu@arista.com> Co-authored-by: Boyang Yu <byu@arista.com>	2020-04-01 23:26:42 -07:00
SuvarnaMeenakshi	2e87a16941	Multi-ASIC implementation (#3888 ) Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.	2020-04-01 23:21:49 -07:00
Kebo Liu	2fd1641feb	copy spc3 fw file to image (#4328 )	2020-03-29 22:48:10 -07:00
Garrick He	a059d7ec0e	[procdockerstatsd] Fix CMD field in dB (#4335 ) * Fix the CMD for the PROCESSSTATS entries so that there is a space between the command name and the arguments. Signed-off-by: Garrick He <garrick_he@dell.com>	2020-03-29 22:47:05 -07:00
Stepan Blyshchak	ee84dca683	[docker_image_ctl.j2] Share UTS namespace with host OS (#4169 ) Instead of updating hostname manualy on Config DB hostname change, simply share containers UTS namespace with host OS. Ideally, instead of setting `--uts=host` for every container in SONiC, this setting can be set per container if feature requires. One behaviour change is introduced in this commit, when `--privileged` or `--cap-add=CAP_SYS_ADMIN` and `--uts=host` are combined, container has privilege to change host OS and every other container hostname. Such privilege should be fixed by limiting containers capabilities. Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2020-03-22 23:04:02 -07:00
SuvarnaMeenakshi	7b4b1245bd	[ntp]: Add "tinker panic 0" in ntp.conf to avoid ntpd from panic (#4263 ) - What I did Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large. - How I did it Added "tinker panic 0" in ntp.conf file. - How to verify it [this assumes that there is a valid NTP server IP in config_db/ntp.conf] Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s. Reboot the device. Before the fix: 3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time. After the fix: 3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.	2020-03-22 23:00:40 -07:00
yozhao101	358570324b	[Monit] Delay start of monitoring for 5 minutes (#4281 )	2020-03-22 22:58:57 -07:00
Andriy Kokhan	39889a3c35	[Service] Added NAT entry into CONTAINER_FEATURE. Fixes #4247 . (#4250 ) * [Service] Added NAT entry into CONTAINER_FEATURE. Fixes #4247. Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>	2020-03-19 22:18:13 -07:00
Joe LeVeque	8e36068237	[sonic-cfggen] Loading the configuration from init_cfg.json and then from config_db.json (#4148 )	2020-03-15 08:54:05 -07:00
Olivier Singla	a8baca0d6e	[kernel]: security kernel update to 4.9.189 (#3913 ) This patch upgrade the kernel from version 4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2) Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>	2020-03-15 08:52:29 -07:00
Joe LeVeque	102cb83097	[Services] Restart NAT service upon unexpected critical process exit. (#4208 )	2020-03-14 18:03:29 -07:00
Stephen Sun	c700127101	[Mellanox]Take advantage of sdk variable to customize the location where sdk_socket exists. (#4223 ) Take advantage of an SDK environment variable to customize the location where sdk_socket exists. In the latest SDK sdk_socket has been moved from /tmp to /var/run which is a better place to contain this kind of file. However, this prevents the subdirs under /var/run from being mapped to different volumes. To resolve this, we take advantage of an SDK variable to designate the location of sdk_socket. This requires every process that requires to access sdk_socket have this environment variable defined. However, to define environment variable for each process is less scalable. We take advantage of the docker scope environment variable to avoid that. It depends on PR 4227	2020-03-14 18:02:43 -07:00
byu343	950926a837	[arista]: Add support for Arista Lodoga (#4232 ) Backport the support of Arista Lodoga to 201911	2020-03-11 13:12:39 -07:00
Abhishek Dosi	cc2d497aa4	Fixing Bad Cherry-pick	2020-03-04 10:46:45 -08:00
rajendra-dendukuri	8581a52571	ZTP infrastructure changes to support DHCP discovery provisioning data (#3298 ) * ZTP infrastructure changes to support DHCP discovery provisioning data - Dynamically generate DHCP client configuration based on current ZTP state - Added support to request and process hostname when using DHCPv6 - Do not process graphservice url dhcp option if ZTP is enabled, ZTP service will process it - Generate /e/n/i file with all active interfaces seeking address assignment via DHCP. Only interfaces that are created in Linux will be added to /e/n/i. Also DHCP is started only on linked up in-band interfaces. Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>	2020-03-03 22:23:59 -08:00
yozhao101	5c8c4b2a50	[Services] Restart BGP service upon unexpected critical process exit. (#4207 )	2020-03-03 19:19:44 -08:00
rajendra-dendukuri	1edb69647e	[sonic-ztp]: Build sonic-ztp package (#3299 ) * Build sonic-ztp package - Add changes in make rules to conditionally include sonic-ztp package Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>	2020-02-24 14:27:24 -08:00
Stepan Blyshchak	398929c622	[mgmt-framework] start after syncd (#4174 ) every service starts after syncd to start the most critical parts first Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2020-02-24 11:04:51 -08:00
Prince Sunny	20510d58d3	Sleep done before mismatch handler (#4165 ) * Sleep done before mismatch handler	2020-02-24 10:25:56 -08:00
Prince Sunny	6740b2d3df	Fix service and container name to be same (#4151 )	2020-02-24 10:24:11 -08:00
Joe LeVeque	f6d69aed49	[interfaces-config.sh] Do not bring 'lo' interface down and up (#4150 )	2020-02-24 10:23:35 -08:00
Sumukha Tumkur Vani	af4e84298a	Start RestAPI container when sonic boots (#4140 ) * Start RestAPI container when sonic boots	2020-02-24 10:16:02 -08:00
Stephen Sun	48f8a8d40e	[Mellanox] platform api support firmware install (#3931 ) support firmware install, including CPLD and BIOS. CPLD: cpldupdate BIOS: boot to onie and update BIOS in onie and then boot to SONiC	2020-02-24 10:14:52 -08:00
byu343	f197f0d2a9	[arista]: Fix convertfs condition for booting from EOS (#4139 ) Fix the issue of incorrectly skipping the convertfs hook when fast-reboot from EOS, by adding an extra kernel cmdline param "prev_os" to differentiate fast-reboot from EOS and from SONiC. This is because we still do disk conversion for fast reboot from eos to sonic, like format the disk.	2020-02-13 16:20:53 -08:00
yozhao101	3ac345922b	[Services] Restart database service upon unexpected critical process exit. (#4138 ) * [database] Implement the auto-restart feature for database container. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [database] Remove the duplicate dependency in service files. Since we already have updategraph ---> config_setup ---> database, we do not need explicitly add database.service in all other container service files. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [event listener] Reorganize the line 73 in event listener script. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [database] update the file sflow.service.j2 to remove the duplicate dependency. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [event listener] Add comments in event listener. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [event listener] Update the comments in line 56. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [event listener] Add parentheses for if statement in line 76 in event listener. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-02-13 16:20:38 -08:00
yozhao101	71225ea4cc	[Service] Enable/disable container auto-restart based on configuration. (#4073 )	2020-02-13 16:20:21 -08:00
yozhao101	984c43e01d	[init_cfg.json] Add new FEATURE and CONTAINER_FEATURE tables (#4137 ) * [init_cfg.json] Add a new table CONTAINER_FEATURE. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [init_cfg.json] Update the content of table CONTAINER_FEATURE. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [init_cfg.json] Use the template to generate the table CONTAINER_FEATURE. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [init_cfg.json] Add a new table FEATURE. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [init_cfg.json] Change the order of container names according to alphabetical order. Signed-off-by: Yong Zhao <yozhao@microsoft.com> * [init_cfg.json] Change the dhcp_relay container name and add rest-api. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-02-13 16:07:41 -08:00
yozhao101	f061353655	[init_cfg.json] Maintain a separate init_cfg.json.j2 template file (#4092 )	2020-02-13 16:07:23 -08:00
pra-moh	c70a7b877d	[procdockerstatsd] Fix incorrect case issue in service file (#4134 )	2020-02-13 16:06:30 -08:00
Stephen Sun	6143fdd54d	[process-reboot-cause]Clean up the process-reboot-cause as reqired in issue 3927 (#4128 )	2020-02-13 16:05:55 -08:00
pra-moh	e1946432ff	[procdockerstats]: Update file permission for procdockerstatsd (#4126 )	2020-02-13 16:05:36 -08:00
Prince Sunny	e87f27050b	Update arp_update to refresh neighbor entries from APP_DB (#4125 )	2020-02-13 16:05:19 -08:00
kannankvs	74ac9b02dc	modified down rules to pre-down rules to ensure that default route is… (#3853 ) * modified down rules to pre-down rules to ensure that default route is deleted just before interface is made down	2020-02-13 16:01:21 -08:00
kannankvs	a836ead688	mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from con… (#4057 ) * mvrf_avoid_snmp_yml_config: made changes to pass SNMP config from confiDB to snmpd.conf without using snmp.yml * added a missing if condition	2020-02-03 15:38:38 -08:00
pra-moh	8e4a4caf79	[baseimage]: removing space from shebang in procdockerstatsd (#4051 )	2020-02-03 15:37:47 -08:00
Dong Zhang	42bffc1215	[MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector (#4035 ) * [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector * update comment for a potential bug * update comment * add TODO maker as review reqirement	2020-02-03 15:36:55 -08:00
Howard Persh	cc825ff2fe	[startup] Fixes issue with /var/platform directory not created (#4000 )	2020-02-03 15:34:34 -08:00
SuvarnaMeenakshi	abe7ef7e2e	[baseimage]: support building multi-asic component (#3856 ) - move single instance services into their own folder - generate Systemd templates for any multi-instance service files in slave.mk - detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file. - update container hostname after creation instead of during creation (docker_image_ctl) - run Docker containers in a network namespace if specified - add a service to create a simulated multi-ASIC topology on the virtual switch platform Signed-off-by: Lawrence Lee <t-lale@microsoft.com> Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>	2020-02-03 15:32:21 -08:00
Kiran Kumar Kella	a943e6ce45	Changes in sonic-buildimage to support the NAT feature (#3494 ) * Changes in sonic-buildimage for the NAT feature - Docker for NAT - installing the required tools iptables and conntrack for nat Signed-off-by: kiran.kella@broadcom.com * Add redis-tools dependencies in the docker nat compilation * Addressed review comments * add natsyncd to warm-boot finalizer list * addressed review comments * using swsscommon.DBConnector instead of swsssdk.SonicV2Connector * Enable NAT application in docker-sonic-vs	2020-02-03 15:30:39 -08:00
B S Rama krishna	5a4f19e04a	[kdump]: porting kdump installation skip on arm to 201911 (#4081 )	2020-01-29 09:07:12 -08:00
Joe LeVeque	ccdc097a8f	[caclmgrd] Fix application of IPv6 service ACL rules (part 2) (#4036 )	2020-01-21 10:53:16 -08:00
Sujin Kang	9deb8c15f3	[reboot cause]: Delay process-reboot-cause service until network connection is stable (#4003 )	2020-01-21 10:47:13 -08:00
yozhao101	82c2eee1e6	[Monit] Change the monitoring period from 120 seconds to 60 seconds. (#3974 ) * [Monit] Change the monitoring period of monit from 120 seconds to 60 seconds and also at the same time double the interval for existing sonic monit config file in host. Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2020-01-21 10:44:36 -08:00
Joe LeVeque	aad6b9c034	[apt] Instruct apt-get to NOT check the "Valid Until" date in Release files (#3973 ) This is an addendum to #3958, which also instructs apt to ignore the "Valid Until" date in Release files inside the slave containers, making a complete solution, much like the previously abandoned PR #2609. This patch also unifies file names and contents. When the Debian team archives a repo, it stops updating the "Valid Until" date, thus apt-get will not apply updates for that repo unless we explicitly tell it to ignore the "Valid Until" date. Also, this has become an issue with active (i.e., non-archived) repos twice in the past year because the Debian folks seem to occasionally let the expiration lapse before updating the date. This will cause SONiC builds to fail with a message like E: Release file for http://debian-archive.trafficmanager.net/debian-security/dists/jessie/updates/InRelease is expired (invalid since 3d 3h 11min 20s). Updates for this repository will not be applied. until the dates have been updated and propagated to all mirrors. With this patch, SONiC should no longer be affected by lapsed "Valid Until" dates, whether they be accidental or purposeful.	2020-01-21 10:43:51 -08:00
rajendra-dendukuri	bb34edf1af	[config-setup]: create a SONiC configuration management service (#3227 ) * Create a SONiC configuration management service * Perform config db migration after loading config_db.json to redis DB * Migrate config-setup post migration hooks on image upgrade config-setup post migration hooks help user to migrate configurations from old image to new image. If the installed hooks are user defined they will not be part of the newly installed image. So these hooks have to be migrated to new image and only then they can be executing when the new image is booting. The changes in this fix migrate config-setup post-migration hooks and ensure that any hooks with the same filename in newly installed image are not overwritten. It is expected that users install new hooks as per their requirement and not edit existing hooks. Any changes to existing hooks need to be done as part of new image and not post bootup.	2020-01-21 10:39:19 -08:00
Prabhu Sreenivasan	7ec2732387	SONiC Management Framework Release 1.0 (#3488 ) * Added sonic-mgmt-framework as submodule / docker * fix build issues * update sonic-mgmt-framework submodule branch to master * Merged changes 70007e6d2ba3a4c0b371cd693ccc63e0a8906e77..00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 00d4fcfed6a759e40d7b92120ea0ee1f08300fc6 Modified environemnt variables * Changes to build sonic-mgmt-framework docker * bumped up sonic-mgmt-framework commit-id * version bump for sonic-mgmt-framework commit-it * bumped up sonic-mgmt-framework commit-id * Add python packages to docker * Build fix for docker with python packages * added libyang as dependent package * Allow building images on NFS-mounted clones Prior to this change, `build_debian.sh` would generate a Debian filesystem in `./fsroot`. This needs root permissions, and one of the tests that is performed is whether the user can create a character special file in the filesystem (using mknod). On most NFS deployments, `root` is the least privileged user, and cannot run mknod. Also, attempting to run commands like rm or mv as root would fail due to permission errors, since the root user gets mapped to an unprivileged user like `nobody`. This commit changes the location of the Debian filesystem to `/fsroot`, which is a tmpfs mount within the slave Docker. The default squashfs, docker tarball and zip files are also created within /tmp, before being copied back to /sonic as the regular user. The side effect of this change is that the contents of `/fsroot` are no longer available once the slave container exits, however they are available within the squashfs image. Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com> * bumped up sonc-mgmt-framework commit to include PR #18 * REST Server startup script is enahnced to read the settings from ConfigDB. Below table provides mapping of db field to command line argument name. ============================================================ ConfigDB entry key Field name REST Server argument ============================================================ REST_SERVER\|default port -port REST_SERVER\|default client_auth -client_auth REST_SERVER\|default log_level -v DEVICE_METADATA\|x509 server_crt -cert DEVICE_METADATA\|x509 server_key -key DEVICE_METADATA\|x509 ca_crt -cacert ============================================================ * Replace src/telemetry as submodule to sonic-telemetry * Update telemetry commit HEAD * Update sonic-telemetry commit HEAD * libyang env path update * Add libyang dependency to telemetry * Add scripts to create JSON files for CLI backend Scripts to create /var/platform/syseeprom and /var/platform/system, which are back-end files for CLI, for system EEPROM and system information. Signed-off-by: Howard Persh <Howard_Persh@dell.com> * In startup script, create directory where CLI back-end files live Signed-off-by: Howard Persh <Howard_Persh@dell.com> * build dependency pkgs added to docker for build failure fix * Changes to fix build issue for mgmt framework * Fix exec path issue with telemetry * s5232[device] PSU detecttion and default led state support * Processing of first boot in rc.local should not have premature exit Signed-off-by: Howard Persh <Howard_Persh@dell.com> * docker mount options added for platform, system features * bumped up sonic-mgmt-framework commit id to pick 23rd July 2019 changes * Added mount options for telemetry docker to get access for system and platform info. * Update commit for sonic-utilities * [dell]: Corrected dport map and renamed config files for S5232F * Fix telemetry submodule commit * added support for sonic-cli console * [Dell S5232F, Z9264F] Harden FPGA driver kernel module For Dell S5232F and Z9264F platforms, be more strict when checking state in ISR of FPGA driver, to harden against spurious interrupts. Signed-off-by: Howard Persh <Howard_Persh@dell.com> * update mgmt-framework submodule to 27th Aug commit. * remove changes not related to mgmt-framework and sonic-telemetry * Revert "Replace src/telemetry as submodule to sonic-telemetry" This reverts commit `11c3192975`. * Revert "Replace src/telemetry as submodule to sonic-telemetry" This reverts commit `11c3192975`. * make submodule changes and remove a change not related to PR * more changes * Update .gitmodules * Update Dockerfile.j2 * Update .gitmodules * Update .gitmodules * Update .gitmodules reverting experimental change * Removed syspoll for release_1.0 Signed-off-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com> * Update docker-sonic-mgmt-framework.mk * Update sonic-mgmt-framework.mk * Update sonic-mgmt-framework.mk * Update docker-sonic-mgmt-framework.mk * Update docker-sonic-mgmt-framework.mk * Revert "Processing of first boot in rc.local should not have premature exit" This reverts commit `e99a91ffc2`. * Remove old telemetry directory * Update docker-sonic-mgmt-framework.mk * Resolving merge conflict with Azure * Reverting the wrong merge * Use CVL_SCHEMA_PATH instead of changing directory for telemetry startup * Add missing export * Add python mmh3 to slave dockerfile * Remove sonic-mgmt-framework build dep for telemetry, fix dialout startup issues * Provided flag to disable compiling mgmt-framework * Update sonic-utilites point latest commit id * Point sonic-utilities to Azure accepted SHA * Updating mgmt framework to right sha * Add sonic-telemetry submodule * Update the mgmt-framework commit id Co-authored-by: jghalam <joe.ghalam@gmail.com> Co-authored-by: Partha Dutta <51353699+dutta-partha@users.noreply.github.com> Co-authored-by: srideepDell <srideep_devireddy@dell.com> Co-authored-by: nirenjan <nirenjan@users.noreply.github.com> Co-authored-by: Sachin Holla <51310506+sachinholla@users.noreply.github.com> Co-authored-by: Eric Seifert <seiferteric@gmail.com> Co-authored-by: Howard Persh <hpersh@yahoo.com> Co-authored-by: Jeff Yin <29264773+jeff-yin@users.noreply.github.com> Co-authored-by: Arunsundar Kannan <31632515+arunsundark@users.noreply.github.com> Co-authored-by: rvasanthm <51932293+rvasanthm@users.noreply.github.com> Co-authored-by: Ashok Daparthi-Dell <Ashok_Daparthi@Dell.com> Co-authored-by: anand-kumar-subramanian <51383315+anand-kumar-subramanian@users.noreply.github.com>	2020-01-08 15:51:02 -08:00
Abhishek	6045e34650	Merge branch 'abdosi/master_201911_label_to_201911' into 201911. Cherry pick changes from master into 201911	2020-01-06 17:30:03 -08:00
Joe LeVeque	5e07b252ff	[monit] Build from source and patch to use MemAvailable value if available on system (#3875 )	2020-01-06 11:41:20 -08:00
Stepan Blyshchak	b834c9ff34	[services] make snmp.timer work again and delay telemetry.service (#3742 ) Delay CPU intensive services at boot - How I did it Made snmp.timer work and add telemetry.timer. But this is not enough because it breaks the existing snmp dependency on swss. So, in this solution snmp timer is a wanted by swss service, but since OnBootSec timer expires only once it will not trigger snmp service, so I added line "OnUnitActiveSec=0 sec" which will start snmp service based on the last time it was active. On boot only OnBootSec will expire, on swss start/restarts only second timer will expire immediately and trigger snmp service. However, snmp service will not stop after "systemctl stop snmp" because of the second timer which will always expire when snmp service because unavailable. So there is a conflict which will be handled by systemd if we add "Conflicts=" line to both snmp.service and snmp.timer. So during boot: snmp does not start by default swss starts and starts snmp timer OnUnitActiveSec=0 does not expire since there is no snmp active OnBootSec expires and starts snmp service and snmp timer gets stopped During "systemctl restart swss" snmp stops because of Requisite on swss snmp unblocks snmp timer from running swss starts and starts snmp timer OnUnitActiveSec=0 expires imidiately and start snmp which stops snmp timer During "systemctl stop snmp" stop of snmp service unblocks snmp timer but no one starts the timer so it is not started by "OnUnitActiveSec=0"	2020-01-06 10:32:24 -08:00
pavel-shirshov	74b45be487	[fast-reboot]: Save fast-reboot state into the db (#3741 ) Put a flag for fast-reboot to the db using EXPIRE feature. Using this flag in other part of SONiC to start in Fast-reboot mode. If we reload a config, the state in the db will be removed.	2020-01-06 10:30:36 -08:00
lguohan	b2234a682d	[docker-base-stretch]: Do not check expire for stretch-backports repo (#3958 ) * [docker-base-stretch]: Do not check expire for stretch-backports repo Signed-off-by: Guohan Lu <gulv@microsoft.com>	2020-01-03 10:44:26 -08:00
Ying Xie	df81943ec5	Revert "[swss.sh] When starting, call 'systemctl restart' on dependents, not (#3807 )" (#3835 ) This reverts commit `351410ea8c`.	2020-01-02 14:35:55 -08:00
Joe LeVeque	fd3d8c23b2	[services] sflow service sets swss service as Requisite=, not Requires= (#3819 ) The sflow service should not start unless the swss service is started. However, if this service is not started, the sflow service should not attempt to start them, instead it should simply fail to start. Using Requisite=, we will achieve this behavior, whereas using Requires= will cause the required service to be started.	2020-01-02 14:29:11 -08:00
Stepan Blyshchak	3474e8fddd	[syncd.sh] remove chipdown on mellanox (#3926 ) ASIC reset events are captured by hw-mgmt and hw-mgmt calls chipup/chipdown internally without OS iteraction Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-12-31 14:43:32 -08:00
Joe LeVeque	f0b7dfad7c	[caclmgrd] Fix application of IPv6 service ACL rules (#3917 )	2019-12-31 14:42:49 -08:00
Renuka Manavalan	2d079a15dd	corefile uploader: Updates per review comments offline (#3915 ) * Updates per review comments 1) core_uploader service waits for syslog.service 2) core_uploader service enabled for restart on failure 3) Use mtime instead of file size + ample time to be robust. * Avoid reloading already uploaded file, by marking the names with a prefix. * Updated failing path. 1) If rc file is missing or required data missing, it periodically logs error in forever loop. 2) If upload fails, retry every hour with a error log, forever. * Fix few bugs * The binary update_json.py will come from sonic-utilities.	2019-12-31 14:42:01 -08:00
Ying Xie	2c7a01a421	[swss service] flush fast-reboot enabled flag upon swss stopping (#3908 ) If we need to stop swss during fast-reboot procedure on the boot up path, it means that something went wrong, like syncd/orchagent crashed already, we are stopping and restarting swss/syncd to re-initialize. In this case, we should proceed as if it is a cold reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-12-18 11:20:45 -08:00
Ying Xie	759bde3a43	[hostcfgd] avoid in place editing config file contents (#3904 ) In place editing (sed -i) seems having some issues with filesystem interaction. It could leave 0 size file or corrupted file behind. It would be safer to sed the file contents into a new file and switch new file with the old file. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-12-18 11:20:25 -08:00
Renuka Manavalan	14f7b8da2d	Corefile uploader service (#3887 ) * Corefile uploader service 1) A service is added to watch /var/core and upload to Azure storage 2) The service is disabled on boot. One may enable explicitly. 3) The .rc file to be updated with acct credentials and http proxy to use. 4) If service is enabled with no credentials, it would sleep, with periodic log messages 5) For any update in .rc, the service has to be restarted to take effect. * Remove rw permission for .rc file for group & others. * Changes per review comments. Re-ordered .rc file per JSON.dump order. Added a script to enable partial update of .rc, which HWProxy would use to add acct key. * Azure storage upload requires python module futures, hence added it to install list. * Removed trailing spaces. * A mistake in name corrected. Copy the .rc updater script to /usr/bin.	2019-12-18 11:19:25 -08:00
Stephen Sun	ba4f0f30c8	[process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot (#3880 ) * [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot 1. check whether /proc/cmdline indicates warm/fast reboot. if yes the software reboot cause file will be treated as the reboot cause. finish 2. check whether platform api returns a reboot cause. if yes it is treated as the reboot cause. finish. 3. check whether /hosts/reboot-cause contains a cause. if yes it is treated as the cause otherwise return unknown. * [process-reboot-cause]Fix review comments * [process-reboot-cause]address comments 1. use "with" statement 2. update fast/warm reboot BOOT_ARG * [process-reboot-cause]address comments * refactor the code flow * Remove escape * Remove extra ':'	2019-12-18 11:17:17 -08:00
pra-moh	bfa96bbce3	Add daemon which periodically pushes process and docker stats to State DB (#3525 )	2019-11-27 15:35:41 -08:00
Joe LeVeque	5e6f8adb22	[services] Remove explicit dependencies from dhcp_relay service file, control in swss.sh (#3823 )	2019-11-26 16:59:45 -08:00
pra-moh	d3a1555f30	[hostcfgd] Add support to enable/disable optional features (#3653 )	2019-11-26 14:11:12 -08:00
yozhao101	67fc68513e	[Services] Restart Sflow service upon unexpected critical process exit. (#3751 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-11-25 13:02:00 -08:00
Joe LeVeque	351410ea8c	[swss.sh] When starting, call 'systemctl restart' on dependents, not (#3807 ) 'systemctl start'	2019-11-22 20:39:09 -08:00
yozhao101	df11b2b9f1	[Services] Restart Telemetry service upon unexpected critical process exit. (#3768 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-11-18 16:56:44 -08:00
kannankvs	4007d9ba9c	[ntp]: modified ntp script to hide the error related to cfggen (#3745 ) This PR is to handle the issue 3527. When device boots up, NTP throws a traceback as explained in the issue 3527. - Traceback will be seen when MGMT_VRF_CONFIG does not exist in the database. Traceback is coming from the script “/etc/init.d/ntp”. - Traceback does not affect the NTP functionality with/without management VRF. When MGMT_VRF_CONFIG does not exist or when MGMT_VRF_CONFIG’s mgmtVrfEnabled is configured to “false”, “NTP” will be started in the “default VRF” context, which is working fine even with this traceback. - This traceback error will be hidden by redirecting the error to /dev/null without affecting functionality.	2019-11-14 00:06:54 -08:00
Joe LeVeque	c50c390eb4	[rsyslog] Add support for IPv6 remote addresses (#3754 )	2019-11-14 00:00:55 -08:00
Tyler Li	c07ae3b16f	Loopback ip addresses move to intfmgrd for supporting VRF	2019-11-10 02:27:33 -08:00
Joe LeVeque	85b0de3df1	[docker-syncd]: Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly (#3534 ) Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.	2019-11-09 10:26:39 -08:00
Olivier Singla	c70d8bca9f	[baseimage]: kdump support (#3722 ) * In the event of a kernel crash, we need to gather as much information as possible to understand and identify the root cause of the crash. Currently, the kernel does not provide much information, which make kernel crash investigation difficult and time consuming. Fortunately, there is a way in the kernel to provide more information in the case of a kernel crash. kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. This PR will add kermel kdump support. An extension to the CLI utilities config and show is provided to configure and manage kdump: - enable / disable kdump functionality - configure kdump (how many kernel crash logs can be saved, memory allocated for capture kernel) - view kernel crash logs	2019-11-08 23:08:42 -08:00
Ying Xie	96fffd883d	Revert "[services] make snmp.timer work again and delay telemetry.service (#3657 )" (#3729 ) This reverts commit `d346cb3898`.	2019-11-08 21:44:25 -08:00
lguohan	6d46badbdc	[aboot]: preserve snmp.yml and acl.json for eos to sonic fast reboot (#3716 )	2019-11-06 20:18:31 -08:00
Neetha John	95466c3ab7	[pfcwd]: Do not start pfc watchdog on Management Tor (#3719 ) Signed-off-by: Neetha John <nejo@microsoft.com>	2019-11-06 18:51:02 -08:00
pavel-shirshov	d5af096f41	[TSA]: Add community to the loopback prefix, when isolated (#3708 ) * Rename asn/deployment_id_asn_map.yaml to constants/constants.yaml * Fix bgp templates * Add community for loopback when bgpd is isolated * Use correct community value	2019-11-06 16:07:28 -08:00
Stepan Blyshchak	d346cb3898	[services] make snmp.timer work again and delay telemetry.service (#3657 ) Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-11-06 12:12:31 -08:00
yozhao101	a117b25446	[Services] Restart LLDP service upon unexpected critical process exit. (#3713 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-11-06 11:02:57 -08:00
Samuel Angebault	05e659901f	[arista] Add support for more 7280CR3 variants (#3711 ) * Add extra Smartsville hwskus	2019-11-06 10:11:38 -08:00
yozhao101	ed79f54569	[Services] Restart DHCP-Relay service upon unexpected critical process exit. (#3667 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-11-05 18:32:14 -08:00
yozhao101	4c31ef3cd2	[Services] Restart Teamd service upon unexpected critical process exit. (#3703 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-11-04 17:45:41 -08:00
yozhao101	4fa3a1e27e	[Services] Restart Platform-monitor service upon unexpected critical process exit. (#3689 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-11-04 17:44:01 -08:00
Stepan Blyshchak	8dbe13c4cc	[services] improve startup time by changing startup order (#3656 ) * [services] improve startup time by given precedence to critical services (syncd.service) Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-10-31 09:18:26 -07:00
yozhao101	cff30c59d0	[Services] Restart Router-advertiser service upon unexpected critical process exit (#3681 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-10-30 16:41:55 -07:00
Ying Xie	5961e031e1	[hostname-config] improve hostname-config process (#3676 ) We noticed in tests/production that there is a low probability failure where /etc/hosts could have some garbage characters before the entry for local host name. The consequence is that all sudo command would be very slow. In extreme cases it would prevent some services from starting properly. I suspect that the /etc/hosts file might be opened by some process causing the issue. Editing contents with new file level and replace the whole file should be safer. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-10-29 08:30:27 -07:00
Danny Allen	63328814fc	[core_cleanup] Fix issue where core_cleanup job runs too frequently (#3659 ) Signed-off-by: Danny Allen <daall@microsoft.com>	2019-10-23 15:55:47 -07:00
yozhao101	a0fbeeaca5	[Services] Restart SNMP service upon unexpected critical process exit. (#3650 ) Signed-off-by: Yong Zhao <yozhao@microsoft.com>	2019-10-22 14:41:12 -07:00
Wenda Ni	be52977aca	Revert "Configure buffer profile to all ports (#3561 )" (#3628 ) This reverts commit `8861cbe98e`.	2019-10-18 09:14:39 -07:00
kannankvs	150ed36be2	[snmp]: changes to handle snmp configuration as per the modified CLI (#3586 ) While doing CLI changes for SNMP configuration, few changes are made in backend to handle the modified CLI. Changes - "community" for "snmp trap" is also made as "configurable". snmpd_conf.j2 is modified to handle the same. - Changed the snmp.yml file generation from postStartAction to preStartAction in docker_image_ctl.j2 specific to SNMP docker, to ensure that the snmp.yml is generated before sonic-cfggen generates the snmpd.conf. - Changed to make the code common for management vrf and default vrf. Users can configure snmp trap and snmp listening IP for both management vrf and default vrf.	2019-10-10 09:24:18 -07:00
pavel-shirshov	9b8f5c9c9a	[ntp]: Use loopback address when we don't have MGMT interface (#3566 ) Added configuration to use Loopback ip if a switch doesn't have MGMT_PORT.	2019-10-07 07:49:25 -07:00
Wenda Ni	8861cbe98e	Configure buffer profile to all ports (#3561 ) Signed-off-by: Wenda Ni <wenni@microsoft.com>	2019-10-04 11:20:57 -07:00
Ying Xie	cd85e2148b	[updategraph] enhance update graph handling (#3549 ) - after reloading minigraph, write latest version string in the DB. - if old config_db.json file exists, use it and migrate to latest version. - only reload minigraph when config_db.json doesn't exist and minigraph exists. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-10-02 13:58:44 -07:00
Ying Xie	d5262a3621	[first boot] sync file system after moving/copying files (#3550 ) Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-10-02 13:58:34 -07:00
Wenda Ni	cf0465bf53	Adopt per-port buffer and qos profile (#3542 ) Signed-off-by: Wenda Ni <wenni@microsoft.com>	2019-10-02 13:01:16 -07:00
Stepan Blyshchak	52e35a0f95	[docker_image_ctl.j2] skip hostname update if is up to date (#3529 ) Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-10-01 20:48:03 -07:00
Stephen Sun	7308d2eb97	[Mellanox] Stop pmon ahead of syncd (#3505 ) Issue Overview shutdown flow For any shutdown flow, which means all dockers are stopped in order, pmon docker stops after syncd docker has stopped, causing pmon docker fail to release sx_core resources and leaving sx_core in a bad state. The related logs are like the following: INFO syncd.sh[23597]: modprobe: FATAL: Module sx_core is in use. INFO syncd.sh[23597]: Unloading sx_core[FAILED] INFO syncd.sh[23597]: rmmod: ERROR: Module sx_core is in use config reload & service swss.restart In the flows like "config reload" and "service swss restart", the failure cause further consequences: sx_core initialization error with error message like "sx_core: create EMAD sdq 0 failed. err: -16" syncd fails to execute the create switch api with error message "syncd_main: Runtime error: :- processEvent: failed to execute api: create, key: SAI_OBJECT_TYPE_SWITCH:oid:0x21000000000000, status: SAI_STATUS_FAILURE" swss fails to call SAI API "SAI_SWITCH_ATTR_INIT_SWITCH", which causes orchagent to restart. This will introduce an extra 1 or 2 minutes for the system to be available, failing related test cases. reboot, warm-reboot & fast-reboot In the reboot flows including "reboot", "fast-reboot" and "warm-reboot" this failure doesn't have further negative effects since the system has already rebooted. In addition, "warm-reboot" requires the system to be shutdown as soon as possible to meet the GR time restriction of both BGP and LACP. "fast-reboot" also requires to meet the GR time restriction of BGP which is longer than LACP. In this sense, any unnecessary steps should be avoided. It's better to keep those flows untouched. summary To summarize, we have to come up with a way to ensure: shutdown pmon docker ahead of syncd for "config reload" or "service swss restart" flow; don't shutdown pmon docker ahead of syncd for "fast-reboot" or "warm-reboot" flow in order to save time. for "reboot" flow, either order is acceptable. Solution To solve the issue, pmon shoud be stopped ahead of syncd stopped for all flows except for the warm-reboot. - How I did it To stop pmon ahead of syncd stopped. This is done in /usr/local/bin/syncd.sh::stop() and for all shutdown sequence. Now pmon stops ahead of syncd so there must be a way in which pmon can start after syncd started. Another point that should be taken consideration is that pmon starting should be deferred so that services which have the logic of graceful restart in fast-reboot and warm-reboot have sufficient CPU cycles to meet their deadline. This is done by add "syncd.service" as "After" to pmon.service and startin /usr/local/bin/syncd.sh::wait() To start pmon automatically after syncd started.	2019-09-27 10:15:46 +02:00
Stephen Sun	c34a4783e0	[build] install new platform api on host (#3282 ) slave.mk: add SONIC_PLATFORM_API_PY2 as dependency of host sonic_debian_extension.j2: install sonic_daemon_base and Mellanox-specific sonic_platform on host mlnx-platform-api.mk: export mlnx_platform_api_py2_wheel_path for sonic_debian_extension.j2 sonic-daemon-base.mk: export daemon_base_py2_wheel_path for sonic_debian_extension.j2 daemon_base.py: hind unnecessary dependency of swss_common on host	2019-09-25 11:00:24 -07:00
Long Ou	b6a09999de	[hostcfgd] hostcfgd will exit when set hostname in DEVICE_METADATA (#3394 ) Signed-off-by: ouxiaolong <ouxiaolong@asterfusion.com>	2019-09-24 17:36:02 -07:00
Harish Venkatraman	9d2d617264	[SNMP] management VRF SNMP support (#2608 ) * [SNMP] management VRF SNMP support This commit adds SNMP support for Management VRF using l3mdev. The patch included provides VRF support, there is no single "listendevice" configuration, rather multiple agentaddress config options can each have their own "interface" to bind to using "ip%interface". The snmpd.conf file is accordingly generated using the snmp.yml file and redis database info. Adding below the comments of SNMP patch 1376 -------------------------------------------- Since the Linux kernel added support for Virtual Routing and Forwarding (VRF) in version 4.3 (Note: these won't compile on non-linux platforms) https://www.kernel.org/doc/Documentation/networking/vrf.txt Linux users could not use snmpd in its current form to bind specific listening IP addresses to specific VRF devices. A simplified description of a VRF inteface is an interface that is a master (a container of sorts) that collects a set of physicalinterfaces to form a routing table. This set of two patches (one for V5-7-patches and one for V5-8-patches branches) is almost identical to patch single "listendevice" configuration. Rather, multiple agentAddress config options can each have their own "interface" to bind to using the <ip>%<interface> syntax.</interface></ip> ------------------------------------------- Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>	2019-09-18 17:26:45 -07:00
Prince Sunny	8ca1eb289e	Install Iptables rules to set TCPMSS for 'lo' interface (#3452 ) * Install Iptables rules to set TCPMSS for lo interface * Moved implementation to hostcfgd to maintain at one place	2019-09-18 10:12:28 -07:00
sridhar-ravindran	3c0b56a709	[DELL] S6100 Support PowerCycle in Last Reboot Reason (#3403 ) * [DELL] S6100 Support PowerCycle in Last Reboot Reason * handle first time boot properly * S6000 Last Reboot Reason Fix	2019-09-17 16:51:46 -07:00
Harish Venkatraman	31d1a76197	[baseimage]: Management vrf ntp support (#3204 ) This commit adds NTP support for management VRF using L3mdev. Config vrf add mgmt will enable management VRF, enslave the eth0 device to the master device mgmt, stop ntp service in default, restart interfaces-configs and restart ntp service in mgmt-vrf context. Requirement and design are covered in mgmt vrf design document. Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>	2019-09-16 10:21:06 -07:00
padmanarayana	75104bb35d	[sflow]: Build infrastructure changes to support sflow docker and utilities (#3251 ) Introduce a new "sflow" container (if ENABLE_SFLOW is set). The new docker will include: hsflowd : host-sflow based daemon is the sFlow agent psample : Built from libpsample repository. Useful in debugging sampled packets/groups. sflowtool : Locally dump sflow samples (e.g. with a in-unit collector) In case of SONiC-VS, enable psample & act_sample kernel modules. VS' syncd needs iproute2=4.20.0-2~bpo9+1 & libcap2-bin=1:2.25-1 to support tc-sample tc-syncd is provided as a convenience tool for debugging (e.g. tc-syncd filter show ...)	2019-09-14 20:27:09 -07:00
Wenda Ni	81aef6b64c	[Qos] use dot1p to tc mapping for backend switches (#3422 ) * Use dot1p to tc mapping for backend switches Signed-off-by: Wenda Ni <wenni@microsoft.com> * Do not write DSCP to TC mapping into CONFIG_DB or config_db.json for storage switches Signed-off-by: Wenda Ni <wenni@microsoft.com>	2019-09-13 11:28:25 -07:00
Danny Allen	97c675c6d5	[cron.d] Add cron job to periodically clean-up core files (#3449 ) * [cron.d] Create cron job to periodically clean-up core files * Create script to scan /var/core and clean-up older core files * Create cron job to run clean-up script Signed-off-by: Danny Allen <daall@microsoft.com> * Update interval for running cron job * Respond to feedback * Change syslog id	2019-09-13 10:50:31 -07:00
lguohan	95a72b4e39	[baseimage]: fix monit configuration (#3448 ) - monit config broke by one monit upgrade - abandon sed approach since it is suspestible to monit config changes - use unixsocket instead of httpd due to a bug in 5.20.0	2019-09-12 22:48:40 -07:00
lguohan	a1158c6c18	Revert "Use dot1p to tc mapping for backend switches (#3412 )" (#3421 ) This reverts commit `ca43dad12f`.	2019-09-09 14:44:46 -07:00
Joe LeVeque	a27f12773b	[baseimage]: Log message containing SONiC version to syslog at boot (#3416 )	2019-09-09 14:18:23 -07:00
Wenda Ni	ca43dad12f	Use dot1p to tc mapping for backend switches (#3412 ) * Use dot1p to tc mapping for backend switches Signed-off-by: Wenda Ni <wenni@microsoft.com> * Do not write DSCP to TC mapping into CONFIG_DB or config_db.json for storage switches Signed-off-by: Wenda Ni <wenni@microsoft.com>	2019-09-06 11:59:47 -07:00
Danny Allen	cfcf30570b	[build_debian] Include checksum of ASIC config files in SONiC filesystem (#3384 ) [build_debian] Generate checksum of ASIC config files * Adds script to generate checksums for ASIC config files * Adds step to build_debian that copies ASIC config checksum into SONiC filesystem Signed-off-by: Danny Allen daall@microsoft.com	2019-09-05 19:41:35 -07:00
Dong Zhang	768beb79e1	create multiple Redis DB instances based on CONFIG at /etc/sonic/database_config.json (#2182 ) this is the first step to moving different databases tables into different database instances in this PR, only handle multiple database instances creation based on user configuration at /etc/sonic/database_config.json we keep current method to create single database instance if no extra/new DATABASE configuration exist in database_config.json file. if user try to configure more db instances at database_config.json , we create those new db instances along with the original db instance existing today. The configuration is as below, later we can add more db related information if needed: { ... "DATABASE": { "redis-db-01" : { "port" : "6380", "database": ["APPL_DB", "STATE_DB"] }, "redis-db-02" : { "port" : "6381", "database":["ASIC_DB"] }, } ... } The detail description is at design doc at Azure/SONiC#271 The main idea is : when database.sh started, we check the configuration and generate corresponding scripts. rc.local service handle old_config copy when loading new images, there is no dependency between rc.local and database service today, for safety and make sure the copy operation are done before database try to read it, we make database service run after rc.local Then database docker started, we check the configuration and generate corresponding scripts/.conf in database docker as well. based on those conf, we create databases instances as required. at last, we ping_pong check database are up and continue  Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com	2019-08-28 11:15:10 -07:00
pavel-shirshov	8facac9149	[Fast-Reboot]: FR mode is active only first 3 minutes after start. (#3352 ) * Fast reboot mode should be enabled only 3 minutes after restart * Advance sonic-quagga submodule	2019-08-19 16:05:20 -07:00
Ying Xie	84b667fbaf	[radv service] radv service should be a cold only dependent of swss (#3348 ) radv should be left alone during warm restart of swss. Otherwise it will announce departure and cause hosts to lose default gateway. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-08-16 12:08:46 -07:00
Ying Xie	d6b4223bdd	[control plane assistant] stop control plane assistant after warm reboot (#3337 ) Delay saving configuration so that the control assistant configurations won't be persisted. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-08-15 00:45:54 -07:00
Renuka Manavalan	fcdf62f5f6	Fix to ensure that tacacs servers are ordered (reverse) by priority in pam.d's config. (#3322 ) Present: Servers are listed in the same order as in redis-db Fix: Save the sort o/p, hence use sorted list to write into pam.d's conf. As well convert priority to integer for use by sort.	2019-08-09 11:46:46 -07:00
Ying Xie	a46df66d05	[service dependent] describe non-warm-reboot dependency outside systemd (#3311 ) * [service dependent] describe non-warm-reboot dependency outside systemctl When dependency was described with systemctl, it will kick in all the time, including under warm reboot/restart scenarios. This is not what we always want. For components that are capable of warm reboot/start, they need to describe dependency in service files. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * [service] teamd service should not require swss service Adding require swss will cause teamd to be killed by systemctl when swss stops. This is not what we want in warm reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * refactoring code * rename functions to match other functions in the file	2019-08-08 15:45:17 -07:00
lguohan	2b28d55853	[build]: enable docker in ram option for small disk device (#3279 ) when device disk is small, do not unzip dockerfs.tar.gz on disk. keep the tar file on the disk, unzip to tmpfs in the initrd phase. enabled this for 7050-qx32 Signed-off-by: Guohan Lu <gulv@microsoft.com>	2019-08-06 23:04:00 -07:00
byu343	6add9445c8	[aboot-image]: Skip arista-hook and arista-convertfs for fast/warm-reboot (#3242 )	2019-07-31 14:20:17 -07:00
Lawrence Lee	7271fe598f	[build]: Move Systemd service start to systemd generator (#3172 ) - What I did Move the enabling of Systemd services from sonic_debian_extension to a new systemd generator - How I did it Create a new systemd generator to manually create symlinks to enable systemd services Add rules/Makefile to build generator Add services to be enabled to /etc/sonic/generated_services.conf to be read by the generator at boot time Signed-off-by: Lawrence Lee <t-lale@microsoft.com>	2019-07-29 15:52:15 -07:00
arheneus@marvell.com	50fe458592	[build]: SONiC buildimage ARM arch support (#2980 ) ARM Architecture support in SONIC make configure platform=[ASIC_VENDOR_ARCH] PLATFORM_ARCH=[ARM_ARCH] SONIC_ARCH: default amd64 armhf - arm32bit arm64 - arm64bit Signed-off-by: Antony Rheneus <arheneus@marvell.com>	2019-07-25 22:06:41 -07:00
Harish Venkatraman	3e69427ac0	[baseimage] management VRF support via l3mdev (#2585 ) This commit adds support for New feature management VRF using L3mdev. Added commands to enable/disable management VRF. Config vrf add mgmt will enable management VRF, enslave the eth0 device to the master device mgmt and restart interfaces-configs in mgmt-vrf context. management interface (eth0) can be configured using config interface eth0 ip add command and removed using config interface eth0 ip remove command. Requirement and design are covered in mgmt vrf design document. Currently show command displays linux command output; will update show command display in next PR after concluding what would be the output for the show commands. Added metric for default routes in dhcp and static, any changes for metric will be addressed subsequently after discussing. Signed-off-by: Harish Venkatraman <harish_venkatraman@dell.com>	2019-07-24 16:18:40 -07:00
Ying Xie	9d64ce761f	[warm reboot] save configuration after warm reboot (#3200 ) * [warm reboot] save configuration after warm reboot After warm reboot, save a copy of in memory database to config_db.json, upgrade procedure might have removed config_db.json to force new image to reload minigraph. However, reload minigraph is skipped during warm reboot. Missing config_db.json would cause device to fault in next non-upgrading cold/fast reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * Update finalize-warmboot.sh	2019-07-24 09:59:47 -07:00
Ying Xie	401f7042a2	Revert "[database] save configuration after DB migration (#3143 )" (#3199 ) This reverts commit `b5a4527cb0`.	2019-07-22 14:13:50 -07:00
rajendra-dendukuri	40c8bc14cd	[baseimage]: Upgrade ifupdown2 to version 1.2.8 (#3180 ) * Upgrade ifupdown2 to version 1.2.8 Required by ZTP to support ZTP over IPv6 transport Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>	2019-07-19 23:09:14 -07:00
zzhiyuan	e4c041b57f	[baseimage]: Fix process-reboot-cause possibly throwing OSError (#3159 ) In case of going from previous iteration of SONiC, and the last reboot was hardware, REBOOT_CAUSE_FILE may not be present and the service may throw an error.	2019-07-16 08:34:11 -07:00
Ying Xie	b5a4527cb0	[database] save configuration after DB migration (#3143 ) - Make sure that migrated DB contents persisted for next boot - Make sure that db saved after warm reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-07-15 20:21:02 -07:00
Stepan Blyshchak	59117d23f0	[swss.sh]: Cleanup LAG entries in STATE DB (#3114 ) Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-07-08 17:29:57 -07:00
Joe LeVeque	5e2ab9dd03	[process-reboot-cause] Handle case if platform does not yet have sonic_platform implementation (#3126 )	2019-07-05 17:53:49 -07:00
Renuka Manavalan	76bf5a0bc4	[build]: Added debug symbols to many debug dockers. (#3098 ) * Added debug symbols to many debug dockers. * For debug images only: 1) Archive source files into debug image 2) Archived source is copied into /src 3) Created an empty dir /debug 4) Mount both /src as ro & /debug as rw into every docker 5) Login banner will give some details on /src & /debug 6) Devs can copy core file into /debug and view it from inside a container. 7) Dev may create all gdb logs and other data directly into /debug. * Dropped redundant REDIS_TOOLS per review comments. * Added debug symbols to frr package and hence FRR based BGP docker. * 1) Moved dbg_files.sh to scripts/ 2) Src directories to archive are now collected from individual Makefiles. 3) Added few more debug symbols 4) Added few more debug dockers. Here after no more changes except per review comments. To debug: Install required version of debug image in Switch or VM. Copy core file into /debug of host Get into Docker gdb /usr/bin/<daemon> -c /debug/<your core file> set directory /src/... <-- inside gdb to get the source For non-in-depth debugging: Download corresponding debug Docker image (docker-...-dbg.gz) to your VM Load the image Run image with entrypoint as 'bash' with dir containing core mapped in. Run gdb on the core.	2019-07-03 22:13:55 -07:00
Joe LeVeque	e5a2beb13b	[reboot-cause]: Move reboot cause processing to its own service, 'process-reboot-cause' (#3102 )	2019-07-03 10:38:20 -07:00
Michel Moriniaux	dc747247d1	[ARISTA] adding 7060_cs32s to eMMC exclusions (#2982 ) * [ARISTA] adding 7060_cs32s to eMMC exclusions Following PR 2774 we added the 7060-cx32s according to the guidelines of PR 2780 This adds the 7060-cx32s to the list f devices that mount /var/log as a tmpfs to mitigate eMMC wearout Signed-off-by: Michel Moriniaux <m.moriniaux@criteo.com> * [ARISTA] adding 7060_cs32s to eMMC exclusions Following PR 2774 we added the 7060-cx32s according to the guidelines of PR 2780 This adds the 7060-cx32s to the list f devices that mount /var/log as a tmpfs to mitigate eMMC wearout Signed-off-by: Michel Moriniaux <m.moriniaux@criteo.com>	2019-07-02 11:52:43 -07:00
Stepan Blyshchak	6961816dec	fix fast reboot compatibility (#3083 ) * fix fast reboot compatibility We should handle both cases for backward-compatible with 201803: - fast-reboot - SONIC_BOOT_TYPE=fast-reboot * handle review comments * add a comment that getBootType code snippet is shared between two files	2019-06-26 12:46:58 -07:00
Jipan Yang	9a1bebe496	[telemetry]: change the service dependency from swss to database (#3072 ) Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>	2019-06-24 12:36:16 -07:00
Joe LeVeque	319d854e46	[baseimage]: Increase TMOUT for serial port connections to 15 minutes (#3032 ) Increase TMOUT value in order to close inactive serial console connections after 900 seconds (15 minutes) of inactivity	2019-06-19 00:16:01 -07:00
Qi Luo	e7b1988638	[submodule] update sonic-linux-kernel (#2985 ) * [submodule] update sonic-linux-kernel * update linux kernel version * Fix many version strings * update mellanox components (built with new kernel) * [mlnx] add make files for SDK WJH libs * Update arista driver submodule (#8) Make the debian packaging point to a newer kernel version.	2019-06-18 10:00:16 -07:00
Kebo Liu	c927517355	[Mellanox] Inject SDK libs dependency to pmon on Mellanox platform (#3000 ) * inject sdk libs to pmon * fix wrong code	2019-06-14 17:38:24 -07:00
lguohan	8f6ae90cba	[docker]: get hostname from config db instead of minigraph (#3004 ) minigraph may not be always available on the some system configuration. Should use config db as the source of truth.	2019-06-13 22:24:09 -07:00
Renuka Manavalan	cdca062693	[build]: Build sonic-broadcom.bin using debug dockers for all stretch based dockers (#2833 ) * Updated Makefile infrastructure to build debug images. As a sample, platform/broadcom/docker-orchagent-brcm.mk is updated to add a docker-orchagent-brcm-dbg.gz target. Now "BLDENV=stretch make target/docker-orchagent-brcm-dbg.gz" will build the debug image. NOTE: If you don't specify NOSTRETcH=1, it implicitly calls "make stretch", which builds all stretch targets and that would include debug dockers too. This debug image can be used in any linux box to inspect core file. If your module's external dependency can be suitably mocked, you my even manually run it inside. "docker run -it --entrypoint=/bin/bash e47a8fb8ed38" You may map the core file path to this docker run. * Dropped the regular binary using DBG_PACKAGES and a small name change to help readability. * Tweaked the changes to retain the existing behavior w.r.t INSTALL_DEBUG_TOOLS=y. When this change ('building debug docker image transparently') is extended to all dockers, this flag would become redundant. Yet, there can be some test based use cases that rely on this flag. Until after all the dockers gets their debug images by default and we switch all use cases of this flag to use the newly built debug images, we need to maintain the existing behavior. * 1) slave.mk - Dropped unused Docker build args 2) Debug template builder: renamed build_dbg_j2.sh to build_debug_docker_j2.sh 3) Dropped insignifcant statement CMD from debug Docker file, as base docker has Entrypoint. * Reverted some changes, per review comments. "User, uid, guid, frr-uid & frr-guid" are required for all docker images, with exception of debug images. * Get in sync with the new update that filters out dockers to be built (SONIC_STRETCH_DOCKERS_FOR_INSTALLERS) and build debug-dockers only for those to be built and debug target is available. * Mkae a template for each target that can be shared by all platforms. Where needed a platform entry can override the template. This avoids duplication, hence easier to maintain. * A small change, that can fit better with other targets too. Just take the platform code and do the rest in template. * Extended debug to all stretch based docker images * 1) Combined all orchagent makefiles into one platform independent make under rules/docker-orchagent.mk 2) Extened debug image to all stretch dockers * Changes per review comments: 1) Dropped LIBSAIREDIS_DBG from database, teamd, router-advertiser, telemetry, and platform-monitor docker.mk files from _DBG_DEPENDS list 2) W.r.t docker make for syncd, moved DEPENDS from template to specific makefile and let the template has stuff that is applicable to all. 1) Corrected a copy/paste mistake * Fixed a copy/paste bug * The base syncd dockers follow a template, which defines the base docker as DOCKER_SYNCD_BASE instead of DOCKER_SYNCD_<platform code>. Fix the docker-syncd-<mlnx, bfn>.mk to use the new one. [Yet to be tested locally] * Fixed spelling mistake * Enable build of dbg-sonic-broadcom.bin, which uses dbg-dockers in place of regular dockers, for dockers that build debug version. For dockers that do not build debug version, it uses the regular docker. This debug bin is installable and usable in a DUT, just like a regular bin. * Per review comments: 1) Share a single rule for final image for normal & debug flavors (e.g. sonic-broadcom.bin & sonic-broadcom-dbg.bin) 2) Put dbg as suffix in final image name. 3) Compared target/sonic-broadcom.bin.logs with & w/o fix to verify integrity of sonic-broadcom.bin 4) Compared target/sonic-broadcom.bin.logs with sonic-broadcom-dbg.bin.log for verification This fix takes care of ONIE image only. The next PR will cover the rest. The next PR, will also make debug image conditional with flag. * Updated per comments. Now that debug dockers are available, do not need a way to install debug symbols in regular dockers. With this commit, when INSTALL_DEBUG_TOOLS=y is set, it builds debug dockers (for dockers that enable debug build) and the final image uses debug dockers. For dockers that do not enable debug build, regular dockers get used in the final image. Note: The debug dockers are explicitly named as <docker name>-dbg.gz. But there is no "-dbg" suffix for image. Hence if you make two runs with and w/o INSTALL_DEBUG_TOOLS=y, you have complete set of regular dockers + debug dockers. But the image gets overwritten. Hence if both regular & debug images are needed, make two runs, as one with INSTALL_DEBUG_TOOLS=y and one w/o. Make sure to copy/rename the final image, before making the second run.	2019-06-12 01:36:21 -07:00
Prince Sunny	231d309b69	Generate interface table to have an entry designated to default VRF. (#2848 ) * Generate default VRF table for router interfaces * Updated jinja2 template to have prefix filter	2019-06-10 14:02:55 -07:00
Myron Sosyak	3ec95e17c8	[build_templates] [hostcfgd] Keep containers hostname up to date (#2924 ) * Add updateHostName function to docker_image_ctl.j2 * Add hostname specification on container creating step * Add listener for hostname changes in hostcfgd Signed-off-by: Myron Sosyak <msosyak@barefootnetworks.com>	2019-06-06 00:41:30 -07:00
Kebo Liu	bd519322cb	[Mellanox] Expose SDK share buffer and unix socket from syncd (#2951 ) * expose SDK share buffer and unix socket from syncd * fix PR comments * fix community comments and add TODO	2019-06-05 11:19:56 -07:00
Nazarii Hnydyn	e041b15d10	[mellanox]: Fixed config reload race. (#2930 ) Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>	2019-05-29 09:57:29 +03:00
lguohan	30b37ec6fb	[build]: make sonic-slave-stretch as the default build docker (#2921 ) Signed-off-by: Guohan Lu <gulv@microsoft.com>	2019-05-27 15:50:51 -07:00
Joe LeVeque	3ec3e20e5a	[logrotate] Enhance robustness (#2942 ) * [logrotate] Decrease frequency to every 10 minutes; kill any lingering logrotate processes * [logrotate] Delete all *.1.gz files as firstaction; Remove note about init-system-helpers < 1.47 workaround However, continue to send SIGHUP directly to rsyslogd process because 'service rsyslog rotate' still doesn't work properly with init-system-helpers version 1.48	2019-05-25 18:00:18 -07:00
Stepan Blyshchak	9523e64666	[swss.sh] flush FDB table during cold start (#2933 ) Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-05-22 22:07:29 -07:00
Ying Xie	222706120d	[updategraph] set DB version after minigraph reload (#2917 ) Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-05-18 22:08:41 -07:00
Samuel Angebault	aac0c24312	[device/Arista] Add support for the 7280CR3-32P4 (#2910 ) * Add boot0 support for the 7280CR3 * Add platform and plugins for 7280CR3 * Add port config for 7280CR3 * Add platform_reboot for 7280CR3 * Add support for 7280CR3-32D4 based on the 7280CR3-32P4 * Update arista driver submodules - Introduce new 7280CR3-32P4 - Improve to the led plugin for OSFP	2019-05-18 10:34:07 -07:00
Samuel Angebault	77cde50541	[device/Arista] Improvements to the boot of Arista devices. (#2898 ) * Fix showing systemd shutdown sequence when verbose is set * Fix creation of kernel-cmdline file Sometimes boot0 prints error "mv: can't preserve ownership of '/mnt/flash/image-arsonic.xxxx/kernel-cmdline': Operation not permitted" * Improve flash space usage during installation Some older systems only have 2GB of flash available. Installing a second image on these can prove to be challenging. The new installation process moves the installer swi to memory in order to avoid free up space from the flash before uncompressing it there. It removes all the flash space usage spike and also improves the IO since the installation is no more reading and writting to the flash at the same time. * Add support of 7060CX-32S-SSD * 7260CX3: use inventory powerCycle procedures * 7050QX-32S: use inventory powerCycle procedures * 7050QX-32: use inventory powerCycle procedures * platform: arista: add common platform_reboot Replace platform_reboot by a link to new common for devices already using a similar script. * 7060CX-32S: use inventory powerCycle procedures * Install python smbus in pmon Some platform plugin need the python smbus library to perform some actions. This installs the dependency.	2019-05-15 12:45:05 -07:00
Renuka Manavalan	a357693f52	[tacacs]: skip accessing tacacs servers for local non-tacacs users (#2843 ) * Switch the nss look up order as "compat" followed by "tacplus". This helps use the legacy passwd file for user info and go to tacacs only if not found. This means, we never contact tacacs for local users like "admin". This isolates local users from any issues with tacacs servers. W/o this fix, the sudo commands by local users could take <count of servers> * <tacacs timeout> seconds, if the tacacs servers are unreachable. * Skip tacacs server access for local non-tacacs users. Revert the order of 'compat tacplus' to original 'tacplus compat' as tacplus access is required for all tacacs users, who also get created locally.	2019-05-09 14:36:32 -07:00
Ying Xie	9efcf1759a	[ebtables] install ebtables in base image and install filter rules (#2805 ) - Add ebtables package, and install some filter rules: 1. ebtables -A FORWARD -d BGA -j DROP 2. ebtables -A FORWARD -p ARP -j DROP Basically, we let the ARP packets in the VLAN being forwarded by the ASIC, kernel gets a copy of these ARP packets and the forwarding from Kenerl gets dropped. So there is always only one copy of ARP/response in the VLAN. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-05-09 09:44:41 -07:00
lguohan	5fb185cd83	[docker-frr]: bring quagga docker features to frr docker (#2870 ) - use superviord to manage process in frr docker - intro separated configuration mode for frr - bring quagga configuration template to frr. Signed-off-by: Guohan Lu <gulv@microsoft.com>	2019-05-08 23:00:49 -07:00
Joe LeVeque	6eca27e564	[services] Restart SwSS service upon unexpected critical process exit (#2845 ) * [service] Restart SwSS Docker container if orchagent exits unexpectedly * Configure systemd to stop restarting swss if it attempts to restart more than 3 times in 20 minutes * Move supervisor-proc-exit-listener script * [docker-dhcp-relay] Enhance wait_for_intf.sh.j2 to utilize STATEDB * Ensure dependent services stop/start/restart with SwSS * Change 'StartLimitInterval' to 'StartLimitIntervalSec', as Stretch installs systemd 232 (>= v230) * Also update journald.conf options * Remove 'PartOf' option from unit files * Add '$(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)' to new shared docker-orchagent makefile * Make supervisor-proc-exit-listener script read from 'critical_processes' file inside container * Update critical_processes file for swss container	2019-05-01 08:02:38 -07:00
Joe LeVeque	2736da97c7	[sudoers] Add /usr/bin/teamshow to READ_ONLY_CMDS (#2846 )	2019-05-01 08:01:44 -07:00
Ying Xie	6431248243	[db migrator] migrate the DB to latest schema when needed (#2808 ) Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-04-30 14:46:18 -07:00
Qi Luo	6b3a26f0cc	Remove unused packages in docker images and host (#2807 ) * Remove unneeded packages in docker images and host * Remove libpython3.6 from snmp docker image	2019-04-29 17:21:24 -07:00
Ying Xie	c7af19a4db	[teamd service] start teamd service after swss (#2829 ) SWSS clears DB tables, if teamd is not started after swss, there is a race condition that swss might clear vital teamd information. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-04-26 15:12:33 -07:00
Andriy Moroz	ca7924eb27	Increase syncd start timeout (#2776 ) * Increase syncd start timeout Signed-off-by: Andriy Moroz <c_andriym@mellanox.com> * Replace TimeoutSec to TimeoutStartSec Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>	2019-04-24 17:51:26 +03:00
zhenggen-xu	75964ef243	[baseimage]: Add fstrim service and fstrim timer by default (#2804 ) This service (weekly) will let SSD firmware to do the garbage collection after file-system deleted files. It could avoid slowness or even READ-ONLY error due to SSD not being able to free the pages even though the file system thinks there was a lot of space left. Signed-off-by: Zhenggen Xu <zxu@linkedin.com>	2019-04-21 14:21:16 -07:00
Stepan Blyshchak	6a4ffef1fd	[snmp.service] Make swss.service a requisite (#2790 )	2019-04-16 18:32:36 -07:00
Ying Xie	8bf9247c5e	[tmpfs var/log] mount /var/log as tmpfs for some platforms (#2780 ) SONiC is a heavy writer to /var/log partition, we noticed that this behavior causes certain flash drive to become read-only over time. To avoid this issue, we mount /var/log parition on these devices as tmpfs. - Mount /var/log as tmpfs - /var/log default size is 128M - Adjust size according to existing var-log.ext4 file size. - Adjust size to between 5% to 10% of total memory size. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-04-14 22:46:26 -07:00
Ying Xie	f583f57af6	[service] add warmboot finializer service (#2715 ) After warm reboot is done, we need to disable warm reboot flag and tear down anything setup for warm reboot and persisted across. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-04-12 15:45:58 -07:00
Renuka Manavalan	6d7ecc426c	[hostcfgd] -- Fix the default for failthrough as false. This implies that by default, if TACACS is configured properly and it reported auth_err, then don't try fail through to traditional unix authentication through /etc/passwd. If this failthrough is intended, make it explicit through "sudo config aaa authentication failthrough enable" Removed an unused variable "aaa.fallback" Tested manually. Note the presence of 'auth_err=die' in all cases except when failthrough is explicitly enabled. admin@str-s6000-acs-13:~$ sudo config aaa authentication failthrough default; date Wed Apr 3 23:05:18 UTC 2019 admin@str-s6000-acs-13:~$ ls -lrt /etc/pam.d/common-auth-sonic ; grep 123 /etc/pam.d/common-auth-sonic -rw-r--r-- 1 root root 1316 Apr 3 23:05 /etc/pam.d/common-auth-sonic auth [success=done new_authtok_reqd=done default=ignore auth_err=die] pam_tacplus.so server=100.127.20.22:49 secret=testing123 login=login timeout=5 try_first_pass auth [success=done new_authtok_reqd=done default=ignore auth_err=die] pam_tacplus.so server=100.127.20.21:49 secret=testing123 login=login timeout=5 try_first_pass admin@str-s6000-acs-13:~$ sudo config aaa authentication failthrough enable; date ; h4 "AAA\|authentication" Wed Apr 3 23:06:37 UTC 2019 admin@str-s6000-acs-13:~$ ls -lrt /etc/pam.d/common-auth-sonic ; grep 123 /etc/pam.d/common-auth-sonic -rw-r--r-- 1 root root 1294 Apr 3 23:06 /etc/pam.d/common-auth-sonic auth [success=done new_authtok_reqd=done default=ignore] pam_tacplus.so server=100.127.20.22:49 secret=testing123 login=login timeout=5 try_first_pass auth [success=done new_authtok_reqd=done default=ignore] pam_tacplus.so server=100.127.20.21:49 secret=testing123 login=login timeout=5 try_first_pass admin@str-s6000-acs-13:~$ sudo config aaa authentication failthrough disable; date ; h4 "AAA\|authentication" Wed Apr 3 23:07:09 UTC 2019 admin@str-s6000-acs-13:~$ ls -lrt /etc/pam.d/common-auth-sonic ; grep 123 /etc/pam.d/common-auth-sonic -rw-r--r-- 1 root root 1321 Apr 3 23:07 /etc/pam.d/common-auth-sonic auth [success=done new_authtok_reqd=done default=ignore auth_err=die] pam_tacplus.so server=100.127.20.22:49 secret=testing123 login=login timeout=5 try_first_pass auth [success=done new_authtok_reqd=done default=ignore auth_err=die] pam_tacplus.so server=100.127.20.21:49 secret=testing123 login=login timeout=5 try_first_pass	2019-04-03 23:16:56 +00:00
Ying Xie	00a0f22f38	Revert "[teamd service] teamd service should start after syncd (#2724 )" (#2733 ) This reverts commit `0d1efb131c`.	2019-04-03 08:20:44 -07:00
paavaanan	b56124bf48	removing dhcp- turn- off option from initrd (#2555 ) * removing dhcp changes from initrd * removing mgmt-intf-dhcp file	2019-04-02 15:48:04 -07:00
Ying Xie	0d1efb131c	[teamd service] teamd service should start after syncd (#2724 ) * [teamd service] teamd service should start after syncd Signed-off-by: Ying Xie <ying.xie@microsoft.com> * combine after lines	2019-04-01 15:40:22 -07:00
Qi Luo	9c83b5480d	[security] Do not generate ssh server keys for non RSA protocols (#2718 )	2019-03-29 15:27:33 -07:00
Ying Xie	698b248a13	[docker script] skip docker mount point checking for database container (#2683 ) database container doesn't mount hwsku folder. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-03-19 20:14:07 -07:00
Renuka Manavalan	ae05579c67	[baseos]: Install ipaddress python package that has deprecated current ipaddr. … (#2674 ) * Install ipaddress python package that has deprecated current ipaddr. ipaddress has backport to python2.7 * Install python ipaddress module as required by route_check.py sonic utility. BTW, ipaddress deprecates ipaddr and ipaddress has python2 backport * Revert the old chaneg per review comments. Signed-off-by: Renuka Manavalan <remanava@microsoft.com>	2019-03-18 11:12:47 -07:00
Pavlo Yadvichuk	11c2e9ee3d	[barefoot]: Allow configuration of platform-specific interfaces used for internal purposes (#2631 ) - Why it is required since SONiC master switches ifupdown package to the new implementation (ifupdown2), it is required to change the configuration of a platform-specific interface for wedge100bf_32x and wedge100bf_65x platforms (bc of ifupdown2 doesn't support auto mode for inet6 protocol). Also, need to make some refactoring and remove if platform == smth then.. from the system level scripts. - What I did removed customization of /usr/bin/interfaces-config.sh explicitly created directory /etc/network/interfaces.d added "source" to the /etc/network/interfaces generation template (to include platform-specific interfaces processing) added platform-specific interfaces config itself (for wedge100bf_32x and wedge100bf_65x) fixed testcase in sonic-config-engine - How to verify it build image for wedge100bf_32x perform sudo config reload -y on new installation check the correct configuration of usb0 interface - Description for the changelog Allow configuration of platform-specific interfaces	2019-03-09 06:22:32 -08:00
Joe LeVeque	2bb5400948	[services] Services which start containers now use 'docker wait' instead of 'docker attach' (#2661 )	2019-03-08 10:59:41 -08:00
Wenda Ni	f9c9fa8ba1	[qos]: Map tc 1, 2, 5, and 6 back to pg 0 (#2650 ) Lossy traffic does not need to be mapped to different ingress PGs. They can all share the same ingress PG. Signed-off-by: Wenda Ni <wenni@microsoft.com>	2019-03-08 02:23:32 -08:00
Nazarii Hnydyn	b22fe37670	[mellanox]: Upgraded hw-management V.2.0.0160. (#2643 ) Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>	2019-03-06 18:51:46 -08:00
Wenda Ni	784bf77a92	Add hook to allow customizing link cable lengths Signed-off-by: Wenda Ni <wenni@microsoft.com>	2019-03-05 22:06:00 +00:00
Ying Xie	66f5202b9f	[swss/syncd] cold start syncd service in swss in attach method (#2639 ) start() is called by service startPre method, which is blocking. Starting syncd service here is causing deadlock. attach() is called by service start method, which is non-blocking. Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-03-04 16:46:55 -08:00
RAMA CHANDRA REDDY GADDAM	b9edb7153d	[aaa] Fix common-auth-sonic.j2 template issue (#2613 )	2019-03-02 15:36:35 -08:00
Joe LeVeque	5eb7872a07	[services] Ensure swss and syncd services start before dependent services (#2634 ) * [services] Ensure swss and syncd services start before dependent services * Add 'attach' functions to scripts which get installed to /usr/local/bin so that services only reference the one script each * Add 'After=swss.service' to syncd.service	2019-03-02 15:28:34 -08:00
yurypm	d632569a6a	Add initramfs hook for Arista devices (#2595 ) We are going to use initramfs hook for firmware upgrades To install Arista hook: - create folder /mnt/flash/<image dir>/platform/hooks/boot1/ from Aboot or /host/<image dir>/platform/hooks/boot1/ from Sonic - add executable script to created folder	2019-02-27 10:28:04 -08:00
Ying Xie	3086f4f391	Revert "[baseimage] Delay ntp-config service to start after 5 minutes (#2494 )" (#2590 ) This reverts commit `33fe8d298e`.	2019-02-21 10:04:54 -08:00
Nikos	1158277533	[frr]: staticd terminating due to inadequate permissions (#2580 ) Signed-off-by: nikos <ntriantafillis@gmail.com>	2019-02-19 21:50:19 -08:00
lguohan	572db1e0a9	[swss]: flush asic db in swss.sh for non warm-boot (#2582 ) need to flush asic db in swss.sh instead of syncd.sh orchagent might already started in swss.sh and put commands into asic db before asic db is flushed in syncd.sh. This causes race condition such as INIT_VIEW not passing to syncd. Signed-off-by: Guohan Lu <gulv@microsoft.com>	2019-02-19 21:48:43 -08:00
Jipan Yang	ff74daaf13	Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#2538 ) Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>	2019-02-19 17:06:56 -08:00
Renuka Manavalan	fa7c46611e	[hostcfgd]: Promote logs for update-notifications-from-DB from DEBUG to INFO (#2576 ) * Add a log message for each notification of add/del TACACS server. Signed-off-by: Renuka Manavalan <remanava@microsoft.com> * Moved another syslog message from DEBUG to INFO to be able to see those notifications. All these changes are to help with a one-time-seen-bug, that hostcfgd did not act upon changes to redis for TACACS servers. We could not repro the bug. Signed-off-by: Renuka Manavalan <remanava@microsoft.com>	2019-02-16 10:17:13 -08:00
Stepan Blyshchak	2dd769bf46	[syncd.sh] Don't stop sxdkernel during warm shutdown on Mellanox platform (#2572 ) /etc/init.d/sxdkernel stop may take up to 15 sec which has impact on control plane downtime Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>	2019-02-15 16:08:08 -08:00
Nazarii Hnydyn	d53df059d4	[devices]: Added new SN3700/SN3700C Mellanox platforms (#2548 ) * [mlnx-msn3700]: Added MSN3700 platform. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com> * [mlnx-msn3700]: Upgrade FW burn: use ASIC auto detect. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com> * [mlnx-msn3700]: Updated HW-MGMT/FW/MFT/SAI/SDK. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com> * [mlnx-msn3700]: Added MSN3700C platform. Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>	2019-02-13 23:08:04 -08:00
Ying Xie	44551d0fb5	[swss/syncd] log swss/syncd service script activities (#2545 ) Signed-off-by: Ying Xie <ying.xie@microsoft.com>	2019-02-10 11:56:31 -08:00
zzhiyuan	6037707abc	[devices]: Add device data for Arista 7060PX/DX4-32 (#2534 ) * Add boot0 definition for Arista 7060PX4-32 and 7060DX4-32 * Add port configuration for Arista 7060PX4-32 * Add plugins for Arista 7060PX4-32 * Add platform_reboot for Arista 7060PX4-32 * Add Arista 7060DX4-32 as symlink of 7060PX4-32 * Add sensors configuration and fancontrol for Arista 7060PX4-32 * Update arista-driver submodules for barefoot/broadcom * Add platform_reboot script for Alhambra * Rook fancontrol CPLD rename	2019-02-08 22:02:01 -08:00
Nadiia Stetskovych	bb5a171ffc	[minigraph]: Do not fail for minigraphs which do not have neighbors listed in <Devices> section (#2522 ) Signed-off-by: Nadiya.Stetskovych <nstetskovych@barefootnetworks.com>	2019-02-04 22:43:08 -08:00
lguohan	f20665008c	[build]: put stretch debian packages under target/debs/stretch/ (#2519 ) * [build]: put stretch debian packages under target/debs/stretch/ * in stretch build phase, all debian packages built in that stage are placed under target/debs/stretch directory. * for python-based debian packages, since they are really the same for jessie and stretch, they are placed under target/python-debs directory. Signed-off-by: Guohan Lu <gulv@microsoft.com>	2019-02-04 22:06:37 -08:00
zhenggen-xu	982eddfaa4	[updategraph] After system upgrade, restore files/directories with original attributes etc. (#2368 ) * [updategraph] After system upgrade, restore files/directories with original attributes etc. Restore a few more files that was missed before. Restore FRR configuration directory if exists on old system Signed-off-by: Zhenggen Xu <zxu@linkedin.com> * Removed deployment_id_asn_map.yml from copy list Signed-off-by: Zhenggen Xu <zxu@linkedin.com>	2019-02-02 12:50:19 -08:00
lguohan	9c2d7240ea	[vs]: Force10-S6000 buffer settings for virtual switch (#2515 ) Signed-off-by: Guohan Lu <gulv@microsoft.com>	2019-02-01 11:18:02 -08:00
Prince Sunny	39e12a1d82	[swss]: Change VrfMgrd startup order, cleanup VRF_TABLE from state DB (#2510 )	2019-01-31 23:28:31 -08:00
Wenda Ni	58adf06cc0	[QoS]: Link pg 2 and 6 to lossy buffer profile (#2511 ) * Link pg 2 and 6 to lossy buffer profile Signed-off-by: Wenda <wenni@microsoft.com>	2019-01-31 23:27:58 -08:00

... 2 3 4 5 6 ...

668 Commits