Delay CPU intensive services at boot
- How I did it
Made snmp.timer work and add telemetry.timer.
But this is not enough because it breaks the existing snmp dependency on swss.
So, in this solution snmp timer is a wanted by swss service, but since OnBootSec timer expires only once it will not trigger snmp service, so I added line "OnUnitActiveSec=0 sec" which will start snmp service based on the last time it was active. On boot only OnBootSec will expire, on swss start/restarts only second timer will expire immediately and trigger snmp service.
However, snmp service will not stop after "systemctl stop snmp" because of the second timer which will always expire when snmp service because unavailable.
So there is a conflict which will be handled by systemd if we add "Conflicts=" line to both snmp.service and snmp.timer.
So during boot:
snmp does not start by default
swss starts and starts snmp timer
OnUnitActiveSec=0 does not expire since there is no snmp active
OnBootSec expires and starts snmp service and snmp timer gets stopped
During "systemctl restart swss"
snmp stops because of Requisite on swss
snmp unblocks snmp timer from running
swss starts and starts snmp timer
OnUnitActiveSec=0 expires imidiately and start snmp which stops snmp timer
During "systemctl stop snmp"
stop of snmp service unblocks snmp timer but no one starts the timer so it is not started by "OnUnitActiveSec=0"
Put a flag for fast-reboot to the db using EXPIRE feature. Using this flag in other part of SONiC to start in Fast-reboot mode. If we reload a config, the state in the db will be removed.
Fix the issue when an SFP module is plugged into a QSFP port via an adapter.
- How I did it
Originally the type of an SFP module is determined according to the SKU dictionary. However, it's possible that as SFP module is plugged into a QSFP port via an adapter. In this case, the EEPROM content will be parsed in the wrong format.
To address that we fetch the identifier value of an xSFP module and then get the type by parsing it.
* [device][accton]: Update for AS7326-56X complying the Broadcom SAI latest version 3.5.3.1m-26
* Update Accton-AS7326-56X to adapt xxx.config.bcm based on the latest update of Device-Specific File Directory Structure.
* Update Accton-AS7326-56X LED BIN complying the Broadcom SAI latest version 3.5.3.1m-26
Signed-off-by: polly_hsu@edge-core.com
* [device][accton]: Merge the SDK config with #3103 (Fix Accton as7326 port breakouk)
Signed-off-by: Polly Hsu <pollyhsu2git@gmail.com>
The sflow service should not start unless the swss service is started. However, if this service is not started, the sflow service should not attempt to start them, instead it should simply fail to start. Using Requisite=, we will achieve this behavior, whereas using Requires= will cause the required service to be started.
ASIC reset events are captured by hw-mgmt and hw-mgmt calls chipup/chipdown internally without OS iteraction
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
* Updates per review comments
1) core_uploader service waits for syslog.service
2) core_uploader service enabled for restart on failure
3) Use mtime instead of file size + ample time to be robust.
* Avoid reloading already uploaded file, by marking the names with a prefix.
* Updated failing path.
1) If rc file is missing or required data missing, it periodically logs error in forever loop.
2) If upload fails, retry every hour with a error log, forever.
* Fix few bugs
* The binary update_json.py will come from sonic-utilities.
If we need to stop swss during fast-reboot procedure on the boot up path,
it means that something went wrong, like syncd/orchagent crashed already,
we are stopping and restarting swss/syncd to re-initialize. In this case,
we should proceed as if it is a cold reboot.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
In place editing (sed -i) seems having some issues with filesystem
interaction. It could leave 0 size file or corrupted file behind.
It would be safer to sed the file contents into a new file and switch
new file with the old file.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* lldpctl: put a lock around some commands to avoid race conditions
* Read all notifications in lldpctl_recv
* lib: fix memory leak
* lib: fix memory leak when handling I/O
* Update series
* Corefile uploader service
1) A service is added to watch /var/core and upload to Azure storage
2) The service is disabled on boot. One may enable explicitly.
3) The .rc file to be updated with acct credentials and http proxy to use.
4) If service is enabled with no credentials, it would sleep, with periodic log messages
5) For any update in .rc, the service has to be restarted to take effect.
* Remove rw permission for .rc file for group & others.
* Changes per review comments.
Re-ordered .rc file per JSON.dump order.
Added a script to enable partial update of .rc, which HWProxy would use to add acct key.
* Azure storage upload requires python module futures, hence added it to install list.
* Removed trailing spaces.
* A mistake in name corrected.
Copy the .rc updater script to /usr/bin.
* [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot
1. check whether /proc/cmdline indicates warm/fast reboot.
if yes the software reboot cause file will be treated as the reboot cause.
finish
2. check whether platform api returns a reboot cause.
if yes it is treated as the reboot cause.
finish.
3. check whether /hosts/reboot-cause contains a cause.
if yes it is treated as the cause otherwise return unknown.
* [process-reboot-cause]Fix review comments
* [process-reboot-cause]address comments
1. use "with" statement
2. update fast/warm reboot BOOT_ARG
* [process-reboot-cause]address comments
* refactor the code flow
* Remove escape
* Remove extra ':'
* [Mellanox]Update hw-mgmt to V7.0000.2308
sonic-linux-kernel should be updated accordingly with necessary patches uploaded.
* [sub-module]Advance submodule head for sonic-linux-kernel
Patch isc-dhcp-relay in order to allow the relay agent to discover configured interfaces even if they are down.
Without this patch, the relay agent will not discover configured interfaces if they are down when the relay agent starts up. If the interface(s) then get brought up after the relay started, the relay will discard packets received on these interfaces and log the message, Discarding packet received on <iface_name> interface that has no IPv4 address assigned. This led to race conditions when starting SONiC (or loading configuration). To resolve this, the relay agent would need to be restarted with all configured interfaces up.
With this patch, the relay agent will discover all configured interfaces, whether or not they are up at the time the relay agent starts. Thus, the state of the configured interfaces can be down when the relay agent starts and brought up during the lifetime of the relay agent process, and the relay agent will relay packets as expected; it will not discard them.
Updated the l2 preset config generator to specify 'admin_status': 'up' for every port by default.
The use of setdefault() ensures that if port already has some admin_status set, the original value will not be overwritten.
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
update multiDB changes in sonic-utilities, including earlier commit by others as well:
- [multiDB]: all application should use API to get redis_client (#753)
- [VRF]: submit vrf CLI #392 (#558)
- [show] Add 'features' subcommand to display status for optional features (#712)
- [neighbor_advertiser] Adds initial support for HTTPS to neighbor advertiser (#750)
after this update , we are able to update sonic-py-swsssdk submodule without hitting error as before.
Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com
- Updated buffers config;
- Set eth2 as CPU port;
- Added systemd service file to load bf_fpga.ko
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
* adding quotes for string comparison with special characters
* Update dockers/docker-sonic-telemetry/telemetry.sh
Co-Authored-By: Joe LeVeque <jleveque@users.noreply.github.com>
* Update dockers/docker-sonic-telemetry/telemetry.sh
Co-Authored-By: Joe LeVeque <jleveque@users.noreply.github.com>
update multiDB changes in sonic-py-swsssdk, including:
*[multi-DB] Part 4: add sonic-db-cli to replace redis-cli (#54)
*[multi-DB] Part 3: Python API changes (#52)
*remove SonicV2Connector which is not used any more (#53)
* 5337490 2019-11-22 | Send port status notification when creating hostif interface (#535) [Kamil Cudnik]
Signed-off-by: Guohan Lu <gulv@microsoft.com>
- [syncd] Fix off-by-one error for attribute enum values query (#536)
- Add support for remove hostif when using tap device (#533)
Signed-off-by: Danny Allen <daall@microsoft.com>