Cherypick of #15685
MSFT ADO: 24274591
Why I did it
Two changes:
1 Fix a day1 issue, where check to wait until CONFIG_DB_INITIALIZED is incorrect.
There are multiple places where same incorrect logic is used.
Current logic (until [[ $($SONIC_DB_CLI CONFIG_DB GET "CONFIG_DB_INITIALIZED") ]];) will always result in pass, irrespective of the result of GET operation.
root@str2-7060cx-32s-29:~# sonic-db-cli CONFIG_DB GET "CONFIG_DB_INITIALIZED"
1
root@str2-7060cx-32s-29:~# until [[ $(sonic-db-cli CONFIG_DB GET "CONFIG_DB_INITIALIZED") ]]; do echo "entered here"; done
root@str2-7060cx-32s-29:~#
root@str2-7060cx-32s-29:~#
root@str2-7060cx-32s-29:~# sonic-db-cli CONFIG_DB GET "CONFIG_DB_INITIALIZED"
0
root@str2-7060cx-32s-29:~# until [[ $(sonic-db-cli CONFIG_DB GET "CONFIG_DB_INITIALIZED") ]]; do echo "entered here"; done
root@str2-7060cx-32s-29:~#
Fix this logic by checking for value of flag to be "1".
root@str2-7060cx-32s-29:~# until [[ $(sonic-db-cli CONFIG_DB GET "CONFIG_DB_INITIALIZED") -eq 1 ]]; do echo "entered here"; done
entered here
entered here
entered here
This gap in logic was highlighted when another fix was merged: #14933
The issue being fixed here caused warmboot-finalizer to not wait until config-db is initialized.
2 Set and unset CONFIG_DB_INITIALIZED for warm-reboot case
Currently, during warm shutdown CONFIG_DB_INITIALIZED's value is stored in redis db backup. This is restored back when the dump is loaded during warm-recovery.
So the value of CONFIG_DB_INITIALIZED does not depend on config db's state, however it remain what it was before reboot.
Fix this by setting CONFIG_DB_INITIALIZED to 0 as when the DB is loaded, and set it to 1 after db_migrator is done.
Work item tracking
Microsoft ADO (number only):
How I did it
How to verify it
Part of sonic-net/sonic-utilities#2760
Similar to #14295
- Why I did it
To clear teamd timer when fast-reboot is finalized to prevent any further affect.
- How I did it
Deleted teamd timer from config-db in fast-reboot finalizer.
config save call is moved to after clearing teamd-timer so it won't have any further affect as well.
- How to verify it
Verified manually that entry was deleted after fast-reboot was finailized.
- Why I did it
To solve an issue with upgrade with fast-reboot including FW upgrade which has been introduced since moving to fast-reboot over warm-reboot infrastructure.
As well, this introduces fast-reboot finalizing logic to determine fast-reboot is done.
- How I did it
Added logic to finalize-warmboot script to handle fast-reboot as well, this makes sense as using fast-reboot over warm-reboot this script will be invoked. The script will clear fast-reboot entry from state-db instead of previous implementation that relied on timer. The timer could expire in some scenarios between fast-reboot finished causing fallback to cold-reboot and possible crashes.
As well this PR updates all services/scripts reading fast-reboot state-db entry to look for the updated value representing fast-reboot is active.
- How to verify it
Run fast-reboot and check that fast-reboot entry exists in state-db right after startup and being cleared as warm-reboot is finalized and not due to a timer.
Why I did it
After warm reboot, show environment prints the following error:
failed to import plugin show.plugins.macsec: [Errno 13] Permission denied: '/tmp/cache/macsec'
How I did it
Set owner back to admin after restoring counters folder.
How to verify it
sudo warm-reboot, then ensure show environement does not print errors.
Signed-off-by: Oleksandr Kolomeiets <oleksandrx.kolomeiets@intel.com>
Spanning from sonic-net/sonic-linkmgrd#76, this PR is to update warm restart finalizer to wait for linkmgrd to be reconciled.
sign-off: Jing Zhang zhangjing@microsoft.com
Why I did it
To make sure finalizer save config after linkmgrd's reconciliation.
How I did it
Add linkmgrd to the reconciliation wait list of warmboot finalizer.
How to verify it
Verified on lab device, linkmgrd reconciled as expected.
A change in sonic-utilities makes all cache files be saved into a
/tmp/cache. On swss restart this cache has to be removed in case swss
starts in cold or fast mode. A related cache restoration in the warmboot
finalizer script is also updated to use new location.
- Why I did it
To fix#9817. Clear the cache directory on swss.sh except for warm start.
Also, adopted finalize-warmboot script to take the cache directory.
- How I did it
A change in sonic-utilities makes all cache files be saved into a /tmp/cache. On swss restart this cache has to be removed in case swss starts in cold or fast mode. A related cache restoration in the warmboot finalizer script is also updated to use new location.
- How to verify it
Run togather with Azure/sonic-utilities#2232. Verify counters cache is removed on config reload, cold/fast reboots, swss restart.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
#### Why I did it
I made this change to support warm/fast reboot for SONiC extension packages as per HLD Azure/SONiC#682.
#### How I did it
I extended manifest.json.j2 with new warm/fast reboot related fields and also extended sonic_debian_extension.j2 script template to generate the shutdown order files for warm and fast reboot.
* [warm boot finalizer] only wait for enabled components to reconcile
Define the component with its associated service. Only wait for components that have associated service enabled to reconcile during warm reboot.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Why I did it
There is a issue for counters after warm-reboot:
If I clear counters by command "sonic-clear counters", then execute 'warm-reboot' and whenSONiC is restart, the counters showed with command "show interface counters" is still old counters before "sonic-clear". It is not the right counters because the counters file in '/tmp' is lost in warm-reboot process.
- How I did it
I fixed it by saving '/tmp/portstat-0' folders in '/host/' before executing 'warm-reboot' (in pull request Azure/sonic-utilities#1099 ), and restore the counters folders back to '/tmp/' after warm-reboot process is finished.
- How to verify it
Clear counters by command 'sonic-clear'
sonic-clear counters
sonic-clear dropcounters
sonic-clear pfccounters
sonic-clear queuecounters
sonic-clear rifcounters
Execute 'warm-reboot'
Use command ‘show interface counters’ to see if the counters is right.
We are moving toward building all Python packages for SONiC as wheel packages rather than Debian packages. This will also allow us to more easily transition to Python 3.
Python files are now packaged in "sonic-utilities" Pyhton wheel. Data files are now packaged in "sonic-utilities-data" Debian package.
**- How I did it**
- Build and install sonic-utilities as a Python package
- Remove explicit installation of wheel dependencies, as these will now get installed implicitly by pip when installing sonic-utilities as a wheel
- Build and install new sonic-utilities-data package to install data files required by sonic-utilities applications
- Update all references to sonic-utilities scripts/entrypoints to either reference the new /usr/local/bin/ location or remove absolute path entirely where applicable
Submodule updates:
* src/sonic-utilities aa27dd9...2244d7b (5):
> Support building sonic-utilities as a Python wheel package instead of a Debian package (#1122)
> [consutil] Display remote device name in show command (#1120)
> [vrf] fix check state_db error when vrf moving (#1119)
> [consutil] Fix issue where the ConfigDBConnector's reference is missing (#1117)
> Update to make config load/reload backward compatible. (#1115)
* src/sonic-ztp dd025bc...911d622 (1):
> Update paths to reflect new sonic-utilities install location, /usr/local/bin/ (#19)
* Changes in sonic-buildimage for the NAT feature
- Docker for NAT
- installing the required tools iptables and conntrack for nat
Signed-off-by: kiran.kella@broadcom.com
* Add redis-tools dependencies in the docker nat compilation
* Addressed review comments
* add natsyncd to warm-boot finalizer list
* addressed review comments
* using swsscommon.DBConnector instead of swsssdk.SonicV2Connector
* Enable NAT application in docker-sonic-vs
* [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector
* update comment for a potential bug
* update comment
* add TODO maker as review reqirement
* [warm reboot] save configuration after warm reboot
After warm reboot, save a copy of in memory database to config_db.json,
upgrade procedure might have removed config_db.json to force new image
to reload minigraph. However, reload minigraph is skipped during warm
reboot. Missing config_db.json would cause device to fault in next
non-upgrading cold/fast reboot.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* Update finalize-warmboot.sh
After warm reboot is done, we need to disable warm reboot flag and
tear down anything setup for warm reboot and persisted across.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>