lazy_re had an issue when importing sonic-cfggen in another application that
uses re.search(). There is no much improvement of lazy_re today after many
other good optimization work done for sonic-cfggen. It served as a quick
temporary solution.
Some quick test for fast-reboot and warm-reboot done on top of 201911 branch:
Fast-reboot: from ASIC reset to ports in up state:
with lazy_re: 18 sec
without lazy_re: 18 sec
Warm-reboot: LAG restoration time:
with lazy_re: 73 sec
without lazy_re: 72 sec
So, there is no real optimization since the number of sonic-cfggen calls is greatly
reduced in latest SONiC. This means it is time to revert this change.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
#### Why I did it
Restrict the min-links parameter in "config portchannel" to the range 1-1024.
FixesAzure/sonic-buildimage#6781 in conjunction with https://github.com/Azure/sonic-buildimage/pull/1630.
Align YANG model with limits in libteam and sonic-utilties.
#### How I did it
PR 1630 in sonic-utilities prevents CLI user from entering a value outside the allowed range. This PR does the following:
- Increases the maximum value of min-links from 128 to 1024.
- Provides validation in libteam, incorporating as a patch the code in https://git.kernel.org/pub/scm/linux/kernel/git/jpirko/libteam.git/commit/?id=69a7494bb77dc10bb27076add07b380dbd778592.
- Updates the Yang model upper limit from 128 to 1024 (was inconsistent with libteam value).
- Updates the Yang model lower limit from 1 to 0, since 0 is set as default in sonic-utilities which would fail its new range check otherwise.
- Added Yang tests for valid and invalid value.
#### How to verify it
config portchannel add PortChannel0004 --min-links 1024
Command should be accepted.
show interfaces portchannel
Output should show PortChannel0004, no errors on CLI.
config portchannel add PortChannel0005 --min-links 1025
Command should be rejected
show interfaces portchannel
Output should not show PortChannel0005 , no errors on CLI.
#### Which release branch to backport (provide reason below if selected)
#### Description for the changelog
Updates YANG model to allow up to 1024 min_links for portchannel. FixesAzure/sonic-buildimage#6781 in conjunction with https://github.com/Azure/sonic-buildimage/pull/1630.
#### Why I did it
Fixing https://github.com/Azure/sonic-buildimage/issues/8938
Fixing 1x10G DPB mode in Mellanox-SN2700-D40C8S8 SKU as it was causing sonic-cfggen to fail.
#### How I did it
Added correct mode format in hwksu.json in Mellanox-SN2700-D40C8S8 and updated platform.json for the new mode.
#### How to verify it
Using sonic-cfggen verify it works fine
Since database.service has been moved to execute after rc-local.service,
and determine-reboot-cause.service rely on database.service, we have to
specify that in "After=".
Signed-off-by: Xichen Lin <xichenlin@microsoft.com>
Co-authored-by: Xichen Lin <xichenlin@microsoft.com>
copp-config service needs to be started after sonic.target so that it could
render the copp-config with the latest information.
It also needs to be restarted when config reload or load_minigraph is invoked.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
the branch refers the branch name that the commit is in,
for example master, 202012, 201911, ...
In case there is no branch, the name will be HEAD.
release is encoded in /etc/sonic/sonic_release file.
the file is only available for a release branch.
It is not available in master branch.
example for master branch
```
build_version: 'master.602-6efc0a88'
debian_version: '10.7'
kernel_version: '4.19.0-9-2-amd64'
asic_type: vs
commit_id: '6efc0a88'
branch: 'master'
release: 'none'
build_date: Tue Dec 29 06:54:02 UTC 2020
build_number: 602
built_by: johnar@jenkins-worker-23
```
example for 202012 release branch
```
build_version: '202012.602-6efc0a88'
debian_version: '10.7'
kernel_version: '4.19.0-9-2-amd64'
asic_type: vs
commit_id: '6efc0a88'
branch: '202012'
release: '202012'
build_date: Tue Dec 29 06:54:02 UTC 2020
build_number: 602
built_by: johnar@jenkins-worker-23
```
Signed-off-by: Guohan Lu <lguohan@gmail.com>
d2baedb Remove exec from platform_reboot_plugin call to handle any hang issue. (#1879)
c54da70 [fstrim] limit smartctl execution time to 30 seconds (#1850)
cccf845 [multi-asic][cli][chassis-db] Avoid connecting to chassis db for cli commands executed from linecard (#1707)
sonic-swss
29a0483 Cache routes for single nexthop for faster retrieval (#1922)
sonic-utilities
9bc6500 Modified the 'show ipv6 link-local-mode' command to display all interfaces by default (#1797)
[202106 6e895ad] Revert "[buffer orch] Bugfix: Don't query counter SAI_BUFFER_POOL_STAT_XOFF_ROOM_WATERMARK_BYTES on a pool where it is not supported (#1857)" (#1945)
- Why I did it
Change thermal recover threshold from temp_trip_norm to temp_trip_high, so that thermal algorithm would set fan speed to minimum allowed earlier and save power.
- How I did it
Change thermal recover threshold from temp_trip_norm to temp_trip_high
- How to verify it
Manual test
How I did it
Added if multi npu check before invoking the load global config.
How to verify it
Restart caclmgrd after this change and check if no error log is thrown.
Why I did it
When feature state is set to always_enabled hostcfgd throws error message
Sep 21 22:30:55.135377 r-leopard-32 ERR /hostcfgd: Unexpected state value 'always_enabled' for feature bgp
Sep 21 22:30:55.420268 r-leopard-32 ERR /hostcfgd: Unexpected state value 'always_enabled' for feature database
Sep 21 22:30:58.672714 r-leopard-32 ERR /hostcfgd: Unexpected state value 'always_enabled' for feature swss
This is due to feature == always_enabled not handled properly.
How I did it
Handled the scenario when feature is always enabled
How to verify it
Restart hostcfgd with feature state configured as always_enabled and check if there are no errors.
Added UT to cover the scenario.
2f5588c Fix flex counters logic of converting poll interval to seconds from MS (#878)
8d57cfd [syncd][bcm] Start syncd by passing context configuration file (#858)
sonic-swss:
bb69ca2 [portsorch] Avoid orchagent crash when set invalid interface types to port (#1906)
6e1bacc [pfcwd] Fix the polling interval time granularity (#1912)
564785b [teammgrd]: Improve LAGs cleanup on shutdown: send SIGTERM directly to PID. (#1841)
7ee8d26 [tlm teamd] Add retry mechanism before logging the ERR in get_dumps. (#1629)
7f57d3d [fgnhgorch] Enable packet flow when no FG ECMP neighbors are resolved (#1900)
08d009f Mux state order change (#1902)
sonic-utilities:
1bc0f07 Provide support to install platform extensions (#1578)
968c781 [config reload] Removed job-mode for sonic.target restart (#1820)
What I did:
add platform components
How I did it:
In platform_components.json add chassis and empty component
How to verify it:
Run show platform firmware updates
As a part of warmboot, redis database is dumped:
c97fe546e5/scripts/fast-reboot (L269)
However, this dump file is deleted, after it is loaded back into db post reboot.
The DB dump can be useful for debugging purpose, hence taking a backup of it can be useful.
Instead of deleting the dump, rename and keep the dump.
Added logrotate file for wtmp and btmp to override default conf and set size cap as 100K as done in
PR: #865. For buster this is control by separate file wtmp and btmp.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* [multi-asic][cli][chassis-db] Avoiding connecting to chassis db
Currently, for all the cli commands, we connect to all databases
mentioned in the database_config.json. The database_config.json also
includes the databases from chassis redis server from supervisor card.
It is unneccessary to connect to databases from chassis redis server
when cli commands are executed form linecard. But we need to allow
connection to chassis databases when the cli commands are executed from
supervisor card.
The changes in this PR fixes this problem. This PR requires that
asic.conf in supervisor card includes VOQ_SUPERVISOR with value 1 to
indentify the supervisor card. The connect_to_all_dbs_for_ns() is
changed to skip chassis databases form the list of collected databases
if the card is not supervisor card.
#### Why I did it
Remove the call to `SonicDBConfig.load_sonic_global_db_config()` in the multi asic functions.
The expection is the client calling this function will call `SonicDBConfig.load_sonic_global_db_config()`
This PR is dependent on the PR https://github.com/Azure/sonic-utilities/pull/1712
#### How to verify it
compile sonic-utilities
sonic-swss commits
5fbd113 [Flex-counters] Fix the delay of flex counters flow to prevent infinite loop (#1899)
dc685b6 [portsorch] Add an extra check before setting oper speed to APPL_DB (#1885)
ceef728 Update port_rates & rif_rates lua scripts to convert poll_interval to MS (#1855)
sonic-utilities commits
3160753 [ci]: Support PR coverage (#1806)
3316fdb fix wrong code indent in sfputil (#1808)
c33e3a8 [config reload] Fix config reload failure due to sonic.target job cancellation (#1814)
4f7e107 [portconfig] Validate duplicate speed value and interface type value (#1745)
59817e2 [warm-reboot] Add new preboot health check: verify database integrity (#1785)
bf2ff3c [portstat, intfstat] added rates and utilization (#1750)
3bf962c [show][platform] Revise chassis info fallback to only fall back on pmon crash (#1751)
*Removed execute permissions from the systemd copp-config.service file.
Without this we will get a warning: "Configuration file /lib/systemd/system/copp-config.service is marked executable. Please remove executable permission bits. Proceeding anyway."
Why I did it
fstrim has dependency on pmon docker.
How I did it
start fstrim timer after sonic.target.
How to verify it
local test and PR test.
Signed-off-by: Ying Xie ying.xie@microsoft.com
To Fix#8697 . The config load_minigraph initializes 'admin_status' to up when platform.json has DPB configs. This doesn't happen when using port_config.ini
The update minigraph has logic to initialize only the ports whose neighbors are defined or those belonging to portchannel
However, a change was introduced to have default admin status to be 'up' in portconfig.py when the minigraph was using platform.json
This will lead to sanity check failure in sonic-mgmt and thus no test cases could be run
* [Nokia ixs7215] Miscellaneous platform API fixes
This commit delivers the following fixes for the Nokia ixs7215 platform
- Fix bug in a fan API error path
- Add support for setting the fan drawer led
- Add support for getting/setting the front panel PSU status led
- Add support for getting the min/max observed temperature value
* [Nokia ixs7215] code review changes: temperature min/max values
- Why I did it
Removed 2x40G for SN3800. This mode is not supported by hardware.
- How I did it
Removing it from hwsku.json and platform.json
- How to verify it
Load it in the device and check supported modes
Why I did it
Fix an issue on the Clearwater2 linecard.
When the linecard is started with a fresh image without configuration, phys would not be initialized.
How I did it
Added default_sku for Clearwater2 which prevents config-setup from failing to create a default config_db.json.
Added some extra logic in the phy-credo-init script to run the phy_config.sh of the hwsku pointed by default_sku if the DEVICE_METADATA.localhost.hwsku information is not populated in CONFIG_DB.
How to verify it
Booting an image with this change and without configuration will lead to the phys being initialized using the phy_config.sh from default_sku.
sonic-swss
73f6f68 [Flex Counters] Delay flex counters even if tables are present in the DB (#1877)
5edb9e5 [buffer orch] Bugfix: Don't query counter SAI_BUFFER_POOL_STAT_XOFF_ROOM_WATERMARK_BYTES on a pool where it is not supported (#1857)
fce0c60 [crm] Fix for Issue Azure/sonic-buildimage#8036 (#1829)
sonic-utilities
2630ac1 [Fast-reboot] Set flex counters delay indicator to prevent flex counters enablement after fast-reboot (#1768)
606f1b1 [portstat pfcstat] Unify the packet number format in the output of portstat and pfcstat in all cases (#1755)
2c6a15e [ecnconfig] Fix exception seen during display and add unit tests (#1784)
9b1995e Fix logic in RIF counters print (#1732)
sonic-swss-comon
3e7b81f Add a new field for FLEX_COUNTER_TABLE to indicate delay for flex counters (#523)