Commit Graph

3516 Commits

Author SHA1 Message Date
Junchao-Mellanox
547ec0a905 Add a configuration to delay start xcvrd for fast-reboot (#5643) 2020-12-22 09:51:54 -08:00
Tamer Ahmed
31389aa778 [cfggen] Remove NatSorted (#5601)
Natural sorting of SONiC config gen output consumes lot of CPU cycles.
The sole use of natsorted was to make test comparison easier and so,
the natsorting logic is now relocated to the test suite. As a result
sonic-cfggen gained nearly 1 sec per call since we no longer import
natsorted module!

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
aa2d39c7e0 [cfggen] Allow Write To Redis DB With Template/Batch Mode (#5203)
Argument to write to config-db is not allowed when using template.
This PR allows cfggen to write to redis db when using template
mode.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
b61621c5f0 [cfggen] Extend Template Argument to Support Batch Mode (#4941) (#5200)
Calls to cfggen take considerable time. With batch mode, we will have the ability
to reduce number of calls from services.

Example of the batch mode command:
sonic-cfggen -t template-1.j2 -t template-2.j2,config-db -t template-3.j2,config-db -t template-4.j2,file1 -t template-5.j2,file2 --write-to-db.

template-1.j2 will be rendered to stdout since it is missing the dest part. stdout is default
config-db is a special keyword that will inject the rendered template into internal data structure. The internal data structure gets written to redis-db with --write-to-db switch. In the case the user would like to write to a file named config-db, it could be given as /config-db or ./config-db

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
afc952535e [mgmt-framework] Call sonic-cfggen Once (#4937)
Optimizing number of calls made to sonic-cfggen during service
start up as it adds to total system boot up time.

***-Test 1***
there is an average saving of 1 to 1.5 sec between old script and new script
```
root@str-s6000-acs-14:/# time /usr/bin/rest-server-old.sh
Generating temporary TLS server certificate ...
2020/07/09 19:03:33 wrote cert.pem
2020/07/09 19:03:33 wrote key.pem
REST_SERVER_ARGS = -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
/usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem

real	0m8.790s
user	0m7.993s
sys	0m0.584s
root@str-s6000-acs-14:/# time /usr/bin/rest-server-new.sh
Generating temporary TLS server certificate ...
2020/07/09 19:03:45 wrote cert.pem
2020/07/09 19:03:45 wrote key.pem
REST_SERVER_ARGS = -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
/usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem

real	0m6.940s
user	0m5.670s
sys	0m0.386s
```
***-Test 2***
Built an image with this change and rest server is running with params as described in test 1 above
```
admin@str-s6000-acs-14:~$ ps -ef | grep rest_server
root      3301  2866  2 02:09 pts/0    00:00:10 /usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem

```

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
2b3e18c0cc [swss] Reduce Calls to SONiC Cfggen (#5177)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting swss service.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Abhishek Dosi
0c1b686ced [Submodule Update] sonic-py-swsssdk
[configdb] Add Ability to Query/Update Redis Using
Pipelines

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
fd3e0b4c58 [frr] Reduce Calls to SONiC Cfggen (#5176)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to two calls during startup when starting frr service.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
c5f53f50b2 [radv] Reduce Calls to SONiC Cfggen (#5178)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting radv service.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
687c971a52 [dhcp-relay] Reduce Calls to SONiC Cfggen (#5175)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting dhcp-relay
service.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
b5bf5e3bce [interfaces] Reduce Calls to SONiC Cfggen (#5174)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when running interfaces-
config.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
066a0b3b2b [snmp]: Reduce Calls to SONiC Cfggen (#5166)
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to once calle during snmp startup

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
018d007750 [cfggen] Use Redis Pipeline (#5250)
This PR enables cfggen to readr/write from Redis DB using pipelines.
Pipelines enables batch read/write from/to Redis DB.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Tamer Ahmed
fae4c4bfcc [swss] Enhance ARP Update to Call Sonic Cfggen Once (#5398)
This PR limited the number of calls to sonic-cfggen to one call
per iteration instead of current 3 calls per iteration.

The PR also installs jq on host for future scripts if needed.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-22 09:51:54 -08:00
Abhishek Dosi
2140daa680 [submodule update] sonic-swss
Support ACL Table type Mirrorv6 for Innovium (#1528)
Enable v6 ACL rule based Mirroring for Innovium Platform

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-22 09:44:32 -08:00
Samuel Angebault
e75c15bfda
[aboot]: Better handle tmpfs management in boot0 (#6268)
To limit IO and space usage on the flash device the boot0 script makes sure the SWI is in memory.
Because SONiC maps /tmp on the flash, some logic is required to make sure of it.
However it is possible for some provisioning mechanism to already download the swi in a memory file system.
This was not properly handled by the boot0 script.
It now properly detect if the image is on a tmpfs or a ramfs and keep it there if that is the case.

- How I did it

Check the filesystem on which the SWI pointed by swipath lies.
If this filesystem is a ramfs or a tmpfs the move_swi_to_tmpfs becomes a no-op.
Made sure the cleanup logic would not behave unexpectedly.

- How to verify it

In SONiC:

Download the swi under /tmp and makes sure it gets moved to /tmp/tmp-swi which gets mounted for that purpose.
Make sure /tmp/tmp-swi gets unmounted once the install process is done.

Create a new mountpoint under /ram using either ramfs or tmpfs and download the swi there.
Install the swi using sonic-installer and makes sure the image doesn't get moved by looking at the logs.
2020-12-22 00:07:10 -08:00
Abhishek Dosi
c70b4cd63d [submodule update] sonic-utilities
fd3e0174971599fa7f9d73ff1a997583eb090fd5 (HEAD -> 201911, origin/201911) [Multi-asic] Enhanced Feature Table configuration for multi-asic platforms (#1152)
12f03b195609c07762d8c8efd80dc548ddd4fe78 Add FW dump with new SAI implementation (#1298)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-20 16:47:53 -08:00
abdosi
35fc12c373 Telemetry Certificate Copy Across Image Upgrade. (#6252)
To copy telemetry certificate during image upgrade from previous image to new image
2020-12-19 08:24:41 -08:00
Abhishek Dosi
eb688c876b [Submodule update] sonic-swss
cea4468c91c448fb33fc8dda0dc44ec7c9b8f897 (HEAD -> 201911, origin/201911) [crm]: Typecast to unit64_t to avoid divide by 0 during overflow (#1550)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-18 18:56:05 -08:00
Lawrence Lee
024330cb4b [minigraph.py]: Prefer parsing device type from <ElementType> (#6184)
* Parse device type from <ElementType> first in <PngDec>
* Fall back to <Device> type attribute if no <ElementType> is found

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-12-18 18:56:05 -08:00
Prince Sunny
8d5cf6a8c3 [Submodule] Update for sonic-restapi (#6231)
b002455 - 2020-12-16 : Validate IP only if nexthop attribute is not null (#66) [Prince Sunny]
76592a9 - 2020-12-03 : Add License file (#62) [Prince Sunny]
2020-12-18 18:56:05 -08:00
arlakshm
7f76698b7d
[201911][hostcfgd]:wait updating the feature table till system init is done (#6234)
- Why I did it
The change is done to make sure the system initialization is done before the hostcfgd sets the feature states.

- How I did it
This is port of the PR #6232.
Since the systemctl version in 201911 doesn't support "--wait".
Added a function to check the output of systemctl is-system-running every second, till the command system is done booting up.

For now this change is only applicable to multi asic platforms based on the testing this change will be extended to all platforms in the future PR.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-12-18 12:31:35 -08:00
Volodymyr Samotiy
78c44d1808
[Mellanox] Update SAI submodule (#6235)
To add VNET route diff tool (SAI/SDK part) to 201911 release

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2020-12-17 09:11:50 -08:00
Lawrence Lee
dae5c4c930 [build_templates]: Start SNMP timer after SWSS service (#6195)
Fixes #5663

- Why I did it
It's currently possible for the SNMP timer to conflict with config reload (specifically if the timer triggers while config reload is stopping the SWSS service). config reload triggers SWSS to shutdown, which causes SNMP to shutdown, which conflicts with the SNMP timer causing SNMP to startup. See the linked issue for more details.

- How I did it
Including the After ordering dependency forces the SNMP timer to wait until SWSS finishes stopping, preventing the conflict. If there is an ordering dependency between two units (e.g. one unit is ordered After another), if one unit is shutting down while the other is starting up, the shutdown will always be ordered before the startup. In this case, that means that the SNMP timer is forced to wait for the SWSS shutdown to complete. Only then can the SNMP timer proceed. See here for more details.

It's important to note that the After dependency will not cause SWSS to be started when the SNMP timer fires (assuming that SWSS has not yet been started). The existing Requisite dependency in the SNMP service will also not cause SWSS to be started, instead it will cause the SNMP service to fail if SWSS is not active.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-12-16 19:28:31 -08:00
Abhishek Dosi
70c6c0d9a0 [submodule update] sonic-swss
[201911] Fixes for NAT lgtm alerts (#1391)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-16 14:25:21 -08:00
Shi Su
ff1a60fbc3 [L2 switch mode] Update l2switch.j2 template (#5981)
- Why I did it
The l2switch.j2 template does not include all fields for PORT. This could be incompatible with the 201911 image or later.

- How I did it
Update l2switch.j2 template and add a unit test.
2020-12-16 14:24:06 -08:00
shlomibitton
959035c854 [NVMe] Add NVMe SSD disc type support to installer.sh script (#6142)
In order to install a SONiC image on top of a NVMe SSD disc properly with ONIE we must configure it properly on the installer.sh script.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-16 14:19:07 -08:00
Abhishek Dosi
76d7c4beaf [submodule update] sonic-utilities
b909766aab63da5e9a51e05fd2bf79e80db75e5 (HEAD -> 201911, origin/201911) Fix show ip/v6 route summary non-multi-asic platform to interact with FRR directly (#1306)
057d2ee26586034975e21a5cacb1a00ca87f2857 Add support to collect tech support on multi ASIC platform (#1308)
38ab16d5835b917f7459044853276c9d4b53c98b [CLI][PFCWD] Fix issue with specifying ports in pfcwd start on masic platforms (#1203)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-14 22:49:54 -08:00
Abhishek Dosi
adbf78816f [submodule update] sonic-sairedis
e98a7af95a9767093904d9e8fd320067163d5f87 (HEAD -> 201911, origin/201911) [syncd] Translate removed RIDs in fdb notification (#729)
3ceeae5371eee5b69064fa1af88f51e27caa2d36 [syncd] Process all cases fdb flush notification (#726)
115ba0783edf85658fd0329eb23796d758c309f5 fix compile error when compiling with g++-4.8.4 (#718)
a67f94d3d91325516069ef8c0d99bdec30bafbce Fix typo at SAI_ATTR_VALUE_TYPE_ACL_FIELD_DATA_UINT32 (#662)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-14 22:46:24 -08:00
Abhishek Dosi
02004411b3 [submodule update] sonic-swss
7f50b9815e14d90c02d9dce63fd08d90e25cee3f (HEAD -> 201911, origin/201911) handled update() function of fdb orchagent for FDB FLUSH event (#1534)
17adc13b6ca21846fe27c94d6a16f9909c712d77 Add a check for warm-restart, and do a clear only when warm-restart is enable. (#1498)
d097260a5aa7bd611babd5062e220056374e23d8 Fixed compilation failure with debug option (#1518)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-14 22:43:35 -08:00
pavel-shirshov
0931280466 [TSA]: Fix TSC. Avoid 'Not consistent' state (#5968) 2020-12-10 16:43:37 -08:00
shlomibitton
3b97d4375d [kernel] Change grub cmdline to set c-states to 0 for "Intel" CPUs (#6051)
Usually for a use case like networking - should not be configured to reach c6, the maximum used is c1e – due to the added latency getting in & out of states (bad for networking).

Following a recommendation by Intel, networking system should avoid getting in & out of states which introduce latency. The recommended state is c1e and no state change enabling.

In addition, c-state sole purpose is to save power and when inside a networking switch its really negligent being such a tiny consumer vs. the whole cluster.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-09 17:48:14 -08:00
Abhishek Dosi
371f82881b [Submoudle update] sonic-utilities
ccb52454a11e6906bb074d888740d279e4a3c8e3 (HEAD -> 201911, origin/201911) [fast-reboot] Fix fast-reboot when NDP entries are present (#1295)
d09667b86abb7d3cd31b92bedf6e4d4bdac4937f Multi-ASIC support for show ip(v6) route (201911 branch) (#1283)
28399bfcad2a40f1a85095bc679540531c4e673c [201911-Mellanox] SKU creator Tool (#1163) (#1250)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-09 17:46:53 -08:00
Volodymyr Samotiy
39e1c27525
update SDK to 4.4.2112, FW to *.2008.2112, SAI to 1.18.0.1 (#6147)
Co-authored-by: keboliu <kebol@mellanox.com>
2020-12-08 07:54:50 +02:00
Junchao-Mellanox
fd05c2581d
Update submodule for PR [thermalctld][201911] Set led status after updating all other fan status (#6055)
Update submodule pointer for PR Azure/sonic-platform-daemons#126
2020-12-04 13:49:41 -08:00
Junchao-Mellanox
8f45bfa1be [Mellanox] Remove eeprom cache file when first time init eeprom object (#6071)
EEPROM cache file is not refreshed after install a new ONIE version even if the eeprom data is updated. The current Eeprom class always try to read from the cache file when the file exists. The PR is aimed to fix it.
2020-12-04 13:26:23 -08:00
abdosi
3a24e7f31f [multi-asic] Enhancing monit process checker for multi-asic. (#6100)
Added Support of process checker for work on multi-asic platforms.
2020-12-04 13:17:35 -08:00
Xin Wang
bf0ce16ebd [bgp]: Fix bgp crash after BGP allow list configuration is added (#6088)
The issue was a typo introduced in #6006. In that change, the BGP allow list
configuration manager was updated to use a method of common ConfigMgr
for restarting peer groups. However, the method name 'restart_peers' was
used instead of the correct 'restart_peer_groups'.

This change updated the managers_allow_list.py to use correct method
'restart_peer_groups' for restarting peer groups.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
2020-12-03 10:44:31 -08:00
Ying Xie
9345fffe8a [FRR] remove the whole block of outchannel properly (#6045)
- Why I did it
Fix issue #6043

- How I did it
We are disabling in container frr log. The log entries are sent to base image and are logged in /var/log/quagga/bgpd.log.

However, we need to remove the whole outchannel config block to avoid an error message raised by rsyslogd.

- How to verify it
Without the change, test_autorestart bgp container will fail on loganalyer errors. With the change, restarting bgp container is no longer generating error message and the test will pass.

The log generated by frr continued appearing in /var/log/quagga/bgpd.log
2020-11-26 17:04:22 -08:00
Abhishek Dosi
8c0df39c96 Revert "Advance SDK/SAI (#6004)"
This reverts commit 33a6e56833.
2020-11-26 11:55:52 -08:00
pavel-shirshov
9e0ea83cd9
[bgpcfgd]: Use peer commands for BBR, not peer-group (#6048)
* templates: Move 'allowas-in' command from peer-group to instance configuration

* Use peer itself, don't rely on peer-groups
2020-11-26 09:55:24 -08:00
Junchao-Mellanox
37eb088b74
[Mellanox] [201911] Fix issue: set fan led in certain order causes incorrect physical fan led color (#6019)
* Fix issue: fan led colo status

* Fix LGTM warning

* Support fan led management for non-swapable fan
2020-11-26 10:09:48 +02:00
Stephen Sun
33a6e56833
Advance SDK/SAI (#6004)
SDK 4.4.2018
FW XX_2008_2018
SAI 1.17.9

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2020-11-26 09:43:50 +02:00
Abhishek Dosi
be0f82e09e [submodule update] sonic-utilities
49cd91dd0eb6d4b4d5fff388035a955feb8d242a (HEAD -> 201911, origin/201911) Feature table cli command update (#1271)
167d67a57a68c2499ef26e74f94cfb5b1c4eff73 [201911]  CRM show/config commands changes for multi-asic (#1127) (#1236)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 17:47:28 -08:00
Abhishek Dosi
854642a1e0 Fix the build error
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 15:22:01 -08:00
Abhishek Dosi
37a1b05b79 Fix Merge Conflict
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 15:15:17 -08:00
pavel-shirshov
619256f446 [bgpcfg]: Batch bgp updates (#6006)
* [bgpcfgd]: Batch bgp updates.

vtysh -f command is slow. It is sometimes takes about 3 seconds.
When we need to run many vtysh -f commands that slows down the system.
Batch vtysh -f updates.

* Use correct file to import run_command
2020-11-25 15:11:28 -08:00
abdosi
1d1898d8e2 Enhanced Feature table to support 'always_enabled' value for state and auto-restart fields. (#6000)
Added new flag value 'always_enabled' for the state and auto-restart field of feature table

init_cfg.json is updated to initialize state field of database/swss/syncd/teamd feature and auto-restart field of database feature
as always_enabled

Once the state/auto-restart value is initialized as "always_enabled" it is immutable and cannot be change via feature config commands. (config feature..) PR#Azure/sonic-utilities#1271

hostcfgd will not take any action if state field value is 'always_enabled'

Since we have always_enabled field for auto-restart updated supervisor-proc-exit-listener
not to have special check for database and always rely on value from Feature table.
2020-11-25 10:04:42 -08:00
Rajkumar-Marvell
17045f42d1 Set sock rx Buf size to 3MB. (#5566)
* Set sock rx Buf size to 3MB.
2020-11-24 11:21:56 -08:00
Abhishek Dosi
ac5117f20b [submodule update] sonic-mgmt-framework
cc4c4db14439a2b91690df0189b62e011ec41f4c (HEAD -> 201911, origin/201911) Merge pull request #74 from project-arlo/fix_otel_dep_error
44df06e0d44bdf7ce49d4eb05ced34f06eb65133 Make sure redis library is checkout with correct commit ID
7ab88143fa4b89d2d7b8030c9ac7b5e6dba16251 Remove unsupported commands (#62)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-23 23:49:29 -08:00