Commit Graph

3840 Commits

Author SHA1 Message Date
shlomibitton
959035c854 [NVMe] Add NVMe SSD disc type support to installer.sh script (#6142)
In order to install a SONiC image on top of a NVMe SSD disc properly with ONIE we must configure it properly on the installer.sh script.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-16 14:19:07 -08:00
Abhishek Dosi
76d7c4beaf [submodule update] sonic-utilities
b909766aab63da5e9a51e05fd2bf79e80db75e5 (HEAD -> 201911, origin/201911) Fix show ip/v6 route summary non-multi-asic platform to interact with FRR directly (#1306)
057d2ee26586034975e21a5cacb1a00ca87f2857 Add support to collect tech support on multi ASIC platform (#1308)
38ab16d5835b917f7459044853276c9d4b53c98b [CLI][PFCWD] Fix issue with specifying ports in pfcwd start on masic platforms (#1203)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-14 22:49:54 -08:00
Abhishek Dosi
adbf78816f [submodule update] sonic-sairedis
e98a7af95a9767093904d9e8fd320067163d5f87 (HEAD -> 201911, origin/201911) [syncd] Translate removed RIDs in fdb notification (#729)
3ceeae5371eee5b69064fa1af88f51e27caa2d36 [syncd] Process all cases fdb flush notification (#726)
115ba0783edf85658fd0329eb23796d758c309f5 fix compile error when compiling with g++-4.8.4 (#718)
a67f94d3d91325516069ef8c0d99bdec30bafbce Fix typo at SAI_ATTR_VALUE_TYPE_ACL_FIELD_DATA_UINT32 (#662)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-14 22:46:24 -08:00
Abhishek Dosi
02004411b3 [submodule update] sonic-swss
7f50b9815e14d90c02d9dce63fd08d90e25cee3f (HEAD -> 201911, origin/201911) handled update() function of fdb orchagent for FDB FLUSH event (#1534)
17adc13b6ca21846fe27c94d6a16f9909c712d77 Add a check for warm-restart, and do a clear only when warm-restart is enable. (#1498)
d097260a5aa7bd611babd5062e220056374e23d8 Fixed compilation failure with debug option (#1518)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-14 22:43:35 -08:00
pavel-shirshov
0931280466 [TSA]: Fix TSC. Avoid 'Not consistent' state (#5968) 2020-12-10 16:43:37 -08:00
shlomibitton
3b97d4375d [kernel] Change grub cmdline to set c-states to 0 for "Intel" CPUs (#6051)
Usually for a use case like networking - should not be configured to reach c6, the maximum used is c1e – due to the added latency getting in & out of states (bad for networking).

Following a recommendation by Intel, networking system should avoid getting in & out of states which introduce latency. The recommended state is c1e and no state change enabling.

In addition, c-state sole purpose is to save power and when inside a networking switch its really negligent being such a tiny consumer vs. the whole cluster.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-09 17:48:14 -08:00
Abhishek Dosi
371f82881b [Submoudle update] sonic-utilities
ccb52454a11e6906bb074d888740d279e4a3c8e3 (HEAD -> 201911, origin/201911) [fast-reboot] Fix fast-reboot when NDP entries are present (#1295)
d09667b86abb7d3cd31b92bedf6e4d4bdac4937f Multi-ASIC support for show ip(v6) route (201911 branch) (#1283)
28399bfcad2a40f1a85095bc679540531c4e673c [201911-Mellanox] SKU creator Tool (#1163) (#1250)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-12-09 17:46:53 -08:00
Volodymyr Samotiy
39e1c27525
update SDK to 4.4.2112, FW to *.2008.2112, SAI to 1.18.0.1 (#6147)
Co-authored-by: keboliu <kebol@mellanox.com>
2020-12-08 07:54:50 +02:00
Junchao-Mellanox
fd05c2581d
Update submodule for PR [thermalctld][201911] Set led status after updating all other fan status (#6055)
Update submodule pointer for PR Azure/sonic-platform-daemons#126
2020-12-04 13:49:41 -08:00
Junchao-Mellanox
8f45bfa1be [Mellanox] Remove eeprom cache file when first time init eeprom object (#6071)
EEPROM cache file is not refreshed after install a new ONIE version even if the eeprom data is updated. The current Eeprom class always try to read from the cache file when the file exists. The PR is aimed to fix it.
2020-12-04 13:26:23 -08:00
abdosi
3a24e7f31f [multi-asic] Enhancing monit process checker for multi-asic. (#6100)
Added Support of process checker for work on multi-asic platforms.
2020-12-04 13:17:35 -08:00
Xin Wang
bf0ce16ebd [bgp]: Fix bgp crash after BGP allow list configuration is added (#6088)
The issue was a typo introduced in #6006. In that change, the BGP allow list
configuration manager was updated to use a method of common ConfigMgr
for restarting peer groups. However, the method name 'restart_peers' was
used instead of the correct 'restart_peer_groups'.

This change updated the managers_allow_list.py to use correct method
'restart_peer_groups' for restarting peer groups.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
2020-12-03 10:44:31 -08:00
Ying Xie
9345fffe8a [FRR] remove the whole block of outchannel properly (#6045)
- Why I did it
Fix issue #6043

- How I did it
We are disabling in container frr log. The log entries are sent to base image and are logged in /var/log/quagga/bgpd.log.

However, we need to remove the whole outchannel config block to avoid an error message raised by rsyslogd.

- How to verify it
Without the change, test_autorestart bgp container will fail on loganalyer errors. With the change, restarting bgp container is no longer generating error message and the test will pass.

The log generated by frr continued appearing in /var/log/quagga/bgpd.log
2020-11-26 17:04:22 -08:00
Abhishek Dosi
8c0df39c96 Revert "Advance SDK/SAI (#6004)"
This reverts commit 33a6e56833.
2020-11-26 11:55:52 -08:00
pavel-shirshov
9e0ea83cd9
[bgpcfgd]: Use peer commands for BBR, not peer-group (#6048)
* templates: Move 'allowas-in' command from peer-group to instance configuration

* Use peer itself, don't rely on peer-groups
2020-11-26 09:55:24 -08:00
Junchao-Mellanox
37eb088b74
[Mellanox] [201911] Fix issue: set fan led in certain order causes incorrect physical fan led color (#6019)
* Fix issue: fan led colo status

* Fix LGTM warning

* Support fan led management for non-swapable fan
2020-11-26 10:09:48 +02:00
Stephen Sun
33a6e56833
Advance SDK/SAI (#6004)
SDK 4.4.2018
FW XX_2008_2018
SAI 1.17.9

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2020-11-26 09:43:50 +02:00
Abhishek Dosi
be0f82e09e [submodule update] sonic-utilities
49cd91dd0eb6d4b4d5fff388035a955feb8d242a (HEAD -> 201911, origin/201911) Feature table cli command update (#1271)
167d67a57a68c2499ef26e74f94cfb5b1c4eff73 [201911]  CRM show/config commands changes for multi-asic (#1127) (#1236)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 17:47:28 -08:00
Abhishek Dosi
854642a1e0 Fix the build error
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 15:22:01 -08:00
Abhishek Dosi
37a1b05b79 Fix Merge Conflict
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 15:15:17 -08:00
pavel-shirshov
619256f446 [bgpcfg]: Batch bgp updates (#6006)
* [bgpcfgd]: Batch bgp updates.

vtysh -f command is slow. It is sometimes takes about 3 seconds.
When we need to run many vtysh -f commands that slows down the system.
Batch vtysh -f updates.

* Use correct file to import run_command
2020-11-25 15:11:28 -08:00
abdosi
1d1898d8e2 Enhanced Feature table to support 'always_enabled' value for state and auto-restart fields. (#6000)
Added new flag value 'always_enabled' for the state and auto-restart field of feature table

init_cfg.json is updated to initialize state field of database/swss/syncd/teamd feature and auto-restart field of database feature
as always_enabled

Once the state/auto-restart value is initialized as "always_enabled" it is immutable and cannot be change via feature config commands. (config feature..) PR#Azure/sonic-utilities#1271

hostcfgd will not take any action if state field value is 'always_enabled'

Since we have always_enabled field for auto-restart updated supervisor-proc-exit-listener
not to have special check for database and always rely on value from Feature table.
2020-11-25 10:04:42 -08:00
Rajkumar-Marvell
17045f42d1 Set sock rx Buf size to 3MB. (#5566)
* Set sock rx Buf size to 3MB.
2020-11-24 11:21:56 -08:00
Abhishek Dosi
ac5117f20b [submodule update] sonic-mgmt-framework
cc4c4db14439a2b91690df0189b62e011ec41f4c (HEAD -> 201911, origin/201911) Merge pull request #74 from project-arlo/fix_otel_dep_error
44df06e0d44bdf7ce49d4eb05ced34f06eb65133 Make sure redis library is checkout with correct commit ID
7ab88143fa4b89d2d7b8030c9ac7b5e6dba16251 Remove unsupported commands (#62)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-23 23:49:29 -08:00
Junchao-Mellanox
ebc84bee94
Fix issue: fan.get_presence always return false (#5983) 2020-11-23 09:28:12 +02:00
Ying Xie
628cc2c11b [frr] remove frr rsyslog file outchannel (#5962)
- Why I did it
frr is creating /var/log/frr/frr.log inside the frr docker and letting it grow. It will eventually exhaust hard drive space.

To fixe issue #5965

- How I did it
Remove rsyslog file outchannel so that frr won't generate /var/log/frr/frr.log inside the docker.

- How to verify it
Manually removed the outchannel and restart BGP docker, making sure that /var/log/frr/frr.log is no longer created inside the docker.

While restarting bgp docker, observed that base image /var/log/quagga/bgpd.log continued to grow and captured all FRR logs.
2020-11-21 09:49:48 -08:00
Prince Sunny
1c2c30fccd Set preference for forced mgmt routes (#5844)
When forced mgmt routes are present, the issue fixed as part of #5754 is not complete.
Added a preference(priority) field to forced mgmt route ip rules
2020-11-21 09:27:09 -08:00
Abhishek Dosi
fd73c84805 submodule update [sonic-swss]
756dd9c8123cd06dc581d9b2eb236334deee1850 (HEAD -> 201911, origin/201911)
[201911 sonic-swss] Flushing FDB entries before removing BridgePort (#1516)
e3f22ea6685104a819440ecc0efe89c4bd3a0003 [201911/portsorch] Add
correct stat list for port buffer drop counters (#1509)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-21 09:00:52 -08:00
Abhishek Dosi
7045d4a5ef [submodule update] sonic-snmpagent
[RFC4292][Namespace][201911]: Fix implementation of RouteUpdater for
 multi-asic platform (#177)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-19 10:42:42 -08:00
pavel-shirshov
5f5ec04dda [bgpcfgd]: Fixes for BBR (#5956)
* Add explicit default state into the constants.yml
* Enable/disable only peer-groups, available in the config
* Retrieve updates from frr before using configuration

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2020-11-19 10:42:42 -08:00
Junchao-Mellanox
500395c56e
[Mellanox] Support max/min speed for PSU fan (#5682) (#5801)
As new hw-mgmt expose the sysfs for PSU fan max speed, we need support max/min speed for PSU fan in mellanox platform API.
Conflicts:
	platform/mellanox/mlnx-platform-api/sonic_platform/fan.py
2020-11-17 18:17:37 +02:00
shlomibitton
1b10f86554 [Mellanox] Fix for QSFP-DD channel status (#5900)
Wrong object init broke the API. Replace object to the correct type.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-14 12:26:28 -08:00
shlomibitton
4088872bb5 [Mellanox] Enhance QSFP-DD DOM information (#5776)
New driver support fetching additional pages from the cable EEPROM.
There are additional information to parse now: RX/TX power, TX bias, TX fault and RX LOS.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-14 12:25:57 -08:00
Aravind Mani
add9752bff [devices]: DellEMC Z9264f buffer changes (#5429)
**- Why I did it**
Converted two SP model to single pool model and modified the buffer size.
**- How I did it**
Changed buffer_default settings for all the DellEMC Z9264f HWSKU's.
**- How to verify it**
Check SP register values in NPU shell.
**- Which release branch to backport (provide reason below if selected)**
Need to be cherry picked for 201911 branch.
2020-11-14 12:25:19 -08:00
jostar-yang
8c2242000e [as5835-54x] Modify qsfp port reset to normal state (#5161)
HW set qsfp port to reset at default. so need SW to set to normal when boot.

1. Modify cpld driver to invert reset offset value
2. Set to normal when boot.
2020-11-14 12:24:32 -08:00
Abhishek Dosi
8efe97498c [submodule update] sonic-utilities
c0df6355deb8bc3685395f727983a5e9f3b06f61 (HEAD -> 201911, origin/201911) Updates to bgp config and show commands with BGP_INTERNAL_NEIGHBOR table (#1224) (#1237)
d683bb48604220942b9f6bdea90c0ea4ff4f72ef [CLI][show][platform] Added ASIC count in the output. (#1185) (#1227)
4585be10aa8e761ce1091ac4a20e562c2550970c [show] Fix 'show int neigh expected' (#1189)
29e4469d5e6c5058fe20c1ce71790f69b7193e7e [201911][fwutil]: Use logger from sonic-py-common (#1190)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-14 08:42:39 -08:00
Lawrence Lee
2aa827f5b7 [buffers_config.j2]: Use correct cable lengths for backend devices (#5905)
* Remove 'backend' from device type strings so that backend devices ('BackEndToRRouter' and 'BackEndLeafRouter') are given the same cable lengths as regular device types.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-14 08:41:14 -08:00
Lawrence Lee
cb32b362f5 Make backend device checking more robust (#5730)
Treat devices that are ToRRouters (ToRRouters and BackEndToRRouters) the same when rendering templates
 Except for BackEndToRRouters belonging to a storage cluster, since these devices have extra sub-interfaces created
Treat devices that are LeafRouters (LeafRouters and BackEndLeafRouters) the same when rendering templates

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-14 08:39:08 -08:00
pavel-shirshov
e9ff96d90e [bgp]: Update TSA functionality (#5906)
Fixed TSA bugs:
1. TSA didn't advertise Loopback ipv6 address
2. TSA and TSB changed BGP dynamic and BGP monitors sessions

**- How to verify it**
Build an image and run on your DUT.
```
admin@str-s6100-acs-1:~$ TSA
System Mode: Normal -> Maintenance
admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv4 neighbors 10.0.0.1 advertised-routes'
BGP table version is 6, local router ID is 10.1.0.32, vrf id 0
Default local pref 100, local AS 64601
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.1.0.32/32     0.0.0.0                  0         32768 i

Total number of prefixes 1
admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv6 neighbors fc00::a advertised-routes'
BGP table version is 6, local router ID is 10.1.0.32, vrf id 0
Default local pref 100, local AS 64601
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> fc00:1::/64      ::                       0         32768 i

Total number of prefixes 1
admin@str-s6100-acs-1:~$ TSB
System Mode: Maintenance -> Normal
```

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2020-11-14 08:35:13 -08:00
madhanmellanox
a79c3c219d
[201911][caclmgrd] Accomadating case insensitive rule props for Control plane ACLs (#5918)
To make Control plane ACLs handle case insensitive ACL rules. Currently, it handles only upper case ACL rules.

Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-11-13 11:41:05 -08:00
Abhishek Dosi
c1feae8a80 [submodule update]
Schema update for BGP internal neighbor table (#389)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-12 08:10:54 -08:00
Abhishek Dosi
1223775af9 [submodule update] sonic-platform-daemons
Semove log errors in single ASIC platforms with init Global config
(#108)
2020-11-11 17:33:54 -08:00
judyjoseph
005702ba0e [multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table. (#5760)
- Why I did it
Update the routine is_bgp_session_internal() by checking the BGP_INTERNAL_NEIGHBOR table.
Additionally to address the review comment #5520 (comment)
Add timer settings as will in the internal session templates and keep it minimal as these sessions which will always be up.
Updates to the internal tests data + add all of it to template tests.

- How I did it
Updated the APIs and the template files.

- How to verify it
Verified the internal BGP sessions are displayed correctly with show commands with this API is_bgp_session_internal()
2020-11-10 12:53:49 -08:00
judyjoseph
ce86621399 [multi-ASIC] BGP internal neighbor table support (#5520)
* Initial commit for BGP internal neighbor table support.
  > Add new template named "internal" for the internal BGP sessions
  > Add a new table in database "BGP_INTERNAL_NEIGHBOR"
  > The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR"

* Changes in template generation tests with the introduction of internal neighbor template files.
2020-11-10 12:52:58 -08:00
arlakshm
431f97d11d Add the vtysh command with newly added "-n" option for multi asic to the read_only_cmds (#5845)
In multi asic platforms the "show ip bgp summary" commands is not available for user with read only privileges, so to fix this the vtysh command with the new "-n" option, added for multi asic platforms, needs to be added to the READ_ONLY_COMMANDS list in the sudoers files. Added the command vtysh -n [0-9] -c show * to list of READ_ONLY_COMMANDS in the sudoers files in this commit.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-11-10 12:30:32 -08:00
shlomibitton
ed186405dd Fix MSN4700 sensors labels (#5861)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-10 12:28:40 -08:00
abdosi
c453381aec [multi-asic] Fixed the docker mount point check for multi-asic (#5848)
API getMount() API was not updated to handle multi-asic platforms
Updated API getMount() to return abspath() for Docker Mount Point
and use that one for mount point comparison

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-09 13:03:20 -08:00
Samuel Angebault
2284dd7a3c [led]: Skip ledinit if there is no led_proc_init.soc file for broadcom platform (#5483)
Some platforms don't leverage the brcm led coprocessor.
However ledinit will try to load a non existing file and exit with an
error code.
This change is a cosmetic fix mostly.

- How to verify it

Boot a platform without the configuration and verify in the syslog that the exit status of ledinit is 0
Boot a platform with the configuration and verify in the syslog that the exit status of ledinit is 0 and the leds are working.
Verified by adding a dumb led_proc_init.soc on an Arista platform which usually doesn't use it.
2020-11-09 12:38:11 -08:00
Stepan Blyshchak
dc68576bab [hostcfgd] If feature state entry not in the cache, add a default state (#5777)
Our use case is to register new features in runtime. The previous change which introduced the cache broke this capability and caused hostcfgd crash.

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2020-11-09 12:34:42 -08:00
abdosi
65cc37cadf [multi-asic] teamdctl support for multi-asic (#5851)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-09 12:33:41 -08:00