Commit Graph

3771 Commits

Author SHA1 Message Date
Abhishek Dosi
37a1b05b79 Fix Merge Conflict
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-25 15:15:17 -08:00
pavel-shirshov
619256f446 [bgpcfg]: Batch bgp updates (#6006)
* [bgpcfgd]: Batch bgp updates.

vtysh -f command is slow. It is sometimes takes about 3 seconds.
When we need to run many vtysh -f commands that slows down the system.
Batch vtysh -f updates.

* Use correct file to import run_command
2020-11-25 15:11:28 -08:00
abdosi
1d1898d8e2 Enhanced Feature table to support 'always_enabled' value for state and auto-restart fields. (#6000)
Added new flag value 'always_enabled' for the state and auto-restart field of feature table

init_cfg.json is updated to initialize state field of database/swss/syncd/teamd feature and auto-restart field of database feature
as always_enabled

Once the state/auto-restart value is initialized as "always_enabled" it is immutable and cannot be change via feature config commands. (config feature..) PR#Azure/sonic-utilities#1271

hostcfgd will not take any action if state field value is 'always_enabled'

Since we have always_enabled field for auto-restart updated supervisor-proc-exit-listener
not to have special check for database and always rely on value from Feature table.
2020-11-25 10:04:42 -08:00
Rajkumar-Marvell
17045f42d1 Set sock rx Buf size to 3MB. (#5566)
* Set sock rx Buf size to 3MB.
2020-11-24 11:21:56 -08:00
Abhishek Dosi
ac5117f20b [submodule update] sonic-mgmt-framework
cc4c4db14439a2b91690df0189b62e011ec41f4c (HEAD -> 201911, origin/201911) Merge pull request #74 from project-arlo/fix_otel_dep_error
44df06e0d44bdf7ce49d4eb05ced34f06eb65133 Make sure redis library is checkout with correct commit ID
7ab88143fa4b89d2d7b8030c9ac7b5e6dba16251 Remove unsupported commands (#62)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-23 23:49:29 -08:00
Junchao-Mellanox
ebc84bee94
Fix issue: fan.get_presence always return false (#5983) 2020-11-23 09:28:12 +02:00
Ying Xie
628cc2c11b [frr] remove frr rsyslog file outchannel (#5962)
- Why I did it
frr is creating /var/log/frr/frr.log inside the frr docker and letting it grow. It will eventually exhaust hard drive space.

To fixe issue #5965

- How I did it
Remove rsyslog file outchannel so that frr won't generate /var/log/frr/frr.log inside the docker.

- How to verify it
Manually removed the outchannel and restart BGP docker, making sure that /var/log/frr/frr.log is no longer created inside the docker.

While restarting bgp docker, observed that base image /var/log/quagga/bgpd.log continued to grow and captured all FRR logs.
2020-11-21 09:49:48 -08:00
Prince Sunny
1c2c30fccd Set preference for forced mgmt routes (#5844)
When forced mgmt routes are present, the issue fixed as part of #5754 is not complete.
Added a preference(priority) field to forced mgmt route ip rules
2020-11-21 09:27:09 -08:00
Abhishek Dosi
fd73c84805 submodule update [sonic-swss]
756dd9c8123cd06dc581d9b2eb236334deee1850 (HEAD -> 201911, origin/201911)
[201911 sonic-swss] Flushing FDB entries before removing BridgePort (#1516)
e3f22ea6685104a819440ecc0efe89c4bd3a0003 [201911/portsorch] Add
correct stat list for port buffer drop counters (#1509)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-21 09:00:52 -08:00
Abhishek Dosi
7045d4a5ef [submodule update] sonic-snmpagent
[RFC4292][Namespace][201911]: Fix implementation of RouteUpdater for
 multi-asic platform (#177)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-19 10:42:42 -08:00
pavel-shirshov
5f5ec04dda [bgpcfgd]: Fixes for BBR (#5956)
* Add explicit default state into the constants.yml
* Enable/disable only peer-groups, available in the config
* Retrieve updates from frr before using configuration

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2020-11-19 10:42:42 -08:00
Junchao-Mellanox
500395c56e
[Mellanox] Support max/min speed for PSU fan (#5682) (#5801)
As new hw-mgmt expose the sysfs for PSU fan max speed, we need support max/min speed for PSU fan in mellanox platform API.
Conflicts:
	platform/mellanox/mlnx-platform-api/sonic_platform/fan.py
2020-11-17 18:17:37 +02:00
shlomibitton
1b10f86554 [Mellanox] Fix for QSFP-DD channel status (#5900)
Wrong object init broke the API. Replace object to the correct type.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-14 12:26:28 -08:00
shlomibitton
4088872bb5 [Mellanox] Enhance QSFP-DD DOM information (#5776)
New driver support fetching additional pages from the cable EEPROM.
There are additional information to parse now: RX/TX power, TX bias, TX fault and RX LOS.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-14 12:25:57 -08:00
Aravind Mani
add9752bff [devices]: DellEMC Z9264f buffer changes (#5429)
**- Why I did it**
Converted two SP model to single pool model and modified the buffer size.
**- How I did it**
Changed buffer_default settings for all the DellEMC Z9264f HWSKU's.
**- How to verify it**
Check SP register values in NPU shell.
**- Which release branch to backport (provide reason below if selected)**
Need to be cherry picked for 201911 branch.
2020-11-14 12:25:19 -08:00
jostar-yang
8c2242000e [as5835-54x] Modify qsfp port reset to normal state (#5161)
HW set qsfp port to reset at default. so need SW to set to normal when boot.

1. Modify cpld driver to invert reset offset value
2. Set to normal when boot.
2020-11-14 12:24:32 -08:00
Abhishek Dosi
8efe97498c [submodule update] sonic-utilities
c0df6355deb8bc3685395f727983a5e9f3b06f61 (HEAD -> 201911, origin/201911) Updates to bgp config and show commands with BGP_INTERNAL_NEIGHBOR table (#1224) (#1237)
d683bb48604220942b9f6bdea90c0ea4ff4f72ef [CLI][show][platform] Added ASIC count in the output. (#1185) (#1227)
4585be10aa8e761ce1091ac4a20e562c2550970c [show] Fix 'show int neigh expected' (#1189)
29e4469d5e6c5058fe20c1ce71790f69b7193e7e [201911][fwutil]: Use logger from sonic-py-common (#1190)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-14 08:42:39 -08:00
Lawrence Lee
2aa827f5b7 [buffers_config.j2]: Use correct cable lengths for backend devices (#5905)
* Remove 'backend' from device type strings so that backend devices ('BackEndToRRouter' and 'BackEndLeafRouter') are given the same cable lengths as regular device types.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-14 08:41:14 -08:00
Lawrence Lee
cb32b362f5 Make backend device checking more robust (#5730)
Treat devices that are ToRRouters (ToRRouters and BackEndToRRouters) the same when rendering templates
 Except for BackEndToRRouters belonging to a storage cluster, since these devices have extra sub-interfaces created
Treat devices that are LeafRouters (LeafRouters and BackEndLeafRouters) the same when rendering templates

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-11-14 08:39:08 -08:00
pavel-shirshov
e9ff96d90e [bgp]: Update TSA functionality (#5906)
Fixed TSA bugs:
1. TSA didn't advertise Loopback ipv6 address
2. TSA and TSB changed BGP dynamic and BGP monitors sessions

**- How to verify it**
Build an image and run on your DUT.
```
admin@str-s6100-acs-1:~$ TSA
System Mode: Normal -> Maintenance
admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv4 neighbors 10.0.0.1 advertised-routes'
BGP table version is 6, local router ID is 10.1.0.32, vrf id 0
Default local pref 100, local AS 64601
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.1.0.32/32     0.0.0.0                  0         32768 i

Total number of prefixes 1
admin@str-s6100-acs-1:~$ vtysh -c 'show bgp ipv6 neighbors fc00::a advertised-routes'
BGP table version is 6, local router ID is 10.1.0.32, vrf id 0
Default local pref 100, local AS 64601
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> fc00:1::/64      ::                       0         32768 i

Total number of prefixes 1
admin@str-s6100-acs-1:~$ TSB
System Mode: Maintenance -> Normal
```

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2020-11-14 08:35:13 -08:00
madhanmellanox
a79c3c219d
[201911][caclmgrd] Accomadating case insensitive rule props for Control plane ACLs (#5918)
To make Control plane ACLs handle case insensitive ACL rules. Currently, it handles only upper case ACL rules.

Co-authored-by: Madhan Babu <madhan@arc-build-server.mtr.labs.mlnx>
2020-11-13 11:41:05 -08:00
Abhishek Dosi
c1feae8a80 [submodule update]
Schema update for BGP internal neighbor table (#389)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-12 08:10:54 -08:00
Abhishek Dosi
1223775af9 [submodule update] sonic-platform-daemons
Semove log errors in single ASIC platforms with init Global config
(#108)
2020-11-11 17:33:54 -08:00
judyjoseph
005702ba0e [multi-ASIC] util changes with the BGP_INTERNAL_NEIGHBOR table. (#5760)
- Why I did it
Update the routine is_bgp_session_internal() by checking the BGP_INTERNAL_NEIGHBOR table.
Additionally to address the review comment #5520 (comment)
Add timer settings as will in the internal session templates and keep it minimal as these sessions which will always be up.
Updates to the internal tests data + add all of it to template tests.

- How I did it
Updated the APIs and the template files.

- How to verify it
Verified the internal BGP sessions are displayed correctly with show commands with this API is_bgp_session_internal()
2020-11-10 12:53:49 -08:00
judyjoseph
ce86621399 [multi-ASIC] BGP internal neighbor table support (#5520)
* Initial commit for BGP internal neighbor table support.
  > Add new template named "internal" for the internal BGP sessions
  > Add a new table in database "BGP_INTERNAL_NEIGHBOR"
  > The internal BGP sessions will be stored in this new table "BGP_INTERNAL_NEIGHBOR"

* Changes in template generation tests with the introduction of internal neighbor template files.
2020-11-10 12:52:58 -08:00
arlakshm
431f97d11d Add the vtysh command with newly added "-n" option for multi asic to the read_only_cmds (#5845)
In multi asic platforms the "show ip bgp summary" commands is not available for user with read only privileges, so to fix this the vtysh command with the new "-n" option, added for multi asic platforms, needs to be added to the READ_ONLY_COMMANDS list in the sudoers files. Added the command vtysh -n [0-9] -c show * to list of READ_ONLY_COMMANDS in the sudoers files in this commit.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-11-10 12:30:32 -08:00
shlomibitton
ed186405dd Fix MSN4700 sensors labels (#5861)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-10 12:28:40 -08:00
abdosi
c453381aec [multi-asic] Fixed the docker mount point check for multi-asic (#5848)
API getMount() API was not updated to handle multi-asic platforms
Updated API getMount() to return abspath() for Docker Mount Point
and use that one for mount point comparison

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-09 13:03:20 -08:00
Samuel Angebault
2284dd7a3c [led]: Skip ledinit if there is no led_proc_init.soc file for broadcom platform (#5483)
Some platforms don't leverage the brcm led coprocessor.
However ledinit will try to load a non existing file and exit with an
error code.
This change is a cosmetic fix mostly.

- How to verify it

Boot a platform without the configuration and verify in the syslog that the exit status of ledinit is 0
Boot a platform with the configuration and verify in the syslog that the exit status of ledinit is 0 and the leds are working.
Verified by adding a dumb led_proc_init.soc on an Arista platform which usually doesn't use it.
2020-11-09 12:38:11 -08:00
Stepan Blyshchak
dc68576bab [hostcfgd] If feature state entry not in the cache, add a default state (#5777)
Our use case is to register new features in runtime. The previous change which introduced the cache broke this capability and caused hostcfgd crash.

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2020-11-09 12:34:42 -08:00
abdosi
65cc37cadf [multi-asic] teamdctl support for multi-asic (#5851)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-11-09 12:33:41 -08:00
Junchao-Mellanox
c3ea7b4d91
[201911][sonic-platform-daemons] Update submodule pointer (#5802)
e030133 [thermalctld] Print exception using repr(e) to get more information (#103) (#109)
2020-11-06 12:47:12 -08:00
liat-grozovik
d72517d78e
[submodule] update sonic-swss submodule (#5824)
Including the following changes:
[bitmap_vnet] Remove BMTOR implementation (#1496)
[intfsorch] Init proxy_arp variable while adding router interface. (#1473)
[drop counters] Clarify log messages for initial counter setup (#1445)

Signed-off-by: Liat Grozovik <liatg@nvidia.com>
2020-11-05 18:32:32 -08:00
Aravind Mani
825e05ae95
DellEMC Z9264f: Fix show version error (#5808)
import os.path in eeprom.py to fix the issue
2020-11-05 00:19:51 -08:00
Samuel Angebault
8efc830718
[Arista] Update driver submodules (#5811)
- Improve SMBus performance
 - Introduce devel package currently not used by sonic build system
2020-11-04 17:06:30 -08:00
Joe LeVeque
4dbc6da85b
[201911][core_cleanup.py] Fix core file path; Improve code reuse (#5782)
- Fix bug: `CORE_FILE_DIR` previously was set to `os.path.basename(__file__)`, which would resolve to the script name. Fix this by hardcoding to `/var/core/`
- Remove locally-define logging functions; use Logger class from sonic-py-common instead
2020-11-03 11:15:49 -08:00
Nazarii Hnydyn
0b7518b7b5 [Mellanox] Update platform components config files. (#5685)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-11-03 08:19:19 -08:00
Junchao-Mellanox
1070d024bc [thermalctld] Enlarge startretries value to avoid thermalctld not able to restart during regression test (#5633)
Increase startretires value from default of 10 to 50 to prevent supervisor from placing thermalctld in FATAL state during regression testing. Also ensures supervisord tries hard to get thermalctld running in production, as thermalctld is critical to prevent device from overheating.
2020-11-03 08:19:19 -08:00
shlomibitton
40190298ca [Mellanox] Add sensors labels for human readable output for MSN2010 (#5658)
Add sensors labels for human readable output for MSN2010

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-03 08:19:19 -08:00
shlomibitton
86249769d3 [Mellanox] Add sensors labels for human readable output for MSN2100 (#5659)
Add sensors labels for human readable output for MSN2100

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-03 08:19:19 -08:00
shlomibitton
00638f51bf [Mellanox] Add sensors labels for human readable output for MSN2410 (#5660)
Add sensors labels for human readable output for MSN2410
2020-11-03 08:19:19 -08:00
shlomibitton
fcce160bdc [Mellanox] Add sensors labels for human readable output for MSN2700 (#5661)
Add sensors labels for human readable output for MSN2700

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-03 08:19:19 -08:00
shlomibitton
31109194c2 [Mellanox] Add sensors labels for human readable output for MSN2740 (#5662)
Add sensors labels for human readable output for MSN2740
2020-11-03 08:19:19 -08:00
shlomibitton
55f6ed8288 [Mellanox] Fixes sensors labels for human readable output for MSN3420 (#5664)
Fixes sensors labels for human readable output for MSN3420

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-11-03 08:19:19 -08:00
Nazarii Hnydyn
781abed79e
[Mellanox] Update SAI to v.1.17.7. (#5766)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-11-02 10:51:49 +02:00
Junchao-Mellanox
712d97f911
[Mellanox] Update SDK 4.4.1956 and FW *.2008.1956 for 201911 (#5769)
Update SDK 4.4.1956 and FW *.2008.1956

Bugs fixes:

1.	Link | Clear operational speed when link is not active
2.	Spectrum-2, SN3800 | On rare occasion, link flapping due to bad BER causes traffic loss
3.	Spectrum-3 | On rare occasion, link flapping due to bad BER causes traffic loss as a result of new PAM4 link maintenance flow on Spectrum-3 devices
4.	Shared Buffers | On rare occasion, modifying shared buffers on a system with split port while traffic is running may cause the firmware to get stuck
5.	Spectrum-3, SN4700 | Fence may fail while running 400GbE 8x port when modifying mirror session configurations under traffic
2020-11-01 23:20:27 -08:00
lguohan
339d2aa6c8 [mgmt ip]: mvrf ip rule priority change to 32765 (#5754)
Fix Azure/SONiC#551

When eth0 IP address is configured, an ip rule is getting added for eth0 IP address through the interfaces.j2 template.

This eth0 ip rule creates an issue when VRF (data VRF or management VRF) is also created in the system.
When any VRF (data VRF or management VRF) is created, a new rule is getting added automatically by kernel as "1000: from all lookup [l3mdev-table]".
This l3mdev IP rule is never getting deleted even if VRF is deleted.

Once if this l3mdev IP rule is added, if user configures IP address for the eth0 interface, interfaces.j2 adds an eth0 IP rule as "1000:from 100.104.47.74 lookup default ". Priority 1000 is automatically chosen by kernel and hence this rule gets higher priority than the already existing rule "1001:from all lookup local ".

This results in an issue "ping from console to eth0 IP does not work once if VRF is created" as explained in Issue 551.
More details and possible solutions are explained as comments in the Issue551.

This PR is to resolve the issue by always fixing the low priority 32765 for the IP rule that is created for the eth0 IP address.
Tested with various combinations of VRF creation, deletion and IP address configuration along with ping from console to eth0 IP address.

Co-authored-by: Kannan KVS <kannan_kvs@dell.com>
2020-11-01 10:41:44 -08:00
Abhishek Dosi
65cb10714c Revert "[mgmt ip]: mvrf ip rule priority change to 32765 (#5754)"
This reverts commit 28366cd0ce.
2020-11-01 10:37:16 -08:00
gechiang
55be531dd1 Added new method get_back_end_interface_set() to speed up back-end in… (#5731)
Added new MultiASIC util method "get_back_end_interface_set()" to speed up back-end interface check by allowing caller to cache the back-end intf into a set. This way the caller can use this set for all subsequent back-end interface check requests  instead of each time need to read from redis DB which become a scaling issue for cases such as checking for thousands of nexthop routes for filtering purpose.
2020-11-01 10:27:10 -08:00
Renuka Manavalan
ecd10b9d10 Load config after subscribe (#5740)
- Why I did it
The update_all_feature_states can run in the range of 20+ seconds to one minute. With load of AAA & Tacacs preceding it, any DB updates in AAA/TACACS during the long running feature updates would get missed. To avoid, switch the order.

- How I did it
Do a load after after updating all feature states.

- How to verify it
Not a easy one
Have a script that
restart hostcfgd
sleep 2s
run redis-cli/config command to update AAA/TACACS table

Run the script above and watch the file /etc/pam.d/common-auth-sonic for a minute.

- When it repro:
The updates will not reflect in /etc/pam.d/common-auth-sonic
2020-11-01 10:27:10 -08:00