Commit Graph

6891 Commits

Author SHA1 Message Date
Stephen Sun
9f0dce0313
[Mellanox] Optimize SFP modules initialization (#7537)
Originally, SFP modules were always accessed from platform daemons, and arbitrary SFP modules can be accessed in the daemon. So all SFP modules were initialized in one shot once one of the following chassis APIs called
- get_all_sfps
- get_sfp_numbers
- get_sfp

Recently, we noticed that SFP modules can also be accessed from CLI, eg. the latest refactor of `sfputil`.

In this case, only one SFP module is accessed in the chassis object's life cycle.
To initialize all SFP modules in one shot is waste of time and causes the CLI to take much more time to finish.
So we would like to optimize the initialization flow by introducing a two-phase initialization approach:
- Partial initialization, which means the `chassis._sfp_list` has been initialized with proper length and all elements being `None`
- Full initialization, which means all elements in `chassis._sfp_list` are created

If the relevant function is called,
- `get_sfp`, only partial initialization will be done, and then the specific SFP module is initialized.
- `get_all_sfps` or `get_num_sfps`, full initialization will be done, which means all SFP modules are initialized.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-06 10:14:48 -07:00
Junchao-Mellanox
a795bc0b8e
[Mellanox] Support new sensor conf file for MSN4700 A1/A0 (#7535)
#### Why I did it

MSN4700 A1/A0 used different sensor chip but keep the existing platform name *x86_64-mlnx_msn4700-r0*, this is a workaround to replace the sensor conf on MSN4700 A1/A0

#### How I did it

Use a shell script to get the sensor conf path and copy that files to /etc/sensors.d/sensors.conf
2021-05-06 10:13:26 -07:00
shlomibitton
557483d0b7
Revert "[submodule]: Update sonic-swss (#7478)" (#7524)
This reverts commit 963e7f4c2c.
2021-05-06 17:40:08 +03:00
lguohan
15be15392d
[ci]: build swi on broadcom platform for pr (#7522)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 15:42:33 -07:00
Aravind Mani
659d078fd3
DellEMC: Z9332f SFP enhancements (#7457)
#### Why I did it
400G media EEPROM and DOM information are not populated properly in DellEMC Z9332f platform.

#### How I did it
Handled QSFP_DD, QSFP28/QSFP+, SFP+ accordingly based on media type detected.
2021-05-05 10:03:11 -07:00
Guohan Lu
a2d33a2a37 [ci]: enable sonic-slave scheduled build on 202012
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:33:31 -07:00
Aravind Mani
e7db9fe46c
DellEMC: Z9332f media settings (#7485)
Changed DellEMC Z9932f media settings from Vendor Name + PN method to common method.
2021-05-05 06:50:24 -07:00
shlomibitton
2d3149d641
[Mellanox] Update FW to xx.2008.2526 (#7511)
- Why I did it
Updated FW to xx.2008.2526 version.

Fixed issues:
1. Spectrum-2, Spectrum-3 | sFlow | High CPU load and high on fully loaded switch.
2. Spectrum-2, Spectrum-3 | Fine grain LAG | in rare cases doesn’t update the right entry

- How I did it
Updated submodule pointer and version in a Makefile.

- How to verify it
Full regression and bugs validation

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-05-05 09:47:57 +03:00
Alexander Allen
0bc0f98d48
[platform] Add serial number and model number to Mellanox PSU platform implementation (#7382)
#### Why I did it

We want to add the ability for the command `show platform psustatus` to show the serial number and part number of the PSU devices on Mellanox platforms. This will be useful for data-center management of field replaceable units (FRUs) on switches.

#### How I did it

I implemented the platform 2.0 functions `get_model()` and `get_serial()` for the PSU in the mellanox platform API by referencing the sysfs nodes provided by the [hw-management](https://github.com/Azure/sonic-buildimage/tree/master/platform/mellanox/hw-management) module.
2021-05-04 13:07:00 -07:00
Andriy Yurkiv
e52fdcfd72
[Mellanox] Add support to VXLAN src port range setting via SAI profile for r SN3800-D28C49S1 (#7500)
- Why I did it
Enable VXLAN src port range configuration via SAI profile for Mellanox-SN3800-D28C49S1 SKU

- How I did it
Added SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 configuration to appropriate sai.profile

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2021-05-04 17:34:33 +03:00
Stephen Sun
b2286a24dc
[Mellanox] Adopt single way to get fan direction for all ASIC types (#7386)
#### Why I did it
Adopt a single way to get fan direction for all ASIC types.
It depends on hw-mgmt V.7.0010.2000.2303. Depends on https://github.com/Azure/sonic-buildimage/pull/7419

#### How I did it
Originally, the get_direction was implemented by fetching and parsing `/var/run/hw-management/system/fan_dir` on the Spectrum-2 and the Spectrum-3 systems. It isn't supported on the Spectrum system.
Now, it is implemented by fetching `/var/run/hw-management/thermal/fanX_dir` for all the platforms.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-03 17:10:18 -07:00
Guohan Lu
853c214951 [ci]: set -ex for official build to exit on any build failures
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-03 12:31:52 -07:00
Guohan Lu
aeb73ad529 [ci]: increase official build to 12 hours
10-hour limit is not enough to finish several jobs.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-03 12:31:31 -07:00
Kebo Liu
5ac048f7e7
[Mellanox] Enhance the platform.json with adding more platform device facts. (#7495)
#### Why I did it

Current platform.json lacks some peripheral device related facts, like chassis/fan/pasu/drawer/thermal/components names, numbers, etc.

#### How I did it

Add platform device facts to the platform.json file

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-05-03 12:22:13 -07:00
trzhang-msft
4f2b54e735
dhcpmon: support dual tor in docker template (#7470) 2021-05-03 10:51:34 -07:00
trzhang-msft
6f85908d4f
dhcpmon: support dual tor scenario (#7471) 2021-05-03 10:51:26 -07:00
Wirut Getbamrung
cfda77b3de
[device/celestica]: Add thermalctld support on Haliburton platform APIs (#6493)
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
2021-05-03 09:14:35 -07:00
Guohan Lu
e1ff8b6ad6 [ci]: change timeout to 36 hours for armhf/arm64 build
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-02 09:55:15 -07:00
Junchao-Mellanox
d9cdf9d14f
[Mellanox] Adjust PSU fan name to align with sysfs file name (#7490)
Change PSU fan name from psu_{psu_index}fan{fan_index} to psu{psu_index}_fan{fan_index}
2021-05-02 08:14:56 -07:00
guxianghong
be4cf09b5d
[Centec][arm64] support new board E530-48s4x and E530-24x2q (#7189)
1. support new board E530-48s4x E530-24x2q
2. optimize platform driver for Centec TsingMa board

Co-authored-by: shi lei <shil@centecnetworks.com>
2021-05-01 10:37:07 -07:00
a-barboza
78b45085e9
[radius] Management User Authentication Feature Issue (#7420) (#7503)
Fix Invalid file name in windows, having ':' charactor. #7420
2021-05-01 10:25:20 -07:00
Shilong Liu
a10542e894
[CI] Add Bldenv pipeline files (#7458)
Add 3 pipeline files:

- pipeline for build docker image sonic-slave-[buster|jessie|stretch] for amd64/armhf/arm64, and push to ACR(sonicdev-microsoft.com)
- pipeline for build docker image sonic-mgmt, and push to ACR
- pipeline for cleaning dpkg cache which are created more than 30 days.

Co-authored-by: lguohan <lguohan@gmail.com>
2021-04-30 16:35:38 -07:00
Lawrence Lee
1b39424520
[docker-orchagent]: Increase ndppd kernel poll interval (#7456)
Why I did it
ndppd by default reads /proc/net/ipv6_route ever 30 seconds. Since T1s advertise so many routes to ToRs, this file is extremely large, and reading it causes ndppd's CPU usage to spike every 30 seconds

How I did it
Increase the delay for reading this file to the maximum possible value (max integer value), which will result in CPU spikes every ~24 days instead of every 30 seconds

How to verify it
Start ndppd with the new config file, confirm that no CPU spikes are seen except at startup

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-04-30 16:30:30 -07:00
Stepan Blyshchak
668a678e5e
[sonic-utilities] update sonic-utilities submodule (#7481)
08337aa [sonic-package-manager] first phase implementation of sonic-package-manager (#1527)
c166f66 [multi-asic] support show ip bgp neigh/network for multi asic (#1574)

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-04-30 15:45:04 -07:00
Ying Xie
5da0046755
[makefile] define a do-nothing target for config.user (#7483)
Why I did it
After PR #7344, 'make init' and/or 'make reset' will also build sonic slave dockers.

'-include rules/config.user' is supposed to be fine when the file is missing. However, when the file is missing, it generates a delayed error which later causes make init and make reset trying to build the sonic slave dockers.

How I did it
Define a do-nothing target for config.user to catch config.user build therefore preventing other builds to be triggered unexpectedly.

How to verify it
did make init and it is now only doing submodule init.
2021-04-30 13:04:15 -07:00
vmittal-msft
68dfa704b3
Updated Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8 (#7068)
* TD3 Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8
2021-04-30 10:02:08 -07:00
Wei Bai
3967c28a76
[docker-sonic-mgmt]: Upgrade Tgen version in SONiC mgmt docker (#7472) 2021-04-29 12:31:46 -07:00
Joe LeVeque
64c3d3a7bf
[caclmgrd] Remove sleep which allowed threads to progress (#7475)
Previously, a brief sleep was necessary in order to get Python threads to progress. The root cause of this has since been found and fixed in sonic-swss-common: Azure/sonic-swss-common#477. The submodule was updated here, so we can now safely remove this sleep.

This PR should also be cherry-picked to the 202012 branch once the submodule is updated there to also include the fix.
2021-04-29 11:07:04 -07:00
shlomibitton
963e7f4c2c
[submodule]: Update sonic-swss (#7478)
[flex-counters] Delay flex counters stats init for faster boot time (Azure/sonic-swss#1646)
[routeorch] Add support for blackhole routes (Azure/sonic-swss#1723)
Update pool sizes during initialization from timer only (Azure/sonic-swss#1708)

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-04-29 11:06:06 -07:00
Andriy Yurkiv
21009be840
[devices][hwsku] add support to VXLAN src port range feature (#7394)
Enable VXLAN src port range configuration via SAI profile
2021-04-29 10:05:02 -07:00
Stephen Sun
b3a283366c
Fix issue: exception occurred during chassis object being destroyed (#7446)
The following error message is observed during chassis object being destroyed

"Exception ignored in: <function Chassis.__del__ at 0x7fd22165cd08>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/sonic_platform/chassis.py", line 83, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
The chassis tries to import deinitialize_sdk_handle during being destroyed for the purpose of releasing the sdk_handle.
However, importing another module during shutting down can cause the error because some of the fundamental infrastructures are no longer available."

This error occurs when a chassis object is created and then destroyed in the Python shell.

- How I did it
To fix it, record the deinitialize_sdk_handle in the chassis object when sdk_handle is being initialized and call the deinitialize handler when the chassis object is being destroyed

- How to verify it
Manually test.
2021-04-29 11:33:39 +03:00
Qi Luo
e3b2a040b2
[sonic-swss-common] Update submodule (#7467)
Includes commits
```
f3e1085 2021-04-22 | [swig] fix ConfigDBConnector.db_name (#483) [Qi Luo]
0e2f1c0 2021-04-21 | [swig] Implement SonicV2Connector.hmset() (#480) [Qi Luo]
d18ce28 2021-04-21 | [swss-common] Add MUX Metrics Table (#482) [Tamer Ahmed]
2e5a194 2021-04-20 | Support for in-band-mgmt via management VRF (#479) [Venkatesan Mahalingam]
3e5529f 2021-04-19 | [swig] Implement TableEntryPoppable.pops() (#478) [Qi Luo]
4a3903b 2021-04-19 | Support for in-band-mgmt via management VRF. (#476) [Venkatesan Mahalingam]
fc2c734 2021-04-19 | [swig] allow threads (#477) [Qi Luo]
```
2021-04-28 19:04:16 -07:00
Junchao-Mellanox
ccc7bd1315
[Mellanox] Upgrade hw-mgmt to 7.0100.2303 (#7419)
- Why I did it
Upgrade hw-mgmt to 7.0100.2303

Bug fixes

1. Fan direction feature fix for fixed FAN system (using shell instead of binutils/strings)
2. Remove cpld 4th link on systems with only 3 CPLD's
3. hw-mgmt: thermal: Add hardcoded critical trip point. Follow-up after patch "Removing critical thermal zones to prevent unexpected software system shutdown".
4. Fix sensor attribute mapping to be label based instead of index based to allow common handling of voltage regulator names independently of hardware changes.
5. Update 'lm-sensors' custom configuration file. Relevant only for users utilizing sensors.conf files coming along with hw-management package.
6. For full feature list please follow https://github.com/Mellanox/hw-mgmt/blob/V.7.0010.2300_BR/debian/Release.txt

- How I did it
Update hw-mgmt pointer
Remove unused patches
Fix existing patch to make sure it apply successfully

- How to verify it
Full platform regression on all mellanox platforms
2021-04-28 16:21:55 +03:00
Junchao-Mellanox
8f1c8a4f19
[submodule] Update submodule pointer for sonic-linux-kernel (#7454)
99ad210 [Mellanox] backport kernel patches for hw-management 7.0100.2303 (#211)

- Why I did it
Update submodule pointer for sonic-linux-kernel to include kernel patches for hw-mgmt 7.0100.2303

- How I did it
Update submodule pointer for sonic-linux-kernel
2021-04-28 16:16:47 +03:00
Kamil Cudnik
a8be1f45b9
[submodule] Advance sonic-sairedis submodule (#7425)
9672423 [sairedis] Add missing SAI interface apis (#827)
bb13d59 (origin/202012) Update call git clean on debian/rules (#826)
ca7b115 [Mellanox] Add SAI template config support (#803)
2021-04-28 01:31:22 -07:00
LuiSzee
7c79b2654e
[build]: fix bug for compile sonic-platform-common caused by enable pytest (#7431)
Co-authored-by: Shi Lei <shil@centecnetworks.com>
2021-04-28 01:30:47 -07:00
vmittal-msft
701afa2e88
Updated bcmsai to 4.3.3.5-1 to include MMU fixes and others (#7440) 2021-04-27 16:33:41 -07:00
Junchao-Mellanox
e58348733d
[Mellanox] Fix platform json for MSN2100 (#7345)
2x40G is not supported on MSN2100, need remove it from platform.json
2021-04-27 16:26:42 -07:00
Dror Prital
22abec3c5d
[mellanox]: Integrate SAI version 1.18.3.2 into Master branch (#7428)
Changes in the new release:

Fix 10G and 50G speeds in SAI XML to support all interface types
Enable SMAC=DMAC and SMAC MC in tunnel debug counter
Add tunnel statistics
Add isolation group API implementation
Fix ACL ANY debug counter to correctly track ACL drops
Add VXLAN source port hard coded range, controlled by K/V
FW dump me now feature
Add mlxtrace to saidump
Speed lane setting and AN control
Implement query stats API
VNI miss part of tunnel decal drop reason
Align with SAI API v1.8.1

Signed-off-by: Dror Prital <drorp@nvidia.com>
2021-04-27 16:24:59 -07:00
Vivek Reddy
595b71aaf6
[submodule] Update submodule for sonic-swss (#7432)
d9f28b6 [SflowMgr] SamplingRate Update by Speed Change Added (#1721)
6c02acf [MACsec]: Set macsec to bypass by default (#1719)
9720f74 [Monitor Vlan] Fix a typo in hostif (#1722)

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-04-27 16:24:17 -07:00
Renuka Manavalan
2ff17b3fff
[submodule] update sonic-utilities (#7439)
* 9dba93f disk_check: Check & mount RO as RW using tmpfs (#1569)
* c3963c5 Fix remove ip rif (#1535)
* 41d8ddc [config][generic-update] Adding apply-patch, rollback, checkpoints commands (#1536)
* a3d37f1 [console] Display success message after line cleared (#1579)
* b10c157 RADIUS Management User Authentication Feature (#1521)
* 59ed6f3 platform pre-check for reboot in master branch (#1556)
* f5efe89 [acl] Use a list instead of a comma-separated string for ACL port list (#1519)
* e296a69 No more IP validation as it is more likely a URL (#1555)
* d5f5382 [CLI][queue counters] add JSON output option for queue counters (#1505)
* 176cc4a 1) Loopback interfaces with valid nexthop IP are not ignored/treated as loopback. (#1565)
* 149ccbd [techsupport] Update show ip interface command (#1562)
* 0e84418 Stop PMON docker before cold and soft reboots (#1514)
* eba5c04 Fix Multi-ASIC show specific resursive route by using common parsing function (#1560)
* e57e7f7 cache the bvid to vlan translations (#1523)
* 38f9f60 sonic-installer: fix py3 issues in bootloader.aboot (#1553)
* 02b263a [voq/inbandif] Voq inbandif port (#1363)
* 0539789 [load_minigraph]: Avoid starting PFCWD for EPMS devicetype (#1552)
* 030293c Use 'importlib' module in lieu of deprecated 'imp' module (#1450)
* 50e5c61 Fixed the possibility of using uninitialized variable in route_check.py (#1551)
2021-04-27 16:23:34 -07:00
Maxime Lorrillere
a92da83047
[chassis] VoQ configuration using minigraph.xml file (#5991)
This commit contains the following changes to support for configuring a VoQ switch using a minigraph.xml file.:
- Add support for system ports configuration to minigraph
- Add support for SwitchId, SwitchType and MaxCores to minigraph
- Add support for inband vlan configuration in minigraph
- `asic_name` is now a mandatory attribute in CONFIG_DB on VoQ switches

Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
2021-04-27 12:18:45 -07:00
zzhiyuan
5f435f2296
[Arista] Add DPB for 7060CX-32S (#7413)
#### Why I did it
- To start support of dynamic port breakout as the norm for Arista platforms.
- Add a DPB hwsku for the 7060CX-32S

#### How I did it
- Expand platform.json for the 7060CX-32S
- Added a new hwsku specifically for DPB
- Added a flex Broadcom configuration

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2021-04-27 11:03:20 -07:00
jostar-yang
93ceb3933e
[as7726-32x] Support PDDF (#7398)
Add PDDF support for Accton as7726-32x platform

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2021-04-27 11:01:40 -07:00
Christian Svensson
186e1b9b57
[arista] Add DPB for Arista 7050 QX32 (#7342)
This change introduces dynamic port breakout (DPB) for Arista 7050 QX32 model by adding a new SKU suffixed with `-Flex`.

The breakout configuration allowed is the same as in mainline Arista EOS, i.e. 24 first ports are allowed to be used in 4x10G in addition to the default 40G mode. The last 8 ports are fixed to 40G. This is due to ASIC limitations of a total of 104 max ports.

**NOTE**: As described in https://github.com/aristanetworks/sonic/issues/30#issuecomment-820584113 front panel LEDs are likely not working when operating in breakout mode. It is not clear if the LEDs work correctly in 40G mode as I have not had a chance to physically inspect the switch with this patch.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2021-04-27 10:57:07 -07:00
Xin Wang
a7e1f7cbad
[docker-sonic-mgmt]: Install aiohttp package to sonic-mgmt docker (#7429)
The aiohttp package is required by azure.kusto.data which is used by  sonic-mgmt/test_reporting.
This change is to ensure that the dependent package is installed in the sonic-mgmt docker.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
2021-04-26 23:38:16 -07:00
Praveen Chaudhary
803e6d8b57
[sonic-slave-buster]: upgrade pyang version to 2.4.0 and install only using pip3. (#7441)
[sonic-slave-stretch]: upgrade pyang version to 2.4.0.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
2021-04-26 21:53:23 -07:00
Stepan Blyshchak
cd2c86eab6
[dockers] label SONiC Docker with manifest (#5939)
Signed-off-by: Stepan Blyschak stepanb@nvidia.com

This PR is part of SONiC Application Extension

Depends on #5938

- Why I did it
To provide an infrastructure change in order to support SONiC Application Extension feature.

- How I did it
Label every installable SONiC Docker with a minimal required manifest and auto-generate packages.json file based on
installed SONiC images.

- How to verify it
Build an image, execute the following command:

admin@sonic:~$ docker inspect docker-snmp:1.0.0 | jq '.[0].Config.Labels["com.azure.sonic.manifest"]' -r | jq
Cat /var/lib/sonic-package-manager/packages.json file to verify all dockers are listed there.
2021-04-26 13:51:50 -07:00
Guohan Lu
27a635a15a Revert "Flashrom refactoring (#6922)"
This reverts commit 7dd9d1f3f2.
2021-04-25 11:51:35 -07:00
xumia
56bdd750ab
Support readonly vtysh for sudoers (#7383)
Why I did it
Support readonly version of the command vtysh

How I did it
Check if the command starting with "show", and verify only contains single command in script.
2021-04-25 16:32:02 +08:00