Commit Graph

5114 Commits

Author SHA1 Message Date
pra-moh
b64a6402d0
Revert "[Telemetry docker] add memory and memory swap limits (#7062)" (#7582)
This reverts commit 0c59278168.
2021-05-11 17:11:54 -07:00
pra-moh
0c59278168
[Telemetry docker] add memory and memory swap limits (#7062)
#### Why I did it
Fix https://github.com/Azure/sonic-telemetry/issues/71

#### How I did it
Added memory limit for telemetry docker.
Historical docker memory usage shows telemetry docker consuming 150-200MB memory. Adding some extra buffer.
2021-05-11 14:51:56 -07:00
Ying Xie
b6c0dd652a
[utilities] advance utililies submodule head (#7580)
Include fullowing changes:

* 9fc630c 2021-05-11 | [sonic_installer] temporary fix: don't migrate packages on aboot platforms (#1607) (github/master) [Stepan Blyshchak]
* fde1d95 2021-05-11 | [config][vxlan] fix 'vxlan evpn_nvo add' command error (#1511) [shikenghua]
* 331c5a5 2021-05-10 | [config]: Use mod_entry when editing VLAN_INTERFACE (#1602) [Lawrence Lee]
* 8c2980a 2021-05-08 | [sonic-utilities] CLI support for port auto negotiation (#1568) [Junchao-Mellanox]
* a71ff02 2021-05-06 | [sfpshow] Gracefully handle improper 'specification_compliance' field (#1594) [Joe LeVeque]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-05-11 13:21:48 -07:00
Aravind Mani
3dc879bbcf
DellEMC: Fix Z9332f xcvrd crash (#7544) 2021-05-10 14:40:06 -07:00
schobtr
da7c80da9d
[Haliburton-Celestica] Use psu driver dps200 from Linux kernel (#7247)
Why I did it
Fix issues below.
#7133
#6602

So, remove the dps200 driver from the platform-specific driver.
Then, add the dps200 module driver to the Linux kernel tree.

How I did it
Remove the dps200 driver from the platform-specific driver and add the dps200 module driver to the Linux kernel.

How to verify it
Build an image with Azure/sonic-linux-kernel#207
Then, install to the Haliburton.
2021-05-09 09:03:12 -07:00
Ying Xie
51b7c756e3
[linux kernel] advance linux kernel submodule head (#7559)
To include change:
* 12d8a0a 2021-05-06 | [dps200] Add dps200 PSU module driver (#207) (HEAD, origin/master, origin/HEAD) [schobtr]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-05-09 09:02:49 -07:00
Junchao-Mellanox
bfae15fb83
[mellaonox]: No need enable thermal zones in thermal_manager.deinitialize since they are enabled by default (#7556)
No need enable thermal zones in thermal_manager.deinitialize since they are enabled by default. And removing this will faster thermalctld exit speed
2021-05-08 10:33:37 -07:00
Samuel Angebault
2803fafe04
[Arista] Update platform driver submodules (#7561)
- Fix sonic_platform api "memory leak" tied to recurring allocation. fixes #7515
- Add arista clis to read chassis eeproms
2021-05-08 10:32:33 -07:00
DavidZagury
0a8023bbfd
[DPB][YANG-models] Update portchannel yang model to have lacp_key (#7297)
To update the yang model to support the new key interduced in Azure/sonic-swss#1660
2021-05-07 15:35:58 -07:00
ArthiSivanantham
67f293912f
[yang]: modify Vlan Member to include PortChannel along with Port (#7407)
To include PortChannel as Vlan Member (in addition to the already existing physical port)

Signed-off-by: Arthi Sivanantham <arthi_sivanantham@dell.com>
2021-05-07 15:22:34 -07:00
judyjoseph
44f21a18ea
Port the fix to get the PCI_id from asic.conf file from 201911 --> master branch (#7513)
Port the fix to get the pci_device ID from asic.conf file and update the "asic_id" field in DEVICE_METADATA with the pci_device_Id.

Ref: https://github.com/Azure/sonic-buildimage/pull/4705
2021-05-07 09:47:26 -07:00
Kebo Liu
629f4459d7
[Mellanox] Align PSU fan name in platform.json with latest change in PR #7490 (#7557)
The PSU fan name convention was changed from "psu_{}_fan_{}" to "psu{}_fan{}" in PR #7490, platform.json need to be changed and aligned.
2021-05-07 09:42:40 -07:00
Renuka Manavalan
7a575b3d00
[container_checker] Use Feature table to get running containers (#7474)
Why I did it
Finding running containers through "docker ps" breaks when kubernetes deploys container, as the names are mangled.

How I did it
The data is is available from FEATURE table, which takes care of kubernetes deployment too.

How to verify it
Deploy a feature via kubernetes and don't expect error from container_check.
2021-05-07 08:42:15 -07:00
Joe LeVeque
3dddbf22fa
[brcm] Fix and simplify start_led.sh (#7548)
LED_PROC_INIT_SOC variable was incorrectly referenced as LED_SOC_INIT_SOC. Introduced in #5483

Rather than fixing the typo, I decided to simplify the script, removing the need for the conditional altogether by moving the bcmcmd call inside the conditional which checks for the presence of LED_SOC_INIT_SOC.
2021-05-07 01:55:40 -07:00
xumia
9daec6f20b
[build]: Fix build wrapper commands not cleanup issue (#7553)
cleanup the build commands after build finished.
2021-05-07 01:52:18 -07:00
Qi Luo
dee7d8f9ca
[sonic-utilities] submodule update (#7546)
Includes below commits

9a88cb6 2021-05-06 | [sonic_installer] dont fail package migration (#1591) [Stepan Blyshchak]
615e531 2021-05-05 | [show][config] Add new snmp commands (#1347) [Travis Van Duyn]
fff4051 2021-05-05 | Fixing serial number read to get from DB if it is populated (#1580) [Sudharsan Dhamal Gopalarathnam]
be974bf 2021-05-05 | [neighbor_advertiser] Use existing tunnel if present for creating tunnel mappings (#1589) [Sumukha Tumkur Vani]
9492eab 2021-05-04 | Use swsscommon instead of swsssdk (#1510) [Andriy Yurkiv]
0f4988b 2021-05-04 | Add pg-drop script to sonic filesystem (#1583) [Andriy Yurkiv]
cbe2159 2021-05-04 | [vnet] Add "vnet_route_check" script (#1300) [Volodymyr Samotiy]
9120766 2021-05-03 | Relax the install_requires, no need to exact version as long as there are no broken changes with future versions (#1530) [Qi Luo]
2e09b22 2021-05-03 | Handle the new db version which mellanox_buffer_migrator isn't interested (#1566) [Stephen Sun]
2021-05-07 01:51:44 -07:00
VenkatCisco
db3d353e77
[pmon]: add psmisc to bring fuser that dentifies processes that are using files or sockets (#7509)
fuser support is required since new cisco hardware watchdog plugin uses them to check anyone else use's /dev/watchdogX resource. The actual validation happens in the platform code, but the package is required for pmon container. Currently the /dev/watchdogX is being used by cisco platform-monitor service. Cisco chassis level watchdog plugin uses "fuser" to claim the watchdog release from platform-monitor service.
2021-05-06 22:24:07 -07:00
Kamil Cudnik
43995ab270
[sonic-slave]: Disable aspell on armhf (#7550)
Read more here: https://bugs.launchpad.net/qemu/+bug/1805913
2021-05-06 22:10:41 -07:00
pettershao-ragilenetworks
8ceec5c843
[ruijie] Fix show version error info (#7541)
Fix following crash in `show version`:

```
Traceback (most recent call last):
  File "/usr/local/bin/decode-syseeprom", line 32, in instantiate_eeprom_object
    eeprom = sonic_platform.platform.Platform().get_chassis().get_eeprom()
AttributeError: module 'sonic_platform' has no attribute 'platform'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/decode-syseeprom", line 262, in <module>
    sys.exit(main())
  File "/usr/local/bin/decode-syseeprom", line 244, in main
    print_serial(use_db)
  File "/usr/local/bin/decode-syseeprom", line 169, in print_serial
    eeprom = instantiate_eeprom_object()
  File "/usr/local/bin/decode-syseeprom", line 34, in instantiate_eeprom_object
    log.log_error('Failed to obtain EEPROM object due to {}'.format(repr(e)))
NameError: name 'log' is not defined
```

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
2021-05-06 14:53:20 -07:00
Sudharsan Dhamal Gopalarathnam
06377f0f9a
sonic-utilities submodule update (#7532)
9a88cb6 [sonic_installer] dont fail package migration (#1591)
615e531 [show][config] Add new snmp commands (#1347)
fff4051 Fixing serial number read to get from DB if it is populated (#1580)
be974bf [neighbor_advertiser] Use existing tunnel if present for creating tunnel mappings (#1589)
9492eab Use swsscommon instead of swsssdk (#1510)
0f4988b Add pg-drop script to sonic filesystem (#1583)
cbe2159 [vnet] Add "vnet_route_check" script (#1300)
9120766 Relax the install_requires, no need to exact version as long as there are no broken changes with future versions (#1530)
2e09b22 Handle the new db version which mellanox_buffer_migrator isn't interested (#1566)
2021-05-06 13:31:33 -07:00
Nazarii Hnydyn
6e264d8ac9
[swss_vars]: Add 'resource_type' attribute. (#7526)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-05-06 12:14:21 -07:00
Samuel Angebault
e7c26fb0c9
[Arista] Update platform configurations and library (#7527)
Platform library changes
 - Fix the use of /proc/modules during testing, fixes #7463
 - Add `libsfp-eeprom.so` build to read/write xcvr eeproms in C
 - Add some more reboot-cause information
 - Write down temperature hw thresholds to the sensors
 - Report software thresholds through platform api
 - Writ `port_name sysfs` file of optoe`
 - Tests enhancements
 - Fix dependency issues for chassis provisioning

Platform configuration changes
 - Add `pcie.yaml` configuration for a few platforms
 - Mount `libsfp-eeprom.so` inside `pmon`
 - Fix `Arista-7050SX3-48C8` and `Arista-7050SX3-48YC8' platform and hwsku
 - Miscellaneous fixes

Co-authored-by: Boyang Yu <byu@arista.com>
Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2021-05-06 10:59:22 -07:00
lguohan
7124dbe56b
[sonic-yang-models]: fix unit test failure (#7436)
https://github.com/mbj4668/pyang/blob/master/pyang/repository.py#L93 throws an exception with pip 21.1

add ietf yang model explicitly to the build process fix the test failure.

tests/test_sonic_yang_models.py .F [ 66%]
tests/yang_model_tests/test_yang_model.py . [100%]
Failed: pyang -f tree ./yang-models/*.yang > ./yang-models/sonic_yang_tree
----------------------------- Captured stderr call -----------------------------
./yang-models/sonic-acl.yang:8: error: module "ietf-inet-types" not found in search path
./yang-models/sonic-device_metadata.yang:8: error: module "ietf-yang-types" not found in search path

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-06 10:57:03 -07:00
Stephen Sun
9f0dce0313
[Mellanox] Optimize SFP modules initialization (#7537)
Originally, SFP modules were always accessed from platform daemons, and arbitrary SFP modules can be accessed in the daemon. So all SFP modules were initialized in one shot once one of the following chassis APIs called
- get_all_sfps
- get_sfp_numbers
- get_sfp

Recently, we noticed that SFP modules can also be accessed from CLI, eg. the latest refactor of `sfputil`.

In this case, only one SFP module is accessed in the chassis object's life cycle.
To initialize all SFP modules in one shot is waste of time and causes the CLI to take much more time to finish.
So we would like to optimize the initialization flow by introducing a two-phase initialization approach:
- Partial initialization, which means the `chassis._sfp_list` has been initialized with proper length and all elements being `None`
- Full initialization, which means all elements in `chassis._sfp_list` are created

If the relevant function is called,
- `get_sfp`, only partial initialization will be done, and then the specific SFP module is initialized.
- `get_all_sfps` or `get_num_sfps`, full initialization will be done, which means all SFP modules are initialized.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-06 10:14:48 -07:00
Junchao-Mellanox
a795bc0b8e
[Mellanox] Support new sensor conf file for MSN4700 A1/A0 (#7535)
#### Why I did it

MSN4700 A1/A0 used different sensor chip but keep the existing platform name *x86_64-mlnx_msn4700-r0*, this is a workaround to replace the sensor conf on MSN4700 A1/A0

#### How I did it

Use a shell script to get the sensor conf path and copy that files to /etc/sensors.d/sensors.conf
2021-05-06 10:13:26 -07:00
shlomibitton
557483d0b7
Revert "[submodule]: Update sonic-swss (#7478)" (#7524)
This reverts commit 963e7f4c2c.
2021-05-06 17:40:08 +03:00
lguohan
15be15392d
[ci]: build swi on broadcom platform for pr (#7522)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 15:42:33 -07:00
Aravind Mani
659d078fd3
DellEMC: Z9332f SFP enhancements (#7457)
#### Why I did it
400G media EEPROM and DOM information are not populated properly in DellEMC Z9332f platform.

#### How I did it
Handled QSFP_DD, QSFP28/QSFP+, SFP+ accordingly based on media type detected.
2021-05-05 10:03:11 -07:00
Guohan Lu
a2d33a2a37 [ci]: enable sonic-slave scheduled build on 202012
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:33:31 -07:00
Aravind Mani
e7db9fe46c
DellEMC: Z9332f media settings (#7485)
Changed DellEMC Z9932f media settings from Vendor Name + PN method to common method.
2021-05-05 06:50:24 -07:00
shlomibitton
2d3149d641
[Mellanox] Update FW to xx.2008.2526 (#7511)
- Why I did it
Updated FW to xx.2008.2526 version.

Fixed issues:
1. Spectrum-2, Spectrum-3 | sFlow | High CPU load and high on fully loaded switch.
2. Spectrum-2, Spectrum-3 | Fine grain LAG | in rare cases doesn’t update the right entry

- How I did it
Updated submodule pointer and version in a Makefile.

- How to verify it
Full regression and bugs validation

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-05-05 09:47:57 +03:00
Alexander Allen
0bc0f98d48
[platform] Add serial number and model number to Mellanox PSU platform implementation (#7382)
#### Why I did it

We want to add the ability for the command `show platform psustatus` to show the serial number and part number of the PSU devices on Mellanox platforms. This will be useful for data-center management of field replaceable units (FRUs) on switches.

#### How I did it

I implemented the platform 2.0 functions `get_model()` and `get_serial()` for the PSU in the mellanox platform API by referencing the sysfs nodes provided by the [hw-management](https://github.com/Azure/sonic-buildimage/tree/master/platform/mellanox/hw-management) module.
2021-05-04 13:07:00 -07:00
Andriy Yurkiv
e52fdcfd72
[Mellanox] Add support to VXLAN src port range setting via SAI profile for r SN3800-D28C49S1 (#7500)
- Why I did it
Enable VXLAN src port range configuration via SAI profile for Mellanox-SN3800-D28C49S1 SKU

- How I did it
Added SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 configuration to appropriate sai.profile

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2021-05-04 17:34:33 +03:00
Stephen Sun
b2286a24dc
[Mellanox] Adopt single way to get fan direction for all ASIC types (#7386)
#### Why I did it
Adopt a single way to get fan direction for all ASIC types.
It depends on hw-mgmt V.7.0010.2000.2303. Depends on https://github.com/Azure/sonic-buildimage/pull/7419

#### How I did it
Originally, the get_direction was implemented by fetching and parsing `/var/run/hw-management/system/fan_dir` on the Spectrum-2 and the Spectrum-3 systems. It isn't supported on the Spectrum system.
Now, it is implemented by fetching `/var/run/hw-management/thermal/fanX_dir` for all the platforms.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-03 17:10:18 -07:00
Guohan Lu
853c214951 [ci]: set -ex for official build to exit on any build failures
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-03 12:31:52 -07:00
Guohan Lu
aeb73ad529 [ci]: increase official build to 12 hours
10-hour limit is not enough to finish several jobs.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-03 12:31:31 -07:00
Kebo Liu
5ac048f7e7
[Mellanox] Enhance the platform.json with adding more platform device facts. (#7495)
#### Why I did it

Current platform.json lacks some peripheral device related facts, like chassis/fan/pasu/drawer/thermal/components names, numbers, etc.

#### How I did it

Add platform device facts to the platform.json file

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-05-03 12:22:13 -07:00
trzhang-msft
4f2b54e735
dhcpmon: support dual tor in docker template (#7470) 2021-05-03 10:51:34 -07:00
trzhang-msft
6f85908d4f
dhcpmon: support dual tor scenario (#7471) 2021-05-03 10:51:26 -07:00
Wirut Getbamrung
cfda77b3de
[device/celestica]: Add thermalctld support on Haliburton platform APIs (#6493)
- Removed the old function for detecting a faulty fan.
- Removed the old function for detecting excess temperature.
- Implement thermal_manager APIs based on ThermalManagerBase
- Implement thermal_conditions APIs based on ThermalPolicyConditionBase
- Implement thermal_actions APIs based on ThermalPolicyActionBase
- Implement thermal_info APIs based on ThermalPolicyInfoBase
- Add thermal_policy.json
2021-05-03 09:14:35 -07:00
Guohan Lu
e1ff8b6ad6 [ci]: change timeout to 36 hours for armhf/arm64 build
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-02 09:55:15 -07:00
Junchao-Mellanox
d9cdf9d14f
[Mellanox] Adjust PSU fan name to align with sysfs file name (#7490)
Change PSU fan name from psu_{psu_index}fan{fan_index} to psu{psu_index}_fan{fan_index}
2021-05-02 08:14:56 -07:00
guxianghong
be4cf09b5d
[Centec][arm64] support new board E530-48s4x and E530-24x2q (#7189)
1. support new board E530-48s4x E530-24x2q
2. optimize platform driver for Centec TsingMa board

Co-authored-by: shi lei <shil@centecnetworks.com>
2021-05-01 10:37:07 -07:00
a-barboza
78b45085e9
[radius] Management User Authentication Feature Issue (#7420) (#7503)
Fix Invalid file name in windows, having ':' charactor. #7420
2021-05-01 10:25:20 -07:00
Shilong Liu
a10542e894
[CI] Add Bldenv pipeline files (#7458)
Add 3 pipeline files:

- pipeline for build docker image sonic-slave-[buster|jessie|stretch] for amd64/armhf/arm64, and push to ACR(sonicdev-microsoft.com)
- pipeline for build docker image sonic-mgmt, and push to ACR
- pipeline for cleaning dpkg cache which are created more than 30 days.

Co-authored-by: lguohan <lguohan@gmail.com>
2021-04-30 16:35:38 -07:00
Lawrence Lee
1b39424520
[docker-orchagent]: Increase ndppd kernel poll interval (#7456)
Why I did it
ndppd by default reads /proc/net/ipv6_route ever 30 seconds. Since T1s advertise so many routes to ToRs, this file is extremely large, and reading it causes ndppd's CPU usage to spike every 30 seconds

How I did it
Increase the delay for reading this file to the maximum possible value (max integer value), which will result in CPU spikes every ~24 days instead of every 30 seconds

How to verify it
Start ndppd with the new config file, confirm that no CPU spikes are seen except at startup

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-04-30 16:30:30 -07:00
Stepan Blyshchak
668a678e5e
[sonic-utilities] update sonic-utilities submodule (#7481)
08337aa [sonic-package-manager] first phase implementation of sonic-package-manager (#1527)
c166f66 [multi-asic] support show ip bgp neigh/network for multi asic (#1574)

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-04-30 15:45:04 -07:00
Ying Xie
5da0046755
[makefile] define a do-nothing target for config.user (#7483)
Why I did it
After PR #7344, 'make init' and/or 'make reset' will also build sonic slave dockers.

'-include rules/config.user' is supposed to be fine when the file is missing. However, when the file is missing, it generates a delayed error which later causes make init and make reset trying to build the sonic slave dockers.

How I did it
Define a do-nothing target for config.user to catch config.user build therefore preventing other builds to be triggered unexpectedly.

How to verify it
did make init and it is now only doing submodule init.
2021-04-30 13:04:15 -07:00
vmittal-msft
68dfa704b3
Updated Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8 (#7068)
* TD3 Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8
2021-04-30 10:02:08 -07:00
Wei Bai
3967c28a76
[docker-sonic-mgmt]: Upgrade Tgen version in SONiC mgmt docker (#7472) 2021-04-29 12:31:46 -07:00