Commit Graph

4591 Commits

Author SHA1 Message Date
VenkatCisco
91b4ce649e [pmon]: add psmisc to bring fuser that dentifies processes that are using files or sockets (#7509)
fuser support is required since new cisco hardware watchdog plugin uses them to check anyone else use's /dev/watchdogX resource. The actual validation happens in the platform code, but the package is required for pmon container. Currently the /dev/watchdogX is being used by cisco platform-monitor service. Cisco chassis level watchdog plugin uses "fuser" to claim the watchdog release from platform-monitor service.
2021-05-10 16:00:43 -07:00
Renuka Manavalan
99958304c1 [container_checker] Use Feature table to get running containers (#7474)
Why I did it
Finding running containers through "docker ps" breaks when kubernetes deploys container, as the names are mangled.

How I did it
The data is is available from FEATURE table, which takes care of kubernetes deployment too.

How to verify it
Deploy a feature via kubernetes and don't expect error from container_check.
2021-05-10 15:59:57 -07:00
Joe LeVeque
e4a9e0c1b6 [caclmgrd] Remove sleep which allowed threads to progress (#7475)
Previously, a brief sleep was necessary in order to get Python threads to progress. The root cause of this has since been found and fixed in sonic-swss-common: Azure/sonic-swss-common#477. The submodule was updated here, so we can now safely remove this sleep.

This PR should also be cherry-picked to the 202012 branch once the submodule is updated there to also include the fix.
2021-05-10 15:59:23 -07:00
Ying Xie
61bb5c168c [makefile] define a do-nothing target for config.user (#7483)
Why I did it
After PR #7344, 'make init' and/or 'make reset' will also build sonic slave dockers.

'-include rules/config.user' is supposed to be fine when the file is missing. However, when the file is missing, it generates a delayed error which later causes make init and make reset trying to build the sonic slave dockers.

How I did it
Define a do-nothing target for config.user to catch config.user build therefore preventing other builds to be triggered unexpectedly.

How to verify it
did make init and it is now only doing submodule init.
2021-05-10 15:58:45 -07:00
Christian Svensson
446f9ed274 [build] Extend rules/config.user to more Makefiles (#7344)
rules/config.user allows overriding default properties without
touching tracked files. This change makes sure all properties
can be set and not just the ones used in slave.mk.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2021-05-10 15:58:18 -07:00
Aravind Mani
37d4b9525b DellEMC: Z9332f media settings (#7485)
Changed DellEMC Z9932f media settings from Vendor Name + PN method to common method.
2021-05-10 15:57:19 -07:00
Aravind Mani
cf36209ae1 DellEMC: Fix Z9332f xcvrd crash (#7544) 2021-05-10 15:56:59 -07:00
Danny Allen
506c0e8e64
[202012][swss-common/utilities/platform-daemons] Update submodule (#7563)
* [202012][swss/swss-common/utilities/platform-daemons] Update submodule

sonic-swss
- [flex-counters] Delay flex counters stats init for faster boot time [202012] (#1736)

sonic-swss-common
- [swig] allow threads (#477)

sonic-utilities
- [sfpshow] Gracefully handle improper 'specification_compliance' field (#1594)

sonic-platform-daemons
- [xcvrd] Change the y_cable presence logic to use "mux_cable" table as identifier from Config DB (#176)
- [xcvrd] Enhance Media Settings (#177)

Signed-off-by: Danny Allen <daall@microsoft.com>
2021-05-10 15:53:43 -07:00
Junchao-Mellanox
6e12c40f40 [Mellanox] Support new sensor conf file for MSN4700 A1/A0 (#7535)
#### Why I did it

MSN4700 A1/A0 used different sensor chip but keep the existing platform name *x86_64-mlnx_msn4700-r0*, this is a workaround to replace the sensor conf on MSN4700 A1/A0

#### How I did it

Use a shell script to get the sensor conf path and copy that files to /etc/sensors.d/sensors.conf
2021-05-10 09:21:42 -07:00
mssonicbld
d634a85e44
[ci/build]: Upgrade SONiC package versions (#7567) 2021-05-09 14:45:28 +00:00
Kebo Liu
100c14007f
[Mellanox] [202012] Enhance the platform.json with adding more platform device facts. (#7496)
- Why I did it
Current platform.json lacks some peripheral device related facts, like chassis/fan/pasu/drawer/thermal/components names, numbers, etc.

- How I did it
Add platform device facts to the platform.json file

- How to verify it
Run sonic-mgmt platform API tests which depend on these facts.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-05-09 10:45:36 +03:00
mssonicbld
b95f86722c
[ci/build]: Upgrade SONiC package versions (#7540) 2021-05-09 04:44:39 +00:00
xumia
0a6149abda
[ci] Fix official builds timeout issue (#7516)
The default value is 600 minutes, it is not enough when building multiple images for a platform, change to 720 minutes.
2021-05-07 14:22:24 -07:00
xumia
2dabad0b5e [build]: Fix build wrapper commands not cleanup issue (#7553)
cleanup the build commands after build finished.
2021-05-07 11:26:10 -07:00
Joe LeVeque
deb9e67838
[202012] Add SOC property to enable AN/LT on some platforms (#7547)
* [202012] Add SOC property to enable AN/LT on some platforms

Why I did it
To enable autonegotiation/link training on some Broadcom-based platforms (Arista 7060CX, 7260CX3, 7050cx3, Celestica DX010)

How I did it
Add appropriate SOC property for enabling the feature to the Broadcom config files of appropriate platforms
Also convert line endings to UNIX format for one Celestica file

* Add 'phy_an_lt_msft' to BCM config file permitted list
2021-05-06 22:21:43 -07:00
Kamil Cudnik
f88767b2ce
[sonic-slave]: Disable aspell for armhf (#7549)
Read more here: https://bugs.launchpad.net/qemu/+bug/1805913
2021-05-06 22:11:06 -07:00
Kamil Cudnik
95a1f7686f
Add aspell-en to main package list sonic-slave-buster docker file (#7521) 2021-05-06 21:48:58 +02:00
Aravind Mani
11aa05b1d3
[202012][DellEMC] Z9332f PSU data is not updated in state DB (#7543)
#### Why I did it

- PSU data is loaded into state DB.
   Following errors are seen in syslogs:
  "Failed to update PSU data - '<=' not supported between instances of 'float' and 'str'"

- Issue is not seen in master image as the PSU API return type is different.

#### How I did it

- Changed the return type in PSU API's.
2021-05-06 11:53:42 -07:00
Stephen Sun
2fc748eade
[202012][utilities] Update submodule (#7542)
68ea9efc Add pg-drop script to sonic filesystem (#1583)
b216bf0a Fixing serial number read to get from DB if it is populated (#1580)
fa7230c6 Handle the new db version which mellanox_buffer_migrator isn't interested (#1566)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-06 19:37:24 +03:00
gechiang
c7540a8aba
[202012] BRCM SAI 4.3.3.5-1 Added Support for 100G special AN/LT mode (#7534) 2021-05-06 09:20:30 -07:00
Qi Luo
97066cbf46
[202012][submodule] Update sonic-utilities submodule (#7528)
Includes below commits:
```
3f8bc52 2021-05-05 | Relax the install_requires, no need to exact version as long as there are no broken changes with future versions (#1530) (#1592) [Qi Luo]
```
2021-05-05 18:39:31 -07:00
mssonicbld
b0460c7ce1
[ci/build]: Upgrade SONiC package versions (#7480)
Co-authored-by: mssonicbld <vsts@fv-az196-264.g5lmldmxpfterdw5iojffagm3c.gx.internal.cloudapp.net>
2021-05-06 06:50:02 +08:00
Aravind Mani
5dc2860d37 DellEMC: Z9332f SFP enhancements (#7457)
#### Why I did it
400G media EEPROM and DOM information are not populated properly in DellEMC Z9332f platform.

#### How I did it
Handled QSFP_DD, QSFP28/QSFP+, SFP+ accordingly based on media type detected.
2021-05-05 13:48:30 -07:00
vpsubramaniam
7d98a3fe47 DellEMC: Z9332F - Watchdog support, add platform.json, new platform API implementation and fixes (#6988)
Incorporate the below changes in DellEMC Z9332F platform:

- Implemented watchdog platform API support
- Implement ‘get_position_in_parent’, ‘is_replaceable’ methods for all device types
- Change return type of SFP methods to match specification in sonic_platform_common/sfp_base.py
- Added platform.json file in device directory.

Co-authored-by: V P Subramaniam <Subramaniam_Vellalap@dell.com>
2021-05-05 13:47:03 -07:00
shlomibitton
ad05c98d34 [Mellanox] Update FW to xx.2008.2526 (#7511)
- Why I did it
Updated FW to xx.2008.2526 version.

Fixed issues:
1. Spectrum-2, Spectrum-3 | sFlow | High CPU load and high on fully loaded switch.
2. Spectrum-2, Spectrum-3 | Fine grain LAG | in rare cases doesn’t update the right entry

- How I did it
Updated submodule pointer and version in a Makefile.

- How to verify it
Full regression and bugs validation

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-05-05 09:36:14 -07:00
Andriy Yurkiv
684e0c508c [Mellanox] Add support to VXLAN src port range setting via SAI profile for r SN3800-D28C49S1 (#7500)
- Why I did it
Enable VXLAN src port range configuration via SAI profile for Mellanox-SN3800-D28C49S1 SKU

- How I did it
Added SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 configuration to appropriate sai.profile

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2021-05-05 09:35:54 -07:00
Junchao-Mellanox
b9680e9e25 [Mellanox] Adjust PSU fan name to align with sysfs file name (#7490)
Change PSU fan name from psu_{psu_index}fan{fan_index} to psu{psu_index}_fan{fan_index}
2021-05-05 09:35:31 -07:00
trzhang-msft
1c2c1f50ba dhcpmon: support dual tor scenario (#7471) 2021-05-05 09:35:01 -07:00
trzhang-msft
d76206bae4 dhcpmon: support dual tor in docker template (#7470) 2021-05-05 09:34:42 -07:00
Stephen Sun
a554ddc91d [Mellanox] Adopt single way to get fan direction for all ASIC types (#7386)
#### Why I did it
Adopt a single way to get fan direction for all ASIC types.
It depends on hw-mgmt V.7.0010.2000.2303. Depends on https://github.com/Azure/sonic-buildimage/pull/7419

#### How I did it
Originally, the get_direction was implemented by fetching and parsing `/var/run/hw-management/system/fan_dir` on the Spectrum-2 and the Spectrum-3 systems. It isn't supported on the Spectrum system.
Now, it is implemented by fetching `/var/run/hw-management/thermal/fanX_dir` for all the platforms.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-05-05 09:34:25 -07:00
vmittal-msft
f766a1bccf Updated Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8 (#7068)
* TD3 Qos/MMU settings for Arista-7050CX3-32S-C32 & Arista-7050CX3-32S-D48C8
2021-05-05 09:33:19 -07:00
Guohan Lu
9d5b899452 [ci]: enable sonic-slave build on 202012 branch
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:31:27 -07:00
Shilong Liu
90bcc51203 [CI] Add Bldenv pipeline files (#7458)
Add 3 pipeline files:

- pipeline for build docker image sonic-slave-[buster|jessie|stretch] for amd64/armhf/arm64, and push to ACR(sonicdev-microsoft.com)
- pipeline for build docker image sonic-mgmt, and push to ACR
- pipeline for cleaning dpkg cache which are created more than 30 days.

Co-authored-by: lguohan <lguohan@gmail.com>
2021-05-05 07:30:39 -07:00
Guohan Lu
7841b7513c [ci]: change timeout to 36 hours for armhf/arm64 build
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:30:07 -07:00
Guohan Lu
a0575a7790 [ci]: increase official build to 12 hours
10-hour limit is not enough to finish several jobs.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-05-05 07:11:59 -07:00
xumia
e27293f4df
Disable the reproducible build for arm64 and armhf temporarily (#7517)
Disable the reproducible build for arm64 and armhf temporarily.
The package version upgrade has been disabled for armhf and armhf for long run time, so disable the reproducible build for the official builds as well. we will enable it when arm64 azure agent ready.
2021-05-05 21:02:39 +08:00
Danny Allen
6060da3a36
[202012][swss/swss-common/utilities/kernel] Update submodule (#7487)
* [202012][swss/swss-common/utilities/kernel] Update submodule

sonic-swss:
- [Monitor Vlan] Fix a typo in hostif (#1722)
- Update pool sizes during initialization from timer only (#1708)
- [SflowMgr] SamplingRate Update by Speed Change Added (#1721)

sonic-swss-common:
- [swss-common] Add MUX Metrics Table (#482)
- [azp] Purge swss before installing the newly built deb package (#472)

sonic-utilities:
- disk_check: Check & mount RO as RW using tmpfs (#1569)
- No more IP validation as it is more likely a URL (#1555)
- Stop PMON docker before cold and soft reboots (#1514)
- Add soft-reboot reboot type (#1453)
- [acl] Use a list instead of a comma-separated string for ACL port list (#1519)
- sonic-installer: fix py3 issues in bootloader.aboot (#1553)
- Fix unsupported fs.squashfs extraction in sonic-installer (#1366)
- [show][config] cli support for firmware upgrade on Y-Cable (#1528) (#1558)

sonic-linux-kernel:
- [Mellanox] backport kernel patches for hw-management 7.0100.2303 (#211)

Signed-off-by: Danny Allen <daall@microsoft.com>

* Update utilities w/ build fix
2021-05-04 08:35:48 -07:00
Volodymyr Samotiy
4152c3e337
[Mellanox] [202012] Update SAI submodule pointer (#7499)
- Why I did it
To include below changes:
Set monitoring VLAN hostif up dy default (for VNET ping tool)

- How I did it
Updated SAI submodule pointer

- How to verify it
Create VLAN hostif according to changes in PR: Azure/sonic-swss#1645
Verify it is admin up by default

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-05-04 17:29:47 +03:00
judyjoseph
8cea931cad Fixes for errors seen in staging devices (#7171)
With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix.

admin@STG01-0101-0102-01T1:~$ TSB
BGP0 : % Could not find route-map entry TO_TIER0_V4 20
line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20
% Could not find route-map entry TO_TIER0_V4 30
line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30

In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.
2021-05-03 13:19:29 -07:00
judyjoseph
7ae4a990e7 [docker-fpm-frr]: TSA/B/C changes for multi-asic (#6510)
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
       In addition skip enforcing it on route maps used between internal BGP sessions.

admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance

and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
2021-05-03 13:19:17 -07:00
Nazarii Hnydyn
0e970582c1
[swss_vars]: Add 'resource_type' attribute. (#7188)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-05-03 10:38:11 -07:00
guxianghong
a0fde3a626 [arm] support compile sonic arm image on arm server (#7285)
- Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly.
- Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228
- rename multiarch docker to sonic-slave-${distro}-march-${arch}

Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Co-authored-by: Shi Lei <shil@centecnetworks.com>
2021-05-02 08:11:56 -07:00
Samuel Angebault
30cc959787 [Arista] Fix dockerd issue on Arista platforms (#7376)
Why I did it
Recent systemd upgrade from #7228 requires an extra cmdline parameter for dockerd to start properly.
Updating boot0 was missed as part of the systemd upgrade change.

How I did it
Just added the missing cmdline parameter in files/Aboot/boot0.j2
This change fixes #7372

How to verify it
Boot the image and dockerd should start normally.
2021-05-01 19:43:51 -07:00
Stepan Blyshchak
ae574ab000 [systemd] disable default systemd udev rules for interfaces (#7369)
Fix #7364

99-default.link - was always in SONiC, but previous systemd (<247) had an issue and it did not work due to issue systemd/systemd#3374. Now systemd 247 works.

However, such policy overrides teamd provided mac address which causes teamd netdev to use a random mac
address. Therefore, needs to be disabled.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-05-01 19:43:41 -07:00
Guohan Lu
4f33a90a82 Revert "Revert "[build_debian.sh] fix systemd is not from backports buster (#7323)""
This reverts commit a253f2039a.
2021-05-01 19:43:15 -07:00
Guohan Lu
55f8f8647b Revert "Revert "[debian] install systemd version 247 from buster-backports (#7228)""
This reverts commit 86f9c7b7a6.
2021-05-01 19:43:04 -07:00
Andriy Yurkiv
c65a8a227f [devices][hwsku] add support to VXLAN src port range feature (#7394)
Enable VXLAN src port range configuration via SAI profile
2021-04-29 10:11:14 -07:00
Stephen Sun
48908b1c5a Fix issue: exception occurred during chassis object being destroyed (#7446)
The following error message is observed during chassis object being destroyed

"Exception ignored in: <function Chassis.__del__ at 0x7fd22165cd08>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/sonic_platform/chassis.py", line 83, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
The chassis tries to import deinitialize_sdk_handle during being destroyed for the purpose of releasing the sdk_handle.
However, importing another module during shutting down can cause the error because some of the fundamental infrastructures are no longer available."

This error occurs when a chassis object is created and then destroyed in the Python shell.

- How I did it
To fix it, record the deinitialize_sdk_handle in the chassis object when sdk_handle is being initialized and call the deinitialize handler when the chassis object is being destroyed

- How to verify it
Manually test.
2021-04-29 10:10:25 -07:00
Junchao-Mellanox
d4e8c3f666 [Mellanox] Upgrade hw-mgmt to 7.0100.2303 (#7419)
- Why I did it
Upgrade hw-mgmt to 7.0100.2303

Bug fixes

1. Fan direction feature fix for fixed FAN system (using shell instead of binutils/strings)
2. Remove cpld 4th link on systems with only 3 CPLD's
3. hw-mgmt: thermal: Add hardcoded critical trip point. Follow-up after patch "Removing critical thermal zones to prevent unexpected software system shutdown".
4. Fix sensor attribute mapping to be label based instead of index based to allow common handling of voltage regulator names independently of hardware changes.
5. Update 'lm-sensors' custom configuration file. Relevant only for users utilizing sensors.conf files coming along with hw-management package.
6. For full feature list please follow https://github.com/Mellanox/hw-mgmt/blob/V.7.0010.2300_BR/debian/Release.txt

- How I did it
Update hw-mgmt pointer
Remove unused patches
Fix existing patch to make sure it apply successfully

- How to verify it
Full platform regression on all mellanox platforms
2021-04-29 10:09:58 -07:00
xumia
bdb23a0d94 Fix workflow permission issue when running in merge branch (#7417)
Fix the labeler workflow permission issue when merging from fork repo.
It impacts the labeler workflow to support auto-merge for package versions upgrade on 202012 branch. The current workaround is to add the label "automerge" on the PR sent by mssonicbld, then the automerge workflow will merge the PR.
2021-04-29 10:09:43 -07:00