- Why I did it
Current platform.json lacks some peripheral device related facts, like chassis/fan/pasu/drawer/thermal/components names, numbers, etc.
- How I did it
Add platform device facts to the platform.json file
- How to verify it
Run sonic-mgmt platform API tests which depend on these facts.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* [202012] Add SOC property to enable AN/LT on some platforms
Why I did it
To enable autonegotiation/link training on some Broadcom-based platforms (Arista 7060CX, 7260CX3, 7050cx3, Celestica DX010)
How I did it
Add appropriate SOC property for enabling the feature to the Broadcom config files of appropriate platforms
Also convert line endings to UNIX format for one Celestica file
* Add 'phy_an_lt_msft' to BCM config file permitted list
#### Why I did it
- PSU data is loaded into state DB.
Following errors are seen in syslogs:
"Failed to update PSU data - '<=' not supported between instances of 'float' and 'str'"
- Issue is not seen in master image as the PSU API return type is different.
#### How I did it
- Changed the return type in PSU API's.
68ea9efc Add pg-drop script to sonic filesystem (#1583)
b216bf0a Fixing serial number read to get from DB if it is populated (#1580)
fa7230c6 Handle the new db version which mellanox_buffer_migrator isn't interested (#1566)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Includes below commits:
```
3f8bc52 2021-05-05 | Relax the install_requires, no need to exact version as long as there are no broken changes with future versions (#1530) (#1592) [Qi Luo]
```
#### Why I did it
400G media EEPROM and DOM information are not populated properly in DellEMC Z9332f platform.
#### How I did it
Handled QSFP_DD, QSFP28/QSFP+, SFP+ accordingly based on media type detected.
Incorporate the below changes in DellEMC Z9332F platform:
- Implemented watchdog platform API support
- Implement ‘get_position_in_parent’, ‘is_replaceable’ methods for all device types
- Change return type of SFP methods to match specification in sonic_platform_common/sfp_base.py
- Added platform.json file in device directory.
Co-authored-by: V P Subramaniam <Subramaniam_Vellalap@dell.com>
- Why I did it
Updated FW to xx.2008.2526 version.
Fixed issues:
1. Spectrum-2, Spectrum-3 | sFlow | High CPU load and high on fully loaded switch.
2. Spectrum-2, Spectrum-3 | Fine grain LAG | in rare cases doesn’t update the right entry
- How I did it
Updated submodule pointer and version in a Makefile.
- How to verify it
Full regression and bugs validation
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Enable VXLAN src port range configuration via SAI profile for Mellanox-SN3800-D28C49S1 SKU
- How I did it
Added SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 configuration to appropriate sai.profile
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
#### Why I did it
Adopt a single way to get fan direction for all ASIC types.
It depends on hw-mgmt V.7.0010.2000.2303. Depends on https://github.com/Azure/sonic-buildimage/pull/7419
#### How I did it
Originally, the get_direction was implemented by fetching and parsing `/var/run/hw-management/system/fan_dir` on the Spectrum-2 and the Spectrum-3 systems. It isn't supported on the Spectrum system.
Now, it is implemented by fetching `/var/run/hw-management/thermal/fanX_dir` for all the platforms.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Add 3 pipeline files:
- pipeline for build docker image sonic-slave-[buster|jessie|stretch] for amd64/armhf/arm64, and push to ACR(sonicdev-microsoft.com)
- pipeline for build docker image sonic-mgmt, and push to ACR
- pipeline for cleaning dpkg cache which are created more than 30 days.
Co-authored-by: lguohan <lguohan@gmail.com>
Disable the reproducible build for arm64 and armhf temporarily.
The package version upgrade has been disabled for armhf and armhf for long run time, so disable the reproducible build for the official builds as well. we will enable it when arm64 azure agent ready.
* [202012][swss/swss-common/utilities/kernel] Update submodule
sonic-swss:
- [Monitor Vlan] Fix a typo in hostif (#1722)
- Update pool sizes during initialization from timer only (#1708)
- [SflowMgr] SamplingRate Update by Speed Change Added (#1721)
sonic-swss-common:
- [swss-common] Add MUX Metrics Table (#482)
- [azp] Purge swss before installing the newly built deb package (#472)
sonic-utilities:
- disk_check: Check & mount RO as RW using tmpfs (#1569)
- No more IP validation as it is more likely a URL (#1555)
- Stop PMON docker before cold and soft reboots (#1514)
- Add soft-reboot reboot type (#1453)
- [acl] Use a list instead of a comma-separated string for ACL port list (#1519)
- sonic-installer: fix py3 issues in bootloader.aboot (#1553)
- Fix unsupported fs.squashfs extraction in sonic-installer (#1366)
- [show][config] cli support for firmware upgrade on Y-Cable (#1528) (#1558)
sonic-linux-kernel:
- [Mellanox] backport kernel patches for hw-management 7.0100.2303 (#211)
Signed-off-by: Danny Allen <daall@microsoft.com>
* Update utilities w/ build fix
- Why I did it
To include below changes:
Set monitoring VLAN hostif up dy default (for VNET ping tool)
- How I did it
Updated SAI submodule pointer
- How to verify it
Create VLAN hostif according to changes in PR: Azure/sonic-swss#1645
Verify it is admin up by default
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix.
admin@STG01-0101-0102-01T1:~$ TSB
BGP0 : % Could not find route-map entry TO_TIER0_V4 20
line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20
% Could not find route-map entry TO_TIER0_V4 30
line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30
In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
In addition skip enforcing it on route maps used between internal BGP sessions.
admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance
and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
- Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly.
- Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228
- rename multiarch docker to sonic-slave-${distro}-march-${arch}
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Co-authored-by: Shi Lei <shil@centecnetworks.com>
Why I did it
Recent systemd upgrade from #7228 requires an extra cmdline parameter for dockerd to start properly.
Updating boot0 was missed as part of the systemd upgrade change.
How I did it
Just added the missing cmdline parameter in files/Aboot/boot0.j2
This change fixes#7372
How to verify it
Boot the image and dockerd should start normally.
Fix#7364
99-default.link - was always in SONiC, but previous systemd (<247) had an issue and it did not work due to issue systemd/systemd#3374. Now systemd 247 works.
However, such policy overrides teamd provided mac address which causes teamd netdev to use a random mac
address. Therefore, needs to be disabled.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
The following error message is observed during chassis object being destroyed
"Exception ignored in: <function Chassis.__del__ at 0x7fd22165cd08>
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/sonic_platform/chassis.py", line 83, in __del__
ImportError: sys.meta_path is None, Python is likely shutting down
The chassis tries to import deinitialize_sdk_handle during being destroyed for the purpose of releasing the sdk_handle.
However, importing another module during shutting down can cause the error because some of the fundamental infrastructures are no longer available."
This error occurs when a chassis object is created and then destroyed in the Python shell.
- How I did it
To fix it, record the deinitialize_sdk_handle in the chassis object when sdk_handle is being initialized and call the deinitialize handler when the chassis object is being destroyed
- How to verify it
Manually test.
- Why I did it
Upgrade hw-mgmt to 7.0100.2303
Bug fixes
1. Fan direction feature fix for fixed FAN system (using shell instead of binutils/strings)
2. Remove cpld 4th link on systems with only 3 CPLD's
3. hw-mgmt: thermal: Add hardcoded critical trip point. Follow-up after patch "Removing critical thermal zones to prevent unexpected software system shutdown".
4. Fix sensor attribute mapping to be label based instead of index based to allow common handling of voltage regulator names independently of hardware changes.
5. Update 'lm-sensors' custom configuration file. Relevant only for users utilizing sensors.conf files coming along with hw-management package.
6. For full feature list please follow https://github.com/Mellanox/hw-mgmt/blob/V.7.0010.2300_BR/debian/Release.txt
- How I did it
Update hw-mgmt pointer
Remove unused patches
Fix existing patch to make sure it apply successfully
- How to verify it
Full platform regression on all mellanox platforms
Fix the labeler workflow permission issue when merging from fork repo.
It impacts the labeler workflow to support auto-merge for package versions upgrade on 202012 branch. The current workaround is to add the label "automerge" on the PR sent by mssonicbld, then the automerge workflow will merge the PR.
Why I did it
Support readonly version of the command vtysh
How I did it
Check if the command starting with "show", and verify only contains single command in script.
Fix the generating version file failure issue caused by artifacts folder change.
When changing to use the same template for PR build, official build and packages version upgrade, the artifacts folder adding a "target" folder, the version upgrade task should be changed accordingly.
PR# 7249 introduced a new bit of logic _after_ the point where the qemu based
build environment for ARM is removed. Hence the new logic fails when building
for ARM. Builds for AMD64 were not affected.
This commit moves the new logic introduced by PR# 7249 to just _before_ the
point where the qemu based build environment for ARM is removed. A comment is
added to reduce the likelihood of this sort of ARM build break from happening
again.
Why I did it
Add the platform filter to ignore some platforms not ready to use
The platform centos-arm64 and the platform marvell-armhf are not ready to use now. We will add it when it is available.
- Why I did it
Changes in the new release:
1. Fix 10G and 50G speeds in SAI XML to support all interface types
2. Enable SMAC=DMAC and SMAC MC in tunnel debug counter
3. Add tunnel statistics
4. Add isolation group API implementation
5. Fix ACL ANY debug counter to correctly track ACL drops
6. Add VXLAN source port hard coded range, controlled by K/V
7. FW dump me now feature
8. Add mlxtrace to saidump
9. Speed lane setting and AN control
10. Implement query stats API
11. VNI miss part of tunnel decal drop reason
- How I did it
Update the version number in SAI make file, update the mlnx-sai submodule pointer.
- How to verify it
Run full regression tests on Mellanox platforms
Signed-off-by: Dror Prital <drorp@nvidia.com>
When submitting a new official build for broadcom, vs, it prompts a error message, which says the job is not defined.
It was caused by the default option "[]", which is not empty, it is used as the jobGroups parameter.