Commit Graph

5692 Commits

Author SHA1 Message Date
Junchao-Mellanox
ba4da1597a [Mellanox] Fix issue: thermal zone threshold value 0 causes fan speed stuck at 100% (#10057)
- Why I did it
In SONiC thermal control algorithm, it compares thermal zone temperature with thermal zone threshold. Previously, a thermal zone with no thermal sensor can still get its threshold. However, a recently driver patch changes this behavior: a thermal zone with no thermal sensor will return 0 for threshold. We need to ignore such thermal zone.

- How I did it
Ignore thermal zones whose temperature is 0.

- How to verify it
Added unit test case and Manual test
2022-02-27 20:45:45 -08:00
Junchao-Mellanox
00940941b2 [system-health] Fix file handle leak (#10059)
- Why I did it
swsscommon.ConfigDBConnector does not automatically close connection when the instance is recycled by python. So, it should not create this instance each time calling check_services. It will cause error like Failed to read from file /var/run/hw-management/led/led_status_capability - OSError(24, 'Too many open files')

- How I did it
Only connect DB once in init

- How to verify it
Manual test
2022-02-27 20:45:41 -08:00
arlakshm
b58a42a8d4 [chassis][bgp] create v4 and v6 peer group for VoQ internal neighbors (#9693)
Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors

How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests

How to verify it

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-02-27 20:45:37 -08:00
Judy Joseph
c6774ad683 update sonic-sairedis
5331ecd [vslib]: Fix MACsec bug in SCI and XPN (#1003)
ac04509 Fix build issues on gcc-10 (#999)
1b8ce97 [pipeline] Download swss common artifact in a separated directory (#995)
7a2e096 Change sonic-buildimage.vs artifact source from CI build to official build. (#992)
d5866a3 [vslib]: fix create MACsec SA error (#986)
f36f7ce Added Support for enum query capability of Nexthop Group Type. (#989)
323b89b Support for MACsec statistics (#892)
26a8a12 Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#968)
0cb253a Fix object availability conversion (#974)
2022-02-27 20:41:15 -08:00
Saikrishna Arcot
5b7e55dfac
Package debugging and hardening for dhcpmon and dhcp6relay (#9862) (#10063)
Enable dbgsym package for dhcpmon.

Allow CFLAGS and LDFLAGS from environment variables to be used
in the dhcp6relay build. This makes sure that the -O2 flag from
dpkg-buildflags gets used.

Finally, enable all hardening flags in dpkg-buildflags for
dhcp6relay and dhcpmon. The change from the default set of flags is that
during linking, immediate binding of symbols is done instead of lazy
binding.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-02-25 11:09:15 -08:00
xumia
5d19c74f1f [Security]: Upgrade urllib3 to fix CVE-2021-33503
See https://security.archlinux.org/CVE-2021-33503
2022-02-25 09:15:08 +00:00
Judy Joseph
38967a4ebd Update sonic-swss, sonic-utilites submodules
sonic-swss

1aa40f7 Remove port serdes object before removing port (#2152)
876d690 [doc] Updating Policer config in Configuration manual (#2144)

sonic-utilities
dfed952 show_platfom_info not run for simx (#2042)
71fdee7 [aclshow] fix aclshow when clear is called before counters are populated (#2037)
a48a027 [sonic-package-manager] implement blocking feature state change (#2035)
c51871d [ci] Fix python dependencies reference path. (#2060)
2022-02-21 23:53:01 -08:00
kellyyeh
5d4031918b [radv] Support multiple ipv6 prefixes per vlan interface (#9934)
Why I did it
Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block.

How I did it
Modifies radvd.conf.j2 and added unit tests

How to verify it
Configure multiple ipv6 address to the same vlan, start radvd
Unit test will check if radvd.conf with multiple ipv6 addresses is formed correctly
2022-02-21 23:48:22 -08:00
Stepan Blyshchak
e5e1b70966 [rsyslog.j2] fix typo in VAR_LOG_SIZE_KB (#9954)
This issue causes negative threshold value and thus deleting log files even when there is enough space.

This issue causes negative threshold value and thus deleting log files even when there is enough space.

- Why I did it
To fix an issue when log files get deleted even if there is enough space.

- How I did it
Fixed an typo.

- How to verify it
Run the portion of the script that calculates threshold, see that the threshold is calculated correctly.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-02-21 23:48:19 -08:00
kellyyeh
a0bb7a0485 [dhcp_relay] Check payload size to prevent buffer overflow in dhcpv6 option (#9740) 2022-02-21 23:48:15 -08:00
Ying Xie
53a8d99baa [dhcp6relay] a couple memory access protections (#9851)
Why I did it
the strcpy and buffer allocation is not safe, it corrupts 1 byte on the stack. Depending on the memory layout, it may or may not cause issue immediately.
message type is not validated before updating the counter. Which could cause segment fault.

How I did it
Remove the unsafe strcpy, use config->interface.c_str() instead.
Check message type before updating counters.

How to verify it
The issue (1) caused segment fault on a specific platform. The fix was validated there. Issue (2) was precautionary. Added log in case it triggers.
2022-02-21 23:48:06 -08:00
kellyyeh
4df0aa348c [dhcp6relay] Support relaying Relay-Forward message (#9887) 2022-02-21 23:48:02 -08:00
Dror Prital
a6eb72cbf1 [Mellanox] Upgrade ASIC FW tool to 4.18.1-16 (#9981)
- Why I did it
Update MFT to version 4.18.1-16 for bugs fixes and new SN2201 support

- How I did it
Advance to MFT tool version to 4.18.1-16

- How to verify it
Manually tested on all Mellanox platforms (ASIC FW Upgrade, link debug tools, CPLD upgrade, etc.)
2022-02-21 23:47:58 -08:00
Alexander Allen
1f566bf98e [pmon] Fix chassis_db_init exit not being expected (#9858)
- Why I did it
Error log was shown on switches during boot
pmon#supervisord 2021-12-22 04:27:16,709 INFO exited: chassis_db_init (exit status 0; not expected)

- How I did it
Add exit code zero as an expected exit code and also disable autorestart.

- How to verify it
Boot the switch and ensure the above log line does not appear.
2022-02-21 23:47:54 -08:00
Sudharsan Dhamal Gopalarathnam
48bf7526fe [yang]YANG model for policer table (#9948)
#### Why I did it
Added yang model for policer table
Fixes https://github.com/Azure/sonic-buildimage/issues/9742 and https://github.com/Azure/sonic-buildimage/issues/9743
#### How I did it
Creating yang model for policer

#### How to verify it
Added UT to verify the yang model

The configuration schema for policer is added in the pull request https://github.com/Azure/sonic-swss/pull/2144
2022-02-21 23:47:50 -08:00
xumia
dce0e7270c [Build]: Fix marvell sai package version parsing issue
Fix marvell sai package version parsing issue (#10009)
2022-02-19 04:25:08 +00:00
Sudharsan Dhamal Gopalarathnam
da5899b748
[yang] Adding not-provisioned to type field in DEVICE_METADATA table (#9951) (#9993)
Cherry-pick of https://github.com/Azure/sonic-buildimage/pull/9951 into 202111 branch
#### Why I did it
Fixing the issue https://github.com/Azure/sonic-buildimage/issues/9915


#### How I did it
Added 'not-provisioned' as a supported value for type field in DEVICE_METADATA type. This value is set during initial ZTP bring up

#### How to verify it
Added UT to verify it.
2022-02-16 15:20:39 -08:00
Judy Joseph
063256f7b3 Merge branch '202111' of https://github.com/Azure/sonic-buildimage into 202111 2022-02-15 11:34:59 -08:00
Judy Joseph
e6adf71c20 Update sonic-platform-daemons
cb3ddf5 [pmon][xcvrd] xcvrd process show backtrace on the internal port. Port PR233 (#236)
5b4c9e1 Fix python wheels path downloaded from vs official build. (#244)
2022-02-15 11:34:30 -08:00
Dror Prital
30b69ed621
[202111] Add sonic_release file in order to have "release 202111" on sonic_version.yml (#9985)
- Why I did it
Fix Issue 9972: Incorrect information about release version in sonic_version.yml

- How I did it
Add "sonic_release" file to /sonic-buildimage/files/image_config/

- How to verify it
Install the image and run: cat /etc/sonic/sonic_version.yml
Verify the following item on sonic_version.yml file: release: '202111'
2022-02-15 10:46:42 +02:00
byu343
9b5bb887c8 Support multi-asic on macsec container (#9921)
This change enables the support of running multiple macsec containers, each for one ASIC.
2022-02-13 22:46:33 -08:00
Judy Joseph
472f2ff772 Update sonic-platform-common submodule 2022-02-13 18:10:19 -08:00
Judy Joseph
97c2b2ff4b Update sonic-swss submodule 2022-02-13 18:09:30 -08:00
Junchao-Mellanox
b37470d45e [Mellanox] Fix issue: SN4600C has 4 CPU core temperature sensors (#9930)
- Why I did it
platform.json of 4600C only has 2 CPU core thermal sensors, but there are 4 actually

- How I did it
Added thermal sensors for CPU core 2 and core 3.

- How to verify it
Build.
2022-02-13 18:04:40 -08:00
Kebo Liu
8c5db31627 fix MSN4410 chassis name in platform_components.json (#9939)
- Why I did it
The chassis name in MSN4410 platform_components.json is not correct

- How I did it
Fix the chassis name

- How to verify it
Run relevant platform API test

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-02-13 18:04:37 -08:00
Marty Y. Lok
af300f5e04 Update Nokia sonic-platform submodule (#9963)
ded0344 Return both 'vendor_rev' and 'hardware_rev' keys from get_transceiver_info to support both earlier and later versions of xcvrd
b3442cc Change log_error to log_info when transceiver module is transitioned
3d3a73c Fix problem introduced with new SFP caching paradigm
2915746 Add script to reboot all IMMs
17ad221 Fix the position_in_parent for the psu entity
207e731 No longer force read pages at SFP init time
b387921 Fixed the voltage and power with rounding to 2 digit decimals

Signed-off-by: mlok <marty.lok@nokia.com>
2022-02-13 18:04:32 -08:00
saksarav-nokia
977cb98479 [Nokia][platform] Modified the bcm config file for Nokia-IXR7250E-36x400G (#9622)
Why I did it
Updated the BCM config recommended by Broadcom for Nokia-IXR7250E-36x400G

How I did it
Updated the BCM config file

How to verify it
Verified running the image with this BCM config in Nokia-IXR7250E-36x400G and ensured that the syncd container was stable, ports were up and passing the traffic.

Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
2022-02-13 18:04:28 -08:00
xumia
bf74a68cb9 [Build]: Fix hundreds of thousands lines of logs printed in marvell-armhf (#9980)
[Build]: Fix hundreds of thousands lines of logs printed in marvell-armhf
It is caused by the bad format of the marvell sai package mrvllibsai_armhf_1.7.1-6.deb, increasing the waiting time to reduce the logs, and reduce the waste of the CPU.
2022-02-13 18:04:23 -08:00
Andriy Yurkiv
b29a3a0d4c [Mellanox][vxlan] remove old static config for VXLAN src port range feature (#9956)
- Why I did it
Need to remove old static configs from sai.profile files.
New implementation: Azure/sonic-swss#1959
New configuration: #9658

- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 lines from files per HWSKU

- How to verify it
When static config is removed following test will fail (src port will be in range 0-255)

py.test vxlan/test_vnet_vxlan.py --inventory "../ansible/inventory, ../ansible/veos" --host-pattern (testbed)-t0 --module-path ../ansible/library/ --testbed (testbed)-t0 --testbed_file ../ansible/testbed.csv --allow_recover --assert plain --log-cli-level info --show-capture=no -ra --showlocals --disable_loganalyzer --skip_sanity --upper_bound_udp_port 65535 --lower_bound_udp_port 64128
2022-02-13 18:04:19 -08:00
Stephen Sun
192409a7db Fix issue: module id got from get_change_event is wrong (#9961)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-02-13 18:04:13 -08:00
Travis Van Duyn
05791708f1 updated jinja template for snmp contact python2 vs python3 issue (#9949) 2022-02-13 18:04:09 -08:00
Andriy Yurkiv
84a986bc27 [Mellanox][VXLAN] add params to vxlan.json file in order to configure VXLAN src port range feature (#9658)
- Why I did it
Remove obsolete parameter that enables static VXLAN src port range
provide functionality no generate json config file according to appropriate parameter in config_db
Done for
SN3800:
• Mellanox-SN3800-D28C50
• Mellanox-SN3800-C64
• Mellanox-SN3800-D28C49S1 (New 10G SKU)

SN2700:
• Mellanox-SN2700-D48C8

- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 from appropriate sai.profile files
Created vxlan.json file and added few params that depends on DEVICE_METADATA.localhost.vxlan_port_range

- How to verify it
File /etc/swss/config.d/vxlan.json should be generated inside swss docker when it restart
[
    {
        "SWITCH_TABLE:switch": {
            "vxlan_src": "0xFF00",
            "vxlan_mask": "8"
        },
        "OP": "SET"
    }
]
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-02-13 18:01:06 -08:00
Alexander Allen
825fc9e2dd [Mellanox] Update mellanox hw-mgmt submodule and versions to V.7.0020.1300 (#9860)
- Why I did it
New version of mellanox platform management code available adding support for new platforms and fixing bugs.

- How I did it
1. Updated the submodule
2. Updated makefile version references
3. Regenerated SONiC patches
2022-02-13 18:01:02 -08:00
saksarav-nokia
86bd652009 [Nokia][platform] Modified Nokia device data to support midplane (#9914)
Added midplane_subnet in chassisdb.conf for interfaces-config.sh to create midplane interface in multi-asic namespaces.

Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
2022-02-13 18:00:58 -08:00
Prince George
fa0de261aa Close console session due to user inactivity (#9890)
Signed-off-by: Prince George <prgeor@microsoft.com>
2022-02-13 18:00:54 -08:00
abdosi
c4254b53f0 Updated Internal BGP Templates for chassis packet (#9674)
Fixes: https://github.com/Azure/sonic-buildimage/issues/9610
2022-02-13 18:00:50 -08:00
Ashok Daparthi-Dell
741d047982 [yang] Fix for sonic-scheduler.yang name pattern (#9873)
#### Why I did it

PR9611 - sonic-scheduler.yang pattern issue

#### How I did it
Modified the scheduler name pattern string to accept any string 

#### How to verify it

Sonic yang tests
2022-02-13 18:00:43 -08:00
Alexander Allen
6b6b40c046 [pmon] Move smartctl from pmon to host (#9607)
Why I did it
Need to be able to run smartctl when pmon docker is not running.

How I did it
Removed the pmon dependency for pmon as well as the command wrapper and added it to the debian-extension.

How to verify it
Stop pmon
Run smartctl from the host and verify it runs without error
2022-02-13 17:54:58 -08:00
Alexander Allen
9264db4635
[submodule] Update linux-kernel submodule pointer (#9973)
Updates include the following changes in order to support new Mellanox platforms and drivers (Azure/sonic-linux-kernel#259)

10ef390 Update kconfig to support / enable newly backported mellanox patches.
6a949e1 Add backported patches for Mellanox hw-mgmt V.7.0020.1300
e1913f7 Rename and reformat patch headers
2022-02-13 17:12:21 +02:00
Judy Joseph
f08866b668 Update sonic-swss submodule
05c2c2e [voq] Neighbor entry impose encap index attribute deprecated (#2069)
2022-02-06 22:54:35 -08:00
Judy Joseph
9b4d80115a Update sonic-utilities submodule 2022-01-30 23:03:16 -08:00
Judy Joseph
29ccb603ae Update sonic-swss submodule 2022-01-30 23:02:18 -08:00
Mohamed Ghoneim
b704c6cc9a [yang] Adding sonic-bgp-monitor to setup.py (#9877)
#### Why I did it
Include sonic-bgp-monitor to setup.py so it gets included in /usr/local/yang-models when installing the package

#### How I did it

#### How to verify it
install the package

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->


#### A picture of a cute animal (not mandatory but encouraged)
2022-01-30 22:49:50 -08:00
Junchao-Mellanox
9070c441e8 Fix issue: 'sx_port_mapping_t' object has no attribute 'slot_id' (#9835)
- Why I did it
Fix issue: 'sx_port_mapping_t' object has no attribute 'slot_id'. sx_port_mapping_t only has attribute slot.

- How I did it
Change slot_id to slot.

- How to verify it
Manual test
2022-01-30 22:49:27 -08:00
Junchao-Mellanox
74df1494d0 [Mellanox] Fix select timeout in sfp event (#9795)
- Why I did it
Python select.select accept a optional timeout value in seconds, however, the value passes to it is a value in millisecond.

- How I did it
Transfer the value to millisecond.

- How to verify it
Manual test
2022-01-30 22:49:24 -08:00
Shi Su
a5afa2c15e Add openbfdd to ptf docker (#9488)
Why I did it
To enable test support for BFD-related features, the PTF docker needs to have the proper support for BFD. This PR aims to add BFD support in ptf docker.

How I did it
Clone and build OpenBFDD for PTF docker.

How to verify it
Build locally and verify BFD is supported.
2022-01-30 22:49:20 -08:00
Dror Prital
7351112a52 [Mellanox] Update SDK/FW to 4.5.1208/2010.1218 and SAI version to 1.20.2.5 (#9619)
- Why I did it
To include latest SDK fixes:
1.  On CMIS modules, after low power configuration, the firmware waited for the module state to be ModuleReady instead of ModuleLowPower causing delays.
2. When connecting SN4600C, 100GbE port with CWDM4 module (Gen 3.0), link up time is 30 seconds.

and to include SAI fixes \ changes:
1. Reduce verbosity for resource check vendor data not found
2. Fix metadata validation, check default value on conditions check
3. Add 100MB, 10MB to 2201 system
4. L3 VXLAN overlay ECMP
5. VXLAN srcport API implementation
6. Fix scheduler profile null (default values) when set on sub group scheduler group
7. Fix ACL binding restoration when port leaves a LAG
8. Fix route logic for set next hop/action and reference counter for ECMP overlay

- How I did it
1. Updated SDK/FW submodule and relevant makefiles with the required versions.
2. Update SAI submodule and relevant makefile with the required version.

- How to verify it
Build an image and run tests from "sonic-mgmt".
2022-01-30 22:49:06 -08:00
Samuel Angebault
fac0b11ebb Add platform.json configs for all denali SKUs (#9717) 2022-01-30 22:49:00 -08:00
Alexander Allen
e8418fd2da [Mellanox] Modified Platform API to support all firmware updates in single boot (#9608)
Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.

How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
2022-01-30 22:48:54 -08:00
Stepan Blyshchak
fcdb3f2f59
[202111][sonic-yang] fix the feature state type (#9587) (#9778)
- Why I did it
The feature state can be a jinja template, like in this file - https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/init_cfg.json.j2#L39.
Without this change it is not possible to validate a configuration file.

- How I did it
Relaxes the constraint on feature state. Feature state leaf can be any string.

- How to verify it
Run UT.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-01-28 05:52:27 +02:00