- Why I did it
In SONiC thermal control algorithm, it compares thermal zone temperature with thermal zone threshold. Previously, a thermal zone with no thermal sensor can still get its threshold. However, a recently driver patch changes this behavior: a thermal zone with no thermal sensor will return 0 for threshold. We need to ignore such thermal zone.
- How I did it
Ignore thermal zones whose temperature is 0.
- How to verify it
Added unit test case and Manual test
- Why I did it
swsscommon.ConfigDBConnector does not automatically close connection when the instance is recycled by python. So, it should not create this instance each time calling check_services. It will cause error like Failed to read from file /var/run/hw-management/led/led_status_capability - OSError(24, 'Too many open files')
- How I did it
Only connect DB once in init
- How to verify it
Manual test
Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors
How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests
How to verify it
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
5331ecd [vslib]: Fix MACsec bug in SCI and XPN (#1003)
ac04509 Fix build issues on gcc-10 (#999)
1b8ce97 [pipeline] Download swss common artifact in a separated directory (#995)
7a2e096 Change sonic-buildimage.vs artifact source from CI build to official build. (#992)
d5866a3 [vslib]: fix create MACsec SA error (#986)
f36f7ce Added Support for enum query capability of Nexthop Group Type. (#989)
323b89b Support for MACsec statistics (#892)
26a8a12 Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#968)
0cb253a Fix object availability conversion (#974)
Enable dbgsym package for dhcpmon.
Allow CFLAGS and LDFLAGS from environment variables to be used
in the dhcp6relay build. This makes sure that the -O2 flag from
dpkg-buildflags gets used.
Finally, enable all hardening flags in dpkg-buildflags for
dhcp6relay and dhcpmon. The change from the default set of flags is that
during linking, immediate binding of symbols is done instead of lazy
binding.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
sonic-swss
1aa40f7 Remove port serdes object before removing port (#2152)
876d690 [doc] Updating Policer config in Configuration manual (#2144)
sonic-utilities
dfed952 show_platfom_info not run for simx (#2042)
71fdee7 [aclshow] fix aclshow when clear is called before counters are populated (#2037)
a48a027 [sonic-package-manager] implement blocking feature state change (#2035)
c51871d [ci] Fix python dependencies reference path. (#2060)
Why I did it
Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block.
How I did it
Modifies radvd.conf.j2 and added unit tests
How to verify it
Configure multiple ipv6 address to the same vlan, start radvd
Unit test will check if radvd.conf with multiple ipv6 addresses is formed correctly
This issue causes negative threshold value and thus deleting log files even when there is enough space.
This issue causes negative threshold value and thus deleting log files even when there is enough space.
- Why I did it
To fix an issue when log files get deleted even if there is enough space.
- How I did it
Fixed an typo.
- How to verify it
Run the portion of the script that calculates threshold, see that the threshold is calculated correctly.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Why I did it
the strcpy and buffer allocation is not safe, it corrupts 1 byte on the stack. Depending on the memory layout, it may or may not cause issue immediately.
message type is not validated before updating the counter. Which could cause segment fault.
How I did it
Remove the unsafe strcpy, use config->interface.c_str() instead.
Check message type before updating counters.
How to verify it
The issue (1) caused segment fault on a specific platform. The fix was validated there. Issue (2) was precautionary. Added log in case it triggers.
- Why I did it
Update MFT to version 4.18.1-16 for bugs fixes and new SN2201 support
- How I did it
Advance to MFT tool version to 4.18.1-16
- How to verify it
Manually tested on all Mellanox platforms (ASIC FW Upgrade, link debug tools, CPLD upgrade, etc.)
- Why I did it
Error log was shown on switches during boot
pmon#supervisord 2021-12-22 04:27:16,709 INFO exited: chassis_db_init (exit status 0; not expected)
- How I did it
Add exit code zero as an expected exit code and also disable autorestart.
- How to verify it
Boot the switch and ensure the above log line does not appear.
cb3ddf5 [pmon][xcvrd] xcvrd process show backtrace on the internal port. Port PR233 (#236)
5b4c9e1 Fix python wheels path downloaded from vs official build. (#244)
- Why I did it
Fix Issue 9972: Incorrect information about release version in sonic_version.yml
- How I did it
Add "sonic_release" file to /sonic-buildimage/files/image_config/
- How to verify it
Install the image and run: cat /etc/sonic/sonic_version.yml
Verify the following item on sonic_version.yml file: release: '202111'
- Why I did it
platform.json of 4600C only has 2 CPU core thermal sensors, but there are 4 actually
- How I did it
Added thermal sensors for CPU core 2 and core 3.
- How to verify it
Build.
- Why I did it
The chassis name in MSN4410 platform_components.json is not correct
- How I did it
Fix the chassis name
- How to verify it
Run relevant platform API test
Signed-off-by: Kebo Liu <kebol@nvidia.com>
ded0344 Return both 'vendor_rev' and 'hardware_rev' keys from get_transceiver_info to support both earlier and later versions of xcvrd
b3442cc Change log_error to log_info when transceiver module is transitioned
3d3a73c Fix problem introduced with new SFP caching paradigm
2915746 Add script to reboot all IMMs
17ad221 Fix the position_in_parent for the psu entity
207e731 No longer force read pages at SFP init time
b387921 Fixed the voltage and power with rounding to 2 digit decimals
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
Updated the BCM config recommended by Broadcom for Nokia-IXR7250E-36x400G
How I did it
Updated the BCM config file
How to verify it
Verified running the image with this BCM config in Nokia-IXR7250E-36x400G and ensured that the syncd container was stable, ports were up and passing the traffic.
Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
[Build]: Fix hundreds of thousands lines of logs printed in marvell-armhf
It is caused by the bad format of the marvell sai package mrvllibsai_armhf_1.7.1-6.deb, increasing the waiting time to reduce the logs, and reduce the waste of the CPU.
- Why I did it
Need to remove old static configs from sai.profile files.
New implementation: Azure/sonic-swss#1959
New configuration: #9658
- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 lines from files per HWSKU
- How to verify it
When static config is removed following test will fail (src port will be in range 0-255)
py.test vxlan/test_vnet_vxlan.py --inventory "../ansible/inventory, ../ansible/veos" --host-pattern (testbed)-t0 --module-path ../ansible/library/ --testbed (testbed)-t0 --testbed_file ../ansible/testbed.csv --allow_recover --assert plain --log-cli-level info --show-capture=no -ra --showlocals --disable_loganalyzer --skip_sanity --upper_bound_udp_port 65535 --lower_bound_udp_port 64128
- Why I did it
Remove obsolete parameter that enables static VXLAN src port range
provide functionality no generate json config file according to appropriate parameter in config_db
Done for
SN3800:
• Mellanox-SN3800-D28C50
• Mellanox-SN3800-C64
• Mellanox-SN3800-D28C49S1 (New 10G SKU)
SN2700:
• Mellanox-SN2700-D48C8
- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 from appropriate sai.profile files
Created vxlan.json file and added few params that depends on DEVICE_METADATA.localhost.vxlan_port_range
- How to verify it
File /etc/swss/config.d/vxlan.json should be generated inside swss docker when it restart
[
{
"SWITCH_TABLE:switch": {
"vxlan_src": "0xFF00",
"vxlan_mask": "8"
},
"OP": "SET"
}
]
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
- Why I did it
New version of mellanox platform management code available adding support for new platforms and fixing bugs.
- How I did it
1. Updated the submodule
2. Updated makefile version references
3. Regenerated SONiC patches
Added midplane_subnet in chassisdb.conf for interfaces-config.sh to create midplane interface in multi-asic namespaces.
Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
#### Why I did it
PR9611 - sonic-scheduler.yang pattern issue
#### How I did it
Modified the scheduler name pattern string to accept any string
#### How to verify it
Sonic yang tests
Why I did it
Need to be able to run smartctl when pmon docker is not running.
How I did it
Removed the pmon dependency for pmon as well as the command wrapper and added it to the debian-extension.
How to verify it
Stop pmon
Run smartctl from the host and verify it runs without error
Updates include the following changes in order to support new Mellanox platforms and drivers (Azure/sonic-linux-kernel#259)
10ef390 Update kconfig to support / enable newly backported mellanox patches.
6a949e1 Add backported patches for Mellanox hw-mgmt V.7.0020.1300
e1913f7 Rename and reformat patch headers
#### Why I did it
Include sonic-bgp-monitor to setup.py so it gets included in /usr/local/yang-models when installing the package
#### How I did it
#### How to verify it
install the package
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### A picture of a cute animal (not mandatory but encouraged)
- Why I did it
Fix issue: 'sx_port_mapping_t' object has no attribute 'slot_id'. sx_port_mapping_t only has attribute slot.
- How I did it
Change slot_id to slot.
- How to verify it
Manual test
- Why I did it
Python select.select accept a optional timeout value in seconds, however, the value passes to it is a value in millisecond.
- How I did it
Transfer the value to millisecond.
- How to verify it
Manual test
Why I did it
To enable test support for BFD-related features, the PTF docker needs to have the proper support for BFD. This PR aims to add BFD support in ptf docker.
How I did it
Clone and build OpenBFDD for PTF docker.
How to verify it
Build locally and verify BFD is supported.
- Why I did it
To include latest SDK fixes:
1. On CMIS modules, after low power configuration, the firmware waited for the module state to be ModuleReady instead of ModuleLowPower causing delays.
2. When connecting SN4600C, 100GbE port with CWDM4 module (Gen 3.0), link up time is 30 seconds.
and to include SAI fixes \ changes:
1. Reduce verbosity for resource check vendor data not found
2. Fix metadata validation, check default value on conditions check
3. Add 100MB, 10MB to 2201 system
4. L3 VXLAN overlay ECMP
5. VXLAN srcport API implementation
6. Fix scheduler profile null (default values) when set on sub group scheduler group
7. Fix ACL binding restoration when port leaves a LAG
8. Fix route logic for set next hop/action and reference counter for ECMP overlay
- How I did it
1. Updated SDK/FW submodule and relevant makefiles with the required versions.
2. Update SAI submodule and relevant makefile with the required version.
- How to verify it
Build an image and run tests from "sonic-mgmt".
Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.
How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
- Why I did it
The feature state can be a jinja template, like in this file - https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/init_cfg.json.j2#L39.
Without this change it is not possible to validate a configuration file.
- How I did it
Relaxes the constraint on feature state. Feature state leaf can be any string.
- How to verify it
Run UT.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>