c5be3ca4 [psud] Increase unit test coverage; Refactor mock platform (#154)
450b7d78 Bug fix: the fields that are not supported by vendor should be "N/A" in STATE_DB (#168)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot.
This PR is related to PR in sonic-utilities repo: Azure/sonic-utilities#1486
How I did it
The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary.
How to verify it
I verified this change on a virtual testbed.
admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json
{"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"}
admin@vlab-01:/host/reboot-cause$ show reboot-cause
Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Backport of https://github.com/Azure/sonic-buildimage/pull/7031 to the 202012 branch
#### Why I did it
To enable parsing the `AutoNegotiation` element from the LinkMetadata section of minigraph file
#### How I did it
Parse the value `AutoNegotiation` element from the `LinkMetadata` section of minigraph file. If the element is present, an `autoneg` key will be added to the port in the `PORT` table of Config DB with a value of either `0` or `1`
If an `autoneg` value is present in port_config.ini, the value from the minigraph will take precedence, overriding that value.
Also remove `AutoNegotiation` and `EnableAutoNegotiation` elements from the `DeviceInfo` section, as we will use this data in the `LinkMetadata` section to determine whether to enable auto-negotiation for a port.
The default bgp connect retry timer is 120 seconds. A reconnection will happen 120 seconds if the initial connection fails. This PR aims to allow a more frequent retry.
The psample module was not loaded on barefoot platform. The loading of this module is a prerequisite for testing SFlow.
* add `.gitignore` to the `barefoot` subdirectory to overwrite ignore "platform/**/debian/*" in the root directory
Integrate hw-management package V.7.0010.2002
Bug fixes:
Removing critical thermal zones to prevent unexpected software system shutdown:
*Kernel 4.9 -0071-mlxsw-core-Remove-critical-trip-point-from-thermal-z.patch
*Kernel 4.19 -076-mlxsw-core-Remove-critical-trip-point-from-thermal-z.patch
Removing redundant link for cpld3 for fixed systems (SN2100, SN2010).
Fix an issue with missed attribute for cpld3 (port CPLD) for SN2700, SN2410.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
this PR updates the following commits in sonic-platform-daemons
260cf2d [xcvrd] change firmware information fields name inside MUX_CABLE_INFO table for Y cable (#165)
cfa600f [thermalctld] Initialize fan led in thermalctld for the first run (#167)
8509f43 [thermalctld] Refactor to allow for greater unit test coverage; Add more unit tests (#157)
70f4e7b [syseepromd] Update warning message to be more informative (#160)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
The file device/mellanox/x86_64-mlnx_msn4410-r0/plugins/sfputil.py is not a software link for device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py. And it is still using python2 syntex which causes some SFP CLI error. The PR is to change it to a softlink and add 4410 support in device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py.
Build Marvell kernel driver for prestera sai sdk
Builds interrupt and dma kernel driver
Removed the older method pre-compiled kernel module debian package and its makefile
Fix the following issues:
Spectrum-2, Spectrum-3 | Port | Fix link issue when using 25 GbE rate between two ports while one is on Spectrum-2-based system and the other is on Spectrum-3-based system
All | warmboot | fail to upgrade from earlier SONiC versions with official SDK/FW 4.4.2306 (was on SONiC 201911)
All | What-Just-Happened | When enabling or disabling WJH under high traffic load to the host CPU, in very specific and low probability conditions, an error could occur, that may result in loss of data, channel failure or in extreme cases SW failure
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
To add latest SAI drop REL_4.3.3.3 to SONIC which addresses the following CSP cases:
CS00012058054: [4.3][IPinIP][TTL-PIPE] IPinIP TTL Pipe Mode is NOT working it is behaving UNIFORM mode even programed as PIPE mode
CS00011227466: [4.3] Warmboot support with tunnel encap
utilities:
* 83f068b 2021-03-22 | Handling error scenario of adding port to Vlan which is part of LAG (#1516) (HEAD -> 202012) [Sudharsan Dhamal Gopalarathnam]
* 470e8ce 2021-03-24 | Enable PFCWD only on ports where PFC is enabled (#1508) [Andriy Yurkiv]
* 09ef2e0 2021-03-22 | [show][config] add support for setting and displaying switching modes on Y cable (#1501) [vdahiya12]
* 0d17d37 2021-03-24 | Warmboot script improvements - timeout exec, disable swss autorestart, remove trap (#1495) [Vaibhav Hemant Dixit]
* 2718cd8 2021-03-24 | [show] Fix int status of LAGs, configured as Vlan members (#1478) [maksymbelei95]
* cc168fb 2021-03-22 | Fix bug: show vlan config for vlan with no members (#1503) [allas-nvidia]
swss:
* 5d8d1fb 2021-03-26 | Revert "Revert "[buffermgr] Support maximum port headroom checking (#1607)" (#1675)" (#1682) (HEAD -> 202012) [Prince Sunny]
* f8df1f8 2021-03-26 | [Dynamic Buffer Calc] Enhance the field checking in table handling (#1680) [Stephen Sun]
* 6328c9f 2021-03-22 | [MuxOrch] FDB ageout safety check (#1674) [Prince Sunny]
* e1d733e 2021-03-21 | reduce severity of log to info in case of flush on non-existing member (#1669) [allas-nvidia]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Why I did it
The existing Fan led and Psu led object initialize itself to green color in init method. However, there are multiple daemons calls sonic platform API and there could be a case that:
A PSU is removed from system
Reboot switch
psud detects that 1 PSU is missing and set PSU led to red
Other daemon just start up and call sonic platform API, the API set PSU led to green by call PsuLed.init
This PR is a partial fix for the issue. As we also need guarantee that the led is initialized with a correct value. I checked existing psud and thermalctld code. psud always initialize the PSU led color on boot up, thermalcltd need some changes to initialize led color on the first run
- How I did it
Remove the led color initialization code from FanLed.init and PsuLed.init
- How to verify it
Manual test
c20bf60 Qi Luo Mon Mar 15 14:28:31 2021 -0700 Implement rfc4363 FdbUpdater for lag inside vlan (#203)
292024a abdosi Mon Mar 15 12:15:21 2021 -0700 Updated lldpRemManAddrTable to use all the management ip address associated with interface. (#201)
9b83459 liushilongbuaa Fri Mar 12 14:35:23 2021 +0800 [CI] Setup dummy azure pipeline (#198)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Why I did it
It was observed that on a multi-asic DUT bootup, the BGP internal sessions between ASIC's was taking more time to get ESTABLISHED than external BGP sessions. The internal sessions was coming up almost exactly 120 secs later.
In multi-asic platform the bgp dockers ( which is per ASIC ) on switch start are bring brought up around the same time and they try to make the bgp sessions with neighbors (in peer ASIC's) which may be not be completely up. This results in BGP connect fail and the retry happens after 120sec which is the default Connect Retry Timer
How I did it
Add the command to set the bgp neighboring session retry timer to 10sec for internal bgp neighbors.
Eliminate the need for `gbsyncd_start.sh`, which simply calls `exec "/usr/bin/gbsyncd_startup.py"`. The shell script is unnecessary.
Once this PR merges, we can remove `gbsyncd_start.sh` from the sonic-sairedis repo.
- Why I did it
Mellanox-SN4600C-D112C8 SKU is not configured properly.
It should have 112 50G interfaces and 8 100G interfaces as described on this PR.
- How I did it
Changed port_config.ini & sai profile.
- How to verify it
Apply this HwSKU to a MSN4600C Mellanox platform.
sonic-swss
* [nbrmgrd] added function to parse IP address from APP_DB (#1672)
* [MUX/PFCWD] Use in_ports for acls instead of seperate ACL table (#1670)
* [mux] VS test for neigh, route and fdb (#1656)
* [Dynamic buffer calc] Bug fix: Remove PGs from an administratively down port. (#1652)
* spell check fixes (#1630)
sonic-utilities
* [reboot]: Stop mux before reboot on dual ToR (#1500)
* [config] Disable/enable container monitoring when stopping/starting services (#1499)
* Add 'show' and 'clear' command for PG drop (#1461)
* [CLI][techsupport] Add NOOP option for commands that did not have that option (#1445)
* [202012][reload] Improve reload by using sonic.target (#1509)
Signed-off-by: Danny Allen <daall@microsoft.com>
sonic-swss
* Add table descriptions for dynamic buffer calculation to the documents (#1664)
* Remove vxlanmgrd dependency on orchagent (#1647)
sonic-utilities
* [show] Fix 'show mac' output, when FDB entry with Vlan 1 is present (#1368)
* [warm-reboot]: Check empty key before issuing redis hget (#1496)
* [generate-dump] Remove Arista specific logic (#1482)
* [warm-reboot]: added automated recover for ISSU file (#1466)
* [warm-reboot] Check if warm restart flag is set when issuing a warm-reboot (#1460)
* [show][config] fix for show/config muxcable hwmode model value; fix show/config muxcable return codes; (#1494)
sonic-linux-kernel
* [net] Disable prio and cls cgroups to make working cgroup2 sock matching (#198)
Signed-off-by: Danny Allen <daall@microsoft.com>
- Why I did it
To pick up new features and fix from SDK/FW and SAI
SDK/FW new Feature:
All | Added support for multiple modules and cable types. For full list contact Nvidia networking support
Spectrum-3 | SN46000C | Added support for up to 5W on ports 49 to 64 .
SDK/FW bugs' fix:
All | fast reboot | fast boot failure from latest 201811 to 201911 and above
Spectrum | 10GbE/1GbE Transceiver (FTLX8574D3BCV) stopped working after firmware upgrade
Spectrum-2 | When device is rebooted with locked Optical Transceivers in split mode, the firmware may get stuck
Spectrum-2 | SN3700 | When connecting at 200GbE to Ixia K400, Ixia receives CRC errors
Spectrum-2 | SN3800 | On rare occasions packets loss may be experienced due to signal integrity issues
Spectrum-2 | When the port is a member of a LAG, after a warmboot and port toggle on the peer-side, the port remains down
Spectrum-3 | SN4700 | While using Optic cable in Split 4x1 mode in PAM4, when two first ports are toggled, the other 2 ports go down
Spectrum-3 | SN4700 | When working in 400GbE, deleting the headroom configuration (changing buffer size to zero) on the fly may cause continual packet drops
SAI
All | sFlow | Use hardcoded value 1 as netlink group number ax expected by hsflowd
- How I did it
Update the related version number in the make files and update the submodule pointer accordingly.
- How to verify it
Run regression test and everything works good.
To have the following fixes:
* All | Port status remains down after warm boot and flapping the port on peer side
* All | LAG HASH | IPv6 SRC_IP is not accounted in LAG hashing [
* All | ASIC driver | Kernel crash observed when driver reload is initiated before it fully loaded
* Spectrum-3 | Buffer | In lossless configuration, headroom is been evicted only when the shared buffers is free
* All | prevent FW access during ISSU
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
Features may be enabled/disabled for the same topology based on run-time
configuration. This PR adds the ability to enable/disable feature based
on config db data.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
show ip interfaces is enhanced recently to support multi ASIC platforms in this PR- https://github.com/Azure/sonic-utilities/pull/1396 .
The ipintutil script as to run as sudo user, to get the ip interface from each namespace.
Add this script to the sudoer file so that show ip interface command is available for user with read-only permissions
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
* [202012][submodule] Update sonic-utilities submodule
- [acl-loader] Improve input validation for acl_loader (#1479)
- [show] cli support for show muxcable cableinfo (#1448)
Signed-off-by: Danny Allen <daall@microsoft.com>
- Unset CONFIG_THERMAL_STATISTICS to prevent kernel crash (#199)
- [dni_dps460] Add attributes to retrieve PMBus status command codes (#197)
- [mellanox]: Backport new kernel patches (#195)
- [ci]: build amd64/armhf/arm64 for CI build (#196)
- Fix read and write failure to ‘fan1_target’ attribute of ‘dni_dps460’ driver. (#183)
Signed-off-by: Danny Allen <daall@microsoft.com>
* src/sonic-platform-daemons 068bccc...e5165b7 (7):
> [xcvrd] Fix crash: If 'dom_capability' not in port_info_dict, insert 'N/A' (#162)
> fix the muxcable state change notification received from other modules, omit the check inside hw_state table (#159)
> [xcvrd] Fix crash on platforms which support media settings with Python 3 (#158)
> [xcvrd] Save the dom_capability of transceiver into db (#72)
> [xcvrd] Fix xcvrd crash on other port prefixes (#123)
> [xcvrd] Make functions used for media setting python3 compatible (#153)
> [psud] Refactor unit tests; increase unit test coverage (#146)
Update FRR to 7.5.1. The following is a list of new commits.
```
df7ab485b FRRouting Release 7.5.1
f4ed841b8 Merge pull request #8187 from opensourcerouting/rpmfixes-75
86d5a20e3 Merge pull request #8193 from mjstapp/fix_signals_7_5
b339cc149 lib: avoid signal-handling race with event loop poll call
0f7b432c3 lib: add debug output for signal mask
c0290c86d lib: add sigevent_check api
7a5348665 doc: Fix CentOS 7 Documentation
2a8e69f48 Merge pull request #8064 from donaldsharp/foo
cf4d1a744 redhat: Fix changelog incorrect date format
b78dcb209 Merge pull request #8181 from idryzhov/7.5-zebra-blackhole
2032e7e72 zebra: don't use kernel nexthops for blackhole routes
e52003567 bgpd: When deleting a neighbor from a peer-group the PGNAME is optional
aa86a6a6f Merge pull request #8161 from mjstapp/fix_sa_7_5_backports
13a8efb4b Merge pull request #8156 from idryzhov/7.5-backports-2021-02-26
58911c6ed lib: Free memory leak in error path in clippy
556dfd211 lib: use right type for wconv() return val
bd9caa8f1 lib: fix some misc SA warnings
683b3fe3f lib: register dependency between control plane protocol and vrf nb nodes
b45248fb6 lib: add definitions for vrf xpaths
7b9f10d04 lib: add ability to register dependencies between northbound nodes
9c240815c bgpd: Bgp peer group issue
d1b43634b bgpd: upon bgp deletion, do not systematically ask to remove main bgp
f5d1dc55e bgpd: Fix crash when we don't have a nexthop
c2e463478 frr-reload: rpki context exiting uses exit and not end
f11db1698 bgpd: Blackhole nexthops are not reachable
c628e94ff staticd: fix vrf enabling
49b079ef1 staticd: fix nexthop creation and installation
0077038e9 staticd: fix nexthop validation
be3dfbbc7 zebra: use AF_INET for protocol family
```
- Why I did it
Update MFT tool version to 4.16.0
Bugs fixes:
mlxlink: Fixed an issue that caused the margin scan to fail with the following message: Eye scan not completed.
mlxcable: Cable firmware burning capability is not supported.
New features:
mlxlink: Enabled margin scan on Network links.
mlxlink: Added PRBS TX/RX polarity inversion using the following flags: --invert_tx_polarity / --invert_rx_polarity
- How I did it
Update MFT make file with new version number.
- How to verify it
Build image and test related functions on Mellanox platform