Why I did it
When using sonic-slave-buster to convert sonic-vs.img.gz to vhdx, it also needs reproducible options.
Otherwise it will rebuild sonic-slave-buster because tag different.
Work item tracking
Microsoft ADO (number only): 25615544
How I did it
Add build options to use same sonic-slave docker when generating vhdx image.
How to verify it
Why I did it
To avoid orchagent crash issue like sonic-net/sonic-swss#2935, disable unsupported counters on SONiC management devices.
Work item tracking
Microsoft ADO (number only): 25437720
How I did it
Update the minigraph parser to disable unsupported counters on management devices.
How to verify it
Verified by unittest.
Manually apply patch to DUT and do config load_minigraph
Co-authored-by: Zhijian Li <zhijianli@microsoft.com>
* [baseimage]: Update openssh to 1:8.4p1-5+deb11u2 (#16826)
Openssh in Debian Bullseye has been updated to 1:8.4p1-5+deb11u2 to fix CVE-2023-38408.
Since we're building openssh with some patches, we need to update our version as well.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Remove main deb installation for derived deb build (#16859)
* Don't install dependencies of derived debs
When "building" a derived deb package, don't install the dependencies of
the package into the container. It's not needed at this stage.
* Re-add openssh-client and openssh-sftp-server as derived debs
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Re-add missing dependency for derived debs. (#16896)
* Re-add missing dependency for derived debs.
My previous changed removed the whole dependency on the main deb
existing, not just the installation of the main deb. Fix this by
readding a dependency on the main deb being built/pulled from cache.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Add the kernel and initramfs as dependencies for RFS build
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.
Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example
- Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
- Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
- LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
- Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
- However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.
Above scenario can result in packet mis-forwarding on data plane
How I fixed it:-
To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature
neighbor PEER ttl-security hops NUMBER
This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.
We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.
How I verify:
Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Release Notes for Cisco 8102-32FH-O:
Fixed platform_test failures in test_component.py
IOFPGA_SJTAG label under ‘fwutil show status’ changed to IOFPGA’
Validated auto FPD upgrade
Improve per-command authorization performance by read passwd entry with getpwent.
This is manually cherry-pick PR for #16460
Why I did it
Currently per-command authorization will check if user is remote user with getpwnam API, which will trigger tacplus-nss for authentication with TACACS server.
But this is not necessary because when user login the user information already add to local passwd file.
Use getpwent API can directly read from passwd file, this will improve per-command authorization performance.
Add pre start check to ensure intfmgrd is running.
The check will run for 20 seconds at most.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Co-authored-by: Longxiang Lyu <35479537+lolyu@users.noreply.github.com>
src/linkmgrd
* d7ab364 - (HEAD -> 202205, origin/202205) [warmboot] config all interfaces back to `auto` if reconciliation times out (#220) (29 minutes ago) [Jing Zhang]
* Revert "[SNMP][IPv6]: Fix to use link local IPv6 address as snmp agentAddress (#16013) (#16102)"
This reverts commit 628e1ad981.
* Revert "[SNMP][IPv6]: Fix SNMP IPv6 reachability issue in certain scenarios (#15487) (#15826)"
This reverts commit 7cfb71bc18.
* [202205][Arista] Update arista platform submodules
- fix issue where platform debug info would no longer be in the dump
- fix issue in scd-xcvr where active low bits couldn't be set
- fix issue in scd-smbus where it perform an oob access
src/sonic-platform-common
* ade83aa - (HEAD -> 202205, origin/202205) [202205] Fix issue: should use 'Value' column to calculate the health percentage for Virtium SSD (#385) (4 weeks ago) [Junchao-Mellanox]
previously, get_num_asics() returns the maximum number of asics. however, the asic_count
should be actual number of asics populated which can be get from get_asic_presence_list().
ADO: 25158825
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* [buffers] Add create_only_config_db_buffers.json for MLNX devices (not MSFT SKU), inject it at the start of the swss docker
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
* [buffers] Align the sonic-device_metadata.yang
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
---------
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
Upgrade the xgs SAI version to 7.1.62.4 to include the following changes:
7.1.62.4: ECMP CRM fix - CS00012312907
7.1.61.4: Includes nexthop group scaling fix - CS00012304075
7.1.60.4: CS00012302193 - SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO attribute value changed
7.1.59.4: [CS00012302400 CS00012302347]backport SONIC-76986 to SAI7.1: Fix the issue--"empty LAG can't be added to ACL entry"
7.1.57.4: [CSP CS00012296571] Backport SONIC-75371 jira on SAI 7.1 branch
7.1.56.4: [CSP CS00012302193] backport SONIC-72912 jira on SAI 7.1 branch
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
Update SDK/FW to 4.5.4318/2010.4316 and SAI to 2205.25.1.2 in order to include listed below fixes.
SDK/FW
In some cases, when an ACL has two or more rules with a similar key, modifying/removing one of the rules may cause modification/removal of one of the similar-key rules, instead of the requested rule.
Using module SPQCELRCDFB when connected to a 3rd party switch, there may either be no link or a very long link up time (~2 minutes).
In some case warmboot from 201911 to 202205 might result in dataplane traffic loss
When upgrade SONiC version using warm boot from version 201911/202012 to newer version, then doing cold boot back to older version and upgrade again to newer one warm boot might be fail.
SAI
Added support for dynamic ordered ECMP group (SAI_NEXT_GROUP_TYPE_DYNAMIC_ORDERED_ECMP)
"store and forward" KV was added
Added Support for IPV6 link local debug counters
---------
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
Why I did it
Fixes for
MIGSMSFT-333 / SR 696141124 - Fix OREDERED ECMP NHG drop when route is added before members are added
MIGSMSFT-333 / SR 696141124 – Fix port handling of empty ecmp group to drop packets
Why I did it
SONiC service determine-reboot-cause might run before driver creating reset cause files. In that case, the reset cause will be "Unknown". This PR introduces a wait mechanism to wait for reset cause sysfs files ready.
How I did it
/run/hw-management/config/reset_attr_ready is the file to indicate all reset cause files are ready. In chassis.get_reboot_cause function, it waits /run/hw-management/config/reset_attr_ready for up to 45 seconds.
How to verify it
Manual test on master/202211/202205
Microsoft ADO (25266920)
sonic-mgmt xoff test was failing for [100g,120km]. Needed to update total headroom pool size when 100G line card is used as T2 uplink.
This size was calculated assuming 100g is used for downlink so cable length was 2km whereas it can also be used for uplink (cable length - 120km). so we need to do calculation based on 120km not 2km. Although it will be some wastage for 2km scenario but it should cover both cases.
This fixesNokia-ION/ndk#22
Note that this PR must be coupled with NDK version >= 22.9.13
Why I did it
To provide proper support for CMIS compliant transceiver module CDB operations (including FW related operations).
How I did it
Enhanced the transport subsystem so as to provide for up to 2k bytes of data to be passed to/from modules (as contrasted with the prior max of 128 bytes).
How to verify it
Ensure that new FW (firmware) can be programmed to CMIS compliant module(s) using the 'sfputil firmware ...' commands.
What I did:
Fixes: #16468
Why I did:
On Some chassis there is no dedicated eth1-midplane interface on supervisor for supervisor and LC communication but instead Linux bridge br1 is used for that. Because of this changes that were done to white-list traffic over eth1-midplane would not work.
How I did:
To fix this we are using altname property of ip link command to set eth1-midplane as altname of br interface. This is done to keep design generic across chassis and between supervisor and LC also. IP-table rules are updated to get parent/base interface name of eth1-midplane.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>