Commit Graph

7202 Commits

Author SHA1 Message Date
Saikrishna Arcot
3bbfaa1ee8
Upgrade docker-sonic-vs and docker-syncd-vs to Bullseye (#13294)
* Upgrade docker-sonic-vs and docker-syncd-vs to Bullseye

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* iproute2: Force a new version and timestamp to be used for the package

There is an issue with Docker's overlay2 storage driver when not using
native diffs (and thus falling back to naive diff mode), which is the
case in the CI builds. The way the naive diff mode detects changes is by
comparing the file size and comparing the timestamps (specifically, I
believe it's the modification timestamp), and if there's a change there,
then it's considered a change that needs to be recorded as part of that
layer.

The problem is that with the code being added in the patch, the file
size remains the same, and the timestamp of binary files appear to be
the same timestamp as the changelog entry (likely for reproducible build
purposes). The file size remains the same likely due to extra padding
within the file introduced by relro. Because of this, Docker doesn't
detect this file has changed, and doesn't save the new file as part of
this layer.

To work around this, create a new changelog entry (with a new version as
well) with a new timestamp. This will result in the binary files having
a different timestamp, and thus will get saved by Docker as part of that
layer.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

---------

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-03-19 21:14:27 -07:00
mssonicbld
1e8e993a94 [ci/build]: Upgrade SONiC package versions 2023-03-20 09:00:28 +08:00
mssonicbld
89ebd43c81
[ci/build]: Upgrade SONiC package versions (#14311)
Upgrade SONiC Versions
2023-03-19 10:16:41 +08:00
jcaiMR
c0a02b1f82
advance dhcprelay to 1d221b0 (#14068) 2023-03-17 15:54:53 -07:00
Dev Ojha
de17f72d9a
[Buffer] Added cable length config to buffer config template for EdgeZoneAggregator (#14280)
Why I did it
SONiC currently does not identify 'EdgeZoneAggregator' neighbor. As a result, the buffer profile attached to those interfaces uses the default cable length which could cause ingress packet drops due to insufficient headroom. Hence, there is a need to update the buffer templates to identify such neighbors and assign the same cable length as used by the T1.

How I did it
Modified the buffer template to identify EdgeZoneAggregator as a neighbor device type and assign it the same cable length as a T1/leaf router.

How to verify it
Unit tests pass, and manually checked on a 7260 to see the changes take effect.

Signed-off-by: dojha <devojha@microsoft.com>
2023-03-17 11:01:17 -07:00
mssonicbld
96817c4357
[ci/build]: Upgrade SONiC package versions (#14102)
Upgrade SONiC Versions
2023-03-17 10:12:30 +08:00
lixiaoyuner
935f5dc5f0
Install kubernetes-cni for kubelet (#14163)
Why I did it
Find a new bug on kubelet side. The kubernetes-cni plug-in was removed in #12997, the reason is that the plug-in will be auto installed when install kubeadm, and will report error if we don't remove the install code. But after removal, the version auto installed is different from what we installed before. This will affect the kubelet action in some scenarios we don't find before. Need to install it by another way.

How I did it
Install kubernetes-cni==0.8.7-00 before install kubeadm

How to verify it
Flannel binary will be installed under /opt/cni/bin/ folder
2023-03-16 17:21:37 -07:00
Neetha John
f30fb6ec58
[storage_backend] Add backend acl service (#14229)
Why I did it
This PR addresses the issue mentioned above by loading the acl config as a service on a storage backend device

How I did it
The new acl service is a oneshot service which will start after swss and does some retries to ensure that the SWITCH_CAPABILITY info is present before attempting to load the acl rules. The service is also bound to sonic targets which ensures that it gets restarted during minigraph reload and config reload

How to verify it
Build an image with the following changes and did the following tests

Verified that acl is loaded successfully on a storage backend device after a switch boot up
Verified that acl is loaded successfully on a storage backend ToR after minigraph load and config reload
Verified that acl is not loaded if the device is not a storage backend ToR or the device does not have a DATAACL table

Signed-off-by: Neetha John <nejo@microsoft.com>
2023-03-16 14:18:28 -07:00
Neetha John
8e4ce44e5c
Update dynamic threshold for TD2 (#14224)
Why I did it
Update dynamic threshold to -1 to get optimal performance for RDMA traffic

How I did it
Modified pg_profile_lookup.ini to reflect the correct value

Signed-off-by: Neetha John <nejo@microsoft.com>
2023-03-16 10:06:46 -07:00
Vivek
f19c414176
[lldpmgrd] Don't log error message for outdated event (#14178)
- Why I did it
Fixes #14236

When a redis event quickly gets outdated during port breakout, error logs like this are seen

Mar  8 01:43:26.011724 r-leopard-56 INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet64': {'admin_status': 'down'}, 'Ethernet68': {'admin_status': 'down'}}}
Mar  8 01:43:26.012565 r-leopard-56 INFO ConfigMgmt: Writing in Config DB
Mar  8 01:43:26.013468 r-leopard-56 INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet64': None, 'Ethernet68': None}, 'INTERFACE': None}
Mar  8 01:43:26.018095 r-leopard-56 NOTICE swss#portmgrd: :- doTask: Configure Ethernet64 admin status to down
Mar  8 01:43:26.018309 r-leopard-56 NOTICE swss#portmgrd: :- doTask: Delete Port: Ethernet64
Mar  8 01:43:26.018641 r-leopard-56 NOTICE lldp#lldpmgrd[32]: :- pops: Miss table key PORT_TABLE:Ethernet64, possibly outdated
Mar  8 01:43:26.018654 r-leopard-56 ERR lldp#lldpmgrd[32]: unknown operation ''

- How I did it
Only log the error when the op is not empty and not one of ("SET" & "DEL" )

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-03-16 18:15:50 +02:00
Vivek
4856c2f22d
[submodule] Advance sonic-dbsyncd pointer
fa8b709 Handled the error case of negative age (#57)
990f5b0 Use github code scanning instead of LGTM (#55)
a7992c5 Install libyang for swss-common. (#50)
244fa86 Update README.md

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-03-16 10:04:23 +02:00
dbarashinvd
06d6dafcf3
[Mellanox] fix for watchdog device not found, adding dependency on hw-management (#14182)
- Why I did it
Sometimes Nvidia watchdog device isn't ready when watchdog-control service is up after first installation from ONIE
need to delay watchdog control service to go up after hw-mgmt which gets devices up and ready

- How I did it
Delay Nvidia watchdog-control service before hw-mgmt has started on Mellanox platform in order to avoid missing or not ready watchdog device.

- How to verify it
verification test of ONIE installation of image in a loop
making sure watchdog service is always up (not failed) after first installation from ONIE
2023-03-15 18:36:20 +02:00
Junchao-Mellanox
03cab99a7a
[system-health] Make check interval more accurate (#14085)
- Why I did it

Healthd check system status every 60 seconds. However, running checker may take several seconds. Say checker takes X seconds, healthd takes (60 + X) seconds to finish one iteration. This implementation makes sonic-mgmt test case not so stable because the value X is hard to predict and different among different platforms. This PR introduces an interval
compensation mechanism to healthd main loop.

- How I did it

Introduces an interval compensation mechanism to healthd main loop: healthd should wait (60 - X) seconds for next iteration

- How to verify it

Manual test
Unit test
2023-03-15 07:21:00 +02:00
kellyyeh
7d585dc48d
Update dhcpv6-relay yang model (#14144)
Why I did it
Add interface-id in dhcpv6-relay yang model

How I did it
Add interface-id option and corresponding UT. Updated configuration.md

How to verify it
kellyyeh@kellyyeh:~/sonic-buildimage/src/sonic-yang-models$ pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-dhcpv6-relay.yang
2023-03-14 22:01:55 -07:00
FuzailBrcm
d3e3565a6c
Fix issue: enhancing PDDF common eeprom APIs to use caching (#13835) (#13848)
Why I did it
To enhance pddf_eeprom.py to use caching and fix #13835

How I did it
Utilising the in-built caching mechanism in the base class eeprom_base.py.
Adding a cache file to store the eeprom data.

How to verify it
By running 'decode-syseeprom' or 'show platform syseeprom' commands.
2023-03-14 17:54:58 -07:00
FuzailBrcm
f822373e53
Enabling FPGA device support in PDDF (#13477)
Why I did it
To enable FPGA support in PDDF.

How I did it
Added FPGAI2C and FPGAPCI in the build path for the PDDF debian package
Added the support for FPGA access APIs in the drivers of fan, xcvr, led etc.
Added the FPGA device creation support in PDDF utils and parsers

How to verify it
These changes can be verified on some platform using such FPGAs. For testing purpose, we took Dell S5232f platform and brought it up using PDDF. In doing so, FPGA devices are created using PDDF and optics eeproms were accessed using common FPGA drivers. Below are some of the logs.
2023-03-14 17:53:35 -07:00
Samuel Angebault
8bd6a8891c
[Arista] Update platform library submodules (#14037)
- Add chassis platform API reboot
- Add fwutil hooks for firmware updates
- Fix PikeZ i2c bus identification issue
- Fix testing issue
2023-03-14 09:36:25 -07:00
Prince George
1cbfc9ceb8
[yang]: Add Yang model support for adding Channel to PORT table (#14228)
Why I did it
Add 'channel' to the CONFIG_DB PORT table. This will be needed to support PORT breakout to multiple channel ports so that Xcvrd can understand which datapath or channel to initialize on the CMIS compliant optics

How I did it
Add 'channel' to the CONFIG_DB PORT table.

How to verify it
Added unit test for valid and invalid channel number
Channel 0 -> No breakout
Channel 1 to 8 -> Breakout channel 1,2, ..8

Signed-off-by: Prince George <prgeor@microsoft.com>
2023-03-14 09:34:16 -07:00
Sudharsan Dhamal Gopalarathnam
43af6b925c
[submodule] Update sonic-swss submodule (#14177)
Update sonic-swss submodule pointer to include the following:
* 98a16cf [ACL] Write ACL table/rule creation status into STATE_DB ([#2662](https://github.com/Azure/sonic-swss/pull/2662))
* a2c9a61 [EVPN]Handling error scenarios during route programming and IMR add ([#2670](https://github.com/Azure/sonic-swss/pull/2670))
* 115efe8 [bfdorch] add default TOS value for BFD session ([#2689](https://github.com/Azure/sonic-swss/pull/2689))
* a198289 [orchagent, SRv6]: create seglist support to set sid list type ([#2406](https://github.com/Azure/sonic-swss/pull/2406))
2023-03-14 08:50:20 -07:00
Dror Prital
35f8101b50
Update SDK/FW to version 4.5.4206/4.5.4204 (#14164)
- Why I did it
To include latest fixes:

Fix traffic loss on all routed traffic when moving from 4.4.3372/XX_2008_3388 to 4.5.4118-012/XX_2010_4120-010. Issue occurred after ISSU process in Spectrum 1 only, When upgrading from older version to a new one. Neighbor entries are overwritten.
Fix When using mirror session policer on SPC2/3, the actual CIR was 1.28 times more than the configured CIR value.
Fix Creation of router interface of type bridge may occasionally fail if create is performed immediately after delete.
Fix False errors during SDK deinitialization may be seen in the syslog

- How I did it
Updated SDK submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "sonic-mgmt".
2023-03-14 16:27:38 +02:00
davidpil2002
8098bc4bf5
Add Secure Boot Support (#12692)
- Why I did it
Add Secure Boot support to SONiC OS.
Secure Boot (SB) is a verification mechanism for ensuring that code launched by a computer's UEFI firmware is trusted. It is designed to protect a system against malicious code being loaded and executed early in the boot process before the operating system has been loaded.

- How I did it
Added a signing process to sign the following components:
shim, grub, Linux kernel, and kernel modules when doing the build, and when feature is enabled in build time according to the HLD explanations (the feature is disabled by default).

- How to verify it
There are self-verifications of each boot component when building the image, in addition, there is an existing end-to-end test in sonic-mgmt repo that checks that the boot succeeds when loading a secure system (details below).

How to build a sonic image with secure boot feature: (more description in HLD)

Required to use the following build flags from rules/config:
SECURE_UPGRADE_MODE="dev"
SECURE_UPGRADE_DEV_SIGNING_KEY="/path/to/private/key.pem"
SECURE_UPGRADE_DEV_SIGNING_CERT="/path/to/cert/key.pem"
After setting those flags should build the sonic-buildimage.
Before installing the image, should prepared the setup (switch device) with the follow:
check that the device support UEFI
stored pub keys in UEFI DB

enabled Secure Boot flag in UEFI
How to run a test that verify the Secure Boot flow:
The existing test "test_upgrade_path" under "sonic-mgmt/tests/upgrade_path/test_upgrade_path", is enough to validate proper boot
You need to specify the following arguments:
Base_image_list your_secure_image
Taget_image_list your_second_secure_image
Upgrade_type cold
And run the test, basically the test will install the base image given in the parameter and then upgrade to target image by doing cold reboot and validates all the services are up and working correctly
2023-03-14 14:55:22 +02:00
zitingguo-ms
1cd67444e4
Upgrade SAI xgs version to 8.4.0.2 and migrate to DMZ (#14212)
Why I did it
Upgrade SAI XGS version to 8.4.0.2 and migrate to DMZ repo.

How I did it
Update SAI XGS version in sai.mk.

How to verify it
Run the SONiC and SAI test with the SAI pipeline.

Signed-off-by: zitingguo-ms zitingguo@microsoft.com
2023-03-14 14:09:30 +08:00
Ye Jianquan
5e85c01621
Add scandir into sonic-mgmt docker image (#14219)
Why I did it
TestbedV2 requires scandir python package

How I did it
Install scandir packages
2023-03-14 08:58:11 +08:00
Nazarii Hnydyn
a5c5e82116
[submodule]: Advance sonic-swss-common submodule. (#14207)
Update sonic-swss-common submodule pointer to include the following:

23df338 [ci] Continue on error when running test. (#757)
06ffb51 Define ACL_TABLE and ACL_RULE table in STATE_DB (#748)
1b369ab [ci] Fix apt-get install unable locate package issue. (#753)
619d4ec Improve unit test for go wrapper (#752)
2023-03-13 17:23:47 -07:00
jhli-cisco
838d76b52f
[sonci-slave]: update sonic-slave docker files to include cisco sdk dependencies (#14203)
cisco SDK dependencies needed
2023-03-13 14:32:34 -07:00
Samuel Angebault
1516ace9a5
[Arista] Add missing platform_components.json (#14067)
Provide platform-components.json for Clearwater2 and Wolverine

These files are needed for fwutil platform sonic-mgmt tests to pass.

Fix PikeZ platform_components.json

Co-authored-by: Patrick MacArthur <pmacarthur@arista.com>
Co-authored-by: Andy Wong <andywong@arista.com>
2023-03-13 12:18:42 -07:00
Marty Y. Lok
836d65d616
[EVERFLOW][ACL_ATBLE] Fix for everflow ACL_TABLE in config_db not having the routed ports when no -ASIC in the asic_port_name (#13532)
Why I did it
After the renaming of the asic_port_name in port_config.ini file (PR: #13053 ), the asic_ifname in port_config.ini is changed from '-ASIC<asic_id>' to just port. Example: 'Eth0-ASIC0' to 'Eth0'.

However, with this change a config_db generated via config load_minigraph would cause the EVERFLOW and EVERFLOWV6 tables under ACL_TABLE to not have any of non-LAG front panel interfaces. This was causing the EVERFLOW suite to fail.

How I did it
In parse_asic_external_neigbhors in minigraph.py there was a check that the asic_name.lower() (like asic0) is present in the port_alias_asic_map. However with -ASIC removed from the asic_ifname, the port_alias_asic_map would not have the asic_name and thus any non-LAG neighbor would not be included.

Fix was the ignore the asic name change as the port_alias_asic_map is already only looking for ports in just the same asic as asic_name.

How to verify it
Execute "config load_minigraph" with the mingraph which is generated by sonic-mgmt gen-minigraph script. And confirm ono-lag interface are present in the Everfloe table in the config_dbs.

Signed-off-by: mlok <marty.lok@nokia.com>
2023-03-13 10:58:32 -07:00
xumia
5f4d063506
[Build] Fix the mirror gpg key expired issue (#14206)
Why I did it
[Build] Fix the mirror gpg key expired issue
See vs build: https://dev.azure.com/mssonic/build/_build/results?buildId=231680&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795

How I did it
Add the apt option not to check the valid until, the option is set to the SONiC docker base image, docker ptf missing the option.

Acquire::Check-Valid-Until "false";
How to verify it
The build of docker-ptf is succeeded after fixed.

2023-03-11T17:26:35.1801999Z [ building ] [ target/docker-ptf.gz ] 
2023-03-11T17:38:10.1608536Z [ finished ] [ target/docker-ptf.gz ]
2023-03-13 11:13:21 +08:00
dbarashinvd
3d9016050f
Revert "[submodule] Advance sonic-sairedis pointer (#14199)" (#14208)
reverted because the submodule update PR needs to be merged with the following PR
#14200 but the PR is not available due to some failures and having only sairedis PR will break fast-boot
2023-03-12 19:44:08 +02:00
Dror Prital
d5ca0a5162
[submodule] Advance sonic-sairedis pointer (#14199)
Update sonic-sairedis submodule pointer to include the following:
* 4bd1dc5 Fast reboot finalizer ([#1213](https://github.com/sonic-net/sonic-sairedis/pull/1213))
* 749b393 [ci] Fix apt-get install unable locate package issue. ([#1212](https://github.com/sonic-net/sonic-sairedis/pull/1212))
* 886875b [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd ([#1201](https://github.com/sonic-net/sonic-sairedis/pull/1201))
* c58d259 Use new value of STATE_DB FAST_REBOOT entry ([#1196](https://github.com/sonic-net/sonic-sairedis/pull/1196))
* 3808e4c Fix issue: bulk counter feature is disabled ([#1205](https://github.com/sonic-net/sonic-sairedis/pull/1205))

Signed-off-by: dprital <drorp@nvidia.com>
2023-03-12 14:16:29 +02:00
Saikrishna Arcot
3556e6c2eb
[submodule] Advance sonic-swss-common pointer (#14142)
Update sonic-swss-common submodule pointer to include the following:

565ad4b Fix common path issue (#751)
3352881 Prevent sonic-db-cli generate core dump (#749)
43cadec Add ProfileProvider class to support read profile config from PROFILE_DB. (#683)
8b09f90 Update path to sairedis tests (#747)
85f3776 Non recursive automake and Debian packaging changes (#700)
This is a reland of #13950, with the debug image build fix.
2023-03-12 14:12:02 +02:00
Liu Shilong
03c02e3946
[action] Update AutoMergeScan action to ignore Semgrep and rerun failed job. (#14118)
Why I did it
Semgrep check has some issues. Ignore it.
check automerge label.
Ignore Azure.sonic-buildimage sub test jobs. Only check final result.
How I did it
2023-03-10 14:14:51 +08:00
Zain Budhwani
30528f2317
Update sonic-gnmi submodule (#14112)
#### Why I did it

update contains following commits

50123ef Zain Budhwani   Tue Feb 28 16:48:22 2023 -0800  Add logs for md5 checksum (sonic-net/sonic-gnmi#80)
a90f2b3 Zain Budhwani   Mon Feb 27 23:44:49 2023 -0800  Add get-update to azp yml (sonic-net/sonic-gnmi#79)
14fe6f4 Zain Budhwani   Tue Jan 31 14:11:27 2023 -0800  Add 202012 branch to pr checker (sonic-net/sonic-gnmi#72)
a792474 Zain Budhwani   Tue Jan 31 09:22:38 2023 -0800  Fix crash when retrieving cpu utilization (sonic-net/sonic-gnmi#70)

#### How I did it

Fetch new changes
2023-03-09 21:04:59 -08:00
Sambath Kumar Balasubramanian
71835385c1
sonic-buildimage Remove unused SAT port from arista configs. (#14167)
Why I did it
To fix aristanetworks/sonic#85

How I did it
Remove unnecessary SAT ports

How to verify it
Speed change from 400-100g without any error.
2023-03-09 15:54:20 -08:00
kellyyeh
a45d7bf9d8
Update dhcpmon rx/tx packet filtering and fix server rx count (#13898)
Why I did it
Dhcpmon had incorrect RX count for server side packets. It does not raise any false alarms, but could miss catching server side packet count mismatch between snapshot and current counter.

Add debug mode which prints counter to syslog

How I did it
Due to dualtor inbound filter requirement, there are currently two filters, each for listening to rx / tx packets.
Originally, we opened up an rx/tx socket for each interface specified, which causes duplicate socket. Now we initialize the sockets only once. Both sockets are not binded to an interface, and we use vlan to interface mapping to filter packets. For inbound uplinks, we use a portchannel to interface mapping.

Previous dhcpmon counter before dual tor change:
[ Agg-Vlan1000- Current rx/tx] Discover: 1/ 4, Offer: 1/ 1, Request: 3/ 12, ACK: 1/ 1
[ eth0- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ eth0- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ PortChannel104- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel103- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel102- Current rx/tx] Discover: 0/ 2, Offer: 1/ 0, Request: 0/ 6, ACK: 1/ 0
[ PortChannel101- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ Vlan1000- Current rx/tx] Discover: 1/ 0, Offer: 0/ 1, Request: 3/ 0, ACK: 0/ 1
[ Agg-Vlan1000- Current rx/tx] Discover: 1/ 4, Offer: 1/ 1, Request: 3/ 12, ACK: 1/ 1

Dhcpmon counter after this PR:
[ PortChannel104- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel103- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel102- Current rx/tx] Discover: 0/ 2, Offer: 1/ 0, Request: 0/ 6, ACK: 1/ 0
[ PortChannel101- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ Vlan1000- Current rx/tx] Discover: 1/ 0, Offer: 0/ 1, Request: 3/ 0, ACK: 0/ 1
[ Agg-Vlan1000- Current rx/tx] Discover: 1/ 4, Offer: 1/ 1, Request: 3/ 12, ACK: 1/ 1

How to verify it
Ran dhcp relay test to send all four packets in singles and batches on both single ToR and dual ToR. Counter was as expected.
2023-03-09 15:52:57 -08:00
NanQiSweeper
ae71988b9a
[yang]SONiC Yang model support for Telemetry_client. (#12483) (#13314)
#### Why I did it
Create SONIC Yang model for Telemetry_client
#### How I did it
Defined Yang models  based on Guideline doc:
https://github.com/Azure/SONiC/blob/master/doc/mgmt/SONiC_YANG_Model_Guidelines.md
and
https://github.com/Azure/sonic-utilities/blob/master/doc/Command-Reference.md
#### How to verify it
Added test cases to verify it.
2023-03-09 10:13:32 -08:00
Junhua Zhai
c4c621c614
[gearbox] use credo sai v0.9.0 (#14149)
Update credo sai package to the latest v0.9.0.
2023-03-08 23:42:10 -08:00
Sudharsan Dhamal Gopalarathnam
8d82a86134
[Mellanox]Fix lpmode set when logical port is larger than 64 (#14138)
- Why I did it
In sfplpm API, the number of logical ports is hardcoded as 64. When a system contains more port than this, the SDK APIs would fail with a syslog as below

Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [MGMT_LIB.ERR] Slot [0] Module [0] has logport [0x00010069] in enabled state
Mar 7 03:53:58.105980 r-leopard-58 ERR syncd#SDK: [SDK_MGMT_LIB.ERR] Failed in __sdk_mgmt_phy_module_pwr_attr_set, error: Internal Error
Mar 7 03:53:58.106118 r-leopard-58 ERR pmon#-c: Error occurred when setting power mode for SFP module 0, slot 0, error code 1

- How I did it
Remove the hardcoded value of 64. Obtained the number of logical ports from SDK

- How to verify it
Manual testing
2023-03-09 00:02:55 +02:00
Volodymyr Samotiy
15bee3a2c0
[Mellanox] Update MFT to 4.22.1-15 (#14133)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2023-03-08 10:21:58 +02:00
Liu Shilong
2ba2ff1398
[ci] Migrate test jobs to vmss agent pool to increase node limit. (#14127)
Why I did it
original ubuntu-20.04 agent pool has a node limit 35.
Use vmss agent pool to get higher node limit.

How I did it
2023-03-07 21:14:04 +08:00
Tejaswini Chadaga
ba30775d65
Add yang model definition for CHASSIS_MODULE table (#14007)
Why I did it
Add yang model definition for CHASSIS_MODULE define and implemented for sonic chassis. HLD for this configuration is included in https://github.com/sonic-net/SONiC/blob/master/doc/pmon/pmon-chassis-design.md#configuration

Fixes #12640

How I did it
Added yang model definition, unit tests, sample config and documentation for the table

How to verify it
Validated config tree generation using "pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-voq-inband-interface.yang"

Built the below python-wheels to validate unit tests and other changes
target/python-wheels/bullseye/sonic_yang_mgmt-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_yang_models-1.0-py3-none-any.whl
target/python-wheels/bullseye/sonic_config_engine-1.0-py3-none-any.whl
2023-03-07 11:24:12 +08:00
Yaqiang Zhu
284ba61a86
[dhcp-relay] Add dhcp_relay show cli (#13614)
Why I did it
Currently the show and clear cli of dhcp_relayis may cause confusion.

How I did it
Add doc for it: [doc] Add docs for dhcp_relay show/clear cli sonic-utilities#2649
Add dhcp_relay config cli and test cases.
show dhcp_relay ipv4 helper
show dhcp_relay ipv6 destination
show dhcp_relay ipv6 counters
sonic-clear dhcp_relay ipv6 counters

How to verify it
Unit test all passed
2023-03-06 10:48:25 -08:00
Stepan Blyshchak
f908dfe919
[Mellanox] Place FW binaries under platform directory instead of squashfs (#13837)
Fixes #13568

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to master
master to 201911 downgrade
master -> master reboot
ONIE -> master boot (First FW burn)
Which release branch to backport (provide reason below if selected)
2023-03-06 13:36:43 +02:00
mssonicbld
506f372533
[ci/build]: Upgrade SONiC package versions (#14072)
Upgrade SONiC Versions
2023-03-05 11:29:38 +08:00
ppikh
de84eb98c7
[ptf]: Added package "wireshark-common" into PTF docker (#14070)
It will allow us to have application called "mergecap" - which can merge multiple .pcap files into single .pcapng file and convert it to .pcap file

Signed-off-by: Petro Pikh <petrop@nvidia.com>
2023-03-04 17:47:42 -08:00
StormLiangMS
ac14a3a587
[submodule advance] advance sonic-swss 309df59 #14076
Why I did it
submodule advance for master branch

309df59 - Revert "[aclorch] Fixed issue [Mellanox] Update SDK to v4.2.9102 #2204.Support IN_PORTS qualifer in MIRRORV6 table. (Cmd "config vlan member add <vid> <interface_name>" always adds interface as tagged #2668)" (Add warm/fast-boot feature processing for wedge100bf_32x/65x platforms #2687) (85 minutes ago) [StormLiangMS]
ebe8de7 - [FDB]Fixing FDB consolidated flush for Remote MACs (pmon to stretch #2673) (2 days ago) [Sudharsan Dhamal Gopalarathnam]
c9ae6aa - Fix issue: there is no retry while creating a RIF which is in removing state ([201811 sub-module] advance sub-modules: utilities, swss, swss-common #2679) (2 days ago) [Junchao-Mellanox]
79afcb3 - [Dual-ToR] handle 'mux_tunnel_egress_acl' attrib in order to change ACL configuration (drop on ingress/egress) on standby ToR (lm75 doesn't support written alarm to syslog. #2646) (3 days ago) [Andriy Yurkiv]
c2b01ba - [orchagent]: Get bridge port ID from orchagent cache instead of SAI API ([201811 sub module] advance sairedis sub module #2657) (3 days ago) [Lawrence Lee]
d8a1cb7 - [dualtor] Fix neighbor miss when mux is not ready ([mellanox] Fix in mlnx-ffb.sh #2676) (3 days ago) [Longxiang Lyu]
1531dff - [ci] Fix pipeline error about team5 not found. (Core dump in orchagent when assigning router interface to a vlan with untagged mode  #2684) (4 days ago) [Liu Shilong]
cfcd40c - [aclorch] Fixed issue [Mellanox] Update SDK to v4.2.9102 #2204.Support IN_PORTS qualifer in MIRRORV6 table. (Cmd "config vlan member add <vid> <interface_name>" always adds interface as tagged #2668) (4 days ago) [Rajkumar-Marvell]
35a7ab0 - swss: Fix Invalid port oid messages generated because of voq counters. (Failed to update FlexCounter, Segmentation fault #2653) (8 days ago) [Sambath Kumar Balasubramanian]
How I did it
How to verify it
run PR test
2023-03-05 09:09:23 +08:00
anamehra
4a93e4cfa4
Add support for platform syncd pre shutdown plugin (#13564)
Why I did it
Vendor platform may require running platform specific pre-shutdown routine before shutting down the syncd process which runs the SAI and vendor sdk instance.

How I did it
Added a platform script hook which will be executed if the plugin script is provided by the platform in device//plugins/
2023-03-03 15:53:33 -08:00
Vaibhav Hemant Dixit
860bc7492a
Add shellcheck and mock modules for running unit and linter test (#14062) 2023-03-03 19:24:26 +00:00
Tejaswini Chadaga
f80bf7783d
Fix VOQ_CHASSIS_V6_PEER route-map config (#14055)
* Fix typo in VOQ_CHASSIS_V6_PEER route-map config

* Updated UT files with the changed config
2023-03-03 09:28:57 -08:00
xumia
ead3d124e4
[Build] Support to use loosen version when failed to install python packages (#14013)
Why I did it
[Build] Support to use loosen version when failed to install python packages
It is to fix the issue #14012

How I did it
Try to use the installation command without constraint

How to verify it
2023-03-03 15:21:10 +08:00