Commit Graph

5595 Commits

Author SHA1 Message Date
Liu Shilong
de00b87161
[ci] Transfer organization from Azure to sonic-net for sonic-mgmt (#11559) (#11563)
Why I did it
Transfer organization from Azure to sonic-net for sonic-mgmt
2022-07-28 15:32:26 +08:00
Lior Avramov
a40aca43b9 [memory_checker] Do not check memory usage of containers if docker daemon is not running (#11476)
Fix in Monit memory_checker plugin. Skip fetching running containers if docker engine is down (can happen in deinit).
This PR fixes issue #11472.

Signed-off-by: liora liora@nvidia.com

Why I did it
In the case where Monit runs during deinit flow, memory_checker plugin is fetching the running containers without checking if Docker service is still running. I added this check.

How I did it
Use systemctl is-active to check if Docker engine is still running.

How to verify it
Use systemctl to stop docker engine and reload Monit, no errors in log and relevant print appears in log.

Which release branch to backport (provide reason below if selected)
The fix is required in 202205 and 202012 since the PR that introduced the issue was cherry picked to those branches (#11129).
2022-07-27 23:28:19 +00:00
xumia
14f67b130d [ci] Fix some not sai package removed issue (#11544)
Only replace the file name starts with "cisco-".
2022-07-27 23:28:15 +00:00
Taylor Cai
c4927e0e68 [device/celestica]:Fix failed test case of Seastone snmp (#11430)
* Update psu.py
* Update thermal.py
2022-07-27 23:28:11 +00:00
kellyyeh
a2e0542356 [dhcpmon] Open different socket for dual tor to enable interface filtering (#11201) 2022-07-27 23:27:58 +00:00
tjchadaga
6c2f99a327 Add load_minigraph option to include traffic-shift-away during config migration (#11403) 2022-07-27 23:27:21 +00:00
Dror Prital
db37325f76
[202012][Mellanox] Update SAI version to 1.22.0.0 and SDK/FW to version 4.5.2318/2010_2318 (#11534)
- Why I did it
Update SAI version - 1.22.0.0
Update SDK/FW version - 4.5.2318/2010_2318

SAI Changes:
1. Port FEC fix for multiple speeds
2. Next hop group optimized bulk API
3. Support BFD remote-disc exchange in negotiation stage
4. Reduce verbosity of shared database already exists print

SDK/FW Fixes:
1. Cr space timeout on Hold and Release GW - at warmboot
2. SPC-1 Port in stuck PHY_UP after peer side rebooted
3. memory leak in sx_api_router_ecmp_update_set

- How I did it
Update pointer for the new SAI and SDK/FW

- How to verify it
Run regression tests
2022-07-26 21:01:36 +03:00
jhli-cisco
66d49231cf
Update cisco-8000.ini (#11522)
update cisco-8000 platform version to 202012-v0.107
2022-07-24 11:43:07 +08:00
anamehra
ee43011748
Update cisco-8000 submodule to v0.106 (#11505)
Signed-off-by: anamehra <anamehra@cisco.com>
2022-07-22 17:01:57 +08:00
VenkatCisco
e2042e2ad6
update cisco-8000 platform version to v106 (#11504) 2022-07-21 08:31:50 -07:00
Kebo Liu
c60bf90590
[202012] [Mellanox] Update hw-mgmt package to V.7.0010.2349 (#11421)
- Why I did it
New changes in this new HW-MGMT package:

1. hw-mgmt: chassis events: Fix voltmon address conflict on connecting
2. hw-mgmt: topology: Add COMEX BRDWL respin support
  a. Removed A2D sensor from all COMEX BRDWL boards
  b. Add COMEX BRDWL boards with register defined (config3)

- How I did it
Advance the hw-mgmt repo pointer and update the hw-mgmt version number

- How to verify it
Run platform-related regression test cases on the new testbed.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-07-20 09:00:17 +03:00
xumia
85c36c5c69 [Build] Cleanup the version deb preference file after build (#11414)
Why I did it
Cleanup the version deb preference file after build.
The version file is no use after build.

How I did it
Remove the no use version file.
2022-07-19 23:09:07 +00:00
bingwang-ms
c5eb031111
[202012] Add flag to control the generation of global level map (#11451)
Why I did it
This PR is to cherry-pick #11448 to 202012 branch after resolving conflicts.
There are conflicts in

files/build_templates/qos_config.j2
src/sonic-config-engine/tests/test_j2files.py
2022-07-15 09:44:45 -07:00
Neetha John
15cc046eda
[202012] Update MMU and ECN settings for Arista-7260CX3-D96C16 (#11427)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
Missed this sku in the previous PR #11398

How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%

How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
2022-07-15 09:33:39 -07:00
Kebo Liu
aa4379ddbe
[202012] [Mellanox] Add sensor conf file for new version of MSN3700/3700C/4600C platforms (#11358)
- Why I did it
MSN3700/3700C/4600C have been re-spined, the new HW version of platforms has different sensors, so need to apply the correct sensor.conf for them.

- How I did it
Add new sensor.conf files for the new re-spined platforms.
Enhance the logic of "get_sensors_conf_path" for the related platforms in order to load the correct sensor.conf for each version of platforms.

- How to verify it
run sensors test on different versions of platforms
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-07-14 08:59:10 +03:00
Jing Zhang
81f200fdec
[202012][sonic-linkmgrd] submodule update #11371
[202012][sonic-linkmgrd] submodule update

a2367d0 Jing Zhang Fri Jun 24 09:10:12 2022 -0700 Remove exception throwing when initializing missing loopback interface #90

sign-off: Jing Zhang zhangjing@microsoft.com
2022-07-12 09:38:11 -07:00
Neetha John
4de610af15
[202012] Update 7260 MMU and ECN settings (#11398)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
Improve throughput and latency for 7260 deployments

How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%

How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
2022-07-12 08:46:44 -07:00
Neetha John
c6f9664b2e
[202012] Minigraph parser changes to select mmu profiles based on SonicQosProfile attribute (#11383)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
There is a need to select different mmu profiles based on deployment type

How I did it
There will be separate subfolders (RDMA-CENTRIC, TCP-CENTRIC, BALANCED) in each hwsku folder which contains deployment specific mmu and qos settings. SonicQosProfile attribute in the minigraph will be used to determine which settings to use. If that attribute is not present, the default settings that exist in the hwsku folder will be used
2022-07-12 08:45:55 -07:00
mssonicbld
550ab26fc7
[ci/build]: Upgrade SONiC package versions (#11422) 2022-07-12 15:39:32 +00:00
Liu Shilong
ed728abb08
[ci] Fix test stage dependency issue. (#11386) (#11391)
Why I did it
When any of the test job failed in the test stage, the rerun will not work, the test stage will be skipped automaticall, so we do not have chance to rerun the test stage again, and the checks of the test will be always in failed status, block the PR to merge forever.

It should be caused by the condition in the Test stage, we should specify the status of the BuildVS stage.

How I did it
Fix stage dependency logic.
2022-07-12 17:55:01 +08:00
Dror Prital
bc935d4002
[202012][submodule] Advance sonic-linux-kernel pointer (#11406)
Update sonic-linux-kernel submodule pointer to include the following:
* [202012][patch] mlxsw: i2c: Prevent transaction execution for special chip states ([#282](https://github.com/Azure/sonic-linux-kernel/pull/282))

Signed-off-by: dprital <drorp@nvidia.com>
2022-07-11 21:40:26 +03:00
Neetha John
ec7cc16199
[202012] Submodule update for sonic-utilities (#11400)
Signed-off-by: Neetha John <nejo@microsoft.com>

This PR contains the following commits
5a54bd7 Added cisco config platform commands (Azure/sonic-utilities#2241)
62c1640 [config/load_mgmt_config] Support load IPv6 mgmt IP (Azure/sonic-utilities#2206)
c061a18 Fix header for the output table following 'show ipv6 interface' command (Azure/sonic-utilities#2219)
ecca18ff [202012] Update load minigraph to load backend acl (Azure/sonic-utilities#2235)
2022-07-11 09:10:02 -07:00
Junhua Zhai
9991b6ac5b [dhcp6relay] Check interface address is not NULL (#11359)
Why I did it
Daemon dhcp6relay may crash due to null pointer access to ifa_addr member of struct ifaddrs. It's not guaranteed that the interface must have available ifa_addr. That is true for some special virtual/pseudo interfaces.

How I did it
Check the pointer to ifa_addr is valid ahead of accessing it.
2022-07-08 21:39:48 +00:00
xumia
de786ccd25 [Build] Fix the missing debian package for reproducible build issue (#11333)
Why I did it
Fix the missing debian package for reproducible build issue.

The gnupg2 should be added into the version file.
https://dev.azure.com/mssonic/build/_build/results?buildId=118139&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=8d99be27-49d0-54d0-99b1-cfc0d47f0318

The following packages have unmet dependencies:
 gnupg2 : Depends: gnupg (>= 2.2.27-2+deb11u2) but 2.2.27-2+deb11u1 is to be installed
E: Unable to correct problems, you have held broken packages.

The issue was caused by the gnupg2 removed, and not detected.
sonic-buildimage/build_debian.sh

Line 250 in 4fb6cf0

 sudo LANG=C chroot $FILESYSTEM_ROOT apt-get -y remove software-properties-common gnupg2 python3-gi 
How I did it
Export the debian packages when any debian package being removed.
2022-07-08 21:39:44 +00:00
Neetha John
26ee4ae4a4 Add backend acl template (#11220)
Why I did it
Storage backend has all vlan members tagged. If untagged packets are received on those links, they are accounted as RX_DROPS which can lead to false alarms in monitoring tools. Using this acl to hide these drops.

How I did it
Created a acl template which will be loaded during minigraph load for backend. This template will allow tagged vlan packets and dropped untagged

How to verify it
Unit tests

Signed-off-by: Neetha John <nejo@microsoft.com>
2022-07-08 21:39:39 +00:00
Alexander Allen
2c3dc47b2c [sonic-py-common] Add platform and chassis info methods to device_info (#7652)
#### Why I did it
These methods were added to make some convenient platform and chassis information methods accessible through sonic-py-common. These methods were refactored from sonic-utilities and are used in the `show platform summary` and `show version` commands. 

#### How I did it
There are two methods, one is `get_platform_info()` which simply calls local methods to collect useful platform information into a dictionary format, this came directly from sonic-utilities.
2022-07-08 21:39:33 +00:00
Neetha John
6fe583ed1c
[202012] Minigraph parser changes for storage backend acl (#11267)
Signed-off-by: Neetha John <nejo@microsoft.com>

Backport #11221

Why I did it
For storage backend, certain rules will be applied to the DATAACL table to allow only vlan tagged packets and drop untagged packets.

How I did it
Create DATAACL table if the device is a storage backend device
To avoid ACL resource issues, remove EVERFLOW related tables if the device is a storage backend device

How to verify it
Added the following unit tests

verify that EVERFLOW acl tables is removed and DATAACL table is added for storage backend tor
verify that no DATAACL tables are created and EVERFLOW tables exist for storage backend leaf
2022-07-08 08:47:25 -07:00
Prince Sunny
0c6892776d
[Submodule] Update sonic-swss (#11369)
* e84a901 - 2022-07-06 : [vnetorch] fix use-after-free in removeBfdSession() (#2366) [Yakiv Huryk]
2022-07-07 22:36:46 -07:00
Ying Xie
1d55dca6d3 [Buffer] Separate buffer profile for Arista-7060CX-32S-Q24C8
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-07 14:09:01 -07:00
Ying Xie
17a9259c55 [7060] fix default port map
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
41518aa825 [Buffer] Separate buffer profile for Arista-7260CX3-Q64
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
f53b2620db [Buffer] Separate buffer profile for Arista-7260CX3-D108C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
b625085b46 [Buffer] Separate buffer profile for Arista-7260CX3-C64
50G data is not accurate, needs further update.

Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
5b42ba021b [Buffer] Separate buffer profile for Arista-7060CX-32S-C32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
83780549c7 [Buffer] Separate buffer profile for Arista-7060CX-32S-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
e8f04cd2e6 [Buffer] Separate buffer profile for Arista-7060CX-32S-Q32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
2c81f02b13 [Buffer] Separate buffer profile for Celestica-DX010-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Kevin Wang
4dbdc8e0a0 [Buffer] Separate buffer profile for Force10-S6100
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-07 14:09:01 -07:00
Ying Xie
2ed29da38d [buffer] create infrastructure to enable buffer/QoS profiles
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-07 14:09:01 -07:00
Vivek
729b4b4c40
Update sonic-swss submodule (#11310)
Update sonic-swss submodule

```
639d10e [PFC_WD] [202012] Avoid applying ZeroBuffer Profiles to ingress PG when a PFC storm is detected (https://github.com/Azure/sonic-buildimage/pull/2310)
475ae19 [202012] [cherry-pick] Apply `DSCP_TO_TC_MAP` from `PORT_QOS_MAP|global` to switch level (https://github.com/Azure/sonic-buildimage/issues/2328)
aa6f855 [ci] Change artifact reference pipeline to common lib pipeline. (https://github.com/Azure/sonic-buildimage/pull/2294)
752f8c5 [ci] Use correct branch when downloading artifact. (https://github.com/Azure/sonic-buildimage/pull/2292)
b3fcc5d [ci] Improve azp trigger settings to automaticlly support new release branch. (https://github.com/Azure/sonic-buildimage/pull/2289)
```
2022-07-06 17:43:35 -07:00
Zhijian Li
24b90d7556
[cherry-pick][202012] Fix issue where HLX module failed to do postinit (#11351)
* [HLX] Fix issue where HLX module failed to do postinit (#7274)

Signed-off-by: Jing Kan jika@microsoft.com
2022-07-06 17:27:29 +08:00
mssonicbld
9a86fa9264
[ci/build]: Upgrade SONiC package versions (#11074)
Upgrade SONiC Versions
2022-07-06 11:00:50 +08:00
Alexander Allen
851bd9bff8 [Mellanox] Add arch folder to SDK binary location (#11278)
- Why I did it
This is for the eventual support of multiple architectures for the mellanox platform.

- How I did it
Change the location of the binaries in Switch-SDK-drivers so that the path specifies the target architecture in addition to the target distribution that the debians are built for.

This is the most straightforward way to separate binaries built against different architectures and selectively target them for installation in the mellanox SONiC image.

- How to verify it
Build SONiC for mellanox and verify it compiles successfully.
2022-07-05 20:58:01 +00:00
yozhao101
c1ab4c6831 [tunnel_packet_handler] Add a whitespace in the warning syslog message. (#11232)
*This PR aims to add a whitespace in the warning syslog message of process tunnel_packet_handler.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2022-07-05 20:57:57 +00:00
vmittal-msft
6ada55439d Updated buffer profile settings for TD3 based HWSKUs (Arista-7050CX3-32S-C32, Arista-7050CX3-32S-D48C8) (#11202)
* Updated buffer profile settings for TD3 based HWSKUs (Arista-7050CX3-32S-C32, Arista-7050CX3-32S-D48C8)
2022-07-05 20:57:53 +00:00
xumia
32cda89f93 [Build]: Support to use symbol links for lazy installation targets to reduce the image size (#10923)
Why I did it
Support to use symbol links in platform folder to reduce the image size.
The current solution is to copy each lazy installation targets (xxx.deb files) to each of the folders in the platform folder. The size will keep growing when more and more packages added in the platform folder. For cisco-8000 as an example, the size will be up to 2G, while most of them are duplicate packages in the platform folder.

How I did it
Create a new folder in platform/common, all the deb packages are copied to the folder, any other folders where use the packages are the symbol links to the common folder.

Why platform.tar?
We have implemented a patch for it, see #10775, but the problem is the the onie use really old unzip version, cannot support the symbol links.
The current solution is similar to the PR 10775, but make the platform folder into a tar package, which can be supported by onie. During the installation, the package.tar will be extracted to the original folder and removed.
2022-07-05 20:57:49 +00:00
yozhao101
4487a962e3 [memory_checker] Do not check memory usage of containers which are not created (#11129)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to fix an issue (#10088) by enhancing the script memory_checker.

Specifically, if container is not created successfully during device is booted/rebooted, then memory_checker do not need check its memory usage.

How I did it
In the script memory_checker, a function is added to get names of running containers. If the specified container name is not in current running container list, then this script will exit without checking its memory usage.

How to verify it
I tested on a lab device by following the steps:

Stops telemetry container with command sudo systemctl stop telemetry.service

Removes telemetry container with command docker rm telemetry

Checks whether the script memory_checker ran by Monit will generate the syslog message saying it will exit without checking memory usage of telemetry.
2022-07-05 20:57:45 +00:00
Samuel Angebault
d15a484dfa
[202012][Arista] Fix cmdline generation during warm-reboot from 201811/201911 (#11161)
Issue fixed: when performing a warm-reboot or fast-reboot from 201811 or 201911 to 202012 the kernel command line contains duplicate information. This issue is related to a change that was made to make 202012 boot0 file more futureproof.
A cold reboot brings everything back into a clean slate though not always desirable.

Changes done:
Added some logic to properly detect the end of the Aboot cmdline when cmdline-aboot-end delimiter is not set (clean case)
Added some logic to regenerate the Aboot cmdline when cmdline-aboot-end is set but duplicate parameters exists before (dirty case). Reorganized some code to handle duplicate parameter handling in the allowlist.
2022-07-04 11:01:03 -07:00
Stephen Sun
fe6be5da92
[202012] Configure different map between uplink and downlink on t1 switch in dual ToR scenario (#11299)
- Why I did it
Configure different DSCP_TO_TC_MAP between uplink and downlink on T1 switch in dual ToR scenario
On T1 uplink, both DSCP 2/6 will be mapped to TC 1 for the purpose of avoiding such traffic occupying lossless buffers.
On T1 downlink, they will be mapped to TC 2/6 respectively. (unchanged)

- How I did it
For vendors who want to configure different DSCP_TO_TC_MAP between uplinks and downlinks on T1, they should
Define generate_dscp_to_tc_map macro in SKU's qos.json.j2 file
Define map AZURE for downlink and AZURE_UPLINK for uplink
Define jinja2 variable different_dscp_to_tc_map as True

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-03 15:58:06 +03:00
xumia
d766e7022e
[Build] Add the missing debian security mirrors in slave images (#11304)
Why I did it
The build below was broken, it was caused by one of the required debian mirror missing.
https://dev.azure.com/mssonic/build/_build/results?buildId=116719&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=88f376cf-c35d-5783-0a48-9ad83a873284

 libpci-dev : Depends: libudev-dev (>= 196) but it is not going to be installed
 libsystemd-dev : Depends: libsystemd0 (= 232-25+deb9u14) but 232-25+deb9u13 is to be installed
How I did it
Add the missing mirrors for buster and stretch.
2022-07-01 21:17:03 +08:00