To include latest warmboot fixes:
[202012] [cherry-pick] Update db_migrator to support `PORT_QOS_MAP|gl (sonic-net/sonic-utilities/pull/2215)
[202012] Migrate missed config on cross branch warm upgrade to 202012 (sonic-net/sonic-utilities#2277)
[202012] Add db_migrator_constants.py script to setup.py (sonic-net/sonic-utilities#2287)
Why I did it
This PR is to backport #11569 into 202012 branch.
This PR is to apply different DSCP_TO_TC_MAP to downlink and uplink ports on T1 in dualtor deployment.
For T1 downlink ports (To T0)
The DSCP_TO_TC_MAP is not changed. DSCP2 and DSCP6 are mapped to TC2 and TC6 respectively.
For T1 uplink ports (To T1)
A new DSCP_TO_TC_MAP|AZURE_UPLINK is defined and applied. DSCP2 and DSCP6 are mapped to TC1 to avoid mixing up lossy and lossless traffic from T2.
The extra lossy PG2 and PG6 added in PR #11157 is reverted as well because no traffic from T2 is mapped to PG2 or PG6 now.
How I did it
Define a new map DSCP_TO_TC_MAP|AZURE_UPLINK for 7260 T1.
How to verify it
Verified by test case in test_j2files.py.
Why I did it
202012 PR test is failing due to some recent change in sonic-mgmt master branch.
How I did it
Use matching sonic-mgmt branch to run 202012 branch PR tests.
How to verify it
this PR test.
Signed-off-by: Ying Xie ying.xie@microsoft.com
What I did:
Added bgp as a dependent of swss
Why I did it:
bgp container was not restarting on swss crash. When swss crashes, linkmgrd
doesn't initate a switchover because it cannot access the default route from
orchagent. Bringing down bgp with swss will isolate the ToR, causing linkmgrd
to initiate a switchover to the peer ToR avoiding significant packet loss.
Signed-off-by: Nikola Dancejic <ndancejic@microsoft.com>
- Why I did it
New SN410 A1 system has a different sensor layout with A0 system, needs a new sensor conf file to support it.
- How I did it
Since the SN4410 A1 system use exactly the same sensor layout as the SN4700 A1 system, so add a symbol link linking to the SN4700 A1 sensor conf file to reuse.
- How to verify it
Run sensor test against the SN4410 A1 system;
Run platform related regression test against the SN4410 A1 system
- Why I did it
Update sonic-linux-kernel submodule pointer to include the following:
add new kernel patch come with hw-mgmt 7.0010.2348 (#285)
- How I did it
Update sonic-linux-kernel submodule pointer
Signed-off-by: dprital <drorp@nvidia.com>
Update sonic-platform-daemons submodule pointer to include the following:
* [chassisd] Add script to initialize chassis info in STATE_DB ([#183](https://github.com/Azure/sonic-platform-daemons/pull/183))
Signed-off-by: dprital <drorp@nvidia.com>
Fix in Monit memory_checker plugin. Skip fetching running containers if docker engine is down (can happen in deinit).
This PR fixes issue #11472.
Signed-off-by: liora liora@nvidia.com
Why I did it
In the case where Monit runs during deinit flow, memory_checker plugin is fetching the running containers without checking if Docker service is still running. I added this check.
How I did it
Use systemctl is-active to check if Docker engine is still running.
How to verify it
Use systemctl to stop docker engine and reload Monit, no errors in log and relevant print appears in log.
Which release branch to backport (provide reason below if selected)
The fix is required in 202205 and 202012 since the PR that introduced the issue was cherry picked to those branches (#11129).
- Why I did it
Update SAI version - 1.22.0.0
Update SDK/FW version - 4.5.2318/2010_2318
SAI Changes:
1. Port FEC fix for multiple speeds
2. Next hop group optimized bulk API
3. Support BFD remote-disc exchange in negotiation stage
4. Reduce verbosity of shared database already exists print
SDK/FW Fixes:
1. Cr space timeout on Hold and Release GW - at warmboot
2. SPC-1 Port in stuck PHY_UP after peer side rebooted
3. memory leak in sx_api_router_ecmp_update_set
- How I did it
Update pointer for the new SAI and SDK/FW
- How to verify it
Run regression tests
- Why I did it
New changes in this new HW-MGMT package:
1. hw-mgmt: chassis events: Fix voltmon address conflict on connecting
2. hw-mgmt: topology: Add COMEX BRDWL respin support
a. Removed A2D sensor from all COMEX BRDWL boards
b. Add COMEX BRDWL boards with register defined (config3)
- How I did it
Advance the hw-mgmt repo pointer and update the hw-mgmt version number
- How to verify it
Run platform-related regression test cases on the new testbed.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Why I did it
This PR is to cherry-pick #11448 to 202012 branch after resolving conflicts.
There are conflicts in
files/build_templates/qos_config.j2
src/sonic-config-engine/tests/test_j2files.py
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Missed this sku in the previous PR #11398
How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%
How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
- Why I did it
MSN3700/3700C/4600C have been re-spined, the new HW version of platforms has different sensors, so need to apply the correct sensor.conf for them.
- How I did it
Add new sensor.conf files for the new re-spined platforms.
Enhance the logic of "get_sensors_conf_path" for the related platforms in order to load the correct sensor.conf for each version of platforms.
- How to verify it
run sensors test on different versions of platforms
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Improve throughput and latency for 7260 deployments
How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%
How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
There is a need to select different mmu profiles based on deployment type
How I did it
There will be separate subfolders (RDMA-CENTRIC, TCP-CENTRIC, BALANCED) in each hwsku folder which contains deployment specific mmu and qos settings. SonicQosProfile attribute in the minigraph will be used to determine which settings to use. If that attribute is not present, the default settings that exist in the hwsku folder will be used
Why I did it
When any of the test job failed in the test stage, the rerun will not work, the test stage will be skipped automaticall, so we do not have chance to rerun the test stage again, and the checks of the test will be always in failed status, block the PR to merge forever.
It should be caused by the condition in the Test stage, we should specify the status of the BuildVS stage.
How I did it
Fix stage dependency logic.
Update sonic-linux-kernel submodule pointer to include the following:
* [202012][patch] mlxsw: i2c: Prevent transaction execution for special chip states ([#282](https://github.com/Azure/sonic-linux-kernel/pull/282))
Signed-off-by: dprital <drorp@nvidia.com>
Signed-off-by: Neetha John <nejo@microsoft.com>
This PR contains the following commits
5a54bd7 Added cisco config platform commands (Azure/sonic-utilities#2241)
62c1640 [config/load_mgmt_config] Support load IPv6 mgmt IP (Azure/sonic-utilities#2206)
c061a18 Fix header for the output table following 'show ipv6 interface' command (Azure/sonic-utilities#2219)
ecca18ff [202012] Update load minigraph to load backend acl (Azure/sonic-utilities#2235)
Why I did it
Daemon dhcp6relay may crash due to null pointer access to ifa_addr member of struct ifaddrs. It's not guaranteed that the interface must have available ifa_addr. That is true for some special virtual/pseudo interfaces.
How I did it
Check the pointer to ifa_addr is valid ahead of accessing it.
Why I did it
Fix the missing debian package for reproducible build issue.
The gnupg2 should be added into the version file.
https://dev.azure.com/mssonic/build/_build/results?buildId=118139&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=8d99be27-49d0-54d0-99b1-cfc0d47f0318
The following packages have unmet dependencies:
gnupg2 : Depends: gnupg (>= 2.2.27-2+deb11u2) but 2.2.27-2+deb11u1 is to be installed
E: Unable to correct problems, you have held broken packages.
The issue was caused by the gnupg2 removed, and not detected.
sonic-buildimage/build_debian.sh
Line 250 in 4fb6cf0
sudo LANG=C chroot $FILESYSTEM_ROOT apt-get -y remove software-properties-common gnupg2 python3-gi
How I did it
Export the debian packages when any debian package being removed.
Why I did it
Storage backend has all vlan members tagged. If untagged packets are received on those links, they are accounted as RX_DROPS which can lead to false alarms in monitoring tools. Using this acl to hide these drops.
How I did it
Created a acl template which will be loaded during minigraph load for backend. This template will allow tagged vlan packets and dropped untagged
How to verify it
Unit tests
Signed-off-by: Neetha John <nejo@microsoft.com>
#### Why I did it
These methods were added to make some convenient platform and chassis information methods accessible through sonic-py-common. These methods were refactored from sonic-utilities and are used in the `show platform summary` and `show version` commands.
#### How I did it
There are two methods, one is `get_platform_info()` which simply calls local methods to collect useful platform information into a dictionary format, this came directly from sonic-utilities.
Signed-off-by: Neetha John <nejo@microsoft.com>
Backport #11221
Why I did it
For storage backend, certain rules will be applied to the DATAACL table to allow only vlan tagged packets and drop untagged packets.
How I did it
Create DATAACL table if the device is a storage backend device
To avoid ACL resource issues, remove EVERFLOW related tables if the device is a storage backend device
How to verify it
Added the following unit tests
verify that EVERFLOW acl tables is removed and DATAACL table is added for storage backend tor
verify that no DATAACL tables are created and EVERFLOW tables exist for storage backend leaf