Previously, we set timeout in each step such as Lock testbed, Prepare testbed, Run test and KVM dump. When some issue suck like retry happens in one step, it will cause timeout error, but actually, it only needs more time to success. In this pr, we remove the timeout limit in each step and control the timeout outside in each job. When the job runs more than four hours, it will be cancelled.
Why I did it
Previously, we set timeout in each step such as Lock testbed, Prepare testbed, Run test and KVM dump. When some issue suck like retry happens in one step, it will cause timeout error, but actually, it only needs more time to success. In this pr, we remove the timeout limit in each step and control the timeout outside in each job. When the job runs more than four hours, it will be cancelled.
How I did it
Remove the timeout parameter in each step, and control the timeout outside in each job.
How to verify it
Set the timeout of one job to 4 hours, and when timeout happens, azure pipeline will cancel this job.
Previously, we hard code the min and max numbers of instance in a plan. In this pr, we support passing the instance numbers of a testplan.
Why I did it
Previously, we hard code the min and max numbers of instance in a plan. In this pr, we support passing the instance numbers of a testplan.
How I did it
Use a variable to set the instance number.
* [SAI PTF] SAI PTF docker support sai-ptf v2
Publish the sai-ptf docker.
Take part of the change from previous PR #11610 (already reverted as some cache issue)
Cause in #11610, added two new target in it, one is sai-ptf another one is syncd-rpc with sai-ptf v2, to make the upgrade with more clear target, use this one take the sai-ptf one.
Test one:
NOSTRETCH=y NOJESSIE=y make configure PLATFORM=vs
NOSTRETCH=y NOJESSIE=y NOBULLSEYE=y SAITHRIFT_V2=y make target/docker-ptf-sai.gz
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* remove useless change
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* remove useless parameters
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* remove useless change
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Update azure-pipelines-build.yml
remove a useless option
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
Migrate multi-asic test jobs to TestbedV2.
Why I did it
Migrate multi-asic test jobs to TestbedV2.
How I did it
Add one parameter num_asic to create testplan.
Modify azure-pipelines.yml to run multi-asic on tbv2.
Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Migrate t0-sonic test jobs to TestbedV2.
Why I did it
Migrate t0-sonic test jobs to TestbedV2.
How I did it
Add two parameters to create testplan.
Modify azure-pipelines.yml to run t0-sonic on tbv2.
Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Add dualtor test using TestbedV2 in buildimage repo.
Why I did it
Add dualtor test using TestbedV2 in buildimage repo.
How I did it
Add dualtor test using TestbedV2 in buildimage repo.
Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Why I did it
Fix the issue that test plan can't be canceled by KVM dump stage
How I did it
Fix the issue that test plan can't be canceled by KVM dump stage
co-authorized by: jianquanye@microsoft.com
Migrate the t0 and t1-lag test jobs in buildimage repo to TestbedV2.
Why I did it
Migrate the t0 and t1-lag test jobs in buildimage repo to TestbedV2.
How I did it
Migrate the t0 and t1-lag test jobs in buildimage repo to TestbedV2.
* Define whether a test is required by code
Why I did it
Define whether a test job is required before merging by code.
Let the failure of multi-asic and t0-sonic don't block pr merge.
The 'required' configuration can be modified by the owner in the future.
How I did it
Required:
t1-lag, t0
Not required:
multi-asic, t0-sonic
How to verify it
AZP itself verifies it.
Signed-off-by: jianquanye@microsoft.com
Why I did it
Fix the official build not triggered correctly issue, caused by the azp template path not existing.
How I did it
Change the azp template path.
With the Broadcom syncd containers getting upgraded to Bullseye, the DNX
RPC container is no longer automatically built. Explicitly add a make
command to build it.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Require multi-asic VS tests to be run during PR checks and merges in master branch.
How I did it
Add job to run multi-asic VS tests in sonic-buildimage pipeline. Currently pipeline will run basic bgp fact test to ensure the testbed comes up, load minigraph works and bgp sessions are up.
Use new multi-asic VS agent pool sonictest-ma in official-multi-asic-vs pipeline
Make multi-asic VS test optional for now. There are two known issues:
Announce route failure during refresh-dut in setup testbed stage.
bgp sessions not getting established.
How to verify it
Tested using test azure pipelines.
The asan-enabled docker image will be used in other CIs that run DVS tests.
Why I did it
To add a possibility to run DVS tests with ASAN for other CIs (e.g. swss).
How I did it
Added the 'asan_image' flag to the vs job group.
How to verify it
Run the CI and check the docker-sonic-vs-asan.gz artifact.
Why I did it
The t0-sonic pool has been fixed, so add it back to azp checker.
How I did it
Remove continueOnError in run-test-template.yml.
Signed-off-by: Ze Gan <ganze718@gmail.com>
example:
├── folderA
│ ├── fileA (skip vstest)
│ ├── fileB
│ └── fileC
If we want to skip vstest when changing /folderA/fileA, and not skip vstest when changing fileB or fileC.
vstest-include:
^folderA/fileA
vstest-exclude:
^folderA
Why I did it
The docker storage driver vfs is not a good option for build, it uses the “deep copy” when building a new layer, leads to lower performance and more space used on disk than other storage drivers.
A better docker storage driver is the default one overlay2, it is a modern union filesystem.
Why I did it
Fix the target directory not empty issue when publishing artifacts.
Some of the artifacts are published to $(Build.ArtifactStagingDirectory)/target/ before source code checked out.
Why I did it
It is not necessary to trigger the publish pipeline when build is failed.
How I did it
Remove the condition in the azp task, change to use template condition.
Why I did it
Support to trigger a pipeline to download and publish artifacts to storage and container registry.
Support to specify the patterns which docker images to upload.
How I did it
Pass the pipeline information and the artifact information by pipeline parameters to the pipeline which will be triggered a new build. It is to decouple the artifacts generation and the publish logic, how and where the artifacts/docker images will be published, depends on the triggered pipeline.
How to verify it
Why I did it
Exclude the innovium build in upgrading version build, currently, the builds are always failed, exclude the build temporarily.
Increase the broadcom build timeout.
Fix the generating version file failure issue caused by artifacts folder change.
When changing to use the same template for PR build, official build and packages version upgrade, the artifacts folder adding a "target" folder, the version upgrade task should be changed accordingly.
Why I did it
docker hub will limit the pull rate.
Use ACR instead to pull debian related docker image.
How I did it
Set DEFAULT_CONTAINER_REGISTRY in pipeline.
Why I did it
Fix ENABLE_DOCKER_BASE_PULL not working issue in armhf/arm64
For build in native armhf/arm64, the expected container registry repo name is sonic-slave-<stretch|buster|bullseye>
How I did it
Publish the slave image to sonic-slave-<stretch|buster|bullseye>.
Why I did it
[Build]: Enable marvell-armhf PR check
Improve the azp dependencies, make the Test stage only depended on BuildVS stage. The Test stage will be triggered once the BuildVS stage finished, reduce the waiting time.
Why I did it
Introduce 2 sub jobs for kvmtest t0 job in sonic-mgmt repo in PR Azure/sonic-mgmt#4861
But in sonic-buildimage repo, because section parameter is null, it always run the part 2 test scripts in kvmtest t0 job.
It missed the part 1 test scripts in kvmtest.sh.
How I did it
Split kvmtest t0 job into two sub jobs such as sonic-mgmt repo and run them in parallel to save time.
How to verify it
Submit PR will trigger the pipeline to run.
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
#### Why I did it
1. Fix Build exception [example](https://dev.azure.com/mssonic/build/_build/results?buildId=73911&view=logs&jobId=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&j=cef3d8a9-152e-5193-620b-567dc18af272&t=ac3bce9f-b126-5a26-3fee-28ce0ec1679d)
```
2022-02-19T01:54:23.4200556Z ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/dist-packages/markupsafe/__init__.py)
```
This is because Jinja2 uses MarkupSafe without specifying an upper limit to the version, MarkupSafe version that was released today removed 'soft_unicode'. So now Jinja2 is complaining.
Related issues:
https://github.com/pallets/jinja/issues/1591https://github.com/aws/aws-sam-cli/issues/3661
2. Reverts #9136
Fixing build failures in SONiC utils [example](https://dev.azure.com/mssonic/build/_build/results?buildId=73784&view=logs&jobId=83516c17-6666-5250-abde-63983ce72a49&j=83516c17-6666-5250-abde-63983ce72a49&t=6177235f-d4f1-5f72-835a-90ebb93a1784)
One of the errors:
```
TestPathAddressing.test_find_ref_paths__ref_is_the_whole_key__returns_ref_paths
self = <tests.generic_config_updater.gu_common_test.TestPathAddressing testMethod=test_find_ref_paths__ref_is_the_whole_key__returns_ref_paths>
def test_find_ref_paths__ref_is_the_whole_key__returns_ref_paths(self):
# Arrange
path = "/PORT/Ethernet0"
expected = [
"/ACL_TABLE/NO-NSW-PACL-V4/ports/0",
"/VLAN_MEMBER/Vlan1000|Ethernet0",
]
# Act
actual = self.path_addressing.find_ref_paths(path, Files.CROPPED_CONFIG_DB_AS_JSON)
# Assert
> self.assertEqual(expected, actual)
E AssertionError: Lists differ: ['/ACL_TABLE/NO-NSW-PACL-V4/ports/0', '/VLAN_MEMBER/Vlan1000|Ethernet0'] != ['/ACL_TABLE/NO-NSW-PACL-V4/ports/0']
E
E First list contains 1 additional elements.
E First extra element 1:
E '/VLAN_MEMBER/Vlan1000|Ethernet0'
E
E - ['/ACL_TABLE/NO-NSW-PACL-V4/ports/0', '/VLAN_MEMBER/Vlan1000|Ethernet0']
E + ['/ACL_TABLE/NO-NSW-PACL-V4/ports/0']
```
The VLAN_MEMBER backlink (can be called referrer link or ref link) is not found.
Issue introduced by https://github.com/Azure/sonic-buildimage/pull/9136
I don't know how this PR passed the build system, it should have failed.
Known YANG issue https://github.com/Azure/sonic-buildimage/issues/9312
#### How I did it
The import to `sonic-vlan` is breaking the build
```
import sonic-vlan {
prefix vlan;
}
```
I am not sure if that's the only issue, so I think reverting the whole PR should be the safer option.
#### How to verify it
Ran sonic-utils tests locally.
Fix the nodesource.list cannot read issue, it is cased by the full path not used.
```
2021-12-03T06:59:26.0019306Z Removing intermediate container 77cfe980cd36
2021-12-03T06:59:26.0020872Z ---> 528fd40e60f6
2021-12-03T06:59:26.0021457Z Step 81/81 : RUN post_run_buildinfo
2021-12-03T06:59:26.0841136Z ---> Running in d804bd7e1b06
2021-12-03T06:59:29.1626594Z [91mDEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
2021-12-03T06:59:34.2960105Z [0m[91m/usr/bin/sed: can't read nodesource.list: No such file or directory
2021-12-03T06:59:34.5094880Z [0mThe command '/bin/sh -c post_run_buildinfo' returned a non-zero code: 2
```
Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
With a Bullseye upgrade, a change that requires everything to get
rebuilt (including the slave containers) takes about 12 hours (the vs
job that builds the virtual switch image as well as the PTF container
sometimes times out towards the end). Part of this is because the kernel
is now built after all of the sonic containers (kernel is built in a
Bullseye slave, the docker containers are built in a Buster slave).
Another part is because during the ptf container build, for some reason,
all of the docker containers are rebuilt.
Therefore, to make sure PRs don't time out after Bullseye gets merged
in, bump up the timeout from 12 hours to 15 hours. This should be enough
for the builds to complete.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
To be able to run VS test on official multi asic VS image.
How I did it
Add a new script to build multi-asic VS image by passing NUM_ASIC build parameter.
Rung multi-asic t1-lag test cases with the built image.
UserID is different with the image built by Jenkins if we build docker-sonic-mgmt in sonicbld pool.
So we build this image in sonictest pool. There is a sonictmp user with UserID 1001.
This will make docker image same as the image built in Jenkins
Why I did it
To enable syncd-rpc for Barefoot build
How I did it
Set the flag
How to verify it
ENABLE_SYNCD_RPC=y make configure PLATFORM=barefoot
ENABLE_SYNCD_RPC=y make all
Add 3 pipeline files:
- pipeline for build docker image sonic-slave-[buster|jessie|stretch] for amd64/armhf/arm64, and push to ACR(sonicdev-microsoft.com)
- pipeline for build docker image sonic-mgmt, and push to ACR
- pipeline for cleaning dpkg cache which are created more than 30 days.
Co-authored-by: lguohan <lguohan@gmail.com>
When submitting a new official build for broadcom, vs, it prompts a error message, which says the job is not defined.
It was caused by the default option "[]", which is not empty, it is used as the jobGroups parameter.
Why I did it
Improve the version of the Pull Request build by changing the local branch name.
How I did it
Change the default branch name merge to [target_branch_name]-[pullrequestid].
How to verify it
For official build, the version is not changed.
For pull request build, the version as below:
Why I did it
Fix the boolean value case sensitive issue in Azure Pipelines
When passing parameters to a template, the "true" or "false" will have case sensitive issue, it should be a type casting issue.
To fix it, we change the true/false to yes/no, to escape the trap.
Support to override the job groups in the template, so PR build has chance to use different build parameters, only build simple targets. For example, for broadcom, we only build target/sonic-broadcom.bin, the other images, such as swi, debug bin, etc, will not be built.