sonic-buildimage/.azure-pipelines/run-test-scheduler-template.yml
Yutong Zhang cb354a5af2
[TestbedV2][master] Remove timeout in each step. (#12915)
Previously, we set timeout in each step such as Lock testbed, Prepare testbed, Run test and KVM dump. When some issue suck like retry happens in one step, it will cause timeout error, but actually, it only needs more time to success. In this pr, we remove the timeout limit in each step and control the timeout outside in each job. When the job runs more than four hours, it will be cancelled.

Why I did it
Previously, we set timeout in each step such as Lock testbed, Prepare testbed, Run test and KVM dump. When some issue suck like retry happens in one step, it will cause timeout error, but actually, it only needs more time to success. In this pr, we remove the timeout limit in each step and control the timeout outside in each job. When the job runs more than four hours, it will be cancelled.

How I did it
Remove the timeout parameter in each step, and control the timeout outside in each job.

How to verify it
Set the timeout of one job to 4 hours, and when timeout happens, azure pipeline will cancel this job.
2022-12-03 22:30:03 -08:00

136 lines
5.3 KiB
YAML

parameters:
- name: TOPOLOGY
type: string
- name: POLL_INTERVAL
type: number
default: 10
- name: POLL_TIMEOUT
type: number
default: 36000
- name: MIN_WORKER
type: string
default: 1
- name: MAX_WORKER
type: string
default: 1
- name: TEST_SET
type: string
default: ""
- name: DEPLOY_MG_EXTRA_PARAMS
type: string
default: ""
- name: COMMON_EXTRA_PARAMS
type: string
default: ""
- name: VM_TYPE
type: string
default: "ceos"
- name: SPECIFIED_PARAMS
type: string
default: "{}"
- name: MGMT_BRANCH
type: string
default: master
- name: NUM_ASIC
type: number
default: 1
steps:
- script: |
set -ex
wget -O ./.azure-pipelines/test_plan.py https://raw.githubusercontent.com/sonic-net/sonic-mgmt/master/.azure-pipelines/test_plan.py
wget -O ./.azure-pipelines/pr_test_scripts.yaml https://raw.githubusercontent.com/sonic-net/sonic-mgmt/master/.azure-pipelines/pr_test_scripts.yaml
displayName: Download TestbedV2 scripts
- script: |
set -ex
pip install PyYAML
rm -f new_test_plan_id.txt
python ./.azure-pipelines/test_plan.py create -t ${{ parameters.TOPOLOGY }} -o new_test_plan_id.txt \
--min-worker ${{ parameters.MIN_WORKER }} --max-worker ${{ parameters.MAX_WORKER }} \
--test-set ${{ parameters.TEST_SET }} --kvm-build-id $(KVM_BUILD_ID) \
--deploy-mg-extra-params "${{ parameters.DEPLOY_MG_EXTRA_PARAMS }}" --common-extra-params "${{ parameters.COMMON_EXTRA_PARAMS }}" \
--mgmt-branch ${{ parameters.MGMT_BRANCH }} --vm-type ${{ parameters.VM_TYPE }} --specified-params "${{ parameters.SPECIFIED_PARAMS }}" \
--num-asic ${{ parameters.NUM_ASIC }}
TEST_PLAN_ID=`cat new_test_plan_id.txt`
echo "Created test plan $TEST_PLAN_ID"
echo "Check https://www.testbed-tools.org/scheduler/testplan/$TEST_PLAN_ID for test plan status"
echo "##vso[task.setvariable variable=TEST_PLAN_ID]$TEST_PLAN_ID"
env:
TESTBED_TOOLS_URL: $(TESTBED_TOOLS_URL)
TENANT_ID: $(TESTBED_TOOLS_MSAL_TENANT_ID)
CLIENT_ID: $(TESTBED_TOOLS_MSAL_CLIENT_ID)
CLIENT_SECRET: $(TESTBED_TOOLS_MSAL_CLIENT_SECRET)
displayName: Trigger test
- script: |
set -ex
echo "Lock testbed"
echo "TestbedV2 is just online and might not be stable enough, for any issue, please send email to sonictestbedtools@microsoft.com"
echo "Runtime detailed progress at https://www.testbed-tools.org/scheduler/testplan/$TEST_PLAN_ID"
# When "LOCK_TESTBED" finish, it changes into "PREPARE_TESTBED"
python ./.azure-pipelines/test_plan.py poll -i "$(TEST_PLAN_ID)" --expected-states PREPARE_TESTBED EXECUTING KVMDUMP FINISHED CANCELLED FAILED
env:
TESTBED_TOOLS_URL: $(TESTBED_TOOLS_URL)
displayName: Lock testbed
- script: |
set -ex
echo "Prepare testbed"
echo "Preparing the testbed(add-topo, deploy-mg) may take 15-30 minutes. Before the testbed is ready, the progress of the test plan keeps displayed as 0, please be patient(We will improve the indication in a short time)"
echo "If the progress keeps as 0 for more than 1 hour, please cancel and retry this pipeline"
echo "TestbedV2 is just online and might not be stable enough, for any issue, please send email to sonictestbedtools@microsoft.com"
echo "Runtime detailed progress at https://www.testbed-tools.org/scheduler/testplan/$TEST_PLAN_ID"
# When "PREPARE_TESTBED" finish, it changes into "EXECUTING"
python ./.azure-pipelines/test_plan.py poll -i "$(TEST_PLAN_ID)" --expected-states EXECUTING KVMDUMP FINISHED CANCELLED FAILED
env:
TESTBED_TOOLS_URL: $(TESTBED_TOOLS_URL)
displayName: Prepare testbed
- script: |
set -ex
echo "Run test"
echo "TestbedV2 is just online and might not be stable enough, for any issue, please send email to sonictestbedtools@microsoft.com"
echo "Runtime detailed progress at https://www.testbed-tools.org/scheduler/testplan/$TEST_PLAN_ID"
# When "EXECUTING" finish, it changes into "KVMDUMP", "FAILED", "CANCELLED" or "FINISHED"
python ./.azure-pipelines/test_plan.py poll -i "$(TEST_PLAN_ID)" --expected-states KVMDUMP FINISHED CANCELLED FAILED
env:
TESTBED_TOOLS_URL: $(TESTBED_TOOLS_URL)
displayName: Run test
- script: |
set -ex
echo "KVM dump"
echo "TestbedV2 is just online and might not be stable enough, for any issue, please send email to sonictestbedtools@microsoft.com"
echo "Runtime detailed progress at https://www.testbed-tools.org/scheduler/testplan/$TEST_PLAN_ID"
# When "KVMDUMP" finish, it changes into "FAILED", "CANCELLED" or "FINISHED"
python ./.azure-pipelines/test_plan.py poll -i "$(TEST_PLAN_ID)" --expected-states FINISHED CANCELLED FAILED
condition: succeededOrFailed()
env:
TESTBED_TOOLS_URL: $(TESTBED_TOOLS_URL)
displayName: KVM dump
- script: |
set -ex
echo "Try to cancel test plan $TEST_PLAN_ID, cancelling finished test plan has no effect."
python ./.azure-pipelines/test_plan.py cancel -i "$(TEST_PLAN_ID)"
condition: always()
env:
TESTBED_TOOLS_URL: $(TESTBED_TOOLS_URL)
TENANT_ID: $(TESTBED_TOOLS_MSAL_TENANT_ID)
CLIENT_ID: $(TESTBED_TOOLS_MSAL_CLIENT_ID)
CLIENT_SECRET: $(TESTBED_TOOLS_MSAL_CLIENT_SECRET)
displayName: Finalize running test plan