Commit Graph

6848 Commits

Author SHA1 Message Date
Dror Prital
54b146f56c
[Mellanox] Update SDK/FW to version 4.5.2320/2010.2320 (#11990)
- Why I did it
Update SDK/FW version - 4.5.2320/2010_2320 in order to have the following fixes:
• Spectrum-3 | PCI calibration changes from a static to a dynamic mechanism.
• [VxLAN] TTL was set to 0 for non IP traffic (such as ARP)

- How I did it
Update pointer for the SDK/FW

- How to verify it
Run regression tests
2022-09-14 20:43:38 +03:00
Junchao-Mellanox
9750cb48c6
[Mellanox] Redirect ethtool stderr to subprocess for better error log (#12038)
- Why I did it
ethtool print error logs when EEPROM of a SFP is not available. It prints error like this:

INFO pmon#/supervisord: xcvrd Cannot get module EEPROM information: Input/output error
INFO pmon#/supervisord: xcvrd Cannot get Module EEPROM data: Invalid argument
However, this log does not contain the relevant SFP index which is hard for developer/qa to find the exactly SFP.

- How I did it
Redirect ethtool stderr to subprocess and log it better

- How to verify it
Manual test
2022-09-14 20:41:43 +03:00
kannankvs
c674b3c634
[doc]: Updated PR Template for a comment to add label/tag for the feature raised. (#12058)
Signed-off-by: kannankvs <kannan_kvs@dell.com>
2022-09-13 21:49:05 -07:00
Samuel Angebault
055fbf5aaa
[Arista] Update platform submodules (#12020) 2022-09-13 19:39:49 -07:00
Mai Bui
7d1b99a886
Replace unsafe functions in iccpd (#11694)
Why I did it
Replace unsafe functions in iccpd
How I did it
Replace memset() by zero initialization
Replace strtok() by strtok_r()
Signed-off-by: maipbui <maibui@microsoft.com>
2022-09-13 09:52:17 -04:00
jcaiMR
b34d94be1f
yang model table DEVICE_NEIGHBOR_METADATA creation (#11894)
* yang mode support for neighbor metadata

* add description in leaf node

* modify description
2022-09-13 10:07:17 +08:00
xumia
a8076e303b
Upgrade the sonic-fips packages to 0.3 (#12040)
Why I did it
Upgrade the sonic-fips packages to release 0.3
Fix the package timestamp not correct issue
2022-09-12 20:31:29 +08:00
Hasan Naqvi
c53972b348
Update submodule to FRR 8.2.2 (#11502)
*The sonic-frr was upgraded to FRR 8.2.2 as part of PR #10691. However, sonic-frr/frr submodule was still referring to previous 7.5 version. Update the sonic-frr/frr submodule to 8.2.2 commit id. Fixes issue #11484.
2022-09-09 17:05:48 -07:00
FuzailBrcm
0a8dd3f4f8
Adding support for get/set low power mode for QSFPs in PDDF common APIs (#11786)
* Adding support for get/set low pwer mode for QSFPs in PDDF common APIs

* Adding support for get/set low pwer mode for QSFPs in PDDF common APIs - Review comments
2022-09-09 14:33:12 -07:00
Zain Budhwani
966fe0d210
Update gnmi submodule (#11988)
* Update gnmi submodule

* Update gnmi pointer again
2022-09-09 14:23:45 -07:00
anamehra
714b1807b6
Fix radv.conf traceback when VLAN_INTERFACE is not defined (#12034)
*Fix the if block scope to prevent traceback due to undefined vlan_list when  VLAN_INTERFACE is not defined.
2022-09-09 12:54:05 -07:00
Ying Xie
a226239439
[zebra] ignore route from default table (#12018)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-09 08:41:41 -07:00
ShiyanWangMS
acc17db0c8
Revert PR#11831 (#12035) "Upgrade docker-sonic-mgmt base image from Ubuntu18.04 to 20.04" 2022-09-09 22:18:13 +08:00
Oleksandr Ivantsiv
549bb3d483
[services] Update "WantedBy=" section for tacacs-config.timer. (#11893)
The timer execution may fail if triggered during a config reload
(when the sonic.target is stopped). This might happen in a rare
situation if config reload is executed after reboot in a small
time slot (for 0 to 30 seconds) before the tacacs-config timer
is triggered. To ensure that timer execution will be resumed after
a config reload the WantedBy section of the systemd service is updated
to describe relation to sonic.target.

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
2022-09-08 15:16:11 -07:00
xumia
19155df148
Fix dbus-run-session command not found issue when install dbus-python (#12009) 2022-09-08 09:33:01 -07:00
bingwang-ms
dc9eaa53fb
Map TC6 to Queue 1 for regular traffic (#11904)
Why I did it
This PR is to update TC_TO_QUEUE_MAP|AZURE for SKU Arista-7050CX3-32S-D48C8 and Arista-7260CX3 T0.

The change is only to align the TC_TO_QUEUE_MAP for regular traffic and bounced traffic. It has no impact on business because we have no traffic being mapped to TC2 or TC6.

How I did it
Update TC_TO_QUEUE_MAP|AZURE , and test cases as well.

How to verify it
Verified by running test case test_j2files.py

/sonic/src/sonic-config-engine$ python3 setup.py test -s tests/test_j2files.py
running test
......
----------------------------------------------------------------------
Ran 29 tests in 25.390s

OK
2022-09-08 09:18:26 -07:00
Ze Gan
016f671857
[docker-macsec]: Add dependencies of MACsec (#11770)
Why I did it
If the SWSS services was restarted, the MACsec service should also be restarted. Otherwise the data in wpa_supplicant and orchagent will not be consistent.

How I did it
Add dependency in docker-macsec.mk.

How to verify it
Manually check by 'sudo service swss restart'.

The MACsec container should be started after swss, the syslog will look like


Sep  8 14:36:29.562953 sonic INFO swss.sh[9661]: Starting existing swss container with HWSKU Force10-S6000
Sep  8 14:36:30.024399 sonic DEBUG container: container_start: BEGIN
...
Sep  8 14:36:33.391706 sonic INFO systemd[1]: Starting macsec container...
Sep  8 14:36:33.392925 sonic INFO systemd[1]: Starting Management Framework container...


Signed-off-by: Ze Gan <ganze718@gmail.com>
2022-09-08 23:45:06 +08:00
Liu Shilong
98d6357ae7
[actions] Remove approve step in label action. (#12015)
Why I did it
Approve step needs special permission settings.
We already added permission setting to enable bypass merging PR.
So, approve step is not necessary.
2022-09-08 17:23:29 +08:00
UmaMaven
38cc35f6da
support for static-route yang model (#11932)
*[Yang] support for static-route yang model #11932
2022-09-06 21:55:59 -07:00
Muhammad Danish
3b9bbf7d28
[doc]: Update README.md (#11960)
* Remove faulty pipeline URLs
* Miscellaneous minor fixes
2022-09-07 12:25:44 +08:00
Ze Gan
5efd6f9748
[macsec]: Add MACsec clear CLI support (#11731)
Why I did it
To support clear MACsec counters by sonic-clear macsec

How I did it
Add macsec sub-command in sonic-clear to cache the current macsec stats, and in the show macsec command to check the cache and return the diff with cache file.

How to verify it

admin@vlab-02:~$ show macsec  Ethernet0
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-128
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
        MACsec Egress SC (52540067daa70001)
        -----------  -
        encoding_an  0
        -----------  -
                MACsec Egress SA (0)
                -------------------------------------  --------------------------------
                auth_key                               9DDD4C69220A1FA9B6763F229B75CB6F
                next_pn                                1
                sak                                    BA86574D054FCF48B9CD7CF54F21304A
                salt                                   000000000000000000000000
                ssci                                   0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN         52
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
                SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
                SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
                -------------------------------------  --------------------------------
        MACsec Ingress SC (525400d4fd3f0001)
                MACsec Ingress SA (0)
                ---------------------------------------  --------------------------------
                active                                   true
                auth_key                                 9DDD4C69220A1FA9B6763F229B75CB6F
                lowest_acceptable_pn                     1
                sak                                      BA86574D054FCF48B9CD7CF54F21304A
                salt                                     000000000000000000000000
                ssci                                     0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN           56
                SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
                SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
                SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
                SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
                ---------------------------------------  --------------------------------

admin@vlab-02:~$ sonic-clear macsec
Clear MACsec counters

admin@vlab-02:~$ show macsec  Ethernet0
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-128
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
        MACsec Egress SC (52540067daa70001)
        -----------  -
        encoding_an  0
        -----------  -
                MACsec Egress SA (0)
                -------------------------------------  --------------------------------
                auth_key                               9DDD4C69220A1FA9B6763F229B75CB6F
                next_pn                                1
                sak                                    BA86574D054FCF48B9CD7CF54F21304A
                salt                                   000000000000000000000000
                ssci                                   0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN         52
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
                SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
                SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
                -------------------------------------  --------------------------------
        MACsec Ingress SC (525400d4fd3f0001)
                MACsec Ingress SA (0)
                ---------------------------------------  --------------------------------
                active                                   true
                auth_key                                 9DDD4C69220A1FA9B6763F229B75CB6F
                lowest_acceptable_pn                     1
                sak                                      BA86574D054FCF48B9CD7CF54F21304A
                salt                                     000000000000000000000000
                ssci                                     0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN           0 <---this counters was cleared.
                SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
                SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
                SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
                SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
                ---------------------------------------  --------------------------------


Signed-off-by: Ze Gan <ganze718@gmail.com>
Co-authored-by: Judy Joseph <jujoseph@microsoft.com>
2022-09-07 08:16:23 +08:00
Renuka Manavalan
31e750ee0b
Fix PR build failure (#11973)
Some PR builds fails to find this file. Remove it temporarily until we root cause it
2022-09-06 15:13:05 -07:00
Stepan Blyshchak
a8b2a538a5
[docker-wait-any] immediately start to wait (#11595)
It could happen that a container has already crashed but docker-wait-any
will wait forever till it starts. It should, however, immediately exit
to make the serivce restart.

#### Why I did it

It is observed in some circumstances that the auto-restart mechanism does not work. Specifically for ```swss.service```, ```orchagent``` had crashed before ```docker-wait-any``` started in ```swss.sh```. This led ```docker-wait-any``` wait forever for ```swss``` to be in ```"Running"``` state and it results in:

```
CONTAINER ID   IMAGE                                COMMAND                  CREATED        STATUS                    PORTS     NAMES
1abef1ecebff   bcbca2b74df6                         "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         what-just-happened
3c924d405cd5   docker-lldp:latest                   "/usr/bin/docker-lld…"   22 hours ago   Up 22 hours                         lldp
eb2b12a98c13   docker-router-advertiser:latest      "/usr/bin/docker-ini…"   22 hours ago   Up 22 hours                         radv
d6aac4a46974   docker-sonic-mgmt-framework:latest   "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         mgmt-framework
d880fd07aab9   docker-platform-monitor:latest       "/usr/bin/docker_ini…"   22 hours ago   Up 22 hours                         pmon
75f9e22d4fdd   docker-snmp:latest                   "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         snmp
76d570a4bd1c   docker-sonic-telemetry:latest        "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         telemetry
ee49f50344b3   docker-syncd-mlnx:latest             "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         syncd
1f0b0bab3687   docker-teamd:latest                  "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         teamd
917aeeaf9722   docker-orchagent:latest              "/usr/bin/docker-ini…"   22 hours ago   Exited (0) 22 hours ago             swss
81a4d3e820e8   docker-fpm-frr:latest                "/usr/bin/docker_ini…"   22 hours ago   Up 22 hours                         bgp
f6eee8be282c   docker-database:latest               "/usr/local/bin/dock…"   22 hours ago   Up 22 hours                         database
```

The check for ```"Running"``` state is not needed because for cold boot case we do ```start_peer_and_dependent_services``` and for warm boot case the loop will retry to wait for container if this container is doing warm boot:
d01a91a569/files/image_config/misc/docker-wait-any (L56)

#### How I did it

Removed the check for ```"Running"```.

#### How to verify it

Kill swss before ```docker-wait-any``` is reached and verify auto restart will restart swss serivce.
2022-09-06 09:26:54 -07:00
Dror Prital
1b5d07f665
[submodule] Advance sonic-platform-daemons pointer (#11882)
- Why I did it
Update sonic-platform-daemons submodule pointer to include the following:

[ycabled] enable telemetry for 'active-active'; fix gRPC portid ordering (#284)
[ycabled] remove some spurious logs (#282)
Correct the peer forwarding state table (#281)
add psu input voltage and current (#276)
[ycabled] add capability to enable/disable telemetry (#279)

Signed-off-by: dprital <drorp@nvidia.com>
2022-09-04 11:03:36 +03:00
Dror Prital
4e18510fc9
[submodule]] Advance sonic-utilities pointer (#11897)
Update sonic-utilities submodule pointer to include the following:

Replace cmp in acl_loader with operator.eq (#2328)
Subinterface vrf bind issue fix (#2211)
[VRF]Adding CLI checks to ensure Vrf is valid in interface bind and static route commands (#2333)
[doc]: Add MACsec CLI doc (#2334)
[sonic-package-manager] Drop 'expires_in' (#2002)
Handle non-front-panel ports in is_rj45_port (#2327)
[service_mgmt]: Fix fetch MULTI_INST_DEPENDENT bug in service_mgmt.sh.j2 (#2319)
correct an error by changing "show bgp summary" to "show bfd summary" (#2324)
Update VRF unbind command (#2331)
Fix issue: port_type is referenced before initialized (#2323)
Fix issue: exception in is_rj45_port in multi ASIC env (#2313)
Delete .DS_Store (#2244)
Fix bug with checking VRF's routes in route_check.py (#2301)
[decode-syseeprom] Fix setting use_db based on support_eeprom_db (#2270)
Fix vrf UT failed issue (#2309)
add lacp_rate to portchannel (#2036)
2022-09-04 10:54:16 +03:00
Kebo Liu
78eeeb7670
[SN2201] remove extra empty lines in the pg_profile_lookup.ini (#11923)
- Why I did it
Remove extra empty lines in the SN2201 pg_profile_lookup.ini to make it aligned with other platforms.
This extra empty line could confuse some test cases which need to parse this file.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-09-04 10:53:07 +03:00
Zain Budhwani
6a54bc439a
Streaming structured events implementation (#11848)
With this PR in, you flap BGP and use events_tool to see the published events.
With telemetry PR #111 in and corresponding submodule update done in buildimage, one could run gnmi_cli to capture BGP flap events.
2022-09-03 07:33:25 -07:00
Renuka Manavalan
71d63a7be7
New commits in swss-common: (#11954)
* 651f52b (HEAD, origin/master, origin/HEAD, master) Change syslog level of Event Publish (#677)
* aca253a Add routing rule table for DASH (#668)
2022-09-02 21:52:24 -07:00
Lawrence Lee
e96ec5a74c
[kernel]: Submodule update (#11947)
Include following commit:

- 443253f [patch]: Add accpt_untracked_na kernel param (#292)

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-09-02 16:06:10 -07:00
Ye Jianquan
750e1b3017
Define whether a test is required by code (#11921)
* Define whether a test is required by code

Why I did it
Define whether a test job is required before merging by code.
Let the failure of multi-asic and t0-sonic don't block pr merge.
The 'required' configuration can be modified by the owner in the future.

How I did it
Required:
t1-lag, t0
Not required:
multi-asic, t0-sonic

How to verify it
AZP itself verifies it.

Signed-off-by: jianquanye@microsoft.com
2022-09-03 06:53:54 +08:00
Ying Xie
a6843927d9
[mux] skip mux operations during warm shutdown (#11937)
* [mux] skip mux operations during warm shutdown

- Enhance write_standby.py script to skip actions during warm shutdown.
- Expand the support to BGP service.
- MuX support was added by a previous PR.
- don't skip action during warm recovery

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-02 13:50:42 -07:00
Lawrence Lee
a762b35cbc
[arp_update]: Set failed IPv6 neighbors to incomplete (#11919)
After pinging any failed IPv6 neighbor entries, set the remaining failed/incomplete entries to a permanent INCOMPLETE state. This manual setting to INCOMPLETE prevents these entries from automatically transitioning to FAILED state, and since they are now incomplete any subsequent NA messages for these neighbors is able to resolve the entry in the cache.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-09-02 13:40:40 -07:00
Liu Shilong
030de9f26d
[actions] Add github context env in label action. (#11926) 2022-09-02 14:07:48 +08:00
arunlk-dell
f82c1fd8ae
Z9432F kernel dependency of platform module (#11941)
Why I did it
Z9432F Update the kernel dependency of platform module

How I did it
Modified the kernel version to current latest 5.10.0-12-2
2022-09-01 16:55:41 -07:00
Jing Zhang
fdd9130ecf
[YANG] Add MUX_CABLE yang model (#11797)
Why I did it
Address issue #10970

sign-off: Jing Zhang zhangjing@microsoft.com

How I did it
Add sonic-mux-cable.yang and unit tests.

How to verify it
Compile Compile target/python-wheels/sonic_yang_mgmt-1.0-py3-none-any.whl and target/python-wheels/sonic_yang_models-1.0-py3-none-any.whl.
Pass sonic-config-engine unit test.
Which release branch to backport (provide reason below if selected)
 201811
 201911
 202006
 202012
 202106
 202111
 202205
Description for the changelog
Link to config_db schema for YANG module changes
f8fe41a023/src/sonic-yang-models/doc/Configuration.md (mux_cable)
2022-09-01 11:52:02 -07:00
Zhaohui Sun
88191b063b
Add python-is-python3 package for bullseye base docker (#11895)
Why I did it
In latest syncd container, it is installed bullseye, can't find command '/usr/bin/python'.
Some scripts such as test_copp still calls /usr/bin/python in syncd.
Submitted the change in #11807 for syncd docker, but it's better to add it in bullseye base docker.

How I did it
Install python-is-python3 package in bullseye base docker to resolve this issue, whatever run python or python3, it will run /usr/bin/python3, will not cause the error of can't find command '/usr/bin/python'

How to verify it
run python in syncd container.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-09-01 08:13:24 +08:00
Jing Kan
353b2742b2
[YANG] Create YANG Model for Console (#11806)
How I did it
Create YANG Model for SONiC console related features.

How to verify it
Add tests

Signed-off-by: Jing Kan jika@microsoft.com
2022-09-01 07:58:16 +08:00
Longxiang Lyu
6e878a36da
[mux] Exit to write standby state to active-active ports (#11821)
[mux] Exit to write standby state to `active-active` ports

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2022-08-31 13:10:22 -07:00
Junchao-Mellanox
46292d71be
Add linux perf tool to sonic image (#11906) 2022-08-31 13:09:36 -07:00
Dev Ojha
c601f24139
[Arista7050cx3] TD3 SKU changes for pg headroom value after interop testing with cisco 8102 (#11901)
Why I did it
After PFC interop testing between 8102 and 7050cx3, data packet losses were observed on the Rx ports of the 7050cx3 (inflow from 8102) during testing. This was primarily due to the slower response times to react to PFC pause packets for the 8102, when receiving such frames from neighboring devices. To solve for the packet drops, the 7050cx3 pg headroom size has to be increased to 160kB.

How I did it
Modified the xoff threshold value to 160kB in the pg_profile file to allow for the buffer manager to read that value when building the image, and configuring the device

How to verify it
run "mmuconfig -l" once image is built


Signed-off-by: dojha <devojha@microsoft.com>
2022-08-31 11:08:32 -07:00
abdosi
4027147238
Align API get_device_runtime_metadata() for python version < 3.9 (#11900)
Why I did it:
API get_device_runtime_metadata() added by #11795 uses merge operator for dict but that is supported only for python version >=3.9. This API will be be used by scrips eg:hostcfgd which is still build for buster which does not have python 3.9 support.
2022-08-31 10:48:15 -07:00
saksarav-nokia
1e75abc274
[Nokia][Nokia-IXR7250E-36x100G & Nokia-IXR7250E-36x400G] Update BCM (#11577)
config to support ERSPAN egress mirror and also set flag to preserve ECN
2022-08-30 20:23:17 -07:00
Arun Saravanan Balachandran
092e0394b5
DellEMC: Z9332f - Graceful platform reboot (#10240)
Why I did it
To gracefully unmount filesystems and stop containers while performing a cold reboot.
Unmount ONIE-BOOT if mounted during fast/soft/warm reboot
How I did it
Override systemd-reboot service to perform a cold reboot.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins.
How to verify it
On reboot, verify that the container stop and filesystem unmount services have completed execution before the platform reboot.
2022-08-30 11:23:52 -07:00
Liu Shilong
cf69206d02
[ci] Fix bug involved by PR 11810 which affect official build pipeline (#11891)
Why I did it
Fix the official build not triggered correctly issue, caused by the azp template path not existing.

How I did it
Change the azp template path.
2022-08-30 14:23:09 +08:00
xumia
4733053c53
Fix vs check install login timeout issue (#11727)
Why I did it
Fix a build not stable issue: #11620
The vs vm has started successfully, but failed to wait for the message "sonic login:".

There were 55 builds failed caused by the issue in the last 30 days.

AzurePipelineBuildLogs
| where startTime > ago(30d)
| where type =~ "task"
| where result =~ "failed"
| where name =~ "Build sonic image"
| where content contains "Timeout exceeded"
| where content contains "re.compile('sonic login:')"
| project-away content
| extend branchName=case(reason=~"pullRequest", tostring(todynamic(parameters)['system.pullRequest.targetBranch']),
              replace("refs/heads/", "", sourceBranch))
| summarize FailedCount=dcount(buildId) by branchName

branchName	FailedCount
master	37
202012	9
202106	4
202111	2
202205	1
201911	1
It is caused by the login message mixed with the output message of the /etc/rc.local, one of the examples as below: (see the message rc.local[307]: sonic+ onie_disco_subnet=255.255.255.0 login: )
The check_install.py was waiting for the message "sonic login:", and Linux console was waiting for the username input (the login message has already printed in the console).
https://dev.azure.com/mssonic/build/_build/results?buildId=123294&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=359769c4-8b5e-5976-a793-85da132e0a6f

2022-07-17T15:00:58.9198877Z [   25.493855] rc.local[307]: + onie_disco_opt53=05
2022-07-17T15:00:58.9199330Z [   25.595054] rc.local[307]: + onie_disco_router=10.0.2.2
2022-07-17T15:00:58.9199781Z [   25.699409] rc.local[307]: + onie_disco_serverid=10.0.2.2
2022-07-17T15:00:58.9200252Z [   25.789891] rc.local[307]: + onie_disco_siaddr=10.0.2.2
2022-07-17T15:00:58.9200622Z [   25.880920] 
2022-07-17T15:00:58.9200745Z 
2022-07-17T15:00:58.9201019Z Debian GNU/Linux 10 sonic ttyS0
2022-07-17T15:00:58.9201201Z 
2022-07-17T15:00:58.9201542Z rc.local[307]: sonic+ onie_disco_subnet=255.255.255.0 login: 
2022-07-17T15:00:58.9202309Z [   26.079767] rc.local[307]: + onie_exec_url=file://dev/vdb/onie-installer.bin

How I did it
Input a newline when finished to run the script /etc/rc.local.
If entering a newline, the message "sonic login:" will prompt again.
2022-08-30 09:19:58 +08:00
Saikrishna Arcot
de54eece46
Fix error handling when failing to install a deb package (#11846)
The current error handling code for when a deb package fails to be
installed currently has a chain of commands linked together by && and
ends with `exit 1`. The assumption is that the commands would succeed,
and the last `exit 1` would end it with a non-zero return code, thus
fully failing the target and causing the build to stop because of bash's
-e flag.

However, if one of the commands prior to `exit 1` returns a non-zero
return code, then bash won't actually treat it as a terminating error.
From bash's man page:

```
-e      Exit immediately if a pipeline (which may consist of a single simple
	command), a list, or a compound command (see SHELL GRAMMAR above),
        exits with a non-zero status.  The shell does not exit if the
        command that fails is part of the  command  list  immediately
        following a while or until keyword, part of the test following the
        if or elif reserved words, part of any command executed in a && or
        || list except the command following the final && or ||, any
        command in a pipeline but the last, or if the command's return
        value is being inverted with !.  If a compound command other than a
        subshell returns a non-zero status because a command failed while
        -e was being ignored, the shell does not exit.
```

The part `part of any command executed in a && or || list except the
command following the final && or ||` says that if the failing command
is not the `exit 1` that we have at the end, then bash doesn't treat it
as an error and exit immediately. Additionally, since this is a compound
command, but isn't in a subshell (subshell are marked by `(` and `)`,
whereas `{` and `}` just tells bash to run the commands in the current
environment), bash doesn't exist. The result of this is that in the
deb-install target, if a package installation fails, it may be
infinitely stuck in that while-loop.

There are two fixes for this: change to using a subshell, or use `;`
instead of `&&`. Using a subshell would, I think, require exporting any
shell variables used in the subshell, so I chose to change the `&&` to
`;`. In addition, at the start of the subshell, `set +e` is added in,
which removes the exit-on-error handling of bash. This makes sure that
all commands are run (the output of which may help for debugging) and
that it still exits with 1, which will then fully fail the target.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-29 11:35:07 -07:00
Saikrishna Arcot
186568a21d
Update sensor names for msn4600c for the 5.10 kernel (#11491)
* Update sensor names for msn4600c for the 5.10 kernel

Looks like a sensor was removed in the 5.10 kernel for the tps53679
sensor, so the names/indexing has changed.

Related to Azure/sonic-mgmt#4513.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Update sensors file

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-29 11:34:23 -07:00
xumia
a1d3d99457
[Build] Increase the size of the installer image (#11869)
#### Why I did it
Fix the build failure caused by the installer image size too small. The installer image is only used during the build, not impact the final images.
See https://dev.azure.com/mssonic/build/_build/results?buildId=139926&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=359769c4-8b5e-5976-a793-85da132e0a6f

```
+ fallocate -l 2048M ./sonic-installer.img
+ mkfs.vfat ./sonic-installer.img
mkfs.fat 4.2 (2021-01-31)
++ mktemp -d
+ tmpdir=/tmp/tmp.TqdDSc00Cn
+ mount -o loop ./sonic-installer.img /tmp/tmp.TqdDSc00Cn
+ cp target/sonic-vs.bin /tmp/tmp.TqdDSc00Cn/onie-installer.bin
cp: error writing '/tmp/tmp.TqdDSc00Cn/onie-installer.bin': No space left on device
[  FAIL LOG END  ] [ target/sonic-vs.img.gz ]
```

#### How I did it
Increase the size from 2048M to 4096M.
Why not increase to 16G like qcow2 image?
The qcow2 supports the sparse disk, although a big disk size allocated, but it will not consume the real disk size. The falocate does not support the sparse disk. We do not want to allocate a very big disk, but no use at all. It will require more space to build.
2022-08-29 11:15:42 -07:00
abdosi
3bf1abb2dc
Address Review Comment to define SONIC_GLOBAL_DB_CLI in gbsyncd.sh (#11857)
As part of PR #11754

    Change was added to use variable SONIC_DB_NS_CLI for
    namespace but that will not work since ./files/scripts/syncd_common.sh
    uses SONIC_DB_CLI. So revert back to use SONIC_DB_CLI and define new
    variable for SONIC_GLOBAL_DB_CLI for global/host db cli access

   Also fixed DB_CLI not working for namespace.
2022-08-29 08:19:28 -07:00
Liu Shilong
35945c9015
[ci] Update reproducible build related pipeline. (#11810) 2022-08-29 11:26:15 +08:00