Commit Graph

4883 Commits

Author SHA1 Message Date
mssonicbld
98dd76c485
[ci/build]: Upgrade SONiC package versions (#8561) 2021-08-24 14:53:16 +00:00
mssonicbld
8f604998b4
[ci/build]: Upgrade SONiC package versions (#8556) 2021-08-23 17:32:59 +00:00
Kostiantyn Yarovyi
387ae82c5d [Pcied] run by python 3
Why I did it
Pcied running by python 2.

How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2

How to verify it
docker exec pmon supervisorctl status
2021-08-23 03:34:48 +00:00
Alexander Allen
196fcffb6f [Mellanox] Upgrade Mellanox firmware tools to 4.17.0 (#8299)
- Why I did it
New release of MFT has the following changelog / RN
 Fixed an issue that resulted in getting MVPD read errors from the mlxfwmanager during fast reboot.
 Fixed mlxuptime sometimes generating a time less than previous due the wrong frequency calculation

- How I did it
Update makefile pointer to new version.

- How to verify it
Manually tested on all Mellanox platforms.
2021-08-23 03:05:20 +00:00
Volodymyr Samotiy
8365245122 [monit] Periodically monitor VNET route consistency (#8266)
*To run VNET route consistency check periodically.
*For any failure, the monit will raise alert based on return code.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-08-23 03:05:16 +00:00
Guohan Lu
7fc11cf441 [ci]: fix artifact download syntax error for vstest (#8547)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-21 14:31:22 -07:00
Samuel Angebault
6a2d9e177c
[202012][Arista] Update platform library submodules (#8530)
Fix Chassis.get_name to return the same value than what's in platform.json
Fix Chassis.get_system_eeprom_info when running from within pmon.
Fix Watchdog.get_remaining_time (fixes [202012 platform_tests] TestWatchdogApi::test_remaining_time failure on vms20-t1-7050cx3-3.1 #8440 and [ 202012 platform_tests ] TestWatchdogApi::test_arm_disarm_states failure on vms20-t1-7050cx3-3.1 #8439)
Implement missing thermal infos and conditions (fixes [202012 platform_tests] test_platform_info.py::test_thermal_control_psu_absence error #8453)
Fix Chassis.set_status_led return value (fixes [2020 platform_tests] TestChassisApi::test_status_led failure on vms20-t0-7050cx3-1  #8464)
2021-08-20 10:30:12 -07:00
mssonicbld
733c851fc9
[ci/build]: Upgrade SONiC package versions (#8527) 2021-08-19 18:49:15 +00:00
Shi Su
7d5ef0d9fe
[202012] [sonic-swss] Update submodule (#8522)
Why I did it
Update the sonic-swss submodule for the 202012 branch. The following is the new commit in the submodule.

c1cb2ca [202012] Backport SAI failure handling to 202012 branch (#1880)

How I did it
Update the sonic-swss submodule pointer for the 202012 branch.
2021-08-18 17:52:01 -07:00
Stepan Blyshchak
996ff82c49 [libteam][warm-reboot] fix issue in teamd warm-reboot that teamd starts (#8227)
with state of tdport from previous warm-reboot.

In case LAG was down before reboot, lacp->wr is not cleared.
In lacp_event_watch_port_flush_data we incremented nr_of_tdports and add
tdport to lacp->wr.state. In case lacp->wr.state already had this tdport
we do not set new state for tdport but appened a new item in
lacp->wr.state. In case we preformed warm-reboot and PortChannel member
was down, after reboot PortChannel member became up next warm-reboot
will initialize teamd with PortChannel member in down state.

Fix this issue by calling stop_wr_mode() when LAG was down. This was probably intended but missed.

#### Why I did it

To fix an issue seen in warm-reboot-sad test cases.

#### How I did it

I fixed it in SONiC libteam patch that adds warm-reboot support. Details in commit description.

#### How to verify it

Run warm-reboot-sad test on t0-56 topology.
2021-08-18 07:34:48 +00:00
vdahiya12
78909de378
[sonic-platform-common] submodule update (#8491)
ef4b3ec [Y-Cable] add the definition inside setup.py to include sonic_y_cable.credo as a package (#211)
7d81488 [Y-Cable][Credo] Credo implementation of YCable class which inherits from YCableBase required for Y-Cable API's in sonic-platform-daemons (#203)
3efb093 [sonic_y_cable] add abstract class YCableBase required for Y-cable API support for multiple vendors (#186)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2021-08-17 13:22:23 -07:00
trzhang-msft
feab18a41b
caclmgrd: monitor mux_cable_table in state_db to update dhcp acl (#8477)
caclmgrd: monitor mux_cable_table in state_db to update dhcp acl

- if the state changes to 'standby', add acl to block dhcp packets based on ingress interfaces
- if the state changes to 'active', delete acl
- if the state changes to 'unknown', also delete acl to avoid potential disconnect
- both addition and deletion follow checking the existence of the rules

The change has been verified on a virtual switch based testbed.

Port to 202012 branch from #8222
2021-08-16 15:55:51 -07:00
mssonicbld
aa5d05ed2c
[ci/build]: Upgrade SONiC package versions (#8385) 2021-08-16 10:48:24 +00:00
Stephen Sun
d599450052 Use predefined macro as vendor information (#8361)
#### Why I did it
Use a predefined variable to get vendor information when the swss docker container is created

#### How I did it
Use `{{ sonic_asic_platform }}` instead of `$SONIC_CFGGEN -y /etc/sonic/sonic_version.yml -v asic_type`

#### How to verify it
Manually test.
2021-08-16 07:51:01 +00:00
Sudharsan Dhamal Gopalarathnam
ba2284c4c0 Grouping delayed services under a target for config reload checks (#7846)
#### Why I did it
Create a target for delayed service timers. Few services in sonic have delayed to speed up the bring up of the system and essential services. However there is no way to track when they start. This will be a problem when executing config reload as config reload expects all services to be up. Hence grouped all the timers that trigger the delayed services under one target so that they could be tracked in 'config reload' command

#### How I did it
Created delay.target service and add created dependency on the delayed targets.
2021-08-16 07:50:56 +00:00
Wirut Getbamrung
347d7262a1
[202012][device/celestica]: Fix failed test cases of Haliburton platform API (#8297)
To fix failed test cases of Haliburton platform APIs that found on platform_tests script
- How I did it
- Add device/celestica/x86_64-cel_e1031-r0/platform.json
- Update functions to support python3.7
- Add more functions follow latest sonic_platform_base
- Fix the bug

Signed-off-by: Wirut Getbamrung [wgetbumr@celestica.com]
2021-08-15 00:00:08 -07:00
Christian Svensson
e241337be0 [frrcfgd]: Fix constant casing for prefix-list mode (#8454)
Fix #6943

This change makes the template match PREFIX_SET and the documentation.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2021-08-14 17:25:07 -07:00
Ying Xie
92fb9c94bd [aboot] use ram partition for /var/log for devices with 3.7G disks (#8400)
Master/202012 image size grew quite a bit. 3.7G harddrive can no longer hold one image and safely upgrade to another image. Every bit of harddrive space is precious to save now.

Also sh syntax seemingly changed, [ condition ] && action was a legit syntax in 201911 branch but it is an error when condition not met with 202012 or later images. Change the syntax to if statement to avoid the issue.

Signed-off-by: Ying Xie ying.xie@microsoft.com
2021-08-14 17:22:01 -07:00
gechiang
5ed6b64c99 Reapply the fix to address setting MTU > 1500 causing portmgrd crash on BRCM platforms (#8472) 2021-08-14 17:15:21 -07:00
novikauanton
aae4e8dc7c
[build]: Fix bfn package version for reproducible build (#8468)
Barefoot pipeline is broken, because version has not been update by ci build yet.
2021-08-14 14:27:45 -07:00
lguohan
28d6ea9547 [openssh]: move build dep installation to sonic-slave-buster (#8381)
install build dep causes dpkg lock issue in parallel build

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-13 22:03:07 -07:00
xumia
b1c2659044
Support to build armhf/arm64 platforms on arm based system (#7731) (#8458)
Why I did it
Support to build armhf/arm64 platforms on arm based system without qemu simulator.
When building the armhf/arm64 on arm based system, it is not necessary to use qemu simulator.

How I did it
Build armhf on armhf system, or build arm64 on arm64 system, by default, qemu simulator will not be used.
When building armhf on arm64, and you have enabled armhf docker, then it will build images without simulator automatically. It is based how the docker service is run.

Docker base image change:
For amd64, change from debian:to amd64/debian:
For arm64, change from multiarch/debian-debootstrap:arm64- to arm64v8/debian:
For armhf, change from multiarch/debian-debootstrap:armhf- to arm32v7/debian:
See https://github.com/docker-library/official-images#architectures-other-than-amd64
The mapping relations:
arm32v6 --- armel
arm32v7 --- armhf
arm64v8 --- arm64

Docker image armhf deprecated info: https://hub.docker.com/r/armhf/debian, using arm32v7 instead.
2021-08-13 19:33:08 +08:00
carl-nokia
03ef275314 [Nokia ixs7215] sfputil support + component tests (#8445)
Deliver sfputil support for sfputil show eeprom and sfputil reset along with some component test case fixes

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-13 03:27:55 -07:00
Shi Su
e71eeb7de9 [FRR]: Upgrade FRR to frr-7.5.1-s1 tag (#8443)
Update FRR 7.5.1 head.
2021-08-12 23:30:17 -07:00
Vladyslav Morokhovych
754378f1d8
[swss] Fix arp_update script (#8412)
Fix #7968

Issue is detected on SONiC.20201231.11

In test_static_route.py::test_static_route_ecmp static routes are configured, but neighbors are not resolved after config reload even after 10 minutes.
It looks like the arp_update script is starting to ping when Vlan1000 is not fully configured.
When issue is reproduced, stuck ping6 process is observed in swss container :

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         180  0.1  0.0   6296  1272 pts/0    S    17:03   0:03 ping6 -I Vlan1000 -n -q -i 0 -c 1 -W 0 ff02::1
And when arp_update script successfully resolves neighbors, we observe sleep 300 instead of ping process
2021-08-12 23:25:28 -07:00
carl-nokia
8fe11740f0 [Nokia] Add hwsku.json for the Nokia-7215 (#8372)
* add hwsku.json for the Nokia-7215
* added required default_brkout_mode to hwsku as its not optional
* remove tabs from the file so spacing consistent

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-12 10:44:17 -07:00
madhanmellanox
c1e31a52cb
[202012]:Adding Mellanox-SN3800-D100C12S2 SKU (#8444)
*To create a new SKU Mellanox-SN3800-D100C12S2
Co-authored-by: Madhan Babu <madhan@l-csi-0241l.mtl.labs.mlnx>
2021-08-12 10:14:22 -07:00
richardyu
36ab000557 PTF adds unittest-xml-reporting (#8417)
Co-authored-by: richardyu-ms <richard.yu@microsoft.com>
2021-08-12 07:09:58 +00:00
jerseyang
133caf6db2 enable the emc2305 fan controller and NCP power controller 30ms timeout mechanism (#8138)
Why I did it
fix the dx010 system eeprom unavailable issue

How I did it
enable the i2c slave 30ms timeout mechanism

How to verify it
i2cstress test in DX010 iSMT controller bus

Co-authored-by: nicwu-cel <nicwu@celestica.com>
2021-08-12 07:09:53 +00:00
carl-nokia
3b9b9462be [sonic-device-data]: add port_type to OPTIONAL_PORT_ATTRIBUTES (#8370)
enable automated test suites to selectively run relevant tests ( or not run tests ) based upon a new port_type identifier in hwsku.json

How I did it
Modified the valid optional fields in validity check for hwsku.json per recommendation from Joe in
https://github.com/Azure/sonic-mgmt/pull/2654/files

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-12 07:09:49 +00:00
dflynn-Nokia
1fc4cb1d48 [Nokia ixs7215] Watchdog timer support (#8377) 2021-08-12 07:09:44 +00:00
Rajkumar-Marvell
01e223513d [reboot-cause] Fixed determine-reboot-cause.service failure. (#8210)
Signed-off-by: Rajkumar Pennadam Ramamoorthy rpennadamram@marvell.com

Why I did it
Install sonic image from ONIE. Once system is up, execute "config reload" command.

Root cause is that "determine-reboot-cause.service" was in failed state.
root@sonic:/host/reboot-cause# systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● determine-reboot-cause.service loaded failed failed Reboot cause determination service

How I did it
Fixed the issue by setting default reason to "REBOOT_CAUSE_UNKNOWN" instead of "None".

How to verify it
Check " determine-reboot-cause.service' loaded successfully post image installation from ONIE.
Verify "reboot-cause.txt" file is created and config reload succeeds.
2021-08-12 07:09:23 +00:00
Shilong Liu
e5428ecb2d Reproducible build add docker image debian* to white list. (#8330)
#### Why I did it
1. Add version control for debian* docker image to white list.
2. Always record docker image sha256 value, regardless of white list.
2021-08-12 07:09:01 +00:00
anamehra
491ab3d977
Platform/cisco-8000 module for sonic-buildimage (8172) (#8399)
Update Makefile, so it does the following:
For a given platform, verify if platform/checkout/.ini exists and
hence run the platform/checkout/template.j2. This allows platform
code to be checked out during the 'make configure' stage.
2021-08-11 23:04:36 -07:00
gechiang
8915e488b7
[202012] BRCM Disable ACL Drop counted towards interface RX_DRP counters (#8383)
* [202012] BRCM Disable ACL Drop counted towards interface RX_DRP counters
2021-08-11 09:10:17 -07:00
tjchadaga
76def5c3a0 Fix TH3 Warm-reboot failure due to Tunnel termination SAI failure (#8395) 2021-08-11 04:12:46 -07:00
vmittal-msft
b0ea180fd4 Updated PGHeadroom settings for 400G speed (DellEMC-Z9332f-M-O16C64 & DellEMC-Z9332f-O32) (#8420)
Updated pg_profile_lookup.ini for both HWSKU to match with BRCM recommendation
2021-08-11 04:10:03 -07:00
Junchao-Mellanox
8285cf2329
[Mellanox] [202012] Upgrade hw-mgmt to 7.0100.2344 (#8408)
To support new PSU fan on Mellanox platforms
2021-08-11 02:04:55 -07:00
Aravind Mani
4629c302c0
<202012> Dell S6100: Monitor serial getty service (#8407)
Why I did it
serial-getty service exited in Dell S6100 device randomly.

How I did it
Added serial-getty to monit services.

How to verify it
Stop serial-getty in ssh session and check whether the service restarts or not.
2021-08-10 11:23:22 -07:00
roman_savchuk
a06cd18dbe
[BFN]: update bfnsdk package (#8350)
Signed-off-by: Roman Savchuk <romanx.savchuk@intel.com>
2021-08-10 08:19:37 -07:00
shlomibitton
e62ae02fb2
[hostcfgd] [202012] Delay hostcfgd for faster boot time (#8117)
#### Why I did it
hostcfgd is starting at the same time as 'create_switch' method is called on orchagent process.
This introduce a degradation on the function execution time which eventually cause the fast-boot flow and a boot scenario in general to run slower (~6 seconds).
This change will delay the start time of this daemon.
90 seconds determined as the maximum allowed downtime for control plane to come back up on fast-boot flow.

#### How I did it
Add a timer for hostcfgd service in order to delay the startup of this service.

#### How to verify it
Install an image with this change and observe the daemon start 90 seconds after the system boot.
2021-08-10 05:52:08 -07:00
lguohan
8b4f6943a9
[submodule]: update sonic-swss (#8374)
* 41dfaad 2021-08-02 | Bridge mac setting, fix statedb time format (#1844) (HEAD, origin/202012) [Prince Sunny]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-08 20:42:46 -07:00
lguohan
09b4e04f7e [build]: add debug info for dpkg frontend lock (#8375)
print out the process that hold the dpkg frontend lock.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-08 20:41:44 -07:00
mssonicbld
1c30a8b0c1
[ci/build]: Upgrade SONiC package versions (#8376) 2021-08-08 19:05:13 +00:00
Sujin Kang
ae7fa32691
[pmon]: Enable Autorestart of the daemons in PMON for unexpected exit (#8358)
Enable Autorestart of the daemons in PMON for unexpected exit
Remove the daemon list from the critical_process which prevent the PMON
from restarting when the individual daemon crashes.
2021-08-07 22:43:38 -07:00
Guohan Lu
ceab083fc5 [build]: add sonic_release 202012
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-07 18:04:28 -07:00
Guohan Lu
e38cc58bbc [build]: add branch and release name in sonic_version.yml
the branch refers the branch name that the commit is in,
for example master, 202012, 201911, ...
In case there is no branch, the name will be HEAD.

release is encoded in /etc/sonic/sonic_release file.
the file is only available for a release branch.
It is not available in master branch.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-07 18:04:28 -07:00
Longxiang Lyu
25f53289eb [swss][arp_update] Send ipv6 pings over vlan sub interfaces (#8363)
#### Why I did it
* `arp_update` fails to ping those neighbors over vlan sub interfaces.

#### How I did it
* modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned.
* modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2021-08-07 12:43:51 +00:00
Neetha John
66c8934d84 Revert "Revert "Update default cable len to 0m for TD2"" (#8354)
* Update default cable len to 0m for TD2 (#8298)
* Update sonic-cfggen tests with the correct cable len

Signed-off-by: Neetha John <nejo@microsoft.com>

As part of the buffer reclamation efforts for TD2, setting the default cable len to 0m which means unused ports will have a cable len of 0m.

Why I did it
To align with the changes in Azure/sonic-swss#1830

How to verify it
- With the default cable len set to 0m and the associated changes in swss, CABLE_LENGTH table had '0m' set for unused ports and accordingly more space was reserved for the shared pool
- Cfggen tests passed with the cable len update
2021-08-07 12:43:46 +00:00
Aravind Mani
7be487bcc8 DellEMC: Z9332f platform API changes (#8258)
Why I did it
platform test suite failed for few API's in DellEMC Z9332f platform.

How I did it
Modified the API's to return the expected values in the script.

How to verify it
Run platform test suite after making the changes.
2021-08-07 12:43:40 +00:00