Commit Graph

5535 Commits

Author SHA1 Message Date
Stepan Blyshchak
fa1e364f54
[services] kill container on stop in warm/fast mode (#10511)
To optimize stop on warm boot, added kill for containers

Use service "kill" in the shutdown path for fast and warm reboot. For all other reload methods, service "stop" is used.
This is done to save time in shutdown path, and to overall improve the time spent in warm and fast reload.

How - Use service_mgmt.sh to trigger common logic to initiate kill (fast/warm) or stop (cold) for database.sh, radv.sh, snmp.sh, telemetry.sh, mgmt-framework.sh

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>, Vaibhav H D <vaibhav.dixit@microsoft.com>
2022-04-18 14:27:48 -07:00
Vivek R
85447401c7
[202012] [submodule] Advance sonic-snmpagent pointer (#10584)
414692f LLDPLocalSystemDataUpdater Exception Log Handled (#249)

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-04-18 10:42:05 +03:00
Ying Xie
6af3de4372
[202012][copp cfg] enable dhcp trap for a couple more devices (#10582)
* [copp cfg] enable copp trap for a couple more devices

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-04-15 11:47:02 -07:00
Jing Zhang
9fd75ffd9d
[202012][sonic-linkmgrd] Submodule Update (#10345)
[202012][sonic-linkmgrd]Submodule update

8507629 Jing Zhang      Mon Apr 4 10:25:22 2022 -0700   Lower unsolicited MUX state change notification log level to WARNING #57
17d217d Longxiang Lyu   Mon Mar 21 12:15:19 2022 +0800  Enhance clang format (#46)
c72fa2a Jing Zhang      Fri Apr 1 12:23:29 2022 -0700   Disable the feature that decreases link probe interval for measuring switch overhead #49 (#54)
256b01b Jing Zhang      Thu Mar 31 16:20:00 2022 -0700  Update link prober metrics posting logics #50 #53
dfd48d0 Jing Zhang      Wed Mar 23 16:27:45 2022 -0700  Decrease link probing interval after switchover to better determine the overhead of a toggle #43 (#48)

sign-off: Jing Zhang zhangjing@microsoft.com
2022-04-14 11:42:22 -07:00
Richard.Yu
6ccc458d2b
[CG-Fix-CVE-2021-44906] Patching on thrift.0.13.0 for package minimist (#10554)
* [CG-Fix-CVE-2021-44906] Patching on thrift.0.13.0 for package minimist

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add more information in patch

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-04-14 06:46:19 -07:00
Saikrishna Arcot
29b6f62902
[202012] Run tune2fs during initramfs instead of image install (#10558)
If it is run during image install, it's not guaranteed that the
installation environment will have tune2fs available. Therefore, run it
during initramfs instead.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-04-12 19:59:24 -07:00
kellyyeh
6e17ef311a [dhcp_relay] Remove dhcp6mon (#10467) 2022-04-12 18:39:19 +00:00
Sudharsan Dhamal Gopalarathnam
234d0ab241 [containerd]Fixing container commands when mode is local and state is disabled (#9986)
Why I did it
During warm-reboot and fast-reboot the below error logs appear
Feb 3 22:05:15.187408 r-lionfish-13 ERR container: docker cmd: kill for nat failed with 404 Client Error for http+docker://localhost/v1.41/containers/nat/json: Not Found ("No such container: nat")

The container command when called for local mode doesn't check if it is enabled before calling docker kill which throws the above errors.
b6ca76b482/scripts/fast-reboot (L699)

How I did it
Checking feature state if local mode and returning error exit code along with valid debug message.

How to verify it
Manually tested with warm-reboot and fast-reboot
Added UT to verify it.
2022-04-12 18:39:13 +00:00
Sudharsan Dhamal Gopalarathnam
d27df5d145
[202012] [submodule] Advance sonic-swss pointer (#10540)
Includes the below commits
f3b2873 [BFD]Retry create BFD with different source UDP port on failure

Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
2022-04-12 10:23:09 +03:00
Qi Luo
3fa538e58c
Revert "[ci] Set default ACR in UpgrateVersion/PR/official pipeline. (#10341)" (#10535)
This reverts commit f4bbcd1cf1. The original one was missing one file ".azure-pipelines/azure-pipelines-repd-build-variables.yml" and break the Azure pipeline.
2022-04-11 23:53:42 -07:00
xumia
fc727f0538 [Ci]: check if there is a sonic dirty version issue (#10445)
Why I did it
[Ci]: check if there is a sonic dirty version issue
If there is a dirty version issue in PR build, the build will be failed.
2022-04-11 23:10:06 +00:00
Rajkumar-Marvell
589234a48c
[Marvell] Marvell armhf SAI debian. (#10526)
Fixed IPv6 route issue resulting in orchagent crash.
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2022-04-11 14:00:46 +08:00
Kevin Wang
a65916449b
Update cisco-8000 ref to release: 202012-v0.97 (#10522)
Signed-off-by: Kevin(Shengkai) Wang <shengkaiwang@microsoft.com>
2022-04-11 08:59:56 +08:00
mssonicbld
e0fa07307a
[ci/build]: Upgrade SONiC package versions (#10395)
[ci/build]: Upgrade SONiC package versions (#10395)
2022-04-10 17:00:00 +08:00
Kebo Liu
1b42dbfdd2
[submodule] [202012] Advance sonic-platform-common pointer (#10502)
Update sonic-platform-common submodule to pick up new commits:

cd623fa [202012] Backport Enhance ssd_generic with more error handling to avoid python crash (#273)
e9a4a81 [y_cable][Broadcom] update the BRCM y_cable driver to release 2.0 (#263)
2022-04-08 12:56:59 +03:00
Shilong Liu
f4bbcd1cf1 [ci] Set default ACR in UpgrateVersion/PR/official pipeline. (#10341)
Why I did it
docker hub will limit the pull rate.
Use ACR instead to pull debian related docker image.

How I did it
Set DEFAULT_CONTAINER_REGISTRY in pipeline.
2022-04-08 11:19:10 +08:00
Stepan Blyshchak
721a53b9a0 [scapy] update scapy to 2.4.5 and patch it (#10457)
Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:

Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors     from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr  2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr  2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors     from scapy.route import *
Apr  2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr  2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors     conf.iface = get_working_if()
Apr  2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr  2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors     ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr  2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr  2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors     return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr  2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-04-07 22:57:47 +00:00
kellyyeh
0e6f1833e0 Update docker-router-advertiser.supervisord.conf.j2 (#10375) 2022-04-07 22:57:37 +00:00
Volodymyr Samotiy
2d21756c5a
[202012] [submodule] Advance sonic-utilities pointer (#10480)
* 3fc6e27 Fix issues in clear_qos (#2122)

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-04-07 10:43:18 +03:00
Volodymyr Samotiy
143dccf706
[202012] [submodule] Advance sonic-platform-common pointer (#10481)
* 427f624 [ssd] Allow individual vendor parsers to handle errors (#252)

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-04-07 10:42:32 +03:00
kellyyeh
b68f4dd74c
Enable dhcp copp trap for EPMS and MgmtTsToR (#10439) 2022-04-06 09:46:08 -07:00
Stepan Blyshchak
4f1b109ea0
[202012] [sonic-swss] update submodule pointer (#10456)
```
323dc34 [neighsyncd] increase neighsyncd timeout (#2209)
7f99941 Remove redundant and problematic code to skip "pool" field in buffer profile handling (#2197)
f3a0feb [Vxlanmgr] vnet netdev cleanup during config reload fix (#2191)
13ccaba Fix issue: sometimes PFC WD unable to create zero buffer pool (#2185)
```

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-04-06 08:51:45 +03:00
tjchadaga
35b630655b
[202012][sonic-sairedis] Update submodule (#10460) 2022-04-05 14:09:14 -07:00
Alexander Allen
01acd72314
Update kernel submodule (#10411)
Update sonic-linux-kernel submodule to updated 202012 branch. This brings in the following commits....

```
e97f9fc [202012] Add upstreamed patches which backport support for registers for CPLD PNs (#275)
58abcdc Merge pull request #267 from Staphylo/202012-log-buf-len
3f16f4f Merge pull request #268 from Staphylo/202012-emmc-fixes
a120ae7 Apply kernel patches to fix emmc unreliability
5f4a3f3 Increase log_buf_len to 1M for all architecture
```
2022-04-03 11:42:41 -07:00
Lawrence Lee
a081d04807 [azp]: Fix type in slave container cleanup (#10424)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-04-03 18:20:04 +00:00
Guohan Lu
8c2e04690e Revert "DellEMC: Z9332f - Component API Fixes (#10187)"
This reverts commit 8a38da94d5.
2022-04-02 14:08:28 -07:00
Kevin Wang
f7596844b7
Update cisco-8000 ref to release: 202012-v0.96 (#10443)
Signed-off-by: Kevin(Shengkai) Wang <shengkaiwang@microsoft.com>
2022-04-02 09:34:19 +08:00
vdahiya12
bd63760457
[202012][sonic-platform-daemons] submodule update (#10392)
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
This PR updates the following commits in sonic-platform-daemons

af39d75 [ycable] fix the logic to update cable_info values when ycable is not present; fix read side logic for ycable (#249)
2022-03-31 12:54:35 -07:00
Junchao-Mellanox
c5c047ac3c
[202012] [submodule] Update sonic-utilities submodule (#10393)
Include fix of "Stop PMON before stopping BGP while doing warmboot/fastboot"

4f1400f [202012] Stop PMON before stopping BGP while doing warmboot/fastboot (Azure/sonic-utilities#2101)
2022-03-31 18:10:53 +03:00
noaOrMlnx
c4c0cac35b
[202012][submodule] Update sonic-swss pointer (#10416)
Update submodule pointer to get the following:
08675c5 2022-03-31 Enable CoPP UT (#2176)

Signed-off-by: Noa Or <noaor@nvidia.com>
2022-03-31 16:04:05 +03:00
Shilong Liu
1c24f2e9df [ci] Add azure pipeline to build common libs. (#10367)
Why I did it
To remove reference on Azure.sonic-buildimage artifacts.
Azure.sonic-buildimage has a higher failure rate.
2022-03-31 20:39:49 +08:00
Shilong Liu
5b22cba8ca [build] Fix issues found in reproducible build. (#10407) 2022-03-31 20:39:49 +08:00
Shilong Liu
615b0d8d1b [ci] Use template from master branch in UpgrateVersion/sonic-slave pipeline (#10380) 2022-03-31 20:39:49 +08:00
Shilong Liu
f0000efff1 [ci] Fix remove sonic-slave-* docker image issue when building sonic-slave* (#10296) 2022-03-31 20:39:49 +08:00
xumia
8d0aea0f1b [build][Bug]: Fix the command set_reproducible_mirrors not found issue (#10398)
Why I did it
Fix the command set_reproducible_mirrors not found issue during the build.
2022-03-31 06:12:20 +00:00
Lawrence Lee
5b0f0c1d99 [tun_pkt]: Wait for AsyncSniffer to init fully (#10346)
Fix for Tunnel packet handler can crash at system startup 
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-03-30 21:16:18 +00:00
Lior Avramov
07c170fa04
Remove quagga from SONiC (#10384)
Quagga is no longer being used in SONiC. Cherry-pick from master PR #7898

Co-authored-by: liora <liora@nvidia.com>
2022-03-30 13:57:34 -07:00
Devesh Pathak
11701938cf
[202012] - sonic-swss submodule update (#10391)
* [202012] - sonic-swss submodule update to include following commits:

fca407a (HEAD) [VNET]Fixing nexthop group delete during route change (#2198)
a9b6b47 [vxlan] Remove tunnel map objects on VNET tunnel removal (#2208)
74e9b9f [FdbOrch] SAI_FDB_EVENT_MOVE generates update with empty update.entry.port_name (#2201)
0a99445 [202012][BFD]Registering BFD state change callback during session creation (#2203)
aebe4a1 [VS test] skip dpb flaky test (#2195) (#2207)
2022-03-30 13:47:23 -07:00
Shilong Liu
491562b74b
[ci] Fix remove sonic-slave-* docker image issue when building sonic-slave* (#10296) (#10378) 2022-03-30 16:08:32 +08:00
Saikrishna Arcot
e9db38594d
Image disk space reduction (#10172) (#10371)
Reduce the disk space taken up during bootup and runtime.

1. Remove python package cache from the base image and from the containers.
2. During bootup, if logs are to be stored in memory, then don't create the `var-log.ext4` file just to delete it later during bootup.
3. For the partition containing `/host`, don't reserve any blocks for just the root user. This just makes sure all disk space is available for all users, if needed during upgrades (for example).

* Remove pip2 and pip3 caches from some containers

Only containers which appeared to have a significant pip cache size are
included here.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Don't create var-log.ext4 if we're storing logs in memory

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Run tune2fs on the device containing /host to not reserve any blocks for just the root user

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
(cherry picked from commit 5617b1ae3e)
2022-03-29 10:11:28 -07:00
Yang Wang
308ecbb89c
[202012][sonic-sairedis]Update sonic-sairedis submodule (#10348) 2022-03-29 13:21:19 +08:00
xumia
674dc6e4c4 [Build]: Fix host image debian package version issue (#10358)
Why I did it
Fix host image debian package version issue.
The package dependencies may have issue, when some of debian packages of the base image are upgraded. For example, libc is installed in base image, but if the mirror has new version, when running "apt-get upgrade", the package will be upgraded unexpected. To avoid such issue, need to add the versions when building the host image.

How I did it
The package versions of host-image should contain host-base-image.
2022-03-29 04:36:43 +00:00
mssonicbld
873689ef6e
[ci/build]: Upgrade SONiC package versions (#10373) 2022-03-28 23:08:38 +00:00
pavannaregundi
9184f975a2 [Marvell-armhf] Setting u-boot ftd_high to resolve kernel hung (#10204)
Why I did it
Kernel hang in during early boot is caused due overwriting of device tree with uncompressing kernel. Added the fdt_high which gives a safe offset from kernel location.

How I did it
Setting uboot environment variable fdt_high.

How to verify it
Successful boot of bullseye kernel on Marvell Armada 380/385.

Change-Id: I3e2521780f5ecdb3bdf6cbb6542250814ca11959
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2022-03-25 21:54:03 +00:00
pavannaregundi
f93266293f [Marvell-armhf] Fixing issues related to partition label (#10203)
Why I did it
Removing incorrect check in plt setup for fw_env config: This check was added before to compare 2 different types of disk. Now the check is redundant and check is not required as transition is complete.
2)Removing legacy_volume_label in create_partition: legacy_volume_label is not used in armhf install files. With legacy_volume_label initialized to NULL, current code will always return true for check, if demo_part exits.

How I did it
Change is about removing the redundant/incorrect code explained above.

How to verify it
uboot fw_printenv and fw_setenv is tested
onie-nos-install has be verified.

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
2022-03-25 21:53:59 +00:00
Saikrishna Arcot
1277c7ed3b Check to see that the py2 and py3 version files exist before trying to sort them (#10325)
For Bullseye, Python 2 isn't present at all. This means that in certain
build cases (such as building something only for Bullseye), the version
file may not exist, and so the sort command would fail.

For most normal build commands, this probably won't be an issue, because
the SONiC build will start with Buster (which has both Python 2 and
Python 3 wheels built), and so the py2 and py3 files will be present
even during the Bullseye builds.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-03-25 21:53:55 +00:00
Eric Zhu
8a246b88b1 Fix issue of partially parsing syseeprom value (#10020) (#10276)
Why I did it
The current code assumes that the value part does not have whitespace. So everything after the whitespace will be ignored. The syseeprom values returned from platform API do not match the output of "show platform syseeprom" on dx010 and e1031 device.

How I did it
This change improved the regular expression for parsing syseeprom values to accommodate whitespaces in the value.
PR 10021 provides the solution, but committed to the wrong place for dx010 and e1031.

How to verify it
Compile the sonic_platform wheel for dx010, then upload to device and install the wheel, verify the platform eeprom API.

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2022-03-25 21:53:45 +00:00
tjchadaga
527232122e
[202012][sonic-sairedis] Update submodule (#10347) 2022-03-25 09:25:18 -07:00
mssonicbld
e71c14502d
[ci/build]: Upgrade SONiC package versions (#10331)
Upgrade SONiC Versions
2022-03-25 15:09:37 +08:00
Saikrishna Arcot
aafb3d00e2
Start haveged before systemd-random-seed (#10328)
The haveged service file in Debian Buster specifies that haveged should
start after systemd-random-seed starts (this was removed in Bullseye
after systemd changes caused a bootloop). This is a bit
counterproductive, since haveged is meant to be used in environments
with minimal sources of entropy, but one of the checks that
systemd-random-seed does is to verify that entropy is present.

Therefore, override the default .service file for haveged that moves
systemd-random-seed to the Before list, allowing it to start before
systemd-random-seed checks the system entropy level. (systemd doesn't
allow removing items from dependency/ordering entries such as After= and
Before=, so the entire .service file has to be overwritten.)

Note that despite this, haveged takes up to two seconds to actually
start working, so systemd-random-seed may still block for about two
seconds. However, this still allows other work (such as running
rc.local) to proceed a bit sooner.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-03-24 14:28:42 -07:00