Commit Graph

904 Commits

Author SHA1 Message Date
Lawrence Lee
b027e87ffb [mux.service]: Remove pmon dependency (#9211)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-11 02:56:27 +00:00
Lawrence Lee
f317d93cb0 Merged PR 4679112: [write_standby]: Ignore non-auto interfaces
[write_standby]: Ignore non-auto interfaces

* In the event that `write_standby.py` is used to automatically switchover interfaces when linkmgrd or bgp crashes, ignore any interfaces that are not configured to auto-switch

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-10 18:54:33 -08:00
Lawrence Lee
57ad50cfd9 Merged PR 4559560: [bgp]: Switch to standby if BGP container exits
[bgp]: Switch mux to standby if BGP container exits

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-10 18:54:33 -08:00
Lawrence Lee
6a9c709336 [write_standby]: Improve logging
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-10 18:54:33 -08:00
Lawrence Lee
77378b4364 [mux]: Call write_standby from host only
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-10 18:54:33 -08:00
Lawrence Lee
25712c712e [mux]: Make write_standby available on host
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

[write_standby]: Cleanup and fix build

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-10 18:54:33 -08:00
Tamer Ahmed
18d1f65339 Merged PR 4813977: [mux] Update Service Install With SONiC Target
[mux] Update Service Install With SONiC Target

Recent PR grouped all SONiC service into sonic.taget. The install section
of mux.service was not update and this causes delays when using config
reload as the service failed state is not being reset.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-11-10 18:54:33 -08:00
Lawrence Lee
70fbd6826c Merged PR 4366316: [mux.service]: Bind to sonic.target
[mux.service]: Bind to sonic.target

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-10 18:54:33 -08:00
Tamer Ahmed
b42aef68f3 Merged PR 4234524: [mux] Start Mux on Only Dual-ToR Platform
[mux] Start Mux on Only Dual-ToR Platform

mux docker depends on the presence of mux cable hardware and is
supposed to run only Gemini ToRs. This PR change the mux feature
config in order to enable mux docker based on device configuration.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-11-10 18:54:33 -08:00
Tamer Ahmed
b8f70f8986 Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd
Linkmgrd monitors link status, mux status, and link state. Has
the link becomes unhealthy, linkmgrd will trigger mux switchover
on a standby ToR ensuring uninterrupted service to servers/blades.
This PR is initial implementation of linkmgrd.

Also, docker-mux container hold packages related to maintaining and managing
mux cable. It currently runs linkmgrd binary that monitor and switches
the mux if needed.
This PR also introduces mux-container and starts linkmgrd as startup when
build is configured with INCLUDE_MUX=y

Edit: linkmgrd PR will follow.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

Related work items: #2315, #3146150
2021-11-10 18:54:33 -08:00
tjchadaga
9a1b1bc44e Fix for additional intf flap during fast-reboot (#9166) 2021-11-09 23:20:06 +00:00
mssonicbld
c15bae7c84
[ci/build]: Upgrade SONiC package versions (#9128) 2021-11-09 22:52:26 +00:00
gechiang
400e40f255
[202012] BRCM SAI 4.3.5.1-6 Picked up fixes for CS00012213351, CS00012182162, and CS00012210826 (#9158)
This is to pick up BRCM SAI 4.3.5.1-6 fixes which contains the following fixes:

1.  CS00012213351 SONIC-50679: [TH, TH2] Warm-reboot from 3.5 to 4.3 fails due to null objects discovered
2.  CS00012182162: SONIC-49805 TD3 MMU config profile optimization changes 
3.  CS00012210826:SONIC-50205/760c60fc: Should read MMU_INTFI_MMU_PORT_TO_MMU_QUEUES_FC_BKP for TH3

Preliminary tests looks fine. BGP neighbors were all up with proper routes programmed
interfaces are all up
Manually ran the following test cases on 7050CX3 (TD3) T0 DUT and all passed:
```
     fib/test_fib.py
     vxlan/test_vxlan_decap.py
     fdb/test_fdb.py
     decap/test_decap.py
     ipfwd/test_dip_sip.py 
     ipfwd/test_dir_bcast.py
     acl/test_acl.py
     vlan/test_vlan.py
     platform_tests/test_reboot.py
```
2021-11-03 07:24:33 -07:00
Sumukha Tumkur Vani
65626c8925
Flush RESTAPI DB upon config reload (#9093) 2021-10-28 09:31:38 -07:00
Nazarii Hnydyn
0cbda8d362 [teamd]: Send USR1/USR2 only to subscribers. (#8856)
To fix teamd signal handling, without which Process 'tlm_teamd' exited unexpectedly
2021-10-27 03:54:58 +00:00
mssonicbld
1c86196411
[ci/build]: Upgrade SONiC package versions (#9050) 2021-10-25 17:09:12 +00:00
gechiang
c95178157d
[202012]BRCM SAI 4.5.3.1-5 picked up SAI fixes for several CSP cases (#9003) 2021-10-19 14:08:31 -07:00
Ying Xie
f1d5aaced0 [copp] bind copp-config.service to sonic.target (#8969)
copp-config service needs to be started after sonic.target so that it could
render the copp-config with the latest information.

It also needs to be restarted when config reload or load_minigraph is invoked.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-10-15 00:40:05 +00:00
gechiang
eca9020a48
[202012] BRCM SAI 4.5.3.1-4 Fixes dscp-uniform mode, th3 debug counter bmp crash (#8968)
* [202012] BRCM SAI 4.5.3.1-4 Fixes dscp-uniform mode, th3 debug counter bmp crash
2021-10-13 08:25:44 -07:00
mssonicbld
b11d6cf5ee
[ci/build]: Upgrade SONiC package versions (#8919) 2021-10-09 19:12:09 +00:00
mssonicbld
0f48239167
[ci/build]: Upgrade SONiC package versions (#8894) 2021-10-02 19:01:40 +00:00
mssonicbld
d790caecbc
[ci/build]: Upgrade SONiC package versions (#8867) 2021-09-29 17:11:32 +00:00
Vaibhav Hemant Dixit
636870d86f Save DB dump after warm/fast reboot (#8803)
As a part of warmboot, redis database is dumped:
c97fe546e5/scripts/fast-reboot (L269)
However, this dump file is deleted, after it is loaded back into db post reboot.
The DB dump can be useful for debugging purpose, hence taking a backup of it can be useful.
Instead of deleting the dump, rename and keep the dump.
2021-09-27 02:29:12 +00:00
gechiang
ac9feadbf1
[202012] BRCMSAI 4.3.5.1-3 fix CS00012203600, CS00012202255, CS00012208537 (#8840) 2021-09-25 17:09:34 -07:00
mssonicbld
667fe3702c
[ci/build]: Upgrade SONiC package versions (#8829) 2021-09-23 17:34:56 +00:00
mssonicbld
c988a7766c
[ci/build]: Upgrade SONiC package versions (#8800) 2021-09-20 12:48:20 +00:00
mssonicbld
7ce529ea35
[ci/build]: Upgrade SONiC package versions (#8795) 2021-09-19 15:26:49 +00:00
mssonicbld
f716745d76
[ci/build]: Upgrade SONiC package versions (#8637) 2021-09-17 16:40:09 +00:00
abdosi
7732fa95bb [baseimage]: Logrotate for wtmp and btmp files. (#8743)
Added logrotate file for wtmp and btmp to override default conf and set size cap as 100K as done in 
PR: #865. For buster this is control by separate file wtmp and btmp.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-09-17 08:24:10 +00:00
Sudharsan Dhamal Gopalarathnam
9c5917d8dd Removing execute permission from copp config file (#8680)
*Removed execute permissions from the systemd copp-config.service file. 
Without this we will get a warning: "Configuration file /lib/systemd/system/copp-config.service is marked executable. Please remove executable permission bits. Proceeding anyway."
2021-09-14 08:59:21 +00:00
Ying Xie
e8b8012818 [202012][fstrim] delay fstrim timer after sonic.target (#8737)
Why I did it
fstrim has dependency on pmon docker.

How I did it
start fstrim timer after sonic.target.

How to verify it
local test and PR test.

Signed-off-by: Ying Xie ying.xie@microsoft.com
2021-09-14 08:59:17 +00:00
gechiang
84b5659372
[202012] BRCM SAI 4.3.5.1-2 Fix BRCM SAI regression due to ACL Egress Mirroring Action capability (#8682) 2021-09-06 22:12:59 -07:00
Samuel Angebault
96f2eaaadb [Arista] Fix flash size computation for Lodoga (#8622)
The Lodoga platform also matched crow which was hardcoding the flash
size to 3700. This change enables autodetect on Clearlake which in turns
allows autodetect for Lodoga.

The threshold was bumped from 3700 to 4000 because size computation can
differ slightly and report slightly above 3700.
2021-09-01 01:40:45 +00:00
mssonicbld
7eb4a345fa
[ci/build]: Upgrade SONiC package versions (#8584)
Co-authored-by: mssonicbld <vsts@fv-az232-326.x3jni0md3anuvcz2px3t3ecixa.bx.internal.cloudapp.net>
2021-08-30 16:24:18 +08:00
Samuel Angebault
01117d58b5 [Arista] Rely on automatic flash size detection for Lodoga (#8608)
Lodoga actually has a 8GB storage device.
LodogaSsd variant has a 30GB SSD drive.
However, in boot0 both were mishandled and assigned 4GB for legacy reasons.

Remove the hardcoding of the flash size and let boot0 autodetect the available space.
2021-08-27 02:27:15 +00:00
dflynn-Nokia
2c91efcd15 [Nokia ixs7215] Add support for changing the console baud rate (#8595)
This commit adds support for changing the default console baud rate configured
within the U-Boot bootloader. That default baud rate is exposed via the value
of the U-Boot 'baudrate' environment variable. This commit removes logic that
hardcoded the console baud rate to 115200 and instead ensures that the U-Boot
'baudrate' variable is always used when constructing the Linux kernel boot
arguments used when booting Sonic.

A change is also made to rc.local to ensure that the specified baud rate is set
correctly in the serial getty service.
2021-08-27 02:27:06 +00:00
gechiang
fcdd63835b
[202012]BRCM SAI 4.3.5.1-1 Fix configurable drop counter out of resource (#8601)
* [202012]BRCM SAI 4.3.5.1 Fix for configurable drop counter out of resource
2021-08-26 14:30:22 -07:00
mssonicbld
98dd76c485
[ci/build]: Upgrade SONiC package versions (#8561) 2021-08-24 14:53:16 +00:00
mssonicbld
8f604998b4
[ci/build]: Upgrade SONiC package versions (#8556) 2021-08-23 17:32:59 +00:00
Volodymyr Samotiy
8365245122 [monit] Periodically monitor VNET route consistency (#8266)
*To run VNET route consistency check periodically.
*For any failure, the monit will raise alert based on return code.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-08-23 03:05:16 +00:00
mssonicbld
733c851fc9
[ci/build]: Upgrade SONiC package versions (#8527) 2021-08-19 18:49:15 +00:00
mssonicbld
aa5d05ed2c
[ci/build]: Upgrade SONiC package versions (#8385) 2021-08-16 10:48:24 +00:00
Stephen Sun
d599450052 Use predefined macro as vendor information (#8361)
#### Why I did it
Use a predefined variable to get vendor information when the swss docker container is created

#### How I did it
Use `{{ sonic_asic_platform }}` instead of `$SONIC_CFGGEN -y /etc/sonic/sonic_version.yml -v asic_type`

#### How to verify it
Manually test.
2021-08-16 07:51:01 +00:00
Sudharsan Dhamal Gopalarathnam
ba2284c4c0 Grouping delayed services under a target for config reload checks (#7846)
#### Why I did it
Create a target for delayed service timers. Few services in sonic have delayed to speed up the bring up of the system and essential services. However there is no way to track when they start. This will be a problem when executing config reload as config reload expects all services to be up. Hence grouped all the timers that trigger the delayed services under one target so that they could be tracked in 'config reload' command

#### How I did it
Created delay.target service and add created dependency on the delayed targets.
2021-08-16 07:50:56 +00:00
Ying Xie
92fb9c94bd [aboot] use ram partition for /var/log for devices with 3.7G disks (#8400)
Master/202012 image size grew quite a bit. 3.7G harddrive can no longer hold one image and safely upgrade to another image. Every bit of harddrive space is precious to save now.

Also sh syntax seemingly changed, [ condition ] && action was a legit syntax in 201911 branch but it is an error when condition not met with 202012 or later images. Change the syntax to if statement to avoid the issue.

Signed-off-by: Ying Xie ying.xie@microsoft.com
2021-08-14 17:22:01 -07:00
novikauanton
aae4e8dc7c
[build]: Fix bfn package version for reproducible build (#8468)
Barefoot pipeline is broken, because version has not been update by ci build yet.
2021-08-14 14:27:45 -07:00
Vladyslav Morokhovych
754378f1d8
[swss] Fix arp_update script (#8412)
Fix #7968

Issue is detected on SONiC.20201231.11

In test_static_route.py::test_static_route_ecmp static routes are configured, but neighbors are not resolved after config reload even after 10 minutes.
It looks like the arp_update script is starting to ping when Vlan1000 is not fully configured.
When issue is reproduced, stuck ping6 process is observed in swss container :

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         180  0.1  0.0   6296  1272 pts/0    S    17:03   0:03 ping6 -I Vlan1000 -n -q -i 0 -c 1 -W 0 ff02::1
And when arp_update script successfully resolves neighbors, we observe sleep 300 instead of ping process
2021-08-12 23:25:28 -07:00
mssonicbld
1c30a8b0c1
[ci/build]: Upgrade SONiC package versions (#8376) 2021-08-08 19:05:13 +00:00
Guohan Lu
ceab083fc5 [build]: add sonic_release 202012
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-07 18:04:28 -07:00
Longxiang Lyu
25f53289eb [swss][arp_update] Send ipv6 pings over vlan sub interfaces (#8363)
#### Why I did it
* `arp_update` fails to ping those neighbors over vlan sub interfaces.

#### How I did it
* modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned.
* modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2021-08-07 12:43:51 +00:00