Commit Graph

1088 Commits

Author SHA1 Message Date
Jing Zhang
78f249be38
change default to be on (#13495)
Changing the default config knob value to be True for killing radv, due to the reasons below:

Killing RADV is to prevent sending the "cease to be advertising interface" protocol packet.
RFC 4861 says this ceasing packet as "should" instead of "must", considering that it's fatal to not do this.
In active-active scenario, host side might have difficulty distinguish if the "cease to be advertising interface" is for the last interface leaving.
6.2.5. Ceasing To Be an Advertising Interface

shutting down the system.
In such cases, the router SHOULD transmit one or more (but not more
than MAX_FINAL_RTR_ADVERTISEMENTS) final multicast Router
Advertisements on the interface with a Router Lifetime field of zero.
In the case of a router becoming a host, the system SHOULD also
depart from the all-routers IP multicast group on all interfaces on
which the router supports IP multicast (whether or not they had been
advertising interfaces). In addition, the host MUST ensure that
subsequent Neighbor Advertisement messages sent from the interface
have the Router flag set to zero.

sign-off: Jing Zhang zhangjing@microsoft.com
2023-01-24 23:59:54 +00:00
Zain Budhwani
c9a33cb00e
Fix segfault issue inside memory_checker (#13066)
#### Why I did it

Segfault was occuring when running memory_checker

#### How I did it

Deinit publisher immediately after publishing

#### How to verify it

Manual testing
2023-01-24 15:30:41 -08:00
Jing Zhang
260a2ec3e7
[dualtor][active-active]Killing radv instead of stopping on active-active dualtor if config knob is on (#13408)
How I did it
radv sends a good-bye packet when the service is stopped, which causes a IPv6 route update on SoC side. And this update leads to an interface bouncing and causes traffic disruption even though the ToR device might already be isolated.

This PR is to mitigate the traffic disruption issue during planned maintenance, by killing radv instead of stopping. So the cease packet won't be sent.

How to verify it
Verified on dev clusters:

Traffic disruption was no longer reproducible.
radv took the killing path
if knob was off, radv would take the stopping path

sign-off: Jing Zhang zhangjing@microsoft.com
2023-01-20 15:34:34 -08:00
Graham Hayes
e077b5362c
[Arista] Rely on automatic flash size detection for Raven (#13277)
Many of these switches have had flash upgraded beyond 2G however, in
boot0 both were assigned 2GB for legacy reasons.

Remove the hardcoding of the flash size and let boot0 autodetect the available space.

Signed-off-by: Graham Hayes <gr@ham.ie>

Signed-off-by: Graham Hayes <gr@ham.ie>
2023-01-12 23:52:40 -08:00
xumia
e6a01ca5eb
[Bug] Fix SONiC installation failure caused by pip/pip3 not found (#13284)
The main issue is the pip/pip3 command cannot be found when the package is being installed by apt-get.
When using the dpkg install, the searching path is PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
When using the apt-get install, the searching path is PATH=/usr/sbin:/usr/bin:/sbin:/bin
But the pip/pip3 default path is at /usr/local/bin, so dpkg works, but apt-get not work.

How I did it
Export the path /usr/local/bin for pip/pip3.
Make the deb packages can be installed by apt-get.
2023-01-11 08:54:24 -08:00
centecqianj
4b933bd566
[Centec arm64] Solve the abnormal console speed of centec-arm64 switch board (#13126)
The console of the centec-arm64 board is ttyAMA0.The current regular expression cannot be correctly parsed.

Signed-off-by: centecqianj <qianj@centec.com>
2023-01-07 21:10:03 -08:00
abdosi
9ecd27ddbb
During build time mask only those feature/services that are disabled excplicitly (#13283)
What I did:
Fix : #13117

How I did:
During build time mask only those feature/services that are disabled explicitly. Some of the features ((eg: teamd/bgp/dhcp-relay/mux/etc..)) state is determine run-time so for those feature by default service will be up and running and then later hostcfgd will mask them if needed.

So Default behavior will be

init_cfg.json.j2 during build time make state as disabled then mask the service
init_cfg.json.j2 during build time make state as another jinja2 template render string than do no mask the service
init_cfg.json.j2 during build time make state as enabled then do not mask the service

How I verify:
Manual Verification.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-01-07 02:36:37 +00:00
Arvindsrinivasan Lakshmi Narasimhan
a57fa16839
[Chassis][Voq]update to add buffer_queue config on system ports (#12156)
Why I did it
In the voq chassis the buffer_queue configuration needs to be applied on system_port instead of the sonic port.
This PR has the change to do this.

How I did it
Modify buffer_config.j2 to generate buffer_queue configuration on system_ports if the device is Voq Chassis

How to verify it
Verify the buffer_queue configuration is generated properly using sonic-cfggen

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-12-31 23:59:54 -08:00
Oleksandr Ivantsiv
127d60f9b8
[build] Adjust teamd and radv features configuration according to the compilation options. (#13139)
- Why I did it
The followup to #12920 PR.
If the feature compilation is disabled its configuration should not be included into init_cfg.json.

- How I did it
Update init_cfg.json.j2 template to include teamd and radv features configuration only if their compilation is enabled.

- How to verify it
The default behavior is preserved. To verify the changes compile the image without overriding INCLUDE_TEAMD and INCLUDE_ROUTER_ADVERTISER options. The generated /etc/sonic/init_cfg.json should remain with no changes. Install the image and verify that both teamd and radv containers are present and running. Verify that feature state returned by show feature status command is enabled.
Change the INCLUDE_TEAMD or INCLUDE_ROUTER_ADVERTISER value to "n". Compile and install the image. Verify that feature configuration is not included in generated /etc/sonic/init_cfg.json file. Verify that show feature status output doesn't include the feature.
2022-12-27 13:55:37 +02:00
Stepan Blyshchak
661669c805
[swss/syncd] remove dependency on interfaces-config.service (#13084)
- Why I did it
Remove dependency on interfaces-config.service to speed up boot, because interfaces-config.service takes a lot of time on boot.

- How I did it
Changed service files for swss, syncd.

- How to verify it
Boot and check swss/syncd start time comparing to interfaces-config

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-12-26 09:20:45 +02:00
Junchao-Mellanox
2126def04e
[infra] Support syslog rate limit configuration (#12490)
- Why I did it
Support syslog rate limit configuration feature

- How I did it
Remove unused rsyslog.conf from containers
Modify docker startup script to generate rsyslog.conf from template files
Add metadata/init data for syslog rate limit configuration

- How to verify it
Manual test
New sonic-mgmt regression cases
2022-12-20 10:53:58 +02:00
Konstantin Vasin
8a3fad2891
[Build] mount cgroup2 in chroot to fix build on ubuntu 22.04 (#13030)
Why I did it
Ubuntu 22.04 uses cgroup2 by default, but docker.sh doesn't mount it.
As a result we get an error when trying to run docker info in chroot env:

ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

How I did it
mount cgroup2 in chroot if all enabled kernel cgroup controllers are currently not in use by cgroup1

So we need to mount cgroup in chroot environment on /sys/fs/cgroup.
Because inside chroot we don't know which cgroup version is used by the host we have two possible solutions:

cgroup tree for chroot is mounted by the host (it was my 1st version of this fix)
cgroup tree is mounted inside chroot based on info from /proc/cgroups (it's current version of this fix)
My 2nd version based on this code from systemd: 5c6c587ce2/src/shared/cgroup-setup.c (L35-L74)

We parse info from /proc/cgroups
Skip header line started from #
Skip controller if it's disabled (4th column = 0)
Count number of controllers with non-zero of hierarchy_id (2nd column)
If this number is not zero then we assume some of controllers are used by host system and the host system uses hybrid or legacy cgroup tree. In this case we can't use unified cgroup tree inside chroot and mount old cgroup tree (v1).
If this number is zero then we assume host system uses unified cgroup tree and we need to mount cgroup2 inside chroot.

Signed-off-by: Konstantin Vasin <k.vasin@yadro.com>
2022-12-17 12:16:45 -08:00
Oleksandr Ivantsiv
9988ff888b
[build] Add the possibility to disable compilation of teamd and radv containers. (#12920)
- Why I did it
This optimization is needed for DPU SONiC. DPU SONiC runs a limited set of containers and teamd and radv containers are not part of them. Unlike the other containers, there was no possibility to disable teamd and radv containers compilation.
To reduce DPU SONiC compilation time and reduce the image size this commit adds the possibility to disable their compilation.

- How I did it
Two new configuration options are added to rules/config file:

INCLUDE_TEAMD
INCLUDE_ROUTER_ADVERTISER
By default to preserve the existing behavior both options are enabled. There are two ways to override them:

To change option value to "n" in rules/config file.
To override their value using SONIC_OVERRIDE_BUILD_VARS env variable:
SONIC_OVERRIDE_BUILD_VARS="SONIC_INCLUDE_TEAMD=y SONIC_INCLUDE_ROUTER_ADVERTISER=n"

- How to verify it
The default behavior is preserved. To verify it compile the image without overriding new options. Install the image and verify that both teamd and radv containers are present and running.
To verify the new options override them with "n" value. Compile and install image. Verify that no docker containers are present. Verify that SWSS can start without errors.
2022-12-13 12:06:30 +02:00
Saikrishna Arcot
00b11ec4e2
Replace logrotate cron file with (adapted) systemd timer file (#12921)
Debian is shipping a systemd timer unit for logrotate, but we're also
packaging in a cron job, which means both of them will run, potentially
at the same time. Remove our cron file, and add an override to the
shipped timer file to have it be run every 10 minutes.

Fixes #12392.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-12-08 14:13:11 -08:00
Junchao-Mellanox
3b3837a636
[containercfgd] Add containercfgd and syslog rate limit configuration support (#12489)
* [containercfgd] Add containercfgd and syslog rate limit configuration support

* Fix build issue

* Fix checker issue

* Fix review comment

* Fix review comment

* Update containercfgd.py
2022-12-08 08:58:35 -08:00
Arvindsrinivasan Lakshmi Narasimhan
7db272556e
[chassis] update the asic_status.py to read from CHASSIS_FABRIC_ASIC_INFO_TABLE (#12576)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

Why I did it
Fixes #12575 and #12575

How I did it
In the PR sonic-net/sonic-platform-daemons#311 chassisd updates to CHASSIS_FABRIC_ASIC_INFO with the fabric asic info.
Updating the asic_status.py to read from the correct table.

How to verify it
test on chassis

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-12-07 21:53:47 -08:00
Michael Li
50b962b4a8
Limit reload BCM SDK kmods on syncd start to PikeZ platform (#12971)
Why I did it
Limiting #12804 changes to PikeZ platform only (Arista-720DT-48S). Note that this is a short term workaround for this platform until SDK investigation on SDK init failure on docker syncd restart due to DMA issues is resolved.

How I did it
Retrieve platform name from /host/machine.conf and only reload SDK kmods on Arista-720DT-48S platform.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-12-07 09:53:21 +08:00
Stepan Blyshchak
8ca0530920
[swss.sh] optimize macsec feature state query (#12946)
- Why I did it
There's a slowdown in bootup related to the execution of a show command during startup of swss service. show is a pretty heavy command and takes long time to execute ~2 sec.

- How I did it
I replaced show with sonic-db-cli which takes a ms to run.

- How to verify it
Boot the switch and verify swss is active.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-12-06 11:23:46 +02:00
Kalimuthu-Velappan
aaeafa8411
02.Version cache - docker cache build framework (#12001)
During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.

Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

#### How I did it

#### How to verify it
2022-12-02 08:28:45 +08:00
Michael Li
f725b83bd6
Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804)
Why I did it
There is an issue on the Arista PikeZ platform (using T3.X2: BCM56274) while running SONiC. If the 'syncd' container in SONiC is restarted, the expected behaviour is that syncd will automatically restart/recover; however it does not and always fails at create_switch due to BCM SDK kmod DMA operation cancellation getting stuck.

Sep 16 22:19:44.855125 pkz208 ERR syncd#syncd: [none] SAI_API_SWITCH:platform_process_command:428 Platform command "init soc" failed, rc = -1. Sep 16 22:19:44.855206 pkz208 INFO syncd#supervisord: syncd CMIC_CMC0_PKTDMA_CH4_DESC_COUNT_REQ:0x33#015 Sep 16 22:19:44.855264 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:platformInit:1909 initialization command "init soc" failed, rc = -1 (Internal error). Sep 16 22:19:44.855403 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:sai_driver_init:642 Error initializing driver, rc = -1. ... Sep 16 22:19:44.855891 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:1173 initializing SDK failed with error Operation failed (0xfffffff5).

Reloading the BCM SDK kmods allows the switch init to continue properly.

How I did it
If BCM SDK kmods are loaded, unload and load them again on syncd docker start script.

How to verify it
Steps to reproduce:

In SONiC, run 'docker ps' to see current running containers; 'syncd' should be present.
Run 'docker stop syncd'
Wait ~1 minute.
Run 'docker ps' to see that syncd is missing.
Check logs to see messages similar to the above.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-11-30 16:16:30 +08:00
Kebo Liu
36a100083f
[Mellanox] Add support to Mellanox Spectrum-4 ASIC Firmware compiling and upgrade (#12844)
- Why I did it
Add support for compiling Spectrum-4 ASIC firmware to the SONiC image
Add support for Spectrum-4 ASIC firmware upgrade

- How I did it
Update Mellanox fw make files to include Spectrum-4 ASIC firmware binaries.
Update firmware upgrade scripts to be able to detect Spectrum-4 ASIC.

- How to verify it
Run regression tests

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-11-29 16:38:41 +02:00
Zain Budhwani
4b001e5115
Change value type of params in memory_checker (#12797)
Fix error when calling events API, required value is string, passing float
2022-11-23 17:37:28 -08:00
bingwang-ms
f402e6b5c6
Apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor (#12730)
Why I did it
The PR is to apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor.
The traffic with DSCP 2 and DSCP 6 from T1 is treated as lossless traffic.

DSCP    TC    Queue
2      2     2
6      6     6
Traffic with DSCP 2 or DSCP 6 from downlink is still treated as lossy traffic as before.

How I did it
Define DSCP_TO_TC_MAP|AZURE_UPLINK and TC_TO_QUEUE_MAP|AZURE_UPLINK.

How to verify it
Verified by UT
Verified by coping the new template to a testbed, and rendering a config_db.json
2022-11-21 11:42:28 -08:00
Konstantin Vasin
6448afd338
[Build] set apt Acquire::Retries to 3 for bullseye (#12758)
Why I did it
There were some changes in apt source code in version 2.1.9.
As a result apt used in bullseye (2.2.4) is intolerant to network issues.
This was fixed in 10631550f1 Already fixed version is used in bookworm (2.5.4)
And not yet affected version is used in buster (1.8.2.3)

How I did it
Set Acquire::Retries to 3 for sonic-slave-bullseye, docker-base-bullseye and final Debian image.

Ref: https://bugs.launchpad.net/ubuntu/+source/apt/+bug/1876035

Signed-off-by: Konstantin Vasin k.vasin@yadro.com
2022-11-21 08:05:16 +08:00
Lorne Long
7e525d96b3
[Build] Use apt-get to predictably support dependency ordered configuration of lazy packages (#12164)
Why I did it
The current lazy installer relies on a filename sort for both unpack and configuration steps. When systemd services are configured [started] by multiple packages the order is by filename not by the declared package dependencies. This can cause the start order of services to differ between first-boot and subsequent boots. Declared systemd service dependencies further exacerbate the issue (e.g. blocking the first-boot script).

The current installer leaves packages un-configured if the package dependency order does not match the filename order.

This also fixes a trivial bug in [Build]: Support to use symbol links for lazy installation targets to reduce the image size #10923 where externally downloaded dependencies are duplicated across lazy package device directories.

How I did it
Changed the staging and first-boot scripts to use apt-get:

dpkg -i /host/image-$SONIC_VERSION/platform/$platform/*.deb

becomes

apt-get -y install /host/image-$SONIC_VERSION/platform/$platform/*.deb

when dependencies are detected during image staging.

How to verify it
Apt-get critical rules

Add a Depends= to the control information of a package. Grep the syslog for rc.local between images and observe the configuration order of packages change.
2022-11-17 11:20:42 +08:00
abdosi
668485aac5
Added Support to runtime render bgp and teamd feature state and lldp has_asic_scope flag (#11796)
Added Support to runtime render bgp and teamd feature `state` and lldp `has_asic_scope`  flag
Needed for SONiC on chassis.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: mlok <marty.lok@nokia.com>
2022-11-15 16:20:14 -08:00
abdosi
bd348c5264
[chassis-packet] fix the issue of internal ip arp not getting resolved. (#12127)
Fix the issue where arp_update will not ping some of the ip's even
though they are in failed state since grep of that ip on ip neigh show
command does not do exact word match and can return multiple match.
2022-11-14 10:15:17 -08:00
Zain Budhwani
98ace33b0f
Add rsyslog plugin regex for select operation failure (#12659)
Added events for select op, alpm parity error, moved dhcp events from host to container
2022-11-13 21:41:33 -08:00
Jing Kan
111752957f
[dhcp_relay] Enable DHCP Relay for BmcMgmtToRRouter in init_cfg (#12648)
Why I did it
DHCP relay feature needs to be enabled for BmcMgmtToRRouter by default

How I did it
Update device type list
2022-11-10 13:37:02 +08:00
Devesh Pathak
0ea4f4d00e
Clear /etc/resolv.conf before building image (#12592)
Why I did it
nameserver and domain entries from build system fsroot gets into sonic image.

How I did it
Clear /etc/resolv.conf before building image

How to verify it
Built image with it and verified with install that /etc/resolv.conf is empty
2022-11-09 16:54:56 -08:00
xumia
ac5d89c6ac
[Build] Support j2 template for debian sources (#12557)
Why I did it
Unify the Debian mirror sources
Make easy to upgrade to the next Debian release, not source url code change required.
Support to customize the Debian mirror sources during the build
Relative issue: #12523
2022-11-09 08:09:53 +08:00
judyjoseph
c259c996b4
Use the macsec_enabled flag in platform to enable macsec feature state (#11998)
* Use the macsec_enabled flag in platform to enable macesc feature state
* Add macsec supported metadata in DEVICE_RUNTIME_METADATA
2022-11-08 11:03:38 -08:00
Sudharsan Dhamal Gopalarathnam
e6a0fba9ea
[logrotate]Fix logrotate firstaction script to reflect correct size (#12599)
- Why I did it
Fix logrotate firstaction script to reflect correct size. The size was modified to change dynamically based on disk size. However this variable was not updated
#9504

- How I did it
Updated the variable based on disk size

- How to verify it
Verify in the generated rsyslog file if the variable is correctly generated from jinja template
2022-11-08 13:38:14 +02:00
Lawrence Lee
ddf16c9d8c
[arp_update]: Fix hardcoded vlan (#12566)
Typo in prior PR #11919 hardcodes Vlan name. Change command to use the $vlan variable instead

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-11-07 12:10:00 -08:00
Zain Budhwani
8f48773fd1
Publish additional events (#12563)
Add event_publish code or regex for rsyslog plugin for additional events
2022-11-07 09:57:57 -08:00
Mai Bui
61a085e55e
Replace os.system and remove subprocess with shell=True (#12177)
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
#### How I did it
remove `shell=True`, use `shell=False`
Replace `os` by `subprocess`
2022-11-04 10:48:51 -04:00
bingwang-ms
6169ae3ee3
Add lossy scheduler for queue 7 (#12596)
* Add lossy scheduler for queue 7
2022-11-04 08:12:00 +08:00
ntoorchi
45d174663a
Enable P4RT at build time and disable at startup (#10499)
#### Why I did it
Currently at the Azure build system, the P4RT container is disabled by default at the build time. Here the goal is to include the P4RT container at the build time while disabling it at the runtime. The user can enable/disable the p4rt app through the config based on the preference. 

#### How I did it
Changed the config in rules/config and init-cfg.json.j2
2022-10-31 16:18:42 -07:00
Devesh Pathak
85e3a81f47
Fix to improve hostname handling (#12064)
* Fix to improve hostname handling
If config_db.json is missing hostname entry, hostname-config.sh ends
up deleting existing entry too and hostname changes to default 'localhost'

* default hostname to 'sonic` if missing in config file
2022-10-25 14:51:02 -07:00
Samuel Angebault
f39c2adc04
Fix extraction of platform.tar.gz for firsttime (#11935) 2022-10-21 18:27:32 -07:00
Samuel Angebault
9cdd78788f
Add support for UpperlakeElite (#12280)
Signed-off-by: Samuel Angebault <staphylo@arista.com>

Signed-off-by: Samuel Angebault <staphylo@arista.com>
2022-10-21 18:26:43 -07:00
Mariusz Stachura
9f88d03c2b
[QoS] Support dynamic headroom calculation for Barefoot platforms (#11708)
Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>

What I did
Adding the dynamic headroom calculation support for Barefoot platforms.

Why I did it
Enabling dynamic mode for barefoot case.

How I verified it
The community tests are adjusted and pass.
2022-10-19 09:36:56 -07:00
cytsao1
9ef8464964
[pmon] Add smartmontools to pmon docker (#11837)
* Add smartmontools to pmon docker

* Set smartmontools to install version 7.2-1 in pmon to match host; clean up smartmontools build files

* Add comments on smartmontools version for both host and pmon
2022-10-17 13:26:31 -07:00
Ying Xie
bc684fef0b
[BGP] starting BGP service after swss (#12381)
Why I did it
BGP service has always been starting after interface-config. However, recently we discovered an issue where some BGP sessions are unable to establish due to BGP daemon not able to read the interface IP.

This issue was clearly observed after upgrading to FRR 8.2.2. See more details in #12380.

How I did it
Delaying starting BGP seems to be a workaround for this issue.

However, caution is that this delay might impact warm reboot timing and other timing sequences.

This workaround is reducing the probability of hitting the issue by close to 100X. However, this workaround is not bulletproof as test shows. It is still preferrable to have a proper FRR fix and revert this change in the future.

How to verify it
Continuously issuing config reload and check BGP session status afterwards.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-10-13 09:24:06 -07:00
Hua Liu
257cc96d7c
Remove swsssdk from sonic OS image and docker container image (#12323)
Remove swsssdk from sonic OS image and docker image

#### Why I did it
swsssdk is deprecated, so need remove from image.

#### How I did it
Update config file to remove swsssdk from image.

#### How to verify it
Pass all test case.

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205

#### Description for the changelog
Remove swsssdk from sonic OS image and docker image

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2022-10-12 13:04:14 +08:00
Zain Budhwani
09fe3f467f
Add Structured Events w/ YANG Models (#12270)
Add events for dhcp-relay, bgp, syncd, & kernel.
2022-10-09 20:23:31 -07:00
Prince George
ac1d392d4c
Disable brackted-paste mode off by default (#12285)
* Disable brackted-paste mode off by default

* address review comment
2022-10-06 07:55:09 -07:00
Saikrishna Arcot
9251d4ba8b
[docker-wait-any]: Exit worker thread if main thread is expected to exit (#12255)
There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed).  Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread.  However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`.  The reason is unknown.

This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.

The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-10-05 18:14:10 -07:00
Muhammad Danish
8c10851c2a
Update azure.github.io links to sonic-net.github.io (#12209)
Why I did it
azure.github.io/SONiC/ no longer works and returns 404 Not Found. Updated it to the correct sonic-net.github.io/SONiC/
2022-10-02 14:02:10 +08:00
Aryeh Feigin
2c10ebb4fe
Use warm-boot infrastructure for fast-boot (#11594)
This PR should be merged together with the sonic-utilities PR (sonic-net/sonic-utilities#2286) and sonic-sairedis PR (sonic-net/sonic-sairedis#1100).

Use redis contents from dump file in fast-reboot.

Improve fast-reboot flow by utilizing the warm-reboot infrastructure.
This followes https://github.com/sonic-net/SONiC/blob/master/doc/fast-reboot/Fast-reboot_Flow_Improvements_HLD.md
2022-09-26 09:01:49 -07:00