Commit Graph

1175 Commits

Author SHA1 Message Date
Vadym Hlushko
73999dddca
[syncd.sh] Clear semaphore before updating firmware (#15819)
Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2023-08-06 22:31:05 -07:00
lixiaoyuner
87a8de4816
Move k8s script to docker-config-engine (#14788) (#15767)
Why I did it
To reduce the container's dependency from host system

Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.

How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.

Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2023-07-10 08:55:45 -07:00
mssonicbld
77aeb9bf73
Revert "Revert "Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933)" (#15464)" (#15684) (#15745) 2023-07-08 06:54:44 +08:00
mssonicbld
37224030dc
[arp_update]: Fix IPv6 neighbor race condition (#15583) (#15693) 2023-07-01 09:13:06 +08:00
mssonicbld
3d99bf71b6
[mlnx-ffb.sh] Update issu-version location (#14925) (#15674) 2023-06-30 08:10:20 +08:00
Junchao-Mellanox
143814c00f Fix issue: systemctl daemon-reload would sporadically cause udev handler fail (#15253)
#### Why I did it

A workaround to back port the fix for a systemd issue.

The systemd issue: https://github.com/systemd/systemd/issues/24668
The systemd PR to fix the issue: https://github.com/systemd/systemd/pull/24673/files

The formal solution should upgrade systemd to a version that contains the fix. But, systemd is a very basic service, upgrading systemd requires heavy test. 

#### How I did it
Copy the correct systemd-udevd.service file in build time 

#### Tested branch (Please provide the tested image version)

- [x] 202211
- [ ] <!-- image version 2 -->

```
SONiC Software Version: SONiC.fix-udev.3-b65c7bdec_Internal
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-18-2-amd64
Build commit: b65c7bdec
Build date: Mon Jun 19 10:54:50 UTC 2023
Built by: sw-r2d2-bot@r-build-sonic-ci02-241

Platform: x86_64-mlnx_msn4700-r0
HwSKU: ACS-MSN4700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2022X08597
Model Number: MSN4700-WS2FO
Hardware Revision: A1
Uptime: 08:10:11 up 1 min,  1 user,  load average: 1.81, 0.67, 0.24
Date: Sun 25 Jun 2023 08:10:11

Docker images:
REPOSITORY                    TAG                             IMAGE ID       SIZE
docker-fpm-frr                fix-udev.3-b65c7bdec_Internal   a7b911e7cb6f   346MB
docker-fpm-frr                latest                          a7b911e7cb6f   346MB
docker-platform-monitor       fix-udev.3-b65c7bdec_Internal   94c5178cf80b   731MB
docker-platform-monitor       latest                          94c5178cf80b   731MB
docker-orchagent              fix-udev.3-b65c7bdec_Internal   46b393e0ace8   328MB
docker-orchagent              latest                          46b393e0ace8   328MB
docker-syncd-mlnx             fix-udev.3-b65c7bdec_Internal   1f5c6c23e33a   734MB
docker-syncd-mlnx             latest                          1f5c6c23e33a   734MB
docker-sflow                  fix-udev.3-b65c7bdec_Internal   7e45992c8c59   317MB
docker-sflow                  latest                          7e45992c8c59   317MB
docker-teamd                  fix-udev.3-b65c7bdec_Internal   e4d905592cda   316MB
docker-teamd                  latest                          e4d905592cda   316MB
docker-nat                    fix-udev.3-b65c7bdec_Internal   7fe799367580   319MB
docker-nat                    latest                          7fe799367580   319MB
docker-macsec                 latest                          d702a5554171   318MB
docker-snmp                   fix-udev.3-b65c7bdec_Internal   3bce8fcf71cd   338MB
docker-snmp                   latest                          3bce8fcf71cd   338MB
docker-sonic-telemetry        fix-udev.3-b65c7bdec_Internal   f13949cbc817   597MB
docker-sonic-telemetry        latest                          f13949cbc817   597MB
docker-dhcp-relay             latest                          153d9072805d   306MB
docker-router-advertiser      fix-udev.3-b65c7bdec_Internal   aed642b9a6bc   299MB
docker-router-advertiser      latest                          aed642b9a6bc   299MB
docker-sonic-p4rt             fix-udev.3-b65c7bdec_Internal   a3cae5ca65a7   870MB
docker-sonic-p4rt             latest                          a3cae5ca65a7   870MB
docker-mux                    fix-udev.3-b65c7bdec_Internal   b81f0401b9a8   347MB
docker-mux                    latest                          b81f0401b9a8   347MB
docker-eventd                 fix-udev.3-b65c7bdec_Internal   c5917d0e801f   298MB
docker-eventd                 latest                          c5917d0e801f   298MB
docker-lldp                   fix-udev.3-b65c7bdec_Internal   fd5dc14a7976   341MB
docker-lldp                   latest                          fd5dc14a7976   341MB
docker-database               fix-udev.3-b65c7bdec_Internal   438c2715a1dd   299MB
docker-database               latest                          438c2715a1dd   299MB
docker-sonic-mgmt-framework   fix-udev.3-b65c7bdec_Internal   5c50b115fbcd   414MB
docker-sonic-mgmt-framework   latest  
```
2023-06-30 02:37:38 +08:00
Vaibhav Hemant Dixit
b62231566b Revert "Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933)" (#15464)
This reverts commit 02b17839c3.

Reverts #14933

The earlier commit caused a race condition that particularly broke cross branch warm upgrade.

Issue happens when db_migrator is still migrating the DB and finalizer is checking DB for list of components to reconcile.

If migration is not complete, finalizer get an empty list to wait for. Due to this, finalizer concludes warmboot (deletes system wide warmboot flag) and cause all the services to do cold restart.

ADO: 24274591
2023-06-17 14:32:23 +08:00
Saikrishna Arcot
8195e33120 Re-add 127.0.0.1/8 when bringing down the interfaces (#15080)
* Re-add 127.0.0.1/8 when bringing down the interfaces

With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.

To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.

Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-06-16 14:30:34 +08:00
siqbal1986
baa5175819 Added VNET_MONITOR_TABLE,BFD_SESSION_TABLE,VNET_ROUTE_TUNNEL_TABLE to the list (#14992)
* The 3 tables in state DB need to be cleaned up after SWSS restart for have consistant state.
2023-06-16 09:54:58 +08:00
Liping Xu
deb94af61b allow docker_inram to kernel cmd list (#15374)
Why I did it
After docker_inram is enabled, the docker folder's default max size is 1.5G.
It's not big enough for some tests which need to install additional docker images or install extra packages.

Work item tracking
Microsoft ADO 24199761:
How I did it
add docker_inram into cmdline_allowlist

How to verify it
sudo sh -c 'echo "docker_inram_size=3000M" >> kernel-cmdline-append'
sudo reboot and check the docker folder size
2023-06-15 14:33:58 +08:00
Sudharsan Dhamal Gopalarathnam
78977ddbce
[202211][config reload]Config Reload Enhancement (#15334)
Backporting #13969

Why I did it
Implementing code changes for sonic-net/SONiC#1203

Work item tracking
Microsoft ADO (number only):
How I did it
Removed the timers and delayed target since the delayed services would start based on event driven approach.
Cleared port table during config reload and cold reboot scenario.
Modified yang model, init_cfg.json to change has_timer to delayed

How to verify it
Added UT to verify
2023-06-12 13:22:16 +08:00
mssonicbld
5f4b54a9cd
[ci/build]: Upgrade SONiC package versions (#15361) 2023-06-06 19:46:12 +08:00
mssonicbld
e4d8355976
[ci/build]: Upgrade SONiC package versions (#15329) 2023-06-04 18:12:12 +08:00
mssonicbld
4e9569ee3b
[ci/build]: Upgrade SONiC package versions (#15165) 2023-06-03 17:22:05 +08:00
mssonicbld
084564bdde
Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933) (#15317) 2023-06-03 09:16:42 +08:00
Anish Narsian
71ecd727ac [arp_update] Resolve neighbors from config_db (#15006)
* To resolve NEIGH table entries present in CONFIG_DB. Without this change arp/ndp entries which we wish to resolve, and configured via CONFIG_DB are not resolved.
2023-05-18 09:46:56 +08:00
Nazarii Hnydyn
ba54e1e1ae
Revert "[swss/syncd] remove dependency on interfaces-config.service (#13084) (#14341)" (#15094)
This reverts commit 499f57a7f7.
2023-05-17 15:59:55 +08:00
DavidZagury
4fd2a6297f
[Secure Boot] Add Secure Boot Support (#12692) (#14963)
- Why I did it
Add Secure Boot support to SONiC OS.
Secure Boot (SB) is a verification mechanism for ensuring that code launched by a computer's UEFI firmware is trusted. It is designed to protect a system against malicious code being loaded and executed early in the boot process before the operating system has been loaded.

- How I did it
Added a signing process to sign the following components:
shim, grub, Linux kernel, and kernel modules when doing the build, and when feature is enabled in build time according to the HLD explanations (the feature is disabled by default).

- How to verify it
There are self-verifications of each boot component when building the image, in addition, there is an existing end-to-end test in sonic-mgmt repo that checks that the boot succeeds when loading a secure system (details below).

How to build a sonic image with secure boot feature: (more description in HLD)

Required to use the following build flags from rules/config:
SECURE_UPGRADE_MODE="dev"
SECURE_UPGRADE_DEV_SIGNING_KEY="/path/to/private/key.pem"
SECURE_UPGRADE_DEV_SIGNING_CERT="/path/to/cert/key.pem"
After setting those flags should build the sonic-buildimage.
Before installing the image, should prepared the setup (switch device) with the follow:
check that the device support UEFI
stored pub keys in UEFI DB

enabled Secure Boot flag in UEFI
How to run a test that verify the Secure Boot flow:
The existing test "test_upgrade_path" under "sonic-mgmt/tests/upgrade_path/test_upgrade_path", is enough to validate proper boot
You need to specify the following arguments:
Base_image_list your_secure_image
Taget_image_list your_second_secure_image
Upgrade_type cold
And run the test, basically the test will install the base image given in the parameter and then upgrade to target image by doing cold reboot and validates all the services are up and working correctly

Co-authored-by: davidpil2002 <91657985+davidpil2002@users.noreply.github.com>
2023-05-15 10:13:26 +08:00
Ying Xie
f6216435c8
Revert "Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516)" (#14901)
This reverts commit 5ef488f808.
2023-05-01 16:49:08 -07:00
mssonicbld
776937c48f
[ci/build]: Upgrade SONiC package versions (#14894) 2023-04-30 18:37:57 +08:00
mssonicbld
4479245996
[ci/build]: Upgrade SONiC package versions (#14889) 2023-04-29 18:57:14 +08:00
mssonicbld
99d6003717
Changes to support TSA from supervisor (#14691) (#14878) 2023-04-28 21:11:55 +08:00
mssonicbld
5ac1051f8f
Temporary WA for the issue that asic_table.json can not be rendered (#13888) (#14857) 2023-04-27 02:57:10 +08:00
mssonicbld
e0ef5b9808
[write standby] force DB connections to use unix socket to connect (#14524) (#14773) 2023-04-24 01:49:42 +08:00
mssonicbld
f53c7b66cd
[Fast-boot] Clear teamd-timer when finalizing fast-reboot (#14583) (#14774) 2023-04-23 21:07:26 +08:00
mssonicbld
70cfef252f
Delay mux/sflow/snmp timer after interface-config service (#14506) (#14771) 2023-04-23 20:52:06 +08:00
mssonicbld
72776df8ba [ci/build]: Upgrade SONiC package versions 2023-04-23 20:46:40 +08:00
Stephen Sun
1d3fa0b03c Enhance the error message output mechanism (#14384)
#### Why I did it

Enhance the error message output mechanism during swss docker creating

#### How I did it

Capture the output to stderr of `sonic-cfggen` and output it using `echo` to make sure the error message will be logged in syslog.

#### How to verify it

Manually test
2023-04-23 18:32:40 +08:00
mssonicbld
e9daace147
[ci/build]: Upgrade SONiC package versions (#14800) 2023-04-22 18:35:55 +08:00
mssonicbld
8e1bbab07d
[image_config] add rasdaemon.timer (#14300) (#14762) 2023-04-22 00:18:05 +08:00
Hua Liu
bee30fdfb9 Improve sudo cat command for RO user. (#14428)
Improve sudo cat command for RO user.

#### Why I did it
RO user can use sudo command show none syslog files.

#### How I did it
Improve sudo cat command for RO user.

#### How to verify it
Pass all UT.
Manually check fixed code work correctly.

#### Description for the changelog
Improve sudo cat command for RO user.
2023-04-21 06:32:24 +08:00
mssonicbld
aea1980b14
[ci/build]: Upgrade SONiC package versions (#14720) 2023-04-19 19:30:56 +08:00
mssonicbld
cc22d69fd3
[ci/build]: Upgrade SONiC package versions (#14680) 2023-04-16 18:59:28 +08:00
mssonicbld
b4dafae65d
[ci/build]: Upgrade SONiC package versions (#14673) 2023-04-15 20:37:33 +08:00
xumia
5dbf512cda
Support to add SONiC OS Version in device info (#14601) (#14623)
Why I did it
Cherry-pick #14601, for code conflict.
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.

SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Work item tracking
Microsoft ADO (number only): 17894593
How I did it
How to verify it
2023-04-13 19:28:03 +08:00
mssonicbld
46af37f77d
[ci/build]: Upgrade SONiC package versions (#14629) 2023-04-12 19:19:12 +08:00
anamehra
e107549942 chassis-packet: resolve the missing static routes (#14593)
Why I did it
Fixes #14179
chassis-packet: missing arp entries for static routes causing high orchagent cpu usage

It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear.

How I did it
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping if any
found to resolve it.

How to verify it
After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present
manual validation:
Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries
run arp_update
Check for neigh entries. All entries should be present.
Testing on T0 setup route/for test_static_route.py

The test set the STATIC_ROUTE entry in conifg db without ifname:
sonic-db-cli CONFIG_DB hmset 'STATIC_ROUTE|2.2.2.0/24' nexthop 192.168.0.18,192.168.0.25,192.168.0.23

"STATIC_ROUTE": {
    "2.2.2.0/24": {
        "nexthop": "192.168.0.18,192.168.0.25,192.168.0.23"
    }
},
Validate that the arp_update gets the proper ARP_UPDATE_VARDS using arp_update_vars.j2 template from config db and does not crash:

{ "switch_type": "", "interface": "", "pc_interface" : "PortChannel101 PortChannel102 PortChannel103 PortChannel104 ", "vlan_sub_interface": "", "vlan" : "Vlan1000", "static_route_nexthops": "192.168.0.18 192.168.0.25 192.168.0.23 ", "static_route_ifnames": "" }

validate route/test_static_route.py testcase pass.
2023-04-12 18:32:47 +08:00
mssonicbld
73766c2fa1
Finalize fast-reboot in warmboot finalizer (#14238) (#14608) 2023-04-11 22:54:56 +08:00
mssonicbld
4d0f1c1972
[ci/build]: Upgrade SONiC package versions (#14578) 2023-04-09 19:17:25 +08:00
mssonicbld
05a9ce9628
[ci/build]: Upgrade SONiC package versions (#14572) 2023-04-08 19:08:35 +08:00
mssonicbld
a3951c2041
Increase wait_for_tunnel() timeout to 90s (#14279) (#14563) 2023-04-07 16:02:01 +08:00
mssonicbld
483b9867e9
[ci/build]: Upgrade SONiC package versions (#14529) 2023-04-05 19:02:12 +08:00
mssonicbld
8863910bc8
[ci/build]: Upgrade SONiC package versions (#14492) 2023-04-02 19:28:22 +08:00
mssonicbld
f3b6860076
[ci/build]: Upgrade SONiC package versions (#14488) 2023-04-01 19:35:15 +08:00
mssonicbld
5b028dc60f
[ci/build]: Upgrade SONiC package versions (#14478) 2023-04-01 03:16:16 +08:00
mssonicbld
fe1e2b16f7
[ci/build]: Upgrade SONiC package versions (#14382) 2023-03-22 19:59:24 +08:00
xumia
0a7037641c
[Security] Fix some of vulnerability issue relative python packages (#14269) (#14352)
Why I did it
Fix some of vulnerability issue relative python packages #14269
Pillow: [CVE-2021-27921]
Wheel: [CVE-2022-40898]
lxml: [CVE-2022-2309]

How I did it
How to verify it
2023-03-22 15:42:29 +08:00
Dev Ojha
24c53a5d34 [Buffer] Added cable length config to buffer config template for EdgeZoneAggregator (#14280)
Why I did it
SONiC currently does not identify 'EdgeZoneAggregator' neighbor. As a result, the buffer profile attached to those interfaces uses the default cable length which could cause ingress packet drops due to insufficient headroom. Hence, there is a need to update the buffer templates to identify such neighbors and assign the same cable length as used by the T1.

How I did it
Modified the buffer template to identify EdgeZoneAggregator as a neighbor device type and assign it the same cable length as a T1/leaf router.

How to verify it
Unit tests pass, and manually checked on a 7260 to see the changes take effect.

Signed-off-by: dojha <devojha@microsoft.com>
2023-03-20 22:36:33 +08:00
mssonicbld
499f57a7f7
[swss/syncd] remove dependency on interfaces-config.service (#13084) (#14341) 2023-03-19 22:32:37 +08:00
Neetha John
0aacc4531a [storage_backend] Add backend acl service (#14229)
Why I did it
This PR addresses the issue mentioned above by loading the acl config as a service on a storage backend device

How I did it
The new acl service is a oneshot service which will start after swss and does some retries to ensure that the SWITCH_CAPABILITY info is present before attempting to load the acl rules. The service is also bound to sonic targets which ensures that it gets restarted during minigraph reload and config reload

How to verify it
Build an image with the following changes and did the following tests

Verified that acl is loaded successfully on a storage backend device after a switch boot up
Verified that acl is loaded successfully on a storage backend ToR after minigraph load and config reload
Verified that acl is not loaded if the device is not a storage backend ToR or the device does not have a DATAACL table

Signed-off-by: Neetha John <nejo@microsoft.com>
2023-03-19 22:32:22 +08:00