Commit Graph

633 Commits

Author SHA1 Message Date
ganglv
9a6d6137a3
Remove UpdateGraphService feature (#18330)
### Why I did it
Remove UpdateGraphService feature from sonic image. The goal is to simplify the bootup process.

### How I did it
Remove updategraph service and updategraph script.
Update all related services, replace updategraph.service with config-setup.service.

#### How to verify it
Build and install new image, load minigraph and check all the services.
2024-03-14 13:12:26 -07:00
Xincun Li
f886328897
[sn2700]: Add CPLD update. (#17376)
Why I did it
Porting #12173 to master, this will ensure all above 201911 version will have CPLD update files.

Microsoft ADO 25846069:

How I did it
Added Mellanox CPLD burn/refresh vme bundle for SN2700 platforms

How to verify it
Using update_firmware script to install private image that contains CPLD VME files with UPDATE_MLNX_CPLD_FW parameter.

Before update, the CPLD version was 15
admin@str2-msn2700-spy-1:~$ sudo fwutil show status
Chassis    Module    Component    Version                Description
---------  --------  -----------  ---------------------  ----------------------------------------
MSN2700    N/A       ONIE         2016.11-5.1.0012-9600  ONIE - Open Network Install Environment
                     SSD          0115-000               SSD - Solid-State Drive
                     BIOS         0ABZS017_01.01.213     BIOS - Basic Input/Output System
                     CPLD1        CPLD000085_REV1501     CPLD - Complex Programmable Logic Device
                     CPLD2        CPLD000043_REV0400     CPLD - Complex Programmable Logic Device
                     CPLD3        CPLD000000_REV0100     CPLD - Complex Programmable Logic Device
Do Update
admin@str2-msn2700-spy-1:/tmp$ sudo ./update_firmware sonic-mellanox-xincun-cpld.bin UPDATE_MLNX_CPLD_FW=1
Available space: 8101 MB
Warning: 'sonic_installer' command is deprecated and will be removed in the future
Please use 'sonic-installer' instead
Current FW version: SONiC-OS-20201231.110
Target FW version number: add-cpld-2.83464431-a0237f7aef
Target FW version: SONiC-OS-add-cpld-2.83464431-a0237f7aef
expr: non-integer argument
NOTICE: Reset Drop caches to index 1
Warning: 'sonic_installer' command is deprecated and will be removed in the future
Please use 'sonic-installer' instead
Image SONiC-OS-add-cpld-2.83464431-a0237f7aef is already installed. Setting it as default...
Command: grub-set-default --boot-directory=/host 0

Command: sync;sync;sync

Command: sleep 3

Done
NOTICE: sonic_installer install successfully
Mellanox platform is detected: x86_64-mlnx_msn2700-r0
Mellanox ASIC maintenance...
Mellanox ASIC firmware is up to date
Mellanox CPLD maintenance...
NOTICE: Copy Mellanox firmware upgrade utility
'/tmp/image-add-cpld-2.83464431-a0237f7aef-fs//usr/bin/mlnx-fw-upgrade.sh' -> '/usr/bin/mlnx-fw-upgrade.sh'
NOTICE: Copy Mellanox cpldupdate utility
'/tmp/image-add-cpld-2.83464431-a0237f7aef-fs//usr/bin/cpldupdate' -> '/usr/bin/cpldupdate'
Mellanox CPLD firmware upgrade is required. Installing compatible version...
Current CPLD firmware version: 15
Target CPLD firmware version: 20
NOTICE: Upgrade MLNX CPLD FW from 15 to 20
CPLD burn firmware file: /tmp/tmp.42DXmW1pQS/FUI000193_Burn_Panther_CPLD000085_REV2000_CPLD000128_REV0600_CPLD000130_REV0300.vme
CPLD refresh firmware file: /tmp/tmp.42DXmW1pQS/FUI000193_Refresh_Panther_CPLD000085_REV2000_CPLD000128_REV0600_CPLD000130_REV0300.vme
[/] CPLD update...                 Lattice Semiconductor Corp.

             ispVME(tm) V12.2 Copyright 1998-2012.

               Customized for Mellanox products.

Processing virtual machine file (/tmp/tmp.42DXmW1pQS/FUI000193_Burn_Panther_CPLD000085_REV2000_CPLD000128_REV0600_CPLD000130_REV0300.vme)......

Diamond Deployment Tool 3.12
CREATION DATE: Tue Sep 20 09:41:49 2022


[|] CPLD update...+=======+
| PASS! |
+=======+
Power cycle the device, then check CPLD version, it has changed to 20.
admin@str2-msn2700-spy-1:~$ sudo fwutil show status
Chassis    Module    Component    Version                Description
---------  --------  -----------  ---------------------  ----------------------------------------
MSN2700    N/A       ONIE         2016.11-5.1.0012-9600  ONIE - Open Network Install Environment
                     SSD          0115-000               SSD - Solid-State Drive
                     BIOS         0ABZS017_01.01.213     BIOS - Basic Input/Output System
                     CPLD1        CPLD000085_REV2000     CPLD - Complex Programmable Logic Device
                     CPLD2        CPLD000128_REV0600     CPLD - Complex Programmable Logic Device
                     CPLD3        CPLD000000_REV0000     CPLD - Complex Programmable Logic Device
2024-03-06 07:39:00 -08:00
lixiaoyuner
9fef78c4f0
Install and upgrade packages for k8s master image (#18159)
### Why I did it
- Currently inside k8s master image we are going to use AAD to do authentication related stuff with python language, we need to pre-install several azure key-vault related python packages.
- Need to upgrade cri-dockerd to 0.3.10 to support bookworm
- Need to change netcat package name to netcat-openbsd for bookworm
- Remove the unnecessary apt-get update
##### Work item tracking
- Microsoft ADO **(number only)**: 26435886

#### How I did it
- pip3 install azure-keyvault-secrets
- apt-get -y install netcat-openbsd
- upgrade the cri-dockerd version for bookworm
#### How to verify it
- pip3 list to check if azure-keyvault-secrets is installed inside image
- dpkg -l to check if netcat-openbsd is installed inside image
- systemctl status cri-dockerd.service to check if it's running well
2024-02-27 18:09:12 -08:00
rajib-dutta1
4753953ed0
Ipmitool bookworm: Fix and patch enterprise-numbers URL (#17878)
### Why I did it

ipmitool utility is used to access various HW sensors. Some platforms use "ipmitool raw " to read specific addresses. 

ipmitool_1.8.19-4_amd64.deb, that is part of bookworm has a defect. The package is missing file enterprise.txt that is expected by the "raw read" code path. 
It is so because the file the .deb tries to download at the build time does not have the necessary extension as it is available on remote server: https://www.iana.org/assignments/enterprise-numbers.txt

### How I did it

The defect had been fixed using coding changes in next unstable version of Linux. It is expected to be available in future stable version of the OS. Hence to keep the changes to minimal, the .dsc file is downloaded and only the Makefile is modified to download the correct file. To make is work as patch necessary changes are made.

#### How to verify it
Build log is attached and installation of the file is noted line #2274
When using vanilla bookworm on platforms like 5212 or 5224:
-------------------------------------------------------------------
root@sonic:~# ipmitool raw 0x04 0x2d 0x31
IANA PEN registry open failed: No such file or directory
00 c0 01 80

When fixed we should not see the above error:
--------------------------------------------------
root@sonic:/home/admin# ipmitool raw 0x04 0x2d 0x31
 00 c0 00 80

### Description for the changelog

This change is to address ipmitool raw read issue. This patch must be removed once it is available in next stable Linux release that contains the fix. 

1edb0e27e4
2024-02-26 17:49:06 -08:00
Prince George
0564ce48c9
[baseimage]: Update smartmontool version >= v7.4 (#17635)
Why I did it
Update smartmontool verson to 7.4. This is done to prevent smartmontools service to exit with non-zero exit status on platform that does not have a SSD/disk to be monitored.

Until Debian Bullseye (which had smartmontools 7.2), Debian had a patch applied that changed the default quit mode to never exit. A bug report was filed on Debian, saying that the source code patch isn't needed and could just be done via command line options, and also that smartmontools 7.3 has a new built-in option to exit with 0 if there are no monitorable devices found (which prevents systemd from treating it as a service failure). Because of that, Debian Bookworm (which also upgraded to 7.3) removed the patch and restored the default behavior of exiting with exit code 17 if there are no devices found.

Smartmontools v7.3 has this issue, because of which smartd exits with non-zero exit status even with "-q" option.

How I did it
Update the smartmontools to version 7.4 which has the fix for exiting gracefully if no monitoring device is found
Added smartd option "-q nodev0" to allow smartd to exit with status 0 if no monitoring device found
2024-02-12 09:37:12 -08:00
Stepan Blyshchak
cac73d80ca
[bootchart] enable command line recording (#17778)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2024-02-12 08:36:44 -08:00
Zain Budhwani
c8439cdd4b
Disable eventd and rsyslog plugin in slim images (#17905)
### Why I did it

Disable eventd at buildtime for slim images

##### Work item tracking
- Microsoft ADO **(number only)**:26386286

#### How I did it

Add flags for disabling eventd and only copy rsyslog conf files when eventd is included and not slim image

#### How to verify it

Manual testing
2024-01-30 22:14:23 -08:00
Kevin Wang
5516381d7e
[qos] change the template keyword from Compute-AI to ComputeAI (#17902)
Why I did it
Align the keywords to make qos configuration take effect

Work item tracking
Microsoft ADO (number only):
How I did it
Change the keyword to ComputeAI

How to verify it
reload minigraph and check the qos configuration
2024-01-29 10:10:54 +08:00
Nazarii Hnydyn
e173987a56
[swss/syncd]: Remove dependency on interfaces-config.service (#17739)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
Co-authored-by: Stepan Blyshchak <38952541+stepanblyschak@users.noreply.github.com>
2024-01-18 08:04:00 -08:00
Liping Xu
d6e0bf66a6
disable restapi for leafRouter in slim image (#17713)
Why I did it
For some devices with small memory, after upgrading to the latest image, the available memory is not enough.

Work item tracking
Microsoft ADO (number only):
26324242
How I did it
Disable restapi feature for LeafRouter which with slim image.

How to verify it
verified on 7050qx T1 (slim image), restapi disabled
verified on 7050qx T0 (slim image), restapi enabled
verified on 7260 T1 (normal image), restapi enabled
2024-01-12 15:26:06 +08:00
prabhataravind
c20abb9e28
[docker_image_ctl.j2]: swss docker initialization improvements (#17628)
* [docker_image_ctl.j2]: swss docker initialization improvements

This commit attempts to address the following:
 * Make sure swss container is indeed up and running before running any commands
   on it. In case where swss container is not fully up when swss.sh attempts to
   create swss:/ready file using "docker exec swss$DEV touch", the command can
   fail silently and can cause swssconfig to wait forever leading to missing IP
   decap configuration among other things. Add a wait so that docker commands
   are run only after swss container status is "Running"
*  Add a log when swss:/ready file is created or if the file creation fails so
   that it becomes easier to debug such scenarios in the future
* [docker_image_ctl.j2]: Use swss$DEV to accommodate multi ASIC platforms as well

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
2024-01-03 17:44:22 -08:00
Junchao-Mellanox
f3f2972512
Optimize syslog rate limit feature for fast and warm boot (#17458)
- Why I did it
Optimize syslog rate limit feature for fast and warm boot

- How I did it
Optimize redis start time
Don't render rsyslog.conf in container startup script
Disable containercfgd by default. There is a new CLI to enable it (in another PR)

- How to verify it
Manual test
Regression test
2023-12-20 09:12:03 +02:00
Yevhen Fastiuk
5efb123ede
[NTP] Add NTP extended configuration (#15058)
hld [#1296](https://github.com/sonic-net/SONiC/pull/1296)
closes [#1254](https://github.com/sonic-net/SONiC/issues/1254)
depends-on [#60](https://github.com/sonic-net/sonic-host-services/pull/60), [#781](https://github.com/sonic-net/sonic-swss-common/pull/781), [#2835](https://github.com/sonic-net/sonic-utilities/pull/2835), [#10749](https://github.com/sonic-net/sonic-mgmt/pull/10749)

#### Why I did it
To cover the next AIs:
* Configure NTP global parameters
* Add/remove new NTP servers
* Change the configuration for NTP servers
* Show NTP status
* Show NTP configuration

### How I did it
* Add YANG model for a new configuration
* Extend configuration templates to support new knobs

### Description for the changelog
* Add ability to configure NTP global parameters such as authentication, dhcp, admin state
* Change the configuration for NTP servers
* Add an ability to show NTP configuration

#### Link to config_db schema for YANG module changes
[NTP configuration](https://github.com/sonic-net/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md#ntp-and-syslog-servers)
2023-12-11 13:31:35 -08:00
Stepan Blyshchak
b61528bee9
Revert "[swss/syncd] remove dependency on interfaces-config.service (#13084) (#14341)" (#15094) (#17367)
This reverts commit 499f57a7f7.

Co-authored-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-12-07 15:20:39 -08:00
Ashwin Hiranniah
ada7c6a72e
Add pensando platform (#15978)
This commit adds support for pensando asic called ELBA. ELBA is used in pci based cards and in smartswitches.

#### Why I did it
This commit introduces pensando platform which is based on ELBA ASIC.
##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Created platform/pensando folder and created makefiles specific to pensando.
This mainly creates pensando docker (which OEM's need to download before building an image) which has all the userspace to initialize and use the DPU (ELBA ASIC).
Output of the build process creates two images which can be used from ONIE and goldfw.
Recommendation is use to use ONIE.
#### How to verify it
Load the SONiC image via ONIE or goldfw and make sure the interfaces are UP.

##### Description for the changelog
Add pensando platform support.
2023-12-04 14:41:52 -08:00
Vivek
4727185648
[lldp] Clean up service start logic owing to port init start optimization (#17268)
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2023-11-27 09:56:54 -08:00
Saikrishna Arcot
862bd794ee Fix container down event not sending out a notification
systemd changed the log message syntax for a container going down.
Update the regex for the new format.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
cae42998dd Fix PAM module configuration issue
pam-auth-update doesn't store local configuration, and it's meant to be
used by packages only. Because libpam-systemd was getting uninstalled
afterwards, this caused tacplus to get re-enabled.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
73605a98ef Modify rasdaemon service on amd64 only
Rasdaemon is not installed on armhf or arm64

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
ed5176107b Update Debian build script for Bookworm
Notable changes:
* Use j2cli from Debian repos instead of pip
* Use setuptools from Debian repos instead of pip
* Use wheel from Debian repos instead of pip
* Update grpcio and grpcio-tools python packages to match version in
  Bookworm
* Use m2crypto from Debian repos instead of pip

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
34a1ac1a0f Migrate from ntp to ntpsec
Debian Bookworm no longer uses NTP, and instead uses NTPsec. Modify our
files to update/replace the NTPsec files instead.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Ze Gan
9f08f88a0d
[dpu]: Add DPU database service (#17161)
Sub PRs:

sonic-net/sonic-host-services#84
#17191

Why I did it
According to the design, the database instances of DPU will be kept in the NPU host.

Microsoft ADO (number only): 25072889

How I did it
To follow the multiple ASIC design, I assume a new platform environment variable NUM_DPU will be defined in the /usr/share/sonic/device/$PLATFORM/platform_env.conf. Based on this number, NPU host will launch a corresponding number of instances for the DPU database.

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-11-17 09:10:03 -08:00
ganglv
c71fb3a30f
Share image for gnmi and telemetry (#16863)
Why I did it
Share docker image to support gnmi container and telemetry container

Work item tracking
Microsoft ADO 25423918:
How I did it
Create telemetry image from gnmi docker image.
Enable gnmi container and disable telemetry container by default.

How to verify it
Run end to end test.
2023-11-08 08:54:36 +08:00
Kebo Liu
31451295d5
Add special rsyslog filter for MSN2700 platform (#16684)
- Why I did it
Mellanox MSN2700 platforms have a non-functional error log: "ERR pmon#sensord: Error getting sensor data: dps460/#10: Can't read". This error is because of a firmware issue with some PSU, we are not able to upgrade the FW online. Since there is no functional impact, this error log can be ignored safely.

- How I did it
Add a new rsyslog rule to the rsyslog-container.conf.j2, if the docker name is pmon and the platform name matches, the new rule will be inserted into the docker rsyslogd.conf

- How to verify it
run regression on the MSN2700 platform to make the error log will not be printed to the syslog.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-10-24 17:54:44 +03:00
Samuel Angebault
e4a497183a
Add build option to reduce final image size (#16729)
* Reduce SONiC image filesystem size

Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:
 - change some packages that are installed in the rootfs
 - apply a rootfs reduction script

The script itself will perform a few steps:
 - remove file duplication by leveraging hardlinks
   - under /usr/share/sonic since the symlinks under the device folder are lost during the build.
   - under /var/lib/docker since the files there will only be mounted ro
 - remove some extra files (man, docs, licenses, ...)
 - some image specific space reduction (only for aboot images currently)

The script can later be improved but for now it's reducing the rootfs
size by ~30%.

* restore fully featured vim package
2023-10-24 10:01:58 +08:00
Longxiang Lyu
072eaed2e3
[snmp] Check intfmgrd running before start (#16588)
Add pre start check to ensure intfmgrd is running.
The check will run for 20 seconds at most.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2023-10-13 16:00:51 -07:00
Saikrishna Arcot
469aed2cf7
[baseimage]: Update openssh to 1:8.4p1-5+deb11u2 (#16826)
Openssh in Debian Bullseye has been updated to 1:8.4p1-5+deb11u2 to fix CVE-2023-38408. 
Since we're building openssh with some patches, we need to update our version as well.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-10-11 10:42:20 -07:00
Vadym Hlushko
3bd396043e
[buffers] Add 'create_only_config_db_buffers.json' file for the Mellanox devices (not MSFT SKU) (#16233)
* [buffers] Add create_only_config_db_buffers.json for MLNX devices (not MSFT SKU), inject it at the start of the swss docker

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>

* [buffers] Align the sonic-device_metadata.yang

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>

---------

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-10-03 08:35:57 -07:00
Vaibhav Hemant Dixit
07f8507911
[fast-reboot] Fix regression: set FAST_REBOOT state_db flag to support fast-reboot from older images (#16733)
Why I did it
Fix: #16699

Fast reboot is failing from old OS versions (eg., 201911 image) to latest (eg., master branch) after PR #15685

The system wide flag for FAST_REBOOT is still required when the base OS version does not support the new fast-reboot reconciliation logic (no db dump)
2023-09-28 09:37:21 -07:00
abdosi
2ce04ab91d
[chassisd]: Add alternate to the bridge interface created on chassis supervisor. (#16505)
Add alternate name eth1-midplane to Linux bridge br1 created on supervisor on some chassis platforms.
See description here: #16504

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-09-23 00:27:21 -07:00
Zain Budhwani
382d68fe42
Move syncd events to syncd.conf (#15950)
### Why I did it

syncd events should have tag sonic-events-syncd, not sonic-events-host. Created a new conf file which will have syncd events

##### Work item tracking
- Microsoft ADO **(number only)**:17747466

#### How I did it

Code change

#### How to verify it

Pipeline
2023-09-19 17:29:49 -07:00
Zain Budhwani
337a9dbcf4
Add rsyslog plugin support for frr log (#16192)
### Why I did it

Currently there is only rsyslog plugin support for /var/log/syslog, meaning we do not detect events that occur in frr logs such as BGP Hold Timer Expiry that appears in frr/bgpd.log. 

##### Work item tracking
- Microsoft ADO **(number only)**: 13366345

#### How I did it

Add omprog action to frr/bgpd.log and frr/zebra.log. Add appropriate regex for both events.

#### How to verify it

sonic-mgmt test case
2023-09-12 16:53:45 -07:00
Yaqiang Zhu
76b7cb8b64
[dhcp_server] Add dhcp_server container (#14031)
Why I did it
Add dhcp_server ipv4 feature to SONiC.
HLD: sonic-net/SONiC#1282

How I did it
To be clarify: This container is disabled by INCLUDE_DHCP_SERVER = n for now, which would cause container not build.

Add INCLUDE_DHCP_SERVER to indicate whether to build dhcp_server container
Add docker file for dhcp_server, build and install kea-dhcp4 inside container
Add template file for dhcp_server container services.
Add entry for dhcp_server to FEATURE table in config_db.
How to verify it
Build image with INCLUDE_DHCP_SERVER = y to verify:

Image can be install successfully without crush.
By config feature state dhcp_server enabled to enable dhcp_server.
2023-09-11 09:15:56 -07:00
Aman Singhal
e22136dd9f
[cisco]: Enable Kdump config by default for cisco-8000 (#16224)
Why I did it
Enabling kdump by default for cisco-8000 by setting crashkernel cmdline arg in device installer.conf.
After bootup, sonic-kdump-config wipes crashkernel arg from /host/grub/grub.cfg, and resets USE_KDUMP in /etc/default/kdump-tools, so kdump will not be enabled on subsequent reboot.

How I did it
Setting kdump enable config as part of init_cfg.json for cisco-8000 platforms.

How to verify it
Install SONiC image with kdump enabled by default (device/hwsku/installer.conf), then reboot.
Kdump config should persist on subsequent reboots and kdump loaded during bootup

Signed-off-by: Aman Singhal <amans@cisco.com>
2023-09-07 01:30:24 -07:00
lixiaoyuner
410e6ff406
Install pyOpenSSL package for k8s master (#16361)
### Why I did it
Need a tool to check certificate's detail of information.
##### Work item tracking
- Microsoft ADO **(number only)**: 25020260
#### How I did it
Install pyOpenSSL package for k8s master
#### How to verify it
Pip3 list to check whether it's installed when include_kubernetes_master=y
2023-08-31 22:26:24 -07:00
Alpesh Patel
cabdac17a5
qos template change for backend compute-ai deployment (#16150)
#### Why I did it

To enable qos config for a certain backend deployment mode, for resource-type "Compute-AI".
This deployment has the following requirement:

- Config below enabled if DEVICE_TYPE as one of backend_device_types
- Config below enabled if ResourceType is 'Compute-AI'
- 2 lossless TCs' (2, 3)
- 2 lossy TCs' (0,1)
- DSCP to TC map uses 4 DSCP code points and maps to the TCs' as follows:
   "DSCP_TO_TC_MAP": {
        "AZURE": {
             "48" : "0",
            "46" : "1",
            "3"  : "3",
            "4"  : "4"
        }
    }

- WRED profile has green {min/max/mark%} as {2M/10M/5%}

This required template change <as in the PR> in addition to the vendor qos.json.j2 file (not included here).

### How I did it

#### How to verify it
- with the above change and the vendor config change, generated the qos.json file and verified that the objective stated in "Why I did it" was met

- verified no error

### Description for the changelog
Update qos_config.j2 for Comptue-AI deployment on one of backend device type roles
2023-08-31 11:30:20 -07:00
abdosi
b6edc374ba
[build]: Added flag in sonic_version.yml to see if image is secured or non-secured (#16191)
What I did:

Added flag in sonic_version.yml to see if compiled image is secured or non-secured. This is done using build/compile time environmental variable SECURE_UPGRADE_MODE as define in HLD: https://github.com/sonic-net/SONiC/blob/master/doc/secure_boot/hld_secure_boot.md

This flag does not provide the runtime status of whether the image has booted securely or not. It's possible that compile time signed image (secured image) can boot on non secure platform.

Why I did:
Flag can be used for manual check or by the test case.

ADO: 24319390

How I verify:
Manual Verification

---
build_version: 'master-16191.346262-cdc5e72a3'
debian_version: '11.7'
kernel_version: '5.10.0-18-2-amd64'
asic_type: broadcom
asic_subtype: 'broadcom'
commit_id: 'cdc5e72a3'
branch: 'master-16191'
release: 'none'
build_date: Fri Aug 25 03:15:45 UTC 2023
build_number: 346262
built_by: AzDevOps@vmss-soni001UR5
libswsscommon: 1.0.0
sonic_utilities: 1.2
sonic_os_version: 11
secure_boot_image: 'no'

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-08-29 13:41:01 -07:00
Vivek
0652991eb8
Run db_migrator for non first-time reboots (#16116)
- Why I did it
The recent change #15685 (comment) removed the db migration for non first reboots.
This is problematic for many deployments which doesn't rely on ZTP and push a custom config_db.json
Port to older branches after #15685 is ported back

- How I did it
Re-introduce the logic to run the db_migrator on non-first boots

- How to verify it
Verified reboot and warm-reboot cases

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-08-22 18:36:38 +03:00
Kebo Liu
1626e198a8
[Mellanox] Update SDK/FW/SAI to 4.6.1020/2012.1020/SAIBuild2305.25.0.3 (#16096)
SONiC changes:
1. Support Spectrum4 ASIC FW binary building.
2. Support new SDK sx-obj-desc lib building since new SAI need it.
3. Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
4. Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2305.25.0.3

SDK/FW bug fixes
1. In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
2. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.

SDK/FW Features
1. On SN2700 all ports can support y cable by credo

SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable 
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE

SAI features
1. Port init profile

- How I did it
Update SDK/FW/SAI make files

- How to verify it
Run full sonic-mgmt regression on Mellanox platform

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-15 15:32:52 +03:00
Vaibhav Hemant Dixit
e127701660
Fix CONFIG_DB_INITIALIZED flag check logic and set/reset flag for warmboot (#15685)
* Fix CONFIG_DB_INITIALIZED flag check logic and set/reset flag for warm-reboot
* Fix db-cli usage
* Handle same image warm-reboot and generalize handling of INIT flag
* Cover boot from ONIE case: set config init flag when minigraph, config_db are missing
* Handle case: first boot of SONiC
* Check for config init flag
* Simplify logic, and do not call db_migrator for same image reboot
2023-08-04 16:00:26 -07:00
ganglv
5c4ab7a7f4
Use DNS j2 for default DNS configuration (#15901)
Why I did it
Support default DNS configuration

How I did it
Use j2 template to generate default DNS configuration.

How to verify it
Run sonic-config-engine unit test.
2023-07-31 15:43:00 -07:00
xumia
a0b3ec2df6
Support FIPS DB configuration (#15632)
Why I did it
Support FIPS DB configuration
Design Doc: sonic-net/SONiC#1372

Work item tracking
Microsoft ADO (number only): 24411148
How I did it
Add the FIPS Yang model to make FIPS configurable in ConfigDB.

How to verify it
See TestPlan: sonic-net/sonic-mgmt#9092
Build the image and run the tests: sonic-net/sonic-mgmt#9091
2023-07-28 16:54:02 +08:00
lixiaoyuner
10b65d9826
Add k8s master code new (#15716)
Why I did it
Currently, k8s master image is generated from a separate branch which we created by ourselves, not release ones. We need to commit these k8s master related code to master branch for a better way to do k8s master image build out.

Work item tracking
Microsoft ADO (number only):
19998138
How I did it
Install k8s dashboard docker images
Install geneva mds and mdsd and fluentd docker images and tag them as latest, tagging latest will help create container always with the latest version
Install azure-storage-blob and azure-identity, this will help do etcd backup and restore.
Install kubernetes python client packages, this will help read worker and container state, we can send these metric to Geneva.
Remove mdm debian package, will replace it with the mdm docker image
Add k8s master entrance script, this script will be called by rc-local service when system startup. we have some master systemd services in compute-move repo, when VMM service create master VM, VMM will copy all master service files inside VM, the entrance script will setup all services according to the service files.
When the entrance script content changed, the PR build will set include_kubernetes_master=y to help do validation for k8s master related code change. The default value of include_kubernetes_master should be always n for public master branch. We will generate master image from internal master branch
How to verify it
Build with INCLUDE_KUBERNETES_MASTER = y
2023-07-25 07:44:59 +08:00
Junchao-Mellanox
05f9c5c297
Fix issue: set delayed attribute to true for platform monitor service (#15816)
There is a redundant line in init_cfg.json.j2. It would cause pmon service always has "delayed=False". However, we know that PMON has a timer now. So, I try to fix it here.
2023-07-24 08:30:35 -07:00
vmittal-msft
fea10546f2
Update WRED profile on system ports (#15612)
* Update WRED profile on system ports
2023-07-19 15:00:39 -07:00
Mohammedz93
28b9299445
Support Reset factory (#14105)
#### Why I did it
Support reset factory in Sonic OS
[Reset Factory HLD](https://github.com/sonic-net/SONiC/pull/1231)
[Sonic-mgmt tests](https://github.com/sonic-net/sonic-mgmt/pull/7652)

#### How I did it
- Added new script "/usr/bin/reset-factory"
   * It generates a new config_db.json files with factory configurations
   * It clears system files and logs
   * It removes all docker containers on system except database
   * It clears non-default users and restores default users password
- Dump the default users info to a new file during build "/etc/sonic/default_users.json"
- Supported new type "Keep-basic" in "config-setup factory"
- Add new conf file for config-setup "/etc/config-setup/config-setup.conf

#### How to verify it
- Run reset-factory script with all types: < none | keep-all-config | only-config | keep-basic >
- Run config-setup factory with parameters < none | keep-basic >

#### Description for the changelog
Support reset factory in Sonic OS

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
2023-07-11 16:14:17 -07:00
Vaibhav Hemant Dixit
ddb3086620
Revert "Revert "Fix for fast/cold-boot: call db_migrator only after old config is loaded (#14933)" (#15464)" (#15684)
This reverts commit 9649a44470.
2023-07-06 17:34:35 -07:00
lixiaoyuner
ca29197184
Move k8s script to docker-config-engine (#14788)
Why I did it
To reduce the container's dependency from host system

Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.

How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.

Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2023-07-05 14:44:48 -07:00
Stepan Blyshchak
1ebdcda9e3
[nvidia] make sure shared storage with syncd is cleared on restarts (#14547)
Why I did it
Sharing the storage of syncd with other proprietary application extensions allows them to communicate with syncd in differnt ways.
If one container wants to pass some information to syncd then shared storage can be used. However, today the shared storage isn't cleaned on restarts making it possible for syncd to read out-of-date information generated in the past.

NOTE: No plans to use it for standard SONIC dockers and we are working on removing the SDK dependency from PMON docker

How I did it
Implemented new service to clean the shared storage.

How to verify it
Do reboot/fast-reboot/warm-reboot/config-reload/systemctl restart swss and verify /tmp/ is cleaned after each restart in syncd container.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-06-28 15:26:49 -07:00
Oleksandr Ivantsiv
475fe27c0b
[dns] Add support for static DNS configuration. (#14549)
- Why I did it
Add support for static DNS configuration. According to sonic-net/SONiC#1262 HLD.

- How I did it
Add a new resolv-config.service that is responsible for transferring configuration from Config DB into /etc/resolv.conf file that is consumed by various subsystems in Linux to resolve domain names into IP addresses.

- How to verify it
Run the image compilation. Each component related to the static DNS feature is covered with the unit tests.
Run sonic-mgmt tests. Static DNS feature will be covered with the system tests.
Install the image and run manual tests.
2023-06-22 19:12:30 +03:00