Commit Graph

317 Commits

Author SHA1 Message Date
Xincun Li
f886328897
[sn2700]: Add CPLD update. (#17376)
Why I did it
Porting #12173 to master, this will ensure all above 201911 version will have CPLD update files.

Microsoft ADO 25846069:

How I did it
Added Mellanox CPLD burn/refresh vme bundle for SN2700 platforms

How to verify it
Using update_firmware script to install private image that contains CPLD VME files with UPDATE_MLNX_CPLD_FW parameter.

Before update, the CPLD version was 15
admin@str2-msn2700-spy-1:~$ sudo fwutil show status
Chassis    Module    Component    Version                Description
---------  --------  -----------  ---------------------  ----------------------------------------
MSN2700    N/A       ONIE         2016.11-5.1.0012-9600  ONIE - Open Network Install Environment
                     SSD          0115-000               SSD - Solid-State Drive
                     BIOS         0ABZS017_01.01.213     BIOS - Basic Input/Output System
                     CPLD1        CPLD000085_REV1501     CPLD - Complex Programmable Logic Device
                     CPLD2        CPLD000043_REV0400     CPLD - Complex Programmable Logic Device
                     CPLD3        CPLD000000_REV0100     CPLD - Complex Programmable Logic Device
Do Update
admin@str2-msn2700-spy-1:/tmp$ sudo ./update_firmware sonic-mellanox-xincun-cpld.bin UPDATE_MLNX_CPLD_FW=1
Available space: 8101 MB
Warning: 'sonic_installer' command is deprecated and will be removed in the future
Please use 'sonic-installer' instead
Current FW version: SONiC-OS-20201231.110
Target FW version number: add-cpld-2.83464431-a0237f7aef
Target FW version: SONiC-OS-add-cpld-2.83464431-a0237f7aef
expr: non-integer argument
NOTICE: Reset Drop caches to index 1
Warning: 'sonic_installer' command is deprecated and will be removed in the future
Please use 'sonic-installer' instead
Image SONiC-OS-add-cpld-2.83464431-a0237f7aef is already installed. Setting it as default...
Command: grub-set-default --boot-directory=/host 0

Command: sync;sync;sync

Command: sleep 3

Done
NOTICE: sonic_installer install successfully
Mellanox platform is detected: x86_64-mlnx_msn2700-r0
Mellanox ASIC maintenance...
Mellanox ASIC firmware is up to date
Mellanox CPLD maintenance...
NOTICE: Copy Mellanox firmware upgrade utility
'/tmp/image-add-cpld-2.83464431-a0237f7aef-fs//usr/bin/mlnx-fw-upgrade.sh' -> '/usr/bin/mlnx-fw-upgrade.sh'
NOTICE: Copy Mellanox cpldupdate utility
'/tmp/image-add-cpld-2.83464431-a0237f7aef-fs//usr/bin/cpldupdate' -> '/usr/bin/cpldupdate'
Mellanox CPLD firmware upgrade is required. Installing compatible version...
Current CPLD firmware version: 15
Target CPLD firmware version: 20
NOTICE: Upgrade MLNX CPLD FW from 15 to 20
CPLD burn firmware file: /tmp/tmp.42DXmW1pQS/FUI000193_Burn_Panther_CPLD000085_REV2000_CPLD000128_REV0600_CPLD000130_REV0300.vme
CPLD refresh firmware file: /tmp/tmp.42DXmW1pQS/FUI000193_Refresh_Panther_CPLD000085_REV2000_CPLD000128_REV0600_CPLD000130_REV0300.vme
[/] CPLD update...                 Lattice Semiconductor Corp.

             ispVME(tm) V12.2 Copyright 1998-2012.

               Customized for Mellanox products.

Processing virtual machine file (/tmp/tmp.42DXmW1pQS/FUI000193_Burn_Panther_CPLD000085_REV2000_CPLD000128_REV0600_CPLD000130_REV0300.vme)......

Diamond Deployment Tool 3.12
CREATION DATE: Tue Sep 20 09:41:49 2022


[|] CPLD update...+=======+
| PASS! |
+=======+
Power cycle the device, then check CPLD version, it has changed to 20.
admin@str2-msn2700-spy-1:~$ sudo fwutil show status
Chassis    Module    Component    Version                Description
---------  --------  -----------  ---------------------  ----------------------------------------
MSN2700    N/A       ONIE         2016.11-5.1.0012-9600  ONIE - Open Network Install Environment
                     SSD          0115-000               SSD - Solid-State Drive
                     BIOS         0ABZS017_01.01.213     BIOS - Basic Input/Output System
                     CPLD1        CPLD000085_REV2000     CPLD - Complex Programmable Logic Device
                     CPLD2        CPLD000128_REV0600     CPLD - Complex Programmable Logic Device
                     CPLD3        CPLD000000_REV0000     CPLD - Complex Programmable Logic Device
2024-03-06 07:39:00 -08:00
lixiaoyuner
9fef78c4f0
Install and upgrade packages for k8s master image (#18159)
### Why I did it
- Currently inside k8s master image we are going to use AAD to do authentication related stuff with python language, we need to pre-install several azure key-vault related python packages.
- Need to upgrade cri-dockerd to 0.3.10 to support bookworm
- Need to change netcat package name to netcat-openbsd for bookworm
- Remove the unnecessary apt-get update
##### Work item tracking
- Microsoft ADO **(number only)**: 26435886

#### How I did it
- pip3 install azure-keyvault-secrets
- apt-get -y install netcat-openbsd
- upgrade the cri-dockerd version for bookworm
#### How to verify it
- pip3 list to check if azure-keyvault-secrets is installed inside image
- dpkg -l to check if netcat-openbsd is installed inside image
- systemctl status cri-dockerd.service to check if it's running well
2024-02-27 18:09:12 -08:00
rajib-dutta1
4753953ed0
Ipmitool bookworm: Fix and patch enterprise-numbers URL (#17878)
### Why I did it

ipmitool utility is used to access various HW sensors. Some platforms use "ipmitool raw " to read specific addresses. 

ipmitool_1.8.19-4_amd64.deb, that is part of bookworm has a defect. The package is missing file enterprise.txt that is expected by the "raw read" code path. 
It is so because the file the .deb tries to download at the build time does not have the necessary extension as it is available on remote server: https://www.iana.org/assignments/enterprise-numbers.txt

### How I did it

The defect had been fixed using coding changes in next unstable version of Linux. It is expected to be available in future stable version of the OS. Hence to keep the changes to minimal, the .dsc file is downloaded and only the Makefile is modified to download the correct file. To make is work as patch necessary changes are made.

#### How to verify it
Build log is attached and installation of the file is noted line #2274
When using vanilla bookworm on platforms like 5212 or 5224:
-------------------------------------------------------------------
root@sonic:~# ipmitool raw 0x04 0x2d 0x31
IANA PEN registry open failed: No such file or directory
00 c0 01 80

When fixed we should not see the above error:
--------------------------------------------------
root@sonic:/home/admin# ipmitool raw 0x04 0x2d 0x31
 00 c0 00 80

### Description for the changelog

This change is to address ipmitool raw read issue. This patch must be removed once it is available in next stable Linux release that contains the fix. 

1edb0e27e4
2024-02-26 17:49:06 -08:00
Prince George
0564ce48c9
[baseimage]: Update smartmontool version >= v7.4 (#17635)
Why I did it
Update smartmontool verson to 7.4. This is done to prevent smartmontools service to exit with non-zero exit status on platform that does not have a SSD/disk to be monitored.

Until Debian Bullseye (which had smartmontools 7.2), Debian had a patch applied that changed the default quit mode to never exit. A bug report was filed on Debian, saying that the source code patch isn't needed and could just be done via command line options, and also that smartmontools 7.3 has a new built-in option to exit with 0 if there are no monitorable devices found (which prevents systemd from treating it as a service failure). Because of that, Debian Bookworm (which also upgraded to 7.3) removed the patch and restored the default behavior of exiting with exit code 17 if there are no devices found.

Smartmontools v7.3 has this issue, because of which smartd exits with non-zero exit status even with "-q" option.

How I did it
Update the smartmontools to version 7.4 which has the fix for exiting gracefully if no monitoring device is found
Added smartd option "-q nodev0" to allow smartd to exit with status 0 if no monitoring device found
2024-02-12 09:37:12 -08:00
Stepan Blyshchak
cac73d80ca
[bootchart] enable command line recording (#17778)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2024-02-12 08:36:44 -08:00
Zain Budhwani
c8439cdd4b
Disable eventd and rsyslog plugin in slim images (#17905)
### Why I did it

Disable eventd at buildtime for slim images

##### Work item tracking
- Microsoft ADO **(number only)**:26386286

#### How I did it

Add flags for disabling eventd and only copy rsyslog conf files when eventd is included and not slim image

#### How to verify it

Manual testing
2024-01-30 22:14:23 -08:00
Junchao-Mellanox
f3f2972512
Optimize syslog rate limit feature for fast and warm boot (#17458)
- Why I did it
Optimize syslog rate limit feature for fast and warm boot

- How I did it
Optimize redis start time
Don't render rsyslog.conf in container startup script
Disable containercfgd by default. There is a new CLI to enable it (in another PR)

- How to verify it
Manual test
Regression test
2023-12-20 09:12:03 +02:00
Yevhen Fastiuk
5efb123ede
[NTP] Add NTP extended configuration (#15058)
hld [#1296](https://github.com/sonic-net/SONiC/pull/1296)
closes [#1254](https://github.com/sonic-net/SONiC/issues/1254)
depends-on [#60](https://github.com/sonic-net/sonic-host-services/pull/60), [#781](https://github.com/sonic-net/sonic-swss-common/pull/781), [#2835](https://github.com/sonic-net/sonic-utilities/pull/2835), [#10749](https://github.com/sonic-net/sonic-mgmt/pull/10749)

#### Why I did it
To cover the next AIs:
* Configure NTP global parameters
* Add/remove new NTP servers
* Change the configuration for NTP servers
* Show NTP status
* Show NTP configuration

### How I did it
* Add YANG model for a new configuration
* Extend configuration templates to support new knobs

### Description for the changelog
* Add ability to configure NTP global parameters such as authentication, dhcp, admin state
* Change the configuration for NTP servers
* Add an ability to show NTP configuration

#### Link to config_db schema for YANG module changes
[NTP configuration](https://github.com/sonic-net/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md#ntp-and-syslog-servers)
2023-12-11 13:31:35 -08:00
Ashwin Hiranniah
ada7c6a72e
Add pensando platform (#15978)
This commit adds support for pensando asic called ELBA. ELBA is used in pci based cards and in smartswitches.

#### Why I did it
This commit introduces pensando platform which is based on ELBA ASIC.
##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Created platform/pensando folder and created makefiles specific to pensando.
This mainly creates pensando docker (which OEM's need to download before building an image) which has all the userspace to initialize and use the DPU (ELBA ASIC).
Output of the build process creates two images which can be used from ONIE and goldfw.
Recommendation is use to use ONIE.
#### How to verify it
Load the SONiC image via ONIE or goldfw and make sure the interfaces are UP.

##### Description for the changelog
Add pensando platform support.
2023-12-04 14:41:52 -08:00
Saikrishna Arcot
cae42998dd Fix PAM module configuration issue
pam-auth-update doesn't store local configuration, and it's meant to be
used by packages only. Because libpam-systemd was getting uninstalled
afterwards, this caused tacplus to get re-enabled.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
73605a98ef Modify rasdaemon service on amd64 only
Rasdaemon is not installed on armhf or arm64

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
ed5176107b Update Debian build script for Bookworm
Notable changes:
* Use j2cli from Debian repos instead of pip
* Use setuptools from Debian repos instead of pip
* Use wheel from Debian repos instead of pip
* Update grpcio and grpcio-tools python packages to match version in
  Bookworm
* Use m2crypto from Debian repos instead of pip

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Saikrishna Arcot
34a1ac1a0f Migrate from ntp to ntpsec
Debian Bookworm no longer uses NTP, and instead uses NTPsec. Modify our
files to update/replace the NTPsec files instead.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
ganglv
c71fb3a30f
Share image for gnmi and telemetry (#16863)
Why I did it
Share docker image to support gnmi container and telemetry container

Work item tracking
Microsoft ADO 25423918:
How I did it
Create telemetry image from gnmi docker image.
Enable gnmi container and disable telemetry container by default.

How to verify it
Run end to end test.
2023-11-08 08:54:36 +08:00
Samuel Angebault
e4a497183a
Add build option to reduce final image size (#16729)
* Reduce SONiC image filesystem size

Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:
 - change some packages that are installed in the rootfs
 - apply a rootfs reduction script

The script itself will perform a few steps:
 - remove file duplication by leveraging hardlinks
   - under /usr/share/sonic since the symlinks under the device folder are lost during the build.
   - under /var/lib/docker since the files there will only be mounted ro
 - remove some extra files (man, docs, licenses, ...)
 - some image specific space reduction (only for aboot images currently)

The script can later be improved but for now it's reducing the rootfs
size by ~30%.

* restore fully featured vim package
2023-10-24 10:01:58 +08:00
Saikrishna Arcot
469aed2cf7
[baseimage]: Update openssh to 1:8.4p1-5+deb11u2 (#16826)
Openssh in Debian Bullseye has been updated to 1:8.4p1-5+deb11u2 to fix CVE-2023-38408. 
Since we're building openssh with some patches, we need to update our version as well.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-10-11 10:42:20 -07:00
Zain Budhwani
382d68fe42
Move syncd events to syncd.conf (#15950)
### Why I did it

syncd events should have tag sonic-events-syncd, not sonic-events-host. Created a new conf file which will have syncd events

##### Work item tracking
- Microsoft ADO **(number only)**:17747466

#### How I did it

Code change

#### How to verify it

Pipeline
2023-09-19 17:29:49 -07:00
Zain Budhwani
337a9dbcf4
Add rsyslog plugin support for frr log (#16192)
### Why I did it

Currently there is only rsyslog plugin support for /var/log/syslog, meaning we do not detect events that occur in frr logs such as BGP Hold Timer Expiry that appears in frr/bgpd.log. 

##### Work item tracking
- Microsoft ADO **(number only)**: 13366345

#### How I did it

Add omprog action to frr/bgpd.log and frr/zebra.log. Add appropriate regex for both events.

#### How to verify it

sonic-mgmt test case
2023-09-12 16:53:45 -07:00
lixiaoyuner
410e6ff406
Install pyOpenSSL package for k8s master (#16361)
### Why I did it
Need a tool to check certificate's detail of information.
##### Work item tracking
- Microsoft ADO **(number only)**: 25020260
#### How I did it
Install pyOpenSSL package for k8s master
#### How to verify it
Pip3 list to check whether it's installed when include_kubernetes_master=y
2023-08-31 22:26:24 -07:00
Kebo Liu
1626e198a8
[Mellanox] Update SDK/FW/SAI to 4.6.1020/2012.1020/SAIBuild2305.25.0.3 (#16096)
SONiC changes:
1. Support Spectrum4 ASIC FW binary building.
2. Support new SDK sx-obj-desc lib building since new SAI need it.
3. Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
4. Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2305.25.0.3

SDK/FW bug fixes
1. In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
2. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.

SDK/FW Features
1. On SN2700 all ports can support y cable by credo

SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable 
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE

SAI features
1. Port init profile

- How I did it
Update SDK/FW/SAI make files

- How to verify it
Run full sonic-mgmt regression on Mellanox platform

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-15 15:32:52 +03:00
ganglv
5c4ab7a7f4
Use DNS j2 for default DNS configuration (#15901)
Why I did it
Support default DNS configuration

How I did it
Use j2 template to generate default DNS configuration.

How to verify it
Run sonic-config-engine unit test.
2023-07-31 15:43:00 -07:00
lixiaoyuner
10b65d9826
Add k8s master code new (#15716)
Why I did it
Currently, k8s master image is generated from a separate branch which we created by ourselves, not release ones. We need to commit these k8s master related code to master branch for a better way to do k8s master image build out.

Work item tracking
Microsoft ADO (number only):
19998138
How I did it
Install k8s dashboard docker images
Install geneva mds and mdsd and fluentd docker images and tag them as latest, tagging latest will help create container always with the latest version
Install azure-storage-blob and azure-identity, this will help do etcd backup and restore.
Install kubernetes python client packages, this will help read worker and container state, we can send these metric to Geneva.
Remove mdm debian package, will replace it with the mdm docker image
Add k8s master entrance script, this script will be called by rc-local service when system startup. we have some master systemd services in compute-move repo, when VMM service create master VM, VMM will copy all master service files inside VM, the entrance script will setup all services according to the service files.
When the entrance script content changed, the PR build will set include_kubernetes_master=y to help do validation for k8s master related code change. The default value of include_kubernetes_master should be always n for public master branch. We will generate master image from internal master branch
How to verify it
Build with INCLUDE_KUBERNETES_MASTER = y
2023-07-25 07:44:59 +08:00
Mohammedz93
28b9299445
Support Reset factory (#14105)
#### Why I did it
Support reset factory in Sonic OS
[Reset Factory HLD](https://github.com/sonic-net/SONiC/pull/1231)
[Sonic-mgmt tests](https://github.com/sonic-net/sonic-mgmt/pull/7652)

#### How I did it
- Added new script "/usr/bin/reset-factory"
   * It generates a new config_db.json files with factory configurations
   * It clears system files and logs
   * It removes all docker containers on system except database
   * It clears non-default users and restores default users password
- Dump the default users info to a new file during build "/etc/sonic/default_users.json"
- Supported new type "Keep-basic" in "config-setup factory"
- Add new conf file for config-setup "/etc/config-setup/config-setup.conf

#### How to verify it
- Run reset-factory script with all types: < none | keep-all-config | only-config | keep-basic >
- Run config-setup factory with parameters < none | keep-basic >

#### Description for the changelog
Support reset factory in Sonic OS

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
2023-07-11 16:14:17 -07:00
lixiaoyuner
ca29197184
Move k8s script to docker-config-engine (#14788)
Why I did it
To reduce the container's dependency from host system

Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.

How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.

Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
2023-07-05 14:44:48 -07:00
Stepan Blyshchak
1ebdcda9e3
[nvidia] make sure shared storage with syncd is cleared on restarts (#14547)
Why I did it
Sharing the storage of syncd with other proprietary application extensions allows them to communicate with syncd in differnt ways.
If one container wants to pass some information to syncd then shared storage can be used. However, today the shared storage isn't cleaned on restarts making it possible for syncd to read out-of-date information generated in the past.

NOTE: No plans to use it for standard SONIC dockers and we are working on removing the SDK dependency from PMON docker

How I did it
Implemented new service to clean the shared storage.

How to verify it
Do reboot/fast-reboot/warm-reboot/config-reload/systemctl restart swss and verify /tmp/ is cleaned after each restart in syncd container.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-06-28 15:26:49 -07:00
Oleksandr Ivantsiv
475fe27c0b
[dns] Add support for static DNS configuration. (#14549)
- Why I did it
Add support for static DNS configuration. According to sonic-net/SONiC#1262 HLD.

- How I did it
Add a new resolv-config.service that is responsible for transferring configuration from Config DB into /etc/resolv.conf file that is consumed by various subsystems in Linux to resolve domain names into IP addresses.

- How to verify it
Run the image compilation. Each component related to the static DNS feature is covered with the unit tests.
Run sonic-mgmt tests. Static DNS feature will be covered with the system tests.
Install the image and run manual tests.
2023-06-22 19:12:30 +03:00
Stepan Blyshchak
e2e5b77f16
[mlnx-ffb.sh] Update issu-version location (#14925)
#### Why I did it

ISSU version check fails due to inability to mount squashfs from 202211 on 201911

#### How I did it

Put ISSU version file under platform directory

#### How to verify it

Warm-upgrade matrix:
- 201911 (with https://github.com/sonic-net/sonic-buildimage/pull/14928) to master
- 201911 (with https://github.com/sonic-net/sonic-buildimage/pull/14928) to 202211
- 202012 (with https://github.com/sonic-net/sonic-buildimage/pull/14927) to master
- 202205 (with this change cherry-picked) to master
2023-06-15 15:14:52 -07:00
Arvindsrinivasan Lakshmi Narasimhan
3f4b959d3f
[chassis] add libffi-dev for sonic-utilities (#15218)
In the PR sonic-net/sonic-utilities#2850 , for support remote access of linecards paramiko package is installed in sonic-utilities. libffi-dev needs to installed to be able to compile for armhf image

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2023-06-03 14:36:50 -07:00
Stepan Blyshchak
d73c810e86
[image_config] add rasdaemon.timer (#14300)
rasdaemon is a tool to log hardware errors. It takes 100% CPU during
boot for a few seconds. It impacts fast/warm boot by delaying control
plane restoration for 5 sec on some platforms.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-04-17 08:58:45 -07:00
Sudharsan Dhamal Gopalarathnam
2804998766
[config reload]Config Reload Enhancement (#13969)
#### Why I did it
Implementing code changes for https://github.com/sonic-net/SONiC/pull/1203

#### How I did it
Removed the timers and delayed target since the delayed services would start based on event driven approach.
Cleared port table during config reload and cold reboot scenario.
Modified yang model, init_cfg.json to change has_timer to delayed

#### How to verify it
Running regression
2023-04-12 11:20:03 -07:00
Neetha John
f30fb6ec58
[storage_backend] Add backend acl service (#14229)
Why I did it
This PR addresses the issue mentioned above by loading the acl config as a service on a storage backend device

How I did it
The new acl service is a oneshot service which will start after swss and does some retries to ensure that the SWITCH_CAPABILITY info is present before attempting to load the acl rules. The service is also bound to sonic targets which ensures that it gets restarted during minigraph reload and config reload

How to verify it
Build an image with the following changes and did the following tests

Verified that acl is loaded successfully on a storage backend device after a switch boot up
Verified that acl is loaded successfully on a storage backend ToR after minigraph load and config reload
Verified that acl is not loaded if the device is not a storage backend ToR or the device does not have a DATAACL table

Signed-off-by: Neetha John <nejo@microsoft.com>
2023-03-16 14:18:28 -07:00
davidpil2002
8098bc4bf5
Add Secure Boot Support (#12692)
- Why I did it
Add Secure Boot support to SONiC OS.
Secure Boot (SB) is a verification mechanism for ensuring that code launched by a computer's UEFI firmware is trusted. It is designed to protect a system against malicious code being loaded and executed early in the boot process before the operating system has been loaded.

- How I did it
Added a signing process to sign the following components:
shim, grub, Linux kernel, and kernel modules when doing the build, and when feature is enabled in build time according to the HLD explanations (the feature is disabled by default).

- How to verify it
There are self-verifications of each boot component when building the image, in addition, there is an existing end-to-end test in sonic-mgmt repo that checks that the boot succeeds when loading a secure system (details below).

How to build a sonic image with secure boot feature: (more description in HLD)

Required to use the following build flags from rules/config:
SECURE_UPGRADE_MODE="dev"
SECURE_UPGRADE_DEV_SIGNING_KEY="/path/to/private/key.pem"
SECURE_UPGRADE_DEV_SIGNING_CERT="/path/to/cert/key.pem"
After setting those flags should build the sonic-buildimage.
Before installing the image, should prepared the setup (switch device) with the follow:
check that the device support UEFI
stored pub keys in UEFI DB

enabled Secure Boot flag in UEFI
How to run a test that verify the Secure Boot flow:
The existing test "test_upgrade_path" under "sonic-mgmt/tests/upgrade_path/test_upgrade_path", is enough to validate proper boot
You need to specify the following arguments:
Base_image_list your_secure_image
Taget_image_list your_second_secure_image
Upgrade_type cold
And run the test, basically the test will install the base image given in the parameter and then upgrade to target image by doing cold reboot and validates all the services are up and working correctly
2023-03-14 14:55:22 +02:00
Stepan Blyshchak
f908dfe919
[Mellanox] Place FW binaries under platform directory instead of squashfs (#13837)
Fixes #13568

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to master
master to 201911 downgrade
master -> master reboot
ONIE -> master boot (First FW burn)
Which release branch to backport (provide reason below if selected)
2023-03-06 13:36:43 +02:00
DavidZagury
ee1b6b3751
Remove support to Mellanox SPC4 ASIC (#13932)
- Why I did it
FW for Spectrum-4 ASIC not yet available

- How I did it
Remove in Mellanox fw make files to Spectrum-4 ASIC firmware binaries.
Remove from firmware upgrade scripts to be able Spectrum-4 ASIC.

- How to verify it
Run regression test
2023-02-23 08:25:34 +02:00
anamehra
26af468a99
Add support for platform topology configuration service (#12066)
* Add support for platform topology configuration service

    This service invokes the platform plugin for platform specific topology
    configuration.
    The path for platform plugin script is:
    /usr/share/sonic/device/$PLATFORM/plugins/config-topology.sh
    If the platform plugin is not available, this service does nothing.

Signed-off-by: anamehra <anamehra@cisco.com>
2023-02-01 12:53:45 -08:00
Saikrishna Arcot
00b11ec4e2
Replace logrotate cron file with (adapted) systemd timer file (#12921)
Debian is shipping a systemd timer unit for logrotate, but we're also
packaging in a cron job, which means both of them will run, potentially
at the same time. Remove our cron file, and add an override to the
shipped timer file to have it be run every 10 minutes.

Fixes #12392.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-12-08 14:13:11 -08:00
Kebo Liu
36a100083f
[Mellanox] Add support to Mellanox Spectrum-4 ASIC Firmware compiling and upgrade (#12844)
- Why I did it
Add support for compiling Spectrum-4 ASIC firmware to the SONiC image
Add support for Spectrum-4 ASIC firmware upgrade

- How I did it
Update Mellanox fw make files to include Spectrum-4 ASIC firmware binaries.
Update firmware upgrade scripts to be able to detect Spectrum-4 ASIC.

- How to verify it
Run regression tests

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-11-29 16:38:41 +02:00
Lorne Long
7e525d96b3
[Build] Use apt-get to predictably support dependency ordered configuration of lazy packages (#12164)
Why I did it
The current lazy installer relies on a filename sort for both unpack and configuration steps. When systemd services are configured [started] by multiple packages the order is by filename not by the declared package dependencies. This can cause the start order of services to differ between first-boot and subsequent boots. Declared systemd service dependencies further exacerbate the issue (e.g. blocking the first-boot script).

The current installer leaves packages un-configured if the package dependency order does not match the filename order.

This also fixes a trivial bug in [Build]: Support to use symbol links for lazy installation targets to reduce the image size #10923 where externally downloaded dependencies are duplicated across lazy package device directories.

How I did it
Changed the staging and first-boot scripts to use apt-get:

dpkg -i /host/image-$SONIC_VERSION/platform/$platform/*.deb

becomes

apt-get -y install /host/image-$SONIC_VERSION/platform/$platform/*.deb

when dependencies are detected during image staging.

How to verify it
Apt-get critical rules

Add a Depends= to the control information of a package. Grep the syslog for rc.local between images and observe the configuration order of packages change.
2022-11-17 11:20:42 +08:00
Zain Budhwani
98ace33b0f
Add rsyslog plugin regex for select operation failure (#12659)
Added events for select op, alpm parity error, moved dhcp events from host to container
2022-11-13 21:41:33 -08:00
Mariusz Stachura
9f88d03c2b
[QoS] Support dynamic headroom calculation for Barefoot platforms (#11708)
Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>

What I did
Adding the dynamic headroom calculation support for Barefoot platforms.

Why I did it
Enabling dynamic mode for barefoot case.

How I verified it
The community tests are adjusted and pass.
2022-10-19 09:36:56 -07:00
cytsao1
9ef8464964
[pmon] Add smartmontools to pmon docker (#11837)
* Add smartmontools to pmon docker

* Set smartmontools to install version 7.2-1 in pmon to match host; clean up smartmontools build files

* Add comments on smartmontools version for both host and pmon
2022-10-17 13:26:31 -07:00
Hua Liu
257cc96d7c
Remove swsssdk from sonic OS image and docker container image (#12323)
Remove swsssdk from sonic OS image and docker image

#### Why I did it
swsssdk is deprecated, so need remove from image.

#### How I did it
Update config file to remove swsssdk from image.

#### How to verify it
Pass all test case.

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205

#### Description for the changelog
Remove swsssdk from sonic OS image and docker image

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2022-10-12 13:04:14 +08:00
Zain Budhwani
09fe3f467f
Add Structured Events w/ YANG Models (#12270)
Add events for dhcp-relay, bgp, syncd, & kernel.
2022-10-09 20:23:31 -07:00
Prince George
ac1d392d4c
Disable brackted-paste mode off by default (#12285)
* Disable brackted-paste mode off by default

* address review comment
2022-10-06 07:55:09 -07:00
Zain Budhwani
fd6a1b0ce2
Add events to host and create rsyslog_plugin deb pkg (#12059)
Why I did it

Create rsyslog plugin deb for other containers/host to install
Add events for bgp and host events
2022-09-21 09:20:53 -07:00
Stepan Blyshchak
e662008f72
[services] kill container on stop in warm/fast mode (#10510)
- Why I did it
To optimize stop on warm boot.

- How I did it
Added kill for containers
2022-09-19 19:34:33 +03:00
lixiaoyuner
a1b50cac41
Make client indentity by AME cert (#11946)
* Make client indentity by AME cert

* Join k8s cluster by ipv6

* Change join test cases

* Test case bug fix

* Improve read node label func

* Configure kubelet and change test cases

* For kubernetes version 1.22.2

* Fix undefine issue

Signed-off-by: Yun Li <yunli1@microsoft.com>
2022-09-16 13:13:39 +08:00
Renuka Manavalan
31e750ee0b
Fix PR build failure (#11973)
Some PR builds fails to find this file. Remove it temporarily until we root cause it
2022-09-06 15:13:05 -07:00
Zain Budhwani
6a54bc439a
Streaming structured events implementation (#11848)
With this PR in, you flap BGP and use events_tool to see the published events.
With telemetry PR #111 in and corresponding submodule update done in buildimage, one could run gnmi_cli to capture BGP flap events.
2022-09-03 07:33:25 -07:00
lixiaoyuner
8d6431e754
Add k8s master feature (#11637)
* Add k8s master feature

Signed-off-by: Yun Li <yunli1@microsoft.com>

* Update kubernetes version mistake and make variable passing clear

Signed-off-by: Yun Li <yunli1@microsoft.com>

* Add CRI-dockerd package

Signed-off-by: Yun Li <yunli1@microsoft.com>

* Update version variable passing logic

Signed-off-by: Yun Li <yunli1@microsoft.com>

* Upgrade the worker kubernetes version

Signed-off-by: Yun Li <yunli1@microsoft.com>

* Install xml file parse tool

Signed-off-by: Yun Li <yunli1@microsoft.com>

Signed-off-by: Yun Li <yunli1@microsoft.com>
2022-08-13 23:01:35 +08:00