Commit Graph

6758 Commits

Author SHA1 Message Date
arlakshm
288d667c66
[yang][multi-asic]bgp internal neighbor yang model (#10632)
closes #10157

Why I did it
Add yang model for the bgp_internal_neighbor table in config_db

How I did it
Add new yang model file and unit tests

How to verify it
UT and compile sonic_yang_models-1.0-py3-none-any.whl and sonic_yang_mgmt-1.0-py3-none-any.whl

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-05-04 15:05:44 -07:00
Lior Avramov
6284d136ec
[LLDP] Enhance lldmgrd Redis events handling (#10593)
Why I did it
When lldpmgrd handled events of other tables besides PORT_TABLE, error message was printed to log.

How I did it
Handle event according to its file descriptor instead of looping all registered selectables for each coming event.

How to verify it
I verified same events are being handled by printing events key and operation, before and after the change.
Also, before the change, in init flow after config reload, when lldpmgrd handled events of other tables besides PORT_TABLE, error messages were printed to log, this issue is solved now.
2022-05-04 08:21:02 -07:00
DavidZagury
799d0c0313
[YANG] Update range of supported port speeds to support 800G ports (#10687)
- Why I did it
To add support for 800G speed for port in the yang.

- How I did it
Change limitation from 400G to 800G.

- How to verify it
Set a port speed to 800G and run the yang DB validation. e.g. by using dynamic port breakout.
2022-05-04 16:56:21 +03:00
kellyyeh
243d0c73f9
[dhcp6relay] Add retry mechanism for binding socket to interface ipv6 addresses (#10712) 2022-05-02 17:14:13 -07:00
Polly Hsu
a1e76d25b7
[as7816-64x] Update installer.conf (#10418)
Co-authored-by: ecsonic <ecsonic@edge-core.com>
Why I did it
The customer report of the PCIe Bus Errors upon the SDK initialization of as7816-64x.

How I did it
Based on the internal info and discussion, update "pcie_aspm=off" into ONIE_PLATFORM_EXTRA_CMDLINE_LINUX of installer.conf to resolve it.
2022-05-02 11:09:16 -07:00
Andriy Yurkiv
0a6bb3f6f0
[yang] add yang options for Context object (#10359)
#### Why I did it
Need to pass LY_CTX_DISABLE_SEARCHDIR_CWD to Context in order to disable automatically searching for schemas in current working directory (which is by default searched automatically)

#### How I did it
add additional attribute into YANG context

#### How to verify it
Create some invalid link on switch :
1) **ln -s /usr/abc xxx**
2) run **spm list**
--> There should not be these messages:
```
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
```
2022-05-02 09:51:30 -07:00
Song Yuan
a9d5858da1
Fix buffer template for Arista SKU. (#10663)
Why I did it
The buffer pool & profile setting in buffer template was not correct and caused the errors like the following:

ERR swss#orchagent: :- parseReference: malformed reference:[BUFFER_PROFILE|ingress_lossless_profile]. Must not be surrounded by [ ]

How I did it
Fix the buffer pool & profile setting by removing "[]".

How to verify it
Loaded image with this fix in a switch and made sure the error was not seen anymore.
2022-05-02 09:49:42 -07:00
shlomibitton
4ec3af86af
[Fastboot] Delay PMON service for better fastboot performance (#10567)
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot.

- How I did it
Add a timer for PMON service.
Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot.
Copy the timer file to the host bin image.

- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
2022-05-02 10:44:17 +03:00
Sumukha Tumkur Vani
80f5d36a5b
[SWSS] Update submodule (#10719)
Add the following commits:

- [orchagent, crm]: Reset crm threshold exceed count when threshold type changed 5ba6a54786c0fd9b155bb9ea2a7ed724a58aab74
- [pbh] [aclorch] Fixed a bug causes by updating the flow-counter value for the PBH rule 841f00389b338e91ddc4de460ace4ff96adfa796
- [ACL]Avoid incrementing crm count when ACL rule create fails 3d3364f9715fa05fbdf2d09b08676c3055903b84
- set remote vtep the netdev down before delete 7f53db782aed2973f4ff6807911b5a549461f3c7
- Removing Vnet with scope default 2ea8581da4ba6f97bebde4845a234d7c810e5515
2022-04-30 10:39:13 -07:00
Alexander Allen
53e5fe6a93
[Mellanox] Upgrade mellanox SDK to 4.5.1500 and mlnx-sai to 1.21.1.1 (#10675)
Update SDK/FW to 4.5.1500/2010.1500 and SAI version to 1.21.1.1

SDK/FW features:
1. Added support for Finisar DR4 (FTCD4523E2PCM) on Spectrum-2 and Spectrum-3 systems.

SAI Features:
1. ECMP overlay support for IPv6
2. BFD offloading / 4K scale
3. Host interface user traps + improved trap registration (table entry)
4. gcc11 compilation fixes
5. Read support for ACL redirect action
6. Optimize ECMP DB size
7. Buffer descriptors new defaults
8. Updated port mapping for SN2201

SAI Fixes:
1. Debug counter removal when configured with all drop reasons

- Why I did it
Upgrade Mellanox SDK and SAI versions to latest

- How I did it
Updated submodule pointers

- How to verify it
Regression tested
2022-04-29 20:50:59 +03:00
Shilong Liu
d258db8aa2
[CG] Fix CG alert about underscore version. (#10705) 2022-04-29 13:40:33 +08:00
vmittal-msft
ede1e0e889
Adjustment to ingress pool size to accomodate brcm sai (#10694) 2022-04-28 20:39:56 -07:00
Mohamed Ghoneim
e0f5333d9c
[SY] Adding exceptlionList to validation exception (#10699)
#### Why I did it
Adding exceptlionList to validation exception

#### How I did it
Check code.

#### How to verify it
Ran manually.
- Run full config validation from a KVM
- Print the thrown exception

**Before**
```
Error: Data Loading Failed
All Keys are not parsed in FEATURE
dict_keys(['telemetry'])
```
**After**
```
Error: Data Loading Failed
All Keys are not parsed in FEATURE
dict_keys(['telemetry'])
exceptionList:["'status'"]
```

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration.
-->

#### A picture of a cute animal (not mandatory but encouraged)
2022-04-28 17:29:56 -07:00
xumia
6e88f05a45
Fix the build target error when building sonic-rest-api (#10693)
Why I did it
Fix target target/debs/bullseye/sonic-rest-api_1.0.1_arm64.deb not existing issue, the correct target is target/debs/bullseye/sonic-rest-api_1.0.1_armhf.deb.
Fix issue: #9896

[ FAIL LOG START ] [ target/debs/stretch/sonic-rest-api_1.0.1_amd64.deb ]
[ REASON ] :      target/debs/stretch/sonic-rest-api_1.0.1_amd64.deb does not exist   NON-EXISTENT PREREQUISITES: 
[ FLAGS  FILE    ] : []
2022-04-29 07:17:52 +08:00
shlomibitton
1d84e0d7df
[Fastboot] Delay LLDP service for better fastboot performance (#10568)
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot.

- How I did it
Add a timer for LLDP service.
Copy the timer file to the host bin image.

- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
This PR is dependent on PR: #10567
2022-04-28 10:35:14 +03:00
Kalimuthu-Velappan
bc30528341
Parallel building of sonic dockers using native dockerd(dood). (#10352)
Currently, the build dockers are created as a user dockers(docker-base-stretch-<user>, etc) that are
specific to each user. But the sonic dockers (docker-database, docker-swss, etc) are
created with a fixed docker name and common to all the users.

    docker-database:latest
    docker-swss:latest

When multiple builds are triggered on the same build server that creates parallel building issue because
all the build jobs are trying to create the same docker with latest tag.
This happens only when sonic dockers are built using native host dockerd for sonic docker image creation.

This patch creates all sonic dockers as user sonic dockers and then, while
saving and loading the user sonic dockers, it rename the user sonic
dockers into correct sonic dockers with tag as latest.

	docker-database:latest <== SAVE/LOAD ==> docker-database-<user>:tag

The user sonic docker names are derived from 'DOCKER_USERNAME and DOCKER_USERTAG' make env
variable and using Jinja template, it replaces the FROM docker name with correct user sonic docker name for
loading and saving the docker image.
2022-04-28 08:39:37 +08:00
Saikrishna Arcot
313cced32b
Update Docker to 20.10.14 (#10677)
* Upgrade docker version from 20.10.7 to 20.10.14, and pin containerd.io

Update the Docker engine version from 20.10.7 to 20.10.14. This brings
in some CVE and bug fixes.

Additionally, pin the version of containerd.io to a specific version,
mainly for consistency/reproducibility.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Remove the containerd ordering change to docker.service

This appears to be already present in the current docker.service.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Remove use of apt-key

apt-key is considered deprecated, and the current practice is to just
add the key into /etc/apt/trusted.gpg.d/.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Upgrade docker container in Bullseye slave to 20.10.14

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-04-27 10:20:42 -07:00
jingwenxie
850e45601b
Revert "[sonic-cfggen] make minigraph parser fail when speed and lanes are not in PORT table (#10228)" (#10683)
This reverts commit cd330f0e70.
2022-04-27 08:26:44 +08:00
ganglv
9d7387a18e
[sonic-host-services]: Fix import and invalid path (#10660)
Why I did it
Can not start sonic-hostservice

How I did it
Install python3-dbus and systemd-python, and replace invalid path

How to verify it
Start the service with below commands:
sudo systemctl start sonic-hostservice
sudo systemctl status sonic-hostservice

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-04-27 07:14:51 +08:00
xumia
a06f5493b2
[Submodule]: update submodule for sonic-restapi (#10680)
Why I did it
Update submodule sonic-restapi
e83e0e8 Fix Ctype_char larger than address space issue in 32-bit armhf (#107)
2022-04-26 21:12:36 +08:00
Zhaohui Sun
cc30771f6b
Add python3 virtual environment for docker-ptf (#10599)
Why I did it
Migrate ptftests script to python3, in order to do an incremental migration, add python virtual environment firstly, install all required python packages in virtual env as well.
Then migrate ptftests scripts from python2 to python3 one by one avoid impacting non-changed scripts.

Signed-off-by: Zhaohui Sun zhaohuisun@microsoft.com

How I did it
Add python3 virtual environment for docker-ptf.
Add submodule ptf-py3 and install patched ptf 0.9.3 into virtual environment as well, two ptf issues were reported here:
p4lang/ptf#173
p4lang/ptf#174

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-04-26 09:13:26 +08:00
Song Yuan
aa62e33339
[chassis] Do not configure LLDP on recirc ports (#7909)
Why I did it
Recirc port is used to only forward traffic from one asic to another asic. So it's not required to configure LLDP on it.

How I did it
Add interface prefix helper for recirc port. Similar to skip configuring LLDP on inband port, add check in lldpmgrd to skip recirc port by checking interface prefix.
2022-04-25 13:14:17 -07:00
Maxime Lorrillere
0606add017
[chassis] Get asic PCI ID from CHASSIS_STATE_DB and update asic_id in CONFIG_DB (#9681)
Asic PCI ID (PCI address) is collected by chassisd (inside pmon -
Azure/sonic-platform-daemons#175) and saved in CHASSIS_STATE_DB (in
redis_chassis). CHASSIS_STATE_DB is accessible by swss containers.

At docker-init.sh (script is called after swss container is created and before
anything that could run in swss like orchagent...), we wait until asic PCI ID
of the corresponding asic is populated by chassisd. We then update asic_id in
CONFIG_DB of asic's database.

A system supporting dynamic asic PCI ID identification requires to have a file
(empty) use_pci_id_chassis in its platform dir.

When orchagent runs, it has correct asic PCI ID in its CONFIG_DB.

Together with this PR:

Azure/sonic-platform-daemons#175
Azure/sonic-platform-common#185

Signed-off-by: Maxime Lorrillere <mlorrillere@arista.com>

Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
2022-04-25 13:09:42 -07:00
Stephen Sun
9237950a0c
Align threshold mode of zero buffer profile of egress_lossless_pool (#10627)
On vs platform, egress_lossless_pool's mode is static.
So the corresponding profile should be of static_th as well.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-04-25 21:56:05 +03:00
Saikrishna Arcot
64187a1b15
Remove SSH host keys after installing the custom version of sshd (#10633)
* Remove SSH host keys after installing the custom version of sshd

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Use an override for for sshd instead of overwriting the service file

Don't overwrite upstream's .service file, and instead use an override
file for making sure the host key(s) are generated.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-04-25 10:38:52 -07:00
Shilong Liu
672db8d416
[submodule] Update submodule for sonic-mgmt-common (#10664)
submodule update, includes:

ec32690 CVE-2020-25614: Update xmlquery, jsonquery and xpath packages. (#58)
5156527 Showtech sonic mgmt framework: Add Management Framework functionality for "show tech-support" (#49)
2022-04-25 08:16:29 -07:00
zzhiyuan
d19a953e13
[Arista] Add 1x100G over 4 lanes configuration for 7060DX4 (#10655)
Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
2022-04-25 08:11:32 -07:00
bingwang-ms
3fc3259a35
Define qos map AZURE_TUNNEL for QoS remapping of tunnel traffic (#10565)
* Add AZURE_TUNNEL map

Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-04-25 15:06:10 +08:00
Longxiang Lyu
d8c27b6ed2
[YANG][vlan-sub-intf] Enforce Linux interface name length (#10646)
Why I did it
Allow portchannel vlan sub intf long name format as long as it follows Linux interface name length limit(<16).

How I did it
Modify the leaf name check.

How to verify it
Test case passes.
2022-04-25 14:44:40 +08:00
SuvarnaMeenakshi
5cd6bc4ce2
[portconfig]: Remove try block for db config initialization (#10581)
Why I did it
Provide fix for comment: https://github.com/Azure/sonic-buildimage/pull/10475/files#r847753187;
How I did it
Try exception is not required in this scenario, so remove and modify to initial db config according to single or multi-asic platforms.
How to verify it
Verified on multi-asic device.
2022-04-22 16:25:29 -07:00
Eric Zhu
869ac1d1f2
sonic-platform-modules-cel dx010: speed up dx010 platform init script (#10313)
* Optimize dx010 sonic platform init script to speed up init process
* Merge issue #10152: [warm-upgrade][202012] Slow Celestica platform init
in rc.local causes lacp-teardown fix into master branch

Signed-off-by: Eric Zhu <erzhu@celestica.com>
2022-04-22 20:36:17 +08:00
Taylor Cai
988a687182
Fix issue test_crm and test_fib (#10585)
Why I did it
Fix issue (https://github.com/Azure/sonic-buildimage/issues/9171) and (https://github.com/Azure/sonic-buildimage/issues/9236)

How I did it
Add flag in config file for get correct count of IPv6 entry.
Add init config file to set IPv4 ECMP hash on L4.

How to verify it
Compile the sonic_platform wheel for e1031, then upload to device and install the wheel, verify using testbed.
2022-04-22 20:35:52 +08:00
Yakiv Huryk
3abf383d3d
[asan] add address sanitizer support to docker-sonic-vs (#10470)
- Why I did it
To support docker-sonic-vs image with ASAN.

- How I did it
1. Made the supervisord.conf a template
2. Added the 'log_path' environment variable for ASAN-enabled daemons
3. Added supervisord.conf.j2 generation and ASAN lib to the docker-sonic-vs/Dockerfile.j2

- How to verify it
1. Made a build with ENABLE_ASAN=y
2. Run the tests, checked ASAN reports

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-04-22 11:07:07 +03:00
Junchao-Mellanox
af5e5c4c94
[Mellanox] Adjust PSU voltage WA (#10619)
- Why I did it
InvalidPsuVolWA.run might raise exception if user power off PSU when it is running. This exception is not caught and will be raised to psud which causes psud failed to update PSU data to DB.

- How I did it
1. Change the log level when WA does not work. This could happen when user power off PSU, hence changing the log level from error to warning is better
2. Change the wait time from 5 to 1 to avoid introduce too much delay in psud. 1 second is usually enough per my test
3. Give a default return value for function get_voltage_low_threshold and get_voltage_high_threshold to avoid exception reach to psud

- How to verify it
Manual test.
Run sonic-mgmt regression
2022-04-22 11:02:30 +03:00
Richard.Yu
37debbeb38
[CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist (#10555)
* [CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add more information in patch

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* Update 0003-Remove-minimist-packages.patch

* change the thrift 0.14.1 to package download

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* use the series file for patching

* fix a code defect
2022-04-22 09:43:16 +08:00
Samuel Angebault
ea38864235
[Arista] Update platform submodules (#10561)
Update PikeZ platform definition
Improve powercycle behavior on chassis
2022-04-21 13:14:59 -07:00
xumia
508dda6ad3
[Ci]: Support to sign image for cisco-8000 uefi secure boot (#10616)
Why I did it
[Ci]: Support to sign image for cisco-8000 uefi secure boot
2022-04-21 12:13:09 +08:00
Alexander Allen
37e2848b3f
Update sonic-sairedis submodule (#10607)
[sairedis submodule] commits:
c7cbfe8 Update SAI submodule to support python 3.7 (#1035)
2022-04-20 19:45:02 -07:00
yozhao101
e24fe9bc60
[Monit] Fix the issue which shows Monit can not reset its counter. (#10288)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>

Why I did it
This PR aims to fix the Monit issue which shows Monit can't reset its counter when monitoring memory usage of telemetry container.

Specifically the Monit configuration file related to monitoring memory usage of telemetry container is as following:

  check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400"
      if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry"
If memory usage of telemetry container is larger than 400MB for 10 times within 20 cycles (minutes), then it will be restarted.
Recently we observed, after telemetry container was restarted, its memory usage continuously increased from 400MB to 11GB within 1 hour, but it was not restarted anymore during this 1 hour sliding window.

The reason is Monit can't reset its counter to count again and Monit can reset its counter if and only if the status of monitored service was changed from Status failed to Status ok. However, during this 1 hour sliding window, the status of monitored service was not changed from Status failed to Status ok.

Currently for each service monitored by Monit, there will be an entry showing the monitoring status, monitoring mode etc. For example, the following output from command sudo monit status shows the status of monitored service to monitor memory usage of telemetry:

    Program 'container_memory_telemetry'
         status                             Status ok
         monitoring status          Monitored
         monitoring mode          active
         on reboot                      start
         last exit value                0
         last output                    -
         data collected               Sat, 19 Mar 2022 19:56:26
Every 1 minute, Monit will run the script to check the memory usage of telemetry and update the counter if memory usage is larger than 400MB. If Monit checked the counter and found memory usage of telemetry is larger than 400MB for 10 times
within 20 minutes, then telemetry container was restarted. Following is an example status of monitored service:

    Program 'container_memory_telemetry'
         status                             Status failed
         monitoring status          Monitored
         monitoring mode          active
         on reboot                      start
         last exit value                0
         last output                    -
         data collected               Tue, 01 Feb 2022 22:52:55
After telemetry container was restarted. we found memory usage of telemetry increased rapidly from around 100MB to more than 400MB during 1 minute and status of monitored service did not have a chance to be changed from Status failed to Status ok.

How I did it
In order to provide a workaround for this issue, Monit recently introduced another syntax format repeat every <n> cycles related to exec. This new syntax format will enable Monit repeat executing the background script if the error persists for a given number of cycles.

How to verify it
I verified this change on lab device str-s6000-acs-12. Another pytest PR (Azure/sonic-mgmt#5492) is submitted in sonic-mgmt repo for review.
2022-04-20 18:08:06 -07:00
Ze Gan
926e698f0a
[docker-macsec]: Fix the missing dependency of macsecmgrd in swss (#10618)
Why I did it
Missing the dependency of macsecmgrd in swss so that the MACsec feature cannot be enabled.

How I did it
Add SWSS dependency in docker-macsec.mk

How to verify it
Check the Azp of sonic-mgmt
2022-04-21 09:00:53 +08:00
Junchao-Mellanox
651ac2c840
[submodule] Update submodule for sonic-swss (#10623)
Swss Commit update:

1fd1dbf Add support for route flow counter (https://github.com/Azure/sonic-buildimage/pull/2094)
d8fadc6 [QoS] Resolve an issue in the sequence where a referenced object removed and then the referencing object deleting and then re-adding (https://github.com/Azure/sonic-buildimage/pull/2210)
eaf7264 [macsecorch]: MACsec with pfc (https://github.com/Azure/sonic-buildimage/pull/2095)
a32b611 [azp]: Reduce diff coverage to 50% threshhold (https://github.com/Azure/sonic-buildimage/issues/2227)
6301db7 [Code owner] Set owners for auto reviews (https://github.com/Azure/sonic-buildimage/issues/2229)
d1fb3dd [BFD]Retry create BFD with different source UDP port on failure (https://github.com/Azure/sonic-buildimage/pull/2225)
53620f3 [orchagent] add & remove port counters dynamically each time port was added or removed (https://github.com/Azure/sonic-buildimage/pull/2019)
cf216be Change ERR to Notice for tunnel term create fail (https://github.com/Azure/sonic-buildimage/pull/2219)
2022-04-20 17:12:28 -07:00
FuzailBrcm
122cb90695
Fix AS7726 not showing serial number in 'show platform summary' (#10489) (#10509) 2022-04-20 13:06:22 -07:00
Samuel Angebault
fb147764b5
[Arista] Fix arista-net initramfs hook (#10624)
The interface renaming logic fails if one interface is missing.
Because of the `set -e` the whole initramfs hook would abort early on
error.
This change fixes the current behavior to make sure missing interfaces
are properly skipped and ensure existing interface are renamed.
2022-04-20 10:03:05 -07:00
Junhua Zhai
128d762af3
[gearbox] Add peer gbsyncd for swss if gearbox exists (#10504)
Fix the issues #10501 and #9733

If having gearbox, we need:
    * add gbsyncd as a peer since swss also has dependency on gbsyncd
    * add service gbsyncd to FEATURE table if it is missing
2022-04-20 19:02:49 +08:00
bingwang-ms
d853c9c747
Update submodule sonic-swss-common (#10611)
Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-04-20 08:25:35 +08:00
Qi Luo
936d93cbcd
Fix tagged VlanInterface if attached to multiple vlan as untagged member (#8927)
#### Why I did it
Fix several bugs:
1. If one vlan member belongs to multiple vlans, and if any of the vlans is "Tagged" type, we respect the tagged type
2. If one vlan member belongs to multiple vlans, and all of the vlans have no "Tagged" type, we override it to be a tagged member
3. make sure `vlantype_name` is assigned correctly in each iteration

#### How to verify it
1. Test the command line to parse a minigraph and make sure the output does not change.
```
./sonic-cfggen -m minigraph.mlnx20.xml
```
The minigraph is for HwSKU Mellanox-SN2700-D40C8S8.

2. Test on a DUT with HwSKU Mellanox-SN2700-D40C8S8
```
sudo config load_minigraph
show vlan brief
```
Checked the "Port Tagging" column in the output.
2022-04-19 15:47:07 -07:00
Jing Zhang
8b5d908c92
Upgrade mux container to Bullseye (#10498)
sign-off: Jing Zhang zhangjing@microsoft.com

#### Why I did it
As part of the process moving containers from buster to bullseye.

#### How I did it
1. change base image from buster to bullseye. 
2. remove unused addition to orchagent run options 

#### How to verify it
Tested building locally.
2022-04-19 09:27:45 -07:00
Saikrishna Arcot
330777e795
Image build time improvements (#10104)
* [build]: Patch debootstrap to not unmount the host's /proc filesystem

Currently, when the final image is being built (sonic-vs.img.gz,
sonic-broadcom.bin, or similar), each invocation of sudo in the
build_debian.sh script takes 0.8 seconds to run and execute the actual
command. This is because the /proc filesystem in the slave container has
been unmounted somehow. This is happening when debootstrap is running,
and it incorrectly unmounts the host's (in our case, the slave
container's) /proc filesystem because in the new image being built,
/proc is a symlink to the host's (the slave container's) /proc. Because
of that, /proc is gone, and each invocation of sudo adds 0.8 seconds
overhead. As a side effect, docker exec into the slave container during
this time will fail, because /proc/self/fd doesn't exist anymore, and
docker exec assumes that that exists.

Debootstrap has fixed this in 1.0.124 and newer, so backport the patch
that fixes this into the version that Bullseye has.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* [build_debian.sh]: Use eatmydata to speed up deb package installations

During package installations, dpkg calls fsync multiples times (for each
package) to ensure that tht efiles are written to disk, so that if
there's some system crash during package installation, then it is in at
least a somewhat recoverable state. For our use case though, we're
installing packages in a chroot in fsroot-* from a slave container and
then packaging it into an image. If there were a system crash (or even
if docker crashed), the fsroot-* directory would first be removed, and
the process would get restarted. This means that the fsync calls aren't
really needed for our use case.

The eatmydata package includes a library that will block/suppress the
use of fsync (and similar) system calls from applications and will
instead just return success, so that the application is not blocked on
disk writes, which can instead happen in the background instead as
necessary. If dpkg is run with this library, then the fsync calls that
it does will have no effect.

Therefore, install the eatmydata package at the beginning of
build_debian.sh and have dpkg be run under eatmydata for almost all
package installations/removals. At the end of the installation, remove
it, so that the final image uses dpkg as normal.

In my testing, this saves about 2-3 minutes from the image build time.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Change ln syntax to use chroot

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-04-19 09:22:16 -07:00
Sachin Naik
598ab99469
secureboot: Enable signing SONiC kernel (#10557)
Why I did it
To sign SONiC kernel image and allow secure boot based system to verify SONiC image before loading into the system.

How I did it
Pass following parameter to rules/config.user
Ex:
SONIC_ENABLE_SECUREBOOT_SIGNATURE := y
SIGNING_KEY := /path/to/key/private.key
SIGNING_CERT := /path/to/public/public.cert

How to verify it
Secure boot enabled system enrolled with right public key of the, image in the platform UEFI database will able to verify image before load.

Alternatively one can verify with offline sbsign tool as below.

export SBSIGN_KEY=/abc/bcd/xyz/
sbverify --cert $SBSIGN_KEY/public_cert.cert fsroot-platform-XYZ/boot/vmlinuz-5.10.0-8-2-amd64 mage

O/P:
Signature verification OK
2022-04-19 13:23:15 +08:00
Shilong Liu
0a99f87bec
[ci] Update common lib pipeline to build more distribute packages. (#10576) 2022-04-19 10:32:01 +08:00