Commit Graph

5848 Commits

Author SHA1 Message Date
mssonicbld
bb0c71246d
[ci/build]: Upgrade SONiC package versions (#10906) 2022-05-23 21:40:47 +00:00
Volodymyr Samotiy
1944f309de
[202111] [Mellanox] Update SAI to 1.21.1.1 and SDK/FW to 4.5.2262/xx.2010.2262 (#10881)
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.
4. Added support for Finisar DR4 (FTCD4523E2PCM) on Spectrum-2 and Spectrum-3 systems.

SAI
1. ECMP overlay support for IPv4 and IPv6
2. BFD offloading / 4K scale

SAI fixes
1. Reduce verbosity of print in case packet ingress on invalid port
2. Added support for Host table entry removal API to remove registration of a trap to a channel

- How I did it
Updated SAI & SDK submodules along with the relevant Makefiles

- How to verify it
Build an image and run tests from "sonic-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-05-22 09:51:48 +03:00
Stephen Sun
2aa1c2f437
Fix issue: error message from system-health daemon is observed during system starting (#10843)
- Why I did it
Error message: "ERR healthd: Failed to read from file /var/run/hw-management/led/led_status_capability" is observed during system starting
The system-health daemon will wait for 5 minutes before it starts to run.
During this time, the only thing it does is to set the LED even before it starts.
However, the corresponding sysfs has not been ready at the time it is being read, which causes the error message.

- How I did it
Defer system-health daemon until hw-management service starts

- How to verify it
Run regression test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-05-21 10:13:49 +03:00
ganglv
f58fec5d95
[sonic-cfggen]: Fix broken UT (#10863)
Why I did it
UT for sonic-config-engine is broken.

How I did it
Remove yang validation.

How to verify it
Run UT for sonic-config-engine.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-05-18 16:44:50 +08:00
Judy Joseph
9eadb98341 Update the submodules
sonic-utilities
0225195 Accept 0 for queue and dscp (#2162)
282faf0 [show][vrf]Fixing show vrf to include vlan subinterface (#2158)
f3f1b11 Validate destination port is not LAG (#2053)

sonic-platform-common
0f6cccd [sonic_ssd] Nokia-7215: "show platform ssdhealth" not showing health percent (#279)
2022-05-15 23:26:43 -07:00
ganglv
6a9ef8c1de [sonic-cfggen]: Update UT to run yang validation (#9700)
Why I did it
Config db schema generated by minigraph should run yang validation.

How I did it
Modify run_script to add yang validation.

How to verify it
Run sonic-config-engine unit test.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-05-15 23:13:12 -07:00
xumia
77ae1e8198 [Ci] Support to trigger a pipeline to download and publish artifacts to storage (#10820)
Why I did it
Support to trigger a pipeline to download and publish artifacts to storage and container registry.
Support to specify the patterns which docker images to upload.

How I did it
Pass the pipeline information and the artifact information by pipeline parameters to the pipeline which will be triggered a new build. It is to decouple the artifacts generation and the publish logic, how and where the artifacts/docker images will be published, depends on the triggered pipeline.

How to verify it
2022-05-15 23:13:01 -07:00
Vivek R
d628430148 Removed platform specific reboot files for mellanox simx platforms (#10806)
- Why I did it
Platform_reboot files for simx doesn't do aything different apart from calling /sbin/reboot. which is anyway done in the /usr/local/bin/reboot script i.e. the parent script which calls the platform specific reboot scripts if present.

Moreover, /sbin/reboot invoked in the platform specific reboot script is a non-blocking call and thus it returns back to the original script (although /sbin/reboot does it job in the background) and we see messages like this.

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-05-15 23:12:55 -07:00
kellyyeh
d03ede7ba0 [dhcp_relay] Remove dhcp6mon (#10467) 2022-05-15 23:12:48 -07:00
Saikrishna Arcot
bad29e535b Fix calculation of $(1)_DEP_PKGS_SHA in Makefile.cache (#10764)
In Makefile.cache, for $(1)_DEP_PKGS_SHA, the intention is to include
the DEP_MOD_SHA and MOD_HASH of each of the current package's
dependencies. However, there's a level of dereferencing missing; instead
of grabbing the value of $(dfile)_DEP_MOD_SHA, it is literally using the
variable name $(dfile)_DEP_MOD_SHA. This means that the value of this
variable will not change when some dependency changes.

The impact of this is in transitive dependencies. For a specific
example, if there is some change in sairedis, then sairedis will be
rebuilt (because there's a change within that component), and swss will
be rebuilt (because it's a direct dependency), but
docker-swss-layer-buster will not get rebuilt, because only the direct
dependencies are effectively being checked, and those aren't changing.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-05-15 23:11:05 -07:00
Marty Y. Lok
b1c3ab73ca [VoQ][config] Multiasic Supervisor card fails to load config_db#.json in chassis when system is reboot (#10106)
Supervisor card fails to load config_db#.json in chassis when system reboot. 
This is an intermittent issue, fixes #10105
2022-05-15 23:11:01 -07:00
Sudharsan Dhamal Gopalarathnam
df660f20be [caclmgrd]Added logic to allow BFD port numbers (#10735)
* [caclmgrd]Added logic to allow BFD port numbers
2022-05-15 23:10:57 -07:00
Dror Prital
51d675c604
[202111][submodule] Update sonic-utilities submodule (#10816)
Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (Azure/sonic-utilities#2133)" (Azure/sonic-utilities#2166)
2022-05-15 14:01:41 +03:00
Shilong Liu
339e68e1dd
[ci] Support multi tags when pushing docker image (#10771) (#10789) 2022-05-11 14:08:50 +08:00
Junchao-Mellanox
d0e7d9a01d
[YANG] Fix issue: Non compliant leaf list in config_db schema (#10291) (#10768)
Fix issue: Non compliant leaf list in config_db schema: https://github.com/Azure/sonic-buildimage/issues/9801

The basic flow of DPB is like:
1.	Transfer config db json value to YANG json value, name it “yangIn”
2.	Validate “yangIn” by libyang
3.	Generate a YANG json value to represent the target configuration, name it “yangTarget”
4.	Do diff between “yangIn” and “yangTarget”
5.	Apply the diff to CONFIG DB json and save it back to DB

The fix:
•	For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
•	For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.

1. Manual test
2. Changed sample config DB and unit test passed

Conflicts:
	src/sonic-yang-mgmt/sonic_yang_ext.py
2022-05-09 07:41:27 -07:00
mssonicbld
a4283019cd
[ci/build]: Upgrade SONiC package versions (#10724) 2022-05-08 23:13:04 +00:00
Judy Joseph
375e20338a Update submodules sonic-snmpagent, sonic-swss
079f80a (HEAD -> 202111, origin/202111) Fix: if routestr does not exist, skip (#257)
8fd0fe1 Fix: not to use blocking get_all() after keys() (#255)
981107a Add VoQ Recirc interface (i.e., Ethernet-Rec) to interface maps for S… (#244)

f4ecfb6 (HEAD -> 202111, origin/202111) Removing Vnet with scope default (#2239)
2022-05-07 23:28:17 -07:00
Junchao-Mellanox
4dabc46d82 Fix race condition between networking service and interface-config service (#10573)
Why I did it
The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like:

Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp.
Systemd starts interface-config service who depends on networking service
Interface-config service runs command “ifdown –force eth0”, check line. but networking service is still running so that this line failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down.
Interface-config service updates /etc/networking/interface to static configuration.
Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP)
When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces.
The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3.

How I did it
Check networking service state before running "ifdown –force eth0", wait for it done if it is activating.

How to verify it
Manual test.
2022-05-07 23:17:07 -07:00
kellyyeh
0a3217004c [dhcp6relay] Add dhcpv6 option check (#10486) 2022-05-07 23:17:03 -07:00
Aravind Mani
f907f19064 DellEMC: S6000,S6100 SFP refactor (#9016)
* DellEMC: S6000,S6100 SFP refactor
2022-05-07 23:16:57 -07:00
Lior Avramov
3d3eb1fb53 [LLDP] Enhance lldmgrd Redis events handling (#10593)
Why I did it
When lldpmgrd handled events of other tables besides PORT_TABLE, error message was printed to log.

How I did it
Handle event according to its file descriptor instead of looping all registered selectables for each coming event.

How to verify it
I verified same events are being handled by printing events key and operation, before and after the change.
Also, before the change, in init flow after config reload, when lldpmgrd handled events of other tables besides PORT_TABLE, error messages were printed to log, this issue is solved now.
2022-05-07 23:16:48 -07:00
kellyyeh
78031eb863 [dhcp6relay] Add retry mechanism for binding socket to interface ipv6 addresses (#10712) 2022-05-07 23:16:44 -07:00
shlomibitton
d3d6d0fb52 [Fastboot] Delay PMON service for better fastboot performance (#10567)
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot.

- How I did it
Add a timer for PMON service.
Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot.
Copy the timer file to the host bin image.

- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
2022-05-07 23:16:41 -07:00
Andriy Yurkiv
1334c0447f [yang] add yang options for Context object (#10359)
#### Why I did it
Need to pass LY_CTX_DISABLE_SEARCHDIR_CWD to Context in order to disable automatically searching for schemas in current working directory (which is by default searched automatically)

#### How I did it
add additional attribute into YANG context

#### How to verify it
Create some invalid link on switch :
1) **ln -s /usr/abc xxx**
2) run **spm list**
--> There should not be these messages:
```
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
```
2022-05-07 23:16:37 -07:00
Judy Joseph
b3f42feeaf Update sonic-swss, sonic-utilities submodules
swss
f71c57e [ACL]Avoid incrementing crm count when ACL rule create fails (#2238)

utilities
8a93fde Allow fw update for other boot type against on the previous "none" boot fw update (#2040)
5837559 [show] fix get routing stack routine (#2137)
c888f29 [techsupport] improve robustness (#2117)
2022-05-01 23:20:41 -07:00
Shilong Liu
482f45e28f [build] docker-sonic-mgmt replace USER by whoami (#9702) 2022-05-01 23:16:26 -07:00
xumia
a55ba095db Fix the build target error when building sonic-rest-api (#10693)
Why I did it
Fix target target/debs/bullseye/sonic-rest-api_1.0.1_arm64.deb not existing issue, the correct target is target/debs/bullseye/sonic-rest-api_1.0.1_armhf.deb.
Fix issue: #9896

[ FAIL LOG START ] [ target/debs/stretch/sonic-rest-api_1.0.1_amd64.deb ]
[ REASON ] :      target/debs/stretch/sonic-rest-api_1.0.1_amd64.deb does not exist   NON-EXISTENT PREREQUISITES: 
[ FLAGS  FILE    ] : []
2022-05-01 23:16:22 -07:00
shlomibitton
94f271c667 [Fastboot] Delay LLDP service for better fastboot performance (#10568)
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot.

- How I did it
Add a timer for LLDP service.
Copy the timer file to the host bin image.

- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
This PR is dependent on PR: #10567
2022-05-01 23:16:18 -07:00
Saikrishna Arcot
f1ec7107cb Remove SSH host keys after installing the custom version of sshd (#10633)
* Remove SSH host keys after installing the custom version of sshd

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Use an override for for sshd instead of overwriting the service file

Don't overwrite upstream's .service file, and instead use an override
file for making sure the host key(s) are generated.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-05-01 23:16:14 -07:00
mssonicbld
e2a2b30676
[ci/build]: Upgrade SONiC package versions (#10722) 2022-05-01 22:40:25 +00:00
mssonicbld
ff48ad4e9b
[ci/build]: Upgrade SONiC package versions (#10658) 2022-04-29 22:37:55 +00:00
Shilong Liu
0d8b769bae
[CG] Fix CG alert about underscore version. (#10706) 2022-04-29 13:40:13 +08:00
Shilong Liu
89d84704a9
[submodule] Update submodule for sonic-mgmt-common (#10671) 2022-04-28 13:40:59 +08:00
Stephen Sun
a68eaebe9d
[202111] [submodule] Advance sonic-platform-daemons pointer (#10673)
e46b243b Fix checkReplyType failed issue via recreating xcvr_table_helper on forking subprocess (#255) (#256)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-04-27 19:37:34 +03:00
Petro Bratash
8f0519b4b3
[BFN] Update configuration files (#10588) 2022-04-27 09:21:47 -07:00
xumia
e950fac740
[Submodule]: update submodule for sonic-restapi (#10682)
Why I did it
Update submodule sonic-restapi
e83e0e8 Fix Ctype_char larger than address space issue in 32-bit armhf (#107)
2022-04-27 07:15:41 +08:00
Judy Joseph
7e18f4c4a1 Update sonic-swss submodule
fc29641 [pbh] [aclorch] Fixed a bug causes by updating the flow-counter value for the PBH rule (#2226)
6c38ef7 [QoS] Resolve an issue in the sequence where a referenced object removed and then the referencing object deleting and then re-adding (#2210)
2022-04-24 21:19:17 -07:00
Junchao-Mellanox
ccc8c58c9c [Mellanox] Adjust PSU voltage WA (#10619)
- Why I did it
InvalidPsuVolWA.run might raise exception if user power off PSU when it is running. This exception is not caught and will be raised to psud which causes psud failed to update PSU data to DB.

- How I did it
1. Change the log level when WA does not work. This could happen when user power off PSU, hence changing the log level from error to warning is better
2. Change the wait time from 5 to 1 to avoid introduce too much delay in psud. 1 second is usually enough per my test
3. Give a default return value for function get_voltage_low_threshold and get_voltage_high_threshold to avoid exception reach to psud

- How to verify it
Manual test.
Run sonic-mgmt regression
2022-04-24 21:14:04 -07:00
Shilong Liu
0dff46b6bc Fix docker-sonic-mgmt reproducible related issue. (#9647)
Reproducible build script breaks docker-sonic-mgmt build.
2022-04-24 21:13:55 -07:00
xumia
03f1b87e65 [Ci]: Support to sign image for cisco-8000 uefi secure boot (#10616)
Why I did it
[Ci]: Support to sign image for cisco-8000 uefi secure boot
2022-04-24 21:13:41 -07:00
Vadym Hlushko
c62251b658 [install.sh] Fixed the sed pattern to match the current image revision (#9813)
#### Why I did it
The test plan described in the `How to verify it` section caused an issue when 3 images (instead of 2) were present when executing `show boot` or `sonic-installer list` commands:
```
root@sonic:/home/admin# show boot
Current: SONiC-OS-master.0-dirty-20220118.165941
Next: SONiC-OS-master.0-dirty-20220118.165941
Available: 
SONiC-OS-master.0-dirty-20220118.165941
SONiC-OS-202012.201-a0376a6e5_Internal
SONiC-OS-202012.201-a0376a6e5_Internal_RPC
```
#### How I did it
Fixed the `sed` pattern to match the current image revision in the `install.sh` script.

#### How to verify it
Test plan:
1. Install the `imageA` by using ONIE
2. Install the `imageA-rpc` by using `sonic-installer`
3. Reboot the switch
4. Swap to the `imageA` - `sonic-installer set-default imageA`
5. Reboot the switch
6. Install  the `imageB` by using `sonic-installer`
7. Check an installed images - `show boot`
8. Reboot the switch
9. Check an installed images - `show boot`
2022-04-24 21:13:37 -07:00
Sachin Naik
749bbb6ea4 secureboot: Enable signing SONiC kernel (#10557)
Why I did it
To sign SONiC kernel image and allow secure boot based system to verify SONiC image before loading into the system.

How I did it
Pass following parameter to rules/config.user
Ex:
SONIC_ENABLE_SECUREBOOT_SIGNATURE := y
SIGNING_KEY := /path/to/key/private.key
SIGNING_CERT := /path/to/public/public.cert

How to verify it
Secure boot enabled system enrolled with right public key of the, image in the platform UEFI database will able to verify image before load.

Alternatively one can verify with offline sbsign tool as below.

export SBSIGN_KEY=/abc/bcd/xyz/
sbverify --cert $SBSIGN_KEY/public_cert.cert fsroot-platform-XYZ/boot/vmlinuz-5.10.0-8-2-amd64 mage

O/P:
Signature verification OK
2022-04-24 21:13:32 -07:00
Shilong Liu
4280a2365d
[CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist (#10555) (#10650)
* [CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* add more information in patch

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* Update 0003-Remove-minimist-packages.patch

* change the thrift 0.14.1 to package download

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* use the series file for patching

* fix a code defect

Co-authored-by: Richard.Yu <richard.yu@microsoft.com>
2022-04-23 13:55:52 +08:00
mssonicbld
b7d77c7193
[ci/build]: Upgrade SONiC package versions (#10653) 2022-04-23 00:43:12 +00:00
ganglv
a4597396c2
[yang]: Update yang models to support 'cluster' (#10597)
Why I did it
Minigraph parser added a new field 'cluster' to device_metadata, and then yang validation is blocked.

How I did it
Add 'cluster' to device_metadata yang models.

How to verify it
Run UT for sonc-yang-models.
Use minigraph parser to generate ConfigDB schema and run yang validation.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-04-21 17:20:36 +08:00
ganglv
0526ff98f2
[yang] Update device_metadata to add dhcp_server (#10598)
Why I did it
dhcp_server is introduced, and need to update yang model.

How I did it
Update yang models and add unit test.

How to verify it
Run unit test for sonic-yang-models.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-04-21 08:39:35 +08:00
Samuel Angebault
df8eaa0544 [Arista] Fix arista-net initramfs hook
The interface renaming logic fails if one interface is missing.
Because of the `set -e` the whole initramfs hook would abort early on
error.
This change fixes the current behavior to make sure missing interfaces
are properly skipped and ensure existing interface are renamed.
2022-04-20 10:04:21 -07:00
Samuel Angebault
eaf9a0bde8 [Arista] rename management interface in initrd (#9856)
On some products the pci enumeration adds randomness into which nic gets
initialized first.
Because SONiC doesn't use deterministic interface naming but instead old
style interface naming, this leads to eth0 not always being the
management port.
To make sure eth0 is always the management port (SONiC expectation)
rename the interfaces in the initramfs for Arista products.
2022-04-20 10:04:21 -07:00
mssonicbld
ae6caab040
[ci/build]: Upgrade SONiC package versions (#10521)
Upgrade SONiC Versions
2022-04-20 12:19:56 +08:00
Judy Joseph
9f52080516 Update sonic-utilities submodule 2022-04-18 08:39:07 -07:00