Fix issue: Non compliant leaf list in config_db schema: https://github.com/Azure/sonic-buildimage/issues/9801
The basic flow of DPB is like:
1. Transfer config db json value to YANG json value, name it “yangIn”
2. Validate “yangIn” by libyang
3. Generate a YANG json value to represent the target configuration, name it “yangTarget”
4. Do diff between “yangIn” and “yangTarget”
5. Apply the diff to CONFIG DB json and save it back to DB
The fix:
• For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
• For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.
1. Manual test
2. Changed sample config DB and unit test passed
Conflicts:
src/sonic-yang-mgmt/sonic_yang_ext.py
079f80a (HEAD -> 202111, origin/202111) Fix: if routestr does not exist, skip (#257)
8fd0fe1 Fix: not to use blocking get_all() after keys() (#255)
981107a Add VoQ Recirc interface (i.e., Ethernet-Rec) to interface maps for S… (#244)
f4ecfb6 (HEAD -> 202111, origin/202111) Removing Vnet with scope default (#2239)
Why I did it
The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like:
Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp.
Systemd starts interface-config service who depends on networking service
Interface-config service runs command “ifdown –force eth0”, check line. but networking service is still running so that this line failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down.
Interface-config service updates /etc/networking/interface to static configuration.
Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP)
When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces.
The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3.
How I did it
Check networking service state before running "ifdown –force eth0", wait for it done if it is activating.
How to verify it
Manual test.
Why I did it
When lldpmgrd handled events of other tables besides PORT_TABLE, error message was printed to log.
How I did it
Handle event according to its file descriptor instead of looping all registered selectables for each coming event.
How to verify it
I verified same events are being handled by printing events key and operation, before and after the change.
Also, before the change, in init flow after config reload, when lldpmgrd handled events of other tables besides PORT_TABLE, error messages were printed to log, this issue is solved now.
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, PMON is delayed in 90 seconds until the system finish the init flow after fastboot.
- How I did it
Add a timer for PMON service.
Exclude for MLNX platform the start trigger of PMON when SYNCD starts in case of fastboot.
Copy the timer file to the host bin image.
- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
#### Why I did it
Need to pass LY_CTX_DISABLE_SEARCHDIR_CWD to Context in order to disable automatically searching for schemas in current working directory (which is by default searched automatically)
#### How I did it
add additional attribute into YANG context
#### How to verify it
Create some invalid link on switch :
1) **ln -s /usr/abc xxx**
2) run **spm list**
--> There should not be these messages:
```
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
libyang[1]: Unable to get information about "xxx" file in "/tmp" when searching for (sub)modules (No such file or directory)
```
- Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot.
- How I did it
Add a timer for LLDP service.
Copy the timer file to the host bin image.
- How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
This PR is dependent on PR: #10567
* Remove SSH host keys after installing the custom version of sshd
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Use an override for for sshd instead of overwriting the service file
Don't overwrite upstream's .service file, and instead use an override
file for making sure the host key(s) are generated.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
e46b243b Fix checkReplyType failed issue via recreating xcvr_table_helper on forking subprocess (#255) (#256)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
fc29641 [pbh] [aclorch] Fixed a bug causes by updating the flow-counter value for the PBH rule (#2226)
6c38ef7 [QoS] Resolve an issue in the sequence where a referenced object removed and then the referencing object deleting and then re-adding (#2210)
- Why I did it
InvalidPsuVolWA.run might raise exception if user power off PSU when it is running. This exception is not caught and will be raised to psud which causes psud failed to update PSU data to DB.
- How I did it
1. Change the log level when WA does not work. This could happen when user power off PSU, hence changing the log level from error to warning is better
2. Change the wait time from 5 to 1 to avoid introduce too much delay in psud. 1 second is usually enough per my test
3. Give a default return value for function get_voltage_low_threshold and get_voltage_high_threshold to avoid exception reach to psud
- How to verify it
Manual test.
Run sonic-mgmt regression
#### Why I did it
The test plan described in the `How to verify it` section caused an issue when 3 images (instead of 2) were present when executing `show boot` or `sonic-installer list` commands:
```
root@sonic:/home/admin# show boot
Current: SONiC-OS-master.0-dirty-20220118.165941
Next: SONiC-OS-master.0-dirty-20220118.165941
Available:
SONiC-OS-master.0-dirty-20220118.165941
SONiC-OS-202012.201-a0376a6e5_Internal
SONiC-OS-202012.201-a0376a6e5_Internal_RPC
```
#### How I did it
Fixed the `sed` pattern to match the current image revision in the `install.sh` script.
#### How to verify it
Test plan:
1. Install the `imageA` by using ONIE
2. Install the `imageA-rpc` by using `sonic-installer`
3. Reboot the switch
4. Swap to the `imageA` - `sonic-installer set-default imageA`
5. Reboot the switch
6. Install the `imageB` by using `sonic-installer`
7. Check an installed images - `show boot`
8. Reboot the switch
9. Check an installed images - `show boot`
Why I did it
To sign SONiC kernel image and allow secure boot based system to verify SONiC image before loading into the system.
How I did it
Pass following parameter to rules/config.user
Ex:
SONIC_ENABLE_SECUREBOOT_SIGNATURE := y
SIGNING_KEY := /path/to/key/private.key
SIGNING_CERT := /path/to/public/public.cert
How to verify it
Secure boot enabled system enrolled with right public key of the, image in the platform UEFI database will able to verify image before load.
Alternatively one can verify with offline sbsign tool as below.
export SBSIGN_KEY=/abc/bcd/xyz/
sbverify --cert $SBSIGN_KEY/public_cert.cert fsroot-platform-XYZ/boot/vmlinuz-5.10.0-8-2-amd64 mage
O/P:
Signature verification OK
* [CG-Fix-CVE-2021-44906] Patching on thrift.0.14.1 for package minimist
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* add more information in patch
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* Update 0003-Remove-minimist-packages.patch
* change the thrift 0.14.1 to package download
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* use the series file for patching
* fix a code defect
Co-authored-by: Richard.Yu <richard.yu@microsoft.com>
Why I did it
Minigraph parser added a new field 'cluster' to device_metadata, and then yang validation is blocked.
How I did it
Add 'cluster' to device_metadata yang models.
How to verify it
Run UT for sonc-yang-models.
Use minigraph parser to generate ConfigDB schema and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
dhcp_server is introduced, and need to update yang model.
How I did it
Update yang models and add unit test.
How to verify it
Run unit test for sonic-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
The interface renaming logic fails if one interface is missing.
Because of the `set -e` the whole initramfs hook would abort early on
error.
This change fixes the current behavior to make sure missing interfaces
are properly skipped and ensure existing interface are renamed.
On some products the pci enumeration adds randomness into which nic gets
initialized first.
Because SONiC doesn't use deterministic interface naming but instead old
style interface naming, this leads to eth0 not always being the
management port.
To make sure eth0 is always the management port (SONiC expectation)
rename the interfaces in the initramfs for Arista products.
- Why I did it
There is a hardware bug that PSU voltage threshold sysfs returns incorrect value. The workaround is to call "sensor -s" to refresh it.
- How I did it
Call "sensor -s" when the threshold value is not incorrect and PSU is "DELTA 1100"
- How to verify it
Unit test and Manual test
Why I did it
minigraph parser has introduced new type.
How I did it
Update yang models to support BmcMgmtToRRouter.
How to verify it
Run unit test for sonic-yang-models
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Need to run yang validation for sonic-cfggen unit test, and many unit test does not provide speed for port table.
How I did it
Update minigraph xml.
How to verify it
Run sonic-cfggen unit test.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
ASN range is from 1 to 4294967295, need to remove invalid ASN.
How I did it
Update unit test and replace ASN 0.
How to verify it
Run unit test for sonic-config-engine.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
sonic-config-engine unit test is using invalid switch_type
How I did it
Update xml with correct switch_type
How to verify it
Run UT for sonic-config-engine
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Need to run yang validation for sonic-cfggen unit test, and many unit test does not provide lanes for port table.
How I did it
Update port config file.
How to verify it
Run sonic-cfggen unit test,
Use below PR to verify
#10228
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Config db schema generated by minigraph can’t pass yang validation, deployment_id can’t be none for yang validation.
How I did it
Update minigraph.py, skip deployment_id with None value
How to verify it
Run UT for sonic-config-enginue.
Run command 'sonic-cfggen -m tests/multi_npu_data/sample-minigraph-noportchannel.xml -p tests/multi_npu_data/sample_port_config-3.ini -n asic3 --print-data'.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Multi-asic platform add aisc_port_name and role to PORT table, and port_index range is changed.
How I did it
Update sonic-port.yang, add asic_port_name and role, and remove range limitation.
How to verify it
Run UT for sonic-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
end2end test is blocked by Yang model for BGP_PEER_RANGE.
How I did it
Add new yang models.
How to verify it
Run UT for sonc-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
Signed-off-by: Gang Lv ganglv@microsoft.com
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
end2end test is blocked by Yang model for AAA login pattern.
#### How I did it
Add pattern to AAA yang models.
#### How to verify it
Run UT for sonc-yang-models.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
Fix#9713
#### A picture of a cute animal (not mandatory but encouraged)
* Ported platform from master
Signed-off-by: Petro Bratash <petrox.bratash@intel.com>
* [BFN] Updated x86_64-accton_as9516_32d-r0/platform.json
* [BFN] Refactoring and adding some functions of Thermal class (set and
get thresholds and etc.)
* [BFN] Fix exception when fwutil run without sudo
* Revert "[BFN] syncd-rpc build with thrift 0.14.1 (#9884)"
This reverts commit bec35267cb.
* [BFN] Updated SDK to 20220127_sai_1.9.1 (#9870)
Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>
* [BFN] updated SDE packages for BFN platforms (#10512)
Updated SDE packages for bfn platform
- introduced X6 profile
- fixes for drop counters
- fixes for platform part
Co-authored-by: Andriy Kokhan <AndriyX.Kokhan@intel.com>
Co-authored-by: roman_savchuk <romanx.savchuk@intel.com>