Commit Graph

5936 Commits

Author SHA1 Message Date
xumia
f1d6d7ccce
[Build]: Fix the bin image generated from raw image issue (#10083)
Why I did it
It is to fix the issue #10048
When building .raw image, for instance, target/sonic-broadcom.raw, it will generate a .bin image, target/sonic-broadcom.bin, as the intermediate file. The intermediate file is a build target which may contains different dependencies with the raw one.
2022-03-08 21:15:43 +08:00
ganglv
29f6b01be6
[sonic-cfggen]: Fix generated deployment_id (#10154)
Why I did it
Config db schema generated by minigraph can’t pass yang validation, deployment_id can’t be none for yang validation.

How I did it
Update minigraph.py, skip deployment_id with None value

How to verify it
Run UT for sonic-config-enginue.
Run command 'sonic-cfggen -m tests/multi_npu_data/sample-minigraph-noportchannel.xml -p tests/multi_npu_data/sample_port_config-3.ini -n asic3 --print-data'.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-03-08 15:48:04 +08:00
xumia
14921e39d1
[Build][Ci]: Support to use the cisco sai packages built by azp (#10102)
Why I did it
Support to use the cisco sai packages built by azp
2022-03-08 10:15:52 +08:00
Renuka Manavalan
d9a61b07a7
send log to /var/log/syslog; Add user info the message (#10033)
Why I did it
Desired the log message destination to be syslog and it misses the critical info.

How I did it
Non logical code changes only.
Logging update, just for one message only
a) The log message is directed to /var/log/syslog, instead of /var/log/auth.log
b) Include user alias in the message

How to verify it
Pick a user alias that has not logged into the switch yet
Add this alias to /etc/tacplus_user
Attempt to login as that user
Look for the error message in /var/log/syslog
e.g. "Feb 18 19:16:41.592191 sonic ERR sshd[5233]: auth fail: Password incorrect. user: user_xyz"
2022-03-07 15:01:31 -08:00
Kebo Liu
fe0a7693f4
[smartmontools] Install smartmontools with apt-get and upgrade it to 7.2-1 (#10087)
Why I did it
Smartmontools 6.6 has an issue with reading SMART info of nvme SSD
Smartmontools can be installed with apt-get, no need to build and install

How I did it
Use apt-get to install smartmontools 7.2-1
Remove previous make files for smartmontools 6.6

How to verify it
verify with "smartctl" can read out correct SMART info on NVME ssd.
verify "show platform ssdhealth" can still work

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-03-07 09:39:33 -08:00
ganglv
78e867a794
[YANG]: Update port Yang models to support multi-asic platform (#10113)
Why I did it
Multi-asic platform add aisc_port_name and role to PORT table, and port_index range is changed.

How I did it
Update sonic-port.yang, add asic_port_name and role, and remove range limitation.

How to verify it
Run UT for sonic-yang-models.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-03-07 15:54:05 +08:00
jingwenxie
eec49a2e09
[yang] support acl MIRROR_ACTION (#10100)
Why I did it
ACL doesn't have mirror related action

How I did it
Add 'MIRROR_INGRESS_ACTION' and 'MIRROR_EGRESS_ACTION' to sonic-acl.yang.j2

How to verify it
Run the YANG model unit tests
2022-03-07 14:04:18 +08:00
ganglv
2ef9d65525
[yang]: AAA login pattern (#9805)
Signed-off-by: Gang Lv ganglv@microsoft.com

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
end2end test is blocked by Yang model for AAA login pattern.

#### How I did it
Add pattern to AAA yang models.

#### How to verify it
Run UT for sonc-yang-models.

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
Fix #9713 

#### A picture of a cute animal (not mandatory but encouraged)
2022-03-07 13:05:46 +08:00
arunlk-dell
b2409be2f2
DellEMC: N3248TE Platform API 2.0 changes (#9735)
Why I did it
N3248TE - Platform API 2.0 changes

How I did it
Implemented the functional API's needed for Platform API 2.0

How to verify it
Used the API 2.0 test suite to validate the test cases.
2022-03-04 17:53:35 -08:00
jostar-yang
d959c4adcd
[AS4630-54PE] Fix led drv and i2c bus order (#9170)
Signed-off-by: Jostar Yang jostar_yang@accton.com.tw

Why I did it
Fix led drv because CPLD SPEC is updated.
Fix i2c bus order
How I did it
Fix led drv. Set blacklist to i801 and ismt. Let accton util to modprobe i801 and ismt.

How to verify it
Test led and sensors cmd. Results are fine.
2022-03-04 17:40:27 -08:00
jostar-yang
85976cbca3
[AS5835-54X] Fix I2C bus order (#9146)
Why I did it
To fix I2C bus order to meet with HW SPEC. Let i801 use bus-0 and ismt use bus-1

How I did it
Modprobe i801 and then do ismt. So i801 will use bus-0 and ismt will use bus-1.

How to verify it
Test show cmd and sensors work well

Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-03-04 17:33:50 -08:00
StormLiangMS
55a0722a33
support BGP_ALLOWED_PREFIXES (#10142) 2022-03-05 09:20:38 +08:00
Marty Y. Lok
c40f04f0e2
[chassis][supervisor]monit container-checker failed due to unexpected "database-chassis" docker running #9042 (#9043)
Why I did it
Fixed the monit container_checker fails due to unexpected "database-chassis" docker running on Supervisor card in the VOQ chassis. fixes #9042

How I did it
Added database-chassis to the always running docker list if platform is supervisor card.

How to verify it
Execute the CLI command "sudo monit status container_checker"


Signed-off-by: mlok <marty.lok@nokia.com>
2022-03-03 17:56:08 -08:00
Alexander Allen
d0ff8b5f48
[pmon] Clean up supervisord chassis_db_init entry and fix startsecs (#10071)
Why I did it
Code review was still in progress when #9858 was merged and upon further testing I have arrived at a better solution.

How I did it
Modified supervisord configuration j2 template for pmon to require no minimum uptime for chassisd_db_init and to remove the redundant exit_codes directive

How to verify it
Boot switch and verify in syslog that there are no errors related to chassis_db_init
2022-03-03 17:10:15 -08:00
Jing Zhang
622962a213
[linkmgrd]: update linkmgrd submodule (#10117)
ce72b0d Longxiang Lyu Thu Feb 24 06:05:12 2022 Put handler member functions as virtual in base (#30)
ef59e4f Jing Zhang Fri Feb 25 11:38:28 2022 Incrementing tolerance on mux state inconsistency (#27)
2d12892 Longxiang Lyu Wed Feb 16 03:32:06 2022 Rename LinkManagerStateMachine to ActiveStandbyStateMachine (#26)
f38634c Jing Zhang Thu Feb 17 17:23:56 2022 Update log level for mux probing and mux state chance (#23)
a8434dd Jing Zhang Thu Feb 17 17:21:01 2022 Handle xcvrd crashing scenarios (#22)
2ebdb2b Longxiang Lyu Mon Feb 14 13:26:07 2022 [make] Enable make extra includes (#24)
2022-03-03 16:22:31 -08:00
Rajkumar-Marvell
b400a64823
[Marvell] Update armhf driver/sai deb version (#10126)
Fixed Marvell SAI deb version naming issue reported in Marvell-switching/sonic-marvell-binaries#62

Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2022-03-03 16:21:18 -08:00
jostar-yang
a3c10515f4
[as7326-56x] Modify to check eeprom by pre_pddf_init.sh (#7841)
Modify to check eeprom by pre_pddf_init.sh

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-03-03 15:59:12 -08:00
FuzailBrcm
482ff1ca50
[pddf]: Support for idle_state device parameter is required for muxes using i2c_mux_pca954x driver (#10060)
As per linux kernel 5.10, 'force_deselct_on_exit' parameter used for driver i2c_mux_pca954x is no longer valid. Instead an attribute 'idle_state' is added per MUX device. This needs to be set to
-1 : For leaving the mux state as is
-2 : For deselecting the channel upon exit
: To always set a channel upon exit

This needs to be accommodated inside the PDDF JSON parser as well.
2022-03-03 15:58:34 -08:00
jostar-yang
34a4817ad0
[AS7712/PDDF] Add idle_state=-2 for pca954x deselect (#10079)
Signed-off-by: Jostar Yang jostar_yang@accton.com.tw

Why I did it
Linux kernel 5.10, 'force_deselct_on_exit' parameter used for driver i2c_mux_pca954x is no longer valid. Instead an attribute 'idle_state' is added per MUX device. So set idle_state=-2 will let do deselect to pca954 when device channel exit . To avoid cause another device channel access i2c fail.

How I did it
Remove force_deselect_on_exit because not use this parameter.
Add "idle_state":"-2" to each "virt_bus"
How to verify it
Test all sysfs are fine.
2022-03-03 15:50:41 -08:00
xumia
582ea7cfc6
[Unit Test]: Fix sonic config engine test not stable issue(#10147)
Co-authored-by: azureuser <azureuser@contoso.com>
2022-03-03 09:22:15 -08:00
Vadym Hlushko
e104247950
[nvgre] Added YANG model and tests (#10095)
- Why I did it
NVGRE Tunnel feature extends the Config DB with new tables. These tables require a new YANG model.

- How I did it
Added a new YANG model sonic-nvgre-tunnel.yang

- How to verify it
Added YANG test cases.

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2022-03-03 15:58:17 +02:00
Sudharsan Dhamal Gopalarathnam
14de0a1548
[containerd]Fixing container commands when mode is local and state is disabled (#9986)
Why I did it
During warm-reboot and fast-reboot the below error logs appear
Feb 3 22:05:15.187408 r-lionfish-13 ERR container: docker cmd: kill for nat failed with 404 Client Error for http+docker://localhost/v1.41/containers/nat/json: Not Found ("No such container: nat")

The container command when called for local mode doesn't check if it is enabled before calling docker kill which throws the above errors.
b6ca76b482/scripts/fast-reboot (L699)

How I did it
Checking feature state if local mode and returning error exit code along with valid debug message.

How to verify it
Manually tested with warm-reboot and fast-reboot
Added UT to verify it.
2022-03-02 19:08:06 -08:00
Lawrence Lee
4d2a55d373
[swss]: Wait for vlan intf to start ndppd (#10119)
- Use the `wait_for_link.sh` script to delay ndppd start until after the VLAN interface is ready
- Avoids issue where ndppd tries to change interface attributes before the interface is ready
2022-03-02 16:23:56 -08:00
Aravind Mani
1740beb1f2
[sonic-cfggen]: Fix sonic-cfggen build failures for armhf (#10132)
Why I did it
amrhf build fails while building sonic-config-engine whl package
https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_apis/build/builds/77089/logs/9

The reason for the failure is due to the fact that there is a new line generated at the top of the file in buffer config test cases while building for broadcom based platform and this issue is not seen in Marvell based platforms.

How I did it
Removed the new line for all the buffer test cases as there is no need to add it and accordingly changed the buffer_config.j2 where the new line is generated.
2022-03-02 13:06:20 -08:00
Maxime Lorrillere
7891760fd0
[yang-models] Add chassis fields to device_metadata (#10006)
This change is adding asic_name, switch_id, switch_type and max_cores to sonic-device_metadata.yang
This should fix issue #9575

Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
2022-03-02 16:10:04 +08:00
ganglv
3bb87c03a1
[yang]: Add yang models for BGP_PEER_RANGE table (#10082)
Why I did it
end2end test is blocked by Yang model for BGP_PEER_RANGE.

How I did it
Add new yang models.

How to verify it
Run UT for sonc-yang-models.

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-03-02 10:09:41 +08:00
jostar-yang
76363cfa16
[AS7326-5X] Fix code bug for led drv (#8555)
Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-03-01 13:33:42 -08:00
jostar-yang
74e790c89f
[as7816-64x]Modify to check specific DUT (#7826)
AS7816 support AT or non-AT DUT. They use different pmbus i2c bus. So use "pre_pddf_init.sh" to check this case.

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-03-01 13:09:17 -08:00
jostar-yang
b617ffd88c
[AS9716-32d] Modify check eeprom via pre_pddf sh (#7827)
Modify to use pre_pddf_init.sh to check eeprom is 0x57 or 0x56.

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2022-03-01 13:08:03 -08:00
Lawrence Lee
47d9b26063
Revert "[swss]: Wait for vlan intf to start ndppd (#10036)" (#10085)
This reverts commit 91204879df.

#10036 breaks ndppd functionality
2022-02-28 15:42:02 -08:00
Aravind Mani
6c31fc65ff
Dell: S6100 fix xcvrd crash (#10062) 2022-02-28 13:13:52 -08:00
Saikrishna Arcot
afa18e2856
[build_debian.sh]: Fix /var/log having 0750 permissions instead of 0755 (#10031)
PR #9481 changed auditd's log directory to be /var/log instead of
/var/log/audit, because SONiC mounts a disk image at /var/log during
runtime, and so the /var/log/audit directory might not exist (since it
would've been created during package installation, mounting another
partition at /var/log will hide it). However, for security reasons,
auditd changes the log directory to have 0750 permissions, so that not
everyone knows about the audit logs or read them.

To fix this, revert the change to auditd's log directory, and tell
systemd to create the audit log directory at runtime if it doesn't
exist. Because the disk image gets mounted during initramfs (before
systemd starts), systemd will make sure that the /var/log/audit
directory will exist.

Fixes #9548 and #10015

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-02-28 11:46:50 -08:00
Alexander Allen
7c4fbf0455
[Mellanox] Add patch to hw-mgmt to prevent loading of non-existent kernel modules (#10073)
- Why I did it
The latest upgrade of Mellanox hw-mgmt V7.0020.1300 introduced a couple new kernel modules for new Mellanox platforms that have yet to be upstreamed to the linux kernel.

As these new platforms do not have SONiC support we elected not to upstream these new drivers to sonic-linux-kernel but hw-mgmt expects them to exist which is causing a non-functional error on switch boot.

Feb 15 00:09:55.374130 r-leopard-simx-74 ERR systemd-modules-load[269]: Failed to find module 'emc2305'
Feb 15 00:09:55.374141 r-leopard-simx-74 ERR systemd-modules-load[269]: Failed to find module 'ads1015'
To resolve this we can patch hw-mgmt to no longer attempt to load these modules by default.

- How I did it
Added a SONiC patch to Mellanox hw-mgmt in order to remove the unused kernel modules which were not upstreamed to sonic-linux-kernel

- How to verify it
Boot switch and verify there are no error logs regarding kernel modules failing to load.
2022-02-28 08:08:19 +02:00
Rajkumar-Marvell
5daf482a95
[Marevell] Fix armhf build failure (#9875)
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2022-02-28 13:40:58 +08:00
Yang Wang
b8fa5e0d8d
install xmlrunner python3 version (#10086) 2022-02-28 11:21:04 +08:00
Junchao-Mellanox
47870cecfc
Stop PMON before swss during warm reboot (#10046)
- Why I did it
Stopping swss and syncd causes some driver module unloading. Those driver modules are depended by PMON. This could trigger ERROR logs in syslog.

- How I did it
Adjust warmboot shutdown order in make file

- How to verify it
Manual test
2022-02-27 11:47:15 +02:00
saksarav-nokia
5e1acf0225
[Nokia][platform]Modify BCM config & platform_reboot for Nokia-IXR7250E-36x400G (#9990)
Signed-off-by: Sakthivadivu Saravanaraj <sakthivadivu.saravanaraj@nokia.com>
2022-02-25 19:57:38 -08:00
Alexander Allen
8dc00ef4e1
[mellanox] Fix DPB supported breakout modes (#10072) 2022-02-25 18:33:35 +05:30
Lawrence Lee
91204879df
[swss]: Wait for vlan intf to start ndppd (#10036)
- Use the `wait_for_link.sh` script to delay ndppd start until after the VLAN interface is ready
- Avoids issue where ndppd tries to change interface attributes before the interface is ready

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-02-24 17:54:45 -08:00
Qi Luo
c9cf4d9ff0
sonic-slave-buster pins the versions of Jinja2 and MarkupSafe in py3 (#10043)
#### Why I did it
Upstream breaking change, ref discussion https://github.com/pallets/markupsafe/issues/282
2022-02-24 17:00:13 -08:00
xumia
b101b023d3
[Security]: Upgrade urllib3 to fix CVE-2021-33503
See https://security.archlinux.org/CVE-2021-33503
2022-02-25 08:59:57 +08:00
Lawrence Lee
a50d1f1fc8
[write_standby]: Increase timeout to 60s (#10065)
- Avoid scenarios where script times out before orchagent can establish IPinIP tunnel

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-02-24 14:55:45 -08:00
arlakshm
fd22635de0
[chassis][bgp] create v4 and v6 peer group for VoQ internal neighbors (#9693)
Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors

How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests

How to verify it

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-02-24 11:21:26 -08:00
abdosi
2bfad16ae1
Fix Headroom value for 7260C64 SKU (#10075)
Updated the Headroom value for (100G,5m) in 7260C64 SKU.
2022-02-24 10:06:43 -08:00
Junchao-Mellanox
fe59e0f2c0
[Mellanox] Fix issue: thermal zone threshold value 0 causes fan speed stuck at 100% (#10057)
- Why I did it
In SONiC thermal control algorithm, it compares thermal zone temperature with thermal zone threshold. Previously, a thermal zone with no thermal sensor can still get its threshold. However, a recently driver patch changes this behavior: a thermal zone with no thermal sensor will return 0 for threshold. We need to ignore such thermal zone.

- How I did it
Ignore thermal zones whose temperature is 0.

- How to verify it
Added unit test case and Manual test
2022-02-24 12:05:56 +02:00
Junchao-Mellanox
d82eafd8ae
[system-health] Fix file handle leak (#10059)
- Why I did it
swsscommon.ConfigDBConnector does not automatically close connection when the instance is recycled by python. So, it should not create this instance each time calling check_services. It will cause error like Failed to read from file /var/run/hw-management/led/led_status_capability - OSError(24, 'Too many open files')

- How I did it
Only connect DB once in init

- How to verify it
Manual test
2022-02-24 11:29:59 +02:00
Zhaohui Sun
44028723ef
Split kvmtest t0 job into two jobs and run in parallel (#10044)
Why I did it
Introduce 2 sub jobs for kvmtest t0 job in sonic-mgmt repo in PR Azure/sonic-mgmt#4861
But in sonic-buildimage repo, because section parameter is null, it always run the part 2 test scripts in kvmtest t0 job.
It missed the part 1 test scripts in kvmtest.sh.

How I did it
Split kvmtest t0 job into two sub jobs such as sonic-mgmt repo and run them in parallel to save time.

How to verify it
Submit PR will trigger the pipeline to run.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-02-24 10:20:24 +08:00
wenyiz2021
2d0b063191
Update container_checker for multi-asic devices when state is 'always_enabled' (#10067)
* Update container_checker for multi-asic devices 

Update container_checker for multi-asic devices to add database containers in always_running_containers. 
Previous change was made for single-asic, and that database containers were not considered as feature when writing to state_db.

* Update container_checker

Update an indent
2022-02-23 18:06:30 -08:00
Junchao-Mellanox
72477bcac8
[submodule] Update submodule for sonic-swss-common (#10012)
*9eac0ae Add support for route flow counter (#576)
*2262c01 [VS] Increase test timout to 360min (#582)
*a2b8161 [ci] pipeline fixes for VS test (#581)
2022-02-23 17:55:37 -08:00
vmittal-msft
bc1dfea619
Updated traffic scheduler settings for HWSKUs : DellEMC-Z9332f-O32 and DellEMC-Z9332f-M-O16C64 (#9828) 2022-02-23 17:22:41 -08:00