Commit Graph

6221 Commits

Author SHA1 Message Date
jingwenxie
fdc65d7600
Remove minigraph loading in updategraph script (#11146)
Why I did it
Minigraph will be deprecated in the future. So minigraph related reload should be deleted.

How I did it
Remove unused load_minigraph
2022-06-21 08:57:57 +08:00
Vadym Hlushko
87425a5b2b
[sflow + dropmon] added the ENABLE_SFLOW_DROPMON build flag. Added patches for sflow repo. (#10370)
* [sflow + dropmon] added INCLUDE_SFLOW_DROPMON flag, added patches for hsflowd
*Added a capability of monitoring dropped packets for the sFlow daemon in order to improve network - monitoring, diagnostic, and troubleshooting. The drop monitor service allows the sFlow daemon to export another type of sample - dropped packets as Discard samples alongside Counter samples and Packet Flow samples.

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2022-06-20 17:07:02 -07:00
Lior Avramov
467e005b4c
Add IP interface loopback action related content to YANG models. (#11012)
*Add IP interface loopback action related content to the required YANG models.
2022-06-20 11:45:08 -07:00
Stepan Blyshchak
42576d2664
[auto-ts] add memory check (#10433)
#### Why I did it

To support automatic techsupport invokation in case memory usage is too high.

#### How I did it

Implemented according to https://github.com/Azure/SONiC/pull/939

#### How to verify it

UT, manual test on the switch.

*DEPENDS* on https://github.com/Azure/sonic-utilities/pull/2116
2022-06-20 09:39:05 -07:00
Shilong Liu
ad1e20fbc7
[build] Add version files to docker image dependencies (#11179)
* [build] Add version files to docker image dependencies
2022-06-20 18:09:00 +08:00
Andriy Yurkiv
ed99ce0ae0
[Mellanox] Install MFT package on platform monitor (pmon) container (#10932)
- Why I did it
Need to execute mlxreg inside pmon docker

- How I did it
Add MFT package to pmon Makefile

- How to verify it
Install image, go to pmon : docker exec -it pmon bash, exec mlxreg
Verifiy warm, fast and cold reboot while MFT is being called in pmon constantly 

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-06-19 14:16:09 +03:00
Stepan Blyshchak
f4a9f7edaa
[sonic-feature.yang] fix check_up_status field type (#10986)
- Why I did it
An issue is encountered when a value "False" is written for a feature in "check_up_status" field, which does not pass YANG validation.

- How I did it
We usually use stypes::boolean_type for such fields, even in this YANG model. This custom type, supports "False" value.

- How to verify it
Write "False" in "check_up_status" field and see if YANG validation passes.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-06-19 14:14:01 +03:00
saksarav-nokia
d42a95ce0b
Updated Nokia device BCM and platform config (#11106) 2022-06-18 10:47:49 -07:00
saksarav-nokia
875e20f99c
Update platform/broadcom/sonic-platform-modules-nokia (#11107) 2022-06-18 10:47:13 -07:00
Junhua Zhai
04ea32b0c2
[macsec] CLI Supports display of gearbox macsec counter (#11113)
Why I did it
To support gearbox macsec counter display, following Azure/sonic-swss-common#622.

How I did it
Use swsscommon CounterTable API
2022-06-18 19:17:05 +08:00
byu343
c1ba71b251
[Arista] Add ASIC configs for blackhawktd4 (#10885)
Why I did it
Add ASIC configs for blackhawktd4

How to verify it
Verified that 400G ports of 400GBASE-CR8 are up and traffic can pass
2022-06-17 12:50:47 -07:00
yozhao101
241f4454b4
[memory_checker] Do not check memory usage of containers which are not created (#11129)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to fix an issue (#10088) by enhancing the script memory_checker.

Specifically, if container is not created successfully during device is booted/rebooted, then memory_checker do not need check its memory usage.

How I did it
In the script memory_checker, a function is added to get names of running containers. If the specified container name is not in current running container list, then this script will exit without checking its memory usage.

How to verify it
I tested on a lab device by following the steps:

Stops telemetry container with command sudo systemctl stop telemetry.service

Removes telemetry container with command docker rm telemetry

Checks whether the script memory_checker ran by Monit will generate the syslog message saying it will exit without checking memory usage of telemetry.
2022-06-17 12:13:18 -07:00
tjchadaga
25bb471539
[sonic-swss] Advance submodule (#11170) 2022-06-17 10:00:46 -07:00
Shilong Liu
2851884c8b
[ci] Support to skip vstest using include/exclude config file. (#11086)
example:
├── folderA
│  ├──  fileA (skip vstest)
│  ├──  fileB
│  └──  fileC
If we want to skip vstest when changing /folderA/fileA, and not skip vstest when changing fileB or fileC.

vstest-include:
^folderA/fileA

vstest-exclude:
^folderA
2022-06-17 15:39:41 +08:00
bingwang-ms
83f23e26ff
Generate switch level dscp_to_tc_map entry from qos_config template (#11087)
* Generate switch level dscp_to_tc_map

Signed-off-by: bingwang <wang.bing@microsoft.com>
2022-06-17 08:44:30 +08:00
Santhosh Kumar T
faecf38417
[DellEMC] S5212F and S5224F 2.0 API changes (#10315)
Why I did it
S5212F - Platform API 2.0 changes
S5224F - Platform API 2.0 changes
How I did it
Implemented the functional API's needed for Platform API 2.0
Added media_settings.json, pcie.yaml, platform.json, system_health_monitoring_config.json files.
How to verify it
Used the API 2.0 test suite to validate the test cases.
2022-06-16 16:50:11 -07:00
youshcentec
1d68efe671
[centec]: fix docker syncd rpc compile(#11097)
Change makefile to reference to new SAI dev for docker-syncd-centec-rpc compile

Co-authored-by: yoush <yoush@centec.com>
2022-06-16 15:27:57 -07:00
byu343
89020f53e4
[Arista] Add support support for 7060dx5_64s and 7060px5_64s (#10888)
Why I did it
This change adds the support for Arista 7060dx5_64s and 7060px5_64s

How I did it
How to verify it
We verified the platform driver is working and the ports are up on 7060dx5_64s and 7060px5_64s.
2022-06-16 09:51:42 -07:00
Samuel Angebault
30bfed92fd
[Arista] Add configuration files for 7050X4-32S platform (#10799)
Add most configuration files for the DCS-7050PX4-32S and DCS-7050DX4-32S.
This review only contains platform configuration files, dataplane ones will follow in future change.

Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
2022-06-16 09:42:10 -07:00
shlomibitton
1474ad76d8
[Mellanox] [pmon] Fix for PMON service not starting when restarting SWSS service after fast/warm reboot (#10901)
- Why I did it
Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform.
Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again.

- How I did it
On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running.
If it is running it means we are at the first boot and continue normally.
If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent.

- How to verify it
Run fast/warm reboot.
service swss restart.
Observe PMON service starting.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2022-06-16 12:15:09 +03:00
xumia
b6811a58cf
[Build] Improve docker build performance (#11111)
Why I did it
The docker storage driver vfs is not a good option for build, it uses the “deep copy” when building a new layer, leads to lower performance and more space used on disk than other storage drivers.
A better docker storage driver is the default one overlay2, it is a modern union filesystem.
2022-06-16 14:13:01 +08:00
Hua Liu
1dbfc66cef
[SSHD] Enable SSHD keepalive timeout feature (#11115)
#### Why I did it
SSHD keepalive timeout feature not enabled on sonic.

#### How I did it
Enable SSHD keepalive timeout feature by set ClientAliveCountMax to 1.

#### How to verify it
Pass All E2E test case.
Manually test with following steps:

1. Change config and restart sshd
2. Connect a ssh with -vvv option to show debug message
3. Get running ssh by command and stop it:

```
azureuser@liuh-dev-vm-02:~$ ps -auxww | grep vvv
azureus+ 1614153  0.0  0.0  12244  6004 pts/1S+   15:48   0:00 ssh admin@10.250.0.101 -vvv
azureus+ 1615570  0.0  0.0   8168  2424 pts/3S+   15:49   0:00 grep --color=auto vvv
azureuser@liuh-dev-vm-02:~$ kill -Stop 1614153
```

4. Check TCP status from server side with ss command:
https://man7.org/linux/man-pages/man8/ss.8.html

```
admin@vlab-01:~$ ss | grep -i ssh
tcp   ESTAB  0  010.250.0.101:ssh 10.250.0.1:58150
tcp   FIN-WAIT-2 0  010.250.0.101:ssh 10.250.0.1:58164
tcp   ESTAB  0  010.250.0.101:ssh 10.250.0.1:57978
```

FIN-WAIT-2 means server already terminate the connection and wait for client response:
https://kb.iu.edu/d/ajmi
.  FIN-WAIT-2  <-- <SEQ=300><ACK=101><CTL=ACK>  <-- CLOSE-WAIT

5. Check again later will show the session been complete closed:

```
admin@vlab-01:~$ ss | grep -i ssh
tcp   ESTAB  0  010.250.0.101:ssh 10.250.0.1:58150
tcp   ESTAB  0  010.250.0.101:ssh 10.250.0.1:57978
```
2022-06-15 22:27:07 -07:00
bingwang-ms
a035b2c3cd
Update YANG for PORT_QOS_MAP to support switch level mapping (#11089)
Signed-off-by: bingwang <wang.bing@microsoft.com>

Co-authored-by: Neetha John <nejo@microsoft.com>
2022-06-16 11:29:38 +08:00
zitingguo-ms
e2078627c7
[AN/LT][Fix bug]:enable phy_an_lt_msft attribute on some platforms (#11147) 2022-06-15 17:29:45 -07:00
Kebo Liu
b2bc90e34b
add flag skip_xcvrd_cmis_mgr to skip cmis task on Nvidia platform (#11120)
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-06-15 16:33:08 -07:00
Taras Keryk
762fad624f
[BFN] Fixed settings for BMC interface (#10940)
* Fixed ipv6 settings for BMC interface

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>

* Added ip6 setting

Signed-off-by: Taras Keryk <tarasx.keryk@intel.com>
2022-06-15 16:31:40 -07:00
arunlk-dell
756de913cb
DellEMC: Initial commit for Z9432F platform (#10640)
Why I did it
Added support for the device Z9432F

How I did it
Implemented the support for the platform Z9432F

Switch Vendor: DellEMC
Switch SKU: Z9432F-ON
ASIC Vendor: Broadcom
SONiC Image: sonic-broadcom.bin
2022-06-15 09:39:41 -07:00
Jing Zhang
cfdb3d5978
[sonic-linkmgrd][master] submodule updates #11145
[sonic-linkmgrd][master] submodule updates

e7e00f2 Jing Zhang      Tue Jun 14 17:27:22 2022 -0700  Fix IP header checksum in handleSendSwitchCommand (#88)
127fd3d Jing Zhang      Tue Jun 7 09:33:03 2022 -0700   Check self's mux mode before switching peer to standby & add support for `detach` mode (#79)

sign-off: Jing Zhang zhangjing@microsoft.com
2022-06-15 09:19:43 -07:00
Neetha John
770d59e064
[sonic-config-engine] Generate expected output with different cable len (#11092)
Why I did it
To address internal build failures where the cable len for some of the skus is set to 300m for all tiers.

How I did it
For the buffers test, generate a new output file based off the original expected output with CABLE_LENGTH table updated to use 300m. In the comparison logic, compare against each of the expected output files and if any matches, the testcase is set to pass

Signed-off-by: Neetha John <nejo@microsoft.com>
2022-06-15 02:16:36 -07:00
Guohan Lu
8d299cab2b [submodule]: update sonic-sairedis
* 48cccb4 2022-06-13 | do not use sai_query_api_version if vendor sai does not support in VendorSai.cpp (#1064) (HEAD, origin/master, origin/HEAD) [Guohan Lu]
* 9b0f773 2022-06-13 | [vslib]: Fixbug in cleanup MACsec device (#1059) [Ze Gan]
* cdf9427 2022-06-11 | No sai api version check if vendor sai does not support (#1063) (HEAD, origin/master, origin/HEAD) [Guohan Lu]
* 3964cf1 2022-06-09 | [counter] Fix port flex counter  (#1052) [Junhua Zhai]
* 2231b7a 2022-06-03 | Purge package sonic-db-cli which depends on libswsscommon (#1057) [Qi Luo]
* 7aa09b9 2022-06-01 | Set PR diff code coverage threshold to 80% (#1039) [Kamil Cudnik]
* 66a29bc 2022-05-18 | [syncd] Use vendor SAI instead of direct SAI api (#1042) [Kamil Cudnik]
* 564bea7 2022-05-18 | [ci] Paralize azure pipeline (#1040) [Shilong Liu]
* 57ed180 2022-05-17 | [configure.ac] implement SAI API version check (#1000) [Stepan Blyshchak]
* 8894dc7 2022-05-17 | vslib: add support for read-only port capabilities (#1038) [Dante (Kuo-Jung) Su]
* 42af975 2022-04-29 | [vslib]: Update packet number of MACsec SA at runtime (#1007) [Ze Gan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2022-06-14 15:38:10 -07:00
Guohan Lu
957285236a
[github]: update pr template to include 202205 backport option (#11139)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2022-06-14 11:48:59 -07:00
Jon Goldberg
b655cd49a3
[Nokia ixs7215] change var/log size to 4GB (#11122)
This makes use of #11121 to add support for configuration of VAR_LOG_SIZE on Nokia IXS7215
2022-06-14 08:57:33 -07:00
Jon Goldberg
9cdde8576c
[installer]: fix armhf for installer.conf usage (#11121)
This fixes the build for armhf to be able to use '/device///installer.conf' files. Specifically, armhf needs support to be able to change the size of /var/log/ directory. It is hardcoded to 512 bytes on all armhf platforms currently. This change will allow any armhf platform to be able to use an installer.conf file to customize the installed image.
2022-06-14 08:56:27 -07:00
Junhua Zhai
63c7c5b4e6
[submodule]: update sonic-swss-common (#11127)
2022-06-14 92f5411: [counter] Add counter table (Azure/sonic-swss-common#622) (Junhua Zhai)
2022-06-03 38c04dd: Purge package sonic-db-cli which depends on libswsscommon (Azure/sonic-swss-common#628) (Qi Luo)
2022-06-14 08:52:40 -07:00
jingwenxie
cca3b5be5b
Reduce logic in updategraph (#11010)
Why I did it
The dhcp_graph_url used by internal service is always set as "N/A". So we can make the updategraph logic short.

How I did it
Shorten 'if statement' logic for /tmp/dhcp_graph_url
2022-06-14 22:18:47 +08:00
jingwenxie
6a4105ad17
[sonic-utilities] update submodule (#11090)
29503ab [portchannel] Added ACL/PBH binding checks to the port before getting added to portchannel (#2151)
ac89489 Modify override testcase to cover PORT admin_status (#2165)
d7953d2 [GCU] Validate peer_group_range ip_range are correct (#2145)
aa81b97 [auto-ts] add memory check (#2116)
b370290 support new interface types CR8/SR8/KR8/LR8 which are brougnt by SAI V.1.10.2 (#2167)
87fc0a4 [scripts/fast-reboot] Add option to include ssd-upgrader-part boot option with SONiC partition (#2150)
90abc07 [config reload] Fix invalid rstrip. (#2157)
fac1769 Accept 0 for queue and dscp (#2162)
2022-06-14 06:56:18 +08:00
Prince George
201792f3cd
Yang model for xcvr tx power and frequency configuration (#11053)
* Yang model for xcvr tx power and frequency configuration

* Add unit test cases

* Addressed review comments
2022-06-12 09:14:35 -07:00
Richard.Yu
356b51f4d6
[Tunnel PFC][Fix bug] Fix bug and Tests for adding property 'sai_remap_prio_on_tnl_egress' (#11027)
* [Tunnel PFC] Tests for adding property 'sai_remap_prio_on_tnl_egress'

Add tests for adding property 'sai_remap_prio_on_tnl_egress', this
property should only be added in dual tor environment.

Test done:
Run test test_j2files.py

Co-authored-by: richardyu <richardyu@contoso.com>
2022-06-10 11:14:45 -07:00
Guohan Lu
245884a97f
[submodule]: update sonic-swss (#11094)
* b12af41 2022-06-09 | [fpmsyncd] don't manipulate route weight (#2320) (HEAD, origin/master, origin/HEAD) [Ying Xie]
* a3f4fbb 2022-06-09 | Combine PGs in buffermgrd (#2281) [bingwang-ms]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2022-06-10 11:03:36 -07:00
Saikrishna Arcot
ead106eda8
Add ping to swss-layer docker (#11093)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-06-10 07:40:37 -07:00
judyjoseph
0b1ae9c43c
Cleanup macsec stateDB tables on restart (#11066)
Clean macsec tables in STATE_DB on start
2022-06-09 15:32:24 -07:00
Ying Xie
26436e34de
[submodule][swss] advance swss submodule head (#11015)
To included:
* ad8f5e4 2022-06-08 | Revert "[Counters] Improve performance by polling only configured ports buffer queue/pg counters (#2143)" (#2315) (HEAD -> master, origin/master, origin/HEAD) [Ying Xie]
* 2ff763f 2022-06-08 | Fix test_warm_reboot issues blocking PR merge (#2309) [Vaibhav Hemant Dixit]
* 05d19ea 2022-06-02 | Purge package sonic-db-cli which depends on libswsscommon (#2308) [Qi Luo]
* a0c3238 2022-06-03 | Add port counter sanity check (#2300) [Junhua Zhai]
* 4944f0f 2022-06-03 | Revert "[portsorch]: Prevent LAG member configuration when port has active ACL binding (#2165)" (#2306) [bingwang-ms]
* eba212d 2022-05-31 | [Counters] Improve performance by polling only configured ports buffer queue/pg counters (#2143) [shlomibitton]
* 9999dae 2022-05-28 | [counter] Support gearbox counters (#2218) [Junhua Zhai]
* c73cf10 2022-05-28 | Support mock_test infra for dynamic buffer manager and fix issues found during mock test (#2234) [Stephen Sun]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-09 12:43:07 -07:00
SuvarnaMeenakshi
77ad3ff19d
[portconfig]: Remove try exception during config_db initialization. (#10960)
Why I did it
Provide fix for comment: https://github.com/Azure/sonic-buildimage/pull/10475/files#r847753187;
Move laoding database config to application code instead of portconfig as portconfig is used as a library.
#10581 was raised for this fix, but had to be reverted due to issue with multi-asic platform.

How I did it
Remove try exception handing from portconfig.py during config_db intialization.
Move loading of database config to application that uses portconfig.py.

How to verify it
unit-test passes.
Verified that it does not cause issue during boot up of multi-asic VS image.
Verified that config_db generation was successful in multi-asic VS.
2022-06-09 11:17:53 -07:00
Aravind Mani
a07765ffea
[DellEMC] Fix S5248f platform issues (#11076)
* [DellEMC] Fix S5248f platform issues

* update files

Co-authored-by: Aravind Mani <aravind.m1@dell.com>
2022-06-09 09:38:13 -07:00
Ying Xie
67e89b837b
[makefile] remove all fsroot folders (#11030)
Why I did it
Make reset didn't clean-up all fsroot folders.

How I did it
Remove all fsroot folders used during build.

How to verify it
Run local build and local make reset:

sudo mkdir fsroot-test
sudo touch fsroot-test/foo
make reset
(Without this change, make reset cannot remove fsroot-foo, with the change, the repo become clean after make reset.)

Signed-off-by: Ying Xie ying.xie@microsoft.com
2022-06-08 18:00:37 -07:00
xumia
36cdaa0c66
[Bug]: fix the version file name issue (#11072)
Why I did it
[Bug]: fix the version file name issue
2022-06-09 06:49:49 +08:00
Jing Zhang
6b70b448d2
[sonic-linkmgrd][master] Submodule Update (#11035)
[sonic-linkmgrd][master] Submodule Update

9c8a16e Jing Zhang Sun Jun 5 08:27:07 2022 -0700 Separate I2C mux state probing and gRPC forwarding state probing (#86)
491c4ee Longxiang Lyu Sun Jun 5 23:26:19 2022 +0800 Fix peer mux wait back off factor (#84)
a0b6b14 Jing Zhang Wed Jun 1 10:33:12 2022 -0700 Revert "Update log level for mux probing and mux state chance (#23)" (#85)
3c2b546 Jing Zhang Tue May 31 10:14:42 2022 -0700 Add default route support to active-active state machine (#78)
6fa892e Longxiang Lyu Fri May 27 09:15:06 2022 +0800 Degrade LinkProberStateMachineBase virtual function logging level (#80)
7b695ca Longxiang Lyu Fri May 27 09:14:02 2022 +0800 Fix mux wait timer and peer mux wait timer (#81)
2022-06-07 07:52:13 -07:00
Junchao-Mellanox
f135f37a50
[Mellanox] optimize platform API import time (#10815)
- Why I did it
"import sonic_platform" takes about 600ms ~ 1000ms, it is kind of slow. After this optimization, the time is about 100ms. The benefit is that those CLIs which does not need the slow import sentence would be faster than before.

- How I did it
Find slow import and call them when need.

- How to verify it
Measure the import time.
2022-06-07 15:13:16 +03:00
Volodymyr Samotiy
5c869492bd
[Mellanox] Update SDK/FW to 4.5.2262/xx.2010.2262 (#10882)
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.

- How I did it
Updated SDK submodule along with the relevant Makefiles

- How to verify it
Build an image and run tests from "sonic-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-06-07 15:11:25 +03:00
Kebo Liu
a3895c3116
[Mellanox] Update SN2201 sai profile and platform reboot script (#10978)
- Why I did it
1. SN2201 sai profile needs to be updated according to the latest hardware.
2. In the reboot script, need to use the common symbol link of the power_cycle sysfs instead of directly accessing it due to SN2201 sysfs is different than other platforms.
3. echo 1 > $SYSFS_PWR_CYCLE will trigger the reboot immediately, the following sleep 3 and echo 0 > $SYSFS_PWR_CYCLE will never be executed, can be removed.

- How I did it
1. Replace the SN2201 sai profile with the latest one.
2. In the platform_reboot script, replace the direct sysfs path with the symbol link path.
3. Remove the redundant code from platform_reboot

- How to verify it
Perform reboot on all the Nvidia platforms, and check all can be rebooted successfully.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-06-07 15:05:54 +03:00