#### Why I did it
SSHD keepalive timeout feature not enabled on sonic.
#### How I did it
Enable SSHD keepalive timeout feature by set ClientAliveCountMax to 1.
#### How to verify it
Pass All E2E test case.
Manually test with following steps:
1. Change config and restart sshd
2. Connect a ssh with -vvv option to show debug message
3. Get running ssh by command and stop it:
```
azureuser@liuh-dev-vm-02:~$ ps -auxww | grep vvv
azureus+ 1614153 0.0 0.0 12244 6004 pts/1S+ 15:48 0:00 ssh admin@10.250.0.101 -vvv
azureus+ 1615570 0.0 0.0 8168 2424 pts/3S+ 15:49 0:00 grep --color=auto vvv
azureuser@liuh-dev-vm-02:~$ kill -Stop 1614153
```
4. Check TCP status from server side with ss command:
https://man7.org/linux/man-pages/man8/ss.8.html
```
admin@vlab-01:~$ ss | grep -i ssh
tcp ESTAB 0 010.250.0.101:ssh 10.250.0.1:58150
tcp FIN-WAIT-2 0 010.250.0.101:ssh 10.250.0.1:58164
tcp ESTAB 0 010.250.0.101:ssh 10.250.0.1:57978
```
FIN-WAIT-2 means server already terminate the connection and wait for client response:
https://kb.iu.edu/d/ajmi
. FIN-WAIT-2 <-- <SEQ=300><ACK=101><CTL=ACK> <-- CLOSE-WAIT
5. Check again later will show the session been complete closed:
```
admin@vlab-01:~$ ss | grep -i ssh
tcp ESTAB 0 010.250.0.101:ssh 10.250.0.1:58150
tcp ESTAB 0 010.250.0.101:ssh 10.250.0.1:57978
```
Why I did it
Added support for the device Z9432F
How I did it
Implemented the support for the platform Z9432F
Switch Vendor: DellEMC
Switch SKU: Z9432F-ON
ASIC Vendor: Broadcom
SONiC Image: sonic-broadcom.bin
Why I did it
To address internal build failures where the cable len for some of the skus is set to 300m for all tiers.
How I did it
For the buffers test, generate a new output file based off the original expected output with CABLE_LENGTH table updated to use 300m. In the comparison logic, compare against each of the expected output files and if any matches, the testcase is set to pass
Signed-off-by: Neetha John <nejo@microsoft.com>
* 48cccb4 2022-06-13 | do not use sai_query_api_version if vendor sai does not support in VendorSai.cpp (#1064) (HEAD, origin/master, origin/HEAD) [Guohan Lu]
* 9b0f773 2022-06-13 | [vslib]: Fixbug in cleanup MACsec device (#1059) [Ze Gan]
* cdf9427 2022-06-11 | No sai api version check if vendor sai does not support (#1063) (HEAD, origin/master, origin/HEAD) [Guohan Lu]
* 3964cf1 2022-06-09 | [counter] Fix port flex counter (#1052) [Junhua Zhai]
* 2231b7a 2022-06-03 | Purge package sonic-db-cli which depends on libswsscommon (#1057) [Qi Luo]
* 7aa09b9 2022-06-01 | Set PR diff code coverage threshold to 80% (#1039) [Kamil Cudnik]
* 66a29bc 2022-05-18 | [syncd] Use vendor SAI instead of direct SAI api (#1042) [Kamil Cudnik]
* 564bea7 2022-05-18 | [ci] Paralize azure pipeline (#1040) [Shilong Liu]
* 57ed180 2022-05-17 | [configure.ac] implement SAI API version check (#1000) [Stepan Blyshchak]
* 8894dc7 2022-05-17 | vslib: add support for read-only port capabilities (#1038) [Dante (Kuo-Jung) Su]
* 42af975 2022-04-29 | [vslib]: Update packet number of MACsec SA at runtime (#1007) [Ze Gan]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
This fixes the build for armhf to be able to use '/device///installer.conf' files. Specifically, armhf needs support to be able to change the size of /var/log/ directory. It is hardcoded to 512 bytes on all armhf platforms currently. This change will allow any armhf platform to be able to use an installer.conf file to customize the installed image.
Why I did it
The dhcp_graph_url used by internal service is always set as "N/A". So we can make the updategraph logic short.
How I did it
Shorten 'if statement' logic for /tmp/dhcp_graph_url
29503ab [portchannel] Added ACL/PBH binding checks to the port before getting added to portchannel (#2151)
ac89489 Modify override testcase to cover PORT admin_status (#2165)
d7953d2 [GCU] Validate peer_group_range ip_range are correct (#2145)
aa81b97 [auto-ts] add memory check (#2116)
b370290 support new interface types CR8/SR8/KR8/LR8 which are brougnt by SAI V.1.10.2 (#2167)
87fc0a4 [scripts/fast-reboot] Add option to include ssd-upgrader-part boot option with SONiC partition (#2150)
90abc07 [config reload] Fix invalid rstrip. (#2157)
fac1769 Accept 0 for queue and dscp (#2162)
* [Tunnel PFC] Tests for adding property 'sai_remap_prio_on_tnl_egress'
Add tests for adding property 'sai_remap_prio_on_tnl_egress', this
property should only be added in dual tor environment.
Test done:
Run test test_j2files.py
Co-authored-by: richardyu <richardyu@contoso.com>
Why I did it
Provide fix for comment: https://github.com/Azure/sonic-buildimage/pull/10475/files#r847753187;
Move laoding database config to application code instead of portconfig as portconfig is used as a library.
#10581 was raised for this fix, but had to be reverted due to issue with multi-asic platform.
How I did it
Remove try exception handing from portconfig.py during config_db intialization.
Move loading of database config to application that uses portconfig.py.
How to verify it
unit-test passes.
Verified that it does not cause issue during boot up of multi-asic VS image.
Verified that config_db generation was successful in multi-asic VS.
Why I did it
Make reset didn't clean-up all fsroot folders.
How I did it
Remove all fsroot folders used during build.
How to verify it
Run local build and local make reset:
sudo mkdir fsroot-test
sudo touch fsroot-test/foo
make reset
(Without this change, make reset cannot remove fsroot-foo, with the change, the repo become clean after make reset.)
Signed-off-by: Ying Xie ying.xie@microsoft.com
- Why I did it
"import sonic_platform" takes about 600ms ~ 1000ms, it is kind of slow. After this optimization, the time is about 100ms. The benefit is that those CLIs which does not need the slow import sentence would be faster than before.
- How I did it
Find slow import and call them when need.
- How to verify it
Measure the import time.
- Why I did it
To include latest fixes:
1. Warmboot | When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
2. Link Up | When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
3. Shared buffer | While moving from lossless to lossy while shared headroom was used, reduction of the shared headroom can only be done prior to pool type change and when shared headroom is not utilized.
- How I did it
Updated SDK submodule along with the relevant Makefiles
- How to verify it
Build an image and run tests from "sonic-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
1. SN2201 sai profile needs to be updated according to the latest hardware.
2. In the reboot script, need to use the common symbol link of the power_cycle sysfs instead of directly accessing it due to SN2201 sysfs is different than other platforms.
3. echo 1 > $SYSFS_PWR_CYCLE will trigger the reboot immediately, the following sleep 3 and echo 0 > $SYSFS_PWR_CYCLE will never be executed, can be removed.
- How I did it
1. Replace the SN2201 sai profile with the latest one.
2. In the platform_reboot script, replace the direct sysfs path with the symbol link path.
3. Remove the redundant code from platform_reboot
- How to verify it
Perform reboot on all the Nvidia platforms, and check all can be rebooted successfully.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* [BGP]Adding configuration knob to allow advertise Loopback ipv6 /128 prefix
By default when IPv6 address is configured with /128 as subnet mask in Loopback0 interface, it will be advertised as prefix with /64 subnet.
To control this behavior a new field 'bgp_adv_lo_prefix_as_128' is introduced in DEVICE_METADATA table which when set to true will advertise prefix with /128 subnet as it is.
Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR
How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Some PR test are timing out on T1-lag kvm test.
How I did it
Increase the timeout to 5 hours.
How to verify it
Test on this PR.
Signed-off-by: Ying Xie ying.xie@microsoft.com
* [Tunnel PFC] Add property for tunnel PFC
Replace the config.bcm file with j2 template file
- Add 'sai_remap_prio_on_tnl_egress=1' property when device metadata local
- Host subtype is 'dualtor'
- Change sai.profile foe the new config.bcm.j2
To not try to build python2 bindings for sairedis for bullseye. The same solution was done for swss-common package.
Releated changes Azure/sonic-sairedis#1050
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
#### Why I did it
Switch py-common from swsssdk to swsscommon.
#### How I did it
Change code and make file to use swsscommon.
#### How to verify it
Pass all UT and E2E test.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration.
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
Recently the nightly testing pipeline found that the autorestart test case was failed when it was run against master image. The reason is Restart= field in each container's systemd configuration file was set to Restart=no even the value of auto_restart field in FEATURE table of CONFIG_DB is enabled.
This issue introduced by #10168 can be reproduced by the following steps:
Issues the config command to disable the auto-restart feature of a container
Runs command config reload or config reload minigraph to enable auto-restart of the container
Checks Restart= field in the container's systemd config file mentioned in step 1 by running the command
sudo systemctl cat <container_name>.service
Initially this PR (#10168) wants to revert the changes proposed by this: #8861. However, it did not fully revert all the changes.
How I did it
When hostcfgd started or was restarted, the Restart= field in each container's systemd configuration file should be initialized according to the value of auto_restart field in FEATURE table of CONFIG_DB.
How to verify it
I verified this change by running auto-restart test case against newly built master image and also ran the unittest:
Signed-off-by: bingwang <bingwang@microsoft.com>
Why I did it
This PR is to add two extra lossless queues for bounced back traffic.
HLD sonic-net/SONiC#950
SKUs include
Arista-7050CX3-32S-C32
Arista-7050CX3-32S-D48C8
Arista-7260CX3-D108C8
Arista-7260CX3-C64
Arista-7260CX3-Q64
How I did it
Update the buffers.json.j2 template and buffers_config.j2 template to generate new BUFFER_QUEUE table.
For T1 devices, queue 2 and queue 6 are set as lossless queues on T0 facing ports.
For T0 devices, queue 2 and queue 6 are set as lossless queues on T1 facing ports.
Queue 7 is added as a new lossy queue as DSCP 48 is mapped to TC 7, and then mapped into Queue 7
How to verify it
Verified by UT
Verified by coping the new template and generate buffer config with sonic-cfggen
#### Why I did it
Fix sonic-db-cli high CPU usage on SONiC startup issue: https://github.com/Azure/sonic-buildimage/issues/10218
ETA of this issue will be 2022/05/31
#### How I did it
Re-write sonic-cli with c++ in sonic-swss-common: https://github.com/Azure/sonic-swss-common/pull/607
Modify swss-common rules and slave.mk to install c++ version sonic-db-cli.
#### How to verify it
Pass all E2E test scenario.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
#### Description for the changelog
Build and install c++ version sonic-db-cli from swss-common.
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration.
-->
#### A picture of a cute animal (not mandatory but encouraged)
#### Why I did it
No logs currently exist for sonic-save-X containers which makes it difficult to debug.
#### How I did it
Altered Makefile.work to create logs in the sonic-slave-X folder while still displaying the log to the screen to prevent interfering with any existing tooling.
#### How to verify it
Do `make configure` and verify that logs show up in `sonic-slave-buster/` and `sonic-slave-bullseye/`
#### Description for the changelog
Add logging for slave container builds
#### A picture of a cute animal (not mandatory but encouraged)
TBD
* When reloading config after crashes, VTEP interfaces are sometimes not created since the tunnel still exists in the STATE_DB.
* Adding VXLAN_TUNNEL_TABLE to the list of tables to be cleaned in swss.sh fixes the problem.
Currently, the build with ASAN_ENABLE=y reuses the packages built with
ASAN_ENABLE=n (and vice versa). To address this issue, ASAN_ENABLE is added to DEP_FLAGS for asan-enabled packages (docker-syncd-mlnx, syncd, docker-orchagent, swss).
- Why I did it
To make dpkg cache use/rebuild the packages for ASAN_ENABLE=y/n.
- How I did it
Added ASAN_ENABLE to the DEP_FLAGS for asan-enabled packages.
- How to verify it
Built with ASAN_ENABLE=y/n and checked the .flags .log files.
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
This is to improve the readability of ASAN reports. The debug package adds function names and source code references to the backtrace (currently, there are only binary addresses of functions)
Another way to address this issue is to build the image with "INSTALL_DEBUG_TOOLS=y". The downside of this approach is that the image size and compilation time are unnecessarily big. Also, the idea is to make the "ENABLE_ASAN" self-sufficient, which would not be the case for this approach.
- Why I did it
To improve the readability of asan logs.
- How I did it
Added SYNCD_DBG and SWSS_DBG to corresponding docker images for ASAN_ENABLE=y build
- How to verify it
Add artificial memory leak
Build with ASAN_ENABLE=y
Test the image and check the ASAN report
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>