#### Why I did it
I made this change to support warm/fast reboot for SONiC extension packages as per HLD Azure/SONiC#682.
#### How I did it
I extended manifest.json.j2 with new warm/fast reboot related fields and also extended sonic_debian_extension.j2 script template to generate the shutdown order files for warm and fast reboot.
Why I did it
Static route configuration should not depend on BGP_ASN. Remove the dependency on BGP_ASN for StaticRouteMgr.
Fix#8027
How I did it
Check if BGP_ASN field before configuring static route redistribution and wait until BGP_ASN is available to enable static route redistribution.
How to verify it
Add unit test to cover the scenario and verify the functionality on a virtual switch.
After https://github.com/Azure/sonic-buildimage/pull/7598 the packages.json generation is broken. This change fixes it make the whole build fail in case generation failed.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
#### Why I did it
Restrict the min-links parameter in "config portchannel" to the range 1-1024.
FixesAzure/sonic-buildimage#6781 in conjunction with https://github.com/Azure/sonic-buildimage/pull/1630.
Align YANG model with limits in libteam and sonic-utilties.
#### How I did it
PR 1630 in sonic-utilities prevents CLI user from entering a value outside the allowed range. This PR does the following:
- Increases the maximum value of min-links from 128 to 1024.
- Provides validation in libteam, incorporating as a patch the code in https://git.kernel.org/pub/scm/linux/kernel/git/jpirko/libteam.git/commit/?id=69a7494bb77dc10bb27076add07b380dbd778592.
- Updates the Yang model upper limit from 128 to 1024 (was inconsistent with libteam value).
- Updates the Yang model lower limit from 1 to 0, since 0 is set as default in sonic-utilities which would fail its new range check otherwise.
- Added Yang tests for valid and invalid value.
#### How to verify it
config portchannel add PortChannel0004 --min-links 1024
Command should be accepted.
show interfaces portchannel
Output should show PortChannel0004, no errors on CLI.
config portchannel add PortChannel0005 --min-links 1025
Command should be rejected
show interfaces portchannel
Output should not show PortChannel0005 , no errors on CLI.
#### Which release branch to backport (provide reason below if selected)
#### Description for the changelog
Updates YANG model to allow up to 1024 min_links for portchannel. FixesAzure/sonic-buildimage#6781 in conjunction with https://github.com/Azure/sonic-buildimage/pull/1630.
- Why I did it
Currently dhcp packets are disabled by the COPP manager for non ToRRouter type switches.
Even if the feature is enabled, DHCP packets wont hook to the CPU since the COPP manager will not trap this packets.
This change is to disable dhcp_relay by default for non ToRRouter switches from init_cfg.json.
With this approach, if the user want to enable the feature for non ToRRouter switches, manual enablement is required by the 'feature' configuration.
This is to keep the current approach for MSFT production issue with dhcp relay for non ToRRouter switched and allow the user to decide if to use it or not.
- How I did it
Configure dhcp_relay 'disabled' by default on init_cfg.json for non ToRRouter switches.
Remove the exclusion of dhcp packets on copp_cfg.json
- How to verify it
Enable dhcp_relay feature on a non ToRRouter switch.
Unit-tests modified so the default values on mocked CONFIG DB in 'test_vectors.py' for dhcp_relay will be 'disabled'.
This is by the change for 'init_cfg.json.j2'.
For ToRRouter the state will change from 'disabled' to 'enabled'.
Another test case added for a 'ToR' switch type, this is to test the state is 'enabled' if the user configured it to be so.
Why I did it
systemd-sonic-generator limits multi-asic unit file instances to 10 (single digit instance number 0 - 10). This limitation needs to be removed to handle more than 10 asics.
MAX_NUM_TARGETS and MAX_NUM_INSTALL_LINES limits to 15 which is not sufficient for systems with more than 15 asics.
Inside get_unit_files(), strcmp produce incorrect results due to non null terminated string being compared.
Added build UT support for systemd-sonic-generator
Changes:
3c485e5 [recorder] Fix incorrect attribute enum value capability query (#843)
677ebca [sairedis] Client/Server support zmq configuration file (#845)
7c70e34 [sairedis] Add support for bulk api in client/server (#844)
76d28a6 [pyext] Use SAI autogenerated saiswig.i (#837)
9949c48 [vslib] implement query for SAI_DEBUG_COUNTER_TYPE enum values (#842)
e385212 [MPLS] Minor tweaks to VS for MPLS support for CRM polling of MPLS In-segments and NHs.
d819f97 [meta] Add support for ignored attributes names (#836)
c163238 Add cisco-8000 checks to syncd_init_common (#839)
9aed2ff [sairedis] Add support for client server architecture (#838)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
NOTE: This is cherry-pick from 1911/2012 to master.
- Why I did it
To fix LAG IP configuration race
- How I did it
Extended timeout for teammgrd
- How to verify it
Add >80 router LAGs. Do config reload
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
Updates:
888701b [Mellanox] Remove mstdump from Mellanoxs collect dump script ([Azure/sonic-utilities#1706])
4818360 [sonic-package-manager] support warm/fast reboot for extension packages ([Azure/sonic-utilities#1554])
793b847 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' ([Azure/sonic-utilities#1679])
24fe1ac [show][config] support for interface alias for muxcable commands ([Azure/sonic-utilities#1699])
Update FW version to 2008.3218, fixing the following issues:
- 50G/100G links that are operationally down before warm-reboot are not coming up after warm-reboot
- 50G/100G links with admin shut / no shut commands are not coming up after warm-reboot
Signed-off-by: Dror Prital <drorp@nvidia.com>
- Why I did it
* For SAI - Advance to adopt the following fixes:
1. Better handle not implement object type for resource availability
2. Fix ext dump when saidump is triggered from 2nd process (saidump utility) other than main adapter host (syncd in SONiC)
* For SDK\FW:
- Changes and new features:
1. Added support in SN4600C systems for new module Finisar ET7402-CWDM4 (100G CWDM4 QSFP28 1310nm SM 2KM).
2. Added support for new module MMS1W50-HM (2km transceiver FR4) for 200GbE
3. Improved performance of "per-port-buffer" counters
4. Added support for Kernel 5.10
- Bugs fixes:
On rare occasions (0.5%), in SN4600C systems, when using 100GbE NRZ mode and Fastboot flow, the link up time may take up to 10 seconds
Signed-off-by: Dror Prital <drorp@nvidia.com>
Why I did it
Currently hostcfgd is implemented in a way each feature which is enabled/disabled triggering execution of systemctl enable/unmask commands which eventually trigger 'systemctl daemon-reload' command.
Each call like this cost 0.6s and overall add a overhead of ~12 seconds of CPU time.
This change will verify the desired state of a feature and the current state of this feature on systemd and trigger a system call only when must.
How I did it
Check each feature status on systemd before executing a system call to enable and reload the systemctl daemon.
How to verify it
Build an image with this change and observe less system calls are executed.
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
dash doesn't support += operation to append to a variable's value. Use KDUMP_CMDLINE_APPEND="${KDUMP_CMDLINE_APPEND} " instead
The below error message is seen when a reboot is issued.
[ 342.439096] kdump-tools[13655]: /etc/init.d/kdump-tools: 117: /etc/default/kdump-tools: KDUMP_CMDLINE_APPEND+= panic=10 debug hpet=disable pcie_port=compat pci=nommconf sonic_platform=x86_64-accton_as7326_56x-r0: not found
#### Why I did it
The process of config generation (sonic-cfggen) fails, but the services continue to run with invalid config
#### How I did it
* add exit with error on errors in start.sh script (because supervisord relies on start.sh return code).
* fix jinja template. Jinja use common python expressions under the hood and `has_key` method was removed from dict in py3, so use check by `in` operator as it is supported by both py2 and py3.
#### How to verify it
* compile sonic with enabled iccp.
* add mclag config to CONFIG_DB.
```
'MC_LAG|1' => {
"local_ip": "10.0.0.2",
"peer_ip": "10.0.0.3",
"peer_link": "Ethernet8",
"mclag_interface": "Ethernet12"
}
* unmaks, enable and start swss and iccpd services in sonic.
* log in into the iccpd container and check the config file `/etc/iccpd/iccpd.conf`
* expected config:
```
mclag_id:1
local_ip:10.0.0.2
peer_ip:10.0.0.3
peer_link:Ethernet8
mclag_interface:Ethernet12
system_mac:YOUR_SYSTEM_MAC
#### Description for the changelog
Fixed initial iccpd startup configuration.
186d8513 Pcieutil to load the platform api first instead of using common api (#1672)
7a82c069 [Mellanox] Update mellanox dump generation to include SDK dumps (#1640)
38f8c068 [sfputil] Expose error status fetched from STATE_DB or platform API to CLI (#1658)
c5d00ae4 [pfcwd] Fix the return code in invalid case (#1691)
57dc4032 [ci]: Fix config prompt question issue (#1693)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
The voq system lag id boundary is set in redis-chassis. Changes include
setting this from database-chassis container. This fixes a timing issue
in finding datbase_config.json file from redis directory which is
created from database container. Since database container usually
starts after database-chassis container the existence of this file is
unreliable while running the command. Running the command under
database-chassis container makes sure that the database_config.json form
redis-chassis directory is guaranteed to be available and hence fixes the
timing issue.
Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>
#### Why I did it
ethtool can be used to query and change settings such as speed, auto- negotiation and checksum offload on many network devices, especially Ethernet devices.
#### How I did it
add package extension to docker-platform-monitor/Dockerfile.j2
#### Why I did it
The libpci library provides portable access to configuration registers of devices connected to the PCI bus.
#### How I did it
update dockers/docker-platform-monitor/Dockerfile.j2
Why I did it
Multiple build failed in 202012 branch
It is caused by the disorder of the package urls retrieved from the command "apt-get download --print-urls "
Why I did it
We hit an issue recently in the chassis bringup where the linux bde attach failed with the following ioctl error.
[ 9058.585960] linux-user-bde (897363): Error: Invalid ioctl (00004c1d)
[ 9105.668237] linux-user-bde (901002): Error: Invalid ioctl (00004c1d)
Debugged with Broadcom team, who suggested to use this flag BCM_INSTANCE_SUPPORT to support multi-instance scenarios ( platforms with more than one asic where there are separate sai/syncd docker instances running controller each asic instance).
This flag was introduced since SDK-6.5.21 and need to be present in SAI and SAI GPL kernel module makefile.
How I did it
Add the flag in this flag BCM_INSTANCE_SUPPORT in gpl modules
Why I did it
To determine the revision of the pcie.yaml to be used based on BIOS version in DellEMC S6100 platform.
Depends on: Azure/sonic-platform-common#195
How I did it
Added two revisions of pcie.yaml pcie_1.yaml and pcie_2.yaml
Included a platform-specific Pcie class to provide the revision of the pcie.yaml to be used by pcieutil/pcied.
How to verify it
Execute pcieutil check (Azure/sonic-utilities#1672) command and verify the list of PCIe devices displayed.
Logs: UT_logs.txt
Signed-off-by: Stepan Blyschak stepanb@mellanox.com
Why I did it
To support building DHCP relay as extension and installing it during build time.
How I did it
Created infrastructure. Users need to define their packages in rules/sonic-packages.mk
How to verify it
Together with #6531
Before this change, a process running inside every SONiC container dealt with FEATURE table 'auto_restart' field and depending on the value decided whether a container has to be killed or not.
If killed service auto restart mechanism restarts the container.
This change moves the logic from container to the host daemon - hostcfgd.
The 'auto_restart' handling is kept in supervisor-proc-exit-listener but now it is not required for container that wants to support auto restart feature.
hostcfgd refactoring - move feature handling in another class.
override systemd service Restart= setting from hostcfgd.
remove default systemd Restart=always.
Signed-off-by: Stepan Blyshchak stepanb@nvidia.com
- Why I did it
Remove the need to deal with container orchestration logic from the container itself. Leave this logic to the orchestrator - host OS.
- How I did it
hostcfgd configures 'Restart=' value for systemd service.
- How to verify it
root@r-tigon-11:/home/admin# sudo config feature autorestart lldp enabled
root@r-tigon-11:/home/admin# show feature status | grep lldp
lldp enabled enabled
root@r-tigon-11:/home/admin# docker exec -it lldp pkill -9 lldpd
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c docker-lldp:latest "/usr/bin/docker-lld…" 2 days ago Exited (0) 20 seconds ago lldp
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c docker-lldp:latest "/usr/bin/docker-lld…" 2 days ago Up 5 seconds lldp
root@r-tigon-11:/home/admin# sudo config feature autorestart lldp disabled
root@r-tigon-11:/home/admin# docker exec -it lldp pkill -9 lldpd
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c docker-lldp:latest "/usr/bin/docker-lld…" 2 days ago Up 35 seconds lldp
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c docker-lldp:latest "/usr/bin/docker-lld…" 2 days ago Exited (0) 3 seconds ago lldp
root@r-tigon-11:/home/admin# docker ps -a | grep lldp
65058396277c docker-lldp:latest "/usr/bin/docker-lld…" 2 days ago Exited (0) 39 seconds ago lldp
root@r-tigon-11:/home/admin#
Advance submodule head for sonic-swss
32261636 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (Azure/sonic-swss#1786)
6c88e47a [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (Azure/sonic-swss#1781)
e86b900d [MPLS] sonic-swss changes for MPLS (Azure/sonic-swss#1686)
4c8e2b53 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (Azure/sonic-swss#1776)
36021246 [VS test stability] Skip flaky test for DPB (Azure/sonic-swss#1807)
c37cc1c5 Support for in-band-mgmt via management VRF (Azure/sonic-swss#1726)
1e3a532d Fix config prompt question issue (Azure/sonic-swss#1799)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
A recent version of contextlib2 (https://pypi.org/project/contextlib2/21.6.0/#history) has broken Python2 compatibility, so the version picked up by netaddr when using Python2 must be specified, or else builds fail
Co-authored-by: Tom Zhu <tom.zhu@metaswitch.com>
#### Why I did it
Support API 2.0 for S5248F platform
#### How I did it
Making changes to S5248F platform specific directory
Co-authored-by: Arun LK <Arun_L_K@dell.com>
#### Why I did it
To ensure any environment variables which are configured in the build/test environment do not influence the behavior of sonic-py-common during unit tests. For example, variables which might be set by continuous integration pipelines.
#### How I did it
Add class-scoped pytest fixture to `TestDeviceInfo` class which stashes the current environment variables, clears them and yields. Once all the test cases in the class finish, the fixture will restore the original environment variables.
Also remove unnecessary unittest-style setup and teardown functions from interface_test.py
Advance submodule update with the following changes:
4475750 Config reload fix (#29)
cf60d5e [ci]: add proper azp (#26)
f0fbfe7 [CI] Set up CI with Azure Pipelines (#25)
879d7bd Include port default fec configuration to be included in ZTP configuration (#24)
a6ae955 Add a pre-defined plugin to download a list of files (#23)
6f0305b [MultiDB] Add multidb support to sonic-ztp (#16)
Why I did it
MMU configuration for DellEMC Z9332 systems in T0/T1 topology
How I did it
Updated config.bcm, QoS/Buffer pool and lossy/lossless profile settings
How to verify it
Verified that Dell systems are booting up fine and basic test cases passing.
Discussion and requirement in Chassis discussion forum to NOT make the asic-id field in the DEVICE_METADATA mandatory. If this field "asic-id" is not present the orchagent will be started without the -i <asic_id> parameter
Ref: https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-orchagent/orchagent.sh#L39
How I did it
Made the check to see if the asic-id is valid and update the asic-id field in the DEVICE_METADATA
5708497 [show] fix show version (#1686)
9041ba0 [config] Adding sanity checks for config reload (#1664)
2cdadb5 [config]: Create portchannel with LACP key (#1473)
6f74ba5 [vnet_route_check] Fix logic for getting VNET routes from ASIC DB (#1653)
54fee0f Add range check on portchannel min-links (#1630)