bf7214c (HEAD -> 202012, origin/202012) [y_cable] add support for manual/standby mode in xcvrd for muxcable; fix download firmware version retrieval logic while download firmware in progress (#220)
fc6a41e (HEAD -> 202012, origin/202012) Fix typo in the simulated y_cable driver (#226)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Fix the check used to wait for interfaces to come up. The group name in
the supervisor config files has changed from isc-dhcp-relay to
dhcp-relay.
Also, in the wait script, wait 10 additional seconds after the vlans,
port channels, and any interfaces are up. This is because dhcrelay
listens on all interfaces (in addition to port channels and vlans), and
to ensure that it stays in a clean state during runtime, wait some extra
time to make sure that those interfaces are created as well.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
This PR updates the following commits
0770dec Add retry reading/setting mux status to simulated y-cable driver (#221)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Redis 4.0.0b1 has been uploaded to pip as a prerelease version. This
version drops support for Python 2 and only supports Python 3. Because
setup.py is being run, it will use the latest version of a package and
not the latest stable version (which is still 3.5.3).
Therefore, pin the redis package to version 3.5.3, so that it will work
for both Python 2 and 3.
#### How to verify it
Make sure that redis-dump-load for Python 2 builds today.
Since database.service has been moved to execute after rc-local.service,
and determine-reboot-cause.service rely on database.service, we have to
specify that in "After=".
Signed-off-by: Xichen Lin <xichenlin@microsoft.com>
Co-authored-by: Xichen Lin <xichenlin@microsoft.com>
Include the following commits:
f9bbed3cb86a3bab9a07745096835dbdbe5a4db6
Convert Unit Tests from unittest framework to pytest framework
e842c5ff317c67919dcbcab3358143cb9a16c9dd
Generate code coverage for Unit Tests
A new config option `sai_verify_incoming_chksum` was added to control the value of IPV4_INCR_CHECKSUM_ORIGINAL_VALUE_VERIFY in the EGR_FLEX_CONFIG control register (this prevents checksums of 0xffff from being propagated to other devices)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
How I did it
Added if multi npu check before invoking the load global config.
How to verify it
Restart caclmgrd after this change and check if no error log is thrown.
0b5f90b (HEAD -> 202012, origin/202012) [show techsupport] fix bash errors in generate_dump script (#1844)
388c50c [202012][warmboot] Add new preboot health check: verify db integrity (#1839)
d73dc98 [config] support for configuring muxcable to standby mode of operation (#1837)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
* [202012][sonic-platform-daemons] submodule update
3d14066 [xcvrd][y_cable] refactor xcvrd to listen to port probe without locks; fix the get_firmware_version API to sync with download_firmware (#216)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
* update sonic-platform-common
93244bf [Y-Cable][Broadcom] upgrade to support Broadcom Y-Cable API to release 1.2 (#217)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Fix for sonic-cfggen exception during platform string read during fresh install and start of sonic in multi asic, /var/run/redisX/ is created after database docker is started.
* c3691d3 [202012][pfcwd] Convert polling interval from ms to us in LUA scripts (#1909)
* 549c804 Mux state order change (#1902)
* 6b0b2c4 Update acl type check logic (#1886)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
* 2a8957d 2021-09-14 | [202012][sonic-utilities] CLI support for port auto negotiation (#1817) (HEAD, origin/202012) [vdahiya12]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
* d03ba4f [202012] [portstat, intfstat] added rates and utilization (#1812)
* 499ad3f [config reload] Fix config reload failure due to sonic.target job cancellation (#1814)
* 96d658c [202012][sonic installer] Add swap setup support (#1815)
* a9c6970 platform pre-check for reboot in 202012 branch (#1788)
* 0e0478b Unify the number format in the ourput of portstat and pfcstat in all cases (#1795)
* 2d1e00e [ecnconfig] Fix exception seen during display and add unit tests (#1784) (#1789)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
* 0323d5e noaOrMlnx Fix flex counters logic of converting poll interval to seconds from MS (#878)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
This PR updates the following commits in sonic-platform-common
9d2e7d5 Add y-cable driver for simulated mux (#213)
e3e8f09 [Y-Cable][Broadcom] Broadcom implementation of YCable class which inherits from YCableBase required for Y-Cable API's in sonic-platform-daemons (#208)
This PR updates the following commits in sonic-platform-daemons
ebc4f3f [Y-Cable] create unknown entries for mux_cable when there is a cable present but module definition is not present/invalid module
b10c417 [xcvrd] initial support for integrating vendor specfic class objects for calling Y-Cable API's inside xcvrd (#197) (#213)
f3fc1ea [y-cable] fix for logging the xcvrd metrics before writing the state to the State-DB (#208)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Why I did it
Update the sonic-swss submodule for the 202012 branch. The following is the new commit in the submodule.
c1cb2ca [202012] Backport SAI failure handling to 202012 branch (#1880)
How I did it
Update the sonic-swss submodule pointer for the 202012 branch.
with state of tdport from previous warm-reboot.
In case LAG was down before reboot, lacp->wr is not cleared.
In lacp_event_watch_port_flush_data we incremented nr_of_tdports and add
tdport to lacp->wr.state. In case lacp->wr.state already had this tdport
we do not set new state for tdport but appened a new item in
lacp->wr.state. In case we preformed warm-reboot and PortChannel member
was down, after reboot PortChannel member became up next warm-reboot
will initialize teamd with PortChannel member in down state.
Fix this issue by calling stop_wr_mode() when LAG was down. This was probably intended but missed.
#### Why I did it
To fix an issue seen in warm-reboot-sad test cases.
#### How I did it
I fixed it in SONiC libteam patch that adds warm-reboot support. Details in commit description.
#### How to verify it
Run warm-reboot-sad test on t0-56 topology.
ef4b3ec [Y-Cable] add the definition inside setup.py to include sonic_y_cable.credo as a package (#211)
7d81488 [Y-Cable][Credo] Credo implementation of YCable class which inherits from YCableBase required for Y-Cable API's in sonic-platform-daemons (#203)
3efb093 [sonic_y_cable] add abstract class YCableBase required for Y-cable API support for multiple vendors (#186)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
caclmgrd: monitor mux_cable_table in state_db to update dhcp acl
- if the state changes to 'standby', add acl to block dhcp packets based on ingress interfaces
- if the state changes to 'active', delete acl
- if the state changes to 'unknown', also delete acl to avoid potential disconnect
- both addition and deletion follow checking the existence of the rules
The change has been verified on a virtual switch based testbed.
Port to 202012 branch from #8222
enable automated test suites to selectively run relevant tests ( or not run tests ) based upon a new port_type identifier in hwsku.json
How I did it
Modified the valid optional fields in validity check for hwsku.json per recommendation from Joe in
https://github.com/Azure/sonic-mgmt/pull/2654/files
Co-authored-by: Carl Keene <keene@nokia.com>
Signed-off-by: Rajkumar Pennadam Ramamoorthy rpennadamram@marvell.com
Why I did it
Install sonic image from ONIE. Once system is up, execute "config reload" command.
Root cause is that "determine-reboot-cause.service" was in failed state.
root@sonic:/host/reboot-cause# systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● determine-reboot-cause.service loaded failed failed Reboot cause determination service
How I did it
Fixed the issue by setting default reason to "REBOOT_CAUSE_UNKNOWN" instead of "None".
How to verify it
Check " determine-reboot-cause.service' loaded successfully post image installation from ONIE.
Verify "reboot-cause.txt" file is created and config reload succeeds.
#### Why I did it
hostcfgd is starting at the same time as 'create_switch' method is called on orchagent process.
This introduce a degradation on the function execution time which eventually cause the fast-boot flow and a boot scenario in general to run slower (~6 seconds).
This change will delay the start time of this daemon.
90 seconds determined as the maximum allowed downtime for control plane to come back up on fast-boot flow.
#### How I did it
Add a timer for hostcfgd service in order to delay the startup of this service.
#### How to verify it
Install an image with this change and observe the daemon start 90 seconds after the system boot.
* 41dfaad 2021-08-02 | Bridge mac setting, fix statedb time format (#1844) (HEAD, origin/202012) [Prince Sunny]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
* Update default cable len to 0m for TD2 (#8298)
* Update sonic-cfggen tests with the correct cable len
Signed-off-by: Neetha John <nejo@microsoft.com>
As part of the buffer reclamation efforts for TD2, setting the default cable len to 0m which means unused ports will have a cable len of 0m.
Why I did it
To align with the changes in Azure/sonic-swss#1830
How to verify it
- With the default cable len set to 0m and the associated changes in swss, CABLE_LENGTH table had '0m' set for unused ports and accordingly more space was reserved for the shared pool
- Cfggen tests passed with the cable len update
To include following changes:
* d84a8cc 2021-08-05 | [fast-reboot] revert the change of disabling counter polling before fast-reboot (#1744) (HEAD -> 202012, github/202012) [Ying Xie]
* e900bc5 2021-08-04 | Add script null_route_helper (#1718) [bingwang-ms]
* 85f14e1 2021-08-02 | disk_check updates: (#1736) [Renuka Manavalan]
* d68ac1c 2021-05-27 | [console][show] Force refresh all lines status during show line (#1641) [Blueve]
* a0e417f 2021-04-25 | [console] Display success message after line cleared (#1579) [Blueve]
* 0c6bb27 2021-04-07 | [console] Include Flow Control status in show line result (#1549) [Blueve]
I have been seeing intermittent (~40%) build failures with the same error described in PR https://github.com/Azure/sonic-buildimage/pull/6592, even with that fix present
```
/usr/bin/ld: mibgroup/ip-forward-mib/ipCidrRouteTable/.libs/ipCidrRouteTable_interface.o: file not recognized: file truncated
...
libtool: error: 'mibgroup/ip-forward-mib/inetCidrRouteTable/inetCidrRouteTable_interface.lo' is not a valid libtool object
make[5]: *** [Makefile:1020: libnetsnmpmibs.la] Error 1
make[5]: *** Waiting for unfinished jobs....
```
#### How I did it
Use `-j1` for the libsnmp build regardless of the value of `$(MULTIARCH_QEMU_ENVIRON)`
#### How to verify it
Performed 10 builds of the libsnmp target (`target/debs/buster/libsnmp-base_5.7.3+dfsg-5_all.deb`) with and without this change. Without the change, hit the error 40% of the time. With the change did not see the error at all
Signed-off-by: Justin Sherman <jusherma@cisco.com>
Update submodule for swss
f54b7d0b [Dynamic Buffer Calc][202012]Bug fix: Don't create lossless buffer profile for active ports without speed configured (Azure/sonic-swss#1820)
ac7f5cff Td2: Reclaim buffer from unused ports (Azure/sonic-swss#1830)
04105a4b [debugcounterorch] check if counter type is supported before querying (Azure/sonic-swss#1789)
a67d8af6 [202012][portsorch] fix errors when moving port from one lag to another. (Azure/sonic-swss#1819)
This PR updates the following commits
a9606fb [show] fix show muxcable metrics <port> for sorted output (#1731)
7355016 [minigraph][port_config] Use imported config.main and add conditional patch (#1728)
cc1d6e4 [configlet] Python3 compatible syntax for extracting a key from the dict (#1721)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
It can be that service is not enabled but UnitFilePreset=enabled (case
for Application Extension):
```
Loaded: loaded (/lib/systemd/system/cpu-report.service; disabled; vendor preset: enabled)
```
This makes existing logic skip enabling the service.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Backport #7944
#### Why I did it
The current logic generates 'VLAN_SUB_INTERFACE' table if the device type is backend and cluster name contains 'str'. This is not a reliable method to determine a storage backend device
#### How I did it
Updated the logic to generate 'VLAN_SUB_INTERFACE' table if any of the following conditions hold true
1. device is of type backend and ResourceType attribute is None
2. device is of type backend and ResourceType attribute contains "Storage"
3. device is of type backend and graph contains "Subinterface" section
Also updated the logic to set "is_storage_device" to True
1. for Backend, if any of the above conditions hold true
2. for Frontend, if ResourceType attribute contains "Storage"
#### How to verify it
Added new tests to verify the code changes and built sonic_config_engine-1.0-py3-none-any.whl successfully
Update:
> 2ca493b 2021-07-13 create sniffer folder if not exist (Azure/sonic-utilities#1659)
> 1695104 2021-07-07 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' (Azure/sonic-utilities#1679)
> e99a3c5 2021-07-07 [show][config] support for interface alias for muxcable commands (Azure/sonic-utilities#1699)
Currently hostcfgd is implemented in a way each feature which is enabled/disabled triggering execution of systemctl enable/unmask commands which eventually trigger 'systemctl daemon-reload' command.
Each call like this cost 0.6s and overall add a overhead of ~12 seconds of CPU time.
This change will verify the desired state of a feature and the current state of this feature on systemd and trigger a system call only when must.
What is changed: Check each feature status on systemd before executing a system call to enable and reload the systemctl daemon.
How to verify: Build an image with this change and observe less system calls are executed.
Why I did it
Static route configuration should not depend on BGP_ASN. Remove the dependency on BGP_ASN for StaticRouteMgr.
Fix#8027
How I did it
Check if BGP_ASN field before configuring static route redistribution and wait until BGP_ASN is available to enable static route redistribution.
How to verify it
Add unit test to cover the scenario and verify the functionality on a virtual switch.
Update submodule for sonic-utilities to include the following PR:
[202012] [pfcwd] Fix the return code in invalid case (#1698)
Signed-off-by: Dror Prital <drorp@nvidia.com>
Advance submodule head for sonic-swss on 202012
bb383be2 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (Azure/sonic-swss#1781)
f949dfe9 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (Azure/sonic-swss#1776)
def0a914 Fix config prompt question issue (Azure/sonic-swss#1799)
21f97506 [ci]: Merge azure pipelines from master to 202012 branch (Azure/sonic-swss#1764)
a83a2a42 [vstest]: add dvs_route fixture
849bdf9c [Mux] Add support for mux metrics to State DB (Azure/sonic-swss#1757)
386de717 [qosorch] Dot1p map list initialization fix (Azure/sonic-swss#1746)
f99abdca [sub intf] Port object reference count update (Azure/sonic-swss#1712)
4a00042d [vstest/nhg]: use dvs_route fixture to make test_nhg more robust
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Update submodule for sonic-utilities to include the following PRs:
80b7b54 Pcieutil to load the platform api first instead of using common api (sonic-utilities#1672)
3d5e93d [Mellanox] Update mellanox dump generation to include SDK dumps (sonic-utilities#1640)
a805efc [ci]: Fix config prompt question issue (sonic-utilities#1693)
33c6d79 [vnet_route_check] Fix logic for getting VNET routes from ASIC DB (sonic-utilities#1653)
Update submodule for sonic-sairedis to include the following PRs:
74b5808 [Mellanox] Update mellanoxs dump generation to include SDK dumps (sonic-sairedis#833)
5ff9305 Fix azure-pipelines branch reference error, change master to 202012 (sonic-sairedis#834)
#### Why I did it
To ensure any environment variables which are configured in the build/test environment do not influence the behavior of sonic-py-common during unit tests. For example, variables which might be set by continuous integration pipelines.
#### How I did it
Add class-scoped pytest fixture to `TestDeviceInfo` class which stashes the current environment variables, clears them and yields. Once all the test cases in the class finish, the fixture will restore the original environment variables.
Also remove unnecessary unittest-style setup and teardown functions from interface_test.py
A recent version of contextlib2 (https://pypi.org/project/contextlib2/21.6.0/#history) has broken Python2 compatibility,
so the version picked up by netaddr when using Python2 must be specified, or else builds fail
Co-authored-by: Tom Zhu <tom.zhu@metaswitch.com>
Update submodule for sonic-utilities to include the following PRs:
Make the soft-reboot available in the SONiC image on master (#1681)
[config]: Update environment file during config reload (#1673)
Enable pr checker form 202012 (#1649)
[config] support for configuring muxcable to manual mode of operation (#1642)
[config] Fix config int add incorrect ip (#1414)
Signed-off-by: Dror Prital <drorp@nvidia.com>
b0dad8c Add to check pcie configuration revision to get the right configuration. (#195)
f66ffc3 [eeprom_tlv_info] Optimize EEPROM data process by using visitor pattern (#193)
* Revert "fix"
This reverts commit 93585b0a0a.
* Revert "Version control git (#6562)"
This reverts commit 52b87753db.
* Revert "Revert "[files/build/versions]: support reproduceable build for git (#5774)""
This reverts commit 1cb8daf585.
* Revert "[files/build/versions]: support reproduceable build for git (#5774)"
This reverts commit 547aa9b2c7.
Add PG_DROP yang model and add check this field in unit test for yang model
How to verify it
Firstly try to do DPB (2x50G) for Ethernet0 port:
sudo config interface breakout Ethernet0 2x50G -f
After that try to do DPB (1x100G[40G]) for Ethernet0 port:
sudo config interface breakout Ethernet0 1x100G[40G] -f
Both commands should work correctly.
Signed-off-by: Mykola Gerasymenko <mykolax.gerasymenko@intel.com>
#### Why I did it
Recently, the build started failing with messages like
```
2021-06-16T16:55:02.8675603Z tests/hostcfgd/hostcfgd_test.py:5: in <module>
2021-06-16T16:55:02.8676208Z from parameterized import parameterized
2021-06-16T16:55:02.8677145Z E ModuleNotFoundError: No module named 'parameterized'
```
Unit tests for hostcfgd depend on the `parameterized` Python package, but it was never added as a dependency to the setup.py file. This dependency was added ~3 months ago. I'm not sure why we only started seeing this failure recently.
#### How I did it
Add 'parameterized' package as a test dependency in setup.py for sonic-host-services package
Why I did it
The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo.
Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy.
How I did it
When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker.
To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo.
How to verify it
Configure a switch such that image repo is unreachable
Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1)
Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option.
Configure a k8s server, and deploy an image for feature with set_owner="kube"
Check if switch could successfully download the image or not.
Why I did it
Enable redistribution of static routes
How I did it
Enable redistribution of static routes when the first route is added to STATIC_ROUTE table of Config_DB and disable the redistribution when the last route is removed from STATIC_ROUTE table.
Advance submodule head for sonic-utilities
b894c5b5 Fix build test failure caused by error module name (Azure/sonic-utilities#1662)
5a7c06a0 [config]][tacacs+] Change tacacs+ minimum timeout value base on spec (Azure/sonic-utilities#1631)
080a689c [202012] [db_migrator] fix old 1911 feature config migration to a new one. (Azure/sonic-utilities#1636)
43fff88c Change to use rvtysh when calling the show commands (Azure/sonic-utilities#1646)
88a823f0 [db_migrator][Mellanox] Update Mellanox buffer migrator with 2km-cable supported (Azure/sonic-utilities#1564)
d096ff78 [config]Static routes to config_db (1534)
a68d8d09 route_check: Updates (Azure/sonic-utilities#1645)
Includes below comments:
```
fcf7cdc [patch] add patch "net: sch_generic: fix the missing new qdisc assignment bug" (#213)
```
#### Why I did it
To bring the fix "net: sch_generic: fix the missing new qdisc assignment bug".
#### How I did it
Updated submodule.
#### How to verify it
Build and run.
Verify that flapping a LAG member port does not lead to this member beeing stuck in disabled state.
Why I did it
ndppd by default reads /proc/net/ipv6_route ever 30 seconds. Since T1s advertise so many routes to ToRs, this file is extremely large, and reading it causes ndppd's CPU usage to spike every 30 seconds
How I did it
Increase the delay for reading this file to the maximum possible value (max integer value), which will result in CPU spikes every ~24 days instead of every 30 seconds
How to verify it
Start ndppd with the new config file, confirm that no CPU spikes are seen except at startup
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Why I did it
The current code skips parsing bandwidth for DeviceMgmtLinks. We have a use case to set the speed for these type of links based on the bandwidth attribute in the minigraph
How to verify it
Ran sonic-cfggen on a minigraph and verified that interface of type DeviceMgmtLink has speed set in the PORT table from the bandwidth attribute in the minigraph
sonic-utilities:
* 8b98d45 2021-05-25 | [show] support for show muxcable firmware version of only active banks (#1629) (HEAD -> 202012) [vdahiya12]
* afd0975 2021-05-20 | [show] add support for muxcable metrics (#1615) [vdahiya12]
sonic-swss
* 7611df5 2021-05-27 | [tunneldecaporch] Set default MTU for the overlay loopback interface (#1756) (HEAD -> 202012) [Volodymyr Samotiy]
* 22fbb5c 2021-05-27 | [202012] Resolve neighbor when nexthop does not exist (#1759) (github/202012) [Shi Su]
* ec7710c 2021-05-27 | [Bulk mode] Limit the size of bulker (#1760) [Shi Su]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Why I did it
k8s handles in lower case, so the code ensures that it uses hostname in all lower case
How I did it
Wrapper for device_info.get_hostname that returns in lower case. This wrapper is used in all places that require hostname to use in kubectl commands.
How to verify it
Device joins successfully.
Why I did it
Currently, there is a bug in the ntp.conf jinja2 template where it will ignore the src_intf directive in CONFIG_DB if there are multiple IP addresses associated with an interface. This code change fixes that bug and allows the template to select the correct source interface for NTP.
How I did it
I did this by modifying the macro in ntp.conf.j2 which determines if there is an ip address associated with an interface to set a state variable when it detects a valid interface entry in CONFIG_DB instead of outputting "true" directly (which could result in multiple "trues" outputted for interfaces with multiple valid IP addresses).
How to verify it
Add two ipv4 addresses to an interface in SONiC
Add the following configuration to config_db.json
{
"NTP": {
"global": {
"src_intf": "Ethernet1"
}
}
}
Replace Ethernet1 with the interface name of the one you assigned the IP addresses to.
Run sudo config reload -y
Open /etc/ntp.conf and verify that the following line exists
...
interface listen Ethernet1
...
The interface specified should be the one set in the previous steps.
Description for the changelog
[ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB
Signed-off-by: Neetha John nejo@microsoft.comFixes#7531
Why I did it
To enable bgp sessions to be established over subinterfaces
How I did it
Listen to VLAN_SUB_INTERFACE table in config db
How to verify it
Bgp sessions were established successfully over subinterface
When FECDisabled is set to true in minigraph.py, push 'fec' 'none' explicitly to config_db. When 'fec' is defined in port_config.ini do not override it with 'rs' for 100G
Backport of #7667 to 202012 branch.
Why I did it
Skip to use the web proxy when the packages have been in the proxy server.
For sai packages or the other packages, we will upload the the proxy server directly, the reproducible will skip to check the site, not necessary to change the version files.
Why I did it
Add bgpcfgd support for static routes.
How I did it
Add bgpcfgd support to subscribe changes in STATIC_ROUTE table in CONFIG_DB and program via vtysh. The key of STATIC_ROUTE table is formatted as STATIC_ROUTE|vrf|ip_prefix, while the vrf is optional. If would be treated the same as "default" if no vrf is given.
Add unit tests.
Make sure Everflow always gets classified as Mirror table and not as Control Plane on multi-asic platforms.
Why I did:
In Multi-asic platforms we generate Everflow acl table data from minigraph for both host and namespace.
It is possible in multi-asic minigraph if there are no external port-channel (Only Router Port IP Interface) then Everflow table will have no binded interface in host and will gets classified as Control Plane ACL while in namespace gets classified as Mirror Table.
For ACL Rule generation we read global db as source of truth for acl table information and so for everflow rule generation if tables gets classified as Control plane we can generate rules with invalid action causing orchagent to throw runtime error.
How I did:
If the table is attach to erspan interface in minigraph then it always gets classified as mirror table.
1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic.
2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output.
3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27
4. Enhancement to add mult_asic specific bgpd template generation unit test cases.
Enable BBR config allowas-in 1 for internal peers
Why I did:
To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.
What I did:-
For multi-asic platforms added iptable v4 rule to communicate on docker bridge ip
For multi-asic platforms extend iptable v4 rule for iptable v6 also
For multi-asic program made all internal rules applicable for all protocols (not filter based on tcp/udp). This is done to be consistent same as local host rule
For multi-asic platforms made nat rule (to forward traffic from namespace to host) generic for all protocols and also use Source IP if present for matching
https://github.com/mbj4668/pyang/blob/master/pyang/repository.py#L93 throws an exception with pip 21.1
add ietf yang model explicitly to the build process fix the test failure.
tests/test_sonic_yang_models.py .F [ 66%]
tests/yang_model_tests/test_yang_model.py . [100%]
Failed: pyang -f tree ./yang-models/*.yang > ./yang-models/sonic_yang_tree
----------------------------- Captured stderr call -----------------------------
./yang-models/sonic-acl.yang:8: error: module "ietf-inet-types" not found in search path
./yang-models/sonic-device_metadata.yang:8: error: module "ietf-yang-types" not found in search path
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Previously, a brief sleep was necessary in order to get Python threads to progress. The root cause of this has since been found and fixed in sonic-swss-common: Azure/sonic-swss-common#477. The submodule was updated here, so we can now safely remove this sleep.
This PR should also be cherry-picked to the 202012 branch once the submodule is updated there to also include the fix.
* [202012] Add SOC property to enable AN/LT on some platforms
Why I did it
To enable autonegotiation/link training on some Broadcom-based platforms (Arista 7060CX, 7260CX3, 7050cx3, Celestica DX010)
How I did it
Add appropriate SOC property for enabling the feature to the Broadcom config files of appropriate platforms
Also convert line endings to UNIX format for one Celestica file
* Add 'phy_an_lt_msft' to BCM config file permitted list
68ea9efc Add pg-drop script to sonic filesystem (#1583)
b216bf0a Fixing serial number read to get from DB if it is populated (#1580)
fa7230c6 Handle the new db version which mellanox_buffer_migrator isn't interested (#1566)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Includes below commits:
```
3f8bc52 2021-05-05 | Relax the install_requires, no need to exact version as long as there are no broken changes with future versions (#1530) (#1592) [Qi Luo]
```
* [202012][swss/swss-common/utilities/kernel] Update submodule
sonic-swss:
- [Monitor Vlan] Fix a typo in hostif (#1722)
- Update pool sizes during initialization from timer only (#1708)
- [SflowMgr] SamplingRate Update by Speed Change Added (#1721)
sonic-swss-common:
- [swss-common] Add MUX Metrics Table (#482)
- [azp] Purge swss before installing the newly built deb package (#472)
sonic-utilities:
- disk_check: Check & mount RO as RW using tmpfs (#1569)
- No more IP validation as it is more likely a URL (#1555)
- Stop PMON docker before cold and soft reboots (#1514)
- Add soft-reboot reboot type (#1453)
- [acl] Use a list instead of a comma-separated string for ACL port list (#1519)
- sonic-installer: fix py3 issues in bootloader.aboot (#1553)
- Fix unsupported fs.squashfs extraction in sonic-installer (#1366)
- [show][config] cli support for firmware upgrade on Y-Cable (#1528) (#1558)
sonic-linux-kernel:
- [Mellanox] backport kernel patches for hw-management 7.0100.2303 (#211)
Signed-off-by: Danny Allen <daall@microsoft.com>
* Update utilities w/ build fix
- Support compile sonic arm image on arm server. If arm image compiling is executed on arm server instead of using qemu mode on x86 server, compile time can be saved significantly.
- Add kernel argument systemd.unified_cgroup_hierarchy=0 for upgrade systemd to version 247, according to #7228
- rename multiarch docker to sonic-slave-${distro}-march-${arch}
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Co-authored-by: Shi Lei <shil@centecnetworks.com>
adf5ab58 [vstest/subintf] Add vs test case to validate processing sequence of APPL DB keys (#1663)
8a732726 [intfsorch] Create subport with the entry contains necessary attributes (#1650)
7ba813b2 [vstest/subintf] Update vs tests to validate physical port host interface vlan tag attribute (#1634)
ed32e333 [portsorch] Configure hostif tagging for subports (#1573)
b5209c43 Handle IPv6 and ECMP routes to be programmed to ASIC (#1711)
515cc1a7 [Dynamic buffer calc][Mellanox] Fix bug: buffer over subscription in buffer pool size calculation (#1706)
0ad524b2 [202012] Allowing the first time FEC and AN configuration to be pushed to SAI (#1710)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
1) Dropped non-required IP update in admin.conf, as all masters use VIP only (#7288)
2) Don't clear VERSION during stop, as it would overwrite new version pending to go.
3) subprocess, get return value from proc and do not imply with presence of data in stderr.
Compiling ethtool from source is causing ethtool unit tests to fail on ARM Platforms.
These tests are failing: (By default netlink-interface is enabled while compiling ethtool)
Link: ([Test File Link](https://salsa.debian.org/kernel-team/ethtool/-/blob/debian/1%255.9-1/test-cmdline.c#L28))
```
FAIL: test-cmdline
==================
E: ethtool 16_char_devname! returns 1
E: ethtool
127_char_devname0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcde returns 1
E: ethtool --change devname xcvr external returns 0
E: ethtool --change devname speed 100 duplex half port tp autoneg on advertise 0x1 phyad 1 xcvr external wol p sopass 01:23:45:67:89:ab msglvl 1 returns 0
FAIL test-cmdline (exit status: 1)
```
Tested this on Local ARM Emulated Container:
```
(Docker Container Emulating ARM)
vkarri@3a03c70eed35:/tmp/ethtool$ ./ethtool 16_char_devname!
netlink interface initialization failed, device name longer than 15 not supported
vkarri@3a03c70eed35:/tmp/ethtool$ echo $?
1 (Expected 0)
vkarri@3a03c70eed35:~/ethtool$ ./ethtool 16_char_devnameee
netlink interface initialization failed, device name longer than 15 not supported
Checked for dependencies: (all are present)
vkarri@3a03c70eed35:~/ethtool$ apt-cache policy libmnl0
libmnl0:
Installed: 1.0.4-2
Candidate: 1.0.4-2
Version table:
*** 1.0.4-2 500
500 http://deb.debian.org/debian buster/main armhf Packages
500 http://packages.trafficmanager.net/debian/debian buster/main armhf Packages
100 /var/lib/dpkg/status
vkarri@3a03c70eed35:~/ethtool$ apt-cache policy libc6
libc6:
Installed: 2.28-10
Candidate: 2.28-10
Version table:
*** 2.28-10 500
500 http://deb.debian.org/debian buster/main armhf Packages
500 http://packages.trafficmanager.net/debian/debian buster/main armhf Packages
100 /var/lib/dpkg/status
```
#### How I did it
Disabled netlink-interface for ethtool.
Even though Netlink is not available, it doesn't seem to impact what ethtool was supposed to do. In fact the older version which was in use before this PR [#5725](https://github.com/Azure/sonic-buildimage/pull/5725) did not have netlink support and everything seemed to work well
Article on Netlink-Support for ethtool: https://lwn.net/Articles/783633/
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Add peer_switch field to DEVICE_METADATA table
- In PORT table:
- Set used ports to admin status up
- Set mux_cable to true for downlinks in use
- In MUX_CABLE table:
- Only add entry if the downlink is in use
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
sonic-swss
* Don't update pools when ingress_lossless_pool is created but the initialization hasn't finished yet (#1685)
* Fix dynamic buffer bug occuring in rare condition (#1678)
sonic-utilities
* [load_minigraph]: Avoid starting PFCWD for EPMS devicetype (#1552)
Signed-off-by: Danny Allen <daall@microsoft.com>
- [pyext] Fix pyext/py2 library (#820)
- Added --purge of base docker image packages before installing new ones. (#819) <--- branch point
Signed-off-by: Danny Allen <daall@microsoft.com>
- [xcvrd] refactor Y-Cable firmware information to conform with all vendors (#171)
- [thermalctld] No need exit thermalcltd when loading invalid policy file (#172) <--- branch point
Signed-off-by: Danny Allen <daall@microsoft.com>
c4d4790 [xcvrd] refactor Y-Cable firmware information to conform with all vendors (#171)
be7f4e1 [voqinband]Support for inband port as regular port (#145)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
eff5c1c [thermalctld] No need exit thermalcltd when loading invalid policy file (#172)
5b6d9c0 [syseepromd] Add unit tests; Refactor to allow for greater unit test coverage (#156)
Changes:
ac4596a [intfmgrd] reach reconciled state at start when there are no interfaces configuration to process (#1703)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
1c3f75e (HEAD -> master, origin/master, origin/HEAD) pindown the version of github.com/openconfig/gnoi (#76)
33acd5b [ci]: setup proper azp (#75)
5d82051 [CI] Set up CI with Azure Pipelines (#72)
0688cdb Remove go get commands from Makefile to prevent go.mod file from chan… (#66)
872f0a3 [Y-Cable] refactor get_firmware_version to comply with all vendors (#182)
cc162d6 [sonic_y_cable]: Decorate all method for mux simulator (#181)
fa02416 Change import order in Ycable helper and EEPROM read bytearray change in SFP plugin (#177)
0b60982 [thermal_base] Add setter functions for critical thresholds (#180)
10dc16f [y_cable] add support for enable/disable autoswitch feature on Y cable (#176)
c6c81a8 [fan_drawer_base.py] Fix FanDrawer get_status_led interface (#175)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Feb 17 Fix tests failing due to duplicate vxlan tunnel creation (#75)
Mar 11 Update route api to specify limitation (#77)
Apr 01 Add host_ifname field while adding entry in VLAN table (#80)
#### Why I did it
To eliminate the need to write duplicate code in order to import a Python module from a source file.
#### How I did it
Add `general` module to sonic-py-common, which contains a `load_module_from_source()` function which supports both Python 2 and 3.
Call this new function in:
- sonic-ctrmgrd/tests/container_test.py
- sonic-ctrmgrd/tests/ctrmgr_tools_test.py
- sonic-host-services/tests/determine-reboot-cause_test.py
- sonic-host-services/tests/hostcfgd/hostcfgd_test.py
- sonic-host-services/tests/procdockerstatsd_test.py
- sonic-py-common/sonic_py_common/daemon_base.py
- When generating L2 preset, check for dual ToR setting from CLI option `-a '{"is_dualtor": true}'`
- When dual ToR is specified, add subtype field to DEVICE_METADATA table
- When dual ToR is specified, add MUX_CABLE, TUNNEL, LOOPBACK_INTERFACE, and PEER_SWITCH tables
#### Why I did it
Plexus-utils before 3.0.16 is vulnerable to command injection because it does not correctly process the contents of double quoted strings.
#### How I did it
Upgrade to 3.0.16
sonic-swss
-[SFlowMgr] Sflow Crash on 200G ports handled (#1683)
-Stablize the test case (#1679)
-Remove PGs from an administratively down port. (#1677)
sonic-swss-common
- fix getting hash from redis db (#465)
- [dbconnector] Initialize redisContext (#464)
sonic-utilities
- route_check: Fix hanging & logging level (#1520)
- Add self timeout and crash if exceeded. (#1502)
- [reboot] User-friendly reboot cause message for kernel panic (#1486)
- [acl-loader]: do not add default deny rule for egress acl (#1531)
Signed-off-by: Danny Allen <daall@microsoft.com>
c5be3ca4 [psud] Increase unit test coverage; Refactor mock platform (#154)
450b7d78 Bug fix: the fields that are not supported by vendor should be "N/A" in STATE_DB (#168)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot.
This PR is related to PR in sonic-utilities repo: Azure/sonic-utilities#1486
How I did it
The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary.
How to verify it
I verified this change on a virtual testbed.
admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json
{"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"}
admin@vlab-01:/host/reboot-cause$ show reboot-cause
Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Backport of https://github.com/Azure/sonic-buildimage/pull/7031 to the 202012 branch
#### Why I did it
To enable parsing the `AutoNegotiation` element from the LinkMetadata section of minigraph file
#### How I did it
Parse the value `AutoNegotiation` element from the `LinkMetadata` section of minigraph file. If the element is present, an `autoneg` key will be added to the port in the `PORT` table of Config DB with a value of either `0` or `1`
If an `autoneg` value is present in port_config.ini, the value from the minigraph will take precedence, overriding that value.
Also remove `AutoNegotiation` and `EnableAutoNegotiation` elements from the `DeviceInfo` section, as we will use this data in the `LinkMetadata` section to determine whether to enable auto-negotiation for a port.
The default bgp connect retry timer is 120 seconds. A reconnection will happen 120 seconds if the initial connection fails. This PR aims to allow a more frequent retry.
this PR updates the following commits in sonic-platform-daemons
260cf2d [xcvrd] change firmware information fields name inside MUX_CABLE_INFO table for Y cable (#165)
cfa600f [thermalctld] Initialize fan led in thermalctld for the first run (#167)
8509f43 [thermalctld] Refactor to allow for greater unit test coverage; Add more unit tests (#157)
70f4e7b [syseepromd] Update warning message to be more informative (#160)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
utilities:
* 83f068b 2021-03-22 | Handling error scenario of adding port to Vlan which is part of LAG (#1516) (HEAD -> 202012) [Sudharsan Dhamal Gopalarathnam]
* 470e8ce 2021-03-24 | Enable PFCWD only on ports where PFC is enabled (#1508) [Andriy Yurkiv]
* 09ef2e0 2021-03-22 | [show][config] add support for setting and displaying switching modes on Y cable (#1501) [vdahiya12]
* 0d17d37 2021-03-24 | Warmboot script improvements - timeout exec, disable swss autorestart, remove trap (#1495) [Vaibhav Hemant Dixit]
* 2718cd8 2021-03-24 | [show] Fix int status of LAGs, configured as Vlan members (#1478) [maksymbelei95]
* cc168fb 2021-03-22 | Fix bug: show vlan config for vlan with no members (#1503) [allas-nvidia]
swss:
* 5d8d1fb 2021-03-26 | Revert "Revert "[buffermgr] Support maximum port headroom checking (#1607)" (#1675)" (#1682) (HEAD -> 202012) [Prince Sunny]
* f8df1f8 2021-03-26 | [Dynamic Buffer Calc] Enhance the field checking in table handling (#1680) [Stephen Sun]
* 6328c9f 2021-03-22 | [MuxOrch] FDB ageout safety check (#1674) [Prince Sunny]
* e1d733e 2021-03-21 | reduce severity of log to info in case of flush on non-existing member (#1669) [allas-nvidia]
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
c20bf60 Qi Luo Mon Mar 15 14:28:31 2021 -0700 Implement rfc4363 FdbUpdater for lag inside vlan (#203)
292024a abdosi Mon Mar 15 12:15:21 2021 -0700 Updated lldpRemManAddrTable to use all the management ip address associated with interface. (#201)
9b83459 liushilongbuaa Fri Mar 12 14:35:23 2021 +0800 [CI] Setup dummy azure pipeline (#198)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Why I did it
It was observed that on a multi-asic DUT bootup, the BGP internal sessions between ASIC's was taking more time to get ESTABLISHED than external BGP sessions. The internal sessions was coming up almost exactly 120 secs later.
In multi-asic platform the bgp dockers ( which is per ASIC ) on switch start are bring brought up around the same time and they try to make the bgp sessions with neighbors (in peer ASIC's) which may be not be completely up. This results in BGP connect fail and the retry happens after 120sec which is the default Connect Retry Timer
How I did it
Add the command to set the bgp neighboring session retry timer to 10sec for internal bgp neighbors.
sonic-swss
* [nbrmgrd] added function to parse IP address from APP_DB (#1672)
* [MUX/PFCWD] Use in_ports for acls instead of seperate ACL table (#1670)
* [mux] VS test for neigh, route and fdb (#1656)
* [Dynamic buffer calc] Bug fix: Remove PGs from an administratively down port. (#1652)
* spell check fixes (#1630)
sonic-utilities
* [reboot]: Stop mux before reboot on dual ToR (#1500)
* [config] Disable/enable container monitoring when stopping/starting services (#1499)
* Add 'show' and 'clear' command for PG drop (#1461)
* [CLI][techsupport] Add NOOP option for commands that did not have that option (#1445)
* [202012][reload] Improve reload by using sonic.target (#1509)
Signed-off-by: Danny Allen <daall@microsoft.com>
sonic-swss
* Add table descriptions for dynamic buffer calculation to the documents (#1664)
* Remove vxlanmgrd dependency on orchagent (#1647)
sonic-utilities
* [show] Fix 'show mac' output, when FDB entry with Vlan 1 is present (#1368)
* [warm-reboot]: Check empty key before issuing redis hget (#1496)
* [generate-dump] Remove Arista specific logic (#1482)
* [warm-reboot]: added automated recover for ISSU file (#1466)
* [warm-reboot] Check if warm restart flag is set when issuing a warm-reboot (#1460)
* [show][config] fix for show/config muxcable hwmode model value; fix show/config muxcable return codes; (#1494)
sonic-linux-kernel
* [net] Disable prio and cls cgroups to make working cgroup2 sock matching (#198)
Signed-off-by: Danny Allen <daall@microsoft.com>
Features may be enabled/disabled for the same topology based on run-time
configuration. This PR adds the ability to enable/disable feature based
on config db data.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
* [202012][submodule] Update sonic-utilities submodule
- [acl-loader] Improve input validation for acl_loader (#1479)
- [show] cli support for show muxcable cableinfo (#1448)
Signed-off-by: Danny Allen <daall@microsoft.com>
- Unset CONFIG_THERMAL_STATISTICS to prevent kernel crash (#199)
- [dni_dps460] Add attributes to retrieve PMBus status command codes (#197)
- [mellanox]: Backport new kernel patches (#195)
- [ci]: build amd64/armhf/arm64 for CI build (#196)
- Fix read and write failure to ‘fan1_target’ attribute of ‘dni_dps460’ driver. (#183)
Signed-off-by: Danny Allen <daall@microsoft.com>
* src/sonic-platform-daemons 068bccc...e5165b7 (7):
> [xcvrd] Fix crash: If 'dom_capability' not in port_info_dict, insert 'N/A' (#162)
> fix the muxcable state change notification received from other modules, omit the check inside hw_state table (#159)
> [xcvrd] Fix crash on platforms which support media settings with Python 3 (#158)
> [xcvrd] Save the dom_capability of transceiver into db (#72)
> [xcvrd] Fix xcvrd crash on other port prefixes (#123)
> [xcvrd] Make functions used for media setting python3 compatible (#153)
> [psud] Refactor unit tests; increase unit test coverage (#146)
Update FRR to 7.5.1. The following is a list of new commits.
```
df7ab485b FRRouting Release 7.5.1
f4ed841b8 Merge pull request #8187 from opensourcerouting/rpmfixes-75
86d5a20e3 Merge pull request #8193 from mjstapp/fix_signals_7_5
b339cc149 lib: avoid signal-handling race with event loop poll call
0f7b432c3 lib: add debug output for signal mask
c0290c86d lib: add sigevent_check api
7a5348665 doc: Fix CentOS 7 Documentation
2a8e69f48 Merge pull request #8064 from donaldsharp/foo
cf4d1a744 redhat: Fix changelog incorrect date format
b78dcb209 Merge pull request #8181 from idryzhov/7.5-zebra-blackhole
2032e7e72 zebra: don't use kernel nexthops for blackhole routes
e52003567 bgpd: When deleting a neighbor from a peer-group the PGNAME is optional
aa86a6a6f Merge pull request #8161 from mjstapp/fix_sa_7_5_backports
13a8efb4b Merge pull request #8156 from idryzhov/7.5-backports-2021-02-26
58911c6ed lib: Free memory leak in error path in clippy
556dfd211 lib: use right type for wconv() return val
bd9caa8f1 lib: fix some misc SA warnings
683b3fe3f lib: register dependency between control plane protocol and vrf nb nodes
b45248fb6 lib: add definitions for vrf xpaths
7b9f10d04 lib: add ability to register dependencies between northbound nodes
9c240815c bgpd: Bgp peer group issue
d1b43634b bgpd: upon bgp deletion, do not systematically ask to remove main bgp
f5d1dc55e bgpd: Fix crash when we don't have a nexthop
c2e463478 frr-reload: rpki context exiting uses exit and not end
f11db1698 bgpd: Blackhole nexthops are not reachable
c628e94ff staticd: fix vrf enabling
49b079ef1 staticd: fix nexthop creation and installation
0077038e9 staticd: fix nexthop validation
be3dfbbc7 zebra: use AF_INET for protocol family
```
Update the sonic-swss-common submodule. The following are the commits in the submodule.
f01fede [debian/control] libswsscommon-dev depends on libbost-dev (#458)
607a8ce Convert return value of get_all function in SonicV2Connector to dict (#462)
Closes issue #6982.
The issue was root caused as we were using the unix_socket for reading from DB as a default mechanism (#5250). The redis unix socket is created as follows.
admin@str--acs-1:~$ ls -lrt /var/run/redis/redis.sock
srwxrw---- 1 root redis 0 Mar 6 01:57 /var/run/redis/redis.sock
So it used to work fine for the user "root" or if user is part of redis group ( admin was made part of redis group by default )
Check if the user is with sudo permissions then use the redis unix socket, else fallback to tcp socket.
Why I did it
We skip install of CNI plugin, as we don't need. But this leaves node in "not ready" state, upon joining master.
To fix, we copy this dummy .conf file in /etc/cni/net.d
How I did it
Keep this file in /usr/share/sonic/templates and copy to /etc/cni/net.d upon joining k8s master.
How to verify it
Upon configuring master-IP and enable join, watch node join and move to ready state.
You may verify using kubectl get nodes command
Includes the following commits:
1673d25 [y_cable] refactor upgrade firmware API's; Fix vendor and part number API's read size for read_eeprom (#174)
ed93a15 [sonic_platform_base] Proper use of class and instance attributes (#173)
691de92 [sonic_y_cable] add stub function for upgrade firmware of Y cable and split the get_part_number and get_vendor API's (#171)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
To adjust config db generated via minigraph per matchmode changes in fine grained ecmp. The changes are done so that nexthop IP based filtering can occur to determine routes as requiring Fine Grained ECMP, in the past the only mode was to use the IP prefix of the route for filtering, with this matchmode change we will use nexthop IP based filtering
Azure/SONiC#727
How I did it
Change will modify config db entry created for FG_NHG to include 'match_mode': 'nexthop-based' so that nexthop IP based filtering can occur to determine routes as requiring Fine Grained ECMP. Changes also remove FG_NHG_PREFIX entry since its not needed under matchmode nexthop-based.
#### Why I did it
It is possible to have DHCP relay configuration with no servers/
helpers which result in DHCP container to crash. This PR fixes this
issue by not starting DHCP relay for vlans with no DHCP helpers.
resolves: #6931closes: #6931
#### How I did it
Do not add program group for dhcp relay with not dhcp helpers
#### How to verify it
Unit test
#### Why I did it
Some platforms have difficult to attach egress ACL to vlan.
#### How I did it
For egress ACL attaching to vlan, break them into vlan members.
#### How to verify it
Unit test
Tested in DUT
30d09be fix the muxcable state change notification received from other modules, omit the check inside hw_state table (#159)
32ec23c [xcvrd] Fix crash on platforms which support media settings with Python 3 (#158)
47bcf90 [xcvrd] Save the dom_capability of transceiver into db (#72)
b9381a5 [xcvrd] Fix xcvrd crash on other port prefixes (#123)
c3c1a59 [xcvrd] Make functions used for media setting python3 compatible (#153)
e179ffc [psud] Refactor unit tests; increase unit test coverage (#146)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
It is possible that one interface attaches multiple vlans. The VlanInterface should be in tagged mode.
Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
Fix a strange bug introduced by https://github.com/Azure/sonic-buildimage/pull/6832 which would only occur in environments with both Python 2 and Python 3 installed (e.g., the PMon container). Error messages such as the following would be seen:
```
ERR pmon#ledd[29]: Failed to load ledutil: module 'importlib' has no attribute 'machinery'
```
This is very odd, and it seems like the Python 2 version of importlib, which is basically just a stub, is taking precedence over the Python 3 version. I found that this occurs when calling `import importlib`. However, calling `import importlib.machinery` and `import importlib.util` causes the proper package to be referenced, and the `machinery` and `util` modules are loaded successfully. This is how it is specified in examples in the official documentation, however there is nothing mentioned regarding that it *should* be done this way or that `import importlib` is unreliable.
Also, since sonic-py-common is still used in environments with Python 2 installed we should maintain support for both Python 2 and 3 until we completely deprecate Python 2, so I have added this back in.
There is a bug in how pyangbind translates yang models into python bindings. The model always sets integer values to 0 by default, so there is no way to check if a user has provided a value that is equal to 0. This is problematic for ICMP and VLAN (among others) because 0 is a valid input value.
This change converts ICMP and VLAN fields to union types so that acl-loader will treat them as null values unless a user explicitly adds an integer value.
Signed-off-by: Danny Allen <daall@microsoft.com>
Migrate from using the `imp` module to using the `importlib` module. As of Python 3, the `imp` module has been deprecated in favor of the `importlib` module.
- Why I did it
Group all SONiC services together and able to manage them together. Will be used in config reload command as much simpler and generic way to restart services.
- How I did it
Add services to sonic.target
- How to verify it
Together with Azure/sonic-utilities#1199
config reload -y
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
183162f Fix issue: expect redis pubsub data to be str type instead of bytes type (#196)
#### Why I did it
Update submoduel for snmpagent for 202012 branch since there is not 202012 branch for snmpagent
#### How I did it
Update submodule pointer
#### How to verify it
Run build
Change in this update:
b75aab7 [swss-common] Add LINKMGR CFG and MUX LINKMGR state table names (#421)
4a77d1c [ci]: add vstest (#459)
07258a6 [ci]: use build template (#457)
ddcae3e runRedisScript api to process integer returned by script run in the redis (#447)
33d89c7 [systemlag] Schema defs for system lag (#448)
af01f37 spell check fixes (#456)
7afd43d Update to make getNamespaces() API at par with the get_ns_list() swssdk-py API. (#455)
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
This PR includes the following commit in sonic-platform-daemons
068bccc [xcvrd] Store mux_cable telemetry data in State DB (#148)
93cac0a [ci]: download from sonic-buildimage.vs artifact (#152)
d651e9b [GitHub] Add pull request template (#151)
bd7830b [pcied] Remove unnecessary message and move the configuration path (#144)
9080fda [ci] Call pip2/3 using sudo (#150)
de60784 [ci] Test and build packages using Azure Pipelines (#149)
8bf0fd1 [ledd] Refactor to allow for more thorough unit testing; Increase unit test coverage (#147)
26bdc9e Set up CI with Azure Pipelines
1fcaa57 [pcied] Add PCIe AER stats collection (#100)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Update FRR 7.5 head. The following is a list of new commits.
```
e2f17ae47ad047e66923c2ff1e84c9ba10d4ad38 Merge pull request #8096 from idryzhov/7.5-backports-2021-02-16
380341362ced8e317c18b7395acb012de1f23acd ospf6d: Don't send hellos on loopback interface
7fa78b659f8e720466e0df62689327ea4b9ff867 bgpd: send correct BMP down message when nht fails
385faf6c079a41def1e6eb882cbfd50047559644 [filter]: change return code for errors
d9a0e9a2934f2f75c64496fe4c724a18aa581fcb bfdd: fix session lookup
08afa0a75311a4e8cb2a18116384b603f7f2d751 ospf6d : fix issue in ecmp inter area route
2299afa1a9128d87d5169742b993c0ada575eb83 ospfd: Prevent duplicate packet read in certain vrf situations
ff42a28af659ee61c0efb877b10738a5812f4bc2 vrf: use wrappers to change VRF_CONFIGURED flag
2bdc59ca21da2d67b77ec70a2fadffbca60690cd vrf: mark vrf as configured when entering vrf node
b9611f65a71adc0b8fa14a5a4d1a8f44e04dcd85 ospf6d: Fix LSA formatting out-of-bounds access
610ebf56913fa56167b0a2a127b07afe020a1efe bfdd: Prevent use after free ( again )
35b0cd5d753dda9aa70ea1c06db61a8d4b8671e3 *: Fix usage of bfd_adj_event
95b8915d0f4de3eae5438632ecd0827061ef48e8 ospf6d: Fix LSA formatting inconsistent retvals
49d73d8be84dbd23d767697474019165e511786c pimd: SGRpt prune received during prune didn't override holdtime
1d0d19afa9bb7cd4bc476d00c887876bc04eee95 eigrpd: Correctly set the mtu for eigrp packets sent
bbb08db69f8eb554d23b4920c1c1e3982d8d2a91 zebra: Prevent sending of unininted data
0813d650a8120458ab7d9317061f3864dbc6f2f7 ospf6d: prevent use after free
2f2e981d967b36b240fca82fea8a961d927ef43c lib: Prevent unininted usage of data
6171becdb391ea5b88916a3a28b04b555e1fc518 bfdd: Prevent storage of ifp pointer that has been deleted
9ebb41cf4bb51e0872796530bf8c7a4d819053db bfdd: Prevent unininited data transmittal
72e16db6fea3629111537f9eb10c86f2d275adcb eigrpd: Prevent uninitialized value from being used
72b61a5bb09d59c3cc0d1d401d51de96949dff52 zebra: disallow resolution to duplicate nexthops
1083bae40b00c0ed2c9f3521ae1ab9675a87202e bgpd: Initialize bgp_notify.raw_data before passing to bgp_notify_receive()
31df7314310416f10c133dcfe9c4586edadf3fbb doc: ebgp-requires-policy requires manuall session clearing
ecc8ec678d2d8a1c3d1d50a22732f9fc4bad689c watchfrr: fix SA warning
9d9365d161979a031de817c1fbcab6508dfee013 watchfrr: fix crash on missing optional argument
907e600d63c1c5b6bda40b0a08344a72533b1787 pimd: Prevent use after free
b47374f0e95d99c93bfe2d14afe55219a9fda455 doc: Update bgp doc for more rfc-8212 talk
4fbeef60cc8dc5362ff84fc91d1a4e343e4e32c7 docker: centos 7, 8 yang bump and repo fixes
808e6d731f330df4a91fdfd6df6a3c8dce1651a6 docker: prefer alpine:latest for building
91b3c471f1c48818370a0f218add917f0d46aa47 Merge pull request #8092 from donaldsharp/7.5_track
60be43c0bf63c16ca42008fa802d0a2050f3fce2 Merge pull request #8090 from ton31337/fix/static_network_vrf_7.5
1f6785aa60cc57a5c8d5de98c9c09a344a0c9262 ospf6d: Track wait_timer and disable when needed
c89e326be91312bed066eb2447ea8944e25a225e bgpd: Check for peer->su_remote if not NULL when handling IPv6 nexthop
15e070f6448870c98c030b6b5013ad8750d8918b Merge pull request #8047 from pguibert6WIND/nhrp_shortcut_routes_75
912994efec94082ae7d8c5e014c410964bea19f4 Merge pull request #8034 from qlyoung/fix-gnu-readline-bracketed-paste-7.5.1
9f50536993f1eb900fbfbe98d21b8c072bbd9c15 nhrpd: replace nhrp route nexthop with onlink route when prefix=nh
8c185008246db31c34574d7b79358001ac411f84 nhrpd: shortcut routes installed with nexthop.
c46c87d19758040bc3f3902ab8e4a0f1bb908721 vtysh: disable bracketed paste in readline
20b35e4c3386de798f3b0cb9f2a7e6b04d995485 Merge pull request #8018 from ton31337/fix/drop_aggregate_as_attribute_if_malformed_7.5
fa25d7327fd64613cc7530aba2edfcde038da074 bgpd: Unset only aggregator flag when AGGREGATOR_AS is 0
3ee9a3726fe1a526d946c1978487a4509fe98f29 bgpd: Drop aggregator_as attribute if malformed in case of BGP_AS_ZERO
be88595c6a2011f0e882bfa663baa61c86ede14e Merge pull request #8005 from opensourcerouting/snap-libyang1-fix-75
fd840ad37f2e836b210c6e60fc6325a4c3e495ce snapcraft: Update rtrlib to 0.7.0
3d00552fa9aedb96acd7ea773bc14fd2b77e7e0f snapcraft: Fix passthrough path for Libyang 1.x
```
This PR updates the following commits
c6b642b [ci]: download from sonic-buildimage.vs artifact (#168)
e76ecc6 [sonic_y_cable] add support for retrieving firmware info for Y cable, internal and nic temperature and voltage (#162)
f9cf8c9 [GitHub] Add pull request template (#167)
c31636e [ci] Call pip2/3 using sudo (#166)
5521f67 [ci] Test and build packages using Azure Pipelines (#164)
faca35c [ci]: Set up CI with Azure Pipelines
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Submodule commits included:
* src/sonic-platform-common 6ad0004...bd4dc03 (1):
> [sonic_sfp/qsfp_dd.py] Update DOM capability method name to align with other drivers (#163)
Also align all calling function names to match.
check the uid before modify local user account.
when run sudo, the process the invoke nss_tacplus library
does not have priviledge to modify the user profile, and
will generate below error messages:
user_rw@sonic:~$ sudo bash
usermod: Permission denied.
usermod: cannot lock /etc/passwd; try again later.
usermod: Permission denied.
usermod: cannot lock /etc/passwd; try again later.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
sonic-swss:
- [Mux] Route handling based on mux status, kernel tunnel support (#1615)
- Reduce noise during frequent route update (#1624)
- Changed Error log to Notice log during FDB flush notification after VLAN delete (#1618)
- [PortsOrch] Add reference counting to ports for ACL bindings (#1614)
- [crm]: Ignore unsupported/non-implemented switch attributes (#1613)
- [Mux] Fix repeating logs in case of tunnel creation fail (#1610)
sonic-utilities:
- [config reload]: Restart mux container (#1401)
- [storyteller] Enhance the storyteller utility (#1400)
- [show] Fix int status when portchannel is in the system (#1376)
- [config][show] cli support for retrieving ber, eye-info and configuring prbs, loopback on Y-cable (#1386)
- Skip route check for tun0 interfaces (#1399)
- do not parse stderr to get correct routing stack (#1398)
- [storyteller] allow storyteller to work on downloaded logs (#1388)
- [show] Run fwutil with sudo (#1364)
Signed-off-by: Danny Allen <daall@microsoft.com>
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.
Fixes#6199
To check if it works out for this issue in 201911 ? #6503
This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels.
master
admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
real 0m15.599s
user 0m0.061s
sys 0m0.038s
Sonic 201911.v58
admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
real 0m5.541s
user 0m0.020s
sys 0m0.028s
When we add allow-list key with action above route-map gets updated . For eg if we add deny action above template will become to no-export community. Now if we delete the key Issue is we still keep the no-export and do not move back to drop community.
This PR fixes this issue by rolling back default route-map community value back to constants.yml default action.
This PR updates the following commits in sonic-platform-common
6ad0004 [component] add auto_update_firmware() to support the auto update. (#106)
49076a9 [sonic_y_cable] Add support for measuring BER and EYE scan and running Loopback, PRBS modes on the Y cable (#158)
6b12b4c [sfp] Add parsing the dom_capability to sff8472 (#102)
7fc76b9 [sonic_pcie] Add get_pcie_aer_stats and its common implementation (#144)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Update minigraph parser to retrieve kubernetes server info from minigraph.xml and update "KUBERNETES_MASTER|SERVER" in running config.
Update minigraph parser to include clusterName from minigraph.xml into "DEVICE_METADATA|localhost"
snmpd's compile is always failed with file truncated on ARM64 arch, the error log is like "/usr/bin/ld: mibgroup/ip-forward-mib/inetCidrRouteTable/.libs/inetCidrRouteTable_interface.o: file not recognized: file truncated"
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
**- Why I did it**
In thermalctd, when speed of fan exceeds threshold, the fan status will be saved as "bad". So in system health, it is better to check fan speed before fan status. In this case, if fan speed exceeds threshold, we get more detailed information.
**- How I did it**
Move fan speed check logic before fan status check
**- How to verify it**
Manual test
This update includes the following changes
> [syncd armhf] Fix syncd crash when running community test suites (#777)
> Revert "[tests]:Add unittest for MACsec on p2p establishment (#771)"
> [tests]:Add unittest for MACsec on p2p establishment (#771)
> [tests] Enable azure pipeline make check to respect unittests (#760)
* c2fb282 2021-01-29 | [ecnconfig] Allow ecn unit test to run without sudo (#1390) [Neetha John]
* 6cc635b 2021-01-29 | [sonic-installer] Add information to syslog (#1369) [Dmytro]
* 7a8024a 2021-01-27 | Prevent user from adding more then a single untagged VLAN to an interface (#1382) [Eran Dahan]
* 41e62c6 2021-01-26 | [pcieutil] Add 'pcie-aer' sub-command to display AER stats (#1169) [Arun Saravanan Balachandran]
* 47f412b 2021-01-26 | Improve robustness of consutil plugin loading (#1353) [Samuel Angebault]
* 64aa1b8 2021-01-25 | [show] Fix warnings, related to gearbox, while show commands execution (#1343) [maksymbelei95]
* ff226d0 2021-01-25 | Prevent configuring IP interface on a port which is a member of VLAN (#1374) [Eran Dahan]
* f1522b9 2021-01-21 | [config_mgmt.py]: Set leaf-list to empty list while port breakout. (#1268) [Praveen Chaudhary]
* 99c05d5 2021-01-21 | add vlan_intf_object only if there are ipv4 or ipv6 mappings (#1377) [Sumukha Tumkur Vani]
* b082684 2021-01-21 | [ecn] Add tests for ecnconfig command (#1372) [Neetha John]
* 23e0920 2021-01-21 | [sfpshow] Enhance QSFP-DD DOM information (#1207) [shlomibitton]
* f4edba1 2021-01-20 | [ecnconfig] handle backend port names when extracting port I/F ID from the port name (#1361) [Mahesh Maddikayala]
Signed-off-by: Danny Allen <daall@microsoft.com>
- Why I did it
Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running
or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process
in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance.
Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by
Supervisord, we can only focus on the logic of monitoring.
- How I did it
We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take
following steps if it was notified one of critical processes exited unexpectedly:
The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted.
If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog.
- How to verify it
First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not.
Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute.
- Which release branch to backport (provide reason below if selected)
201811
201911
[x ] 202006
* Fix exception in bgpmon caused by duplicate keys
It is possible that BGP neighbors in IPv4 and IPv6 address families
share the same name (such as bgp monitor). However, such case is not
handled in bgpmon, and an Exception will be raised. This commit will
address the issue by Using set instead of list to avoid duplicate keys.
Recent changes brought l2 vlan concept which do not have DHCP
clients behind them and so DHCP relay is not required. Also,
dhcpmon fails to launch on those vlans as their interfaces
lack IP addresses. This PR limit launch of both DHCP relay
and dhcpmon to L3 vlans only.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
- Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema.
- Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup
**- How I did it**
- add frrcfgd service
- when frr_mgmg_framework_config is set, frrcfgd starts in bgp container
- when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR.
- add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted
**- How to verify it**
1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed.
1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB
Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>
**- Why I did it**
For now `hwsku.json` and `platform.json` dont support optional fields. For example no way to add `fec` or `autoneg` field using `platform.json` and `hwsku.json`.
**- How I did it**
Added parsing of optional fields from hwsku.json.
**- How to verify it**
Add optional field to `hwsku.json`. After first boot will be generated new `config_db.json` or you can generate it using `sonic-cfggen` command. In this file must be optional field from `hwsku.json` or check using command `redis-cli hgetall PORT_TABLE:Ethernet0`
Example of `hwsku.json`, that must be parsed:
```
{
"interfaces": {
"Ethernet0": {
"default_brkout_mode": "1x100G[40G]",
"fec": "rs",
"autoneg": "0"
},
...
}
```
Example of generated `config_db.json`:
```
"PORT": {
"Ethernet0": {
"alias": "Ethernet0",
"lanes": "0,1,2,3",
"speed": "100000",
"index": "1",
"admin_status": "up",
"fec": "rs",
"autoneg": "0",
"mtu": "9100"
},
```
So, we can see this entries in redis db:
```
admin@sonic:~$ redis-cli hgetall PORT_TABLE:Ethernet0
1) "alias"
2) "Ethernet0"
3) "lanes"
4) "0,1,2,3"
5) "speed"
6) "100000"
7) "index"
8) "1"
9) "admin_status"
10) "up"
11) "fec"
12) "rs"
13) "autoneg"
14) "0"
15) "mtu"
16) "9100"
17) "description"
18) ""
19) "oper_status"
20) "up"
```
Also its way to fix `show interface status`, `FEC` field but also need add `FEC` field to `hwsku.json`.
Before:
```
admin@sonic:~$ show interfaces status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ----------- ------ ------ ------- --------------- ----------
Ethernet0 0,1,2,3 100G 9100 N/A Ethernet0 routed up up QSFP28 or later N/A
```
After:
```
admin@sonic:~$ show interfaces status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ----------- ------ ------ ------- --------------- ----------
Ethernet0 0,1,2,3 100G 9100 rs Ethernet0 routed up up QSFP28 or later N/A
```
**- Why I did it**
Prior to SONiC using Debian Buster, we needed to build Python 3.5 or newer from source for installation in the SNMP container, becuase it wasn't available from the Debian repository for Jessie or Stretch. Now that all containers are based on Buster, we simply install Python 3.7 from the Debian repository in the host as well as all containers. We are no longer building Python 3 from source, so the Makefile is unused and we no longer need to install build dependencies in the slave containers.
**- How I did it**
- Remove Python 3 makefile
- No longer install Python 3 build dependencies in the slave containers.
Submodule changes to be committed:
* src/sonic-platform-daemons 81318f7...e72f6cd (3):
> [ledd] Minor refactor; add unit tests (#143)
> [thermalctld] Report unit test coverage (#141)
> [psud] Increase unit test coverage (#140)