Why I did it
get_system_mac was returning 'None' mac for system without eeprom.
get_system_mac for marvell platform checks for mac in eeprom, profile.ini(hwsku file) and eth0. Check for valid mac returned by syseeprom was incorrect. Which was resulting in bypassing mac get from profile.ini and eth0.
How I did it
get_system_mac already has a logic to get first valid mac.
Removed null check for mac returned by eeprom.
Corrected the check for profile.ini file by checking if file exist.
How to verify it
Executed sonic-cfggen to check valid mac address is getting configured in config_db.json with/without profile.ini.
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.
This submodule update brings in the following changes:
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
src/sonic-platform-daemons
```
* 76baca3 - (HEAD -> master, origin/master, origin/HEAD) Fixes for the issues uncovered by sonic-pcied unit tests (#389) (32 hours ago) [Ashwin Srinivasan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
For route registry service, in order to block hijacked routes, IBGP session needs to be set up from BGP sentinel service to SONiC, and BGP sentinel service advertise the same route with higher local-preference and no export community. So that SONiC takes the route from BGP sentinel as the best path and does not advertise the route to EBGP peers.
In order to do that, new route-maps are needed. So this change adds a new set of templates, keeping BGPSentinel peers out of the other templates.
Work item tracking
Microsoft ADO (number only): 24451346
How I did it
Add sentinel_community in constants.yml, route from BGPSentinel do not match this community will be denied.
Add support to convert BGPSentinel related configuration in the BGPPeerPassive element of the minigraph to a new BGP_SENTINELS table in CONFIG_DB
Add a new set of "sentinels" templates to docker-fpm-frr
Add a new BGP peer manager to bgpcfgd, to add neighbors from the BGP_SENTINELS table using the "sentinels" templates
Add a test case for minigraph.py, making sure the BGPSentinel and BGPSentinelV6 elements create BGP_SENTINELS DB entry.
Add a set of test cases for the new sentinels templates in sonic-bgpcfgd tests.
Add sonic-bgp-sentinel.yang and a set of testcases for the yang file.
How to verify it
Testcases and UT newly added would pass.
Setup IPv4 and IPv6 BGPSentinel services in minigraph, and load minigraph, show CONFIG_DB and "show runningconfig bgp", configuration would be loaded successfully.
Using t1-lag topo and setup IBGP session from BGPSentinel to SONiC loopback address, IBGP session would up.
Advertise route from BGPSentinel to T1 with sentinel_community, higher local-preference and no-export communiyt. In T1, show bgp route, the result is "Not advertise to any EBGP peer".
Withdraw the route in BGPSentinel, in T1, route would advertise to EBGP peers.
Advertise route from T1 that does not match sentinel_community, in T1, would not see the route in show bgp route.
#### Why I did it
src/sonic-gnmi
```
* 610509b - (HEAD -> master, origin/master, origin/HEAD) Install necessary debs instead of entire artifact in azp (#137) (2 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* cb1b3f40 - (HEAD -> master, origin/master, origin/HEAD) Remove system neighbor DEL operation in m_toSync if SET operation for (#2853) (7 hours ago) [Song Yuan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 6e5cfda - (HEAD -> master, origin/master, origin/HEAD) Change common_libs dependencies from buster to bullseye (#212) (2 days ago) [Ze Gan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
event yang models for usage currently use int as type for usage leaf, needs to be of type decimal64
##### Work item tracking
- Microsoft ADO **(number only)**:17747466
#### How I did it
Update yang models and UT
#### How to verify it
UT
#### Why I did it
src/sonic-swss
```
* 5b27c209 - (HEAD -> master, origin/master, origin/HEAD) Refactor Orch class to separate recorder implementation (#2837) (8 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 94242c2 - (HEAD -> master, origin/master, origin/HEAD) Use vendor customizable fan speed threshold checks (#378) (3 hours ago) [spilkey-cisco]
* db6e340 - Fix index out of range in the error log of invalid media lane mask received (#386) (8 hours ago) [MichaelWangSmci]
```
#### How I did it
#### How to verify it
#### Description for the changelog
- Why I did it
Adjust PSU power threshold logic in system health.
- How I did it
Update the description message in PSU power threshold checking
power of PSU x (xx w) exceeds threshold (xx w) => System power exceeds xx threshold (xx w)
- How to verify it
Manual test and unit test
Why I did it
When sonic is managed by k8s, the sonic container is managed by k8s daemonset, daemonset identifies its members by labels. Currently when restarting a sonic service by systemctl, if the service's container is already managed by k8s, systemd script stops the container by removing the feature label to make it disjoin from k8s daemonset, and then starts it by adding the label to make it join k8s daemonset again.
This behavior would cause problem during k8s container upgrade. Containers in daemonset are upgraded in a rolling fashion, that means the daemonset version is updated first, then rollout the new version to containers with precheck/postcheck one by one. However, if a sonic device joins a daemonset, k8s will directly deploy a pod with the current version of daemonset, it is expected when a device joins k8s cluster at first time.
But for a device which has already joined k8s cluster, the re-joining daemonset will cause the container upgraded to new version without precheck, so if a systemd service is restarted during daemonset upgrade, the container may be upgraded without precheck and break rolling update policy. To fix it, we need to remove the logic about dropping k8s label in systemd service stop script for kube mode.
Work item tracking
Microsoft ADO (number only): 24304563
How I did it
Don't drop label in systemd service stop script when feature's set_owner is kube. Only drop label when feature's set_owner is local.
How to verify it
The label feature_enabled should be always true if the feature's set owner is kube.
#### Why I did it
src/sonic-utilities
```
* 51c7a43c - (HEAD -> master, origin/master, origin/HEAD) [show][muxcable] update `show mux config` to print out `soc_ipv6` as well (#2909) (6 hours ago) [Jing Zhang]
* fd497755 - [route_check][dualtor] Ignore vlan neighbor route miss (#2888) (18 hours ago) [Longxiang Lyu]
* 81c0ed4e - [show][muxcable] update `show mux tunnel-route` to check soc_ipv6 as well (33 hours ago) [Jing Zhang]
* 1ee73668 - [db_migrator] Migrate DNS configuratuion (#2893) (2 days ago) [ganglv]
* 553a3432 - [dualtor][route_check] filter out `soc_ipv6` (#2899) (2 days ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
When do clean up container images, current code has two bugs need to be fixed. And some variables' name maybe cause confused, change the variables' name.
Work item tracking
Microsoft ADO (number only): 24502294
How I did it
We do clean up after tag latest successfully. But currently tag latest function only return 0 and 1, 0 means succeed and 1 means failed, when we get 1, we will retry, when we get 0, we will do clean up. Actually the code 0 includes another case we don't need to do clean up. The case is that when we are doing tag latest, the container image we want to tag maybe not running, so we can not tag latest and don't need to cleanup, we need to separate this case from 0, return -1 now.
When local mode(v1) -> kube mode(v2) happens, one problem is how to handle the local image, there are two cases. one case is that there was one kube v1 container dry-run(cause we don't relace the local if kube version = local version), we will remove the kube v1 image and tag the local version with ACR prefix and remove local v1 local tag. Another case is that there was no kube v1 container dry-run, we remove the local v1 image directly, cause the local v1 image should not be the last desire version.
About the docker_id variable, it may cause confused, it's actually docker image id, so rename the variable. About the two dicts and the list, rename them to be more readable.
How to verify it
Check tag latest and image clean up result.
Why I did it
During the upgrade process via k8s, the feature's systemd service will restart as well, all of the feature systemd service has restart number limit, and the limit number is too small, only three times. if fallback happens when upgrade, the start count will be 2, just once again, the systemd service will be down. So, need to bypass this. This restart function will be called when do local -> kube, kube -> kube, kube ->local, each time call this function, we indeed need to restart successfully, so do reset-failed every time we do restart.
When need to go back to local mode, we do systemd restart immediately without waiting the default restart interval time so that we can reduce the container down time.
Work item tracking
Microsoft ADO (number only):
24172368
How I did it
Before every restart for upgrade, do reset feature's restart number. The restart number will be reset to 0 to bypass the restart limit.
When need to go back to local mode, we do systemd restart immediately.
How to verify it
Feature's systemd service can be always restarted successfully during upgrade process via k8s.
#### Why I did it
src/sonic-platform-daemons
```
* d73808c - (HEAD -> master, origin/master, origin/HEAD) Added PCIe transaction check for all peripherals on the bus (#331) (9 hours ago) [Ashwin Srinivasan]
* 432602a - Update active application selected code in transceiver_info table aft… (#381) (13 hours ago) [Michael Wang - TW]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* c7e1308e - (HEAD -> master, origin/master, origin/HEAD) Remove redundant updateFabricPortState (#2850) (2 hours ago) [kenneth-arista]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-linux-kernel
```
* d070cae - (HEAD -> master, origin/master, origin/HEAD) arm64: dts: marvell: Add Nokia 7215-IXS-A1 board (#321) (34 hours ago) [Pavan-Nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 465f95e - (HEAD -> master, origin/master, origin/HEAD) Default implementation of under/over speed checks (#382) (9 hours ago) [spilkey-cisco]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 7ca31477 - (HEAD -> master, origin/master, origin/HEAD) [db_migrator] Set docker_routing_config_mode to the value obtained from minigraph parser (#2890) (10 hours ago) [Vaibhav Hemant Dixit]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 776af62c - (HEAD -> master, origin/master, origin/HEAD) [CodeQL]: Use dependencies with relevant versions in azp template. (#2845) (4 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
After k8s upgrade a container, k8s can only know the container is running, don't know the service's status inside container. So we need a probe inside container, k8s will call the probe to check whether the container is really ready.
##### Work item tracking
- Microsoft ADO **(number only)**: 22453004
#### How I did it
Add a health check probe inside config engine container, the probe will check whether the start service exit normally or not if the start service exists and call the python script to do container self-related specific checks if the script is there. The python script should be implemented by feature owner if it's needed.
more details: [design doc](https://github.com/sonic-net/SONiC/blob/master/doc/kubernetes/health-check.md)
#### How to verify it
Check path /usr/bin/readiness_probe.sh inside container.
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [x] 202205
- [x] 202211
#### Tested branch (Please provide the tested image version)
- [x] 20220531.28
* Add an ability to configure remote syslog servers
* Add an initial configuration for remote syslog
* Extend YANG module and add unit tests
#### Why I did it
Adding the following functionality to rsyslog feature:
- Configure remote syslog servers: protocol, filter, severity level
- Update global syslog configuration: severity level, message format
#### How I did it
added parameters to syslog server and global configuration.
#### How to verify it
create syslog server using CLI/adding to Redis-DB
verify server is added to file /etc/rsyslog.conf and server is functional.
#### Description for the changelog
extend rsyslog capabilities, added server and global configuration parameters.
#### Link to config_db schema for YANG module changes
https://github.com/iavraham/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-syslog.yang
- Why I did it
Implemented ssh configurations
- How I did it
Added ssh config table in configDB, once changed - hostcfgd will change the relevant OS files (sshd_config)
- How to verify it
Tests in sonic-host-services. Change relevant configs in configDB such as ports, and see sshd port was modified
*use lower case for IPv6 address as internal key and bfd session key. fixes#15764
Why I did it
*staticroutebfd uses the IPv6 address string as a key to create bfd session and cache the bfd sessions using it as a key.
When the IPv6 address string has uppercase letter in the static route nexthop list, the string with uppercase letter key is stored in the cache, but the BFD STATE_DB uses lowercase for IPv6 address, so when the staticroutebfd get the bfd state event, it cannot find the bfd session in its local cache because of the letter case.
#### Why I did it
We should not modify minigraph schema.
#### How I did it
Update minigraph.py and remove unit test.
#### How to verify it
Run sonic-config-engine unit test.
#### Why I did it
src/sonic-mgmt-common
```
* 341fd73 - (HEAD -> master, origin/master, origin/HEAD) Remove invalid db type definitions: ERROR_DB, USER_DB (#94) (3 days ago) [Sachin Holla]
```
#### How I did it
#### How to verify it
#### Description for the changelog
*What I did:
Enable BFD for Static Route for chassis-packet. This will trigger the use of the feature as defined in here: #13789
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
fix static route uninstall issue when all nexthops are not reachable.
the feature was working but the bug was introduced when support dynamic bfd enable/disable. Added UT testcase to guard this.
#### Why I did it
src/sonic-dash-api/sonic-dash-api
```
* 3f728d1 - (HEAD -> master, origin/master, origin/HEAD) Update vnet_direct in route.proto (#4) (11 days ago) [Ze Gan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
To reduce the container's dependency from host system
Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.
How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.
Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
Why I did it
Tacacs nss library uses popen to execute useradd and usermod commands. Popen executes using a shell (/bin/sh) which is passed the command string with "-c". This means that if untrusted user input is supplied, unexpected shell escapes can occur. In this case the username supplied can be untrusted user input when logging in via ssh or other methods when tacacs is enabled. Debian has very little limitation on usernames and as such characters such as quotes, braces, $, >, | etc are all allowed. Since the nss library is run by root, any shell escape will be ran as root.
In the current community version of tacacs nss library, the issue is mitigated by the fact that the useradd command is only ran if the user is found to exist on the tacacs server, so the bad username would have to already exists there which is unlikely. However, internally (at Dell) we had to modify this behavior to support other tacacs servers that do not allow authorization messages to verify user existence prior to a successful authentication. These servers include Cisco ISE and Aruba ClearPass. In order to support these tacacs+ servers, we have to create a temporary user immediately, which means this would be a much bigger issue.
I also plan to supply the patch to support ISE and ClearPass and as such, I would suggest taking this patch to remediate this issue first.
How I did it
Replace call to popen with fork/execl of the useradd/usermod binary directly.
How to verify it
Install patched version of libnss-tacplus and verify that tacacs+ user login still works as expected.
Why I did it
For the DASH scenario, the APP_DB will be optimized by protobuf message for less memory consumption.
How I did it
Download the Debian package of protobuf 3.21.12 and create a corresponding rule for building it.
Add a submodule of sonic-dash-api and generated its Debian package which includes C++ library and Python library
How to verify it
Check artifacts of Azp that the protobuf-related and dash-api deb packages should be generated.
Signed-off-by: Ze Gan <ganze718@gmail.com>
#### Why I did it
src/sonic-platform-common
```
* 10af810 - (HEAD -> master, origin/master, origin/HEAD) More prevention of fatal exception caused by VDM dictionary missing fields when a transceiver has just been pulled (#376) (5 hours ago) [snider-nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* bc08806 - (HEAD -> master, origin/master, origin/HEAD) Implemented ssh configurations (#32) (14 hours ago) [ycoheNvidia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* [sonic-pit] Add PIT(Platform Integration Test) feature, second part, add 6 test cases.
Signed-off-by: Li Hua <guizhao.lh@alibaba-inc.com>
* Add missing test case configuration and platform configuration.
Signed-off-by: Li Hua <guizhao.lh@alibaba-inc.com>
* Remove unsed comment, replace duplicated function with import from other moduls.
---------
Signed-off-by: Li Hua <guizhao.lh@alibaba-inc.com>
#### Why I did it
src/dhcpmon
```
* 824a144 - (HEAD -> master, origin/master, origin/HEAD) replace atoi with strtol (#6) (3 hours ago) [Mai Bui]
* 32c0c3f - Fix libswsscommon package installation for non-amd64 (#7) (6 hours ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* a67f684f - (HEAD -> master, origin/master, origin/HEAD) [hash]: Implement GH backend (#2598) (3 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* eab4a9e - (HEAD -> master, origin/master, origin/HEAD) [hostcfgd][dns] Subscribe to DNS_NAMESERVER table to react to static DNS configuration changes. (#49) (2 days ago) [Oleksandr Ivantsiv]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
Avoid 'sscanf()' for number conversions. Its use can lead to undefined behavior, slow processing, and integer overflows. Instead prefer the 'strto*()' family of functions.
#### How I did it
replace sscanf with strtol
#### How to verify it
Manual test
- Why I did it
Add support for static DNS configuration. According to sonic-net/SONiC#1262 HLD.
- How I did it
Add a new resolv-config.service that is responsible for transferring configuration from Config DB into /etc/resolv.conf file that is consumed by various subsystems in Linux to resolve domain names into IP addresses.
- How to verify it
Run the image compilation. Each component related to the static DNS feature is covered with the unit tests.
Run sonic-mgmt tests. Static DNS feature will be covered with the system tests.
Install the image and run manual tests.
#### Why I did it
src/sonic-sairedis
```
* 14a863a - (HEAD -> master, origin/master, origin/HEAD) [warmboot] Add workaround for `INIT_VIEW` failure (#1252) (5 hours ago) [Jing Zhang]
* abb02a5 - [actions] Support Semgrep by Github Actions (#1254) (2 days ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* 508d642 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#67) (31 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Graceful restart is a key event for bgpd, related log print is debug level. To change it to info level to get more visibilities when this kind of event is triggered.
Work item tracking
Microsoft ADO (13875291):
How I did it
To create patch file to change from debug level to info level.
How to verify it
To run PR test and capture the print.
- Why I did it
To fix hiredis compilation
- How I did it
Changed package version: 0.14.0-3~bpo9+1 -> 0.14.1-1
- How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
#### Why I did it
src/dhcprelay
```
* c36b8e3 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#39) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 4bda49b - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#210) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-dbsyncd
```
* e4ac906 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#59) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-mgmt-framework
```
* 4a2ff41 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#116) (5 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Update the definition of acl table type BMCDATA and BMCDATAV6 in minigraph parser.
Work item tracking
Microsoft ADO (number only): 24101023
How I did it
Update the definition of acl table type BMCDATA and BMCDATAV6 in minigraph parser.
How to verify it
Ran unittest to verify this update:
Added support data for fabric monitoring in CONFIG_DB
The CONFIG_DB now has the FABRIC_MONITOR|FABRIC_MONITOR_DATA table for default value for fabric port monitoring. An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_MONITOR|FABRIC_MONITOR_DATA"
{'monErrThreshCrcCells': '1', 'monErrThreshRxCells': '61035156', 'monPollThreshIsolation': '1', 'monPollThreshRecovery': '8'}
The CONFIG_DB now also has a table for each fabric port for its isolate status.
An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_PORT|Fabric20"
{'alias': 'Fabric20', 'isolateStatus': 'False', 'lanes': '20'}
#### Why I did it
src/sonic-swss
```
* 87e0b08 - (HEAD -> master, origin/master, origin/HEAD) [portsorch]: Enhancing SWSS OA logs to capture host_tx_ready change events (#2822) (11 hours ago) [mihirpat1]
* c7e52a0 - [subinterface]: Fix admin state handling. (#2806) (34 hours ago) [Nazarii Hnydyn]
* ebfda13 - [aclorch] Fix TODO: use SAI object API to query capabilities (#2743) (2 days ago) [Stepan Blyshchak]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-gnmi
```
* a600dc9 - (HEAD -> master, origin/master, origin/HEAD) Fix threading issues in Event Client (#121) (9 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss-common
```
* 2320ddc - (HEAD -> master, origin/master, origin/HEAD) Add ZMQ port for orchagent (#795) (19 hours ago) [Hua Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* Re-add 127.0.0.1/8 when bringing down the interfaces
With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.
To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.
Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
The asn 0 in BGP_MONITOR is invalid by YANG definition. However, the asn 0 in BGP_MONITOR is found in many devices.
It was introduced by minigraph where its value is set to 0.
To unblock Config Updater test, the short term fix is to accept the asn 0 in BGP_MONITOR.
We can revert this after NGS team make all the ASN change in minigraph.
##### Work item tracking
- Microsoft ADO **(24186140)**:
#### How I did it
Change the range
#### How to verify it
Unit test.
Why I did it
This is to add support for specifying custom retry counts for LACP sessions. This is to make warmboot easier on low-storage and low-memory platforms, by allowing more than 90 seconds of downtime.
How I did it
How to verify it
Tested manually with these cases:
Verify that changing the retry count using teamdctl PortChannel101 state item set runner.retry_count 5 takes effect
Verify that the retry count change actually affects when the LAG goes down by forcefully killing teamd on one side (i.e. setting the retry count to 5 causes the LAG to go down after 150 seconds)
Verify that the retry count gets reset to 3 after the LAG goes down for whatever reason
Verify that the retry count gets reset to 3 after some period of time (30 seconds * retry count)
Test cases are in sonic-net/sonic-mgmt#7961 and sonic-net/sonic-mgmt#8152.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
What I did:
Workaround for the issue seen here : FRRouting/frr#13682
It seems there is timing issue where there are multiple recursive lookup needed to resolve nexthop of the route it's possible that it does not happen correctly causing route to remain in inactive state
Issue is seen on chassis-packet as there 2 level of recursive lookup needed for a given e-BGP learnt route
- Level1 to resolve e-BGP peer (connected route via bgp ) over Loopback4096 (i-BGP peering)
- Level 2 Loopback4096 over backend port-channels next-hops
For VOQ chassis there is no e-BGP peer (connected route via bgp ) resolution as route is added as Static route by orchagent over Ethernet-IB.
Also as part of this remove route-map policy from instance.conf.j2 as same is define in peer-group.j2.
Microsoft ADO: https://msazure.visualstudio.com/One/_workitems/edit/24198507
How I verify:
Functional Verification manually
Updated UT.
We will be adding sanity check in sonic-mgmt to make sure none of route are in inactive state.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
During build, lots of pip packages are getting installed through pip
install command.
This feature adds support for caching all the pip packages into local
cache path, so that subsequent build always loads from the cache.
When a package is referenced from the web through wget command,
it downloads the package for every build.
This feature caches all the packages that are being downloaded from the
web, so that subsequent build always loads the cache instead of from
web.
* [202205] Update SOC properties for DLR_INIT based pfcwd recovery (#15217)
Why I did it
Update soc properties for certain roles that need to use pfcwd dlr init based recovery mechanism
How to verify it
Updated the templates on a 7050cx3 dual tor and 7260 T1 which satisfies these conditions and validated pfcwd recovery which uses DLR_INIT based mechanism. Also validated that this mechanism is not used on 7050cx3 single tor with the updated templates
Signed-off-by: Neetha John <nejo@microsoft.com>
* AclInterface and Management Interfaces are parsed on finding first valid node for it.
Above logic works for multi-asic scenarios where ACL Interface and Management Interfaces are present in DPG order {Host, Asicx, Asicy} but not when DPG is in {Asicx, Asicy, Host} order.
* [static_route][staticroutebfd]fix an issue on deleting a non-bfd static route
Fix an issue for deleting a non-bfd static route also remove the staticroutebfd from critical_processes list and make it auto restart in the case of crash.
What I did:
Added change to add 'peerType' as element in NEIGH_STATE_TABLE.
'peerType' can be i-BGP vs e-BGP determined based on local and remote AS number.
Why I did:
This is useful to filter neighbors in SONiC as internal vs external in chassis use-case (example: telemetry)
Verification:
Manual Verification
127.0.0.1:6379[6]> hgetall "NEIGH_STATE_TABLE|10.0.0.5"
1) "state"
2) "Established"
3) "peerType"
4) "e-BGP"
127.0.0.1:6379[6]> hgetall "NEIGH_STATE_TABLE|2603:10e2:400::4"
1) "state"
2) "Established"
3) "peerType"
4) "i-BGP"
Also sonic-mgmt test case test_bgp_fact.py is enhanced: Enhanced bgp_fact to validate NEIGH_STATE_TABLE element 'peerType' sonic-mgmt#8462
Stop authorization after user being rejected by server.
#### Why I did it
Fix nss_tacplus bug: after user being rejected by one TACACS+ server, nss_tacplus will try with next TACACS+ server.
##### Work item tracking
- Microsoft ADO :15276692
#### How I did it
Check authorization result, stop authorization after user being rejected by server.
#### How to verify it
Pass all E2E test.
Create new UT: https://github.com/sonic-net/sonic-mgmt/pull/8345
#### Description for the changelog
Stop authorization after user being rejected by server.
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
- Why I did it
If you enable feature and then disable it, System Ready status change to Not Ready
A disabled feature should not affect the system ready status.
- How I did it
During the disable flow of dhcp_relay, it entered the dnsrvs_name list, which caused the SYSTEM_STATE key to be set to DOWN. Right after that, the dhcp_relay service was removed from the full service list, however, but, when it was removed from the dnsrvs_name, there was no flow to reset the system state back to UP even though there was no more services in down state.
- How to verify it
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay enabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay disabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
Should see
System is ready
- Why I did it
interfaces-config service restarts networking service, which in-turn results in loopback interface address is being removed and reassigned back
If the system-health happens to start during that instance expections and logs like this are seen:
Apr 15 18:14:49.357869 r-panther-20 ERR healthd: update system status exception:Unable to connect to redis: Cannot assign requested address
Apr 15 18:14:49.429778 r-panther-20 ERR healthd: subscribe_statedb exited- Unable to connect to redis: Cannot assign requested address
Apr 15 18:14:52.218594 r-panther-20 ERR healthd: system_service_Map_base::at
Apr 15 18:14:52.219714 r-panther-20 ERR healthd: system_service_Map_base::at
Apr 15 18:14:55.218636 r-panther-20 ERR healthd: system_service_Map_base::at
Apr 15 18:14:55.218722 r-panther-20 ERR healthd: system_service_Map_base::at
- How I did it
use unix socket path
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Why I did it
Introduce a new valid neighbor element type to YANG.
Work item tracking
Microsoft ADO (number only): 23994521
How I did it
Add MgmtLeafRouter to element network type list.
How to verify it
Passes UTs
- Run pre-commit tox profile to trim all trailing blanks
- Use several commits with a per-folder based strategy
to ease their merge
Issue #15114
Signed-off-by: Guillaume Lambert <guillaume.lambert@orange.com>
Why I did it
We need to store information of power shelf in config_db for SONiC MX switch. Current minigraph parser cannot parse rack_mgmt_map field.
Work item tracking
Microsoft ADO (number only): 22179645
How I did it
Add support for parsing rack_mgmt_map.
What I did:
Updated Static Route Attribute in Minigraph. NGS Minigraph has define semantics of static route differently.
See below for differences:-
Microsoft ADO: 17956325
Before
<AssociatedTo>8.0.0.1/32</AssociatedTo>
<Address>192.168.1.2,192.168.2.2</Address>
<AttachTo>PortChannel40,PortChannel50</AttachTo>
Now:
<Address>8.0.0.1</Address>
<AttachTo>PortChannel40,192.168.1.2;PortChannel50,192.168.2.2</AttachTo>
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Update PG headroom settings ports based on port speed/cable length
* Updated XOFF settings to use chip level numbers than core
* Updated PG headroom based on uplink/downlink side
* fix for sonic-config-gen tests
* More fixes for unit test cases
* more test fixes
* Merged multiple functions into one
Add new Nokia build target and establish an arm64 build:
Platform: arm64-nokia_ixs7215_52xb-r0
HwSKU: Nokia-7215-A1
ASIC: marvell
Port Config: 48x1G + 4x10G
How I did it
- Change make files for saiserver and syncd to use Bulleseye kernel
- Change Marvell SAI version to 1.11.0-1
- Add Prestera make files to build kernel, Flattened Device Tree blob and ramdisk for arm64 platforms
- Provide device and platform related files for new platform support (arm64-nokia_ixs7215_52xb-r0).
Following changes are done:
Added Support where if asic configuration is not present in minigraph sonic-cfggen do not error out but instead process it gracefully.
Use Case: In Supervisor we have number of asic are define as max possible but in minigraph configuration of only valid/available asics only are present. Without this change load_minigraph fails.
Microsoft ADO: 17956325
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Why I did it
Our k8s feature will pull new version container images for each upgrade, the container images inside sonic will be more and more, but for now we don’t have a way to clean up the old version container images, the disk may be filled up. Need to add cleaning up the old version container images logic.
Work item tracking
Microsoft ADO (number only):
17979809
How I did it
Remove the old version container images besides the feature's current version and last version image, last version image is saved for supporting fallback.
How to verify it
Check whether the old version images are removed
Why I did it
Fix#15000
isc-dhcp 4.4.1-2.3+deb11u1 is no longer available in debian repository
How I did it
update isc-dhcp to new version 4.4.1-2.3+deb11u2
Why I did it
replace yaml.load to yaml.safe_load because yaml.safe_load is more secure
Work item tracking
Microsoft ADO (number only): 15022050
How I did it
How to verify it
Verified in DUT 201911 which yaml version < 5.1
- Why I did it
Add fan direction check to system health, all fans should be in the same direction
- How I did it
Add fan direction check to system health, all fans should be in the same direction
- How to verify it
Manual test
Unit test
Added sonic-mgmt test case to verify
Why I did it
To support BGPMon sessions from each T2 linecard ASIC
Work item tracking
Microsoft ADO (number only): 17873174
How I did it
Added change in BGPMon configuration to use Loopback4096 as source interface, since this has a unique IP per ASIC.
How to verify it
Tested by manually setting up BGPMon session on T2 LC and verified that Loopback4096 could be used as source
What I did:
In FRR command update source <interface-name> is not at address-family level. Because of this
internal peer route-map for ipv6 were getting applied to ipv4 address family. As a result
TSA over iBGP for Ipv6 was not getting applied.
How I verify:
Manual Verification of TSA over both ipv4 and ipv6 after fix works fine.
Updated UT for this.
Added sonic-mgmt test gap: sonic-net/sonic-mgmt#8170
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
There are chassis-packet and Single asic platforms which support this 400G to 100G/40G speed change via config.
Enabling this feature for all platforms which can support this. Keeping it enabled for all does not affect the platforms
which do not support this feature yet.
Signed-off-by: anamehra anamehra@cisco.com
Why I did it
Do not require leafref as part of yang. Only need string to compare whether string received from event matches what is possible for ifname.
How I did it
How to verify it
Run UT
Why I did it
Update ECN settings for T2 chassis
How I did it
Updated qos config file to load these settings during switch bootup
How to verify it
Verified on line card on T2 chassis
Why I did it
Support for SONIC chassis isolation using TSA and un-isolation using TSB from supervisor module
Work item tracking
Microsoft ADO (number only): 17826134
How I did it
When TSA is run on the supervisor, it triggers TSA on each of the linecards using the secure rexec infrastructure introduced in sonic-net/sonic-utilities#2701. User password is requested to allow secure login to linecards through ssh, before execution of TSA/TSB on the linecards
TSA of the chassis withdraws routes from all the external BGP neighbors on each linecard, in order to isolate the entire chassis. No route withdrawal is done from the internal BGP sessions between the linecards to prevent transient drops during internal route deletion. With these changes, complete isolation of a single linecard using TSA will not be possible (a separate CLI/script option will be introduced at a later time to achieve this)
Changes also include no-stats option with TSC for quick retrieval of the current system isolation state
This PR also reverts changes in #11403
How to verify it
These changes have a dependency on sonic-net/sonic-utilities#2701 for testing
Run TSA from supervisor module and ensure transition to Maintenance mode on each linecard
Verify that all routes are withdrawn from eBGP neighbors on all linecards
Run TSB from supervisor module and ensure transition to Normal mode on each linecard
Verify that all routes are re-advertised from eBGP neighbors on all linecards
Run TSC no-stats from supervisor and verify that just the system maintenance state is returned from all linecards
Why I did it
systemd stop event on service with timers can sometime delete the state_db entry for the corresponding service.
Note: This won't be observed on the latest master label since the dependency on timer was removed with the recent config reload enhancement. However, it is better to have the fix since there might be some systemd services added to system health daemon in the future which may contain timers
root@qa-eth-vt01-4-3700c:/home/admin# systemctl stop snmp
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status
System is not ready - one or more services are not up
Service-Name Service-Status App-Ready-Status Down-Reason
---------------------- ---------------- ------------------ -------------
<Truncated>
ssh OK OK -
swss OK OK -
syncd OK OK -
sysstat OK OK -
teamd OK OK -
telemetry OK OK -
what-just-happened OK OK -
ztp OK OK -
<Truncated>
Expected
Should see a Down entry for SNMP instead of the entry being deleted from the STATE_DB
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status
System is not ready - one or more services are not up
Service-Name Service-Status App-Ready-Status Down-Reason
---------------------- ---------------- ------------------ -------------
<Truncated>
snmp Down Down Inactive
ssh OK OK -
swss OK OK -
syncd OK OK -
sysstat OK OK -
teamd OK OK -
telemetry OK OK -
what-just-happened OK OK -
ztp OK OK -
<Truncated>
How I did it
Happens because the timer is usually a PartOf service and thus a stop on service is propagated to timer. Fixed the logic to handle this
Apr 18 02:06:47.711252 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.service from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.711347 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.service ]
Apr 18 02:06:47.722363 r-lionfish-16 INFO healthd: snmp.service service state changed to [inactive/dead]
Apr 18 02:06:47.723230 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.timer from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.723328 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.timer ]
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
* Support ACL interface type BmcData in minigraph parser
* Support ACL interface type BmcData in minigraph parser
* add unittest
* Add a global dict for storing the defination of custom acl tables
- Why I did it
Extend the CRM YANG model with DASH attributes.
- How I did it
Add new attributes to the existing CRM YANG model.
Implement tests for DASH CRM attributes.
- How to verify it
Build sonic_yang_models packages. The tests will be run automatically.
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.
#### Why I did it
When user enable TACACS per-command authorization, and run a command with wildcard , if the command match more than hundreds of files, the per-command authorization will failed with following message:
*** authorize failed by TACACS+ with given arguments, not executing
The root cause of this issue is because bash will match files with wildcard and replace with wildcard args with matched files. when there are too many files, TACACS plugin will generate a big authorization request, which will be reject by server side.
##### Work item tracking
- Microsoft ADO **(number only)**: 18074861
#### How I did it
Fix bash patch file, use original user inputs as authorization parameters.
#### How to verify it
Pass all UT.
Create new UT to validate the TACACS authorization request are using original command arguments.
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8115
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [X] 202205
- [X] 202211
#### Tested branch (Please provide the tested image version)
- [x] 202205.258490-412b83d0f
- [x] 202211.71966120-1b971c54b5
#### Description for the changelog
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.
#### Why I did it
sonic-config-engine unit test needs to detect missing yang models.
#### How I did it
Update unit test, return error for missing yang models.
#### How to verify it
Run unit test for sonic-config-engine
#### Why I did it
Yang model for NEIGH table was missing
Fixed https://github.com/sonic-net/sonic-buildimage/issues/13971
#### How I did it
added sonic-neigh.yang model
#### How to verify it
make buildimage
#### Description for the changelog
Adding NEIGH yang model
Signed-off-by: Stepan Blyschak stepanb@nvidia.com
DEPENDS: #12852
Why I did it
To support BGP pending FIB suppression.
How I did it
I backported patches from FRR 8.4 feature that allows communicating ASIC route status back to FRR.
Also, added a new field in DEVICE_METADATA YANG model table. Added UT for YANG model changes.
How to verify it
Run on the switch.
Open
[minigraph] add support for changing T1 ports speed from 400G to 100G and vice-versa
#14505
vdahiya12 wants to merge 9 commits into sonic-net:master from vdahiya12:dev/vdahiya/minigraph_parser
Conversation 10
Commits 9
Checks 18
Files changed 5
Conversation
vdahiya12
@vdahiya12 vdahiya12 commented 2 weeks ago •
On SONiC T1 cisco 8101 HwSku, the speed changes are done from 400G to 100G needs to be supported on 400G ports.
To enable this, along with speed change the port lanes need to be changed. This PR has the changes to update the port lanes when such speed change happens.
Basically if Banwidth in minigraph.xml intends to enable a 100G speed on a 400G port, then the appropriate lane change and speed change needs to be invoked in mingraph parser
Example if port_config.ini dicatates the speed to be 400G and minigraph has 100G speed, then this changeneeds to be accommodated
# name lanes alias index speed channel
Ethernet96 1536,1537,1538,1539,1540,1541,1542,1543 etp12 12 400000 0
<DeviceLinkBase>
<ElementType>DeviceInterfaceLink</ElementType>
<EndDevice>ARISTA01T2</EndDevice>
<EndPort>Ethernet1</EndPort>
<StartDevice>Device-8101-01</StartDevice>
<StartPort>etp12</StartPort>
<Bandwidth>100000</Bandwidth>
</DeviceLinkBase>
These platforms today have 400g port with 8 serdes lines, and 100g will operate with 4 serdes lane. When the port speed changes from 400G to 100G the first 4 lanes will be used for 100G port.
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
* Fix backend port channels and routes being displayed
In `show interface portchannel` and `show ip route`, backend port
channels and routes were being displayed. This is due to changes in #13660.
Fix these issues by switching to reading from PORTCHANNEL_MEMBERS table
instead.
Fixes#14459.
* Replace table name with constant
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
Implementing code changes for https://github.com/sonic-net/SONiC/pull/1203
#### How I did it
Removed the timers and delayed target since the delayed services would start based on event driven approach.
Cleared port table during config reload and cold reboot scenario.
Modified yang model, init_cfg.json to change has_timer to delayed
#### How to verify it
Running regression
Why I did it
Optimize the version control for Debian packages.
Fix sonic-slave-buster/sources.list.amd64 not found display issue, need to generate the file before running the shell command to evaluate the sonic image tag.
When using the snapshot mirror, it is not necessary to update the version file based on the base image. It will reduce the version dependency issue, when an image is not run when freezing the version.
How I did it
Not to update the version file when snapshot mirror enabled.
How to verify it
Why I did it
refine reproducible build.
How I did it
Fix reset map variable in bash.
Ignore empty web file md5sum value.
If web file didn't backup in azure storage, use file on web.
How to verify i
- Why I did it
To solve an issue with upgrade with fast-reboot including FW upgrade which has been introduced since moving to fast-reboot over warm-reboot infrastructure.
As well, this introduces fast-reboot finalizing logic to determine fast-reboot is done.
- How I did it
Added logic to finalize-warmboot script to handle fast-reboot as well, this makes sense as using fast-reboot over warm-reboot this script will be invoked. The script will clear fast-reboot entry from state-db instead of previous implementation that relied on timer. The timer could expire in some scenarios between fast-reboot finished causing fallback to cold-reboot and possible crashes.
As well this PR updates all services/scripts reading fast-reboot state-db entry to look for the updated value representing fast-reboot is active.
- How to verify it
Run fast-reboot and check that fast-reboot entry exists in state-db right after startup and being cleared as warm-reboot is finalized and not due to a timer.
#### Why I did it
When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds.
Fix issue https://github.com/sonic-net/sonic-buildimage/issues/14381
#### How I did it
The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port.
Implement the op in LACP runner to disable the port which leads to proper LACPDU send.
#### How to verify it
Set LAG between 2 switches.
Set LAGs to be router port and set ip address.
In switch A send ping to ip address of LAG in switch B.
In switch B, while ping is running remove port from LAG.
Verify ping is not stopping.
#### Why I did it
Fix issue: wrong teamd link watch state after warm reboot due to TEAM_ATTR_PORT_CHANGED lost
The flag TEAM_ATTR_PORT_CHANGED is maintained by kernel team driver:
- a flag "changed" is maintained in struct team_port struct
- the flag is set by __team_port_change_send once relevant information is updated, including port linkup (together with speed, duplex), adding or removing
- the flag is cleared by team_nl_fill_one_port_get once the updated information has been notified to user space via RTNL
In the userspace, the change flag is maintained by libteam in struct team_port.
The team daemon calls port_priv_change_handler_func on receiving port change event.
The logic in port_priv_change_handler_func
1. creates the port if it did not exist, which triggers port add event and eventually calls lacp_port_added callback.
2. triggers port change event if team_port->changed is true, which eventually calls lw_ethtool_event_watch_port_changed to update port state for link watch ethtool.
3. removes the port if team_port->removed is removed
In lacp_port_added, it calls team_refresh to refresh ifinfo, port info, and option info from the kernel via RTNL.
In this step, port_priv_change_handler_func is called recursively.
- In the inner call, it won't get TEAM_ATTR_PORT_CHANGED flag because kernel has already notified that.
- As a result, team_port->changed flag is cleared in the libteam.
- The port change event won't be triggered from either inner or outer call of port_priv_change_handler_func.
If the port has been up when the port is being added to the team device, the "port up" information is carried in the outer call but will be lost.
In case the flag TEAM_ATTR_PORT_CHANGED is set only in the inner call, function port_priv_change_handler_func can be called in the inner call.
However, it will fail to fetch "enable" options because option_list_init has not be called.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
#### How I did it
Fix:
Do not call check_call_change_handlers when parsing RTNL function is called from another check_call_change_handlers recursively.
#### How to verify it
- Manually test
- Regression test
- warm reboot
- warm reboot sad lag
- warm reboot sad lag member
- warm reboot sad (partial)
#### Why I did it
Fixes#14277.
Fixes the inconsistent fallback behaviour for RADIUS authentication when AAA authentication is configured as "radius, local".
#### How I did it
Modified common-auth-sonic.j2 template to make sure that when no RADIUS servers are configured (with AAA authentication login method set to radius, local), the system falls back to local authentication successfully.
#### How to verify it
1. Configure authentication based on RADIUS and local.
config aaa authentication login radius local
2. Configure an unreachable RADIUS server.
config radius add 6.6.6.6
3. Try to login to switch with existing admin user credentials. This is successful.
4. Remove RADIUS server configuration.
config radius delete 6.6.6.6
5. Try to login to switch with admin user credentials. This is successful.
Based on the port breakout HLD, we are now using subport instead of channel in the CONFIG_DB PORT table to handle port breakout. The yang schema needs to be modified accordingly to handle the corresponding change.
The corresponding code changes have been merged through sonic-net/sonic-platform-daemons/pull/342 merged
Signed-off-by: Mihir Patel <patelmi@microsoft.com>
Why I did it
Support Egress Mirroring on supported Arista platforms
How I did it
Add necessary soc properties for egress mirroring recycle ports to be created
Signed-off-by: Nathan Wolfe <nwolfe@arista.com>
Change references to use bullseye instead of buster
Why I did it
Almost all daemons in 202211 and master uses bullseye, and NAT seems easy to migrate.
How I did it
Replaced the references, built with 202211 branch.
How to verify it
Not sure, it builds and tests pass as far as I can tell but I don't use the feature myself.
Signed-off-by: Christian Svensson <blue@cmd.nu>
Why I did it
We found a bug when pilot, the tag function doesn't remove the ACR domain when do tag, it makes the latest tag not work. And in the original tag function, it calls os.system and os.popen which are not recommend, need to refactor.
How I did it
Do a split("/") when get image_rep to fix the acr domain bug
Refactor the tag function code and add test cases
How to verify it
Check whether container images are tagged as latest when in kube mode.
Why I did it
832ef9c4 - Fix bug in GCU vlanintf_validator ([Bcm SAI] ugprade Broadcom SAI to version 3.3.5.4m-1 #2765) (5 minutes ago) [jingwenxie]
53f611b7 - Revert "Convert IPv6 addresses to lowercase in apply-patch (Add Pegatron project to branch 201807 #2299)" (Add note for running out of disk space in /var/lib/docker to README.md #2758) (20 hours ago) [jingwenxie]
79a21cef - Revert frr route check ([mlnx] fix url inconsistency in fw.mk #2761) (8 minutes ago) [StormLiangMS]
824680ed - Resolved rc!=0 problem by replacing fgrep with awk. Added ipv4 filtering to get only v4 peers in case of show ip bgp neighbors (Improve eeprom access reliability #2756) (30 hours ago) [saurabh17g]
10f31ea6 - Revert "Replace pickle by json (Add autoneg to 7170-Q59S20 #2636)" ([hostcfgd] Default value of fallthrough for authentication set to be False. #2746) (7 days ago) [Mai Bui]
05fa7513 - Fix the show interface counters throwing exception on device with no external interfaces ([docker-platform-monitor]: Add smartmontools 6.6-1 #2703) (11 days ago) [abdosi]
f27dea0c - [route_check] remove check-frr_patch mock ([minigraph]: Mark both ERSPAN and ERSPANv6 as mirror ACL tables #2732) (11 days ago) [Stepan Blyshchak]
2d95529d - Revert "Update load minigraph to load backend acl (mlnx msn2010: default config_db.json generation with sonic-cfggen is not working #2236)" (swss stretch update broke restore_neighbors.py for neigh service #2735) (12 days ago) [Neetha John]
c869c970 - (master) Update the ref guide to reflect the vlan brief output ([teamd] update teamd docker to stretch and fix teamd_init failure #2731) (2 weeks ago) [Vivek]
76457141 - Fix fast-reboot DB migration ([teamd]: update teamd docker to stretch #2734) (2 weeks ago) [Aryeh Feigin]
f7f783bc - Enhance the logic to wait for all buffer tables to be removed in _clear_qos ([sfputil] Not able to read out values of voltage/temp/power on some cables #2720) (2 weeks ago) [Stephen Sun]
e6179afa - Remove timer from FAST_REBOOT STATE_DB entry and use finalizer (Rollback kernel submodule update. #2621) (3 weeks ago) [Aryeh Feigin]
ff688323 - [route_check] fix IPv6 address handling ([docker pmon] install fancontrol & sensord #2722) (3 weeks ago) [Stepan Blyshchak]
7a604c51 - update fast-reboot ([201811][sairedis][swss] advance sub module head of sairedis and swss #2728) (3 weeks ago) [jhli-cisco]
9f83ace9 - [GCU] Add vlanintf-validator (Revert "[device/celestica] blacklist gpio_ich kernel module on haliburton" #2697) (3 weeks ago) [jingwenxie]
338d1c05 - Check SONiC dependencies before installation. ([sonic-slave]: Add iproute2 dependencies in stretch docker #2716) (3 weeks ago) [Liu Shilong]
64d2efd2 - Improve show acl commands ([sonic-utilities] update submodule #2667) (3 weeks ago) [bingwang-ms]
2ef5b31e - [GCU] Add PFC_WD RDMA validator ([sub module] advance sonic-utilities sub module for 201811 branch #2619) (3 weeks ago) [isabelmsft]
c7aa8416 - [show][muxcable] increase timeout for displaying HW_STATUS (Fixing get_transceiver_change_event #2712) (3 weeks ago) [vdahiya12]
2fc2b826 - YANG validation for ConfigDB Updates: MIRROR_SESSION use case ([mellanox] Update SDK to 4.3.0132 #2430) (3 weeks ago) [isabelmsft]
e16bdaae - Fix non-zero status exit on non secure boot system ([service] add warmboot finializer service #2715) (3 weeks ago) [kellyyeh]
90d70152 - [route_check] implement a check for FRR routes not marked offloaded (Feature to run an option platform specific script on the first boot #2531) (3 weeks ago) [Stepan Blyshchak]
c2bc150a - [warm/fast-reboot] Backup logs from tmpfs to disk during fast/warm shutdown ([swss]: update swss docker to stretch #2714) (3 weeks ago) [Vaibhav Hemant Dixit]
a015834d - [db_migrator] Add missing attribute 'weight' to route entries in APPL DB ([device/celestica] blacklist gpio_ich kernel module on seastone #2691) (4 weeks ago) [Vaibhav Hemant Dixit]
cd519aac - [ci] Fix pipeline issue caused by sonic-slave-* change. ([201803] Modify Debian apt repos to reflect changes made by maintainers #2709) (4 weeks ago) [Liu Shilong]
2680e6f3 - [dhcp_relay] Fix dhcp_relay restart error while add/del vlan ([thrift] add a patch to revert THRIFT-3650 #2688) (4 weeks ago) [Yaqiang Zhu]
How I did it
How to verify it
Why I did it
This PR is to update the check of IP_TYPE from sonic-acl.yang.
It's because if the ACL rule is added by loading a json file with acl-loader, there is no IP_TYPE for ACL rule. If such rule exists in ACL_RULE table, the GCU (generic config updater) refuses to update any ACL rules because the existing one is invalid.
This PR updates the yang model for ACL. If the IP_TYPE leaf doesn't exist, then we don't check the field.
How I did it
Accept the rule if IP_TYPE is absent.
How to verify it
The change is verified by UT.
Why I did it
We don't need to install rsync in every docker container if vcache is disabled.
How I did it
Install rsync in pre_run_buildinfo script only if vcache is enabled.
How to verify it
Why I did it
Change static route expiry timer max timeout value from 1800 to 172800.
To keep same value range as defined in sonic-restapi/sonic_api.yaml
How I did it
How to verify it
apply change to bgpcfd, restart bgp container see if the value take action.
Update sonic-py-common, add missing dependency to redis-dump-load.
#### Why I did it
The script sonic_db_dump_load.py in sonic-py-common is depends on redis-dump-load, however the dependency is missing.
#### How I did it
Add redis-dump-load dependency.
#### How to verify it
Pass all E2E test case.
#### Description for the changelog
Update sonic-py-common, add missing dependency to redis-dump-load.
Why I did it
For better accounting purposes, updating the ingress lossy traffic profile to use static threshold. This change is only intended for Th devices using RDMA-CENTRIC profiles
How I did it
Update the buffer templates for Th devices in RDMA-CENTRIC folder to use the correct threshold
How to verify it
Verified the changes manually on a Th device.
Existing unit tests render Th template from the RDMA-CENTRIC folder. Updated the expected output to use the correct threshold
- Why I did it
Update VxLAN yang model to include IPv6 source in VxLAN tunnel. The src_ip field can include both ipv4 as well as ipv6 address
- How I did it
Updated yang model.
- How to verify it
Added UT to verify
* Upgrade docker-sonic-vs and docker-syncd-vs to Bullseye
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* iproute2: Force a new version and timestamp to be used for the package
There is an issue with Docker's overlay2 storage driver when not using
native diffs (and thus falling back to naive diff mode), which is the
case in the CI builds. The way the naive diff mode detects changes is by
comparing the file size and comparing the timestamps (specifically, I
believe it's the modification timestamp), and if there's a change there,
then it's considered a change that needs to be recorded as part of that
layer.
The problem is that with the code being added in the patch, the file
size remains the same, and the timestamp of binary files appear to be
the same timestamp as the changelog entry (likely for reproducible build
purposes). The file size remains the same likely due to extra padding
within the file introduced by relro. Because of this, Docker doesn't
detect this file has changed, and doesn't save the new file as part of
this layer.
To work around this, create a new changelog entry (with a new version as
well) with a new timestamp. This will result in the binary files having
a different timestamp, and thus will get saved by Docker as part of that
layer.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
---------
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
SONiC currently does not identify 'EdgeZoneAggregator' neighbor. As a result, the buffer profile attached to those interfaces uses the default cable length which could cause ingress packet drops due to insufficient headroom. Hence, there is a need to update the buffer templates to identify such neighbors and assign the same cable length as used by the T1.
How I did it
Modified the buffer template to identify EdgeZoneAggregator as a neighbor device type and assign it the same cable length as a T1/leaf router.
How to verify it
Unit tests pass, and manually checked on a 7260 to see the changes take effect.
Signed-off-by: dojha <devojha@microsoft.com>
fa8b709 Handled the error case of negative age (#57)
990f5b0 Use github code scanning instead of LGTM (#55)
a7992c5 Install libyang for swss-common. (#50)
244fa86 Update README.md
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
- Why I did it
Healthd check system status every 60 seconds. However, running checker may take several seconds. Say checker takes X seconds, healthd takes (60 + X) seconds to finish one iteration. This implementation makes sonic-mgmt test case not so stable because the value X is hard to predict and different among different platforms. This PR introduces an interval
compensation mechanism to healthd main loop.
- How I did it
Introduces an interval compensation mechanism to healthd main loop: healthd should wait (60 - X) seconds for next iteration
- How to verify it
Manual test
Unit test
Why I did it
Add interface-id in dhcpv6-relay yang model
How I did it
Add interface-id option and corresponding UT. Updated configuration.md
How to verify it
kellyyeh@kellyyeh:~/sonic-buildimage/src/sonic-yang-models$ pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-dhcpv6-relay.yang
Why I did it
Add 'channel' to the CONFIG_DB PORT table. This will be needed to support PORT breakout to multiple channel ports so that Xcvrd can understand which datapath or channel to initialize on the CMIS compliant optics
How I did it
Add 'channel' to the CONFIG_DB PORT table.
How to verify it
Added unit test for valid and invalid channel number
Channel 0 -> No breakout
Channel 1 to 8 -> Breakout channel 1,2, ..8
Signed-off-by: Prince George <prgeor@microsoft.com>
Update sonic-swss-common submodule pointer to include the following:
23df338 [ci] Continue on error when running test. (#757)
06ffb51 Define ACL_TABLE and ACL_RULE table in STATE_DB (#748)
1b369ab [ci] Fix apt-get install unable locate package issue. (#753)
619d4ec Improve unit test for go wrapper (#752)
Why I did it
After the renaming of the asic_port_name in port_config.ini file (PR: #13053 ), the asic_ifname in port_config.ini is changed from '-ASIC<asic_id>' to just port. Example: 'Eth0-ASIC0' to 'Eth0'.
However, with this change a config_db generated via config load_minigraph would cause the EVERFLOW and EVERFLOWV6 tables under ACL_TABLE to not have any of non-LAG front panel interfaces. This was causing the EVERFLOW suite to fail.
How I did it
In parse_asic_external_neigbhors in minigraph.py there was a check that the asic_name.lower() (like asic0) is present in the port_alias_asic_map. However with -ASIC removed from the asic_ifname, the port_alias_asic_map would not have the asic_name and thus any non-LAG neighbor would not be included.
Fix was the ignore the asic name change as the port_alias_asic_map is already only looking for ports in just the same asic as asic_name.
How to verify it
Execute "config load_minigraph" with the mingraph which is generated by sonic-mgmt gen-minigraph script. And confirm ono-lag interface are present in the Everfloe table in the config_dbs.
Signed-off-by: mlok <marty.lok@nokia.com>
reverted because the submodule update PR needs to be merged with the following PR
#14200 but the PR is not available due to some failures and having only sairedis PR will break fast-boot
Update sonic-swss-common submodule pointer to include the following:
565ad4b Fix common path issue (#751)
3352881 Prevent sonic-db-cli generate core dump (#749)
43cadec Add ProfileProvider class to support read profile config from PROFILE_DB. (#683)
8b09f90 Update path to sairedis tests (#747)
85f3776 Non recursive automake and Debian packaging changes (#700)
This is a reland of #13950, with the debug image build fix.