Why I did it
get_system_mac was returning 'None' mac for system without eeprom.
get_system_mac for marvell platform checks for mac in eeprom, profile.ini(hwsku file) and eth0. Check for valid mac returned by syseeprom was incorrect. Which was resulting in bypassing mac get from profile.ini and eth0.
How I did it
get_system_mac already has a logic to get first valid mac.
Removed null check for mac returned by eeprom.
Corrected the check for profile.ini file by checking if file exist.
How to verify it
Executed sonic-cfggen to check valid mac address is getting configured in config_db.json with/without profile.ini.
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.
This submodule update brings in the following changes:
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
src/sonic-platform-daemons
```
* 76baca3 - (HEAD -> master, origin/master, origin/HEAD) Fixes for the issues uncovered by sonic-pcied unit tests (#389) (32 hours ago) [Ashwin Srinivasan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
For route registry service, in order to block hijacked routes, IBGP session needs to be set up from BGP sentinel service to SONiC, and BGP sentinel service advertise the same route with higher local-preference and no export community. So that SONiC takes the route from BGP sentinel as the best path and does not advertise the route to EBGP peers.
In order to do that, new route-maps are needed. So this change adds a new set of templates, keeping BGPSentinel peers out of the other templates.
Work item tracking
Microsoft ADO (number only): 24451346
How I did it
Add sentinel_community in constants.yml, route from BGPSentinel do not match this community will be denied.
Add support to convert BGPSentinel related configuration in the BGPPeerPassive element of the minigraph to a new BGP_SENTINELS table in CONFIG_DB
Add a new set of "sentinels" templates to docker-fpm-frr
Add a new BGP peer manager to bgpcfgd, to add neighbors from the BGP_SENTINELS table using the "sentinels" templates
Add a test case for minigraph.py, making sure the BGPSentinel and BGPSentinelV6 elements create BGP_SENTINELS DB entry.
Add a set of test cases for the new sentinels templates in sonic-bgpcfgd tests.
Add sonic-bgp-sentinel.yang and a set of testcases for the yang file.
How to verify it
Testcases and UT newly added would pass.
Setup IPv4 and IPv6 BGPSentinel services in minigraph, and load minigraph, show CONFIG_DB and "show runningconfig bgp", configuration would be loaded successfully.
Using t1-lag topo and setup IBGP session from BGPSentinel to SONiC loopback address, IBGP session would up.
Advertise route from BGPSentinel to T1 with sentinel_community, higher local-preference and no-export communiyt. In T1, show bgp route, the result is "Not advertise to any EBGP peer".
Withdraw the route in BGPSentinel, in T1, route would advertise to EBGP peers.
Advertise route from T1 that does not match sentinel_community, in T1, would not see the route in show bgp route.
#### Why I did it
src/sonic-gnmi
```
* 610509b - (HEAD -> master, origin/master, origin/HEAD) Install necessary debs instead of entire artifact in azp (#137) (2 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* cb1b3f40 - (HEAD -> master, origin/master, origin/HEAD) Remove system neighbor DEL operation in m_toSync if SET operation for (#2853) (7 hours ago) [Song Yuan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 6e5cfda - (HEAD -> master, origin/master, origin/HEAD) Change common_libs dependencies from buster to bullseye (#212) (2 days ago) [Ze Gan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
event yang models for usage currently use int as type for usage leaf, needs to be of type decimal64
##### Work item tracking
- Microsoft ADO **(number only)**:17747466
#### How I did it
Update yang models and UT
#### How to verify it
UT
#### Why I did it
src/sonic-swss
```
* 5b27c209 - (HEAD -> master, origin/master, origin/HEAD) Refactor Orch class to separate recorder implementation (#2837) (8 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 94242c2 - (HEAD -> master, origin/master, origin/HEAD) Use vendor customizable fan speed threshold checks (#378) (3 hours ago) [spilkey-cisco]
* db6e340 - Fix index out of range in the error log of invalid media lane mask received (#386) (8 hours ago) [MichaelWangSmci]
```
#### How I did it
#### How to verify it
#### Description for the changelog
- Why I did it
Adjust PSU power threshold logic in system health.
- How I did it
Update the description message in PSU power threshold checking
power of PSU x (xx w) exceeds threshold (xx w) => System power exceeds xx threshold (xx w)
- How to verify it
Manual test and unit test
Why I did it
When sonic is managed by k8s, the sonic container is managed by k8s daemonset, daemonset identifies its members by labels. Currently when restarting a sonic service by systemctl, if the service's container is already managed by k8s, systemd script stops the container by removing the feature label to make it disjoin from k8s daemonset, and then starts it by adding the label to make it join k8s daemonset again.
This behavior would cause problem during k8s container upgrade. Containers in daemonset are upgraded in a rolling fashion, that means the daemonset version is updated first, then rollout the new version to containers with precheck/postcheck one by one. However, if a sonic device joins a daemonset, k8s will directly deploy a pod with the current version of daemonset, it is expected when a device joins k8s cluster at first time.
But for a device which has already joined k8s cluster, the re-joining daemonset will cause the container upgraded to new version without precheck, so if a systemd service is restarted during daemonset upgrade, the container may be upgraded without precheck and break rolling update policy. To fix it, we need to remove the logic about dropping k8s label in systemd service stop script for kube mode.
Work item tracking
Microsoft ADO (number only): 24304563
How I did it
Don't drop label in systemd service stop script when feature's set_owner is kube. Only drop label when feature's set_owner is local.
How to verify it
The label feature_enabled should be always true if the feature's set owner is kube.
#### Why I did it
src/sonic-utilities
```
* 51c7a43c - (HEAD -> master, origin/master, origin/HEAD) [show][muxcable] update `show mux config` to print out `soc_ipv6` as well (#2909) (6 hours ago) [Jing Zhang]
* fd497755 - [route_check][dualtor] Ignore vlan neighbor route miss (#2888) (18 hours ago) [Longxiang Lyu]
* 81c0ed4e - [show][muxcable] update `show mux tunnel-route` to check soc_ipv6 as well (33 hours ago) [Jing Zhang]
* 1ee73668 - [db_migrator] Migrate DNS configuratuion (#2893) (2 days ago) [ganglv]
* 553a3432 - [dualtor][route_check] filter out `soc_ipv6` (#2899) (2 days ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
When do clean up container images, current code has two bugs need to be fixed. And some variables' name maybe cause confused, change the variables' name.
Work item tracking
Microsoft ADO (number only): 24502294
How I did it
We do clean up after tag latest successfully. But currently tag latest function only return 0 and 1, 0 means succeed and 1 means failed, when we get 1, we will retry, when we get 0, we will do clean up. Actually the code 0 includes another case we don't need to do clean up. The case is that when we are doing tag latest, the container image we want to tag maybe not running, so we can not tag latest and don't need to cleanup, we need to separate this case from 0, return -1 now.
When local mode(v1) -> kube mode(v2) happens, one problem is how to handle the local image, there are two cases. one case is that there was one kube v1 container dry-run(cause we don't relace the local if kube version = local version), we will remove the kube v1 image and tag the local version with ACR prefix and remove local v1 local tag. Another case is that there was no kube v1 container dry-run, we remove the local v1 image directly, cause the local v1 image should not be the last desire version.
About the docker_id variable, it may cause confused, it's actually docker image id, so rename the variable. About the two dicts and the list, rename them to be more readable.
How to verify it
Check tag latest and image clean up result.
Why I did it
During the upgrade process via k8s, the feature's systemd service will restart as well, all of the feature systemd service has restart number limit, and the limit number is too small, only three times. if fallback happens when upgrade, the start count will be 2, just once again, the systemd service will be down. So, need to bypass this. This restart function will be called when do local -> kube, kube -> kube, kube ->local, each time call this function, we indeed need to restart successfully, so do reset-failed every time we do restart.
When need to go back to local mode, we do systemd restart immediately without waiting the default restart interval time so that we can reduce the container down time.
Work item tracking
Microsoft ADO (number only):
24172368
How I did it
Before every restart for upgrade, do reset feature's restart number. The restart number will be reset to 0 to bypass the restart limit.
When need to go back to local mode, we do systemd restart immediately.
How to verify it
Feature's systemd service can be always restarted successfully during upgrade process via k8s.
#### Why I did it
src/sonic-platform-daemons
```
* d73808c - (HEAD -> master, origin/master, origin/HEAD) Added PCIe transaction check for all peripherals on the bus (#331) (9 hours ago) [Ashwin Srinivasan]
* 432602a - Update active application selected code in transceiver_info table aft… (#381) (13 hours ago) [Michael Wang - TW]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* c7e1308e - (HEAD -> master, origin/master, origin/HEAD) Remove redundant updateFabricPortState (#2850) (2 hours ago) [kenneth-arista]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-linux-kernel
```
* d070cae - (HEAD -> master, origin/master, origin/HEAD) arm64: dts: marvell: Add Nokia 7215-IXS-A1 board (#321) (34 hours ago) [Pavan-Nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 465f95e - (HEAD -> master, origin/master, origin/HEAD) Default implementation of under/over speed checks (#382) (9 hours ago) [spilkey-cisco]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 7ca31477 - (HEAD -> master, origin/master, origin/HEAD) [db_migrator] Set docker_routing_config_mode to the value obtained from minigraph parser (#2890) (10 hours ago) [Vaibhav Hemant Dixit]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 776af62c - (HEAD -> master, origin/master, origin/HEAD) [CodeQL]: Use dependencies with relevant versions in azp template. (#2845) (4 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
After k8s upgrade a container, k8s can only know the container is running, don't know the service's status inside container. So we need a probe inside container, k8s will call the probe to check whether the container is really ready.
##### Work item tracking
- Microsoft ADO **(number only)**: 22453004
#### How I did it
Add a health check probe inside config engine container, the probe will check whether the start service exit normally or not if the start service exists and call the python script to do container self-related specific checks if the script is there. The python script should be implemented by feature owner if it's needed.
more details: [design doc](https://github.com/sonic-net/SONiC/blob/master/doc/kubernetes/health-check.md)
#### How to verify it
Check path /usr/bin/readiness_probe.sh inside container.
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [x] 202205
- [x] 202211
#### Tested branch (Please provide the tested image version)
- [x] 20220531.28
* Add an ability to configure remote syslog servers
* Add an initial configuration for remote syslog
* Extend YANG module and add unit tests
#### Why I did it
Adding the following functionality to rsyslog feature:
- Configure remote syslog servers: protocol, filter, severity level
- Update global syslog configuration: severity level, message format
#### How I did it
added parameters to syslog server and global configuration.
#### How to verify it
create syslog server using CLI/adding to Redis-DB
verify server is added to file /etc/rsyslog.conf and server is functional.
#### Description for the changelog
extend rsyslog capabilities, added server and global configuration parameters.
#### Link to config_db schema for YANG module changes
https://github.com/iavraham/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-syslog.yang
- Why I did it
Implemented ssh configurations
- How I did it
Added ssh config table in configDB, once changed - hostcfgd will change the relevant OS files (sshd_config)
- How to verify it
Tests in sonic-host-services. Change relevant configs in configDB such as ports, and see sshd port was modified
*use lower case for IPv6 address as internal key and bfd session key. fixes#15764
Why I did it
*staticroutebfd uses the IPv6 address string as a key to create bfd session and cache the bfd sessions using it as a key.
When the IPv6 address string has uppercase letter in the static route nexthop list, the string with uppercase letter key is stored in the cache, but the BFD STATE_DB uses lowercase for IPv6 address, so when the staticroutebfd get the bfd state event, it cannot find the bfd session in its local cache because of the letter case.
#### Why I did it
We should not modify minigraph schema.
#### How I did it
Update minigraph.py and remove unit test.
#### How to verify it
Run sonic-config-engine unit test.
#### Why I did it
src/sonic-mgmt-common
```
* 341fd73 - (HEAD -> master, origin/master, origin/HEAD) Remove invalid db type definitions: ERROR_DB, USER_DB (#94) (3 days ago) [Sachin Holla]
```
#### How I did it
#### How to verify it
#### Description for the changelog
*What I did:
Enable BFD for Static Route for chassis-packet. This will trigger the use of the feature as defined in here: #13789
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
fix static route uninstall issue when all nexthops are not reachable.
the feature was working but the bug was introduced when support dynamic bfd enable/disable. Added UT testcase to guard this.
#### Why I did it
src/sonic-dash-api/sonic-dash-api
```
* 3f728d1 - (HEAD -> master, origin/master, origin/HEAD) Update vnet_direct in route.proto (#4) (11 days ago) [Ze Gan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
To reduce the container's dependency from host system
Work item tracking
Microsoft ADO (number only):
17713469
How I did it
Move the k8s container startup script to config engine container, other than mount it from host.
How to verify it
Check file path(/usr/share/sonic/scripts/container_startup.py) inside config engine container.
Signed-off-by: Yun Li <yunli1@microsoft.com>
Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com>
Why I did it
Tacacs nss library uses popen to execute useradd and usermod commands. Popen executes using a shell (/bin/sh) which is passed the command string with "-c". This means that if untrusted user input is supplied, unexpected shell escapes can occur. In this case the username supplied can be untrusted user input when logging in via ssh or other methods when tacacs is enabled. Debian has very little limitation on usernames and as such characters such as quotes, braces, $, >, | etc are all allowed. Since the nss library is run by root, any shell escape will be ran as root.
In the current community version of tacacs nss library, the issue is mitigated by the fact that the useradd command is only ran if the user is found to exist on the tacacs server, so the bad username would have to already exists there which is unlikely. However, internally (at Dell) we had to modify this behavior to support other tacacs servers that do not allow authorization messages to verify user existence prior to a successful authentication. These servers include Cisco ISE and Aruba ClearPass. In order to support these tacacs+ servers, we have to create a temporary user immediately, which means this would be a much bigger issue.
I also plan to supply the patch to support ISE and ClearPass and as such, I would suggest taking this patch to remediate this issue first.
How I did it
Replace call to popen with fork/execl of the useradd/usermod binary directly.
How to verify it
Install patched version of libnss-tacplus and verify that tacacs+ user login still works as expected.
Why I did it
For the DASH scenario, the APP_DB will be optimized by protobuf message for less memory consumption.
How I did it
Download the Debian package of protobuf 3.21.12 and create a corresponding rule for building it.
Add a submodule of sonic-dash-api and generated its Debian package which includes C++ library and Python library
How to verify it
Check artifacts of Azp that the protobuf-related and dash-api deb packages should be generated.
Signed-off-by: Ze Gan <ganze718@gmail.com>
#### Why I did it
src/sonic-platform-common
```
* 10af810 - (HEAD -> master, origin/master, origin/HEAD) More prevention of fatal exception caused by VDM dictionary missing fields when a transceiver has just been pulled (#376) (5 hours ago) [snider-nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* bc08806 - (HEAD -> master, origin/master, origin/HEAD) Implemented ssh configurations (#32) (14 hours ago) [ycoheNvidia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* [sonic-pit] Add PIT(Platform Integration Test) feature, second part, add 6 test cases.
Signed-off-by: Li Hua <guizhao.lh@alibaba-inc.com>
* Add missing test case configuration and platform configuration.
Signed-off-by: Li Hua <guizhao.lh@alibaba-inc.com>
* Remove unsed comment, replace duplicated function with import from other moduls.
---------
Signed-off-by: Li Hua <guizhao.lh@alibaba-inc.com>
#### Why I did it
src/dhcpmon
```
* 824a144 - (HEAD -> master, origin/master, origin/HEAD) replace atoi with strtol (#6) (3 hours ago) [Mai Bui]
* 32c0c3f - Fix libswsscommon package installation for non-amd64 (#7) (6 hours ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* a67f684f - (HEAD -> master, origin/master, origin/HEAD) [hash]: Implement GH backend (#2598) (3 hours ago) [Nazarii Hnydyn]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* eab4a9e - (HEAD -> master, origin/master, origin/HEAD) [hostcfgd][dns] Subscribe to DNS_NAMESERVER table to react to static DNS configuration changes. (#49) (2 days ago) [Oleksandr Ivantsiv]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
Avoid 'sscanf()' for number conversions. Its use can lead to undefined behavior, slow processing, and integer overflows. Instead prefer the 'strto*()' family of functions.
#### How I did it
replace sscanf with strtol
#### How to verify it
Manual test
- Why I did it
Add support for static DNS configuration. According to sonic-net/SONiC#1262 HLD.
- How I did it
Add a new resolv-config.service that is responsible for transferring configuration from Config DB into /etc/resolv.conf file that is consumed by various subsystems in Linux to resolve domain names into IP addresses.
- How to verify it
Run the image compilation. Each component related to the static DNS feature is covered with the unit tests.
Run sonic-mgmt tests. Static DNS feature will be covered with the system tests.
Install the image and run manual tests.
#### Why I did it
src/sonic-sairedis
```
* 14a863a - (HEAD -> master, origin/master, origin/HEAD) [warmboot] Add workaround for `INIT_VIEW` failure (#1252) (5 hours ago) [Jing Zhang]
* abb02a5 - [actions] Support Semgrep by Github Actions (#1254) (2 days ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* 508d642 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#67) (31 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Graceful restart is a key event for bgpd, related log print is debug level. To change it to info level to get more visibilities when this kind of event is triggered.
Work item tracking
Microsoft ADO (13875291):
How I did it
To create patch file to change from debug level to info level.
How to verify it
To run PR test and capture the print.
- Why I did it
To fix hiredis compilation
- How I did it
Changed package version: 0.14.0-3~bpo9+1 -> 0.14.1-1
- How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
#### Why I did it
src/dhcprelay
```
* c36b8e3 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#39) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 4bda49b - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#210) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-dbsyncd
```
* e4ac906 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#59) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-mgmt-framework
```
* 4a2ff41 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#116) (5 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Update the definition of acl table type BMCDATA and BMCDATAV6 in minigraph parser.
Work item tracking
Microsoft ADO (number only): 24101023
How I did it
Update the definition of acl table type BMCDATA and BMCDATAV6 in minigraph parser.
How to verify it
Ran unittest to verify this update:
Added support data for fabric monitoring in CONFIG_DB
The CONFIG_DB now has the FABRIC_MONITOR|FABRIC_MONITOR_DATA table for default value for fabric port monitoring. An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_MONITOR|FABRIC_MONITOR_DATA"
{'monErrThreshCrcCells': '1', 'monErrThreshRxCells': '61035156', 'monPollThreshIsolation': '1', 'monPollThreshRecovery': '8'}
The CONFIG_DB now also has a table for each fabric port for its isolate status.
An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_PORT|Fabric20"
{'alias': 'Fabric20', 'isolateStatus': 'False', 'lanes': '20'}
#### Why I did it
src/sonic-swss
```
* 87e0b08 - (HEAD -> master, origin/master, origin/HEAD) [portsorch]: Enhancing SWSS OA logs to capture host_tx_ready change events (#2822) (11 hours ago) [mihirpat1]
* c7e52a0 - [subinterface]: Fix admin state handling. (#2806) (34 hours ago) [Nazarii Hnydyn]
* ebfda13 - [aclorch] Fix TODO: use SAI object API to query capabilities (#2743) (2 days ago) [Stepan Blyshchak]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-gnmi
```
* a600dc9 - (HEAD -> master, origin/master, origin/HEAD) Fix threading issues in Event Client (#121) (9 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss-common
```
* 2320ddc - (HEAD -> master, origin/master, origin/HEAD) Add ZMQ port for orchagent (#795) (19 hours ago) [Hua Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
* Re-add 127.0.0.1/8 when bringing down the interfaces
With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.
To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.
Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
#### Why I did it
The asn 0 in BGP_MONITOR is invalid by YANG definition. However, the asn 0 in BGP_MONITOR is found in many devices.
It was introduced by minigraph where its value is set to 0.
To unblock Config Updater test, the short term fix is to accept the asn 0 in BGP_MONITOR.
We can revert this after NGS team make all the ASN change in minigraph.
##### Work item tracking
- Microsoft ADO **(24186140)**:
#### How I did it
Change the range
#### How to verify it
Unit test.
Why I did it
This is to add support for specifying custom retry counts for LACP sessions. This is to make warmboot easier on low-storage and low-memory platforms, by allowing more than 90 seconds of downtime.
How I did it
How to verify it
Tested manually with these cases:
Verify that changing the retry count using teamdctl PortChannel101 state item set runner.retry_count 5 takes effect
Verify that the retry count change actually affects when the LAG goes down by forcefully killing teamd on one side (i.e. setting the retry count to 5 causes the LAG to go down after 150 seconds)
Verify that the retry count gets reset to 3 after the LAG goes down for whatever reason
Verify that the retry count gets reset to 3 after some period of time (30 seconds * retry count)
Test cases are in sonic-net/sonic-mgmt#7961 and sonic-net/sonic-mgmt#8152.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
What I did:
Workaround for the issue seen here : FRRouting/frr#13682
It seems there is timing issue where there are multiple recursive lookup needed to resolve nexthop of the route it's possible that it does not happen correctly causing route to remain in inactive state
Issue is seen on chassis-packet as there 2 level of recursive lookup needed for a given e-BGP learnt route
- Level1 to resolve e-BGP peer (connected route via bgp ) over Loopback4096 (i-BGP peering)
- Level 2 Loopback4096 over backend port-channels next-hops
For VOQ chassis there is no e-BGP peer (connected route via bgp ) resolution as route is added as Static route by orchagent over Ethernet-IB.
Also as part of this remove route-map policy from instance.conf.j2 as same is define in peer-group.j2.
Microsoft ADO: https://msazure.visualstudio.com/One/_workitems/edit/24198507
How I verify:
Functional Verification manually
Updated UT.
We will be adding sanity check in sonic-mgmt to make sure none of route are in inactive state.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
During build, lots of pip packages are getting installed through pip
install command.
This feature adds support for caching all the pip packages into local
cache path, so that subsequent build always loads from the cache.
When a package is referenced from the web through wget command,
it downloads the package for every build.
This feature caches all the packages that are being downloaded from the
web, so that subsequent build always loads the cache instead of from
web.
* [202205] Update SOC properties for DLR_INIT based pfcwd recovery (#15217)
Why I did it
Update soc properties for certain roles that need to use pfcwd dlr init based recovery mechanism
How to verify it
Updated the templates on a 7050cx3 dual tor and 7260 T1 which satisfies these conditions and validated pfcwd recovery which uses DLR_INIT based mechanism. Also validated that this mechanism is not used on 7050cx3 single tor with the updated templates
Signed-off-by: Neetha John <nejo@microsoft.com>
* AclInterface and Management Interfaces are parsed on finding first valid node for it.
Above logic works for multi-asic scenarios where ACL Interface and Management Interfaces are present in DPG order {Host, Asicx, Asicy} but not when DPG is in {Asicx, Asicy, Host} order.
* [static_route][staticroutebfd]fix an issue on deleting a non-bfd static route
Fix an issue for deleting a non-bfd static route also remove the staticroutebfd from critical_processes list and make it auto restart in the case of crash.
What I did:
Added change to add 'peerType' as element in NEIGH_STATE_TABLE.
'peerType' can be i-BGP vs e-BGP determined based on local and remote AS number.
Why I did:
This is useful to filter neighbors in SONiC as internal vs external in chassis use-case (example: telemetry)
Verification:
Manual Verification
127.0.0.1:6379[6]> hgetall "NEIGH_STATE_TABLE|10.0.0.5"
1) "state"
2) "Established"
3) "peerType"
4) "e-BGP"
127.0.0.1:6379[6]> hgetall "NEIGH_STATE_TABLE|2603:10e2:400::4"
1) "state"
2) "Established"
3) "peerType"
4) "i-BGP"
Also sonic-mgmt test case test_bgp_fact.py is enhanced: Enhanced bgp_fact to validate NEIGH_STATE_TABLE element 'peerType' sonic-mgmt#8462
Stop authorization after user being rejected by server.
#### Why I did it
Fix nss_tacplus bug: after user being rejected by one TACACS+ server, nss_tacplus will try with next TACACS+ server.
##### Work item tracking
- Microsoft ADO :15276692
#### How I did it
Check authorization result, stop authorization after user being rejected by server.
#### How to verify it
Pass all E2E test.
Create new UT: https://github.com/sonic-net/sonic-mgmt/pull/8345
#### Description for the changelog
Stop authorization after user being rejected by server.
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
- Why I did it
If you enable feature and then disable it, System Ready status change to Not Ready
A disabled feature should not affect the system ready status.
- How I did it
During the disable flow of dhcp_relay, it entered the dnsrvs_name list, which caused the SYSTEM_STATE key to be set to DOWN. Right after that, the dhcp_relay service was removed from the full service list, however, but, when it was removed from the dnsrvs_name, there was no flow to reset the system state back to UP even though there was no more services in down state.
- How to verify it
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay enabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
root@qa-eth-vt01-2-3700v:/home/admin# config feature state dhcp_relay disabled
root@qa-eth-vt01-2-3700v:/home/admin# show system-health sysready-status
Should see
System is ready