Why I did it
Graceful restart is a key event for bgpd, related log print is debug level. To change it to info level to get more visibilities when this kind of event is triggered.
Work item tracking
Microsoft ADO (13875291):
How I did it
To create patch file to change from debug level to info level.
How to verify it
To run PR test and capture the print.
- Why I did it
To fix hiredis compilation
- How I did it
Changed package version: 0.14.0-3~bpo9+1 -> 0.14.1-1
- How to verify it
make configure PLATFORM=mellanox
make target/sonic-mellanox.bin
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
#### Why I did it
src/dhcprelay
```
* c36b8e3 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#39) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 4bda49b - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#210) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-dbsyncd
```
* e4ac906 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#59) (7 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-mgmt-framework
```
* 4a2ff41 - (HEAD -> master, origin/master, origin/HEAD) [actions] Support Semgrep by Github Actions (#116) (5 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
Downgrade the symcrypt version, use the SymCrypt version v103.0.1 for certification.
Work item tracking
Microsoft ADO (number only): 24222567
How I did it
How to verify it
Why I did it
Update the definition of acl table type BMCDATA and BMCDATAV6 in minigraph parser.
Work item tracking
Microsoft ADO (number only): 24101023
How I did it
Update the definition of acl table type BMCDATA and BMCDATAV6 in minigraph parser.
How to verify it
Ran unittest to verify this update:
Why I did it
To include 202305 in the PR template.
Work item tracking
Microsoft ADO (21450860):
How I did it
Change the temlate.
How to verify it
pass PR test.
Added support data for fabric monitoring in CONFIG_DB
The CONFIG_DB now has the FABRIC_MONITOR|FABRIC_MONITOR_DATA table for default value for fabric port monitoring. An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_MONITOR|FABRIC_MONITOR_DATA"
{'monErrThreshCrcCells': '1', 'monErrThreshRxCells': '61035156', 'monPollThreshIsolation': '1', 'monPollThreshRecovery': '8'}
The CONFIG_DB now also has a table for each fabric port for its isolate status.
An example output of getting this table is:
sonic-db-cli CONFIG_DB hgetall "FABRIC_PORT|Fabric20"
{'alias': 'Fabric20', 'isolateStatus': 'False', 'lanes': '20'}
This reverts commit 02b17839c3.
Reverts #14933
The earlier commit caused a race condition that particularly broke cross branch warm upgrade.
Issue happens when db_migrator is still migrating the DB and finalizer is checking DB for list of components to reconcile.
If migration is not complete, finalizer get an empty list to wait for. Due to this, finalizer concludes warmboot (deletes system wide warmboot flag) and cause all the services to do cold restart.
ADO: 24274591
#### Why I did it
src/sonic-swss
```
* 87e0b08 - (HEAD -> master, origin/master, origin/HEAD) [portsorch]: Enhancing SWSS OA logs to capture host_tx_ready change events (#2822) (11 hours ago) [mihirpat1]
* c7e52a0 - [subinterface]: Fix admin state handling. (#2806) (34 hours ago) [Nazarii Hnydyn]
* ebfda13 - [aclorch] Fix TODO: use SAI object API to query capabilities (#2743) (2 days ago) [Stepan Blyshchak]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-gnmi
```
* a600dc9 - (HEAD -> master, origin/master, origin/HEAD) Fix threading issues in Event Client (#121) (9 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss-common
```
* 2320ddc - (HEAD -> master, origin/master, origin/HEAD) Add ZMQ port for orchagent (#795) (19 hours ago) [Hua Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
sonic-mgmt is failing tests due to invalid test data in platform.json
Fwutil is upset the chassis name in the platform_component.json of the 7060CX-32S
How I did it
Fixed the aforementioned issues
* Re-add 127.0.0.1/8 when bringing down the interfaces
With #5353, 127.0.0.1/16 was added to the lo interface, and then
127.0.0.1/8 was removed. However, when bringing down the lo interface,
like during a config reload, 127.0.0.1/16 gets removed, but 127.0.0.1/8
isn't added back to the interface. This means that there's a period of
time where 127.0.0.1 is not available at all, and services that need to
connect to 127.0.01 (such as for redis DB) will fail.
To fix this, when going down, add 127.0.0.1/8. Add this address before
the existing configuration gets removed, so that 127.0.0.1 is available
at all times.
Note that running `ifdown lo` doesn't actually bring down the loopback
interface; the interface always stays "physically" up.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
SDK patches for iproute2 were added to SONiC tree as a temporary solution.
Now that SDK with the patches is available, I have removed the patches from SONiC tree and we consume them from SDK github during compilation.
- How I did it
During build we download SDK iproute2 patches from SDK github (or from the URL provided by user if compiling SDK from sources) and apply them before compilation.
- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
- Why I did it
Adjust the warning threshold implementation according to the latest algorithm update
- How I did it
Modify power warning and critical thresholds methods
- How to verify it
Unit test updated to cover the change
Signed-off-by: Stephen Sun <stephens@nvidia.com>
#### Why I did it
The asn 0 in BGP_MONITOR is invalid by YANG definition. However, the asn 0 in BGP_MONITOR is found in many devices.
It was introduced by minigraph where its value is set to 0.
To unblock Config Updater test, the short term fix is to accept the asn 0 in BGP_MONITOR.
We can revert this after NGS team make all the ASN change in minigraph.
##### Work item tracking
- Microsoft ADO **(24186140)**:
#### How I did it
Change the range
#### How to verify it
Unit test.
Add watchdog mechanism to swss service and generate alert when swss have issue.
**Work item tracking**
Microsoft ADO (number only): 16578912
**What I did**
Add orchagent watchdog to monitor and alert orchagent stuck issue.
**Why I did it**
Currently SONiC monit system only monit orchagent process exist or not. If orchagent process stuck and stop processing, current monit can't find and report it.
**How I verified it**
Pass all UT.
Manually test process_monitoring/test_critical_process_monitoring.py can pass.
Add new UT https://github.com/sonic-net/sonic-mgmt/pull/8306 to check watchdog works correctly.
Manually test, after pause orchagent with 'kill -STOP <pid>', check there are warning message exist in log:
Apr 28 23:36:41.504923 vlab-01 ERR swss#supervisor-proc-watchdog-listener: Process 'orchagent' is stuck in namespace 'host' (1.0 minutes).
**Details if related**
Heartbeat message PR: https://github.com/sonic-net/sonic-swss/pull/2737
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8306
For T2 systems using packet mode, the backplane interfaces (Ethernet-BP#) and the fabric card ethernet interfaces are not visible as neighbor interfaces.
In packet mode, these interfaces needs qos and buffer config as well.
This fix addresses that issue and adds the backplane interfaces to the PORTS_ACTIVE list