To include the following fixes:
DNX:
CS00012287482 - Support for 1024 LAGs on DNX (Added back fix reverted in [202205] Update Broadcom DNX SAI version to 7.1.54.4 #15850)
CS00012302400 - New SAI 7.1.50.4 caused regression in sonic-mgmt ACL test &
ACL entry creation failing with SAI_STATUS_INVALID_PORT_NUMBER in SAI 7.1.50.4
(CS00012302347)
CS00012302163 - SAI_API_BRIDGE:_brcm_sai_bridge_port_learn_flag:1620 sai bridge lag port list get. failed with error -7.
CS00012296571 - LACP packets are queued to Queue 0 instead of Queue 7
CS00012301919 - The traffic is queued to VOQ 8 sometimes instead of destination port's VOQ
CS00012297160 - [SONIC] [J2C+] Traffic to unknown destination route getting enqueued on VOQ 10
CS00012298730 - [7.x][J2/J2C+] : Treat Q=0 as lowest priority and Q=7 as highest priority in Strict Priority Scheduling
Also includes -
XGS:
Port SONIC-62323 to SAI 7.1, Use single NH instead of ecmp
[SAI_BRANCH rel_ocp_sai_7_1] ECMP group expansion fail due to no resources
Fix capability for Hostif queue on SAI version 7.1
CS00012302193 - SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO attribute value changed
- Why I did it
watchdogutil uses platform API watchdog instance to control/query watchdog status. In Nvidia watchdog status, it caches "armed" status in a object member "WatchdogImplBase.armed". This is not working for CLI infrastructure because each CLI will create a new watchdog instance, the status cached in previous instance will totally lose. Consider following commands:
admin@sonic:~$ sudo watchdogutil arm -s 100 =====> watchdog instance1, armed=True
Watchdog armed for 100 seconds
admin@sonic:~$ sudo watchdogutil status ======> watchdog instance2, armed=False
Status: Unarmed
admin@sonic:~$ sudo watchdogutil disarm =======> watchdog instance3, armed=False
Failed to disarm Watchdog
- How I did it
Use sysfs to query watchdog status
- How to verify it
Manual test
Unit test
Conflicts:
platform/mellanox/mlnx-platform-api/sonic_platform/watchdog.py
platform/mellanox/mlnx-platform-api/tests/test_watchdog.py
Release Notes for Cisco T0 and 8102-64H.
• Fix for PSUD crash when PSUs are inserted in an operational system
• Fix for VxLAN counters not incrementing in show vxlan counter' and 'show platform npu vxlan counters'
• Fix for continuous error messages reported by thermalctld
• Fix for dshell client enable/disable causing syncd crash
• Support for 9100 TPID for Cisco fanout.
• Caveat: Drop counters for packets with invalid VLAN tag are counted twice.
Release Notes for Cisco 8101-32FH:
• Aikido FPD 1.89 Upgrade
Update SAI xgs version to 7.1.54.4-3 to include the following XGS changes:
7.1.54.3-1: Port SONIC-62323 to SAI 7.1, Use single NH instead of ecmp
7.1.54.3-2: [SAI_BRANCH rel_ocp_sai_7_1] ECMP group expansion fail due to no resources
7.1.54.3-3: Fix capability for Hostif queue on SAI version 7.1
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
Why I did it
Updating the iSMART_64 tool for supporting latest debian releases.
How I did it
On branch new_ismart
Changes to be committed:
(use "git restore --staged ..." to unstage)
modified: platform/broadcom/sonic-platform-modules-dell/s6100/scripts/iSMART_64
How to verify it
In s6100, run the iSMART_64 tool.
md5sum - 24725730d7649769c7ba50971c1f2955
Co-authored-by: Santhosh Kumar T <53558409+santhosh-kt@users.noreply.github.com>
Why I did it
[E1031] fix pca9548 initializes failed occasionally in stress test.
When failure happened, ismt i2c bus hang up and need power cycle to
recover it.
How I did it
Add 0.5s delay between setuping and configuring pca9548 i2c mux.
How to verify it
Reboot stress test at least 100 times without failure.
Co-authored-by: Ikki Zhu <79439153+qnos@users.noreply.github.com>
To pick the below fixes:
DNX fixes:
Temporarily revert fix for CS00012287482 - support for 1024 LAGs on DNX
CS00012297599 - [J2C+] sonic-mgmt failure in test_copp.py (test_no_policer[BGP])
CS00012293560 - ECN remark issue in SONiC
CS00012302371 - SONiC: V6 packets were mapped to wrong TC queue
CS00012288540 - Available ACL Entry and Counter is incorrect after removing ACL rules
Other changes (XGS fixes)
SID - L3 multicast packet drop due to wrong VFI derivation - SDK-350470
SID - SIGSEGV in linkscan callback delivery - SDK-287578
SID - Repeated VXLAN calls deletes vlan translation action profile SDK-313980
SER - error in IS_TDM_CALENDAR0/1 can cause traffic hit in TH
SID - L2_ENTRY Table Lookups May Miss
[CSP CS00012275452] sai_object_type_get_availability failed with SAI_STATUS_INVALID_PARAMETER
[CSP CS00012253527] sai_query_attribute_capability for obj type SAI_OBJECT_TYPE_SWITCH
Why I did it
Fix incorrectly specified table name in the extra queues and extra pgs j2 files for 8101-32FH-O
How I did it
Update platform module to 202205.2.2.7
Update SAI xgs version to 7.1.50.4 to include the following changes:
patch fix from CSP CS00012282080 needed to support speed change from 400g to 100g on chassis linecards.
Backport SONIC-71507 VSQF/VSQE are not created after port creation. JIRA# SONIC-71507
Backport JIRA SONIC-70704 to rel_ocp_sai_7_1. JIRA# SONIC-70704
SID - L3 multicast packet drop due to wrong VFI derivation - SDK-350470
SID - SIGSEGV in linkscan callback delivery - SDK-287578
SID - Repeated VXLAN calls deletes vlan translation action profile SDK-313980
SER - error in IS_TDM_CALENDAR0/1 can cause traffic hit in TH
SID - L2_ENTRY Table Lookups May Miss
[CSP CS00012275452] sai_object_type_get_availability failed with SAI_STATUS_INVALID_PARAMETER
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
DNX fixes:
CS00012287482 - support for 1024 LAGs on DNX
Other changes (XGS fixes)
SID - L3 multicast packet drop due to wrong VFI derivation - SDK-350470
SID - SIGSEGV in linkscan callback delivery - SDK-287578
SID - Repeated VXLAN calls deletes vlan translation action profile SDK-313980
SER - error in IS_TDM_CALENDAR0/1 can cause traffic hit in TH
SID - L2_ENTRY Table Lookups May Miss
[CSP CS00012275452] sai_object_type_get_availability failed with SAI_STATUS_INVALID_PARAMETER
Why I did it
There is rare condition, emc2305 hold SMBus and cause SMBus completion wait timed out.
How I did it
Enable EMC2305 SMBus timeout feature, 30ms period of inactivity will reset the interface.
How to verify it
Use 'i2cget -y -f 23 0x4d 0x20 b' to read EMC2305 configuration register and check DIS_TO bit not set.
Signed-off-by: Eric Zhu <erzhu@celestica.com>
Why I did it
Release Notes for Cisco 8101-32FH-O:
• Fixed a FEC lane related error message
• Implemented 'show platform npu mac-state -i ' CLI for NPU MAC information and save-state dump.
• Fixed a Mac Port SerDes credit mismatch error message
• Configurable drop counter support for SAI_DEBUG_COUNTER_ATTR_TYPE (MIGSMSFT-197)
• Removed eth1-midplane creation rule that is not needed for this platform.
• Fix to move control packets from queue 0 to queue 7
How I did it
Update platform version to 202205.2.2.5
Why I did it
ptf_nn_agent failed to start in dnx rpc syncd because module afpacket was not installed.
Please see issue sonic-net/sonic-mgmt#7822
How I did it
Add downloading ptf afpacket module in docker file.
How to verify it
Verified that ptf_nn_agent was started successfully in dnx rpc syncd with the change.
[S6100] Improve S6100 serial-getty monitor, wait and re-check when getty not running to avoid false alert.
#### Why I did it
On S6100, the serial-getty service some time can't auto-restart by systemd. So there is a monit unit to check serial-getty service status and restart it.
However, this monit will report false alert, because in most case when serial-getty not running, systemd can restart it successfully.
To avoid the false alert, improve the monitor to wait and re-check.
Steps to reproduce this issue:
1. User login to device via console, and keep the connection.
2. User login to device via SSH, check the serial-getty@ttyS1.service service, it's running.
3. Run 'monit reload' from SSH connection.
4. Check syslog 1 minutes later, there will be false alert: ' 'serial-getty' process is not running'
#### How I did it
Add check-getty.sh script to recheck again later when getty service not running.
And update monit unit to check serial-getty service status with this script to avoid false alert.
#### How to verify it
Pass all UT.
Manually check fixed code work correctly:
```
admin@***:~$ sudo systemctl stop serial-getty@ttyS1.service
admin@***:~$ sudo /usr/local/bin/check-getty.sh
admin@***:~$ echo $?
1
admin@***:~$ sudo systemctl status serial-getty@ttyS1.service
● serial-getty@ttyS1.service - Serial Getty on ttyS1
Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
Active: inactive (dead) since Tue 2023-03-28 07:15:21 UTC; 1min 13s ago
admin@***:~$ sudo /usr/local/bin/check-getty.sh
admin@***:~$ echo $?
0
admin@***:~$ sudo systemctl status serial-getty@ttyS1.service
● serial-getty@ttyS1.service - Serial Getty on ttyS1
Loaded: loaded (/lib/systemd/system/serial-getty@.service; enabled-runtime; vendor preset: enabled)
```
syslog:
```
Mar 28 07:10:37.597458 *** INFO systemd[1]: serial-getty@ttyS1.service: Succeeded.
Mar 28 07:12:43.010550 *** ERR monit[593]: 'serial-getty' status failed (1) -- no output
Mar 28 07:12:43.010744 *** INFO monit[593]: 'serial-getty' trying to restart
Mar 28 07:12:43.010846 *** INFO monit[593]: 'serial-getty' stop: '/bin/systemctl stop serial-getty@ttyS1.service'
Mar 28 07:12:43.132172 *** INFO monit[593]: 'serial-getty' start: '/bin/systemctl start serial-getty@ttyS1.service'
Mar 28 07:13:43.286276 *** INFO monit[593]: 'serial-getty' status succeeded (0) -- no output
```
#### Description for the changelog
[S6100] Improve S6100 serial-getty monitor.
#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
Why I did it
Update sonic-platform submodule for Nokia-7250IXRE Platform. This requires the new NDK 22.9.8 and above
How I did it
Update submodule sonic-platform for Nokia-7250IXRE platform.
c9f316e Disparate process and thread-safe protection for MDIPC transport, and refactored presence logic to better align with SfpStateUpdateTask operation
a3486cc Added _get_module_bulk_info() and cache the info for 5 seconds to optimize the chassisd update.
4b2e729 Fixed the nokia_cmd show qfpga help display
7b87049 Fixed the nokia_cmd show midplane helper dispaly.
83eabea Add "nokia_cmd set ndk-monitor-action" and "nokia_cmd set ndk-log-level" commands
8aad7de Add nokia_cmd show ndk-version
d2c55e3 Modify the psu.py and module.py to optimize the psud running time
Signed-off-by: mlok <marty.lok@nokia.com>
1. SONIC 20220531.25 OC Failure: Everflow testcases failing due to SAI orchagent crash
2. SONIC 20220531.25 OC Failure: ACL IPv6 testcases.
3. TPID support
Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
Fix watchdog reboot cause for wolverine linecard
Fix PSU fan speed of 0% by adding max RPM to most psu descriptions
Add product DCS-7060DX5-64
Add product DCS-7060DX5-32
Why I did it
Release Notes for Cisco 8101-32FH-O:
· Enable CMIS
· Enable Subport for Static Breakout
How I did it
Update platform version to 202205.2.2.2 (equivalent to 202205-v0.16)
Why I did it
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.
SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
How I did it
Why I did it
Add support for SFP refactor on Nokia-7215 Marvell armhf platform.
Platform: armhf-nokia_ixs7215_52x-r0
HwSKU: Nokia-7215
ASIC: marvell
Port Config: 48x1G + 4x10G (SFP+)
How I did it
Modify sfp.py to support SFP refactor optoe driver and platform.json to facilitate proper OC test completion.
How to verify it
Build armhf target for Nokia-7215 and verify proper Xcvrd and SFP refactor operation.
To include the following DNX changes:
Revert patch and add official SDK/SAI fix for the below CSPs
a. CS00012282080 : syncd crashes after a speed change due to "cosq src vsqs gport get" failure
b. CS00012281200 : J2C+ : Scope of config.bcm SOC property bcm_stat_interval
Fixes for:
a. CS00012278343: SONiC J2c+ Macsec: Shutting down LAG members which have macsec cause
remaining active LAG members to go down
b. CS00012279717: Instance_id printed in SAI syslog messages are truncated to 9 bytes
Why I did it
Release Notes for Cisco 8808 platform:
Thermal sensor driver support
Release Notes for Cisco 8101-32FH-O:
Support for unwanted Alarm Suppression
How I did it
Update platform version to 202205.2.2.1 (v0.15)
How to verify it
Why I did it
Fixed syncd syslog errors reported in issue sonic-net/SONiC#1248Fixessonic-net/SONiC#1248
How I did it
Compile and load the SAI debian in SONiC 202205 image.
How to verify it
Verify PTF autorestart testcase passes.
Signed-off-by: rajkumar38 <rpennadamram@marvell.com>
Why I did it
Updated the port_alias name as per the SONiC HLD [b70bb0c7da/doc/sonic-port-name.md]
Updated channel number to breakout interfaces
How I did it
update platform version to 0.13
Why I did it
Update SAI xgs version to 7.1.36.4 to include the following changes.
JIRA# SONIC-69731 (7.1.33.4)
Issue Summary: SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO brcm_sai_get_switch_attribute returns null.
Root Cause: Not implemented.
Fix Description: Get support for SAI switch attr SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO added
JIRA# SONIC-70737 (7.1.34.4)
Issue Summary: ECN being marked as CE even without congestion
Root Cause: ecn_thresh was set to very low value and packets were 100% marked.
Fix Description: ecn_thresh set to correct value
backport SONIC-70081 to SAI7.1 (7.1.35.4)
egress lossy queue PFC Rx fix:ignore PFC signals from egress
Update git submodules (7.1.36.4)
Update sdk-src/hsdk_6.5.24_SAI_7.1.0_GA from branch 'hsdk_6.5.24_SAI_7.1.0_GA'
to 57d0e360269c4ab659c4790ae471aa4dba2532b4
[SAI_BRANCH rel_ocp_sai_7_1] Broadcom image build failed with SAI 7.1 in DMZ repo (on bullseye)
How I did it
Update SAI xgs code.
How to verify it
Run the SONiC and SAI test with the 7.1 SAI pipeline.
Signed-off-by: zitingguo-ms <zitingguo@microsoft.com>
* Remove apt package lists and make macro to clean up apt and python cache
Remove the apt package lists (`/var/lib/apt/lists`) from the docker
containers. This saves about 100MB.
Also, make a macro to clean up the apt and python cache that can then be
used in all of the containers. This helps make the cleanup be consistent
across all containers.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Chassis:
Fixed Multiple orchagent crashes due to BULK create failure
added new line card support
Addressed T2 snmp test failures
Addressed reboot issues
Chassis and Fixed:
Addressed dshell client issues
HWSKU name update from Cisco-8101-T32/Cisco-8101-C48T8 to Cisco-8101-O32/Cisco-8101-C48O8
How I did it
update to cisco version 0.12