sonic-buildimage/files/scripts
Michael Li f725b83bd6
Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804)
Why I did it
There is an issue on the Arista PikeZ platform (using T3.X2: BCM56274) while running SONiC. If the 'syncd' container in SONiC is restarted, the expected behaviour is that syncd will automatically restart/recover; however it does not and always fails at create_switch due to BCM SDK kmod DMA operation cancellation getting stuck.

Sep 16 22:19:44.855125 pkz208 ERR syncd#syncd: [none] SAI_API_SWITCH:platform_process_command:428 Platform command "init soc" failed, rc = -1. Sep 16 22:19:44.855206 pkz208 INFO syncd#supervisord: syncd CMIC_CMC0_PKTDMA_CH4_DESC_COUNT_REQ:0x33#015 Sep 16 22:19:44.855264 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:platformInit:1909 initialization command "init soc" failed, rc = -1 (Internal error). Sep 16 22:19:44.855403 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:sai_driver_init:642 Error initializing driver, rc = -1. ... Sep 16 22:19:44.855891 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:1173 initializing SDK failed with error Operation failed (0xfffffff5).

Reloading the BCM SDK kmods allows the switch init to continue properly.

How I did it
If BCM SDK kmods are loaded, unload and load them again on syncd docker start script.

How to verify it
Steps to reproduce:

In SONiC, run 'docker ps' to see current running containers; 'syncd' should be present.
Run 'docker stop syncd'
Wait ~1 minute.
Run 'docker ps' to see that syncd is missing.
Check logs to see messages similar to the above.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-11-30 16:16:30 +08:00
..
arp_update [chassis-packet] fix the issue of internal ip arp not getting resolved. (#12127) 2022-11-14 10:15:17 -08:00
asic_status.py [systemd] ASIC status based service bringup on VOQ chassis (#7477) 2021-07-27 23:02:49 -07:00
asic_status.sh [systemd] ASIC status based service bringup on VOQ chassis (#7477) 2021-07-27 23:02:49 -07:00
bgp.sh BGP Service script path and error fix (#5183) 2020-08-15 12:09:10 -07:00
configdb-load.sh [MultiDB] use sonic-db-cli PING and fix wrong multiDB API in NAT (#4541) 2020-05-06 15:41:28 -07:00
core_cleanup.py [Python] Align files in root dir, dockers/ and files/ with PEP8 standards (#6109) 2020-12-03 15:57:50 -08:00
database.sh [services] kill container on stop in warm/fast mode (#10510) 2022-09-19 19:34:33 +03:00
gbsyncd-platform.sh [gearbox] provide common gbsyncd.service.j2 to start for platform specific gbsyncd docker (#9332) 2021-11-23 10:44:29 -08:00
gbsyncd.sh Address Review Comment to define SONIC_GLOBAL_DB_CLI in gbsyncd.sh (#11857) 2022-08-29 08:19:28 -07:00
lldp.sh [systemd] ASIC status based service bringup on VOQ chassis (#7477) 2021-07-27 23:02:49 -07:00
mark_dhcp_packet.py Replace os.system and remove subprocess with shell=True (#12177) 2022-11-04 10:48:51 -04:00
mgmt-framework.sh [services] kill container on stop in warm/fast mode (#10510) 2022-09-19 19:34:33 +03:00
radv.sh [services] kill container on stop in warm/fast mode (#10510) 2022-09-19 19:34:33 +03:00
service_mgmt.sh [services] kill container on stop in warm/fast mode (#10510) 2022-09-19 19:34:33 +03:00
snmp.sh [services] kill container on stop in warm/fast mode (#10510) 2022-09-19 19:34:33 +03:00
sonic-netns-exec [sonic-netns-exec]: use "$@" to reflects all positional parameters as they were set initially (#4375) 2020-04-07 00:05:47 -07:00
supervisor-proc-exit-listener Publish additional events (#12563) 2022-11-07 09:57:57 -08:00
swss.sh Added Support to runtime render bgp and teamd feature state and lldp has_asic_scope flag (#11796) 2022-11-15 16:20:14 -08:00
syncd_common.sh [swss.sh/syncd.sh] Trap only on EXIT (#11590) 2022-08-10 20:57:07 -07:00
syncd.sh Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804) 2022-11-30 16:16:30 +08:00
teamd.sh [teamd.sh] kill teamd docker on warm shutdown for faster shutdown (#10219) 2022-03-15 09:20:36 +02:00
telemetry.sh [services] kill container on stop in warm/fast mode (#10510) 2022-09-19 19:34:33 +03:00
update_chassisdb_config In modular chassis, add CHASSIS_STATE_DB on control card (#5624) 2020-12-15 17:15:00 -08:00
write_standby.py [mux] skip mux operations during warm shutdown (#11937) 2022-09-02 13:50:42 -07:00