Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
AS7326-56X and AS7726-56X use the same design so both devices have the same problem.
The detailed description below takes AS7326-56X as the example to explain.
Original implementation:
- In platform/broadcom/sonic-platform-modules-accton/as7326-56x/service/as7326-platform-handle_mac.service,
it executes the script file "accton_handle_idt.sh".
- In "accton_handle_idt.sh", it modifies the content of the script file "/etc/init.d/opennsl-modules"
to insert the lines to execute "idt_init.sh" before the command to load broadcom linux kernel module "linux-kernel-bde.ko".
- The script "idt_init.sh" cannot be executed at the first boot of SONiC after installing SONiC under ONIE. This is the reason why all of the ports does not work.
New implementation:
- Let "as7326-platform-handle_mac.service" execute "idt_init.sh".
- Change the content of "as7326-platform-handle_mac.service" to define the service type as "oneshot". Add the settings to ensure "as7326-platform-handle_mac.service" is executed before "opennsl-modules.service".
By setting the service type as "oneshot", it is guaranteed that "opennsl-modules.services" is started only when the forked process to execute the script file "idt_init.sh" is terminated
Signed-off-by: charlie_chen <charlie_chen@edge-core.com>
When LLDP parameter tx-interval value is modified, there was no immediate PDU sent to peer to update the peer with the latest values. Due to this the update on peer happened only after the next PDU is sent which can cause a delay of upto 30 secs (default value).
* In the event of a kernel crash, we need to gather as much information
as possible to understand and identify the root cause of the crash.
Currently, the kernel does not provide much information, which make
kernel crash investigation difficult and time consuming.
Fortunately, there is a way in the kernel to provide more information
in the case of a kernel crash. kdump is a feature of the Linux kernel
that creates crash dumps in the event of a kernel crash. This PR
will add kermel kdump support.
An extension to the CLI utilities config and show is provided to
configure and manage kdump:
- enable / disable kdump functionality
- configure kdump (how many kernel crash logs can be saved, memory
allocated for capture kernel)
- view kernel crash logs
Added python-libpcap to be used by arp_responder.py utility. This is needed to set conf.use_pcap which will make sure that L2pcapListenSocket uses libpcap instead of Linux PF_PACKET sockets. By using libpcap the vlan field will not be removed when the application receives the packet.
Implement Watchdog platform2.0 API for DellEMC S6100 platform.
- Added new file watchdog.py in sonic_platform directory.
- Enabled API support to Enable/disable watchdog.
Implement part of the Chassis and Fan related APIs.
- Chassis APIs
get_base_mac()
get_serial_number()
get_serial_number()
get_system_eeprom_info()
get_reboot_cause()
- Fan APIs
get_direction()
get_speed()
get_target_speed()
get_speed_tolerance()
set_speed()
set_status_led()
get_target_speed()
- Fan APIs base on Device API
get_name()
get_presence()
get_model()
get_serial()
get_status()
Signed-off-by: Wirut Getbamrung wgetbumr@celestica.com
* Rename asn/deployment_id_asn_map.yaml to constants/constants.yaml
* Fix bgp templates
* Add community for loopback when bgpd is isolated
* Use correct community value
* Preemphasis values for various optics
This patch adds the preemphasis values for the various supported
optics for qfx5210 platform
Signed-off-by: Ciju Rajan K <crajank@juniper.net>
Fixed the fpga crash issue which we see in 15-20 mins time frame after onie-install. Accessing stale i2c transfer message buffer causes this crash. Te message buffer becomes stale due to race between i2c transfer and fpga interrupt handler.
This new state STATE_STOP will not be exposed for the wake up call till all the ISR of previous transfer is completed successfully.
Fixed Makefile of FRR. Before we had issues after #3589:
- When you want to rebuild frr with new changes you get error "branch frr/7.1 is already exist".
- When your patch list is empty stg undo gives an error
* [sonic-cfggen] optimize execution time
a lot of template rendering causes switch to start longer because jinja2
needs to parse them. Introducing RedisBytecodeCache to store parsed buckets of
internal template bytecode to speedup same template rendering during start
* [sonic-cfggen] do lazy regexp compilation to speedup sonic-cfggen
* [sonic-cfggen] address pep8 related comments
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
We noticed in tests/production that there is a low probability failure
where /etc/hosts could have some garbage characters before the entry for
local host name. The consequence is that all sudo command would be very
slow. In extreme cases it would prevent some services from starting
properly.
I suspect that the /etc/hosts file might be opened by some process causing
the issue. Editing contents with new file level and replace the whole file
should be safer.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>