Haliburton needed watchdog daemon to monitor the basic health of a machine. If something goes wrong, such as a crashing program overloading the CPU, or no more free memory on the system, watchdog can safely reboot the machine,
* [SAI] Add PFC pause duration counters in microseconds
**- Why I did it**
To add PFC pause duration counters in microseconds
**- How I did it**
Updated SAI to version 1.14.3
**- How to verify it**
**- Description for the changelog**
[Mellanox] Update SAI to version 1.14.3
This PR adds the changes in #6198 to 201811 branch to support warm-reboot image upgrade for kvm images.
The sai.profile file in kvm images overrides the warmboot file with path /var/cache/sai_warmboot.bin. Since the directory /var/cache is not mounted in syncd, it will be cleared in an image upgrade, the warm-reboot image upgrade will fail if the file is put in the directory.
Remove the path that overrides the default path. The warmboot file path will then be the default value /var/warmboot/sai-warmboot.bin. Since /var/warmboot/ is mounted by /host/warmboot/ in the host, it could survive an image upgrade.
Since DOCKER_SYNCD_VS is no longer being used, the mount option does not properly mount the warmboot file directory. Fix the mount option so that the directory is properly mounted.
During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
- What I did
Ported a fix from libteam master to our master.
Fixes#4070Fixes#3649
- How I did it
Applied patch jpirko/libteam@c723737 from upstream.
- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
To address issue #5525
Explicitly control the grub installation requirement when it is needed.
We have scenario where configuration migration happened but grub
installation is not required.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [platform/cel-dx010]: add gpio init for fan direction
* [platform/cel-dx010]: remove invalid code on fancontrol service
* [platform/cel-dx010]: modify fancontrol service permission
* [platform/cel-dx010]: install fancontrol in pmon
Printing both snapshot and current counter sets will make it easier to pinpoint
which message type(s) is/are not being relayed. This PR prints both counter sets.
Also, this PR defines gnu11 as a C standard to compile with in order to avoid
making changes when porting to 201811 branch.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.
In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
* [201811][swss-common] advance swss-common sub module head
- Fix SubscriberStateTable::hasCachedData formula for a timing risk (#379)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* Fix build of the unit test of SubscriberStateTable (#383)
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.
In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.
Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.
Signed-off-by: Baptiste Covolato <baptiste@arista.com>
Fix an error which causes bgpcfgd crash on invalid ip address. Before the fix we had an issue here. When either loopback ipv4 or ipv6 addresses were already set and bgpcfgd received another "SET" message for already set ip loopback address, bgpcfgd will send syslog message about ambiguous ip address (despite the fact that the address is good) and crash of bgpcfgd. With this change this behavior is changed: if we receive ip address and this ip address is already set, bgpcfgd will send this message to the syslog and return from the handler.
If a device had a master or 201911 image then installed a 201811 image, it could result in a prefdl that was not properly processed by 201811 Arista code.
This is a commit that was on 201911 and master branch.
Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
sonic-quagga using utility from master branch of sonic-buildimage. I had to create 201811 branch in sonic-quagga which could work with 201811 branch of sonic-buildimage.
Fix the method get_transceiver_change_event to abide by the function description, return True status and use timeout in milliseconds.
Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
[aclorch] Use IPv6 Next Header internally for protocol number on MLNX platform (#1343)
Add/Del lag_name_map item according to lag adding and removing (#1124)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Update fancontrol service for Seastone-DX010/E1031 device to support hysteresis temperature threshold and difference config for each unit fan direction type (B2F/F2B); follow master branch
FDB/ARP/Default routes files are deleted after swssconfig. This
makes debugging/validation of device conversion hard. This PR
saves those files in order to facilitate debugging of device conversion.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
While migrating to SONiC 20181130, identified a couple of issues:
1. union-mount needs /host/machine.conf parameters for vendor specific checks : however, in case of migration, the /host/machine.conf is extracted from ONIE only in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/platform/rc.local#L127.
2. Since grub.cfg is updated to have net.ifnames=0 biosdevname=0, 70-persistent-net.rules changes are no longer required.
Don't limit iptables connection tracking to TCP protocol; allow connection tracking for all protocols. This allows services like NTP, which is UDP-based, to receive replies from an NTP server even if the port is blocked, as long as it is in reply to a request sent from the device itself.