Commit Graph

2444 Commits

Author SHA1 Message Date
Aravind Mani
ca76c05890
[201811] Dell: Fix S6000 reboot issue (#6688)
#### Why I did it
Reboot command in Dell S6000 failed to reboot the switch.

#### How I did it
Added retry mechanism and CPU reset.
2021-02-16 15:37:44 -08:00
Volodymyr Samotiy
7d0c9f4b3b
[Mellanox] Update SDK repo pointer for kernel package v4.3.1646 with kernel v4.9.0-14 (#6719)
To update rebuilt Mellanox SDK kernel package v4.3.1646 with kernel v4.9.0-14.

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-02-08 08:33:44 -08:00
Volodymyr Samotiy
3d7f4fc45d
[Mellanox] Update SDK kernel package v4.3.1646 with kernel v4.9.0-14 (#6647)
Updated commit hash pointer in the relevant Makefile for the repository which contains SDK packages.

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-02-02 16:35:19 -08:00
Guohan Lu
db81349bae [ci]: reset the repo
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 05:07:48 -08:00
Guohan Lu
ee4fa7b909 [pip2]: pin down pip to 20.3.3
With the release of pip21.0 (https://pypi.org/project/pip/#history) on branch
201811 stretch build is failing with below error logs:

As per https://pypi.org/project/pip/ pip21.0 does not not support python2
from Jan 2021. To fix this tag the pip to 20.3.3 version which was being used last
and is working fine.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-30 05:07:48 -08:00
arlakshm
ddbfe0631d [baseimage]: add docker ps to the sudoer file (#6604)
fixes Azure/sonic-utilities#1389

With the recent changes in sudoer files. The  show commands fails for the read-only users.
The problem here is the 'docker ps' is failing in the function [get_routing_stack()](8a1109ed30/show/main.py (L54)) therefore all the CLI commands are failing.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-01-29 08:38:47 -08:00
Qi Luo
a6295f82be
Cleanup sudoers file (#6523)
Same as https://github.com/Azure/sonic-buildimage/pull/6518
For 201811 branch
2021-01-21 14:42:10 -08:00
Guohan Lu
d40a82bf66 [ci]: add azure pipeline yaml
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-20 23:06:13 -08:00
lguohan
ca74fe22fc [build]: setup -t option in docker run correctly (#6320)
use bash -t test flag to check if input device is tty or not

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-20 23:06:13 -08:00
Guohan Lu
1cca3ded45 [build]: fix dpkg uninstall bug
fix a bug when there are multiple debian packages to be uninstalled

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-20 23:06:13 -08:00
lguohan
63a044a87c [build]: fix dpkg admindir corruption issue in parallel build (#6408)
Fix #119

when parallel build is enable, multiple dpkg-buildpackage
instances are running at the same time. /var/lib/dpkg is shared
by all instances and the /var/lib/dpkg/updates could be corrupted
and cause the build failure.

the fix is to use overlay fs to mount separate /var/lib/dpkg
for each dpkg-buildpackage instance so that they are not affecting
each other.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-20 23:06:13 -08:00
lguohan
7e282b89a2 [build]: wait for conflicts package to be uninstalled (#5039)
when parallel build is enabled, both docker-fpm-frr and docker-syncd-brcm
is built at the same time, docker-fpm-frr requires swss which requires to
install libsaivs-dev. docker-syncd-brcm requires syncd package which requires
to install libsaibcm-dev.

since libsaivs-dev and libsaibcm-dev install the sai header in the same
location, these two packages cannot be installed at the same time. Therefore,
we need to serialize the build between these two packages. Simply uninstall
the conflict package is not enough to solve this issue. The correct solution
is to have one package wait for another package to be uninstalled.

For example, if syncd is built first, then it will install libsaibcm-dev.
Meanwhile, if the swss build job starts and tries to install libsaivs-dev,
it will first try to query if libsaibcm-dev is installed or not. if it is
installed, then it will wait until libsaibcm-dev is uninstalled. After syncd
job is finished, it will uninstall libsaibcm-dev and swss build job will be
unblocked.

To solve this issue, _UNINSTALLS is introduced to uninstall a package that
is no longer needed and to allow blocked job to continue.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-20 23:06:13 -08:00
lguohan
0510ecedcd [build]: change user name to lower case when used in sonic-slave tag (#6319)
sonic-slave tag only allows all lower case. In case the user
name is mixed case, we need to change user name to all lower case.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-19 09:57:57 -08:00
Guohan Lu
74807be779 [submodule]: update sonic-swss
87e1a36 2020-04-16 | Do not set PG to Buffer porfile mapping again if already exist. (#1261) (HEAD -> 201811, origin/201811) [abdosi]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-16 21:59:48 -08:00
Volodymyr Samotiy
312a596e5f
[submodule] Update swss submodules (#6394)
To add updated PFC storm detection logic for Mellanox platforms

swss commits

a1b6e5e [pfcwd] Update PFC storm detection logic for Mellanox platforms (#1523)
4999565 [acl] Remove Ethertype from L3V6 qualifiers (#1433)

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-01-16 21:58:08 -08:00
lguohan
4e08f4dcc3
[docker-orchagent]: make build depends only on sairedis package (#6467)
backport c4b5b002c3

make swss build depends only on libsairedis instead of syncd. This allows to build swss without depending
on vendor sai library.

Currently, libsairedis build also buils syncd which requires vendor SAI lib. This makes difficult to build
swss docker in buster while still keeping syncd docker in stretch, as swss requires libsairedis which also
build syncd and requires vendor to provide SAI for buster. As swss docker does not really contain syncd
binary, so it is not necessary to build syncd for swss docker.

[submodule]: update sonic-sairedis

* 9a66890 2020-06-28 | [build]: add option to build without syncd (HEAD -> 201811, origin/201811) [Guohan Lu]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-16 13:12:14 -08:00
lguohan
bcda39f394
[sonic-linux-kernel]: update kernel to 4.9.246 (#6461)
kernel ABI from 4.9.0-12 -> 4.9.0-14

Signed-off-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Samuel Angebault <angebault.samuel@gmail.com>
2021-01-16 12:33:23 -08:00
Renuka Manavalan
b2e3ba800e
[tacacs]: Restore from TACACS backup if present, upon load-minigraph during update-graph action. (#6407)
Why I did it
During upgrade, if config is loaded from minigraph, it would miss TACACS credentials. This leads to device losing remote user accessibility

- How I did it
During update graph, when config is loaded from minigraph, look for TACACS credentials back-up and load that if available

- How to verify it
Remove /etc/sonic/config-db.json, save TACACS credentials in /etc/sonic/tacacs.json and do a Image upgrade. Do image upgrade and boot into new image. Verify remote user access is available.

NOTE: This change is available in master via PR #6285
2021-01-11 13:57:20 -08:00
Ying Xie
abdbda9435
[201811][bcm SAI] ugprade Broadcom SAI to version 3.5.3.6-2 (#6400)
- Rebase to Broadcom release 3.5.3.6.
- Taking fixes for: CS00011229318, CS00010775359, CS00011331832, CS00011444035, CS00011222060 and CS00010318905
- Taking CS00011581499 patch from Broadcom.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-01-08 16:29:57 -08:00
Wirut Getbamrung
a416f49676
[platform/cel-haliburton]: add watchdog service (#6259)
Haliburton needed watchdog daemon to monitor the basic health of a machine. If something goes wrong, such as a crashing program overloading the CPU, or no more free memory on the system, watchdog can safely reboot the machine,
2020-12-26 03:04:21 -08:00
Volodymyr Samotiy
d609b406be
[Mellanox] Update SAI to version 1.14.3 (#6156)
* [SAI] Add PFC pause duration counters in microseconds
**- Why I did it**
To add PFC pause duration counters in microseconds
**- How I did it**
Updated SAI to version 1.14.3
**- How to verify it**

**- Description for the changelog**
[Mellanox] Update SAI to version 1.14.3
2020-12-14 23:51:14 -08:00
Shi Su
0574db2760
[201811][warm-reboot] Remove warmboot file path that overrides the default path (#6201)
This PR adds the changes in #6198 to 201811 branch to support warm-reboot image upgrade for kvm images.

The sai.profile file in kvm images overrides the warmboot file with path /var/cache/sai_warmboot.bin. Since the directory /var/cache is not mounted in syncd, it will be cleared in an image upgrade, the warm-reboot image upgrade will fail if the file is put in the directory.

Remove the path that overrides the default path. The warmboot file path will then be the default value /var/warmboot/sai-warmboot.bin. Since /var/warmboot/ is mounted by /host/warmboot/ in the host, it could survive an image upgrade.
2020-12-13 22:48:29 -08:00
Shi Su
40bd77c915
201811][syncd] Fix directory mount for vs syncd docker (#6200)
Since DOCKER_SYNCD_VS is no longer being used, the mount option does not properly mount the warmboot file directory. Fix the mount option so that the directory is properly mounted.
2020-12-13 22:42:02 -08:00
Sumukha Tumkur Vani
0eb8f773f4
Potential fix for Celestica E1031 device hang
set CPU max_cstate to 0
2020-12-04 12:39:47 -08:00
Ying Xie
93302d1810
[bcm SAI] Upgrade Broadcom SAI to version 3.5.3.5-3 (#5734)
- Include change to CS00011229318.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-10-29 08:53:29 -07:00
Neetha John
6b96b8e4ac
[knet] Address Tx drop No DMA resource issue (#5727)
Signed-off-by: Neetha John <nejo@microsoft.com>
2020-10-28 08:29:16 -07:00
Aravind Mani
701a304b6d [Dell S6100] Properly release memory upon ICH driver deinit (#5561)
During platform deinitialization, dell_ich is not removed properly and when we do initialize s6100 platform, ICH driver sysfs attributes are not attached. Because of this, get_transceiver_change_event returns error and this leads xcvrd to crash.
2020-10-14 18:48:45 +00:00
pavel-shirshov
6c2801b846 teamd: fix possible race in master ifname callback (#4109)
- What I did
Ported a fix from libteam master to our master.
Fixes #4070
Fixes #3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
2020-10-09 15:59:56 +00:00
Ying Xie
9ea38c417c [rc.local] separate configuration migration and grub installation logic (#5528)
To address issue #5525

Explicitly control the grub installation requirement when it is needed.
We have scenario where configuration migration happened but grub
installation is not required.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-10-05 15:11:35 +00:00
Ying Xie
4c80d996ba
Revert "[201811][platform/cel]: Port fancontrol changes for dx010/e1031 to 201811 branch (#4867)" (#5496)
This reverts commit c9d86f0587.
2020-09-29 18:02:47 -07:00
Ying Xie
81e9ec6be6
Revert "[201811][platform-cel]: Fix dx010 FSC error (#4949)" (#5485)
This reverts commit ec07d10748.
2020-09-29 08:01:02 -07:00
Aravind Mani
7e6fa15784
Dell S6100 fix mux log issue (#5413)
IOM completion log was not seen in syslog.
2020-09-21 12:19:07 -07:00
Aravind Mani
bee516e370
Dell S6100- Fix PCA MUX attachment issue (#5401)
* Dell S6100- Fix PCA MUX attachment t issue

* Update s6100_i2c_enumeration.sh

* Update s6100_i2c_enumeration.sh
2020-09-20 20:05:53 -07:00
Ying Xie
f041345e4e
[201811][bcm SAI] ugprade Broadcom SAI to 3.5.3.5-2 (#5405)
Including following Broadcom patches:
- CS00010869953, CS00010914668(KB29456), CS00010503275(KB0029315), CS00010914673(KB0029442)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-18 14:54:54 -07:00
Wirut Getbamrung
ec07d10748
[201811][platform-cel]: Fix dx010 FSC error (#4949)
* [platform/cel-dx010]: add gpio init for fan direction

* [platform/cel-dx010]: remove invalid code on fancontrol service

* [platform/cel-dx010]: modify fancontrol service permission

* [platform/cel-dx010]: install fancontrol in pmon
2020-09-17 15:30:15 -07:00
Tamer Ahmed
b903c8e198 [dhcpmon] Print Both Snapshot And Current Counters (#5374)
Printing both snapshot and current counter sets will make it easier to pinpoint
which message type(s) is/are not being relayed. This PR prints both counter sets.
Also, this PR defines gnu11 as a C standard to compile with in order to avoid
making changes when porting to 201811 branch.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-16 09:57:36 -07:00
Tamer Ahmed
949bdee24e [dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.

In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-16 09:57:36 -07:00
Ying Xie
db1ef65102
[201811][swss-common] advance swss-common sub module head (#5369)
* [201811][swss-common] advance swss-common sub module head

- Fix SubscriberStateTable::hasCachedData formula for a timing risk (#379)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* Fix build of the unit test of SubscriberStateTable (#383)
2020-09-15 09:10:19 -07:00
Ying Xie
6c4914b62c Revert "[dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)"
This reverts commit 44d6e03df3.
2020-09-14 22:03:55 +00:00
Tamer Ahmed
44d6e03df3 [dhcpmon] Monitor Mgmt Interface For DHCP Packets (#5317)
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.

In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-14 16:56:01 +00:00
Guohan Lu
083607f4d1 [submodule]: update sonic-utilities
* 4d69425 2020-09-12 | [utilities] Define Explicit Dependency On Ipaddress Package (#1113) (HEAD, origin/201811) [Guohan Lu]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-09-12 23:08:39 -07:00
Ying Xie
6597bd8dea
[201811][utilities] advance utilities sub module head (#5339)
- [filter-fdb] Call Filter FDB Main From Within Test Code #1051 and #1059 (#1086)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-08 14:35:22 -07:00
Blueve
55d2d15e4e [conf] append nos-config-part for s6100 (#5234)
* [conf] append nos-config-part for s6100

* modify rc.local

Signed-off-by: Guohan Lu <lguohan@gmail.com>

* Update rc.local

Co-authored-by: Blueve <jika@microsoft.com>
Co-authored-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Ying Xie <yxieca@users.noreply.github.com>
2020-09-08 19:30:33 +00:00
Ying Xie
6b75059b1d
[201811][kernel][utilities][sairedis] advance submodule heads (#5288)
- Kernel: [201811] Fix I2C ISMT DMA buffer alignment issue (#158)[201811] Fix I2C ISMT DMA buffer alignment issue (#158)
- utilities: Fix pfcwd stats crash with invalid queue name (#1077)
- sairedis: [syncd] Fix notification on switch shutdown request (#638)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-09-02 12:15:47 -07:00
zhenggen-xu
a99026acdc
[Build] pin down setuptools for build issues (#5280)
See: https://github.com/Azure/sonic-buildimage/issues/5279

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-09-01 16:29:31 -07:00
Joe LeVeque
c909422abc [caclmgrd] Always restart service upon process termination (#5065) 2020-08-31 20:31:13 +00:00
Joe LeVeque
4547ea022d [caclmgrd] Improve code reuse (#4931)
Improve code reuse in `generate_block_ip2me_traffic_iptables_commands()` function.
2020-08-31 20:30:54 +00:00
Baptiste Covolato
c706a1079f [arista/aboot]: Zero out 1st MB before repartitioning (#5220)
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.

Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.

Signed-off-by: Baptiste Covolato <baptiste@arista.com>
2020-08-22 18:48:10 -07:00
Santhosh Kumar T
a2cb92056a
Dell S6100 Port I2C changes to 201811 branch (#5150)
* Dell S6100 Port I2C changes to 201811 branch

* Update s6100_i2c_enumeration.sh
2020-08-18 14:38:28 -07:00
zhenggen-xu
e1e97199e3
[201811 Monit] Enable monitoring of SWSS daemons (#5144)
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-08-13 20:42:06 -07:00