Commit Graph

4331 Commits

Author SHA1 Message Date
gechiang
6d9b05c032 Anchor the libprotobuf-dev version based on a fixed version by using debian control dependency (#6420) 2021-01-15 08:15:26 -08:00
Lawrence Lee
063a485982 [minigraph.py]: Force /32 prefix for mux cable server IPv4 loopbacks (#6418)
Server IPv4 loopbacks do not always arrive with /32 prefix, which is a requirement for the MUX_CABLE table in config DB

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-12 06:23:13 -08:00
Qi Luo
b6de4943fc [sonic-slave]: Upgrade python lxml library version to 4.6.2 (#6404) 2021-01-12 06:23:06 -08:00
lguohan
45b724fe76 [build]: fix dpkg admindir corruption issue in parallel build (#6408)
Fix #119

when parallel build is enable, multiple dpkg-buildpackage
instances are running at the same time. /var/lib/dpkg is shared
by all instances and the /var/lib/dpkg/updates could be corrupted
and cause the build failure.

the fix is to use overlay fs to mount separate /var/lib/dpkg
for each dpkg-buildpackage instance so that they are not affecting
each other.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-12 06:22:51 -08:00
Lawrence Lee
264ecb181c [minigraph.py]: Add peer switch hostname to device metadata (#6405)
To make the peer switch hostname easily accessible from config DB. Add peer_switch field to DEVICE_METADATA table

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-01-11 10:48:13 -08:00
Ying Xie
9da8b0faf9 [utilities] advance utilities submodule head (#6402)
- (HEAD, github/master) [storyteller] adding a grep wrapper with predefined scenarios (#1349)
- Adding global-timeout, individual command timeout, log files collection (#1249)
- Add FW dump with new SAI implementation (#1338)
- [unit test][pfcwd] Fix tests that require sudo access (#1340)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-01-11 10:48:04 -08:00
guxianghong
2ae182623c [Centec ARM64]Upgrade Centec syncd docker to buster and Enable Telemetry on ARM64 (#6386)
* Enable telemetry for ARM64 by default

* [Centec]Upgrade Centec syncd docker to buster; libjemalloc2 have been installed in docker-base-buster, remove libjemalloc1 from docker-syncd-centec's Dockerfile.j2

Co-authored-by: Gu Xianghong <xgu@centecnetworks.com>
2021-01-09 08:29:36 -08:00
dependabot[bot]
3e8142fba0 Bump lxml from 4.6.1 to 4.6.2 in /src/sonic-config-engine (#6385)
Bumps [lxml](https://github.com/lxml/lxml) from 4.6.1 to 4.6.2.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-4.6.1...lxml-4.6.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-01-09 08:29:31 -08:00
Qi Luo
88ace50da8 [sonic-swss-common] Update submodule (#6382)
Includes sonic-swss-common commits:
```
71dc350 2021-01-07 | Lower the log level for outdated key for SubscriberStateTable notification (#441) [Qi Luo]
7e40582 2021-01-08 | Add boost dependencies (#442) [Ze Gan]
30a8ddf 2021-01-05 | Change DBConnector::hgetall return type from map to unordered_map (#440) [Qi Luo]
021108d 2021-01-02 | MCLAG Enhancements per HLD https://github.com/Azure/SONiC/pull/596 (#405) [Praveen-Brcm]
54996fc 2021-01-02 | Implement ConfigDBConnector and ConfigDBPipeConnector in C++ (#437) [Qi Luo]
8286525 2020-12-27 | Simply refactor DBConnector hgetall() [Qi Luo]
6d1d33b 2020-12-27 | Fix RedisTransactioner: handle empty deque [Qi Luo]
624e0b8 2020-12-26 | Move complex class constructor as explicit, and fix several mistaken copy constructor usage [Qi Luo]
3b983f9 2020-12-30 | [ci]: add timeout to 180 minutes for arm build (#439) [lguohan]
f2e4210 2020-12-29 | Add utility for string and redis (#434) [Ze Gan]
7a885fd 2020-12-29 | [build]: add build check for arm64 and armhf (#436) [lguohan]
47bccc4 2020-12-24 | Add missed vector header to rediscommand.h (#435) [Ze Gan]
```
2021-01-09 08:29:25 -08:00
pavel-shirshov
03391f20c5 [bgpcfgd]: Support default action for "Allow prefix" feature (#6370)
* Use 20 and 30 route-map entries instead of 2 and 3 for TSA

* Added support for dynamic "Allow list" default action.

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2021-01-09 08:29:19 -08:00
yozhao101
bfec282a82 [Monit] Monitoring the running status of containers. (#6251)
**- Why I did it**
This PR aims to monitor the running status of each container. Currently the auto-restart feature was enabled. If a critical process exited unexpected, the container will be restarted. If the container was restarted 3 times during 20 minutes, then it will not run anymore unless we cleared the flag using the command `sudo systemctl reset-failed <container_name>` manually. 

**- How I did it**
We will employ Monit to monitor a script. This script will generate the expected running container list and compare it with the current running containers. If there are containers which were expected to run but were not running, then an alerting message will be written into syslog.

**- How to verify it**
I tested this feature on a lab device `str-a7050-acs-3` which has single ASIC and `str2-n3164-acs-3` which has a Multi-ASIC. First I manually stopped a container by running the command `sudo systemctl stop <container_name>`, then I checked whether there was an alerting message in the syslog.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2021-01-09 08:27:53 -08:00
Renuka Manavalan
1bdefd16fa Take a copy of existing TACACS credentials and restore it during upgrade (#6285)
In scenario where upgrade gets config from minigraph, it could miss tacacs credentials as they are not in minigraph. Hence restore explicitly upon load-minigraph, if present.

- Why I did it
Upon boot, when config migration is required, the switch could load config from minigraph. The config-load from minigraph would wipe off TACACS key and disable login via TACACS, which would disable all remote user access. This change, would re-configure the TACACS if there is a saved copy available.

- How I did it
When config is loaded from minigraph, look for a TACACS credentials back up (tacacs.json) under /etc/sonic/old_config. If present, load the credentials into running config, before config-save is called.

- How to verify it
Remove /etc/sonic/config_db.json and do an image update. Upon reboot, w/o this change, you would not be able ssh in as remote user. You may login as admin and check out, "show tacacs" & "show aaa" to verify that tacacs-key is missing and login is not enabled for tacacs.
With this change applied, remove /etc/sonic/config_db.json, but save tacacs & aaa credentials as tacacs.json in /etc/sonic/. Upon reboot, you should see remote user access possible.
2021-01-09 08:27:41 -08:00
Joe LeVeque
984c833e4c [system-health] Make run_command() Python 3-compliant (#6371)
Pass universal_newlines=True parameter to subprocess.Popen(); no longer use .encode('utf-8') on resulting stdout.
This was missed in #5886

Note: I would prefer to use text=True instead of universal_newlines=True, as the former is an alias only available in Python 3 and is more understandable than the latter. However, Even though the setup.py file for this package only specifies Python 3, the LGTM tool finds other Python 2 code in the repo and validates the code as Python 2 code and alerts that text=True is an invalid parameter. Will stick with universal_newlines=True for now. Once all Python code in the repo has been converted to Python 3, I will change all universal_newlines=True to text=True.
2021-01-09 08:27:25 -08:00
gechiang
7462850ba4 [brcm]: BRCM SAI 4.2.1.5-9 Fix _brcm_sai_indexed_data_get () with unexpected queue causing _brcm_sai_switch_assert () after warm reboot (#6374)
Starting from build (master) 176 the warm reboot on BRCM Platform started to experience syncd crash. Upon further debug by Ying it was determined that the crash was related to the following new change:
[Dynamic buffer calc] Support dynamic buffer calculation (#1338)

Ying also debugged further and found The crash was caused by buffer pool profile setting operation SAI_BUFFER_PROFILE_ATTR_SHARED_DYNAMIC_TH

A case has filed with BRCM while a potential fix was tried by Ying that seems to have addressed this issue and we are making this change available in master branch so that it will allow further feature validation/testing especially in the warm reboot area.
Once an official fix is provided by BRCM, we will then remove this in house fix and apply the official fix.

- How to verify it
Just perform warm reboot with any master code 175 or above you should see this issue or issue the following cmd will also cause the crash: "mmuconfig -p egress_lossy_profile -a 0"
2021-01-09 08:27:16 -08:00
abdosi
be82cdbad5 Updated imfile configuration for supervisord logs (#6368)
Updated imfile configuration for supervisord logs for stretch and buster.
2021-01-09 08:27:08 -08:00
carl-nokia
941c27ce2a [Nokia]: Enable Telemetry for armhf and provide required qos files (#6364)
* [platform][Nokia]: Add buffers and qos files for config qos reload

   - providing required files

* [platform][armhf]: remove hardcoded disable for Telemetry on armhf

Co-authored-by: Carl Keene <keene@nokia.com>
2021-01-09 08:26:52 -08:00
Joe LeVeque
0182c5f5ca [sonic-platform-common][sonic-platform-daemons] Update submodules (#6352)
src/sonic-platform-common 9935fca...8664efc (2):

Make sonic_sfp Python2 and Python3 compatible (#157)
[sffbase.py] Fix to make Python 3-compatible (#156)

src/sonic-platform-daemons e6c786b...81318f7 (1):

[psud] Fix issue where PSU Fan info is not updated in State DB (#137)

Fixes #6341
2021-01-09 08:26:30 -08:00
sudhanshukumar22
c111a68e74 [docker-lldp]: sonic advertise meaningful SysDescription instead of debian (#6114)
Sonic devices advertise meaningful system description along with Debian package information.

before the fix:

-------------
admin@sonic:~$ show lldp neighbors
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface: Ethernet0, via: LLDP, RID: 3, Time: 0 day, 16:36:30
SysName: sonic
SysDescr: Debian GNU/Linux 9 (stretch) Linux 4.9.0-11-2-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64
-------------------------------------------------------------------------------

After the fix:

root@sonic:~# show lldp neighbors Ethernet16
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface:    Ethernet16, via: LLDP, RID: 10, Time: 0 day, 00:01:00
    SysName:      sonic
    SysDescr:     SONiC Software Version: SONiC.sonic_upstream_1.0_daily_201130_1501_62-dirty-20201130.203529 - HwSku: Accton-AS7816-64X - Distribution: Debian 10.6 - Kernel: 4.19.0-9-2-amd64
-------------------------------------------------------------------------------

Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
2021-01-09 08:26:21 -08:00
Aravind Mani
7305b55c80 DellEMC: Z9332f change SFP detection logic (#6261)
- Dynamically change EEPROM driver based on media type.
- Otherwise, EEPROM INFO and DOM INFO might not be fetched properly and will result in erroneous output.
2021-01-09 08:25:58 -08:00
Samuel Angebault
c6f14c9927
[202012][Arista] Update driver submodules (#6397)
- Cleanup and Refactor of library internals, logic mostly unchanged.
 - Enhance debugability with `arista dump` and `arista diag` commands.
 - Fix power supply detection issue.
2021-01-09 08:05:54 -08:00
Ying Xie
a37aa08504 [sairedis] advance sairedis submodule head (#6365)
To incldue following changes:

- [ci]: add build for arm64 and armhf (#757)
- Use template hgetall, because we will tune the return types of library functions (#759)
- [syncd] Fix bulk multi attrs for same key db update (#761)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-01-06 06:19:58 -08:00
abdosi
dd25f774b5 [rsyslog]: Explicitly set the notify mode for rsyslog imfile module (#6351)
Enable the notify mode of rsyslogd imfile module used for supervisord
logs in docker container. 

Setup the mode="inotify" when loading imfile, made sure we are are getting 
supervisord logs in host immediately.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-06 06:19:50 -08:00
Vaibhav Hemant Dixit
5becc2ac7e [determine-reboot-cause] Ignore non-hardware reboot cause (#6349)
- Why I did it - Reboot cause prints "Non-Hardware (N/A)" instead of showing the software reboot cause.
The issue is mishandling of hardware reboot cause in determine-reboot-cause script.

- How I did it
Fixed the handling for Non Hardware reboot cause. Ignore if Non-Hardware is present in the hardware_reboot_cause output. Added some code refactoring for simplicity.

- How to verify it - With fix, the hardware reboot cause is ignored (if it is non hw):
2021-01-06 06:19:41 -08:00
Junchao-Mellanox
d00dc06945 [xcvrd] Remove dependency on SONIC_PLATFORM_API_PY2 and SONIC_PLATFORM_API_PY3 (#6344)
Remove the build time dependency on SONIC_PLATFORM_API_PY2 and SONIC_PLATFORM_API_PY3 from xcvrd make rule
2021-01-06 06:19:32 -08:00
Akhilesh Samineni
46c2bf0ed4 After first bootup, the FEATURE table is not present in CONFIG_DB (#5911)
Fix the After first bootup(onie-install), the FEATURE table is not present in CONFIG_DB. 
Fix is done by calling config reload.
2021-01-06 06:19:21 -08:00
xumia
a79eb88052 Install the latest version of the sonic build hooks in slave container (#6348) 2021-01-06 06:18:27 -08:00
sudhanshukumar22
d4f3318337 [lldp] lldp upstream patches (#6118)
The details are as follows:

    1. 0010-Ported-fix-for-length-exceeded-from-lldp-community.patch

    Patch taken from 78243478dc

    lib: remove limit on system description length

    The limit was introduced in 9c49ced while fixing a memory leak.
    The state data is used to ensure we don't interleave operations. We
    need to handle the case where the value is truncated because it is
    larger than the allocated size.

    Fix issue https://github.com/lldpd/lldpd/issues/408

    2. 0011-fix-med-location-len.patch
    
    Patch taken from 5c3479463a

    lib: fix LLDP-MED location parsing in liblldpctl

    Some bounds were not checked correctly when parsing LLDP-MED civic
    location fields. This triggers out-of-bound reads (no write) in
    lldpcli, ultimately leading to a crash.

    Fix https://github.com/lldpd/lldpd/pull/420

Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
2021-01-04 23:25:21 -08:00
lguohan
dc94373a86 [docker-sonic-vs]: reduce the build steps for docker-sonic-vs (#6350)
combine multiple same operation into one operation to reduce
the build steps. this is to avoid max depth exceeded issue
in the build.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-04 23:25:15 -08:00
Arun Saravanan Balachandran
1b049421aa [DellEMC] Add platform-modules as prerequisite for determine-reboot-cause (#6322)
Add a systemd dependency to make platform-modules service as a prerequisite for determine-reboot-cause service to ensure platform initialization is complete before determine-reboot-cause.service executes.
2021-01-04 23:25:07 -08:00
Myron Sosyak
f7bb635be8 [BFN] Convert platform modules to python 3 (#6347)
Fix syntax errors during xcvrd start with Python 3 daemons
2021-01-04 23:24:19 -08:00
Sabareesh-Kumar-Anandan
08ab27506b [libyang1] Adding LFS support for arm32 (#6346)
In the emulated armhf environment, the function readdir()returns NULL on a ext4 file system directory. When running the libyang1 test cases, it will require to load the plugins from the files (such as metadata.so), because the readdir() is failing, the plugins can’t be loaded in the emulated armhf environment, so it causes libyang1 test error. This error is a combination of the following reasons.
• Emulation of a 32-bit target from a 64-bit host –> qemu from x86_64 to armhf
• Glibc version > 2.27 – Debian buster is using glibc 2.28

- How I did it
Enabled large file support by setting _FILE_OFFSET_BITS=64 for libyang1.

Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>
2021-01-04 23:24:08 -08:00
Denys Petryshyn
42fa096883 [BFN] Upgrade docker-syncd-bfn to buster (#6345)
* Add changes to allow migration of bfn syncd to buster

* Update BFN packages for Debian 10

Signed-off-by: Denys Petryshyn <denysx.petryshyn@intel.com>
2021-01-04 23:23:46 -08:00
lguohan
711b7f8079 [frr]: change frr debug package to extra to avoid build break for dbg image (#6340)
build frr dbg image force to install frr in the build process
which breaks the current build and is uneccessary.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-04 23:23:28 -08:00
lguohan
07b9282456 [broadcom]: match the brcm sai filename version to control file version (#6339)
the control file version is 4.2.1.5-7

sonic$ sudo dpkg -i target/debs/buster/libsaibcm-dev_4.2.1.5-8_amd64.deb
(Reading database ... 175880 files and directories currently installed.)
Preparing to unpack .../libsaibcm-dev_4.2.1.5-8_amd64.deb ...
Unpacking libsaibcm-dev (4.2.1.5-7) over (4.2.1.5-7) ...
Setting up libsaibcm-dev (4.2.1.5-7) ...
lgh@491d842369cf:/sonic$ sudo dpkg -i target/debs/buster/libsaibcm_4.2.1.5-8_amd64.deb
(Reading database ... 175880 files and directories currently installed.)
Preparing to unpack .../libsaibcm_4.2.1.5-8_amd64.deb ...
Unpacking libsaibcm (4.2.1.5-7) over (4.2.1.5-7) ...
Setting up libsaibcm (4.2.1.5-7) ...
Processing triggers for libc-bin (2.28-10) ...

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-03 08:47:41 -08:00
Mahesh Maddikayala
25b0f43d9f [sonic-py-common] Added an API to get file path containing SONiC version (#6309)
* [sonic-py-common] add an API to get file path containing SONiC version so that the API can be mocked for unit tests.
2021-01-03 08:47:34 -08:00
xumia
0ac81e3c96 Fix the hostimage version path permission issue (#6337) 2021-01-03 06:04:05 -08:00
Guohan Lu
9b2274a4f1 [build]: add artifical dependency between libyang and frr
frr build requires libyang 1.0.184 which conflicts with
libyang 1.0.73. Solution here is to compile frr and libyang 1.0.184
first, and then uninstall libyang 1.0.184 after frr build.
Then, compile libyang 1.0.73 and all packages depend on it later.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-03 06:04:00 -08:00
Guohan Lu
91bf30422c [frr]: remove dependency betwee frr and frr-snmp
no instalation of frr during the build process

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-03 06:03:56 -08:00
Guohan Lu
68086fc297 [build]: check if package with given version is installed or not
in case conflicting packages have same name but different version,
the install step needs to wait till package with the conflicting
version is uninstalled

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-03 06:03:52 -08:00
Guohan Lu
83781b9459 [build]: fix dpkg uninstall bug
fix a bug when there are multiple debian packages to be uninstalled

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-01-03 06:03:47 -08:00
Kebo Liu
3acf7006ed
[mellanox]: Update Mellanox SDK to 4.4.2208 FW to *.2008.2208 (#6333)
Features:
    Spectrum-3 | Systems | Added GA-level support for SN4700 A0 system
    All | Shared headroom | Added GA-level support for Shared headroom between PGs

  Bugs fixes:
    All | Counters | Sent traffic in certain size is wrongly increase to a smaller size counter, because port extended counter has a counter for sent traffic per packet-size range
    All | Shared buffer | Configuring shared buffer on the fly may, on occasion, cause the chip to get stuck
    Spectrum-2 | Modules | On occasion, link down is experienced with INPHI COLORZ PAM4 100G optic cables on SN3700 systems
2020-12-31 17:44:02 -08:00
Tony Titus
70ba59109d
[sonic-swss-common] Update submodule (#6317)
Including commits in sonic-swss-common repo:

b423b9c Add support for hexists call (#432) [Tony Titus]
0982996 Remove extension of tableNameSeparatorMap (#430) [Qi Luo]
d16cc76 [build]: add azure pipeline build badge (#429) [lguohan]
f2aaf55 Set up CI with Azure Pipelines (#428) [lguohan]
2020-12-31 17:43:09 -08:00
abdosi
ef0088c29f
Enable the notify mode of rsyslogd imfile module used for supervisord (#6298)
Enable the notify mode of rsyslogd imfile module used for supervisord logs in docker container
2020-12-31 17:01:57 -08:00
Guohan Lu
a165e632e3 [build]: fix syntax error when DOCKER_BASE_PULL is enabled
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-12-31 00:46:21 -08:00
Vaibhav Hemant Dixit
40f69f0aac
[determine-reboot-cause] Skip invoking platform code for unit tests for hardware_reboot_cause (#6325)
What: Modify unit test to not call any platform dependent api in test_find_hardware_reboot_cause.

- Why I did it
MELLANOX build is failing for the recent PRs. The errors are due to platform library being invoked in a unit test for determine-reboot-cause script.

Verified by running unit tests and a successful Mellanox build.

Co-authored-by: Vaibhav Hemant Dixit <vadixit@microsoft.com>
2020-12-30 17:48:56 -08:00
kktheballer
ba92a081ce
Minigraph ECMP parsing support (cleaner format) (#4985)
Why I did it
To support FG_ECMP  scenarios
- How I did it
Modified minigraph parser to parse ECMP fields in the case they are present in minigraph
- How to verify it
Loaded ensuing config_db file on a DUT to verify the fields are parsed and configure device correctly
2020-12-30 15:18:21 -08:00
carl-nokia
a1fe203788
[Nokia]: EEPROM platform API Python3 compliance changes (#6318)
- Why I did it
Make EEPROM platform APIs Python3 compliant in Nokia platform.

- How I did it
Handle bytearray type returned by read_eeprom and read_eeprom_bytes methods.

- How to verify it
Boot Nokia ixs7215 and verify PMON docker running and show platform syseeprom

Co-authored-by: Carl Keene <keene@nokia.com>
2020-12-30 07:34:57 -08:00
Danny Allen
a64994ec29
[sysctl] Increase hung_task_timeout_secs to 300 (#6312)
Depending on the performance characteristics of a given hardware platform, it's possible to exceed the default 120 second kernel timeout during I/O intensive operations like image installation. This can cause a kernel panic like so:

kernel:[ 852.441781] Kernel panic - not syncing: hung_task: blocked tasks

If this happens during image installation, it's possible for the install to become corrupted and leave the device in an unreachable state that requires a power cycle to resolve. This risk increases as image size continues to increase. So, we need to increase the timeout so that we don't encounter kernel panics on devices with lower disk throughput.

Signed-off-by: Danny Allen <daall@microsoft.com>
2020-12-30 05:00:16 -08:00
Nazarii Hnydyn
119fd7f577
[buildsystem] Fix syntax error: unexpected end of file in Makefile.work (#6315)
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2020-12-30 04:59:16 -08:00
lguohan
de4a3c8f2f
[build]: change user name to lower case when used in sonic-slave tag (#6319)
sonic-slave tag only allows all lower case. In case the user
name is mixed case, we need to change user name to all lower case.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-12-30 04:58:20 -08:00