Commit Graph

124 Commits

Author SHA1 Message Date
abdosi
dddf96933c
[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. (#5720)
Why/How I did:

Make sure first error syslog is triggered based on FAULT TOLERANCE condition.

Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent

Updated the monit conf files with repeat every x cycles for the alert action
2020-10-31 17:29:49 -07:00
Samuel Angebault
12911ba619
[Arista] Update arista driver submodules (#5736)
- Change `/run/arista` mount to pmon by `/var/run/platform_cache`
 - Python3 by default for Arista platform initialisation
 - Fix outstanding py2/3 compatibility issues (eeprom mostly)
 - Use pytest for unit testing
 - Miscellaneous modular fixes
2020-10-30 04:17:30 -07:00
Samuel Angebault
5bfe37ca42
[Arista] Update driver submodules (#5686)
- Enable thermalctld support for our platforms
 - Fix Chassis.get_num_sfp which had an off by one
 - Implement read_eeprom and write_eeprom in SfpBase
 - Refactor of Psus and PsuSlots. Psus they are now detected and metadata reported
 - Improvements to modular support

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-10-23 12:28:36 -07:00
Petro Bratash
7895513813
[BFN] Rename variable in Montara platform debian/rules (#5675)
Platforms .deb-package is not installed into the right directory, because variable PLATFORM was defined previously and variable in the makefile is ignored.
https://www.gnu.org/software/make/manual/html_node/Overriding.html
2020-10-22 10:34:18 -07:00
Joe LeVeque
8011edc307
[platform] Remove references to deprecated get_serial_number() method in Chassis class (#5649)
The `get_serial_number()` method in the ChassisBase and ModuleBase classes was redundant, as the `get_serial()` method is inherited from the DeviceBase class. This method was removed from the base classes in sonic-platform-common and the submodule was updated in https://github.com/Azure/sonic-buildimage/pull/5625.

This PR aligns the existing vendor platform API implementations to remove the `get_serial_number()` methods and ensure the `get_serial()` methods are implemented, if they weren't previously.

Note that this PR does not modify the Dell platform API implementations, as this will be handled as part of https://github.com/Azure/sonic-buildimage/pull/5609
2020-10-17 22:00:14 -07:00
Volodymyr Boiko
8b135afb52
[barefoot][platform] Platform API fixups (#5613)
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-10-14 11:35:36 -07:00
Andriy Kokhan
8e0e316cf8
[BFN] Updated SDK packages to 20201009 (#5576)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-10-10 20:20:41 -07:00
Volodymyr Boiko
f61ff95e26
[barefoot] Platform API 2.0 fixups (#5539)
Fixes for bfn platform api

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-10-05 10:50:03 -07:00
Volodymyr Boiko
dbea3bbfd7
BFN platform API 2.0 support (#4766)
Added barefoot platform api 2.0 support

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-10-03 13:46:21 -07:00
yozhao101
13cec4c486
[Monit] Unmonitor the processes in containers which are disabled. (#5153)
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-09-25 00:28:28 -07:00
Vitaliy Senchyshyn
3397e41aa8
[barefoot][platform] Update BFN platforms (#5356)
1. Added support of BFN newport new platform name.
2. Updated debian version for montara and mavericks platforms.
2020-09-16 10:34:49 -07:00
Samuel Angebault
0b4191fe2a
[Arista] Updating driver submodules (#5352)
- Merge chassis codebase upstream
 - Add support for Otterlake supervisor
 - Add support for NorthFace and Camp chassis
 - Add support for Eldridge, Dragonfly and Brooks fabrics
 - Add support for Clearwater2 and Clearwater2Ms linecards
 - Add new arista Cli to power on/off cards
 - Add new arista show Cli to inspect supervisor, chassis, fabrics and linecards
2020-09-10 01:34:38 -07:00
Samuel Angebault
41dcdc6f49
[Arista] Update driver submodules (#5296)
- Finish the refactor of the smbus backend (huge performance enhencement)
 - Split scd-hwmon kernel module in multiple source files
 - Add new scd-uart driver to manage linecard consoles within a chassis
 - Bootstrap of thermal platform API implementation
 - Fix psud led error message
 - Fix fan led color on Smartsville
2020-09-09 11:41:35 -07:00
Joe LeVeque
5b3b4804ad
[dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-08 23:36:38 -07:00
Samuel Angebault
6f4ef03b29
[arista] Update driver submodules (#5147)
- fix watchdog timeout units
- fix import path for thermal_manager
- remove arista bind mounts for docker-snmp
- improve arista bind mounts for pmon
2020-08-21 10:38:35 -07:00
Petro Bratash
52b349442d
[BFN] Add support pcied daemon for Montara and Newport (#5199)
Signed-off-by: Petro Bratash <petrox.bratash@intel.com>
2020-08-18 14:32:48 -07:00
Samuel Angebault
5a3d504e61
[arista] Update arista driver submodules (#5116)
- Fix some import issues with sonic_platform
 - Fix for xcvr with newer calls in platform API
 - Miscellaneous improvements
2020-08-07 19:55:51 -07:00
lguohan
e1ac3cfc6a
[build]: wait for conflicts package to be uninstalled (#5039)
when parallel build is enabled, both docker-fpm-frr and docker-syncd-brcm
is built at the same time, docker-fpm-frr requires swss which requires to
install libsaivs-dev. docker-syncd-brcm requires syncd package which requires
to install libsaibcm-dev.

since libsaivs-dev and libsaibcm-dev install the sai header in the same
location, these two packages cannot be installed at the same time. Therefore,
we need to serialize the build between these two packages. Simply uninstall
the conflict package is not enough to solve this issue. The correct solution
is to have one package wait for another package to be uninstalled.

For example, if syncd is built first, then it will install libsaibcm-dev.
Meanwhile, if the swss build job starts and tries to install libsaivs-dev,
it will first try to query if libsaibcm-dev is installed or not. if it is
installed, then it will wait until libsaibcm-dev is uninstalled. After syncd
job is finished, it will uninstall libsaibcm-dev and swss build job will be
unblocked.

To solve this issue, _UNINSTALLS is introduced to uninstall a package that
is no longer needed and to allow blocked job to continue.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-07-27 10:46:20 -07:00
Andriy Kokhan
c4748e8fb7
[bfn] Updated SDK packages to SAI v1.6.4 (#5021)
* Update SDK packages to SAI 1.6.4
* Add SAI_SWITCH_TYPE_NPU attribute support

Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-07-23 12:47:02 -07:00
Volodymyr Boiko
39941f0ae2 [BFN] Update SAI and platform packages to 20200710 (#4942)
Barefoot, updated SAI and platform packages to 20200710

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-07-12 18:08:52 +00:00
Samuel Angebault
c213bcf23a [arista]: Update arista driver submodules (#4922)
- Add more reboot cause reporting
 - Fix backward compatibility issue with older reboot cause format
 - Miscellaneous improvements
2020-07-12 18:08:52 +00:00
lguohan
1dcf8ec04f [kernel]: upgrade linux kernel to 4.9.118 (#4897)
upgrade kernel to latest maintenance version 4.9.118

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-07-12 18:08:51 +00:00
Volodymyr Boiko
15748a50ae [barefoot][SAI v1.6.3] Update SAI and platform packages to 20200701 (#4890)
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-07-12 18:08:51 +00:00
lguohan
58632e6e83 [docker-orchagent]: make build depends only on sairedis package (#4880)
make swss build depends only on libsairedis instead of syncd. This allows to build swss without depending
on vendor sai library.

Currently, libsairedis build also buils syncd which requires vendor SAI lib. This makes difficult to build
swss docker in buster while still keeping syncd docker in stretch, as swss requires libsairedis which also
build syncd and requires vendor to provide SAI for buster. As swss docker does not really contain syncd
binary, so it is not necessary to build syncd for swss docker.

* [submodule]: update sonic-sairedis

* ccbb3bc 2020-06-28 | add option to build without syncd (HEAD, origin/master, origin/HEAD) [Guohan Lu]
* 4247481 2020-06-28 | install saidiscovery into syncd package [Guohan Lu]
* 61b8e8e 2020-06-26 | Revert "sonic-sairedis: Add support to sonic-sairedis for gearbox phys (#624)" (#630) [Danny Allen]
* 85e543c 2020-06-26 | add a README to tests directory to describe how to run 'make check' (#629) [Syd Logan]
* 2772f15 2020-06-26 | sonic-sairedis: Add support to sonic-sairedis for gearbox phys (#624) [Syd Logan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-07-12 18:08:51 +00:00
yozhao101
4fa81b4f8d
[dockers] Update critical_processes file syntax (#4831)
**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a  container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.

Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.

**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
2020-06-25 21:18:21 -07:00
yozhao101
b8ad0ed4e4
[Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor 
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the 
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the 
syncd process, Monit will not detect whether syncd process is running or not. 

If we ran the command  `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three 
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match. 

Similarly, On Mellanox Monit will also select the process with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.

That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.

**- How I did it**

**- How to verify it**

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-25 17:03:14 -07:00
Olivier Singla
68576bc2f9
[kerne]: kernel update from 4.19.0-6 to 4.19.0-6-2 (#4711) 2020-06-21 06:41:23 -07:00
Volodymyr Boiko
603b2955e6
[BFN] Update SAI and platform packages to 20200618 (#4817)
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-06-19 11:00:44 -07:00
Andriy Kokhan
6acd64d005
[BFN] Updated SDK packages to SAI v1.6.1 (#4744)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-06-11 23:24:49 -07:00
YaoTien
e2ebe99665
[devices]: Fixed OSW1800 build problem (#4647)
* Enable to build osw1800
* Modify wnc-eeprom which base on debian buster's eeprom.c

Co-authored-by: Brand.huang <brand.huang@wnc.com.tw>
2020-06-09 09:19:45 -07:00
joyas-joseph
9505bdb910
[docker-syncd-vs]: Convert syncd-vs docker to buster (#4726)
Signed-off-by: Joyas Joseph <joyas_joseph@dell.com>
2020-06-09 09:07:25 -07:00
Samuel Angebault
c8bd640ae5
[arista] Update drivers submodules (#4693)
- Sensor and Fan information added to primary platforms for thermal API.
- Refactors involving better abstractions, code reuse and dead code removal.
- Improvements to the diag capabilities
- Pylintrc added to improve code quality. Will become fatal at a later time.

Co-authored-by: Baptiste Covolato <baptiste@arista.com>
2020-06-04 13:33:33 -07:00
Guohan Lu
82dc8ace12 [docker-syncd-bfn]: use service dependency in supervisord to start services 2020-05-22 11:01:28 -07:00
Andriy Kokhan
6889fd14ac
[BFN] Fixed Barefoot platform image build. (#4565)
- Updated bf_tun kernel driver;
- Temporarily disabled WNC modules build;

Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-05-11 23:23:34 -07:00
Andriy Kokhan
40ed75c2cb
[BFN] Updated Barefoot SDK to 2020-05-07 (#4566)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-05-11 23:21:05 -07:00
lguohan
c55603f494
[build]: add docker-ptf-* as stretch docker targets (#4516)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-05-01 11:20:33 -07:00
Samuel Angebault
80a025a7c2
[arista] update platform driver submodules (#4512)
- Add Makefile rules to build debug containers for SWI images
- Fix some platform API implementation for xcvrd and thermalctld
- Improvements to arista diag command
- Miscellaneous refactors

Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
2020-04-30 12:06:19 -07:00
lguohan
c52b8c4e75
[platform-modules]: set debian control depends on unsigned kernel package (#4466)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-22 19:41:48 -07:00
byu343
0095763780 [arista]: Update driver submodules to support buster kernel (#57) 2020-04-20 07:34:43 +00:00
Guohan Lu
e5e26d9a43 [platform-modules]: set debian control file to depend on 4.19.0-6
change depends to linux-image-4.19.0-6-amd64 in debian/control

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-17 04:51:51 +00:00
Guohan Lu
01cb7934b0 [build]: add buster docker as the last step of the build proces
- build SONIC_STRETCH_DOCKERS in sonic-slave-stretch docker
- build image related module in sonic-slave-buster docker.
  This includes all kernels modules and some packages

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-04-16 10:26:18 +00:00
Myron Sosyak
530c9fc427
Update Barefoot kdrv (#4355) 2020-04-14 09:43:35 -07:00
Andriy Kokhan
36752b8775
[BFN] Update Barefoot SDK packages (#4397)
Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
2020-04-08 22:49:45 -07:00
Samuel Angebault
5b0ec7afe6
[Arista] Update drivers submodules (#4317)
* Update arista drivers submodules

* Add device configs for 7060CX2-32S

* Update boot0 and union-mount for 7060CX2-32S

* Add 7170-32C and 7170-32CD support in boot0

* Sync after writting boot configs

* Add 7170-32C and 7170-32CD device configurations

Co-authored-by: Boyang Yu <byu@arista.com>
2020-03-27 17:28:27 -07:00
Myron Sosyak
0b618ab19b
[BFN] Fix syncd RPC compilation (#4258)
* Fix BFN syncd RPC compilation
* Remove empty line
2020-03-19 10:40:14 -07:00
Mykola F
bfe690b739
[syncd-rpc.mk] install ptf dependancy (#4279)
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
2020-03-18 11:59:39 -07:00
Samuel Angebault
a3bbdc9929
[arista] Update platform drivers submodules (#4200)
* Build Arista drivers using DPKG_DEB
* Update arista platform submodule
2020-03-17 10:14:59 -07:00
Samuel Angebault
75181702d3
[arista] Update drivers submodule (#4164)
- Refactor of the cli
 - Refactor of the logging mechanism
 - Add some more utilities
 - Miscelaneous fixes and improvements
2020-02-25 09:14:56 -08:00
Samuel Angebault
1e40c598a3
[arista] Update drivers submodules (#4147)
Update the linux-kernel dependency to 4.9.0-11-2
2020-02-13 18:16:10 -08:00
Olivier Singla
6a0dcb1b16
[kernel]: security kernel update to 4.9.189 (#3913)
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)

Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
2020-02-12 17:41:58 -08:00