Commit Graph

4168 Commits

Author SHA1 Message Date
Samuel Angebault
c213bcf23a [arista]: Update arista driver submodules (#4922)
- Add more reboot cause reporting
 - Fix backward compatibility issue with older reboot cause format
 - Miscellaneous improvements
2020-07-12 18:08:52 +00:00
Ying Xie
d499a266c0 [mgmt docker] move pycryptodome installation to the end of the docker building (#4917)
* [mgmt docker] move pycryptodome installation to the end of the docker building

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* pin down the version to current: 3.9.8

* comment
2020-07-12 18:08:52 +00:00
Akhilesh Samineni
525029e3d8 [NAT]: Update the conntrack entries timeout to Max value after warmboot (#4596)
Signed-off-by: Akhilesh Samineni <akhilesh.samineni@broadcom.com>

All new NAT conntrack entries are added to kernel with max entry timeout of 432000 and setting the same timeout during system warm reboot also
2020-07-12 18:08:52 +00:00
joyas-joseph
7a6fca2f98 [docker-sflow]: upgrade docker-sflow on buster (#4904) 2020-07-12 18:08:52 +00:00
Tamer Ahmed
f4eae5dabd [telemetry] Call sonic-cfggen Once (#4901)
sonic-cfggen call is slow and this is taking place in the SONiC
boot up process. The change uses templates to assemble all required
vars into single template file. With this change, telemetry now calls
once into sonic-cfggen.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-07-12 18:08:52 +00:00
Stephen Sun
153f880e6b [mellanox]: Support warm reboot on MSN4700 (#4910) 2020-07-12 18:08:52 +00:00
shlomibitton
e666bf8490 [Mellanox] Add a new SKU Mellanox-SN4600C-D112C8 (#4833)
Add related files to the device folder:

buffer config templates
pg lookup profile
port_config.ini
sai profile
sensor conf
plugins

Co-authored-by: Stephen Sun <stephens@mellanox.com>
2020-07-12 18:08:52 +00:00
Qi Luo
7707185aaf [build]: Fix make clean for redis-tools (#4903)
Fixed #4898
2020-07-12 18:08:52 +00:00
Ying Xie
6f11833ffa Revert "[sonic mgmt docker] lock pycryptodome version to 3.9.7 (#4913)" (#4915)
This reverts commit f427d2eecf.
2020-07-12 18:08:51 +00:00
Ying Xie
caa3323e9d [sonic mgmt docker] lock pycryptodome version to 3.9.7 (#4913)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-07-12 18:08:51 +00:00
Venkatesan Mahalingam
7d003c3518 [TACACS+]: Add support to specify source address for TACACS+ (#4610)
This pull request was cherry picked from "#1238" to resolve the conflicts.

- Why I did it
Add support to specify source address for TACACS+
- How I did it
Add patches for libpam-tacplus and libnss-tacplus. The patches parse the new option 'src_ip' and store the converted addrinfo. Then the addrinfo is used for TACACS+ connection.
Add a attribute 'src_ip' for table "TACPLUS|global" in configDB
Add some code to adapt to the attribute 'src_ip'.
- How to verify it
Config command for source address PR in sonic-utilities
config tacacs src_ip <ip_address>

- Description for the changelog
Add patches to specify source address for the TACACS+ outgoing packets.

- A picture of a cute animal (not mandatory but encouraged)

**UT logs: **

UT_tacacs_source_intf.txt
2020-07-12 18:08:51 +00:00
lguohan
1dcf8ec04f [kernel]: upgrade linux kernel to 4.9.118 (#4897)
upgrade kernel to latest maintenance version 4.9.118

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-07-12 18:08:51 +00:00
arlakshm
97fa2c087b "[config]: Multi ASIC loopback changes (#4895)
Resubmitting the changes for (#4825) with fixes for sonic-bgpcdgd test failures
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-07-12 18:08:51 +00:00
zzhiyuan
2c426a8290 Skip thermalctld for arista platforms (#4893)
thermalctld throwing error messages because it is not yet fully configured, disabling it for now on arista platforms.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-07-12 18:08:51 +00:00
xumia
68e7cdb5ed Fix dpkg cache hash value relative to file path issue (#4894) 2020-07-12 18:08:51 +00:00
Volodymyr Boiko
15748a50ae [barefoot][SAI v1.6.3] Update SAI and platform packages to 20200701 (#4890)
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-07-12 18:08:51 +00:00
lguohan
e2e57d32d6 [docker-orchagent]: upgrade docker-orchagent to buster (#4889)
also update submodule

* 01f810f 2020-07-02 | fix compiling issue for gcc8.3 (#1339) [lguohan]
* 9b13120 2020-07-03 | Fix in script to avoid orchagent crash when port down followed by fdb delete (#1340) [rupesh-k]
* 9b01844 2020-07-01 | [qosorch] Update QoS scheduler params for shaping features (#1296) [Michael Li]
* 86b5e99 2020-07-02 | [mirrororch] Port Mirroring implementation (#1314) [rupesh-k]
* c05601c 2020-06-24 | [portsyncd]: add debug message if a port cannot be found in port able (#1328) [lguohan]
* a0b6412 2020-06-23 | COPP_DEL_fix: DEL for one trap group from SONIC is resetting all the trap IDs (#1273) [SinghMinu]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-07-12 18:08:51 +00:00
Joe LeVeque
856f117fd3 [sonic-platform-daemons] Update submodule (#4887)
* src/sonic-platform-daemons abe115e...9b8bfa1 (1):
  > [xcvrd] Update key names in 'get_media_settings_value()' (#63)
2020-07-12 18:08:51 +00:00
paavaanan
cba801ae89 [Dell]: DellEMC S6100 disable pericom/xlinx chipset (#4868)
- Xilinx/pericom peripherals are not actively used in DellEMC S6100 switch.
- These peripherals are throwing PCIE corrected messages in some of the units and filling syslog.
- Since it is not usable disabling it at startup.
2020-07-12 18:08:51 +00:00
lguohan
58632e6e83 [docker-orchagent]: make build depends only on sairedis package (#4880)
make swss build depends only on libsairedis instead of syncd. This allows to build swss without depending
on vendor sai library.

Currently, libsairedis build also buils syncd which requires vendor SAI lib. This makes difficult to build
swss docker in buster while still keeping syncd docker in stretch, as swss requires libsairedis which also
build syncd and requires vendor to provide SAI for buster. As swss docker does not really contain syncd
binary, so it is not necessary to build syncd for swss docker.

* [submodule]: update sonic-sairedis

* ccbb3bc 2020-06-28 | add option to build without syncd (HEAD, origin/master, origin/HEAD) [Guohan Lu]
* 4247481 2020-06-28 | install saidiscovery into syncd package [Guohan Lu]
* 61b8e8e 2020-06-26 | Revert "sonic-sairedis: Add support to sonic-sairedis for gearbox phys (#624)" (#630) [Danny Allen]
* 85e543c 2020-06-26 | add a README to tests directory to describe how to run 'make check' (#629) [Syd Logan]
* 2772f15 2020-06-26 | sonic-sairedis: Add support to sonic-sairedis for gearbox phys (#624) [Syd Logan]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-07-12 18:08:51 +00:00
Guohan Lu
f8da3e4c69 Revert "[config]: Loopback Interface changes for multi ASIC devices (#4825)"
This reverts commit cae65a451c.
2020-07-12 18:08:51 +00:00
Renuka Manavalan
7c30949758 Added new pip packages, required by kube.py (kubernetes CLI). (#4884) 2020-07-12 18:08:51 +00:00
arlakshm
002335a3d5 [config]: Loopback Interface changes for multi ASIC devices (#4825)
* Loopback IP changes for multi ASIC devices
multi ASIC will have 2 Loopback Interfaces

- Loopback0 has globally unique IP address, which is advertised by the multi ASIC device to its peers.
This way all the external devices will see this device as a single device.
- Loopback4096 is assigned an IP address which has a scope is within the device. Each ASIC has a different ip address for Loopback4096. This ip address will be used as Router-Id by the bgp instance on multi ASIC devices.

This PR implements this change for multi ASIC devices

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-07-12 18:08:51 +00:00
pavel-shirshov
2b137fb540 Tests of FRR templates which rendered by sonic-cfggen (#4875)
* Tests of FRR templates which rendered by sonic-cfggen
2020-07-12 18:08:51 +00:00
pavel-shirshov
7d0ea7383d [pfx_filter]: Add a prefix mask by default in pfx_filter, when there is no one (#4860)
If some table with a list of tuples (interface name, ip prefix) has ip prefixes without a mask length, it will cause issues in SONiC. For example quagga and frr will treat ipv4 address without a mask, so "10.20.30.40" address will be treated as "10.0.0.0/8", which is dangerous.

The fix here is that when pfx_filter get a tuple (interface name, ip prefix), where the ip prefix doesn't have prefix mask length, add a mask by default: "/32 for ipv4 addresses, /128 for ipv6 addresses".

Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
2020-07-12 18:08:51 +00:00
abdosi
fc6bcff52b [sonic-buildimage] Changes to make network specific sysctl common for both host and docker namespace (#4838)
* [sonic-buildimage] Changes to make network specific sysctl
common for both host and docker namespace (in multi-npu).

This change is triggered with issue found in multi-npu platforms
where in docker namespace
net.ipv6.conf.all.forwarding was 0 (should be 1) because of
which RS/RA message were triggered and link-local router were learnt.

Beside this there were some other sysctl.net.ipv6* params whose value
in docker namespace is not same as host namespace.

So to make we are always in sync in host and docker namespace
created common file that list all sysctl.net.* params and used
both by host and docker namespace. Any change will get applied
to both namespace.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments and made sure to invoke augtool
only one and do string concatenation of all set commands

* Address Review Comments.
2020-07-12 18:08:51 +00:00
Mahesh Maddikayala
5eabae1ede Fix in libsaibcm for high CPU utilization of syncd (#4874) 2020-07-12 18:08:51 +00:00
Akhilesh Samineni
eed16e9618 [docker-nat]: Updated the NAT iptables patch for 4.19 buster (#4843)
Updated the NAT iptables patch for 4.19 buster

Depends on PR : Azure/sonic-linux-kernel#147

1 Known issue:

With both NAT patch files for 4.19 buster kernel, seeing 1 display issue in iptables like explained below

On Docker NAT, iptables supported version is 1.6.0 and on base OS it’s 1.8.2. So seeing an display issue of which fullcone option is not showing in version 1.8.2 iptables output and no issues in functionality.

Display issue – For example of comparsion:

NAT Docker:
root@sonic:/home/admin# docker exec -it nat bash
root@sonic:/# iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 DNAT all -- * * 0.0.0.0/0 0.0.0.0/0 to:1.1.1.1 fullcone

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 30 packets, 2749 bytes)
pkts bytes target prot opt in out source destination

Chain POSTROUTING (policy ACCEPT 30 packets, 2749 bytes)
pkts bytes target prot opt in out source destination
root@sonic:/#

Base OS:
root@sonic:/home/admin# iptables-legacy -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
1 36 DNAT all -- * * 0.0.0.0/0 0.0.0.0/0 to:1.1.1.1

Chain INPUT (policy ACCEPT 1 packets, 36 bytes)
pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 41 packets, 3572 bytes)
pkts bytes target prot opt in out source destination

Chain POSTROUTING (policy ACCEPT 41 packets, 3572 bytes)
pkts bytes target prot opt in out source destination
root@sonic:/home/admin#

To fix this issue, iptables need to update from 1.6.0 to 1.8.2 version and have to update the NAT docker from stretch to buster. Will raise a new PR with this.

Signed-off-by: Akhilesh Samineni akhilesh.samineni@broadcom.com

Signed-off-by: Akhilesh Samineni <akhilesh.samineni@broadcom.com>
2020-07-12 18:08:51 +00:00
Mahesh Maddikayala
74389cb402 [sonic-sairedis] sonic-sairedis submodule update (#4847)
* sonic-sairedis submodule update
* Update BRCM SAI to 3.7.5.1
2020-07-12 18:08:51 +00:00
judyjoseph
1af68b3aa6 Support for connecting to DB in namespace via TCP port in multi-asic platform. (#4779)
* Support for connecting to DB in namespace via IP:port ( using docker bridge network ) for applications in multi-asic platform.

* Added the default IP as 127.0.0.1 if the IPaddress derivation from interface fails.
Moved the localhost loopback IP binding logic into the supervisor.j2 file.
2020-07-12 18:08:51 +00:00
Kebo Liu
0921e3d6ff [mellanox]: Update SAI to 1.16.5 (#4873)
1.  Upgrade SAI headers to v1.6.3
2.  Fix traffic lost during FFB related to buffer config + optimize buffer config timing for FB
3.  Add ACL fields BTH, IP flags
4.  Add ACL infrastructure of different fields per ASIC type
2020-07-12 18:08:51 +00:00
Volodymyr Boiko
77a1bc25de [sonic-platform-common] Update submodule (#4871)
* src/sonic-platform-common 82bbeab...42781ff (1):
  > [SfpBase] Fix key name typo in docstring (#99)

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2020-07-12 18:08:51 +00:00
arlakshm
a8b99f77f3 syslog changes Multi ASIC platforms (#4738)
Add changes for syslog support for containers running in namespaces on multi ASIC platforms.
On Multi ASIC platforms

Rsyslog service is only running on the host. There is no rsyslog service running in each namespace.
On multi ASIC platforms the rsyslog service on the host will be listening on the docker0 ip address instead of loopback address.
The rsyslog.conf on the containers is modified to have omfwd target ip to be docker0 ipaddress instead of loopback ip

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2020-07-12 18:08:51 +00:00
Praveen Chaudhary
0f4460e7ad
[rules/sonic-utilities.mk]: Add sonic_yang_models as dep to sonic utils (#4869)
Since we can not refer a dir in sonic-buildimage while jenkins testing of sonic-utilities.
We need to create build dependency on sonic_yang_models PKG too.

Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
2020-06-29 14:44:52 -07:00
abdosi
15440b6e43
Changes to make default route programming correct in multi-npu platforms (#4774)
* Changes to make default route programming
correct in multi-asic platform where frr is not running
in host namespace. Change is to set correct administrative distance.
Also make NAMESPACE* enviroment variable available for all dockers
so that it can be used when needed.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix review comments

* Review comment to check to add default route
only if default route exist and delete is successful.
2020-06-29 11:38:46 -07:00
SuvarnaMeenakshi
ab2177b4a9
[systemd-generator]: Fix dependency update for multi-asic platform (#4820)
* [systemd-generator]: Fix the code to make sure that dependencies
of host services are generated correctly for multi-asic platforms.
Add code to make sure that systemd timer files are also modified
to add the correct service dependency for multi-asic platforms.

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>

* [systemd-generator]: Minor fix, remove debug code and
remove unused variable.
2020-06-29 09:39:23 -07:00
Junchao-Mellanox
ce391645f2
[Mellanox] add ASIC temperature support to platform API (#4828)
**- Why I did it**

System health feature requires to read ASIC temperature and threshold from platform API

**- How I did it**

Implement Chassis.get_asic_temperature and Chassis.get_asic_temperature_threshold by getting value from system fs.
2020-06-28 17:54:28 -07:00
ciju-juniper
dd4cf912a6
[Juniper][QFX5210] Fixing a few platform issues (#4857)
This patch addresses the following issues:
 1) Platform drivers were not loading in the latest images. Fixed
    the intialization script to make sure that all the drivers are
    loaded.
 2) Getting rid of "pstore: crypto_comp_decompress failed, ret = -22!"
    messages during the kernel boot, after moving to 4.19 kernel. The
    solution is to remove the files under '/sys/fs/pstore' directory.

Signed-off-by: Ciju Rajan K <crajank@juniper.net>
2020-06-28 11:11:34 -07:00
yozhao101
1c32933c7d
[docker] Correct the lldp-syncd program name in critical_process file. (#4862)
The program name in critical_processes file must match the program name defined in supervisord.conf file.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-28 11:08:30 -07:00
Praveen Chaudhary
07930c39ba
[build] Add essential PY PKGs on host for sonic-utilities/config/config_mgmt.py (#4740)
Add essential PY PKGs on host by installing them in sonic_debian_extension.j2

Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
2020-06-28 11:03:48 -07:00
Joe LeVeque
ced0f7ba3d
[sonic-platform-common][sonic-platform-daemons][sonic-utilities] Update submodules (#4852)
* src/sonic-platform-common 75698a8...82bbeab (9):
  > [sfputil] Make SfpUtilHelper.get_physical_to_logical noexcept as in SfpUtilBase (#96)
  > [sfp_base] Update return value documentation of channel-specific methods (#98)
  > [sfp] Tweak key names of some transceiver info fields (#97)
  > fix typo:  portconfig.ini to port_config.ini (#94)
  > [chassis_base] Add platform API support for system LED (#91)
  > Add PCIe check commad  (#64)
  > [sfputilbase.py] Don't try to print EEPROM sysfs file name if we failed to read from it (#81)                                                                                    
  > [sfputilbase | sfputilhelper] Add support of platform.json (#72)
  > [eeprom] Add try-except to catch the IOError (#85)

* src/sonic-platform-daemons 0f4fd83...abe115e (2):
  > [xcvrd] Tweak some transceiver info key names (#62)
  > [psud][thermalctld] Always get fan/PSU LED status from platform API to avoid status inconsistencies (#59)                                                                        

* src/sonic-utilities fd7781b...16a33f2 (9):
  > [config] Fix syntax error (#966)
  > [config] Fix indentation level in _get_disabled_services_list() (#965)
  > a4e64d1 [sonic_installer] Refactor sonic_installer code (#953)
  > 90efd62 [Show | Command Reference] Add Port breakout Show Command (#859)
  > [sfpshow][mock_state_db] Tweak key names of some transceiver info fields (#958)
  > [show] Add missing verbose option to "show line" (#961)
  > [filter-fdb] Check VLAN Presence When Filter FDB (#957)
  > [master]fix #4716 show ipv6 interfaces neighbor_ip is N/A issue (#948)
  > Fix for command. show interface transceiver eeprom -d Ethernet (#955)

Note: sonic-utilities update fixes #4716
2020-06-27 22:57:26 -07:00
Aravind Mani
0c3ec0e644
[DellEMC] S52xx fix SFP reset in 1.0 API (#4858)
Issue: Port with AOC cable does not come up when "sfputil reset <port_name>" is executed.

Modified the incorrect mask used in reset API to resolve the issue.
2020-06-27 12:02:53 -07:00
pavel-shirshov
1eb3dfe541
[docker-teamd]: Introducing tlm_teamd: telemetry for teamd (#4824)
**- What I did**
1. Updated submodule sonic-swss to bring tlm_teamd to the buildimage.
2. Updated supervisord for the teamd
3. Updated critical process list (not sure that tlm_teamd is critical for now)

**- How to verify it**
Build an image and run. Check that tlm_teamd is running and STATE_DB has information in the LAG_INTERFACE, and :LAG_MEMBER_INTERFACE
```
admin@sonic:~$ redis-cli -n 6 hgetall 'LAG_TABLE|PortChannel16'
 1) "state"
 2) "ok"
 3) "team_device.ifinfo.dev_addr"
 4) "4c:76:25:f5:48:80"
 5) "setup.kernel_team_mode_name"
 6) "loadbalance"
 7) "team_device.ifinfo.ifindex"
 8) "6"
 9) "runner.fast_rate"
10) "false"
11) "runner.active"
12) "true"
13) "setup.pid"
14) "35"
15) "runner.fallback"
16) "false"
```

```
admin@sonic:~$ redis-cli -n 6 hgetall 'LAG_MEMBER_TABLE|PortChannel16|Ethernet16'
 1) "runner.selected"
 2) "true"
 3) "runner.aggregator.selected"
 4) "true"
 5) "runner.aggregator.id"
 6) "26"
 7) "runner.actor_lacpdu_info.state"
 8) "61"
 9) "runner.state"
10) "current"
11) "runner.actor_lacpdu_info.system"
12) "4c:76:25:f5:48:80"
13) "runner.partner_lacpdu_info.state"
14) "61"
15) "link.up"
16) "true"
17) "ifinfo.dev_addr"
18) "4c:76:25:f5:48:80"
19) "ifinfo.ifindex"
20) "26"
21) "link_watches.list.link_watch_0.up"
22) "true"
23) "runner.actor_lacpdu_info.port"
24) "17"
25) "runner.partner_lacpdu_info.port"
26) "1"
27) "runner.partner_lacpdu_info.system"
28) "52:54:00:ff:34:1b"
```
2020-06-27 01:22:23 -07:00
Qi Luo
6849a0351c
[redis] Install vanilla redis packages for Buster and Stretch; upgrade Buster to 6.0.5 (#4732)
upgrade redis server to 5:6.0.5-1~bpo10+1
2020-06-27 01:17:20 -07:00
lguohan
c79783003d
[submodule]: update sonic-linux-kernel (#4856)
* c60b1f4 2020-06-26 | e1000: Do not perform reset in reset_task if we are already down (#148) (HEAD -> master, origin/master, origin/HEAD) [lguohan]
* c6aeedd 2020-06-25 | Updated NAT kernel patch for 4.19 buster (#147) [Akhilesh Samineni]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2020-06-26 23:59:22 -07:00
Praveen Chaudhary
94d448e9bd
[slave.mk]: Adding support to specify debs dependencies for python-debs package. (#4849)
**- How I did it**
added below line:
$$(addsuffix -install,$$(addprefix $(DEBS_PATH)/,$$($$*_DEBS_DEPENDS))) \

**- How to verify it**
Added below dependencies in sonic-utils
```
SONIC_UTILS = python-sonic-utilities_1.2-1_all.deb
$(SONIC_UTILS)_SRC_PATH = $(SRC_PATH)/sonic-utilities
$(SONIC_UTILS)_DEBS_DEPENDS = $(LIBYANG) $(LIBYANG_CPP) $(LIBYANG_PY2) \  <<<<<<<<<<<
                                     $(LIBYANG_PY3)
$(SONIC_UTILS)_WHEEL_DEPENDS = $(SONIC_CONFIG_ENGINE) $(SONIC_YANG_MGMT_PY)
SONIC_PYTHON_STDEB_DEBS += $(SONIC_UTILS)
```
Build the PKGs successfully.

Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
2020-06-26 11:32:35 -07:00
Kebo Liu
88bbcbf246
[Mellanox] Update SDK to 4.4.0952, FW to *.2007.1280 (#4842) 2020-06-26 13:44:21 +03:00
yozhao101
4fa81b4f8d
[dockers] Update critical_processes file syntax (#4831)
**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a  container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.

Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.

**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
2020-06-25 21:18:21 -07:00
Shuba Viswanathan
921d132a32
[sonic-mgmt]: Support for pytest-html to control logs better (#4791)
The current stdout file which also includes the dut logs are very verbose and noisy.

We have manually installed it in the sonic-mgmt docker in our organization and tuned the pytest settings to produce very helpful and concise logs.

pytest-html plugins can be used to post-process the output in various ways based on our different and unique organizational needs.

Hence proposing to add this pkt to the docker file
2020-06-25 17:45:16 -07:00
yozhao101
b8ad0ed4e4
[Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor 
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the 
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the 
syncd process, Monit will not detect whether syncd process is running or not. 

If we ran the command  `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three 
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match. 

Similarly, On Mellanox Monit will also select the process with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.

That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.

**- How I did it**

**- How to verify it**

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-25 17:03:14 -07:00