Commit Graph

5264 Commits

Author SHA1 Message Date
Kevin Wang
3ef3e3c56f
Update cisco-8000 ref to release: 202012-v0.8-nopatches (#9763)
Signed-off-by: Kevin(Shengkai) Wang <shengkaiwang@microsoft.com>
2022-01-15 09:49:34 +08:00
SuvarnaMeenakshi
d2ee7a5bef [docker-snmp]: Modify log level of snmpd (#9734)
#### Why I did it
resolves https://github.com/Azure/sonic-buildimage/issues/8779
snmpd writes the below error message in syslog :
snmp#snmpd[27]: truncating integer value > 32 bits
This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days.

#### How I did it
Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd

To avoid seeing this message if the counter grows, the snmpd error log level is changed to display  LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG.

Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog.

#### How to verify it
On a device which is up for more than 497 days, modify supervisord.conf  with the change and restart snmp.
Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.
2022-01-14 23:01:19 +00:00
gechiang
068ff9ddbd
[202012][BRCM TH3] Add SOC properties to prevent FDB events during warmboot (#9761) 2022-01-14 14:44:43 -08:00
Stepan Blyshchak
31065ccb93
[Mellanox] [202012] fail the build when hw-mgmt patches do not apply (#9566)
Taken from https://github.com/Azure/sonic-buildimage/pull/9539

####  Why I did it
To fix an issue that hw-mgmt patches were not applied. One patch was already in upstream hw-mgmt package thus applying it again caused an error and no other patches were applied. Also, I did it to improve the Makefile, so that the make will fail in case patches fail to apply.

####  How I did it
Removed obsolete patch, made applying patches a hard failure in the build.

####  How to verify it
Run the make and verify patches are applied.
2022-01-13 15:08:27 -08:00
gechiang
bdc7ce86de
[202012] BRCM SAI 4.3.5.2 Fixes CS00012205357, CS00012214196, CS00012213974 (#9754) 2022-01-13 11:40:43 -08:00
Kebo Liu
75bd97e176 [Mellanox] Add sensors conf for MSN4600C A1 platform (#9706)
- Why I did it
Add sensor conf for MSN4600C A1 platform

- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform

- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-01-13 07:01:26 +00:00
DavidZagury
57abd5914e [Mellanox] Upgrade Mellanox firmware tools to 4.17.2-12 (#8978)
- Why I did it
Bug fix:
bad_param request due to missing parser rest command while running mlxlink

- How I did it
Advance to MFT tool version to 4.17.2-12.

- How to verify it
Manually tested on all mellanox platforms.
2022-01-12 22:36:11 +00:00
mssonicbld
a0376a6e59
[ci/build]: Upgrade SONiC package versions (#9680) 2022-01-07 22:12:12 +00:00
Richard.Yu
dec52bb32f
Add thrift 0.13.0 (#9678)
* Add thrift 0.13.0  (#8307)

#### Why I did it
To bump thrift version to 0.13.0, to fix some dependencies issues.

#### How I did it
As there are dependencies between thrift and saithrinft server  (bf3630316c/test/saithrift) which is used by syncd-rpc to update thrift version, I also need to make changes in saithrinft server, and then SAI ref point should be updated in sairedis, and then sairedis ref point should be updated too. It is too many change, so I decided to add thrift 0.13.0 as separeate target to be able to work and test father changes in saithrinft and one when appropriate changes will be merged to SAI and ref points will be updated I will squash this and the old thrift target.  I was not able to build thrift deb pkg by original rules, so I copied `debian `folder from the old version and tune it for newer one.

#### How to verify it
```
make init
make configure PLATFORM=vs
make target/debs/buster/libthrift_0.13.0_amd64.deb
```

```

* Correct the pkg name for thrift.0.13.0

Correct thrift.0.13.0 dependent package name.
In previous code, the buildout target was named as PYTHON3_THRIFT_0_13_0
But when add the prackage to LIBTHRIFT_0_13_0, it typo as PYTHON_THRIFT_0_13_0

Co-authored-by: Myron Sosyak <myronx.sosyak@intel.com>
2022-01-05 18:00:58 -08:00
Dror Prital
1eec2bc25e
[202012] [submodule] Update sonic-utilities submodule (#9661)
Update submodule sonic-utilities that contains the following commits:

Revert "[202012] [generate_dump] allow to extend dump with plugin scripts (#1945)" (#1993)
[soft-reboot] Add support for platforms based on Device Tree (#1963)
[Reclaiming buffer][202012] Database migrator for reclaiming buffer (#1898)
[202012] [generate_dump] allow to extend dump with plugin scripts (#1945)

Signed-off-by: dprital <drorp@nvidia.com>
2022-01-02 16:51:04 +02:00
Kebo Liu
16a3929159
[202012][Mellanox] Update hw-mgmt package to V.7.0010.2347 (#9594)
- Why I did it
Update hw-mgmt to a new version to pick up support for the SN4600C A1 system.

- How I did it
Update the pointer of the hw-mgmt submodule
Update the hw-mgmt version number
Remove the staled code patch to hw-mgmt userspace code.

- How to verify it
Run platform regression on Mellanox platforms.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-12-28 09:40:58 +02:00
mssonicbld
9b1a3971bd
[ci/build]: Upgrade SONiC package versions (#9645) 2021-12-26 23:30:40 +00:00
mssonicbld
813a6387c5
[ci/build]: Upgrade SONiC package versions (#9543) 2021-12-24 17:05:45 +00:00
Stephen Sun
b36ee67bc7 Fix typo and missing files in SN3800 and SN4600C's buffer templates (#9537)
Why I did it
Fix typo and missing files in SN3800 and SN4600C's buffer templates

How I did it
ingress_lossless_xoff_size => ingress_lossless_pool_xoff add missing files for SN4600C-D100C12S2

How to verify it
Deploy the fix and verify whether the device can be up.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-12-23 03:28:43 +00:00
anamehra
b3ca681279
Update cisco-8000 ref to release: 202012-v0.8 (#9528)
Update cisco-8000 ref to release: 202012-v0.8
Signed-off-by: Anand Mehra anamehra@cisco.com
2021-12-21 22:44:38 -08:00
gechiang
6faeea6bf2
[202012] Submodule Update on sonic-sairedis (#9533)
#### Why I did it
To pick up fixes from submodule sonic-sairedis which include the following fixes:
```
commit 1027eef3a331e84560827c7584ee8009baf434d5 (HEAD -> 202012, origin/202012)
Author: gechiang <62408185+gechiang@users.noreply.github.com>
Date:   Wed Dec 8 03:13:34 2021 -0800

    [202012] Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#976)

commit 94455e50d3444dcd60093b7a39c7f427337a94d2
Author: VenkatCisco <77468614+VenkatCisco@users.noreply.github.com>
Date:   Tue Jun 15 03:23:20 2021 -0700

    Add cisco-8000 checks to syncd_init_common (#839)

commit 2df539483ed68519c3c9c6df958d3ed2f31dd629
Author: Kamil Cudnik <kcudnik@gmail.com>
Date:   Mon Dec 6 20:50:23 2021 +0100

    [lgtm] Add gmock libs to lgtm (#979)

```
2021-12-20 23:45:01 -08:00
Stephen Sun
507cb8b91c
[202012][swss-common][swss] Submodule update (#9531)
#### Why I did it

Update sonic-swss-common

54879741 [202012][schema] Add vnet route tunnel and advertise network tables for state_db (Azure/sonic-swss-common#563)
a5394f9d Update for BFD, default route table (Azure/sonic-swss-common#550)

Update sonic-swss

fbbe5bcc [202012][pfc_detect] fix RedisReply errors (Azure/sonic-swss#2078)
5762b0c2 [Reclaim buffer][202012] Reclaim unused buffer for dynamic buffer model (Azure/sonic-swss#1985)
33e9bd19 [Document][202012] Supply the missing ingress/egress port profile list in document (Azure/sonic-swss#2066)
1b6ffba1 [Reclaiming buffer][202012] Support reclaiming buffer in traditional buffer model (Azure/sonic-swss#2063)
afb33f16 [202012] Update default route status to state DB (Azure/sonic-swss#2009) (Azure/sonic-swss#2067)
b9c44f75 Common code update for reclaiming buffer (backport community PR Azure/sonic-swss#1996 to 202106/202012) (Azure/sonic-swss#2061)
cf5182d8 [request parser] Allow request parser to parse multiple values
2021-12-20 23:43:42 -08:00
Vadym Hlushko
d8ee1e6a63
[Mellanox] [SN4410] [202012] Fixed port_config.ini (#9542)
#### Why I did it
The capability files were incorrect in comparison to the marketing spec of the SN4410 platform.

#### How I did it
Aligned the capability files according to the marketing spec.

#### How to verify it
Did basic manual sanity checks:
- Check if critical docker containers were UP
- Check if interfaces were created and were UP
- Check if interfaces created in the syncd docker container by executing – sx_api_ports_dump.py script
- Check the logs from the start of the switch – everything was OK
- Verified the port breakout
2021-12-20 23:42:34 -08:00
Sumukha Tumkur Vani
bbb88c4a18 [RESTAPI] Update submodule (#9547) 2021-12-20 19:26:14 +00:00
Qi Luo
9fe3e1732a [sonic-slave]: Upgrade python lxml library version to 4.6.5 (#9529)
Bumps lxml from 4.6.5.
2021-12-20 19:26:09 +00:00
Stepan Blyshchak
bdf31a6556 [Mellanox][SDK] Build SDK with PRM sniffer support (#9500)
- Why I did it
To have an ability to use PRM sniffer.

- How I did it
Enabled the option in configure flags.

- How to verify it
Built and ran on switch. Enabled the feature in runtime and checked the sniffer recording.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-12-20 19:25:52 +00:00
Shi Su
60ac485f96 Reduce route selection deferral timer for bgp graceful restart (#7533)
Why I did it
There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.

Fix #7488

How I did it
Reduce route selection deferral timer for bgp graceful restart to 15 seconds.
2021-12-20 19:24:58 +00:00
kellyyeh
006582b3e2
[dhcp_relay] Update DHCPv6 counter on relayed messages (#9283) (#9578)
(cherry picked from commit f2ee94d201)
2021-12-17 11:56:01 -08:00
Lawrence Lee
a41c15a329 [swss]: Listen for undeliverable tunnel packets (#9348)
- Create a script in the orchagent docker container which listens for these encapsulated packets which are trapped to CPU (indicating that they cannot be routed/no neighbor info exists for the inner packet). When such a packet is received, the script will issue a ping command to the packet's inner destination IP to start the neighbor learning process.
- This script is also resilient to portchannel status changes (i.e. interface going up or down). An interface going down does not affect traffic sniffing on interfaces which are still up. When an interface comes back up, we restart the sniffer to start capturing traffic on that interface again.
2021-12-16 11:59:34 -08:00
vmittal-msft
724037ebc3
BRCM SAI 4.3.5.1-9 for enabling SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP capability (#9463) 2021-12-14 09:56:21 -08:00
Devesh Pathak
c6f38d9815
[202012][sonic-swss] Updated submodule for sonic-swss (#9422) 2021-12-14 00:03:09 -08:00
Junchao-Mellanox
0197855d5d
[Mellanox] [202012] Allow user to set LED to orange (#9514)
Backport https://github.com/Azure/sonic-buildimage/pull/9259 to 202012

#### Why I did it

Nvidia platform API does not support set LED to orange. 

#### How I did it

Allow user to set LED to orange

#### How to verify it

Manual test
2021-12-13 16:04:06 -08:00
Travis Van Duyn
0226140e9c [snmp]: updated to support snmp config from redis configdb (#6134)
**- Why I did it**
I'm updating the jinja2 template to support getting SNMP information from the redis configdb. 
I'm using the format approved here: 
https://github.com/Azure/SONiC/pull/718

This will pave the way for us to decrement using the snmp.yml in the future.  
Right now we will still be using both the snmp.yml and configdb to get variable information in order to create the snmpd.conf via the sonic-cfggen tool. 

**- How I did it**
I first updated the SNMP Schema in PR #718 to get that approved as a standardized format. 
Then I verified I could add snmp configs to the configdb using this standard schema.  Once the configs were added to the configdb then I updated the snmpd.conf.j2 file to support the updates via the configdb while still using the variables in the snmp.yml file in parallel.  This way we will have backward compatibility until we can fully migrate to the configdb only. 

By updating the snmpd.conf.j2 template and running the sonic-cfggen tool the snmpd.conf gets generated with using the values in both the configdb and snmp.yml file. 

Co-authored-by: trvanduy <trvanduy@microsoft.com>
2021-12-13 17:42:48 +00:00
Stephen Sun
8836b6bcd2 [Mellanox] Adjust buffer parameters with 2km cable supported for 4600C non-generic SKUs (#9215)
- Why I did it
Also recalculated all parameters with the latest algorithm with per-speed peer response time taken into account

- How I did it
Detailed information of each SKU:

C64:
t0: 32 100G downlinks and 32 100G uplinks
t1: 56 100G downlinks and 8 100G uplinks with 2km-cable supported
D112C8: 112 50G downlinks and 8 100G uplinks.
D48C40: 48 50G downlinks, 32 100G downlinks, and 8 100G uplinks
D100C12S2: 4 100G downlinks, 2 10G downlinks, 100 50G downlinks, and 8 100G uplinks
2km cable is supported for C64 on t1 only

- How to verify it
Run regression test (QoS)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-12-12 01:36:53 +00:00
zzhiyuan
4d18fb8377
[202012] [Arista] Update db with eeprom info through syseepromd (#9109)
Why I did it
Arista did not update db with eeprom info. Previous PR had issues that were reverted.

How I did it
Had Arista eeprom class inherit the class that has method to update db. Updated platform API methods for Arista 202012.

How to verify it
In redis-cli the keys and values can be seen. Can use sonic-mgmt testing to verify behavior, and see the chassis platform API methods have not regressed.
2021-12-10 08:24:24 -08:00
Stephen Sun
acac848858
[Reclaim buffer][202012] Reclaim unused buffers by applying zero buffer profiles (#9063)
- Why I did it
Support zero buffer profiles

1. Add buffer profiles and pool definition for zero buffer profiles
2. Support applying zero profiles on INACTIVE PORTS
3. Enable dynamic buffer manager to load zero pools and profiles from a JSON file

- How I did it
Add buffer profiles and pool definition for zero buffer profiles

If the buffer model is static:
 * Apply normal buffer profiles to admin-up ports
 * Apply zero buffer profiles to admin-down ports
If the buffer model is dynamic:
 * Apply normal buffer profiles to all ports
 * buffer manager will take care when a port is shut down

Update buffers_config.j2 to support INACTIVE PORTS by extending the existing macros to generate the various buffer objects, including PGs, queues, ingress/egress profile lists

Originally, all the macros to generate the above buffer objects took active ports only as an argument.
Now that buffer items need to be generated on inactive ports as well, an extra argument representing the inactive ports need to be added.
To be backward compatible, a new series of macros are introduced to take both active and inactive ports as arguments
The original version (with active ports only) will be checked first. If it is not defined, then the extended version will be called.
Only vendors who support zero profiles need to change their buffer templates
Enable buffer manager to load zero pools and profiles from a JSON file:

The JSON file is provided on a per-platform basis
It is copied from platform/<vendor> folder to /usr/share/sonic/temlates folder in compiling time and rendered when the swss container is being created.
To make code clean and reduce redundant code, extract common macros from buffer_defaults_t{0,1}.j2 of all SKUs to two common files:
One in Mellanox-SN2700-D48C8 for single ingress pool mode
The other in ACS-MSN2700 for double ingress pool mode
Those files of all other SKUs will be symbol link to the above files

Update sonic-cfggen test accordingly:
 * Adjust example output file of JSON template for unit test
 * Add unit test in for Mellanox's new buffer templates.

- How to verify it
Regression test.
Unit test in sonic-cfggen
Run regression test and manually test.

Signed-off-by: stephens <stephens@nvidia.com>
2021-12-09 17:34:56 +02:00
xumia
6a6512246d
[Bug][Build]: fix the file not found issue caused by the relative path (#9450)
#### Why I did it
Merged from master branch: https://github.com/Azure/sonic-buildimage/pull/9443
Fix the nodesource.list cannot read issue, it is cased by the full path not used.

```
2021-12-03T06:59:26.0019306Z Removing intermediate container 77cfe980cd36
2021-12-03T06:59:26.0020872Z  ---> 528fd40e60f6
2021-12-03T06:59:26.0021457Z Step 81/81 : RUN post_run_buildinfo
2021-12-03T06:59:26.0841136Z  ---> Running in d804bd7e1b06
2021-12-03T06:59:29.1626594Z DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
2021-12-03T06:59:34.2960105Z /usr/bin/sed: can't read nodesource.list: No such file or directory
2021-12-03T06:59:34.5094880Z The command '/bin/sh -c post_run_buildinfo' returned a non-zero code: 2
```
2021-12-07 23:42:33 -08:00
Jing Zhang
f93d1f64a2
[submodule]: Update submodule sonic-linkmgrd (#9421)
6c6151b Fix unstable unit tests (state change handler wasn't invoked) (#8)
2f7dc0a support code diff coverage (#5)
83f0002 Force mux state switch to standby if triggered from Cli (#6)

signed-off-by: Jing Zhang zhangjing@microsoft.com
2021-12-06 16:41:33 -08:00
Neetha John
8f15d39bae
[202012] [submodule] Update sonic-utilites pointer (#9410)
Contains the following commits
239cb5c  [flex counter] Flex counter threads consume too much CPU resources (Azure/sonic-utilities#1925)
8a3b41a [load_minigraph] Delay pfcwd start until the buffer templates are rendered  (Azure/sonic-utilities#1937)
2021-12-06 13:44:00 -08:00
kellyyeh
2019ccaa2a [radv] Run radv on MgmtToRRouter (#9424)
* Allow radv to run on mgmt tor and EPMS
2021-12-06 21:32:33 +00:00
xumia
5670f25416 [Build]: Fix tmpfs space not enough issue when building vs image (#9438)
Fix no space left on device issue in tmpfs.
2021-12-01T06:30:40.1651742Z cp: write error: No space left on device
2021-12-01T06:30:40.1652225Z Failure: local_fs_run():/dev/vdb Unable to copy /tmp/tmp.gl4Sgp/onie-installer.bin to tmpfs
2021-12-06 21:32:29 +00:00
xumia
a3f1769a34 [Build]: Cleanup the reproducible mirrors when build complete (#9132)
Why I did it
The reproducible build mirrors are only used during the build, the mirrors can be removed after that.
2021-12-06 21:32:24 +00:00
Volodymyr Samotiy
0831635b1c
[Mellanox] Update SDK to v4.4.3360 and FW to v2008.3358 (#9403)
- Why I did it
To include latest fixes.

1. On CMIS modules, after low power configuration, the firmware waited for the module state to be ModuleReady instead of ModuleLowPower causing delays.
2. When connecting Spectrum devices with optical transceivers that support RXLOS, remote side port down might cause the switch firmware to get stuck and cause unexpected switch behavior.
3. On rare occasions, when working with port rates of 1GbE or 10GbE and congestion occurs, packets may get stuck in the chip and may cause switch to hang.
4. When ECMP has high amount of next-hops based on VLAN interfaces, in some rare cases, packets will get a wrong VLAN tag and will be dropped.
5. Using SN4600C with copper or optics loopback cables in NRZ speeds, link may raise in long link up times ( up to 70 seconds).
6. When connecting SN4600C to SN4600C after Fastboot in 50GbE No_FEC mode with a copper cable, the link up time may take ~20 seconds.

- How I did it
Updated SDK submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "soni-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-12-06 11:01:43 +02:00
xumia
05351731d5
[Build]: Enable arm pr check #9362
Support marvell-armhf dpkg cache and the azp check.
Waiting for merging PR #9381 to 202012 branch, so only azp template change in this PR.

Move the VS build to a new stage BuildVS, change the Test stage only depending on BuildVS, running the BuildVS and the other platform's build in parallel. The Test stage do not has dependency on the marvel-armhf build, reduce the overall build time caused by longer build time of marvel-armhf build.
2021-12-04 09:31:53 +08:00
xumia
1e8fe7dd6c Fix armhf version issue (#9382)
Why I did it
Fix some of the version files not used issue.
One of example version file version-py3-all-armhf, when building marvell-armhf, the version is used as expected, but it not use.
2021-12-01 02:29:07 +00:00
Wirut Getbamrung
933454dc29 [device/celestica]: add controllable config to platform.json of e1031 (#9183) 2021-12-01 02:29:02 +00:00
Lawrence Lee
b3a3aa0c38 [mux]: Fix mark_dhcp_packet (#9373)
- Consolidate the two [Service] sections by moving the ExecStartPre line for mark_dhcp_packet.py to the first section and removing the second.
- Make the mark_dhcp_packet.py file executable
- Also clean up mark_dhcp_packet.py
    - Remove unused imports
    - Fix spacing and line lengths to conform to PEP8
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-12-01 02:28:56 +00:00
arlakshm
9f0fc89cff remove staticd.conf.j2 (#9182)
Why I did it
resolves #8979 and #9055

How I did it
Remove the file static.conf.j2,which adds the default route on eth0 from bgp docker

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-12-01 02:28:51 +00:00
Stephen Sun
fafd5327bd [Reclaim buffer] Common infrastructure update for reclaiming buffer (#9133)
- Why I did it
This is to update the common sonic-buildimage infra for reclaiming buffer.

- How I did it
Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer
The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there.
Rendering is done here for passing azure pipeline.
Load zero_profiles.json when the dynamic buffer manager starts
Generate inactive port list to reclaim buffer

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-12-01 02:28:46 +00:00
Junchao-Mellanox
227f2f8aec [Mellanox] Fan speed should not be 100% when PSU is powered off (#9258)
- Why I did it
When PSU is powered off, the PSU is still on the switch and the air flow is still the same. In this case, it is not necessary to set FAN speed to 100%.

- How I did it
When PSU is powered of, don't treat it as absent.

- How to verify it
Adjust existing unit test case
Add new case in sonic-mgmt
2021-12-01 02:28:37 +00:00
xumia
d9fd39538b Support dpkg cache for marvell-armhf (#9381)
Why I did it
Support marvell-armhf dpkg cache
2021-11-30 13:11:12 +00:00
Junchao-Mellanox
b1162682cb
[system-health] [202012] No longer check critical process/service status via monit (#9367)
Backport https://github.com/Azure/sonic-buildimage/pull/9068 to 202012

#### Why I did it

Command `monit summary -B` can no longer display the status for each critical process, system-health should not depend on it and need find a way to monitor the status of critical processes. The PR is to address that. monit is still used by system-health to do file system check as well as customize check.

#### How I did it

1.	Get container names from FEATURE table
2.	For each container, collect critical process names from file critical_processes
3.	Use “docker exec -it <container_name> bash -c ‘supervisorctl status’” to get processes status inside container, parse the output and check if any critical processes exit

#### How to verify it

1. Add unit test case to cover it
2. Adjust sonic-mgmt cases to cover it
3. Manual test
2021-11-24 15:36:14 -08:00
Jing Zhang
3e6cdfa3a6
[sonic-linkmgrd] submodule update (#9343)
Submodule update for sonic-linkmgrd
Incorporates:

c11a576 (2021-11-22 09:38:46) [ci]: show code coverage in azure pipeline (#4)
4ceb01d (2021-11-18 20:24:20) Fix MUX toggling issue (#1)
d640527 (2021-11-12 22:31:44) [ci]: fix artifact download
b9f247d (2021-11-12 22:31:44) [ci]: use native arm64/armhf build
3059122 (2021-09-27 11:32:23) [linkgrd] Add Missing Apache License Header

signed-off-by: Jing Zhang zhangjing@microsoft.com
2021-11-24 11:12:22 -08:00
tjchadaga
d3a5c5ccd0
[202012][sonic-sairedis] update submodule (#9364)
Update sonic-sairedis submodule to get the below fixes:

7389704 [202012] Add ACL_TABLE object to break before make list (Azure/sonic-sairedis#971)
f334349 Fix hung issue when installing linux kernel modules (Azure/sonic-sairedis#969)
2021-11-24 11:10:50 -08:00
xumia
415fd17689 [Build]: Fix the version not found issue (#9331)
When we update the a sai package downing from a remote server, we need to update the version file as well currently, but the reproducible build feature is not enabled in master, it can only be detected when merging the code into the release branches, such as 202106, 202012, etc.
The reproducible feature is to reduce the build failure, not need to break the build when the version not specified. If version not specified, the best choice is to accept the version from remote server.

Co-authored-by: Ubuntu <xumia@xumia-vm1.jqzc3g5pdlluxln0vevsg3s20h.xx.internal.cloudapp.net>
2021-11-24 01:16:37 +00:00