Commit Graph

6677 Commits

Author SHA1 Message Date
Renuka Manavalan
31e750ee0b
Fix PR build failure (#11973)
Some PR builds fails to find this file. Remove it temporarily until we root cause it
2022-09-06 15:13:05 -07:00
Stepan Blyshchak
a8b2a538a5
[docker-wait-any] immediately start to wait (#11595)
It could happen that a container has already crashed but docker-wait-any
will wait forever till it starts. It should, however, immediately exit
to make the serivce restart.

#### Why I did it

It is observed in some circumstances that the auto-restart mechanism does not work. Specifically for ```swss.service```, ```orchagent``` had crashed before ```docker-wait-any``` started in ```swss.sh```. This led ```docker-wait-any``` wait forever for ```swss``` to be in ```"Running"``` state and it results in:

```
CONTAINER ID   IMAGE                                COMMAND                  CREATED        STATUS                    PORTS     NAMES
1abef1ecebff   bcbca2b74df6                         "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         what-just-happened
3c924d405cd5   docker-lldp:latest                   "/usr/bin/docker-lld…"   22 hours ago   Up 22 hours                         lldp
eb2b12a98c13   docker-router-advertiser:latest      "/usr/bin/docker-ini…"   22 hours ago   Up 22 hours                         radv
d6aac4a46974   docker-sonic-mgmt-framework:latest   "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         mgmt-framework
d880fd07aab9   docker-platform-monitor:latest       "/usr/bin/docker_ini…"   22 hours ago   Up 22 hours                         pmon
75f9e22d4fdd   docker-snmp:latest                   "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         snmp
76d570a4bd1c   docker-sonic-telemetry:latest        "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         telemetry
ee49f50344b3   docker-syncd-mlnx:latest             "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         syncd
1f0b0bab3687   docker-teamd:latest                  "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         teamd
917aeeaf9722   docker-orchagent:latest              "/usr/bin/docker-ini…"   22 hours ago   Exited (0) 22 hours ago             swss
81a4d3e820e8   docker-fpm-frr:latest                "/usr/bin/docker_ini…"   22 hours ago   Up 22 hours                         bgp
f6eee8be282c   docker-database:latest               "/usr/local/bin/dock…"   22 hours ago   Up 22 hours                         database
```

The check for ```"Running"``` state is not needed because for cold boot case we do ```start_peer_and_dependent_services``` and for warm boot case the loop will retry to wait for container if this container is doing warm boot:
d01a91a569/files/image_config/misc/docker-wait-any (L56)

#### How I did it

Removed the check for ```"Running"```.

#### How to verify it

Kill swss before ```docker-wait-any``` is reached and verify auto restart will restart swss serivce.
2022-09-06 09:26:54 -07:00
Dror Prital
1b5d07f665
[submodule] Advance sonic-platform-daemons pointer (#11882)
- Why I did it
Update sonic-platform-daemons submodule pointer to include the following:

[ycabled] enable telemetry for 'active-active'; fix gRPC portid ordering (#284)
[ycabled] remove some spurious logs (#282)
Correct the peer forwarding state table (#281)
add psu input voltage and current (#276)
[ycabled] add capability to enable/disable telemetry (#279)

Signed-off-by: dprital <drorp@nvidia.com>
2022-09-04 11:03:36 +03:00
Dror Prital
4e18510fc9
[submodule]] Advance sonic-utilities pointer (#11897)
Update sonic-utilities submodule pointer to include the following:

Replace cmp in acl_loader with operator.eq (#2328)
Subinterface vrf bind issue fix (#2211)
[VRF]Adding CLI checks to ensure Vrf is valid in interface bind and static route commands (#2333)
[doc]: Add MACsec CLI doc (#2334)
[sonic-package-manager] Drop 'expires_in' (#2002)
Handle non-front-panel ports in is_rj45_port (#2327)
[service_mgmt]: Fix fetch MULTI_INST_DEPENDENT bug in service_mgmt.sh.j2 (#2319)
correct an error by changing "show bgp summary" to "show bfd summary" (#2324)
Update VRF unbind command (#2331)
Fix issue: port_type is referenced before initialized (#2323)
Fix issue: exception in is_rj45_port in multi ASIC env (#2313)
Delete .DS_Store (#2244)
Fix bug with checking VRF's routes in route_check.py (#2301)
[decode-syseeprom] Fix setting use_db based on support_eeprom_db (#2270)
Fix vrf UT failed issue (#2309)
add lacp_rate to portchannel (#2036)
2022-09-04 10:54:16 +03:00
Kebo Liu
78eeeb7670
[SN2201] remove extra empty lines in the pg_profile_lookup.ini (#11923)
- Why I did it
Remove extra empty lines in the SN2201 pg_profile_lookup.ini to make it aligned with other platforms.
This extra empty line could confuse some test cases which need to parse this file.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-09-04 10:53:07 +03:00
Zain Budhwani
6a54bc439a
Streaming structured events implementation (#11848)
With this PR in, you flap BGP and use events_tool to see the published events.
With telemetry PR #111 in and corresponding submodule update done in buildimage, one could run gnmi_cli to capture BGP flap events.
2022-09-03 07:33:25 -07:00
Renuka Manavalan
71d63a7be7
New commits in swss-common: (#11954)
* 651f52b (HEAD, origin/master, origin/HEAD, master) Change syslog level of Event Publish (#677)
* aca253a Add routing rule table for DASH (#668)
2022-09-02 21:52:24 -07:00
Lawrence Lee
e96ec5a74c
[kernel]: Submodule update (#11947)
Include following commit:

- 443253f [patch]: Add accpt_untracked_na kernel param (#292)

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-09-02 16:06:10 -07:00
Ye Jianquan
750e1b3017
Define whether a test is required by code (#11921)
* Define whether a test is required by code

Why I did it
Define whether a test job is required before merging by code.
Let the failure of multi-asic and t0-sonic don't block pr merge.
The 'required' configuration can be modified by the owner in the future.

How I did it
Required:
t1-lag, t0
Not required:
multi-asic, t0-sonic

How to verify it
AZP itself verifies it.

Signed-off-by: jianquanye@microsoft.com
2022-09-03 06:53:54 +08:00
Ying Xie
a6843927d9
[mux] skip mux operations during warm shutdown (#11937)
* [mux] skip mux operations during warm shutdown

- Enhance write_standby.py script to skip actions during warm shutdown.
- Expand the support to BGP service.
- MuX support was added by a previous PR.
- don't skip action during warm recovery

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-02 13:50:42 -07:00
Lawrence Lee
a762b35cbc
[arp_update]: Set failed IPv6 neighbors to incomplete (#11919)
After pinging any failed IPv6 neighbor entries, set the remaining failed/incomplete entries to a permanent INCOMPLETE state. This manual setting to INCOMPLETE prevents these entries from automatically transitioning to FAILED state, and since they are now incomplete any subsequent NA messages for these neighbors is able to resolve the entry in the cache.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-09-02 13:40:40 -07:00
Liu Shilong
030de9f26d
[actions] Add github context env in label action. (#11926) 2022-09-02 14:07:48 +08:00
arunlk-dell
f82c1fd8ae
Z9432F kernel dependency of platform module (#11941)
Why I did it
Z9432F Update the kernel dependency of platform module

How I did it
Modified the kernel version to current latest 5.10.0-12-2
2022-09-01 16:55:41 -07:00
Jing Zhang
fdd9130ecf
[YANG] Add MUX_CABLE yang model (#11797)
Why I did it
Address issue #10970

sign-off: Jing Zhang zhangjing@microsoft.com

How I did it
Add sonic-mux-cable.yang and unit tests.

How to verify it
Compile Compile target/python-wheels/sonic_yang_mgmt-1.0-py3-none-any.whl and target/python-wheels/sonic_yang_models-1.0-py3-none-any.whl.
Pass sonic-config-engine unit test.
Which release branch to backport (provide reason below if selected)
 201811
 201911
 202006
 202012
 202106
 202111
 202205
Description for the changelog
Link to config_db schema for YANG module changes
f8fe41a023/src/sonic-yang-models/doc/Configuration.md (mux_cable)
2022-09-01 11:52:02 -07:00
Zhaohui Sun
88191b063b
Add python-is-python3 package for bullseye base docker (#11895)
Why I did it
In latest syncd container, it is installed bullseye, can't find command '/usr/bin/python'.
Some scripts such as test_copp still calls /usr/bin/python in syncd.
Submitted the change in #11807 for syncd docker, but it's better to add it in bullseye base docker.

How I did it
Install python-is-python3 package in bullseye base docker to resolve this issue, whatever run python or python3, it will run /usr/bin/python3, will not cause the error of can't find command '/usr/bin/python'

How to verify it
run python in syncd container.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
2022-09-01 08:13:24 +08:00
Jing Kan
353b2742b2
[YANG] Create YANG Model for Console (#11806)
How I did it
Create YANG Model for SONiC console related features.

How to verify it
Add tests

Signed-off-by: Jing Kan jika@microsoft.com
2022-09-01 07:58:16 +08:00
Longxiang Lyu
6e878a36da
[mux] Exit to write standby state to active-active ports (#11821)
[mux] Exit to write standby state to `active-active` ports

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2022-08-31 13:10:22 -07:00
Junchao-Mellanox
46292d71be
Add linux perf tool to sonic image (#11906) 2022-08-31 13:09:36 -07:00
Dev Ojha
c601f24139
[Arista7050cx3] TD3 SKU changes for pg headroom value after interop testing with cisco 8102 (#11901)
Why I did it
After PFC interop testing between 8102 and 7050cx3, data packet losses were observed on the Rx ports of the 7050cx3 (inflow from 8102) during testing. This was primarily due to the slower response times to react to PFC pause packets for the 8102, when receiving such frames from neighboring devices. To solve for the packet drops, the 7050cx3 pg headroom size has to be increased to 160kB.

How I did it
Modified the xoff threshold value to 160kB in the pg_profile file to allow for the buffer manager to read that value when building the image, and configuring the device

How to verify it
run "mmuconfig -l" once image is built


Signed-off-by: dojha <devojha@microsoft.com>
2022-08-31 11:08:32 -07:00
abdosi
4027147238
Align API get_device_runtime_metadata() for python version < 3.9 (#11900)
Why I did it:
API get_device_runtime_metadata() added by #11795 uses merge operator for dict but that is supported only for python version >=3.9. This API will be be used by scrips eg:hostcfgd which is still build for buster which does not have python 3.9 support.
2022-08-31 10:48:15 -07:00
saksarav-nokia
1e75abc274
[Nokia][Nokia-IXR7250E-36x100G & Nokia-IXR7250E-36x400G] Update BCM (#11577)
config to support ERSPAN egress mirror and also set flag to preserve ECN
2022-08-30 20:23:17 -07:00
Arun Saravanan Balachandran
092e0394b5
DellEMC: Z9332f - Graceful platform reboot (#10240)
Why I did it
To gracefully unmount filesystems and stop containers while performing a cold reboot.
Unmount ONIE-BOOT if mounted during fast/soft/warm reboot
How I did it
Override systemd-reboot service to perform a cold reboot.
Unmount ONIE-BOOT if mounted using fast/soft/warm-reboot plugins.
How to verify it
On reboot, verify that the container stop and filesystem unmount services have completed execution before the platform reboot.
2022-08-30 11:23:52 -07:00
Liu Shilong
cf69206d02
[ci] Fix bug involved by PR 11810 which affect official build pipeline (#11891)
Why I did it
Fix the official build not triggered correctly issue, caused by the azp template path not existing.

How I did it
Change the azp template path.
2022-08-30 14:23:09 +08:00
xumia
4733053c53
Fix vs check install login timeout issue (#11727)
Why I did it
Fix a build not stable issue: #11620
The vs vm has started successfully, but failed to wait for the message "sonic login:".

There were 55 builds failed caused by the issue in the last 30 days.

AzurePipelineBuildLogs
| where startTime > ago(30d)
| where type =~ "task"
| where result =~ "failed"
| where name =~ "Build sonic image"
| where content contains "Timeout exceeded"
| where content contains "re.compile('sonic login:')"
| project-away content
| extend branchName=case(reason=~"pullRequest", tostring(todynamic(parameters)['system.pullRequest.targetBranch']),
              replace("refs/heads/", "", sourceBranch))
| summarize FailedCount=dcount(buildId) by branchName

branchName	FailedCount
master	37
202012	9
202106	4
202111	2
202205	1
201911	1
It is caused by the login message mixed with the output message of the /etc/rc.local, one of the examples as below: (see the message rc.local[307]: sonic+ onie_disco_subnet=255.255.255.0 login: )
The check_install.py was waiting for the message "sonic login:", and Linux console was waiting for the username input (the login message has already printed in the console).
https://dev.azure.com/mssonic/build/_build/results?buildId=123294&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=359769c4-8b5e-5976-a793-85da132e0a6f

2022-07-17T15:00:58.9198877Z [   25.493855] rc.local[307]: + onie_disco_opt53=05
2022-07-17T15:00:58.9199330Z [   25.595054] rc.local[307]: + onie_disco_router=10.0.2.2
2022-07-17T15:00:58.9199781Z [   25.699409] rc.local[307]: + onie_disco_serverid=10.0.2.2
2022-07-17T15:00:58.9200252Z [   25.789891] rc.local[307]: + onie_disco_siaddr=10.0.2.2
2022-07-17T15:00:58.9200622Z [   25.880920] 
2022-07-17T15:00:58.9200745Z 
2022-07-17T15:00:58.9201019Z Debian GNU/Linux 10 sonic ttyS0
2022-07-17T15:00:58.9201201Z 
2022-07-17T15:00:58.9201542Z rc.local[307]: sonic+ onie_disco_subnet=255.255.255.0 login: 
2022-07-17T15:00:58.9202309Z [   26.079767] rc.local[307]: + onie_exec_url=file://dev/vdb/onie-installer.bin

How I did it
Input a newline when finished to run the script /etc/rc.local.
If entering a newline, the message "sonic login:" will prompt again.
2022-08-30 09:19:58 +08:00
Saikrishna Arcot
de54eece46
Fix error handling when failing to install a deb package (#11846)
The current error handling code for when a deb package fails to be
installed currently has a chain of commands linked together by && and
ends with `exit 1`. The assumption is that the commands would succeed,
and the last `exit 1` would end it with a non-zero return code, thus
fully failing the target and causing the build to stop because of bash's
-e flag.

However, if one of the commands prior to `exit 1` returns a non-zero
return code, then bash won't actually treat it as a terminating error.
From bash's man page:

```
-e      Exit immediately if a pipeline (which may consist of a single simple
	command), a list, or a compound command (see SHELL GRAMMAR above),
        exits with a non-zero status.  The shell does not exit if the
        command that fails is part of the  command  list  immediately
        following a while or until keyword, part of the test following the
        if or elif reserved words, part of any command executed in a && or
        || list except the command following the final && or ||, any
        command in a pipeline but the last, or if the command's return
        value is being inverted with !.  If a compound command other than a
        subshell returns a non-zero status because a command failed while
        -e was being ignored, the shell does not exit.
```

The part `part of any command executed in a && or || list except the
command following the final && or ||` says that if the failing command
is not the `exit 1` that we have at the end, then bash doesn't treat it
as an error and exit immediately. Additionally, since this is a compound
command, but isn't in a subshell (subshell are marked by `(` and `)`,
whereas `{` and `}` just tells bash to run the commands in the current
environment), bash doesn't exist. The result of this is that in the
deb-install target, if a package installation fails, it may be
infinitely stuck in that while-loop.

There are two fixes for this: change to using a subshell, or use `;`
instead of `&&`. Using a subshell would, I think, require exporting any
shell variables used in the subshell, so I chose to change the `&&` to
`;`. In addition, at the start of the subshell, `set +e` is added in,
which removes the exit-on-error handling of bash. This makes sure that
all commands are run (the output of which may help for debugging) and
that it still exits with 1, which will then fully fail the target.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-29 11:35:07 -07:00
Saikrishna Arcot
186568a21d
Update sensor names for msn4600c for the 5.10 kernel (#11491)
* Update sensor names for msn4600c for the 5.10 kernel

Looks like a sensor was removed in the 5.10 kernel for the tps53679
sensor, so the names/indexing has changed.

Related to Azure/sonic-mgmt#4513.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Update sensors file

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-29 11:34:23 -07:00
xumia
a1d3d99457
[Build] Increase the size of the installer image (#11869)
#### Why I did it
Fix the build failure caused by the installer image size too small. The installer image is only used during the build, not impact the final images.
See https://dev.azure.com/mssonic/build/_build/results?buildId=139926&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=359769c4-8b5e-5976-a793-85da132e0a6f

```
+ fallocate -l 2048M ./sonic-installer.img
+ mkfs.vfat ./sonic-installer.img
mkfs.fat 4.2 (2021-01-31)
++ mktemp -d
+ tmpdir=/tmp/tmp.TqdDSc00Cn
+ mount -o loop ./sonic-installer.img /tmp/tmp.TqdDSc00Cn
+ cp target/sonic-vs.bin /tmp/tmp.TqdDSc00Cn/onie-installer.bin
cp: error writing '/tmp/tmp.TqdDSc00Cn/onie-installer.bin': No space left on device
[  FAIL LOG END  ] [ target/sonic-vs.img.gz ]
```

#### How I did it
Increase the size from 2048M to 4096M.
Why not increase to 16G like qcow2 image?
The qcow2 supports the sparse disk, although a big disk size allocated, but it will not consume the real disk size. The falocate does not support the sparse disk. We do not want to allocate a very big disk, but no use at all. It will require more space to build.
2022-08-29 11:15:42 -07:00
abdosi
3bf1abb2dc
Address Review Comment to define SONIC_GLOBAL_DB_CLI in gbsyncd.sh (#11857)
As part of PR #11754

    Change was added to use variable SONIC_DB_NS_CLI for
    namespace but that will not work since ./files/scripts/syncd_common.sh
    uses SONIC_DB_CLI. So revert back to use SONIC_DB_CLI and define new
    variable for SONIC_GLOBAL_DB_CLI for global/host db cli access

   Also fixed DB_CLI not working for namespace.
2022-08-29 08:19:28 -07:00
Liu Shilong
35945c9015
[ci] Update reproducible build related pipeline. (#11810) 2022-08-29 11:26:15 +08:00
Liu Shilong
4b4e311c14
[actions] Update github actions label and automerge. (#11736)
1. Add auto approve step when adding label to version upgrading PR.
2. Use mssonicbld TOKEN to merge version upgrading PR instead of 'github actions'
2022-08-29 11:24:57 +08:00
Samuel Angebault
f2c9a3584d
[Arista] Fix content of platform.json for DCS-720DT-48S (#11855)
Why I did it
Content of platform.json was outdated and some platform_tests/api of sonic-mgmt were failing.

How I did it
Added the necessary values to platform.json

How to verify it
Running platform_tests/api of sonic-mgmt should yield 100% passrate.
2022-08-29 10:45:24 +08:00
Zain Budhwani
178a30bc3b
Update swss common submodule for events api (#11858)
#### Why I did it

Structured events code like eventd, rsyslogplugin, requires changes made in swss-common

Submodule adds these newest commits:

56b0f18 (HEAD, origin/master, origin/HEAD, master) Events: APIs to set/get global options (#672)
5467c89 Add changes to yml file to improve pytest (#674)

#### How I did it

Updated git submodule

#### How to verify it

Check new commit pointer
2022-08-28 12:28:58 -07:00
Junhua Zhai
bd55b342d9
Advance submodule sonic-sairedis (#11704)
2022-07-28 854d54e: Add support of mdio IPC server class using sai switch api and unix socket (sonic-net/sonic-sairedis#1080) (Jiahua Wang)
2022-07-27 513cb2a: [FlexCounter] Refactor FlexCounter class (sonic-net/sonic-sairedis#1073) (Junchao-Mellanox)
2022-08-28 12:48:22 +08:00
Samuel Angebault
8ccae96bfe
[Arista] Update platform submodule (#11853) 2022-08-26 09:38:45 -07:00
abdosi
0ed53bbc1b
New API to support runtime metadata needed for Feature Table field jinja rendering. (#11795)
Added new API to return runtime metadata dictionary as needed during Feature Table field rendering by hostcfgd.
2022-08-26 08:04:00 -07:00
Junhua Zhai
0c820a1826
Mount directory warmboot in docker gbsyncd (#11852)
Why I did it
The directory /var/warmboot as top directory for warmboot feature is also needed in docker gbsyncd. Some vendor SAI might save data under it. Without it, the SAI init/creation API failure has happened on PikeZ platform.

How I did it
Mount host directory /host/warmboot as /var/warmboot in docker gbsyncd, which is same as what it has done on docker syncd.
2022-08-26 22:00:45 +08:00
Richard.Yu
a1eae940d5
[SAIServer] support saiserver v2 in bullseye (#11849)
Upgrade libboost-atomic1.71 to libboost-atomic1.74

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2022-08-25 22:51:53 -07:00
andywongarista
1b83e418f8
Enable AN for Ethernet24-47 (#11839)
Enable port AN ON explicitly and then port will become (oper status) UP. Somehow those ports AN are not default ON in bcm sdk.
2022-08-26 09:30:11 +08:00
arunlk-dell
13bd63e73a
DellEMC: S5296F Platform API 2.0 changes (#11162)
Why I did it
S5296F - Platform API 2.0 changes

How I did it
Implemented the functional API's needed for Platform API 2.0

How to verify it
Used the API 2.0 test suite to validate the test cases.
2022-08-25 17:07:23 -07:00
SuvarnaMeenakshi
8d06de37ae
Add support to get fabric asic namespaces list. (#11793)
Why I did it
VoQ chassis supervisor will have Fabric asics and the sub_role for fabric asics will be "Fabric".
The fabric asics namespaces are not being returned in get_all_namespaces() and is required in caclmgrd to add right cacl to allow internal docker traffic from fabric asic namespaces.
test_cacl_application fails on VoQ chassis Supervisor with the error:
Failed: Missing expected iptables rules: set(['-A INPUT -s 240.127.1.1/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.3/32 -d 240.127.1.1/32 -j ACCEPT', '-A INPUT -s 240.127.1.2/32 -d 240.127.1.1/32 -j ACCEPT'])

How I did it
Update get_all_namespaces to return fabric namespaces list.

How to verify it
Verified on VoQ chassis.
2022-08-25 12:39:52 -07:00
Jing Zhang
0cdef2ebc6
[YANG] add peer switch model (#11828)
Why I did it
Address issue #10966

sign-off: Jing Zhang zhangjing@microsoft.com

How I did it
Add sonic-peer-switch.yang and unit tests.

How to verify it
Compile Compile target/python-wheels/sonic_yang_mgmt-1.0-py3-none-any.whl and target/python-wheels/sonic_yang_models-1.0-py3-none-any.whl.

Which release branch to backport (provide reason below if selected)
 201811
 201911
 202006
 202012
 202106
 202111
 202205
Description for the changelog
Link to config_db schema for YANG module changes
b721ff87b9/src/sonic-yang-models/doc/Configuration.md (peer-switch)
2022-08-25 08:46:45 -07:00
ShiyanWangMS
83704d9955
Upgrade docker-sonic-mgmt base image from Ubuntu18.04 to 20.04 (#11831)
Update base image from ubuntu18.04 to ubuntu20.04
Fix necessary dependencies.
After upgrade, Py2 is 2.7.18, Py3 is 3.8.10.
2022-08-25 15:55:01 +08:00
Sangita Maity
30e79a1a3f
[snmpd] Fixed snmpd memory leak issue (#11812)
#### Why I did it
This fixed memory leak in ETHERLIKE-MIB. The fix is not part of net-snmp(5.7.3 version). This PR includes the patch to fix memory leak issue.

```
ke->name in stdup-ed at line 297: n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
```
#### How I did it
patched the fix.
[net-snmp] upstream fix link -> [snmpd}upstream link](ed4e48b5fa)


#### How to verify it
**Before The fix**

used valgrind to find memory leak.
```
root@lnos-x1-a-csw06:/# grep "definitely lost" valgrind-out.txt
==493== 4 bytes in 1 blocks are definitely lost in loss record 1 of 333
==493== 16 bytes in 1 blocks are definitely lost in loss record 25 of 333
==493== 757 bytes in 71 blocks are definitely lost in loss record 214 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 293 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 294 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 295 of 333
==493== 1,168 (32 direct, 1,136 indirect) bytes in 1 blocks are definitely lost in loss record 296 of 333
==493==    definitely lost: 905 bytes in 77 blocks
```
_we can see  the memory leak see in stack trace._
-> dot3stats_linux -> get_nlmsg -> strdup
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c#L277

```
n = malloc(sizeof(*n));
    memset(n, 0, sizeof(*n));

    n->ifindex = ifi->ifi_index;
    n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
    memcpy(&n->stats, RTA_DATA(tb[IFLA_STATS]), sizeof(n->stats));
    n->next = kern_db;
    kern_db = n;
    return 0;
```
we were not freeing space for EtherLike-MIB.AS interface mib queries were getting increased, we see memory increment.

```
        kern_db = ke->next;
        free(ke);
```
https://github.com/net-snmp/net-snmp/blob/v5.7.3/agent/mibgroup/etherlike-mib/data_access/dot3stats_linux.c#L467

```
==55== 757 bytes in 71 blocks are definitely lost in loss record 186 of 299
==55==    at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==55==    by 0x4EB6E49: strdup (strdup.c:42)
==55==    by 0x493F278: get_nlmsg (dot3stats_linux.c:299)
==55==    by 0x493F529: rtnl_dump_filter_l.constprop.3 (dot3stats_linux.c:370)
==55==    by 0x493FD7A: rtnl_dump_filter (dot3stats_linux.c:401)
==55==    by 0x493FD7A: _dot3Stats_netlink_get_errorcntrs (dot3stats_linux.c:424)
==55==    by 0x494009F: interface_dot3stats_get_errorcounters (dot3stats_linux.c:530)
==55==    by 0x48F6FDA: dot3StatsTable_container_load (dot3StatsTable_data_access.c:330)
==55==    by 0x485E76B: _cache_load (cache_handler.c:700)
==55==    by 0x485FA37: netsnmp_cache_helper_handler (cache_handler.c:638)
==55==    by 0x48720BC: netsnmp_call_handler (agent_handler.c:526)
==55==    by 0x48720BC: netsnmp_call_next_handler (agent_handler.c:640)
==55==    by 0x4865F75: table_helper_handler (table.c:717)
==55==    by 0x4871B66: netsnmp_call_handler (agent_handler.c:526)
==55==    by 0x4871B66: netsnmp_call_handlers (agent_handler.c:611)

757 bytes in 71 blocks are definitely lost in loss record 214 of 333
==493==    at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==493==    by 0x4EB6E49: strdup (strdup.c:42)
==493==    by 0x493F278: ??? (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493==    by 0x493F529: ??? (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493==    by 0x493FD7A: _dot3Stats_netlink_get_errorcntrs (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493==    by 0x494009F: interface_dot3stats_get_errorcounters (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493==    by 0x48F6FDA: dot3StatsTable_container_load (in /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30.0.3)
==493==    by 0x485E76B: _cache_load (cache_handler.c:700)
==493==    by 0x485FA37: netsnmp_cache_helper_handler (cache_handler.c:638)
==493==    by 0x48720BC: netsnmp_call_handler (agent_handler.c:526)
==493==    by 0x48720BC: netsnmp_call_next_handler (agent_handler.c:640)
==493==    by 0x4865F75: table_helper_handler (table.c:717)
==493==    by 0x4871B66: netsnmp_call_handler (agent_handler.c:526)
==493==    by 0x4871B66: netsnmp_call_handlers (agent_handler.c:611)
```
```
**After The fix**
no  memory leak in valgrind stack trace related to etherlike MIB.
```
2022-08-25 00:24:53 -07:00
Hua Liu
214e394ac0
Remove swsssdk from rules and image. (#11469)
#### Why I did it
To deprecate swsssdk, remove all dependency to it. 

#### How I did it
Remove swsssdk from rules and build image scripts.

#### How to verify it
Pass all UT and E2E test case

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205

#### Description for the changelog
Remove swsssdk from rules and build image scripts.

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2022-08-25 08:35:51 +08:00
rajendra-dendukuri
e740573892
Fix TARGET_BOOTLOADER variable assignment (#11722)
- Pass TARGET_BOOTLOADER variable value to slave build infra

#### Why I did it
The TARGET_BOOTLOADER is always blank when referred to in the Makefiles which are executed inside the slave build container.

#### How I did it
Pass it on the make command invoking slave.mk explicitly similar to other environment variables.

#### How to verify it
kdump-tools package is installed on sonic-broadcom.bin image.
2022-08-24 16:42:01 -07:00
Saikrishna Arcot
adffbd4643
Build the Broadcom DNX RPC container as part of the official build (#11829)
With the Broadcom syncd containers getting upgraded to Bullseye, the DNX
RPC container is no longer automatically built. Explicitly add a make
command to build it.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-08-24 11:42:15 -07:00
Vivek
f3c1c14d22
[Mellanox] [4700] Update platform capability file to support new breakout mode (#11614)
- Why I did it
This new breakout mode is required when a QSFP cable is used on the QSFP-DD supported 4700 port. since QSFP only uses the first 4 lanes, this mode is required to restrict the child ports to only use the first four lanes

- How I did it
Updated the platfrom.json file with the extended data

- How to verify it
Tested on one port:

root@msn-4700:/home/admin# show int status
  Interface                            Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin                                             Type    Asym PFC
-----------  -------------------------------  -------  -----  -----  -------  ------  ------  -------  -----------------------------------------------  ----------
  Ethernet0                                0      25G   9100    N/A    etp1a  routed      up       up                                  QSFP28 or later         N/A
  Ethernet1                                1      25G   9100    N/A    etp1b  routed    down       up                                              N/A         N/A
  Ethernet2                                2      25G   9100    N/A    etp1c  routed    down       up                                              N/A         N/A
  Ethernet3                                3      25G   9100    N/A    etp1d  routed    down       up                                              N/A         N/A

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-08-24 11:55:33 +03:00
Yakiv Huryk
6462f45433
[bullseye] add dependencies for saithriftv2 build (#11666)
- Why I did it
To support saithriftv2 build for bullseye dockers

- How I did it
Added the dependencies documented in the SAI docs and used in sonic-slave-buster

- How to verify it
Build saithriftv2 in the sonic-slave-bullseye

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2022-08-24 11:53:54 +03:00
Vivek
0193c8e90c
[sonic_py_common] Cache Static Information in device_info to speed up CLI response (#11696)
- Why I did it
Profiled the execution for the following cmd intfutil -c status

- How I did it
Cached the following information:
1. get_sonic_version_info()
2. get_platform_info()
None of the API exposed to the user libraries (for eg: sonic-utilities) has been modified
These methods involve reading text files or from redis. Thus, caching helped to improve the execution time

- How to verify it
Added UT's.
Verified on the device

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2022-08-24 11:53:11 +03:00
Kebo Liu
d9f3e6ce4f
update Mellanox SDK/FW to 4.5.2318 2010.2318 (#11773)
- Why I did it
Update SDK/FW version - 4.5.2318/2010_2318 to pick up new fixes:
1. Cr space timeout on Hold and Release GW - at warm boot
2. Spectrum Port in stuck PHY_UP after peer side rebooted
3. Memory leak in sx_api_router_ecmp_update_set

- How I did it
Update the make file with the new version number
Update submodule Switch-SDK-drivers pointer

- How to verify it
Run sonic regression

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-08-24 11:45:16 +03:00