- Why I did it
Updated FW to xx.2008.2526 version.
Fixed issues:
1. Spectrum-2, Spectrum-3 | sFlow | High CPU load and high on fully loaded switch.
2. Spectrum-2, Spectrum-3 | Fine grain LAG | in rare cases doesn’t update the right entry
- How I did it
Updated submodule pointer and version in a Makefile.
- How to verify it
Full regression and bugs validation
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Enable VXLAN src port range configuration via SAI profile for Mellanox-SN3800-D28C49S1 SKU
- How I did it
Added SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 configuration to appropriate sai.profile
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
When submitting a new official build for broadcom, vs, it prompts a error message, which says the job is not defined.
It was caused by the default option "[]", which is not empty, it is used as the jobGroups parameter.
Why I did it
Improve the version of the Pull Request build by changing the local branch name.
How I did it
Change the default branch name merge to [target_branch_name]-[pullrequestid].
How to verify it
For official build, the version is not changed.
For pull request build, the version as below:
Why I did it
Fix the boolean value case sensitive issue in Azure Pipelines
When passing parameters to a template, the "true" or "false" will have case sensitive issue, it should be a type casting issue.
To fix it, we change the true/false to yes/no, to escape the trap.
Support to override the job groups in the template, so PR build has chance to use different build parameters, only build simple targets. For example, for broadcom, we only build target/sonic-broadcom.bin, the other images, such as swi, debug bin, etc, will not be built.
Why I did it
Failed to build the centec-arm64 for no space in docker data root.
How I did it
Change to use the data root to the folder under /data.
See detail info about DOCKER_DATA_ROOT_FOR_MULTIARCH in the file Makefile.work.
How to verify it
Set the environment variable, then the /data used as docker root.
Why I did it
Fix the wrong build options and improve display name for some tasks
1. Fix the wrong build architecture for arm64 and armhf
2. Fix the build timeout parameter
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
The service file generate_monit_config.service is used to generate the Monit configuration file from template. I also should install this service file and enable it.
How I did it
I appended this service file name at the end of /etc/sonic/generated_services.conf.
How to verify it
I verified this on the device str2-7260cx3-acs-1.
Which release branch to backport (provide reason below if selected)
201811
[x ] 201911
202006
202012
#### Why I did it
MSN4700 A1/A0 used different sensor chip but keep the existing platform name *x86_64-mlnx_msn4700-r0*, this is a workaround to replace the sensor conf on MSN4700 A1/A0
#### How I did it
Use a shell script to get the sensor conf path and copy that files to /etc/sensors.d/sensors.conf
Why I did it
Added soft-reboot plugin support.
Added SSD version s16425cq check
Added error message to display in console/SSH in case reboot is called in faulty/non-upgraded devices.
Recently, we found on some of our testbeds the entropy collecting process finishes more than 60 seconds after system started.
This results in swss not able to start sporadically.
To install haveged can accelerate the entropy collect process.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
This PR aims to monitor the critical processes in PMon container by Monit in 201911 branch.
How I did it
I created a template configuration file of Monit and it will be rendered to generate Monit configuration file of PMon container
by a service generate_monit_config.service.
How to verify it
I verified this on a Mellanox device str-msn2700-03 and an Arista device str-a7050-acs-1.
Which release branch to backport (provide reason below if selected)
201811
[x ] 201911
202006
202012
New features and fixes in the new SDK/FW:
SN4600C | AN/LT support
SN2700 | AN/LT bugs fixes
WJH | FID_MISS support
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Issue is get_pip.py is moved to pip 21.1 (https://github.com/pypa/get-pip/commits/main) which is not compatible with 3.6.
Issue of pip itself is fixed as part of 21.1.1 in pip community (pypa/pip#9835).
However get-pip.py is still not updated to latest pip. Also get.pip.py does not support python 3.6 version explicitly (pypa/get-pip#88)
Step 15/29 : RUN curl https://bootstrap.pypa.io/get-pip.py | python3.6
---> Running in bece31f49267
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 1891k 100 1891k 0 0 9564k 0 --:--:-- --:--:-- --:--:-- 9600k
Traceback (most recent call last):
File "<stdin>", line 24298, in <module>
File "<stdin>", line 139, in main
File "<stdin>", line 115, in bootstrap
File "<stdin>", line 96, in monkeypatch_for_cert
File "/tmp/tmp5fnxrz0a/pip.zip/pip/_internal/commands/__init__.py", line 9, in <module>
File "/tmp/tmp5fnxrz0a/pip.zip/pip/_internal/cli/base_command.py", line 12, in <module>
File "/tmp/tmp5fnxrz0a/pip.zip/pip/_internal/cli/cmdoptions.py", line 30, in <module>
File "/tmp/tmp5fnxrz0a/pip.zip/pip/_internal/utils/hashes.py", line 2, in <module>
ImportError: cannot import name 'NoReturn'
The command '/bin/sh -c curl https://bootstrap.pypa.io/get-pip.py | python3.6' returned a non-zero code: 1
How I did:
Got the file from https://github.com/pypa/get-pip/tree/21.0 and added to the buildimage
pin pip to the previous release 21.0.1. (Similar is done in other public repos eg: grpc/grpc-java#8115)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
a364614 2021-04-22 | [201911][acl] Use a list instead of a comma-separated string for ACL port list (#1576) [Danny Allen]
391e524 2021-04-15 | [201911] Fix Multi-ASIC show specific resursive route (#1563) [gechiang]
#### Why I did it
Since we will have multiple `dhcrelay` processes if there exists different VLANs in the table `VLAN_INTERFACE` of `CONIFG_DB`,
we should use unique service name for each `dhcrelay` process in Monit configuration file. Otherwise, Monit service will fail to work.
#### How I did it
I append the VLAN name to the end of each service name such that they are unique.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
#### Why I did it
- xcvrd crash was seen in latest 201811 images.
- For Dell S6100,API 2.0 uses poll mode while 1.0 was still using interrupt mode.
#### How I did it
- Modified get_transceiver_change_event in 1.0 to poll mode in all the related branches.
Backport of https://github.com/Azure/sonic-buildimage/pull/7309 to the 201911 branch
Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
This PR aims to monitor critical processes in router advertiser and dhcp_relay containers by Monit.
How I did it
Router advertiser container only ran on T0 device and the T0 device should have at least one VLAN interface
which was configured an IPv6 address. At the same time, router advertiser container will not run on devices of which
the deployment type is 8.
As such, I created a service which will dynamically generate Monit configuration file of router advertiser from a
template.
Similarly Monit configuration file of dhcp_relay was also generated from a template since the number of dhcrelay process in dhcp_relay container is depended on number of VLANs.
How to verify it
I verified this implementation on a DuT.
see below error:
+ sudo https_proxy= LANG=C chroot ./fsroot easy_install pip==20.3.3
Searching for pip==20.3.3
Reading https://pypi.python.org/simple/pip/
Couldn't find index page for 'pip' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading https://pypi.python.org/simple/
No local packages or working download links found for pip==20.3.3
error: Could not find suitable distribution for Requirement.parse('pip==20.3.3')
How I fix:
Install python-pip via apt-get
Pin the version to 20.3.3
Master has same changes.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
With the latest 201911 image, the following error was seen on staging devices with TSB command ( for both single asic, multi asic ). Though this err message doesn't affect the TSB functionality, it is good to fix.
admin@STG01-0101-0102-01T1:~$ TSB
BGP0 : % Could not find route-map entry TO_TIER0_V4 20
line 1: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 permit 20
% Could not find route-map entry TO_TIER0_V4 30
line 2: Failure to communicate[13] to zebra, line: no route-map TO_TIER0_V4 deny 30
In addition, in this PR I am fixing the message displayed to user when there are no BGP neighbors configured on that BGP instance. In multi-asic device there could be case where there are no BGP neighbors configured on a particular ASIC.