Commit Graph

3862 Commits

Author SHA1 Message Date
Kevin Wang
b5d06a8ae2 [Buffer] Separate buffer profile for Arista-7060CX-32S-C32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-06-23 15:22:02 -07:00
Kevin Wang
917eb50328 [Buffer] Separate buffer profile for Arista-7060CX-32S-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-06-23 15:22:02 -07:00
Kevin Wang
32459b990a [Buffer] Separate buffer profile for Arista-7060CX-32S-Q32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-06-23 15:22:02 -07:00
Kevin Wang
9544e1ffac [Buffer] Separate buffer profile for Celestica-DX010-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-06-23 15:22:02 -07:00
Kevin Wang
c89dbe2721 [Buffer] Separate buffer profile for Force10-S6100
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-06-23 15:22:02 -07:00
Ying Xie
b593d802d5 [buffer] create infrastructure to enable buffer/QoS profiles
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-06-23 15:22:02 -07:00
Andriy Yurkiv
38eef912e8
Enable PG drop counters by default, set default values only on the first start (#10935)
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>

Backport form master
Appropriate PR on master: #7735
Appropriate PR on master #6444

Why I did it
PG drop counters should be enabled by default (merge from master)
After "config reload" or "docker swss restart" all counters were enabled even if they were disabled before

How I did it
1)Add PG drop counter enable option to dockers/docker-orchagent/enable_counters.py
2) Check if entry already exist before set default values

How to verify it
- install image and run counterpoll show CLI command and then you will see PG_STAT_DROP enabled
- Disable few counters
    counterpoll pg-drop disable
    counterpoll port disable
- Save and reload
   config save
   config reload
- Check enable status
2022-06-22 09:43:02 -07:00
Liu Shilong
02e0aff5e7
[ci] Set default ACR in UpgrateVersion/PR/official pipeline. (#11002)
* [ci] Set default ACR in UpgrateVersion/PR/official pipeline. (#10341)

Why I did it
docker hub will limit the pull rate.
Use ACR instead to pull debian related docker image.

How I did it
Set DEFAULT_CONTAINER_REGISTRY in pipeline.

* Add a config variable to override default container registry instead of dockerhub. (#10166)
* Add variable to reset default docker registry
* fix bug in docker version control
2022-06-22 17:33:20 +08:00
Nazarii Hnydyn
530125311e
[Mellanox]: Advance SAI submodule. (#11149)
- To fix tunnel underlay configuration

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2022-06-15 12:30:14 -07:00
shlomibitton
5c5c13a536
Add a new patch to set PSU led to green on init by Nvidia hw-mgmt package (#10912)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2022-05-24 19:19:00 -07:00
Abhishek Dosi
a17f2e50ce [Submodule update] sonic-restapi
c3d9d8f2bcd364dc81cd4d9bec02666cef648b10 (HEAD -> 201911, origin/201911) API for getting all members from all VLANs (#106)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-05-19 06:30:06 -07:00
abdosi
252168d1cd
[2019111] Added Support for BGP allow list feature to have route-map action of setting tag (#10869)
What I did:
Added support to create route-map action set tag
when the the allow prefix list matches. The tag can ben define by user in
constants.yml.

Why I did:
Since for Allow List feature we call from base route-map allow-list route-map having set tag option provides way for base route-map to do match tag and take any further action if needed. Adding tag provide metadata that can used by base route-map
2022-05-18 23:01:59 -07:00
Abhishek Dosi
be4dbb1c63 [Submodule update] sonic-swss
[201911][pfcwd] Avoid ingress drop by not attaching zero profiles when pfc storm is detected (#2279)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-05-18 18:20:59 -07:00
StormLiangMS
94ef122f1f [bgpcfgd] to support removal part of configuration of bgp allowed prefix list (#10165)
* fix allow list issue

Signed-off-by: stormliang <stormliang@microsoft.com>

* add the ipaddress in the install list

* add unit test

Co-authored-by: Ubuntu <azureuser@SONIC-SH-STORM-02.5pu3m0fajw1edcfltykk1gauxa.gx.internal.cloudapp.net>

Why I did it
Failed to remove part of configuration of bgp allowed prefix list. The details in #10141

How I did it
There are two issues:

In FRR, ipv6 default route is ::/0, but in the configuration, it is 0::/0, string comparison would be false, but why ipv4 failed to remove the allowed prefix list, ipv6 works? Looks into next one for the answer.

The current managers_allow_list doesn’t support removal part of the prefix list. But why IPv6 works in 1? It is because the bug for the IPv6 default route comparison, it would do the update no matter what is the operation (the code will compare the prefix list in the FRR and configuration db, if all configurations in db are presented in FRR, it do nothing, otherwise it will update the prefix list based on the configuration from db).

How to verify it
Follow the step in #10141
2022-05-18 18:20:49 -07:00
abdosi
cd28f30969
Updated Broadcom SAI version to 3.7.6.1-1 (#10859)
Updated BRCM SAI Version to 3.7.6.1-1
2022-05-17 22:39:10 -07:00
Abhishek Dosi
b56fbc5ca8 [Submodule update] sonic-snmpagent
f91a9e6e07a43cae531cda019935de3221e0bb09 (HEAD -> 201911, origin/201911) Fix: not to use blocking get_all() after keys() (#255)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-05-17 15:52:29 -07:00
Abhishek Dosi
443158c95e [Submodule Update] sonic-utilities
988d8e172e7140174bfa21d3b86c9685c7127a14 (HEAD -> 201911, origin/201911) [warm-reboot]: added automated recover for ISSU file (#1466)
913df4e2faf6e70e0aebb01d81f79694a6d8ee20 [201911] Warmboot script improvements - timeout in exec and disable service-autorestart (#2149)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-05-17 15:51:04 -07:00
Vaibhav Hemant Dixit
2c104c8d56
[201911] Advance sonic-snmpagent submodule to latest (#10829)
Why I did it
Advance sonic-snmpagent submodule to include:
[201911] Fix snmp subagent errors in shutdown path Azure/sonic-snmpagent#259
2022-05-16 08:04:28 -07:00
Volodymyr Samotiy
ce7bf08144
[Mellanox] [201911] Update FW to v2008.3382 (#10798)
- Why I did it
To include the fix for the issue of Modification of shared headroom on the fly can get to negative occupancy that leads to PFC been sent from the switch continuously.

- How I did it
Updated submodule pointer and version in relevant Makefile.

- How to verify it
Build an image and run tests from sonic-mgmt.

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-05-11 08:39:01 +03:00
Santhosh Kumar T
9093feb113
[DellEMC][201911] S6100 CPLD upgrade support in 201911 branch porting changes (#10686)
Why I did it
Porting changes from DellEMC: S6100 CPLD upgrade #4299 and DellEMC S6100 CPLD upgrade support #3834 to 201911 branch
Added CPLD upgrade support for DellEMC S6100 platform.
2022-04-28 09:23:38 -07:00
Santhosh Kumar T
ac35a62747
[DellEMC][201911] S6100 S6000 - Show techsupport enhancement (#10690) 2022-04-27 09:17:35 -07:00
Abhishek Dosi
b8689d71f0 Fix the build error created by cherry-pick of PR:
[bgp] Enable BGP Graceful Restart based on device role (#9486)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-04-02 10:47:44 -07:00
Abhishek Dosi
7b0ef0ed55 [Submodule update] sonic-py-swssdk
9ce4d19d5a199cffe2933d80e343a80ded398b4a (HEAD -> 201911, origin/201911) With the changes in PR:https://github.com/Azure/sonic-buildimage/pull/5289 access to redis unix socket is given to the redis group members. Many of sonic-util commands (especially in multi-asic) case use redis unix socket to connect to DB and thus those comamnd fails without providing sudo. This PR is continuation  of PR: https://github.com/Azure/sonic-buildimage/pull/7002 where we default to use TCP for Redis if user is not root

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-04-01 10:50:11 -07:00
Abhishek Dosi
843ac556a7 [Submodule update] sonic-swss
1c12a4050fecabd88245c7aa64a61259bc00db3b (HEAD -> 201911, origin/201911)Allowing the first time FEC and AN configuration to be pushed to SAI (#1705) (#2196)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-04-01 09:46:09 -07:00
abdosi
9138423b3e [bgp] Enable BGP Graceful Restart based on device role (#9486)
What I did:
Updated Jinja Template to enable BGP Graceful Restart based on device role. By default it will be enable only if the device role type is TorRouter.

Why I did:-
By default FRR is configured in Graceful Helper mode. Graceful Restart is needed on T0/TorRouter only since the device can go for warm-reboot. For T1/LeafRouter it need to be in Helper mode only
2022-04-01 09:43:53 -07:00
Ying Xie
db5b9ee834 [warm boot finalizer] only wait for enabled components to reconcile (#6454)
* [warm boot finalizer] only wait for enabled components to reconcile

Define the component with its associated service. Only wait for components that have associated service enabled to reconcile during warm reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-03-31 12:01:25 -07:00
Sudharsan Dhamal Gopalarathnam
17c9648c9c
[202012] FEC none config through minigraph (#7670) (#10338)
When FECDisabled is set to true in minigraph.py, push 'fec' 'none' explicitly to config_db. When 'fec' is defined in port_config.ini do not override it with 'rs' for 100G

Backport of #7667 to 202012 branch.
2022-03-29 10:28:26 -07:00
abdosi
74305f8a56
Backport FRR patch tp FRR-7.2 to handle pthread race in peer notify message handling (#10324)
What I did:

Backport FRR patch FRRouting/frr#8220 on FRR 7.2. Fixes the Issue FRRouting/frr#8213

Why I did:-

Because of this race-condition we saw GR getting triggered even though BGP shut is given on peer device.

How I verify:

After patching this fix GR is not triggered on doing BGP shut on peer.
2022-03-23 08:58:10 -07:00
xumia
218a310eeb
[Submodule] Update src/sonic-restapi (#10263)
af30fec Fix urllib3 CVE-2021-33503 issue (#104) (#105)
2022-03-20 17:33:58 -07:00
zzhiyuan
d3c881858c [Arista] Increase switch PCIe timeout for 7060-cx32s (#9248)
Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
Why I did it
Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up.

No issues would be expected from the setpci command. From the PCIe spec:

"Software is permitted to change the value in this field at any
time. For Requests already pending when the Completion
Timeout Value is changed, hardware is permitted to use either
the new or the old value for the outstanding Requests, and is
permitted to base the start time for each Request either on when
this value was changed or on when each request was issued. "

How I did it
Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder.

Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands.

How to verify it
On bootup of an Arista 7060, can execute:
lspci -vv -s 01:00.0 | grep -i "devctl2"
In order to check that the timeout has changed.
2022-03-02 07:44:24 -08:00
Abhishek Dosi
a90ebfba51 [submodule update] sonic-snmpagent
[rfc2737]: Handle unicode error when parsing transceiver (#235)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-03-01 16:17:03 -08:00
xumia
9bc3e9992e [Security]: Upgrade urllib3 to fix CVE-2021-33503
See https://security.archlinux.org/CVE-2021-33503
2022-02-25 01:29:39 +00:00
Renuka Manavalan
e3958afa2c
Manually cherry-picked PR #9123 (#10041)
Identify the bad password set by sshd and fail auth before sending to
AAA server, and hence avoid possible user lock out by AAA.
For more details, please refer the parent/original PR #9123
2022-02-23 17:41:21 -08:00
kellyyeh
017547dad3
[201911][radv] Support multiple ipv6 prefixes per vlan interface and change radv interval to 3min (#10016)
* [radv] Support multiple ipv6 prefixes per vlan interface (#9934)
* Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block.
2022-02-18 07:40:55 -08:00
Abhishek Dosi
c71320132f [Submodule update] sonic-swss
[fpmsyncd] Skip routes to eth0 or docker0 (#1606)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-02-17 18:34:37 -08:00
arlakshm
1d84ff5bd9
remove staticd.conf (#9657)
resolves #8979 and #9055

How I did it
Remove the file static.conf.j2,which adds the default route on eth0 from frr docker

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-02-17 18:31:30 -08:00
Abhishek Dosi
9f55979696 [submodule update] sonic-restapi
a1830c1761087bdc1f7433ebbb8d0bdc419da0d3 (HEAD -> 201911, origin/master, origin/HEAD, origin/201911) Fix OpenAPI spec to be readable by autorest (#101)
94805a39ac0712219f7dc08faa2cfdbf371dd177 Identify and report Vnet GUID for conflicting VNI (#99)
4832dfd677de72edc44d4eb8c1b60cfad79a3355 Static route expiry if not specified as persistent  (#98)
5cc4358fb67b9e2a0da9a6691064e41f97ebebc2 (master) Add support for overlay ECMP (#96)
6822a46197daef060b4d00dba5153b04b163c43f [CI] Set diff cover threshold to 50% (#97)
dcc826a1503060b9a07e4510b4f48331c49e87dd Add PR diff coverage (#95)
e842c5ff317c67919dcbcab3358143cb9a16c9dd Generate code coverage for Unit Tests (#94)
f9bbed3cb86a3bab9a07745096835dbdbe5a4db6 Convert Unit Tests from unittest framework to pytest framework (#93)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-02-17 11:14:14 -08:00
Abhishek Dosi
81175f27b8 [submodule update] sonic-sairedis
[201911] Prevent other notification event storms to keep enqueue
unchecked and drained all memory that leads to crashing the switch
router (#981)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-02-17 11:07:14 -08:00
Samuel Angebault
520a13ba72
[201911][Arista] Add emmc quirks for Upperlake (#9971)
Why I did it
Fix some unreliability seen on emmc device with some AMD CPUs

How I did it
Added a kernel parameter to add quirks to
It depends on a sonic-linux-kernel change to work properly but will be a no-op without it.

Description for the changelog
Add emmc quirks for Upperlake
2022-02-11 13:28:11 -08:00
xumia
9c0a09241c
[submodule]: update sonic-utilities (#9775)
update sonic-utilities
2022-01-19 18:46:39 +08:00
Junchao-Mellanox
2ffc9d572f
[Mellanox] [201911] Optimize thermal policies (#9664)
- Why I did it
Optimize thermal control policies to simplify the logic and add more protection code in policies to make sure it works even if kernel algorithm does not work.

- How I did it
Reduce unused thermal policies
Add timely ASIC temperature check in thermal policy to make sure ASIC temperature and fan speed is coordinated
Minimum allowed fan speed now is calculated by max of the expected fan speed among all policies
Move some logic from fan.py to thermal.py to make it more readable

- How to verify it
1. Manual test
2. Regression
2022-01-19 11:42:09 +02:00
Arun Saravanan Balachandran
33ef26d97b
[201911] DellEMC: S6000, S6100 - Enable thermalctld, Platform API changes (#9384)
Why I did it
To incorporate the below changes in DellEMC S6100, S6000 platforms.

Enable thermalctld
Backport Platform API changes from master branch.
How I did it
Remove 'skip_thermalctld:true' in pmon_daemon_control.json
Implement the platform API methods in the respective device files
How to verify it
Verified that platform data is displayed by show platform fan and show platform temperature commands.
2021-12-10 12:23:22 -08:00
Samuel Angebault
dfa77a54d5
[201911][Arista] Backport logrotate configuration (#9455)
Backport logrotate configuration for arista*.log files
2021-12-08 19:11:04 -08:00
Elvis Tsai
a8fed0a85e
[201911][Innovium] Update Wistron platform definition
Why I did it
Cannot retrieve and display the reboot-cause.

How I did it
Correct the platform initialization definition.

How to verify it
Manual reboot and then 'show reboot-cause'
2021-12-08 19:09:53 -08:00
Junchao-Mellanox
2b4c8ee330
[Mellanox] Fan speed should not be 100% when PSU is powered off (#9258) (#9380)
Backport #9258 to 201911

Why I did it
When PSU is powered off, the PSU is still on the switch and the air flow is still the same. In this case, it is not necessary to set FAN speed to 100%.

How I did it
When PSU is powered of, don't treat it as absent.

How to verify it
Adjust existing unit test case
Add new case in sonic-mgmt
Conflicts:
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_infos.py
2021-12-07 18:22:51 -08:00
Volodymyr Samotiy
690f8e6919
[Mellanox] Update SDK to v4.4.3360 and FW to v2008.3358 (#9402)
- Why I did it
To include latest fixes.

1. On CMIS modules, after low power configuration, the firmware waited for the module state to be ModuleReady instead of ModuleLowPower causing delays.
2. When connecting Spectrum devices with optical transceivers that support RXLOS, remote side port down might cause the switch firmware to get stuck and cause unexpected switch behavior.
3. On rare occasions, when working with port rates of 1GbE or 10GbE and congestion occurs, packets may get stuck in the chip and may cause switch to hang.
4. When ECMP has high amount of next-hops based on VLAN interfaces, in some rare cases, packets will get a wrong VLAN tag and will be dropped.
5. Using SN4600C with copper or optics loopback cables in NRZ speeds, link may raise in long link up times ( up to 70 seconds).
6. When connecting SN4600C to SN4600C after Fastboot in 50GbE No_FEC mode with a copper cable, the link up time may take ~20 seconds.

- How I did it
Updated SDK submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "soni-mgmt".
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-12-05 09:17:17 +02:00
abdosi
cff7fbb5c1 Added 40G {300/40/5m} pg lookup profile for 7260 100G SKU (#9249)
What I did:
Added 40G {300/40/5m} profile for 7260 100G SKU
2021-11-24 18:57:01 -08:00
Abhishek Dosi
89fc705c6f [submodule] sonic-swss
ea9c6690cafb959d28c90fd5d01fce5bbb5f899b (HEAD -> 201911, origin/201911) Revert "[fpmsyncd] Skip routes to eth0 or docker0 (#1606)"
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-11-24 15:38:56 -08:00
Abhishek Dosi
1f1818450e [Submodule update] sonic-utilities
3ce811960f19c514a6ca0b1c611b2c453eb3a0a3 (HEAD -> 201911, origin/201911) [201911][port2alias]: Fix to get right number of return values (#1907)
e648290b51fa4ec4d465efe55aa4d27d16edb249 disk_Check: Scan & mount as RW when disk turns into Read-only (#1872)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-11-24 15:27:58 -08:00
Nazarii Hnydyn
6cdf81419a
[submodule]: Advance sonic-utilities. (#9307)
Commits on Oct 26, 2021
Remove exec from platform_reboot call to prevent reboot hang (#1881) 066b5adf6d737a5bd174123d4d00dab4b6110cf6
  
Commits on Nov 17, 2021
[fdbshow]: Handle FDB cleanup gracefully. (#1918) c80321c98d0741f340d2900108bad7fed76c80cd
2021-11-23 10:21:44 -08:00