Commit Graph

5689 Commits

Author SHA1 Message Date
Lawrence Lee
fad5ec47b4 [mux]: Call write_standby from host only
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Lawrence Lee
5232647b33 [mux]: Make write_standby available on host
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

[write_standby]: Cleanup and fix build

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Lawrence Lee
14403c61d2 [mux]: Initialize all mux ports as standby
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
b880f9d973 Merged PR 4813977: [mux] Update Service Install With SONiC Target
[mux] Update Service Install With SONiC Target

Recent PR grouped all SONiC service into sonic.taget. The install section
of mux.service was not update and this causes delays when using config
reload as the service failed state is not being reset.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Lawrence Lee
0295c832c2 Merged PR 4366316: [mux.service]: Bind to sonic.target
[mux.service]: Bind to sonic.target

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
bff785ec49 Merged PR 4234524: [mux] Start Mux on Only Dual-ToR Platform
[mux] Start Mux on Only Dual-ToR Platform

mux docker depends on the presence of mux cable hardware and is
supposed to run only Gemini ToRs. This PR change the mux feature
config in order to enable mux docker based on device configuration.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
56d4c34be7 [linkmgrd] Relocate Linkmgrd to Github
This PR deletes local-to-buildimage linkmgrd and creates new submodule
pointing to github repo of sonic-linkmgrd.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
29e9b775c1 [mux] Add New Package Vars
Ading new packaging variable to mux docker

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
f0711494f3 [linkmgrd] Enhance Init And Switch State When Config Is Active
During warm reboot, linkmgrd would go away and so heartbeats will
be lost. This would result in standby link son peer ToR to pull the
link active. This is undesirable since we would not create tunnel
from the ToR that is being rebooted to the peer ToR. This PR
implicitly lock the state of the mux if config is not set to auto.

Also, orchagent does not initialize MUX to it hardware state, rather
it initilizes MUX to Unknown state. linkmgrd will detect this situation
and probe MUX state to correct orchagent state.

There a fix for the case when state os switched MUX is delayed. The
PR will poll the MUX for the new state. This is required to update
the state ds and hence create/tear tunnel.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
c9c2826520 Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd
Linkmgrd monitors link status, mux status, and link state. Has
the link becomes unhealthy, linkmgrd will trigger mux switchover
on a standby ToR ensuring uninterrupted service to servers/blades.
This PR is initial implementation of linkmgrd.

Also, docker-mux container hold packages related to maintaining and managing
mux cable. It currently runs linkmgrd binary that monitor and switches
the mux if needed.
This PR also introduces mux-container and starts linkmgrd as startup when
build is configured with INCLUDE_MUX=y

Edit: linkmgrd PR will follow.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

Related work items: #2315, #3146150
2021-10-15 09:59:59 -07:00
Marty Y. Lok
c37470544a
[Nokia][port]Modify the Nokia-IXR7250E-36x400G device data (#8875)
-- Based on the new BCM configuration, Modify the portcoreid for the front panel port in the port_config.ini for line card Nokia-IXR7250e-36x400G
 -- Correct the pcie.pmal file
 -- Update the platform_ndk.json with new field "update-asic-pvt"
 -- Add chassis-internal-intf to chassisdb.conf
 -- update platform_reboot

Signed-off-by: mlok <marty.lok@nokia.com>
2021-10-14 13:47:12 -07:00
Rajkumar-Marvell
669dfaa207
[Marvell] Update amd64 SAI version (#8868)
Move Marvell SAI deb version to 1.8.1-1 for amd64 platform

Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2021-10-14 11:55:59 -07:00
Ze Gan
f4f6955e43
[devices]: Add new SKU for SONiC VM (#8971)
The default ethernet port naming style is Ethernet0, Ethernet4...Ethernet(i*4) which isn't compatible with EOS's style Ethernet1,Ethernet2...Etherent(i+1)
SONiC-mgmt usually use EOS as neighbor devices. To relieve the compatible issue on SONiC as neighbor devices, This PR introduces a new SKU SONiC VM.

Signed-off-by: Ze Gan <ganze718@gmail.com>
2021-10-14 10:47:40 -07:00
xumia
b9366f3f8e
Fix failed to download cisco artifacts issue (#8942)
Why I did it
Fix the failure to download cisco artifacts issue
2021-10-14 14:14:27 +08:00
Ying Xie
638c287837
[copp] bind copp-config.service to sonic.target (#8969)
copp-config service needs to be started after sonic.target so that it could
render the copp-config with the latest information.

It also needs to be restarted when config reload or load_minigraph is invoked.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-10-13 21:07:44 -07:00
liuh-80
a4ac69e4f0
[TACACS+]: Extract tacacs support functions into library and fix memory leak issue. (#8659)
This pull request extract tacacs support functions into library to share TACACS config file parse code with other project. Also fix memory leak issue in parse config code.

Why I did it
    To support TACACS per command authorization, we need reuse the TACACS config file parse code in bash plugin project.

How I did it
    Add libtacsupport.pc.in to extract tacacs support functions into library.
    Fix memory leak issue in TACACS config parse code by convert the dynamic memory allocation memory to static memory allocation.

How to verify it
    Pass all current UT.
    Check shared library generated manually.

Which release branch to backport (provide reason below if selected)
    N/A

Description for the changelog
    Extract tacacs support functions into library, this will share TACACS config file parse code with other project.
    Also fix memory leak issue in parse config code.
2021-10-14 10:04:58 +08:00
Xichen96
4654f72f1c
[determine-reboot-cause] delay execution (#8935)
Since database.service has been moved to execute after rc-local.service,
and determine-reboot-cause.service rely on database.service, we have to
specify that in "After=".

Signed-off-by: Xichen Lin <xichenlin@microsoft.com>

Co-authored-by: Xichen Lin <xichenlin@microsoft.com>
2021-10-14 08:33:19 +08:00
Dmytro
e8adee2c83
[frrcfgd][bgpcfgd] Add portchannel support (#8911)
* To add portchannel support in frrcfgd and bgpcfgd
* Update is_zero_ip() to handle portchannel name
Signed-off-by: d-dashkov <Dmytro_Dashkov@Jabil.com>
2021-10-12 18:54:37 -07:00
Sudharsan Dhamal Gopalarathnam
434a641026
[DPB][Mellanox]Fixing DPB modes in Mellanox-SN2700-D40C8S8 (#8953)
#### Why I did it
Fixing https://github.com/Azure/sonic-buildimage/issues/8938
Fixing 1x10G DPB mode in Mellanox-SN2700-D40C8S8 SKU as it was causing sonic-cfggen to fail.


#### How I did it
Added correct mode format in hwksu.json in Mellanox-SN2700-D40C8S8  and updated platform.json for the new mode.


#### How to verify it
Using sonic-cfggen verify it works fine
2021-10-12 18:16:13 -07:00
Volodymyr Samotiy
ce7abad3ba
[Mellanox] Update SAI to v1.19.4 (#8929)
- Why I did it
Advance to new Nvidia SAI release with the following changes:
New features:
- Align with new SDK/FW version 4.5.1006 and above and in parallel to existing used SDK/FW bundle
- Implement timestamp and egress_queue_index hostif packet attributes.

Bugs fixes:
- Fix compilation issues with gcc10
- Fix return code for buffer overflow for query enum values and query statistics capabilities
- Reduce verbosity of print in case packet ingress on invalid port
- Fix mirror Qos settings

- How I did it
Updated SAI version and submodule pointer

- How to verify it
Run regression tests from sonic-mgmt

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-10-12 10:33:35 +03:00
gechiang
f52f97cfa5
[sonic-utilities] Update sonic-utilities submodule to pick set of new fixes (#8947) 2021-10-11 23:07:27 -07:00
liuh-80
7d40384c58
[TACACS+] Add plugin support to bash. (#8660)
This pull request add plugin support library to bash.
    And we will create a TACACS+ plugin for bash in an other PR, which will bring per command authorization feature to bash.

Why I did it
    To support TACACS per command authorization, we check user command before execute it.

How I did it
    Add plugin support to bash.

How to verify it
    UT with CUnit under bash project cover all new code in plugin.c.
    Also pass all current UT.

Which release branch to backport (provide reason below if selected)
    N/A

Description for the changelog
    Add plugin support to bash.
2021-10-11 15:20:51 +08:00
Sumukha Tumkur Vani
32e73b0872
[RESTAPI] Update Submodule (#8931)
Include the following commits:

f9bbed3cb86a3bab9a07745096835dbdbe5a4db6
Convert Unit Tests from unittest framework to pytest framework

e842c5ff317c67919dcbcab3358143cb9a16c9dd
Generate code coverage for Unit Tests
2021-10-09 15:41:22 -07:00
Qi Luo
add9b651b6
Add platform_asic file to each platform folder in sonic-device-data based package (#8542)
#### Why I did it
Add platform_asic file to each platform folder in sonic-device-data package. The file content will be used as the ground truth of mapping from PLATFORM_STRING to switch ASIC family.

One use case of the mapping is to prevent installing a wrong image, which targets for other ASIC platforms. For example, currently we have several ONIE images naming as sonic-*.bin, it's easy to mistakenly install the wrong image. With this mapping built into image, we could fetch the ONIE platform string, and figure out which ASIC it is using, and check we are installing the correct image.

After this PR merged, each platform vendor has to add one mandatory text file  `device/PLATFORM_VENDOR/PLATFORM_STRING/platform_asic`, with the content of the platform's switch ASIC family.

I will update https://github.com/Azure/SONiC/wiki/Porting-Guide after this PR is merged.

You can get a list of the ASIC platforms by `ls -b platform | cat`. Currently the options are
```
barefoot
broadcom
cavium
centec
centec-arm64
generic
innovium
marvell
marvell-arm64
marvell-armhf
mellanox
nephos
p4
vs
```

Also support
```
broadcom-dnx
```

#### How I did it

#### How to verify it
Test one image on DUT. And check the folders under `/usr/share/sonic/device`
2021-10-08 19:27:48 -07:00
Vaibhav Hemant Dixit
0780aea966
[master] Submodule advance sonic-swss (#8915)
[Submodule advance sonic-swss]
Include below commits to master image:

Cache routes for single nexthop for faster retrieval Azure/sonic-swss#1922
Reduce route count for route perf test (Azure/sonic-swss#1928)
[pytest]: Re-use DVS container when possible (Azure/sonic-swss#1816)
[PORTSYNCD] when no ports on config db on init, continue and set Port… (Azure/sonic-swss#1861)
[gearbox] Add gearbox unit test (Azure/sonic-swss#1920)
Reverted skipped test_buffer_dynamic test cases (Azure/sonic-swss#1937)
Revert "[buffer orch] Bugfix: Don't query counter SAI_BUFFER_POOL_STA… (Azure/sonic-swss#1945)
2021-10-08 17:29:23 -07:00
xumia
3855ce2849
[ci]: Support azp for cisco 8000 (#8654)
Why I did it
Setup Azure pipeline for cisco 8000.
2021-10-08 15:31:49 +08:00
Aravind Mani
77b6bc39be
DellEMC: Fix z9332f low power mode issue (#8693) 2021-10-08 11:17:36 +05:30
Wirut Getbamrung
800de696db
[Celestica/sonic_platform]: Fixed failed test cases in Haliburton platform testing (#8815)
* [device/celestica-e1031]: fix apis follow lastest spec
* [device/celestica-e1031]: fix lgtm (#261)
2021-10-08 11:10:05 +08:00
arlakshm
34267393b3
[yang] Feature yang changes (#7955)
Why I did it
Add yang model for Feature configuration

How I did it
Add feature.yang and unit tests

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-10-05 15:44:24 -07:00
Aravind Mani
b7d49b0a0c
Dell S6000: PCIe Gen1 settings (#8663)
Why I did it
PCIe Gen1 settings was needed for Dell S6000 device.

How I did it
Modified from Gen2 to Gen 1 speed for Dell S6000 PCIe devices

How to verify it
Check lspci output, verify the syslogs
2021-10-05 10:17:26 -07:00
byu343
677f31dac3
[arista] Add asic and phy configs for clearwater2ms (#8174)
* Add ASIC configs for clearwater2ms
* Add 100G gearbox configs for clearwater2ms
2021-10-04 19:11:57 -07:00
Junchao-Mellanox
552963ab0e
[Mellanox] Change thermal recover threshold from temp_trip_norm to temp_trip_high (#8792)
- Why I did it
Change thermal recover threshold from temp_trip_norm to temp_trip_high, so that thermal algorithm would set fan speed to minimum allowed earlier and save power.

- How I did it
Change thermal recover threshold from temp_trip_norm to temp_trip_high

- How to verify it
Manual test
2021-10-04 20:20:33 +03:00
kellyyeh
df6361f50c
Change radv interval to 3min (#8882) 2021-10-01 15:00:16 -07:00
Ying Xie
3e397ce8b2
[Nokia 7215] Rename alias column with etpN normination (#8879)
also add hwsku alias Nokia-M0-7215

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-09-30 15:27:54 -07:00
Praveen Chaudhary
83108d9c9a
[YANG MGMT]: Support Grouping translation in YANG Models. (#8318)
Changes:
 -- pre Process Grouping section from all yang models, so it can be used
from any yang model.
-- add jsondiff in setup.py, it is useful for test debugging in case of
failures.
-- use 'stypes' instead of head.
-- pass config DB table name in _createLeafDict().
-- added test config for grouping.
-- white spaces changes.

Note: Changes are done in the way that we can add support for other
Generic YANG statement easily for translation.

Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
2021-09-30 12:53:34 -07:00
Sudharsan Dhamal Gopalarathnam
1e35915dcf
Load global config in caclmgrd only in multi asic NPU (#8812)
How I did it
Added if multi npu check before invoking the load global config.

How to verify it
Restart caclmgrd after this change and check if no error log is thrown.
2021-09-30 12:45:51 -07:00
Sumukha Tumkur Vani
33e64a4e7e
[RESTAPI] submodule update (#8859)
Rejection of incorrect CIDR addresses while configuring routes.
2021-09-29 15:43:04 -07:00
Sumukha Tumkur Vani
67db126333
Reduce logging level for RESTAPI from trace to info (#8858) 2021-09-29 15:41:47 -07:00
Alexander Allen
57ad1ed680
Add Mellanox-SN4600C-D100C12S2 SKU (#8832)
*Add Mellanox-SN4600C-D100C12S2 SKU
2021-09-29 14:14:11 -07:00
Ashok Daparthi-Dell
6cbdf11e53
SONIC QOS YANG - Remove qos tables field value refernce format (#7752)
Depends on Azure/sonic-utilities#1626
Depends on Azure/sonic-swss#1754

QOS tables in config db used ABNF format i.e "[TABLE_NAME|name] to refer fieldvalue to other qos tables.

Example:
Config DB:
"Ethernet92|3": {
"scheduler": "[SCHEDULER|scheduler.1]",
"wred_profile": "[WRED_PROFILE|AZURE_LOSSLESS]"
},
"Ethernet0|0": {
"profile": "[BUFFER_PROFILE|ingress_lossy_profile]"
},
"Ethernet0": {
"dscp_to_tc_map": "[DSCP_TO_TC_MAP|AZURE]",
"pfc_enable": "3,4",
"pfc_to_queue_map": "[MAP_PFC_PRIORITY_TO_QUEUE|AZURE]",
"tc_to_pg_map": "[TC_TO_PRIORITY_GROUP_MAP|AZURE]",
"tc_to_queue_map": "[TC_TO_QUEUE_MAP|AZURE]"
},

This format is not consistent with other DB schema followed in sonic.
And also this reference in DB is not required, This is taken care by YANG "leafref".

Removed this format from all platform files to consistent with other sonic db schema.
Example:
"Ethernet92|3": {
"scheduler": "scheduler.1",
"wred_profile": "AZURE_LOSSLESS"
},

Dependent pull requests:
#7752 - To modify platfrom files
#7281 - Yang model
Azure/sonic-utilities#1626 - DB migration
Azure/sonic-swss#1754 - swss change to remove ABNF format
2021-09-28 09:21:24 -07:00
Sudharsan Dhamal Gopalarathnam
b2659dcdbc
Handle feature flow when state is always_enabled (#8811)
Why I did it
When feature state is set to always_enabled hostcfgd throws error message
Sep 21 22:30:55.135377 r-leopard-32 ERR /hostcfgd: Unexpected state value 'always_enabled' for feature bgp
Sep 21 22:30:55.420268 r-leopard-32 ERR /hostcfgd: Unexpected state value 'always_enabled' for feature database
Sep 21 22:30:58.672714 r-leopard-32 ERR /hostcfgd: Unexpected state value 'always_enabled' for feature swss
This is due to feature == always_enabled not handled properly.

How I did it
Handled the scenario when feature is always enabled

How to verify it
Restart hostcfgd with feature state configured as always_enabled and check if there are no errors.
Added UT to cover the scenario.
2021-09-28 08:52:03 -07:00
ArthiSivanantham
ada8043ed3
SONiC Yang for Warm Restart (#7698)
Why I did it
SONiC YANG model support for warm restart.

How I did it
Defined warm restart YANG containers and lists based on config-DB schema.

How to verify it
Successful build of the following packages:
make target/python-wheels/sonic_yang_models-1.0-py3-none-any.whl
make target/python-wheels/sonic_yang_mgmt-1.0-py3-none-any.whl

Signed-off-by: Arthi Sivanantham <arthi_sivanantham@dell.com>
2021-09-28 08:51:26 -07:00
Junchao-Mellanox
c770375b3f
[submodule] Update submodule for sonic-utilities (#8752)
0d538d3 [ci]: Support code diff coverage (#1834)
48887d1 [config] support for configuring muxcable to standby mode of operation (#1837)
2088a9a Provide support to install platform extensions (#1578)
c97fe54 Add check_db_integrity script to setup.py (#1828)
c0b9917 [debug dump util] COPP Module Added (#1670)
826311c [techsupport] Removed interactive option for docker commands and Improved Error Reporting (#1723)
ce11545 [config reload] Removed job-mode for sonic.target restart (#1820)
f76f672 [fdbshow]: Fix typo in comment (#1809)
17208a0 [ci]: Support PR coverage (#1806)
c2c2354 fix wrong code indent in sfputil (#1808)
47a9a0f [portconfig] Validate duplicate speed value and interface type value (#1745)
f1086ee [sonic_installer]Add --skip-platform-check option for sonic_installer when image mismatch (#1791)
c007d65 [warm-reboot] Add new preboot health check: verify database integrity (#1785)
41e31e8 Fix PatchApplier init order (#1762)
2416175 [config reload] Fix config reload failure due to sonic.target job cancellation (#1814)
2b12aad [portstat, intfstat] added rates and utilization (#1750)
26e700a [debug dump util] Techsupport addition (#1669)
9f2326e [debug dump util] Base Skeleton and Click Class added (#1668)
2021-09-28 02:52:04 -07:00
Junhua Zhai
144b9f158d
[docker-sonic-vs] Make the ID numbers of "GB_ASIC_DB", "GB_COUNTERS_DB" and "GB_FLEX_COUNTER_DB" same as the ones in swss-common (#8845) 2021-09-28 13:54:12 +08:00
Ashok Daparthi-Dell
6c40fe4f44
[Submodule] update for swss (#8839)
*[Submodule] update for swss with following commits:
a3fdaf4 QOS fieldvalue reference ABNF format to string changes ([sonic-platform-daemons] Update submodule #1754)
a8fcadf Add sleep to ensure starting route perf test after the vs is stable ([mellanox]: Update hw-mgmt service with the stop action #1929)
a89d1f8 Fix failing DPB LAG tests ([socat]: build socat with readline #1919)
86b4ede [portsorch] Avoid orchagent crash when set invalid interface types to port (Upgrade azure-keyvault to known compatible version #1906)
025032f [VS] Skip failing test - test_recirc_port ([rsyslog]: use # to separate container and program name in syslog msg #1918)
d338bd0 [pfcwd] Fix the polling interval time granularity (Download newer version (8.23.0-2) of rsyslog from jessie-backports in hopes of eliminating memory leaks #1912)
14c937e Enabling copp tests ([Mellanox] Update hw-management service config #1914)
fbdcaae [teammgrd]: Improve LAGs cleanup on shutdown: send SIGTERM directly to PID. ([docker-syncd-mlnx] add new mlnx-sfpd daemon to docker-syncd-mlnx #1841)
002bb1d [tlm teamd] Add retry mechanism before logging the ERR in get_dumps. ([py-swss/config] config load-minigraph failure leaves database in wrong state #1629)
57d21e7 [pfcwd] Convert polling interval from ms to us in LUA scripts ([interfaces]: Move IP/MTU information from interfaces file into database #1908)
d01524d [fgnhgorch] Enable packet flow when no FG ECMP neighbors are resolved (Update arista driver submodule to includes interrupt handling changes #1900)
8cf355d Mux state order change ([submodule] update snmpagent and dbsyncd, extending/implementing ieee802.1ab, rfc3433, rfc2737 MIBs #1902)
2021-09-27 18:20:08 -07:00
arunlk-dell
c668f2ab5e
DellEMC: Initial commit for S5224F platform support (#8717)
Why I did it
Added support for the device S5224F

How I did it
Implemented the support for the platform S5224F
Switch Vendor: DellEMC
Switch SKU: S5224F-ON
ASIC Vendor: Broadcom
SONiC Image: sonic-broadcom.bin

How to verify it
Verified the show platform/interface commands
2021-09-26 09:34:16 -07:00
arunlk-dell
b0b0ba828a
DellEMC: N3248PXE Initial platform commit (#8562)
Why I did it
Added support for the device N3248PXE

How I did it
Implemented the support for the platform N3248PXE
n3248pxe_unit_test_log.txt

Switch Vendor: DellEMC
* Switch SKU: N3248PXE
* ASIC Vendor: Broadcom
* SONiC Image: sonic-broadcom.bin

How to verify it
Verified the show platform commands
2021-09-25 15:35:16 -07:00
SuvarnaMeenakshi
5324ce8a4d
[azure-pipeline][multi-asic]: Add azure pipeline script to generate multi asic vs image (#8215)
Why I did it
To be able to run VS test on official multi asic VS image.

How I did it
Add a new script to build multi-asic VS image by passing NUM_ASIC build parameter.
Rung multi-asic t1-lag test cases with the built image.
2021-09-24 11:14:43 -07:00
Ye Jianquan
38500fa92e
Add gdb and pyrasite to ptf image (#8816) 2021-09-24 17:10:48 +08:00
Vaibhav Hemant Dixit
ee9250e8cc
Save DB dump after warm/fast reboot (#8803)
As a part of warmboot, redis database is dumped:
c97fe546e5/scripts/fast-reboot (L269)
However, this dump file is deleted, after it is loaded back into db post reboot.
The DB dump can be useful for debugging purpose, hence taking a backup of it can be useful.
Instead of deleting the dump, rename and keep the dump.
2021-09-23 23:53:22 -07:00