Commit Graph

5062 Commits

Author SHA1 Message Date
Song Yuan
c9e01cf6ef [chassis] Set LAG Id range for 7800 chassis (#8052)
Configure LAG Id range in chassisdb.conf for 7800 chassis.
2021-09-02 15:38:28 -07:00
richardyu
1808d5894e [202012][saiserver docker]adds saiserver dependences (#8447)
Co-authored-by: richardyu-ms <richard.yu@microsoft.com>
2021-09-02 15:38:17 -07:00
Samuel Angebault
7ba0d3497f [Arista] Rely on automatic flash size detection for Lodoga (#8608)
Lodoga actually has a 8GB storage device.
LodogaSsd variant has a 30GB SSD drive.
However, in boot0 both were mishandled and assigned 4GB for legacy reasons.

Remove the hardcoding of the flash size and let boot0 autodetect the available space.
2021-09-02 15:38:04 -07:00
dflynn-Nokia
7a950cf49c [Nokia ixs7215] Add support for changing the console baud rate (#8595)
This commit adds support for changing the default console baud rate configured
within the U-Boot bootloader. That default baud rate is exposed via the value
of the U-Boot 'baudrate' environment variable. This commit removes logic that
hardcoded the console baud rate to 115200 and instead ensures that the U-Boot
'baudrate' variable is always used when constructing the Linux kernel boot
arguments used when booting Sonic.

A change is also made to rc.local to ensure that the specified baud rate is set
correctly in the serial getty service.
2021-09-02 15:37:54 -07:00
Shilong Liu
90e7f6b201 [build] Fix reproducible build issues (#8548)
* [build] Fix reproducible build issues
2021-09-02 15:37:42 -07:00
gechiang
8527e3fb18 BRCM Disable ACL Drop counted towards interface RX_DRP counters part II (#8596) 2021-09-02 15:37:17 -07:00
Samuel Angebault
6c6bfde24e
[202106][Arista] Update platform library submodules (#8643)
Update 202106 release with the Arista drivers from master.
This update is mostly targeted at chassis and is deemed stable.

Most notable chassis improvements:
 - fix psu reporting in platform api
 - powercycle fabrics on supervisor reboot
 - improve card powercycle reliability
 - fix led plugin when dealing with `Ethernet-Rec` and `Ethernet-IB`
 - fix system-health cli reporting
 - unset provisioning mode once the linecard has started
 - fix `show version` when running as `admin`
 - fix race between loading the eeprom module and sysfs file availablity
 - implement `get_all_asics` platform API
2021-09-01 23:23:11 -07:00
Judy Joseph
d7f5dded12 Revert "[bgpcfgd][voq] Fix for unit test failure in bgp config for voq switch (#8278)"
This reverts commit 2353c7decd.
2021-08-26 09:33:23 -07:00
Judy Joseph
dbcb55be66 Update sonic-swss submodule with
88a38f7 Ignore ALREADY_EXIST error in FDB creation (#1815)
b1c23f3 Change rif_rates.lua and port_rates.lua scripts to calculate rates correct (#1848)

Update sonic-utilities submodule with

cbc25d6 [config reload] Call systemctl reset-failed for snmp,telemetry,mgmt-framework services (#1773)
04dcd07 Improve config error handling on version_info (#1760)
e567a60 Load the database global_db. (#1752)
c15fb8f [sfputil] Gracefully handle improper 'specification_compliance' field (#1741)
39350f8 [dhcp_relay] Update CLI reference document and add a new API for ip address type (#1717)
18f13c6 [sonic-package-manager] switch from poetry-semver to semantic_version due to bugs found in poetry-semver (#1710)
b16724a [voq][chassis] VOQ cli show commands implementation (#1689)
9427cd6 [debug dump util] Match Infrastructure (#1666)
d9fb39b [route_check] Filter out VNET routes (#1612)
2021-08-25 22:52:58 -07:00
Judy Joseph
043e02e606 Update the sonic-linux-kernel submodule 2021-08-25 14:51:48 -07:00
byu343
4830ec99ae [Arista] Update phy-credo service for config load (#8005)
phy-credo.service will be restarted when running 'config reload'

Signed-off-by: Boyang Yu <byu@arista.com>
2021-08-25 14:41:36 -07:00
shlomibitton
edd6f4086c [dhcp_relay] Adapt config/show CLI commands to support DHCPv6 relay (#8211)
#### Why I did it
- Adapt config/show CLI commands to support DHCPv6 relay
- Support multiple dhcp servers assignment in one command
- Fix IP validation
- Adapt UT and add new UT cases

#### How I did it
- Modify config/show dhcp relay files
- Modify config/show UT files

#### How to verify it
This PR has a dependency on PR https://github.com/Azure/sonic-utilities/pull/1717
Build an image with the dependent PR and this PR
Use config/show DHCPv6 relay commands.
2021-08-25 12:45:03 -07:00
Christian Svensson
c7d4f5b8d8 [mgmt-framework]: Fix typo in mgmt_vars.j2 (#8475)
Signed-off-by: Christian Svensson <blue@cmd.nu>
2021-08-25 12:44:41 -07:00
vganesan-nokia
2353c7decd [bgpcfgd][voq] Fix for unit test failure in bgp config for voq switch (#8278)
The unit test failure was due to missing bgp graceful restart select
defer time configuration in voq_chassis.conf. Modified sample output
data file voq_chassis.conf to include this configuration.

Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>
2021-08-25 12:44:23 -07:00
Vivek Reddy
b0f8633604 [Mellanox][master][SKU] sonic interface names are aligned to 4 instead of 8 for 4600/4600C platforms (#8155)
*Edited platform.json for 4600 & 4600C
*Edited hwsku.json and port_config.ini files for all the SKU's present under these platforms
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-08-25 12:44:06 -07:00
Vivek Reddy
9a711645da [Mellanox] [master] Added D48C40 SKU for 4600C platform (#8201)
*Added new SKU for SN4600C Platform: Mellanox-SN4600C-D48C40
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-08-25 12:43:47 -07:00
madhanmellanox
3057266ff6 Adding SKU Mellanox-SN3800-D100C12S2 (#8441)
* Adding SKU Mellanox-SN3800-D100C12S2

Co-authored-by: Madhan Babu <madhan@l-csi-0241l.mtl.labs.mlnx>
2021-08-25 12:43:16 -07:00
Kostiantyn Yarovyi
a83480f749 [Pcied] run by python 3
Why I did it
Pcied running by python 2.

How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2

How to verify it
docker exec pmon supervisorctl status
2021-08-25 12:38:14 -07:00
Alexander Allen
9a3e35ec83 [Mellanox] Upgrade Mellanox firmware tools to 4.17.0 (#8299)
- Why I did it
New release of MFT has the following changelog / RN
 Fixed an issue that resulted in getting MVPD read errors from the mlxfwmanager during fast reboot.
 Fixed mlxuptime sometimes generating a time less than previous due the wrong frequency calculation

- How I did it
Update makefile pointer to new version.

- How to verify it
Manually tested on all Mellanox platforms.
2021-08-25 12:37:36 -07:00
Volodymyr Samotiy
055d497799 [monit] Periodically monitor VNET route consistency (#8266)
*To run VNET route consistency check periodically.
*For any failure, the monit will raise alert based on return code.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-08-25 12:37:19 -07:00
Alexander Allen
50dcc82ee0 [Mellanox] Add 2x100G and 4x50G breakout modes to MSN4410 platform (#8419)
- Why I did it
The MSN4410 platform was missing 2x100G and 4x50G supported breakout modes in platform.json

- How I did it
Added the aforementioned breakout modes to platform.json

- How to verify it
Run show interface breakout on a MSN4410 and verify that 2x100G and 4x50G are listed in the supported breakout modes for ports 1-24.
2021-08-25 12:36:31 -07:00
Samuel Angebault
8de67e0051 [Arista] Add VOQ information for Clearwater2 (#8508)
This change introduces 3 columns in the port_config.ini file.
These are coreId, corePortId and numVoq.
The ports for inband and recirc were also renamed properly.
2021-08-25 12:25:22 -07:00
Marty Y. Lok
f33b659aff Added Nokia IXR7250E support (#7809)
Why I did it
Support Nokia ixr7250E IMM and Supervisor cards

How I did it
Added modules x86_64-nokia_ixr7250e_sup-r0 and x86_64-nokia_ixr7250e_36x400g-r0 ../device/nokia directory.
Modified the platform/broadcom/one-image.mk to include NOKIA_IXR7250_PLATFORM_MODULE
Modified the platform/broadcom/rule.mk to include the platform-module-nokia.mk
2021-08-25 12:21:39 -07:00
abdosi
f9a2c096ab Enable sysctl fib_multipath_use_neigh (#8502)
Enable fib_multipath_use_neigh for v4
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

Why I did:
This is helpful if the neighbor are not directly connected then Kernel forward to unreachable neighbor option. With this option forwarding using neighbor state to be valid.
2021-08-25 12:21:20 -07:00
Stepan Blyshchak
9e9e2abdf7 [libteam][warm-reboot] fix issue in teamd warm-reboot that teamd starts (#8227)
with state of tdport from previous warm-reboot.

In case LAG was down before reboot, lacp->wr is not cleared.
In lacp_event_watch_port_flush_data we incremented nr_of_tdports and add
tdport to lacp->wr.state. In case lacp->wr.state already had this tdport
we do not set new state for tdport but appened a new item in
lacp->wr.state. In case we preformed warm-reboot and PortChannel member
was down, after reboot PortChannel member became up next warm-reboot
will initialize teamd with PortChannel member in down state.

Fix this issue by calling stop_wr_mode() when LAG was down. This was probably intended but missed.

#### Why I did it

To fix an issue seen in warm-reboot-sad test cases.

#### How I did it

I fixed it in SONiC libteam patch that adds warm-reboot support. Details in commit description.

#### How to verify it

Run warm-reboot-sad test on t0-56 topology.
2021-08-25 12:20:57 -07:00
Qi Luo
65c135a1ae Replace swsssdk with swsscommon in sonic-host-services (#8034)
#### Why I did it
swsssdk will be deprecated. Use swsscommon instead.

#### How to verify it
Unit test
2021-08-25 12:20:32 -07:00
minionatwork
1a41e76c45 [multi-asic] Fix for sonic-cfggen exception during platform string read (#8229)
Fix for sonic-cfggen exception during platform string read during fresh install and start of sonic in multi asic, /var/run/redisX/ is created after database docker is started.
2021-08-25 12:20:09 -07:00
Stephen Sun
288c49a085 Use predefined macro as vendor information (#8361)
#### Why I did it
Use a predefined variable to get vendor information when the swss docker container is created

#### How I did it
Use `{{ sonic_asic_platform }}` instead of `$SONIC_CFGGEN -y /etc/sonic/sonic_version.yml -v asic_type`

#### How to verify it
Manually test.
2021-08-25 12:19:09 -07:00
tomer-israel
69e32890e2 [Mellanox] fix syseeprom info values on mellanox simulator platforms msn4700 and msn4800 (#8387)
#### Why I did it
The values of the syseeprom were not valid

#### How I did it
I took the correct hexdump values from a real switch and created this hex file again

#### How to verify it
decode-syseeprom will display the new values
2021-08-25 12:18:37 -07:00
Ying Xie
34487eef5d [aboot] use ram partition for /var/log for devices with 3.7G disks (#8400)
Master/202012 image size grew quite a bit. 3.7G harddrive can no longer hold one image and safely upgrade to another image. Every bit of harddrive space is precious to save now.

Also sh syntax seemingly changed, [ condition ] && action was a legit syntax in 201911 branch but it is an error when condition not met with 202012 or later images. Change the syntax to if statement to avoid the issue.

Signed-off-by: Ying Xie ying.xie@microsoft.com
2021-08-25 12:18:19 -07:00
gechiang
c679ebf931 Reapply the fix to address setting MTU > 1500 causing portmgrd crash on BRCM platforms (#8472) 2021-08-25 12:17:56 -07:00
Kebo Liu
0108c7de58 [Mellanox] Upgrade hw-mgmt to 7.0100.2344 (#8463)
To pick up new PSU fan support from new hw-mgmt release

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-08-25 12:17:36 -07:00
carl-nokia
aef7c85695 [Nokia ixs7215] sfputil support + component tests (#8445)
Deliver sfputil support for sfputil show eeprom and sfputil reset along with some component test case fixes

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-25 12:17:07 -07:00
Vladyslav Morokhovych
47496ec8a4 [swss] Fix arp_update script (#8412)
Fix #7968

Issue is detected on SONiC.20201231.11

In test_static_route.py::test_static_route_ecmp static routes are configured, but neighbors are not resolved after config reload even after 10 minutes.
It looks like the arp_update script is starting to ping when Vlan1000 is not fully configured.
When issue is reproduced, stuck ping6 process is observed in swss container :

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         180  0.1  0.0   6296  1272 pts/0    S    17:03   0:03 ping6 -I Vlan1000 -n -q -i 0 -c 1 -W 0 ff02::1
And when arp_update script successfully resolves neighbors, we observe sleep 300 instead of ping process
2021-08-25 12:16:42 -07:00
Shi Su
499ad9141b [FRR]: Upgrade FRR to frr-7.5.1-s1 tag (#8443)
Update FRR 7.5.1 head.
2021-08-25 12:16:19 -07:00
carl-nokia
61dfcae4e0 [Nokia] Add hwsku.json for the Nokia-7215 (#8372)
* add hwsku.json for the Nokia-7215
* added required default_brkout_mode to hwsku as its not optional
* remove tabs from the file so spacing consistent

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-25 12:16:01 -07:00
shlomibitton
cbb825cb4b [hostcfgd] Delay hostcfgd and aaastatsd for faster boot time (#7965)
#### Why I did it
hostcfgd is starting at the same time as 'create_switch' method is called on orchagent process.
This introduce a degradation on the function execution time which eventually cause the fast-boot flow and a boot scenarion in general to run slower (~6 seconds).
This change will delay the start time of this daemon.
The aaastatsd will delay as well since it has a dependency on hostcfgd, so it is required to delay both.
90 seconds determined as the maximum allowed downtime for control plane to come back up on fast-boot flow.

#### How I did it
Add two timers for hostcfgd and aaastatsd  services in order to delay the startup of these services.

#### How to verify it
Install an image with this change and observe the daemons start 90 seconds after the system boot.
2021-08-25 12:15:16 -07:00
jerseyang
418ee0cd38 enable the emc2305 fan controller and NCP power controller 30ms timeout mechanism (#8138)
Why I did it
fix the dx010 system eeprom unavailable issue

How I did it
enable the i2c slave 30ms timeout mechanism

How to verify it
i2cstress test in DX010 iSMT controller bus

Co-authored-by: nicwu-cel <nicwu@celestica.com>
2021-08-25 12:14:59 -07:00
carl-nokia
43fa47d486 [sonic-device-data]: add port_type to OPTIONAL_PORT_ATTRIBUTES (#8370)
enable automated test suites to selectively run relevant tests ( or not run tests ) based upon a new port_type identifier in hwsku.json

How I did it
Modified the valid optional fields in validity check for hwsku.json per recommendation from Joe in
https://github.com/Azure/sonic-mgmt/pull/2654/files

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-25 12:14:17 -07:00
dflynn-Nokia
fb072d84cb [Nokia ixs7215] Watchdog timer support (#8377) 2021-08-25 12:13:44 -07:00
Rajkumar-Marvell
31f4154787 [reboot-cause] Fixed determine-reboot-cause.service failure. (#8210)
Signed-off-by: Rajkumar Pennadam Ramamoorthy rpennadamram@marvell.com

Why I did it
Install sonic image from ONIE. Once system is up, execute "config reload" command.

Root cause is that "determine-reboot-cause.service" was in failed state.
root@sonic:/host/reboot-cause# systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● determine-reboot-cause.service loaded failed failed Reboot cause determination service

How I did it
Fixed the issue by setting default reason to "REBOOT_CAUSE_UNKNOWN" instead of "None".

How to verify it
Check " determine-reboot-cause.service' loaded successfully post image installation from ONIE.
Verify "reboot-cause.txt" file is created and config reload succeeds.
2021-08-25 12:13:15 -07:00
Shilong Liu
9fb60721f8 Reproducible build add docker image debian* to white list. (#8330)
#### Why I did it
1. Add version control for debian* docker image to white list.
2. Always record docker image sha256 value, regardless of white list.
2021-08-25 12:12:42 -07:00
Kebo Liu
dd9a9ba4c3 [Mellanox] Add new sensor conf to support SN4410 A1 system (#8379)
#### Why I did it

New SN410 A1 system has a different sensor layout with A0 system, needs a new sensor conf file to support it.

#### How I did it

Since the SN4410 A1 system use exactly the same sensor layout as the SN4700 A1 system, so add a symbol link linking to the SN4700 A1 sensor conf file to reuse.

#### How to verify it

Run sensor test against the SN4410 A1 system;
Run platform related regression test against the SN4410 A1 system
2021-08-25 12:12:18 -07:00
tjchadaga
8b780d68a9 Fix TH3 Warm-reboot failure due to Tunnel termination SAI failure (#8395) 2021-08-25 12:12:00 -07:00
gechiang
280df2ee46 BRCM Disable ACL Drop counted towards interface RX_DRP counters (#8382)
* BRCM Disable ACL Drop counted towards interface RX_DRP counters
2021-08-25 12:11:23 -07:00
judyjoseph
6bbfafb045 [build]: Update the make cache mode for opennsl-module-dnx (#8391)
Fix warning shown during compilation

[ DPKG ] Cache is not enabled for opennsl-modules-dnx_5.0.0.4_amd64.deb package
2021-08-25 12:10:59 -07:00
Longxiang Lyu
9cc4b7b406 [swss][arp_update] Send ipv6 pings over vlan sub interfaces (#8363)
#### Why I did it
* `arp_update` fails to ping those neighbors over vlan sub interfaces.

#### How I did it
* modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned.
* modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2021-08-25 12:10:33 -07:00
Blueve
02bce90933 [ARM] Fix issue whre the ping6 tool is missing from orchagent docker (#8345)
Signed-off-by: Jing Kan jika@microsoft.com
2021-08-25 12:10:06 -07:00
Guohan Lu
52a59f827e [ci]: fix artifact download syntax error for vstest (#8547)
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-21 14:31:49 -07:00
lguohan
b9d6eb0678 [openssh]: move build dep installation to sonic-slave-buster (#8381)
install build dep causes dpkg lock issue in parallel build

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-20 16:07:02 +08:00