Commit Graph

7642 Commits

Author SHA1 Message Date
judyjoseph
efeae03ea3
Add override_config to load_minigraph in config-setup service (#14834)
This PR is to handle the override minigraph config by golden_config_db.json file if it is present in the backup location.
2023-05-10 11:54:33 -07:00
Junchao-Mellanox
9deca05f9d
[Mellanox] get LED capability from capability file (#14584)
- Why I did it
Currently, LED sysfs path is hardcoded. We will need change LED code if new LED color is supported for new platforms. This PR is aimed to improve this. By this PR, LED sysfs path is deduced from LED capability file.

- How I did it
Improve LED management on Nvidia platform:
get LED capability from capability file and deduce sysfs name according to the capability

- How to verify it
Unit test
Manual test
2023-05-10 20:53:50 +03:00
Yakiv Huryk
fa02411750
[Mellanox][asan] disable fast_unwind_on_malloc for mlnx syncd (#14858)
- Why I did it
To improve ASAN backtrace output when the call stack contains a code that is not compiled with -fno-omit-frame-pointer.

- How I did it
Added fast_unwind_on_malloc=0 to the ASAN_OPTIONS

- How to verify it
Build and test docker-syncd-mlnx.gz with ENABLE_ASAN=y

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2023-05-10 20:50:42 +03:00
DavidZagury
a10c1951d6
[Mellanox] Update SN5600 SAI XML file (#14947)
- Why I did it
Update SAI xml file to align with the default SKU

- How I did it
Update the SN5600 SAI xml file

- How to verify it
Install image on SN5600 device
2023-05-10 20:43:27 +03:00
Junchao-Mellanox
5e893666df
[system-health] Add fan direction check for system health (#14509)
- Why I did it
Add fan direction check to system health, all fans should be in the same direction

- How I did it
Add fan direction check to system health, all fans should be in the same direction

- How to verify it
Manual test
Unit test
Added sonic-mgmt test case to verify
2023-05-10 20:38:20 +03:00
Ze Gan
aeaaebbafc
[Update-submodule] sonic-wpa-supplicant (#14986)
a24412c25 (HEAD, origin/master, origin/HEAD, master) [mka]: Fix unexpected cleanup (#73)
26d1da0bc [mka]: Fix re-establishment by reset MI (#72)

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-05-10 10:19:47 -07:00
Konstantin Vasin
bba4fb86c7
[Build] update python package docker in host image to 6.1.1 (#14993)
Fix #14974
Refs: https://github.com/docker/docker-py/pull/3116
2023-05-10 07:43:35 -07:00
mssonicbld
be8c36e256 [submodule] Update submodule sonic-gnmi to the latest HEAD automatically 2023-05-10 16:32:58 +08:00
mssonicbld
c1e2e7f4eb
[submodule] Update submodule wpasupplicant/sonic-wpa-supplicant to the latest HEAD automatically (#14998) 2023-05-10 15:09:33 +08:00
Tejaswini Chadaga
4e60f0d563
Template change for BGP monitors on T2 (#14844)
Why I did it
To support BGPMon sessions from each T2 linecard ASIC

Work item tracking
Microsoft ADO (number only): 17873174
How I did it
Added change in BGPMon configuration to use Loopback4096 as source interface, since this has a unique IP per ASIC.

How to verify it
Tested by manually setting up BGPMon session on T2 LC and verified that Loopback4096 could be used as source
2023-05-09 13:40:00 -07:00
mssonicbld
b6b31df339 [submodule] Update submodule sonic-swss-common to the latest HEAD automatically 2023-05-10 04:32:26 +08:00
mssonicbld
faed3c6231 [submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically 2023-05-09 18:33:07 +08:00
Zain Budhwani
a738c39328
Add fix to monit_regex.json for catching mem_usage and cpu_usage (#14954)
Why I did it
Current regex not able to capture logs, modify regex to capture syslog messages

Work item tracking
Microsoft ADO (number only): 13366345
How I did it
Code change

How to verify it
sonic-mgmt test case
2023-05-08 11:48:17 -07:00
mssonicbld
ab5fd22a62 [submodule] Update submodule sonic-gnmi to the latest HEAD automatically 2023-05-07 16:32:24 +08:00
mssonicbld
5bbda67b81 [submodule] Update submodule sonic-platform-common to the latest HEAD automatically 2023-05-06 18:32:37 +08:00
mssonicbld
7c51e92610 [submodule] Update submodule sonic-mgmt-framework to the latest HEAD automatically 2023-05-06 18:32:31 +08:00
mssonicbld
4b033deb77
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#14910) 2023-05-06 16:44:58 +08:00
abdosi
9b8b4e6e4d
[bgp/TSA]: Fixed the internal peer route-map policy (#14804)
What I did:
In FRR command update source <interface-name> is not at address-family level. Because of this
internal peer route-map for ipv6 were getting applied to ipv4 address family. As a result
TSA over iBGP for Ipv6 was not getting applied.

How I verify:

Manual Verification of TSA over both ipv4 and ipv6 after fix works fine.
Updated UT for this.

Added sonic-mgmt test gap: sonic-net/sonic-mgmt#8170

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-05-05 13:55:05 -07:00
anamehra
ab7bcb43b7
[minigraph.py]: Enable 400G to 100G/40G speed change via minigraph for all platforms (#14736)
There are chassis-packet and Single asic platforms which support this 400G to 100G/40G speed change via config.
Enabling this feature for all platforms which can support this. Keeping it enabled for all does not affect the platforms
which do not support this feature yet.

Signed-off-by: anamehra anamehra@cisco.com
2023-05-05 13:52:40 -07:00
xumia
50fb266c16
[Ci] Support marvell/marvell-arm64 build (#14875)
Why I did it
Support marvell/marvell-arm64 build

Work item tracking
Microsoft ADO (number only): 19995559
How I did it
2023-05-05 12:40:35 +08:00
Zain Budhwani
f239a8388c
[yang] Change swss-event, dhcp-relay-event leafref to string (#13326)
Why I did it
Do not require leafref as part of yang. Only need string to compare whether string received from event matches what is possible for ifname.

How I did it
How to verify it
Run UT
2023-05-04 16:48:54 -07:00
Zain Budhwani
4974b5c49c
Add idle conn duration config to telemetry.sh (#14903)
Why I did it
Supports new field in sonic-net/sonic-gnmi@258b887

Work item tracking
Microsoft ADO (number only): 13468195
How I did it
Add new field in telemetry.sh

How to verify it
Pipeline
2023-05-04 16:47:02 -07:00
Akhilesh Samineni
9e2b181fdc
SONiC Yang model support for IPv6 link local (#14757)
SONiC Yang model support for IPv6 link local

What I did
Created SONiC Yang model for IPv6 link local

How I did it
Defined Yang models for IPv6 link local based on https://github.com/sonic-net/SONiC/blob/master/doc/ipv6/ipv6_link_local.md

How to verify it
Added enable test case.
2023-05-04 10:39:41 -07:00
vmittal-msft
5fc85f3274
Updated default ECN settings for T2 chassis (#14388)
Why I did it
Update ECN settings for T2 chassis

How I did it
Updated qos config file to load these settings during switch bootup

How to verify it
Verified on line card on T2 chassis
2023-05-04 10:01:09 -07:00
Dror Prital
7dcd55ca18
Support pulling sonic-slave-docker image from path at REGISTRY_SERVER (#14907)
- Why I did it
In order to reduce sonic build time, there is an option to acquire sonic slave docker(s) from artifact server (reduce sonic make configure time).
Current implementation supports only convention of:

<REGISTRY_SERVER>:<REGISTRY_PORT>/<SLAVE_BASE_IMAGE>:<SLAVE_BASE_TAG>

In case the SLAVE_BASE_IMAGE appear in internal path inside the server, the convention should be like that:

<REGISTRY_SERVER>:<REGISTRY_PORT><REGISTRY_SERVER_PATH>/<SLAVE_BASE_IMAGE>:<SLAVE_BASE_TAG>

When REGISTRY_SERVER_PATH (that is set on rules/config) will have to start with "/".

If REGISTRY_SERVER_PATH will not be set, the behavior will remain the same it works today.

- How I did it
Add ability to set REGISTRY_SERVER_PATH and update the code for docker image tag and docker image pull accordingly

- How to verify it
Use sonic slave docker image from artifact server in which the image is kept in internal folder and make sure it consume it.
2023-05-04 11:41:10 +03:00
Jon Goldberg
0692e8aa43
[armhf][Nokia-7215] changes fstrim.timer to daily (#14723)
Using timer-override.conf, we modify the fstrim.timer service.

For armhf, Nokia-7215 platform, we modify fstrim.timer to run daily
instead of weekly.  This is required because the size of the SSD on
this platform is 16GB, which on average is nearly 10 times smaller than
most other sonic platforms.  With smaller disk and the ever increasing
level of logging done by sonic, this change is required to prevent
the SSD from entering a read-only state due to inadequate free blocks.
2023-05-03 10:26:41 -07:00
Samuel Angebault
205e60ea9e
[Arista] Update platform library submodules (#14827)
- Fix watchdog reboot cause for wolverine linecard
- Fix PSU fan speed of 0% by adding max RPM to most psu descriptions
- Add product DCS-7060DX5-64
- Add product DCS-7060DX5-32
2023-05-03 10:19:38 -07:00
Yevhen Fastiuk
46615f5563
[Profiler] Add ability to print elapsed build time for target packages (#14778)
- Why I did it
To be able to see how much time was consumed to build a specific target.
A newly added code does those things:

1. Print build start time for target
2. Print build end time for target
3. Print elapsed time for target

- How I did it
Add a macro to record the time
Add macros to print end time and elapsed time

- How to verify it
Just build an image and check any *.log file

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
2023-05-03 13:26:22 +03:00
Yevhen Fastiuk
ef2d08b2ef
Enable missed cache for local debs (#14794)
- Why I did it
To be able to cache, and then retrieve cached "copied" debs

- How I did it
Add missed caching and cache retrieval steps

- How to verify it
Build with cache and then clean and rebuild again. The targets added to SONIC_COPY_DEBS should be taken from a cache.

Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
2023-05-03 13:24:42 +03:00
mssonicbld
9f680cb67c [submodule] Update submodule dhcpmon to the latest HEAD automatically 2023-05-03 16:32:13 +08:00
Kebo Liu
14a5f21c08
[Mellanox] Update SN5600 sensors.conf and pcie.yaml files (#14883)
- Why I did it
Update the sensors.conf and pcie.yaml according to the real hardware.

- How I did it
Update the sensors.conf and pcie.yaml

- How to verify it
run relevant sonic-mgmt test cases.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-05-02 10:36:57 +03:00
Lior Avramov
97cdb6af5c
[Mellanox] Add copyright header to ECMP calculator files (#14825)
- Why I did it
Add NVIDIA Copyright header to NVIDIA files added lately

- How I did it
Add NVIDIA Copyright header for the relevant files

- How to verify it
N/A (only commented text was added).
2023-05-02 10:35:16 +03:00
DavidZagury
2d0a12af6d
Fix issue with prod script not found, change the prod signing to work with flags to align to the dev script (#14580)
- Why I did it
Fix issue with signing tool not running due to being call with the path from the host and not the path it is mounted on inside the docker-slave

- How I did it
Modified the path on the SECURE_UPGRADE_PROD_SIGNING_TOOL flag to the path where it is mounted inside the slave docker

- How to verify it
Build SONiC using your own prod script
2023-05-02 09:13:16 +03:00
Ying Xie
72c52bc677
Revert "Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516)" (#14902)
This reverts commit c7ecd92c54.
2023-05-01 17:12:38 -07:00
Lawrence Lee
865605ef76
[README] Update link for moving docker directory (#14668)
The previous link to instructions for moving the docker directory are outdated.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2023-05-01 14:24:32 -07:00
Ravi [Marvell]
147e99ed9b
Support a new ACL table type called L3V4V6. (#14803)
This table supports both v4 and v6 Match types.

HLD: sonic-net/SONiC#1267

Signed-off-by: Ravi(Marvell) rck@innovium.com

Signed-off-by: Ravi(Marvell) rck@innovium.com
2023-05-01 13:14:56 -07:00
Andrew Sapronov
59178e3636
[devices]: Netberg Aurora 610 reduce kernel module output (#13704)
Normally doesn't need to measure i2c calls.
Also switched to use timespec64_sub() to ensure time delta normalized

Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
2023-05-01 10:48:08 -07:00
mssonicbld
80c5ab4a4a
[ci/build]: Upgrade SONiC package versions (#14896) 2023-05-01 18:10:48 +08:00
Lior Avramov
2922f26b6c
[Mellanox] Replace iproute2 supplied by SDK to iproute2 downloaded from Debian repository (#14726)
- Why I did it
Mellanox syncd container will be based on Debian iproute2 plus patches instead of Nvidia internal version of iproute2

- How I did it
Download iproute2 from Debian repository, apply patches and compile to create a new target.
The target is then deployed in syncd container of Mellanox switches only.
The new target is called IPROUTE2_MLNX.

- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
2023-04-30 12:30:09 +03:00
mssonicbld
967c198a44 [submodule] Update submodule linkmgrd to the latest HEAD automatically 2023-04-30 16:32:27 +08:00
mssonicbld
55062201b3
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#14892) 2023-04-30 15:56:11 +08:00
mssonicbld
0d709a3655
[ci/build]: Upgrade SONiC package versions (#14888) 2023-04-29 17:42:19 +08:00
mssonicbld
18740e7921 [submodule] Update submodule sonic-gnmi to the latest HEAD automatically 2023-04-29 16:32:11 +08:00
mssonicbld
05323b0c48
[submodule] Update submodule sonic-sairedis to the latest HEAD automatically (#14885) 2023-04-29 15:45:38 +08:00
mssonicbld
3c68cba9a9
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#14886) 2023-04-29 15:35:38 +08:00
Tejaswini Chadaga
ca224863cb
Changes to support TSA from supervisor (#14691)
Why I did it
Support for SONIC chassis isolation using TSA and un-isolation using TSB from supervisor module

Work item tracking
Microsoft ADO (number only): 17826134
How I did it
When TSA is run on the supervisor, it triggers TSA on each of the linecards using the secure rexec infrastructure introduced in sonic-net/sonic-utilities#2701. User password is requested to allow secure login to linecards through ssh, before execution of TSA/TSB on the linecards

TSA of the chassis withdraws routes from all the external BGP neighbors on each linecard, in order to isolate the entire chassis. No route withdrawal is done from the internal BGP sessions between the linecards to prevent transient drops during internal route deletion. With these changes, complete isolation of a single linecard using TSA will not be possible (a separate CLI/script option will be introduced at a later time to achieve this)

Changes also include no-stats option with TSC for quick retrieval of the current system isolation state

This PR also reverts changes in #11403

How to verify it
These changes have a dependency on sonic-net/sonic-utilities#2701 for testing

Run TSA from supervisor module and ensure transition to Maintenance mode on each linecard
Verify that all routes are withdrawn from eBGP neighbors on all linecards
Run TSB from supervisor module and ensure transition to Normal mode on each linecard
Verify that all routes are re-advertised from eBGP neighbors on all linecards
Run TSC no-stats from supervisor and verify that just the system maintenance state is returned from all linecards
2023-04-28 16:28:06 +08:00
mssonicbld
7d3f785c5f [submodule] Update submodule sonic-gnmi to the latest HEAD automatically 2023-04-28 14:34:16 +08:00
Song Yuan
48ed53cbf2
[chassis/arista]: Increase LAG Ids to 1024 (#10519)
Why I did it
Today at most 128 LAGs are supported. This is not sufficient if there are many LAGs with just few ports.

How I did it
Increase LAG Ids to 1024 for DNX device.
2023-04-27 11:28:23 -07:00
Vivek
22b4aac432
[Sys Mon] Fix the service entry delete in state_db because of timer job (#14702)
Why I did it
systemd stop event on service with timers can sometime delete the state_db entry for the corresponding service.

Note: This won't be observed on the latest master label since the dependency on timer was removed with the recent config reload enhancement. However, it is better to have the fix since there might be some systemd services added to system health daemon in the future which may contain timers

root@qa-eth-vt01-4-3700c:/home/admin# systemctl stop snmp
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status 
System is not ready - one or more services are not up

Service-Name            Service-Status    App-Ready-Status    Down-Reason
----------------------  ----------------  ------------------  -------------
<Truncated>
ssh                     OK                OK                  -
swss                    OK                OK                  -
syncd                   OK                OK                  -
sysstat                 OK                OK                  -
teamd                   OK                OK                  -
telemetry               OK                OK                  -
what-just-happened      OK                OK                  -
ztp                     OK                OK                  -
<Truncated>
Expected

Should see a Down entry for SNMP instead of the entry being deleted from the STATE_DB

root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status 
System is not ready - one or more services are not up

Service-Name            Service-Status    App-Ready-Status    Down-Reason
----------------------  ----------------  ------------------  -------------
<Truncated>
snmp                    Down              Down                Inactive
ssh                     OK                OK                  -
swss                    OK                OK                  -
syncd                   OK                OK                  -
sysstat                 OK                OK                  -
teamd                   OK                OK                  -
telemetry               OK                OK                  -
what-just-happened      OK                OK                  -
ztp                     OK                OK                  -
<Truncated>
How I did it
Happens because the timer is usually a PartOf service and thus a stop on service is propagated to timer. Fixed the logic to handle this

Apr 18 02:06:47.711252 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.service from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.711347 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.service ] 
Apr 18 02:06:47.722363 r-lionfish-16 INFO healthd: snmp.service service state changed to [inactive/dead]

Apr 18 02:06:47.723230 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.timer from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.723328 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.timer ] 

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-04-27 09:02:13 -07:00
Marty Y. Lok
a9cc1fb11d
[Nokia][device-data] Modify the Nokia-7250IXRE platform specific reboot script (#14568)
Why I did it

When reboot the chassis by issuing "sudo reboot" on Supervisor card. The internal midplane communication xe0 should be shutdown to avoid double reboot on the linecard.
Added a udev link rule to disable the autoneg on AMD xgbe port Xe0 and Xe1 and make the setting in sync with the peer Broadcom greyhound ports.

How I did it

Modify the Nokia-7250IXRE specific reboot script on the Supervisor card to shutdown the internal interface xe0. Also move reboot linecard code to the top of the script to make sure the notification has been send to Linecard before shutdown the xe0 interface.
Introduced a new rule 80-net-by-driver.link to disable the autoneg on the AMD size. This change requires the latest NDK which contains the change to set the autoneg on the xe0 and xe1 port on the Greyhound.

Signed-off-by: mlok <marty.lok@nokia.com>
2023-04-27 08:53:16 -07:00