Commit Graph

6962 Commits

Author SHA1 Message Date
Lior Avramov
e6b1ed366b [Mellanox] [ECMP calculator] Add script usage and more information to script description in help option (#13493)
Add script usage and more information to script description being printed in help option.

- Why I did it
Missing information in script description in help option.

- How I did it
Expand script description and add script usage.

- How to verify it
Run the script with -h option.
2023-02-16 18:36:36 +08:00
Oleksandr Ivantsiv
5ef488f808 Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516)
- Why I did it
fixes #12907

When the management interface IP address configuration changes from dynamic to static the DNS configuration (retrieved from the DHCP server) in /etc/resolv.conf remains uncleared. This leads to a DNS configuration pointing to the wrong nameserver. To make the behavior clear DNS configuration received from DHCP should be cleared.

- How I did it
Use resolvconf package for managing DNS configuration. It is capable of tracking the source of DNS configuration and puts the configuration retrieved from the DHCP servers into a separate file. This allows the implementation of DNS configuration cleanup retrieved from DHCP during networking reconfiguration.

- How to verify it
Ensure that the management interface has no static configuration.
Check that /etc/resolv.conf has DNS configuration.
Configure a static IP address on the management interface.
Verify that /etc/resolv.conf has no DNS configuration.
Remove the static IP address from the management interface.
Verify that /etc/resolv.conf has DNS configuration retrieved form DHCP server.
2023-02-16 18:36:33 +08:00
Yaqiang Zhu
30e4369255
[dhcp_relay] Remove exist check while adding dhcpv6 relay (#13826) 2023-02-16 11:08:34 +08:00
mssonicbld
a34892efdf
[ci/build]: Upgrade SONiC package versions (#13816) 2023-02-15 19:29:52 +08:00
Richard.Yu
fe1fc4cf6a
[broadcom]: Set default SYNCD_SHM_SIZE for Broadcom XGS devices (#13297) (#13807)
After upgrade to brcmsai 8.1, the sdk running environment (container) recommended with mininum memory size as below

TH4/TD4(ltsw) uses 512MB
TH3 used 300MB
Helix4/TD2/TD3/TH/TH 256 MB
Base on this requirement, adjust the default syncd share memory size and set the memory size for special ACISs in platform_env.conf file for different types of Broadcom ASICs.

How I did it
Add the platform_env.conf file if none of it for broadcom platform (base on platform_asic file)
Add the 'SYNCD_SHM_SIZE' and set the value

for ltsw(TD4/TH4) devices set to 512M at least (update the platform_env.conf)
for Td2/TH2/TH devices set to 256M
for TH3 set to 300M

verify

How to verify it
verify the image with code fix
Check with UT
Check on lab devices

On a problematic device which cannot start successfully
Run with the command
$ cat /proc/linux-kernel-bde
Broadcom Device Enumerator (linux-kernel-bde)
Module parameters:
        maxpayload=128
        usemsi=0
        dmasize=32M
        himem=(null)
        himemaddr=(null)
DMA Memory (kernel): 33554432 bytes, 0 used, 33554432 free, local mmap
No devices found
$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Cannot get Broadcom Chip Id. Skip set SYNCD_SHM_SIZE.
Creating new syncd container with HWSKU Force10-S6000
a4862129a7fea04f00ed71a88715eac65a41cdae51c3158f9cdd7de3ccc3dd31
$ docker inspect syncd | grep -i shm
            "ShmSize": 67108864,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
On Normal device
$ docker inspect syncd | grep -i shm
            "ShmSize": 268435456,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e"
change the config syncd_shm.ini to b85=128m

$ docker rm -f syncd
syncd
$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
3209ffc1e5a7224b99640eb9a286c4c7aa66a2e6a322be32fb7fe2113bb9524c
$  docker inspect syncd | grep -i shm
            "ShmSize": 134217728,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",
change the config under
/usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/Force10-S6000/platform_env.conf
and run command

$ cat /usr/share/sonic/device/x86_64-dell_s6000_s1220-r0/platform_env.conf
SYNCD_SHM_SIZE=300m

$ sudo /usr/bin/syncd.sh start
Creating new syncd container with HWSKU Force10-S6000
897f6fcde1f669ad2caab7da4326079abd7e811bf73f018c6dacc24cf24bfda5
$  docker inspect syncd | grep -i shm
            "ShmSize": 314572800,
                "Tag": "fix_8.1_shm_issue.67873427-9f7ca60a0e",

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
2023-02-15 15:58:49 +08:00
StormLiangMS
fd0e614a66
[submodules] advance sonic-sairedis for 202211 #13799
Why I did it
sonic-sairedis

53488e9 - [sai_failure_dump]Invoking dump during SAI failure (Update Mellanox buffer profiles config #1198) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
85921af - [Mellanox] Enable DSCP remapping by using SAI attribute ([Nephos] Updating download link for SAI and SDK #1188) (15 hours ago) [Stephen Sun]
82f2cd7 - Switch to using stock gcovr 5.2 (Add service to config hostname based on configdb #1174) (15 hours ago) [Saikrishna Arcot]
3a6c60d - [ppi]: Enable bulk API. ([Aboot] Declare flash_size for all platform #1171) (15 hours ago) [Nazarii Hnydyn]
f1303cb - Use github code scanning instead of LGTM (#1160) (15 hours ago) [Liu Shilong]
b1972d9 - Fix for [EVPN] When MAC moves from remote end point to local, ASIC DB fields are not updated properly for the mac #11503Update NotificationProcessor.cpp ([libteam] Add fallback support for single-member-port LAG #1118) (15 hours ago) [anilkpan]
How I did it
How to verify it
2023-02-15 08:33:43 +08:00
xumia
ff57447ec9 [Build] Change the default mirror version config file (#13786)
Why I did it
Change the mirror config file
Use the files/build/versions/default/versions-mirror only when reproducible build enabled.
The config in files/build/versions is only for reproducible build, while snapshot mirror feature does not have the dependency on the reproducible build.

How I did it
Skip the mirror config in files/build/versions/default/versions-mirror if reproducible build not enabled.

How to verify it
2023-02-15 00:44:47 +08:00
StormLiangMS
d70e8e1f6c
[submodule advance][202211] advance sonic-platform-daemons to 7219b56 #13693
Why I did it
advance sonic-platform-daemons

7219b56 - [Xcvrd]: Fix optics insertion/removal not detected (Add Ingrasys S9100 platform submodule #333) (3 days ago) [Prince George]
9b15ccf - add data for telemtery enhancement for 'active-active' cable type ([platform]: add support for Force10-Z9100 32x100G #332) (3 days ago) [vdahiya12]
1c7dba6 - Fix bug where transceiver info is missing after port breakout change ([teamd] Fix a bug in #305 that will break teamd #329) (3 days ago) [Tal Berlowitz]
07b8f3c - Xcvrd should restart if any child thread crashes (Update Mellanox SAI git reference #326) (3 days ago) [mihirpat1]
How I did it
How to verify it
2023-02-14 15:17:57 +08:00
StormLiangMS
91ff5d0358
[submodule advance][202211] advance sonic-platform-common to 2dbc0ea #13692
Why I did it
advance sonic-platform-common

2dbc0ea - (HEAD, origin/202211) Change get_tx_bias return type to list ([platform]: add eeprom/sfputil support for z9100 #342) (2 days ago) [mihirpat1]
How I did it
How to verify it
2023-02-14 15:11:07 +08:00
mssonicbld
d1de964ec1
[Mellanox] Support DSCP remapping in dual ToR topo on T0 switch (#12605) (#13787) 2023-02-14 14:59:59 +08:00
Jing Zhang
1a95fcd08f
change default to be on (#13495) (#13796)
Changing the default config knob value to be True for killing radv, due to the reasons below:

Killing RADV is to prevent sending the "cease to be advertising interface" protocol packet.
RFC 4861 says this ceasing packet as "should" instead of "must", considering that it's fatal to not do this.
In active-active scenario, host side might have difficulty distinguish if the "cease to be advertising interface" is for the last interface leaving.
6.2.5. Ceasing To Be an Advertising Interface

shutting down the system.
In such cases, the router SHOULD transmit one or more (but not more
than MAX_FINAL_RTR_ADVERTISEMENTS) final multicast Router
Advertisements on the interface with a Router Lifetime field of zero.
In the case of a router becoming a host, the system SHOULD also
depart from the all-routers IP multicast group on all interfaces on
which the router supports IP multicast (whether or not they had been
advertising interfaces). In addition, the host MUST ensure that
subsequent Neighbor Advertisement messages sent from the interface
have the Router flag set to zero.

sign-off: Jing Zhang zhangjing@microsoft.com
2023-02-14 09:48:46 +08:00
mssonicbld
a01fb7ad71
[build] Check if patches are applied before applying patches. (#13566) (#13690) 2023-02-13 03:09:50 +08:00
mssonicbld
48e6a829fc
Add explicit dependency on sonic_platform_common (#13446) (#13680) 2023-02-13 01:53:16 +08:00
mssonicbld
b3cf657129
[chassis] Fixed critical process not correct for database-chassis docker (#13445) (#13679) 2023-02-13 01:22:50 +08:00
mssonicbld
3e619d4385
During build time mask only those feature/services that are disabled excplicitly (#13283) (#13651) 2023-02-13 01:15:26 +08:00
mssonicbld
8832ddd60b
[Mellanox] Improve FW upgrade logging (#13465) (#13681) 2023-02-12 23:53:33 +08:00
Richard.Yu
422978c158
[202211][submodule]Advance sairedis head (#13712)
Why I did it
include changes from sairedis submodule
102d20b | [202211][submodule][SAI]Advance header include 0031470 | improve enum values integration check (#1727) (#1737)
04d3c41 | [Submodule][upgrade]Upgrade SAI submodule (#1204)

updates from SAI
7710e24 | [cherry-pick][202211]Enhance the check enum lock script (#1741) (#1742)
0031470 | improve enum values integration check (#1727) (#1737)
4f11c7e | Enable github code scanning to replace LGTM. (#1709)

How I did it
How to verify it
2023-02-12 05:34:22 +00:00
mssonicbld
f595eb8ecd
[dualtor][active-active]Killing radv instead of stopping on active-active dualtor if config knob is on (#13408) (#13657) 2023-02-11 14:14:34 +08:00
mssonicbld
956173856c
[sflow]: Unblocked psample_*() function calls in BRCM ESW platforms for proper functionality of sflow feature (#12918) (#13691) 2023-02-11 12:35:41 +08:00
Kalimuthu-Velappan
70763e20e7 02.Version cache - docker cache build framework (#12001)
During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.

Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

#### How I did it

#### How to verify it
2023-02-11 06:33:57 +08:00
kenneth-arista
67610d7e4f [device/arista] Reduce SDK stat polling freq in DNX devices (#13429)
Eariler the SDK stat polling was erroneously set to once every msec
which is far more frequent than required by SWSS. The new setting, which
is consistent with other vendor SKUs, is once a second. The net result
is reduced CPU MHz by syncd.
2023-02-11 02:38:01 +08:00
Ikki Zhu
8a8c0b5ea2 [Celestica DX010] fix fan drawer and watchdog platform testcase issues (#13426)
Why I did it
fix DX010 fan drawer and watchdog platform test case issues

How I did it
1. Add fan_drawer get_maximum_consumed_power support
2. Adjust maximum watchdog timeout value check

How to verify it
Run test_fan_drawer and test_watchdog test cases.
2023-02-11 02:37:47 +08:00
xumia
a6c64c9d35
[Security][202211] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips #13737 (#13763)
* [Security] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips (#13737)

Why I did it
[Security] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips

f6df7303d8 Update expired certs.
84540b59c1 CVE-2022-2068
f763d8a93e Prepare 1.1.1n-0+deb11u2
576562cebe CVE-2022-1292
How I did it
Upgrade the OpenSSL version

* [Security] Upgrade OpenSSL version for armhf
2023-02-10 21:50:57 +08:00
Jing Zhang
5b64d825de [sudoers] add /usr/local/bin/storyteller to READ_ONLY_CMDS (#13422)
Adding /usr/local/bin/storyteller to READ_ONLY_CMDS. So no write access or prompt for password is needed to run storyteller.

Tested on 202205 clusters, user who didn't request write access was able to grep log using storyteller.

sign-off: Jing Zhang zhangjing@microsoft.com
2023-02-07 20:54:03 +08:00
bingwang-ms
f9d0f25c66 Support both port name and alias in ACL table AttachTo attribute (#13444)
Why I did it
This PR is an enhancement of PR #13105
Because the input string of AttachTo for ACL table can appear in both port name group and port alias group, I added a logic to determine whether the string should be port name or port alias

If all the input strings belong to port name group, then we treat all of them as port name
If all the input strings belong to port alias, then we treat all of them as port alias
If all the input string belongs to both port alias group and port name group, we prefer port alias. The behavior is as before.
How I did it
Walk through all port names/alias in the input to make a decision.

How to verify it
Verified by adding UT.
2023-02-07 20:53:56 +08:00
ganglv
00a8df68a6
Enable host service. (#13544)
#### Why I did it
Back port GNMI to 202211 branch

#### How I did it
Update rules/config to enable host service

#### How to verify it
Run GNMI end2end test
2023-02-06 20:52:13 -08:00
Junhua Zhai
200342261a [gearbox] use credo sai v0.8.2 (#13565)
Update credo sai package to the latest v0.8.2, which also has the fix for aristanetworks/sonic#52.
2023-02-07 04:32:28 +08:00
Liu Shilong
fa5f03bb33 [build] Check if patches are applied before applying patches (#13386)
Why I did it
If make fails, we can't rerun the make process, because existing patches can't apply again.

How I did it
Check if patches are applied. if yes, don't apply patches again.

How to verify it
2023-02-06 16:37:03 +08:00
Tomer Shalvi
55822424bc Moving multiprocessing.Manager to the correct sub-process (#13377)
Why I did it
There is a queue in sysmonitor.py that is created based on an object of multiprocessing.Manager.
After performing fast-reboot, system health monitor is being shut down, what causes this Manager to be shut down as well, since it is a child-process of healthd.
That's why I moved the creation of this Manager from the top of the file to the function Sysmonitor.system_service() (The only place it is used), to make Manager a child-process of Sysmonitor, instead of Healthd. This way both the queue (the Manager) and the processes that uses this queue will be child-processes of the same process, and the problematic scenario of sysmonitor sending messages to a dead queue will not be possible.

How I did it
Removed the definition of manager as global and moved it to system_service() function

How to verify it
Perform a fast reboot and verify the traceback issue is fixed
2023-02-06 14:37:36 +08:00
Jing Kan
1f9ff1ca3d [Arista 720DT] Create SKU alias Arista-720DT-G48S4 (#12905) 2023-02-06 12:36:59 +08:00
Vivek
ee7724e74d Fix dependency of dhcp-mon on VLAN with only v6 (#13006)
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-02-06 12:36:55 +08:00
xumia
81dd4b8f7b [Build] Support j2 template for debian sources for docker ptf (#13198)
Change to use the sources.list from the file generated from the j2 template
2023-02-06 12:36:51 +08:00
mssonicbld
7fc672c3e8
Use tmpfs for /var/log for Arista 7260 (#13587) (#13667) 2023-02-05 23:08:51 +08:00
mssonicbld
6f3f7f30b2
[build] Adjust teamd and radv features configuration according to the compilation options. (#13139) (#13644) 2023-02-05 04:44:02 +08:00
mssonicbld
d699d32553
[BugFix] Fix the bug that it gets error system-mac of centec platform (#12721) (#13625) 2023-02-05 02:17:56 +08:00
mssonicbld
d9b15aea0d
[Seastone] Enhancement fix for PR12200 syseeprom issue (#13344) (#13664) 2023-02-05 01:22:04 +08:00
Oleksandr Ivantsiv
a754c753bb [build] Add the possibility to disable compilation of teamd and radv containers. (#12920)
- Why I did it
This optimization is needed for DPU SONiC. DPU SONiC runs a limited set of containers and teamd and radv containers are not part of them. Unlike the other containers, there was no possibility to disable teamd and radv containers compilation.
To reduce DPU SONiC compilation time and reduce the image size this commit adds the possibility to disable their compilation.

- How I did it
Two new configuration options are added to rules/config file:

INCLUDE_TEAMD
INCLUDE_ROUTER_ADVERTISER
By default to preserve the existing behavior both options are enabled. There are two ways to override them:

To change option value to "n" in rules/config file.
To override their value using SONIC_OVERRIDE_BUILD_VARS env variable:
SONIC_OVERRIDE_BUILD_VARS="SONIC_INCLUDE_TEAMD=y SONIC_INCLUDE_ROUTER_ADVERTISER=n"

- How to verify it
The default behavior is preserved. To verify it compile the image without overriding new options. Install the image and verify that both teamd and radv containers are present and running.
To verify the new options override them with "n" value. Compile and install image. Verify that no docker containers are present. Verify that SWSS can start without errors.
2023-02-04 10:48:18 +08:00
byu343
2f27120c8a [Arista]: Add hwSku Arista-7260CX3-D108C10 (#13242)
* [Arista]: Add hwSku Arista-7260CX3-D108C10

* Add buffer-related config for Arista-7260CX3-D108C10
2023-02-04 10:48:14 +08:00
kenneth-arista
e3790d3044 [device/arista] Disabled polled_irq_mode for DNX SKUs (#13349)
Disabled polled_irq_mode for all Arista DNX devices as this mode
leads to excessive use of the CPU via an unneeded interrupt
polling thread.
2023-02-04 10:48:10 +08:00
Ikki Zhu
2ab45b1127 [Celestica Seastone] fix multi sonic platform issues (#13356)
Why I did it
Fix the following issues for Seastone platform:

- system-health issue: show system-health detail will not complete #9530, Celestica Seastone DX010-C32: show system-health detail fails with 'Chassis' object has no attribute 'initizalize_system_led' #11322
- show platform firmware updates issue: Celestica Seastone DX010-C32: show platform firmware updates #11317
- other platform optimization

How I did it
Modify and optimize the platform implememtation.

How to verify it
Manual run the test commands described in these issues.
2023-02-04 10:48:05 +08:00
Sudharsan Dhamal Gopalarathnam
ce8ffb6812 [yang] Add collector_vrf to sflow yang model (#12897)
- Why I did it
Fixed sflow yang model to include collector_vrf field.

- How I did it
Added leaf for collector_vrf under sflow_collector. Additionally aligned the configuration guide

- How to verify it
Added UT to verify.
2023-02-04 09:54:17 +08:00
Saikrishna Arcot
2e760823c1 Replace logrotate cron file with (adapted) systemd timer file (#12921)
Debian is shipping a systemd timer unit for logrotate, but we're also
packaging in a cron job, which means both of them will run, potentially
at the same time. Remove our cron file, and add an override to the
shipped timer file to have it be run every 10 minutes.

Fixes #12392.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-02-04 09:54:12 +08:00
Liu Shilong
56c2c65372 [build]: increase raw image disk size to 4GB (#12958)
3GB disk size is not enough for broadcom raw image.
2023-02-04 09:54:08 +08:00
Longxiang Lyu
918e2d11f8 [dualtor] Let T0 delay 10 seconds before sending BGP updates (#12996)
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.

How I did it
add coalesce-time 10000 to the frr bgp startup config.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2023-02-04 09:54:05 +08:00
lixiaoyuner
7161ff46ca Add k8s support feature set and Add platform label for scheduler usage (#12997)
Why I did it
We plan to pilot k8s feature, need to fix several bugs including enable telemetry feature and add platform label.

How I did it
Add support feature set, only enable telemetry container upgrade for now
Add platform label for scheduler usage
Remove CNI installation code, it would be auto installed when install kubeadm
How to verify it
After sonic device join k8s cluster, show node labels to check if platform label is visible.

Signed-off-by: Yun Li yunli1@microsoft.com
2023-02-04 09:54:01 +08:00
Zain Budhwani
24be87504f Change bgp notification leaf name and mem_usage leaf type (#13012)
#### Why I did it

Improve naming convention for bgp notification events and change type of leaf for sonic-events-host mem usage from uint64 to decimal64

#### How I did it

Replace "-" with "_"

Replace uint64 with decimal64

#### How to verify it

Run yang model unit tests

#### Description for the changelog

Change YANG model leaf naming convention for bgp notification
2023-02-04 09:53:57 +08:00
kellyyeh
f4ae6219bf [dhcpmon] Fix dhcpmon socket filter and tx count issue (#13065)
Why I did it
Fix issue caused by dualtor support PR [dhcpmon] Open different socket for dual tor to enable interface filtering #11201
Improve code
How I did it
On single ToR, packets received count was duplicated due to socket filter set to "inbound"
Tx count not increasing due to filter set to "inbound". Added an outbound socket to count tx packets
Added vlan member interface mapping for Ethernet interface to vlan interface lookup in reference to PR Fix multiple vlan issue sonic-dhcp-relay#27
Exit when socket fails to initialize to allow dhcp_relay docker to restart
How to verify it
Tested on vstestbed single tor and dual tor, sent packets and verify printed out dhcpmon rx and tx counters is correct

Correct number of tx increases
Tx does not increase when ToR is on standby
2023-02-04 09:53:53 +08:00
Zain Budhwani
b4e22e2752 Fix segfault issue inside memory_checker (#13066)
#### Why I did it

Segfault was occuring when running memory_checker

#### How I did it

Deinit publisher immediately after publishing

#### How to verify it

Manual testing
2023-02-04 09:53:49 +08:00
Ikki Zhu
e182d03f57 Seastone add platform capability enhancement config (#13079) 2023-02-04 09:53:45 +08:00
andywongarista
19e94dfbfc [Arista] Update ip packet checksum when set to 0xffff on 720DT-48S (#13088)
Why I did it
This is to fix test_forward_ip_packet_with_0xffff_chksum_tolerant test failure on 720DT-48S. IP packets with checksum set to 0xffff will be forwarded with the same checksum on this platform, instead of updating to the correct value.

How I did it
Add bcm config sai_verify_incoming_chksum=0 so that checksum is updated instead of being left unchanged when checksum is 0xffff. Note that packets with invalid checksum are still dropped with this config.
2023-02-04 09:53:41 +08:00