Commit Graph

8214 Commits

Author SHA1 Message Date
Mai Bui
6c96b29484
[docker-teamd] limit privileged flag for teamd container (#15829)
Signed-off-by: Mai Bui <maibui@microsoft.com>
2023-08-17 09:48:57 -07:00
Saikrishna Arcot
5723ba29e4
Remove depot_tools repo (#16114)
It appears that this was initially added to provide the git-retry
command (which doesn't appear to be used today). However, this repo is
now also providing bazel (which is actually used in our build today),
and this command (along with git-retry) expects some vpython3 binary to
be set up/installed.

Rather than going through that, just get rid of this repo.
2023-08-16 14:18:50 -07:00
Vivek
d4923615d6
[Mellanox] [SN4410] Support new breakout modes for PAM4 (#15668)
- Why I did it
Add new breakout modes to be used in PAM4 supported cables

- How I did it

- How to verify it
Verified the 50G per lane breakout modes are applied properly on the switch

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-08-16 08:30:33 +03:00
Mai Bui
030c57200d
[docker-lldp] limit privileged flag for lldp container (#15830)
#### Why I did it
HLD implementation: Container Hardening (https://github.com/sonic-net/SONiC/pull/1364)
##### Work item tracking
- Microsoft ADO **(number only)**: 14807420

#### How I did it
Reduce linux capabilities in privileged flag, retain NET_ADMIN capability
2023-08-15 11:27:12 -07:00
Andriy Dobush
cf72683f12
Remove privileged flag for database and snmp docker (#13783)
#### Why I did it
Reduce docker privilege 
This is part of HLD https://github.com/sonic-net/SONiC/pull/1364

#### How I did it
Remove flag --privileged
#### How to verify it
docker exec -it database bash
root@0048b82b460b:/# ip link add dummy0 type dummy
RTNETLINK answers: Operation not permitted
2023-08-15 11:18:50 -07:00
Kebo Liu
1626e198a8
[Mellanox] Update SDK/FW/SAI to 4.6.1020/2012.1020/SAIBuild2305.25.0.3 (#16096)
SONiC changes:
1. Support Spectrum4 ASIC FW binary building.
2. Support new SDK sx-obj-desc lib building since new SAI need it.
3. Remove SX_SCEW debian package from Mellanox SDK build since we are no longer using it (we use libxml2 instead).
4. Update SAI, SDK, FW to version 4.6.1020/2012.1020/SAIBuild2305.25.0.3

SDK/FW bug fixes
1. In SPC-1 platforms: Fastboot mode is not operational for Split port with Force mode in 50G speed
SFP modules are kept in disabled state after set LPM (low power mode) on/off for at least 3 minutes.
2. When preforming fast boot from an old SDK version (currently installed) to a newer one (target version), and the system was initially loaded with a new SDK version (past version), and the system has not been wiped, under specific conditions, the fast boot would use the past version's data and may fail.

SDK/FW Features
1. On SN2700 all ports can support y cable by credo

SAI bug Fixes
1. When creating an ACL rule with SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP/SAI_ACL_ENTRY_ATTR_FIELD_DST_IP enabled, and then disabling the field by setting enable=false, a match on L3_type=IPv4 will remain programmed for the rule Issue resolved after the fix
2. Allow the max scale of virtual routers to be configure for SPC-1, SPC-2, SPC-3 when fastboot enable 
3. Remove default hash key of SRC_MAC, DST_MAC and ETH_TYPE

SAI features
1. Port init profile

- How I did it
Update SDK/FW/SAI make files

- How to verify it
Run full sonic-mgmt regression on Mellanox platform

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-15 15:32:52 +03:00
mssonicbld
4acaaf8179
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#16157) 2023-08-15 15:07:17 +08:00
Kebo Liu
5aa2417c71
[Mellanox] Update MFT to newer version 4.25.0-62 (#16149)
- Why I did it
Update Mellanox MFT tool to version 4.25.0-62

- How I did it
Update the MFT tool make file

- How to verify it
Run full sonic-mgmt regression.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-15 09:49:19 +03:00
Zhaohui Sun
286ec3edbf
Change orchagent pop batch size from 8192 to 1024 (#16125)
### Why I did it
Background running lua script may cause redis-server quite busy if batch size is 8192.
If handling time exceeded default 5s, the redis-server will not response to other process and will cause syncd crash.

```
Aug  9 07:46:29.512326 str-s6100-acs-5 INFO database#supervisord: redis 68:M 09 Aug 2023 07:46:29.511 # Lua slow script detected: still in execution after 5186 milliseconds. You can try killing the script using the SCRIPT KILL command. Script SHA1 is: 88270a7c5c90583e56425aca8af8a4b8c39fe757
Aug  9 07:46:29.523716 str-s6100-acs-5 ERR syncd#syncd: :- checkReplyType: Expected to get redis type 5 got type 6, err: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
Aug  9 07:46:29.524818 str-s6100-acs-5 INFO syncd#supervisord: syncd terminate called after throwing an instance of '
Aug  9 07:46:29.525268 str-s6100-acs-5 ERR pmon#CCmisApi: :- checkReplyType: Expected to get redis type 5 got type 6, err: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
Aug  9 07:46:29.526148 str-s6100-acs-5 INFO syncd#supervisord: syncd std::system_error'
Aug  9 07:46:29.528308 str-s6100-acs-5 ERR pmon#psud[32]: :- checkReplyType: Expected to get redis type 5 got type 6, err: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
Aug  9 07:46:29.529048 str-s6100-acs-5 ERR lldp#python3: :- guard: RedisReply catches system_error: command: *2#015#012$3#015#012DEL#015#012$27#015#012LLDP_ENTRY_TABLE:Ethernet37#015#012, reason: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.: Input/output error
Aug  9 07:46:29.529720 str-s6100-acs-5 ERR snmp#python3: :- guard: RedisReply catches system_error: command: *2#015#012$7#015#012HGETALL#015#012$28#015#012COUNTERS:oid:0x100000000000a#015#012, reason: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.: Input/output error
```

88270a7c5c90583e56425aca8af8a4b8c39fe757 is /usr/share/swss/consumer_state_table_pops.lua
##### Work item tracking
- Microsoft ADO **24741990**:

#### How I did it
Change batch size from 8192 to1024.
#### How to verify it
Run all test cases in sonic-mgmt to verify the system stability.

### Tested branch (Please provide the tested image version)

- [x] 20220531.36
2023-08-14 17:49:49 -07:00
Nonodark Huang
1acafa4873
[Ufispace][PDDF] Add PDDF support on S9110-32X, S8901-54XC, S7801-54XS and S6301-56ST (#16017)
Why I did it
Add PDDF support on following Ufispace platforms with Broadcom ASIC

S9110-32X
S8901-54XC
S7801-54XS
S6301-56ST
How I did it
Add PDDF configuration files, scripts and python files

How to verify it
Run pddf commands and show commands.

Signed-off-by: nonodark <ef67891@yahoo.com.tw>
2023-08-14 15:56:03 -07:00
Saikrishna Arcot
dfe5ea6e52
Fix the clean target reporting "Is a directory" error (#16029)
### Why I did it

Since directories are being removed, the `-r` flag is required.

Fixes #15922

##### Work item tracking
- Microsoft ADO **(number only)**: 24752770
2023-08-14 10:00:30 -07:00
mssonicbld
7bea886f1d
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#16123)
#### Why I did it
src/sonic-utilities
```
* 5b492d54 - (HEAD -> master, origin/master, origin/HEAD) [chassis][voq] clear: Fix clear queuecounters to also clear VOQ counters (#2878) (2 days ago) [Patrick MacArthur]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-14 18:32:40 +08:00
Zhijian Li
ab7c4ee661
[Celestica-E1031] Enable CPU watchdog (#16083)
Enable CPU watchdog on Celestica-E1031.
2023-08-13 21:33:19 -07:00
mssonicbld
34bad34495
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#16122) 2023-08-13 14:59:45 +08:00
mssonicbld
2547968d3c
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#16080) 2023-08-13 14:54:22 +08:00
mssonicbld
ae48f7db6b
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#16121) 2023-08-12 14:42:38 +08:00
mssonicbld
388f5c51fe
[submodule] Update submodule sonic-sairedis to the latest HEAD automatically (#16004)
#### Why I did it
src/sonic-sairedis
```
* eb24302 - (HEAD -> master, origin/master, origin/HEAD) Build both the regular and RPC version when the RPC profile is enabled (#1273) (28 hours ago) [Saikrishna Arcot]
* 9e855c2 - [FEC] Adding support for vs testing for SAI_PORT_ATTR_AUTO_NEG_FEC_MODE_OVERRIDE (#1271) (2 days ago) [Sudharsan Dhamal Gopalarathnam]
* 4dbdb21 - Fix RPC package build failure due to shell syntax issue (#1268) (10 days ago) [Saikrishna Arcot]
* 588d596 - Make sure new binaries replace existing binaries in docker-sonic-vs (#1269) (11 days ago) [Saikrishna Arcot]
* ce8f642 - [vs] Use boost join to concatenate switch types in config (#1266) (3 weeks ago) [Kamil Cudnik]
* d6055a2 - [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259) (4 weeks ago) [prabhataravind]
* e1cdb4d - [CodeQL]: Use dependencies with relevant versions in azp template. (#1262) (5 weeks ago) [Nazarii Hnydyn]
* c08f9a2 - [CI]: Fix collect log error in azp template. (#1260) (5 weeks ago) [Nazarii Hnydyn]
* eed856c - [CodeQL]: Fix syncd compilation in azp template. (#1261) (5 weeks ago) [Nazarii Hnydyn]
* a3f1f1a - Reland 'Make changes to building and packaging sairedis (#1116)' (#1194) (6 weeks ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-12 14:32:27 +08:00
Ze Gan
055fe90d3f
[build]: Remove uselses proto package (#16093)
Why I did it
The protoc-dev isn't used by SONiC, but it was added to the derived package.

Work item tracking
Microsoft ADO (number only): 17417902

How I did it
Remove protoc-dev from protobuf.mk

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-08-11 11:52:24 -07:00
bingwang-ms
d50ae1fd09
[arista]: Always set sai_tunnel_support on Arista-7260cx3 (#16097)
Why I did it
To overwrite the default DSCP_TO_TC_MAP for tunnel traffic, the attribute sai_tunnel_support must be set to 1.
Before this change, the attribute is set only on dual-tor platform when remap is enabled.
This PR is to set the attribute on all Arista-7260cx3 devices.

Work item tracking
Microsoft ADO 24785776

How I did it
Update the config.bcm template for Arista-7260cx3 devices.

How to verify it
The change is verified by manually rendering the j2 on a T1 testbed.
2023-08-11 11:51:25 -07:00
FuzailBrcm
bb8ce50cbe
Adding support for extra GPIO chips in the common PDDF driver (#16082) 2023-08-11 09:31:18 -07:00
Liu Shilong
3500f69fdb
Revert "[Ufispace][PDDF] Add PDDF support on S9180-32X (#14909)" (#16092)
This reverts commit d2b5d774c5.
2023-08-11 09:13:53 -07:00
Saikrishna Arcot
519a1e4a91
Update sairedis submodule (#16072)
* Update sairedis submodule

This submodule update needs to be manually done due to build changes
done in the sairedis submodule. Specifically, Debian build profiles are
now being used instead of dpkg build targets, and dbgsym packages are
being used instead of dbg packages. Because of this, there needs to be
changes on the sonic-buildimage side for this.

This is a reland of #15720, which was reverted in #15995 due to the RPC
package build failing. That failure has since been fixed, and the
PR pipeline has been updated to build the RPC package so that this is
checked at the PR stage.

This submodule update brings in the following changes:

```
4dbdb21 Fix RPC package build failure due to shell syntax issue (#1268)
588d596 Make sure new binaries replace existing binaries in docker-sonic-vs (#1269)
ce8f642 [vs] Use boost join to concatenate switch types in config (#1266)
d6055a2 [vslib]: Temporaily map DPU switch type to NVDA_MBF2H536C (#1259)
e1cdb4d [CodeQL]: Use dependencies with relevant versions in azp template. (#1262)
c08f9a2 [CI]: Fix collect log error in azp template. (#1260)
eed856c [CodeQL]: Fix syncd compilation in azp template. (#1261)
a3f1f1a Reland 'Make changes to building and packaging sairedis (#1116)' (#1194)
```

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Update sairedis submodule with the fix for the RPC package build

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

---------

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-08-11 09:00:46 -07:00
mssonicbld
0269e60a36
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#16106)
#### Why I did it
src/sonic-platform-common
```
* ab70e66 - (HEAD -> master, origin/master, origin/HEAD) Add new SSD type support (#390) (21 hours ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-11 16:32:43 +08:00
Aaron Payment
eedaa2adbf
sonic-buildimage: Fix SAI_API_TUNNEL SAI_STATUS_NOT_SUPPORTED error (#13874)
Syncd will abort in handleSaiCreateStatus with
'Encountered failure in create operation, exiting orchagent,SAI API: SAI_API_TUNNEL, status: SAI_STATUS_NOT_SUPPORTED'

The fix is to add the following brcm config to prevent the error:
sai_tunnel_global_sip_mask_enable=1
bcm_tunnel_term_compatible_mode=1

Signed-off-by: Aaron Payment <aaronp@arista.com>
2023-08-11 13:36:18 +08:00
vmittal-msft
12d24d572a
Updated PG headroom settings for 40g port speed (#16038) 2023-08-10 17:35:43 -07:00
Arun LK
97113bae61
Dell: E3224F platform onboarding (#16002)
* Dell: E3224F platform onboarding

* Dell: E3224F platform onboarding
2023-08-10 17:27:30 -07:00
mssonicbld
a86eb95005
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#16078)
#### Why I did it
src/sonic-platform-common
```
* 537095c - (HEAD -> master, origin/master, origin/HEAD) Added new RegBitsFields (#391) (32 hours ago) [Prince George]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-10 17:22:28 +08:00
mssonicbld
51761149cc
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#16079)
#### Why I did it
src/sonic-platform-daemons
```
* f3c2631 - (HEAD -> master, origin/master, origin/HEAD) Revert pcied enhancements (#392) (28 hours ago) [Ashwin Srinivasan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-10 17:22:23 +08:00
Sachin Holla
04ffd67fda
Ensure sonic yangs wheel is built before sonic-mgmt-common (#15226)
* Enhanced slave.mk to accept python wheels as dependency for a deb
  target. Dependent wheel names should be specified through the new
  {deb_name}_WHEEL_DEPENDS variable in the deb's make rules. The wheel
  will be built and installed in the slave docker before starting the
  deb build.

* Added sonic_yang_models-1.0-py3-none-any.whl as dependency for
  sonic-mgmt-common.deb. This is required for using the sonic yangs in
  UMF

Signed-off-by: Sachin Holla <sachin.holla@broadcom.com>
2023-08-09 11:40:00 -07:00
Ze Gan
96757a335c
Remove temporary files and import dash_api to python3 env (#16033)
1. Remove useless temporary protobuf deb packages
2. Import dash_api to python3 env

### Why I did it
1. There are some temporary Debian packages,protobuf packages, needs to be deleted
2. The dash-api was installed in the system folder that cannot be imported by the virtual python3 environment. But the testcases of DASH in sonic-mgmt are executed in virtual python3 environment.

##### Work item tracking
- Microsoft ADO **(number only)**: 17417902

#### How I did it
1. Add missed `&&` so that all protobuf debian packaged can be downloaded to the /tmp folder
2. Add ` --system-site-packages ` to env-python so that the system library can be accessed by virtual environment

#### How to verify it
Check the dash_api can be imported in env-python3
```
AzDevOps@46a900cf8477:~$ source env-python3/bin/activate
(env-python3) zegan@46a900cf8477:~$ ls
bin  env-python3
(env-python3) zegan@46a900cf8477:~$ python3
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dash_api
>>>

```
2023-08-08 21:04:54 -07:00
FuzailBrcm
8524e563d3
PDDF: Supporting extra system fans in the common PDDF drivers (#15956) 2023-08-08 14:59:36 -07:00
SuvarnaMeenakshi
803c71c86a
[SNMP][IPv6]: Fix to use link local IPv6 address as snmp agentAddress (#16013)
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
fixes: https://github.com/sonic-net/sonic-buildimage/issues/16001
Caused by: https://github.com/sonic-net/sonic-buildimage/pull/15487

The above PR introduced change to use Management and Loopback Ipv4 and ipv6 addresses as snmpagent address in snmpd.conf file.
With this change, if Link local IP address is configured as management or Loopback IPv6 address, then snmpd tries to open socket on that ipv6 address and fails with the below error:
```
Error opening specified endpoint "udp6:[fe80::5054:ff:fe6f:16f0]:161"
Server Exiting with code 1
```
From RFC4007, if we need to specify non-global ipv6 address without ambiguity, we need to use zone id along with the ipv6 address: <address>%<zone_id>
Reference: https://datatracker.ietf.org/doc/html/rfc4007

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Modify snmpd.conf file to use the %zone_id representation for ipv6 address.
#### How to verify it
In VS testbed, modify config_db to use link local ipv6 address as management address:
    "MGMT_INTERFACE": {
        "eth0|10.250.0.101/24": {
            "forced_mgmt_routes": [
                "172.17.0.1/24"
            ],
            "gwaddr": "10.250.0.1"
        },
        "eth0|fe80::5054:ff:fe6f:16f0/64": {
            "gwaddr": "fe80::1"
        }
    },

Execute config_reload after the above change.
snmpd comes up and check if snmpd is listening on ipv4 and ipv6 addresses:
```
admin@vlab-01:~$ sudo netstat -tulnp | grep 161
tcp        0      0 127.0.0.1:3161          0.0.0.0:*               LISTEN      274060/snmpd        
udp        0      0 10.1.0.32:161           0.0.0.0:*                           274060/snmpd        
udp        0      0 10.250.0.101:161        0.0.0.0:*                           274060/snmpd        
udp6       0      0 fc00:1::32:161          :::*                                274060/snmpd        
udp6       0      0 fe80::5054:ff:fe6f::161 :::*                                274060/snmpd      -- Link local 
 
admin@vlab-01:~$ sudo ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.250.0.101  netmask 255.255.255.0  broadcast 10.250.0.255
        inet6 fe80::5054:ff:fe6f:16f0  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:6f:16:f0  txqueuelen 1000  (Ethernet)
        RX packets 36384  bytes 22878123 (21.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 261265  bytes 46585948 (44.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

admin@vlab-01:~$ docker exec -it snmp snmpget -v2c -c public fe80::5054:ff:fe6f:16f0 1.3.6.1.2.1.1.1.0
iso.3.6.1.2.1.1.1.0 = STRING: "SONiC Software Version: SONiC.master.327516-04a6031b2 - HwSku: Force10-S6000 - Distribution: Debian 11.7 - Kernel: 5.10.0-18-2-amd64"
```
Logs from snmpd:
```
Turning on AgentX master support.
NET-SNMP version 5.9
Connection from UDP/IPv6: [fe80::5054:ff:fe6f:16f0%eth0]:44308
```
Ran test_snmp_loopback test to check if loopback ipv4 and ipv6 works:
```
./run_tests.sh -n vms-kvm-t0 -d vlab-01 -c snmp/test_snmp_loopback.py  -f vtestbed.yaml -i ../ansible/veos_vtb -e "--skip_sanity --disable_loganalyzer" -u
=== Running tests in groups ===
Running: pytest snmp/test_snmp_loopback.py --inventory ../ansible/veos_vtb --host-pattern vlab-01 --testbed vms-kvm-t0 --testbed_file vtestbed.yaml --log-cli-level warning --log-file-level debug --kube_master unset --showlocals --assert plain --show-capture no -rav --allow_recover --ignore=ptftests --ignore=acstests --ignore=saitests --ignore=scripts --ignore=k8s --ignore=sai_qualify --junit-xml=logs/tr.xml --log-file=logs/test.log --skip_sanity --disable_loganalyzer
..                                                                        

snmp/test_snmp_loopback.py::test_snmp_loopback[vlab-01] PASSED 
```
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [x] 202012
- [x] 202106
- [x] 202111
- [x] 202205
- [x] 202211
- [x] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2023-08-08 14:47:33 -07:00
mssonicbld
345b5e2000
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#16073)
#### Why I did it
src/sonic-swss
```
* 23cb2e50 - (HEAD -> master, origin/master, origin/HEAD) [ASAN] Fix Indirect Mem Leaks in Orchagent (#2869) (10 hours ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-08 15:32:55 +08:00
Arvindsrinivasan Lakshmi Narasimhan
46817036fd
[chassis]: removed dependency for bgp and swss for chassis supervisor (#15734)
Fixes #15667 and #13293

Work item tracking
Microsoft ADO 24472854:

How I did it
On chassis supervisor bgp feature is disabled in hostcfgd. The dependency between swss and bgp causes the bgp containers to start even though the feature is disabled.

How to verify it
Tests on chassis supervisor and LC
2023-08-07 09:52:48 -07:00
shdasari
d9393b0149
[radius]: Use execl instead of popen in RADIUS NSS code to fix vulnerability. (#15512)
Why I did it
#15284 fixes a case of shell escape exploit for TACACS+. This applies to RADIUS as well. RADIUS creates an unconfirmed user locally on the switch while attempting authentication. popen() is used to execute useradd,usermod and userdel commands. This exposes a vulnerability where a tactically designed username (which could contain explicit linux commands) can lead to getting executed as root.

An example of such a username could be "asd";echo>remoteRCE2;#". This leads to remoteRCE2 getting created in "/".

How I did it
All calls to popen() used to execute useradd, usermod and userdel are replaced with fork()/execl().

How to verify it
Prior to the fix, following is the behavior:

[s@i vm] ssh "asd";echo>remoteRCE2;#"@1.1.1.1
asd";echo>remoteRCE2;#@1.1.1.1's password:
Permission denied, please try again.

On the SONiC switch,

root@sonic:/# ls
accton_as7816_monitor.log home lib64 remoteRCE2 sys
bin host libx32 root tmp
boot initrd.img media run usr
cache.tgz initrd.img.old mnt sbin var
dev lib opt sonic vmlinuz
etc lib32 proc srv vmlinuz.old
root@sonic:/# ls -l

With the fix:

[s@i vm] ssh "asd";echo>remoteRCE2;#"@1.1.1.1
asd";echo>remoteRCE2;#@1.1.1.1's password:
Permission denied, please try again.

root@sonic:/# ls
accton_as7816_monitor.log etc lib mnt sbin usr
bin home lib32 opt sonic var
boot host lib64 proc srv vmlinuz
cache.tgz initrd.img libx32 root sys vmlinuz.old
dev initrd.img.old media run tmp

Verified that RADIUS authentication works as expected for valid users as well.
2023-08-07 09:48:18 -07:00
Sudharsan Dhamal Gopalarathnam
7bdd0d8011
[frr]: FRR 8.5.1 integration changes (#15965)
Why I did it
Upgrading FRR 8.5.1 to include latest fixes.

New patches that were added:

Patch	FRR Pull request	Issue fixed
0012-zebra-Rename-vrf_lookup_by_tableid-to-zebra_vrf_look.patch	FRRouting/frr#13396	#14866
0013-zebra-Move-protodown_r_bit-to-a-better-spot.patch	FRRouting/frr#13396	#14866
0014-zebra-Remove-unused-dplane_intf_delete.patch	FRRouting/frr#13396	#14866
0015-zebra-Remove-unused-add-variable.patch	FRRouting/frr#13396	#14866
0016-zebra-Remove-duplicate-function-for-netlink-interfac.patch	FRRouting/frr#13396	#14866
0017-zebra-Add-code-to-get-set-interface-to-pass-up-from-.patch	FRRouting/frr#13396	#14866
0018-zebra-Use-zebra-dplane-for-RTM-link-and-addr.patch	FRRouting/frr#13396	#14866
0019-zebra-Abstract-dplane_ctx_route_init-to-init-route-w.patch	FRRouting/frr#13757	FRRouting/frr#13754
00020-zebra-Fix-crash-when-dplane_fpm_nl-fails-to-process-.patch	FRRouting/frr#13757	FRRouting/frr#13754

Removed patches:

Patch	Upstream FRR commit that is present in 8.5.1
0001-Add-support-of-bgp-tcp-DSCP-value.patch	FRRouting/frr@425bd64
0010-zebra-Note-when-the-netlink-DUMP-command-is-interrup.patch	FRRouting/frr@2f71996
0011-bgpd-enhanced-capability-is-always-turned-on-for-int.patch	FRRouting/frr@8e89adc
0012-Ensure-ospf_apiclient_lsa_originate-cannot-accidently-write-into-stack.patch	FRRouting/frr@d2aeac3 , FRRouting/frr@49efc80, FRRouting/frr@ff6db10
0013-zebra-fix-dplane-fpm-nl-to-allow-for-fast-configuration.patch	FRRouting/frr@551fa8c
0014-bgpd-Allow-network-XXX-to-work-with-bgp-suppress-fib.patch	FRRouting/frr@4801fc4
0015-zebra-Return-statements-do-not-use-paranthesis.patch	FRRouting/frr@871a16c
0016-zebra-Add-zrouter.asic_notification_nexthop_control.patch	FRRouting/frr@06525c4
0017-zebra-Re-arrange-fpm_read-to-reduce-code-duplication.patch	FRRouting/frr@7d83e13
0018-zebra-Add-dplane_ctx_get-set_flags.patch	FRRouting/frr@10388e9
0019-zebra-Rearrange-dplane_ctx_route_init.patch	FRRouting/frr@f935122
0020-zebra-Add-ctx-to-netlink-message-parsing.patch	FRRouting/frr@45f0a10
0021-zebra-Read-from-the-dplane_fpm_nl-a-route-update.patch	FRRouting/frr@a0e1173
0022-zebra-Fix-code-because-missing-backport.patch	FRRouting/frr@07fd1f7
0024-zebra-continue-fpm-read-when-we-decide-a-netlink-message-is-not-needed.patch	FRRouting/frr@c0275ab
0025-zebra-Send-nht-resolved-entry-up-to-concerned-protoc.patch	FRRouting/frr@8ce0e51
0027-bgpd-Ensure-FRR-has-enough-data-to-read-in-peek_for_as4_capability-and-bgp_open_option_parse.patch	FRRouting/frr@3e46b43
0028-bgpd-Ensure-that-bgp-open-message-stream-has-enough-data-to-read.patch	FRRouting/frr@766eec1

Realigned patches:

Old Patch	New patch
0002-Reduce-severity-of-Vty-connected-from-message.patch	0001-Reduce-severity-of-Vty-connected-from-message.patch
0004-Allow-BGP-attr-NEXT_HOP-to-be-0.0.0.0-due-to-allevia.patch	0002-Allow-BGP-attr-NEXT_HOP-to-be-0.0.0.0-due-to-allevia.patch
0005-nexthops-compare-vrf-only-if-ip-type.patch	0003-nexthops-compare-vrf-only-if-ip-type.patch
0006-frr-remove-frr-log-outchannel-to-var-log-frr.log.patch	0004-frr-remove-frr-log-outchannel-to-var-log-frr.log.patch
0007-Add-support-of-bgp-l3vni-evpn.patch	0005-Add-support-of-bgp-l3vni-evpn.patch
0008-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch	0006-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch
0009-ignore-route-from-default-table.patch	0007-ignore-route-from-default-table.patch
0009-ignore-route-from-default-table.patch	0007-ignore-route-from-default-table.patch
0023-Use-vrf_id-for-vrf-not-tabled_id.patch	0008-Use-vrf_id-for-vrf-not-tabled_id.patch
0026-bgpd-Ensure-suppress-fib-pending-works-with-network-.patch	0009-bgpd-Ensure-suppress-fib-pending-works-with-network-.patch
0029-bgpd-Change-log-level-for-graceful-restart-events.patch	0010-bgpd-Change-log-level-for-graceful-restart-events.patch
0030-zebra-Static-routes-async-notification-do-not-need-t.patch	0011-zebra-Static-routes-async-notification-do-not-need-t.patch

How I did it
Upgrade FRR submodule. Align the patches. Integrate new patches to fix issues.

How to verify it
Run sonic-mgmt regression to verify
2023-08-07 09:45:13 -07:00
abdosi
c6d1dae741
Fix the Loopback0 IPv6 address of LC's in chassis not reachable from (#16026)
What I did:
Fix the Loopback0 IPv6 address of LC's in chassis not reachable from peer devices.

Why I did:
For Ipv6 Loopback0 address we only advertise /64 subnet to the peer devices. However, in case of chassis each LC will have it own /128 address of that /64 subnet . Since this /128 address does not get advertised peer devices can-not ping/reach the LC's loopback0.

How I fix:
Advertise /128 Loopback0 Ipv6 address only between i-BGP peers. This way even though /64 is advertised to e-BGP peer devices when packet reaches any of LC's it can reach the appropriate LC's.

How I verify:
Manual verification
UT added for same.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-08-06 22:36:33 -07:00
Vadym Hlushko
9fba98ce6d
[syncd.sh] Clear semaphore before updating firmware (#15818)
Why I did it
The hw resources should be released before updating firmware.

How I did it
Added logic to release hw resources in syncd.sh script

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2023-08-06 22:30:33 -07:00
mssonicbld
642350c524
[submodule] Update submodule sonic-swss-common to the latest HEAD automatically (#16031)
#### Why I did it
src/sonic-swss-common
```
* be425ed - (HEAD -> master, origin/master, origin/HEAD) [redisCommand]: Not store the error return code of redisFormat (#809) (2 days ago) [Ze Gan]
* 5966d8b - Fix binary serializer can't deserialize protopuf buffer content issue (#810) (3 days ago) [Hua Liu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-06 16:32:34 +08:00
andywongarista
96fa513690
[Arista] Add support for DCS-7060DX5-32 (#14793)
* Add asic support for blackhawkth4dd

* Add bfd feature to BlackhawkTh4Dd

* Add platform data for blackhawkth4

* Add Qos settings for Blackhawk-TH4

* Add pg and queue settings for Blackhawk-TH4

* Add buffers_defaults_t0.j2

* Add blackhawkth4 to boot0

* Update 7060dx5 config.bcm

* Fix build error

---------

Co-authored-by: Boyang Yu <byu@arista.com>
Co-authored-by: David Meggy <davidm@arista.com>
2023-08-05 22:11:45 +08:00
mssonicbld
14c8ce282f
[submodule] Update submodule sonic-host-services to the latest HEAD automatically (#15992)
#### Why I did it
src/sonic-host-services
```
* 6767bc7 - (HEAD -> master, origin/master, origin/HEAD) [FeatureD] Move the Feature related config from Hostcfgd into a new daemon (#71) (6 days ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2023-08-05 14:32:40 +08:00
Vaibhav Hemant Dixit
e127701660
Fix CONFIG_DB_INITIALIZED flag check logic and set/reset flag for warmboot (#15685)
* Fix CONFIG_DB_INITIALIZED flag check logic and set/reset flag for warm-reboot
* Fix db-cli usage
* Handle same image warm-reboot and generalize handling of INIT flag
* Cover boot from ONIE case: set config init flag when minigraph, config_db are missing
* Handle case: first boot of SONiC
* Check for config init flag
* Simplify logic, and do not call db_migrator for same image reboot
2023-08-04 16:00:26 -07:00
vdahiya12
f41aad9226
[minigraph] remove number of lanes check for changing speed from 400G to 100G and set speed setting before lane reconfiguration (#15721)
8111 800G interface, split to 2x400G (each has 4 lanes) fails to change interface speed from 400G to 100G during deploy mg. In minigraph.xml, the interface speed configuration is good, but fails to generate the right value to config_db.json.

In order to support this SKU the speed transitioning should support both 4 lanes and 8 lanes in the port_config.ini.

Why I did it

before this change for a 400G to 100G transition, in all cases except when lanes are 8, we would continue and the line
ports.setdefault(port_name, {})['speed'] = port_speed_png[port_name]
would not be executed, hence the default speed will never be set for a case and config_db will not be updated,
where speed is transitioning from 400G to 100G or 40G, but lanes are not equal to 8.

In order for those cases to pass where lanes are not specifically 8, we need the change

Work item tracking
24242657

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2023-08-04 14:53:49 -07:00
Stephen Sun
97a091abd2
[Mellanox] Use Debian reboot in Nvidia platform reboot when it is invoked from kdump capture boot (#15701)
#### Why I did it

When a kernel crash occurs, the system will reboot to the kdump capture kernel if kdump is enabled (`config kdump enable`). In the kdump capture boot, it only stores the crash information, and then reboot the system to a normal boot.
In this boot, no SONiC service is started but it invokes `reboot` which is actually the SONiC reboot that depends on SONiC services. There is a logic to skip all SONiC stuff and invoke platform reboot in SONiC reboot to avoid issues.
However, on Nvidia platforms, the platform reboot still depends on SONiC services, which can cause issues.
So, the Debian reboot is called directly in platform reboot if it is invoked from the kdump capture boot.

#### How I did it

Manual test
2023-08-04 13:24:38 -07:00
Vivek
f1a4fbb1ad
[FeatureD] Add featured systemd files in host-services and update submodule (#15815)
### Why I did it

- Hostcfgd is handling a lot of tasks and Feature table is by itself an important and big task which can benefit from separation into a new daemon
- Currently, Hostcfgd handles feature table first before other tables an thus other taska such as Aaa, Ntp are delayed. With the split, they can run in paralell
- After the recent config-reload enhancements, Hostcfgd uses a multi-threading approach to listen to PortInitDone. BY splitting the daemon into two, we can avoid having a separate thread by using SubscriberStateTable and Select,.

#### Note: 

Depends on host-services PR : https://github.com/sonic-net/sonic-host-services/pull/71
Once the host-services is merged, updating the submodule along with this PR should fix the CI problem

#### How I did it

Refactor the feature related tasks from hostcfgd into a seperate daemon.

#### How to verify it

UT's and Tested on DUT

```
admin@r-tigris-22:~$ show logging -f | grep featured
Jun 28 22:13:33.870021 r-tigris-22 INFO featured: ConfigDB connect success
Jun 28 22:14:05.638063 r-tigris-22 INFO featured: Updating feature 'radv' systemd config file related to auto-restart ...
Jun 28 22:14:06.169184 r-tigris-22 INFO featured: Feature radv is enabled and started
Jun 28 22:14:06.172343 r-tigris-22 INFO featured: Updating feature 'sflow' systemd config file related to auto-restart ...
Jun 28 22:14:06.844322 r-tigris-22 INFO featured: Feature sflow is stopped and disabled
Jun 28 22:14:06.846761 r-tigris-22 INFO featured: Updating feature 'snmp' systemd config file related to auto-restart ...
Jun 28 22:14:07.129090 r-tigris-22 INFO featured: Feature is snmp delayed for port init
Jun 28 22:14:07.132052 r-tigris-22 INFO featured: Updating feature 'swss' systemd config file related to auto-restart ...
Jun 28 22:14:08.368948 r-tigris-22 INFO featured: Feature swss is enabled and started
Jun 28 22:14:08.369240 r-tigris-22 INFO featured: Updating feature 'syncd' systemd config file related to auto-restart ...
Jun 28 22:14:08.718357 r-tigris-22 INFO featured: Feature syncd is enabled and started
Jun 28 22:14:08.721496 r-tigris-22 INFO featured: Updating feature 'teamd' systemd config file related to auto-restart ...
Jun 28 22:14:09.042495 r-tigris-22 INFO featured: Feature teamd is enabled and started
Jun 28 22:14:09.045441 r-tigris-22 INFO featured: Updating feature 'telemetry' systemd config file related to auto-restart ...
Jun 28 22:14:09.359831 r-tigris-22 INFO featured: Feature is telemetry delayed for port init
Jun 28 22:14:30.740499 r-tigris-22 INFO featured: Updating delayed features after port initialization
Jun 28 22:14:33.914178 r-tigris-22 INFO featured: Feature lldp is enabled and started
Jun 28 22:14:35.536264 r-tigris-22 INFO featured: Feature mgmt-framework is enabled and started
Jun 28 22:14:38.098571 r-tigris-22 INFO featured: Feature snmp is enabled and started
Jun 28 22:14:39.555727 r-tigris-22 INFO featured: Feature telemetry is enabled and started


Jun 28 22:13:33.977011 r-tigris-22 INFO hostcfgd: ConfigDB connect success
Jun 28 22:13:33.993878 r-tigris-22 INFO hostcfgd: Waiting for systemctl to finish initialization
Jun 28 22:13:34.274818 r-tigris-22 INFO hostcfgd: systemctl has finished initialization -- proceeding ...
Jun 28 22:13:34.391623 r-tigris-22 INFO hostcfgd: file size check pass: /etc/pam.d/sshd size is (2139) bytes
Jun 28 22:13:34.427273 r-tigris-22 INFO hostcfgd: file size check pass: /etc/pam.d/login size is (4132) bytes
Jun 28 22:13:34.433390 r-tigris-22 INFO hostcfgd: file size check pass: /etc/nsswitch.conf size is (494) bytes
Jun 28 22:13:34.455110 r-tigris-22 INFO hostcfgd: file size check pass: /etc/nsswitch.conf size is (494) bytes
Jun 28 22:13:34.478882 r-tigris-22 INFO hostcfgd: Found audisp-tacplus PID: 442
Jun 28 22:13:34.482365 r-tigris-22 INFO hostcfgd: cmd - ['service', 'aaastatsd', 'stop']
Jun 28 22:13:36.108569 r-tigris-22 INFO hostcfgd: NtpCfg load ...
Jun 28 22:13:36.108699 r-tigris-22 INFO hostcfgd: ntp server update key 0
Jun 28 22:13:36.108763 r-tigris-22 INFO hostcfgd: ntp server update, restarting ntp-config, ntp servers configured set()
Jun 28 22:14:06.691693 r-tigris-22 INFO hostcfgd: KdumpCfg init ...
Jun 28 22:14:06.691771 r-tigris-22 DEBUG hostcfgd: passw_policies_update - key: POLICIES
Jun 28 22:14:06.691832 r-tigris-22 DEBUG hostcfgd: passw_policies_update - data: {'digits_class': 'true', 'expiration': '180', 'expiration_warning': '15', 'history_cnt': '10', 'len_min': '8', 'lower_class': 'true', 'reject_user_passw_match': 'true', 'special_class': 'true', 'state': 'disabled', 'upper_class': 'true'}
Jun 28 22:14:06.691891 r-tigris-22 DEBUG hostcfgd: modify_conf_file: passw_policies - {'digits_class': True, 'expiration': '180', 'expiration_warning': '15', 'history_cnt': '10', 'len_min': '8', 'lower_class': True, 'reject_user_passw_match': True, 'special_class': True, 'state': 'disabled', 'upper_class': True}
Jun 28 22:14:06.701982 r-tigris-22 DEBUG hostcfgd: Initial hostname: r-tigris-22
Jun 28 22:14:06.702075 r-tigris-22 DEBUG hostcfgd: Initial mgmt interface conf: {('eth0', '10.210.24.108/22'): {'gwaddr': '10.210.24.1'}}
Jun 28 22:14:06.702115 r-tigris-22 DEBUG hostcfgd: Initial mgmt VRF state: 
Jun 28 22:14:06.702177 r-tigris-22 INFO hostcfgd: RSyslogCfg: Initial config: {'config': {'GLOBAL': {'rate_limit_burst': '0', 'rate_limit_interval': '0'}}, 'servers': {}}
Jun 28 22:14:06.709455 r-tigris-22 INFO hostcfgd[39326]: Failed to restart resolv-config.service: Unit resolv-config.service not found.
Jun 28 22:14:06.709560 r-tigris-22 ERR hostcfgd: ['systemctl', 'restart', 'resolv-config'] - failed: return code - 5, output:#012None
admin@r-tigris-22:~$ Connection to r-tigris-22 closed by remote host.
```
2023-08-04 13:00:54 -07:00
pettershao-ragilenetworks
abccdaeb6c
[Ragile]Adapt kernel 5.10 for broadcom on RA-B6510-48V8C (#14809)
* Adapt kernel 5.10 for broadcom on RA-B6510-48V4C

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* update

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* modify one-image.mk file

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* modify debian/rule.mk

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

* Add platform.json file

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>

---------

Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
2023-08-04 12:01:49 -07:00
mssonicbld
b11c6d47ea
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#16032) 2023-08-04 15:15:04 +08:00
Junchao-Mellanox
91f3da018e
[Mellanox] Add more unit test coverage for platform API (#15842)
- Why I did it
Increase UT coverage for Nvidia platform API code

Work item tracking
Microsoft ADO (number only):

- How I did it
Focus on low coverage file:
1. component.py
2. watchdog.py
3. pcie.py

- How to verify it
Run the unit test, the coverage has been changed from 70% to 90%
2023-08-03 13:54:31 +03:00
Kebo Liu
380898f3a1
[Mellanox] Remove unnecessary file manipulation in the SAI Make file (#15993)
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2023-08-03 13:39:27 +03:00
Vadym Hlushko
521a86b2de
[Mellanox] Add mlxtrace to techsupport (#15961)
- Why I did it
Added the fwtrace config files in order to be able to call the mlxstrace utility during the show techsupport dump.

Work item tracking
Microsoft ADO (number only):

- How I did it
Added fwtrace config files. Added path to these files to sai.profile for each mlnx device.

- How to verify it
Execute the show techsupport command and check if mlxstrace output is in system dump.

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
2023-08-03 11:36:58 +03:00