Commit Graph

7985 Commits

Author SHA1 Message Date
mssonicbld
0ed4e6c30b
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#18069)
#### Why I did it
src/sonic-platform-common
```
* b6f8a8d - (HEAD -> 202305, origin/202305) Fix memory map parsing issue (#427) (22 hours ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-02-08 16:32:46 +08:00
Hua Liu
b62eb94ded Fix IPV6 forced-mgmt-route not work issue (#17299)
ix IPV6 forced-mgmt-route not work issue

Why I did it
IPV6 forced-mgmt-route not work

When add a IPV6 route, should use 'ip -6 rule add pref 32764 address' command, but currently in the template the '-6' parameter are missing, so the IPV6 route been add to IPV4 route table.

Also this PR depends on #17281 , which will fix the IPV6 'default' route table missing in IPV6 route lookup issue. 

Microsoft ADO (number only):24719238
2024-02-07 16:32:34 +08:00
mssonicbld
7cc5ee2507
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#18060)
#### Why I did it
src/sonic-utilities
```
* c5f53423 - (HEAD -> 202305, origin/202305) Fix `sudo config load_mgmt_config` fails with error "File /var/run/dhclient.eth0.pid does not exist" (#3149) (16 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-02-07 16:32:30 +08:00
anamehra
01110422da
Revert "[staticroutebfd] fix ipv6 letter case issue (#15765)" (#18027)
This reverts commit 630d93e2fd.
2024-02-06 18:41:37 +08:00
mssonicbld
87a0cb43f6
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#18036)
#### Why I did it
src/sonic-platform-common
```
* a64276a - (HEAD -> 202305, origin/202305) Tx/Rx power values should be rounded up to 3 decimal places (#432) (22 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-02-05 16:32:33 +08:00
StormLiangMS
4d2a6bef33 fix the compile issue for slim image (#18015)
Why I did it
The PR introduced a bug for slim image build, #17905, by which the sonic_asic_platform is missing when build docker image for slim image.

[ building ] [ target/docker-dhcp-relay.gz ]
/sonic/dockers/docker-dhcp-relay/cli-plugin-tests /sonic
/sonic
Traceback (most recent call last):
  File "/usr/local/bin/j2", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/dist-packages/j2cli/cli.py", line 202, in main
    output = render_command(
  File "/usr/local/lib/python3.9/dist-packages/j2cli/cli.py", line 186, in render_command
    result = renderer.render(args.template, context)
  File "/usr/local/lib/python3.9/dist-packages/j2cli/cli.py", line 85, in render
    return self._env \
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1090, in render
    self.environment.handle_exception()
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 832, in handle_exception
    reraise(*rewrite_traceback_stack(source=source))
  File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 28, in reraise
    raise value.with_traceback(tb)
  File "/sonic/dockers/docker-dhcp-relay/Dockerfile.j2", line 48, in top-level template code
    {% if build_reduce_image_size != "y" or sonic_asic_platform != "broadcom" %}
jinja2.exceptions.UndefinedError: 'sonic_asic_platform' is undefined
make: *** [slave.mk:1072: target/docker-dhcp-relay.gz] Error 1
make: *** Waiting for unfinished jobs....
[ finished ] [ target/docker-swss-layer-bullseye.gz ]
[ finished ] [ target/docker-syncd-brcm-dnx.gz ]
make[1]: *** [Makefile.work:608: target/sonic-broadcom.bin] Error 2
make[1]: Leaving directory '/data/work/1/s'
make: *** [Makefile:41: target/sonic-broadcom.bin] Error 2
And why it slipped the PR test? PR test doesn't compile with slim option, it won't check sonic_asic_platform != "broadcom" for PR build.

Work item tracking
Microsoft ADO (number only):
How I did it
Export sonic_asic_platform for docker build in slave.mk

How to verify it
build with slim image option.
2024-02-05 14:33:01 +08:00
Baorong Liu
630d93e2fd [staticroutebfd] fix ipv6 letter case issue (#15765)
*use lower case for IPv6 address as internal key and bfd session key. fixes #15764

Why I did it
*staticroutebfd uses the IPv6 address string as a key to create bfd session and cache the bfd sessions using it as a key.
When the IPv6 address string has uppercase letter in the static route nexthop list, the string with uppercase letter key is stored in the cache, but the BFD STATE_DB uses lowercase for IPv6 address, so when the staticroutebfd get the bfd state event, it cannot find the bfd session in its local cache because of the letter case.
2024-02-03 14:32:21 +08:00
Baorong Liu
cfebad6a11
[staticroutebfd]double commit for PR 15765 (#17973)
* [staticroutebfd]double commit pr 15765

* fix ut, change to single hop in ut
2024-02-03 08:35:41 +08:00
zitingguo-ms
6514cfa262
upgrade xgs SAI version to 8.4.41.1 (#17976)
Why I did it
Upgrade the xgs SAI version to 8.4.41.1 to include the following fix:

8.4.41.1: Cherry-pick from SAI 4.3: CS00012288297: Fix TX queue for control packets
Work item tracking
Microsoft ADO (number only): 26626208
How I did it
Upgrade xgs SAI version in sai.mk file.

How to verify it
run test_bgp_queue.py test on 7050qx T1: https://dev.azure.com/mssonic/internal/_build/results?buildId=467287&view=results
2024-02-01 15:01:27 +08:00
Yaqiang Zhu
de3a1e54b1
[202305] Advance dhcpmon submodule (#17952)
Why I did it
Advance dhcpmon submodule head

Work item tracking
Microsoft ADO (number only): 26270786
How I did it
fc20a97 Yaqiang Zhu Wed Jan 10 09:11:25 2024 +0800 [202311][counter] Clear counter table when dhcpmon init (#14)
bace2e0 Yaqiang Zhu Fri Jan 5 11:29:21 2024 +0800 [counter] Clear counter table when dhcpmon init (#14)

How to verify it
2024-02-01 14:10:17 +08:00
Zain Budhwani
7c88542d6e
Disable eventd and rsyslog plugin in slim images (#17905) (#17972)
Disable eventd at buildtime for slim images

- Microsoft ADO **(number only)**:26386286

Add flags for disabling eventd and only copy rsyslog conf files when eventd is included and not slim image

Manual testing
2024-02-01 10:14:20 +08:00
Volodymyr Samotiy
15a0f912bd [Mellanox] Disable SSD NCQ on Mellanox platforms (#17567)
- Why I did it
Based on some research some products might experience an occasional IO failures in the communication between CPU and SSD because of NCQ.
There seems to be a problem between some kernel versions and some SATA controllers.

Syslog error message examples:

Error "ata1: SError: { UnrecovData Handshk }" - "failed command: WRITE FPDMA QUEUED".
Error "ata1: SError: { RecovComm HostInt PHYRdyChg CommWake 10B8B DevExch }" - "failed command: READ FPDMA QUEUED".
Some vendors already disabled NCQ on their platforms in SONiC due to similar issue:

[Arista] Disable ATA NCQ for a few products #13739 [Arista] Disable ATA NCQ for a few products
[Arista] Disable SSD NCQ on DCS-7050CX3-32S #13964 [Arista] Disable SSD NCQ on DCS-7050CX3-32S
Also there are other discussions on Debian/Ubuntu forums about similar issues and it was suggested to disable NCQ:

https://askubuntu.com/questions/133946/are-these-sata-errors-dangerous

- How I did it
Add a kernel parameter to tell libata to disable NCQ

- How to verify it
Use FIO tool - fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4
2024-02-01 04:32:27 +08:00
Baorong Liu
2a8181c88d [staticroutebfd] fix an error in error logging (#17043)
Why I did it
Fix an error in the log_err call.
this error can be triggered by an invalid static route key. usually the code cannot go here with normal config file. but hit this issue with an invalid key by manual testing with redis-cli directly. the file is scanned by Python lint to prevent such errors.

Work item tracking
Microsoft ADO ():26250268

How I did it
fix the format error.

How to verify it
1, ran pylint to check the design, make sure no such error in the design file.
2, wrote a separate python program to verify the log call.
In the current logging related testing, usually use patch/mock for logging. for this specific error, could not trigger it if we call mock function instead the real function in the design. so need to do lint checking for code change.
2024-02-01 04:32:23 +08:00
Sudharsan Dhamal Gopalarathnam
4efd510546
[202305][Mellanox]Update SDK/FW to 4.6.2202/2012.2202 (#17946)
Why I did it
Update SDK/FW version to 4.6.2202/2012.2202
Fixed issues

On Spectrum-3 systems, ports' toggling while sending traffic on 400G speed ports, might result in stuck FW.
In Spectrum-1 switch systems, 50G SR2 speed mode is not supported when AutoNeg is enabled. In this case although the max interface speed is 50G for SR2 or SR4 or SR, the actual max interface speed negotiated between the loopback is 25G.
On Spectrum-2 and Spectrum-3, Switch create in fastboot might take more than 40 seconds in case there are no active links.
When performing warmboot from version prior to 202205 to 202205 and above , no aging and mac move take place
Work item tracking
Microsoft ADO (number only):
How I did it
Updating make files.

How to verify it
Running regression
2024-01-31 13:33:36 +08:00
zitingguo-ms
77d2b28957
[202305] Upgrade xgs SAI version to 8.4.41.0 (#17937)
Why I did it
Upgrade the xgs SAI version to 8.4.41.0 to include the following fix:

8.4.39.3: Revert "Merged PR 4452: Update SAI version to 8.4.39.2 to include fix capability for Hostif queue"
8.4.40.0: [sbumodule upgrade][CS00012330252] ACL entry programming takes longer in SAI version 8.4 compared to SAI version 7.1
8.4.41.0: [CS00012330251]Extra buffer Profiles created internally are seen as regular profiles in Get calls
Work item tracking
Microsoft ADO (number only): 26609411
How I did it
Upgrade xgs SAI version in sai.mk file.

How to verify it
Run basic SONiC test using SAI release pipeline, all cases passed.

8.4.40.0: https://dev.azure.com/mssonic/internal/_build/results?buildId=465899&view=results
8.4.41.0: https://dev.azure.com/mssonic/internal/_build/results?buildId=466690&view=results
2024-01-31 10:35:29 +08:00
Hua Liu
c5086a0ee0 [TACACS] Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue. (#17749)
Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue.

#### Why I did it
When set TACACS to "tacacs+, local", user still can run a blocked command with local permission.

##### Work item tracking
- Microsoft ADO: 26399545

#### How I did it
Fix code to reject command when authorized failed from TACACS server side.

#### How to verify it
Pass all UT.

### Description for the changelog
Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue.
2024-01-31 02:32:07 +08:00
Longxiang Lyu
06c27e5058 [dualtor] Disable zebra link-detect for vlan interfaces (#17784)
* [dualtor] Disable zebra link-detect for vlan interfaces

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2024-01-30 16:32:42 +08:00
mssonicbld
2a9391ad8f
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17938)
#### Why I did it
src/sonic-utilities
```
* c5e30e38 - (HEAD -> 202305, origin/202305) [202305] Enhanced route_check.py for multi_asic platforms (#3112) (21 hours ago) [Deepak Singhal]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-30 16:32:39 +08:00
abdosi
a118a5ba43
[chassis] Added support of isolating given LC in Chassis with TSA mode (#16732) (#17860)
What I did:
Added support when TSA is done on Line Card make sure it's completely
isolated from all e-BGP peer devices from this LC or remote LC

Why I did:
Currently when TSA is executed on LC routes are withdrawn from it's connected e-BGP peers only. e-BGP peers on remote LC can/will (via i-BGP) still have route pointing/attracting traffic towards this isolated LC.

How I did:

When TSA is applied on LC all the routes that are advertised via i-BGP are set with community tag of no-export so that when remote LC received these routes it does not send over to it's connected e-BGP peers.

Also once we receive the route with no-export  over iBGP match on it and and set the local preference of that route to lower value (80) so that we remove that route from the forwarding database. Below scenario explains why we do this:

- LC1 advertise R1 to LC3
- LC2 advertise R1 to LC3
- On LC3 we have multi-path/ECMP over both LC1 and LC2
- On LC3 R1 received from LC1 is consider best route over R1 over received from LC2 and is send to LC3 e-BGP peers
- Now we do TSA on LC2
- LC3 will receive R1 from LC2 with community no-export and from LC1 same as earlier (no change)
- LC3 will still get traffic for R1 since it is still advertised to e-BGP peers (since R1 from LC1 is best route)
- LC3 will forward to both LC1 and LC2 (ecmp) and this causes issue as LC2 is in TSA mode and should not receive traffic

To fix above scenario we change the preference to lower value of R1 received from LC2 so that it is removed from Multi-path/ECMP group.

How I verfiy:

UT has been added to make sure Template generation is correct
Manual Verification of the functionality
sonic-mgmt test case will be updated accordingly.
Please note this PR is on top of this :#16714 which needs to be merged first.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2024-01-29 22:54:18 +08:00
Junchao-Mellanox
18fbca839f Fix error log while creating PSU thermal object (#17789)
- Why I did it
If a PSU is not present, there could be error log while restarting psud or thermalctld:

Jan  8 17:15:52.689616 sonic ERR pmon#psud: Thermal sysfs /run/hw-management/thermal/psu2_temp1_max does not exist

Jan  8 17:15:57.747723 sonic ERR pmon#thermalctld: Thermal sysfs /run/hw-management/thermal/psu2_temp1 does not exist

- How I did it
if a PSU is not present, we should not check the PSU temperature sysfs.
2024-01-29 18:32:33 +08:00
ganglv
6a76a73b0f Change tcp port range to support telemetry and gnmi (#17907)
* Reserve tcp port for telemetry and gnmi

* Use ip_local_port_range instead

* Fix sysctl config
2024-01-29 14:32:34 +08:00
Kevin Wang
1e758d4538 [qos] change the template keyword from Compute-AI to ComputeAI (#17902)
Why I did it
Align the keywords to make qos configuration take effect

Work item tracking
Microsoft ADO (number only):
How I did it
Change the keyword to ComputeAI

How to verify it
reload minigraph and check the qos configuration
2024-01-29 14:32:31 +08:00
mssonicbld
335ebaafc2
[submodule] Update submodule sonic-platform-daemons to the latest HEAD automatically (#17891)
#### Why I did it
src/sonic-platform-daemons
```
* 824c20a - (HEAD -> 202305, origin/202305) Support 800G ifname in xcvrd (#420) (20 hours ago) [Anoop Kamath]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-24 16:35:00 +08:00
mssonicbld
78e211cfd0
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17868)
#### Why I did it
src/sonic-utilities
```
* 83a548de - (HEAD -> 202305, origin/202305) Disable Key Validation feature during sonic-installation for Cisco Platforms (#3115) (22 hours ago) [selvipal]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-21 16:32:35 +08:00
mssonicbld
5d30151268
[submodule] Update submodule sonic-snmpagent to the latest HEAD automatically (#17863)
#### Why I did it
src/sonic-snmpagent
```
* 6f59d29 - (HEAD -> 202305, origin/202305) Fix SNMP dropping some of the queue counter when create_only_config_db_buffers is set to true (#303) (#309) (33 minutes ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-20 16:32:45 +08:00
mssonicbld
208c1705f5
[submodule] Update submodule sonic-snmpagent to the latest HEAD automatically (#17859)
#### Why I did it
src/sonic-snmpagent
```
* 2efaf2e - (HEAD -> 202305, origin/202305) Revert "[action] [PR:303] Fix SNMP dropping some of the queue counter when create_only_config_db_buffers is set to true (#303)" (#308) (4 minutes ago) [StormLiangMS]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-20 04:32:37 +08:00
Saikrishna Arcot
306175be2c
dhcrelay: Don't look up the ifindex for the fallback interface (#17797) (#17840)
Currently, whenever isc-dhcp-relay forwards a packet upstream,
internally, it will try to send it on a "fallback" interface. My
understanding is that this isn't meant to be a real interface, but
instead is basically saying to use Linux's regular routing stack to
route the packet appropriately (rather than having isc-dhcp-relay
specify specifically which interface to use).

The problem is that on systems with a weak CPU, a large number of
interfaces, and many upstream servers specified, this can introduce a
noticeable delay in packets getting sent. The delay comes from trying to
get the ifindex of the fallback interface. In one test case, it got to
the point that only 2 packets could be processed per second. Because of
this, dhcrelay will easily get backlogged and likely get to a point
where packets get dropped in the kernel.

Fix this by adding a check saying if we're using the fallback interface,
then don't try to get the ifindex of this interface. We're never going
to have an interface named this in SONiC.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2024-01-19 19:16:04 +08:00
mssonicbld
fc7a1ac02f
[submodule] Update submodule sonic-snmpagent to the latest HEAD automatically (#17853)
#### Why I did it
src/sonic-snmpagent
```
* b0a4bcc - (HEAD -> 202305, origin/202305) Set the execute bit on sysDescr_pass.py (#306) (22 hours ago) [Andre Kostur]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-19 18:36:00 +08:00
mssonicbld
30b1f05f0c
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#17851)
#### Why I did it
src/linkmgrd
```
* f5e9b54 - (HEAD -> 202305, origin/202305) [CodeQL] fix unmet build dependency (#222) (10 hours ago) [Jing Zhang]
* 2282cc5 - [active-standby] Probe the link in suspend timeout (#235) (22 hours ago) [Longxiang Lyu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-19 16:34:55 +08:00
mssonicbld
66ce739340
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17855)
#### Why I did it
src/sonic-utilities
```
* 93c42272 - (HEAD -> 202305, origin/202305) [chassis]: Support show ip bgp summary to display without error when no external neighbors are configured on chassis LC (#3099) (22 hours ago) [Arvindsrinivasan Lakshmi Narasimhan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-19 16:34:46 +08:00
zitingguo-ms
076b6c3d25
upgrade xgs SAI to 8.4.39.2 (#17842)
Why I did it
Upgrade the xgs SAI version to 8.4.39.2 to include the following fix:

8.4.36.0: [submodule upgrade] [SAI_BRANCH rel_ocp_sai_8_4] SID: SDK-381039 Cosq control dynamic type changes
8.4.37.0: SID: MMU cosq control configuration with Dynamic Type Check
8.4.38.0: [sbumodule upgrade] [CSP 0001232212][SAI_BRANCH rel_ocp_sai_8_4]back-porting SONIC-82415 to SAI 8.4
8.4.39.0: [CSP CS00012320979] Port SONIC-81867 sai spec compliance for get SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO
8.4.39.1: changes for phy-re-init of 40G ports for TH platforms CS00012327470
8.4.39.2: fix capability for Hostif queue by change SET operation of SAI_HOSTIF_ATTR_QUEUE to be true
Work item tracking
Microsoft ADO (number only): 26491005
How I did it
Upgrade xgs SAI version in sai.mk file.

How to verify it
Run basic SONiC test using SAI release pipeline, all cases passed.
https://dev.azure.com/mssonic/internal/_build/results?buildId=457869&view=results
2024-01-19 14:56:01 +08:00
abdosi
6d767e549d
[chassis] Support advertisement of Loopback0 of all LC's across all e-BGP peers in TSA mode (#16714) (#17837)
What I did:
In Chassis TSA mode Loopback0 Ip's of each LC's should be advertise through e-BGP peers of each remote LC's

How I did:

- Route-map policy to Advertise own/self Loopback IP to other internal iBGP peers with a community internal_community as define in constants.yml
- Route-map policy to match on above internal_community when route is received from internal iBGP peers and set a internal tag as define in constants.yml and also delete the internal_community so we don't send to any of e-BGP peers
- In TSA new route-map match on above internal tag and permit the route (Loopback0 IP's of remote LC's) and set the community to traffic_shift_community.
- In TSB delete the above new route-map.

How I verify:

Manual Verification

UT updated.
sonic-mgmt PR: sonic-net/sonic-mgmt#10239

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2024-01-19 12:53:26 +08:00
Baorong Liu
898f1263c8
[staticroutebfd]double commit PR-16450, change bfd to singlehop (#17549)
Why I did it
double commit PR-16450 because of cherry pick conflict for PR#202305

Work item tracking
Microsoft ADO (number only):
How I did it
How to verify it
2024-01-18 20:08:29 +08:00
ganglv
d81780baa3
Fix host service for 202305 branch (#17781)
Why I did it
When we disable telemetry.service, sonic-hostservice will not start. And root cause is sonic-hostservice is only wanted by telemetry.service.

Work item tracking
Microsoft ADO (number only):
How I did it
Add dependency for gnmi.service.

How to verify it
Disable telemetry.service and build new image, and then check sonic-hostservice with new image.
2024-01-18 20:07:40 +08:00
Liu Shilong
b99c46e3f8
[build] Add gpg keys for sonic-slave-bullseye in arm64 cross build on amd64. (#17182) (#17828)
Fix #16204

Microsoft ADO (number only): 25746782

How I did it
multiarch/debian-debootstrap:arm64-bullseye is too old.
It needs to add some gpg keys before 'apt-get update'
2024-01-18 20:05:04 +08:00
mssonicbld
f460347a34
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#17821)
#### Why I did it
src/sonic-swss
```
* ac94f0b7 - (HEAD -> 202305, origin/202305) [202305][routeorch] Fixing bug with multiple routes pointing to nhg (#3002) (2 hours ago) [Nikola Dancejic]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-18 16:34:00 +08:00
abdosi
ed71c12e05 Enable BFD for Static Route for chassis-packet. (#15383)
*What I did:
Enable BFD for Static Route for chassis-packet. This will trigger the use of the feature as defined in here: #13789

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2024-01-18 14:36:19 +08:00
abdosi
aa01630841 [build]: Added flag in sonic_version.yml to see if image is secured or non-secured (#16191)
What I did:

Added flag in sonic_version.yml to see if compiled image is secured or non-secured. This is done using build/compile time environmental variable SECURE_UPGRADE_MODE as define in HLD: https://github.com/sonic-net/SONiC/blob/master/doc/secure_boot/hld_secure_boot.md

This flag does not provide the runtime status of whether the image has booted securely or not. It's possible that compile time signed image (secured image) can boot on non secure platform.

Why I did:
Flag can be used for manual check or by the test case.

ADO: 24319390

How I verify:
Manual Verification

---
build_version: 'master-16191.346262-cdc5e72a3'
debian_version: '11.7'
kernel_version: '5.10.0-18-2-amd64'
asic_type: broadcom
asic_subtype: 'broadcom'
commit_id: 'cdc5e72a3'
branch: 'master-16191'
release: 'none'
build_date: Fri Aug 25 03:15:45 UTC 2023
build_number: 346262
built_by: AzDevOps@vmss-soni001UR5
libswsscommon: 1.0.0
sonic_utilities: 1.2
sonic_os_version: 11
secure_boot_image: 'no'

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2024-01-18 14:36:15 +08:00
snider-nokia
9f81d9d1bf [Nokia][sonic-platform] Update Nokia sonic-platform submodule and device data (#17378)
These changes, in conjunction with NDK version >= 22.9.17 address the thermal logging issues discussed at Nokia-ION/ndk#27. While the changes contained at this PR do not require coupling to NDK version >= 22.9.17, thermal logging enhancements will not be available without updated NDK >= 22.9.17. Thus, coupling with NDK >=22.9.17 is preferred and recommended.

Why I did it
To address thermal logging deficiencies.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
The following changes are included:

Threshold configuration values are provided in the associated device data .json files. There is also a change included to better handle the condition where an SFP module read fails.

Modify the module.py reboot to support reboot linecard from Supervisor

 - Modify reboot to call _reboot_imm for single IMM card reboot
 - Add log to the ndk_cmd to log the operation of "reboot-linecard" and "shutdown/satrtup the sfm"
Add new nokia_cmd set command and modify show ndk-status output

 - Add a new function reboot_imm() to nokia_common.py to support reboot a single IMM slot from CPM
 - Added new command: nokia_cmd set reboot-linecard <slot> [forece] for CPM
 - Append a new column "RebootStatus" at the end of output of "nokia_cmd show ndk-status"
 - Provide ability for IMM to disable all transceiver module TX at reboot time
 - Remove defunct xcvr-resync service
2024-01-18 14:36:12 +08:00
Marty Y. Lok
2f8630c85f [Nokia-IXR7250E] Modify the platform_reboot on the IXR7250E for PMON API reboot and Disable all SFPs (#17483)
Why I did it
When Supervisor card is rebooted by using PMON API, it takes about 90 seconds to trigger the shutdown in down path. At this time linecards have been up. This delays linecards database initialization which is trying to PING/PONG the database-chassis. To address this issue, we modified the NDK to use the system call with "sudo reboot" when the request is from PMON API on Supervisor case. The NDK version is 22.9.20 and greater. This new NDK requires this modifcaiton of platform_reboot to work with.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
Modify the platform_reboot In Supervisor not to reboot all IMMs since it has been done in the function reboot() in module.py. Also handle the reboot-cause.txt for on the Supervisor when the reboot is request from PMON API.
Modify the Nokia platform specific platform_reboot in linecard to disable all SPFs.
This PR works with NDK version 22.9.20 and above

Signed-off-by: mlok <marty.lok@nokia.com>
2024-01-18 14:36:08 +08:00
Liu Shilong
6926171e92 [build] Fix a bash script some times called by sh issue. (#17761)
Why I did it
Fix a bug that sometimes the script runs in sh not bash.

Work item tracking
Microsoft ADO (number only): 26297955
How I did it
2024-01-18 14:36:04 +08:00
vdahiya12
a262306822 [Arista] Update config.bcm of 7060_cx32s for handling 40g optics with unreliable los settings (#17768)
For 40G optics there is SAI handling of T0 facing ports to be set with SR4 type and unreliable los set for a fixed set of ports. For this property to be invoked the requirement is set
phy_unlos_msft=1 in config.bcm.
This change is to meet the requirement and once this property is set, the los/interface type settings is applied by SAI on the required ports.

Why I did it
For Arista-7060CX-32S-Q32 T1, 40G ports RX_ERR minimalization during connected device reboot
can be achieved by turning on Unreliable LOS and SR4 media_type for all ports which are connected to T0.

The property phy_unlos_msft=1 is to exclusively enable this property.

Microsoft ADO: 25941176

How I did it
Changes in SAI and turning on property

How to verify it
Ran the changes on a testbed and verified configurations are as intended.

with property

admin@sonic2:~$ bcmcmd "phy diag xe8 dsc config" | grep -C 2 "LOS"
Brdfe_on                    = 0
Media Type                  = 2
Unreliable LOS              = 1
Scrambling Disable          = 0
Lane Config from PCS        = 0

without property

admin@sonic:~$ bcmcmd "phy diag xe8 dsc config" | grep -C 2 "LOS"
Brdfe_on                    = 0
Media Type                  = 0
Unreliable LOS              = 0
Scrambling Disable          = 0
Lane Config from PCS        = 0

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2024-01-18 04:32:39 +08:00
Feng-msft
245f337d14 Fix dialout build flag issue. (#17715)
### Why I did it
Fix ENABLE_DIALOUT flag issue.

##### Work item tracking
- Microsoft ADO **(number only)**: 21326000

#### How I did it
Update Makefile.work and add debug string.

#### How to verify it
![image](https://github.com/sonic-net/sonic-buildimage/assets/97083744/960d75d1-618c-4734-acb5-7a32a28c262b)
2024-01-17 02:34:03 +08:00
Liping Xu
b53a8908b0 disable restapi for leafRouter in slim image (#17713)
Why I did it
For some devices with small memory, after upgrading to the latest image, the available memory is not enough.

Work item tracking
Microsoft ADO (number only):
26324242
How I did it
Disable restapi feature for LeafRouter which with slim image.

How to verify it
verified on 7050qx T1 (slim image), restapi disabled
verified on 7050qx T0 (slim image), restapi enabled
verified on 7260 T1 (normal image), restapi enabled
2024-01-12 20:47:37 +08:00
mssonicbld
8561fa6db7
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#17757)
#### Why I did it
src/sonic-utilities
```
* 651a80b1 - (HEAD -> 202305, origin/202305) Modify teamd retry count script to base BGP status on default BGP status (#3069) (22 hours ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-12 16:34:35 +08:00
Lawrence Lee
9f66a7dbf6 add timeout to ping6 command (#17729)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2024-01-11 12:34:22 +08:00
Nazarii Hnydyn
13aa19acf4
[swss/syncd]: Remove dependency on interfaces-config.service (#17649)
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com

This improvement does not bring any warm-reboot degradation, since the database availability (tcp/ip access over the loopback interface) was fixed by these PRs:

Re-add 127.0.0.1/8 when bringing down the interfaces #15080
Fix potentially not having any loopback address on lo interface #16490
Why I did it
Removed dependency on interfaces-config.service to speed up the boot, because interfaces-config.service takes a lot of time on init
Work item tracking
N/A
How I did it
Changed service files for swss/syncd
How to verify it
Boot and check swss/syncd start time comparing to interfaces-config
2024-01-10 21:15:55 +08:00
mssonicbld
f4c15b8322
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#17731)
#### Why I did it
src/linkmgrd
```
* 2f5971f - (HEAD -> 202305, origin/202305) [warmboot] use config_db connector to update mux mode config instead of CLI (#223) (4 hours ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-10 16:34:10 +08:00
Nazarii Hnydyn
a3b5ea93eb
[sonic-cfggen]: Optimize template rendering and database access. (#17650)
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com

Why I did it
Improved switch init time
Work item tracking
N/A
How I did it
Replaced: sonic-cfggen -> sonic-db-cli
Aggregated template list for sonic-cfggen
How to verify it
Run warm-reboot
2024-01-10 08:56:20 +08:00
mssonicbld
44bc291ba5
[submodule] Update submodule linkmgrd to the latest HEAD automatically (#17721)
#### Why I did it
src/linkmgrd
```
* 2089ab6 - (HEAD -> 202305, origin/202305) Exclude DbInterface in PR coverage check (#224) (3 hours ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
2024-01-10 04:32:40 +08:00