Commit Graph

6973 Commits

Author SHA1 Message Date
Volodymyr Samotiy
86ad8edb8b
[Mellanox] Update SAI to v2205.22.1.19 and SDK/FW to v4.5.3168/v2010.3070 (#12206)
- Why I did it
To include latest fixes and new functionality

SAI fixes and new features
fix #3205239, incorrect object type returned for SG child list
Fix VRF-VNI map entries remove issue
ECC health event and logging
[Port Buffers] restore default queue and pg configuration when all user pools are deleted
Fix EVPN type3 error on removal of uc/bc flood group
Fix EVPN type2 MAC move from local to remote results in SAI failure
Fix Disable learning on VXLAN tunnel
Fix error on VXLAN v6 tunnel removal
Fix port cannot apply schedule group when it is a lag member
Fix BFD add more detailed message on BFD packet not related to any existing session
gcc10 compilation fixes
Disable learning on VXLAN tunnel
Support BFD remote-disc exchange in negotiation stage
Tunnel Loopback packet action attribute implementation (for Dual TOR)
Add KVD resources MIN/MAX functionality (pending CRM issue with MIN only)
Support for CRC2 hash algorithm
Bulk counter support for PGs, queues
Support mirror sample rate attribute (SPC2+)
[Functional] [QoS] | Unable to remove SCHEDULE profile table even if there is no object referencing it
Next hop group optimized bulk API
Reduce verbosity of shared database already exists print
Span mirror policer (SPC2+), optimize pipeline for acl mirror action with policer on SPC2+
use same size descriptor pool for rx/tx
fix bfd - notify Sonic for admin-down event
2201 - empty list for supported fec for RJ45 ports
Fix don't disable used tunnel underlay interfaces

SDK fixes
100GbE FCI DAC (10137628-4050LF/HPE PN: 845408-B21) was recognized by mistake as supporting "cable burning' which caused the switch firmware to read page 0x9f (which unsupported in the cable) and to report this cable as having "bad eeprom".
Added remote peer UDP port information in BFD packet event.
After editing an ECMP, the resilient ECMP next-hop counter may not count correctly.
Fixed potential memory leaks in some APIs related to LPM
If TTL_CMD_COPY is used in Encap direction for a packet with no TTL, then the value passed in the ttl data structure will be used if non-zero (default 255 if zero).
In SN2201: When configuring Force mode, user should configure Speed and FEC on both sides
In Flex Tunnel encapsulation flow, if the encapsulation is with an IPv6 header, the flow label field may not be updated as expected.
In some cases, when changing speed to 400GbE over 8 lanes, the first few packets would be dropped.
In some traffic patterns involving small packets, the PortRcvErrors counter may mistakenly count events of local physical errors due to an internal flow in the hardware that involves link packets.
On Spectrum systems, sometimes during link failure, not all previous firmware indications cleared properly, potentially affecting the next link up attempt.
On the NVIDIA Spectrum-2 switch, when receiving a packet with Symbol Errors on ports that are configured to cut-thought mode, a pipeline might get stuck.
PCI calibration changes from a static to a dynamic mechanism.
SDK debug dump shows "Unknown" Counter in RFC3635 Counter Group.
SDK debug dump shows "Unknown" Counter in the PPCNT Traffic Class Counter Group.
SDK Dump missing column headers in some GC tables may result in difficulty understanding the dump.
SLL configuration is missing in SDK dump.
Spectrum-2 systems, do no support 1GbE on supported 40GbE modules.
When binding a UDP port which is already in use for BFD TX session, the error message appears incorrectly.
When Flex Tunnel was used, Flex Modifier sometimes experienced a brief mis-configuration during ISSU.
When many ports are active (e.g. 70 ports up), and the configuration of shared buffer is applied on the fly, occasionally, the firmware might get stuck.
When running 1GbE speeds on SN4600 system, the port remained active while peer side was closed.
When toggling many ports of the Spectrum devices while raising 10GbE link up and link maintenance is enabled, the switch may get stuck and may need to be rebooted.
When trying to reconfigure the Flex Parser header and Flex transition parameters after ISSU, the switch will returned an error even if the configuration was identical to that done before performing the ISSU.
While toggling the cable, and the low power mode is set to ON, an unexpected PMPE event error is received.
How I did it
Updated SDK/SAI submodule and relevant makefiles with the required versions.

- How to verify it
Build an image and run tests from "soni-mgmt".

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2022-09-30 09:39:12 +03:00
mssonicbld
a7d088c47c
[ci/build]: Upgrade SONiC package versions (#12191) 2022-09-28 23:25:55 +08:00
Ying Xie
1667a089d7
[202205][linkmgrd][swss][platform-daemon] advance submodule head (#12186)
linkmgrd:
* a0834e4 2022-09-22 | [Active-Active] server side admin forwarding state sync up (#133) (HEAD -> 202205) [Jing Zhang]
* ea56020 2022-09-21 | Post switchover reasons to STATE DB (#131) [Jing Zhang]

swss:
* d9cbf44 2022-09-24 | [orchagent] Fix issue: ip prefix shall be inited even if VRF/VNET is not ready (#2461) (HEAD -> 202205, github/202205) [Junchao-Mellanox]
* 3dcd6ff 2022-09-24 | [macsec]: Set MTU for MACsec (#2398) (#2466) [Ze Gan]

platform-daemon:
* 48b6239 2022-09-27 | [ycabled] add support for getting grpc secerts via shared file (#298) (HEAD -> 202205) [vdahiya12]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-28 07:50:34 -07:00
Stephen Sun
6dab2ffb46
[202205] [Mellanox] Fix typo in platform API (#12139)
- Why I did it
Fix a typo in chassis platform API which causes the following error

>>> import sonic_platform as P
>>> c = P.platform.Platform().get_chassis()
>>> sl = c.get_all_sfps()
>>> sl[0].get_lpmode()
Sep 28 07:48:33 INFO    LOG: Initializing SX log with STDOUT as output file.
False
>>> del c
Exception ignored in: <function Chassis.__del__ at 0x7f1d166ef8b0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/sonic_platform/chassis.py", line 126, in __del__
    self.sfp_module.deinitialize_sdk_handle(sfp_module.SFP.shared_sdk_handle)
NameError: name 'sfp_module' is not defined

- How I did it
Use self while using the SDK handle

- How to verify it
Manual test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-09-28 11:09:46 +03:00
Junchao-Mellanox
ef84a41048
Revert "[Mellanox] Redirect ethtool stderr to subprocess for better error log (#12038)" (#12184)
This reverts commit 9750cb4.

There is a PR to handle master branch revert: #12183

- Why I did it
The PR to be reverted introduced many notice logs every 1 minute if SFP is not plugged:

Cannot get module EEPROM information: Input/output error
Before the "bad" PR, the message format is like this:

INFO pmon#supervisord: xcvrd Cannot get module EEPROM information: Input/output error
It was truncated by rsyslog because every message is the same. However, the "bad" PR introduces SFP index to the message:

NOTICE pmon#xcvrd: Failed to get EEPROM data for sfp 39: Cannot get module EEPROM information: Input/output error
Rsyslog no longer truncate such log and many such messages are flooded to syslog.

- How I did it
Revert the PR

- How to verify it
Manual test
2022-09-28 10:14:50 +03:00
mssonicbld
1c5abca0a6
[ci/build]: Upgrade SONiC package versions (#12187) 2022-09-27 08:41:31 +08:00
gechiang
f5540c4cfa
[202205] [submodule] Advance sonic-swss pointer (#12095)
* [sonic-swss submodule update] pick up latest Fixes from sonic-swss repo

* Pick up latest
2022-09-26 15:53:26 -07:00
Sudharsan Dhamal Gopalarathnam
6527570203
[202205][submodule] Update sonic-utilities submodule (#12162)
Update sonic-utilities submodule pointer to include the following:
* 99ed8ea [link-local]Modify RIF check to include link-local enabled interfaces ([#2394](https://github.com/Azure/sonic-utilities/pull/2394))

Signed-off-by: dgsudharsan <sudharsand@nvidia.com>

Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
2022-09-26 07:55:14 -07:00
mssonicbld
99f9c53d19
[ci/build]: Upgrade SONiC package versions (#12142) 2022-09-25 21:57:18 +08:00
Dror Prital
9191779b13
remove JINJA2_CACHE (#12155) 2022-09-23 07:20:24 -07:00
Dror Prital
d5bd2dd6bf
[202205] [Mellanox] update NVIDIA copyright header for added files (#12126)
- Why I did it
Add NVIDIA Copyright header for new "NVIDIA" files

- How I did it
Add the copyright header as remark at the head of the file
2022-09-22 19:08:08 +03:00
Ying Xie
f0d4997023
[202205][linkmgrd][utilities][swss][sairedis][platform-daemon] advance submodule head (#12149)
linkmgrd:
* 05e5f4c 2022-09-20 | [Active-Active] flaky LinkmgrdBootupSequence unit tests (#134) (HEAD -> 202205) [Jing Zhang]
* 16fcadf 2022-09-13 | [active-standby] update warmboot reconciliation logic (#129) [Jing Zhang]
* e656a87 2022-09-09 | [active-active] shutdown link prober when starting as isolated (#130) [Jing Zhang]

uttilities:
yinxi@ying-dev-vm-01:~/src/sonic-202205/src/sonic-utilities$ git hist github/202205..HEAD
* 562188f 2022-09-14 | Use 'default' VRF when VRF name is not provided (#2368) (HEAD -> 202205) [Sumukha Tumkur Vani]
* c50ba4f 2022-09-20 | [minigraph] add option to specify golden path in load_minigraph (#2350) [jingwenxie]
* cec5ab2 2022-09-20 | [GCU]Remove GCU unique lane check for duplicate lanes platforms (#2343) [jingwenxie]
* 8d20771 2022-09-15 | Vnet_route_check Vxlan tunnel route update. (#2281) [siqbal1986]

swss:
* 88371f7 2022-09-21 | [ci] Only when test stage succeeded or succeededwithissues, PR run Gcov (#2460) (HEAD -> 202205) [Liu Shilong]
* c11dbd7 2022-09-15 | [QoS] Enforce drop probability only for colors whose WRED are enabled (#2422) [Stephen Sun]

sairedis:
* 80928dd 2022-09-06 | [lgtm] Add uuid library (#1119) (HEAD -> 202205, github/202205) [Kamil Cudnik]
* c147dd0 2022-09-16 | [202205][vslib]: Add SAI_PORT_ATTR_OPER_SPEED get #1123 [Ze Gan]

platform-daemon:
* 9cf8adf 2022-09-21 | [ycabled] add notification for gRPC connection state transitions to  IDLE/TRANSIENT_FAILURE (#295) (HEAD -> 202205) [vdahiya12]
* 1e07ae3 2022-09-20 | Use get() to fetch default value from dictionary for port admin_status #286 [anamehra]
* 157f483 2022-09-15 | [Xcvrd] Soak duplicate events and process only updated interested events (#285) [Prince George]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-22 16:29:18 +03:00
Junhua Zhai
5479892d1b [PikeZ] Update port alias in Arista-720DT-48S (#12086)
Fix #12037, by following HLD https://github.com/sonic-net/SONiC/blob/master/doc/sonic-port-name.md.
2022-09-21 21:19:14 +00:00
xumia
89ba8149c7 Upgrade the sonic-fips packages to 0.3 (#12040)
Why I did it
Upgrade the sonic-fips packages to release 0.3
Fix the package timestamp not correct issue
2022-09-21 21:19:05 +00:00
Junchao-Mellanox
5ecb93f7e8 [Mellanox] Redirect ethtool stderr to subprocess for better error log (#12038)
- Why I did it
ethtool print error logs when EEPROM of a SFP is not available. It prints error like this:

INFO pmon#/supervisord: xcvrd Cannot get module EEPROM information: Input/output error
INFO pmon#/supervisord: xcvrd Cannot get Module EEPROM data: Invalid argument
However, this log does not contain the relevant SFP index which is hard for developer/qa to find the exactly SFP.

- How I did it
Redirect ethtool stderr to subprocess and log it better

- How to verify it
Manual test
2022-09-21 21:18:55 +00:00
anamehra
ab0edff4cb Fix radv.conf traceback when VLAN_INTERFACE is not defined (#12034)
*Fix the if block scope to prevent traceback due to undefined vlan_list when  VLAN_INTERFACE is not defined.
2022-09-21 21:18:45 +00:00
ganglv
a98b34af29 Fix dhcp option buffer issue (#12033)
Why I did it
Current isc-dhcp uses below code to remove DHCP option:
memmove(sp, op, op[1] + 2);
sp += op[1] + 2;

sp points to the option to be stripped, we can call it as option S.
op points to the option after options S, we can call it as option O.
DHCP option is a typical type-length-value structure, the first byte is type, the second byte is length, and remain parts are value.
In this case, option O length is bigger than option S, and more than 2 bytes, after the memmove, we will get this result:

Now Option S and Option O are overwritten, op[1] was the length of Option O, and it's modified after memmove.
But current implementation is still using op[1] as length to update sp (sp+=op[1]+2), so we get the wrong sp.

How I did it
Create patch from https://github.com/isc-projects/dhcp
The new impelementation use mlen to store the length of Option O before memmove, that's how it fixed the bug.
size_t mlen = op[1] + 2;
memmove(sp, op, mlen);
sp += mlen;

How to verify it
I have a PR for sonic-mgmt to cover this issue:
sonic-net/sonic-mgmt#6330

Signed-off-by: Gang Lv ganglv@microsoft.com
2022-09-21 21:18:27 +00:00
Dror Prital
4e3aff882f [Mellanox] Update SDK/FW to version 4.5.2320/2010.2320 (#11990)
- Why I did it
Update SDK/FW version - 4.5.2320/2010_2320 in order to have the following fixes:
• Spectrum-3 | PCI calibration changes from a static to a dynamic mechanism.
• [VxLAN] TTL was set to 0 for non IP traffic (such as ARP)

- How I did it
Update pointer for the SDK/FW

- How to verify it
Run regression tests
2022-09-21 21:16:20 +00:00
Samuel Angebault
366ded2936 Implement ssd_util plugin for Arista products (#11981)
Why I did it
Some Arista products do not have an SSD but use an eMMC instead.
The SsdUtil plugin is therefore extended to support both.

How I did it
Implemented ssd_util.py platform plugin loaded by ssdutil.
This plugin fallback to the default SONiC implementation if the arista one can't be found.

How to verify it
Run show platform ssdhealth on a product with an eMMC
2022-09-21 21:16:03 +00:00
Volodymyr Boiko
3d620370f7 [bgp][service] Start bgp service after interfaces-config service (#11827)
- Why I did it
interfaces-config service restarts networking service, during the restart loopback interface address is being removed and reassigned back, leaving loopback without an ipv4 address for a while.
On SONiC startup and config reload interfaces-config and bgp services start in parallel and sometimes
fpmsyncd in bgp attempts bind to loopback while it does not have an address, fails with the log
Exception "Cannot assign requested address" had been thrown in daemon
and exits with rc 0.

root@sonic:/# supervisorctl status
fpmsyncd                         EXITED    Jul 20 05:04 AM
zebra                            RUNNING   pid 35, uptime 6:15:05
zsocket                          EXITED    Jul 20 05:04 AM
docker logs bgp
INFO exited: fpmsyncd (exit status 0; expected)
With fpmsyncd dead, configured routes do not appear in the database.

- How I did it
Added ordering dependency on interfaces-config service into bgp.config

- How to verify it
Itself the issue reproduces quite rarely, but one can gain the time interval between networking down and networking up in interfaces-config.sh like this:

diff --git a/files/image_config/interfaces/interfaces-config.sh b/files/image_config/interfaces/interfaces-config.sh
index f6aa4147a..87caceeff 100755
--- a/files/image_config/interfaces/interfaces-config.sh
+++ b/files/image_config/interfaces/interfaces-config.sh
@@ -63,7 +63,11 @@ done
 # Read sysctl conf files again
 sysctl -p /etc/sysctl.d/90-dhcp6-systcl.conf

-systemctl restart networking
+# systemctl restart networking
+
+systemctl start networking
+sleep 10
+systemctl stop networking

 # Clean-up created files
 rm -f /tmp/ztp_input.json /tmp/ztp_port_data.json
with this change the issue reproduces on every config reload.

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2022-09-21 21:15:08 +00:00
Mai Bui
a63af72142 [device/ruijie] Mitigation for security vulnerability #11779
Signed-off-by: maipbui maibui@microsoft.com
Why I did it
The xml.etree.ElementTree module is not secure against maliciously constructed data.
How I did it
Remove xml. Use lxml XML parsers package that prevent potentially malicious operation.
2022-09-21 21:14:14 +00:00
Maxime Lorrillere
458b12b4af [Chassis][Voq]Configure midplane network on supervisor (#11725)
Multi-asic Docker instances are created behind Docker's default bridge
which doesn't allow talking to other Docker instances that are in the
host network (like database-chassis).

On linecards, we configure midplane interfaces to let per-asic docker
containers talk to CHASSIS_DB on the supervisor through internal chassis
network.

On the supervisor we don't need to use chassis internal network, but we
still need a similar setup in order to allow fabric containers to talk
to database-chassis
2022-09-21 21:12:40 +00:00
Ravindranath C K
a700ffdb3d [innovium]: Enable syncd container autorestart for Innovium platforms (#11497)
Why I did it
Enable syncd container autorestart for Innovium platforms

How I did it
Add critical_process file and sypervisord.conf entry

How to verify it
Tested with autorestart/test_container_autorestart.py::test_containers_autorestart

PASSED autorestart/test_container_autorestart.py::test_containers_autorestart[sonic-xxx-dut-sonic-xxx-dut|syncd]

Signed-off-by: rck-innovium rck@innovium.com
2022-09-21 21:10:23 +00:00
Sudharsan Dhamal Gopalarathnam
962349ff77
[202205][submodule] Update sonic-swss submodule (#12137)
Update sonic-swss submodule pointer to include the following:
* 8eea92e [202205][counters] Revert PR #2432 for the buffer queue/pg counters improvement ([#2462](https://github.com/Azure/sonic-swss/pull/2462))
* 5d8636a [202205] Enhance orchagent and buffer manager in error handling (#2414) ([#2449](https://github.com/Azure/sonic-swss/pull/2449))
* aa22237 [Everflow/ERSPAN] Set correct destination port and mac address when the nexthop is updated for ERSPAN mirror destination (#2392) ([#2455](https://github.com/Azure/sonic-swss/pull/2455))
* 04ce7be check state_db for po before sending ARP/ND pkts (#2444) ([#2450](https://github.com/Azure/sonic-swss/pull/2450))
* f0138a2 [portmgr] Fixed the orchagent crash due to late arrival of notif (#2431) ([#2451](https://github.com/Azure/sonic-swss/pull/2451))
* 7cfde48 Change the log messages in addKernelNeigh/Route from ERROR to INFO ([#2437](https://github.com/Azure/sonic-swss/pull/2437))
* 2c5116e [202205][counters] Improve performance by polling only configured ports buffer queue/pg counters ([#2432](https://github.com/Azure/sonic-swss/pull/2432))
2022-09-21 09:32:03 -07:00
mssonicbld
77b469d7c8
[ci/build]: Upgrade SONiC package versions (#12121) 2022-09-20 21:24:25 +08:00
Kebo Liu
ff8296ddfe
[202205] [submodule] Advance sonic-swss-common submodule pointer (#12111)
- Why I did it
To pickup new commit from sonic-swss-common submodule:

afd2382 [202205] Fix sonic-db-cli dictionary output format not backward compatible issue. sonic-net/sonic-swss-common#690

- How I did it
Advance the sonic-swss-common submodule pointer

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-09-20 10:14:59 +03:00
Oleksandr Ivantsiv
c9ba827773
[202205] [services] Update "WantedBy=" section for tacacs-config.timer. (#11893) (#12080)
Manually cherry-picking #11893

- Why I did it
The timer execution may fail if triggered during a config reload (when the sonic.target is stopped). This might happen in a rare situation if config reload is executed after reboot in a small time slot (for 0 to 30 seconds) before the tacacs-config timer is triggered:

systemctl status tacacs-config.timer
tacacs-config.timer - Delays tacacs apply until SONiC has started
Loaded: loaded (/lib/systemd/system/tacacs-config.timer; enabled-runtime; vendor preset: enabled)
Active: failed (Result: resources) since Mon 2022-08-29 15:53:03 IDT; 1min 28s ago
Trigger: n/a
Triggers: tacacs-config.service

Aug 29 15:47:53 r-boxer-sw01 systemd[1]: Started Delays tacacs apply until SONiC has started.
Aug 29 15:53:03 r-boxer-sw01 systemd[1]: tacacs-config.timer: Failed to queue unit startup job: Transaction for tacacs-config.service/start is destructive (mgmt-framework.timer has 's>
Aug 29 15:53:03 r-boxer-sw01 systemd[1]: tacacs-config.timer: Failed with result 'resources'.

- How I did it
To ensure that timer execution will be resumed after a config reload the WantedBy section of the systemd service is updated to describe relation to sonic.target.

- How to verify it
Reboot the system
After reboot monitor tacacs-config.timer status. 30 seconds before timer activation run "config reload -y" command.
Check system status.

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
2022-09-19 09:20:10 +03:00
mssonicbld
f361c029c5
[ci/build]: Upgrade SONiC package versions (#11980) 2022-09-19 12:31:16 +08:00
gechiang
570f1b0888
[202205][sonic-utilities submodule update] pick up latest needed fix (#12090) 2022-09-15 23:41:18 -07:00
Hua Liu
e51368d789
Revert "Fix docker database flush_unused_database failed issue (#11600) (#11677)" (#12084)
This reverts commit 7e4883e71f.
2022-09-15 14:19:56 -07:00
Sudharsan Dhamal Gopalarathnam
1e034d5e86
[202205][submodule] Update sonic-utilities submodule (#12076)
Update sonic-utilities submodule pointer to include the following:
* b739efc [subinterface]Added additional checks in portchannel and subinterface commands (#2345) ([#2371](https://github.com/Azure/sonic-utilities/pull/2371))
* d01153a Use warm-boot infrastructure for fast-boot ([#2365](https://github.com/Azure/sonic-utilities/pull/2365))
2022-09-15 10:14:58 -07:00
Dror Prital
e129f4198d
[202205][submodule] Advance sonic-sairedis pointer (#12078)
Update sonic-sairedis submodule pointer to include the following:
[202205] Use warm-boot infrastructure for fast-boot (#1121)
2022-09-15 10:06:39 -07:00
Dror Prital
9444176014
[202205][submodule] Advance sonic-utilities pointer (#12079)
Update sonic-utilities submodule pointer to include the following:
[202205][subinterface]Added additional checks in portchannel and subinterface commands (#2371)
[202205] Use warm-boot infrastructure for fast-boot (#2365)
2022-09-15 10:06:12 -07:00
Sudharsan Dhamal Gopalarathnam
79ca9767d3
Update submodule to FRR 8.2.2 (#11502) (#12074)
*The sonic-frr was upgraded to FRR 8.2.2 as part of PR #10691. However, sonic-frr/frr submodule was still referring to previous 7.5 version. Update the sonic-frr/frr submodule to 8.2.2 commit id. Fixes issue #11484.

Co-authored-by: Hasan Naqvi <56742004+hasan-brcm@users.noreply.github.com>
2022-09-14 19:25:16 -07:00
Aryeh Feigin
b8c6e2a45d
Use warm-boot infrastructure for fast-boot (#12026) 2022-09-14 21:23:34 +03:00
Samuel Angebault
9f351aecd7
[202205][Arista] Update platform submodules (#12021) 2022-09-13 19:40:16 -07:00
Ying Xie
6812cf0cca
[202205][linkmgrd][utilities][platform-daemons][platform-common][linux-kernel] advance submodule head (#12025)
linkmgrd:
* ab5b2c1 2022-09-02 | Fix mux config (#128) (HEAD -> 202205, github/202205) [Longxiang Lyu]

utilities:
* 7de9305 2022-09-07 | [generate dump]Added error message when saisdkdump fails (#2356) (HEAD -> 202205, github/202205) [Sudharsan Dhamal Gopalarathnam]
* c5b0a6d 2022-09-07 | [counterpoll]Fixing counterpoll show for tunnel and acl stats (#2355) [Sudharsan Dhamal Gopalarathnam]
* 1452b44 2022-09-05 | [GCU] Fix missing backend in dry run (#2347) [jingwenxie]
* bc7b845 2022-09-04 | Add Password Hardening CLI support (#2338) [davidpil2002]
* 55e8948 2022-09-06 | [fast-reboot]Avoid stopping masked services during fast-reboot (#2335) [Sudharsan Dhamal Gopalarathnam]
* f7d69d4 2022-08-30 | Replace cmp in acl_loader with operator.eq (#2328) [Zhaohui Sun]
* 4054ebb 2022-09-05 | Add verification for override (#2305) [jingwenxie]
* 729d811 2022-05-30 | Fix sonic-installer and 'show version' command crash when database docker not running issue. (#2183) [Hua Liu]

platform-daemons:
* 36ba7c0 2022-09-07 | [ycable] cleanup logic for creating grpc future ready (#289) (HEAD -> 202205) [vdahiya12]
* 2a9db73 2022-09-01 | [ycabled] fix insert events from xcvrd;cleanup some mux toggle logic (#287) [vdahiya12]

platform-common:
* d7c990d 2022-09-03 | [CMIS] 'get_transceiver_info' should return 'None' when CMIS cable EEPROM is not ready  (#305) (HEAD -> 202205) [Kebo Liu]

linux-kernel:
* 25ea052 2022-08-31 | [patch]: Add accpt_untracked_na kernel param (#292) (HEAD -> 202205) [Lawrence Lee]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-10 05:04:53 -07:00
kellyyeh
281ede963a [dhcp_relay] Add "vlan missing ip helper" dhcp relay unittest (#10654) 2022-09-09 20:53:02 +00:00
Dror Prital
612326d655
[202205][submodule] Advance sonic-sairedis pointer (#11881)
Update sonic-sairedis submodule pointer to include the following:
* [202205][SAI] advance SAI version to 1.10 to pick up saiserver and syncd-rpc support ([#1115](https://github.com/sonic-net/sonic-sairedis/pull/1115))

Signed-off-by: dprital <drorp@nvidia.com>

Signed-off-by: dprital <drorp@nvidia.com>
2022-09-08 21:16:10 -07:00
Saikrishna Arcot
f1243bad1b
Pin version of bazelisk to v1.13.0 (#12027)
* Pin version of bazelisk to v1.13.0

This tries to avoid builds failures due to the latest version of
bazelisk changing and causing hash mismatches.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-09-08 21:15:35 -07:00
Ying Xie
ee40402ab7 Revert "[build] Fix version of bazelist which is lost acccidently (#12012)"
This reverts commit 36c5787daf.
2022-09-09 04:14:59 +00:00
Liu Shilong
36c5787daf
[build] Fix version of bazelist which is lost acccidently (#12012)
Why I did it
bazelisk package with hash value 1227b24db77557d552701f6add122edc is deleted from github release.
Reproducible build only cached hash value. Package file didn't be cached. Because they are in different pipelines.
Using latest package hash instead.
2022-09-09 07:24:44 +08:00
xumia
7c5cb343e3 Fix dbus-run-session command not found issue when install dbus-python (#12009) 2022-09-08 16:34:38 +00:00
bingwang-ms
96588d20e0 Map TC6 to Queue 1 for regular traffic (#11904)
Why I did it
This PR is to update TC_TO_QUEUE_MAP|AZURE for SKU Arista-7050CX3-32S-D48C8 and Arista-7260CX3 T0.

The change is only to align the TC_TO_QUEUE_MAP for regular traffic and bounced traffic. It has no impact on business because we have no traffic being mapped to TC2 or TC6.

How I did it
Update TC_TO_QUEUE_MAP|AZURE , and test cases as well.

How to verify it
Verified by running test case test_j2files.py

/sonic/src/sonic-config-engine$ python3 setup.py test -s tests/test_j2files.py
running test
......
----------------------------------------------------------------------
Ran 29 tests in 25.390s

OK
2022-09-08 16:34:31 +00:00
Ze Gan
0a54c46a0d [docker-macsec]: Add dependencies of MACsec (#11770)
Why I did it
If the SWSS services was restarted, the MACsec service should also be restarted. Otherwise the data in wpa_supplicant and orchagent will not be consistent.

How I did it
Add dependency in docker-macsec.mk.

How to verify it
Manually check by 'sudo service swss restart'.

The MACsec container should be started after swss, the syslog will look like


Sep  8 14:36:29.562953 sonic INFO swss.sh[9661]: Starting existing swss container with HWSKU Force10-S6000
Sep  8 14:36:30.024399 sonic DEBUG container: container_start: BEGIN
...
Sep  8 14:36:33.391706 sonic INFO systemd[1]: Starting macsec container...
Sep  8 14:36:33.392925 sonic INFO systemd[1]: Starting Management Framework container...


Signed-off-by: Ze Gan <ganze718@gmail.com>
2022-09-08 15:50:06 +00:00
Ying Xie
b4bf4aca3f [mux] skip mux operations during warm shutdown (#11937)
* [mux] skip mux operations during warm shutdown

- Enhance write_standby.py script to skip actions during warm shutdown.
- Expand the support to BGP service.
- MuX support was added by a previous PR.
- don't skip action during warm recovery

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-09-08 15:48:56 +00:00
Kebo Liu
67d9acda39 [SN2201] remove extra empty lines in the pg_profile_lookup.ini (#11923)
- Why I did it
Remove extra empty lines in the SN2201 pg_profile_lookup.ini to make it aligned with other platforms.
This extra empty line could confuse some test cases which need to parse this file.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-09-08 15:48:45 +00:00
Lawrence Lee
12e6b89d80 [arp_update]: Set failed IPv6 neighbors to incomplete (#11919)
After pinging any failed IPv6 neighbor entries, set the remaining failed/incomplete entries to a permanent INCOMPLETE state. This manual setting to INCOMPLETE prevents these entries from automatically transitioning to FAILED state, and since they are now incomplete any subsequent NA messages for these neighbors is able to resolve the entry in the cache.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-09-08 15:48:05 +00:00
Ze Gan
3b128ec7e8 [macsec]: Add MACsec clear CLI support (#11731)
Why I did it
To support clear MACsec counters by sonic-clear macsec

How I did it
Add macsec sub-command in sonic-clear to cache the current macsec stats, and in the show macsec command to check the cache and return the diff with cache file.

How to verify it

admin@vlab-02:~$ show macsec  Ethernet0
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-128
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
        MACsec Egress SC (52540067daa70001)
        -----------  -
        encoding_an  0
        -----------  -
                MACsec Egress SA (0)
                -------------------------------------  --------------------------------
                auth_key                               9DDD4C69220A1FA9B6763F229B75CB6F
                next_pn                                1
                sak                                    BA86574D054FCF48B9CD7CF54F21304A
                salt                                   000000000000000000000000
                ssci                                   0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN         52
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
                SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
                SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
                -------------------------------------  --------------------------------
        MACsec Ingress SC (525400d4fd3f0001)
                MACsec Ingress SA (0)
                ---------------------------------------  --------------------------------
                active                                   true
                auth_key                                 9DDD4C69220A1FA9B6763F229B75CB6F
                lowest_acceptable_pn                     1
                sak                                      BA86574D054FCF48B9CD7CF54F21304A
                salt                                     000000000000000000000000
                ssci                                     0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN           56
                SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
                SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
                SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
                SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
                ---------------------------------------  --------------------------------

admin@vlab-02:~$ sonic-clear macsec
Clear MACsec counters

admin@vlab-02:~$ show macsec  Ethernet0
MACsec port(Ethernet0)
---------------------  -----------
cipher_suite           GCM-AES-128
enable                 true
enable_encrypt         true
enable_protect         true
enable_replay_protect  false
replay_window          0
send_sci               true
---------------------  -----------
        MACsec Egress SC (52540067daa70001)
        -----------  -
        encoding_an  0
        -----------  -
                MACsec Egress SA (0)
                -------------------------------------  --------------------------------
                auth_key                               9DDD4C69220A1FA9B6763F229B75CB6F
                next_pn                                1
                sak                                    BA86574D054FCF48B9CD7CF54F21304A
                salt                                   000000000000000000000000
                ssci                                   0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN         52
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED    0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED    0
                SAI_MACSEC_SA_STAT_OUT_PKTS_ENCRYPTED  0
                SAI_MACSEC_SA_STAT_OUT_PKTS_PROTECTED  0
                -------------------------------------  --------------------------------
        MACsec Ingress SC (525400d4fd3f0001)
                MACsec Ingress SA (0)
                ---------------------------------------  --------------------------------
                active                                   true
                auth_key                                 9DDD4C69220A1FA9B6763F229B75CB6F
                lowest_acceptable_pn                     1
                sak                                      BA86574D054FCF48B9CD7CF54F21304A
                salt                                     000000000000000000000000
                ssci                                     0
                SAI_MACSEC_SA_ATTR_CURRENT_XPN           0 <---this counters was cleared.
                SAI_MACSEC_SA_STAT_IN_PKTS_DELAYED       0
                SAI_MACSEC_SA_STAT_IN_PKTS_INVALID       0
                SAI_MACSEC_SA_STAT_IN_PKTS_LATE          0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_USING_SA  0
                SAI_MACSEC_SA_STAT_IN_PKTS_NOT_VALID     0
                SAI_MACSEC_SA_STAT_IN_PKTS_OK            0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNCHECKED     0
                SAI_MACSEC_SA_STAT_IN_PKTS_UNUSED_SA     0
                SAI_MACSEC_SA_STAT_OCTETS_ENCRYPTED      0
                SAI_MACSEC_SA_STAT_OCTETS_PROTECTED      0
                ---------------------------------------  --------------------------------


Signed-off-by: Ze Gan <ganze718@gmail.com>
Co-authored-by: Judy Joseph <jujoseph@microsoft.com>
2022-09-08 15:47:49 +00:00
Stepan Blyshchak
8431d3ab36 [docker-wait-any] immediately start to wait (#11595)
It could happen that a container has already crashed but docker-wait-any
will wait forever till it starts. It should, however, immediately exit
to make the serivce restart.

#### Why I did it

It is observed in some circumstances that the auto-restart mechanism does not work. Specifically for ```swss.service```, ```orchagent``` had crashed before ```docker-wait-any``` started in ```swss.sh```. This led ```docker-wait-any``` wait forever for ```swss``` to be in ```"Running"``` state and it results in:

```
CONTAINER ID   IMAGE                                COMMAND                  CREATED        STATUS                    PORTS     NAMES
1abef1ecebff   bcbca2b74df6                         "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         what-just-happened
3c924d405cd5   docker-lldp:latest                   "/usr/bin/docker-lld…"   22 hours ago   Up 22 hours                         lldp
eb2b12a98c13   docker-router-advertiser:latest      "/usr/bin/docker-ini…"   22 hours ago   Up 22 hours                         radv
d6aac4a46974   docker-sonic-mgmt-framework:latest   "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         mgmt-framework
d880fd07aab9   docker-platform-monitor:latest       "/usr/bin/docker_ini…"   22 hours ago   Up 22 hours                         pmon
75f9e22d4fdd   docker-snmp:latest                   "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         snmp
76d570a4bd1c   docker-sonic-telemetry:latest        "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         telemetry
ee49f50344b3   docker-syncd-mlnx:latest             "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         syncd
1f0b0bab3687   docker-teamd:latest                  "/usr/local/bin/supe…"   22 hours ago   Up 22 hours                         teamd
917aeeaf9722   docker-orchagent:latest              "/usr/bin/docker-ini…"   22 hours ago   Exited (0) 22 hours ago             swss
81a4d3e820e8   docker-fpm-frr:latest                "/usr/bin/docker_ini…"   22 hours ago   Up 22 hours                         bgp
f6eee8be282c   docker-database:latest               "/usr/local/bin/dock…"   22 hours ago   Up 22 hours                         database
```

The check for ```"Running"``` state is not needed because for cold boot case we do ```start_peer_and_dependent_services``` and for warm boot case the loop will retry to wait for container if this container is doing warm boot:
d01a91a569/files/image_config/misc/docker-wait-any (L56)

#### How I did it

Removed the check for ```"Running"```.

#### How to verify it

Kill swss before ```docker-wait-any``` is reached and verify auto restart will restart swss serivce.
2022-09-08 15:47:27 +00:00