Commit Graph

6457 Commits

Author SHA1 Message Date
Lior Avramov
ff3ad9ddd1 [memory_checker] Do not check memory usage of containers if docker daemon is not running (#11476)
Fix in Monit memory_checker plugin. Skip fetching running containers if docker engine is down (can happen in deinit).
This PR fixes issue #11472.

Signed-off-by: liora liora@nvidia.com

Why I did it
In the case where Monit runs during deinit flow, memory_checker plugin is fetching the running containers without checking if Docker service is still running. I added this check.

How I did it
Use systemctl is-active to check if Docker engine is still running.

How to verify it
Use systemctl to stop docker engine and reload Monit, no errors in log and relevant print appears in log.

Which release branch to backport (provide reason below if selected)
The fix is required in 202205 and 202012 since the PR that introduced the issue was cherry picked to those branches (#11129).
2022-07-28 20:37:22 +00:00
Ze Gan
7a502a25c1 [iproute2]: Enhance iproute2 to update PN for XPN (#11474)
Why I did it
ip command cannot update packet number if the cipher is XPN.

How I did it
Specify SSCI when update packet number and ignore SSCI value if update action.

Signed-off-by: Ze Gan <ganze718@gmail.com>
2022-07-28 20:37:12 +00:00
abdosi
eb56dc8b90 Enable ARP Update Script for Packet based chassis. (#11465)
What I did:

    Following changes done for packet based chassis:-
    1> Run arp_update on LC's to resolve static route nexthops over backend
    port-channel interfaces.
    2> On Supervisor make sure arp_update exit gracefully
2022-07-28 20:36:54 +00:00
Stephen Sun
e317af0e9a Fix chassis test issue (#11460)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:36:44 +00:00
abdosi
1d32553a91 Added Support for deployment_id parsing for Device Asic metadata (#11454)
What I did:
Added Support for deployment_id parsing for Device Asic metadata.

Why I did:-
Deployment Id is used in BGP docker for FRR template generation. For multi-asic platforms running in namespace without deployment id as key in DEVICE_METADATA FRR template generation fails. This change is needed after this #10154 where if deployment_id is none we don't update DEVICE_METADA dictionary.

How I verify:-
Added unit-test.
2022-07-28 20:36:34 +00:00
tjchadaga
4bc1192dcd Log message fix for TSB (#11441) 2022-07-28 20:36:21 +00:00
jusherma
e00cd53caf [build] don't require passwordless sudo #11417
Why I did it
Not all build environments have passwordless sudo enabled for all users

How I did it
Instead of using sudo to delete fsroot directories, mount them in a small, temporary docker container and delete them from there

How to verify it
Build in an environment where the build user does not have passwordless sudo enabled and confirm that no sudo password prompts are seen
2022-07-28 20:36:01 +00:00
tjchadaga
0c7f0aa9b7 Add load_minigraph option to include traffic-shift-away during config migration (#11403) 2022-07-28 20:34:39 +00:00
Marty Y. Lok
948c932cee [Nokia][IXR7250E] Add Nokia platform Nokia-IXR7250E-36x100GE 100G line card device dat (#11382)
Signed-off-by: mlok <marty.lok@nokia.com>
2022-07-28 20:34:05 +00:00
tjchadaga
f56963603b Add bgp_device_global yang model (#11343) 2022-07-28 20:31:36 +00:00
Stephen Sun
94df2c4b86 [Mellanox] Support new platform API get_port_or_cage_type for RJ45 ports (#11336)
- Why I did it
Support get_port_or_cage_type for RJ45 ports

- How I did it
Implement the new platform API get_port_or_cage_type
Fix the issue: unable to import SFP when chassis object is destructed

- How to verify it
Manually test and regression test

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:31:20 +00:00
Junhua Zhai
eafaf08780 [macsec] cli multi-namespace support (#11285)
Enable multi-asic platform support for macsec cli
2022-07-28 20:30:15 +00:00
Stephen Sun
b4d8ee3fec [Mellanox] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario (#11261)
- Why I did it
Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario
This is to port #11032 and #11299 from 202012 to master.

Support additional queue and PG in buffer templates, including both traditional and dynamic model
Support mapping DSCP 2/6 to lossless traffic in the QoS template.
Add macros to generate additional lossless PG in the dynamic model
Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
Buffer tables are rendered via using macros.
Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.
On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:

40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:30:00 +00:00
tjchadaga
fc93871881 Changes to persist TSA/B state across reloads (#11257) 2022-07-28 20:29:45 +00:00
kellyyeh
4abfd37a8d [dhcpmon] Open different socket for dual tor to enable interface filtering (#11201) 2022-07-28 20:28:29 +00:00
andywongarista
f377636747 Add gbsyncd container for broncos (#11154)
* Add docker-gbsyncd-broncos support
* Address review comments
* Add socket to gbsyncd
* Upgrade gbsyncd-broncos to bullseye
2022-07-28 20:27:21 +00:00
Kebo Liu
67e46e1004 add flag skip_xcvrd_cmis_mgr to skip cmis task on Nvidia platform (#11120)
Signed-off-by: Kebo Liu <kebol@nvidia.com>
2022-07-28 20:26:25 +00:00
Prince George
60c00195d4 Skip CMIS manager (#10907)
* Removed unwanted changes

* Fix j2 compilation error

* Address review comment

* Add newline
2022-07-28 20:25:36 +00:00
Kebo Liu
2f59460fc4 [Mellanox] Enhance Platform API to support SN2201 - RJ45 ports and new components mgmt. (#10377)
* Support new platform SN2201 and RJ45 port

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* remove unused import and redundant function

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* fix error introduced by rebase

Signed-off-by: Kebo Liu <kebol@nvidia.com>

* Revert the special handling of RJ45 ports (#56)

* Revert the special handling of RJ45 ports

sfp.py
sfp_event.py
chassis.py

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove deadcode

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Support CPLD update for SN2201

A new class is introduced, deriving from ComponentCPLD and overloading _install_firmware
Change _install_firmware from private (starting with __) to protected, making it overloadable

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Initialize component BIOS/CPLD

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove swb_amb which doesn't on DVT board any more

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Remove the unexisted sensor - switch board ambient - from platform.json

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Do not report error on receiving unknown status on RJ45 ports

Translate it to disconnect for RJ45 ports
Report error for xSFP ports

Signed-off-by: Stephen Sun <stephens@nvidia.com>

* Add reinit for RJ45 to avoid exception

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Stephen Sun <stephens@nvidia.com>
2022-07-28 20:24:49 +00:00
Liu Shilong
71f47ed15b
[ci] Transfer organization from Azure to sonic-net for sonic-mgmt (#11559) (#11560)
Why I did it
Transfer organization from Azure to sonic-net for sonic-mgmt
2022-07-28 15:31:02 +08:00
Ying Xie
f96f0e464f
[202205][sairedis][platform-daemon][linkmgrd][utilities][swss-common] advance submodule head (#11518)
sairedis:
* 38c0bb1 2022-07-21 | [sairedis] Fix reopen recoding file (#1087) (HEAD -> 202205, github/202205) [Kamil Cudnik]

platform-daemon:
* 17587b6 2022-07-22 | [ycabled] add secure channel support for grpc dualtor active-active connectivity  (#275) (HEAD -> 202205, github/202205) [vdahiya12]

linkmgrd:
* c911ec7 2022-07-21 | Avoid unnecessary error logs from `handleGetServerMacAddressNotification` (#96) (HEAD -> 202205) [Jing Zhang]
* bbae81d 2022-07-18 | Add support for reconciliation after warm restart  (#76) [Jing Zhang]

utilities:
* bcc1206 2022-07-20 | Change db_migrator major version on master branch from version 2 to 3 (#2272) (HEAD -> 202205) [Vaibhav Hemant Dixit]
* ad40697 2022-07-21 | Fix test for pfcwd_sw_enable in db_migrator_test (#2253) [bingwang-ms]
* 886f612 2022-07-22 | Revert "show commands for SYSTEM READY (#1851) (#2261)" (#2274) (github/202205) [Ying Xie]
* a6404b7 2022-07-17 | show commands for SYSTEM READY (#1851) (#2261) [Senthil Kumar Guruswamy]

swss-common:
* 509b265 2022-07-06 | Add device global table definition (#645) (HEAD -> 202205) [tjchadaga]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-23 00:22:15 -07:00
Samuel Angebault
8ae03c994d [Arista] Update platform library (#10922)
- Implement Pcie plugin for chassis
- Implement set_admin_status for chassis modules
- Fix phy declaration for phy-credo
2022-07-22 22:15:34 +00:00
Neetha John
f92e3e8262 Update 7260 MMU and ECN settings (#11449)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
Improve throughput and latency for 7260 deployments

How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%

How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
2022-07-22 22:14:41 +00:00
zitingguo-ms
e13df585ee [bcm sai]upgrade Broadcom SAI to 7.1.0.0-6 (#11410)
- Default Not to report Single bit ECC correctable events to avoid the need to set SOC porperties.

Signed-off-by: zitingguo <zitingguo@microsoft.com>
2022-07-22 22:14:28 +00:00
Ying Xie
aee974269f [minigraph] allow LibraPeeringLink to be dualtor indication as well (#11492)
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-21 15:26:11 +00:00
vdahiya12
3829faf2c9
[caclmgrd][dualtor] add iptables rule for dualtor gRPC to allow packets getting forwarded from loopback IP (#11458)
This PR is a required for changing the L3 IP forwarding Behavior to SoC in active-active toplogy. Basically, for getting a packet to be forwarded to the SoC IP in active-active topology, the requirement is to use the the LoopBack 3 IP inside SONiC device as the SRC IP. This is required because in active-active topology by default if the ToR wants to send packet to the SoC, it would pick the Vlan IP since that's the IP in the subnet, but since there are firewalls inside the SoC , the IP packets with Vlan IP as src IP in the IP header will be dropped. Hence to overcome this limitation, there is an iptable nat rule that is installed inside the kernel, with which all the packets which have SoC IP as destination IP, use Loopnack 3 IP as src in IP header

How I did it
check the config DB if the ToR is a DualToR and has an SoC IP assigned.
put an iptable rule
iptables -t nat -A POSTROUTING --destination -j SNAT --to-source "
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
2022-07-20 09:00:28 -07:00
Ying Xie
3e9c1d16c1
[202205][platform-daemon] move submodule head (#11475)
platform-daemon:
* 17f886d 2022-07-18 | [ycabled] remove some redundant logging for active-active cable type (#274) (HEAD -> 202205, github/202205) [vdahiya12]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-18 17:58:32 -07:00
Ying Xie
10cafd5490
[202205][swss][sairedis] advance submodule head (#11463)
swss:
* 7841930 2022-07-15 | [vxlan]Fixing L2MC vlan member caching issue (#2378) (HEAD -> 202205) [Sudharsan Dhamal Gopalarathnam]
* b8cd435 2022-07-14 | [muxorch] Always use direct link for SoC IPs (#2369) [Longxiang Lyu]
* 6158d5c 2022-07-08 | Add BGP profile to Vnet routes (#2337) [Prince Sunny]
* bdb7ffd 2022-07-06 | [teammgr]: Waiting MACsec ready before doLagMemberTask (#2286) [Ze Gan]

sairedis:
* 58359d4 2022-06-30 | [sairedis] Perform log rotate on request (#1058) (HEAD -> 202205, github/202205) [Kamil Cudnik]
* cad0268 2022-07-13 | Enable cisco debug shell by default (#1078) [VenkatCisco]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-18 10:44:20 -07:00
bingwang-ms
f7cc66ad4c Add flag to control the generation of PORT_QOS_MAP|global entry (#11448)
Why I did it
This PR is to add a flag to control whether to generate PORT_QOS_MAP|global entry or not.
It's because for some HWSKU, such as BackEndToRRouter and BackEndLeafRouter, there is no DSCP_TO_TC_MAP defined.
Hence, if the PORT_QOS_MAP|global entry is generated, OA will report some error because the DSCP_TO_TC_MAP map AZURE can not be found.

Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- saiObjectTypeQuery: invalid object id oid:0x7fddb43605d0
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- meta_generic_validation_objlist: SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP:SAI_ATTR_VALUE_TYPE_OBJECT_ID object on list [0] oid 0x7fddb43605d0 is not valid, returned null object id
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- applyDscpToTcMapToSwitch: Failed to apply DSCP_TO_TC QoS map to switch rv:-5
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- doTask: Failed to process QOS task, drop it
This PR is to address the issue.

How I did it
Add a flag require_global_dscp_to_tc_map to control whether to generate the PORT_QOS_MAP|global entry. The default value for require_global_dscp_to_tc_map is true. If the device type is storage backend, the value is changed to false. Then the PORT_QOS_MAP|global entry is not generated.

How to verify it
Update the current test_qos_dscp_remapping_render_template to cover storage backend.
2022-07-17 03:20:20 +00:00
Neetha John
aa63d3101d Minigraph parser changes to select mmu profiles based on SonicQosProfile attribute (#11429)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
There is a need to select different mmu profiles based on deployment type

How I did it
There will be separate subfolders (RDMA-CENTRIC, TCP-CENTRIC, BALANCED) in each hwsku folder which contains deployment specific mmu and qos settings. SonicQosProfile attribute in the minigraph will be used to determine which settings to use. If that attribute is not present, the default settings that exist in the hwsku folder will be used
2022-07-17 03:20:07 +00:00
xumia
3f0c82c831 [Build] Cleanup the version deb preference file after build (#11414)
Why I did it
Cleanup the version deb preference file after build.
The version file is no use after build.

How I did it
Remove the no use version file.
2022-07-17 03:19:54 +00:00
SuvarnaMeenakshi
40b47e96ce [caclmgrd]: Add infrastructure to support adding feature specific acls (#11367)
Why I did it
Add infrastructure to support adding feature specific acls.
If feature specific ACLs has to be added:

if feature_name in self.feature_present and self.feature_present.get('feature_name'):
    add_feature_specific_acls()
How I did it
Add function to get features present in feature table.

How to verify it
unit-test passes.
2022-07-17 03:17:28 +00:00
Stepan Blyshchak
3607686fd1 [teamd] Stop teamd after stopping swss in fast-reboot (#11210)
- Why I did it
To optimize fast-reboot. Teamd can be stopped after bgp is stopped and after swss is stopped because the last LACP packet can be sent still since syncd is still running. Saves 15 sec on shutdown.

- How I did it
Defined in the manifest for teamd to be stopped after swss

- How to verify it
Run it on the switch.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-07-17 03:16:54 +00:00
Lawrence Lee
669687385b [device]: Add SAI checksum verify to TD3 config (#8857)
* [device]: Add SAI checksum verify to TD3 config
* A new config option was added to control the value of IPV4_INCR_CHECKSUM_ORIGINAL_VALUE_VERIFY in the EGR_FLEX_CONFIG control register (this prevents checksums of 0xffff from being propagated to other devices)
2022-07-17 03:11:54 +00:00
Liu Shilong
80ae988be4
[ci] Enable reproducible build in PR checks. (#11411)
* [ci] Enable reproducible build in PR checks.
2022-07-13 16:58:37 +08:00
xumia
d3ae762db5
[Build] Add the missing debian security mirrors in slave images (#11305) (#11314)
Why I did it
Cherry-pick from #11305

The build below was broken, it was caused by one of the required debian mirror missing.
https://dev.azure.com/mssonic/build/_build/results?buildId=116719&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=88f376cf-c35d-5783-0a48-9ad83a873284

 libpci-dev : Depends: libudev-dev (>= 196) but it is not going to be installed
 libsystemd-dev : Depends: libsystemd0 (= 232-25+deb9u14) but 232-25+deb9u13 is to be installed
How I did it
Add the missing mirrors for buster and stretch.
2022-07-13 11:08:11 +08:00
mssonicbld
63a3631d98
[ci/build]: Upgrade SONiC package versions (#11425)
Upgrade SONiC Versions
2022-07-13 07:08:33 +08:00
Ying Xie
cd991bb8e1 [Buffer] Separate buffer profile for Arista-7060CX-32S-Q24C8
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-12 15:05:19 -07:00
Ying Xie
43b2f15286 [7060] fix default port map
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-12 15:05:19 -07:00
zzhiyuan
a0265de430 [Arista] [201811] Add Arista-7260CX3-D96C16 HWSKU (#10034)
Why I did it
This was an ask by Microsoft to provide:
7260 config.bcm file for hardware sku Arista-7260CX3-D92C16 (Named Arista-7260CX3-D96C16).

There are 16 100G uplinks:
Ethernet13-20/1
Ethernet45-52/1

All other ports are breakout to 2 50G ports.

How I did it
Copied existing Arista-7260CX3-D108C8 HWSKU and altered the bcm.config and port_config.ini files.

How to verify it
The new 100G ports do come up with a 201811 image using this HWSKU.

Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
7566032403 [Buffer] Separate buffer profile for Arista-7260CX3-Q64
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
a62af01e9e [Buffer] Separate buffer profile for Arista-7260CX3-D108C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
8b9f1fb65d [Buffer] Separate buffer profile for Arista-7260CX3-C64
50G data is not accurate, needs further update.

Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
b929929f0c [Buffer] Separate buffer profile for Arista-7060CX-32S-C32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
a0c38a9c81 [Buffer] Separate buffer profile for Arista-7060CX-32S-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
634b4ee1f2 [Buffer] Separate buffer profile for Arista-7060CX-32S-Q32
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
b245ee7860 [Buffer] Separate buffer profile for Celestica-DX010-D48C8
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Kevin Wang
0c94b72c2a [Buffer] Separate buffer profile for Force10-S6100
Signed-off-by: Kevin Wang <shengkaiwang@microsoft.com>
2022-07-12 15:05:19 -07:00
Ying Xie
9494b724e3 [buffer] create infrastructure to enable buffer/QoS profiles
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-07-12 15:05:19 -07:00
xumia
4000453af3
Upgrade openssh to 8.4p1-5+deb11u1 (#11408)
Why I did it
Cherry-pick: #11405
Fix the openssh build issue, upgrade from 8.4p1-5 to 8.4p1-5+deb11u1.

https://dev.azure.com/mssonic/build/_build/results?buildId=120209&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=8d99be27-49d0-54d0-99b1-cfc0d47f0318

+ sudo dpkg --root=./fsroot-broadcom -i target/debs/bullseye/openssh-server_8.4p1-5_amd64.deb
dpkg: warning: downgrading openssh-server from 1:8.4p1-5+deb11u1 to 1:8.4p1-5
(Reading database ... 44818 files and directories currently installed.)
Preparing to unpack .../openssh-server_8.4p1-5_amd64.deb ...
Unpacking openssh-server (1:8.4p1-5) over (1:8.4p1-5+deb11u1) ...
dpkg: dependency problems prevent configuration of openssh-server:
 openssh-server depends on openssh-client (= 1:8.4p1-5); however:
  Version of openssh-client on system is 1:8.4p1-5+deb11u1.

dpkg: error processing package openssh-server (--install):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 openssh-server
+ clean_sys

How I did it
Upgrade openssh from 8.4p1-5 to 8.4p1-5+deb11u1.
2022-07-11 22:30:00 +08:00