Commit Graph

7091 Commits

Author SHA1 Message Date
Liu Shilong
1ccad8f0b3
[ci] Remove dulplicated code, which is to avoid docker hanging. (#14064)
Why I did it
remove duplicated code for docker hanging issue.
These codes are included in daemon step.

How I did it
remove these codes.
2023-03-08 09:49:33 -08:00
Marty Y. Lok
37b31c5916
[reboot-cause] Porting PR to fix a broken symlink of previous-reboot-cause file removal issue (sonic-host-services #46) (#14106)
Why I did it
Porting/cherry-pick PR sonic-net/sonic-host-services#46
"show reboot-cause history" shows empty history. When the previous-reboot-cause has a broken symlink, And rebooting the system will not be able to generate a new symlink of the new previous-reboot-cause.

admin@sonic:~$ show reboot-cause history 
Name    Cause    Time    User    Comment
------  -------  ------  ------  ---------
How I did it
Somehow, when the symlink file /host/reboot-cause/previous-reboot-cause is broken (which its destination files doesn't exist in this case), the current condition check "if os.path,exists(PREVIOUS_REBOOT_CAUSE_FILE)" will return False in determine-reboot-cause script. Hence, the current previous-reboot-cause is not been removed and the recreation of the new previous-reboot-cause failed. In case of previous-reboot-cause is a broken synlink file, add condition os.path.islink(PREVIOUS_REBOOT_CAUSE) to check and allow the remove operation happens.

How to verify it
Manually make the /host/reboot-cause/previous-reboot-cause to be a broken symlink file by removing its destination file
reboot the system. "show reboot-cause history" should show the correct info

Signed-off-by: mlok <marty.lok@nokia.com>
2023-03-08 09:48:49 -08:00
mssonicbld
75ad7b046a [submodule] Update submodule to the latest HEAD automatically 2023-03-08 20:55:48 +08:00
Dror Prital
c4e34452af
[Mellanox] Install python2 on syncd (#14151)
- Why I did it
Some scripts on syncd require Python2 support.

- How I did it
Add Python2 to syncd docker

- How to verify it
Run manually python scripts under Nvidia SDK debug to ensure they are working
2023-03-08 13:41:41 +02:00
Ying Xie
086c7f5871
[202205][linkmgrd][utilities][platform-daemon] advance submodule head (#14120)
linkmgrd:
* 046bdd0 2023-03-06 | [active-active] add state transition handler for (LinkProber: Unknown, MuxState: Active, LinkState: Down) (#179) (HEAD -> 202205) [Jing Zhang]
* 15ba715 2023-03-06 | loose link down swithcover condition (#178) [Jing Zhang]

utilities:
* 51d9c9f6 2023-03-06 | [warm/fast-reboot] Backup logs from tmpfs to disk during fast/warm shutdown (#2714) (HEAD -> 202205) [Vaibhav Hemant Dixit]
* 03aa77b3 2023-03-02 | [ci] Fix pipeline issue caused by sonic-slave-* change. (#2709) [Liu Shilong]
* 4bd7d4f1 2023-03-03 | [db_migrator] Add missing attribute 'weight' to route entries in APPL DB (#2691) [Vaibhav Hemant Dixit]
* 69a60397 2023-03-01 | removed duplicates and resolved conflicts (#2674) (github/202205) [kannankvs]

platform-daemon:
* 10bc119 2023-03-06 | [ycable] add changes for correcting telemetry values for 'active-active' (#341) (HEAD -> 202205) [vdahiya12]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-03-07 18:59:12 -08:00
Volodymyr Samotiy
9fa28a2533
[202205] [Mellanox] Update SAI 2205.24.0.2 & SDK/FW 4.5.4206/2010.4204 (#14136)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2023-03-07 16:48:55 -08:00
Marty Y. Lok
432c4f9222 [Chassis][multiasic] Fix the sonic-db-cli core files issue on multiasic platform after the c++ implementation of sonic-db-cli (#13207)
Fixe #12047. After the c++ implementation of the sonic-db-cli, sonic-db-cli PING command tries to initialize the global database for all instances database starting. If all instance database-config.json are not ready yet. it will crash and generate core file. PR sonic-net/sonic-swss-common#701 only fix the crash and the process abortion. 

Signed-off-by: mlok <marty.lok@nokia.com>
2023-03-07 14:39:25 +08:00
Ying Xie
5ac0bc3a0f
[202205][swss] update submodule head (#14110)
swss:
* 143cd44 2023-03-04 | Revert "[aclorch] Fixed issue #2204.Support IN_PORTS qualifer in MIRRORV6 table. (#2668)" (#2687) (HEAD -> 202205, github/202205) [StormLiangMS]
* 25812f8 2023-02-06 | [test_mux] add sleep in test_NH (#2648) [Nikola Dancejic]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-03-07 05:37:47 +00:00
Yaqiang Zhu
9d2ec1dc56 [dhcp-relay] Add dhcp_relay show cli (#13614)
Why I did it
Currently the show and clear cli of dhcp_relayis may cause confusion.

How I did it
Add doc for it: [doc] Add docs for dhcp_relay show/clear cli sonic-utilities#2649
Add dhcp_relay config cli and test cases.
show dhcp_relay ipv4 helper
show dhcp_relay ipv6 destination
show dhcp_relay ipv6 counters
sonic-clear dhcp_relay ipv6 counters

How to verify it
Unit test all passed
2023-03-07 10:12:59 +08:00
xumia
850a044726 [Build] Support to use loosen version when failed to install python packages (#14013)
Why I did it
[Build] Support to use loosen version when failed to install python packages
It is to fix the issue #14012

How I did it
Try to use the installation command without constraint

How to verify it
2023-03-07 10:12:53 +08:00
Yakiv Huryk
1098f6ed4c
[Mellanox] update sdk/fw build procedure (#14104)
* sdk debs are now downloaded as Spectrum-SDK-Drivers-SONiC-Bins release
* sx kernel is downloaded as zip from Spectrum-SDK-Drivers

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
2023-03-06 10:33:06 -08:00
Zain Budhwani
a1f9ee5684 Remove dialout as critical process (#14006)
#### Why I did it

Remove dialout as critical process as it is no longer used in prod. As part of future work, can remove dialout completely

#### How I did it

Remove from critical process list
2023-03-06 18:37:19 +08:00
Ikki Zhu
3a8305d4b0 [Seastone] fix dx010 qsfp eeprom data write issue (#13930)
Why I did it
Platform cases test_tx_disable, test_tx_disable_channel, test_power_override failed in dx010.

How I did it
Add i2c access algorithm for CPLD i2c adapters.

How to verify it
Verify it with platform_tests/api/test_sfp.py::TestSfpApi test cases.
2023-03-06 16:39:26 +08:00
mssonicbld
cf5b888534
[ci/build]: Upgrade SONiC package versions (#14083) 2023-03-05 18:57:50 +08:00
mssonicbld
b40cccafa4
[ci/build]: Upgrade SONiC package versions (#14079) 2023-03-04 22:15:09 +08:00
mssonicbld
ecb528face
Fix VOQ_CHASSIS_V6_PEER route-map config (#14055) (#14074) 2023-03-04 05:08:02 +08:00
mssonicbld
bab3243d93
[Arista] Disable SSD NCQ on Lodoga (#13964) (#14073) 2023-03-04 04:41:19 +08:00
Tejaswini Chadaga
a0c127bb54
Update DNX SAI to version 7.1.36.4-1 (#14060) 2023-03-03 09:29:45 -08:00
kenneth-arista
812c1aeecf
sonic-buildimage Make changes to arista config.bcm files to support max cores (#13831) (#14033)
To support 64 cores on arista skus. Fixes aristanetworks/sonic#77
Remapped recycle ports to lowers core port ids and set appl_param_nof_ports_per_modid to 64.

Co-authored-by: Sambath Kumar Balasubramanian <63021927+skbarista@users.noreply.github.com>
2023-03-02 13:08:00 -08:00
xumia
ad41268dc2 [Build] Fix the docker image docker-dhcp-relay:latest not found issue (#13048)
Why I did it
It is to fix the broadcom build failure, it is caused by the build image docker-dhcp-relay:latest not found.

2022-12-14T00:09:57.5464893Z [ FAIL LOG START ] [ target/docker-dhcp-relay.gz-load ]
2022-12-14T00:09:57.5466036Z Attempting docker image lock for docker-dhcp-relay load
2022-12-14T00:09:57.5467113Z Obtained docker image lock for docker-dhcp-relay load
2022-12-14T00:09:57.5468206Z Loading docker image target/docker-dhcp-relay.gz
2022-12-14T00:09:57.5469361Z Loaded image: docker-dhcp-relay:internal.65852159-11ad82a07a
2022-12-14T00:09:57.5470686Z Tagging docker image docker-dhcp-relay:latest as docker-dhcp-relay-sonic:latest
2022-12-14T00:09:57.5471997Z Error response from daemon: No such image: docker-dhcp-relay:latest
2022-12-14T00:09:57.5473122Z [  FAIL LOG END  ] [ target/docker-dhcp-relay.gz-load ]
2022-12-14T00:09:57.5539792Z make: *** [slave.mk:1180: target/docker-dhcp-relay.gz-load] Error 1
2022-12-14T00:09:57.5540958Z make: *** Waiting for unfinished jobs....
The image had been built succeeded

2022-12-13T17:01:59.9046935Z [ finished ] [ target/docker-eventd.gz ] 
2022-12-13T17:02:00.4947165Z [ building ] [ target/docker-dhcp-relay.gz ] 
2022-12-13T17:02:00.6688627Z /sonic/dockers/docker-dhcp-relay/cli-plugin-tests /sonic
2022-12-13T17:02:41.1123955Z /sonic
2022-12-13T17:07:04.1786069Z [ finished ] [ target/docker-dhcp-relay.gz ] 
But it was tagged by another value:

Obtained docker image lock for docker-dhcp-relay save
Tagging docker image docker-dhcp-relay-sonic:latest as docker-dhcp-relay:internal.65852159-11ad82a07a
Saving docker image docker-dhcp-relay:internal.65852159-11ad82a07a
Released docker image lock for docker-dhcp-relay save
Removing docker image docker-dhcp-relay-sonic:latest
Untagged: docker-dhcp-relay-sonic:latest
target/docker-dhcp-relay.gz
File /dpkg_cache/docker-dhcp-relay.gz-2ddfa01a109ca69b7621f1a-450bae36026d9dee62646f2.tgz saved in cache 
[ CACHE::SAVED ] /dpkg_cache/docker-dhcp-relay.gz-2ddfa01a109ca69b7621f1a-450bae36026d9dee62646f2.tgz
How I did it
When the feature SONIC_CONFIG_USE_NATIVE_DOCKERD_FOR_BUILD not enabled, always save as the latest tag, not use the specify version.
The version is dynamic, it is changed when a new commit checked in, but the image of docker-dhcp-relay is not necessary to change.
2023-03-02 16:39:25 +08:00
mssonicbld
ab6b3cde4e
Add QOS profiles for Arista SKUs (#13829) (#14040) 2023-03-02 14:51:41 +08:00
abdosi
9b2aa9591c
Added IP Table rule to allow eth1-midplane traffic for chassis (#13946)
What I did:
Added IP Table rule to make sure we do not drop chassis internal traffic on eth1-midpplane when Control Plane ACL's are installed.

Why I did:
When Control Plane ACL's are installed there is default Catch All rule is added to drop all traffic that is not white-listed explicitly https://github.com/sonic-net/sonic-host-services/blob/master/scripts/caclmgrd#L735. In this case Internal Traffic between Supervisor and LC will get drop. To fix this added explicit rule to allow all traffic coming from eth1-midplane.
2023-03-02 05:08:04 +00:00
Samuel Angebault
15916670d7
[202205][Arista] Update platform library submodules (#14039) 2023-03-01 17:43:56 -08:00
mssonicbld
9d6457a2ff
[ci/build]: Upgrade SONiC package versions (#14046) 2023-03-02 05:37:16 +08:00
Junchao-Mellanox
740ae962b5
[Mellanox] [202205] Add sodimm sensor (#13841) 2023-03-01 12:14:53 +02:00
xumia
b8ef3c07df
Bump lxml from 4.6.5 to 4.9.1 in /src/sonic-config-engine (#14011)
Why I did it
It is to fix the security alert CVE-2022-2309, see https://security-tracker.debian.org/tracker/CVE-2022-2309
The fix has already merged in master, See detail in PR #11366

How I did it
Upgrade version to 4.9.1

How to verify it
2023-03-01 08:21:57 +00:00
Ying Xie
1f6456a601
[202205][utilities][swss-common][sairedis][platform-common][swss] advance submodule head (#14029)
utilities:
* a4f141f1 2023-01-10 | [sfputil] Firmware download/upgrade CLI support for QSFP-DD (#1947) (#2349) (HEAD -> 202205) [CliveNi]

swss-common:
* 41fcad8 2023-01-30 | Increase the netlink buffer size from 3MB to 16MB. (#739) (HEAD -> 202205) [KISHORE KUNAL]

sairedis:
* 5ce9990 2023-02-27 | [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd (#1201) (HEAD -> 202205) [Andriy Yurkiv]
* 3c2e0c5 2023-02-23 | Use new value of STATE_DB FAST_REBOOT entry (#1196) [Aryeh Feigin]
* fe7756f 2023-02-28 | [submodule][SAI]Advance SAI head (#1210) (github/202205) [Richard.Yu]

platform-common:
* 321a8e7 2022-09-23 | Cdb fw upgrade (#308) (HEAD -> 202205) [CliveNi]

swss:
* ceea558 2023-02-28 | [orchagent]: Get bridge port ID from orchagent cache instead of SAI API (#2657) (HEAD -> 202205) [Lawrence Lee]
* bd04e24 2023-03-01 | [dualtor] Fix neighbor miss when mux is not ready (#2676) (HEAD -> 202205) [Longxiang Lyu]
* 7d87a90 2023-02-28 | [ci] Fix pipeline error about team5 not found. (#2684) [Liu Shilong]
* 93a924c 2023-02-27 | [aclorch] Fixed issue #2204.Support IN_PORTS qualifer in MIRRORV6 table. (#2668) [Rajkumar-Marvell]
* 9d87ec4 2023-02-23 | swss: Fix Invalid port oid messages generated because of voq counters. (#2653) [Sambath Kumar Balasubramanian]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-03-01 06:24:34 +00:00
Saikrishna Arcot
052cf9a0a6
Don't create the members@ array in config_db for PC when reading from minigraph (#13660) (#14028)
Fixes #11873.

When loading from minigraph, for port channels, don't create the members@ array in config_db in the PORTCHANNEL table. This is no longer needed or used.

In addition, when adding a port channel member from the CLI, that member doesn't get added into the members@ array, resulting in a bit of inconsistency. This gets rid of that inconsistency.
2023-03-01 04:52:03 +00:00
Stephen Sun
76a5c75b82 [Mellanox] Advance hw-mgmt to v.7.0020.4104 (#13372)
- Why I did it
Advance hw-mgmt service to V.7.0020.4100
Add missing thermal sensors that are supported by hw-mgmt package
Delay system health service before hw-mgmt has started on Mellanox platform in order to avoid reading some sensors before ready.
Depends on sonic-net/sonic-linux-kernel#305

- How I did it
1. Update hw mgmt version
2. Add missing sensors
3. Delay service 

- How to verify it
Regression test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2023-03-01 12:38:50 +08:00
mssonicbld
73f572948f
[netlink] Increse netlink buffer size from 3MB to 16MB (#13965) (#14027) 2023-03-01 11:47:29 +08:00
spilkey-cisco
b2e124cf00
Add asic presence filtering for container checking in system-health (#13497) (#13966)
Why I did it
On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.

system-health indicates errors in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).

system-health process error messages were also altered to indicate which container had the issue; multiple containers may run processes with the same name, which can result in identical system-health error messages, causing ambiguity.

How I did it
Port container_checker logic from #11442 into service_checker for system-health.

How to verify it
Bringup Supervisor card with one or more missing fabric cards. Execute 'show system-health summary'. The command should not report failure due to missing dockers for the asics on the fabric cards which are not present.
2023-02-28 15:40:21 -08:00
Arvindsrinivasan Lakshmi Narasimhan
3e96341049
400 to 100 speed change (#14024)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2023-02-28 14:19:35 -08:00
jhli-cisco
a100e250f6
Update cisco-8000.ini (#13794)
Why I did it
Code drop with incremental fixes for Cisco 8808 and Cisco 8101-32FH-O platforms
Support for ordered ECMP capability
Support for BFD Serviceability CLI
Support for VxLAN Serviceability CLI
Support for 200G Fabric port speed for Fabric Card 8808-FC0 and Line Card 88-LC0-36FH-M/88-LC0-36FH-MO
PFC support for Q200 ASIC based line cards
Support for Single-TOR and Dual-TOR on Cisco 8102-64H-O Platform

How I did it
update cisco 8000 tag

How to verify it
2023-02-27 14:41:25 +08:00
mssonicbld
7f4afadd1a
[ci/build]: Upgrade SONiC package versions (#13995) 2023-02-26 20:02:25 +08:00
mssonicbld
3170ef9b50
[ci/build]: Upgrade SONiC package versions (#13991) 2023-02-25 20:14:59 +08:00
mssonicbld
668774db9f
[ci/build]: Upgrade SONiC package versions (#13975) 2023-02-24 17:32:54 +08:00
Ying Xie
7d49a38e05
[202205][linkmgrd] advance submodule head (#13968)
linkmgrd:
* a2e8391 2023-02-23 | [active-active] link operational down didn't trigger toggle to standby if `MuxUnknown` event arrives first.  (#175) (HEAD -> 202205) [Jing Zhang]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-02-24 04:14:40 +00:00
Ying Xie
ade145f58a
[202205][linkmgrd] advance submodule head (#13939)
linkmgrd:
* d2227d8 2023-02-22 | [active-standby] Toggle to standby if link down and config auto (#173) (HEAD -> 202205) [Longxiang Lyu]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-02-23 11:38:09 -08:00
Dror Prital
d0d59e3f51
[202205][submodule] Advance sonic-platform-common pointer (#13948)
Update sonic-platform-common submodule pointer to include the following:
* d353e7f Update host electrical interface for 2x100G AOC ([#346](https://github.com/sonic-net/sonic-platform-common/pull/346))

Signed-off-by: dprital <drorp@nvidia.com>
2023-02-23 15:39:40 +02:00
Dror Prital
72b57c0913
[202205][submodule] Advance sonic-platform-daemons pointer (#13947)
Update sonic-platform-daemons submodule pointer to include the following:
* 7ee7805 Update CMIS module types for 2x100G AOC support ([#339](https://github.com/sonic-net/sonic-platform-daemons/pull/339))

Signed-off-by: dprital <drorp@nvidia.com>
2023-02-23 15:39:32 +02:00
Dror Prital
cf5a145a69
[202205][submodule] Advance sonic-swss pointer (#13949)
Update sonic-swss submodule pointer to include the following:
* 3aeb4be Align watermark flow with port configuration ([#2672](https://github.com/sonic-net/sonic-swss/pull/2672))

Signed-off-by: dprital <drorp@nvidia.com>
2023-02-23 15:02:46 +02:00
Liu Shilong
b23f79f463
[ci] Fix docker hang issue and change template reference branch (#13894) (#13921)
Why I did it
Cherry pick PR(#13894)
Azure pipeline change.
Use common template to make it easy to change common steps. Fix docker hang issue.

How I did it
How to verify
2023-02-23 14:19:56 +08:00
Stepan Blyshchak
0095ad57f2
[systemd-sonic-generator] Fix overlapping strings being passed to strcpy/strcat (#13889)
Why I did it
Fix an issue that services do not start automatically on first boot and start only after hostcfgd enables them.
This is due to a bug in systemd-sonic-generator:

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 17:04:57 -08:00
mathieulaunay
bda91a19e3
build: add an env var to run make reset unattended (#13820)
- previously "make reset" was expecting user input from the terminal
    to do its job
  - setting UNATTENDED to any non-zero string will allow "make reset" to
    run without interactive confirmation

Signed-off-by: Mathieu Launay <m.launay@criteo.com>
2023-02-23 00:42:51 +00:00
mssonicbld
35a4410f86
[Ci] Support to use the same snapshot for all platform builds (#13913) (#13938) 2023-02-23 08:04:43 +08:00
Tejaswini Chadaga
4345a556eb
Update DNX BRCM SAI version to 7.1.35.4 (#13907) 2023-02-22 14:43:39 -08:00
mssonicbld
e842241f71
add psu fans status led available config (#13926) (#13936) 2023-02-23 06:29:38 +08:00
Stepan Blyshchak
70e2ea1e87
[202205][Mellanox] Place FW binaries under platform directory instead of squashfs (#13838)
Fixes #13568
Backport of #13837

Upgrade from old image always requires squashfs mount to get the next image FW binary. This can be avoided if we put FW binary under platform directory which is easily accessible after installation:

admin@r-spider-05:~$ ls /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
/host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa
admin@r-spider-05:~$ ls -al /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa
lrwxrwxrwx 1 root root 66 Feb  8 17:57 /tmp/image-fw-new-loc.0-dirty-20230208.193534-fs/etc/mlnx/fw-SPC.mfa -> /host/image-fw-new-loc.0-dirty-20230208.193534/platform/fw-SPC.mfa

- Why I did it
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogether with this change.

- How I did it
Place FW binary under /host/image-/platform/mlnx/, soft links in /etc/mlnx are created to avoid breaking existing scripts/automation.
/etc/mlnx/fw-SPCX.mfa is a soft link always pointing to the FW that should be used in current image
mlnx-fw-upgrade.sh is updated to prefer /host/image-/platform/mlnx location and fallback to /etc/mlnx in squashfs in case new location does not exist. This is necessary to do image downgrade.

- How to verify it
Upgrade from 201911 to 202205
202205 to 201911 downgrade
202205 -> 202205 reboot
ONIE -> 202205 boot (First FW burn)

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-02-22 17:40:50 +02:00
Judy Joseph
3c81038bc2 To fix the test failure when porting PR to 202205 2023-02-22 16:37:36 +08:00
judyjoseph
bc8b34c49a Voq Chassis: Add the Recirc ports to the INTERFACES table to make it routed intf (#13779)
* VOQ: Add the Recirc ports to the INTERFACES table to make it routed intf

* Add a test to cover Recir port generation in INTERFACE table
2023-02-22 16:37:36 +08:00