Commit Graph

3928 Commits

Author SHA1 Message Date
Liu Shilong
ef1dbdd0d5
[build] Use public storage for public resources. (#18038) (#18205)
* [build] Use public storage for public resources. (#18038)

* fix

* fix
2024-02-29 08:49:18 -08:00
Stepan Blyshchak
43c82ebfb0
[201911][nvidia] Fix broken FW links (#16721)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-11-20 10:09:02 -08:00
Hua Liu
185c6544ec
Update sonic-snmpagent submodule (#17014)
8b9cab7 2023-10-26 [201911] Fix IfHighSpeed UT issue on 201911 (#299) 
622b771 2023-10-13 | Fix backup port rfc2863 UT to 202012 branch issue (#298) [Hua Liu]
fa94798 2023-10-11 | Add ifhighspeed UT (#296) [Hua Liu]
41789ca 2023-09-14 | Support interface speed for PortChannels (#262) [Lukas Stockner]
2023-11-02 22:19:54 -07:00
Feng-msft
a5043bfc84
Fix monit false alarm issue, which locates in process_checker and it (#16907)
Fix monit false alarm issue, which located in process_checker and it missed "disk-sleep" status check, thus some 201911 SONiC box report "pmon|sensord" error coincidently.

#### Why I did it
Currently psutil library returns below detail process status:
running: The process is currently running.
sleeping: The process is sleeping or waiting for an event to occur.
disk-sleep: The process is waiting for I/O operations to complete.
stopped: The process has been stopped (e.g. via the SIGSTOP signal).
zombie: The process has terminated but is still listed in the process table.
dead: The process has terminated and has been removed from the process table.

We should regard running/sleeping/disk-sleep as normal case and not alert in monit process.

Now once the disk-sleep occurs during monit cycle, below syslog will be paged, so get rid of syslog output meanwhile.

yslog.2.gz:Feb 24 06:12:17.394619 MEL23-0101-0301-04T1 ERR monit[6040]: 'pmon|sensord' status failed (1) -- '/usr/sbin/sensord -f daemon' is not running in host
syslog.2.gz:Feb 24 06:13:17.932531 MEL23-0101-0301-04T1 ERR monit[6040]: 'pmon|sensord' status failed (1) -- '/usr/sbin/sensord -f daemon' is not running in host
syslog.2.gz:Feb 24 06:14:18.502505 MEL23-0101-0301-04T1 ERR monit[6040]: 'pmon|sensord' status failed (1) -- '/usr/sbin/sensord -f daemon' is not running in host

Then I tried to reproduce the issue by triggering process_checker for sensord frequently and observed it's under "disk-sleep" status once the alert is raised.

##### Work item tracking
- Microsoft ADO **(number only)**:17663589

#### How I did it
Fix process_checker script code for adding "disk-sleep" case handling.

#### How to verify it
Verified in local DUT.
2023-10-26 18:23:24 -07:00
Liping Xu
36e65035ba
add check connetction between zebra and bgp (#16675)
Why I did it
Back port #6478 and #6519 to 201911 branch.

Work item tracking
Microsoft ADO (number only):
24978836
How I did it
Add checking the connection between zebra and bgp during bgpd start.

How to verify it
Modify start.h, add debug log and check the syslog

  _Sep 22 02:41:29.716356 str-a7060cx-acs-10 INFO bgp#root: ####: start zebra
  Sep 22 02:41:30.815341 str-a7060cx-acs-10 INFO bgp#root: ####: start check connection
  Sep 22 02:41:30.868784 str-a7060cx-acs-10 INFO bgp#root: ####: It took 0.029979 seconds to wait for zebra to be ready to accept connections
  Sep 22 02:41:30.873685 str-a7060cx-acs-10 INFO bgp#root: ####: start bgpd
  Sep 22 02:41:35.270569 str-a7060cx-acs-10 INFO bgp#root: ####: done_

  _Sep 22 03:28:02.423438 str-a7060cx-acs-10 INFO bgp#root: ####: start zebra
  Sep 22 03:28:03.731320 str-a7060cx-acs-10 INFO bgp#root: ####: start check connection
  Sep 22 03:28:33.749152 str-a7060cx-acs-10 INFO bgp#root: ####: Error: zebra is not ready to accept connections
  Sep 22 03:28:33.752490 str-a7060cx-acs-10 INFO bgp#root: ####: start bgpd
  Sep 22 03:28:34.259735 str-a7060cx-acs-10 INFO bgp#root: ####: start bgpd done
  Sep 22 03:28:34.755538 str-a7060cx-acs-10 INFO bgp#root: ####: start bgpcfgd
  Sep 22 03:28:35.800906 str-a7060cx-acs-10 INFO bgp#root: ####: done_
2023-09-25 23:29:40 +08:00
Tejaswini Chadaga
42597806a9
[201911][multi-asic] Monit changes to enable internal link monitoring script (#16393)
Monit changes to enable script to monitor SAI_PORT_STAT_IF_IN_ERRORS & SAI_PORT_STAT_IF_OUT_ERRORS on internal (backend) ports of multi-asic device.
2023-09-12 15:57:13 -07:00
Tejaswini Chadaga
71084e7b47
[201911][sonic-utilties] Submodule Update (#16487)
ef2a0cd0 [201911] [multi_asic] Script to monitor errors on internal links (#2971)
1252e31b Changes to separate UT data for internal link monitor (#2976)
3e6654e [[201911] [multi-asic] Unit test fix for internal link monitoring (#2977)
2023-09-12 15:54:25 -07:00
Liu Shilong
51d6b2c33d
Fix faketime package downloading issue. (#16263)
Why I did it
Fix: #16086
faketime package url expired. It breaks 201911 build.
Update package url.

Work item tracking
Microsoft ADO (number only): 24930879
2023-08-25 07:40:35 +08:00
Stepan Blyshchak
fd153e0584 [sonic-cfggen] store jinja2 cache in log level db. (#5646)
This PR makes two changes:
    - Store Jinja2 cache in LOGLEVEL DB instead of STATE DB
    - Store bytecode cache encoded in base64

Tested with the following command: "redis-dump -d 3 -k JINJA2_CACHE"

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-07-14 00:06:47 +00:00
Stepan Blyshchak
73647be598
[201911][mlnx-ffb.sh] Update issu-version location (#14928)
ISSU version check fails due to inability to mount squashfs from 202211 on 201911
2023-06-21 11:00:21 -07:00
xumia
7aeb5d46ce
[Build][201911] Fix the stretch/jessie mirror removed issue (#15083)
[Build] Fix the stretch/jessie mirror removed issue.
2023-05-17 22:52:26 -07:00
abdosi
9d8c082415
Updated Bradcom SAI3.7 Debian package. (#14689)
Upgrade BRCM SAI to Debian package SAI 3.7.6.1-3.
2023-04-17 14:39:25 -07:00
Stepan Blyshchak
16cce80e20
[201911][Mellanox] Place FW binaries under platform directory instead of squashfs (#14270)
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogather with this change.
2023-04-10 19:00:12 -07:00
Hua Liu
e5c6c2f9b9
Improve sudo cat command for RO user. (#14428) (#14437)
Improve sudo cat command for RO user.
Manually cherry-pick for #14428
2023-04-05 15:29:58 -07:00
Samuel Angebault
a13d460bbb
[201911][Arista] Disable ATA NCQ for a few products (#14468)
Why I did it
Some products might experience an occasional IO failure in the communication between CPU and SSD.
Based on some research it could be attributable to some device not handling ATA NCQ (Native Command Queue).

This issue currently affect 4 products:

DCS-7170-32C*
DCS-7170-64C
DCS-7060DX4-32
DCS-7260CX3-64
DCS-7050CX3-32S

How I did it
This change disable NCQ on the affected drive for a small set of products.

How to verify it
When the fix is applied, these 2 patterns can be found in the dmesg.
ata[0-9]+.00: FORCE: horkage modified (noncq)
NCQ (not used)

Test results using: fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4

with NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (depth 32), AA)

   READ: bw=33.9MiB/s (35.6MB/s), 33.9MiB/s-33.9MiB/s (35.6MB/s-35.6MB/s), io=4073MiB (4270MB), run=120078-120078msec
  WRITE: bw=34.1MiB/s (35.8MB/s), 34.1MiB/s-34.1MiB/s (35.8MB/s-35.8MB/s), io=4100MiB (4300MB), run=120078-120078msec
without NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (not used))

   READ: bw=31.7MiB/s (33.3MB/s), 31.7MiB/s-31.7MiB/s (33.3MB/s-33.3MB/s), io=3808MiB (3993MB), run=120083-120083msec
  WRITE: bw=31.9MiB/s (33.4MB/s), 31.9MiB/s-31.9MiB/s (33.4MB/s-33.4MB/s), io=3830MiB (4016MB), run=120083-120083msec
Which release branch to backport (provide reason below if selected)
2023-04-02 14:03:21 -07:00
xumia
5db2dc558c
[Build][201911] Fix the jessie mirror removed issue (#14476)
Change to use the snapshot mirror http://packages.trafficmanager.net/snapshot.

Warning: The Jessie distribution is EOL, please avoid to use it if you can. And the snapshot mirror will be removed in near future as well.
2023-03-31 10:16:40 -07:00
jingwenxie
9b3d8b7f81
[201911][sonic-utilities] submodule update (#13844) 2023-03-06 10:47:11 -08:00
Liu Shilong
ef0c6f34ba
[build] Fix issues caused by docker.com gpg key update. (#14063)
Why I did it
docker.com's gpg key start to work from 2023-02-23. While debian.org's gpg key expired in 2022-11.
We used a walkaround for security checking for debian gpg keys. Now we need to exclude docker.com's gpg key.

How I did it
Update docker.com's gpg key without faketime.
Update others' gpg key with faketime '2022-11'

How to verify it
2023-03-06 10:18:29 +08:00
Prince Sunny
fb0751bc84
[201911] Fix a typo (#14050)
*Fix a typo introduced as part of #13403
2023-03-03 10:23:08 -08:00
Joe LeVeque
56e00d666f [build_debian.sh] Configure sshd to listen for IPv6 connections (#7719)
#### Why I did it

To allow SSH connections from IPv6 addresses

Resolves https://github.com/Azure/sonic-buildimage/issues/7668

#### How I did it

In build_debian.sh, modify sshd_config file so as to enable listening for IPv6 connections
2023-02-15 21:33:25 +00:00
Nazarii Hnydyn
29ef4ee83e
[201911] [Mellanox] Add BIOS upgrade infra (#13623)
- Why I did it
Added BIOS upgrade infra

- How I did it
Added new make target

- How to verify it
Copy msn3800_bios.tar.gz to platform/mellanox/bios
make configure PLATFORM=mellanox
make target/files/stretch/msn3800_bios.tar.gz

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-02-13 19:49:23 +02:00
Prince Sunny
5bd7761481
[201911] Create Vxlan and Vnet default configs (#13403)
* Create Vxlan and Vnet default configs
2023-02-01 18:36:29 -08:00
Qi Luo
9ce9ba4fb7
[201911] Fix tagged VlanInterface if attached to multiple vlan as untagged member (#13534)
Backport https://github.com/sonic-net/sonic-buildimage/pull/8927
2023-01-30 15:49:02 -08:00
Saikrishna Arcot
bc615c0689
Fix build break for jessie apt key expiration. (#13328)
#### Why I did it

The GPG key used for Jessie's official repos has since expired, which means building 201911 images no longer works.

#### How I did it

Fake the time to be before the expiry date.
2023-01-30 12:04:05 -08:00
Devesh Pathak
4dba276094 Fix to improve hostname handling (#12064)
* Fix to improve hostname handling
If config_db.json is missing hostname entry, hostname-config.sh ends
up deleting existing entry too and hostname changes to default 'localhost'

* default hostname to 'sonic` if missing in config file
2023-01-30 18:39:09 +00:00
Prince George
cd2bb08545 Close console session due to user inactivity (#9890)
Signed-off-by: Prince George <prgeor@microsoft.com>
2023-01-30 18:36:11 +00:00
Sudharsan Dhamal Gopalarathnam
2a8153a0b2
[201911][mellanox]Fix CPLD upgrade script (#13240)
Modified the skip check to be greater than or equal to compared to equal to previously
2023-01-06 11:54:06 -08:00
kellyyeh
757130c027
[201911] Add dhcp6relay as dhcprelay submodule (#12052) 2022-11-07 15:27:52 -08:00
nicwu-cel
7769540f42
[Celestica] Add Celestica Silverstone-X platform deb dependency files (#12158)
* Add Celestica Silverstone-X platform deb dependency files
* Optimized Celestica Silverstone-X platform deb dependency files indentation
2022-11-07 09:14:41 +08:00
Hua Liu
90c9811ab2
[201911] [Systemd] Upgrade systemd to fix timer elapsed issue (#12485)
Upgrade systemd to fix timer elapsed issue.

#### Why I did it
On 201911 release, snmp.timer become elapsed status and snmp.service will not be trigger by snmp.timer:

● snmp.service - SNMP container
Loaded: loaded (/usr/lib/systemd/system/snmp.service; static; vendor preset: enabled)
Active: inactive (dead)
● snmp.timer - Delays snmp container until SONiC has started
Loaded: loaded (/usr/lib/systemd/system/snmp.timer; enabled; vendor preset: enabled)
Active: active (elapsed) since Wed 2022-08-03 18:12:59 UTC; 2 months 17 days ago

This issue caused by systemd bug: https://github.com/systemd/systemd/pull/10778/files

This issue can be reproduce with following steps:
1. reboot system.
2. continusly run following commands till  timer elapsed:
systemctl status snmp.timer
sudo systemctl daemon-reload

#### How I did it
Install latest version systemd from offical backport source.

#### How to verify it
Pass all test case.
Manually check reproduce steps, verify the issue fixed.

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205

#### Description for the changelog
Upgrade systemd to fix timer elapsed issue.

#### Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
2022-10-31 10:25:54 +08:00
ganglv
f9dddfb106
[cherry-pick][201911] Fix dhcp option buffer issue (#12520)
Why I did it
#12033

How I did it
How to verify it
2022-10-28 14:27:47 +08:00
arheneus@marvell.com
fc1295bdcc [ntp][apparmor] Allow apparmor read permission for ntpd under rw mount path of rootfs (#6040)
Certain platform specific packages sonic-platform-xyz, installs files onto rootfs, which would be placed on read-write mount path on /host/image-name/rw/...
when ntpd starts it tries to do read access on /usr/bin /usr/sbin/ /usr/local/bin , which inturn links further to the read-write mount path also.
Where ntpd would get below Apparmor Warning message

LOG:-
audit: type=1400 audit(1606226503.240:21): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/local/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
audit: type=1400 audit(1606226503.240:22): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/sbin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
audit: type=1400 audit(1606226503.240:23): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

Fix:
Add rw/.. mount path similar to root path access provided for ntpd in /etc/apparmor.d/usr.sbin.ntpd

Signed-off-by: Antony Rheneus <arheneus@marvell.com>
2022-10-16 05:42:35 +00:00
Sudharsan Dhamal Gopalarathnam
434ce42b2c
[201911][Mellanox] Adding SKU Mellanox-SN2700-D44C10 (#12395)
To add new SKU Mellanox-SN2700-D44C10 with following requirements:
2022-10-15 11:15:54 -07:00
xumia
37fa1014ad
[201911] Change submodule path from Azure to sonic-net (#12313)
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"

How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
2022-10-12 21:07:22 +08:00
Sudharsan Dhamal Gopalarathnam
43096c58b6
[201911][mellanox] Extend Mellanox FW utils with CPLD update (#12172)
Porting https://github.com/sonic-net/sonic-buildimage/pull/3723 to 201911

#### Why I did it
Extend Mellanox FW utils with CPLD update feature
Added support for CPLD upgrade to Mellanox FW utility


#### How I did it
Updated Mellanox FW utility

#### How to verify it
mlnx-fw-upgrade.sh --upgrade --cpld # Regular CPLD update flow
UPDATE_MLNX_CPLD_FW=1 mlnx-fw-upgrade.sh --upgrade # Force CPLD refresh only

#### Ensure to add label/tag for the feature raised. example - [PR#2174](https://github.com/sonic-net/sonic-utilities/pull/2174) where, Generic Config and Update feature has been labelled as GCU.
2022-09-26 09:10:08 -07:00
Sudharsan Dhamal Gopalarathnam
e716b453b3
[201911][mellanox] Add CPLD update for SN2700 (#12173)
* [mellanox]: Add CPLD update for SN2700 (#3570)

* [mellanox]: Add CPLD update for SN2700.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* Updating cpld file

* Updating file path for cpld

* Updating archive

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
Co-authored-by: Nazarii Hnydyn <nazariig@mellanox.com>
2022-09-26 07:58:28 -07:00
abdosi
69f18cfdbb
Cheery-pick the commit from master where in multi-asic platforms bgp (#12081)
Cherry-pick the commit from master where in multi-asic platforms bgp template rendering fails which needs Loopback4096 IP Address. Issue happens because of timing/race condition where if peer gets added first and then Loopback4096 notification comes to bgpcfgd
2022-09-15 10:12:57 -07:00
Moshe Moshe
8c302e6217
[201911][Mellanox] Update FW to version 2008.3388 (#11978)
* [201911][Mellanox] Update FW to version 2008_3388
2022-09-12 10:26:13 -07:00
Vivek
7781399bb6
[201911][Mellanox] Collect MST dump before syncd restart on shutdown notification (#11742)
- Why I did it
Collecting MST dump before syncd restart on shutdown notification during a SAI failure

Dump can be found under:
root@sonic:/home/admin# ls -l /var/dump/mstdump/
total 10684
-rw-r--r-- 1 root root 5460332 Aug 15 18:41 mstdump_20220815_184143.tar.gz
-rw-r--r-- 1 root root 5473253 Aug 15 21:46 mstdump_20220815_214642.tar.gz

root@sonic:/home/admin# tar -xvzf /var/dump/mstdump/mstdump_20220815_214642.tar.gz
├── ir-gdb
│   └── core
└── mstdump
    ├── mstdump1
    ├── mstdump2
    ├── mstdump3
    └── mststatus

- How I did it
Checked for shutdown notification log in sairedis and used it to determine whether the shutdown is normal or due to SAI failure

- How to verify it
Simulated a SAI failure event and verified it. Verified it also on different reboots and config reload scenarios the dump is not generated

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2022-08-29 16:09:26 +03:00
Sujin Kang
61a34fcf22
[201911] Add hardware reboot cause when software reboot failed (#11753)
Why I did it
Add the hardware reboot cause when the previous software reboot failed

How I did it
Check both hardware reboot cause and software reboot cause.
Add the hardware reboot as actual reboot cause
if any hardware reboot cause is available for any software reboot.

How to verify it
Perform reboots and verify the reboot-cause
2022-08-25 12:30:53 -07:00
nicwu-cel
0a7570000c
Add Celesitca Silverstone-x platform (#11533)
Why I did it
Add Celestica Silverstone-x platform

How I did it
Add Celestica Silverstone-x platform

How to verify it
verified by SONiC tested platform APIs
verified by SONiC APIs including " psuutil
psushow(show platform psustatus)
sfputil
sfpshow
tempershow(show platform temperature)
fanshow(show platform fan)
watchdogutil
fwutil(show platform firmware status)
decode-syseeprom -d(show platform syseeprom)
show platform ssdhealth
show platform summary
show interfaces status
"
2022-08-11 08:47:54 -07:00
Vivek
0b06e280aa
[201911] [libteam] Backport Missing update to libteam WR patch (#11583)
Why I did it
LAG Flaps are seen on Sad Warm reboot tests because of this.

How I did it
backport 8a2ba14677

Signed-off-by: vkarri <vkarri@contoso.com>
2022-08-08 15:02:56 +03:00
Liu Shilong
ee966125d4
[ci] Update azp reference to support transfering organization from Azure to sonic-net (#11606) 2022-08-02 16:14:38 +08:00
anish-n
7c83bb69e0 Minigraph resource type changes (#5198)
* Parse sub_role from minigraph into DEVICE_METADATA
* Change minigraph sub_role to resource_type
2022-08-01 17:18:08 +00:00
Abhishek Dosi
b6e8d38fc4 Revert "Minigraph resource type changes (#5198)"
This reverts commit f42f325f09.
2022-08-01 17:13:53 +00:00
Abhishek Dosi
5ff90316be [Submodule update] sonic-swss
[neighsyncd] increase neighsyncd timeout (#2209)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-07-29 21:56:16 +00:00
Abhishek Dosi
f3370dd414 [Submodule update] sonic-restapi
Change error message when conflicting Vlan ID is used (#119)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-07-29 21:53:42 +00:00
anish-n
f42f325f09 Minigraph resource type changes (#5198)
* Parse sub_role from minigraph into DEVICE_METADATA
* Change minigraph sub_role to resource_type
2022-07-29 17:02:34 +00:00
Abhishek Dosi
36ca9c52aa [Submodule update] sonic-restapi
commit 5f7cb77230fceb1b7fd30c57a70d0cd05cd6dd95 (HEAD -> 201911, origin/201911)
Author: Sumukha Tumkur Vani <sumukhatv@outlook.com>
Date:   Wed Jul 27 16:51:38 2022 -0700

    Use 201911/stretch dependencies for build (#118)

commit e3809523050df75ec18bf31a5dc3a2e595d58a14
Author: Sumukha Tumkur Vani <sumukhatv@outlook.com>
Date:   Wed Jul 27 15:17:12 2022 -0700

    Change response message for conflicting VNI (#117)

    Ref: https://github.com/Azure/sonic-restapi/pull/99

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-07-29 16:57:27 +00:00
Abhishek Dosi
11473cf736 [Submodule update] sonic-swss
Handle delete case for object not found (#2391)

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2022-07-29 16:55:27 +00:00