Commit Graph

1180 Commits

Author SHA1 Message Date
mssonicbld
0469a2a02f
[ci/build]: Upgrade SONiC package versions (#13881) 2023-02-19 18:47:51 +08:00
mssonicbld
7521705bb8
[ci/build]: Upgrade SONiC package versions (#13877) 2023-02-18 19:09:33 +08:00
mssonicbld
e170a4b8a1
[ci/build]: Upgrade SONiC package versions (#13840) 2023-02-17 07:40:37 +08:00
mssonicbld
df0685fb19
[Arista] Add emmc quirks in boot0 to improve reliability (#10013) (#13743)
Why I did it
Fix some unreliability seen on emmc device with some AMD CPUs

How I did it
Added a kernel parameter to add quirks to
It depends on a sonic-linux-kernel change to work properly but will be a no-op without it.
The quirk added is SDHCI_QUIRK2_BROKEN_HS200 used to downgrade the link speed for the eMMC.

Co-authored-by: Samuel Angebault <staphylo@arista.com>
2023-02-10 09:28:24 -08:00
mssonicbld
f5656d1aad
[Mellanox][sai_failure_dump]Added platform specific script to be invoked during SAI failure dump (#13533) (#13749)
- Why I did it
Added platform specific script to be invoked during SAI failure dump. Added some generic changes to mount /var/log/sai_failure_dump as read write in the syncd docker

- How I did it
Added script in docker-syncd of mellanox and copied it to /usr/bin

- How to verify it
Manual UT and new sonic-mgmt tests

Co-authored-by: Sudharsan Dhamal Gopalarathnam <dgsudharsan@users.noreply.github.com>
2023-02-10 09:23:10 -08:00
mssonicbld
d623dd2fca
Increase PikeZ varlog size (#13550) (#13750)
Why I did it
To address error sometimes seen when running sonic-mgmt test_stress_routes.py::test_announce_withdraw_route on 720DT-48S

How I did it
Update boot0 logic to set platform specific varlog size for 720DT-48S

How to verify it
Verified that /var/log size increased and error is no longer observed when running test

Co-authored-by: andywongarista <78833093+andywongarista@users.noreply.github.com>
2023-02-10 09:20:36 -08:00
mssonicbld
06aa8aa11b
[Mellanox] Support DSCP remapping in dual ToR topo on T0 switch (#12605) (#13745)
- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
2023-02-10 09:16:56 -08:00
Oleksandr Ivantsiv
d1fa414f1b
Clear DNS configuration received from DHCP during networking reconfiguration in Linux. (#13516) (#13695)
- Why I did it
fixes #12907

When the management interface IP address configuration changes from dynamic to static the DNS configuration (retrieved from the DHCP server) in /etc/resolv.conf remains uncleared. This leads to a DNS configuration pointing to the wrong nameserver. To make the behavior clear DNS configuration received from DHCP should be cleared.

- How I did it
Use resolvconf package for managing DNS configuration. It is capable of tracking the source of DNS configuration and puts the configuration retrieved from the DHCP servers into a separate file. This allows the implementation of DNS configuration cleanup retrieved from DHCP during networking reconfiguration.

- How to verify it
Ensure that the management interface has no static configuration.
Check that /etc/resolv.conf has DNS configuration.
Configure a static IP address on the management interface.
Verify that /etc/resolv.conf has no DNS configuration.
Remove the static IP address from the management interface.
Verify that /etc/resolv.conf has DNS configuration retrieved form DHCP server.
2023-02-10 09:11:05 -08:00
xumia
7642f4c07f
[Security][202205] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips #13737 (#13759)
* [Security] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips (#13737)

Why I did it
[Security] Upgrade the openssl version to 1.1.1n-0+deb11u4+fips

f6df7303d8 Update expired certs.
84540b59c1 CVE-2022-2068
f763d8a93e Prepare 1.1.1n-0+deb11u2
576562cebe CVE-2022-1292
How I did it
Upgrade the OpenSSL version

* [Security] Upgrade OpenSSL version for armhf
2023-02-10 12:01:22 +00:00
mssonicbld
6620871fff
Add support for platform topology configuration service (#12066) (#13605) 2023-02-03 06:34:17 +08:00
Saikrishna Arcot
78ed216167 Use tmpfs for /var/log for Arista 7260 (#13587)
This is to reduce writes to disk, which then can use the SSD to get worn
out faster.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-02-03 04:32:29 +08:00
mssonicbld
3591f6b8a3
rsyslog to start after interfaces-config (#13503) (#13529) 2023-01-27 16:10:20 +08:00
mssonicbld
3860186ec2
[sudoers] add /usr/local/bin/storyteller to READ_ONLY_CMDS (#13422) (#13530) 2023-01-27 16:08:57 +08:00
mssonicbld
5d29448f41
change default to be on (#13495) (#13498)
Changing the default config knob value to be True for killing radv, due to the reasons below:

Killing RADV is to prevent sending the "cease to be advertising interface" protocol packet.
RFC 4861 says this ceasing packet as "should" instead of "must", considering that it's fatal to not do this.
In active-active scenario, host side might have difficulty distinguish if the "cease to be advertising interface" is for the last interface leaving.
6.2.5. Ceasing To Be an Advertising Interface

shutting down the system.
In such cases, the router SHOULD transmit one or more (but not more
than MAX_FINAL_RTR_ADVERTISEMENTS) final multicast Router
Advertisements on the interface with a Router Lifetime field of zero.
In the case of a router becoming a host, the system SHOULD also
depart from the all-routers IP multicast group on all interfaces on
which the router supports IP multicast (whether or not they had been
advertising interfaces). In addition, the host MUST ensure that
subsequent Neighbor Advertisement messages sent from the interface
have the Router flag set to zero.

sign-off: Jing Zhang zhangjing@microsoft.com

Co-authored-by: Jing Zhang <zhangjing@microsoft.com>
2023-01-25 09:58:53 -08:00
mssonicbld
9d4c5f1ad5
[ci/build]: Upgrade SONiC package versions (#13492) 2023-01-24 22:55:46 +08:00
mssonicbld
a80c8b4c01
[ci/build]: Upgrade SONiC package versions (#13466) 2023-01-22 22:50:02 +08:00
mssonicbld
8e52922c4e
[ci/build]: Upgrade SONiC package versions (#13463) 2023-01-21 22:31:16 +08:00
mssonicbld
a67ed33ec1
[dualtor][active-active]Killing radv instead of stopping on active-active dualtor if config knob is on (#13408) (#13458) 2023-01-21 12:55:10 +08:00
Arvindsrinivasan Lakshmi Narasimhan
73c0deb810
Revert "[Chassis][Voq]update to add buffer_queue config on system ports (#12156)" (#13421)
This reverts commit 1cffbc7b07.

Why I did it
This PR reverts the changes done in #12156 in 202205.
The dependant swss changes in PR sonic-net/sonic-swss#2618 are not merged in 202205 yet.

This revert is to avoid issues on 202205 till the sonic-net/sonic-swss#2618 is merged in.

Once sonic-net/sonic-swss#2618 changes are merged, this change will be added back.
2023-01-18 16:38:05 -08:00
mssonicbld
d7ac478415
[ci/build]: Upgrade SONiC package versions (#13393) 2023-01-17 22:56:10 +08:00
mssonicbld
da8bc0bb12
[ci/build]: Upgrade SONiC package versions (#13369) 2023-01-15 23:11:45 +08:00
mssonicbld
242e5d8db8
[ci/build]: Upgrade SONiC package versions (#13365) 2023-01-14 22:50:30 +08:00
mssonicbld
3c86702d15
[Bug] Fix SONiC installation failure caused by pip/pip3 not found (#13284) (#13352) 2023-01-13 09:57:35 +08:00
mssonicbld
991c98e560
During build time mask only those feature/services that are disabled excplicitly (#13283) (#13296)
What I did:
Fix : #13117

How I did:
During build time mask only those feature/services that are disabled explicitly. Some of the features ((eg: teamd/bgp/dhcp-relay/mux/etc..)) state is determine run-time so for those feature by default service will be up and running and then later hostcfgd will mask them if needed.

So Default behavior will be

init_cfg.json.j2 during build time make state as disabled then mask the service
init_cfg.json.j2 during build time make state as another jinja2 template render string than do no mask the service
init_cfg.json.j2 during build time make state as enabled then do not mask the service

How I verify:
Manual Verification.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com>
2023-01-09 10:28:03 -08:00
Arvindsrinivasan Lakshmi Narasimhan
1cffbc7b07 [Chassis][Voq]update to add buffer_queue config on system ports (#12156)
Why I did it
In the voq chassis the buffer_queue configuration needs to be applied on system_port instead of the sonic port.
This PR has the change to do this.

How I did it
Modify buffer_config.j2 to generate buffer_queue configuration on system_ports if the device is Voq Chassis

How to verify it
Verify the buffer_queue configuration is generated properly using sonic-cfggen

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2023-01-05 12:37:16 +08:00
mssonicbld
f5974090ad
[ci/build]: Upgrade SONiC package versions (#13221) 2023-01-03 22:46:44 +08:00
mssonicbld
bf05eeada6
[ci/build]: Upgrade SONiC package versions (#13209) 2022-12-31 22:30:03 +08:00
xumia
8395de69d3
[Build] Support j2 template for debian sources (#12557) (#13185)
Why I did it
Unify the Debian mirror sources
Make easy to upgrade to the next Debian release, not source url code change required. Support to customize the Debian mirror sources during the build
Relative issue: #12523

How I did it
How to verify it
2022-12-30 09:47:33 +08:00
mssonicbld
8358cffe2d
[ci/build]: Upgrade SONiC package versions (#13169) 2022-12-25 22:12:53 +08:00
mssonicbld
7722833311
[ci/build]: Upgrade SONiC package versions (#13159) 2022-12-24 22:37:23 +08:00
mssonicbld
4b5d019bbe
[ci/build]: Upgrade SONiC package versions (#13147) 2022-12-23 06:25:41 +08:00
mssonicbld
b193ba4d9f
[ci/build]: Upgrade SONiC package versions (#13095) 2022-12-18 22:36:59 +08:00
mssonicbld
a8382a3245
[ci/build]: Upgrade SONiC package versions (#13093) 2022-12-17 23:56:08 +08:00
mssonicbld
faaaea0464
[ci/build]: Upgrade SONiC package versions (#13036) 2022-12-13 23:15:57 +08:00
Saikrishna Arcot
c725dfb975 Replace logrotate cron file with (adapted) systemd timer file (#12921)
Debian is shipping a systemd timer unit for logrotate, but we're also
packaging in a cron job, which means both of them will run, potentially
at the same time. Remove our cron file, and add an override to the
shipped timer file to have it be run every 10 minutes.

Fixes #12392.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-12-13 06:34:28 +08:00
mssonicbld
6891aa915a
[ci/build]: Upgrade SONiC package versions (#13017) 2022-12-11 22:19:51 +08:00
mssonicbld
e428afae01
[ci/build]: Upgrade SONiC package versions (#13015) 2022-12-10 22:16:58 +08:00
lixiaoyuner
b0c9013ea1
Add k8s master feature (#11637) (#12984)
Signed-off-by: Yun Li <yunli1@microsoft.com>

* Add k8s master feature

* Update kubernetes version mistake and make variable passing clear

* Add CRI-dockerd package

* Update version variable passing logic

* Upgrade the worker kubernetes version

* Install xml file parse tool
2022-12-09 10:43:54 +08:00
Stepan Blyshchak
7ed1cd0d68 [services] kill container on stop in warm/fast mode (#10510)
- Why I did it
To optimize stop on warm boot.

- How I did it
Added kill for containers
2022-12-08 17:19:16 +00:00
Michael Li
41858170d8 Limit reload BCM SDK kmods on syncd start to PikeZ platform (#12971)
Why I did it
Limiting #12804 changes to PikeZ platform only (Arista-720DT-48S). Note that this is a short term workaround for this platform until SDK investigation on SDK init failure on docker syncd restart due to DMA issues is resolved.

How I did it
Retrieve platform name from /host/machine.conf and only reload SDK kmods on Arista-720DT-48S platform.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-12-08 17:18:00 +00:00
Ying Xie
7da66c2943 Revert "Revert "Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804)""
This reverts commit 7e910aecad.
2022-12-08 17:17:41 +00:00
Stepan Blyshchak
699800bdf1 [swss.sh] optimize macsec feature state query (#12946)
- Why I did it
There's a slowdown in bootup related to the execution of a show command during startup of swss service. show is a pretty heavy command and takes long time to execute ~2 sec.

- How I did it
I replaced show with sonic-db-cli which takes a ms to run.

- How to verify it
Boot the switch and verify swss is active.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-12-08 04:32:54 +08:00
mssonicbld
7152e84277
Make client indentity by AME cert (#11946) (#12908) 2022-12-02 13:13:26 +08:00
Ying Xie
7e910aecad Revert "Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804)"
This reverts commit 132c6e934a.
2022-12-01 19:47:33 +00:00
Michael Li
132c6e934a Reload BCM SDK kmods on syncd start to handle syncd restart issues (#12804)
Why I did it
There is an issue on the Arista PikeZ platform (using T3.X2: BCM56274) while running SONiC. If the 'syncd' container in SONiC is restarted, the expected behaviour is that syncd will automatically restart/recover; however it does not and always fails at create_switch due to BCM SDK kmod DMA operation cancellation getting stuck.

Sep 16 22:19:44.855125 pkz208 ERR syncd#syncd: [none] SAI_API_SWITCH:platform_process_command:428 Platform command "init soc" failed, rc = -1. Sep 16 22:19:44.855206 pkz208 INFO syncd#supervisord: syncd CMIC_CMC0_PKTDMA_CH4_DESC_COUNT_REQ:0x33#015 Sep 16 22:19:44.855264 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:platformInit:1909 initialization command "init soc" failed, rc = -1 (Internal error). Sep 16 22:19:44.855403 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:sai_driver_init:642 Error initializing driver, rc = -1. ... Sep 16 22:19:44.855891 pkz208 CRIT syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:1173 initializing SDK failed with error Operation failed (0xfffffff5).

Reloading the BCM SDK kmods allows the switch init to continue properly.

How I did it
If BCM SDK kmods are loaded, unload and load them again on syncd docker start script.

How to verify it
Steps to reproduce:

In SONiC, run 'docker ps' to see current running containers; 'syncd' should be present.
Run 'docker stop syncd'
Wait ~1 minute.
Run 'docker ps' to see that syncd is missing.
Check logs to see messages similar to the above.

Signed-off-by: Michael Li <michael.li@broadcom.com>
2022-12-01 01:36:18 +00:00
abdosi
81fe1d9c1a
Added Support to runtime render bgp and teamd feature state and lldp has_asic_scope flag (#11796) (#12856)
Added Support to runtime render bgp and teamd feature state and lldp has_asic_scope flag
2022-11-29 13:47:37 -08:00
bingwang-ms
4f7a0b4705 Apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor (#12730)
Why I did it
The PR is to apply separated DSCP_TO_TC_MAP and TC_TO_QUEUE_MAP to uplink ports on dualtor.
The traffic with DSCP 2 and DSCP 6 from T1 is treated as lossless traffic.

DSCP    TC    Queue
2      2     2
6      6     6
Traffic with DSCP 2 or DSCP 6 from downlink is still treated as lossy traffic as before.

How I did it
Define DSCP_TO_TC_MAP|AZURE_UPLINK and TC_TO_QUEUE_MAP|AZURE_UPLINK.

How to verify it
Verified by UT
Verified by coping the new template to a testbed, and rendering a config_db.json
2022-11-28 18:51:04 +00:00
Lorne Long
5a4efe211c [Build] Use apt-get to predictably support dependency ordered configuration of lazy packages (#12164)
Why I did it
The current lazy installer relies on a filename sort for both unpack and configuration steps. When systemd services are configured [started] by multiple packages the order is by filename not by the declared package dependencies. This can cause the start order of services to differ between first-boot and subsequent boots. Declared systemd service dependencies further exacerbate the issue (e.g. blocking the first-boot script).

The current installer leaves packages un-configured if the package dependency order does not match the filename order.

This also fixes a trivial bug in [Build]: Support to use symbol links for lazy installation targets to reduce the image size #10923 where externally downloaded dependencies are duplicated across lazy package device directories.

How I did it
Changed the staging and first-boot scripts to use apt-get:

dpkg -i /host/image-$SONIC_VERSION/platform/$platform/*.deb

becomes

apt-get -y install /host/image-$SONIC_VERSION/platform/$platform/*.deb

when dependencies are detected during image staging.

How to verify it
Apt-get critical rules

Add a Depends= to the control information of a package. Grep the syslog for rc.local between images and observe the configuration order of packages change.
2022-11-28 18:48:36 +00:00
abdosi
88bb83e859 [chassis-packet] fix the issue of internal ip arp not getting resolved. (#12127)
Fix the issue where arp_update will not ping some of the ip's even
though they are in failed state since grep of that ip on ip neigh show
command does not do exact word match and can return multiple match.
2022-11-28 18:48:36 +00:00
arlakshm
b86b3b0d7d
[202205][chassis] update the asic_status.py to read from CHASSIS_FABRIC_ASIC_INFO_TABLE (#12780)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-11-26 20:27:30 -08:00
mssonicbld
66ba3285ac
[ci/build]: Upgrade SONiC package versions (#12830) 2022-11-25 21:59:36 +08:00
mssonicbld
4424937611
[ci/build]: Upgrade SONiC package versions (#12812) 2022-11-23 21:35:21 +08:00
mssonicbld
2d2305091f
[ci/build]: Upgrade SONiC package versions (#12772) 2022-11-20 22:50:40 +08:00
mssonicbld
13b8078555
[ci/build]: Upgrade SONiC package versions (#12759) 2022-11-19 04:22:29 +08:00
mssonicbld
f4bace99f1
[ci/build]: Upgrade SONiC package versions (#12726) 2022-11-17 02:52:28 +08:00
mssonicbld
1be9baa1c0
[ci/build]: Upgrade SONiC package versions (#12691) 2022-11-13 22:33:45 +08:00
mssonicbld
2b641e0505
[ci/build]: Upgrade SONiC package versions (#12656) 2022-11-11 23:34:54 +08:00
Jing Kan
b2d3e2cf2e [dhcp_relay] Enable DHCP Relay for BmcMgmtToRRouter in init_cfg (#12648)
Why I did it
DHCP relay feature needs to be enabled for BmcMgmtToRRouter by default

How I did it
Update device type list
2022-11-10 18:16:15 +00:00
Sudharsan Dhamal Gopalarathnam
1ea37e2723 [logrotate]Fix logrotate firstaction script to reflect correct size (#12599)
- Why I did it
Fix logrotate firstaction script to reflect correct size. The size was modified to change dynamically based on disk size. However this variable was not updated
#9504

- How I did it
Updated the variable based on disk size

- How to verify it
Verify in the generated rsyslog file if the variable is correctly generated from jinja template
2022-11-10 18:15:10 +00:00
bingwang-ms
d824846928 Add lossy scheduler for queue 7 (#12596)
* Add lossy scheduler for queue 7
2022-11-10 18:14:55 +00:00
Devesh Pathak
c7ce62154b Clear /etc/resolv.conf before building image (#12592)
Why I did it
nameserver and domain entries from build system fsroot gets into sonic image.

How I did it
Clear /etc/resolv.conf before building image

How to verify it
Built image with it and verified with install that /etc/resolv.conf is empty
2022-11-10 18:14:10 +00:00
Lawrence Lee
f60e22a5c3 [arp_update]: Fix hardcoded vlan (#12566)
Typo in prior PR #11919 hardcodes Vlan name. Change command to use the $vlan variable instead

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-11-10 18:12:02 +00:00
judyjoseph
ab713dcfb6 Use the macsec_enabled flag in platform to enable macsec feature state (#11998)
* Use the macsec_enabled flag in platform to enable macesc feature state
* Add macsec supported metadata in DEVICE_RUNTIME_METADATA
2022-11-10 18:08:42 +00:00
mssonicbld
584aaa7058
[ci/build]: Upgrade SONiC package versions (#12612) 2022-11-06 22:25:30 +08:00
mssonicbld
98c3e24770
[ci/build]: Upgrade SONiC package versions (#12606) 2022-11-05 00:36:02 +08:00
mssonicbld
1463af1227
[ci/build]: Upgrade SONiC package versions (#12584) 2022-11-03 00:12:56 +08:00
mssonicbld
fe62175aa6
[ci/build]: Upgrade SONiC package versions (#12571) 2022-11-02 01:18:10 +08:00
mssonicbld
ae681eabb8
[ci/build]: Upgrade SONiC package versions (#12556) 2022-11-01 03:49:12 +08:00
mssonicbld
483257d88c
[ci/build]: Upgrade SONiC package versions (#12543) 2022-10-28 23:15:39 +08:00
Samuel Angebault
8e44292d74
[202205][Arista] Fix cmdline generation during warm-reboot from 201811/201911 (#12371)
* [202012][Arista] Fix cmdline generation during warm-reboot from 201811/201911 (#11161)

Issue fixed: when performing a warm-reboot or fast-reboot from 201811 or 201911 to 202012 the kernel command line contains duplicate information. This issue is related to a change that was made to make 202012 boot0 file more futureproof.
A cold reboot brings everything back into a clean slate though not always desirable.

Changes done:
Added some logic to properly detect the end of the Aboot cmdline when cmdline-aboot-end delimiter is not set (clean case)
Added some logic to regenerate the Aboot cmdline when cmdline-aboot-end is set but duplicate parameters exists before (dirty case). Reorganized some code to handle duplicate parameter handling in the allowlist.

* Fix cmdline generation due to sonic_fips
2022-10-27 10:14:26 -07:00
Samuel Angebault
b1c0d8d5e4
Add emmc quirks to boot0 (#9989) (#12373)
Why I did it
Fix some unreliability seen on emmc device with some AMD CPUs

How I did it
Added a kernel parameter to add quirks to
It depends on a sonic-linux-kernel change to work properly but will be a no-op without it.

Description for the changelog
Add emmc quirks for Upperlake
2022-10-27 07:09:03 -07:00
Devesh Pathak
17c213a264 Fix to improve hostname handling (#12064)
* Fix to improve hostname handling
If config_db.json is missing hostname entry, hostname-config.sh ends
up deleting existing entry too and hostname changes to default 'localhost'

* default hostname to 'sonic` if missing in config file
2022-10-25 21:52:42 +00:00
Samuel Angebault
94c8107f5e Fix extraction of platform.tar.gz for firsttime (#11935) 2022-10-25 20:43:32 +00:00
cytsao1
8930d70972 [pmon] Add smartmontools to pmon docker (#11837)
* Add smartmontools to pmon docker

* Set smartmontools to install version 7.2-1 in pmon to match host; clean up smartmontools build files

* Add comments on smartmontools version for both host and pmon
2022-10-25 20:41:26 +00:00
xumia
db2128564b
[202205] Change submodule path from Azure to sonic-net (#12308)
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"

How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
2022-10-24 13:13:14 +08:00
mssonicbld
abc92c6248
[ci/build]: Upgrade SONiC package versions (#12452) 2022-10-20 03:23:45 +08:00
mssonicbld
5d2db5068c
[ci/build]: Upgrade SONiC package versions (#12437) 2022-10-18 22:19:35 +08:00
mssonicbld
cfc9af71ef
[ci/build]: Upgrade SONiC package versions (#12418) 2022-10-16 22:24:10 +08:00
mssonicbld
b4e6a06d1a
[ci/build]: Upgrade SONiC package versions (#12409) 2022-10-14 23:51:03 +08:00
Ying Xie
a1365b44c3 [BGP] starting BGP service after swss (#12381)
Why I did it
BGP service has always been starting after interface-config. However, recently we discovered an issue where some BGP sessions are unable to establish due to BGP daemon not able to read the interface IP.

This issue was clearly observed after upgrading to FRR 8.2.2. See more details in #12380.

How I did it
Delaying starting BGP seems to be a workaround for this issue.

However, caution is that this delay might impact warm reboot timing and other timing sequences.

This workaround is reducing the probability of hitting the issue by close to 100X. However, this workaround is not bulletproof as test shows. It is still preferrable to have a proper FRR fix and revert this change in the future.

How to verify it
Continuously issuing config reload and check BGP session status afterwards.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-10-13 16:34:10 +00:00
mssonicbld
3435a8a305
[ci/build]: Upgrade SONiC package versions (#12372) 2022-10-13 02:58:26 +08:00
mssonicbld
1b5d61246a
[ci/build]: Upgrade SONiC package versions (#12324) 2022-10-09 21:44:14 +08:00
Stepan Blyshchak
06f8b1f98a
[auto-ts] add memory check (#10433) (#12291)
#### Why I did it

To support automatic techsupport invokation in case memory usage is too high.

#### How I did it

Implemented according to https://github.com/Azure/SONiC/pull/939

#### How to verify it

UT, manual test on the switch.

*DEPENDS* on https://github.com/Azure/sonic-utilities/pull/2116
2022-10-06 08:06:46 -07:00
Prince George
fab37239dd Disable brackted-paste mode off by default (#12285)
* Disable brackted-paste mode off by default

* address review comment
2022-10-06 14:58:46 +00:00
Saikrishna Arcot
ac19e2a8ba [docker-wait-any]: Exit worker thread if main thread is expected to exit (#12255)
There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed).  Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread.  However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`.  The reason is unknown.

This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.

The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-10-06 14:57:53 +00:00
mssonicbld
204cf58221
[ci/build]: Upgrade SONiC package versions (#12278) 2022-10-05 20:38:20 +08:00
Ying Xie
76f7d7fa53 Revert "[auto-ts] add memory check (#10433)"
This reverts commit a2cd0f5d4c.
2022-10-04 21:53:45 +00:00
mssonicbld
1a08069d40
[ci/build]: Upgrade SONiC package versions (#12268) 2022-10-04 21:09:24 +08:00
Stepan Blyshchak
a2cd0f5d4c [auto-ts] add memory check (#10433)
#### Why I did it

To support automatic techsupport invokation in case memory usage is too high.

#### How I did it

Implemented according to https://github.com/Azure/SONiC/pull/939

#### How to verify it

UT, manual test on the switch.

*DEPENDS* on https://github.com/Azure/sonic-utilities/pull/2116
2022-10-03 18:58:38 +00:00
mssonicbld
89643d4717
[ci/build]: Upgrade SONiC package versions (#12245) 2022-10-02 21:13:07 +08:00
mssonicbld
a7d088c47c
[ci/build]: Upgrade SONiC package versions (#12191) 2022-09-28 23:25:55 +08:00
mssonicbld
1c5abca0a6
[ci/build]: Upgrade SONiC package versions (#12187) 2022-09-27 08:41:31 +08:00
mssonicbld
99f9c53d19
[ci/build]: Upgrade SONiC package versions (#12142) 2022-09-25 21:57:18 +08:00
Volodymyr Boiko
3d620370f7 [bgp][service] Start bgp service after interfaces-config service (#11827)
- Why I did it
interfaces-config service restarts networking service, during the restart loopback interface address is being removed and reassigned back, leaving loopback without an ipv4 address for a while.
On SONiC startup and config reload interfaces-config and bgp services start in parallel and sometimes
fpmsyncd in bgp attempts bind to loopback while it does not have an address, fails with the log
Exception "Cannot assign requested address" had been thrown in daemon
and exits with rc 0.

root@sonic:/# supervisorctl status
fpmsyncd                         EXITED    Jul 20 05:04 AM
zebra                            RUNNING   pid 35, uptime 6:15:05
zsocket                          EXITED    Jul 20 05:04 AM
docker logs bgp
INFO exited: fpmsyncd (exit status 0; expected)
With fpmsyncd dead, configured routes do not appear in the database.

- How I did it
Added ordering dependency on interfaces-config service into bgp.config

- How to verify it
Itself the issue reproduces quite rarely, but one can gain the time interval between networking down and networking up in interfaces-config.sh like this:

diff --git a/files/image_config/interfaces/interfaces-config.sh b/files/image_config/interfaces/interfaces-config.sh
index f6aa4147a..87caceeff 100755
--- a/files/image_config/interfaces/interfaces-config.sh
+++ b/files/image_config/interfaces/interfaces-config.sh
@@ -63,7 +63,11 @@ done
 # Read sysctl conf files again
 sysctl -p /etc/sysctl.d/90-dhcp6-systcl.conf

-systemctl restart networking
+# systemctl restart networking
+
+systemctl start networking
+sleep 10
+systemctl stop networking

 # Clean-up created files
 rm -f /tmp/ztp_input.json /tmp/ztp_port_data.json
with this change the issue reproduces on every config reload.

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2022-09-21 21:15:08 +00:00
Maxime Lorrillere
458b12b4af [Chassis][Voq]Configure midplane network on supervisor (#11725)
Multi-asic Docker instances are created behind Docker's default bridge
which doesn't allow talking to other Docker instances that are in the
host network (like database-chassis).

On linecards, we configure midplane interfaces to let per-asic docker
containers talk to CHASSIS_DB on the supervisor through internal chassis
network.

On the supervisor we don't need to use chassis internal network, but we
still need a similar setup in order to allow fabric containers to talk
to database-chassis
2022-09-21 21:12:40 +00:00
mssonicbld
77b469d7c8
[ci/build]: Upgrade SONiC package versions (#12121) 2022-09-20 21:24:25 +08:00
Oleksandr Ivantsiv
c9ba827773
[202205] [services] Update "WantedBy=" section for tacacs-config.timer. (#11893) (#12080)
Manually cherry-picking #11893

- Why I did it
The timer execution may fail if triggered during a config reload (when the sonic.target is stopped). This might happen in a rare situation if config reload is executed after reboot in a small time slot (for 0 to 30 seconds) before the tacacs-config timer is triggered:

systemctl status tacacs-config.timer
tacacs-config.timer - Delays tacacs apply until SONiC has started
Loaded: loaded (/lib/systemd/system/tacacs-config.timer; enabled-runtime; vendor preset: enabled)
Active: failed (Result: resources) since Mon 2022-08-29 15:53:03 IDT; 1min 28s ago
Trigger: n/a
Triggers: tacacs-config.service

Aug 29 15:47:53 r-boxer-sw01 systemd[1]: Started Delays tacacs apply until SONiC has started.
Aug 29 15:53:03 r-boxer-sw01 systemd[1]: tacacs-config.timer: Failed to queue unit startup job: Transaction for tacacs-config.service/start is destructive (mgmt-framework.timer has 's>
Aug 29 15:53:03 r-boxer-sw01 systemd[1]: tacacs-config.timer: Failed with result 'resources'.

- How I did it
To ensure that timer execution will be resumed after a config reload the WantedBy section of the systemd service is updated to describe relation to sonic.target.

- How to verify it
Reboot the system
After reboot monitor tacacs-config.timer status. 30 seconds before timer activation run "config reload -y" command.
Check system status.

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
2022-09-19 09:20:10 +03:00
mssonicbld
f361c029c5
[ci/build]: Upgrade SONiC package versions (#11980) 2022-09-19 12:31:16 +08:00
Aryeh Feigin
b8c6e2a45d
Use warm-boot infrastructure for fast-boot (#12026) 2022-09-14 21:23:34 +03:00
Saikrishna Arcot
f1243bad1b
Pin version of bazelisk to v1.13.0 (#12027)
* Pin version of bazelisk to v1.13.0

This tries to avoid builds failures due to the latest version of
bazelisk changing and causing hash mismatches.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-09-08 21:15:35 -07:00