Commit Graph

7607 Commits

Author SHA1 Message Date
mssonicbld
ead7b975f8 [submodule] Update submodule sonic-sairedis to the latest HEAD automatically 2023-04-20 16:34:05 +08:00
mssonicbld
34d1c860b0 [submodule] Update submodule sonic-utilities to the latest HEAD automatically 2023-04-20 16:34:00 +08:00
Hua Liu
a14cc76879
Install python-redis package to docker containers (#14632)
Install python-redis package to docker containers

#### Why I did it
This this bug: https://github.com/sonic-net/sonic-buildimage/issues/14531
The 'flush_unused_database' is part of docker-database, and docker-database does not install python-redis package by itself. it's using redis installed by sonic-py-swsssdk.
So after remove sonic-py-swsssdk from container, this script break.

To this this bug and avoid similer bug happen again, install python-redis to docker containers which removed sonic-py-swsssdk .

#### How I did it
Install python-redis to containers.

#### How to verify it
Pass all UT.
Create new UT to cover this scenario: https://github.com/sonic-net/sonic-mgmt/pull/8032

#### Description for the changelog
Improve sudo cat command for RO user.
2023-04-19 18:14:48 -07:00
mssonicbld
d006219e2d
[ci/build]: Upgrade SONiC package versions (#14718) 2023-04-19 18:59:16 +08:00
mssonicbld
864a254a50 [submodule] Update submodule sonic-swss to the latest HEAD automatically 2023-04-19 16:34:33 +08:00
mssonicbld
6556288ac2 [submodule] Update submodule sonic-utilities to the latest HEAD automatically 2023-04-19 16:34:27 +08:00
vdahiya12
9e2d457a42
[minigraph] add support for changing T1 ports speed from 400G to 100G and vice-versa (#14505)
Open
[minigraph] add support for changing T1 ports speed from 400G to 100G and vice-versa
#14505
vdahiya12 wants to merge 9 commits into sonic-net:master from vdahiya12:dev/vdahiya/minigraph_parser
Conversation 10
Commits 9
Checks 18
Files changed 5
Conversation
vdahiya12
@vdahiya12 vdahiya12 commented 2 weeks ago • 
On SONiC T1 cisco 8101 HwSku, the speed changes are done from 400G to 100G needs to be supported on 400G ports.
To enable this, along with speed change the port lanes need to be changed. This PR has the changes to update the port lanes when such speed change happens.
Basically if Banwidth in minigraph.xml intends to enable a 100G speed on a 400G port, then the appropriate lane change and speed change needs to be invoked in mingraph parser
Example if port_config.ini dicatates the speed to be 400G and minigraph has 100G speed, then this changeneeds to be accommodated

# name         lanes                                      alias   index  speed    channel
Ethernet96     1536,1537,1538,1539,1540,1541,1542,1543    etp12    12       400000     0
 <DeviceLinkBase>
        <ElementType>DeviceInterfaceLink</ElementType>
        <EndDevice>ARISTA01T2</EndDevice>
        <EndPort>Ethernet1</EndPort>
        <StartDevice>Device-8101-01</StartDevice>
        <StartPort>etp12</StartPort>
        <Bandwidth>100000</Bandwidth>
      </DeviceLinkBase>
These platforms today have 400g port with 8 serdes lines, and 100g will operate with 4 serdes lane. When the port speed changes from 400G to 100G the first 4 lanes will be used for 100G port.

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2023-04-19 01:23:19 -07:00
DavidZagury
03bf4ff549
Remove default value from SECURE_UPGRADE_DEV_SIGNING_KEY (#14582)
This is done because when there is a default value, we mount to this path, and this creates this folder on the host.

#### Why I did it
Fix issue that running without overwriting SECURE_UPGRADE_DEV_SIGNING_KEY and SECURE_UPGRADE_DEV_SIGNING_CERT dummy folders are being created on the host.

#### How I did it
Removed the default assignment to SECURE_UPGRADE_DEV_SIGNING_KEY and SECURE_UPGRADE_DEV_SIGNING_CERT

#### How to verify it
Build SONiC using your own prod script
2023-04-18 15:48:47 -07:00
Zain Budhwani
e9a9c9e31f
Update telemetry.sh with threshold config (#14615)
#### Why I did it

Threshold is a new config field passed to telelemetry.go as parameter

#### How I did it

Add check for threshold

#### How to verify it

Modify telemetry.sh, systemctl restart telemetry, telemetry process has threshold of 100
2023-04-18 14:29:30 -07:00
mssonicbld
802a5cff19 [submodule] Update submodule sonic-gnmi to the latest HEAD automatically 2023-04-18 16:34:04 +08:00
mssonicbld
fe8530e692
[submodule] Update submodule sonic-platform-common to the latest HEAD automatically (#14678) 2023-04-18 15:25:37 +08:00
Aryeh Feigin
039a9c998a
[Fast-boot] Clear teamd-timer when finalizing fast-reboot (#14583)
Part of sonic-net/sonic-utilities#2760
Similar to #14295

- Why I did it
To clear teamd timer when fast-reboot is finalized to prevent any further affect.

- How I did it
Deleted teamd timer from config-db in fast-reboot finalizer.
config save call is moved to after clearing teamd-timer so it won't have any further affect as well.

- How to verify it
Verified manually that entry was deleted after fast-reboot was finailized.
2023-04-18 09:15:42 +03:00
Stepan Blyshchak
d73c810e86
[image_config] add rasdaemon.timer (#14300)
rasdaemon is a tool to log hardware errors. It takes 100% CPU during
boot for a few seconds. It impacts fast/warm boot by delaying control
plane restoration for 5 sec on some platforms.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-04-17 08:58:45 -07:00
mssonicbld
7f262d71da
[ci/build]: Upgrade SONiC package versions (#14685) 2023-04-17 19:58:43 +08:00
Feng-msft
183d0f2be7
Update golang version for telemetry build in sonic-slave-buster to fix CVE-2021-33195 (#14637)
Update golang version for telemetry build in sonic-slave-buster to fix https://security-tracker.debian.org/tracker/CVE-2021-33195, this PR will be merged into 202012 branch finally.

#### Why I did it
Go before 1.15.13 and 1.16.x before 1.16.5 has functions for DNS lookups that do not validate replies from DNS servers, and thus a return value may contain an unsafe injection (e.g., XSS) that does not conform to the RFC1035 format. Now in 201911 and 202012 branch we're using 1.14.2

##### Work item tracking
- Microsoft ADO **(number only)**:17727291

#### How I did it
Bump golang version into 1.15.15 which contains corresponding fix.

#### How to verify it
unit test to do sanity check.
2023-04-16 23:47:42 -07:00
Feng-msft
46c0d073a5
Update golang version for telemetry build in sonic-slave-buster to fix (#14636)
Update golang version for telemetry build in sonic-slave-jessie to fix CVE-2021-33195, this PR will be merged into 201911 branch finally.

#### Why I did it
Go before 1.15.13 and 1.16.x before 1.16.5 has functions for DNS lookups that do not validate replies from DNS servers, and thus a return value may contain an unsafe injection (e.g., XSS) that does not conform to the RFC1035 format. Now in 201911 and 202012 branch we're using 1.14.2

##### Work item tracking
- Microsoft ADO **(number only)**:17727291

#### How I did it
Bump golang version into 1.15.15 which contains corresponding fix.

#### How to verify it
unit test to do sanity check.
2023-04-16 23:44:11 -07:00
mssonicbld
7931abd527
[submodule] Update submodule sonic-host-services to the latest HEAD automatically (#14670) 2023-04-16 15:04:19 +08:00
mssonicbld
49dbaeb649
[ci/build]: Upgrade SONiC package versions (#14672) 2023-04-15 18:21:50 +08:00
mssonicbld
98ed13b978
[submodule] Update submodule sonic-swss to the latest HEAD automatically (#14648) 2023-04-15 15:04:14 +08:00
Saikrishna Arcot
070a64af89
Fix backend port channels and routes being displayed (#14479)
* Fix backend port channels and routes being displayed
In `show interface portchannel` and `show ip route`, backend port
channels and routes were being displayed. This is due to changes in #13660.
Fix these issues by switching to reading from PORTCHANNEL_MEMBERS table
instead.
Fixes #14459.
* Replace table name with constant

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-04-14 19:54:02 -07:00
mssonicbld
d014b03849
[submodule] Update submodule sonic-utilities to the latest HEAD automatically (#14649) 2023-04-14 15:10:53 +08:00
Ravi [Marvell]
fa48caf39d
Add debug shell packages for Marvell Innovium platforms (#11845)
- Why I did it
Package Marvell/Innovium CLI shell.

- How I did it
Include shell packages.

- How to verify it
Platform specific shell commands.

Signed-off-by: rck-innovium rck@innovium.com
2023-04-13 22:04:36 +03:00
Dror Prital
d8f75c932b
[submodule] Advance sonic-platform-daemons pointer (#14595)
Update sonic-platform-daemons submodule pointer to include the following:
* d1203ef Update xcvrd to use new STATE_DB FAST_REBOOT entry ([#335](https://github.com/sonic-net/sonic-platform-daemons/pull/335))

Signed-off-by: dprital <drorp@nvidia.com>
2023-04-13 18:21:32 +03:00
mssonicbld
3e529cbab3 [submodule] Update submodule to the latest HEAD automatically 2023-04-13 20:51:28 +08:00
Vivek
397908aa59
[Mellanox] Facilitate automatic integration of new hw-mgmt (#14594)
- Why I did it
Facilitate Automatic integration of new hw-mgmt version into SONiC.

Inputs to the Script:

MLNX_HW_MANAGEMENT_VERSION Eg: 7.0040.5202
CREATE_BRANCH: (y|n) Creates a branch instead of a commit (optional, default: n)
BRANCH_SONIC: Only relevant when CREATE_BRANCH is y. Default: master.
Note: These should be provided through SONIC_OVERRIDE_BUILD_VARS  parameter

Output:

Script creates a commit (in each of sonic-buildimage, sonic-linux-kernel) with all the changes required for upgrading the hw-management version to a version provided by MLNX_HW_MANAGEMENT_VERSION
Brief Summary of the changes made:

MLNX_HW_MANAGEMENT_VERSION flag in the hw-management.mk file
hw-mgmt submodule is updated to the corresponding version
Updates are made to non-upstream-patches/patches and series.patch file
series, kconfig-inclusion and kconfig-exclusion files can be updated in the sonic-linux-kernel repo
sonic-linux-kernel/patches folder is updated with the corresponding upstream patches
Based on the inputs, there could be a branch seen in the local for each of the repo's. Branch is named as <branch>_<parent_commit>_integrate_<hw_mgmt_version>

- How I did it
Added a new make target which can be invoked by calling make integrate-mlnx-hw-mgmt
user@server:/sonic-buildimage$ git rev-parse --abbrev-ref HEAD
master_23193446a_integrate_7.0020.5052
user@server:/sonic-buildimage$ git log --oneline -n 2
f66e01867 (HEAD -> master_23193446a_integrate_V.7.0020.5052, show) Intgerate HW-MGMT V.7.0020.5052 Changes
23193446a (master_intg_hw_mgmt) Update logic

user@server:/sonic-buildimage/src/sonic-linux-kernel$ git rev-parse --abbrev-ref HEAD
master_6847319_integrate_7.0020.4104
user@server:/sonic-buildimage/src/sonic-linux-kernel$ git log --oneline -n 2
6094f71 (HEAD -> master_6847319_integrate_V.7.0020.5052) Intgerate HW-MGMT V.7.0020.5052 Changes
6847319 (origin/master, origin/HEAD) Read ID register for optoe1 to find pageable bit in optoe driver  (#308)
Changes made will be summarized under sonic-buildimage/integrate-mlnx-hw-mgmt_user.out file. Debugging and troubleshooting output is written to sonic-buildimage/integrate-mlnx-hw-mgmt.log files

User output file & stdout file:

log_files.tar.gz

Limitations:
Assumes the changes would only work for amd64
Assumes the non-upstream patches in mellanox only belong to hw-mgmt

- How to verify it
Build the Kernel

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-04-13 14:18:09 +03:00
Sudharsan Dhamal Gopalarathnam
2804998766
[config reload]Config Reload Enhancement (#13969)
#### Why I did it
Implementing code changes for https://github.com/sonic-net/SONiC/pull/1203

#### How I did it
Removed the timers and delayed target since the delayed services would start based on event driven approach.
Cleared port table during config reload and cold reboot scenario.
Modified yang model, init_cfg.json to change has_timer to delayed

#### How to verify it
Running regression
2023-04-12 11:20:03 -07:00
Christian Svensson
97c29a45bd
[build] Do not ignore well-known debian files (#14565)
Includes the common debian files that we always want to include.

This mitigates but does not fully solve #7683 as it
could be more files that are ignored by this default rule.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2023-04-12 09:10:22 -07:00
mssonicbld
f9eb849d75
[ci/build]: Upgrade SONiC package versions (#14620) 2023-04-12 20:05:30 +08:00
anamehra
f34360f101
chassis-packet: resolve the missing static routes (#14593)
Why I did it
Fixes #14179
chassis-packet: missing arp entries for static routes causing high orchagent cpu usage

It is observed that some sonic-mgmt test case calls sonic-clear arp, which clears the static arp entries as well. Orchagent or arp_update process does not try to resolve the missing arp entries after clear.

How I did it
arp_update should resolve the missing arp/ndp static route
entries. Added code to check for missing entries and try ping if any
found to resolve it.

How to verify it
After boot or config reload, check ipv4 and ipv4 neigh entries to make sure all static route entries are present
manual validation:
Use sonic-clear arp and sonic-clear ndp to clear all neighbor entries
run arp_update
Check for neigh entries. All entries should be present.
Testing on T0 setup route/for test_static_route.py

The test set the STATIC_ROUTE entry in conifg db without ifname:
sonic-db-cli CONFIG_DB hmset 'STATIC_ROUTE|2.2.2.0/24' nexthop 192.168.0.18,192.168.0.25,192.168.0.23

"STATIC_ROUTE": {
    "2.2.2.0/24": {
        "nexthop": "192.168.0.18,192.168.0.25,192.168.0.23"
    }
},
Validate that the arp_update gets the proper ARP_UPDATE_VARDS using arp_update_vars.j2 template from config db and does not crash:

{ "switch_type": "", "interface": "", "pc_interface" : "PortChannel101 PortChannel102 PortChannel103 PortChannel104 ", "vlan_sub_interface": "", "vlan" : "Vlan1000", "static_route_nexthops": "192.168.0.18 192.168.0.25 192.168.0.23 ", "static_route_ifnames": "" }

validate route/test_static_route.py testcase pass.
2023-04-12 15:07:42 +08:00
xumia
f1fd42558a
Support to add SONiC OS Version in device info (#14601)
Why I did it
Support to add SONiC OS Version in device info.
It will be used to display the version info in the SONiC command "show version". The version is used to do the FIPS certification. We do not do the FIPS certification on a specific release, but on the SONiC OS Version.

SONiC Software Version: SONiC.master-13812.218661-7d94c0c28
SONiC OS Version: 11
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
How I did it
2023-04-12 09:20:08 +08:00
Qi Luo
38f0ec6563
Update pull request template for test evidence, and work item trackers (#14552)
Update pull request template for test evidence, and work item trackers
2023-04-11 15:16:35 -07:00
xumia
ad162ae0e8
[Build] Optimize the version control for Debian packages (#14557)
Why I did it
Optimize the version control for Debian packages.
Fix sonic-slave-buster/sources.list.amd64 not found display issue, need to generate the file before running the shell command to evaluate the sonic image tag.
When using the snapshot mirror, it is not necessary to update the version file based on the base image. It will reduce the version dependency issue, when an image is not run when freezing the version.

How I did it
Not to update the version file when snapshot mirror enabled.

How to verify it
2023-04-11 17:07:26 +08:00
Konstantin Vasin
d7d6445abf
[Build] disable DOCKER_BUILDKIT explicitly (#14405)
Why I did it
Fix #14081
By default DOCKER_BUILDKIT is enabled after docker version 23.0.0
So we need to disable it explicitly if SONIC_USE_DOCKER_BUILDKIT is not set.
Otherwise it will produce larger installable images.

How I did it
set DOCKER_BUILDKIT=0 in slave.mk

How to verify it
2023-04-11 08:06:07 +00:00
Liu Shilong
3d32008e49
[build] Fix reproducible build version issue when failed to download web file (#14587)
Why I did it
refine reproducible build.

How I did it
Fix reset map variable in bash.
Ignore empty web file md5sum value.
If web file didn't backup in azure storage, use file on web.
How to verify i
2023-04-11 10:47:30 +08:00
Vivek
0df155b014
Made non-upstream patch design order aware (#14434)
- Why I did it

Currently, non upstream patches are applied only after upstream patches.

Depends on sonic-net/sonic-linux-kernel#313. Can be merged in any order, preferably together

- What I did it

Non upstream Patches that reside in the sonic repo will not be saved in a tar file bur rather in a folder pointed out by EXTERNAL_KERNEL_PATCH_LOC. This is to make changes to the non upstream patches easily traceable.
The build variable name is also updated to INCLUDE_EXTERNAL_PATCHES
Files/folders expected under EXTERNAL_KERNEL_PATCH_LOC
EXTERNAL_KERNEL_PATCH_LOC/
       ├──── patches/
             ├── 0001-xxxxx.patch
             ├── 0001-yyyyyyyy.patch
             ├── .............
       ├──── series.patch
series.patch should contain a diff that is applied on the sonic-linux-kernel/patch/series file. The diff should include all the non-upstream patches.
How to verify it

Build the Kernel and verified if all the patches are applied properly

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2023-04-10 19:48:27 +03:00
mssonicbld
4e5c8988b1
[ci/build]: Upgrade SONiC package versions (#14586) 2023-04-10 18:10:37 +08:00
mssonicbld
4ff784a489
[submodule] Update submodule to the latest HEAD automatically (#14585) 2023-04-10 15:00:12 +08:00
xumia
09bd333b63
[Build] Fix the reproducible build variable display error in the slave container (#14543)
Why I did it
Enable the reproducible build for PR build for master branch

Fix the reproducible build variable display error in the slave container.
The below config is none, although the config is set and takes effect.

"SONIC_VERSION_CONTROL_COMPONENTS": "none"
How I did it
Passing the variable through the slave container command line.
The variable has been passed to the slave container and the other docker container by a config file, it is only used to display the value during the build.

How to verify it
See https://dev.azure.com/mssonic/build/_build/results?buildId=247960&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=88f376cf-c35d-5783-0a48-9ad83a873284

"SONIC_VERSION_CONTROL_COMPONENTS": "deb,py2,py3,web,git,docker"
2023-04-10 14:56:30 +08:00
Konstantin Vasin
1bf50a5566
[Build] use snapshots of debian mirrors for sonic-slave containers #14400
Why I did it
We don't use snapshots of debian mirrors for sonic-slave containers even if MIRROR_SNAPSHOT is enabled.

How I did it
Export MIRROR_SNAPSHOT in Makefile.work to generate sources.list for sonic-slave containers using debian snapshot mirror

How to verify it
2023-04-10 09:15:10 +08:00
Aryeh Feigin
41a9813018
Finalize fast-reboot in warmboot finalizer (#14238)
- Why I did it
To solve an issue with upgrade with fast-reboot including FW upgrade which has been introduced since moving to fast-reboot over warm-reboot infrastructure.
As well, this introduces fast-reboot finalizing logic to determine fast-reboot is done.

- How I did it
Added logic to finalize-warmboot script to handle fast-reboot as well, this makes sense as using fast-reboot over warm-reboot this script will be invoked. The script will clear fast-reboot entry from state-db instead of previous implementation that relied on timer. The timer could expire in some scenarios between fast-reboot finished causing fallback to cold-reboot and possible crashes.

As well this PR updates all services/scripts reading fast-reboot state-db entry to look for the updated value representing fast-reboot is active.

- How to verify it
Run fast-reboot and check that fast-reboot entry exists in state-db right after startup and being cleared as warm-reboot is finalized and not due to a timer.
2023-04-09 16:59:15 +03:00
mssonicbld
e32624d362
[ci/build]: Upgrade SONiC package versions (#14571) 2023-04-08 18:00:30 +08:00
mssonicbld
95fb9ee637
[submodule] Update submodule to the latest HEAD automatically (#14525) 2023-04-08 17:05:31 +08:00
Stephen Sun
152148fb81
Enhance the error message output mechanism (#14384)
#### Why I did it

Enhance the error message output mechanism during swss docker creating

#### How I did it

Capture the output to stderr of `sonic-cfggen` and output it using `echo` to make sure the error message will be logged in syslog.

#### How to verify it

Manually test
2023-04-07 14:23:35 -07:00
Lior Avramov
71f2a6a3a9
Add teamd patches to solve traffic loss issue when removing port from LAG (#14002)
#### Why I did it
When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds.
Fix issue https://github.com/sonic-net/sonic-buildimage/issues/14381

#### How I did it
The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port. 
Implement the op in LACP runner to disable the port which leads to proper LACPDU send.

#### How to verify it
Set LAG between 2 switches.
Set LAGs to be router port and set ip address.
In switch A send ping to ip address of LAG in switch B.
In switch B, while ping is running remove port from LAG.
Verify ping is not stopping.
2023-04-07 14:15:19 -07:00
Stephen Sun
3b5871f7f8
Fix issue: wrong teamd link watch state after warm reboot (#14084)
#### Why I did it

Fix issue: wrong teamd link watch state after warm reboot due to TEAM_ATTR_PORT_CHANGED lost

The flag TEAM_ATTR_PORT_CHANGED is maintained by kernel team driver:
- a flag "changed" is maintained in struct team_port struct
- the flag is set by __team_port_change_send once relevant information is updated, including port linkup (together with speed, duplex), adding or removing
- the flag is cleared by team_nl_fill_one_port_get once the updated information has been notified to user space via RTNL

In the userspace, the change flag is maintained by libteam in struct team_port.
The team daemon calls port_priv_change_handler_func on receiving port change event.
The logic in port_priv_change_handler_func
1. creates the port if it did not exist, which triggers port add event and eventually calls lacp_port_added callback.
2. triggers port change event if team_port->changed is true, which eventually calls lw_ethtool_event_watch_port_changed to update port state for link watch ethtool.
3. removes the port if team_port->removed is removed

In lacp_port_added, it calls team_refresh to refresh ifinfo, port info, and option info from the kernel via RTNL.
In this step, port_priv_change_handler_func is called recursively.
- In the inner call, it won't get TEAM_ATTR_PORT_CHANGED flag because kernel has already notified that.
- As a result, team_port->changed flag is cleared in the libteam.
- The port change event won't be triggered from either inner or outer call of port_priv_change_handler_func.

If the port has been up when the port is being added to the team device, the "port up" information is carried in the outer call but will be lost.

In case the flag TEAM_ATTR_PORT_CHANGED is set only in the inner call, function port_priv_change_handler_func can be called in the inner call.
However, it will fail to fetch "enable" options because option_list_init has not be called.

Signed-off-by: Stephen Sun <stephens@nvidia.com>

#### How I did it

Fix:
Do not call check_call_change_handlers when parsing RTNL function is called from another check_call_change_handlers recursively.

#### How to verify it

- Manually test
- Regression test
  - warm reboot
  - warm reboot sad lag
  - warm reboot sad lag member
  - warm reboot sad (partial)
2023-04-07 14:13:33 -07:00
Devesh Pathak
d74055e12c
Increase wait_for_tunnel() timeout to 90s (#14279)
Why I did it
Orchagent sometimes take additional time to execute Tunnel tasks. This cause write_standby script to error out and mux state machines are not initialized. It results in show mux status missing some ports in output.

Mar 13 20:36:52.337051 m64-tor-0-yy41 INFO systemd[1]: Starting MUX Cable Container...
Mar 13 20:37:52.480322 m64-tor-0-yy41 ERR write_standby: Timed out waiting for tunnel MuxTunnel0, mux state will not be written
Mar 13 20:37:58.983412 m64-tor-0-yy41 NOTICE swss#orchagent: :- doTask: Tunnel(s) added to ASIC_DB.
How I did it
Increase timeout from 60s to 90s

How to verify it
Verified that mux state machine is initialized and show mux status has all needed ports in it.
2023-04-07 11:30:58 +08:00
xumia
6e43b5c515
[Build] Support to use the snapshot mirror for debian base image (#14474)
Why I did it
[Build] Support to use the snapshot mirror for debian base image

How I did it
If the MIRROR_SNAPSHOT=n, then use the default mirror http://deb.debian.org/debian
If the MIRROR_SNAPSHOT=y, then use the snapshot mirror, for instance http://packages.trafficmanager.net/snapshot/debian/20230330T000330Z/.

How to verify it
+ scripts/build_debian_base_system.sh amd64 bullseye ./fsroot-vs
I: Target architecture can be executed
I: Retrieving InRelease 
I: Checking Release signature
I: Valid Release signature (key id A4285295FC7B1A81600062A9605C66F00D6C9793)
I: Retrieving Packages 
I: Validating Packages 
I: Resolving dependencies of required packages...
I: Resolving dependencies of base packages...
I: Checking component main on http://packages.trafficmanager.net/snapshot/debian/20230331T000125Z...
I: Retrieving libacl1 2.2.53-10
2023-04-07 11:05:51 +08:00
xumia
46cb2ad03d
[Ci] Fix the wrong SONIC_BUILD_JOBS build variable used issue in Azp (#14071)
Why I did it
[Ci] Fix the no parallel jobs in some of the platforms issue
We observed some of the pipelines running more time than expected. The issue is the SONIC_BUILD_JOBS using the wrong value 1. It is caused by the runtime variable issue, there is additional single quota mark character added in the make command line.

make 'SONIC_BUILD_JOBS=$(nproc)' targe/xxxx
Need to change to

make SONIC_BUILD_JOBS=$(nproc) targe/xxxx
It is to improve the build performance for some of the platforms using the variable SONIC_BUILD_JOBS=1.
Good one vs: https://dev.azure.com/mssonic/build/_build/results?buildId=227986&view=logs&j=cef3d8a9-152e-5193-620b-567dc18af272&t=cf595088-5c84-5cf1-9d7e-03331f31d795

"SONIC_BUILD_JOBS"                : "8"
Bad one barefoot: https://dev.azure.com/mssonic/build/_build/results?buildId=227379&view=logs&j=993d6e22-aeec-5c03-fa19-35ecba587dd9&t=7be0d2ec-661f-5569-462c-2d9b7ca4ca5d

"SONIC_BUILD_JOBS"                : "1"
How I did it
Expand the BUILD_OPTIONS variable for all platforms.
2023-04-07 09:35:02 +08:00
Ying Xie
737d0e57ad
[write standby] force DB connections to use unix socket to connect (#14524)
Why I did it
At service start up time, there are chances that the networking service is being restarted by interface-config service. When that happens, write_standby could fail to make DB connections due to loopback interface is being reconfigured.

How I did it
Force the db connector to use unix socket to avoid loopback reconfig timing window.

How to verify it
Run config reload test 20+ times and no issue encountered.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* use unix socket instead

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2023-04-06 13:54:56 -07:00
Kuanyu Chen
cffd87a627
Add monit_snmp file to monitor memory usage (#14464)
#### Why I did it

When CPU is busy, the sonic_ax_impl may not have sufficient speed to handle the notification message sent from REDIS.
Thus, the message will keep stacking in the memory space of sonic_ax_impl.

If the condition continues, the memory usage will keep increasing.

#### How I did it

Add a monit file to check if the SNMP container where sonic_ax_impl resides in use more than 4GB memory.
If yes, restart the sonic_ax_impl process.

#### How to verify it

Run a lot of this command: `while true; do ret=$(redis-cli -n 0 set LLDP_ENTRY_TABLE:test1 test1); sleep 0.1; done;`
And check the memory used by sonic_ax_impl keeps increasing.

After a period, make sure the sonic_ax_impl is restarted when the memory usage reaches the 4GB threshold.
And verify the memory usage of sonic_ax_impl drops down from 4GB.
2023-04-06 12:19:11 -07:00