Commit Graph

690 Commits

Author SHA1 Message Date
Stepan Blyshchak
43c82ebfb0
[201911][nvidia] Fix broken FW links (#16721)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-11-20 10:09:02 -08:00
Feng-msft
a5043bfc84
Fix monit false alarm issue, which locates in process_checker and it (#16907)
Fix monit false alarm issue, which located in process_checker and it missed "disk-sleep" status check, thus some 201911 SONiC box report "pmon|sensord" error coincidently.

#### Why I did it
Currently psutil library returns below detail process status:
running: The process is currently running.
sleeping: The process is sleeping or waiting for an event to occur.
disk-sleep: The process is waiting for I/O operations to complete.
stopped: The process has been stopped (e.g. via the SIGSTOP signal).
zombie: The process has terminated but is still listed in the process table.
dead: The process has terminated and has been removed from the process table.

We should regard running/sleeping/disk-sleep as normal case and not alert in monit process.

Now once the disk-sleep occurs during monit cycle, below syslog will be paged, so get rid of syslog output meanwhile.

yslog.2.gz:Feb 24 06:12:17.394619 MEL23-0101-0301-04T1 ERR monit[6040]: 'pmon|sensord' status failed (1) -- '/usr/sbin/sensord -f daemon' is not running in host
syslog.2.gz:Feb 24 06:13:17.932531 MEL23-0101-0301-04T1 ERR monit[6040]: 'pmon|sensord' status failed (1) -- '/usr/sbin/sensord -f daemon' is not running in host
syslog.2.gz:Feb 24 06:14:18.502505 MEL23-0101-0301-04T1 ERR monit[6040]: 'pmon|sensord' status failed (1) -- '/usr/sbin/sensord -f daemon' is not running in host

Then I tried to reproduce the issue by triggering process_checker for sensord frequently and observed it's under "disk-sleep" status once the alert is raised.

##### Work item tracking
- Microsoft ADO **(number only)**:17663589

#### How I did it
Fix process_checker script code for adding "disk-sleep" case handling.

#### How to verify it
Verified in local DUT.
2023-10-26 18:23:24 -07:00
Tejaswini Chadaga
42597806a9
[201911][multi-asic] Monit changes to enable internal link monitoring script (#16393)
Monit changes to enable script to monitor SAI_PORT_STAT_IF_IN_ERRORS & SAI_PORT_STAT_IF_OUT_ERRORS on internal (backend) ports of multi-asic device.
2023-09-12 15:57:13 -07:00
Stepan Blyshchak
73647be598
[201911][mlnx-ffb.sh] Update issu-version location (#14928)
ISSU version check fails due to inability to mount squashfs from 202211 on 201911
2023-06-21 11:00:21 -07:00
xumia
7aeb5d46ce
[Build][201911] Fix the stretch/jessie mirror removed issue (#15083)
[Build] Fix the stretch/jessie mirror removed issue.
2023-05-17 22:52:26 -07:00
Stepan Blyshchak
16cce80e20
[201911][Mellanox] Place FW binaries under platform directory instead of squashfs (#14270)
202211 and above uses different squashfs compression type that 201911 kernel can not handle. Therefore, we avoid mounting squashfs altogather with this change.
2023-04-10 19:00:12 -07:00
Hua Liu
e5c6c2f9b9
Improve sudo cat command for RO user. (#14428) (#14437)
Improve sudo cat command for RO user.
Manually cherry-pick for #14428
2023-04-05 15:29:58 -07:00
Samuel Angebault
a13d460bbb
[201911][Arista] Disable ATA NCQ for a few products (#14468)
Why I did it
Some products might experience an occasional IO failure in the communication between CPU and SSD.
Based on some research it could be attributable to some device not handling ATA NCQ (Native Command Queue).

This issue currently affect 4 products:

DCS-7170-32C*
DCS-7170-64C
DCS-7060DX4-32
DCS-7260CX3-64
DCS-7050CX3-32S

How I did it
This change disable NCQ on the affected drive for a small set of products.

How to verify it
When the fix is applied, these 2 patterns can be found in the dmesg.
ata[0-9]+.00: FORCE: horkage modified (noncq)
NCQ (not used)

Test results using: fio --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4

with NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (depth 32), AA)

   READ: bw=33.9MiB/s (35.6MB/s), 33.9MiB/s-33.9MiB/s (35.6MB/s-35.6MB/s), io=4073MiB (4270MB), run=120078-120078msec
  WRITE: bw=34.1MiB/s (35.8MB/s), 34.1MiB/s-34.1MiB/s (35.8MB/s-35.8MB/s), io=4100MiB (4300MB), run=120078-120078msec
without NCQ (ata1.00: 61865984 sectors, multi 1: LBA48 NCQ (not used))

   READ: bw=31.7MiB/s (33.3MB/s), 31.7MiB/s-31.7MiB/s (33.3MB/s-33.3MB/s), io=3808MiB (3993MB), run=120083-120083msec
  WRITE: bw=31.9MiB/s (33.4MB/s), 31.9MiB/s-31.9MiB/s (33.4MB/s-33.4MB/s), io=3830MiB (4016MB), run=120083-120083msec
Which release branch to backport (provide reason below if selected)
2023-04-02 14:03:21 -07:00
Nazarii Hnydyn
29ef4ee83e
[201911] [Mellanox] Add BIOS upgrade infra (#13623)
- Why I did it
Added BIOS upgrade infra

- How I did it
Added new make target

- How to verify it
Copy msn3800_bios.tar.gz to platform/mellanox/bios
make configure PLATFORM=mellanox
make target/files/stretch/msn3800_bios.tar.gz

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2023-02-13 19:49:23 +02:00
Devesh Pathak
4dba276094 Fix to improve hostname handling (#12064)
* Fix to improve hostname handling
If config_db.json is missing hostname entry, hostname-config.sh ends
up deleting existing entry too and hostname changes to default 'localhost'

* default hostname to 'sonic` if missing in config file
2023-01-30 18:39:09 +00:00
Prince George
cd2bb08545 Close console session due to user inactivity (#9890)
Signed-off-by: Prince George <prgeor@microsoft.com>
2023-01-30 18:36:11 +00:00
arheneus@marvell.com
fc1295bdcc [ntp][apparmor] Allow apparmor read permission for ntpd under rw mount path of rootfs (#6040)
Certain platform specific packages sonic-platform-xyz, installs files onto rootfs, which would be placed on read-write mount path on /host/image-name/rw/...
when ntpd starts it tries to do read access on /usr/bin /usr/sbin/ /usr/local/bin , which inturn links further to the read-write mount path also.
Where ntpd would get below Apparmor Warning message

LOG:-
audit: type=1400 audit(1606226503.240:21): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/local/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
audit: type=1400 audit(1606226503.240:22): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/sbin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
audit: type=1400 audit(1606226503.240:23): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/image-HEAD-dirty-20201111.173951/rw/usr/bin/" pid=3733 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

Fix:
Add rw/.. mount path similar to root path access provided for ntpd in /etc/apparmor.d/usr.sbin.ntpd

Signed-off-by: Antony Rheneus <arheneus@marvell.com>
2022-10-16 05:42:35 +00:00
xumia
37fa1014ad
[201911] Change submodule path from Azure to sonic-net (#12313)
Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"

How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
2022-10-12 21:07:22 +08:00
Sudharsan Dhamal Gopalarathnam
e716b453b3
[201911][mellanox] Add CPLD update for SN2700 (#12173)
* [mellanox]: Add CPLD update for SN2700 (#3570)

* [mellanox]: Add CPLD update for SN2700.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* Updating cpld file

* Updating file path for cpld

* Updating archive

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
Co-authored-by: Nazarii Hnydyn <nazariig@mellanox.com>
2022-09-26 07:58:28 -07:00
Vivek
7781399bb6
[201911][Mellanox] Collect MST dump before syncd restart on shutdown notification (#11742)
- Why I did it
Collecting MST dump before syncd restart on shutdown notification during a SAI failure

Dump can be found under:
root@sonic:/home/admin# ls -l /var/dump/mstdump/
total 10684
-rw-r--r-- 1 root root 5460332 Aug 15 18:41 mstdump_20220815_184143.tar.gz
-rw-r--r-- 1 root root 5473253 Aug 15 21:46 mstdump_20220815_214642.tar.gz

root@sonic:/home/admin# tar -xvzf /var/dump/mstdump/mstdump_20220815_214642.tar.gz
├── ir-gdb
│   └── core
└── mstdump
    ├── mstdump1
    ├── mstdump2
    ├── mstdump3
    └── mststatus

- How I did it
Checked for shutdown notification log in sairedis and used it to determine whether the shutdown is normal or due to SAI failure

- How to verify it
Simulated a SAI failure event and verified it. Verified it also on different reboots and config reload scenarios the dump is not generated

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
2022-08-29 16:09:26 +03:00
Sujin Kang
61a34fcf22
[201911] Add hardware reboot cause when software reboot failed (#11753)
Why I did it
Add the hardware reboot cause when the previous software reboot failed

How I did it
Check both hardware reboot cause and software reboot cause.
Add the hardware reboot as actual reboot cause
if any hardware reboot cause is available for any software reboot.

How to verify it
Perform reboots and verify the reboot-cause
2022-08-25 12:30:53 -07:00
Ying Xie
db5b9ee834 [warm boot finalizer] only wait for enabled components to reconcile (#6454)
* [warm boot finalizer] only wait for enabled components to reconcile

Define the component with its associated service. Only wait for components that have associated service enabled to reconcile during warm reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-03-31 12:01:25 -07:00
Samuel Angebault
520a13ba72
[201911][Arista] Add emmc quirks for Upperlake (#9971)
Why I did it
Fix some unreliability seen on emmc device with some AMD CPUs

How I did it
Added a kernel parameter to add quirks to
It depends on a sonic-linux-kernel change to work properly but will be a no-op without it.

Description for the changelog
Add emmc quirks for Upperlake
2022-02-11 13:28:11 -08:00
Renuka Manavalan
eda84d2209
Invoke disk check periodically (#7374)
Helps with periodic scan of disk for RO state.
If found, this script makes transient fix and raise error message.
2021-11-19 16:45:21 -08:00
Sumukha Tumkur Vani
94577ba2ed
Flush RESTAPI DB upon config reload (#9092) 2021-10-28 16:06:28 -07:00
abdosi
d1f659689e
Logrotate for wtmp and btmp files to fix size getting too large. (#8744)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-09-14 17:39:02 -07:00
Stephen Sun
17948d0e4c
[docker-orchagent][201911] Pass ASIC vendor information to swss docker as docker level environment variable (#8274)
#### Why I did it
Recently, the reserved buffer of admin-down ports is going to be reclaimed.
However, the way to do this differs among vendors.
We need to find a way to pass vendor information to swss docker.

#### How I did it
Fetch the ASIC vendor information when the docker is created and pass it to the docker as environment variable `ASIC_VENDOR`.
2021-09-13 01:47:56 -07:00
Renuka Manavalan
8cd6714ef4
hostcfgd: Handle missed tacacs updates between load & listen (#8223)
Why I did it
The time gap between last config load & db-listen seem to have increased.
Any config updates that occurred in this gap gets missed by db-listen.
This could miss updating /etc/pam.d/common-auth-sonic

How I did it
Add a one shot timer, just before db-listen. The timer will fire after the subscribe is done
When the timer fires, reload tacacs & aaa
2021-08-06 10:38:37 -07:00
xumia
e4a4cfed98
Fix vtysh shell-ingestion security issue (#8022)
Why I did it
Fix vtysh shell-ingestion security issue
Only expose the limited parameters of the command vtysh show.
2021-06-30 19:34:55 +08:00
Renuka Manavalan
3ea38a9788
Add service to restore TACACS from old config (#7560) (#7865)
In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials.
Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present.

How I did it
By adding a service, that would fire 5 mins after boot.
This service apply tacacs if available.

How to verify it
Upgrade and watch status of tacacs.timer & tacacs.service
You may create /etc/sonic/old_config/tacacs.json, with updated credentials
(before 5mins after boot) and see that appears in config & persisted too.
2021-06-15 10:52:31 -07:00
Kuanyu Chen
c4f8cf9371 [config-setup]: Fix a bug in checking if updategraph is enabled (#7093)
Encounter error during "config-setup boot" if the updategraph is enabled.

How I did it
Correct the code inside the config-setup script.
Remove the space between the assignment operator.

How to verify it
Remove the /etc/sonic/config_db.json and reboot the device.
Originally, it will return following error after boot up.
rv: command not found
After modification, it can correctly parse the status of updategraph without error.
2021-05-31 08:11:08 -07:00
xumia
7aa8a021ea
Support readonly vtysh for sudoers (#7383) (#7572)
* Support readonly vtysh for sudoers (#7383)

Why I did it
Support readonly version of the command vtysh

How I did it
Check if the command starting with "show", and verify only contains single command in script.

* Fix the type issue in rvtysh
2021-05-19 09:02:16 +08:00
yozhao101
24e1cde1e6
[201911][Monit] Restart telemetry container if memory usage is beyond the threshold (#7618)
This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.
2021-05-17 16:51:13 -07:00
yozhao101
f0bbd9d1e9
[Monit] Install the unit of generate_monit_config.service. (#7558)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
The service file generate_monit_config.service is used to generate the Monit configuration file from template. I also should install this service file and enable it.

How I did it
I appended this service file name at the end of /etc/sonic/generated_services.conf.

How to verify it
I verified this on the device str2-7260cx3-acs-1.

Which release branch to backport (provide reason below if selected)
 201811
[x ] 201911
 202006
 202012
2021-05-07 13:37:09 -07:00
yozhao101
a8d2d0b5cd
[201911][Monit] Monitor critical processes in PMon contianer. (#7438)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor the critical processes in PMon container by Monit in 201911 branch.

How I did it
I created a template configuration file of Monit and it will be rendered to generate Monit configuration file of PMon container
by a service generate_monit_config.service.

How to verify it
I verified this on a Mellanox device str-msn2700-03 and an Arista device str-a7050-acs-1.

Which release branch to backport (provide reason below if selected)
 201811
[x ] 201911
 202006
 202012
2021-04-28 17:12:21 -07:00
yozhao101
528543bc6a
[201911][Monit] Monitor critical processes in radv and dhcp_relay containers. (#7340)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor critical processes in router advertiser and dhcp_relay containers by Monit.

How I did it
Router advertiser container only ran on T0 device and the T0 device should have at least one VLAN interface
which was configured an IPv6 address. At the same time, router advertiser container will not run on devices of which
the deployment type is 8.

As such, I created a service which will dynamically generate Monit configuration file of router advertiser from a
template.

Similarly Monit configuration file of dhcp_relay was also generated from a template since the number of dhcrelay process in dhcp_relay container is depended on number of VLANs.

How to verify it
I verified this implementation on a DuT.
2021-04-16 08:40:06 -07:00
pra-moh
e1eb1bda59
[201911][procdockerstatsd] fix typo for variable name (#7183) 2021-03-29 19:22:03 -07:00
pra-moh
afe548b61a
[201911][procdockerstatsd] Add missing unit conversion (#7157)
Fixing same issue in 201911 as mention here #7151
2021-03-26 10:24:02 -07:00
Volodymyr Samotiy
fd22b3bcee
[monit] Periodically monitor VNET route consistency (#7078)
To run VNET route consistency check periodically.

For any failure, the monit will raise alert based on return code.
The tool will log required details.
2021-03-25 07:24:59 -07:00
pra-moh
5f5644bb93
[201911][procdockerstatsd] Fix bug in procdockerstatsd (#7073)
Fix incorrect variable name
2021-03-16 18:41:45 -07:00
pra-moh
bd07256bfd
[201911][procdockerstatsd] Fix unit conversion for docker stats (#7063)
Bug exists in 201911 branch where unit conversion for docker stats is incorrect. Both MiB/GiB to byes conversion is incorrect
Example:
admin@str-s6000-acs-10:/usr/bin$ docker stats --no-stream -a
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e958c81d27a8 mgmt-framework 0.00% 0B / 0B 0.00% 0B / 0B 0B / 0B 0
9b6b7b4361d5 telemetry 3.13% 86.31MiB / 7.785GiB 1.08% 0B / 0B 0B / 106kB 30
e7fee0b617fe snmp 70.28% 57.03MiB / 7.785GiB 0.72% 0B / 0B 0B / 102kB 9

admin@str-s6000-acs-10:/usr/bin$ redis-cli -n 6 hgetall "DOCKER_STATS|e7fee0b617fe"

"MEM%"
"0.72"
"MEM_LIMIT_BYTES"
"8359080099840"
"NAME"
"snmp"
"NET_OUT_BYTES"
"0"
"MEM_BYTES"
"5980028928"
"BLOCK_OUT_BYTES"
"102000"
"NET_IN_BYTES"
"0"
"BLOCK_IN_BYTES"
"0"
"PIDS"
"9"
"CPU%"
"5.96"
2021-03-16 05:54:19 -07:00
abdosi
ab05a2f58a
Add support for BGP Monitors on multi asic SONiC platforms. (#6977)
This PR is cherry-pick of master
https://github.com/Azure/sonic-buildimage/pull/6920

Why I did it
Add support for BGP Monitors on multi asic SONiC platforms.

How I did it
On multi ASIC SONiC platforms, BGP monitor session will be established from Backend ASIC.
To achieve this following changes are done

Add BGP monitor configuration on the backend ASIC.
The BGP monitor configuration is present in the DPG of the device in minigraph.xml of multi-ASIC device, so this configuration will be added to the config_db of the host, when the minigraph is loaded.
To add configuration for this in the Backend ASIC, a new class MultiAsicBgpMonCfg is added to the hostcfgd service to update the config_db of the backend ASIC when the BGP_MONITOR table of the host config_db is updated.
This way incremental BGP_MONITOR configuration can also be handled.

Changes to establish BGP session with bgp monitor.

Add route in host main routing table to go to one of pre-define backend asic
Add IP table rule on front asic to mark the BGP packets with destination as IPv4 Loopback.
Add IP rule in front asic namespace to match mark BGP packet and lookup default table
Program the default route in FrontEnd asic name space docker default table as part of start.sh of the BGP container.
It need to be done as part of start.sh otherwise FRR default route will get over-written.
How to verify it

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: Arvind <arlakshm@microsoft.com>
2021-03-06 21:21:52 -08:00
Qi Luo
32e3cd9454
Revert "[monit] Periodically monitor VNET route consistency (#6819)" (#6975)
This reverts commit 2c6be7e0f5.
Reverts #6819
2021-03-06 06:56:26 -08:00
Volodymyr Samotiy
2c6be7e0f5
[monit] Periodically monitor VNET route consistency (#6819)
To run VNET route consistency check periodically.

For any failure, the monit will raise alert based on return code.
The tool will log required details.
2021-03-05 13:15:19 -08:00
judyjoseph
b05a4f1c30
Port fix for https://github.com/Azure/sonic-buildimage/pull/6537 in 201911 (#6648)
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.

Fix Issue #6537
2021-02-26 17:16:33 -08:00
SuvarnaMeenakshi
9208dc507b [multi-asic][vs]: Update topology script to retrieve hwsku from minigraph (#6219)
Update topology script to retrieve hwsku from minigraph
if hwsku information is not available in config_db.
Fix clean up of interfaces in msft_multi_asic_vs hwsku
topology script.
- Why I did it
When bringing up multi-asic VS switch, topology service is started during boot up.
Topology service starts a shell script which runs the topology script present in /usr/share/sonic/device// directory. To invoke hwsku specific script, the topology script tries to retrieve hwsku information from config_db.
During initial boot up config_db might not be populated. In order to start topology service before config_db is updated,
update topology script to get hwsku information from minigraph.xml if it is available.
This will be helpful to bring up multi-asic VS testbed by loading minigraph and starting topology service.
- How I did it
Update topology.sh script to retrieve hwsku information from minigraph.xml.
Fix clean up function on msft_multi_asic_vs toplogy script.
- How to verify it
single-asic VS - no change; topology service is only enabled for multi-asic VS.
multi-asic VS - Bring up multi-asic VS image, copy minigraph to vs image, start topology service. Topology service should be successful.
to test clean up function fix, start topology service - make sure interfaces are created and moved to the right namespaces.
stop topology service - make sure namespace do not have any interface and all front end interfaces are present in default namespace.
2021-02-25 18:42:44 -08:00
abdosi
1064cd5cd0 [multi-asic] Enhanced iptable default rules (#6765)
What I did:-

For multi-asic platforms added iptable v4 rule to communicate on docker bridge ip
For multi-asic platforms extend iptable v4 rule for iptable v6 also
For multi-asic program made all internal rules applicable for all protocols (not filter based on tcp/udp). This is done to be consistent same as local host rule
For multi-asic platforms made nat rule (to forward traffic from namespace to host) generic for all protocols and also use Source IP if present for matching
2021-02-25 18:39:43 -08:00
arlakshm
5822b42fdb
[sudoers]: add ipintutil in sudoer file (#6857)
This PR is port of #6845 for 201911

show ip interfaces is enhanced recently to support multi ASIC platforms in this Azure/sonic-utilities#1437. The ipintutil script as to run as sudo user, to get the ip interface from each namespace.
Add this script to the sudoer file so that show ip interface command is available for user with read-only permissions

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-02-23 13:26:53 -08:00
arlakshm
daecc34180
[201911][baseimage] Install pyroute2 for sonic-utilites (#6792)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

Install pyroute2 for sonic-utilities. This change is needed for Azure/sonic-utilities#1437
2021-02-22 23:30:27 -08:00
SuvarnaMeenakshi
9e777e90a0 [multi_asic][vs]: Add dependency in teamd service to start after topology service(#6594)
[multi_asic][vs]: Add dependency in teamd service to start after topology service.
- Why I did it
In multi-asic VS, topology service is run after database service to set up the internal asic topology.
swss and syncd have a dependency to start after topology service is run so that the interfaces are moved to right namespace and created in the right namespace. In case of multi-asic vs, during the initial boot up, when there is no configuration added, teamd service starts and swss/syncd do not start as topology service does not start. Upon loading configuration using config_db or minigraph, swss and sycnd start up , but teamd is not restarted as swss is not stopped and started. This causes teamd to be in a bad state and requires a reload of config.

- How I did it
Add dependency in teamd service to start after topology service is completed.

- How to verify it
No change in single asic vs or platform.
No change in multi-asic regular image.
Change only in multi-asic VS. Bring up a multi-asic VS image without any configration, teamd service will fail to start due to dependency failure. Load minigraph, start topology service, load configuration, ensure all services come up.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
2021-02-18 18:05:10 -08:00
shlomibitton
4a1742e839
Stop teamd service before syncd (#6756)
When large number of port channels (more than 64) is configured, a config reload command might ends with not all port channel configured and up. Further debug shows that unloading the port channels on the ASIC driver take a lot of time.
With the change, deleting all port channels before the syncd restart will free resources better and the ASIC driver will unload all netdev fast and the operation will execute properly.
2021-02-18 15:48:11 -08:00
arlakshm
a750f89630 [multi asic] add ip netns identify command to sudoer (#6591)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

- Why I did it
The command sudo ip netns identify <pid> is used in function get_current_namespace
to check in the cli command is running in host context or within a namespace.

This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands

This problem is not there on single asic platforms.

- How I did it
Add ip netns identify [0-9]* to sudoers file.
2021-02-02 10:32:59 -08:00
lguohan
fcf93dda12
[sonic-linux-kernel]: kernel security update to 4.9.246 (#6545)
* [sonic-linux-kernel]: kernel security update to 4.9.246
* [Arista] Update driver submodule (#60)
     Update kernel dependency to 4.9.0-14-2

Signed-off-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Samuel Angebault <angebault.samuel@gmail.com>
2021-01-28 08:46:07 -08:00
arlakshm
3cd536bb45 [Multi Asic] support of swss.rec and sairedis.rec for multi asic (#6310)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

- Why I did it
This PR has the changes to support having different swss.rec and sairedis.rec for each asic.
The logrotate script is updated as well

- How I did it

Update the orchagent.sh script to use the logfile name options in these PRs(Azure/sonic-swss#1546 and Azure/sonic-sairedis#747)
In multi asic platforms the record files will be different for each asic, with the format swss.asic{x}.rec and sairedis.asic{x}.rec

Update the logrotate script for multiasic platform .
2021-01-27 17:12:32 -08:00
abdosi
9779560b63 [baseimage]: Updates for Ebtables and support for multi-asic (#6542)
Following changes were done for ebtables:

- Support for Multi-asic platforms. Ebtable filters are installed in namespace for multi-asic and not host. On Single asic installed on  host.

- For Multi-asic platforms we don't want to install on host otherwise Namespace-to-Namespace communication does not happens since ARP Request are not forwarded.

- Updated to use text file to restore ebtables rules then the binary format. Rules are restore as part of Database docker init instead of rc.local

- Removed the ebtable service files for buster as not needed as filters are restored/installed as part of database docker init.
   All the binaries are pre-installed with ebtables* binary are same as ebatbles-legacy-*

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-01-27 16:59:10 -08:00