Commit Graph

393 Commits

Author SHA1 Message Date
Jing Zhang
ffd9e190e1 Update WARM START FINALIZER to wait for linkmgrd to reconcile (#11477)
Spanning from sonic-net/sonic-linkmgrd#76, this PR is to update warm restart finalizer to wait for linkmgrd to be reconciled.

sign-off: Jing Zhang zhangjing@microsoft.com

Why I did it
To make sure finalizer save config after linkmgrd's reconciliation.

How I did it
Add linkmgrd to the reconciliation wait list of warmboot finalizer.

How to verify it
Verified on lab device, linkmgrd reconciled as expected.
2022-08-09 21:05:12 +00:00
tjchadaga
6d66d9b8fc
Revert "Add load_minigraph option to include traffic-shift-away during config migration (#11403)" (#11625)
This reverts commit 6c2f99a327.
2022-08-06 10:05:45 +05:30
Lior Avramov
a40aca43b9 [memory_checker] Do not check memory usage of containers if docker daemon is not running (#11476)
Fix in Monit memory_checker plugin. Skip fetching running containers if docker engine is down (can happen in deinit).
This PR fixes issue #11472.

Signed-off-by: liora liora@nvidia.com

Why I did it
In the case where Monit runs during deinit flow, memory_checker plugin is fetching the running containers without checking if Docker service is still running. I added this check.

How I did it
Use systemctl is-active to check if Docker engine is still running.

How to verify it
Use systemctl to stop docker engine and reload Monit, no errors in log and relevant print appears in log.

Which release branch to backport (provide reason below if selected)
The fix is required in 202205 and 202012 since the PR that introduced the issue was cherry picked to those branches (#11129).
2022-07-27 23:28:19 +00:00
tjchadaga
6c2f99a327 Add load_minigraph option to include traffic-shift-away during config migration (#11403) 2022-07-27 23:27:21 +00:00
yozhao101
4487a962e3 [memory_checker] Do not check memory usage of containers which are not created (#11129)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to fix an issue (#10088) by enhancing the script memory_checker.

Specifically, if container is not created successfully during device is booted/rebooted, then memory_checker do not need check its memory usage.

How I did it
In the script memory_checker, a function is added to get names of running containers. If the specified container name is not in current running container list, then this script will exit without checking its memory usage.

How to verify it
I tested on a lab device by following the steps:

Stops telemetry container with command sudo systemctl stop telemetry.service

Removes telemetry container with command docker rm telemetry

Checks whether the script memory_checker ran by Monit will generate the syslog message saying it will exit without checking memory usage of telemetry.
2022-07-05 20:57:45 +00:00
Junchao-Mellanox
4f326e8779
Fix race condition between networking service and interface-config service (#10573) (#10766)
Backport https://github.com/Azure/sonic-buildimage/pull/10573 to 202012.

#### Why I did it

The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like:

1.	Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp.
2.	Systemd starts interface-config service who depends on networking service
3.	Interface-config service runs command  “ifdown –force eth0”, check [line](16717d2dc5/files/image_config/interfaces/interfaces-config.sh (L4)). but networking service is still running so that this [line](ac32bec0e2/ifupdown2/ifupdown/main.py (L74)) failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down.
4.	Interface-config service updates /etc/networking/interface to static configuration.
5.	Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP)
6.	When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces.

The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3.

#### How I did it

Check networking service state before running "ifdown –force eth0", wait for it done if it is activating.

#### How to verify it

Manual test.
2022-05-14 14:58:24 -07:00
Samuel Angebault
705d3c0804 [Arista] Remove arista.log from rsyslog default logrotate (#9731)
Why I did it
In parallel of this change Arista added a custom logrotate configuration as part of its driver library.
Having 2 logrotate configuration for the same log file triggers an issue.

Fixes aristanetworks/sonic#38

How I did it
Arista merged a few changes in sonic-buildimage which added a logrotate configuration aristanetworks/sonic@e43c797
It is therefore the right path to remove the arista.log line from the logrotate.d/rsyslog configuration.

How to verify it
Logrotate works without any error message, arista log rotation happens and arista daemons still append logs once file was truncated.
2022-04-28 23:58:41 +00:00
yozhao101
e6c18fa6dd [Monit] Fix the issue which shows Monit can not reset its counter. (#10288)
Signed-off-by: Yong Zhao <yozhao@microsoft.com>

Why I did it
This PR aims to fix the Monit issue which shows Monit can't reset its counter when monitoring memory usage of telemetry container.

Specifically the Monit configuration file related to monitoring memory usage of telemetry container is as following:

  check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400"
      if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry"
If memory usage of telemetry container is larger than 400MB for 10 times within 20 cycles (minutes), then it will be restarted.
Recently we observed, after telemetry container was restarted, its memory usage continuously increased from 400MB to 11GB within 1 hour, but it was not restarted anymore during this 1 hour sliding window.

The reason is Monit can't reset its counter to count again and Monit can reset its counter if and only if the status of monitored service was changed from Status failed to Status ok. However, during this 1 hour sliding window, the status of monitored service was not changed from Status failed to Status ok.

Currently for each service monitored by Monit, there will be an entry showing the monitoring status, monitoring mode etc. For example, the following output from command sudo monit status shows the status of monitored service to monitor memory usage of telemetry:

    Program 'container_memory_telemetry'
         status                             Status ok
         monitoring status          Monitored
         monitoring mode          active
         on reboot                      start
         last exit value                0
         last output                    -
         data collected               Sat, 19 Mar 2022 19:56:26
Every 1 minute, Monit will run the script to check the memory usage of telemetry and update the counter if memory usage is larger than 400MB. If Monit checked the counter and found memory usage of telemetry is larger than 400MB for 10 times
within 20 minutes, then telemetry container was restarted. Following is an example status of monitored service:

    Program 'container_memory_telemetry'
         status                             Status failed
         monitoring status          Monitored
         monitoring mode          active
         on reboot                      start
         last exit value                0
         last output                    -
         data collected               Tue, 01 Feb 2022 22:52:55
After telemetry container was restarted. we found memory usage of telemetry increased rapidly from around 100MB to more than 400MB during 1 minute and status of monitored service did not have a chance to be changed from Status failed to Status ok.

How I did it
In order to provide a workaround for this issue, Monit recently introduced another syntax format repeat every <n> cycles related to exec. This new syntax format will enable Monit repeat executing the background script if the error persists for a given number of cycles.

How to verify it
I verified this change on lab device str-s6000-acs-12. Another pytest PR (Azure/sonic-mgmt#5492) is submitted in sonic-mgmt repo for review.
2022-04-21 22:00:42 +00:00
Jing Kan
4ee75f490e
[202012][copp_cfg] Enable dhcp trap for BmcMgmtToRRouter (#10596)
Signed-off-by: Jing Kan jika@microsoft.com
2022-04-19 15:59:20 +08:00
Ying Xie
6af3de4372
[202012][copp cfg] enable dhcp trap for a couple more devices (#10582)
* [copp cfg] enable copp trap for a couple more devices

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2022-04-15 11:47:02 -07:00
kellyyeh
b68f4dd74c
Enable dhcp copp trap for EPMS and MgmtTsToR (#10439) 2022-04-06 09:46:08 -07:00
Saikrishna Arcot
aafb3d00e2
Start haveged before systemd-random-seed (#10328)
The haveged service file in Debian Buster specifies that haveged should
start after systemd-random-seed starts (this was removed in Bullseye
after systemd changes caused a bootloop). This is a bit
counterproductive, since haveged is meant to be used in environments
with minimal sources of entropy, but one of the checks that
systemd-random-seed does is to verify that entropy is present.

Therefore, override the default .service file for haveged that moves
systemd-random-seed to the Before list, allowing it to start before
systemd-random-seed checks the system entropy level. (systemd doesn't
allow removing items from dependency/ordering entries such as After= and
Before=, so the entire .service file has to be overwritten.)

Note that despite this, haveged takes up to two seconds to actually
start working, so systemd-random-seed may still block for about two
seconds. However, this still allows other work (such as running
rc.local) to proceed a bit sooner.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-03-24 14:28:42 -07:00
noaOrMlnx
4f021c44c2
Update docker-sonic-vs infrastructure in order to run CoPP UT (#10230)
*Changes to run CoPP UT in docker-sonic-vs
2022-03-21 21:55:24 -07:00
xumia
67312ff635
[Build]: Use one debian mirror config (#10281)
Why I did it
Use one debian mirror config.
The empty config in https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/apt/sources.list overrides the file https://github.com/Azure/sonic-buildimage/blob/master/files/apt/sources.list.amd64 (armhf/arm64), it does not make sense.
All the content in files/image_config/apt is no use, any one wants to add mirror config, please add in files/apt.

How I did it
Remove files/image_config/apt and the reference.
2022-03-21 17:04:19 +08:00
wenyiz2021
5878cfdb06 Update container_checker for multi-asic devices when state is 'always_enabled' (#10067)
* Update container_checker for multi-asic devices 

Update container_checker for multi-asic devices to add database containers in always_running_containers. 
Previous change was made for single-asic, and that database containers were not considered as feature when writing to state_db.

* Update container_checker

Update an indent
2022-03-14 23:01:43 +00:00
Santhosh Kumar T
e83955599d
[202012] Refactoring DELL platform init to reduce rc.local processing time (#10171)
Why I did it
To reduce the processing time of rc.local, refactoring s6100 platform initialization.
Fixing [warm-upgrade][202012] Slow DELL platform init in rc.local causes lacp-teardown #10150
How I did it
On branch 202012-s6100-rclocalChanges to be committed:  (use "git restore --staged <file>..." to unstage)
        modified:   ../../../../files/image_config/platform/rc.local        
	modified:   ../debian/platform-modules-s6100.install        
	modified:   scripts/fast-reboot_plugin
        modified:   scripts/s6100_platform.sh
        renamed:    scripts/s6100_i2c_enumeration.sh -> scripts/s6100_platform_startup.sh
        renamed:    systemd/s6100-i2c-enumerate.service -> systemd/s6100-platform-startup.service
2022-03-10 18:51:07 -08:00
noaOrMlnx
7a35504ff7
[202012] [CoPP] Add always_enabled field (#9999)
Add the "always_enabled" field to copp_cfg.j2 file, in order to allow traps without an entry in features table, to be installed automatically.

This is a cherry-pick of https://github.com/Azure/sonic-buildimage/pull/9302

- Why I did it
In order to allow traps without an entry in features table, to be installed automatically.

- How I did it
Add always_enabled field to traps without a feature
2022-02-20 12:42:39 +02:00
Prince George
c1a0871fe9 Close console session due to user inactivity (#9890)
Signed-off-by: Prince George <prgeor@microsoft.com>
2022-02-08 19:07:29 +00:00
dflynn-Nokia
c715bdbf56 [firsttime boot] suppress error message on platforms not supporting kdump (#9521)
Why I did it
Eliminate benign firsttime boot error reported when running on platforms that do not support kdump.

How I did it
Change rc.local to check for presence of the file /etc/default/kdump-tools before referencing it.

How to verify it
Install a new image on an armhf or arm64 platform and check for a failed reference to /etc/default/kdump-tools on firsttime boot.
2022-01-21 02:39:17 +00:00
Renuka Manavalan
6cb7af73d9 add arista.log to logrotate (#9245) 2021-11-15 21:32:03 +00:00
Ying Xie
f1d5aaced0 [copp] bind copp-config.service to sonic.target (#8969)
copp-config service needs to be started after sonic.target so that it could
render the copp-config with the latest information.

It also needs to be restarted when config reload or load_minigraph is invoked.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-10-15 00:40:05 +00:00
abdosi
7732fa95bb [baseimage]: Logrotate for wtmp and btmp files. (#8743)
Added logrotate file for wtmp and btmp to override default conf and set size cap as 100K as done in 
PR: #865. For buster this is control by separate file wtmp and btmp.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-09-17 08:24:10 +00:00
Sudharsan Dhamal Gopalarathnam
9c5917d8dd Removing execute permission from copp config file (#8680)
*Removed execute permissions from the systemd copp-config.service file. 
Without this we will get a warning: "Configuration file /lib/systemd/system/copp-config.service is marked executable. Please remove executable permission bits. Proceeding anyway."
2021-09-14 08:59:21 +00:00
Ying Xie
e8b8012818 [202012][fstrim] delay fstrim timer after sonic.target (#8737)
Why I did it
fstrim has dependency on pmon docker.

How I did it
start fstrim timer after sonic.target.

How to verify it
local test and PR test.

Signed-off-by: Ying Xie ying.xie@microsoft.com
2021-09-14 08:59:17 +00:00
dflynn-Nokia
2c91efcd15 [Nokia ixs7215] Add support for changing the console baud rate (#8595)
This commit adds support for changing the default console baud rate configured
within the U-Boot bootloader. That default baud rate is exposed via the value
of the U-Boot 'baudrate' environment variable. This commit removes logic that
hardcoded the console baud rate to 115200 and instead ensures that the U-Boot
'baudrate' variable is always used when constructing the Linux kernel boot
arguments used when booting Sonic.

A change is also made to rc.local to ensure that the specified baud rate is set
correctly in the serial getty service.
2021-08-27 02:27:06 +00:00
Volodymyr Samotiy
8365245122 [monit] Periodically monitor VNET route consistency (#8266)
*To run VNET route consistency check periodically.
*For any failure, the monit will raise alert based on return code.
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-08-23 03:05:16 +00:00
Guohan Lu
ceab083fc5 [build]: add sonic_release 202012
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-07 18:04:28 -07:00
Blueve
762847c2cf [port_config] Introduce ad-hoc mport_config.json file (#8066)
Signed-off-by: Jing Kan jika@microsoft.com
2021-07-15 12:06:47 +00:00
rajendra-dendukuri
049a23e5a4 [kdump] Fix kdump error message when a reboot is issued (#7985)
dash doesn't support += operation to append to a variable's value. Use KDUMP_CMDLINE_APPEND="${KDUMP_CMDLINE_APPEND} " instead

The below error message is seen when a reboot is issued.

[ 342.439096] kdump-tools[13655]: /etc/init.d/kdump-tools: 117: /etc/default/kdump-tools: KDUMP_CMDLINE_APPEND+= panic=10 debug hpet=disable pcie_port=compat pci=nommconf sonic_platform=x86_64-accton_as7326_56x-r0: not found
2021-07-07 09:40:16 +00:00
xumia
c3018773d7 Fix vtysh shell-ingestion security issue (#7759)
Fix vtysh shell-ingestion security issue
Only expose the limited parameters of the command vtysh show.
2021-06-28 09:38:00 +00:00
Sujin Kang
d67a5b887f Support multiple pcie configuration file and change the pcie status table name to match with pcied changes (#7886)
Why I did it
Support multiple pcie configuration file and change the pcie status table name
This is to match with below two PRs.
Azure/sonic-platform-common#195
Azure/sonic-platform-daemons#189

How I did it
Check pcie configuration file with wild card and change the device status table name

How to verify it
Restart with changes and see if the pcie check works as expected.
2021-06-17 07:09:50 +00:00
yozhao101
fb2c995f53
[202012][Monit] Deprecate the feature of monitoring the critical processes by Monit (#7823)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.

How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.

How to verify it
I verified this on the device str-7260cx3-acs-1.
2021-06-09 09:04:22 -07:00
Renuka Manavalan
32e5137ab7 Add service to restore TACACS from old config (#7560)
Why I did it
In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials.
Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present.

How I did it
By adding a service, that would fire 5 mins after boot.
This service apply tacacs if available.

How to verify it
Upgrade and watch status of tacacs.timer & tacacs.service
You may create /etc/sonic/old_config/tacacs.json, with updated credentials
(before 5mins after boot) and see that appears in config & persisted too.

Which release branch to backport (provide reason below if selected)
 201911
 202006
 202012
2021-06-07 06:02:32 +00:00
yozhao101
3af05fdffe [Monit] Restart telemetry container if memory usage is beyond the threshold (#7645)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.

How I did it
I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container.

How to verify it
I verified this implementation on device str-7260cx3-acs-1.
2021-05-31 04:38:18 +00:00
Alexander Allen
bd6096a018 [ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB (#7586)
Why I did it
Currently, there is a bug in the ntp.conf jinja2 template where it will ignore the src_intf directive in CONFIG_DB if there are multiple IP addresses associated with an interface. This code change fixes that bug and allows the template to select the correct source interface for NTP.

How I did it
I did this by modifying the macro in ntp.conf.j2 which determines if there is an ip address associated with an interface to set a state variable when it detects a valid interface entry in CONFIG_DB instead of outputting "true" directly (which could result in multiple "trues" outputted for interfaces with multiple valid IP addresses).

How to verify it
Add two ipv4 addresses to an interface in SONiC

Add the following configuration to config_db.json

{
"NTP": {
    "global": {
        "src_intf": "Ethernet1"
        }
    }
}
Replace Ethernet1 with the interface name of the one you assigned the IP addresses to.

Run sudo config reload -y

Open /etc/ntp.conf and verify that the following line exists

...
interface listen Ethernet1
...
The interface specified should be the one set in the previous steps.

Description for the changelog
[ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB
2021-05-27 22:29:01 +00:00
Renuka Manavalan
53b3d378c7 Invoke disk check periodically. (#7374)
Why I did it
Helps with periodic scan of disk for RO state.
If found, this script makes transient fix and raise error message.
2021-05-27 22:28:44 +00:00
shlomibitton
c53f58e488 Remove 'vm.panic_on_oom=1' (#7678)
#### Why I did it
If a process limits using nodes by mempolicy/cpusets, and those nodes become memory exhaustion status, one process may be killed by oom-killer.
No panic occurs in this case, because other node's memory may be free.
This means system total status may be not fatal yet.

#### How I did it
Remove 'vm.panic_on_oom=1' kernel flag from 'vmcore-sysctl.conf '
2021-05-26 02:41:02 +00:00
Sujin Kang
d1043e3c91 add config-setup.service as dependency for pcie-check.service (#7599)
Why I did it
start pcie-check.service after config-setup.service since pcie_util depends on device_info which is available with config db metadata.

How I did it
Add config-setup.service as a dependency of pcie-check.service

How to verify it
Upon reboot, check if the pcie-check.sh throws the platform api error which is dependent on DEVICE_METADATA
2021-05-19 18:14:19 +00:00
Renuka Manavalan
99958304c1 [container_checker] Use Feature table to get running containers (#7474)
Why I did it
Finding running containers through "docker ps" breaks when kubernetes deploys container, as the names are mangled.

How I did it
The data is is available from FEATURE table, which takes care of kubernetes deployment too.

How to verify it
Deploy a feature via kubernetes and don't expect error from container_check.
2021-05-10 15:59:57 -07:00
Stepan Blyshchak
ae574ab000 [systemd] disable default systemd udev rules for interfaces (#7369)
Fix #7364

99-default.link - was always in SONiC, but previous systemd (<247) had an issue and it did not work due to issue systemd/systemd#3374. Now systemd 247 works.

However, such policy overrides teamd provided mac address which causes teamd netdev to use a random mac
address. Therefore, needs to be disabled.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-05-01 19:43:41 -07:00
xumia
1b05982727 Support readonly vtysh for sudoers (#7383)
Why I did it
Support readonly version of the command vtysh

How I did it
Check if the command starting with "show", and verify only contains single command in script.
2021-04-29 10:08:55 -07:00
Kuanyu Chen
66dedf38c2 [config-setup]: Fix a bug in checking if updategraph is enabled (#7093)
Encounter error during "config-setup boot" if the updategraph is enabled.

How I did it
Correct the code inside the config-setup script.
Remove the space between the assignment operator.

How to verify it
Remove the /etc/sonic/config_db.json and reboot the device.
Originally, it will return following error after boot up.
rv: command not found
After modification, it can correctly parse the status of updategraph without error.
2021-04-21 13:58:03 -07:00
yozhao101
c63b59698c [container_checker] Exclude the 'always_disabled' container from expected running container list (#7217)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
Since we introduced a new value always_disabled for the state field in FEATURE table, the expected running container list
should exclude the always_diabled containers. This bug was found by nightly test and posted at here: issue. This PR fixes #7210.

How I did it
I added a logic condition to decide whether the value of state field of a container was always_disabled or not.

How to verify it
I verified this on the device str-dx010-acs-1.

Which release branch to backport (provide reason below if selected)
 201811
 201911
 202006
[ x] 202012
2021-04-02 11:52:35 -07:00
arlakshm
cc6e521b40 [baseimage] add ipintutil in sudoer file (#6845)
show ip interfaces is enhanced recently to support multi ASIC platforms in this PR- https://github.com/Azure/sonic-utilities/pull/1396 .
The ipintutil script as to run as sudo user, to get the ip interface from each namespace.
Add this script to the sudoer file so that show ip interface command is available for user with read-only permissions

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-03-13 23:29:24 -08:00
Sujin Kang
15aed52ef2 [pcie.yaml] Move pcie configuration file path to platform directory (#6475)
- Why I did it
The pcie configuration file location is under plugin directory not under platform directory.
#6437

- How I did it

Move all pcie.yaml configuration file from plugin to platform directory.
Remove unnecessary timer to start pcie-check.service
Move pcie-check.service to sonic-host-services
- How to verify it
Verify on the device
2021-03-04 21:23:05 +00:00
Stepan Blyshchak
7fb5a72d23 [services] introduce sonic.target (#5705)
- Why I did it
Group all SONiC services together and able to manage them together. Will be used in config reload command as much simpler and generic way to restart services.

- How I did it
Add services to sonic.target

- How to verify it
Together with Azure/sonic-utilities#1199
config reload -y

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2021-03-04 21:23:05 +00:00
arlakshm
d7be5a021a [Multi Asic] support of swss.rec and sairedis.rec for multi asic (#6310)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com

- Why I did it
This PR has the changes to support having different swss.rec and sairedis.rec for each asic.
The logrotate script is updated as well

- How I did it

Update the orchagent.sh script to use the logfile name options in these PRs(Azure/sonic-swss#1546 and Azure/sonic-sairedis#747)
In multi asic platforms the record files will be different for each asic, with the format swss.asic{x}.rec and sairedis.asic{x}.rec

Update the logrotate script for multiasic platform .
2021-02-23 23:56:01 +00:00
Joe LeVeque
57a6fb9f39 [pcie-check] Update underlying pcieutil command and add to sudoers file (#6682)
- Why I did it

As of Azure/sonic-utilities#1297, subcommands of pcieutil have changed to remove the redundant pcie- prefix. This PR adapts calling applications (pcie-check) to the new syntax.

Resolves #6676

- How I did it

Remove pcie- prefix from pcieutil subcommands in calling applications
Also add pcieutil * to sudoers file, as pcieutil requires elevated permissions
2021-02-05 15:47:58 -08:00
arlakshm
24d785a64d [baseimage]: add docker ps to the sudoer file (#6604)
fixes Azure/sonic-utilities#1389

With the recent changes in sudoer files. The  show commands fails for the read-only users. 
The problem here is the 'docker ps' is failing in the function [get_routing_stack()](8a1109ed30/show/main.py (L54)) therefore all the CLI commands are failing.

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-02-03 10:39:00 -08:00
arlakshm
197f75a246 [multi asic] add ip netns identify command to sudoer (#6591)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

- Why I did it
The command sudo ip netns identify <pid> is used in function get_current_namespace
to check in the cli command is running in host context or within a namespace.

This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands

This problem is not there on single asic platforms.

- How I did it
Add ip netns identify [0-9]* to sudoers file.
2021-02-03 10:38:24 -08:00