Why I did it
Fixed the monit container_checker fails due to unexpected "database-chassis" docker running on Supervisor card in the VOQ chassis. fixes#9042
How I did it
Added database-chassis to the always running docker list if platform is supervisor card.
How to verify it
Execute the CLI command "sudo monit status container_checker"
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
amrhf build fails while building sonic-config-engine whl package
https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_apis/build/builds/77089/logs/9
The reason for the failure is due to the fact that there is a new line generated at the top of the file in buffer config test cases while building for broadcom based platform and this issue is not seen in Marvell based platforms.
How I did it
Removed the new line for all the buffer test cases as there is no need to add it and accordingly changed the buffer_config.j2 where the new line is generated.
* Update container_checker for multi-asic devices
Update container_checker for multi-asic devices to add database containers in always_running_containers.
Previous change was made for single-asic, and that database containers were not considered as feature when writing to state_db.
* Update container_checker
Update an indent
This issue causes negative threshold value and thus deleting log files even when there is enough space.
This issue causes negative threshold value and thus deleting log files even when there is enough space.
- Why I did it
To fix an issue when log files get deleted even if there is enough space.
- How I did it
Fixed an typo.
- How to verify it
Run the portion of the script that calculates threshold, see that the threshold is calculated correctly.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Implement infrastructure that allows enabling address sanitizer
for docker containers. Enable address sanitizer for SWSS container.
- Why I did it
To add a possibility to compile SONiC applications with address sanitizer (ASAN).
ASAN is a memory error detector for C/C++. It finds:
1. Use after free (dangling pointer dereference)
2. Heap buffer overflow
3. Stack buffer overflow
4. Global buffer overflow
5. Use after return
6. Use after the scope
7. Initialization order bugs
8. Memory leaks
- How I did it
By adding new ENABLE_ASAN configuration option.
- How to verify it
By default ASAN is disabled and the SONiC image is not affected.
When ASAN is enabled it inspects all allocation, deallocation, and memory usage that the application does in run time. To verify whether the application has memory errors tests that trigger memory usage of the application should be run. Ideally, the whole regression tests should be run. Memory leaks reports will be placed in /var/log/asan/ directory of SONiC host OS.
Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Why I did it
Fixes#8980 partly.
The corresponding changes in sonic-sairedis is here :
Azure/sonic-sairedis#975
How I did it
Include changes from both repos and build an image for verification.
How to verify it
Trigger fast-reboot with the changes, see the attribute SAI_SWITCH_ATTR_UNINIT_DATA_PLANE_ON_REMOVAL being set at the SAI level.
Signed-off-by: Thushar Gowda <24815472+tbgowda@users.noreply.github.com>
Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.
How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
Why I did it
Eliminate benign firsttime boot error reported when running on platforms that do not support kdump.
How I did it
Change rc.local to check for presence of the file /etc/default/kdump-tools before referencing it.
How to verify it
Install a new image on an armhf or arm64 platform and check for a failed reference to /etc/default/kdump-tools on firsttime boot.
- External PHY is managed via gearbox (gbsybcd docker container) in SONiC
- Enhanced 'External PHY management' from SONiC's single-ASIC environment to multi-ASIC
- Enhanced gbsyncd docker container from single Namespace to multi-Namspace mode
- Added gbsyncd.service.j2 on per_namespace basis.
- Each namepace/ASIC now to have its unique gbsyncd<ASIC#> docker container with its
own Gearbox table, redis-DB
Signed-off-by: Shyam Kumar <shyakuma@cisco.com>
Why I did it
Need to be able to run smartctl when pmon docker is not running.
How I did it
Removed the pmon dependency for pmon as well as the command wrapper and added it to the debian-extension.
How to verify it
Stop pmon
Run smartctl from the host and verify it runs without error
[image]: Prevent radius passkey and snmp community string into syslog. (#9727)
#### Why I did it
Prevent radius passkey and snmp community string into syslog.
#### How I did it
Add radius and snmp config command to PASSWD_CMDS
#### How to verify it
Run and pass all UTs.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
Add radius and snmp config command to PASSWD_CMDS to prevent radius passkey and snmp community string into syslog.
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
The existing log file size in sonic is 1 Mb. Over a period of time this leads to huge number of log files which becomes difficult for monitoring applications to handle.
Instead of large number of small files, the size of the log file is not set to 16 Mb which reduces the number of files over a period of time.
How I did it
Changed the size parameter and related macros in logrotate config for rsyslog
How to verify it
Execute logrotate manually and verify the limit when the file gets rotated.
Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
Why I did it
database.sh failed to create the database for namespace in multiasic platform.
The latest code Docker version 20.10.x, command "docker create" no longer takes optional "NET=" with empty value. Syntax error show with current docker create command in database.sh. Issue #9503
How I did it
Modify the docker_image_ctl.j2 to set default network setting NET="bridge" instead of empty for namespace database.
- Use SfpOptoeBase by default to leverage new `sonic_xcvr` refactor
- Add support for `Woodleaf` product
- Move `libsfp-eeprom.so` to a different `.deb` package
- Add new logrotate configuration for arista logs
- Improve logging mechanism for the drivers (IO loglevel, fix syslog duplicates)
- Initialize chassis cards in parallel
- Refactor of `get_change_event` to fix interrupts treated as presence change
- Add INCLUDE_PINS to config to enable/disable container
- Add Docker files and supporting resources
- Add sonic-pins submodule and associated make files
Submission containing materials of a third party:
Copyright Google LLC; Licensed under Apache 2.0
#### Why I did it
Adds P4RT container to SONiC for PINS
The P4RT app is covered by this HLD:
https://github.com/pins/SONiC/blob/master/doc/pins/p4rt_app_hld.md
#### How I did it
Followed the pattern and templates used for other SONiC applications
#### How to verify it
Build SONiC with INCLUDE_P4RT set to "y".
Verify that the resulting build has a container called "p4rt" running.
You can verify that the service is up by running the following command on the SONiC switch:
```bash
sudo netstat -lpnt | grep p4rt
```
You should see the service listening on TCP port 9559.
#### Which release branch to backport (provide reason below if selected)
None
#### Description for the changelog
Build P4RT container for PINS
Why I did it
Fix for issue #8389
How I did it
The /etc/rsyslog.conf is empty file which cause the FATAL of the process rsyslogd in the global instance database container. The function updateSyslogConf() should only generate the rsyslog.conf for containers in the namespace. it should not do it for the containers in the global instance. Instead, default rsyslog.conf should be used. Especially for database container, updateSyslogConf() is called before the database container is created. The result cause the sonic-cfggen failed to generate the rsyslog.conf.Why I did it
Fix for issue #8389
How I did it
The /etc/rsyslog.conf is empty file which cause the FATAL of the process rsyslogd in the global instance database container. The function updateSyslogConf() should only generate the rsyslog.conf for containers in the namespace. it should not do it for the containers in the global instance. Instead, default rsyslog.conf should be used. Especially for database container, updateSyslogConf() is called before the database container is created. The result cause the sonic-cfggen failed to generate the rsyslog.conf.
Signed-off-by: mlok <marty.lok@nokia.com>
This pull request integrate audisp-tacplus to SONiC for per-command accounting.
#### Why I did it
To support TACACS per-command accounting, we integrate audisp-tacplus project to sonic.
#### How I did it
1. Add auditd service to SONiC
2. Port and patch audisp-tacplus to SONiC
#### How to verify it
UT with CUnit to cover all new code in usersecret-filter.c
Also pass all current UT.
#### Which release branch to backport (provide reason below if selected)
N/A
#### Description for the changelog
Add audisp-tacplus for per-command accounting.
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
Add bgpcfgd support to advertise routes.
How I did it
Make bgpcfgd subscribe to the ADVERTISE_NETWORK table in STATE_DB and configure route advertisement accordingly.
How to verify it
Added unit tests in bgpcfgd and verify on KVM about route advertisement.
- Consolidate the two [Service] sections by moving the ExecStartPre line for mark_dhcp_packet.py to the first section and removing the second.
- Make the mark_dhcp_packet.py file executable
- Also clean up mark_dhcp_packet.py
- Remove unused imports
- Fix spacing and line lengths to conform to PEP8
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
- Why I did it
This is to update the common sonic-buildimage infra for reclaiming buffer.
- How I did it
Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer
The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there.
Rendering is done here for passing azure pipeline.
Load zero_profiles.json when the dynamic buffer manager starts
Generate inactive port list to reclaim buffer
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Why I did it
Fix#9059. It provides common gbsyncd.service.j2 to start for platform specific gbsyncd docker, which must be named 'gbsyncd'.
How I did it
All of platform specific gbsyncd dockers use a common name 'gbsyncd'
Use a unique systemd service template gbsyncd.service.j2 for gbsyncd docker
Why I did it
Fix#9059. It provides common gbsyncd.service.j2 to start for platform specific gbsyncd docker, which must be named 'gbsyncd'.
How I did it
All of platform specific gbsyncd dockers use a common name 'gbsyncd'
Use a unique systemd service template gbsyncd.service.j2 for gbsyncd docker
#### Why I did it
Changes required for feature "Event Driven TechSupport Invocation & CoreDump Mgmt". [HLD](https://github.com/Azure/SONiC/pull/818 )
Requires: https://github.com/Azure/sonic-utilities/pull/1796.
Merging in any order would be fine.
Summary of the changes:
- Added the YANG Models for the new tables introduces as a part of this feature.
- Enhanced init_cfg.json with the default config required
- Added a compile Time flag which enables/disables the config required for this feature inside the init_cfg.json
- Enhanced the supervisor-proc-exit-listener script to populate `<feature>:<critical_proc> = <comm>:<pid>` info in the STATE_DB when it observes an proc exit notification for the critical processes running inside the docker.
This pull request add a bash plugin for TACACS+ per-command authorization
#### Why I did it
1. To support TACACS per command authorization, we check user command before execute it.
2. Fix libtacsupport.so can't parse tacplus_nss.conf correctly issue:
Support debug=on setting.
Support put server address and secret in same row.
3. Fix the parse_config_file method not reset server list before parse config file issue.
#### How I did it
The bash plugin will be called before every user command, and check user command with remote TACACS+ server for per-command authorization.
#### How to verify it
UT with CUnit cover all code in this plugin.
Also pass all current UT.
#### Which release branch to backport (provide reason below if selected)
N/A
#### Description for the changelog
Add Bash TACACS+ plugin.
#### A picture of a cute animal (not mandatory but encouraged)
* [ACL] enable ACL FC when genereting config from minigraph but disable by default
Why I did it
To support ACL counters on Flex Counter Infrastructure.
How I did it
Enable ACL FC in init_cfg and minigraph. Disable when genereting configuration from preset.
How to verify it
Together with depends PRs. Run ACL/Everflow test suite.
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Allow mellanox platform to build and successfully switch packets in
Debian 11
Upgraded
* Mellanox SDK
* Mellanox Hardware Management
* Mellanox Firmware
* Mellanox Kernel Patches
Adjusted build system to support host system running bullseye and
dockers running buster.
1. Fix build for armhf and arm64
2. upgrade centec tsingma bsp support to 5.10 kernel
3. modify centec platform driver for linux 5.10
Co-authored-by: Shi Lei <shil@centecnetworks.com>
ISSU will likely be broken. As of right now, the issu-version file is
not being generated during build.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
The common startup script used for SONiC containers is calling an inline
python command that uses Python 2 syntax, and thus errors out when run
with Python 3. Make this work with Python 3.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
There appears to be some network issue in the pipeline builds when
downloading packages from our mirror. Change the source to be from the
main debian repos to try to get around this issue.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Remove Python 2 package installation from the base image. For container
builds, reference Python 2 packages only if we're not building for
Bullseye.
For libyang, don't build Python 2 bindings at all, since they don't seem
to be used.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Python 2 is no longer available, so remove those packages, and remove
the pip2 commands. For picocom and systemd, just install from the
regular repo, since there's no backports yet.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
In case an app.ext requires a dependency syncd^1.0.0, the RPC version of syncd will not satisfy this constraint, since 1.0.0-rpc < 1.0.0. This is not correct to put 'rpc' as a prerelease identifier. Instead put 'rpc' as build metadata in the version: 1.0.0+rpc which satisfies the constraint ^1.0.0.
- How I did it
Changed the way how to version in RPC and DBG images are constructed.
- How to verify it
Install app.ext with syncd^1.0.0 dependency on a switch with RPC syncd docker.
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>