Using timer-override.conf, we modify the fstrim.timer service.
For armhf, Nokia-7215 platform, we modify fstrim.timer to run daily
instead of weekly. This is required because the size of the SSD on
this platform is 16GB, which on average is nearly 10 times smaller than
most other sonic platforms. With smaller disk and the ever increasing
level of logging done by sonic, this change is required to prevent
the SSD from entering a read-only state due to inadequate free blocks.
- Fix watchdog reboot cause for wolverine linecard
- Fix PSU fan speed of 0% by adding max RPM to most psu descriptions
- Add product DCS-7060DX5-64
- Add product DCS-7060DX5-32
- Why I did it
To be able to see how much time was consumed to build a specific target.
A newly added code does those things:
1. Print build start time for target
2. Print build end time for target
3. Print elapsed time for target
- How I did it
Add a macro to record the time
Add macros to print end time and elapsed time
- How to verify it
Just build an image and check any *.log file
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
- Why I did it
To be able to cache, and then retrieve cached "copied" debs
- How I did it
Add missed caching and cache retrieval steps
- How to verify it
Build with cache and then clean and rebuild again. The targets added to SONIC_COPY_DEBS should be taken from a cache.
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
- Why I did it
Update the sensors.conf and pcie.yaml according to the real hardware.
- How I did it
Update the sensors.conf and pcie.yaml
- How to verify it
run relevant sonic-mgmt test cases.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
- Why I did it
Add NVIDIA Copyright header to NVIDIA files added lately
- How I did it
Add NVIDIA Copyright header for the relevant files
- How to verify it
N/A (only commented text was added).
- Why I did it
Fix issue with signing tool not running due to being call with the path from the host and not the path it is mounted on inside the docker-slave
- How I did it
Modified the path on the SECURE_UPGRADE_PROD_SIGNING_TOOL flag to the path where it is mounted inside the slave docker
- How to verify it
Build SONiC using your own prod script
Normally doesn't need to measure i2c calls.
Also switched to use timespec64_sub() to ensure time delta normalized
Co-authored-by: Kostiantyn Yarovyi <kostiantynx.yarovyi@intel.com>
- Why I did it
Mellanox syncd container will be based on Debian iproute2 plus patches instead of Nvidia internal version of iproute2
- How I did it
Download iproute2 from Debian repository, apply patches and compile to create a new target.
The target is then deployed in syncd container of Mellanox switches only.
The new target is called IPROUTE2_MLNX.
- How to verify it
Compile and load on switch, verify interfaces network devices created successfully.
Verify LLDP shows connections to neighbors.
Verify ping between 2 hosts over 2 router ports is successful.
Why I did it
Support for SONIC chassis isolation using TSA and un-isolation using TSB from supervisor module
Work item tracking
Microsoft ADO (number only): 17826134
How I did it
When TSA is run on the supervisor, it triggers TSA on each of the linecards using the secure rexec infrastructure introduced in sonic-net/sonic-utilities#2701. User password is requested to allow secure login to linecards through ssh, before execution of TSA/TSB on the linecards
TSA of the chassis withdraws routes from all the external BGP neighbors on each linecard, in order to isolate the entire chassis. No route withdrawal is done from the internal BGP sessions between the linecards to prevent transient drops during internal route deletion. With these changes, complete isolation of a single linecard using TSA will not be possible (a separate CLI/script option will be introduced at a later time to achieve this)
Changes also include no-stats option with TSC for quick retrieval of the current system isolation state
This PR also reverts changes in #11403
How to verify it
These changes have a dependency on sonic-net/sonic-utilities#2701 for testing
Run TSA from supervisor module and ensure transition to Maintenance mode on each linecard
Verify that all routes are withdrawn from eBGP neighbors on all linecards
Run TSB from supervisor module and ensure transition to Normal mode on each linecard
Verify that all routes are re-advertised from eBGP neighbors on all linecards
Run TSC no-stats from supervisor and verify that just the system maintenance state is returned from all linecards
Why I did it
Today at most 128 LAGs are supported. This is not sufficient if there are many LAGs with just few ports.
How I did it
Increase LAG Ids to 1024 for DNX device.
Why I did it
systemd stop event on service with timers can sometime delete the state_db entry for the corresponding service.
Note: This won't be observed on the latest master label since the dependency on timer was removed with the recent config reload enhancement. However, it is better to have the fix since there might be some systemd services added to system health daemon in the future which may contain timers
root@qa-eth-vt01-4-3700c:/home/admin# systemctl stop snmp
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status
System is not ready - one or more services are not up
Service-Name Service-Status App-Ready-Status Down-Reason
---------------------- ---------------- ------------------ -------------
<Truncated>
ssh OK OK -
swss OK OK -
syncd OK OK -
sysstat OK OK -
teamd OK OK -
telemetry OK OK -
what-just-happened OK OK -
ztp OK OK -
<Truncated>
Expected
Should see a Down entry for SNMP instead of the entry being deleted from the STATE_DB
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status
System is not ready - one or more services are not up
Service-Name Service-Status App-Ready-Status Down-Reason
---------------------- ---------------- ------------------ -------------
<Truncated>
snmp Down Down Inactive
ssh OK OK -
swss OK OK -
syncd OK OK -
sysstat OK OK -
teamd OK OK -
telemetry OK OK -
what-just-happened OK OK -
ztp OK OK -
<Truncated>
How I did it
Happens because the timer is usually a PartOf service and thus a stop on service is propagated to timer. Fixed the logic to handle this
Apr 18 02:06:47.711252 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.service from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.711347 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.service ]
Apr 18 02:06:47.722363 r-lionfish-16 INFO healthd: snmp.service service state changed to [inactive/dead]
Apr 18 02:06:47.723230 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.timer from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.723328 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.timer ]
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Why I did it
When reboot the chassis by issuing "sudo reboot" on Supervisor card. The internal midplane communication xe0 should be shutdown to avoid double reboot on the linecard.
Added a udev link rule to disable the autoneg on AMD xgbe port Xe0 and Xe1 and make the setting in sync with the peer Broadcom greyhound ports.
How I did it
Modify the Nokia-7250IXRE specific reboot script on the Supervisor card to shutdown the internal interface xe0. Also move reboot linecard code to the top of the script to make sure the notification has been send to Linecard before shutdown the xe0 interface.
Introduced a new rule 80-net-by-driver.link to disable the autoneg on the AMD size. This change requires the latest NDK which contains the change to set the autoneg on the xe0 and xe1 port on the Greyhound.
Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
Update sonic-platform submodule for Nokia-7250IXRE Platform. This requires the new NDK 22.9.8 and above
How I did it
Update submodule sonic-platform for Nokia-7250IXRE platform.
c9f316e Disparate process and thread-safe protection for MDIPC transport, and refactored presence logic to better align with SfpStateUpdateTask operation
a3486cc Added _get_module_bulk_info() and cache the info for 5 seconds to optimize the chassisd update.
4b2e729 Fixed the nokia_cmd show qfpga help display
7b87049 Fixed the nokia_cmd show midplane helper dispaly.
83eabea Add "nokia_cmd set ndk-monitor-action" and "nokia_cmd set ndk-log-level" commands
8aad7de Add nokia_cmd show ndk-version
d2c55e3 Modify the psu.py and module.py to optimize the psud running time
Signed-off-by: mlok <marty.lok@nokia.com>
This PR is to add the following
Add a new options "--profile" to the show macsec command, to show all profiles in device
Update the currentl show macsec command, to show profile in each interface o/p. This will tell which macsec profile the interface is attached to.
#### Why I did it
Remove dbus when telemetry does not use it.
##### Work item tracking
- Microsoft ADO **(number only)**: 17852550
#### How I did it
Use INCLUDE_SYSTEM_GNMI to determine if telemetry needs dbus.
#### How to verify it
Build image and check telemetry container.
* Support ACL interface type BmcData in minigraph parser
* Support ACL interface type BmcData in minigraph parser
* add unittest
* Add a global dict for storing the defination of custom acl tables
- Why I did it
Extend the CRM YANG model with DASH attributes.
- How I did it
Add new attributes to the existing CRM YANG model.
Implement tests for DASH CRM attributes.
- How to verify it
Build sonic_yang_models packages. The tests will be run automatically.
Depends on https://github.com/sonic-net/sonic-linux-kernel/pull/315
#### Why I did it
The name SECURE_UPGRADE_DEV_SIGNING_CERT is misleading, this flag is relevant to both to dev and prod signing.
#### How I did it
Rename all mentions of name SECURE_UPGRADE_DEV_SIGNING_CERT to SECURE_UPGRADE_SIGNING_CERT - this is also done with PR in sonic-linux-kernel repository
#### How to verify it
Build SONiC using your own prod script
Why I did it
Import defusedxml packet to fix semgrep error "using defusedxml instead of xml"
How I did it
Add "pip3 install defusedxml" in build_debian.sh
Signed-off-by: pettershao-ragilenetworks <pettershao@ragilenetworks.com>
- Why I did it
We suspect the issue #13791 is caused by redis server being temporarily unavailable during system initialization so we do not use -d in sonic-cfggen, for now, to avoid accessing redis server
- How I did it
Provide a string containing required json data when calling sonic-cfggen
- How to verify it
Manually test it
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.
#### Why I did it
When user enable TACACS per-command authorization, and run a command with wildcard , if the command match more than hundreds of files, the per-command authorization will failed with following message:
*** authorize failed by TACACS+ with given arguments, not executing
The root cause of this issue is because bash will match files with wildcard and replace with wildcard args with matched files. when there are too many files, TACACS plugin will generate a big authorization request, which will be reject by server side.
##### Work item tracking
- Microsoft ADO **(number only)**: 18074861
#### How I did it
Fix bash patch file, use original user inputs as authorization parameters.
#### How to verify it
Pass all UT.
Create new UT to validate the TACACS authorization request are using original command arguments.
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8115
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [X] 202205
- [X] 202211
#### Tested branch (Please provide the tested image version)
- [x] 202205.258490-412b83d0f
- [x] 202211.71966120-1b971c54b5
#### Description for the changelog
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.