Improve the Linux kernel build cache hit rate.
Current the the hit rate is around 85.8% (based on the last 3 month, 3479 PR builds totally, 494 PR build not hit).
We can improve the hit rate up to 95% or better.
The Linux kernel build will take really long time, most of the PRs are nothing to do with the kernel change. The remaining cache options should be enough to detect the Linux kernel cache status (dirty or not).
Why I did it
The existing log file size in sonic is 1 Mb. Over a period of time this leads to huge number of log files which becomes difficult for monitoring applications to handle.
Instead of large number of small files, the size of the log file is not set to 16 Mb which reduces the number of files over a period of time.
How I did it
Changed the size parameter and related macros in logrotate config for rsyslog
How to verify it
Execute logrotate manually and verify the limit when the file gets rotated.
Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
[Submodule update] sonic-swss
c78aa1b81a3a9001669746067ebbe40b4485f71d (HEAD -> master, origin/master, origin/HEAD) OA changes to support Ordered ECMP and DVS test for same. (#2092)
b4b00031378a6ce303b779159e718d6d20790c11 Handling Invalid CRM configuration gracefully (#2109)
d240cb2d356ec17baa464455f37f88ac5dbc441a [Mellanox] '_8lane' not added to Mellanox 5xxx models with 800G (#2090)
8fd6e488d2a3696b9cfe352a9119c86f0f33e6dc [pfcwd] Add vs test infrastructure (#2077)
b96ee5438b8bf08980846ee84ff69e7ba267b0dc [vnetorch] Advertise vnet tunnel routes (#2058)
[submodule update] sonic-sairedis
d5866a3dccfb3bc50853d740d54203b5cae61eed (HEAD -> master, origin/master, origin/HEAD) [vslib]: fix create MACsec SA error (#986)
f36f7ce6236ae97526e15f00e7688ccced7c0454 Added Support for enum query capability of Nexthop Group Type. (#989)
323b89b14995a84bd6539c8a1df00b77d251f99e Support for MACsec statistics (#892)
26a8a1204e873109537c81462ad1457cf38c2f9e Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#968)
0cb253a42cd0a641b8e0a3c6a4a54e5397dd8c2d Fix object availability conversion (#974)
- Why I did it
Add sensor conf for MSN4600C A1 platform
- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform
- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
457e94d51 [macsec_linux]: Fixbug cannot dump the PN due to type error (#42)
f7c073323 Disable P2P module (#41)
7b3b777e2 [ci]: use native arm64 and armhf build pool (#40)
d4e91d66c [sonic_operator]: Increase wait timeout (#39)
43611ef88e [sonic_operators]: Add log in sonic operators (#43)
Signed-off-by: Ze Gan <ganze718@gmail.com>
What I did:-
Enhanced minigraph parser to parse interface name associated with static route nexthop
Why I did:-
One of the use case to support interface name is Chassis Packet. For Chassis Packet we have Static Routes configured to route traffic across line-card. If the FRR programs static route without the interface name then in case if the ip interface that is associated with the nexthop goes down FRR resolves static route nexthop over the default route as we have FRR config ip nht-resolve-via-default which causes undesired behavior. Having interface name with Static Route prevents recursive lookup on default route.
How I verify:
Updated unit-test cases
Manual verification
#### Why I did it
resolves https://github.com/Azure/sonic-buildimage/issues/8779
snmpd writes the below error message in syslog :
snmp#snmpd[27]: truncating integer value > 32 bits
This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days.
#### How I did it
Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd
To avoid seeing this message if the counter grows, the snmpd error log level is changed to display LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG.
Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog.
#### How to verify it
On a device which is up for more than 497 days, modify supervisord.conf with the change and restart snmp.
Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.
Enable pca954x idle_disconnect to avoid possible I2C device address conflict.
How I did it
Change pca954x device_attr idle_state to -2 (MUX_IDLE_DISCONNECT).
How to verify it
Cat pca954x device_attr idle_state and confirm the value is -2.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
#### Why I did it
AAA yang model is not up to date.
#### How I did it
Add fallback and trace field, and replace boolean_type
#### How to verify it
Run UT for sonic_yang_models.
Follow the steps from #9710
Why I did it
Config db schema generated by minigraph can’t pass yang validation, bgp_asn must not be None.
How I did it
Update sampe-voq-graph.xml to add bgp_asn.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m tests/sample-voq-graph.xml -p tests/voq-sample-port-config.ini --print-data', and check bgp_asn.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Update the sonic-swss submodule. The following are new commits in the submodule.
6dae6b8 Add initial value for weight in overlay nexthops (#2096)
d3cd402 [p4rt-tests] Bind response consumer to appl_state_db (#2105)
43e54e5 Fix armhf buildimage artifacts not found issue (#2107)
How I did it
Update the swss submodule pointer.
Provide the changes required for supporting the "show-techsupport" command via the SONiC Management Framework front end mechanisms (CLI, REST, and gNOI). The Management Framework functionality implemented by this PR improves on the the capabilities currently provided by the SONiC Click CLI interface via the "show techsupport" command by providing the following additional features:
- User-friendly "help" information describing command syntax details for CLI invocation.
- Ability to invoke the command via REST and gNOI mechanisms.
Unit test results are attached to this PR.
Fixes#9561Fixes#9570Fixes#9563
Partial fix for #9556
#### Why I did it
- Attributes for dual ToR configs lack YANG model support
#### How I did it
- Extend YANG tests to cover dual ToR use cases
- Extend YANG model to cover dual ToR use cases
- Reduce the default log level to warning so only test failures are printed
#### How to verify it
- Run the YANG model unit tests
Why I did it
Add YANG model file for table VLAN_SUB_INTERFACE
How I did it
Add YANG model file sonic-vlan-sub-interface.yang to describe data structure
modify existing unit-test to cover vlan sub interface
How to verify it
Build sonic-yang-models and sonic-yang-mgmt without errors
Correct thrift.0.13.0 dependent package name.
In previous code, the buildout target was named as PYTHON3_THRIFT_0_13_0
But when add the prackage to LIBTHRIFT_0_13_0, it typo as PYTHON_THRIFT_0_13_0
Signed-off-by: Roman Savchuk romanx.savchuk@intel.com
Why I did it
Update SDK with fixed for warmboot tofino2
Added platform interface extension
How I did it
Create new package for Y2, Y1, X2, X1 profiles
How to verify it
Run test suites from sonic-mgmt repo
#### Why I did it
It should be handled by `ConfigDBConnector.typed_to_raw()`.
This is a bug for `sonic-cfggen -m --print-data` only
```
"PORTCHANNEL_MEMBER": {
"PortChannel0001|Ethernet112": {
"NULL": "NULL"
},
"PortChannel0002|Ethernet116": {
"NULL": "NULL"
},
"PortChannel0003|Ethernet120": {
"NULL": "NULL"
},
"PortChannel0004|Ethernet124": {
"NULL": "NULL"
}
},
```
But not appears in `sonic-cfgen -d --print-data`.
```
"PORTCHANNEL_MEMBER": {
"PortChannel0001|Ethernet112": {},
"PortChannel0002|Ethernet116": {},
"PortChannel0003|Ethernet120": {},
"PortChannel0004|Ethernet124": {}
},
```
Tested in a T0 KVM.
On a multi-asic Supervisor card, running commands like
'show interface counter' opens a confid_db connection per
namespace per interface which results in many duplicate connections
exceeding the allowed open file handles. This causes the command to fail.
Caching the connections to prevent duplicate handles.
When the package name with special characters, such as +, the package name may be encoded as %2b, the package url will not be found when reproducible build enabled.
Since ++ is treated as a regex by apt-get, make sure the development
headers and the Java bindings for grpc don't get installed as well.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
The SAI credo libraries don't depend on that library. This saves about
36MB of disk storage space.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Why I did it
Config db schema generated by minigraph can’t pass yang validation, portchannel_member has invalid port.
How I did it
Update test minigraph to remove invalid port channel.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m xxx.xml --print-data', and check port channel member.
Signed-off-by: Gang Lv ganglv@microsoft.com
Add docker-syncd-centec-rpc and docker-saiserver-centec to buster docker image list. These 2 docker images are based on docker-config-engine-buster.
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
Why I did it
Config db schema generated by minigraph can’t pass yang validation, and there's no 'alias' field in yang model.
Minigraph parser supports 'alias' field for VLAN.
How I did it
Add 'alias' field to sonic-vlan.yang
How to verify it
Build sonic-yang-models.
Run command 'sonic-cfggen -m xxx.xml --print-data', and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Recently additional sensors that were needed only for specific system added to all systems and caused errors.
How I did it
* Include CPU board and switch board sensors only on SN2201 system
* Fix issue in test_chassis_thermal, now it skips non existing thermals.
How to verify it
Run show platform temperature
Signed-off-by: liora <liora@nvidia.com>
Why I did it
Config db schema generated by minigraph can’t pass yang validation, there's no Vlan31 in 'VLAN' table.
How I did it
Update test minigraph to add vlan interface.
How to verify it
Build sonic-yang-models.
Run command 'sonic-cfggen -m tests/fg-ecmp-sample-minigraph.xml -p tests/mellanox-sample-port-config.ini --print-data', and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Reboot cause is not working in S52xx platforms.
How I did it
Modified platform API's.
How to verify it
Check "show reboot-cause" to verify the reboot reason
#### Why I did it
Fixes https://github.com/Azure/sonic-utilities/issues/1871
From [generic-config-updater](https://github.com/Azure/sonic-utilities/tree/master/generic_config_updater) we call `sonic-yang-mgmt` multiple times in order to check a certain change to ConfigDb is valid or not. It is expected for some changes to be invalid, so always printing errors from `sonic-yang-mgmt` makes the output hard to read.
In this PR, we are adding a way to control if logs should be printed or not.
#### How I did it
- Added `print_log_enabled` flag to sonic_yang ctor
- Converted all `print` statements to `sysLog(..., doPrint=True)`
#### How to verify it
unit-test passing means the change did not break logs.
#### Info about libyang logging
libyang provides an extensive logging logic which can support a lot of scenarios:
- ly_log_level: setting logging level
- LY_LLERR
- LY_LLWRN
- ...
- ly_set_log_clb: setting log callback to customize the default behavior which is printing the msgs
- ly_log_options: setting logging options
- LY_LOLOG: If callback is set use it, otherwise just print. If flag is not set, do nothing.
- ...
For more info refer to:
- https://netopeer.liberouter.org/doc/libyang/devel/html/group__logopts.html#gaff80501597ed76344a679be2b90a1d0a
- https://netopeer.liberouter.org/doc/libyang/devel/html/group__log.html#gac88b78694dfe9efe0450a69603f7eceb
#### What's next?
Consume the new flag `print_log_enabled` in [generic-config-updater](https://github.com/Azure/sonic-utilities/tree/master/generic_config_updater) to reduce the logging clutter.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### A picture of a cute animal (not mandatory but encouraged)
Why I did it
To incorporate the below changes in DellEMC S6100, S6000 platforms.
S6100, S6000:
Implement 'get_revision' method for Chassis
Implement 'get_maximum_consumed_power' method for FanDrawer
Implement 'get_revision', 'get_maximum_supplied_power' methods for PSU
Implement 'get_error_description' method for SFP.
S6100:
Implement 'get_module_index' method for Chassis
Implement 'get_description', 'get_maximum_consumed_power', 'get_oper_status', 'get_slot' methods for Module
Update component names in platform.json
How I did it
Implement the platform API methods in the respective device files
How to verify it
Verified that the respective sonic-mgmt platform API test cases report success.
Why I did it
To include newer Fan LED, thermal capabilities fields in platform.json of DellEMC S6000, S6100, Z9332f platforms.
How I did it
Add the capabilities fields in each platform's respective platform.json.
How to verify it
Ran sonic-mgmt platform api test cases that use capabilities fields and verified that the results are as expected.
Why I did it
Some platforms need to run few steps before the PDDF service is actually started.
* Adding pre_pddf_init script in the service file
* Raising exception for get_target_speed() for PSU-fan in PDDF (#8129)
- Why I did it
PDDF utils were python2 compliant and they needed to be migrated to Python3 (as per Bullseye)
PDDF common platform APIs file name changed as the name was already in use
Indentation issues
Dead/redundant code needed to be removed
- How I did it
Made files Python3 compliant
Indentation corrected
Redundant code removed
- How to verify it
AS7326 Accton platform uses PDDF. PDDF utils were run on this platform to verify.