Why I did it
Eliminate benign firsttime boot error reported when running on platforms that do not support kdump.
How I did it
Change rc.local to check for presence of the file /etc/default/kdump-tools before referencing it.
How to verify it
Install a new image on an armhf or arm64 platform and check for a failed reference to /etc/default/kdump-tools on firsttime boot.
- External PHY is managed via gearbox (gbsybcd docker container) in SONiC
- Enhanced 'External PHY management' from SONiC's single-ASIC environment to multi-ASIC
- Enhanced gbsyncd docker container from single Namespace to multi-Namspace mode
- Added gbsyncd.service.j2 on per_namespace basis.
- Each namepace/ASIC now to have its unique gbsyncd<ASIC#> docker container with its
own Gearbox table, redis-DB
Signed-off-by: Shyam Kumar <shyakuma@cisco.com>
Why I did it
ConfigDB schema generated by minigraph parser can't pass yang validation.
How I did it
Modify minigraph.py, and use 'state' to replace 'status'.
How to verify it
Run UT for sonic-config-engine.
Use minigraph parser to generate ConfigDB schema, and run yang validation.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
end2end test is blocked by Yang model for BGP monitor.
How I did it
Create new yang files for BGP monitor, and add UT.
How to verify it
Follow the steps in #9711.
Run UT for sonic-yang-models.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Need to be able to run smartctl when pmon docker is not running.
How I did it
Removed the pmon dependency for pmon as well as the command wrapper and added it to the debian-extension.
How to verify it
Stop pmon
Run smartctl from the host and verify it runs without error
c4127c2 [psud] Fix PSU log issue (#235)
07542cb [pmon][xcvrd]xcvrd process show backtrace on the internal port. (#233)
3e432e7 [Y-Cable] Increased unit test coverage of y_cable_helper.py (#229)
7c363f5 [ledd] prevent led crash on recirc port event (#232)
e9ccd82 [sonic-platform-daemons] fix dependency issue on py2 wheels by correcting the path (#234)
2b0acfb [sfp-refactoring] xcvrd: add initial support for CMIS application initialization (#217)
- Why I did it
Optimize thermal control policies to simplify the logic and add more protection code in policies to make sure it works even if kernel algorithm does not work.
- How I did it
Reduce unused thermal policies
Add timely ASIC temperature check in thermal policy to make sure ASIC temperature and fan speed is coordinated
Minimum allowed fan speed now is calculated by max of the expected fan speed among all policies
Move some logic from fan.py to thermal.py to make it more readable
- How to verify it
1. Manual test
2. Regression
Tested on a Celestica Seastone2 DX030 switch
Testing scenarios:
- Various QSFP ports in both normal and breakout config.
- 100G and 40G link speed show different colors.
- SFP1 port works.
Signed-off-by: Christian Svensson <blue@cmd.nu>
- Why I did it
For MSN4410/MSN4600/MSN4700 now they can support fetching PSU voltage threshold, no need to skip the psu voltage check in system health monitoring, so update the system health monitoring configuration file for these platforms.
- How I did it
remove skip PSU change config from the system_health_monitoring_config.json file
- How to verify it
Build image run on these platforms, system health monitoring will not report error against PSU voltage
Signed-off-by: Kebo Liu <kebol@nvidia.com>
```
d9f3afe [fdbshow] Adding more options for fdbshow and show mac (#1982)
902e14f Revert "Revert "[Barefoot] Added CLI to list/set P4 profile (#1951)"" (#2019)
5cc9dd5 Revert "Revert "[sonic-package-manager] support sonic-cli-gen and packages with YANG model (#1650)" (#1972)" (#1994)
```
As part of this, update the isc-dhcp package to match the Bullseye
version (this fixes some compile errors related to BIND), clean up some
of the build dependencies and runtime dependencies for debian packaging,
and use the default Boost version to compile against instead of
explicitly saying using 1.74.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
- Why I did it
MSN4700 platform has 8 lanes per port and thus can support 2x40G with each lane running at 10G
- How I did it
Added 40G to 2x200G breakout mode in platform.json
- How to verify it
Run config int break Ethernet0 2x40G[200G,100G,50G,25G,10G,1G]
And verify the command runs successfully and the port speed was set to 40G with a 2x breakout.
* Description: Currently IPv4 routes with IPv6 link local next hops are
not properly installed in FPM.
Reason is the netlink decoding truncates the ipv6 LL address to 4 byte
ipv4 address.
Ex : fe80:: is directly converted to ipv4 and it results in 254.128.0.0
as next hop for below routes
show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
B>* 2.1.0.0/16 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight 1,
02:22:26
B>* 5.1.0.0/16 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight 1,
02:22:26
B>* 10.1.0.2/32 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight
1, 02:22:26
Hence this fix converts the ipv6-LL address to ipv4-LL (169.254.0.1)
address before sending it to FPM. This is inline with how these types of
routes are currently programmed into kernel.
Signed-off-by: Nikhil Kelapure <nikhil.kelapure@broadcom.com>
[image]: Prevent radius passkey and snmp community string into syslog. (#9727)
#### Why I did it
Prevent radius passkey and snmp community string into syslog.
#### How I did it
Add radius and snmp config command to PASSWD_CMDS
#### How to verify it
Run and pass all UTs.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
#### Description for the changelog
Add radius and snmp config command to PASSWD_CMDS to prevent radius passkey and snmp community string into syslog.
#### A picture of a cute animal (not mandatory but encouraged)
#### Why I did it
Build failed.
Due to the error message looks like build failed because `pyversions` utility was not found.
`pyversions` utility is a part of `python2-minimal` package and it wasn't installed.
#### How I did it
To avoid installing python2 just specify explicit python version `--with python3` and use build system for py3 `--buildsystem=pybuild`
#### How to verify it
Run build
* Add intel_iommu=off to installer.conf
* This solve flooding DMAR err msg: "handling fault status reg 2"
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* remove customized at24 driver
* Use kernel 5.10.46 upstream at24 driver directly. The ADDR16 issue on
old driver has gone.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* pin I2C-0/I2C-1 bus order
* otherwise, sometimes I2C-0/I2C-1 will be assigned to the undesired one.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* fix i2c bus num for fan driver
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* backward compatible with R0A/R0B HW
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
* fix workdir for seastone2
Signed-off-by: Viktor Ekmark <viktor@ekmark.se>
* seastone2: Add I2C SFP definition for SFP1
Signed-off-by: Christian Svensson <blue@cmd.nu>
* [device/cel_seastone_2] sfputil logic for SFP1
Earlier logic resulted in the name of SFP1 being SFP33 which is not
correct. The cannonical source is seastone2_fpga module and it calls it
SFP1, so ensure the logic does as well.
Signed-off-by: Christian Svensson <blue@cmd.nu>
* [device/cel_seastone_2] sysfs paths for SFP1
Various changes that plumbs the correct port presence and DOM decoding
for the SFP1 port.
Signed-off-by: Christian Svensson <blue@cmd.nu>
Co-authored-by: Christian Svensson <blue@cmd.nu>
Improve the Linux kernel build cache hit rate.
Current the the hit rate is around 85.8% (based on the last 3 month, 3479 PR builds totally, 494 PR build not hit).
We can improve the hit rate up to 95% or better.
The Linux kernel build will take really long time, most of the PRs are nothing to do with the kernel change. The remaining cache options should be enough to detect the Linux kernel cache status (dirty or not).
Why I did it
The existing log file size in sonic is 1 Mb. Over a period of time this leads to huge number of log files which becomes difficult for monitoring applications to handle.
Instead of large number of small files, the size of the log file is not set to 16 Mb which reduces the number of files over a period of time.
How I did it
Changed the size parameter and related macros in logrotate config for rsyslog
How to verify it
Execute logrotate manually and verify the limit when the file gets rotated.
Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
[Submodule update] sonic-swss
c78aa1b81a3a9001669746067ebbe40b4485f71d (HEAD -> master, origin/master, origin/HEAD) OA changes to support Ordered ECMP and DVS test for same. (#2092)
b4b00031378a6ce303b779159e718d6d20790c11 Handling Invalid CRM configuration gracefully (#2109)
d240cb2d356ec17baa464455f37f88ac5dbc441a [Mellanox] '_8lane' not added to Mellanox 5xxx models with 800G (#2090)
8fd6e488d2a3696b9cfe352a9119c86f0f33e6dc [pfcwd] Add vs test infrastructure (#2077)
b96ee5438b8bf08980846ee84ff69e7ba267b0dc [vnetorch] Advertise vnet tunnel routes (#2058)
[submodule update] sonic-sairedis
d5866a3dccfb3bc50853d740d54203b5cae61eed (HEAD -> master, origin/master, origin/HEAD) [vslib]: fix create MACsec SA error (#986)
f36f7ce6236ae97526e15f00e7688ccced7c0454 Added Support for enum query capability of Nexthop Group Type. (#989)
323b89b14995a84bd6539c8a1df00b77d251f99e Support for MACsec statistics (#892)
26a8a1204e873109537c81462ad1457cf38c2f9e Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#968)
0cb253a42cd0a641b8e0a3c6a4a54e5397dd8c2d Fix object availability conversion (#974)
- Why I did it
Add sensor conf for MSN4600C A1 platform
- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform
- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
457e94d51 [macsec_linux]: Fixbug cannot dump the PN due to type error (#42)
f7c073323 Disable P2P module (#41)
7b3b777e2 [ci]: use native arm64 and armhf build pool (#40)
d4e91d66c [sonic_operator]: Increase wait timeout (#39)
43611ef88e [sonic_operators]: Add log in sonic operators (#43)
Signed-off-by: Ze Gan <ganze718@gmail.com>
What I did:-
Enhanced minigraph parser to parse interface name associated with static route nexthop
Why I did:-
One of the use case to support interface name is Chassis Packet. For Chassis Packet we have Static Routes configured to route traffic across line-card. If the FRR programs static route without the interface name then in case if the ip interface that is associated with the nexthop goes down FRR resolves static route nexthop over the default route as we have FRR config ip nht-resolve-via-default which causes undesired behavior. Having interface name with Static Route prevents recursive lookup on default route.
How I verify:
Updated unit-test cases
Manual verification
#### Why I did it
resolves https://github.com/Azure/sonic-buildimage/issues/8779
snmpd writes the below error message in syslog :
snmp#snmpd[27]: truncating integer value > 32 bits
This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days.
#### How I did it
Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd
To avoid seeing this message if the counter grows, the snmpd error log level is changed to display LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG.
Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog.
#### How to verify it
On a device which is up for more than 497 days, modify supervisord.conf with the change and restart snmp.
Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.
Enable pca954x idle_disconnect to avoid possible I2C device address conflict.
How I did it
Change pca954x device_attr idle_state to -2 (MUX_IDLE_DISCONNECT).
How to verify it
Cat pca954x device_attr idle_state and confirm the value is -2.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
#### Why I did it
AAA yang model is not up to date.
#### How I did it
Add fallback and trace field, and replace boolean_type
#### How to verify it
Run UT for sonic_yang_models.
Follow the steps from #9710
Why I did it
Config db schema generated by minigraph can’t pass yang validation, bgp_asn must not be None.
How I did it
Update sampe-voq-graph.xml to add bgp_asn.
How to verify it
Build sonic-config-engine.
Run command 'sonic-cfggen -m tests/sample-voq-graph.xml -p tests/voq-sample-port-config.ini --print-data', and check bgp_asn.
Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
Update the sonic-swss submodule. The following are new commits in the submodule.
6dae6b8 Add initial value for weight in overlay nexthops (#2096)
d3cd402 [p4rt-tests] Bind response consumer to appl_state_db (#2105)
43e54e5 Fix armhf buildimage artifacts not found issue (#2107)
How I did it
Update the swss submodule pointer.
Provide the changes required for supporting the "show-techsupport" command via the SONiC Management Framework front end mechanisms (CLI, REST, and gNOI). The Management Framework functionality implemented by this PR improves on the the capabilities currently provided by the SONiC Click CLI interface via the "show techsupport" command by providing the following additional features:
- User-friendly "help" information describing command syntax details for CLI invocation.
- Ability to invoke the command via REST and gNOI mechanisms.
Unit test results are attached to this PR.
Fixes#9561Fixes#9570Fixes#9563
Partial fix for #9556
#### Why I did it
- Attributes for dual ToR configs lack YANG model support
#### How I did it
- Extend YANG tests to cover dual ToR use cases
- Extend YANG model to cover dual ToR use cases
- Reduce the default log level to warning so only test failures are printed
#### How to verify it
- Run the YANG model unit tests
Why I did it
Add YANG model file for table VLAN_SUB_INTERFACE
How I did it
Add YANG model file sonic-vlan-sub-interface.yang to describe data structure
modify existing unit-test to cover vlan sub interface
How to verify it
Build sonic-yang-models and sonic-yang-mgmt without errors