Signed-off-by: Yong Zhao yozhao@microsoft.com
Why I did it
This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.
How I did it
I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container.
How to verify it
I verified this implementation on device str-7260cx3-acs-1.
- Why I did it
To give SONiC Application Extension developers an environment to run and develop their apps.
- How I did it
Created sonic-sdk and sonic-sdk-buildenv dockers and their dbg versions.
- How to verify it
Build:
$ make -f slave target/sonic-sdk.gz target/sonic-sdk-buildenv.gz
Why I did it
Arista-7260CX3-Q64 is missing T1 MMU configuration.
How I did it
Define T1 MMU configuration for Arista-7260CX3-Q64.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Why I did it
Pick up fix from new hw-management package:
Fix gearbox thermal zone name, which was lack suffix thermal zone number
- How I did it
Update the hw-management version number in the make file
Update hw-management submodule pointer
- How to verify it
Run platform related test cases on Mellanox platform
- Fix `thermal.get_position_in_parent` issue which causes test_snmp_phy_entity failure
- Add support for xcvr thermal info so that thermalctld can incorporate it into the cooling algorithm (QSFP and OSFP/QSFP-DD modules only)
- Add improvements to `arista` CLI
#### Why I did it
To fix the following:
```
# psuutil status
Traceback (most recent call last):
File "/usr/local/bin/psuutil", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/psuutil/main.py", line 93, in status
psu_name = psu.get_name()
File "/usr/local/lib/python3.7/dist-packages/sonic_platform_base/device_base.py", line 28, in get_name
raise NotImplementedError
NotImplementedError
```
#### How I did it
Implemented get_name
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
#### Why I did it
To get rid of obsolete code
#### How I did it
Removed plugins folder from device/barefoot
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
Why I did it
k8s handles in lower case, so the code ensures that it uses hostname in all lower case
How I did it
Wrapper for device_info.get_hostname that returns in lower case. This wrapper is used in all places that require hostname to use in kubectl commands.
How to verify it
Device joins successfully.
#### Why I did it
The label for PSU related sensors on the Spectrum-2 platform is not aligned with the physical location of the PSU.
#### How I did it
Update the label in the sensor conf file for those relevant platforms
Signed-off-by: Kebo Liu <kebol@nvidia.com>
When FECDisabled is set to true in minigraph.py, push 'fec' 'none' explicitly to config_db. When 'fec' is defined in port_config.ini do not override it with 'rs' for 100G
#### Why I did it
If a process limits using nodes by mempolicy/cpusets, and those nodes become memory exhaustion status, one process may be killed by oom-killer.
No panic occurs in this case, because other node's memory may be free.
This means system total status may be not fatal yet.
#### How I did it
Remove 'vm.panic_on_oom=1' kernel flag from 'vmcore-sysctl.conf '
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
PG profile settings need to be aligned with Arista-7050-QX-32S
How I did it
Copy over the current settings from Arista-7050-QX-32S and define params for 10G and 1G speeds as well
#### Why I did it
Update sonic-snmpaget submodule to pick up new commits:
> Extend rfc3433.py to support more Physical Entity Sensor MIB entries 28b9dfd3a2
#### How I did it
update the submodule pointer to including the new commits
#### How to verify it
run community snmp test.
#### Why I did it
According to thermalctld hld, each fan must belong to a fan drawer, if the fan drawer does not physically exist, put fan into a virtual fan drawer. This PR is to clear fan from chassis._fan_list
#### How I did it
1. Don't put fan to chassis._fan_list
2. Always query fan from fan_drawer
#### Why I did it
This pull request allows calls to be made through the platform 2.0 API that retrieve the PSU and Chassis hardware revision on Mellanox platforms. Access to these values will aid customers in determining their hardware revisions for debugging and technical support. These values are intended to be eventually exposed through the CLI.
#### How I did it
For the PSU hardware revision I used the existing VPD function calls implemented in https://github.com/Azure/sonic-buildimage/pull/7382
For the Chassis hardware revision I parsed the SMBIOS / DMI type 2 information to retrieve the information.
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Need proper MMU and Qos settings for Arista-7050QX-32S-S4Q31
How I did it
Updated the settings based on Arista-7050-QX-32S
#### Why I did it
xcvrd crashes when the application advertisement capability flag is not seen for few transceivers.
#### How I did it
Initialize the additional application capability in dunder init
Why I did it
Currently, there is a bug in the ntp.conf jinja2 template where it will ignore the src_intf directive in CONFIG_DB if there are multiple IP addresses associated with an interface. This code change fixes that bug and allows the template to select the correct source interface for NTP.
How I did it
I did this by modifying the macro in ntp.conf.j2 which determines if there is an ip address associated with an interface to set a state variable when it detects a valid interface entry in CONFIG_DB instead of outputting "true" directly (which could result in multiple "trues" outputted for interfaces with multiple valid IP addresses).
How to verify it
Add two ipv4 addresses to an interface in SONiC
Add the following configuration to config_db.json
{
"NTP": {
"global": {
"src_intf": "Ethernet1"
}
}
}
Replace Ethernet1 with the interface name of the one you assigned the IP addresses to.
Run sudo config reload -y
Open /etc/ntp.conf and verify that the following line exists
...
interface listen Ethernet1
...
The interface specified should be the one set in the previous steps.
Description for the changelog
[ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB
#### Why I did it
To avoid the following logs
```
Mar 15 15:52:04.599302 igk-dut-04 INFO database#/supervisord: flushdb /bin/bash: /usr/local/bin/flush_unused_database: /usr/bin/python: bad interpreter: No such file or directory
Mar 15 15:52:04.599947 igk-dut-04 INFO database#supervisord 2021-03-15 15:52:04,599 INFO exited: flushdb (exit status 126; not expected)
```
#### How I did it
Fix shebang
#### How to verify it
Check the logs
Map priority 0 to TC 1 and priority 1 to TC 0
Send traffic on priority 0 and 1 and verified that it gets mapped correctly in hw
Signed-off-by: Neetha John <nejo@microsoft.com>
Signed-off-by: Neetha John nejo@microsoft.comFixes#7531
Why I did it
To enable bgp sessions to be established over subinterfaces
How I did it
Listen to VLAN_SUB_INTERFACE table in config db
How to verify it
Bgp sessions were established successfully over subinterface
Why I did it
start pcie-check.service after config-setup.service since pcie_util depends on device_info which is available with config db metadata.
How I did it
Add config-setup.service as a dependency of pcie-check.service
How to verify it
Upon reboot, check if the pcie-check.sh throws the platform api error which is dependent on DEVICE_METADATA
Why I did it
Enable redistribution of static routes
How I did it
Enable redistribution of static routes when the first route is added to STATIC_ROUTE table of Config_DB and disable the redistribution when the last route is removed from STATIC_ROUTE table.
#### Why I did it
Add initial support of SN4800 platform for Mellanox ASIC simulation device.
NOTE: This is work in progress and not full support of the platform.
#### How I did it
Add new folders for SN4800 with zero ports based on SN4700 Spectrum-3 switch.
This PR updates the following commits in sonic-platform-daemons
e60804c [xcvrd] add support for logging mux_metrics events into state DB (#185)
807b304 [psud] Add PSU Hardware Revision to Redis STATE_DB (#179)
d0be634 [muxcable] Remove Xcvrd Sleep (#174)
cc3803f [thermalctld] Enable stopping thermal manager (#180)
665fcd9 [xcvrd] Fix crash for QSFP DD media (#181)
cdabd09 [xcvrd] Change the y_cable presence logic to use "mux_cable" table as identifier from Config DB (#176)
4be4306 [xcvrd] Enhance Media Settings (#177)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Why I did it
Skip to use the web proxy when the packages have been in the proxy server.
For sai packages or the other packages, we will upload the the proxy server directly, the reproducible will skip to check the site, not necessary to change the version files.
[config]Static routes to config_db (#1534)
[DPB]: Shut down interface before dynamic port breakout (#1303)
[vlan] remove dhcp-relay as dhcp-relay commands will come as a plugin (#1378)
Add 'default' option for sFlow. (#1606)
[Command-Reference.md] Document new SNMP show and config commands (#1600)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
1213d61 [thermal_manager_base] Add a stop function to thermal manager (#187)
a95834b [DeviceBase] Added hardware revision number to generic device properties (#184)
f4901a0 [voqinbandif]To support inband port as front panel port (#159)
#### Why I did it
Improve readability of `show environment` output.
#### How I did it
In all sensors.conf, give the customized labels according to HW specifications for each model.
Signed-off-by: Sean Wu <sean_wu@edge-core.com>
- Why I did it
Add initial support of SN4800 platform .
NOTE: This is work in progress and not full support of the platform.
- How I did it
Add new folders for SN4800 with zero ports based on SN4700 Spectrum-3 switch.
- How to verify it
Simulator device was tested. See #7448