- What I did
Added Daemon to Log LPC bus degradation in Intel C2000 processor. Intel Rangeley C2000 processors with revision less than or equal to 2 have issue where LPC bus degrades over time in some processors. To identify the problem and to notify the issue, a daemon has been added which will log on encountering the issue.
- How I did it
Added a daemon which validates the CPLD scratch(0x102) and SMF scratch(0x202) registers by writing and reading values on regular polling intervals (300 seconds). If there is a discrepancy between read and write, a critical log will be thrown.
- How to verify it
The infra is verify by simulating the issue where between write and read, the value in register is modified and the log appearance is checked.
- Description for the changelog
Added Daemon to identify LPC bus degradation issue and notify using syslog in Dell S6100 and Z9100 platforms. This daemon will only run on processors with revision less than or equal to 2.
* Update sonic-quagga submodule
* Port some patches from sonic-quagga
* Fix Makefile
* Another patch
* Uncomment bgp test
* Downport Nikos's patch
* Add a patch to alleviate the vendor issue
* use patch instead of stg
to include update in mellanox PFCWD lua script
matching new SAI
sonic-swss:
407d048 [mellanox] convert logic to use quanta in pfc_detect_mellanox.lua (#930)
67c0940 [test]: Skip test_clear in test_watermark (#937)
c72c34f Enable Vnet/Vxlan VS test (#935)
4c771d0 add incCrmAclUsedCounter and decCrmAclUsedCounter for SAI_ACL_BIND_POINT_TYPE_SWITCH case. (#899)
825c0cb [vs]: Fix bitmap VNET virtual switch test (#936)
4577b40 Add buffer pool watermark support (#853)
4a67378 Add support of VXLAN tunnel removal (#931)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
* [build]: wait 60 seconds for docker engine to start
On some platforms, it can take more than 1 second for docker
engine to start.
Signed-off-by: Guohan Lu <gulv@microsoft.com>
- create a dockerfile-marcros.j2 file with all common operations
written as j2 macro
- use single dockerfile instruction for COPY and RUN commands
when possible to improve build time
- reorganize dockerfile instructions to make more cache friendly
(in case someday we will remove --no-cache to build docker images)
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
Some kernels are built with overlayfs as a builtin and not a module.
For these the check via lsmod currently fails.
This improvement now checks the kernel configuration for the
CONFIG_OVERLAY_FS entry. Depending on the OS and kernel version the
build configuration can be in multiple places.
- What I did
During boot/reload time, wait in a loop to check for bcm initialization.
Break the loop, once sdk is ready to process the 'bcmcmd' request (or) loop count reached the maximum value.
- How I did it
In the existing implementation during syncd start process will sleep for a fixed time (3 secs)
for sdk initialization to happen. But the time taken for sdk initialization is varying for different platforms.
To fix this issue, the syncd start process wait in a loop and check whether sdk is ready to process 'bcmcmd' command.
- How to verify it
Check for syncd process status and interface status.
Check for syslogs and no failures related to syncd should be present.
This is a follow-up of sonic-snmpagent PR 92
Now that licensing issues have been solved FRR is distributed with SNMP
support compiled-in. This PR adds the last bits of configuration to get
the frr-snmp debian packages added to the docker container and the
config bits to enable the snmp module in FRR
This PR brings the functionality of being able to poll bgpd for routes
and peer status.
Signed-off-by: Michel Moriniaux <m.moriniaux@criteo.com>
* src/iproute2/Makefile
* src/python3/Makefile
These Makefiles do not properly clean out the src build subdirectory
prior to downloading the source code contents. This causes an error
during a rebuild following a 'make clean'.
Signed-off-by: Greg Paussa <greg.paussa@broadcom.com>
* [submodule] update sonic-linux-kernel
* update linux kernel version
* Fix many version strings
* update mellanox components (built with new kernel)
* [mlnx] add make files for SDK WJH libs
* Update arista driver submodule (#8)
Make the debian packaging point to a newer kernel version.
* Set the default mac ageing time to 300 seconds
The current mac ageing was disabled, this could lead the mac address
table to increase over time and lead to resource and performance issues.
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
* Update the default HW ageing timer to be 600 seconds.
This is to be on the safer side where ARP update interval
is 300 seconds and SONiC does not flood when ARP is aged out.
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
sonic-swss:
[vnet]: Extend Bitmap VNET test with "remove" flows (#900)
[vxlanorch] Ambiguous return code for removeNextHopTunnel (#880)
Address review comment: remove data member m_entriesCreated, which is introduced for dependancy resolution purpose. (#839)
Set LAG mtu value based on kernel netlink msg (#922)
[orchagent]: Remove try/catch for correct coredump file (#790)
[aclorch] unittest by gtest (#924)
[orchagent]: Added support of PFC WD for BFN platform (#823)
[vnetorch]: Fix tunnel route removal flow for bitmap VNET (#912)
pkill -9 zebra for frr warm restart VS test fix (#927)
swss-orchagent: add new orch for vnet routes/tunnel routes tables in CONFIG_DB (#907)
[debian]: Do not build test when building with real SAI (#932)
sonic-swss-common:
Add schema for dot1p to tc mapping config table (#274)
Fix MIRROR_SESSION table macro name (#264)
[schema] Add VNET Route tables in config_db (#279)
[debian] increment debian compatibility to 10 to enable parallel package build (#280)
White-list clear_stats op from orchagent to syncd (#281)
Correct comment (#282)
sonic-sairedis:
[debian]: Change build order in target binary (#452)
[debian] increment debian compatibility to 10 to enable parallel package build (#461)
Full sleep wait flex counter polling thread when POLL_COUNTER_STATUS is disable (#462)
add support for SAI_ATTR_VALUE_TYPE_ACL_CAPABILITY (#460)
Check if port VID exists in db on flex counter query (#464)
Full sleep wait change for PFC watchdog (#465)
Add synchronous clear_stats operation path (#463)
Modify sai_create_port to breakout a port for virtual switch (#454)
Fix typo (#467)
Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
* Updated Makefile infrastructure to build debug images.
As a sample, platform/broadcom/docker-orchagent-brcm.mk is updated to add a docker-orchagent-brcm-dbg.gz target.
Now "BLDENV=stretch make target/docker-orchagent-brcm-dbg.gz" will build the debug image.
NOTE: If you don't specify NOSTRETcH=1, it implicitly calls "make stretch", which builds all stretch targets and that would include debug dockers too.
This debug image can be used in any linux box to inspect core file. If your module's external dependency can be suitably mocked, you my even manually run it inside.
"docker run -it --entrypoint=/bin/bash e47a8fb8ed38"
You may map the core file path to this docker run.
* Dropped the regular binary using DBG_PACKAGES and a small name change to help readability.
* Tweaked the changes to retain the existing behavior w.r.t INSTALL_DEBUG_TOOLS=y.
When this change ('building debug docker image transparently') is extended to all dockers, this flag would become redundant. Yet, there can be some test based use cases that rely on this flag.
Until after all the dockers gets their debug images by default and we switch all use cases of this flag to use the newly built debug images, we need to maintain the existing behavior.
* 1) slave.mk - Dropped unused Docker build args
2) Debug template builder: renamed build_dbg_j2.sh to build_debug_docker_j2.sh
3) Dropped insignifcant statement CMD from debug Docker file, as base docker has Entrypoint.
* Reverted some changes, per review comments.
"User, uid, guid, frr-uid & frr-guid" are required for all docker images, with exception of debug images.
* Get in sync with the new update that filters out dockers to be built (SONIC_STRETCH_DOCKERS_FOR_INSTALLERS) and build debug-dockers only for those to be built and debug target is available.
* Mkae a template for each target that can be shared by all platforms.
Where needed a platform entry can override the template.
This avoids duplication, hence easier to maintain.
* A small change, that can fit better with other targets too.
Just take the platform code and do the rest in template.
* Extended debug to all stretch based docker images
* 1) Combined all orchagent makefiles into one platform independent make under rules/docker-orchagent.mk
2) Extened debug image to all stretch dockers
* Changes per review comments:
1) Dropped LIBSAIREDIS_DBG from database, teamd, router-advertiser, telemetry, and platform-monitor docker*.mk files from _DBG_DEPENDS list
2) W.r.t docker make for syncd, moved DEPENDS from template to specific makefile and let the template has stuff that is applicable to all.
* 1) Corrected a copy/paste mistake
* Fixed a copy/paste bug
* The base syncd dockers follow a template, which defines the base docker as DOCKER_SYNCD_BASE instead of DOCKER_SYNCD_<platform code>. Fix the docker-syncd-<mlnx, bfn>.mk to use the new one.
[Yet to be tested locally]
* Fixed spelling mistake
* Enable build of dbg-sonic-broadcom.bin, which uses dbg-dockers in place of regular dockers, for dockers that build debug version. For dockers that do not build debug version, it uses the regular docker.
This debug bin is installable and usable in a DUT, just like a regular bin.
* Per review comments:
1) Share a single rule for final image for normal & debug flavors (e.g. sonic-broadcom.bin & sonic-broadcom-dbg.bin)
2) Put dbg as suffix in final image name.
3) Compared target/sonic-broadcom.bin.logs with & w/o fix to verify integrity of sonic-broadcom.bin
4) Compared target/sonic-broadcom.bin.logs with sonic-broadcom-dbg.bin.log for verification
This fix takes care of ONIE image only. The next PR will cover the rest.
The next PR, will also make debug image conditional with flag.
* Updated per comments.
Now that debug dockers are available, do not need a way to install debug symbols in regular dockers.
With this commit, when INSTALL_DEBUG_TOOLS=y is set, it builds debug dockers (for dockers that enable debug build) and the final image uses debug dockers. For dockers that do not enable debug build, regular dockers get used in the final image.
Note:
The debug dockers are explicitly named as <docker name>-dbg.gz. But there is no "-dbg" suffix for image.
Hence if you make two runs with and w/o INSTALL_DEBUG_TOOLS=y, you have complete set of regular dockers + debug dockers. But the image gets overwritten.
Hence if both regular & debug images are needed, make two runs, as one with INSTALL_DEBUG_TOOLS=y and one w/o. Make sure to copy/rename the final image, before making the second run.
- What I did
Currently when the system is under memory pressure, the OOM killer kicks in and kills a rogue process. Killing a rogue process can cause the device to be un-healthy leading to blackholing of the traffic.
To avoid this, configure the OOM to do a kernel panic which will cause the device to reboot and come back up healthy.
- How I did it
Added the sysctl variable panic_on_oom and set the value to 2.
Setting it to 2 will ensure OOM killer to always do a kernel panic.
These patches add support for the Broadcom XMC card (XLR/GTS). At this moment
only Tomahawk switch (BCM956960K) is supported. Add
device/broadcom/x86_64-bcm_xlr-r0 and
platform/broadcom/sonic-platform-modules-brcm-xlr-gts files
Advancing sub module pointers to dynamic transceiver support feature commit.
- src/sonic-swss
f437f9f..d616764
[policerorch]: Add PolicerOrch to bundle with mirror session (889)
Fix MIRROR_SESSION table macro name (802)
Ignore neighbor entry with BCAST MAC, check SAI status exists (914)
[vstest]: Update the mirror session state table name (917)
[test]: Skip tests under investigation (919)
[debian] increment debian compatibility to 10 to enable parallel package build (911)
[aclorch]: Add MIRROR_DSCP table type (906)
[test]: Mark some VLAN tests as Stretch only (903)
[warm restart assist] assume vector values could be reordered (921)
Suppress storm detect counter increment for ongoing pfc storm case during a warm reboot (869)
Fix vlan incremental config and add vs test cases (799)
Remove *_LEFT fields to allow PFC watchdog to enter fresh into the (897)
add dynamic transceiver tuning support (821)
- src/sonic-platform-common
92b54b1..7f95a2a
Enhance new platform API (19)
Add .gitignore file (28)
[sonic_platform_base] Add sonic_sfp and sonic_eeprom to sonic_platform_base (27)
Added type abbrev name to be used in media_settings.json for Dynamictransceiver tuning (32)
- src/sonic-platform-daemons
c8931f3..366ac0e
Fixed xcvrd shutdown flow. (23)
Add .gitignore file (27)
Dynamic transceiver tuning support (26)
* [sfputil]Remove the dependency on sysfs for sfputil, mainly get_presence and port_to_eeprom_mapping
Remove the dependency on sysfs, including:
1. rewrite get_presence by using ethtool;
2. remove interface port_to_eeprom_mapping which is no longer referenced;
3. remove code that references port_to_eeprom_mapping and _port_to_eeprom_mapping;
4. remove private member qsfp_sysfs_path which is no longer referenced.
* [sfputil.py]
minor adjustment: move the presence=False to the beginning of get_presence.