**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.
Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.
**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the
syncd process, Monit will not detect whether syncd process is running or not.
If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.
Similarly, On Mellanox Monit will also select the process with the highest uptime (at there
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.
That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.
**- How I did it**
**- How to verify it**
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* Change port index in port_config.ini to 1-based
* Add default port index to port_config.ini, change platform plugins to accept 1-based port index
* fix port index in sfp_event.py
**- Why I did it**
The tx_disable function isn't work for the accton_ax5835-54x device.
**- How I did it**
Fix the incorrect path of the sfp node path inside the util file.
**- How to verify it**
Test with
"sudo accton_as5835_54x_util.py show"
"sudo accton_as5835_54x_util.py set sfp"
There should see correct value for module_present and module_tx_disable. And should able to set it.
Signed-off-by: kuanyu_chen <kuanyu_chen@edge-core.com>
The `-sv2` suffix was used to differentiate SNMP Dockers when we transitioned from "SONiCv1" to "SONiCv2", about four years ago. The old Docker materials were removed long ago; there is no need to keep this suffix. Removing it aligns the name with all the other Dockers.
Also edit Monit configuration to detect proper snmp-subagent command line in Buster, and make snmpd command line matching more robust.
- Add .gitignore files in each subdirectory of src/, so as to reduce the size of the .gitignore file in the project root, and also make it easier to maintain (i.e., if a directory in src/ is removed, there will not be outdated entries in the root .gitignore file.
- Also add missing .gitignore entries and remove outdated entries and duplicates.
The -sv2 suffix was used to differentiate SNMP Dockers when we transitioned from "SONiCv1" to "SONiCv2", about four years ago. The old Docker materials were removed long ago; there is no need to keep this suffix. Removing it aligns the name with all the other Dockers.
**- Why I did it**
For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.
**- How I did it**
- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’ ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py,
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.
**- How to verify it**
- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.
UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
- Sensor and Fan information added to primary platforms for thermal API.
- Refactors involving better abstractions, code reuse and dead code removal.
- Improvements to the diag capabilities
- Pylintrc added to improve code quality. Will become fatal at a later time.
Co-authored-by: Baptiste Covolato <baptiste@arista.com>
Cleanup description string
First port (management port) are excluded from general port naming scheme.
Management port are excluded from general port naming scheme.
before:
|on GNS3 |in SONiC |
|---------|---------|
|Ethernet0|eth0 |
|Ethernet1|Ethernet0|
|Ethernet2|Ethernet4|
|Ethernet3|Ethernet8|
after:
|on GNS3 |in SONiC |
|---------|---------|
|eth0 |eth0 |
|Ethernet0|Ethernet0|
|Ethernet1|Ethernet4|
|Ethernet2|Ethernet8|
Signed-off-by: Masaru OKI <masaru.oki@gmail.com>
* Update sonic-sairedis (sairedis with SAI 1.6 headers)
* Update SAIBCM to 3.7.4.2, which is built upon SAI1.6 headers
* missed updating BRCM_SAI variable, fixed it
* Update SAIBCM to 3.7.4.2, updated link to libsaibcm
* [Mellanox] Update SAI (release:v1.16.3; API:v1.6)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
* Update sonic-sairedis pointer to include SAI1.6 headers
* [Mellanox] Update SDK to 4.4.0914 and FW to xx.2007.1112 to match SAI 1.16.3 (API:v1.6)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
* ensure the veth link is up in docker VS container
* ensure the veth link is up in docker VS container
* [Mellanox] Update SAI (release:v1.16.3.2; API:v1.6)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
* use 'config interface startup' instead of using ifconfig command, also undid the previous change'
Co-authored-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100, Z9100 and Z9264 platforms to 'float'.
* [platform]: Add a new supported platform, Delta-agc032
Switch Vendor: Delta
Switch SKU: Delta-agc032
CPU: BROADWELL-DE
ASIC Vendor: Broadcom
Switch ASIC: Tomahawk3, BCM56980
Port Configuration: 32x400G + 2x10G
- What I did
Add a new Delta platform Delta-agc032.
- How I did it
Add files by following SONiC Porting Guide.
- How to verify it
1. decode-syseeprom
2. sensors
3. psuutil
4. sfputil
5. show interface status
6. bcmcmd
Signed-off-by: zoe-kuan <ZOE.KUAN@deltaww.com>
3y Power YPEB-1200am PSU doens't support read fan_dir from pmbus register
Check with vendor this PUS type only support F2B fan direction. So add to show "F2B"
when red psu_fan_dir sysfs.
* Trigger thermal action log only if thermal condition changes
* test file existence before read file content
* fix error for set psu fan speed
* Remove logs because it print too frequently
- bug fix : Fixed an issue which the nps ko file was not loaded due to the wrong service file name
- Optimize the code to reduce changes due to the kernel upgrade
- Remove nephos ko file loaded in swss.service.j2 because it has loaded at syncd.service.j2
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.
- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
This patchset implement the following:
- Setting the FAN frequency
- Corrections to the EM policy with respect to platform
defined temperature / fan values
- Updates to the platform monitorng script logging
- Fixes to platform initialization script
Signed-off-by: Ciju Rajan K <crajank@juniper.net>
Update SAI/SDK/FW and MSN4700 device files to support 8 lanes 400G
Update SAI to 1.16.3
Update SDK to 4.4.0914
Update FW to *.2007.1112
Update MSN4700 device files to support 8 lanes 400G
currently, vs docker always create 32 front panel ports.
when vs docker starts, it first detects the peer links
in the namespace and then setup equal number of front panel
interfaces as the peer links.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
- Add Makefile rules to build debug containers for SWI images
- Fix some platform API implementation for xcvrd and thermalctld
- Improvements to arista diag command
- Miscellaneous refactors
Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
* New SKU support for MSN3420
Signed-off-by: Shlomi Bitton <shlomibi@mellanox.com>
Conflicts:
device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py
* Add CPLD's
* Symlink fixes and semantics
* Adding new platform at end of lines
This patch set implements the following:
- Fixes the conflicts in chassis.py / platform.py in sonic_platfrom
- Consolidating the common library files in sonic_platform
- Moving QFX5210 specific drivers to qfx5210/modules
- Moving Juniper common fpga drivers to common/modules
- Cleaning up the platform driver files
- Bug fixes in QFX5210 platform monitor & initialiazation script
- Fixing the bugs in QFX5210 eeprom parsing
Signed-off-by: Ciju Rajan K <crajank@juniper.net>
[baseimage]: upgrade base image to debian buster
bring up the base image to debian buster 4.19 kernel.
using the merge commits to preserve the individual commits to better track the history.
Currently we port SONiC to buster in a way that base image is on buster and
other dockers based on stretch. The benefit is that tasks can be carried out
simultaneously.
The build procedure can be treated as 2 stages.
The first stage is to build the stretch-based debs and dockers and the second
stage is to build the buster-based ones.
One thing we have to pay attention to is some debs depend on kernel should not
be built at stretch stage because the kernel isn't available at that time.
The idea is to move that kind of debs out of SONIC_STRETCH_DEBS. Meanwhile,
any dependency explicitly put on the stretch based dockers on kernel should be
removed.
1. undefine led_classdev_register as it is defined in leds.h
2. header file location change
a. linux/i2c/pmbus.h -> linux/pmbus.h
b. linux/i2c-mux-gpio.h -> linux/platform_data/i2c-mux-gpio.h
c. linux/i2c/pca954x.h -> linux/platform_data/pca954x.h
- build SONIC_STRETCH_DOCKERS in sonic-slave-stretch docker
- build image related module in sonic-slave-buster docker.
This includes all kernels modules and some packages
Signed-off-by: Guohan Lu <lguohan@gmail.com>
This is a 1RU switch with 32 QSFP28 (40G/100G) ports on
Broadcom Tomahawk I chipset. CPU used in QFX5200-32C-S
is Intel Ivy Bridge. The machine has Redundant and
hot-swappable Power Supply (1+1) and also has Redundant
and hot swappable fans (5).
Signed-off-by: Ciju Rajan K <crajank@juniper.net>
Signed-off-by: Ashish Bhensdadia <bashish@juniper.net>
FPGA driver crash fix for stale buffer in i2c transfer
LED firmware load issue fix.
10G port swapfix
psu/sfp bug fixes to report correct states/status of hw
libpython2.7, libdaemon0, libdbus-1-3, libjansson4 are common
across different containers. move them into docker-base-stretch
Signed-off-by: Guohan Lu <lguohan@gmail.com>
* [thermal control] Fix pmon docker stop issue on 3800
* [thermal fix] Fix QA test issue
* [thermal fix] change psu._get_power_available_status to psu.get_power_available_status
* [thermal fix] adjust log for PSU absence and power absence
* [thermal fix] add unit test for loading thermal policy file with duplicate conditions in different policies
* [thermal] fix fan.get_presence for non-removable SKU
* [thermal fix] fix issue: fan direction is based on drawer
* Fix issue: when fan is not present, should not read fan direction from sysfs but directly return N/A
* [thermal fix] add unit test for get_direction for absent FAN
* Unplugable PSU has no FAN, no need add a FAN object for this PSU
* Update submodules
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
* add MSN4700 device files
* update ACS-MSN4700 sai profile
* update buffer pool size, headroom, sensor conf, port config and reboot scripts
* fix ident
* update sensor conf and buffer pool
* [sn4700] add sku 4700 to chassis.py
* [Mellanox-4700] Add 4700 info to psu and thermal platform API
* update buffer config file template to the latest.
update SAI profile to use 100G X 4lanes for now
update port_config.ini according to the SAI profile
* [Mellanox]Update the buffer configurations for 4700
* fix alignment in pg_profile_lookup.ini
* add platform components file for new sku
* Update device/mellanox/x86_64-mlnx_msn4700-r0/ACS-MSN4700/pg_profile_lookup.ini
Co-Authored-By: Nazarii Hnydyn <nazariig@mellanox.com>
* remove redundant line
* [Mellanox]Correct type, buffer size
Co-authored-by: Nazarii Hnydyn <nazariig@mellanox.com>
Co-authored-by: junchao <junchao@mellanox.com>
Co-authored-by: Stephen Sun <stephens@mellanox.com>
In the file "files/build_templates/docker_image_ctl.j2", it adds the option
"--net" to the docker create command through the commit
"abe7ef7e2e2e1215c97cee19a83aab0b130cebe5" (#3856).
Remove the "--net" option in "docker-syncd-brcm-rpc.mk" to avoid
specifying duplicated "--net" options.
Signed-off-by: charlie_chen <charlie_chen@edge-core.com>
DPKG caching framework provides the infrastructure to cache the sonic module/target .deb files into a local cache by tracking the target dependency files.SONIC build infrastructure is designed as a plugin framework where any new source code can be easily integrated into sonic as a module and that generates output as a .deb file. The source code compilation of a module is completely independent of other modules compilation. Inter module dependency is resolved through build artifacts like header files, libraries, and binaries in the form of Debian packages. For example module A depends on module B. While module A is being built, it uses B's .deb file to install it in the build docker.
The DPKG caching framework provides an infrastructure that caches a module's deb package and restores it back to the build directory if its dependency files are not modified. When a module is compiled for the first time, the generated deb package is stored at the DPKG cache location. On the subsequent build, first, it checks the module dependency file modification. If none of the dependent files is changed, it copies the deb package from the cache location, otherwise, it goes for local compilation and generates the deb package. The modified files should be checked-in to get the newer cache deb package.
This provides a huge improvement in build time and also supports the true incremental build by tracking the dependency files.
- How I did it
It takes two global arguments to enable the DPKG caching, the first one indicates the caching method and the second one describes the location of the cache.
SONIC_DPKG_CACHE_METHOD=cache
SONIC_DPKG_CACHE_SOURCE=
where SONIC_DPKG_CACHE_METHOD - Default method is 'cache' for deb package caching
none: no caching
cache: cache from local directory
Dependency file tracking:
Dependency files are tracked for each target in two levels.
1. Common make infrastructure files - rules/config, rules/functions, slave.mk etc.
2. Per module files - files which are specific to modules, Makefile, debian/rules, patch files, etc.
For example: dependency files for Linux Kernel - src/sonic-linux-kernel,
SPATH := $($(LINUX_HEADERS_COMMON)_SRC_PATH)
DEP_FILES := $(SONIC_COMMON_FILES_LIST) rules/linux-kernel.mk rules/linux-kernel.dep
DEP_FILES += $(SONIC_COMMON_BASE_FILES_LIST)
SMDEP_FILES := $(addprefix $(SPATH)/,$(shell cd $(SPATH) && git ls-files))
DEP_FLAGS := $(SONIC_COMMON_FLAGS_LIST) \
$(KERNEL_PROCURE_METHOD) $(KERNEL_CACHE_PATH)
$(LINUX_HEADERS_COMMON)_CACHE_MODE := GIT_CONTENT_SHA
$(LINUX_HEADERS_COMMON)_DEP_FLAGS := $(DEP_FLAGS)
$(LINUX_HEADERS_COMMON)_DEP_FILES := $(DEP_FILES)
$(LINUX_HEADERS_COMMON)_SMDEP_FILES := $(SMDEP_FILES)
$(LINUX_HEADERS_COMMON)_SMDEP_PATHS := $(SPATH)
Cache file tracking:
The Cache file is a compressed TAR ball of a module's target DEB file and its derived-target DEB files.
The cache filename is formed with the following format
FORMAT:
<module deb filename>.<24 byte of DEP SHA hash >-<24 byte of MOD SHA hash>.tgz
Eg:
linux-headers-4.9.0-9-2-common_4.9.168-1+deb9u3_all.deb-23658712fd21bb776fa16f47-c0b63ef593d4a32643bca228.tgz
< 24-byte DEP SHA value > - the SHA value is derived from all the dependent packages.
< 24-byte MOD SHA value > - the SHA value is derived from either of the following.
GIT_COMMIT_SHA - SHA value of the last git commit ID if it is a submodule
GIT_CONTENT_SHA - SHA value is generated from the content of the target dependency files.
Target Specific rules:
Caching can be enabled/disabled on a global level and also on the per-target level.
$(addprefix $(DEBS_PATH)/, $(SONIC_DPKG_DEBS)) : $(DEBS_PATH)/% : .platform $$(addsuffix -install,$$(addprefix $(DEBS_PATH)/,$$($$*_DEPENDS))) \
$(call dpkg_depend,$(DEBS_PATH)/%.dep )
$(HEADER)
# Load the target deb from DPKG cache
$(call LOAD_CACHE,$*,$@)
# Skip building the target if it is already loaded from cache
if [ -z '$($*_CACHE_LOADED)' ] ; then
.....
# Rules for Generating the target DEB file.
.....
# Save the target deb into DPKG cache
$(call SAVE_CACHE,$*,$@)
fi
$(FOOTER)
The make rule-'$(call dpkg_depend,$(DEBS_PATH)/%.dep )' checks for target dependency file modification. If it is newer than the target, it will go for re-generation of that target.
Two main macros 'LOAD_CACHE' and 'SAVE_CACHE' are used for loading and storing the cache contents.
The 'LOAD_CACHE' macro is used to load the cache file from cache storage and extracts them into the target folder. It is done only if target dependency files are not modified by checking the GIT file status, otherwise, cache loading is skipped and full compilation is performed.
It also updates the target-specific variable to indicate the cache is loaded or not.
The 'SAVE_CACHE' macro generates the compressed tarball of the cache file and saves them into cache storage. Saving into the cache storage is protected with a lock.
- How to verify it
The caching functionality is verified by enabling it in Linux kernel submodule.
It uses the cache directory as 'target/cache' where Linux cache file gets stored on the first-time build and it is picked from the cache location during the subsequent clean build.
- Description for the changelog
The DPKG caching framework provides the infrastructure to save the module-specific deb file to be cached by tracking the module's dependency files.
If the module's dependency files are not changed, it restores the module deb files from the cache storage.
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)
DOCUMENT PR:
https://github.com/Azure/SONiC/pull/559
Take advantage of an SDK environment variable to customize the location where sdk_socket exists.
In the latest SDK sdk_socket has been moved from /tmp to /var/run which is a better place to contain this kind of file.
However, this prevents the subdirs under /var/run from being mapped to different volumes. To resolve this, we take advantage of an SDK variable to designate the location of sdk_socket.
This requires every process that requires to access sdk_socket have this environment variable defined. However, to define environment variable for each process is less scalable. We take advantage of the docker scope environment variable to avoid that.
It depends on PR 4227
* [Mellanox]Integrate hw-mgmt 7.0000.3012
* [sonic-linux-kernel]Advance the submodule head
Advance the sonic-linux-kernel
[sFlow]: Patch to fix skb_over_panic in psample driver (#120)
Added support in the kernel for fullcone 3-tuple unique nat. (#100)
Adding support to compile ARM architecture (#102)
[ixgbe] Support bcm54616s external phy in ixgbe (#122)
Fix i2c ISMT DMA buffer alignment issue (#123)
[mellanox]: Add SN4700 patches. (#126)
* Add new CIG device CS6436-54P and CS5435-54P, also update code for CS6436-56P
* security kernel update to 4.9.189 for CIG devices
* security kernel update to 4.9.189 for CIG devices
* Update rules
Update rule file
update FW to xx_2000_3298
update SAI to 1.16.0
update Spectrum-1 and Spectrum-2 buffer pool size according to the new SDK default config change.
modified: ../../device/mellanox/x86_64-mlnx_msn2700-r0/ACS-MSN2700/buffers_defaults_t0.j2
modified: ../../device/mellanox/x86_64-mlnx_msn2700-r0/ACS-MSN2700/buffers_defaults_t1.j2
modified: ../../device/mellanox/x86_64-mlnx_msn3700-r0/ACS-MSN3700/buffers_defaults_t0.j2
modified: ../../device/mellanox/x86_64-mlnx_msn3700-r0/ACS-MSN3700/buffers_defaults_t1.j2
modified: fw.mk
modified: mlnx-sai.mk
modified: mlnx-sai/SAI-Implementation
modified: sdk-src/sx-kernel/Switch-SDK-drivers
modified: sdk.mk
signed-off by kebol@mellanox.com
Due to linux create hwmonN node number, N isn't fixed. So modify to use "" to let thermal.py can correct sysfs. For example, use /sys/bus/i2c/devices/55-0048/hwmon/hwmon/temp1_input to instead of /sys/bus/i2c/devices/55-0048/hwmon/hwmon4/temp1_input
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)
Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
- Fix for Azure/sonic-buildimage#4095
- Exit status from failed make command(action) didnt reached parent target because the make command is inside the "for" loop.
- Only the exit status of the last command in the last iteration of the for loop is read by parent target.
- This is the reason why dpkg-buildpackage ignored the make error.
- Fixed the issue with help of "set -e".
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.
sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)