Why I did it
Change the path of sonic submodules that point to "Azure" to point to "sonic-net"
How I did it
Replace "Azure" with "sonic-net" on all relevant paths of sonic submodules
Why/How I did:
Make sure first error syslog is triggered based on FAULT TOLERANCE condition.
Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent
Updated the monit conf files with repeat every x cycles for the alert action
Signed-off-by: Guohan Lu <lguohan@gmail.com>
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* [redis] Use redis-server and redis-tools in blob storage to prevent
upstream link broken
* Use curl instead of wget
* Explicitly install dependencies
The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97f1e. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.
Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.
Signed-off-by: Baptiste Covolato <baptiste@arista.com>
Fix for the host unmount issue through PR https://github.com/Azure/sonic-buildimage/pull/4558 and https://github.com/Azure/sonic-buildimage/pull/4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket
Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
* Support for connecting to DB in namespace via IP:port ( using docker bridge network ) for applications in multi-asic platform.
* Added the default IP as 127.0.0.1 if the IPaddress derivation from interface fails.
Moved the localhost loopback IP binding logic into the supervisor.j2 file.
1. Upgrade SAI headers to v1.6.3
2. Fix traffic lost during FFB related to buffer config + optimize buffer config timing for FB
3. Add ACL fields BTH, IP flags
4. Add ACL infrastructure of different fields per ASIC type
Add changes for syslog support for containers running in namespaces on multi ASIC platforms.
On Multi ASIC platforms
Rsyslog service is only running on the host. There is no rsyslog service running in each namespace.
On multi ASIC platforms the rsyslog service on the host will be listening on the docker0 ip address instead of loopback address.
The rsyslog.conf on the containers is modified to have omfwd target ip to be docker0 ipaddress instead of loopback ip
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Since we can not refer a dir in sonic-buildimage while jenkins testing of sonic-utilities.
We need to create build dependency on sonic_yang_models PKG too.
Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
* Changes to make default route programming
correct in multi-asic platform where frr is not running
in host namespace. Change is to set correct administrative distance.
Also make NAMESPACE* enviroment variable available for all dockers
so that it can be used when needed.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix review comments
* Review comment to check to add default route
only if default route exist and delete is successful.
* [systemd-generator]: Fix the code to make sure that dependencies
of host services are generated correctly for multi-asic platforms.
Add code to make sure that systemd timer files are also modified
to add the correct service dependency for multi-asic platforms.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
* [systemd-generator]: Minor fix, remove debug code and
remove unused variable.
**- Why I did it**
System health feature requires to read ASIC temperature and threshold from platform API
**- How I did it**
Implement Chassis.get_asic_temperature and Chassis.get_asic_temperature_threshold by getting value from system fs.
This patch addresses the following issues:
1) Platform drivers were not loading in the latest images. Fixed
the intialization script to make sure that all the drivers are
loaded.
2) Getting rid of "pstore: crypto_comp_decompress failed, ret = -22!"
messages during the kernel boot, after moving to 4.19 kernel. The
solution is to remove the files under '/sys/fs/pstore' directory.
Signed-off-by: Ciju Rajan K <crajank@juniper.net>
The program name in critical_processes file must match the program name defined in supervisord.conf file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
* src/sonic-platform-common 75698a8...82bbeab (9):
> [sfputil] Make SfpUtilHelper.get_physical_to_logical noexcept as in SfpUtilBase (#96)
> [sfp_base] Update return value documentation of channel-specific methods (#98)
> [sfp] Tweak key names of some transceiver info fields (#97)
> fix typo: portconfig.ini to port_config.ini (#94)
> [chassis_base] Add platform API support for system LED (#91)
> Add PCIe check commad (#64)
> [sfputilbase.py] Don't try to print EEPROM sysfs file name if we failed to read from it (#81)
> [sfputilbase | sfputilhelper] Add support of platform.json (#72)
> [eeprom] Add try-except to catch the IOError (#85)
* src/sonic-platform-daemons 0f4fd83...abe115e (2):
> [xcvrd] Tweak some transceiver info key names (#62)
> [psud][thermalctld] Always get fan/PSU LED status from platform API to avoid status inconsistencies (#59)
* src/sonic-utilities fd7781b...16a33f2 (9):
> [config] Fix syntax error (#966)
> [config] Fix indentation level in _get_disabled_services_list() (#965)
> a4e64d1 [sonic_installer] Refactor sonic_installer code (#953)
> 90efd62 [Show | Command Reference] Add Port breakout Show Command (#859)
> [sfpshow][mock_state_db] Tweak key names of some transceiver info fields (#958)
> [show] Add missing verbose option to "show line" (#961)
> [filter-fdb] Check VLAN Presence When Filter FDB (#957)
> [master]fix #4716 show ipv6 interfaces neighbor_ip is N/A issue (#948)
> Fix for command. show interface transceiver eeprom -d Ethernet (#955)
Note: sonic-utilities update fixes#4716
Issue: Port with AOC cable does not come up when "sfputil reset <port_name>" is executed.
Modified the incorrect mask used in reset API to resolve the issue.
* c60b1f4 2020-06-26 | e1000: Do not perform reset in reset_task if we are already down (#148) (HEAD -> master, origin/master, origin/HEAD) [lguohan]
* c6aeedd 2020-06-25 | Updated NAT kernel patch for 4.19 buster (#147) [Akhilesh Samineni]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.
Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.
**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
The current stdout file which also includes the dut logs are very verbose and noisy.
We have manually installed it in the sonic-mgmt docker in our organization and tuned the pytest settings to produce very helpful and concise logs.
pytest-html plugins can be used to post-process the output in various ways based on our different and unique organizational needs.
Hence proposing to add this pkt to the docker file
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the
syncd process, Monit will not detect whether syncd process is running or not.
If we ran the command `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.
Similarly, On Mellanox Monit will also select the process with the highest uptime (at there
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.
That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.
**- How I did it**
**- How to verify it**
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
sonic-utils has sonic-yang-mgmt as build time deps, which inturn installs libyang.
libyang is needed to run newly added test.
If sonic-yang-mgmt is already built then libyang will not be installed in slave docker
without this PR and test will not run.
Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
Add build flag TELEMETRY_WRITABLE. When set to "y" it will add a go build flag in the telemetry build that will enable telemetry write mode to allow configuration via gNMI Set RPC as well as operations via the gNOI RPC's. The default for TELEMETRY_WRITABLE is unset in which case telemetry is read-only. In read-only mode the Set RPC and all gNOI RPC's are disabled and will return an "Unsupported" error when called.
authored-by: Eric Seifert <eric@seifert.casa>
when portsyncd starts, it first enumerates all front panel ports
and marks them as old interfaces. Then, for new front panel ports
it checks if their indexes exist in previous sets. If yes, it will
treats them as old interfaces and ignore them.
The reason we have this check is because broadcom SAI only removes
front panel ports after sai switch init.
So, if portsyncd starts after orchagent, new interfaces could be
created before portsyncd and treated as old interface.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
When building the SONiC image, used systemd to mask all services which are set to "disabled" in init_cfg.json.
This PR depends on https://github.com/Azure/sonic-utilities/pull/944, otherwise `config load_minigraph will fail when trying to restart disabled services.
* Change port index in port_config.ini to 1-based
* Add default port index to port_config.ini, change platform plugins to accept 1-based port index
* fix port index in sfp_event.py
To enable tagged vlan support by minigraph parser. This enables us to generate a config_db file that will enable SONiC device to operate using tagged and untagged vlan.
- Why I did it
New repo sonic-mgmt-common is introduced for the common translib related code. This commit adds build rules for this new repo.
- How I did it
Added sonic-mgmt-common submodule
Added build rules for the new sonic-mgmt-common repo. It creates two deb packages -- sonic-mgmt-common_1.0.0_{arch}.deb and sonic-mgmt-common-codegen_1.0.0_{arch}.deb. Package cache is enabled.
Added dependency on sonic-mgmt-common for mgmt-framework and telemetry debs and dockers.
- How to verify it
Full build and incremental builds
Basic ACL and interface opreations through REST, KLISH CLI and gNMI
- Description for the changelog
Git submodule and build rules for the new sonic-mgmt-common repo.