Optimizing number of calls made to sonic-cfggen during service
start up as it adds to total system boot up time.
***-Test 1***
there is an average saving of 1 to 1.5 sec between old script and new script
```
root@str-s6000-acs-14:/# time /usr/bin/rest-server-old.sh
Generating temporary TLS server certificate ...
2020/07/09 19:03:33 wrote cert.pem
2020/07/09 19:03:33 wrote key.pem
REST_SERVER_ARGS = -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
/usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
real 0m8.790s
user 0m7.993s
sys 0m0.584s
root@str-s6000-acs-14:/# time /usr/bin/rest-server-new.sh
Generating temporary TLS server certificate ...
2020/07/09 19:03:45 wrote cert.pem
2020/07/09 19:03:45 wrote key.pem
REST_SERVER_ARGS = -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
/usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
real 0m6.940s
user 0m5.670s
sys 0m0.386s
```
***-Test 2***
Built an image with this change and rest server is running with params as described in test 1 above
```
admin@str-s6000-acs-14:~$ ps -ef | grep rest_server
root 3301 2866 2 02:09 pts/0 00:00:10 /usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
```
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
For telemetry regression test we need gnmi client to be present on ptfdocker. Gnmi-server will be present on SONiC DuT. Further, we can access gnmi_get from ptfdocker inside pytest to verify gnmi server streaming data successfully or not.
The template is referenced relative to the script path and this could
results in errors in case script is run from root. Add explicit
path to the template file name.
Also, moving telemetry_var template to template dir.
And remove double quotes from around json dict.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
* [mgmt docker] move pycryptodome installation to the end of the docker building
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* pin down the version to current: 3.9.8
* comment
Signed-off-by: Akhilesh Samineni <akhilesh.samineni@broadcom.com>
All new NAT conntrack entries are added to kernel with max entry timeout of 432000 and setting the same timeout during system warm reboot also
sonic-cfggen call is slow and this is taking place in the SONiC
boot up process. The change uses templates to assemble all required
vars into single template file. With this change, telemetry now calls
once into sonic-cfggen.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Resubmitting the changes for (#4825) with fixes for sonic-bgpcdgd test failures
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
also update submodule
* 01f810f 2020-07-02 | fix compiling issue for gcc8.3 (#1339) [lguohan]
* 9b13120 2020-07-03 | Fix in script to avoid orchagent crash when port down followed by fdb delete (#1340) [rupesh-k]
* 9b01844 2020-07-01 | [qosorch] Update QoS scheduler params for shaping features (#1296) [Michael Li]
* 86b5e99 2020-07-02 | [mirrororch] Port Mirroring implementation (#1314) [rupesh-k]
* c05601c 2020-06-24 | [portsyncd]: add debug message if a port cannot be found in port able (#1328) [lguohan]
* a0b6412 2020-06-23 | COPP_DEL_fix: DEL for one trap group from SONIC is resetting all the trap IDs (#1273) [SinghMinu]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
* Loopback IP changes for multi ASIC devices
multi ASIC will have 2 Loopback Interfaces
- Loopback0 has globally unique IP address, which is advertised by the multi ASIC device to its peers.
This way all the external devices will see this device as a single device.
- Loopback4096 is assigned an IP address which has a scope is within the device. Each ASIC has a different ip address for Loopback4096. This ip address will be used as Router-Id by the bgp instance on multi ASIC devices.
This PR implements this change for multi ASIC devices
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
* [sonic-buildimage] Changes to make network specific sysctl
common for both host and docker namespace (in multi-npu).
This change is triggered with issue found in multi-npu platforms
where in docker namespace
net.ipv6.conf.all.forwarding was 0 (should be 1) because of
which RS/RA message were triggered and link-local router were learnt.
Beside this there were some other sysctl.net.ipv6* params whose value
in docker namespace is not same as host namespace.
So to make we are always in sync in host and docker namespace
created common file that list all sysctl.net.* params and used
both by host and docker namespace. Any change will get applied
to both namespace.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Address Review Comments and made sure to invoke augtool
only one and do string concatenation of all set commands
* Address Review Comments.
* Support for connecting to DB in namespace via IP:port ( using docker bridge network ) for applications in multi-asic platform.
* Added the default IP as 127.0.0.1 if the IPaddress derivation from interface fails.
Moved the localhost loopback IP binding logic into the supervisor.j2 file.
* Changes to make default route programming
correct in multi-asic platform where frr is not running
in host namespace. Change is to set correct administrative distance.
Also make NAMESPACE* enviroment variable available for all dockers
so that it can be used when needed.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Fix review comments
* Review comment to check to add default route
only if default route exist and delete is successful.
The program name in critical_processes file must match the program name defined in supervisord.conf file.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
**- Why I did it**
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
`isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical
processes and test whether a container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.
Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.
**- How to verify it**
We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
The current stdout file which also includes the dut logs are very verbose and noisy.
We have manually installed it in the sonic-mgmt docker in our organization and tuned the pytest settings to produce very helpful and concise logs.
pytest-html plugins can be used to post-process the output in various ways based on our different and unique organizational needs.
Hence proposing to add this pkt to the docker file
when portsyncd starts, it first enumerates all front panel ports
and marks them as old interfaces. Then, for new front panel ports
it checks if their indexes exist in previous sets. If yes, it will
treats them as old interfaces and ignore them.
The reason we have this check is because broadcom SAI only removes
front panel ports after sai switch init.
So, if portsyncd starts after orchagent, new interfaces could be
created before portsyncd and treated as old interface.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
FDB/ARP/Default routes files are deleted after swssconfig. This
makes debugging/validation of device conversion hard. This PR
saves those files in order to facilitate debugging of device conversion.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
REST and telemetry servers were using "DEVICE_METADATA|x509" table for
server certificate configurations. This table has been deprecated now.
Enhanced REST server startup script to read server certificate file
path configurations from REST_SERVER table. Three more attributes -
server_crt, server_key and ca_crt are introduced as described in
https://github.com/Azure/SONiC/pull/550.
For backard compatibility, certificate configurations are read from
old "DEVICE_METADATA|x509" table if they (server_crt, server_key and
ca_crt) are not present in REST_SERVER table.
Fixes bug https://github.com/Azure/sonic-buildimage/issues/4291
Signed-off-by: Sachin Holla <sachin.holla@broadcom.com>
The `-sv2` suffix was used to differentiate SNMP Dockers when we transitioned from "SONiCv1" to "SONiCv2", about four years ago. The old Docker materials were removed long ago; there is no need to keep this suffix. Removing it aligns the name with all the other Dockers.
Also edit Monit configuration to detect proper snmp-subagent command line in Buster, and make snmpd command line matching more robust.
The -sv2 suffix was used to differentiate SNMP Dockers when we transitioned from "SONiCv1" to "SONiCv2", about four years ago. The old Docker materials were removed long ago; there is no need to keep this suffix. Removing it aligns the name with all the other Dockers.
**- Why I did it**
We need RIF counters to be enabled by default. Flex Counter does probe for supported counters. If a platform does not support RIF counters, SAI will return NOT_SUPPORTED and Flex Counter will stop polling the counter.
**- How to verify it**
After fresh install rif counter gropup is enabled by default:
$ counterpoll show
Type Interval (in ms) Status
-------------------- ------------------ --------
QUEUE_STAT default (10000) enable
PORT_STAT default (1000) enable
RIF_STAT default (1000) enable
QUEUE_WATERMARK_STAT default (10000) enable
PG_WATERMARK_STAT default (10000) enable
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>