Commit Graph

985 Commits

Author SHA1 Message Date
Ze Gan
87036c34ec
[macsec]: Upgrade docker-macsec to bullseye (#10574)
Following the patch from : https://packages.debian.org/bullseye/wpasupplicant, to upgrade sonic-wpa-supplicant for supporting bullseye and upgrade docker-macsec.mk as a bullseye component.
2022-04-17 20:32:51 +08:00
Zhaohui Sun
44cf773a96
Revert "[docker-ptf]: Upgrade scapy to 2.4.5 in docker-ptf (#10507)" (#10537)
It upgraded scapy to 2.4.5 in docker-ptf container, after this upgrade, all scripts under ansible/roles/test/files/ptftests will import scapy 2.4.5, some test cases will fail because they are not upgraded accordingly.

Reverts #10507 to avoid breaking regression test.

This reverts commit 92efc01270.
2022-04-13 08:57:10 +08:00
kellyyeh
396a92cb2e
[dhcp_relay] Remove dhcp6mon (#10467) 2022-04-12 10:44:17 -07:00
Ashwin Srinivasan
b4f8f1dd22
Removed python2 dependency for sonic-pcied in sonic-platform-daemons (#10421)
Removed python2 support for sonic-platform-daemons that was causing unit
test errors in sonic_pcied.
* Removed config from docker supervisord jinja templates per VD review comment
* Removed space and python3 per QL comments
2022-04-09 13:16:50 -07:00
Ze Gan
92efc01270
[docker-ptf]: Upgrade scapy to 2.4.5 in docker-ptf (#10507)
Why I did it
Existing dataplane tests cannot be tested under MACsec environment due to the traffic under MACsec link is encrypted. So, I will override the dp_poll of ptf to MACsec dp_poll to decrypt the MACsec packets on injected ports (PR: Azure/sonic-mgmt#5490). MACsec decryption library depends on scapy 2.4.5.

How I did it
Upgrade scapy library to 2.4.5 by pip.

How to verify it
Check the scapy version in docker-ptf by

python -c "import scapy; print(scapy.__version__)"
2.4.5

Signed-off-by: Ze Gan <ganze718@gmail.com>
2022-04-08 22:26:40 -07:00
kellyyeh
330d11a128
Add EPMS and MgmtTsToR (#10478) 2022-04-07 21:49:42 -07:00
Stepan Blyshchak
4426f7715f
[scapy] update scapy to 2.4.5 and patch it (#10457)
Why I did it
Running warm-reboot in a loop for 500 times leads to this error on 318-th iteration:

Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors Traceback (most recent call last):
Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/bin/restore_neighbors.py", line 24, in <module>
Apr  2 15:56:27.346747 sonic INFO swss#/supervisord: restore_neighbors     from scapy.all import conf, in6_getnsma, inet_pton, inet_ntop, in6_getnsmac, get_if_hwaddr, Ether, ARP, IPv6, ICMPv6ND_NS, ICMPv6NDOptSrcLLAddr
Apr  2 15:56:27.346795 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/all.py", line 25, in <module>
Apr  2 15:56:27.346956 sonic INFO swss#/supervisord: restore_neighbors     from scapy.route import *
Apr  2 15:56:27.346995 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/route.py", line 205, in <module>
Apr  2 15:56:27.347089 sonic INFO swss#/supervisord: restore_neighbors     conf.iface = get_working_if()
Apr  2 15:56:27.347129 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/arch/linux.py", line 128, in get_working_if
Apr  2 15:56:27.347213 sonic INFO swss#/supervisord: restore_neighbors     ifflags = struct.unpack("16xH14x", get_if(i, SIOCGIFFLAGS))[0]
Apr  2 15:56:27.347250 sonic INFO swss#/supervisord: restore_neighbors   File "/usr/local/lib/python3.7/dist-packages/scapy/arch/common.py", line 31, in get_if
Apr  2 15:56:27.347345 sonic INFO swss#/supervisord: restore_neighbors     return ioctl(sck, cmd, struct.pack("16s16x", iff.encode("utf8")))
Apr  2 15:56:27.347365 sonic INFO swss#/supervisord: restore_neighbors OSError: [Errno 19] No such device
The issue was reported to scapy devs secdev/scapy#3369, the fix is secdev/scapy#3371, however there is no released scapy version with this fix right now, thus decided to build scapy v2.4.5 from sources and apply the fix in a form of a patch.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2022-04-07 14:23:35 +03:00
kellyyeh
8cd346d80b
Update docker-router-advertiser.supervisord.conf.j2 (#10375) 2022-04-06 09:44:21 -07:00
Saikrishna Arcot
588ed0b760
Upgrade router-advertiser container to Bullseye (#10374)
Change the base image from `docker-config-engine-buster` to
`docker-config-engine-bullseye`, and remove the hardcoded
`radvd` version from the Dockerfile.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-04-01 16:12:43 -07:00
Junchao-Mellanox
106fac5f09
[counter] Fix issue: non default counters will be delayed forever after fastboot (#10413)
- Why I did it
Fastboot will delay all counters in CONFIG DB, it relies on enable_counters.py to recover the delayed counters. However, enable_counters.py does not recover those non-default counters.

- How I did it
For non-default counters, if it is in CONFIG DB, put delay status to false after the waiting.

- How to verify it
Manual test
2022-03-31 15:23:57 +03:00
Lawrence Lee
b31df59c7c
[tun_pkt]: Wait for AsyncSniffer to init fully (#10346)
Fix for Tunnel packet handler can crash at system startup 
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-03-30 14:03:29 -07:00
judyjoseph
8e642848c2
Introduce the asic_subtype field for adding the sub platform variants. (#10235)
* Introduce the asic_subtype field for adding the sub platform variants. 
   It uses the value of TARGET_MACHINE variable in slave.mk.
2022-03-28 11:22:32 -07:00
tomer-israel
cc938e73a3
Dynamic port configuration - solve lldp issues when adding/removing ports (#9386)
#### Why I did it
when adding and removing ports after init stage we saw two issues:

first:
In several cases, after removing a port, lldpmgr is continuing to try to add a port to lldp with lldpcli command. the execution of this command is continuing to fail since the port is not existing anymore.

second:
after adding a port, we sometimes see this warning messgae:
"Command failed 'lldpcli configure ports Ethernet18 lldp portidsubtype local etp5b': 2021-07-27T14:16:54 [WARN/lldpctl] cannot find port Ethernet18"

we added these changes in order to solve it.
#### How I did it
port create events are taken from app db only.
lldpcli command is executed only when linux port is up.

when delete port event is received we remove this command from  pending_cmds dictionary

#### How to verify it
manual tests and running lldp tests


#### Description for the changelog
Dynamic port configuration - solve lldp issues when adding/removing ports
2022-03-25 17:47:24 -07:00
Robert J. Halstead
147d631065
[PINS] update sonic-p4rt docker to bullseye (#10182)
#### Why I did it
SONiC is migrating to bullseye. This will update the sonic-pins container to bullseye.

#### How I did it
The [sonic-pins code](https://github.com/Azure/sonic-buildimage/blob/master/rules/p4rt.mk) isn't dependent on any architecture so it will already build successfully for bullseye. This PR updates the docker to use bullseye.

#### How to verify it
Today we cannot build the docker-sonic-p4rt.gz target (e.g. Issue #9885). With this change the docker will build successfully. The P4RT executable will not run, because of a missing runtime library, libgmpxx, which I'll address in a followup PR.

#### Description for the changelog
Update docker-sonic-p4rt.gz target to build with Bullseye instead of Buster.
2022-03-23 17:21:36 -07:00
Saikrishna Arcot
4a5e75e45e
[restapi]: Don't use python/python2 for restapi start scripts (#10285)
Python 2 isn't installed by default in Buster and Bullseye containers,
and the scripts/modules can be used with Python 3, so make sure Python 3
is used.

Why I did it
After the Buster and Bullseye upgrade for the restapi container, processes will no longer start because supervisord is trying to call python and python2, both of which are unavailable.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-03-22 18:34:42 -07:00
Oleksandr Kozodoi
c5849c9650
Add scapy support for python3 virtual environment in the sonic-mgmt docker container (#10234)
Why I did it
Migration of sonic-mgmt codebase from Python 2 to Python 3

How I did it
Added scapy dependencies to the env-python3 virtual environment.

How to verify it
Run test case:
py.test --testbed=testbed-t0 --inventory=../ansible/lab --testbed_file=../ansible/testbed.csv --host-pattern=testbed-t0 -- module-path=../ansible/library lldp

Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>
2022-03-16 12:00:51 +08:00
Saikrishna Arcot
5617b1ae3e
Image disk space reduction (#10172)
# Why I did it

Reduce the disk space taken up during bootup and runtime.

# How I did it

1. Remove python package cache from the base image and from the containers.
2. During bootup, if logs are to be stored in memory, then don't create the `var-log.ext4` file just to delete it later during bootup.
3. For the partition containing `/host`, don't reserve any blocks for just the root user. This just makes sure all disk space is available for all users, if needed during upgrades (for example).


* Remove pip2 and pip3 caches from some containers

Only containers which appeared to have a significant pip cache size are
included here.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Don't create var-log.ext4 if we're storing logs in memory

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Run tune2fs on the device containing /host to not reserve any blocks for just the root user

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-03-15 18:12:49 -07:00
Shilong Liu
3fa627f290
Add a config variable to override default container registry instead of dockerhub. (#10166)
* Add variable to reset default docker registry
* fix bug in docker version control
2022-03-14 18:09:20 +08:00
xwjiang2021
b73da484c4
Install the allure-pytest package globally in sonic-mgmt docker (#10216)
Why I did it
This fix is to address issue: Azure/sonic-mgmt#5280

In the sonic-mgmt Dockerfile, python package allure-pytest is installed after ENV USER $user.
Consequently the package is installed to path /home/$user/.local and is only available to the $user
account. If we use root account in sonic-mgmt docker container to run tests, any script importing
the allure package will fail with ImportError. We need to install the allure-pytest package to global
directory instead of user local directory.

How I did it
Update the sonic-mgmt Dockerfile to ensure that the allure-pytest package is installed to global directory

How to verify it
Build a new sonic-mgmt docker image based on the changes.
Use sonic-mgmt docker container of the newly built image to run test scripts that depend on the
allure-pytest package. No ImportError is raised.
2022-03-12 20:18:12 +08:00
Oleksandr Kozodoi
3fa18d18d4
Add necessary changes for python3 virtual environment of sonic-mgmt docker container (#9277)
This PR includes necessary changes for the setup of the Python3 virtual environment in the sonic-mgmt docker container.

How to activate Python3 virtual environment?
Connect to the sonic-mgmt container
$ docker exec -ti sonic-mgmt bash
Activate the virtual environment
$ source /var/user/env-python3/bin/activate

Why I did it
Migration of sonic-mgmt codebase from Python 2 to Python 3

How I did it
Added all necessary dependencies to the env-python3 virtual environment.

Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>
2022-03-09 12:28:01 +08:00
Alexander Allen
d0ff8b5f48
[pmon] Clean up supervisord chassis_db_init entry and fix startsecs (#10071)
Why I did it
Code review was still in progress when #9858 was merged and upon further testing I have arrived at a better solution.

How I did it
Modified supervisord configuration j2 template for pmon to require no minimum uptime for chassisd_db_init and to remove the redundant exit_codes directive

How to verify it
Boot switch and verify in syslog that there are no errors related to chassis_db_init
2022-03-03 17:10:15 -08:00
Lawrence Lee
4d2a55d373
[swss]: Wait for vlan intf to start ndppd (#10119)
- Use the `wait_for_link.sh` script to delay ndppd start until after the VLAN interface is ready
- Avoids issue where ndppd tries to change interface attributes before the interface is ready
2022-03-02 16:23:56 -08:00
Lawrence Lee
47d9b26063
Revert "[swss]: Wait for vlan intf to start ndppd (#10036)" (#10085)
This reverts commit 91204879df.

#10036 breaks ndppd functionality
2022-02-28 15:42:02 -08:00
Yang Wang
b8fa5e0d8d
install xmlrunner python3 version (#10086) 2022-02-28 11:21:04 +08:00
Lawrence Lee
91204879df
[swss]: Wait for vlan intf to start ndppd (#10036)
- Use the `wait_for_link.sh` script to delay ndppd start until after the VLAN interface is ready
- Avoids issue where ndppd tries to change interface attributes before the interface is ready

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-02-24 17:54:45 -08:00
xumia
b101b023d3
[Security]: Upgrade urllib3 to fix CVE-2021-33503
See https://security.archlinux.org/CVE-2021-33503
2022-02-25 08:59:57 +08:00
arlakshm
fd22635de0
[chassis][bgp] create v4 and v6 peer group for VoQ internal neighbors (#9693)
Why I did it
In the recent minigraph changes we add separate BGP session configuration for V4 and V6 internal VoQ neighbors.
This PR is adding different Peer groups for V4 and V6 neighbors

How I did it
Add VOQ_CHASSIS_V4_PEER and VOQ_CHASSIS_V6_PEER groups
Add extra Unit tests

How to verify it

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2022-02-24 11:21:26 -08:00
Richard.Yu
2210c82ef8
[PTF-SAIv2]Add ptf docker for sai-ptf (saiv2) (#9729)
* [PTF-SAIv2]Add ptf dockre for sai-ptf (saiv2)

Base on current ptf docker create a new docker for sai-ptf(saiv2)
upgrade related package
use the latest ptf and install it

test done:
NOJESSIE=1 NOSTRETCH=1 NOBULLSEYE=1 ENABLE_SYNCD_RPC=y make target/docker-ptf-sai.gz
BLDENV=buster make -f Makefile.work target/docker-ptf-sai.gz

* upgrade the thrift to 014
2022-02-18 01:48:50 -08:00
kellyyeh
f136c53d19
[radv] Support multiple ipv6 prefixes per vlan interface (#9934)
Why I did it
Radvd.conf.j2 template creates two copies of the vlan interface when there are more than one ipv6 address assigned to a single vlan interface. Changed the format to add prefixes under the same vlan interface block.

How I did it
Modifies radvd.conf.j2 and added unit tests

How to verify it
Configure multiple ipv6 address to the same vlan, start radvd
Unit test will check if radvd.conf with multiple ipv6 addresses is formed correctly
2022-02-16 14:17:26 -08:00
Jason Lyu
b023c29a1e
[redis] Upgrade redis version (#9757)
#### Why I did it

The current redis version of SONiC is `6.0.6`, which contains many high-risky security issues like CVEs that are fixed in the latest version. The Redis release notes also highly recommend to upgrade with SECURITY urgency.

```
================================================================================
Redis 6.0.16 Released Mon Oct 4 12:00:00 IDT 2021
================================================================================

Upgrade urgency: SECURITY, contains fixes to security issues.

Security Fixes:
* (CVE-2021-41099) Integer to heap buffer overflow handling certain string
  commands and network payloads, when proto-max-bulk-len is manually configured
  to a non-default, very large value [reported by yiyuaner].
* (CVE-2021-32762) Integer to heap buffer overflow issue in redis-cli and
  redis-sentinel parsing large multi-bulk replies on some older and less common
  platforms [reported by Microsoft Vulnerability Research].
* (CVE-2021-32687) Integer to heap buffer overflow with intsets, when
  set-max-intset-entries is manually configured to a non-default, very large
  value [reported by Pawel Wieczorkiewicz, AWS].
* (CVE-2021-32675) Denial Of Service when processing RESP request payloads with
  a large number of elements on many connections.
* (CVE-2021-32672) Random heap reading issue with Lua Debugger [reported by
  Meir Shpilraien].
* (CVE-2021-32628) Integer to heap buffer overflow handling ziplist-encoded
  data types, when configuring a large, non-default value for
  hash-max-ziplist-entries, hash-max-ziplist-value, zset-max-ziplist-entries
  or zset-max-ziplist-value [reported by sundb].
* (CVE-2021-32627) Integer to heap buffer overflow issue with streams, when
  configuring a non-default, large value for proto-max-bulk-len and
  client-query-buffer-limit [reported by sundb].
* (CVE-2021-32626) Specially crafted Lua scripts may result with Heap buffer
  overflow [reported by Meir Shpilraien].

Other bug fixes:
* Fix appendfsync to always guarantee fsync before reply, on MacOS and FreeBSD (kqueue) (#9416)
* Fix the wrong mis-detection of sync_file_range system call, affecting performance (#9371)
* Fix replication issues when repl-diskless-load is used (#9280)
```

#### How I did it

Edit `Dockerfile.j2` file

#### How to verify it

Check redis version

#### Description for the changelog
This PR will upgrade redis-server version to `6.0.16`.
2022-02-15 16:43:01 -08:00
Alexander Allen
9677401f4a
[pmon] Fix chassis_db_init exit not being expected (#9858)
- Why I did it
Error log was shown on switches during boot
pmon#supervisord 2021-12-22 04:27:16,709 INFO exited: chassis_db_init (exit status 0; not expected)

- How I did it
Add exit code zero as an expected exit code and also disable autorestart.

- How to verify it
Boot the switch and ensure the above log line does not appear.
2022-02-15 10:51:24 +02:00
Travis Van Duyn
62934ad4c4
updated jinja template for snmp contact python2 vs python3 issue (#9949) 2022-02-10 09:01:46 -08:00
Oleksandr Ivantsiv
25a0ce5eb1
[asan] Add address sanitizer support. (#9857)
Implement infrastructure that allows enabling address sanitizer
for docker containers. Enable address sanitizer for SWSS container.

- Why I did it
To add a possibility to compile SONiC applications with address sanitizer (ASAN).
ASAN is a memory error detector for C/C++. It finds:
1. Use after free (dangling pointer dereference)
2. Heap buffer overflow
3. Stack buffer overflow
4. Global buffer overflow
5. Use after return
6. Use after the scope
7. Initialization order bugs
8. Memory leaks

- How I did it
By adding new ENABLE_ASAN configuration option.

- How to verify it
By default ASAN is disabled and the SONiC image is not affected.
When ASAN is enabled it inspects all allocation, deallocation, and memory usage that the application does in run time. To verify whether the application has memory errors tests that trigger memory usage of the application should be run. Ideally, the whole regression tests should be run. Memory leaks reports will be placed in /var/log/asan/ directory of SONiC host OS.

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
2022-02-09 13:29:18 +02:00
abdosi
e44a40cc3b
Updated Internal BGP Templates for chassis packet (#9674)
Fixes: https://github.com/Azure/sonic-buildimage/issues/9610
2022-02-08 09:36:32 -08:00
Lawrence Lee
eff80f750f
[swss]: Reduce tunnel_packet_handler memory usage (#9762)
* Configure scapy to not store sniffed packets

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2022-02-07 11:55:48 -08:00
Christian Svensson
660c0cbe7b
docker-dhcp-relay: Fix test call to MockConfigDb (#9903)
*docker-dhcp-relay: Fix test call to MockConfigDb

Signed-off-by: Christian Svensson <blue@cmd.nu>
2022-02-01 18:52:52 -08:00
Andriy Yurkiv
cb3b9416a6
[Mellanox][VXLAN] add params to vxlan.json file in order to configure VXLAN src port range feature (#9658)
- Why I did it
Remove obsolete parameter that enables static VXLAN src port range
provide functionality no generate json config file according to appropriate parameter in config_db
Done for
SN3800:
• Mellanox-SN3800-D28C50
• Mellanox-SN3800-C64
• Mellanox-SN3800-D28C49S1 (New 10G SKU)

SN2700:
• Mellanox-SN2700-D48C8

- How I did it
Remove SAI_VXLAN_SRCPORT_RANGE_ENABLE=1 from appropriate sai.profile files
Created vxlan.json file and added few params that depends on DEVICE_METADATA.localhost.vxlan_port_range

- How to verify it
File /etc/swss/config.d/vxlan.json should be generated inside swss docker when it restart
[
    {
        "SWITCH_TABLE:switch": {
            "vxlan_src": "0xFF00",
            "vxlan_mask": "8"
        },
        "OP": "SET"
    }
]
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
2022-01-31 15:57:30 +02:00
vdahiya12
61e9a7683c
[y_cable] Support for initialization of new daemon ycable to support ycables (#9125)
* [y_cable] Support for initialization of new Daemon ycable to support
ycables
This PR also adds the commit in sonic-platform-daemons

94fa239 [y_cable] refactor y_cable to a seperate logic and new daemon from xcvrd (#219)

Why I did it
This PR separates the logic of Y-Cable from xcvrd. Before this change we were utilizing xcvrd daemon to control all aspects of Y-Cable right from initialization to processing requests from other entities like orch,linkmgr.
Now we would have another daemon ycabled which will serve this purpose.
Logically everything still remains the same from the perspective of other daemons.
it also take care aspects like init/delete daemon from Y-Cable perspective.

How I did it
To serve the purpose we build a new wheel sonic_ycabled-1.0-py3-none-any.whl and install it inside pmon.
We also initalize the daemon ycabled which serves our purpose for refactor inside pmon

How to verify it
Ran the changes with an image for dualtor tests on a 7050cx3 platform

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2022-01-25 11:10:25 -08:00
Longxiang Lyu
49a036e90c
Add dualtor TSA/B/C support (#9726)
Why I did it
Add TSA/B/C dualtor support

Signed-off-by: Longxiang Lyu lolv@microsoft.com

How I did it
For TSA, toggle all the mux to standby if the device type is dualtor and there are active mux ports.
For TSC, add mux status output.

How to verify it
Run TSA/B/C on a dualtor setup
2022-01-25 10:50:29 +08:00
DINESH KUMAR SELLAPPAN
d9b1577675
Support for Statistics Python Module in sonic-mgmt docker image (#9682)
This PR includes the support for statistics module in sonic-mgmt docker image
2022-01-25 10:32:22 +08:00
xumia
7a226ffd0d
Support bullseye for docker-sonic-restapi docker-sonic-telemetry (#9791)
Support bullseye for docker-sonic-restapi docker-sonic-telemetry
Upgrade to bullseye and Golang-1.15 to support FIPS.
2022-01-21 08:41:39 +08:00
kellyyeh
3e263fa6a8
[dhcp_relay] Remove dhcpv6 servers from VlanBrief (#9718) 2022-01-19 07:47:08 -08:00
Saikrishna Arcot
bb3362760d
[docker-dhcprelay]: Update to Bullseye (#9736)
As part of this, update the isc-dhcp package to match the Bullseye
version (this fixes some compile errors related to BIND), clean up some
of the build dependencies and runtime dependencies for debian packaging,
and use the default Boost version to compile against instead of
explicitly saying using 1.74.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-01-18 15:11:36 -08:00
shlomibitton
eaa888d948
Fix import error for DHCP relay CLI (#9691)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2022-01-16 08:08:01 +02:00
SuvarnaMeenakshi
945278f6d8
[docker-snmp]: Modify log level of snmpd (#9734)
#### Why I did it
resolves https://github.com/Azure/sonic-buildimage/issues/8779
snmpd writes the below error message in syslog :
snmp#snmpd[27]: truncating integer value > 32 bits
This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days.

#### How I did it
Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd

To avoid seeing this message if the counter grows, the snmpd error log level is changed to display  LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG.

Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog.

#### How to verify it
On a device which is up for more than 497 days, modify supervisord.conf  with the change and restart snmp.
Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.
2022-01-12 14:40:01 -08:00
Saikrishna Arcot
fee2441717
Create docker-base-bullseye and docker-config-engine-bullseye (#9666)
* [slave-bullseye]: Remove Python 2

It shouldn't be needed anymore.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* [dockers]: Add docker-base-bullseye and docker-config-engine-bullseye

Also upgrade socat from 1.7.3.1 to 1.7.4.1

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-01-11 09:23:42 -08:00
abdosi
6c507329b7
Enable/Disable Order ECMP feature. (#9651)
Updated Jinja2 Template in switch.json.j2 for enabling/disabling Order ECMP feature based on device role.
Changes as per design: Azure/SONiC#896
2022-01-06 16:40:50 -08:00
Saikrishna Arcot
bd479cad29 Create a docker-swss-layer that holds the swss package.
This is to save about 50MB of disk space, since 6 containers
individually install this package.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2022-01-06 09:26:55 -08:00
Saikrishna Arcot
b09b845225 [docker-platform-monitor]: Remove Python 2
Python 2 doesn't appear to be required any more.
2022-01-06 09:26:55 -08:00
Shilong Liu
36d866002a
[build] Fix docker-sonic-mgmt pylint dependency lazy-object-proxy version (#9596) 2021-12-24 10:42:37 +08:00
zzhiyuan
a6d0a27a18
[Arista] Increase switch PCIe timeout for 7060-cx32s (#9248)
Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
Why I did it
Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up.

No issues would be expected from the setpci command. From the PCIe spec:

"Software is permitted to change the value in this field at any
time. For Requests already pending when the Completion
Timeout Value is changed, hardware is permitted to use either
the new or the old value for the outstanding Requests, and is
permitted to base the start time for each Request either on when
this value was changed or on when each request was issued. "

How I did it
Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder.

Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands.

How to verify it
On bootup of an Arista 7060, can execute:
lspci -vv -s 01:00.0 | grep -i "devctl2"
In order to check that the timeout has changed.
2021-12-17 08:43:25 -08:00
Lawrence Lee
7bd0a2ad11
[swss]: Listen for undeliverable tunnel packets (#9348)
- Create a script in the orchagent docker container which listens for these encapsulated packets which are trapped to CPU (indicating that they cannot be routed/no neighbor info exists for the inner packet). When such a packet is received, the script will issue a ping command to the packet's inner destination IP to start the neighbor learning process.
- This script is also resilient to portchannel status changes (i.e. interface going up or down). An interface going down does not affect traffic sniffing on interfaces which are still up. When an interface comes back up, we restart the sniffer to start capturing traffic on that interface again.
2021-12-14 14:45:23 -08:00
Shi Su
f2774b635d
Add openbfdd to ptf docker (#9488)
Why I did it
To enable test support for BFD-related features, the PTF docker needs to have the proper support for BFD. This PR aims to add BFD support in ptf docker.

How I did it
Clone and build OpenBFDD for PTF docker.

How to verify it
Build locally and verify BFD is supported.
2021-12-14 11:46:48 -08:00
abdosi
6c0da4bcf0
[bgp] Enable BGP Graceful Restart based on device role (#9486)
What I did:
Updated Jinja Template to enable BGP Graceful Restart based on device role. By default it will be enable only if the device role type is TorRouter.

Why I did:-
By default FRR is configured in Graceful Helper mode. Graceful Restart is needed on T0/TorRouter only since the device can go for warm-reboot. For T1/LeafRouter it need to be in Helper mode only
2021-12-13 10:14:50 -08:00
novikauanton
969cea07aa
add platform to iccpd's env (#8945) 2021-12-08 09:21:44 -08:00
Brian O'Connor
46bcda359c
[PINS] Build P4RT container for PINS (#9083)
- Add INCLUDE_PINS to config to enable/disable container
- Add Docker files and supporting resources
- Add sonic-pins submodule and associated make files

Submission containing materials of a third party:
    Copyright Google LLC; Licensed under Apache 2.0

#### Why I did it

Adds P4RT container to SONiC for PINS

The P4RT app is covered by this HLD:
https://github.com/pins/SONiC/blob/master/doc/pins/p4rt_app_hld.md

#### How I did it

Followed the pattern and templates used for other SONiC applications

#### How to verify it

Build SONiC with INCLUDE_P4RT set to "y".
Verify that the resulting build has a container called "p4rt" running.
You can verify that the service is up by running the following command on the SONiC switch:
```bash
sudo netstat -lpnt | grep p4rt
```
You should see the service listening on TCP port 9559.

#### Which release branch to backport (provide reason below if selected)

None

#### Description for the changelog

Build P4RT container for PINS
2021-12-07 11:11:25 -08:00
abdosi
f501311f11
Updated BGP Template for Chassis/Multi-asic (#9291)
Updated BGP Template for the case:
    
   1. For Packet Chassis do not advertise Loopback4096 address into BGP as there is Static Route for same. 
       Having this route in BGP causes two level of recursion in Zebra and cause assert in Zebra 
       when there are many nexthop involved
 
   2. Advertise only P2P Connected IP's into BGP (External Peers). For Packet chassis we have backend IP Interface subnet and if 
        they get advertised into BGP then it also causes recursion
2021-12-06 09:36:24 -08:00
kellyyeh
d11207d4f4
[radv] Run radv on MgmtToRRouter (#9424)
* Allow radv to run on mgmt tor and EPMS
2021-12-03 09:45:06 -08:00
kellyyeh
f2ee94d201
[dhcp_relay] Update DHCPv6 counter on relayed messages (#9283) 2021-11-30 20:15:30 -08:00
vganesan-nokia
78de10713c
[voq-chassis][bgpcfg] VOQ_BGP_CHASSIS_NEIGHBORS timers default (#8455)
The BGP_VOQ_CHASSIS_NEIGHBOR keepalive and holdtime timers are
configured similar to general neighbors. Changes are done to configure
BGP_VOQ_CHASSIS_NEIGHBOR timers similar to BGP_INTENAL_NEIGBOR since voq
chassis bgp neighbors are similar to bgp internal neighbors in
multi-asic. As it is done for bgp internal neighbors, the keepalive and
holdtime timers are set to 3 and 10 seconds respectively. Also similar
to bgp internal neighbors, connection retry timer is also configured for
voq chassis bgp neighbors.

Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>
2021-11-30 12:10:27 -08:00
Saikrishna Arcot
fdd8236864
[docker-mgmt-framework]: Don't overwrite /etc/passwd and /etc/group with symlinks (#9375)
Fixes #9376

Because /etc/passwd and /etc/group have been overwritten with symlinks
to /host_etc/passwd and /host_etc/group, the debug container build
fails. This is because the debug container is built without /etc being
mounted at /host_etc in the container (which does happen at runtime).
Because of that, /etc/passwd and /etc/group don't exist, which causes
some package installation errors when openssh-client tries to create a
group.

This is a partial revert of 1347f29178.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2021-11-24 23:51:59 -08:00
arlakshm
5830852832
remove staticd.conf.j2 (#9182)
Why I did it
resolves #8979 and #9055

How I did it
Remove the file static.conf.j2,which adds the default route on eth0 from bgp docker

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
2021-11-24 15:32:16 -08:00
Junchao-Mellanox
554b04f312
Add trap flow counter support (#8940)
*Add trap flow counter support
2021-11-24 15:26:52 -08:00
Brian O'Connor
002827f08e
[PINS] Add APPL_STATE_DB and response path log (#9082)
- Add APPL_STATE_DB to database_config.json
- Clear APPL_STATE_DB during SwSS container restarts
- Add response path log file to logrotate config: responsepublisher.rec

Co-authored-by: PINS Working Group <sonic-pins-subgroup@googlegroups.com>
2021-11-24 10:31:06 -08:00
Stephen Sun
b3ccef9c08
[Reclaim buffer] Common infrastructure update for reclaiming buffer (#9133)
- Why I did it
This is to update the common sonic-buildimage infra for reclaiming buffer.

- How I did it
Render zero_profiles.j2 to zero_profiles.json for vendors that support reclaiming buffer
The zero profiles will be referenced in PR [Reclaim buffer] Reclaim unused buffers by applying zero buffer profiles #8768 on Mellanox platforms and there will be test cases to verify the behavior there.
Rendering is done here for passing azure pipeline.
Load zero_profiles.json when the dynamic buffer manager starts
Generate inactive port list to reclaim buffer

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-11-24 15:00:23 +02:00
Ze Gan
79b8ff52b0
[sonic-mgmt]: Upgrade scapy (#8554)
* Upgrade scapy

Signed-off-by: Ze Gan <ganze718@gmail.com>

* Add scapy version

Signed-off-by: Ze Gan <ganze718@gmail.com>
2021-11-24 10:07:50 +08:00
Stepan Blyshchak
a2c2d67098
[ACL] enable ACL FC when genereting config from minigraph but disable by default (#8908)
* [ACL] enable ACL FC when genereting config from minigraph but disable by default
Why I did it
To support ACL counters on Flex Counter Infrastructure.

How I did it
Enable ACL FC in init_cfg and minigraph. Disable when genereting configuration from preset.

How to verify it
Together with depends PRs. Run ACL/Everflow test suite.

Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
2021-11-11 09:07:54 +08:00
tjchadaga
8544147a70
Fix for additional intf flap during fast-reboot (#9166) 2021-11-08 15:21:11 -08:00
Lawrence Lee
7c0507b6db
[swss]: Start ndppd after vlanmgrd (#9155)
Why I did it
During swss container startup, if ndppd starts up before/with vlanmgrd, ndppd will be pinned at nearly 100% CPU usage.

How I did it
Only start ndppd after vlanmgrd is running. Also, call ndppd directly instead of through bash for improved logging and to prevent orphaned processes.

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-11-03 11:03:01 -07:00
Sudharsan Dhamal Gopalarathnam
fcff3f3d09
VxLAN Tunnel Counters and Rates implementation (#8369)
* Enable flex counters for Vxlan tunnel
2021-11-01 10:42:21 -07:00
abdosi
919b3e5cdf
[chassis-packet] Fixed BGP Internal Peer template (#9106)
What I did:

Fix the typo in Internal Peer Group template for Packet-based Chassis.
Address Review comments of PR: [chassis-packet] minigraph parsing and BGP template changes #8966
- Static Route Parsing for Host
- Formatting of chassis port_config.ini
2021-10-29 11:02:38 -07:00
Junhua Zhai
7de673cb5b
[gearbox] Use separator ':' for GB_ASIC_DB, GB_COUNTERS_DB and GB_FLEX_COUNTER_DB (#9100)
Keep GB_ASIC_DB, etc consistent with the ones in sonic-swss-common/common/database_config.json
2021-10-28 10:27:52 -07:00
Marty Y. Lok
b91190d82d
[Nokia] Add protobuf and grpc C++ and python lib to support Nokia IXR7250E platform (#8366)
#### Why I did it
Nokia IXR7250E platform requires grpcio, grpcio-tools python library, and libprotobuf-dev, libgrpc++ library  

#### How I did it
Modified the build_debian.sh install libprotobuf-dev and libgrpc++ to support nokia ndk
Modified the sonic_debian_extension.j2 to install the grpcio and grpcio-tools in the host
Modified the docker-platform-monitor/Dockerfile.js to install grpcio and grpcio-tools for the pmon container.

#### How to verify it
Image running success.
2021-10-26 18:09:32 -07:00
Kebo Liu
9c4a7c2fed
[PMON] Skip chassis_db_init task on Mellanox simx platform (#9017)
Why I did it
"chassis_db_init" task of PMON should be skipped on Mellanox simx platform, since the hardware info which this task is trying to access is not available on simx platforms, It will introduce some error log.

How I did it
Add the capability for "chassis_db_init" in the template for it can be skipped by adding configuration in "pmon_daemon_control.json".
add "skip_chassis_db_init" configuration for simx platforms.
use symbol link for "pmon_daemon_control.json" since all the simx platforms share the same configuration
How to verify it
Build an image and install it on simx platform to check whether "chassis_db_init" task is skipped.

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-10-24 09:10:41 -07:00
Saikrishna Arcot
c1d5e0682f
docker-dhcp-relay: Fix waiting for interfaces to get set up (#9034)
Fix the check used to wait for interfaces to come up. The group name in
the supervisor config files has changed from isc-dhcp-relay to
dhcp-relay.

Also, in the wait script, wait 10 additional seconds after the vlans,
port channels, and any interfaces are up. This is because dhcrelay
listens on all interfaces (in addition to port channels and vlans), and
to ensure that it stays in a clean state during runtime, wait some extra
time to make sure that those interfaces are created as well.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2021-10-21 18:45:00 -07:00
shlomibitton
546340bf7b
[dhcp_relay] Fix import for dhcp_counters on clear_dhcp6relay_counter.py (#8991)
#### Why I did it

**Import issue will cause:**
root@sonic:/# sudo sonic-clear arp
failed to import plugin clear.plugins.dhcprelay: No module named 'show_dhcp_relay'

#### How I did it

Fix the import.

#### How to verify it

run sudo sonic-clear arp
2021-10-19 03:10:36 -07:00
abdosi
3bb248bd67
[chassis-packet] minigraph parsing and BGP template changes (#8966)
1. Changes for Generation LC-Graph for packet-based chassis.
2. Added Support Ipv6 Peering on Loopback4096 for voq also
3. Updated asic topology yml files to be offset of slot
4. Made slot_num to take string slot<number> instead of number
5. Consolidated template_dpg_voq_asic.j2 into dpg_asic.j2
6. Remove Loopback4096 from asic topology and parse as dut invertory for
   multi-asic
7. Updated topo_facts parsing for asic topology_
8. Internal BGP Session rename from <VoqChassisInternal> to <ChassisInternal> and take switch_type as value.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2021-10-18 18:44:24 -07:00
Lawrence Lee
fad5ec47b4 [mux]: Call write_standby from host only
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Lawrence Lee
5232647b33 [mux]: Make write_standby available on host
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

[write_standby]: Cleanup and fix build

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Lawrence Lee
14403c61d2 [mux]: Initialize all mux ports as standby
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-10-15 09:59:59 -07:00
Tamer Ahmed
c9c2826520 Merged PR 3845699: [linkmgrd]: Introduce MUX cable linkmgrd
Linkmgrd monitors link status, mux status, and link state. Has
the link becomes unhealthy, linkmgrd will trigger mux switchover
on a standby ToR ensuring uninterrupted service to servers/blades.
This PR is initial implementation of linkmgrd.

Also, docker-mux container hold packages related to maintaining and managing
mux cable. It currently runs linkmgrd binary that monitor and switches
the mux if needed.
This PR also introduces mux-container and starts linkmgrd as startup when
build is configured with INCLUDE_MUX=y

Edit: linkmgrd PR will follow.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

Related work items: #2315, #3146150
2021-10-15 09:59:59 -07:00
kellyyeh
df6361f50c
Change radv interval to 3min (#8882) 2021-10-01 15:00:16 -07:00
Ye Jianquan
38500fa92e
Add gdb and pyrasite to ptf image (#8816) 2021-09-24 17:10:48 +08:00
kellyyeh
62a1f5eb19
Add CLI Support for IPv6 Helpers and DHCPv6 Relay Counters (#8593) 2021-09-23 22:01:26 -07:00
DINESH KUMAR SELLAPPAN
31a647a72d
[docker-sonic-mgmt]: Snappi version to 0.5.11 (#8790) 2021-09-23 02:12:12 -07:00
kellyyeh
bc06c6fcb5
Incorporate DHCPv6 Relay Agent into dhcp-relay docker (#8321) 2021-09-22 16:05:03 -07:00
shlomibitton
112fda7877
[Flex Counters] Reset flex counters delay flag on config DB when enable_counters script is called (#8500)
#### Why I did it
Reset flex counters delay flag on config DB when enable_counters script is called to allow enablement of flex counters in orchagent.

#### How I did it
Push to config DB 'false' value for delay indication when enable_counters script is called before enabling the counters.

#### How to verify it
Observe counters are created when enable_counters script is called.
2021-09-01 21:17:36 -07:00
richardyu
479f61404b
Add thrift in the docker-sonic-mgmt (#8623)
Co-authored-by: richardyu-ms <richard.yu@microsoft.com>
2021-08-31 19:20:06 -07:00
shlomibitton
56533ceb9e
[dhcp_relay] Adapt config/show CLI commands to support DHCPv6 relay (#8211)
#### Why I did it
- Adapt config/show CLI commands to support DHCPv6 relay
- Support multiple dhcp servers assignment in one command
- Fix IP validation
- Adapt UT and add new UT cases

#### How I did it
- Modify config/show dhcp relay files
- Modify config/show UT files

#### How to verify it
This PR has a dependency on PR https://github.com/Azure/sonic-utilities/pull/1717
Build an image with the dependent PR and this PR
Use config/show DHCPv6 relay commands.
2021-08-25 00:48:39 -07:00
Christian Svensson
f7de685be2
[mgmt-framework]: Fix typo in mgmt_vars.j2 (#8475)
Signed-off-by: Christian Svensson <blue@cmd.nu>
2021-08-24 10:54:13 -07:00
Kostiantyn Yarovyi
6530f93881 [Pcied] run by python 3
Why I did it
Pcied running by python 2.

How I did it
dropped python2 support and add python3 support for pcied in file docker-pmon.supervisord.conf.j2

How to verify it
docker exec pmon supervisorctl status
2021-08-23 03:30:12 +00:00
Myron Sosyak
4d03526311
[docker-ptf] Upgrade to buster (#8254)
Co-authored-by: Your Name <you@example.com>
2021-08-18 10:42:03 -07:00
xumia
a4405f09ed
Support to build armhf/arm64 platforms on arm based system (#7731)
Why I did it
Support to build armhf/arm64 platforms on arm based system without qemu simulator.
When building the armhf/arm64 on arm based system, it is not necessary to use qemu simulator.

How I did it
Build armhf on armhf system, or build arm64 on arm64 system, by default, qemu simulator will not be used.
When building armhf on arm64, and you have enabled armhf docker, then it will build images without simulator automatically. It is based how the docker service is run.

Docker base image change:
For amd64, change from debian:to amd64/debian:
For arm64, change from multiarch/debian-debootstrap:arm64- to arm64v8/debian:
For armhf, change from multiarch/debian-debootstrap:armhf- to arm32v7/debian:
See https://github.com/docker-library/official-images#architectures-other-than-amd64
The mapping relations:
arm32v6 --- armel
arm32v7 --- armhf
arm64v8 --- arm64

Docker image armhf deprecated info: https://hub.docker.com/r/armhf/debian, using arm32v7 instead.
2021-08-12 22:24:37 +08:00
richardyu
9417fe9303
PTF adds unittest-xml-reporting (#8417)
Co-authored-by: richardyu-ms <richard.yu@microsoft.com>
2021-08-11 20:55:21 -07:00
Blueve
aa01315f60
[ARM] Fix issue whre the ping6 tool is missing from orchagent docker (#8345)
Signed-off-by: Jing Kan jika@microsoft.com
2021-08-05 22:00:50 +08:00
Sujin Kang
447f0c64da
[pmon]: Enable Autorestart of the daemons in PMON for unexpected exit cases (#8326)
Remove the daemon list from the critical_process which prevent the PMON
from restarting when the individual daemon crashes.
2021-08-04 09:57:54 -07:00
VenkatCisco
0803f7bf34
[pmon]: add python3-jsonschema pmon (#8018)
jsonschema is an implementation of JSON Schema for Python .

Signed-off-by: Venkat Garigipati <venkatg@cisco.com>
2021-08-03 18:08:09 -07:00
vganesan-nokia
f9231723f9
[multiasic][voq][bgpconf] Fix for the issue of same BGP router id in all asics (#8049)
For multiasic, the back end asics use ip addresss of Loopback4096 for BGP router id. In VOQ multi-asic chassis there are no back end asics. All the asics are front end and the iBGP connections are established via Ethernet-IB of asics. Since these asics are not designated as BackEnd, the ip address of interface Loopback0 is used as BGP router id. Since the ip address of Loopback0 is same for all the asics in the line card, same router id is used for voq iBGP configurations and hence the iBGP connections are not established. Changes are done to fix this
2021-07-26 12:54:52 -07:00
Shi Su
8a48be9b74
Reduce route selection deferral timer for bgp graceful restart (#7533)
Why I did it
There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.

Fix #7488

How I did it
Reduce route selection deferral timer for bgp graceful restart to 15 seconds.
2021-07-26 10:16:19 -07:00
賓少鈺
aa59bfeab7
[PDE]: introduce the SONiC Platform Development Env (#7510)
The PDE silicon test harness and platform test harness can be found in
src/sonic-platform-pdk-pde
2021-07-24 16:24:43 -07:00