sonic-swss:
- [Mux] Route handling based on mux status, kernel tunnel support (#1615)
- Reduce noise during frequent route update (#1624)
- Changed Error log to Notice log during FDB flush notification after VLAN delete (#1618)
- [PortsOrch] Add reference counting to ports for ACL bindings (#1614)
- [crm]: Ignore unsupported/non-implemented switch attributes (#1613)
- [Mux] Fix repeating logs in case of tunnel creation fail (#1610)
sonic-utilities:
- [config reload]: Restart mux container (#1401)
- [storyteller] Enhance the storyteller utility (#1400)
- [show] Fix int status when portchannel is in the system (#1376)
- [config][show] cli support for retrieving ber, eye-info and configuring prbs, loopback on Y-cable (#1386)
- Skip route check for tun0 interfaces (#1399)
- do not parse stderr to get correct routing stack (#1398)
- [storyteller] allow storyteller to work on downloaded logs (#1388)
- [show] Run fwutil with sudo (#1364)
Signed-off-by: Danny Allen <daall@microsoft.com>
The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.
Fixes#6199
To check if it works out for this issue in 201911 ? #6503
This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels.
master
admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
real 0m15.599s
user 0m0.061s
sys 0m0.038s
Sonic 201911.v58
admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
real 0m5.541s
user 0m0.020s
sys 0m0.028s
When we add allow-list key with action above route-map gets updated . For eg if we add deny action above template will become to no-export community. Now if we delete the key Issue is we still keep the no-export and do not move back to drop community.
This PR fixes this issue by rolling back default route-map community value back to constants.yml default action.
This PR updates the following commits in sonic-platform-common
6ad0004 [component] add auto_update_firmware() to support the auto update. (#106)
49076a9 [sonic_y_cable] Add support for measuring BER and EYE scan and running Loopback, PRBS modes on the Y cable (#158)
6b12b4c [sfp] Add parsing the dom_capability to sff8472 (#102)
7fc76b9 [sonic_pcie] Add get_pcie_aer_stats and its common implementation (#144)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Update minigraph parser to retrieve kubernetes server info from minigraph.xml and update "KUBERNETES_MASTER|SERVER" in running config.
Update minigraph parser to include clusterName from minigraph.xml into "DEVICE_METADATA|localhost"
snmpd's compile is always failed with file truncated on ARM64 arch, the error log is like "/usr/bin/ld: mibgroup/ip-forward-mib/inetCidrRouteTable/.libs/inetCidrRouteTable_interface.o: file not recognized: file truncated"
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
**- Why I did it**
In thermalctd, when speed of fan exceeds threshold, the fan status will be saved as "bad". So in system health, it is better to check fan speed before fan status. In this case, if fan speed exceeds threshold, we get more detailed information.
**- How I did it**
Move fan speed check logic before fan status check
**- How to verify it**
Manual test
This update includes the following changes
> [syncd armhf] Fix syncd crash when running community test suites (#777)
> Revert "[tests]:Add unittest for MACsec on p2p establishment (#771)"
> [tests]:Add unittest for MACsec on p2p establishment (#771)
> [tests] Enable azure pipeline make check to respect unittests (#760)
* c2fb282 2021-01-29 | [ecnconfig] Allow ecn unit test to run without sudo (#1390) [Neetha John]
* 6cc635b 2021-01-29 | [sonic-installer] Add information to syslog (#1369) [Dmytro]
* 7a8024a 2021-01-27 | Prevent user from adding more then a single untagged VLAN to an interface (#1382) [Eran Dahan]
* 41e62c6 2021-01-26 | [pcieutil] Add 'pcie-aer' sub-command to display AER stats (#1169) [Arun Saravanan Balachandran]
* 47f412b 2021-01-26 | Improve robustness of consutil plugin loading (#1353) [Samuel Angebault]
* 64aa1b8 2021-01-25 | [show] Fix warnings, related to gearbox, while show commands execution (#1343) [maksymbelei95]
* ff226d0 2021-01-25 | Prevent configuring IP interface on a port which is a member of VLAN (#1374) [Eran Dahan]
* f1522b9 2021-01-21 | [config_mgmt.py]: Set leaf-list to empty list while port breakout. (#1268) [Praveen Chaudhary]
* 99c05d5 2021-01-21 | add vlan_intf_object only if there are ipv4 or ipv6 mappings (#1377) [Sumukha Tumkur Vani]
* b082684 2021-01-21 | [ecn] Add tests for ecnconfig command (#1372) [Neetha John]
* 23e0920 2021-01-21 | [sfpshow] Enhance QSFP-DD DOM information (#1207) [shlomibitton]
* f4edba1 2021-01-20 | [ecnconfig] handle backend port names when extracting port I/F ID from the port name (#1361) [Mahesh Maddikayala]
Signed-off-by: Danny Allen <daall@microsoft.com>
- Why I did it
Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running
or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process
in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance.
Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by
Supervisord, we can only focus on the logic of monitoring.
- How I did it
We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take
following steps if it was notified one of critical processes exited unexpectedly:
The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted.
If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog.
- How to verify it
First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not.
Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute.
- Which release branch to backport (provide reason below if selected)
201811
201911
[x ] 202006
* Fix exception in bgpmon caused by duplicate keys
It is possible that BGP neighbors in IPv4 and IPv6 address families
share the same name (such as bgp monitor). However, such case is not
handled in bgpmon, and an Exception will be raised. This commit will
address the issue by Using set instead of list to avoid duplicate keys.
Recent changes brought l2 vlan concept which do not have DHCP
clients behind them and so DHCP relay is not required. Also,
dhcpmon fails to launch on those vlans as their interfaces
lack IP addresses. This PR limit launch of both DHCP relay
and dhcpmon to L3 vlans only.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
- Support for non-template based FRR configurations (BGP, route-map, OSPF, static route..etc) using config DB schema.
- Support for save & restore - Jinja template based config-DB data read and apply to FRR during startup
**- How I did it**
- add frrcfgd service
- when frr_mgmg_framework_config is set, frrcfgd starts in bgp container
- when user changed the BGP or other related table entries in config DB, frrcfgd will run corresponding VTYSH commands to program on FRR.
- add jinja template to generate FRR config file to be used by FRR daemons while bgp container restarted
**- How to verify it**
1. Add/delete data on config DB and then run VTYSH "show running-config" command to check if FRR configuration changed.
1. Restart bgp container and check if generated FRR config file is correct and run VTYSH "show running-config" command to check if FRR configuration is consistent with attributes in config DB
Co-authored-by: Zhenhong Zhao <zhenhong.zhao@dell.com>
**- Why I did it**
For now `hwsku.json` and `platform.json` dont support optional fields. For example no way to add `fec` or `autoneg` field using `platform.json` and `hwsku.json`.
**- How I did it**
Added parsing of optional fields from hwsku.json.
**- How to verify it**
Add optional field to `hwsku.json`. After first boot will be generated new `config_db.json` or you can generate it using `sonic-cfggen` command. In this file must be optional field from `hwsku.json` or check using command `redis-cli hgetall PORT_TABLE:Ethernet0`
Example of `hwsku.json`, that must be parsed:
```
{
"interfaces": {
"Ethernet0": {
"default_brkout_mode": "1x100G[40G]",
"fec": "rs",
"autoneg": "0"
},
...
}
```
Example of generated `config_db.json`:
```
"PORT": {
"Ethernet0": {
"alias": "Ethernet0",
"lanes": "0,1,2,3",
"speed": "100000",
"index": "1",
"admin_status": "up",
"fec": "rs",
"autoneg": "0",
"mtu": "9100"
},
```
So, we can see this entries in redis db:
```
admin@sonic:~$ redis-cli hgetall PORT_TABLE:Ethernet0
1) "alias"
2) "Ethernet0"
3) "lanes"
4) "0,1,2,3"
5) "speed"
6) "100000"
7) "index"
8) "1"
9) "admin_status"
10) "up"
11) "fec"
12) "rs"
13) "autoneg"
14) "0"
15) "mtu"
16) "9100"
17) "description"
18) ""
19) "oper_status"
20) "up"
```
Also its way to fix `show interface status`, `FEC` field but also need add `FEC` field to `hwsku.json`.
Before:
```
admin@sonic:~$ show interfaces status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ----------- ------ ------ ------- --------------- ----------
Ethernet0 0,1,2,3 100G 9100 N/A Ethernet0 routed up up QSFP28 or later N/A
```
After:
```
admin@sonic:~$ show interfaces status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ----------- ------ ------ ------- --------------- ----------
Ethernet0 0,1,2,3 100G 9100 rs Ethernet0 routed up up QSFP28 or later N/A
```
**- Why I did it**
Prior to SONiC using Debian Buster, we needed to build Python 3.5 or newer from source for installation in the SNMP container, becuase it wasn't available from the Debian repository for Jessie or Stretch. Now that all containers are based on Buster, we simply install Python 3.7 from the Debian repository in the host as well as all containers. We are no longer building Python 3 from source, so the Makefile is unused and we no longer need to install build dependencies in the slave containers.
**- How I did it**
- Remove Python 3 makefile
- No longer install Python 3 build dependencies in the slave containers.
Submodule changes to be committed:
* src/sonic-platform-daemons 81318f7...e72f6cd (3):
> [ledd] Minor refactor; add unit tests (#143)
> [thermalctld] Report unit test coverage (#141)
> [psud] Increase unit test coverage (#140)
Meet the requirement for the MUX_CABLE table that IPv6 loopbacks have a /128 prefix
Note that this change only affects the MUX_CABLE table, all other tables continue to use the loopback address provided in minigraph.
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Changes in this update:
37695c8 [show]: Use TCP Connection For Muxcable Commands (#1371)
8119ba2 Validations checks while creating and deleting a Portchannel (#1326)
3df267e [config] Fix Breakout mode option and BREAKOUT_CFG table check method (#1270)
9bd709b [show] Fix show arp in case with FDB entries, linked to default VLAN (#1357)
bc2d27e [generate_dump]: fix syntax error
signed-of-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Currently FRR is send Prefix with VNI information to FPMSYNCD. This PR allows FRR to send RMAC with EVPN Type5 prefix to fpmsyncd. This is a temp fix. This patch will be removed once neighorch is ready to handle the Prefix and ARP (containing RMAC) separately.
[ci]: download artifacts from master branch (#768)
Do not create fabric port if mapping is not available (#769)
[syncd] Comparison logic log also current attr value on set operation (#763)
Add fabric port test to vslib (#737)
[ci]: use sonicbld pool (#766)
[tests] Remove exit command blocking all tests to run (#765)
[vslib]: adapt macsec sai 1.7.1 (#755)
Add support for SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY needed by CRM (#756)
Signed-off-by: Danny Allen <daall@microsoft.com>
Avoid sonic-cfggen crashing when a server does not have a configured loopback address in the minigraph
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Update sonic-linux-kernel pointer to pick up new commits:
- Backport patches to increase critical threshold for ASIC and validate transceiver temperature a7c1af7c44edde90dff49d672071139043bcdb65 548e8e0be4
- [ci]: Set up CI with Azure Pipelines 548e8e0be49692050ea4071d5e9945816bc5aacc a7c1af7c44
Signed-off-by: Kebo Liu <kebol@nvidia.com>
* Fix py3 version changed even version control enabled issue
* Add some comments and simplify the script
* Add the comment to explain how to get the not hooked command
Server IPv4 loopbacks do not always arrive with /32 prefix, which is a requirement for the MUX_CABLE table in config DB
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
Fix#119
when parallel build is enable, multiple dpkg-buildpackage
instances are running at the same time. /var/lib/dpkg is shared
by all instances and the /var/lib/dpkg/updates could be corrupted
and cause the build failure.
the fix is to use overlay fs to mount separate /var/lib/dpkg
for each dpkg-buildpackage instance so that they are not affecting
each other.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
To make the peer switch hostname easily accessible from config DB. Add peer_switch field to DEVICE_METADATA table
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>