Bugs fixes:
All | Kernel | During system reload when CPU is loaded with heavy traffic, a Kernel Panic may occur.
All | Modules, Port split | FW stuck when device rebooted with locked Optical Transceivers in split mode
Spectrum-3 | PFC | On Spectrum-3 systems, slow reaction time to Rx pause packets on 40GbE ports may lead to buffer overflow on servers.
Spectrum-3 | SN4700, Port Split | On rare occasion SN4700, conducting 100G split (4x25G) in NRZ when splitter port 1 or 2 are down, ports 3 and 4 will also go down.
Enahncments:
All | Kernel | new notification on ISSU start, so other kernel drivers can disable any interface to ASIC
Signed-off-by: Kebo Liu <kebol@nvidia.com>
Recent changes brought l2 vlan concept which do not have DHCP
clients behind them and so DHCP relay is not required. Also,
dhcpmon fails to launch on those vlans as their interfaces
lack IP addresses. This PR limit launch of both DHCP relay
and dhcpmon to L3 vlans only.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Fix#119
when parallel build is enable, multiple dpkg-buildpackage
instances are running at the same time. /var/lib/dpkg is shared
by all instances and the /var/lib/dpkg/updates could be corrupted
and cause the build failure.
the fix is to use overlay fs to mount separate /var/lib/dpkg
for each dpkg-buildpackage instance so that they are not affecting
each other.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
sonic-slave tag only allows all lower case. In case the user
name is mixed case, we need to change user name to all lower case.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
With the release of pip21.0 (https://pypi.org/project/pip/#history) on branch 201911 stretch build is failing with below error logs:
As per https://pypi.org/project/pip/ pip21.0 does not not support python2 from Jan 2021. To fix this tag the pip to 20.3.3 version which was being used last and is working fine.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
**- Why I did it**
Earlier today we found a bug in the SONiC TSA implementation.
TSC shows incorrect output (see below) in case we have a route-map which contains TSA route-map as a prefix.
```
admin@str-s6100-acs-1:~$ TSC
Traffic Shift Check:
System Mode: Not consistent
```
The reason is that TSC implementation has too loose regexps in TSA utilities, which match wrong route-map entries:
For example, current TSC matches following
```
route-map TO_BGP_PEER_V4 permit 200
route-map TO_BGP_PEER_V6 permit 200
```
But it should match only
```
route-map TO_BGP_PEER_V4 permit 20
route-map TO_BGP_PEER_V4 deny 30
route-map TO_BGP_PEER_V6 permit 20
route-map TO_BGP_PEER_V6 deny 30
```
**- How I did it**
I fixed it by using egrep with `^` and `$` regexp markers which match begin and end of the line.
**- How to verify it**
1. Add follwing entry to FRR config:
```
str-s6100-acs-1#
str-s6100-acs-1# conf t
str-s6100-acs-1(config)# route-map TO_BGP_PEER_V4 permit 200
str-s6100-acs-1(config-route-map)# end
```
2. Use the TSC command and check output. It should show normal.
```
admin@str-s6100-acs-1:~$ TSC
Traffic Shift Check:
System Mode: Normal```
[Submodule update] sonic-utilities
- [db_migrator][201911] Support shared headroom in db_migrator on Mellanox platform (#1261)
- Multi-ASIC support show ip/v6 route additional parameters (#1333)
Signed-off-by: Stephen Sun <stephens@nvidia.com>
To pick up new commits from sonic-linux-kernel repo:
[201911] Backport patches to increase critical threshold for ASIC and validate transceiver temperature 2f173b45da29f3643212d6c9111db321797453ec Azure/sonic-linux-kernel@2f173b4
Signed-off-by: Kebo Liu <kebol@nvidia.com>
b8f0c3a [snmpagent] [201911] Fix hardcoded qsfp lane count by reading sensor status from DB (#183)
**- Why I did it**
Update submodule pointer for snmpagent to include fix for hardcoded qsfp lane count
**- How I did it**
Update snmpagent submodule
**- How to verify it**
Run build.
In scenario where upgrade gets config from minigraph, it could miss tacacs credentials as they are not in minigraph. Hence restore explicitly upon load-minigraph, if present.
- Why I did it
Upon boot, when config migration is required, the switch could load config from minigraph. The config-load from minigraph would wipe off TACACS key and disable login via TACACS, which would disable all remote user access. This change, would re-configure the TACACS if there is a saved copy available.
- How I did it
When config is loaded from minigraph, look for a TACACS credentials back up (tacacs.json) under /etc/sonic/old_config. If present, load the credentials into running config, before config-save is called.
- How to verify it
Remove /etc/sonic/config_db.json and do an image update. Upon reboot, w/o this change, you would not be able ssh in as remote user. You may login as admin and check out, "show tacacs" & "show aaa" to verify that tacacs-key is missing and login is not enabled for tacacs.
With this change applied, remove /etc/sonic/config_db.json, but save tacacs & aaa credentials as tacacs.json in /etc/sonic/. Upon reboot, you should see remote user access possible.
* Use 20 and 30 route-map entries instead of 2 and 3 for TSA
* Added support for dynamic "Allow list" default action.
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
Updating submodule for sonic-swss to get the changes to Azure 201911. The following were the commits that were part of this submodule.
[201911-SWSS]flushing FDB entries per VLAN when deleting VLAN (PR#Azure/sonic-swss#1575) 9519fead3fc63972131de9cb8963a5aeacf7b23d
- Why I did it
Fix setting PSU led to 'green' or 'red' states.
Fix return False if unsupported color request.
Remove 'off' option for PSU led API since it is not supported in Mellanox.
- How I did it
Fix import missing information.
Return 'False' when unsupported led color is requested, preventing an exception.
- How to verify it
Try to set PSU LED to different status with Mellanox platform device.
Try to set PSU LED color to unsupported color with Mellanox platform device.
- Why I did it
Move frr logs from syslog from the directory /var/log/quagga/.log to /var/log/frr/log
- How I did it
Updated the rsyslog config files.
- How to verify it
Verified the logs come into the file zebra.log and bgpd.log in the DIR /var/log/frr/log
Natural sorting of SONiC config gen output consumes lot of CPU cycles.
The sole use of natsorted was to make test comparison easier and so,
the natsorting logic is now relocated to the test suite. As a result
sonic-cfggen gained nearly 1 sec per call since we no longer import
natsorted module!
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Argument to write to config-db is not allowed when using template.
This PR allows cfggen to write to redis db when using template
mode.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to cfggen take considerable time. With batch mode, we will have the ability
to reduce number of calls from services.
Example of the batch mode command:
sonic-cfggen -t template-1.j2 -t template-2.j2,config-db -t template-3.j2,config-db -t template-4.j2,file1 -t template-5.j2,file2 --write-to-db.
template-1.j2 will be rendered to stdout since it is missing the dest part. stdout is default
config-db is a special keyword that will inject the rendered template into internal data structure. The internal data structure gets written to redis-db with --write-to-db switch. In the case the user would like to write to a file named config-db, it could be given as /config-db or ./config-db
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Optimizing number of calls made to sonic-cfggen during service
start up as it adds to total system boot up time.
***-Test 1***
there is an average saving of 1 to 1.5 sec between old script and new script
```
root@str-s6000-acs-14:/# time /usr/bin/rest-server-old.sh
Generating temporary TLS server certificate ...
2020/07/09 19:03:33 wrote cert.pem
2020/07/09 19:03:33 wrote key.pem
REST_SERVER_ARGS = -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
/usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
real 0m8.790s
user 0m7.993s
sys 0m0.584s
root@str-s6000-acs-14:/# time /usr/bin/rest-server-new.sh
Generating temporary TLS server certificate ...
2020/07/09 19:03:45 wrote cert.pem
2020/07/09 19:03:45 wrote key.pem
REST_SERVER_ARGS = -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
/usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
real 0m6.940s
user 0m5.670s
sys 0m0.386s
```
***-Test 2***
Built an image with this change and rest server is running with params as described in test 1 above
```
admin@str-s6000-acs-14:~$ ps -ef | grep rest_server
root 3301 2866 2 02:09 pts/0 00:00:10 /usr/sbin/rest_server -ui /rest_ui -logtostderr -cert /tmp/cert.pem -key /tmp/key.pem
```
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting swss service.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to two calls during startup when starting frr service.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting radv service.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when starting dhcp-relay
service.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to one call during startup when running interfaces-
config.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Calls to sonic-cfggen is CPU expensive. This PR reduces calls to
sonic-cfggen to once calle during snmp startup
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
This PR enables cfggen to readr/write from Redis DB using pipelines.
Pipelines enables batch read/write from/to Redis DB.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
This PR limited the number of calls to sonic-cfggen to one call
per iteration instead of current 3 calls per iteration.
The PR also installs jq on host for future scripts if needed.
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Support ACL Table type Mirrorv6 for Innovium (#1528)
Enable v6 ACL rule based Mirroring for Innovium Platform
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
To limit IO and space usage on the flash device the boot0 script makes sure the SWI is in memory.
Because SONiC maps /tmp on the flash, some logic is required to make sure of it.
However it is possible for some provisioning mechanism to already download the swi in a memory file system.
This was not properly handled by the boot0 script.
It now properly detect if the image is on a tmpfs or a ramfs and keep it there if that is the case.
- How I did it
Check the filesystem on which the SWI pointed by swipath lies.
If this filesystem is a ramfs or a tmpfs the move_swi_to_tmpfs becomes a no-op.
Made sure the cleanup logic would not behave unexpectedly.
- How to verify it
In SONiC:
Download the swi under /tmp and makes sure it gets moved to /tmp/tmp-swi which gets mounted for that purpose.
Make sure /tmp/tmp-swi gets unmounted once the install process is done.
Create a new mountpoint under /ram using either ramfs or tmpfs and download the swi there.
Install the swi using sonic-installer and makes sure the image doesn't get moved by looking at the logs.