Problem:
Default groupadd for redis, takes 1000 by default. This forces, subsequently created admin group to get 1001.
As all TACACS users are created with 1000 as their gid, they end up in redis group.
Fix:
Create redis group *after* admin group is created
Add a check that admin group id is 1000
#### Why I did it
Plexus-utils before 3.0.16 is vulnerable to command injection because it does not correctly process the contents of double quoted strings.
#### How I did it
Upgrade to 3.0.16
The motivation of these changes is to fix (#6051):
- Why I did it
To fix CPU cstates configuration
- How I did it
Updated code to be POSIX compatible
- How to verify it
root@sonic:/home/admin# sonic_installer install sonic-mellanox.bin
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
Feb 17 Fix tests failing due to duplicate vxlan tunnel creation (#75)
Mar 11 Update route api to specify limitation (#77)
Apr 01 Add host_ifname field while adding entry in VLAN table (#80)
Fix the following issues:
Spectrum-2, Spectrum-3 | Port | Fix link issue when using 25 GbE rate between two ports while one is on Spectrum-2-based system and the other is on Spectrum-3-based system
All | warmboot | fail to upgrade from earlier SONiC versions with official SDK/FW 4.4.2306 (was on SONiC 201911)
All | What-Just-Happened | When enabling or disabling WJH under high traffic load to the host CPU, in very specific and low probability conditions, an error could occur, that may result in loss of data, channel failure or in extreme cases SW failure
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
Make sure Everflow always gets classified as Mirror table and not as Control Plane on multi-asic platforms.
Why I did:
In Multi-asic platforms we generate Everflow acl table data from minigraph for both host and namespace.
It is possible in multi-asic minigraph if there are no external port-channel (Only Router Port IP Interface) then Everflow table will have no binded interface in host and will gets classified as Control Plane ACL while in namespace gets classified as Mirror Table.
For ACL Rule generation we read global db as source of truth for acl table information and so for everflow rule generation if tables gets classified as Control plane we can generate rules with invalid action causing orchagent to throw runtime error.
How I did:
If the table is attach to erspan interface in minigraph then it always gets classified as mirror table.
ecc1f9b1bb0ad18843e0f969fe8564cf37bf2080 (HEAD -> 201911, origin/201911)
[acl_loader]: add iptype match to the rules for dataplane acl
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
ad9022ebf9c13b59ef8dc47aaa1f89628e64315e (HEAD -> 201911, origin/201911) Reduce time taken by show commands on multi-asic platforms (#1544)
4993a3644bff689701aac2ee2b10c351a9d241ef [fast-reboot]: Fix fail to execute fast-reboot problem (#1047)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
The S6000 devices, the cold reboot is abrupt and it is likely to cause issues which will cause the device to land into EFI shell. Hence the platform reboot will happen after graceful unmount of all the filesystems as in S6100.
Bug fixes
-Removing critical thermal zones to prevent unexpected software system shutdown:
Kernel 4.9 -0071-mlxsw-core-Remove-critical-trip-point-from-thermal-z.patch
Kernel 4.19 -076-mlxsw-core-Remove-critical-trip-point-from-thermal-z.patch
- hw-mgmt: thermal: Add hardcoded critical trip point
- Removing redundant link for cpld3 for fixed systems (SN2100, SN2010).
- Fix an issue with a missed attribute for cpld3 (port CPLD) for SN2700, SN2410.
Signed-off-by: Stephen Sun <stephens@nvidia.com>
To run VNET route consistency check periodically.
For any failure, the monit will raise alert based on return code.
The tool will log required details.
#### Why I did it
SAI profile files speed configuration have wrong bitmap value for 10/50G speed option.
#### How I did it
Fix to the correct value for all SPC1 devices.
#### How to verify it
Configure on these platforms ports with 10/50G speed using this fix.
Backport of https://github.com/Azure/sonic-buildimage/pull/7031 to the 201911 branch
#### Why I did it
To enable parsing the `AutoNegotiation` element from the LinkMetadata section of minigraph file
#### How I did it
Parse the value `AutoNegotiation` element from the `LinkMetadata` section of minigraph file. If the element is present, an `autoneg` key will be added to the port in the `PORT` table of Config DB with a value of either `0` or `1`
If an `autoneg` value is present in port_config.ini, the value from the minigraph will take precedence, overriding that value.
Also remove `AutoNegotiation` and `EnableAutoNegotiation` elements from the `DeviceInfo` section, as we will use this data in the `LinkMetadata` section to determine whether to enable auto-negotiation for a port.
Why I did it
It was observed that on a multi-asic DUT bootup, the BGP internal sessions between ASIC's was taking more time to get ESTABLISHED than external BGP sessions. The internal sessions was coming up almost exactly 120 secs later.
In multi-asic platform the bgp dockers ( which is per ASIC ) on switch start are bring brought up around the same time and they try to make the bgp sessions with neighbors (in peer ASIC's) which may be not be completely up. This results in BGP connect fail and the retry happens after 120sec which is the default Connect Retry Timer
How I did it
Add the command to set the bgp neighboring session retry timer to 10sec for internal bgp neighbors.
Included commits in sonic-py-swsssdk
```
63c75c1 2021-03-14 | Workaround Mellanox default vlan has no SAI_VLAN_ATTR_VLAN_ID attribute (#103) [Qi Luo]
```
Included commits in sonic-snmpagent
```
a8c6e36 2021-03-15 | Implement rfc4363 FdbUpdater for lag inside vlan (#204) [Qi Luo]
```
It is possible to have DHCP relay configuration with no servers/
helpers which result in DHCP container to crash. This PR fixes this
issue by not starting DHCP relay for vlans with no DHCP helpers.
resolves: #6931closes: #6931
Do not add program group for dhcp relay with not dhcp helpers
Unit test
d81828c6740f2d4fca59fe3ec1d0adb1088a9dbb (HEAD -> 201911, origin/201911) Updated lldpRemManAddrTable to use all the management ip address associated with interface. (#201)
093a3c2c5bc688ddc5e5362dc657f19175e12ce8 Fix fdb_vlanmac() on corner cases (#193)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
- Why I did it
To pick up new features and fix from SDK/FW and SAI
SDK/FW new Feature:
All | Added support for multiple modules and cable types. For full list contact Nvidia networking support
Spectrum-3 | SN46000C | Added support for up to 5W on ports 49 to 64 .
SDK/FW bugs' fix:
All | fast reboot | fast boot failure from latest 201811 to 201911 and above
Spectrum | 10GbE/1GbE Transceiver (FTLX8574D3BCV) stopped working after firmware upgrade
Spectrum-2 | When device is rebooted with locked Optical Transceivers in split mode, the firmware may get stuck
Spectrum-2 | SN3700 | When connecting at 200GbE to Ixia K400, Ixia receives CRC errors
Spectrum-2 | SN3800 | On rare occasions packets loss may be experienced due to signal integrity issues
Spectrum-2 | When the port is a member of a LAG, after a warmboot and port toggle on the peer-side, the port remains down
Spectrum-3 | SN4700 | While using Optic cable in Split 4x1 mode in PAM4, when two first ports are toggled, the other 2 ports go down
Spectrum-3 | SN4700 | When working in 400GbE, deleting the headroom configuration (changing buffer size to zero) on the fly may cause continual packet drops
SAI
All | Counters | Update tunnel decap counter to capture VNI miss
- How I did it
Update the related version number in the make files and update the submodule pointer accordingly.
- How to verify it
Run regression test and everything works good.
Included commits in sonic-py-swsssdk repo
```
4e0c561 2019-11-19 | read portchannel name from LAG_NAME_MAP_TABLE in COUNTERS_DB (#51) [anilkpandey]
```
Included commits in sonic-snmpagent repo
```
02dc2ce 2021-03-12 | add mock tables for LAG_NAME_MAP_TABLE in COUNTERS_DB (#202) [Qi Luo]
```
Why I did it
To monitor the SSD health condition in DellEMC S6100 platform post upgrade.
A daemon is introduced to monitor the SSD every one hour.
To check for SSD status at boot time and at the time of cold-reboot.
All these changes are supported only for newer SSD firmware.
Added a platform_reboot_pre_check script to prevent cold-reboot based on SSD status.
Depends on Azure/sonic-utilities#1472
DO NOT MERGE UNTIL ABOVE PR IS MERGED
Closes issue #6982.
The issue was root caused as we were using the unix_socket for reading from DB as a default mechanism (#5250). The redis unix socket is created as follows.
admin@str--acs-1:~$ ls -lrt /var/run/redis/redis.sock
srwxrw---- 1 root redis 0 Mar 6 01:57 /var/run/redis/redis.sock
So it used to work fine for the user "root" or if user is part of redis group ( admin was made part of redis group by default )
Check if the user is with sudo permissions then use the redis unix socket, else fallback to tcp socket.
This PR is cherry-pick of master
https://github.com/Azure/sonic-buildimage/pull/6920
Why I did it
Add support for BGP Monitors on multi asic SONiC platforms.
How I did it
On multi ASIC SONiC platforms, BGP monitor session will be established from Backend ASIC.
To achieve this following changes are done
Add BGP monitor configuration on the backend ASIC.
The BGP monitor configuration is present in the DPG of the device in minigraph.xml of multi-ASIC device, so this configuration will be added to the config_db of the host, when the minigraph is loaded.
To add configuration for this in the Backend ASIC, a new class MultiAsicBgpMonCfg is added to the hostcfgd service to update the config_db of the backend ASIC when the BGP_MONITOR table of the host config_db is updated.
This way incremental BGP_MONITOR configuration can also be handled.
Changes to establish BGP session with bgp monitor.
Add route in host main routing table to go to one of pre-define backend asic
Add IP table rule on front asic to mark the BGP packets with destination as IPv4 Loopback.
Add IP rule in front asic namespace to match mark BGP packet and lookup default table
Program the default route in FrontEnd asic name space docker default table as part of start.sh of the BGP container.
It need to be done as part of start.sh otherwise FRR default route will get over-written.
How to verify it
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: Arvind <arlakshm@microsoft.com>