Commit Graph

4871 Commits

Author SHA1 Message Date
Kebo Liu
33cb83cbd1 [Mellanox] Align PSU name convention returned from psu.get_name platform API (#7783)
Make PSU name returned from platform API aligned with the convention "PSU {X}" instead of "PSU{X}".
2021-06-07 06:02:32 +00:00
Renuka Manavalan
32e5137ab7 Add service to restore TACACS from old config (#7560)
Why I did it
In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials.
Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present.

How I did it
By adding a service, that would fire 5 mins after boot.
This service apply tacacs if available.

How to verify it
Upgrade and watch status of tacacs.timer & tacacs.service
You may create /etc/sonic/old_config/tacacs.json, with updated credentials
(before 5mins after boot) and see that appears in config & persisted too.

Which release branch to backport (provide reason below if selected)
 201911
 202006
 202012
2021-06-07 06:02:32 +00:00
mssonicbld
1e9cb30008
[ci/build]: Upgrade SONiC package versions (#7770) 2021-06-05 14:03:04 +00:00
Volodymyr Samotiy
754e4fea17
[Mellanox] Update SDK to 4.4.3106 and FW to xx.2008.3106 (#7785)
Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
2021-06-03 19:18:07 -07:00
Myron Sosyak
b65b59c1b5 [BFN] Enable syncd-rpc build (#7646)
Why I did it
To enable syncd-rpc for Barefoot build

How I did it
Set the flag

How to verify it
ENABLE_SYNCD_RPC=y make configure PLATFORM=barefoot
ENABLE_SYNCD_RPC=y make all
2021-06-03 12:13:42 +00:00
Volodymyr Boiko
6a90c09cc7 [barefoot][sonic-platform] Fix Fan.set_speed (#7763)
Fixed typo

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-06-03 12:13:42 +00:00
Volodymyr Boiko
202c31ebbe [barefoot][platform] Support fans and thermal (#7004)
Add support for fans and thermals to sonic-platform package for Montara platform

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-06-03 12:13:42 +00:00
Myron Sosyak
e7009513da [docker-database] Fix Python3 issue (#7700)
#### Why I did it
To avoid the following error
```
Traceback (most recent call last):
  File "/usr/local/bin/flush_unused_database", line 10, in <module>
    if 'PONG' in output:
TypeError: a bytes-like object is required, not 'str'
```
`communicate` method returns the strings if streams were opened in text mode; otherwise, bytes.
In our case text arg  in Popen is not true and that means that `communicate` return the bytes
#### How I did it
Set `text=True` to get strings instead of bytes
#### How to verify it
run `/usr/local/bin/flush_unused_database` inside database container
2021-06-02 02:39:31 +00:00
bingwang-ms
eb8c05c306 Fix lldpmgrd syntax issue (#7742)
Signed-off-by: bingwang <bingwang@microsoft.com>
2021-06-02 02:39:31 +00:00
Lawrence Lee
6a0e9078d4 [docker-orchagent]: Increase ndppd kernel poll interval (#7456)
Why I did it
ndppd by default reads /proc/net/ipv6_route ever 30 seconds. Since T1s advertise so many routes to ToRs, this file is extremely large, and reading it causes ndppd's CPU usage to spike every 30 seconds

How I did it
Increase the delay for reading this file to the maximum possible value (max integer value), which will result in CPU spikes every ~24 days instead of every 30 seconds

How to verify it
Start ndppd with the new config file, confirm that no CPU spikes are seen except at startup

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-06-02 02:38:54 +00:00
novikauanton
556bb62db4
added barefoot to job filters (#7771)
Build failed with error "Failed to verify the package"
The reason of the fail is version verification
Version manager didn't find dependencies for the barefoot platform
2021-06-02 09:18:00 +08:00
mssonicbld
eddce4d58b
[ci/build]: Upgrade SONiC package versions (#7755) 2021-05-31 13:02:58 +00:00
Stephen Sun
d387d75420 [Mellanox] Support buffer configuration for 2km cables (#7337)
#### Why I did it
Support 2km cables for Microsoft SKUs

#### How I did it
1. Update pg_profile_lookup.ini with 2000m cable supported
2. Update buffer configuration for t1 with uplink cable 2000m
  - For SN3800 platform:
    - C64:
      - t0: 32 100G down links and 32 100G up links.
      - t1: 56 100G down links and 8 100G up links with 2 km cable.
    - D112C8: 112 50G down links and 8 100G up links.
    - D24C52: 24 50G down links, 20 100G down links, and 32 100G up links.
    - D28C50: 28 50G down links, 18 100G down links, and 32 100G up links.
  - For SN2700 platform:
    - D48C8: 48 50G down links and 8 100G up links.
    - C32:
      - t0: 16 100G down links and 16 100G up links.
      - t1: 24 100G down links and 8 100G up links with 2 km cable.
  - For SN4600C platform:
    - D112C8: 112 50G down links and 8 100G up links.

#### How to verify it
Run regression test
2021-05-31 04:39:59 +00:00
Neetha John
b0f3ecb5cf Rename AristaQX-32S skus (#7751)
This PR contains the following changes
Original Arista-7050-QX-32S sku (32x40G ports) has been renamed to Arista-7050QX32S-Q32
Arista-7050-QX-32S is symlinked to Arista-7050QX-32S-S4Q31 (4x10G, 31x40G ports)

Signed-off-by: Neetha John <nejo@microsoft.com>
2021-05-31 04:38:19 +00:00
Neetha John
20b7654389 [minigraph] Parse bandwidth for DeviceMgmtLinks (#7744)
Why I did it
The current code skips parsing bandwidth for DeviceMgmtLinks. We have a use case to set the speed for these type of links based on the bandwidth attribute in the minigraph

How to verify it
Ran sonic-cfggen on a minigraph and verified that interface of type DeviceMgmtLink has speed set in the PORT table from the bandwidth attribute in the minigraph
2021-05-31 04:38:18 +00:00
Wirut Getbamrung
85fccfe8bf [device/celestica]: Fix remaining failed test cases of Seastone-DX010 platform API (#7743)
**- Why I did it**
- To fix failed test cases of Seastone-DX010 platform APIs that found on [platform_tests](https://github.com/Azure/sonic-mgmt/tree/master/tests/platform_tests/api) script

**- How I did it**
1. Add device/celestica/x86_64-cel_seastone-r0/platform.json 
2. Update functions to support python3.7
3. Add more functions follow latest sonic_platform_base
4. Fix the bug
2021-05-31 04:38:18 +00:00
yozhao101
3af05fdffe [Monit] Restart telemetry container if memory usage is beyond the threshold (#7645)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.

How I did it
I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container.

How to verify it
I verified this implementation on device str-7260cx3-acs-1.
2021-05-31 04:38:18 +00:00
Junchao-Mellanox
74216f8710 [Mellanox] clear fan from chassis._fan_list (#7682)
#### Why I did it

According to thermalctld hld, each fan must belong to a fan drawer, if the fan drawer does not physically exist, put fan into a virtual fan drawer. This PR is to clear fan from chassis._fan_list

#### How I did it

1. Don't put fan to chassis._fan_list
2. Always query fan from fan_drawer
2021-05-31 04:32:40 +00:00
mssonicbld
71d4b17ad0
[ci/build]: Upgrade SONiC package versions (#7746) 2021-05-29 14:22:37 +00:00
zzhiyuan
39a4cd9c3a
[202012][Arista] Update Arista submodule to include pmbus fix (#7737)
Why I did it
Microsoft reported occasional daemon crashes on devices running 201911. On close inspection it was due to PMBus reads failing on IOError on very rare occasions.

This is the fix for 202012 branch.

How I did it
Add try/except block on performing reads on PMBus GPIOs.

How to verify it
Which release branch to backport (provide reason below if selected)

Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
2021-05-28 15:43:30 -07:00
jostar-yang
bedcc44cb7
[as7726-32x] Support API2.0 (#7729)
Add platform API 2.0 support for as7726-32x platform

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
2021-05-28 12:23:20 -07:00
Ying Xie
8fc68f9781
[202012][swss][utilities] advance submodule heads (#7739)
sonic-utilities:
* 8b98d45 2021-05-25 | [show] support for show muxcable firmware version of only active banks (#1629) (HEAD -> 202012) [vdahiya12]
* afd0975 2021-05-20 | [show] add support for muxcable metrics (#1615) [vdahiya12]

sonic-swss
* 7611df5 2021-05-27 | [tunneldecaporch] Set default MTU for the overlay loopback interface (#1756) (HEAD -> 202012) [Volodymyr Samotiy]
* 22fbb5c 2021-05-27 | [202012] Resolve neighbor when nexthop does not exist (#1759) (github/202012) [Shi Su]
* ec7710c 2021-05-27 | [Bulk mode] Limit the size of bulker (#1760) [Shi Su]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-05-28 08:41:09 -07:00
Kebo Liu
babaaaad6b [Mellanox] Add support for MSN4600 A1 system (#7732)
Add new sensor conf for MSN4600 A1 system
Add a Mellanox hw-management patch to support MSN4600 A1 system
2021-05-27 22:30:39 +00:00
Kebo Liu
7ca8b458ee update hw-mgmt version to 2304 (#7725)
- Why I did it
Pick up fix from new hw-management package:
Fix gearbox thermal zone name, which was lack suffix thermal zone number

- How I did it
Update the hw-management version number in the make file
Update hw-management submodule pointer

- How to verify it
Run platform related test cases on Mellanox platform
2021-05-27 22:30:33 +00:00
Ying Xie
aa445bec9a [MMU] define T1 MMU configuratino for Arista-7260CX3-Q64 (#7718)
Why I did it
Arista-7260CX3-Q64 is missing T1 MMU configuration.

How I did it
Define T1 MMU configuration for Arista-7260CX3-Q64.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2021-05-27 22:30:15 +00:00
Prince Sunny
7a816ed5fa [Mux] Do not clean-up HW_MUX_CABLE_TABLE from State DB (#7710)
Co-authored-by: Ubuntu <prsunny@prince-vm.vzw1i4tqyeburcdz5lrgulxi2c.yx.internal.cloudapp.net>
2021-05-27 22:30:06 +00:00
Renuka Manavalan
cd7d04d08a K8S handles hostname in lower case (#7694)
Why I did it
k8s handles in lower case, so the code ensures that it uses hostname in all lower case

How I did it
Wrapper for device_info.get_hostname that returns in lower case. This wrapper is used in all places that require hostname to use in kubectl commands.

How to verify it
Device joins successfully.
2021-05-27 22:29:55 +00:00
bingwang-ms
c5d27750f2 Fix supervisor-proc-exit-listener startup issue in restapi (#7681)
* Fix supervisor-proc-exit-listener startup issue in restapi

Signed-off-by: bingwang <bingwang@microsoft.com>
2021-05-27 22:29:42 +00:00
Lawrence Lee
cb3a9eec58 [swss.service]: Remove ordering with pmon (#7614)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-05-27 22:29:35 +00:00
Alexander Allen
bd6096a018 [ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB (#7586)
Why I did it
Currently, there is a bug in the ntp.conf jinja2 template where it will ignore the src_intf directive in CONFIG_DB if there are multiple IP addresses associated with an interface. This code change fixes that bug and allows the template to select the correct source interface for NTP.

How I did it
I did this by modifying the macro in ntp.conf.j2 which determines if there is an ip address associated with an interface to set a state variable when it detects a valid interface entry in CONFIG_DB instead of outputting "true" directly (which could result in multiple "trues" outputted for interfaces with multiple valid IP addresses).

How to verify it
Add two ipv4 addresses to an interface in SONiC

Add the following configuration to config_db.json

{
"NTP": {
    "global": {
        "src_intf": "Ethernet1"
        }
    }
}
Replace Ethernet1 with the interface name of the one you assigned the IP addresses to.

Run sudo config reload -y

Open /etc/ntp.conf and verify that the following line exists

...
interface listen Ethernet1
...
The interface specified should be the one set in the previous steps.

Description for the changelog
[ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB
2021-05-27 22:29:01 +00:00
Renuka Manavalan
53b3d378c7 Invoke disk check periodically. (#7374)
Why I did it
Helps with periodic scan of disk for RO state.
If found, this script makes transient fix and raise error message.
2021-05-27 22:28:44 +00:00
gechiang
5d29a2c2a6
[202012] BRCM SAI 4.3.3.5-3 Enable VFP-based subintf on td2 (#7728)
Why I did it
This SAI change is to Enable TD2 platforms to also be able to handle VFP-based sub-interfaces.
This is the feature that is also needed for TD2 platforms.

How to verify it
Guohan and Team has validated this feature change on TD2 platforms
2021-05-27 15:11:03 -07:00
Joe LeVeque
c67ead6558
[202012][sonic-platform-common] Update submodule (#7695) 2021-05-27 08:19:29 -07:00
Neetha John
9e3e769b89
[bgpcfgd] Enable BGP sessions over subinterfaces (#7654) (#7696)
Signed-off-by: Neetha John nejo@microsoft.com

Fixes #7531

Why I did it
To enable bgp sessions to be established over subinterfaces

How I did it
Listen to VLAN_SUB_INTERFACE table in config db

How to verify it
Bgp sessions were established successfully over subinterface
2021-05-27 08:19:09 -07:00
mssonicbld
5b5131f26d
[ci/build]: Upgrade SONiC package versions (#7701) 2021-05-26 13:58:58 +00:00
Kebo Liu
ef7ac729cc [Mellanox] Update the Spectrum-2 platform PSU sensor's label in the sensor conf file (#7706)
#### Why I did it
The label for PSU related sensors on the Spectrum-2 platform is not aligned with the physical location of the PSU. 

#### How I did it
Update the label in the sensor conf file for those relevant platforms

Signed-off-by: Kebo Liu <kebol@nvidia.com>
2021-05-26 02:41:16 +00:00
Volodymyr Boiko
0fa3404a62 [barefoot][device] Support pcied on Mavericks (#7705)
Add pcie.yaml to enable pcied on Mavericks platform

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
2021-05-26 02:41:08 +00:00
shlomibitton
c53f58e488 Remove 'vm.panic_on_oom=1' (#7678)
#### Why I did it
If a process limits using nodes by mempolicy/cpusets, and those nodes become memory exhaustion status, one process may be killed by oom-killer.
No panic occurs in this case, because other node's memory may be free.
This means system total status may be not fatal yet.

#### How I did it
Remove 'vm.panic_on_oom=1' kernel flag from 'vmcore-sysctl.conf '
2021-05-26 02:41:02 +00:00
Neetha John
55c798adf3 Update PG profile settings for Arista-7050QX-32S-S4Q31 (#7673)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
PG profile settings need to be aligned with Arista-7050-QX-32S

How I did it
Copy over the current settings from Arista-7050-QX-32S and define params for 10G and 1G speeds as well
2021-05-26 02:40:44 +00:00
Neetha John
908e1a7524 [bgpcfgd] Enable BGP sessions over subinterfaces (#7654)
Signed-off-by: Neetha John nejo@microsoft.com

Fixes #7531

Why I did it
To enable bgp sessions to be established over subinterfaces

How I did it
Listen to VLAN_SUB_INTERFACE table in config db

How to verify it
Bgp sessions were established successfully over subinterface
2021-05-26 02:40:09 +00:00
LuiSzee
bc5b367d37 [radv] fix bug for radv can't startup if DEVICE_METADATA.localhost.type is NULL (#7651)
Co-authored-by: Shi Lei <shil@centecnetworks.com>
2021-05-26 02:39:02 +00:00
Sudharsan Dhamal Gopalarathnam
7a8e77d8be
[202012] FEC none config through minigraph (#7670)
When FECDisabled is set to true in minigraph.py, push 'fec' 'none' explicitly to config_db. When 'fec' is defined in port_config.ini do not override it with 'rs' for 100G

Backport of #7667 to 202012 branch.
2021-05-25 09:35:25 -07:00
Lior Avramov
38a29254a6
[sonic-snmpagent] submodule update (#7636)
Includes below commits:
748b545 2021-04-05 [SNMP] Update description of entPhysicalDescr mib in case interface is not configured.
2021-05-24 16:23:32 -07:00
Neetha John
ff1e5ef8b7 Update MMU and QOS settings for Arista-7050QX-32S-S4Q31 (#7672)
Signed-off-by: Neetha John <nejo@microsoft.com>

Why I did it
Need proper MMU and Qos settings for Arista-7050QX-32S-S4Q31

How I did it
Updated the settings based on Arista-7050-QX-32S
2021-05-24 22:27:27 +00:00
Aravind Mani
63354c0daa DellEMC: Z9332f fix SFP issues (#7677)
#### Why I did it
xcvrd crashes when the application advertisement capability flag is not seen for few transceivers.

#### How I did it
Initialize the additional application capability in dunder init
2021-05-24 22:27:26 +00:00
Myron Sosyak
4ec13fd556 Fix python version (#7658)
#### Why I did it
To avoid the following logs 
```
Mar 15 15:52:04.599302 igk-dut-04 INFO database#/supervisord: flushdb /bin/bash: /usr/local/bin/flush_unused_database: /usr/bin/python: bad interpreter: No such file or directory
Mar 15 15:52:04.599947 igk-dut-04 INFO database#supervisord 2021-03-15 15:52:04,599 INFO exited: flushdb (exit status 126; not expected)
```

#### How I did it
Fix  shebang
#### How to verify it
Check the logs
2021-05-24 22:25:47 +00:00
Neetha John
cb930c30cd [qos]: modify dot1p to tc mapping (#7661)
Map priority 0 to TC 1 and priority 1 to TC 0

Send traffic on priority 0 and 1 and verified that it gets mapped correctly in hw

Signed-off-by: Neetha John <nejo@microsoft.com>
2021-05-24 22:25:47 +00:00
xumia
2973a63f1d Fix the type issue in rvtysh (#7648)
Why I did it
Change the type issue in the command rvtysh
change PARA/para to PARAM/param
2021-05-24 22:25:47 +00:00
sudhanshukumar22
0a5551aabf docker-lldp:intermittent DB errors will result in Client termination (#6119)
This PR allows listen to hostname changes and mgmt ip changes.
2021-05-24 22:21:29 +00:00
xumia
dc44da4e0f Skip the web proxy when downloading from proxy endpoint (#7592)
Why I did it
Skip to use the web proxy when the packages have been in the proxy server.
For sai packages or the other packages, we will upload the the proxy server directly, the reproducible will skip to check the site, not necessary to change the version files.
2021-05-24 22:18:00 +00:00