Commit Graph

5098 Commits

Author SHA1 Message Date
tjchadaga
76def5c3a0 Fix TH3 Warm-reboot failure due to Tunnel termination SAI failure (#8395) 2021-08-11 04:12:46 -07:00
vmittal-msft
b0ea180fd4 Updated PGHeadroom settings for 400G speed (DellEMC-Z9332f-M-O16C64 & DellEMC-Z9332f-O32) (#8420)
Updated pg_profile_lookup.ini for both HWSKU to match with BRCM recommendation
2021-08-11 04:10:03 -07:00
Junchao-Mellanox
8285cf2329
[Mellanox] [202012] Upgrade hw-mgmt to 7.0100.2344 (#8408)
To support new PSU fan on Mellanox platforms
2021-08-11 02:04:55 -07:00
Aravind Mani
4629c302c0
<202012> Dell S6100: Monitor serial getty service (#8407)
Why I did it
serial-getty service exited in Dell S6100 device randomly.

How I did it
Added serial-getty to monit services.

How to verify it
Stop serial-getty in ssh session and check whether the service restarts or not.
2021-08-10 11:23:22 -07:00
roman_savchuk
a06cd18dbe
[BFN]: update bfnsdk package (#8350)
Signed-off-by: Roman Savchuk <romanx.savchuk@intel.com>
2021-08-10 08:19:37 -07:00
shlomibitton
e62ae02fb2
[hostcfgd] [202012] Delay hostcfgd for faster boot time (#8117)
#### Why I did it
hostcfgd is starting at the same time as 'create_switch' method is called on orchagent process.
This introduce a degradation on the function execution time which eventually cause the fast-boot flow and a boot scenario in general to run slower (~6 seconds).
This change will delay the start time of this daemon.
90 seconds determined as the maximum allowed downtime for control plane to come back up on fast-boot flow.

#### How I did it
Add a timer for hostcfgd service in order to delay the startup of this service.

#### How to verify it
Install an image with this change and observe the daemon start 90 seconds after the system boot.
2021-08-10 05:52:08 -07:00
lguohan
8b4f6943a9
[submodule]: update sonic-swss (#8374)
* 41dfaad 2021-08-02 | Bridge mac setting, fix statedb time format (#1844) (HEAD, origin/202012) [Prince Sunny]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-08 20:42:46 -07:00
lguohan
09b4e04f7e [build]: add debug info for dpkg frontend lock (#8375)
print out the process that hold the dpkg frontend lock.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-08 20:41:44 -07:00
mssonicbld
1c30a8b0c1
[ci/build]: Upgrade SONiC package versions (#8376) 2021-08-08 19:05:13 +00:00
Sujin Kang
ae7fa32691
[pmon]: Enable Autorestart of the daemons in PMON for unexpected exit (#8358)
Enable Autorestart of the daemons in PMON for unexpected exit
Remove the daemon list from the critical_process which prevent the PMON
from restarting when the individual daemon crashes.
2021-08-07 22:43:38 -07:00
Guohan Lu
ceab083fc5 [build]: add sonic_release 202012
Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-07 18:04:28 -07:00
Guohan Lu
e38cc58bbc [build]: add branch and release name in sonic_version.yml
the branch refers the branch name that the commit is in,
for example master, 202012, 201911, ...
In case there is no branch, the name will be HEAD.

release is encoded in /etc/sonic/sonic_release file.
the file is only available for a release branch.
It is not available in master branch.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-07 18:04:28 -07:00
Longxiang Lyu
25f53289eb [swss][arp_update] Send ipv6 pings over vlan sub interfaces (#8363)
#### Why I did it
* `arp_update` fails to ping those neighbors over vlan sub interfaces.

#### How I did it
* modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned.
* modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
2021-08-07 12:43:51 +00:00
Neetha John
66c8934d84 Revert "Revert "Update default cable len to 0m for TD2"" (#8354)
* Update default cable len to 0m for TD2 (#8298)
* Update sonic-cfggen tests with the correct cable len

Signed-off-by: Neetha John <nejo@microsoft.com>

As part of the buffer reclamation efforts for TD2, setting the default cable len to 0m which means unused ports will have a cable len of 0m.

Why I did it
To align with the changes in Azure/sonic-swss#1830

How to verify it
- With the default cable len set to 0m and the associated changes in swss, CABLE_LENGTH table had '0m' set for unused ports and accordingly more space was reserved for the shared pool
- Cfggen tests passed with the cable len update
2021-08-07 12:43:46 +00:00
Aravind Mani
7be487bcc8 DellEMC: Z9332f platform API changes (#8258)
Why I did it
platform test suite failed for few API's in DellEMC Z9332f platform.

How I did it
Modified the API's to return the expected values in the script.

How to verify it
Run platform test suite after making the changes.
2021-08-07 12:43:40 +00:00
VenkatCisco
99d3e2767e Support L1 & L3 Config generation in SONiC (#7637)
Why I did it
This provides support for: PR #7074.

How I did it
Extend sonic-config-engine/config_samples.py to provide support for l1 & l3
2021-08-07 12:40:06 +00:00
Guohan Lu
db8cc247e0 [build]: Fix docker pull on armhf platform
armhf build uses native dockerd

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-06 23:35:25 -07:00
Qi Luo
0162cee667
[sonic-swss-common][sonic-snmpagent] Update submodule (#8357)
Include below commits

sonic-swss-common
```
83d3351 2021-04-22 | [swig] fix ConfigDBConnector.db_name (#483) [Qi Luo]
fdf296f 2021-04-09 | Fix: ConfigDBConnector call super init with proper parameter name (#470) [Qi Luo]
4f580e3 2021-03-26 | [swig] translate SonicV2Connector::keys return type from C++ vector<string> to Python list (#468) [Qi Luo]
```

sonic-snmpagent
```
c160c2b 2021-08-04 | CPU Spike because of redundant and flooded keyspace notifis handled (#230) [Vivek Reddy]
a4dd3bf 2021-08-03 | Non-block reading counters to tolerate corrupted/delayed counters in COUNTERS_DB (#231) [Qi Luo]
```
2021-08-06 03:41:36 -07:00
Ying Xie
a21fdeb58c
[202012][utilities] advance sonic-utilities submodule head (#8355)
To include following changes:

* d84a8cc 2021-08-05 | [fast-reboot] revert the change of disabling counter polling before fast-reboot (#1744) (HEAD -> 202012, github/202012) [Ying Xie]
* e900bc5 2021-08-04 | Add script null_route_helper (#1718) [bingwang-ms]
* 85f14e1 2021-08-02 | disk_check updates: (#1736) [Renuka Manavalan]
* d68ac1c 2021-05-27 | [console][show] Force refresh all lines status during show line (#1641) [Blueve]
* a0e417f 2021-04-25 | [console] Display success message after line cleared (#1579) [Blueve]
* 0c6bb27 2021-04-07 | [console] Include Flow Control status in show line result (#1549) [Blueve]
2021-08-06 03:40:24 -07:00
gechiang
0f3f0c2a1a
[202012] BRCM SAI 4.3.5.1 Fix for TH3 FDB Flush Timeout (#8342)
This is to pick up BRCM SAI 4.3.5.1 which contains the following fix:
CS00012201406: [4.3.3.9] SAI_STATUS_FAILURE on FDB flush after all ports flapped

Preliminary tests looks fine. BGP neighbors were all up with proper routes programmed
interfaces are all up
Manually ran the following test cases on z9332f (TH3) T0 DUT and all passed:
```
     ipfwd/test_dir_bcast.py
     fib/test_fib.py
```
Manually ran the following test cases on S6100 (TH) and all passed:
```
     ipfwd/test_dir_bcast.py
     fdb/test_fdb.py
```
2021-08-05 19:03:06 -07:00
Arun Saravanan Balachandran
9b50d631ff DellEMC: Add pcie.yaml for Z9332f (#8329)
Why I did it
To support "pcied" and "pcieutil" commands in DellEMC Z9332f.

How I did it
Add 'pcie.yaml' in device/dell/[PLATFORM]/ directory.

How to verify it
Execute "pcieutil check" command.
Logs: UT_logs.txt
2021-08-06 02:00:04 +00:00
vmittal-msft
886846f719 Dell Z9332 systems optimized MMU settings for T0/T1 topology (#8341) 2021-08-06 01:59:59 +00:00
Samuel Angebault
99efd5346e
[202012][Arista] Update platform library submodules (#8339)
This PR only contains backports from master

Fix leak discovered on master, though 202012 is not affected it's better to have the fix (fixes [master] thermalctld leak on Arista devices makes them unreachable when memory is exhausted #7515)
Fix EepromDecoderimplementation in the platform API (fixes syseepromd crashing repeatedly on SONiC.20201231.02 #8263)
Fix Mineral platform definition and configuration
Fix build issues in environments where /proc is not mounted/restricted (fixes PLATFORM=broadcom fails arista "ReloadCauseManagerTest" first time #7800)
Fix some pytest issues
Add sfp-eeprom C API and also mount it in pmon
2021-08-05 18:35:31 -07:00
Blueve
d2f2a07c7c [ARM] Fix issue whre the ping6 tool is missing from orchagent docker (#8345)
Signed-off-by: Jing Kan jika@microsoft.com
2021-08-05 15:25:53 +00:00
jusherma
4c0daa80c0 [build] Always use -j1 for libsnmp to avoid race condition (#8324)
I have been seeing intermittent (~40%) build failures with the same error described in PR https://github.com/Azure/sonic-buildimage/pull/6592, even with that fix present

```
/usr/bin/ld: mibgroup/ip-forward-mib/ipCidrRouteTable/.libs/ipCidrRouteTable_interface.o: file not recognized: file truncated
...
libtool:   error: 'mibgroup/ip-forward-mib/inetCidrRouteTable/inetCidrRouteTable_interface.lo' is not a valid libtool object
make[5]: *** [Makefile:1020: libnetsnmpmibs.la] Error 1
make[5]: *** Waiting for unfinished jobs....
```

#### How I did it

Use `-j1` for the libsnmp build regardless of the value of `$(MULTIARCH_QEMU_ENVIRON)`

#### How to verify it

Performed 10 builds of the libsnmp target (`target/debs/buster/libsnmp-base_5.7.3+dfsg-5_all.deb`) with and without this change. Without the change, hit the error 40% of the time. With the change did not see the error at all

Signed-off-by: Justin Sherman <jusherma@cisco.com>
2021-08-05 15:23:18 +00:00
VenkatCisco
3aed7eab8f [pmon]: add python3-jsonschema pmon (#8018)
jsonschema is an implementation of JSON Schema for Python .

Signed-off-by: Venkat Garigipati <venkatg@cisco.com>
2021-08-05 15:23:06 +00:00
VenkatCisco
cb8ff6dba1 [baseimage]: add j2cli to sonic_debian_extension.j2 (#8019)
j2cli provides access to jinja library. cisco platform.py requires j2cli to handle jinja template configuration files.
2021-08-05 15:22:57 +00:00
DavidZagury
0551fed754 [Mellanox][Pcie] Fix issue on pcied with an id that contains only decimal digits was treated as a decimal number (#8309)
A device that contains only decimal digits was mistreated as a decimal integer resulting in failure to find it in the id to bus map.
2021-08-05 15:22:48 +00:00
vdahiya12
5e594043ce [pmon] create and mount firmware directory on PMON for firmware upgrade support on muxcable (#8283)
This PR creates a directory firmware on the HOST with the path /usr/share/sonic/firmware, as well as this is 
mounted on PMON container with the same path /usr/share/sonic/firmware. This is required for firmware 
upgrade support for muxcable as currently by design all Y-Cable API's are called by xcvrd. As such if CLI has 
to transfer a file to PMON we need to mount a directory from host to PMON just for getting the firmware files. 
Hence we require this change.

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2021-08-05 15:22:41 +00:00
novikauanton
08dc00f817 [iccpd][docker] fix initial startup configuration (#7982)
#### Why I did it
The process of config generation (sonic-cfggen) fails, but the services continue to run with invalid config

#### How I did it
* add exit with error on errors in start.sh script (because supervisord relies on start.sh return code).
* fix jinja template. Jinja use common python expressions under the hood and `has_key` method was removed from dict in py3, so use check by `in` operator as it is supported by both py2 and py3.
#### How to verify it
* compile sonic with enabled iccp. 
* add mclag config to CONFIG_DB. 
    ``` 
    'MC_LAG|1' => {
        "local_ip": "10.0.0.2",
        "peer_ip": "10.0.0.3",
        "peer_link": "Ethernet8",
        "mclag_interface": "Ethernet12" 
    }
* unmaks, enable and start swss and iccpd services in sonic.
* log in into the iccpd container and check the config file `/etc/iccpd/iccpd.conf`
* expected config:
    ```
    mclag_id:1
        local_ip:10.0.0.2
        peer_ip:10.0.0.3
        peer_link:Ethernet8
        mclag_interface:Ethernet12
    system_mac:YOUR_SYSTEM_MAC

#### Description for the changelog
Fixed initial iccpd startup configuration.
2021-08-05 15:21:33 +00:00
Shilong Liu
7aad616832
[build]: Enable reproducible build for git docker (#8331) 2021-08-04 09:59:02 -07:00
Guohan Lu
fa239270c1 Revert "Update default cable len to 0m for TD2 (#8298)"
This reverts commit af2024e567.
2021-08-04 08:40:36 -07:00
lguohan
cb50baeec2 [k8s]: disable http_proxy for docker by default (#8328)
disable http_proxy for docker by default. by default, we should not use proxy.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-04 00:31:27 -07:00
Neetha John
af2024e567 Update default cable len to 0m for TD2 (#8298)
Signed-off-by: Neetha John <nejo@microsoft.com>

As part of the buffer reclamation efforts for TD2, setting the default cable len to 0m which means unused ports will have a cable len of 0m.

Why I did it
To align with the changes in Azure/sonic-swss#1830

How to verify it
With the default cable len set to 0m and the associated changes in swss, CABLE_LENGTH table had '0m' set for unused ports and accordingly more space was reserved for the shared pool
2021-08-03 09:58:46 +00:00
Vivek Reddy
1eaa951966 [Mellanox] [SKU] Fix the shared headroom for 4600C-C64 SKU (#8242)
Removed ingress_lossy_pool from the BUFFER_POOL list
Fx the the egress_lossless_pool_size value

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-08-03 09:58:40 +00:00
mssonicbld
2a41adcdc1
[ci/build]: Upgrade SONiC package versions (#8303) 2021-08-02 02:21:04 +00:00
Neetha John
cb49c6522f
[submodule]: Advance swss submodule pointer (#8288)
This PR includes the following commits

a67d8af [202012][portsorch] fix errors when moving port from one lag to another (Azure/sonic-swss#1819)
04105a4 [debugcounterorch] check if counter type is supported before querying (Azure/sonic-swss#1789)
ac7f5cff Td2: Reclaim buffer from unused ports (Azure/sonic-swss#1830)
f54b7d0 [Dynamic Buffer Calc][202012]Bug fix: Don't create lossless buffer profile for active ports without speed configured  (Azure/sonic-swss#1820)

Signed-off-by: Neetha John <nejo@microsoft.com>
2021-07-30 17:09:35 -07:00
Nazarii Hnydyn
ada56abe6e
[submodule]: Advance sonic-swss & sonic-sairedis. (#8296)
sairedis:
*[recorder] Fix incorrect attribute enum value capability query (#843) d86b051
*[syncd] Fix fdb flood queue size limit check (#863) 3a2af76
*[vslib] implement query for SAI_DEBUG_COUNTER_TYPE enum values (#842) 575dcb4 

swss:
*[portsorch] fix errors when moving port from one lag to anoth… a67d8af
*[debugcounterorch] check if counter type is supported before querying… ( 04105a4
*Td2: Reclaim buffer from unused ports (#1830) ac7f5cf
*[Dynamic Buffer Calc][202012]Bug fix: Don't create lossless buffer pr… f54b7d0 

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-07-30 16:24:35 -07:00
Stephen Sun
f1c8a6ab96
[sonic-swss][202012] Update submodule (#8286)
Update submodule for swss

f54b7d0b [Dynamic Buffer Calc][202012]Bug fix: Don't create lossless buffer profile for active ports without speed configured (Azure/sonic-swss#1820)
ac7f5cff Td2: Reclaim buffer from unused ports (Azure/sonic-swss#1830)
04105a4b [debugcounterorch] check if counter type is supported before querying (Azure/sonic-swss#1789)
a67d8af6 [202012][portsorch] fix errors when moving port from one lag to another. (Azure/sonic-swss#1819)
2021-07-30 02:49:40 -07:00
vdahiya12
5651977f65
[sonic-utilities] submodule update (#8284)
This PR updates the following commits

a9606fb [show] fix show muxcable metrics <port> for sorted output (#1731)
7355016 [minigraph][port_config] Use imported config.main and add conditional patch (#1728)
cc1d6e4 [configlet] Python3 compatible syntax for extracting a key from the dict (#1721)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2021-07-29 07:58:07 -07:00
mssonicbld
cdbcf7da47
[ci/build]: Upgrade SONiC package versions (#8256) 2021-07-28 08:44:24 +00:00
Qi Luo
9642ea0bb9
[sonic-swss-common] Update submodule (#8272)
Includes below commits
```
bf8c832 2021-07-22 | Fix DBInterface blocking operations (#505) (HEAD -> 202012, origin/202012) [Qi Luo]
0e9385f 2021-04-21 | [swig] Implement SonicV2Connector.hmset() (#480) [Qi Luo]
76be49f 2021-07-25 | Modify the hardcode separator ":" to variable. (#473) [PJHsieh]
142ae3c 2021-06-23 | Fix config prompt question issue (#500) [xumia]
e7bebe1 2021-06-14 | Fix repo reference issue (#487) [xumia]
```
2021-07-28 01:27:15 -07:00
Sujin Kang
a8a6d40a14
[Submodule update:202012] sonic-platform-common, sonic-platform-daemons (#8262)
sonic-platform-daemons:
* c90bb29 [202012] Refactor Pcied and add unittest (Azure/sonic-platform-daemons#199)

sonic-platform-common:
* 9a59e19 Revert "Unifying the platform api for get_pcie_aer_stats with PcieBase (Azure/sonic-platform-common#197)" (Azure/sonic-platform-common#207)
* d35960b Revert "Revert "Unifying the platform api for get_pcie_aer_stats with PcieBase (Azure/sonic-platform-common#197)" (Azure/sonic-platform-common#207)" (Azure/sonic-platform-common#210)
2021-07-27 03:49:34 -07:00
DavidZagury
45e100b61b [Mellanox][pcied] Ignore bus on pcie.yaml for Mellanox switches (#8063)
Why I did it
BIOS upgrade on rare cases cannot guarantee bus value remain the same on every BIOS release. Ignoring this field in order for pcied not to fail but still verify device id in a different way. The solution is future proof and will not require changes in code when new BIOS version is available

How I did it
Since bus is not a fixed value (it is determined by the bios version) we are ignoring this field, and instead checking if there is a device that match on all other fields that and in addition has a matching device id.

How to verify it
Verify no errors or failures in pcied on different BIOS version with the same code base.
2021-07-27 10:46:31 +00:00
Christian Svensson
84dcc9d086 [DellEmc] Fix port lanes for 10G ports on alternative S5232 SKUs (#8208)
Backport the fix (444cede11) that was made for the default SKU to the alternative SKUs.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2021-07-27 05:14:33 +00:00
Vivek Reddy
67202cc2bb autorestart inside restapi docker is disabled (#8006)
Fix issue with critical process in the restapi docker restarting immediately after getting killed
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
2021-07-27 05:14:28 +00:00
Dror Prital
be6cd44ddf Update SDK\FW to version 4.4.3222\2008.3224 (#8247)
*Update SDK\FW Version to 4.4.3222\2008.3224.

Signed-off-by: Dror Prital <drorp@nvidia.com>
2021-07-26 11:05:29 -07:00
jostar-yang
4eab1514ec
[AS5835-54X] Support system-health and remove extra code (#8137)
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
2021-07-24 18:35:06 -07:00
mssonicbld
814865b2c5
[ci/build]: Upgrade SONiC package versions (#8206) 2021-07-24 12:27:11 +00:00
Wirut Getbamrung
d191dcd2e9
[device/celestica]: Add thermalctld support on Haliburton platform APIs (#6493) (#8217)
- Why I did it

The thermalctld daemon on the Pmon docker requires support from the thermal manager API
- How I did it
Cherry picked from : cfda77b
Removed the old function for detecting a faulty fan.
Removed the old function for detecting excess temperature.
Implement thermal_manager APIs based on ThermalManagerBase
Implement thermal_conditions APIs based on ThermalPolicyConditionBase
Implement thermal_actions APIs based on ThermalPolicyActionBase
Implement thermal_info APIs based on ThermalPolicyInfoBase
Add thermal_policy.json

- How to verify it
Check the fan speed during temperature changes.
Examine events that will occur after the temperature has exceeded the threshold.
Check for events that will occur after the fan is removed or the fan is not working properly.

- Which release branch to backport (provide reason below if selected)
 202012
2021-07-23 06:16:22 -07:00