Commit Graph

1573 Commits

Author SHA1 Message Date
carl-nokia
3b9b9462be [sonic-device-data]: add port_type to OPTIONAL_PORT_ATTRIBUTES (#8370)
enable automated test suites to selectively run relevant tests ( or not run tests ) based upon a new port_type identifier in hwsku.json

How I did it
Modified the valid optional fields in validity check for hwsku.json per recommendation from Joe in
https://github.com/Azure/sonic-mgmt/pull/2654/files

Co-authored-by: Carl Keene <keene@nokia.com>
2021-08-12 07:09:49 +00:00
Rajkumar-Marvell
01e223513d [reboot-cause] Fixed determine-reboot-cause.service failure. (#8210)
Signed-off-by: Rajkumar Pennadam Ramamoorthy rpennadamram@marvell.com

Why I did it
Install sonic image from ONIE. Once system is up, execute "config reload" command.

Root cause is that "determine-reboot-cause.service" was in failed state.
root@sonic:/host/reboot-cause# systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● determine-reboot-cause.service loaded failed failed Reboot cause determination service

How I did it
Fixed the issue by setting default reason to "REBOOT_CAUSE_UNKNOWN" instead of "None".

How to verify it
Check " determine-reboot-cause.service' loaded successfully post image installation from ONIE.
Verify "reboot-cause.txt" file is created and config reload succeeds.
2021-08-12 07:09:23 +00:00
tjchadaga
76def5c3a0 Fix TH3 Warm-reboot failure due to Tunnel termination SAI failure (#8395) 2021-08-11 04:12:46 -07:00
shlomibitton
e62ae02fb2
[hostcfgd] [202012] Delay hostcfgd for faster boot time (#8117)
#### Why I did it
hostcfgd is starting at the same time as 'create_switch' method is called on orchagent process.
This introduce a degradation on the function execution time which eventually cause the fast-boot flow and a boot scenario in general to run slower (~6 seconds).
This change will delay the start time of this daemon.
90 seconds determined as the maximum allowed downtime for control plane to come back up on fast-boot flow.

#### How I did it
Add a timer for hostcfgd service in order to delay the startup of this service.

#### How to verify it
Install an image with this change and observe the daemon start 90 seconds after the system boot.
2021-08-10 05:52:08 -07:00
lguohan
8b4f6943a9
[submodule]: update sonic-swss (#8374)
* 41dfaad 2021-08-02 | Bridge mac setting, fix statedb time format (#1844) (HEAD, origin/202012) [Prince Sunny]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-08 20:42:46 -07:00
Neetha John
66c8934d84 Revert "Revert "Update default cable len to 0m for TD2"" (#8354)
* Update default cable len to 0m for TD2 (#8298)
* Update sonic-cfggen tests with the correct cable len

Signed-off-by: Neetha John <nejo@microsoft.com>

As part of the buffer reclamation efforts for TD2, setting the default cable len to 0m which means unused ports will have a cable len of 0m.

Why I did it
To align with the changes in Azure/sonic-swss#1830

How to verify it
- With the default cable len set to 0m and the associated changes in swss, CABLE_LENGTH table had '0m' set for unused ports and accordingly more space was reserved for the shared pool
- Cfggen tests passed with the cable len update
2021-08-07 12:43:46 +00:00
VenkatCisco
99d3e2767e Support L1 & L3 Config generation in SONiC (#7637)
Why I did it
This provides support for: PR #7074.

How I did it
Extend sonic-config-engine/config_samples.py to provide support for l1 & l3
2021-08-07 12:40:06 +00:00
Qi Luo
0162cee667
[sonic-swss-common][sonic-snmpagent] Update submodule (#8357)
Include below commits

sonic-swss-common
```
83d3351 2021-04-22 | [swig] fix ConfigDBConnector.db_name (#483) [Qi Luo]
fdf296f 2021-04-09 | Fix: ConfigDBConnector call super init with proper parameter name (#470) [Qi Luo]
4f580e3 2021-03-26 | [swig] translate SonicV2Connector::keys return type from C++ vector<string> to Python list (#468) [Qi Luo]
```

sonic-snmpagent
```
c160c2b 2021-08-04 | CPU Spike because of redundant and flooded keyspace notifis handled (#230) [Vivek Reddy]
a4dd3bf 2021-08-03 | Non-block reading counters to tolerate corrupted/delayed counters in COUNTERS_DB (#231) [Qi Luo]
```
2021-08-06 03:41:36 -07:00
Ying Xie
a21fdeb58c
[202012][utilities] advance sonic-utilities submodule head (#8355)
To include following changes:

* d84a8cc 2021-08-05 | [fast-reboot] revert the change of disabling counter polling before fast-reboot (#1744) (HEAD -> 202012, github/202012) [Ying Xie]
* e900bc5 2021-08-04 | Add script null_route_helper (#1718) [bingwang-ms]
* 85f14e1 2021-08-02 | disk_check updates: (#1736) [Renuka Manavalan]
* d68ac1c 2021-05-27 | [console][show] Force refresh all lines status during show line (#1641) [Blueve]
* a0e417f 2021-04-25 | [console] Display success message after line cleared (#1579) [Blueve]
* 0c6bb27 2021-04-07 | [console] Include Flow Control status in show line result (#1549) [Blueve]
2021-08-06 03:40:24 -07:00
jusherma
4c0daa80c0 [build] Always use -j1 for libsnmp to avoid race condition (#8324)
I have been seeing intermittent (~40%) build failures with the same error described in PR https://github.com/Azure/sonic-buildimage/pull/6592, even with that fix present

```
/usr/bin/ld: mibgroup/ip-forward-mib/ipCidrRouteTable/.libs/ipCidrRouteTable_interface.o: file not recognized: file truncated
...
libtool:   error: 'mibgroup/ip-forward-mib/inetCidrRouteTable/inetCidrRouteTable_interface.lo' is not a valid libtool object
make[5]: *** [Makefile:1020: libnetsnmpmibs.la] Error 1
make[5]: *** Waiting for unfinished jobs....
```

#### How I did it

Use `-j1` for the libsnmp build regardless of the value of `$(MULTIARCH_QEMU_ENVIRON)`

#### How to verify it

Performed 10 builds of the libsnmp target (`target/debs/buster/libsnmp-base_5.7.3+dfsg-5_all.deb`) with and without this change. Without the change, hit the error 40% of the time. With the change did not see the error at all

Signed-off-by: Justin Sherman <jusherma@cisco.com>
2021-08-05 15:23:18 +00:00
lguohan
cb50baeec2 [k8s]: disable http_proxy for docker by default (#8328)
disable http_proxy for docker by default. by default, we should not use proxy.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-08-04 00:31:27 -07:00
Nazarii Hnydyn
ada56abe6e
[submodule]: Advance sonic-swss & sonic-sairedis. (#8296)
sairedis:
*[recorder] Fix incorrect attribute enum value capability query (#843) d86b051
*[syncd] Fix fdb flood queue size limit check (#863) 3a2af76
*[vslib] implement query for SAI_DEBUG_COUNTER_TYPE enum values (#842) 575dcb4 

swss:
*[portsorch] fix errors when moving port from one lag to anoth… a67d8af
*[debugcounterorch] check if counter type is supported before querying… ( 04105a4
*Td2: Reclaim buffer from unused ports (#1830) ac7f5cf
*[Dynamic Buffer Calc][202012]Bug fix: Don't create lossless buffer pr… f54b7d0 

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
2021-07-30 16:24:35 -07:00
Stephen Sun
f1c8a6ab96
[sonic-swss][202012] Update submodule (#8286)
Update submodule for swss

f54b7d0b [Dynamic Buffer Calc][202012]Bug fix: Don't create lossless buffer profile for active ports without speed configured (Azure/sonic-swss#1820)
ac7f5cff Td2: Reclaim buffer from unused ports (Azure/sonic-swss#1830)
04105a4b [debugcounterorch] check if counter type is supported before querying (Azure/sonic-swss#1789)
a67d8af6 [202012][portsorch] fix errors when moving port from one lag to another. (Azure/sonic-swss#1819)
2021-07-30 02:49:40 -07:00
vdahiya12
5651977f65
[sonic-utilities] submodule update (#8284)
This PR updates the following commits

a9606fb [show] fix show muxcable metrics <port> for sorted output (#1731)
7355016 [minigraph][port_config] Use imported config.main and add conditional patch (#1728)
cc1d6e4 [configlet] Python3 compatible syntax for extracting a key from the dict (#1721)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2021-07-29 07:58:07 -07:00
Qi Luo
9642ea0bb9
[sonic-swss-common] Update submodule (#8272)
Includes below commits
```
bf8c832 2021-07-22 | Fix DBInterface blocking operations (#505) (HEAD -> 202012, origin/202012) [Qi Luo]
0e9385f 2021-04-21 | [swig] Implement SonicV2Connector.hmset() (#480) [Qi Luo]
76be49f 2021-07-25 | Modify the hardcode separator ":" to variable. (#473) [PJHsieh]
142ae3c 2021-06-23 | Fix config prompt question issue (#500) [xumia]
e7bebe1 2021-06-14 | Fix repo reference issue (#487) [xumia]
```
2021-07-28 01:27:15 -07:00
Sujin Kang
a8a6d40a14
[Submodule update:202012] sonic-platform-common, sonic-platform-daemons (#8262)
sonic-platform-daemons:
* c90bb29 [202012] Refactor Pcied and add unittest (Azure/sonic-platform-daemons#199)

sonic-platform-common:
* 9a59e19 Revert "Unifying the platform api for get_pcie_aer_stats with PcieBase (Azure/sonic-platform-common#197)" (Azure/sonic-platform-common#207)
* d35960b Revert "Revert "Unifying the platform api for get_pcie_aer_stats with PcieBase (Azure/sonic-platform-common#197)" (Azure/sonic-platform-common#207)" (Azure/sonic-platform-common#210)
2021-07-27 03:49:34 -07:00
Stepan Blyshchak
7eb6abdc7b
[hostcfgd] differentiate between UnitFileState and UnitFilePreset (#8169) (#8228)
It can be that service is not enabled but UnitFilePreset=enabled (case
for Application Extension):

```
    Loaded: loaded (/lib/systemd/system/cpu-report.service; disabled; vendor preset: enabled)
```

This makes existing logic skip enabling the service.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2021-07-21 01:13:30 -07:00
Renuka Manavalan
91f611157a
cherry-pick PR #8158 & PR #8205 into 202012 (#8235) 2021-07-20 20:52:33 -07:00
Qi Luo
01efb5454e
[sonic-utilities] Update submodule (#8196)
Includes below commits
```
d19829c 2021-07-16 | Revert "[minigraph][port_config] Consume port_config.json while reloading minigraph (#1705)" [Guohan Lu]
cd1f6e6 2021-07-15 | Reworked IP validation in "config interface ip add/remove" command (#1709) [Andriy Kokhan]
66c34c0 2021-07-15 | [minigraph][port_config] Consume port_config.json while reloading minigraph (#1705) [Blueve]
```
2021-07-17 04:19:01 -07:00
Neetha John
8acb206778
[202012] [minigraph] Update parsing logic for Storage backend devices (#8004)
Backport #7944 

#### Why I did it
The current logic generates 'VLAN_SUB_INTERFACE' table if the device type is backend and cluster name contains 'str'. This is not a reliable method to determine a storage backend device

#### How I did it
Updated the logic to generate 'VLAN_SUB_INTERFACE' table if any of the following conditions hold true
  1. device is of type backend and ResourceType attribute is None
  2. device is of type backend and ResourceType attribute contains "Storage"
  3. device is of type backend and graph contains "Subinterface" section

Also updated the logic to set "is_storage_device" to True
  1. for Backend, if any of the above conditions hold true
  2. for Frontend, if ResourceType attribute contains "Storage"

#### How to verify it
Added new tests to verify the code changes and built sonic_config_engine-1.0-py3-none-any.whl successfully
2021-07-15 17:33:07 -07:00
lguohan
1d3939b7fe
[submodule]: update sonic-platform-common (#8178)
* 063e915 2021-06-15 | [CI] sonic-config-engine now depends on SONiC YANG packages (#198) (HEAD, origin/202012) [Joe LeVeque]
* 2d36a79 2021-07-13 | Fix Xcvrd crash due to invalid key access in type_of_media_interface, host_electrical_interface, connector_dict (#206) [Prince George]
* 67b8a77 2021-06-18 | Fix decode error when parsing EEPROM fields (#199) [Aravind Mani]
* 238d76b 2021-06-17 | Unifying the platform api for get_pcie_aer_stats with PcieBase (#197) [Sujin Kang]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-07-14 11:34:46 -07:00
lguohan
90c0dcf9e0
[submodule]: update sonic-platform-daemons (#8180)
* 664f0e2 2021-07-14 | [xrcvd]: Removed undefined symbol 'sfp_status_helper' (#204) (HEAD, origin/202012) [Prince George]
* 1b2d016 2021-06-16 | [CI] sonic-config-engine now depends on SONiC YANG packages (#194) [Joe LeVeque]
* 1cf5996 2021-07-14 | Introduce mgmtinit delay after transceiver module insertion (#201) [Prince George]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-07-14 08:07:37 -07:00
Guohan Lu
faf2cc2dac [submodule]: update sonic-linux-kernel
* deb716f 2021-07-14 | [Marvell] CPU1 failure on continuous reboot  (#228) (HEAD, origin/202012) [Rajkumar-Marvell]

Signed-off-by: Guohan Lu <lguohan@gmail.com>
2021-07-14 00:32:23 -07:00
DavidZagury
cf33a50c57
[sonic-utilities] Update submodule (#8168)
Update:
> 2ca493b 2021-07-13 create sniffer folder if not exist (Azure/sonic-utilities#1659) 
> 1695104 2021-07-07 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' (Azure/sonic-utilities#1679) 
> e99a3c5 2021-07-07 [show][config] support for interface alias for muxcable commands (Azure/sonic-utilities#1699)
2021-07-14 00:11:26 -07:00
Qi Luo
30e9c0e7d3
[sonic-snmpagent]: Advance submodule (#8176)
Includes below commits
```
946e5cf 2021-07-12 | Fix: SonicV2Connector behavior change: get_all will return empty dict if (#226) [Qi Luo]
```
2021-07-14 00:07:15 -07:00
shlomibitton
da7f596a55
[hostcfgd] [202012] Enhance hostcfgd to check feature state and run less system calls (#8157)
Currently hostcfgd is implemented in a way each feature which is enabled/disabled triggering execution of systemctl enable/unmask commands which eventually trigger 'systemctl daemon-reload' command.
Each call like this cost 0.6s and overall add a overhead of ~12 seconds of CPU time.
This change will verify the desired state of a feature and the current state of this feature on systemd and trigger a system call only when must.
What is changed: Check each feature status on systemd before executing a system call to enable and reload the systemctl daemon.
How to verify: Build an image with this change and observe less system calls are executed.
2021-07-13 14:57:17 -07:00
Shi Su
c857f64c00 [bgpcfgd] Remove unnecessary dependency for StaticRouteMgr (#8037)
Why I did it
Static route configuration should not depend on BGP_ASN. Remove the dependency on BGP_ASN for StaticRouteMgr.
Fix #8027

How I did it
Check if BGP_ASN field before configuring static route redistribution and wait until BGP_ASN is available to enable static route redistribution.

How to verify it
Add unit test to cover the scenario and verify the functionality on a virtual switch.
2021-07-13 05:14:10 +00:00
Shilong Liu
7811e7eef1 Bug fix for reproducible build (#8061) 2021-07-07 09:41:29 +00:00
shlomibitton
adbc657722
[sonic-swss][202012] submodule update (#8058)
[flex-counters] [202012] Delay flex counters stats init for faster boot time (Azure/sonic-swss#1804)
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2021-07-05 09:05:49 +03:00
Guohan Lu
d3e2983188 Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469)"
This reverts commit e851a42db7.
2021-07-01 18:41:21 -07:00
gechiang
e784c2607c
[202012] Add BRCM SOC Property to not count ACL drops towards interface RX_DRP fir DualToR platforms (#8000) 2021-07-01 16:45:07 -07:00
Dror Prital
22565d03a3
[sonic-utilities][202012] submodule update (#8029)
Update submodule for sonic-utilities to include the following PR:
[202012] [pfcwd] Fix the return code in invalid case (#1698)

Signed-off-by: Dror Prital <drorp@nvidia.com>
2021-06-30 20:39:27 +03:00
Shilong Liu
d9b1089a3c
Enable reproducible build for git in 202012 branch. (redo 6562) (#7994) 2021-06-30 10:54:54 +08:00
Stephen Sun
1e0af83f6a
[sonic-swss][202012] submodule update (#8011)
Advance submodule head for sonic-swss on 202012

bb383be2 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (Azure/sonic-swss#1781)
f949dfe9 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (Azure/sonic-swss#1776)
def0a914 Fix config prompt question issue (Azure/sonic-swss#1799)
21f97506 [ci]: Merge azure pipelines from master to 202012 branch (Azure/sonic-swss#1764)
a83a2a42 [vstest]: add dvs_route fixture
849bdf9c [Mux] Add support for mux metrics to State DB (Azure/sonic-swss#1757)
386de717 [qosorch] Dot1p map list initialization fix (Azure/sonic-swss#1746)
f99abdca [sub intf] Port object reference count update (Azure/sonic-swss#1712)
4a00042d [vstest/nhg]: use dvs_route fixture to make test_nhg more robust

Signed-off-by: Stephen Sun <stephens@nvidia.com>
2021-06-29 22:49:58 +03:00
DavidZagury
93a9dcb587
[sonic-sairedis][sonic-utilities][202012] submodule update (#7998)
Update submodule for sonic-utilities to include the following PRs:

80b7b54 Pcieutil to load the platform api first instead of using common api (sonic-utilities#1672)
3d5e93d [Mellanox] Update mellanox dump generation to include SDK dumps (sonic-utilities#1640)
a805efc [ci]: Fix config prompt question issue (sonic-utilities#1693)
33c6d79 [vnet_route_check] Fix logic for getting VNET routes from ASIC DB (sonic-utilities#1653)

Update submodule for sonic-sairedis to include the following PRs:

74b5808 [Mellanox] Update mellanoxs dump generation to include SDK dumps (sonic-sairedis#833)
5ff9305 Fix azure-pipelines branch reference error, change master to 202012 (sonic-sairedis#834)
2021-06-29 21:37:11 +03:00
Joe LeVeque
b7344e8ac3 [sonic-py-common] Clear environment variables before running device_info tests (#7273)
#### Why I did it

To ensure any environment variables which are configured in the build/test environment do not influence the behavior of sonic-py-common during unit tests. For example, variables which might be set by continuous integration pipelines.

#### How I did it

Add class-scoped pytest fixture to `TestDeviceInfo` class which stashes the current environment variables, clears them and yields. Once all the test cases in the class finish, the fixture will restore the original environment variables.

Also remove unnecessary unittest-style setup and teardown functions from interface_test.py
2021-06-29 07:19:46 +00:00
thomas.cappleman@metaswitch.com
1d3e7ab161 [build]: Fix sonic-cfggen contextlib err (#7996)
A recent version of contextlib2 (https://pypi.org/project/contextlib2/21.6.0/#history) has broken Python2 compatibility,
so the version picked up by netaddr when using Python2 must be specified, or else builds fail

Co-authored-by: Tom Zhu <tom.zhu@metaswitch.com>
2021-06-28 17:18:45 -07:00
Dror Prital
6d4bbdca92
[sonic-utilities][202012] submodule update (#7981)
Update submodule for sonic-utilities to include the following PRs:

Make the soft-reboot available in the SONiC image on master (#1681)
[config]: Update environment file during config reload (#1673)
Enable pr checker form 202012 (#1649)
[config] support for configuring muxcable to manual mode of operation (#1642)
[config] Fix config int add incorrect ip (#1414)

Signed-off-by: Dror Prital <drorp@nvidia.com>
2021-06-26 12:08:04 +03:00
Alexander Allen
b128bdc246
Bump sonic-platform-daemons (#7936)
3ab5a04 [xcvrd] Force cleanup of chassis global variable on deinit (#193)
2021-06-22 16:33:35 +03:00
Junchao-Mellanox
873f6a8b92
[202012] [sonic-platform-common] Update submodule (#7929)
b0dad8c Add to check pcie configuration revision to get the right configuration. (#195)
f66ffc3 [eeprom_tlv_info] Optimize EEPROM data process by using visitor pattern (#193)
2021-06-21 09:15:42 -07:00
Rajkumar-Marvell
c6282294b9 [Marvell] Fix system MAC parsing logic for Marvell platform. (#7914)
Fixed parsing logic in file "src/sonic-py-common/sonic_py_common/device_info.py"

Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
2021-06-21 09:56:02 +00:00
Qi Luo
8b7091ff51 Revert some mistakenly merged/pushed code
* Revert "fix"

This reverts commit 93585b0a0a.

* Revert "Version control git (#6562)"

This reverts commit 52b87753db.

* Revert "Revert "[files/build/versions]: support reproduceable build for git (#5774)""

This reverts commit 1cb8daf585.

* Revert "[files/build/versions]: support reproduceable build for git (#5774)"

This reverts commit 547aa9b2c7.
2021-06-21 06:35:54 +00:00
liushilongbuaa
52b87753db Version control git (#6562)
* support reproduceable build for git clone

Signed-off-by: shilongliu <shilongliu@microsoft.com>

* fix

* bug-fix

Signed-off-by: shilongliu <shilongliu@microsoft.com>

* bug-fix

Signed-off-by: shilongliu <shilongliu@microsoft.com>

Co-authored-by: shilongliu <shilongliu@microsoft.com>
2021-06-18 13:32:32 +08:00
Guohan Lu
1cb8daf585 Revert "[files/build/versions]: support reproduceable build for git (#5774)"
This reverts commit d75c290f00.
2021-06-18 13:32:27 +08:00
liushilongbuaa
547aa9b2c7 [files/build/versions]: support reproduceable build for git (#5774)
* support reproduceable build for git clone

Signed-off-by: shilongliu <shilongliu@microsoft.com>

* fix

Co-authored-by: shilongliu <shilongliu@microsoft.com>
2021-06-18 13:32:20 +08:00
Mykola Gerasymenko
c406d42a26
Add PG_DROP yang model (#7899)
Add PG_DROP yang model and add check this field in unit test for yang model

How to verify it
Firstly try to do DPB (2x50G) for Ethernet0 port:
sudo config interface breakout Ethernet0 2x50G -f
After that try to do DPB (1x100G[40G]) for Ethernet0 port:
sudo config interface breakout Ethernet0 1x100G[40G] -f
Both commands should work correctly.

Signed-off-by: Mykola Gerasymenko <mykolax.gerasymenko@intel.com>
2021-06-17 10:32:45 -07:00
Joe LeVeque
c46bf41ea5 [sonic-host-services] Add 'parameterized' package as a test dependency (#7900)
#### Why I did it

Recently, the build started failing with messages like

```
2021-06-16T16:55:02.8675603Z tests/hostcfgd/hostcfgd_test.py:5: in <module>
2021-06-16T16:55:02.8676208Z     from parameterized import parameterized
2021-06-16T16:55:02.8677145Z E   ModuleNotFoundError: No module named 'parameterized'
```

Unit tests for hostcfgd depend on the `parameterized` Python package, but it was never added as a dependency to the setup.py file. This dependency was added ~3 months ago. I'm not sure why we only started seeing this failure recently.

#### How I did it

Add 'parameterized' package as a test dependency in setup.py for sonic-host-services package
2021-06-17 07:09:50 +00:00
Renuka Manavalan
e851a42db7 [Kubernetes]: The kube server could be used as http-proxy for docker (#7469)
Why I did it
The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo.

Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy.

How I did it
When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker.

To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo.

How to verify it
Configure a switch such that image repo is unreachable
Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1)
Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option.
Configure a k8s server, and deploy an image for feature with set_owner="kube"
Check if switch could successfully download the image or not.
2021-06-17 07:09:50 +00:00
Sudharsan Dhamal Gopalarathnam
199c75f36b
[202012][sonic-utilities] submodule update (#7891)
d86d765 [202012]Fixing db_migrator for Feature table (#1676)
440b0f4 [config] Sort Config Db When Saving (#1623) (#1651)
2021-06-16 18:33:41 +03:00
Blueve
4cbf7e975b [console][minigraph] Avoid generate config for self console port (#7817)
Signed-off-by: Jing Kan jika@microsoft.com
2021-06-16 12:46:25 +00:00