Why I did it
End goal: To have azure pipeline job to run multi-asic VS tests.
Intermediate goal: Require multi-asic KVM image so that the test can be run.
The difference between single asic and multi-asic KVM image is asic.conf file which has different NUM_ASIC values.
Idea behind the approach in this PR to attain the intermediate goal above:
Ease of building multi-asic KVM image so that any user or azure pipeline can use a simple make command to generate single or multi-asic KVM images as required.
Use a single onie installer image and multiple KVM images for single and multi-asic images.
Current scenario:
For VS platform, sonic-vs.bin is generated which is the onie installer image and sonic-vs.img.gz is generated which is the KVM iamge.
Scenario to be achieved:
sonic-vs.bin - which will include single asic platform, 4 asic platform and 6 asic platform.
sonic-vs.img.gz - single asic KVM image
sonic-4asic-vs.img.gz - 4 asic KVM image
sonic-6asic-vs.img.gz - 6 asic KVM image
In this PR, 2 new platforms are added for 4-asic and 6-asic VS.
How I did it
Create 4-asic and 6-asic device directories with the required files and hwsku files.
Add onie-recovery image information in vs platform.
How to verify it
After this PR change, no build change.
sonic-vs.bin onie installer image should include information of new multi-asic vs platforms.
This reverts commit a557dbd97e.
Reverting this PR as it is not required currently for multi-asic VS.
multi-asic VS will come up with multiple instances of swss and syncd. syncd will use default hwinfo string, same as in single asic VS.
hwskus.
Why I did it
For multi-asic platforms, orchagent process in swss docker is started by passing device_ids(or asic_ids).
Each swss docker starts orchagent with a different device_id. This device_id is passed as Hardware info to syncd. For syncd to start with the right hwinfo, context_config.json is passed as an argument. context_config.json file is looked up to get the hwinfo information.
sonic-sairedis PRs required for this diff to be used to bring up multi-asic VS:
Azure/sonic-sairedis#830Azure/sonic-sairedis#832
How I did it
Add context_config.json for each asic in the same structure as provided here: https://github.com/Azure/sonic-sairedis/blob/master/lib/src/context_config.json
Each asic context_config.json will have different hwinfo string.
hwinfo string will be same as device id retrieved from asic.conf file.
Signed-off-by: Suvarna Meenakshi <sumeenak@microsoft.com>
- Why I did it
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces. When bringing this up, below issue was seen:
When all 32 interfaces(sonic interfaces and linux interface) are set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .In order to keep multi-asic VS lighter and easier to bring up and test, new hwsku 'msft_four_asic_vs' is added to represent 4-asic hwsku with 2 frontend asics and 2 backend asics and each asic having 8 interfaces interconnected by port-channels.
- How I did it
Add msft_four_asic_hwsku directory to have the right number of directories (4) and update port_config.ini and lanemap.ini files to include 8 ports information.
Add topology.sh script to create the internal asic-asic connectivity.
- How to verify it
Update asic.conf with the 4 asic information as below and build sonic-vs.img:
NUM_ASIC=4
DEV_ID_ASIC_0=0
DEV_ID_ASIC_1=1
DEV_ID_ASIC_2=2
DEV_ID_ASIC_3=3
Modify sonic_multiasic.xml to have 8 front panel interfaces.
create virtual switch using "sudo virsh sonic_mutliasic.xml" command.
Start topology service and Load config_db files for switch and each asic.
Ensure that that all internal interfaces and port_channels are coming up.
multi-asic vs testbed:
Bring up mutli-asic VS testbed with a multi-asic image(asic.conf updated to 4 asics) and using t1-lag topology.
./testbed-cli.sh -t vtestbed.csv -m veos_vtb -k ceos add-topo vms-kvm-four-asic-t1-lag password.txt
Load minigraph/config_dbs.
Ensure all internal and external interfaces come up.
No change on single asic vs.
Update topology script to retrieve hwsku from minigraph
if hwsku information is not available in config_db.
Fix clean up of interfaces in msft_multi_asic_vs hwsku
topology script.
- Why I did it
When bringing up multi-asic VS switch, topology service is started during boot up.
Topology service starts a shell script which runs the topology script present in /usr/share/sonic/device// directory. To invoke hwsku specific script, the topology script tries to retrieve hwsku information from config_db.
During initial boot up config_db might not be populated. In order to start topology service before config_db is updated,
update topology script to get hwsku information from minigraph.xml if it is available.
This will be helpful to bring up multi-asic VS testbed by loading minigraph and starting topology service.
- How I did it
Update topology.sh script to retrieve hwsku information from minigraph.xml.
Fix clean up function on msft_multi_asic_vs toplogy script.
- How to verify it
single-asic VS - no change; topology service is only enabled for multi-asic VS.
multi-asic VS - Bring up multi-asic VS image, copy minigraph to vs image, start topology service. Topology service should be successful.
to test clean up function fix, start topology service - make sure interfaces are created and moved to the right namespaces.
stop topology service - make sure namespace do not have any interface and all front end interfaces are present in default namespace.
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces.
When bringing this up, below issue was seen:
When all 32 interfaces in each namespace (sonic interfaces and linux interface) is set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
- Why I did it
The sai.profile file in kvm images overrides the warmboot file with path /var/cache/sai_warmboot.bin. Since the directory /var/cache is not mounted in syncd, it will be cleared in an image upgrade, the warm-reboot image upgrade will fail if the file is put in the directory.
Fix#6183
- How I did it
Remove the path that overrides the default path. The warmboot file path will then be the default value /var/warmboot/sai-warmboot.bin. Since /var/warmboot/ is mounted by /host/warmboot/ in the host, it could survive an image upgrade.
- How to verify it
Tested warm reboot upgrading kvm image locally.
Submodule updates include the following commits:
* src/sonic-utilities 9dc58ea...f9eb739 (18):
> Remove unnecessary calls to str.encode() now that the package is Python 3; Fix deprecation warning (#1260)
> [generate_dump] Ignoring file/directory not found Errors (#1201)
> Fixed porstat rate and util issues (#1140)
> fix error: interface counters is mismatch after warm-reboot (#1099)
> Remove unnecessary calls to str.decode() now that the package is Python 3 (#1255)
> [acl-loader] Make list sorting compliant with Python 3 (#1257)
> Replace hard-coded fast-reboot with variable. And some typo corrections (#1254)
> [configlet][portconfig] Remove calls to dict.has_key() which is not available in Python 3 (#1247)
> Remove unnecessary conversions to list() and calls to dict.keys() (#1243)
> Clean up LGTM alerts (#1239)
> Add 'requests' as install dependency in setup.py (#1240)
> Convert to Python 3 (#1128)
> Fix mock SonicV2Connector in python3: use decode_responses mode so caller code will be the same as python2 (#1238)
> [tests] Do not trim from PATH if we did not append to it; Clean up/fix shebangs in scripts (#1233)
> Updates to bgp config and show commands with BGP_INTERNAL_NEIGHBOR table (#1224)
> [cli]: NAT show commands newline issue after migrated to Python3 (#1204)
> [doc]: Update Command-Reference.md (#1231)
> Added 'import sys' in feature.py file (#1232)
* src/sonic-py-swsssdk 9d9f0c6...1664be9 (2):
> Fix: no need to decode() after redis client scan, so it will work for both python2 and python3 (#96)
> FieldValueMap `contains`(`in`) will also work when migrated to libswsscommon(C++ with SWIG wrapper) (#94)
- Also fix Python 3-related issues:
- Use integer (floor) division in config_samples.py (sonic-config-engine)
- Replace print statement with print function in eeprom.py plugin for x86_64-kvm_x86_64-r0 platform
- Update all platform plugins to be compatible with both Python 2 and Python 3
- Remove shebangs from plugins files which are not intended to be executable
- Replace tabs with spaces in Python plugin files and fix alignment, because Python 3 is more strict
- Remove trailing whitespace from plugins files
**- Why I did it**
We were building a custom version of Supervisor because I had added patches to prevent hangs and crashes if the system clock ever rolled backward. Those changes were merged into the upstream Supervisor repo as of version 3.4.0 (http://supervisord.org/changes.html#id9), therefore, we should be able to simply install the vanilla package via pip. This will also allow us to easily move to Python 3, as Python 3 support was added in version 4.0.0.
**- How I did it**
- Remove Makefiles and patches for building supervisor package from source
- Install Python 3 supervisor package version 4.2.1 in Buster base container
- Also install Python 3 version of supervisord-dependent-startup in Buster base container
- Debian package installed binary in `/usr/bin/`, but pip package installs in `/usr/local/bin/`, so rather than update all absolute paths, I changed all references to simply call `supervisord` and let the system PATH find the executable to prevent future need for changes just in case we ever need to switch back to build a Debian package, then we won't need to modify these again.
- Install Python 2 supervisor package >= 3.4.0 in Stretch and Jessie base containers
* buildimage: Add gearbox phy device files and a new physyncd docker to support VS gearbox phy feature
* scripts and configuration needed to support a second syncd docker (physyncd)
* physyncd supports gearbox device and phy SAI APIs and runs multiple instances of syncd, one per phy in the device
* support for VS target (sonic-sairedis vslib has been extended to support a virtual BCM81724 gearbox PHY).
HLD is located at b817a12fd8/doc/gearbox/gearbox_mgr_design.md
**- Why I did it**
This work is part of the gearbox phy joint effort between Microsoft and Broadcom, and is based
on multi-switch support in sonic-sairedis.
**- How I did it**
Overall feature was implemented across several projects. The collective pull requests (some in late stages of review at this point):
https://github.com/Azure/sonic-utilities/pull/931 - CLI (merged)
https://github.com/Azure/sonic-swss-common/pull/347 - Minor changes (merged)
https://github.com/Azure/sonic-swss/pull/1321 - gearsyncd, config parsers, changes to orchargent to create gearbox phy on supported systems
https://github.com/Azure/sonic-sairedis/pull/624 - physyncd, virtual BCM81724 gearbox phy added to vslib
**- How to verify it**
In a vslib build:
root@sonic:/home/admin# show gearbox interfaces status
PHY Id Interface MAC Lanes MAC Lane Speed PHY Lanes PHY Lane Speed Line Lanes Line Lane Speed Oper Admin
-------- ----------- --------------- ---------------- --------------- ---------------- ------------ ----------------- ------ -------
1 Ethernet48 121,122,123,124 25G 200,201,202,203 25G 204,205 50G down down
1 Ethernet49 125,126,127,128 25G 206,207,208,209 25G 210,211 50G down down
1 Ethernet50 69,70,71,72 25G 212,213,214,215 25G 216 100G down down
In addition, docker ps | grep phy should show a physyncd docker running.
Signed-off-by: syd.logan@broadcom.com
as needed by SAI 3.7 and above. Without this change
Warmboot fails from 3.5 to 3.7 as Braodcoam Datastructure
gets corrupted after warm-boot.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Modify port_config.ini files multi-asic vs platform. Changes done:
- Add new columns: index, asic_port_name, role(Int/Ext)
- Modify alias of interface names. Alias should match the interface names present in minigraph file.
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
hardware daemons are not supported in kvm vs platform now
admin@vlab-01:/usr/share/sonic/device/x86_64-kvm_x86_64-r0$ docker exec -it pmon bash
root@vlab-01:/# supervisorctl status
fancontrol STOPPED Not started
lm-sensors STOPPED Not started
rsyslogd RUNNING pid 23, uptime 0:03:09
start.sh EXITED Apr 22 09:07 AM
supervisor-proc-exit-listener RUNNING pid 17, uptime 0:03:10
Signed-off-by: Guohan Lu <lguohan@gmail.com>
- move single instance services into their own folder
- generate Systemd templates for any multi-instance service files in slave.mk
- detect single or multi-instance platform in systemd-sonic-generator based on asic.conf platform specific file.
- update container hostname after creation instead of during creation (docker_image_ctl)
- run Docker containers in a network namespace if specified
- add a service to create a simulated multi-ASIC topology on the virtual switch platform
Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
Signed-off-by: Suvarna Meenakshi <Suvarna.Meenaksh@microsoft.com>
* syncd changes to disk and add e1000 driver to sonic vm
* add pg_profile_lookup.ini
Signed-off-by: Guohan Lu <gulv@microsoft.com>
* update swss and sairedis
sairedis:
* d146572 2018-11-22 | Fix interface name used on link message using lane map (#386)
swss:
* c74dc60 2018-11-22 | [vstest]: use eth1~32 as physical interface name in vs docker (#700) (HEAD -> master, origin/master, origin/HEAD) [lguohan]
* 6007e7f 2018-11-22 | [portmgrd]: Fix setting default port admin status and MTU (#691) [stepanblyschak]
* 6c70f6d 2018-11-22 | [portsorch] Fix port queue index init bug (#505) [yangbashuang]
* 70ac79b 2018-11-21 | [gitignore]: Update all binary names in the ignore list (#698) [Shuotian Cheng]
* 2a3626c 2018-11-21 | [test]: Remove duplicate legacy ACL tests (#699) [Shuotian Cheng]
* 8099811 2018-11-20 | [aclorch]: Remove unnecessary warning message (#696) [Shuotian Cheng]
* 63d8ebc 2018-11-18 | [portsorch]: Remove duplicate local variables - port (#690) [Shuotian Cheng]
* 28dc042 2018-11-18 | Remove default docker name value of swss. (#692) [Jipan Yang]
Signed-off-by: Guohan Lu <gulv@microsoft.com>