**- Why I did it**
Reduce the time it takes for the ASIC FW burn as part of the automatic FW upgrade procedure.
**- How I did it**
Add -d option to mlxfwmanager tool to use the faster MST device and not the default one which is not the fastest one.
**- How to verify it**
I manually changed ASIC FW followed by reboot command in order for FW upgrade to take place on deinit.
I manually changed ASIC FW followed by hard reset in order for FW upgrade to take place on init.
Signed-off-by: liora <liora@nvidia.com>
* 28d358f 2021-02-01 | [show] Run fwutil with sudo (#1364) (HEAD) [Volodymyr Boiko]
* a50b7a2 2021-01-29 | [ecnconfig] Allow ecn unit test to run without sudo (#1390) [Neetha John]
* 8a1109e 2021-01-29 | [sonic-installer] Add information to syslog (#1369) [Dmytro]
* c7c01e4 2021-01-27 | [show] fix "show interfaces breakout" command (#1198) [Dmytro Shevchuk]
* 7a8024a 2021-01-27 | Prevent user from adding more then a single untagged VLAN to an interface (#1382) [Eran Dahan]
* 41e62c6 2021-01-26 | [pcieutil] Add 'pcie-aer' sub-command to display AER stats (#1169) [Arun Saravanan Balachandran]
* 47f412b 2021-01-25 | Improve robustness of consutil plugin loading (#1353) [Samuel Angebault]
* 64aa1b8 2021-01-26 | [show] Fix warnings, related to gearbox, while show commands execution (#1343) [maksymbelei95]
* ff226d0 2021-01-25 | Prevent configuring IP interface on a port which is a member of VLAN (#1374) [Eran Dahan]
* f1522b9 2021-01-21 | [config_mgmt.py]: Set leaf-list to empty list while port breakout. (#1268) [Praveen Chaudhary]
* 99c05d5 2021-01-21 | add vlan_intf_object only if there are ipv4 or ipv6 mappings (#1377) [Sumukha Tumkur Vani]
* b082684 2021-01-21 | [ecn] Add tests for ecnconfig command (#1372) [Neetha John]
* 23e0920 2021-01-21 | [sfpshow] Enhance QSFP-DD DOM information (#1207) [shlomibitton]
* f4edba1 2021-01-20 | [ecnconfig] handle backend port names when extracting port I/F ID from the port name (#1361) [Mahesh Maddikayala]
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Current mutli-asic vs hwsku consists of 6 asics with each asic having 32 interfaces.
When bringing this up, below issue was seen:
When all 32 interfaces in each namespace (sonic interfaces and linux interface) is set to 9100 mtu, DMA error is seen "DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:06:03.0" which can be fixed by updating swiotlb=65536 in /host/grub/grub.cfg .
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
When we add allow-list key with action above route-map gets updated . For eg if we add deny action above template will become to no-export community. Now if we delete the key Issue is we still keep the no-export and do not move back to drop community.
This PR fixes this issue by rolling back default route-map community value back to constants.yml default action.
**Why I did it**
Disable SDK extended dump due to issue found
**How I did it**
Update SAI submodule
**How to verify it**
Verify the SDK extended dump is not called.
Signed-off-by: Eran Dahan <erand@nvidia.com>
* Bump version number to 2.0.32-1 to include a fix for a memory-leak
found during testing. A wrong API is used to free the cJSON
data-structure, which only frees the first pointed-to structure.
The proper API should recursively free all structures.
Signed-off-by: Garrick He <garrick_he@dell.com>
This PR updates the following commits in sonic-platform-common
6ad0004 [component] add auto_update_firmware() to support the auto update. (#106)
49076a9 [sonic_y_cable] Add support for measuring BER and EYE scan and running Loopback, PRBS modes on the Y cable (#158)
6b12b4c [sfp] Add parsing the dom_capability to sff8472 (#102)
7fc76b9 [sonic_pcie] Add get_pcie_aer_stats and its common implementation (#144)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Archive compiled Debian packages and Python wheels so that the artifacts can be downloaded and used by other pipelines.
Also archive related log files.
- Why I did it
SONiC design requires sonic_platform package to be installed in SONiC host environment, not only in docker containers.
- How I did it
For now, sonic_platform python wheel package, that is used by pmon, is provided via device-specific platform modules deb packages that unpacks the wheel package file into specific device's directory on lazy-install.
The PR makes deb packages' postinst script also install these unpacked wheel packages to host.
Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>
port_config.ini for HWSKU Arista-7050CX3-32S-C32 has missing speed column and duplicated lanes.
The incorrect speed causes issues in Orchagent RESTARTCHECK as the below task remains as the remaining item during swss shutdown.
Update minigraph parser to retrieve kubernetes server info from minigraph.xml and update "KUBERNETES_MASTER|SERVER" in running config.
Update minigraph parser to include clusterName from minigraph.xml into "DEVICE_METADATA|localhost"
A few issues where discovered with crashkernel on Arista platforms.
1) platforms using `docker_inram=on` would end up OOM in kdump environment.
This happens because the same initramfs is used by SONiC and the crashkernel.
With `docker_inram=on` the `dockerfs.tar.gz` is extracted in a `tmpfs` created for the occasion.
Since `dockerfs.tar.gz` weights more than 1.5G, it doesn't fit into the kdump environment and ends up OOM.
This OOM event can in turn trigger a panic.
2) Arista platforms with `secureboot` enabled would fail to load the crashkernel because the kernel parameter would be discarded on boot.
This happens because the `boot0` in secureboot mode is strict about kernel parameter injection.
3) The secureboot path allowlist would remove kernel crash reports.
4) The kdump service would fail on Arista products since `/boot/` is empty in `secureboot`
**- How I did it**
1) To prevent an OOM event in the crashkernel the fix is to avoid the codepaths in `union-mount` that create tmpfs and populate them. Some more codepath specific to Arista devices are also skipped to make the kdump process faster.
This relies on detecting that the initramfs is starting in a kdump environment and skipping some initialization.
The `/usr/sbin/kdump-config` tool appends a few kernel cmdline arguments when loading the crashkernel.
The most unique one is `systemd.unit=kdump-tools.service` which is used in a few initramfs hooks to set `in_kdump`.
2) To allow `kdump` to work in `secureboot` environment the cmdline generation in boot0 was slightly modified.
The codepath to load kernel parameters changed by SONiC is now running for booting in secure mode.
It was altered to prevent an append only behavior which would grow the `kernel-cmdline` at every reboot.
This ever growing behavior would lead `kexec` to fail to load the kernel due to a too long cmdline.
3) To get the kernel crash under /var/crash this path has to be added to `allowlist_paths`
4) The `/host/image-XXX/boot` folder is now populated in `secureboot` mode but not used.
**- How to verify it**
Regular boot:
- enable kdump
- enable docker_inram=on via kernel-params
- reboot
- generate a crash `echo c > /proc/sysrq-trigger`
- before: witness OOM events on the console
- after: crash kernel works and crash available under /var/crash
Secure boot:
- enable kdump
- reboot
- generate a crash `echo c > /proc/sysrq-trigger`
- before: witness no kdump
- after: crash kernel works and crash available under /var/crash
Co-authored-by: Boyang Yu <byu@arista.com>
snmpd's compile is always failed with file truncated on ARM64 arch, the error log is like "/usr/bin/ld: mibgroup/ip-forward-mib/inetCidrRouteTable/.libs/inetCidrRouteTable_interface.o: file not recognized: file truncated"
Co-authored-by: Xianghong Gu <xgu@centecnetworks.com>
1. BRCM SAI Debian build need not have any Kernel version dependency - Starting with 4.3 BRCM made changes in SAI so that this dependency has been cleaned up. We can now remove the Kernel Version dependency from Azure Pipeline build script.
2. Bypass PEER_MODE p2mp setting causing SYNCd crash on non-TD3 SKUs - Temporarily patch BRCM SAI code to not cause SYNCd crash when Orchagent program SAI_TUNNEL_ATTR_PEER_MODE: SAI_TUNNEL_PEER_MODE_P2MP on Non-TD3 SKUs. Will remove this when BRCM provide proper fix to address this issue.
- Why I did it
Fix issue: ptf_nn_agent isn't able to start in syncd-rpc docker on buster.
- How I did it
The issue is fixed by installing python-dev, cffi and nnpy for python 2 explicitly.
- How to verify it
Run copp test on RPC image.
**- Why I did it**
In thermalctd, when speed of fan exceeds threshold, the fan status will be saved as "bad". So in system health, it is better to check fan speed before fan status. In this case, if fan speed exceeds threshold, we get more detailed information.
**- How I did it**
Move fan speed check logic before fan status check
**- How to verify it**
Manual test
BRCM SDK 6.5.21 includes firmware updates (premier cancun) for TD3 platforms. The firmware update is required on TD3 platforms, which is packaged with BCMSAI 4.3.0.10.
**- How I did it**
Updated BCM config with a new variable that specifies the firmware package path. SDK uses this path to locate firmware packages and load during cold boot.
**- How to verify it**
bsv
BRCM SAI ver: [4.3.0.10], OCP SAI ver: [1.7.1], SDK ver: [sdk-6.5.21] CANCUN ver: [5.3.3]
drivshell>
admin@str2-7050cx3-acs-02:~$ bcmsh
Press Enter to show prompt.
Press Ctrl+C to exit.
NOTICE: Only one bcmsh or bcmcmd can connect to the shell at same time.
drivshell>cancun stat
cancun stat
UNIT0 CANCUN:
CIH: LOADED
Ver: 06.06.01
CMH: LOADED
Ver: 06.06.01
SDK Ver: 06.05.21
CCH: LOADED
Ver: 06.06.01
SDK Ver: 06.05.21
CEH: LOADED
Ver: 06.06.01
SDK Ver: 06.05.21
drivshell>
Starting with BRCM SAI 4.3.1.5 we see the following :ethtool not fount" error in syslog during boot up:
```
Jan 27 07:36:14.712472 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1:
Jan 27 07:36:14.712844 str-s6100-acs-1 INFO syncd#/supervisord: syncd ethtool: not found
Jan 27 07:36:14.713228 str-s6100-acs-1 INFO syncd#/supervisord: syncd #015
Jan 27 07:36:14.713840 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet39 speed 40000 rc:32512
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initializePort: Initializing port alias:Ethernet36 pid:1000000000040
Jan 27 07:36:14.726793 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet36 admin:0 oper:0 addr:4c:76:25:f5:48:80 ifindex:75 master:0
Jan 27 07:36:14.727967 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet36(ok) to state db
Jan 27 07:36:14.729331 str-s6100-acs-1 NOTICE swss#orchagent: :- addHostIntfs: Create host interface for port Ethernet36
Jan 27 07:36:14.752398 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1: ethtool: not found#015
Jan 27 07:36:14.752689 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet36 speed 40000 rc:32512
Jan 27 07:36:14.756050 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet36
Jan 27 07:36:14.757585 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet36
```
It seems that starting with BRCM SAI 4.2.1.5 syncd is using ethtool to set the host interface speed and since this ethtool was not part of the syncd Docker, we observe these "ethtool not found" issue.
This update includes the following changes
> [syncd armhf] Fix syncd crash when running community test suites (#777)
> Revert "[tests]:Add unittest for MACsec on p2p establishment (#771)"
> [tests]:Add unittest for MACsec on p2p establishment (#771)
> [tests] Enable azure pipeline make check to respect unittests (#760)
**- Why I did it**
As per https://pypi.org/project/pip/ pip 21.0 does not not support Python 2 from Jan 2021. Most places in the codebase have already been pinned, but this one was missed.
**- How I did it**
Pin pip2 < version 21 in build_debian.sh
fixesAzure/sonic-utilities#1389
With the recent changes in sudoer files. The show commands fails for the read-only users.
The problem here is the 'docker ps' is failing in the function [get_routing_stack()](8a1109ed30/show/main.py (L54)) therefore all the CLI commands are failing.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
- Why I did it
The command sudo ip netns identify <pid> is used in function get_current_namespace
to check in the cli command is running in host context or within a namespace.
This function is used for every CLI command and command sudo ip netns identify <pid> needs to be added in sudoer files to allow users with RO access to run show cli commands
This problem is not there on single asic platforms.
- How I did it
Add ip netns identify [0-9]* to sudoers file.
**- Why I did it**
sonic-utilities will become dependent upon sonic-platform-common as of https://github.com/Azure/sonic-utilities/pull/1386.
**- How I did it**
- Add sonic-platform-common as a dependency in docker-sonic-vs.mk
- Additionally, no longer install Python 2 packages of swsssdk and sonic-py-common, as they should no longer be needed.