Found another syncd timing issue related to clock going backwards.
To be safe disable the ntp long jump.
Signed-off-by: Ying Xie <ying.xie@microsoft.com>
- Add .gitignore files in each subdirectory of src/, so as to reduce the size of the .gitignore file in the project root, and also make it easier to maintain (i.e., if a directory in src/ is removed, there will not be outdated entries in the root .gitignore file.
- Also add missing .gitignore entries and remove outdated entries and duplicates.
**Why I did it**
- Added support for S6000 new HWSKU-Q24S32
**How I did it**
- Modified port_config.ini, TD2 settings to bring the ports UP.
**How to verify it**
- Check LLDP neighbors,LLDP table, interface status,EEPROM and other show commands.
- Do OIR, LED, Traffic testings.
**How I did it**
- Modified port_config.ini, TD2 settings to bring the ports UP.
**How to verify it**
- Check LLDP neighbors,LLDP table, interface status,EEPROM and other show commands.
- Do OIR, LED, Traffic testings.
Update AS7312-54X,AS7312-54XS,AS7315-27XB config.bcm file to make sure there is no the following error message.
configuration: format error in /usr/share/sonic/hwsku/th-as7312-48x25G+6x100G.config.bcm on line 110 (ignored)#15
The -sv2 suffix was used to differentiate SNMP Dockers when we transitioned from "SONiCv1" to "SONiCv2", about four years ago. The old Docker materials were removed long ago; there is no need to keep this suffix. Removing it aligns the name with all the other Dockers.
Update sonic-snmpagent submodule with PRs:
89b7b2c [Multi-asic]: Namespace support for LLDP and Sensor tables (#131)
fcb8955 Simplify test code (#132)
a677876 [Multi-asic]: Support multi-asic platform (#126)
update sonic-py-swsssdk submodule with PRs:
132f8d5 [MultiDB]: use python class composition to avoid confusion in base class (#74)
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
**- Why I did it**
For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.
**- How I did it**
- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’ ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py,
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.
**- How to verify it**
- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.
UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
- What I did
In order to allow the SONiC community to check in platform capability file i.e. platform.json
file directly under device folder. We need to add this test to make sure the contents of the this file is compliant with platform capability design specified in DPB HLD doc
- How I did it
Added platformJson_checker.py file in Test folder.
Signed-off-by: Sangita Maity <sangitamaity0211@gmail.com>
**- Why I did it**
When I tested auto-restart feature of swss container by manually killing one of critical processes in it, swss will be stopped. Then syncd container as the peer container should also be
stopped as expected. However, I found sometimes syncd container can be stopped, sometimes
it can not be stopped. The reason why syncd container can not be stopped is the process
(/usr/local/bin/syncd.sh stop) to execute the stop() function will be stuck between the lines 164 –167. Systemd will wait for 90 seconds and then kill this process.
164 # wait until syncd quit gracefully
165 while docker top syncd$DEV | grep -q /usr/bin/syncd; do
166 sleep 0.1
167 done
The first thing I did is to profile how long this while loop will spin if syncd container can be
normally stopped after swss container is stopped. The result is 5 seconds or 6 seconds. If syncd
container can be normally stopped, two messages will be written into syslog:
str-a7050-acs-3 NOTICE syncd#dsserve: child /usr/bin/syncd exited status: 134
str-a7050-acs-3 INFO syncd#supervisord: syncd [5] child /usr/bin/syncd exited status: 134
The second thing I did was to add a timer in the condition of while loop to ensure this while loop will be forced to exit after 20 seconds:
After that, the testing result is that syncd container can be normally stopped if swss is stopped
first. One more thing I want to mention is that if syncd container is stopped during 5 seconds or 6 seconds, then the two log messages can be still seen in syslog. However, if the execution
time of while loop is longer than 20 seconds and is forced to exit, although syncd container can be stopped, I did not see these two messages in syslog. Further, although I observed the auto-restart feature of swss container can work correctly right now, I can not make sure the issue which syncd container can not stopped will occur in future.
**- How I did it**
I added a timer around the while loop in stop() function. This while loop will exit after spinning
20 seconds.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
- Sensor and Fan information added to primary platforms for thermal API.
- Refactors involving better abstractions, code reuse and dead code removal.
- Improvements to the diag capabilities
- Pylintrc added to improve code quality. Will become fatal at a later time.
Co-authored-by: Baptiste Covolato <baptiste@arista.com>
Cleanup description string
First port (management port) are excluded from general port naming scheme.
Management port are excluded from general port naming scheme.
before:
|on GNS3 |in SONiC |
|---------|---------|
|Ethernet0|eth0 |
|Ethernet1|Ethernet0|
|Ethernet2|Ethernet4|
|Ethernet3|Ethernet8|
after:
|on GNS3 |in SONiC |
|---------|---------|
|eth0 |eth0 |
|Ethernet0|Ethernet0|
|Ethernet1|Ethernet4|
|Ethernet2|Ethernet8|
Signed-off-by: Masaru OKI <masaru.oki@gmail.com>
**- Why I did it**
We need RIF counters to be enabled by default. Flex Counter does probe for supported counters. If a platform does not support RIF counters, SAI will return NOT_SUPPORTED and Flex Counter will stop polling the counter.
**- How to verify it**
After fresh install rif counter gropup is enabled by default:
$ counterpoll show
Type Interval (in ms) Status
-------------------- ------------------ --------
QUEUE_STAT default (10000) enable
PORT_STAT default (1000) enable
RIF_STAT default (1000) enable
QUEUE_WATERMARK_STAT default (10000) enable
PG_WATERMARK_STAT default (10000) enable
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
I found that with IPv4Network types, calling list(ip_ntwrk.hosts()) is reliable. However, when doing the same with an IPv6Network, I found that the conversion to a list can hang indefinitely. This appears to me to be a bug in the ipaddress.IPv6Network implementation. However, I could not find any other reports on the web.
This patch changes the behavior to call next() on the ip_ntwrk.hosts() generator instead, which returns the IP address of the first host.
Fix hostcfgd so that changes to the "FEATURE" table in ConfigDB are properly handled. Three changes here:
1. Fix indenting such that the handling of each key actually occurs in the for key in status_data.keys(): loop
2. Add calls to sudo systemctl mask and sudo systemctl unmask as appropriate to ensure changes persist across reboots
3. Substitute returns with continues so that even if one service fails, we still try to handle the others
Note that the masking is persistent, even if the configuration is not saved. We may want to consider only calling systemctl enable/disable in hostcfgd when the DB table changes, and only call systemctl mask/unmask upon calling config save.
This change allows the recursive `git clean` and `git reset` commands to continue even if they encounter an error in one of the submodules. Previously, if an error was encountered, the operation would terminate with a message similar to the following:
Stopping at 'src/sonic-mgmt-framework'; script returned non-zero status.
* Update sonic-sairedis (sairedis with SAI 1.6 headers)
* Update SAIBCM to 3.7.4.2, which is built upon SAI1.6 headers
* missed updating BRCM_SAI variable, fixed it
* Update SAIBCM to 3.7.4.2, updated link to libsaibcm
* [Mellanox] Update SAI (release:v1.16.3; API:v1.6)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
* Update sonic-sairedis pointer to include SAI1.6 headers
* [Mellanox] Update SDK to 4.4.0914 and FW to xx.2007.1112 to match SAI 1.16.3 (API:v1.6)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
* ensure the veth link is up in docker VS container
* ensure the veth link is up in docker VS container
* [Mellanox] Update SAI (release:v1.16.3.2; API:v1.6)
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
* use 'config interface startup' instead of using ifconfig command, also undid the previous change'
Co-authored-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
This changes is needed to support parallel build Jobs.
Made LIBYANG_PY2 and LIBYANG_PY3 depend on LIBYANG and LIBYANG_CPP. Also LIBYANG_CPP depends on LIBYANG.
Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
Images built from master branch and installed on devices where we mount /var/lib/docker in RAM (because the HDD is small) were failing to boot properly. The Docker service failed to start because /var/lib/docker was filled to 100%. This is due to the increase in total number of containers in the image.
As of today, /var/lib/docker contains 1.3 GB of data. Therefore, this PR increases the size of the ramdisk to 1.5 GB to accommodate all the containers. Example output below from an Arista-7050-QX32 SKU:
```
admin@sonic:~$ df -h
Filesystem Size Used Avail Use% Mounted on
...
tmpfs 1.5G 1.3G 172M 89% /var/lib/docker
...
```
Changes:
-- Removing the part where build dependencies are installed in setup.py.
-- Adding build dependencies in corresponsing rules\..*.mk file.
Signed-off-by: Praveen Chaudhary pchaudhary@linkedin.com
- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100, Z9100 and Z9264 platforms to 'float'.
* [platform]: Add a new supported platform, Delta-agc032
Switch Vendor: Delta
Switch SKU: Delta-agc032
CPU: BROADWELL-DE
ASIC Vendor: Broadcom
Switch ASIC: Tomahawk3, BCM56980
Port Configuration: 32x400G + 2x10G
- What I did
Add a new Delta platform Delta-agc032.
- How I did it
Add files by following SONiC Porting Guide.
- How to verify it
1. decode-syseeprom
2. sensors
3. psuutil
4. sfputil
5. show interface status
6. bcmcmd
Signed-off-by: zoe-kuan <ZOE.KUAN@deltaww.com>
**- Why I did it**
Advance sonic-py-swsssdk submodule to fix#4632
**- How I did it**
In py3, the response from redis connector is encoded as byte array. They
need to be decoded before accessing them as strings.
Use following commands to test
sonic-db-cli CONFIG_DB "keys *"
sonic-db-cli CONFIG_DB "hget PORT|Ethernet0 admin_status"
**- How to verify it**
sonic-db-cli CONFIG_DB "keys *"
sonic-db-cli CONFIG_DB "hget PORT|Ethernet0 admin_status"
Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>