Commit Graph

523 Commits

Author SHA1 Message Date
Tamer Ahmed
2cc98b4bac [platform] Add Support For Environment Variable File (#5010)
* [platform] Add Support For Environment Variable

This PR adds the ability to read environment file from /etc/sonic.
the file contains immutable SONiC config attributes such as platform,
hwsku, version, device_type. The aim is to minimize calls being made
into sonic-cfggen during boot time.

singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-09-28 21:14:39 +00:00
Joe LeVeque
b70c6f72b2 [dockers][supervisor] Increase event buffer size for dependent-startup (#5247)
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:

```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```

This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.

Resolves https://github.com/Azure/sonic-buildimage/issues/5241
2020-09-28 16:12:53 +00:00
yozhao101
7580c846ad
[201911][Monit] Unmonitor processes in disabled containers (#5462)
We want to let Monit to unmonitor the processes in containers which are disabled in `FEATURE` table such that
Monit will not generate false alerting messages into the syslog.

- Backport of https://github.com/Azure/sonic-buildimage/pull/5153 to the 201911 branch

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-09-25 00:30:41 -07:00
Wirut Getbamrung
10534a39eb
[device/celestica]: Update DX010 platform APIS on 201911 branch (#5416)
* [device/celestica]: DX010 platform API update (#4608)

- Fix fancontrol.service path
- Fix return temp format in thermal API
- Improve init time in chassis API
- Upgrade sfp API

* [device/celestica]: Update DX010 reboot cause API (#4678)

- Add more cases support in DX010 reboot cause API
    - Add Thermal Overload reboot cause support
    - Add new Watchdog reboot cause support

* [device/celestica]: using sonic-py-common package
2020-09-24 10:20:15 -07:00
Aravind Mani
a637c7f9b7
[201911] Dell S6100 fix mux issue (#5415)
- Why I did it
For fixing PCA MUX attachment issue in Dell S6100 platform.

- How I did it
Wait till IOM MUX powered up properly and start I2C enumeration.
2020-09-22 15:29:21 -07:00
Arun Saravanan Balachandran
cb55779937 DellEMC S6100: Log HW reboot reason registers (#5361) 2020-09-19 14:09:31 -07:00
Samuel Angebault
4a185fc09b
[Arista] Update driver submodules. (#5408)
- Fix show platform firmware platform plugin error
- Fix import behavior for arista's sonic_platform implementation
- Fix fan led color and detection on Smartsville
2020-09-18 21:40:49 -07:00
judyjoseph
ef89cb96a0
[201911] Broadcom SAI 3.7.5.2 (#5330)
* Broadcom SAI 3.7.5.2, with the fixes for following CSP's 

e5e06f4 Fix for CS00010914668(KB0029456/SDK-218585) and CS00010503275(KB0029315/SDK-213475)
cf4f8da Solution for CS00010775359 in 3.7
0348f03 Patch for CS00010897814
a2d2fdd Patch for CS00010817763
4d362e8 Patch for CS00010636736
557ddc6 Solution for CS00010443542
0f122f1 Port SDK SER fix for dynamic tables (SDK-175398 / SDK-221245) to SAI 3.7
37e5c5e Fix for CS00010790550
64daf8a Fix for CS00010726597
e7f000e Fix for CS00010697761
44b7ab3 Solution for CS00010617498.
1475c24 CSP10503275 request to pull KB0029314 into 3.7
2020-09-18 17:17:30 -07:00
Samuel Angebault
cde4e88f3a
[201911][Arista] Update arista drivers submodules (#5290)
- fix watchdog timeout units
 - remove arista bind mounts for docker-snmp
 - add python3 mounts for pmon
2020-09-01 20:19:13 -07:00
Aravind Mani
fef4626804
Dell S6100 Port I2C changes to 201911 branch (#5148) 2020-08-18 14:39:36 -07:00
Guohan Lu
da7e36e869 [docker-syncd-brcm]: use service dependency in supervisord to start services 2020-08-15 22:31:23 -07:00
Joe LeVeque
309a098b21
[201911][Python] Migrate applications/scripts to import sonic-py-common package (#5132)
As part of consolidating all common Python-based functionality into the new sonic-py-common package, this pull request:
1. Redirects all Python applications/scripts in sonic-buildimage repo which previously imported sonic_device_util or sonic_daemon_base to instead import sonic-py-common, which was added to the 201911 branch in https://github.com/Azure/sonic-buildimage/pull/5063
2. Replaces all calls to `sonic_device_util.get_platform_info()` to instead call `sonic_py_common.get_platform()` and removes any calls to `sonic_device_util.get_machine_info()` which are no longer necessary (i.e., those which were only used to pass the results to `sonic_device_util.get_platform_info()`.
3. Removes unused imports to the now-deprecated sonic-daemon-base package and sonic_device_util.py module

This is a step toward resolving https://github.com/Azure/sonic-buildimage/issues/4999
2020-08-13 16:35:53 -07:00
abdosi
ea1b25a210
[libsaibcm] updated libsaibcm debian package for (#5138)
3.7.5.1-3 in 201911

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-08-11 08:55:43 -07:00
Joe LeVeque
840be7732c
[201911][devices] Update SFP keys to align with new standard (#4976)
Align SFP key names with new standard defined in https://github.com/Azure/sonic-platform-common/pull/97

- hardwarerev -> hardware_rev
- serialnum -> serial
- manufacturename -> manufacturer
- modelname -> model
- Connector -> connector
2020-07-16 11:09:47 -07:00
Samuel Angebault
41ba95ee3f
[arista] update Arista drivers submodules (#4967)
Merge most of the changes that recently made it to master.
This will be the last such merge operation and future commits will only cherry-pick fixes and targeted features.

Major fixes and features,
- reboot cause enhancement with more hardware reboot cause reporting
- fix reboot cause parsing issue with 201811 release
- fix get_change_event logic
- fix error message on missing sysfs entry by our plugins
- final piece of the platform refactors for fan and sensor reporting through the platform API
2020-07-16 10:36:07 -07:00
arlakshm
7c699df654 Add support for bcmsh and bcmcmd utlitites in multi ASIC devices (#4926)
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
This PR has changes to support accessing the bcmsh and bcmcmd utilities on multi ASIC devices
Changes done
- move the link of /var/run/sswsyncd from docker-syncd-brcm.mk to docker_image_ctl.j2
- update the bcmsh and bcmcmd scripts to take -n [ASIC_ID] as an argument on multi ASIC platforms
2020-07-11 09:47:24 -07:00
abdosi
43bb028ed6
[brcmsai]: Updated BRCM SAI Debina package to 3.7.5.1-2 (#4916)
Fix for Copp Rules not having Policer Rate-Limit applied.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-07-08 06:47:33 -07:00
paavaanan
369bf887d4 [Dell]: DellEMC S6100 disable pericom/xlinx chipset (#4868)
- Xilinx/pericom peripherals are not actively used in DellEMC S6100 switch.
- These peripherals are throwing PCIE corrected messages in some of the units and filling syslog.
- Since it is not usable disabling it at startup.
2020-07-05 15:37:38 -07:00
yozhao101
bbcd4c6235 [Monit] Use the string "/usr/bin/syncd\s" to monitor the syncd process (#4706)
**- Why I did it**
After discussed with Joe, we use the string "/usr/bin/syncd\s" in Monit configuration file to monitor 
syncd process on Broadcom and Mellanox. Due to my careless, I did not find this bug during the 
previous testing. If we use the string "/usr/bin/syncd" in Monit configuration file to monitor the 
syncd process, Monit will not detect whether syncd process is running or not. 

If we ran the command  `sudo monit procmactch “/usr/bin/syncd”` on Broadcom, there will be three 
processes in syncd container which matched this "/usr/bin/syncd": `/bin/bash /usr/bin/syncd.sh
wait`, `/usr/bin/dsserve /usr/bin/syncd –diag -u -p /etc/sai.d/sai.profile` and `/usr/bin/syncd –diag -
u -p /etc/sai.d/said.profile`. Monit will select the processes with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match. 

Similarly, On Mellanox Monit will also select the process with the highest uptime (at there 
`/bin/bash /usr/bin/syncd.sh wait`) to match and did not select `/usr/bin/syncd –diag -u -p
/etc/sai.d/said.profile` to match.

That is why Monit is unable to detect whether syncd process is running or not if we use the string “/usr/bin/syncd” in Monit configuration file. If we use the string "/usr/bin/syncd\s" in Monit configuration file, Monit can filter out the process `/bin/bash /usr/bin/syncd.sh wait` and thus can correctly monitor the syncd process.

**- How I did it**

**- How to verify it**

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-06-28 07:29:59 -07:00
Wirut Getbamrung
3d0126baeb [platform-celestica]: Update fancontrol service for Seastone-DX010 device (#3690)
* [platform/cel]: add fancontrol service support for dx010

* [device/celestica]: add hysteresis temp to dx010 fancontrol configuration
2020-06-28 07:27:20 -07:00
abdosi
a84b534ed7
[broadcom-sai]: Updated broadcom SAI to fix High CPU on TH/Th2 platform. (#4859)
Verified after loading on TH platforms cpu usage gone down:

Previous:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2521 root 20 0 1512860 360452 63540 S 144.6 4.4 8:00.03 syncd

After Fix:
7500 root 20 0 1592420 350912 64184 S 45.4 4.3 3:50.99 syncd
2020-06-27 01:10:41 -07:00
yozhao101
c2364cf03e
[201911][dockers] Update critical_processes file syntax (#4854)
Backport of https://github.com/Azure/sonic-buildimage/pull/4831 to the 201911 branch
2020-06-26 11:37:05 -07:00
Arun Saravanan Balachandran
030570de81 [DellEMC]: EEPROM decoder for S6000, S6000-ON (#4718)
**- Why I did it**

For decoding system EEPROM of S6000 based on Dell offset format and S6000-ON’s system EEPROM in ONIE TLV format.

**- How I did it**

- Differentiate between S6000 and S6000-ON using the product name available in ‘dmi’  ( “/sys/class/dmi/id/product_name” )
- For decoding S6000 system EEPROM in Dell offset format and updating the redis DB with the EEPROM contents, added a new class ‘EepromS6000’ in eeprom.py, 
- Renamed certain methods in both Eeprom, EepromS6000 classes to accommodate the plugin-specific methods.

**- How to verify it**

- Use 'decode-syseeprom' command to list the system EEPROM details.
- Wrote a python script to load chassis class and call the appropriate methods.

UT Logs: [S6000_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735515/S6000_eeprom_logs.txt), [S6000-ON_eeprom_logs.txt](https://github.com/Azure/sonic-buildimage/files/4735461/S6000-ON_eeprom_logs.txt)
Test script: [eeprom_test_py.txt](https://github.com/Azure/sonic-buildimage/files/4735509/eeprom_test_py.txt)
2020-06-16 08:15:28 -07:00
Arun Saravanan Balachandran
093d7731ab
[201911] DellEMC: Skip thermalctld and thermal platform API changes (#4752)
**- Why I did it**

- Skip thermalctld in DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Change the return type of thermal Platform APIs in DellEMC S6000, S6100 and Z9100 platforms to 'float'.

**- How I did it**

- Add 'skip_thermalctld:true' in pmon_daemon_control.json for DellEMC S6000, S6100, Z9100 and Z9264 platforms.
- Made changes in thermal.py, for 'get_temperature', 'get_high_threshold' and 'get_low_threshold' to return 'float' value.

**- How to verify it**

- Check thermalctld is not running in 'pmon'.
- Wrote a python script to load Chassis class and then call the APIs accordingly and verify the return type.
2020-06-11 10:48:27 -07:00
judyjoseph
ccf12d2ff7
SAI 3.7.5.1 (#4710) 2020-06-07 20:45:12 -07:00
Arun Saravanan Balachandran
98b8d1eee1 DellEMC: get_change_event Platform API implementation for S6000, S6100 and Z9100 (#4593)
For detecting transceiver change events through xcvrd in DellEMC S6000, S6100 and Z9100 platforms.

- In S6000, rename 'get_transceiver_change_event' in chassis.py to 'get_change_event' and return appropriate values.
- In S6100, implement 'get_change_event' through polling method (poll interval = 1 second) in chassis.py (Transceiver insertion/removal does not generate interrupts due to a CPLD bug)
- In Z9100, implement 'get_change_event' through interrupt method using select.epoll().
2020-05-27 18:00:45 -07:00
Santhosh Kumar T
045d5e6f23 [DellEMC] S6000 Disable Low power mode by default (#4592) 2020-05-20 07:55:16 -07:00
Santhosh Kumar T
1e3df476e5 [DellEMC] S6100 Last Reboot Reason Thermal Support (#3767) 2020-05-09 18:37:31 -07:00
Samuel Angebault
8456aeba98
Update arista drivers submodules (#4532) 2020-05-04 20:10:50 -07:00
rajendra-dendukuri
e843d994d0 [brcm-th-svk]: Fix errors in BCM956960K switch (#4390)
Fix Broadcom TH SVK boot up crash
2020-04-30 22:10:16 -07:00
Wataru Ishida
674f72e2ce [broadcom]: respect the current network namespace when creating netdev (#3896)
https://github.com/Broadcom-Switch/OpenNSL/issues/26

Signed-off-by: Wataru Ishida <ishida@nel-america.com>
2020-04-27 08:50:23 -07:00
Santhosh Kumar T
214541b75b [DellEMC] S6000 - Thermal support - Last Reboot Reason (#4097)
- Added support for Thermal event in Last Reboot Reason "show reboot-cause" command.
- Added support for sending log message in case of thermal shutdown.

sonic NOTICE root: Shutting down due to over temperature (40 degree, 30 degree, 34 degree)
2020-04-27 08:50:23 -07:00
abdosi
5839a01abd [sonic-buildimage] libsaibcm Debian package update (#4439)
from 3.7.3.3-3 to 3.7.3.3-4
Fixes for PFC WD

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-04-19 16:29:32 -07:00
Joe LeVeque
d09fba379f
[201911][Juniper QFX5210] Fix Python errors (#4413) 2020-04-11 16:42:56 -07:00
Srideep
c080e80165 [DellEMC] S5232 platform updates (#4360)
FPGA driver crash fix for stale buffer in i2c transfer
LED firmware load issue fix.
10G port swapfix
psu/sfp bug fixes to report correct states/status of hw
2020-04-10 21:22:48 -07:00
Samuel Angebault
8819322210
[Arista] Update drivers submodules (#4353)
* Update arista drivers submodules

* Add device configs for 7060CX2-32S

* Update boot0 and union-mount for 7060CX2-32S

* Add 7170-32C and 7170-32CD support in boot0

* Sync after writting boot configs

* Add 7170-32C and 7170-32CD device configurations

Co-authored-by: Boyang Yu <byu@arista.com>

Co-authored-by: Boyang Yu <byu@arista.com>
2020-04-01 23:26:42 -07:00
Aravind Mani
9385e0201f [DellEMC] Fix Z9100 port index issue (#4309) 2020-03-24 15:15:14 -07:00
Samuel Angebault
fcb5b41e33 [arista] Update platform drivers submodules (#4200)
* Build Arista drivers using DPKG_DEB
* Update arista platform submodule
2020-03-22 23:02:54 -07:00
Mykola F
444c450aa3 [syncd-rpc.mk] install ptf dependancy (#4279)
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
2020-03-19 22:18:49 -07:00
Olivier Singla
a8baca0d6e [kernel]: security kernel update to 4.9.189 (#3913)
This patch upgrade the kernel from version
4.9.0-9-2 (4.9.168-1+deb9u3) to 4.9.0-11-2 (4.9.189-3+deb9u2)

Co-authored-by: rajendra-dendukuri <47423477+rajendra-dendukuri@users.noreply.github.com>
2020-03-15 08:52:29 -07:00
byu343
950926a837
[arista]: Add support for Arista Lodoga (#4232)
Backport the support of Arista Lodoga to 201911
2020-03-11 13:12:39 -07:00
abdosi
c495dc3050 [broadcom]: Updated BRCM SAI Debian package revision number to 3.7.3.3-2. (#4182)
This has patch to fix enable/disable attribute for lag member. It's on top of vanilla 3.7.3.3
2020-03-03 13:20:51 -08:00
abdosi
fe9baadc54 Updated the file permission mode to include +x (#4183)
This to avoid git status reporting dirty.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2020-02-24 11:05:24 -08:00
yozhao101
71225ea4cc [Service] Enable/disable container auto-restart based on configuration. (#4073) 2020-02-13 16:20:21 -08:00
Wirut Getbamrung
68f664beb6 [platform/cel]: Remove afulnx_64 (#3900)
remove afulnx_64 install script
2020-02-13 16:02:07 -08:00
paavaanan
e32d22c0b9 [devices]: DellEMC S6000 fancontrol support (#4048)
- Implemented fancontrol service to monitor S6000 fans and adjust fan speed w.r.t temperature.
- fancontrol.service starts the fancontrol script at startup.
- This script takes the average temperature by reading three sensors and configure FANS to appropritate RPM against the temperature.
- When the temperature is adjusted script will log in syslog for future reference.
- Also, script checks for faulty fans and report the status in syslog.
2020-02-03 15:37:26 -08:00
Dong Zhang
42bffc1215 [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector (#4035)
* [MultiDB] (except ./src and ./dockers dirs): replace redis-cli with sonic-db-cli and use new DBConnector
* update comment for a potential bug
* update comment
* add TODO maker as review reqirement
2020-02-03 15:36:55 -08:00
yozhao101
d6aee4cc65 [Monit] Change the full process name of syncd in the monit config file. (#4033)
Since the syncd process running on different platforms will have the different full path names, we
change the full path name of process syncd in the monit config file such that it will be universal and is not for a specific vendor.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-02-03 15:36:24 -08:00
Arun Saravanan Balachandran
c841693007 DellEMC: Platform2.0 API enhancements in DellEMC S6000 and other API changes (#3956) 2020-02-03 15:34:12 -08:00
Arun Saravanan Balachandran
939de3d5cc DellEMC : Platform2.0 API Implementation [S6100, S6000, Z9100] (#3740) 2020-02-03 15:31:39 -08:00