Commit Graph

4056 Commits

Author SHA1 Message Date
mprabhu-nokia
00cea080af
Chassisd to monitor cards in a modular chassis (#5523)
HLD: Azure/SONiC#646

Introducing chassisd process to monitor status of the control, line and fabric cards in a modular chassis.

- Why I did it
Modular Chassis has control-cards, line-cards and fabric-cards along with other peripherals. Chassisd will be a central entity that has visibility of the entire chassis.

- How I did it
Chassisd process will monitor cards in the main thread. Another configuation_handling_task is created to listen to CONFIG_DB for admin_status up/down events. The monitored status is persisted in REDIS-DB.
2020-12-15 16:28:58 -08:00
zhenggen-xu
182a809dc3
[docker-vs][docker-orchagent] install python3 dependent packages for restore_neighbors.py (#6207)
Install the necessary python3 dependent packages to convert restore_neighbor.py 
to support python3 as python2 is EOL. See: Azure/sonic-swss#1542

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
2020-12-15 11:06:30 -08:00
Lawrence Lee
290f66bbb8
[minigraph.py]: Prefer parsing device type from <ElementType> (#6184)
* Parse device type from <ElementType> first in <PngDec>
* Fall back to <Device> type attribute if no <ElementType> is found

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-12-15 10:20:44 -08:00
shlomibitton
a6aaffd2ad
[kdump] Add more kernel panic conditions for vmcore dump (#6095)
Create new file to "sysctl.d" with desired panic conditions.
It will trigger a vmcore dump using kdump-tools on these situations.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-15 08:54:13 -08:00
rajendra-dendukuri
b60448a006
kdump: Add default kdump command line arguments (#6180)
The default /etc/default/kdump-tools file provided by the kdump-tools
package doesn't set a value for KDUMP_CMDLINE_APPEND.

The default kdump command line arguments need to be set in order
to extend them to use additional arguments required for SONiC
platforms.

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2020-12-15 08:52:23 -08:00
Sabareesh-Kumar-Anandan
9f4ca01388
[sonic-config-engine] Adding dependent pkgs needed for arm compilation (#6186)
libxslt-dev and libz-dev are dependencies for lxml==4.6.1 which is required for pyangbind==0.8.1

lxml-4.6.2-cp37-cp37m-manylinux1_x86_64.whl is directly downloaded in amd64 whereas in arm this is built from lxml-4.6.2.tar.gz

Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>
2020-12-15 08:44:46 -08:00
Sabareesh-Kumar-Anandan
3cd70b88b7
[marvell][platform] Checking file presence before access (#6208)
In marvell_et644m platform scripts, I have added a check to confirm the file availability before accessing it.

Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>
2020-12-15 08:43:12 -08:00
Wirut Getbamrung
4257c792a2
[device/celestica]: Add xcvrd event support for Seastone-DX010 (#5896)
- Add sysfs interrupt to notify userspace app of external interrupt
- Implement get_change_event() in chassis api.
2020-12-14 10:22:56 -08:00
Shi Su
9580b0407f
[warm-reboot] Remove warmboot file path that overrides the default path (#6198)
- Why I did it
The sai.profile file in kvm images overrides the warmboot file with path /var/cache/sai_warmboot.bin. Since the directory /var/cache is not mounted in syncd, it will be cleared in an image upgrade, the warm-reboot image upgrade will fail if the file is put in the directory.

Fix #6183

- How I did it
Remove the path that overrides the default path. The warmboot file path will then be the default value /var/warmboot/sai-warmboot.bin. Since /var/warmboot/ is mounted by /host/warmboot/ in the host, it could survive an image upgrade.

- How to verify it
Tested warm reboot upgrading kvm image locally.
2020-12-13 22:17:33 -08:00
Stephen Sun
e010d83fc3
[Dynamic buffer calc] Support dynamic buffer calculation (#6194)
**- Why I did it**
To support dynamic buffer calculation.
This PR also depends on the following PRs for sub modules
- [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](https://github.com/Azure/sonic-swss/pull/1338)
- [sonic-swss-common: Dynamic buffer calculation #361](https://github.com/Azure/sonic-swss-common/pull/361)
- [sonic-utilities: Support dynamic buffer calculation #973](https://github.com/Azure/sonic-utilities/pull/973)

**- How I did it**
1. Introduce field `buffer_model` in `DEVICE_METADATA|localhost` to represent which buffer model is running in the system currently:
    - `dynamic` for the dynamic buffer calculation model
    - `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used
2. Add the tables required for the feature:
   - ASIC_TABLE in platform/\<vendor\>/asic_table.j2
   - PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2
   - PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed.
   - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2
   - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2
3. Copy the newly introduced j2 files into the image and rendering them when the system starts
4. Update the CLI options for buffermgrd so that it can start with dynamic mode
5. Fetches the ASIC vendor name in orchagent:
   - fetch the vendor name when creates the docker and pass it as a docker environment variable
   - `buffermgrd` can use this passed-in variable
6. Clear buffer related tables from STATE_DB when swss docker starts
7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2
8. Remove buffer pool sizes for ingress pools and egress_lossy_pool
   Update the buffer settings for dynamic buffer calculation
2020-12-13 11:35:39 -08:00
Qi Luo
6db01243d9
[sonic-swss-common] Update submodule (#6197)
Including commits in sonic-swss-common repo:
```
40e695d 2020-12-12 | [pyext] Implement DBConnector::pubsub() to emulate python Redis.pubsub() (#427) [Qi Luo]
9a2bf6c 2020-12-12 | Add fdb_flush.v2.lua script to makefile (#425) [Kamil Cudnik]
7d60e7e 2020-12-07 | Handle exception in Select (#424) [Qi Luo]
```
2020-12-13 11:34:15 -08:00
Qi Luo
25826626aa
[baseimage]: No need to apt-mark manual
It makes no difference during build.
2020-12-12 11:24:51 -08:00
Tamer Ahmed
cbbda09599
[relay]: Prevent Buffer Overrun Of Malformed DHCP Packet (#6057)
[dhcp-relay]: Prevent Buffer Overrun Of Malformed DHCP Packet

The add/strip relay agent options does not take into account the buffer
length and so it is possible to overrun the buffer. The issue will
result in contents from previous packet being added to the current one.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
2020-12-11 16:28:05 -08:00
Dong Zhang
6999ce5282
[sonic-utilities] update submodule for supportting MultiDB warmboot backing up database (#6181)
update sonic-utilites submodule, adding MultiDB warmboot backing up database support:

*    [MultiDB] add MultiDB warmboot support - backing up database (#1205) 
    * lua script to backup all database into one database and create rdb file - following earlier multiDB warmboot design at https://github.com/Azure/SONiC/blob/master/doc/database/multi_database_instances.md
    * copy this rdb file as before to WARM_DIR
    * restoring database part is in another PR at sonic-buildimage  https://github.com/Azure/sonic-buildimage/pull/5773, they depend on each other
2020-12-11 11:30:50 -08:00
Junchao-Mellanox
51c77b179f
[Mellanox] Add python3 support for Mellanox platform API (#6175)
python2 is end of life and SONiC is going to support python3. This PR is going to support:

1. Mellanox SONiC platform API python3 support
2. Install both python2 and python3 verson of Mellanox SONiC platform API or pmon and host side
2020-12-11 10:51:31 -08:00
carl-nokia
2cd236cece
[nokia]: IXS7215:fix for dynamic changes reading DOM and LED (#6136)
clean up corner condition when SDK reset and SFP's move

Co-authored-by: Carl Keene <keene@nokia.com>
2020-12-11 07:35:24 -08:00
Prabhu Sreenivasan
77afb8e54d
[ntp]: ntp-systemd-wrapper file is getting overwritten (#6179)
ntp-systemd-wrapper file from files/image_config/ntp was not getting picked up. Added a line on sonic_debian_extension.j2 to copy over the file from files/image_config/ntp after installing the debian package.

Signed-off-by: Prabhu Sreenivasan <prabhu.sreenivasan@broadcom.com>
2020-12-10 23:20:41 -08:00
Joe LeVeque
641b610115
[build]: Install 'wheel' package in sonic-slave-buster (#6182)
Install the 'wheel' package in sonic-slave-buster container to eliminate error messages like the following:

```
  Running setup.py bdist_wheel for watchdog: started
  Running setup.py bdist_wheel for watchdog: finished with status 'error'
  Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-Qd3K08/watchdog/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-0AHpMe --python-tag cp27:
  usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
     or: -c --help [cmd1 cmd2 ...]
     or: -c --help-commands
     or: -c cmd --help
  
  error: invalid command 'bdist_wheel'
  
  ----------------------------------------
  Failed building wheel for watchdog

```

These error messages appear to have no impact on the image build, because the Python package seems to still get installed successfully afterward, just the building of a wheel package fails. Therefore, this is more of a cosmetic fix than an actual bug.
2020-12-10 23:18:58 -08:00
Lawrence Lee
fd4433d836
[minigraph.py]: Remove prefix length from peer switch loopback address (#6174)
* PEER_SWITCH table in config DB expects a standalone IP address w/o a prefix length

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2020-12-10 14:43:39 -08:00
Dong Zhang
b2a3de5f4f
[MultiDB] add mutidb warmboot support - restoring database (#5773)
* restoring each database  with all data before warmboot and then flush unused data in each instance, following the multiDB warmboot design at https://github.com/Azure/SONiC/blob/master/doc/database/multi_database_instances.md
  * restore needs to be done in database docker since we need to know the database_config.json in new version
  * copy all data rdb file into each instance restoration location andthen flush unused database
  * other logic is the same as before
*  backing up database part is in another PR at sonic-utilities https://github.com/Azure/sonic-utilities/pull/1205, they depend on each other
2020-12-10 11:06:19 -08:00
Srideep
3c9a7ec623
[DellEMC Z9332f] Platform API 2.0 Support and bug fixing (#5958)
- Add platform infra to support 2.0 API
- Bug fixing for 9332 known issues
2020-12-10 10:30:44 -08:00
judyjoseph
6d9ecbcfd8
Move frr logs from syslog to /var/log/frr/*.log (#5988)
- Why I did it
Move frr logs from syslog from the directory /var/log/quagga/.log to /var/log/frr/log

- How I did it
Updated the rsyslog config files.

- How to verify it
Verified the logs come into the file zebra.log and bgpd.log in the DIR /var/log/frr/log
2020-12-10 08:44:34 -08:00
rajendra-dendukuri
31ce20ac38
[kdump]: Kdump usability and reliability improvements (#6113)
- Allow platform specific reboot script to be called after crash kernel has
finished copying the kernel vmcore
- Disable pcie advanced features when running crash kernel. This improves
reliability of the crash kernel to successfully create a vmcore and also
reboot
- Allow crash kernel to reboot if a panic is seen while it is generating a
vmcore
- Fix crash kernel to use the SONiC specific /usr/local/bin/reboot script
instead of the Linux reboot command /sbin/reboot
- Use sonic_platform as the kernel command line parameter to pass platform identifier string

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2020-12-10 01:32:37 -08:00
Sabareesh-Kumar-Anandan
8d00b57959
[sonic-utilities] Update sonic-utilities submodule (#6139)
[show] Moving some py imports inside specific cmds (#1202)
Fix python3 issue in ecnconfig (#1286)

Signed-off-by: Sabareesh Kumar Anandan <sanandan@marvell.com>
2020-12-10 01:29:30 -08:00
Qi Luo
b0fdeff173
[baseimage]: No need to mark packages as auto since all debootstrap installed (#6159)
Originally this line is used to mark all previously installed packages (deboostrap installed) as auto, so later if no other packages depend on anyone of them, it will be auto removed. Seems we gained little from this line, so let's remove it.
2020-12-10 01:05:21 -08:00
Ying Xie
d9b9e4f7db
[swss] advance swss submodule head (#6157)
- Why I did it
Advance swss submodule to pick up latest changes.

- How I did it
Including folowing changes:

[portsorch] adjust port initialized event back to notice (#1532)
Signed-off-by: Ying Xie ying.xie@microsoft.com
2020-12-10 00:28:14 -08:00
Blueve
3d22019802
[sonic-config-engine/minigraph] Enable console mgmt feature for console device (#6166)
* Introduced a list console_device_types which contains the device types that support console management feature
* Inject CONSOLE_SWITCH:console_mgmt table with enabled:yes or enabled:no

Signed-off-by: Jing Kan jika@microsoft.com
2020-12-10 15:42:11 +08:00
shlomibitton
6762f526d9
[NVMe] Add NVMe SSD disc type support to installer.sh script (#6142)
In order to install a SONiC image on top of a NVMe SSD disc properly with ONIE we must configure it properly on the installer.sh script.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-09 19:03:27 -08:00
trzhang-msft
d4d90a8963
Support for dual tor option in dhcp docker template (#6152) 2020-12-09 18:10:00 -08:00
Qi Luo
0e554e09ce
[makefile] Remove python-netsnmp deb package from makefile (#6161)
Because no one is using it in buildimage repo
2020-12-09 17:40:07 -08:00
Ying Xie
68f1352159
[makefile] re-organize make file so init only execute once (#6160)
- Why I did it
make init executed 3 times, which is unnecessary.

- How I did it
reorganize the makefile so that init only executed once.

- How to verify it
make reset
make init

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
2020-12-09 14:13:33 -08:00
gechiang
0ffadf357e
Upgrade to SAI BCM 4.2.1.5-6 to pick up the fib hash patch (CS00011388674) and the fdb invalid port patch (CS00011298546) (#6165) 2020-12-09 13:53:27 -08:00
Joe LeVeque
762dfd965b
[sonic-platform-daemons] Update submodule (#6158)
* src/sonic-platform-daemons 73e6ddd...4da0bfc (3):
  > Align style with PEP8 standards (#128)
  > Support python3 for xcvrd, psud, thermalctld and syseepromd (#132)
  > Import mock psu object for testing LED (#119)
2020-12-09 12:04:56 -08:00
Srideep
fb83550e92
[DellEMC Z9332F] New Sku support (16x400G+64x100G) (#6091)
* Add New port map for TH3
* Add LED and other npu related files
2020-12-08 18:03:38 -08:00
Joe LeVeque
68febe6228
[sonic-slave]: Pin version of m2crypto Python package to 0.36.0 in slave containers (#6163)
The maintainers of the m2crypto Python package pushed two new versions of the package to PyPI today, version 0.37.0 followed a few hours later by 0.37.1 (https://pypi.org/project/M2Crypto/0.37.1/#history). It appears as though these packages are failing to build/install properly in our image.

The problem was noticed in the Jessie container, where we were not previously explicitly installing the Debian m2crypto package. As part of this PR, I install m2crypto via pip in the Jessie container and pin down the version. I also modified the Stretch and Buster Dockerfiles to install the package vi pip in the same fashion for consistency.
2020-12-08 16:25:54 -08:00
Kamil Cudnik
275c5cfb07
[submodule] sonic-swss update: Add zmq sync mode flag (#1505) (#6144)
To support communication mode flag on orchagent
2020-12-08 11:37:04 -08:00
Samuel Angebault
44f4c2ed66
[Arista] Update driver submodules (#6151)
- Enhance eeprom parsing robustness on corrupted fields
 - Add chassis provisioning service
 - Disable CPU sleep state on some systems
 - Complete refactor for FanSlots
 - Fix module unload while still in use
2020-12-08 11:17:28 -08:00
zzhiyuan
4d4b489a5f
Add platform.json to Arista platforms (#6150)
platform.json is needed for sonic-mgmt testing. Also in the future it will be used as part of dynamic port breakout.

Also removed the folder symlink for BlackhawkDD because it has a different platform.json than BlackhawkO.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-12-08 10:20:13 -08:00
Vadym Hlushko
09ff334965
[DPB] added capability files for SN4700 platform (#6014)
* [DPB] added capability files for SN4700 platform

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>

* [DPB] fixed platform.json and hwsku.json for SN4700

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>

* [DPB] fixed wrong mode 4x100G[50G] -> 4x100G

Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
2020-12-08 16:47:51 +02:00
dflynn-Nokia
98a8753c11
[sonic-mgmt-common] Update submodule (#6129)
This update brings in the following commits.

86c1108 Enable arm architecture to build in addition to amd64 (#37)
4acb2c3 fix bugs and enhance Transformer (#35)
49e5a22 ygot related enhancements and fixes (#34)
51224de Fix ietf yang search path for cvl schema builds (#32)
3c6cdb3 CVL Changes #8: 'must' and 'when' expression evaluation (#31)
dabf231 CVL Changes #7: 'leafref' evaluation (#28)
6f9535f CVL Changes #6: Customized Xpath Engine integration (#27)
5e2466b DB-Layer fixes/enhancements (#26)
9a27302 CVL Changes #4: Implementation of new CVL APIs (#22)
dbf1093 Translib support for authorization, yang versioning and Delete flag (#21)
80f369e CVL Changes #5: YParser enhancement (#23)
904ce18 CVL Changes #3: Multi-db instance support (#20)
9d24a34 CVL Changes #2:  YValidator infra changes for evaluating xpath expression (#19)
f3fc40f CVL Changes #1: Initial CVL code reorganization and common infra changes (#18)
4922601 Bulk and RPC API support in translib (#16)
1d730df RFC7895 yang module library implementation (#15)
2020-12-08 04:31:16 -08:00
shlomibitton
43a32e60fc
[kernel] Change grub cmdline to set c-states to 0 for "Intel" CPUs (#6051)
Usually for a use case like networking - should not be configured to reach c6, the maximum used is c1e – due to the added latency getting in & out of states (bad for networking).

Following a recommendation by Intel, networking system should avoid getting in & out of states which introduce latency. The recommended state is c1e and no state change enabling.

In addition, c-state sole purpose is to save power and when inside a networking switch its really negligent being such a tiny consumer vs. the whole cluster.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
2020-12-07 12:30:50 -08:00
Myron Sosyak
3b04da97fb
[sonic-platform-common] Update submodule (#6141)
Update sonic-platform-common submodule:
* Make eeprom_tlvinfo.py Python3 compatible

**- Why I did it**
To get the latest changes which fix some python2 -> python3 migration errors.
2020-12-07 10:04:48 -08:00
yozhao101
3e717210e9
[sonic-telemetry] Update the submodule sonic-telemetry. (#6132)
[dataset] Add dataset "system uptime" into non-db client. (#52)
Adding new data set to query Sonic OS version. (#50)
[gnmi_server] Disregard EOF status for STREAM subs (#48)

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
2020-12-06 06:01:46 -08:00
Kamil Cudnik
af357f3ec6
[sairedis] Advance sairedis pointer to support cmd zmq flag (#6064)
[sairedis] Advance sairedis pointer to support cmd zmq flag

[meta] Use memcpy instead of cast to prevent strict-aliasing error (#723)
[vslib]Add MACsec forward and filters to HostInterfaceInfo (#719)
[vslib] Add StateBase function for MACsec (#717)
Add support for default zmq synchronous mode flag (#711)
[syncd] Code clean (#720)
[sairedis] Remove custom bulk fdb methods (#710)
[vslib]Add MACsec Filters (#713)
[vslib]Add MACsec Forwarder (#714)
[vslib]Add MACsec Manager (#715)
Add helper functions, findObjects and dumpObject (#716)
Code clean refactor (#712)
[vslib] Fix CorePortIndexMap log line (#708)
[meta] Use custom hash in SaiObjectCollection (#709)
Fix LGTM localtime function warnings (#707)
[vs] VoQ Switch objects initialization - Local Port OID mapping to System Ports (#703)
Code style refactor (#705)
[vs] Initialization of VOQ switch objects (#702)
[vs] SAI support for VOQ switches - Switch State Initialization (#701)
Add MACsec meta methods (#704)
[vs] SAI support for VOQ switches (#698)
[vs] SAI support for VOQ switches - Core Port Index Map File parser (#700)
[vs] SAI support for VoQ switch - Core Port Index Map Container (#699)
[syncd][sairedis] Change pub/sub model to push/pull in zmq notification (#695)
[syncd] Use lua script to update db when using bulk api (#690)
[syncd] Fix bulk api object type for next hop group members (#685)
Add FlexCounter for MACsec SA (#684)
2020-12-05 12:13:34 +01:00
bingwang-ms
49804c64b6
[celestica/dx010]: Fix incorrect path for voltage, current and power. (#6128)
Fixes #6126.

There is a bug in getting the path of voltage, current and power. The
list object is directly converted to string to format the file path. As
a result, read_txt_file will get None value and a WARNING will be
recorded. This commit fix the issue.

Signed-off-by: bingwang <bingwang@microsoft.com>
2020-12-04 13:41:20 -08:00
Dmytro Shevchuk
026f0ec3fb
[Barefoot]: fix unresolved SFP type on Newport/Montara (#6063)
Fix `show interface status` and `sfpshow eeprom` commands showing incorrect information on Newport/Montara platforms
2020-12-04 11:00:03 -08:00
abdosi
59c1e3a78a
[multi-asic] Enhancing monit process checker for multi-asic. (#6100)
Added Support of process checker for work on multi-asic platforms.
2020-12-04 10:39:43 -08:00
rajendra-dendukuri
8a04487499
[kdump]: Implement Kdump configuration handler (#6122)
- Kdump configurations stored and manipulated in ConfigDB are now processed
by hostcfgd and applied asynchronously

Signed-off-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
2020-12-04 10:15:30 -08:00
Samuel Angebault
468aac92b7
[Arista] Update platform configurations for 7060DX4 and 7060PX4 (#6084)
Current support for the 7060PX4-32 and 7060DX4 was broken.
With this change, ports are now linking fine.

Co-authored-by: Zhi Yuan Carl Zhao <zyzhao@arista.com>
2020-12-04 10:11:06 -08:00
Samuel Angebault
8576911a57
[database-chassis]: Fix the way database-chassis start (#6099)
The service crash when the platform boots due to missing waits.
/usr/bin/database.sh tries to operate on a missing socket and fails.
We now wait for the chassis database to be ready the same way we do database.
2020-12-04 10:09:35 -08:00