This PR is part of the following HLD:
Persistent loglevel HLD: sonic-net/SONiC#1041
- Why I did it
After the Logger tables moved from the LOGLEVEL_DB to the CONFIG_DB and the jinja2_cache was deleted the LOGLEVEL_DB is not in use.
- How I did it
Removed the LOGLEVEL_DB from the SONiC code
- How to verify it
All tests were passed
Signed-off-by: maipbui <maibui@microsoft.com>
#### Why I did it
`subprocess` is used with `shell=True`, which is very dangerous for shell injection.
`os` - not secure against maliciously constructed input and dangerous if used to evaluate dynamic content
`yaml.load` can create arbitrary Python objects
#### How I did it
Replace `os` by `subprocess`, remove `shell=True`
Use `yaml.safe_load()`
#### How to verify it
Pass UT
- Why I did it
Fixes#11431
- How I did it
dhcp6relay binds to ipv6 addresses configured on these vlan interfaces
Thus check if they are ready before launching dhcp6relay
- How to verify it
Unit Tests
Tested on a live device
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
slb and bgp mon peers are not needed for storage backend. These neighbor are present in the minigraph.
How I did it
After minigraph parsing, remove these neighbors if it is a storage backend device
How to verify it
Unit tests
Verified on the device that once these tables are removed, these peers don't show up in "show runningconfig bgp" output
- Why I did it
As part of Persistent log level HLD , LOGLEVEL_DB content is moved to CONFIG_DB.
In addition, it was decided to remove jinja2_cache which currently appear on LOGLEVEL_DB
This cache was added to speed up template rendering in start scripts. There were a lot of them rendered during system start. This caused a delay in warm boot LAG restore time. It was tested and verified that with and without the cache we don't see any difference in this timing now. It is probably due to a lot of other optimizations done to sonic-cfggen. Since there is no noticeable improvement made by j2 cache now it is safe to remove it.
- How I did it
Remove redis_bcc.py file and and remove the bytcode_cache from sonic-sfggen
- How to verify it
Warm boot was tested with \ without this jinja2_cache and it there is no difference in performance
Why I did it
This PR is to update TC_TO_QUEUE_MAP|AZURE for SKU Arista-7050CX3-32S-D48C8 and Arista-7260CX3 T0.
The change is only to align the TC_TO_QUEUE_MAP for regular traffic and bounced traffic. It has no impact on business because we have no traffic being mapped to TC2 or TC6.
How I did it
Update TC_TO_QUEUE_MAP|AZURE , and test cases as well.
How to verify it
Verified by running test case test_j2files.py
/sonic/src/sonic-config-engine$ python3 setup.py test -s tests/test_j2files.py
running test
......
----------------------------------------------------------------------
Ran 29 tests in 25.390s
OK
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com
Why I did it
Generate the port configuration required 400G ZR port from minigraph.
How I did it
Add parse logic to get tx_power and laser_freq from LinkMetadata section of the minigraph.
Add UT for packet-chassis and voq chassis
How to verify it
UT
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Why I did it
This PR is to backport PR #11056 and PR #11045 into master branch.
This PR is to enable tunnel_qos_remap on T1 and T0 in DualToR deployment.
On T1, we check the property DownstreamRedundancyTypes. On T0, we check the property RedundancyType.
tunnel_qos_remap is set to enabled if gemini is in DownstreamRedundancyTypes (on T1) or RedundancyType (on T0).
How I did it
The change is implemented in minigraph.py.
How to verify it
Verified by test_minigraph_case.py and 'test_j2files.py`.
- Why I did it
Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario
This is to port #11032 and #11299 from 202012 to master.
Support additional queue and PG in buffer templates, including both traditional and dynamic model
Support mapping DSCP 2/6 to lossless traffic in the QoS template.
Add macros to generate additional lossless PG in the dynamic model
Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
Buffer tables are rendered via using macros.
Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.
On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:
40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
Improve throughput and latency for 7260 deployments
How I did it
Update the dynamic threshold to 0 and ECN settings as 2mb/10mb/5%
How to verify it
Updated unit tests to use the modified values for 7260 ecn settings.
What I did:
Added Support for deployment_id parsing for Device Asic metadata.
Why I did:-
Deployment Id is used in BGP docker for FRR template generation. For multi-asic platforms running in namespace without deployment id as key in DEVICE_METADATA FRR template generation fails. This change is needed after this #10154 where if deployment_id is none we don't update DEVICE_METADA dictionary.
How I verify:-
Added unit-test.
Why I did it
This PR is to add a flag to control whether to generate PORT_QOS_MAP|global entry or not.
It's because for some HWSKU, such as BackEndToRRouter and BackEndLeafRouter, there is no DSCP_TO_TC_MAP defined.
Hence, if the PORT_QOS_MAP|global entry is generated, OA will report some error because the DSCP_TO_TC_MAP map AZURE can not be found.
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- saiObjectTypeQuery: invalid object id oid:0x7fddb43605d0
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- meta_generic_validation_objlist: SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP:SAI_ATTR_VALUE_TYPE_OBJECT_ID object on list [0] oid 0x7fddb43605d0 is not valid, returned null object id
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- applyDscpToTcMapToSwitch: Failed to apply DSCP_TO_TC QoS map to switch rv:-5
Jul 14 00:24:40.286767 str2-7050qx-32s-acs-03 ERR swss#orchagent: :- doTask: Failed to process QOS task, drop it
This PR is to address the issue.
How I did it
Add a flag require_global_dscp_to_tc_map to control whether to generate the PORT_QOS_MAP|global entry. The default value for require_global_dscp_to_tc_map is true. If the device type is storage backend, the value is changed to false. Then the PORT_QOS_MAP|global entry is not generated.
How to verify it
Update the current test_qos_dscp_remapping_render_template to cover storage backend.
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
There is a need to select different mmu profiles based on deployment type
How I did it
There will be separate subfolders (RDMA-CENTRIC, TCP-CENTRIC, BALANCED) in each hwsku folder which contains deployment specific mmu and qos settings. SonicQosProfile attribute in the minigraph will be used to determine which settings to use. If that attribute is not present, the default settings that exist in the hwsku folder will be used
Why I did it
Currently interfaces.j2 hardcodes to eth0 even when there are multiple interfaces in MGMT_INTERFACE. This change adds support to generate /e/n/i when there are multiple interfaces in MGMT_INTERFACE.
How I did it
By removing hardcoded eth0 when looping through MGMT_INTERFACE.
How to verify it
Verified through unit test.
Which release branch to backport (provide reason below if selected)
201811
201911
202006
202012
202106
202111
202205
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)
Signed-off-by: Neetha John nejo@microsoft.com
Why I did it
For storage backend, certain rules will be applied to the DATAACL table to allow only vlan tagged packets and drop untagged packets.
How I did it
Create DATAACL table if the device is a storage backend device
To avoid ACL resource issues, remove EVERFLOW related tables if the device is a storage backend device
How to verify it
Added the following unit tests
- verify that EVERFLOW acl tables is removed and DATAACL table is added for storage backend tor
- verify that no DATAACL tables are created and EVERFLOW tables exist for storage backend leaf
Why I did it
Storage backend has all vlan members tagged. If untagged packets are received on those links, they are accounted as RX_DROPS which can lead to false alarms in monitoring tools. Using this acl to hide these drops.
How I did it
Created a acl template which will be loaded during minigraph load for backend. This template will allow tagged vlan packets and dropped untagged
How to verify it
Unit tests
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
To further support parse out soc_ipv4 and soc_ipv6 out of Dpg:
<DeviceDataPlaneInfo>
<IPSecTunnels />
<LoopbackIPInterfaces xmlns:a="http://schemas.datacontract.org/2004/07/Microsoft.Search.Autopilot.Evolution">
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>HostIP</Name>
<AttachTo>Loopback0</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>10.10.10.2/32</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>10.10.10.2/32</a:PrefixStr>
</a:LoopbackIPInterface>
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>HostIP1</Name>
<AttachTo>Loopback0</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>fe80::0002/128</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>fe80::0002/128</a:PrefixStr>
</a:LoopbackIPInterface>
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>SoCHostIP0</Name>
<AttachTo>server2SOC</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>10.10.10.3/32</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>10.10.10.3/32</a:PrefixStr>
</a:LoopbackIPInterface>
<a:LoopbackIPInterface>
<ElementType>LoopbackInterface</ElementType>
<Name>SoCHostIP1</Name>
<AttachTo>server2SOC</AttachTo>
<a:Prefix xmlns:b="Microsoft.Search.Autopilot.NetMux">
<b:IPPrefix>fe80::0003/128</b:IPPrefix>
</a:Prefix>
<a:PrefixStr>fe80::0003/128</a:PrefixStr>
</a:LoopbackIPInterface>
</LoopbackIPInterfaces>
</DeviceDataPlaneInfo>
Signed-off-by: Longxiang Lyu lolv@microsoft.com
How I did it
For servers loopback definitions in Dpg, if they contain LoopbackIPInterface with tags AttachTo, which has value of format like <server_name>SOC, the address will be regarded as a SoC IP, and sonic-cfggen now will treat the port connected to the server as active-active if the redundancy_type is either Libra or Mixed.
How to verify it
Pass the unittest.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
The tests that are available for interfaces.j2 only covers the case when ZTP is disabled and MGMT_INTERFACE is defined. This change adds unit tests for:
1) When ZTP is enabled, with combination of (ip enabled/disabled, inband enabled/disabled)
2) When ZTP is disabled, and MGMT_INTERFACE is not defined, with mgmtVrfEnabled set to true/false
#### How I did it
I created multiple mock-up files to:
1) Enable/disable ZTP, and subconditions under ZTP
2) Created graph file that has no management interface defined
#### How to verify it
Compared output with interfaces.j2 template to ensure the output is expected.
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Signed-off-by: bingwang <wang.bing@microsoft.com>
Why I did it
This PR brings two changes
Add lossy PG profile for PG2 and PG6 on T1 for ports between T1 and T2.
After PR Update qos config to clear queues for bounced back traffic #10176 , the DSCP_TO_TC_MAP and TC_TO_PG_MAP is updated when remapping is enable
DSCP_TO_TC_MAP
Before After Why do this change
"2" : "1" "2" : "2" Only change for leaf router to map DSCP 2 to TC 2 as TC 2 will be used for lossless TC
"6" : "1" "6" : "6" Only change for leaf router to map DSCP 6 to TC 6 as TC 6 will be used for lossless TC
TC_TO_PRIORITY_GROUP_MAP
Before After Why do this change
"2" : "0" "2" : "2" Only change for leaf router to map TC 2 to PG 2 as PG 2 will be used for lossless PG
"6" : "0" "6" : "6" Only change for leaf router to map TC 6 to PG 6 as PG 6 will be used for lossless PG
So, we have two new lossy PGs (2 and 6) for the T2 facing ports on T1, and two new lossless PGs (2 and 6) for the T0 facing port on T1.
However, there is no lossy PG profile for the T2 facing ports on T1. The lossless PGs for ports between T1 and T0 have been handled by buffermgrd .Therefore, We need to add lossy PG profiles for T2 facing ports on T1.
We don't have this issue on T0 because PG 2 and PG 6 are lossless PGs, and there is no lossy traffic mapped to PG 2 and PG 6
Map port level TC7 to PG0
Before the PCBB change, DSCP48 -> TC 6 -> PG 0.
After the PCBB change, DSCP48 -> TC 7 -> PG 7
Actually, we can map TC7 to PG0 to save a lossy PG.
How I did it
Update the qos and buffer template.
How to verify it
Verified by UT.
This reverts commit 90a849ea85.
#### Why I did it
The interfaces unit test did not cover some of the conditions in interfaces.j2 that was changed in #11204. Therefore reverting the change and add the tests before making the change to interfaces.j2.
#### How I did it
Git revert.
#### How to verify it
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
#### Description for the changelog
#### Link to config_db schema for YANG module changes
#### A picture of a cute animal (not mandatory but encouraged)
* [Interfaces] Modify template to support multiple management interfaces
* Modify minigraph to process interfaces in sorted order
Signed-off-by: Ubuntu <gechen@gechen-sonic-dev.d0r25nej54guppclip4gpy5b5a.jx.internal.cloudapp.net>
* Add UT minigraph
Signed-off-by: Ubuntu <gechen@gechen-sonic-dev.d0r25nej54guppclip4gpy5b5a.jx.internal.cloudapp.net>
* make case insensitve comparison
Signed-off-by: George Chen <gechen@microsoft.com>
* Use natural sort
Signed-off-by: George Chen <gechen@microsoft.com>
Co-authored-by: Ubuntu <gechen@gechen-sonic-dev.d0r25nej54guppclip4gpy5b5a.jx.internal.cloudapp.net>
Why I did it
To address internal build failures where the cable len for some of the skus is set to 300m for all tiers.
How I did it
For the buffers test, generate a new output file based off the original expected output with CABLE_LENGTH table updated to use 300m. In the comparison logic, compare against each of the expected output files and if any matches, the testcase is set to pass
Signed-off-by: Neetha John <nejo@microsoft.com>
* [Tunnel PFC] Tests for adding property 'sai_remap_prio_on_tnl_egress'
Add tests for adding property 'sai_remap_prio_on_tnl_egress', this
property should only be added in dual tor environment.
Test done:
Run test test_j2files.py
Co-authored-by: richardyu <richardyu@contoso.com>
Why I did it
Provide fix for comment: https://github.com/Azure/sonic-buildimage/pull/10475/files#r847753187;
Move laoding database config to application code instead of portconfig as portconfig is used as a library.
#10581 was raised for this fix, but had to be reverted due to issue with multi-asic platform.
How I did it
Remove try exception handing from portconfig.py during config_db intialization.
Move loading of database config to application that uses portconfig.py.
How to verify it
unit-test passes.
Verified that it does not cause issue during boot up of multi-asic VS image.
Verified that config_db generation was successful in multi-asic VS.
Why I did it
As part of PCBB changes, we need to enable 2 extra lossless queues. The changes in this PR are done to adjust only the reserved sizes on Th2 for the additional 2 lossless queues
Calculations are done based on 40 downlinks for T1 and 16 uplinks for dual ToR
How to verify it
Verified that the rendering works fine on Th2 dut
Unit tests have been updated to reflect the modified buffer sizes when pcbb is enabled. There are existing testcases that will test the original buffer sizes when pcbb is disabled. With these changes, was able to build sonic-config-engine wheel successfully
Signed-off-by: Neetha John <nejo@microsoft.com>
Signed-off-by: bingwang <bingwang@microsoft.com>
Why I did it
This PR is to add two extra lossless queues for bounced back traffic.
HLD sonic-net/SONiC#950
SKUs include
Arista-7050CX3-32S-C32
Arista-7050CX3-32S-D48C8
Arista-7260CX3-D108C8
Arista-7260CX3-C64
Arista-7260CX3-Q64
How I did it
Update the buffers.json.j2 template and buffers_config.j2 template to generate new BUFFER_QUEUE table.
For T1 devices, queue 2 and queue 6 are set as lossless queues on T0 facing ports.
For T0 devices, queue 2 and queue 6 are set as lossless queues on T1 facing ports.
Queue 7 is added as a new lossy queue as DSCP 48 is mapped to TC 7, and then mapped into Queue 7
How to verify it
Verified by UT
Verified by coping the new template and generate buffer config with sonic-cfggen
Signed-off-by: Neetha John <nejo@microsoft.com>
Why I did it
There was a typo in hwsku specified as part of #10889
How I did it
Replaced with the correct hwsku
How to verify it
test_cfggen.py is passing
#### Why I did it
To ensure that some internal testcases do not break due to external changes
#### How to verify it
Ran test_cfggen.py with the changes and it passed
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com
Why I did it
resolves#10761.
For VOQ chassis, the Recirc port, which was added for the Everflow, stays admin down after load minigraph.
This PR add the fix to make the recirc port as admin up
How I did it
The PR adds a change in minigraph.py, if port has role as Rec make the the port as admin-status up.
How to verify it
UT
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
Why I did it
To further add cable_type and soc_ipv4 field to table MUX_CABLE, this PR tries to parse the minigraph like the following:
```
<Device i:type="SmartCable">
<ElementType>SmartCable</ElementType>
<SubType>active-active</SubType>
<Address xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>192.168.0.3/21</d5p1:IPPrefix>
</Address>
<AddressV6 xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>::/0</d5p1:IPPrefix>
</AddressV6>
<ManagementAddress xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>0.0.0.0/0</d5p1:IPPrefix>
</ManagementAddress>
<ManagementAddressV6 xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>::/0</d5p1:IPPrefix>
</ManagementAddressV6>
<SerialNumber i:nil="true" />
<Hostname>svcstr-7050-acs-1-Servers0-SC</Hostname>
</Device>
<Device i:type="Server">
<ElementType>Server</ElementType>
<Address xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>192.168.0.2/21</d5p1:IPPrefix>
</Address>
<AddressV6 xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>fc02:1000::2/64</d5p1:IPPrefix>
</AddressV6>
<ManagementAddress xmlns:d5p1="Microsoft.Search.Autopilot.NetMux">
<d5p1:IPPrefix>0.0.0.0/0</d5p1:IPPrefix>
</ManagementAddress>
<Hostname>Servers0</Hostname>
</Device>
```
Signed-off-by: Longxiang Lyu lolv@microsoft.com
How I did it
get_mux_cable_entries will try to get the mux cable device from the devices list and get the cable type and soc ip address from the device definition.
How to verify it
Pass the unit-test