**- Why I did it**
I was asked to change "Allow list" prefix-list generation rule.
Previously we generated the rules using following method:
```
For each {prefix}/{masklen} we would generate the prefix-rule
permit {prefix}/{masklen} ge {masklen}+1
Example:
Prefix 1.2.3.4/24 would have following prefix-list entry generated
permit 1.2.3.4/24 ge 23
```
But we discovered the old rule doesn't work for all cases we have.
So we introduced the new rule:
```
For ipv4 entry,
For mask < 32 , we will add ‘le 32’ to cover all prefix masks to be sent by T0
For mask =32 , we will not add any ‘le mask’
For ipv6 entry, we will add le 128 to cover all the prefix mask to be sent by T0
For mask < 128 , we will add ‘le 128’ to cover all prefix masks to be sent by T0
For mask = 128 , we will not add any ‘le mask’
```
**- How I did it**
I change prefix-list entry generation function. Also I introduced a test for the changed function.
**- How to verify it**
1. Build an image and put it on your dut.
2. Create a file test_schema.conf with the test configuration
```
{
"BGP_ALLOWED_PREFIXES": {
"DEPLOYMENT_ID|0|1010:1010": {
"prefixes_v4": [
"10.20.0.0/16",
"10.50.1.0/29"
],
"prefixes_v6": [
"fc01:10::/64",
"fc02:20::/64"
]
},
"DEPLOYMENT_ID|0": {
"prefixes_v4": [
"10.20.0.0/16",
"10.50.1.0/29"
],
"prefixes_v6": [
"fc01:10::/64",
"fc02:20::/64"
]
}
}
}
```
3. Apply the configuration by command
```
sonic-cfggen -j test_schema.conf --write-to-db
```
4. Check that your bgp configuration has following prefix-list entries:
```
admin@str-s6100-acs-1:~$ show runningconfiguration bgp | grep PL_ALLOW
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 10 deny 0.0.0.0/0 le 17
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 20 permit 127.0.0.1/32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 30 permit 10.20.0.0/16 le 32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V4 seq 40 permit 10.50.1.0/29 le 32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 10 deny 0.0.0.0/0 le 17
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 20 permit 127.0.0.1/32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 30 permit 10.20.0.0/16 le 32
ip prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V4 seq 40 permit 10.50.1.0/29 le 32
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 10 deny ::/0 le 59
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 20 deny ::/0 ge 65
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 30 permit fc01:10::/64 le 128
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_1010:1010_V6 seq 40 permit fc02:20::/64 le 128
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 10 deny ::/0 le 59
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 20 deny ::/0 ge 65
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 30 permit fc01:10::/64 le 128
ipv6 prefix-list PL_ALLOW_LIST_DEPLOYMENT_ID_0_COMMUNITY_empty_V6 seq 40 permit fc02:20::/64 le 128
```
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
**- Why I did it**
On teamd docker restart, the swss and syncd needs to be restarted as there are dependent resources present.
**- How I did it**
Add the teamd as a dependent service for swss
Updated the docker-wait script to handle service and dependent services separately.
Handle the case of warm-restart for the dependent service
**- How to verify it**
Verified the following scenario's with the following testbed
VM1 ----------------------------[DUT 6100] -----------------------VM2, ping traffic continuous between VMs
1. Stop teamd docker alone
> swss, syncd dockers seen going away
> The LAG reference count error messages seen for a while till swss docker stops.
> Dockers back up.
2. Enable WR mode for teamd. Stop teamd docker alone
> swss, syncd dockers not removed.
> The LAG reference count error messages not seen
> Repeated stop teamd docker test - same result, no effect on swss/syncd.
3. Stop swss docker.
> swss, teamd, syncd goes off - dockers comes back correctly, interfaces up
4. Enable WR mode for swss . Stop swss docker
> swss goes off not affecting syncd/teamd dockers.
5. Config reload
> no reference counter error seen, dockers comes back correctly, with interfaces up
6. Warm reboot, observations below
> swss docker goes off first
> teamd + syncd goes off to the end of WR process.
> dockers comes back up fine.
> ping traffic between VM's was NOT HIT
7. Fast reboot, observations below
> teamd goes off first ( **confirmed swss don't exit here** )
> swss goes off next
> syncd goes away at the end of the FR process
> dockers comes back up fine.
> there is a traffic HIT as per fast-reboot
8. Verified in multi-asic platform, the tests above other than WR/FB scenarios
Issue was because we were relying on port_alias_asic_map dictionary
but that dictionary can't be used as alias name format has changed.
Fix the port alias mapping as what is needed.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
With python 2.7, import yaml module was resulting in huge memory allocation in the heap per process. As an interim fix, moving the import yaml to the function which actually uses this module. This helps reduce the memory footprint of pmon docker, as it don't use the API's which need yaml processing.
This issue not seen with importing yaml with python3, Need to be further analyzed, hence putting this fix in 201911 where we continue to use python2.7.
**- Why I did it**
FRR introduced [next hop tracking](http://docs.frrouting.org/projects/dev-guide/en/latest/next-hop-tracking.html) functionality.
That functionality requires resolving BGP neighbors before setting BGP connection (or explicit ebgp-multihop command). Sometimes (BGP MONITORS) our neighbors are not directly connected and sessions are IBGP. In this case current configuration prevents FRR to establish BGP connections. Reason would be "waiting for NHT". To fix that we need either add static routes for each not-directly connected ibgp neighbor, or enable command `ip nht resolve-via-default`
**- How I did it**
Put `ip nht resolve-via-default` into the config
**- How to verify it**
Build an image. Enable BGP_MONITOR entry and check that entry is Established or Connecting in FRR
Co-authored-by: Pavel Shirshov <pavel.contrib@gmail.com>
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
* Calculate ECMP hash seed based on ASIC ID on multi ASIC platform. Each ASIC will have a unique ECMP hash seed value.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
**- Why I did it**
BGP_MONITORS sessions don't have corresponding DEVICE_NEIGHBOR_METADATA CONFIG_DB entries in the minigraphs. Prevent bgpcfgd to wait on such entries for BGP_MONITORS sessions.
**- How I did it**
Set constructor argument to False that means - don't wait for device neighbors metadata info for BGP_MONITORS
**- How to verify it**
Build an image, write on your device, use a minigraph with BGP_MONITORS sessions. Check that sessions are populated in the config.
implements a new feature: "BGP Allow list."
This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.
* Fix generate_l2_config: don't override hostname because sonic-cfggen may not read
from Redis. Fix test_l2switch_template test case to test preset l2
feature
* Improve test script: compare json files with sort_keys
* Revert changes on sample_output
* Remove members field in VLAN section. Fix test assertTrue statement.
Fix load minigraph on 201911 branch. (#1124)
Fixed config load_minigrpah not working for Multi-asic platfroms.
(#1123)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
[Namespace]: Fix SAI_ID key used in cpfcIfTable and
csqIfQosGroupStatsTable implementation (#138)
Implementation changes for CiscoBgp4MIB (#158)
[ciscoSwitchQosMIB]: Remove invocation of update_data function
during (#161)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
implements a new feature: "BGP Allow list."
This feature allows us to control which IP prefixes are going to be advertised via ebgp from the routes received from EBGP neighbors.
When stopping the swss, pmon or bgp containers, log messages like the following can be seen:
```
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,061 ERRO pool dependent-startup event buffer overflowed, discarding event 34
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,063 ERRO pool dependent-startup event buffer overflowed, discarding event 35
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,064 ERRO pool dependent-startup event buffer overflowed, discarding event 36
Aug 23 22:50:43.789760 sonic-dut INFO swss#supervisord 2020-08-23 22:50:10,066 ERRO pool dependent-startup event buffer overflowed, discarding event 37
```
This is due to the number of programs in the container managed by supervisor, all generating events at the same time. The default event queue buffer size in supervisor is 10. This patch increases that value in all containers in order to eliminate these errors. As more programs are added to the containers, we may need to further adjust these values. I increased all buffer sizes to 25 except for containers with more programs or templated supervisor.conf files which allow for a variable number of programs. In these cases I increased the buffer size to 50. One final exception is the swss container, where the buffer fills up to ~50, so I increased this buffer to 100.
Resolves https://github.com/Azure/sonic-buildimage/issues/5241
Revert "Revert " [201911]show interface counters for multi ASIC devices
(#1104)""
Revert "Revert "Pfcstat (#1097)""
[show] Fix 'show int neighbor expected' (#1106)
Update argument for docker exec it->i (#1118)
Update to make config load/reload backward compatible. (#1115)
Handling deletion of Port Channel before deletion of its members
(#1062)
Skip default route present in ASIC-DB but not in APP-DB. (#1107)
[CLI][PFCWD][Multi-ASIC] Added multi ASIC support to 'pfcwd' CLI
(#1102)
[201911] Multi asic platform config interface portchannel, show
transceiver (#1087)
[drop counters] Fix configuration for counters with lowercase
names (#1103)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
[xcvrd] Don't log unnecessary messages upon empty transceiver change
event (#53)
[thermalctld] Optimize the thermal policy loop to make it execute
every 60 seconds (#77)
[thermalctld] Fix issue: fan status should not be True when fan is
absent (#92)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Avoid adding loopback interface (ip link add) when setting nat zone on
loopback interface (#1434)
[acl] Remove Ethertype from L3V6 qualifiers (#1433)
Sflow fixes during DEL processing (#1427)
Fix#3971 by skipping create-only SAI attributes when modifying
buffer pools or profiles in orchagent (#1430)
Fix issue: bufferorch only pass the first attribute to sai when
setting attribute (#1442)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Fix SubscriberStateTable::hasCachedData formula for a timing risk
(#379)
Add restapi DB (#386)
Fix swss::exec return value (#368)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Printing both snapshot and current counter sets will make it easier to pinpoint
which message type(s) is/are not being relayed. This PR prints both counter sets.
Also, this PR defines gnu11 as a C standard to compile with in order to avoid
making changes when porting to 201811 branch.
singed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
When BGP routes are missing, DHCP packets get relayed over mgmt
interface. This results in dhcpmon alerting that DHCP packets are
not being relayed. This is PR include mgmt interface as uplink
device, and so, if DHCP packet gets relayed over mgmt interface,
regular dhcpmon alert will not be issues. Instead, dhcpmon will
check the mgmt interface counts and issue a separate alert regarding
packets travelling through mgmt network.
In addition, this PR includes the following enhancements:
1. Add SIGUSR1 handler that prints out current packet counts
2. Increase alert grace window to 3 minutes from currently 2 minutes
3. Time is now computed more accurately
4. Print vlan name before counters
signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
update the environment variable in the teardown (#1101)
Fix for show interface portchannel now working on 201911 (#1105)
Revert "Pfcstat (#1097)"
Revert " [201911]show interface counters for multi ASIC devices
(#1104)"
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
update the environment variable in the teardown (#1101)
Fix for show interface portchannel now working on 201911 (#1105)
[201911]show interface counters for multi ASIC devices (#1104)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Parse quagga output without knowledge about hostname, so robust
against hostname changes or mismatch (#124)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Add ip_prefix len based on proxy_arp status (#1096)
[sonic-cfggen][QoS][multi ASIC] Multi ASIC QoS and Buffer config
generation support, merge from master (#1095)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Update transceiver info DB key names (#146)
[Multiasic]: Provide namespace support for ipNetToMediaPhysAddress (#129)
[LLDP]: Modify OID index of LLDPRemTableUpdater MIB (#155)
[Namespace]: Simplify sync_d functions to use higher order (#154)
[Namespace]: Fix interface counters in RFC 1213 (#145)
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>