Commit Graph

1142 Commits

Author SHA1 Message Date
abdosi
acb2e94475
[chassis] Added support of isolating given LC in Chassis with TSA mode (#16732)
What I did:
Added support when TSA is done on Line Card make sure it's completely
isolated from all e-BGP peer devices from this LC or remote LC

Why I did:
Currently when TSA is executed on LC routes are withdrawn from it's connected e-BGP peers only. e-BGP peers on remote LC can/will (via i-BGP) still have route pointing/attracting traffic towards this isolated LC.

How I did:

When TSA is applied on LC all the routes that are advertised via i-BGP are set with community tag of no-export so that when remote LC received these routes it does not send over to it's connected e-BGP peers.

Also once we receive the route with no-export  over iBGP match on it and and set the local preference of that route to lower value (80) so that we remove that route from the forwarding database. Below scenario explains why we do this:

- LC1 advertise R1 to LC3
- LC2 advertise R1 to LC3
- On LC3 we have multi-path/ECMP over both LC1 and LC2
- On LC3 R1 received from LC1 is consider best route over R1 over received from LC2 and is send to LC3 e-BGP peers
- Now we do TSA on LC2
- LC3 will receive R1 from LC2 with community no-export and from LC1 same as earlier (no change)
- LC3 will still get traffic for R1 since it is still advertised to e-BGP peers (since R1 from LC1 is best route)
- LC3 will forward to both LC1 and LC2 (ecmp) and this causes issue as LC2 is in TSA mode and should not receive traffic

To fix above scenario we change the preference to lower value of R1 received from LC2 so that it is removed from Multi-path/ECMP group.

How I verfiy:

UT has been added to make sure Template generation is correct
Manual Verification of the functionality
sonic-mgmt test case will be updated accordingly.
Please note this PR is on top of this :#16714 which needs to be merged first.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2024-01-05 12:24:31 -08:00
Dev Ojha
a0e2082efd
Update Dockerfile.j2 (#17663) 2024-01-04 09:05:42 -08:00
Xichen96
ffe292a021
[dhcp_server] add config dhcp server enable (#17605)
* add config dhcp_server enable

* fix bug

* fix bug

* fix bug

* fix bug
2024-01-03 13:39:39 -08:00
Xichen96
7011e00eba
[dhcp_server] add dhcp server show option (#17469)
* add show dhcp_server option

* Option to option id
2024-01-03 13:39:03 -08:00
Ze Gan
0e58512353
[docker-databse]: Revise database_global schema for dpu (#17443)
### Why I did it
Revise DPU's database_global.json schema to achieve a more general design

### How I did it
1. Remove databse_type
2. Add a new field databse_name
2023-12-29 14:20:52 -08:00
Xichen96
08666100fc
[dhcp_server] add config dhcp server del (#17603)
* add config dhcp server del
2023-12-22 09:07:24 -08:00
Xichen96
13a16cf87f
[dhcp_server] add config dhcp_server add (#17489)
* dhcp_server add
* add test dup gw nm
2023-12-20 09:09:46 -08:00
Xichen96
1e92ba24ec
[dhcp_server] add show dhcp server info (#17468)
* add show dhcp server info
2023-12-20 09:07:32 -08:00
Junchao-Mellanox
f3f2972512
Optimize syslog rate limit feature for fast and warm boot (#17458)
- Why I did it
Optimize syslog rate limit feature for fast and warm boot

- How I did it
Optimize redis start time
Don't render rsyslog.conf in container startup script
Disable containercfgd by default. There is a new CLI to enable it (in another PR)

- How to verify it
Manual test
Regression test
2023-12-20 09:12:03 +02:00
Yaqiang Zhu
728df4e89d
[dhcp_relay] Optimize j2 file in dhcp_relay container (#17506) 2023-12-15 15:47:40 -08:00
Ze Gan
b21f33b8b1
[Azp]: Fix azp on building ubuntu20.04 and sonic-mgmt (#17439)
The Azp failed on ubuntu20.04 and sonic-mgmt building due to sonic-dash-api updating.

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-12-13 22:49:04 -08:00
Xichen96
5992765d94
[dhcp_server] add show range cli (#17262)
* add show range

* add support for single ip
2023-12-07 14:50:38 -08:00
Ying Xie
2e072beb41
Revert "[pmon] update gRPC version to 1.57.0 (#16257)" (#17401)
This reverts commit 45a852233b.
2023-12-07 11:01:47 -08:00
SuvarnaMeenakshi
90dc254656
[SNMP]: Modify minigraph parser to update SNMP_AGENT_ADDRESS_CONFIG table (#17045)
#### Why I did it
SNMP query over IPv6 does not work due to issue in net-snmp where IPv6 query does not work on multi-nic environment.
To get around this, if snmpd listens on specific ipv4 or ipv6 address, then the issue is not seen.
We plan to configure Management IP and Loopback IP configured in minigraph.xml as SNMP_AGENT_ADDRESS in config_db., based on changes discussed in https://github.com/sonic-net/SONiC/pull/1457.

##### Work item tracking
- Microsoft ADO **(number only)**:26091228

#### How I did it
Modify minigraph parser to update SNMP_AGENT_ADDRESS_CONFIG with management and Loopback0 IP addresses.
Modify snmpd.conf.j2 to use SNMP_AGENT_ADDRESS_CONFIG table if it is present in config_db, if not listen on any IP.
Main change:
1. if minigraph.xml is used to configure the device, then snmpd will listen on mgmt and loopback IP addresses,
2. if config_db is used to configure the device, snmpd will listen IP present in SNMP_AGENT_ADDRESS_CONFIG  if that table is present, if table is not present snmpd will listen on any IP.
#### How to verify it
config_db.json created from minigraph.xml for single asic VS image with mgmt and Loopback IP addresses.
```
    "SNMP_AGENT_ADDRESS_CONFIG": {
        "10.1.0.32|161|": {},
        "10.250.0.101|161|": {},
        "FC00:1::32|161|": {},
        "fec0::ffff:afa:1|161|": {}
    },
 .....
 
 snmpd listening on the above IP addresses:
 admin@vlab-01:~$ sudo netstat -tulnp | grep 161
tcp        0      0 127.0.0.1:3161          0.0.0.0:*               LISTEN      71522/snmpd         
udp        0      0 10.250.0.101:161        0.0.0.0:*                           71522/snmpd         
udp        0      0 10.1.0.32:161           0.0.0.0:*                           71522/snmpd         
udp6       0      0 fec0::ffff:afa:1:161    :::*                                71522/snmpd         
udp6       0      0 fc00:1::32:161          :::*                                71522/snmpd  
```
2023-12-06 13:23:02 -08:00
Nazarii Hnydyn
1ff27db42f
[frr]: Force disable next hop group support. (#17344)
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com

Closes #17345

This W/A was proposed by Nvidia FRR team before the long term solution is ready.

Why I did it
A W/A to fix default route installation during LAG member flap
Work item tracking
N/A
How I did it
Disabled FRR next hop group support
How to verify it
Do LAG member flap
2023-12-06 11:09:54 +08:00
Ashwin Hiranniah
ada7c6a72e
Add pensando platform (#15978)
This commit adds support for pensando asic called ELBA. ELBA is used in pci based cards and in smartswitches.

#### Why I did it
This commit introduces pensando platform which is based on ELBA ASIC.
##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Created platform/pensando folder and created makefiles specific to pensando.
This mainly creates pensando docker (which OEM's need to download before building an image) which has all the userspace to initialize and use the DPU (ELBA ASIC).
Output of the build process creates two images which can be used from ONIE and goldfw.
Recommendation is use to use ONIE.
#### How to verify it
Load the SONiC image via ONIE or goldfw and make sure the interfaces are UP.

##### Description for the changelog
Add pensando platform support.
2023-12-04 14:41:52 -08:00
Dev Ojha
19b216457e
[Snappi] Update snappi module on sonic-mgmt docker (#17269)
* Update snappi module on Dockerfile.j2

* Update snappi module on Dockerfile.j2

* Update snappi module for py2 and venv
2023-11-30 21:33:00 -08:00
ShiyanWangMS
eba6ef0aa9
Remove Python3 venv in Python3-only sonic-mgmt-docker (#17337)
How I did it
Remove Python3 venv in Python3-only sonic-mgmt-docker

How to verify it
There is no impact to sonic-mgmt-docker:latest tag.
Build sonic-mgmt-docker with LEGACY_SONIC_MGMT_DOCKER=y, see python3 venv is there.
Build sonic-mgmt-docker with LEGACY_SONIC_MGMT_DOCKER=n, see python3 venv is NOT included.
2023-11-30 09:23:25 +08:00
Yaqiang Zhu
da80593ecb
[dhcp_relay] Use dhcprelayd to manage critical processes (#17236)
Modify j2 template files in docker-dhcp-relay. Add dhcprelayd to group dhcp-relay instead of isc-dhcp-relay-VlanXXX, which would make dhcprelayd to become critical process.
In dhcprelayd, subscribe FEATURE table to check whether dhcp_server feature is enabled.
2.1 If dhcp_server feature is disabled, means we need original dhcp_relay functionality, dhcprelayd would do nothing. Because dhcrelay/dhcpmon configuration is generated in supervisord configuration, they will automatically run.
2.2 If dhcp_server feature is enabled, dhcprelayd will stop dhcpmon/dhcrelay processes started by supervisord and subscribe dhcp_server related tables in config_db to start dhcpmon/dhcrelay processes.
2.3 While dhcprelayd running, it will regularly check feature status (by default per 5s) and would encounter below 4 state change about dhcp_server feature:
A) disabled -> enabled
In this scenario, dhcprelayd will subscribe dhcp_server related tables and stop dhcpmon/dhcrelay processes started by supervisord and start new pair of dhcpmon/dhcrelay processes. After this, dhcpmon/dhcrelay processes are totally managed by dhcprelayd.
B) enabled -> enabled
In this scenaro, dhcprelayd will monitor db changes in dhcp_server related tables to determine whether to restart dhcpmon/dhrelay processes.
C) enabled -> disabled
In this scenario, dhcprelayd would unsubscribe dhcp_server related tables and kill dhcpmon/dhcrelay processes started by itself. And then dhcprelayd will start dhcpmon/dhcrelay processes via supervisorctl.
D) disabled -> disabled
dhcprelayd will check whether dhcrelay processes running status consistent with supervisord configuration file. If they are not consistent, dhcprelayd will kill itself, then dhcp_relay container will stop because dhcprelayd is critical process.
2023-11-27 09:30:01 -08:00
Shashanka Balakuntala
8b192a1151
[dhcp-relay]: Modify dhcp relay to pick primary address (#17012)
This is change taken as part of the HLD: sonic-net/SONiC#1470 and this is a follow up on the PR #16827 where in the docker-dhcp we pick the value of primary gateway of the interface from the VLAN_Interface table which has "secondary" flag set in the config_db

Microsoft ADO (number only): 16784946

How did I do it
-  Changes in the j2 file to add a new "-pg" parameter in the dhcpv4-relay.agents.j2, the ip would be retrieved from the config db's vlan_interface table such that the interface which are picked will have secondary field set.

- Changes in isc-dhcp to re-order the addresses of the discovered interface and which has the ip which has the passed parameter.
2023-11-22 15:05:32 -08:00
Saikrishna Arcot
730faa152a Fix docker-sonic-mgmt-framework for armhf
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-11-21 18:53:15 -08:00
Ze Gan
a87cddc6c9
[docker-database-init.sh]: Fix wrong creating of database_global.json in multi asic platform (#17221)
Fix bug: #17161 (comment)
multi-asic platforms it will never go to the else part as DATABASE_TYPE is always ""


Microsoft ADO (number only): 25072889

Move the checker NAMESPACE_ID == "" back

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-11-21 09:41:20 -08:00
Xichen96
ee38e2447d
[dhcp_server] Add show dhcp_server ipv4 lease (#17125)
* Add show dhcp_server ipv4 lease
* add ut for show dhcp_server ipv4 lease
2023-11-21 08:42:07 -08:00
abdosi
4a7aa2634f
[chassis] Support advertisement of Loopback0 of all LC's across all e-BGP peers in TSA mode (#16714)
What I did:
In Chassis TSA mode Loopback0 Ip's of each LC's should be advertise through e-BGP peers of each remote LC's

How I did:

- Route-map policy to Advertise own/self Loopback IP to other internal iBGP peers with a community internal_community as define in constants.yml
- Route-map policy to match on above internal_community when route is received from internal iBGP peers and set a internal tag as define in constants.yml and also delete the internal_community so we don't send to any of e-BGP peers
- In TSA new route-map match on above internal tag and permit the route (Loopback0 IP's of remote LC's) and set the community to traffic_shift_community.
- In TSB delete the above new route-map.

How I verify:

Manual Verification

UT updated.
sonic-mgmt PR: sonic-net/sonic-mgmt#10239


Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-11-20 09:42:02 -08:00
abdosi
e37b4f3cfa
Revert iBGP GTSM feature for VOQ Chassis (#17037)
What I did:

Revert the GTSM feature for VOQ iBGP session done as part of #16777.

Why I did:
On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-11-17 17:03:37 -08:00
Ze Gan
9f08f88a0d
[dpu]: Add DPU database service (#17161)
Sub PRs:

sonic-net/sonic-host-services#84
#17191

Why I did it
According to the design, the database instances of DPU will be kept in the NPU host.

Microsoft ADO (number only): 25072889

How I did it
To follow the multiple ASIC design, I assume a new platform environment variable NUM_DPU will be defined in the /usr/share/sonic/device/$PLATFORM/platform_env.conf. Based on this number, NPU host will launch a corresponding number of instances for the DPU database.

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-11-17 09:10:03 -08:00
Yaqiang Zhu
3223ca0156
[dhcp_server] Add config_db monitor and customize options for dhcpservd (#17051)
Why I did it
Add config_db monitor and customize options for dhcpservd. HLD: sonic-net/SONiC#1282

Work item tracking
Microsoft ADO (number only): 25600859
How I did it
Add support to customize unassigned DHCP options. Current support type: binary, boolean, ipv4-address, string, uint8, uint16, uint32
Add db config change monitor for dhcpservd
How to verify it
Unit tests in sonic-dhcp-server all passed
2023-11-16 08:56:50 -08:00
Lawrence Lee
04b30fc378
[tph]: Detect LAG flaps from APPL_DB (#16879)
Why I did it
A race condition exists while the TPH is processing a netlink message - if a second netlink message arrives during processing it will be missed since TPH is not listening for other messages.
Another bug was found where TPH was unnecessarily restarting since it was checking admin status instead of operational status of portchannels.

How I did it
Subscribe to APPL_DB for updates on LAG operational state
Track currently sniffed interfaces

How to verify it
Send tunnel packets with destination IP of an unresolved neighbor, verify that ping commands are run
Shut down a portchannel interface, verify that sniffer does not restart
Send tunnel packets, verify ping commands are still run
Bring up portchannel interface, verify that sniffer restarts

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2023-11-09 16:01:59 -08:00
JunhongMao
4da5099919
[VOQ][saidump] Install rdbtools into the docker base related containers. (#16466)
Fix #13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
2023-11-08 11:57:25 -08:00
ganglv
c71fb3a30f
Share image for gnmi and telemetry (#16863)
Why I did it
Share docker image to support gnmi container and telemetry container

Work item tracking
Microsoft ADO 25423918:
How I did it
Create telemetry image from gnmi docker image.
Enable gnmi container and disable telemetry container by default.

How to verify it
Run end to end test.
2023-11-08 08:54:36 +08:00
ShiyanWangMS
7013b05899
Add new docker-sonic-mgmt makefile flag: LEGACY_SONIC_MGMT_DOCKER (#17070)
Why I did it
This is part of Python3 migration project. This PR will add a new makefile flag: LEGACY_SONIC_MGMT_DOCKER
Now by default: LEGACY_SONIC_MGMT_DOCKER = y will build sonic-mgmt-docker with Python2 and Python3
If LEGACY_SONIC_MGMT_DOCKER = n will will sonic-mgmt-docker with Python3 only

Work item tracking
Microsoft ADO (number only): 25254349

How I did it
Add makefile flag: LEGACY_SONIC_MGMT_DOCKER

How to verify it
By default will build sonic-mgmt-docker with Python2 and Python3. No change compared to before.
Set LEGACY_SONIC_MGMT_DOCKER=n will build sonic-mgmt-docker with Python3 only
2023-11-03 09:04:01 +08:00
Yaqiang Zhu
274d320443
[dhcp_server] Add dhcprelayd for dhcp_server feature (#16947)
Add support in dhcp_relay container for dhcp_server_ipv4 feature. HLD: sonic-net/SONiC#1282
2023-11-02 08:09:01 -07:00
ShiyanWangMS
fe735e35c6
Upgrade Ansible to 6.7.0 and make Python3 as the default interpreter in sonic-mgmt-docker (#17021)
Why I did it
This PR is part of sonic-mgmt-docker Python3 migration project.

Work item tracking
Microsoft ADO (number only): 24397943

How I did it
Upgrade Ansible to 6.7.0
Make Python3 as the default interpreter. python is a soft link to python3. If you want to use python2, use the command python2 explicitly.
Upgrade some pip packages to higher version in order to meet security requirement.

How to verify it
Build a private sonic-mgmt-docker successfully.
Verify python is python3.
Verify python2 is working with 202012 and 202205 branch.
Verify python3 is working with master branch.
Verify with github PR test.
2023-10-31 09:44:55 +08:00
Saikrishna Arcot
87f137be25
Upgrade paramiko in docker-ptf to 2.9.5 (#16897)
With Debian Bookworm, Paramiko 2.9 or newer will need to be used to be
able to connect to devices running that version of Debian
(specifically, to those running OpenSSH 9.2).

Paramiko is currently on 3.3.1. For now, upgrade to 2.9.5.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
2023-10-24 22:51:16 -07:00
Yaqiang Zhu
73dd38a5ce
[dhcp_server] Add dhcpservd to dhcp_server container (#16560)
Why I did it
Part implementation of dhcp_server. HLD: sonic-net/SONiC#1282.
Add dhcpservd to dhcp_server container.

How I did it
Add installing required pkg (psutil) in Dockerfile.
Add copying required file to container in Dockerfile (kea-dhcp related and dhcpservd related)
Add critical_process and supervisor config.
Add support for generating kea config (only in dhcpservd.py) and updating lease table (in dhcpservd.py and lease_update.sh)

How to verify it
Build image with setting INCLUDE_DHCP_SERVER to y and enabled dhcp_server feature after installed image, container start as expected.
Enter container and found that all processes defined in supervisor configuration running as expected.
Kill processes defined in critical_processes, container exist.
2023-10-20 09:52:05 -07:00
Stepan Blyshchak
7ab27c1b90
[frr] fix default zebra config not inserted into empty zebra.conf (#16747)
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
2023-10-19 08:47:24 +08:00
abdosi
7059f42385
[chassis/multi-asic] Make sure iBGP session established as directly connected (#16777)
What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

- Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
- Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
- LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
- Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
- However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default  where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.

Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature

neighbor PEER ttl-security hops NUMBER

This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.

We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verify:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-10-10 11:51:40 -07:00
Hua Liu
6e3260098f
Enable ZMQ between GNMI and Orchanget (#16661)
Enable ZMQ on gnmi and orchagent

#### Why I did it
Improve GNMI API performance for Dash resources

#### How I did it
Modify gnmi and orchagent service start script, add ZMQ parameter.

#### How to verify it
Pass all UT & E2E test
Manually verify with create Dash resources via gnmi API.
2023-10-09 14:22:50 -07:00
vikumarks
b45ee0980b
[sonic-mgmt]: Added required python packages to run MSFT hero Test cases (#15883)
Added required python packages to run MSFT hero Test cases

dpugen==0.1.1
ctypesgen
pandas
PyYAML
ixload

Co-authored-by: Guohan Lu <lguohan@gmail.com>
2023-09-27 18:13:44 -07:00
Bob Chu
7e6790ab6b
[Telemetry] enable default service config if no config from DB (#16683)
#### Why I did it
Fix issue #16533 , telemetry service exit in master and 202305 branches due to no telemetry configs in redis DB.

#### How I did it
Enable default config if no TELEMETRY configs from redis DB.

#### How to verify it
After the fix, telemetry service would work with the following two scenarios:
1. With TELEMETRY config in redis DB, load service configs from DB.
2. No TELEMETRY config in redis DB, use default service configs.
2023-09-27 17:20:18 -07:00
Zain Budhwani
d89dde3b6d
Fix regex and process name (#16647)
### Why I did it

### How I did it

Fix regex such that dhcp bind failure event is detected as well as process name since dhcp relay processes that need to be detected are dhcprelay6 and dhcrelay.

#### How to verify it

Manual testing and nightly test event
2023-09-26 16:15:27 -07:00
abdosi
8b7b2a7f7c
[chassis/multi-asic] Enable Sending BGP Community over internal neighbors over iBGP Session (#16705)
What I did:
Enable Sending BGP Community over internal neighbors over iBGP Session

Microsoft ADO: 25268695

Why I did:
Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers.


str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52141
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 16:08:26 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52688
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 15:45:51 2023

After the change

str2-xxxx-lc2-2(config)# router bgp 65100
str2-xxxx-lc2-2(config-router)# address-family ipv4
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community
str2-xxxx-lc2-2(config-router-af)# exit
str2-xxxx-lc2-2(config-router)# address-family ipv6
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community
str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52400
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:19 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52947
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:09 2023

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
2023-09-26 13:34:38 -07:00
Yevhen Fastiuk
52f6dd65a3
Improve remote fetch (#12795)
### Why I did it
To fix those errors:
One:
```
Connecting to urm.nvidia.com (urm.nvidia.com)|*.*.*.*|:443... connected.
GnuTLS: Error in the pull function.
Unable to establish SSL connection.
Error 4
make[1]: Leaving directory '/sonic/src/smartmontools'
[ target/debs/bullseye/smartmontools_6.6-1_amd64.deb ]
```
Second:
```
Get:90 https://debian-mirror-url buster/main amd64 librrd-dev amd64 1.7.1-2 [284 kB]
Get:91 https://debian-mirror-url buster/main amd64 psmisc amd64 23.2-1+deb10u1 [126 kB]
Get:92 https://debian-mirror-url buster/main amd64 python-smbus amd64 4.1-1 [12.2 kB]
Get:93 https://debian-mirror-url buster/main amd64 python3.7-dev amd64 3.7.3-2+deb10u3 [510 kB]
Get:94 https://debian-mirror-url buster/main amd64 python3-dev amd64 3.7.3-1 [1264 B]
Get:95 https://debian-mirror-url buster/main amd64 python3-smbus amd64 4.1-1 [12.5 kB]
Get:96 https://debian-mirror-url buster/main amd64 rrdtool amd64 1.7.1-2 [485 kB]
Fetched 122 MB in 12s (9976 kB/s)
E: Failed to fetch https://debian-mirror-url/pool/main/p/python-defaults/python2-minimal_2.7.16-1_amd64.deb  500  Internal Server Error [IP: *.*.*.* 443]
E: Failed to fetch https://debian-mirror-url/pool/main/f/fontconfig/fontconfig-config_2.13.1-2_all.deb  500  Internal Server Error [IP: *.*.*.* 443]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
The command '/bin/sh -c apt-get update &&       apt-get install -y          build-essential         python3-dev             ipmitool                librrd8                 librrd-dev              rrdtool                 python-smbus            python3-smbus           dmidecode               i2c-tools               psmisc                  libpci3' returned a non-zero code: 100
[ target/docker-platform-monitor.gz ]
Error 1
```

#### How I did it
Add retry mechanism to apt, wget, and curl hooks
2023-09-23 18:07:04 -07:00
Ze Gan
83d67d4c8a
[build]: Polish protobuf build (#16119)
- Use dget to download the protobuf source code
- Add official link in sonic-mgmt Dockerfile for protobuf

Signed-off-by: Ze Gan <ganze718@gmail.com>
2023-09-23 00:25:43 -07:00
prasannam2302
c3cf42124b
[bgp]: exclude bgpmon for frrcfgd. (#16582)
Boot up a switch, if frrcfgd is enabled with frr_mgmt_framework_config being "true", 
then "bgpmon" process should not be running after this change. bgpmon should be 
running when bgpcfgd is enable with frr_mgmt_framework_config being "false"
2023-09-23 00:17:36 -07:00
Liu Shilong
bfa05c8349
[build] Fix build issue in docker-ptf-sai caused by setuptools_scm new release (#16636)
docker-ptf-sai build fails on setuptools_scm's new release on 09/20/2023.
Use old version instead.
2023-09-21 10:38:08 -07:00
Zain Budhwani
2dfdeb94d6
Load generic omprog in all dockers for rsyslog plugin support (#16601)
### Why I did it

##### Work item tracking
- Microsoft ADO **(number only)**:13366345

#### How I did it

Add generic omprog file in all dockers for rsyslog plugin support. Add file to docker-config-engine-bullseye so no need to add individually.

#### How to verify it

UT/Pipeline
2023-09-20 16:27:42 -07:00
vdahiya12
45a852233b
[pmon] update gRPC version to 1.57.0 (#16257)
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
2023-09-15 16:41:51 -07:00
anamehra
78981d93b8
Chassis: fix pmon docker failure when DEVICE_METADATA is not available (#16527)
Signed-off-by: anamehra anamehra@cisco.com

Added a check for DEVICE_METADATA before accessing the data. This prevents the j2 failure when var is not available.
2023-09-13 14:10:56 -07:00
ShiyanWangMS
42126ccf7d
Revert "Upgrade Ansible to 6.7.0 and make Python3 as the default interpreter in sonic-mgmt-docker (#15836)" (#16537)
This reverts commit 51fb6d7d9f.

The new sonic-mgmt docker image has ansible upgraded. Encountered some issues that are hard to debug to have a quick fix. Let's revert the change for now. The new sonic-mgmt docker image was kept for further debugging and fixing. After all the issues are fixed, we'll need to apply this change again.
2023-09-13 16:20:17 +08:00