If encountered a line without RequiredBy or WantedBy the code passes uninitialized pointer to get_install_targets_from_line(). Where it can fail with segfault or silently pass randomly.
- Why I did it
Uninitialized target_suffix is passed to get_install_targets_from_line() when other fields are present in [Install] section, like this:
root@sonic:/home/admin# systemctl cat ntpsec
...
[Install]
Alias=ntp.service
Alias=ntpd.service
WantedBy=multi-user.target
- How I did it
Initialize target_suffix with NULL, put an assert in get_install_targets_from_line(). Edited test to cover this scenario.
- How to verify it
UT and on the switch.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
#### Why I did it
src/sonic-utilities
```
* 9d5dacab - (HEAD -> 202311, origin/202311) CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (4 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* dd1432a2 - (HEAD -> 202311, origin/202311) [ci] Allow partially success build artifact in PR checker pipeline. #2986 (10 hours ago) [Liu Shilong]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 1f5fcfd - (HEAD -> 202311, origin/202311) Exclude DbInterface in PR coverage check (#224) (21 hours ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 83e5106 - (HEAD -> 202311, origin/202311) Updated supported CMIS module types in xcvrd to include new module for SPC4 (#440) (4 hours ago) [Tomer Shalvi]
* f390d8d - Mark sub-port interfaces as invalid ports in xcvrd (#412) (21 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* c4fd095e - (HEAD -> 202311, origin/202311) Fix multi VLAN neighbor learning (#3049) (#3064) (65 minutes ago) [Lawrence Lee]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 4dfc01f - (HEAD -> 202311, origin/202311) Certain VDM fields not populating after encountering KeyError on 400ZR optics (#442) (28 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 64d5fdd9 - (HEAD -> 202311, origin/202311) [intfsorch] Enable ipv6 proxy ndp along with proxy arp (#3045) (2 days ago) [Nikola Dancejic]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* b2bea12c - (HEAD -> 202311, origin/202311) CLI enhancements to revtrieve data from TRANSCEIVER_FIRMWARE_INFO table (#3177) (4 hours ago) [mihirpat1]
* 02ae33f3 - Modify transceiver PM CLI to handle N/A value for DOM threshold (#3174) (28 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-sairedis
```
* edb2b17 - (HEAD -> 202311, origin/202311) Add new functionality to syncd_init_common.sh, to use common sai.profile (#1352) (22 hours ago) [noaOrMlnx]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 5430f6f - (HEAD -> 202311, origin/202311) Change get_transceiver_info_firmware_versions return type to dict (#440) (2 days ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* c711b061 - (HEAD -> 202311, origin/202311) [Mellanox buffer migrator] Do not touch the buffer model on generic SKUs if the buffer configuration is empty (#3114) (2 days ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
### Why I did it
Fix flakiness of eventd UT - run sub after capture service starts
##### Work item tracking
- Microsoft ADO **(number only)**:25650744
#### How I did it
Run sub socket after capture socket is initialized
#### How to verify it
Pipeline
#### Why I did it
src/sonic-platform-daemons
```
* 7792838 - (HEAD -> 202311, origin/202311) Move firmware version fields to TRANSCEIVER_FIRMWARE_INFO table (#435) (22 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* 121b338 - (HEAD -> 202311, origin/202311) Unable to retrieve media settings with just Vendor name (#419) (10 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 2910b0e3 - (HEAD -> 202311, origin/202311) Fix the Orchagent crash seen during Port channel OC test cases. (#3042) (7 days ago) [saksarav-nokia]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* b2125761 - (HEAD -> 202311, origin/202311) [chassis] fix show bgp summary when no neighbors are present on one ASIC (#3158) (2 days ago) [Arvindsrinivasan Lakshmi Narasimhan]
* 54595c1e - [202311]Fix the sfputil treats page number as decimal instead of hexadecimal (#3153) (#3160) (5 days ago) [Sudharsan Dhamal Gopalarathnam]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-sairedis
```
* 23481f0 - (HEAD -> 202311, origin/202311) Skip FABRIC PORT Attributes from sairedis logging (#1339) (2 days ago) [saksarav-nokia]
* 682e860 - Revert "add if statement for module control mode support" (#1341) (4 days ago) [dbarashinvd]
* 3621a18 - SAI submodule update to pick the sai-thrift support added to read VOQ counters (#1332) (4 days ago) [saksarav-nokia]
* 52cd15b - Fix code coverage and ASAN not being enabled (#1338) (5 days ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Fix IPV6 forced-mgmt-route not work issue
Why I did it
IPV6 forced-mgmt-route not work
When add a IPV6 route, should use 'ip -6 rule add pref 32764 address' command, but currently in the template the '-6' parameter are missing, so the IPV6 route been add to IPV4 route table.
Also this PR depends on #17281 , which will fix the IPV6 'default' route table missing in IPV6 route lookup issue.
Microsoft ADO (number only):24719238
#### Why I did it
src/sonic-utilities
```
* 31a6584c - (HEAD -> 202311, origin/202311) Fix `sudo config load_mgmt_config` fails with error "File /var/run/dhclient.eth0.pid does not exist" (#3149) (16 hours ago) [Mai Bui]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/linkmgrd
```
* 70b6d15 - (HEAD -> 202311, origin/202311) [active-standby] Fix `show mux status` inconsistency introduced by orchagent rollback (#225) (3 days ago) [Jing Zhang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-host-services
```
* 054aa7a - (HEAD -> 202311, origin/202311) Fixed ip6table internal_docker_ip_traffic rule command for multi-asic (#94) (3 days ago) [anamehra]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-common
```
* 9bf5a17 - (HEAD -> 202311, origin/202311) Implementing set_optoe_write_timeout API (#422) (3 days ago) [mihirpat1]
* c8617b8 - APIs to help in finding NPU SI settings (#410) (3 days ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-platform-daemons
```
* dbaa079 - (HEAD -> 202311, origin/202311) Support 800G ifname in xcvrd (#416) (2 days ago) [Anoop Kamath]
* e4272c1 - 400ZR not linking up with latest SONiC master image (#410) (3 days ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-snmpagent
```
* 5d5cfe5 - (HEAD -> 202311, origin/202311) Set the execute bit on sysDescr_pass.py (#306) (3 days ago) [Andre Kostur]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-swss
```
* 55d53b79 - (HEAD -> 202311, origin/202311) [copporch] Add safeguard during policer attribute update (#2977) (3 days ago) [Vivek]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* 2046e66c - (HEAD -> 202311, origin/202311) Reduce generate_dump mem usage for cores (#3052) (3 days ago) [davidm-arista]
* fbd6c916 - Disable Key Validation feature during sonic-installation for Cisco Platforms (#3115) (3 days ago) [selvipal]
* 88c027f0 - [Techsupport]Adding more FRR and BGP dumps (#3118) (3 days ago) [Sudharsan Dhamal Gopalarathnam]
* 555ecf64 - [chassis]: Support show ip bgp summary to display without error when no external neighbors are configured on chassis LC (#3099) (3 days ago) [Arvindsrinivasan Lakshmi Narasimhan]
* 1515edcb - [db_migrator]Remove route migration (#3068) (3 days ago) [Sudharsan Dhamal Gopalarathnam]
* 8862c114 - Modify teamd retry count script to base BGP status on default BGP status (#3069) (3 days ago) [Saikrishna Arcot]
* f4b5ef21 - Add all SKUs to the generic config update list (#3131) (3 days ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
#### Why I did it
src/sonic-utilities
```
* be6224a3 - (HEAD -> 202311, origin/202311) [202311] Migrate GNMI table (#3138) (10 hours ago) [ganglv]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Why I did it
ICM reported due to "BGPMon Process exited" which was caused by json load exception.
Work item tracking
Microsoft ADO (number only):
25916773
How I did it
Add an exception handle during json load.
How to verify it
Verified locally, add debug log to modify the output string of cmd to make it not with json formation, then check the syslog.
#### Why I did it
src/sonic-platform-common
```
* 7c2ad66 - (HEAD -> 202311, origin/202311) Tx/Rx power values should be rounded up to 3 decimal places (#432) (4 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
What I did:
Added support when TSA is done on Line Card make sure it's completely
isolated from all e-BGP peer devices from this LC or remote LC
Why I did:
Currently when TSA is executed on LC routes are withdrawn from it's connected e-BGP peers only. e-BGP peers on remote LC can/will (via i-BGP) still have route pointing/attracting traffic towards this isolated LC.
How I did:
When TSA is applied on LC all the routes that are advertised via i-BGP are set with community tag of no-export so that when remote LC received these routes it does not send over to it's connected e-BGP peers.
Also once we receive the route with no-export over iBGP match on it and and set the local preference of that route to lower value (80) so that we remove that route from the forwarding database. Below scenario explains why we do this:
- LC1 advertise R1 to LC3
- LC2 advertise R1 to LC3
- On LC3 we have multi-path/ECMP over both LC1 and LC2
- On LC3 R1 received from LC1 is consider best route over R1 over received from LC2 and is send to LC3 e-BGP peers
- Now we do TSA on LC2
- LC3 will receive R1 from LC2 with community no-export and from LC1 same as earlier (no change)
- LC3 will still get traffic for R1 since it is still advertised to e-BGP peers (since R1 from LC1 is best route)
- LC3 will forward to both LC1 and LC2 (ecmp) and this causes issue as LC2 is in TSA mode and should not receive traffic
To fix above scenario we change the preference to lower value of R1 received from LC2 so that it is removed from Multi-path/ECMP group.
How I verfiy:
UT has been added to make sure Template generation is correct
Manual Verification of the functionality
sonic-mgmt test case will be updated accordingly.
Please note this PR is on top of this :#16714 which needs to be merged first.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Why I did it
Fan tolerance checking is done through new APIs, is_under_speed and is_over_speed, which populate corresponding fields into the database. speed_tolerance is no longer used and was removed, but system-health was not updated and indicates failures:
ADO: 25279165
root@sonic/# show system-health summary
System status summary
System status LED red_blink
Services:
Status: OK
Hardware:
Status: Not OK
Reasons: Failed to get speed tolerance for fantray5.fan1
Failed to get speed tolerance for fantray5.fan0
Failed to get speed tolerance for fantray4.fan1
Failed to get speed tolerance for fantray4.fan0
Failed to get speed tolerance for fantray3.fan1
Failed to get speed tolerance for fantray3.fan0
Failed to get speed tolerance for fantray2.fan1
Failed to get speed tolerance for fantray2.fan0
Failed to get speed tolerance for fantray1.fan1
Failed to get speed tolerance for fantray1.fan0
Failed to get speed tolerance for fantray0.fan1
Failed to get speed tolerance for fantray0.fan0
Failed to get speed tolerance for PSU1.fan0
Failed to get speed tolerance for PSU0.fan0
How I did it
Updated hardware_checker.py in system-health to consume new is_under_speed and is_over_speed database entries instead of speed_tolerance and hard-coded calculations.
How to verify it
root@sonic:/# show system-health summary
System status summary
System status LED green
Services:
Status: OK
Hardware:
Status: OK
### Why I did it
Github issue: https://github.com/sonic-net/sonic-buildimage/issues/16356. The YANG definition breaks GCU feature.
We can either update sonic_yang and GCU's search algorithm to enable the same key count case or simply update YANG model to solve the issue.
The pros for update YANG model are it could solve the issue directly and we don't need to handle the complicate search algorithm in sonic_yang and GCU. This is the only YANG model that has this issue.
### How I did it
Combine two list into one. The previous YANG validation unit tests are still applicable.
#### How to verify it
Unit test and E2E test
Why I did it
Fix an error in the log_err call.
this error can be triggered by an invalid static route key. usually the code cannot go here with normal config file. but hit this issue with an invalid key by manual testing with redis-cli directly. the file is scanned by Python lint to prevent such errors.
Work item tracking
Microsoft ADO ():26250268
How I did it
fix the format error.
How to verify it
1, ran pylint to check the design, make sure no such error in the design file.
2, wrote a separate python program to verify the log call.
In the current logging related testing, usually use patch/mock for logging. for this specific error, could not trigger it if we call mock function instead the real function in the design. so need to do lint checking for code change.
#### Why I did it
SNMP query over IPv6 does not work due to issue in net-snmp where IPv6 query does not work on multi-nic environment.
To get around this, if snmpd listens on specific ipv4 or ipv6 address, then the issue is not seen.
We plan to configure Management IP and Loopback IP configured in minigraph.xml as SNMP_AGENT_ADDRESS in config_db., based on changes discussed in https://github.com/sonic-net/SONiC/pull/1457.
##### Work item tracking
- Microsoft ADO **(number only)**:26091228
#### How I did it
Modify minigraph parser to update SNMP_AGENT_ADDRESS_CONFIG with management and Loopback0 IP addresses.
Modify snmpd.conf.j2 to use SNMP_AGENT_ADDRESS_CONFIG table if it is present in config_db, if not listen on any IP.
Main change:
1. if minigraph.xml is used to configure the device, then snmpd will listen on mgmt and loopback IP addresses,
2. if config_db is used to configure the device, snmpd will listen IP present in SNMP_AGENT_ADDRESS_CONFIG if that table is present, if table is not present snmpd will listen on any IP.
#### How to verify it
config_db.json created from minigraph.xml for single asic VS image with mgmt and Loopback IP addresses.
```
"SNMP_AGENT_ADDRESS_CONFIG": {
"10.1.0.32|161|": {},
"10.250.0.101|161|": {},
"FC00:1::32|161|": {},
"fec0::ffff:afa:1|161|": {}
},
.....
snmpd listening on the above IP addresses:
admin@vlab-01:~$ sudo netstat -tulnp | grep 161
tcp 0 0 127.0.0.1:3161 0.0.0.0:* LISTEN 71522/snmpd
udp 0 0 10.250.0.101:161 0.0.0.0:* 71522/snmpd
udp 0 0 10.1.0.32:161 0.0.0.0:* 71522/snmpd
udp6 0 0 fec0::ffff:afa:1:161 :::* 71522/snmpd
udp6 0 0 fc00:1::32:161 :::* 71522/snmpd
```
Fix can't access IPV6 address via management interface because 'default' route table does not add to route lookup issue.
#### Why I did it
When device set with IPV6 TACACS server address, and shutdown all BGP, device can't connect to TACACS server via management interface.
After investigation, I found the IPV6 'default' route table does not add to route lookup:
admin@vlab-01:~$ ip -6 rule list
1001: from all lookup local
32765: from fec0::ffff:afa:1 lookup default
32766: from all lookup main
admin@vlab-01:~$
As compare:
admin@vlab-01:~$ ip -4 rule list
1001: from all lookup local
32764: from all to 172.17.0.1/24 lookup default
32765: from 10.250.0.101 lookup default
32766: from all lookup main
32767: from all lookup default <== 'default' route table exist in IPV4 route lookup
Issue fix by add 'default' route table to route lookup with following command:
admin@vlab-01:~$ sudo ip -6 rule add pref 32767 lookup default
admin@vlab-01:~$ ip -6 rule list
1001: from all lookup local
32765: from fec0::ffff:afa:1 lookup default
32766: from all lookup main
32767: from all lookup default <== 'default' route table been added to IPV6 route lookup
admin@vlab-01:~$
##### Work item tracking
- Microsoft ADO: 25798732
#### How I did it
When management interface using 'default' route table, add 'default' route table to IPV6 route lookup.
#### How to verify it
Pass all UT.
Add new UT to cover this change.
Manually verify issue fixed:
### Tested branch (Please provide the tested image version)
- [x] master-17281.417570-2133d58fa
#### Description for the changelog
Fix can't access IPV6 address via management interface because 'default' route table does not add to route lookup issue.
For 40G optics there is SAI handling of T0 facing ports to be set with SR4 type and unreliable los set for a fixed set of ports. For this property to be invoked the requirement is set
phy_unlos_msft=1 in config.bcm.
This change is to meet the requirement and once this property is set, the los/interface type settings is applied by SAI on the required ports.
Why I did it
For Arista-7060CX-32S-Q32 T1, 40G ports RX_ERR minimalization during connected device reboot
can be achieved by turning on Unreliable LOS and SR4 media_type for all ports which are connected to T0.
The property phy_unlos_msft=1 is to exclusively enable this property.
Microsoft ADO: 25941176
How I did it
Changes in SAI and turning on property
How to verify it
Ran the changes on a testbed and verified configurations are as intended.
with property
admin@sonic2:~$ bcmcmd "phy diag xe8 dsc config" | grep -C 2 "LOS"
Brdfe_on = 0
Media Type = 2
Unreliable LOS = 1
Scrambling Disable = 0
Lane Config from PCS = 0
without property
admin@sonic:~$ bcmcmd "phy diag xe8 dsc config" | grep -C 2 "LOS"
Brdfe_on = 0
Media Type = 0
Unreliable LOS = 0
Scrambling Disable = 0
Lane Config from PCS = 0
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
- Why I did it
Enhance the feature to support disabling password hardening as Linux support.
-1: expiration will never occur
0: expiration will expired immediately
Opened bug:
#17427
- How I did it
Added the -1 value to be supported in hostcfgd and this value will propagate to the relevant Linux files
- How to verify it
Pls see the details in the bug description that link attached above
Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue.
#### Why I did it
When set TACACS to "tacacs+, local", user still can run a blocked command with local permission.
##### Work item tracking
- Microsoft ADO: 26399545
#### How I did it
Fix code to reject command when authorized failed from TACACS server side.
#### How to verify it
Pass all UT.
### Description for the changelog
Fix when set TACACS to "tacacs+, local" user can run blocked command with local permission issue.
Fix zebra leaking memory with fib suppress enabled. Porting the fix from
FRRouting/frr#14983
While running test_stress_route.py, systems with lower memory started to throw low memory logs. On further investigation, a memory leak has been found in zebra which was fixed in the FRR community.