Why I did it
For better accounting purposes, updating the ingress lossy traffic profile to use static threshold. This change is only intended for Th devices using RDMA-CENTRIC profiles
How I did it
Update the buffer templates for Th devices in RDMA-CENTRIC folder to use the correct threshold
How to verify it
Verified the changes manually on a Th device.
Existing unit tests render Th template from the RDMA-CENTRIC folder. Updated the expected output to use the correct threshold
Why I did it
Our k8s feature will pull new version container images for each upgrade, the container images inside sonic will be more and more, but for now we don’t have a way to clean up the old version container images, the disk may be filled up. Need to add cleaning up the old version container images logic.
Work item tracking
Microsoft ADO (number only):
17979809
How I did it
Remove the old version container images besides the feature's current version and last version image, last version image is saved for supporting fallback.
How to verify it
Check whether the old version images are removed
Why I did it
Introduce a new valid neighbor element type to YANG.
Work item tracking
Microsoft ADO (number only): 23994521
How I did it
Add MgmtLeafRouter to element network type list.
How to verify it
Passes UTs
Why I did it
We need to store information of power shelf in config_db for SONiC MX switch. Current minigraph parser cannot parse rack_mgmt_map field.
Work item tracking
Microsoft ADO (number only): 22179645
How I did it
Add support for parsing rack_mgmt_map.
Why I did it
Manually cherry-pick and resolve conflicts of this PR: #15109
Extend device_metadata yang model.
Work item tracking
Microsoft ADO (number only): 22912178
How I did it
Add rack_mgmt_map field in yang model.
How to verify it
Build image.
Why I did it
systemd stop event on service with timers can sometime delete the state_db entry for the corresponding service.
Note: This won't be observed on the latest master label since the dependency on timer was removed with the recent config reload enhancement. However, it is better to have the fix since there might be some systemd services added to system health daemon in the future which may contain timers
root@qa-eth-vt01-4-3700c:/home/admin# systemctl stop snmp
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status
System is not ready - one or more services are not up
Service-Name Service-Status App-Ready-Status Down-Reason
---------------------- ---------------- ------------------ -------------
<Truncated>
ssh OK OK -
swss OK OK -
syncd OK OK -
sysstat OK OK -
teamd OK OK -
telemetry OK OK -
what-just-happened OK OK -
ztp OK OK -
<Truncated>
Expected
Should see a Down entry for SNMP instead of the entry being deleted from the STATE_DB
root@qa-eth-vt01-4-3700c:/home/admin# show system-health sysready-status
System is not ready - one or more services are not up
Service-Name Service-Status App-Ready-Status Down-Reason
---------------------- ---------------- ------------------ -------------
<Truncated>
snmp Down Down Inactive
ssh OK OK -
swss OK OK -
syncd OK OK -
sysstat OK OK -
teamd OK OK -
telemetry OK OK -
what-just-happened OK OK -
ztp OK OK -
<Truncated>
How I did it
Happens because the timer is usually a PartOf service and thus a stop on service is propagated to timer. Fixed the logic to handle this
Apr 18 02:06:47.711252 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.service from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.711347 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.service ]
Apr 18 02:06:47.722363 r-lionfish-16 INFO healthd: snmp.service service state changed to [inactive/dead]
Apr 18 02:06:47.723230 r-lionfish-16 DEBUG healthd: Main process- received event:snmp.timer from source:sysbus time:2023-04-17 23:06:47
Apr 18 02:06:47.723328 r-lionfish-16 INFO healthd: check_unit_status for [ snmp.timer ]
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.
#### Why I did it
When user enable TACACS per-command authorization, and run a command with wildcard , if the command match more than hundreds of files, the per-command authorization will failed with following message:
*** authorize failed by TACACS+ with given arguments, not executing
The root cause of this issue is because bash will match files with wildcard and replace with wildcard args with matched files. when there are too many files, TACACS plugin will generate a big authorization request, which will be reject by server side.
##### Work item tracking
- Microsoft ADO **(number only)**: 18074861
#### How I did it
Fix bash patch file, use original user inputs as authorization parameters.
#### How to verify it
Pass all UT.
Create new UT to validate the TACACS authorization request are using original command arguments.
UT PR: https://github.com/sonic-net/sonic-mgmt/pull/8115
#### Which release branch to backport (provide reason below if selected)
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [X] 202205
- [X] 202211
#### Tested branch (Please provide the tested image version)
- [x] 202205.258490-412b83d0f
- [x] 202211.71966120-1b971c54b5
#### Description for the changelog
Fix per-command authorization failed issue when a command with wildcard match more than hundred files.
Why I did it
src/sonic-linux-kernel
* 3909870 - (HEAD -> 202211, origin/202211) Change SECURE_UPGRADE_DEV_SIGNING_CERT to SECURE_UPGRADE_SIGNING_CERT (#315) (4 days ago) [DavidZagury]
* baaa137 - [202211] Add Secure Boot Kernel configuration backport (#316) (4 days ago) [DavidZagury]
How I did it
How to verify it
Update sonic-sairedis submodule pointer to include the following:
* 61cf1ce Revert Ignore removing switch for mellanox platform due to known limitation (1216) ([#1232](https://github.com/sonic-net/sonic-sairedis/pull/1232))
Signed-off-by: dprital <drorp@nvidia.com>
Why I did it
[Submodule][202211] Advance sonic-restapi pointer
The branch 202012 has already updated to commit 47e4b53.
4f6f979 Fix the redis security issue CVE-2023-28858 and CVE-2023-28859 (#139)
47e4b53 Fix adv_pfx len for ipv6 (#135)
44121be Support ipv6 prefix lenght greater than 64 and check for adv_prefix (#134)
99c467d Add API support for adv prefix and custom monitoring (#133)
347684a Use github code scanning instead of LGTM (#132)
86543d0 Updates to route PATCH API (#129)
a1af82c Install libyang to azure pipeline (#128)
2007c4c Increase coverage threshold (#126)
Work item tracking
Microsoft ADO (number only): 17705422
How I did it
How to verify it
Why I did it
Optimize the version control for Debian packages.
Fix sonic-slave-buster/sources.list.amd64 not found display issue, need to generate the file before running the shell command to evaluate the sonic image tag.
When using the snapshot mirror, it is not necessary to update the version file based on the base image. It will reduce the version dependency issue, when an image is not run when freezing the version.
How I did it
Not to update the version file when snapshot mirror enabled.
How to verify it
Update sonic-py-common, add missing dependency to redis-dump-load.
This is manually cherry-pick PR for https://github.com/sonic-net/sonic-buildimage/pull/14347
After 202211, the redis-dump-load been patched by sonic, so can't cherry-pick master branch PR to 202211 branch.
#### Why I did it
The script sonic_db_dump_load.py in sonic-py-common is depends on redis-dump-load, however the dependency is missing.
#### How I did it
Add redis-dump-load dependency.
#### How to verify it
Pass all E2E test case.
#### Description for the changelog
Update sonic-py-common, add missing dependency to redis-dump-load.
Why I did it
SONiC currently does not identify 'EdgeZoneAggregator' neighbor. As a result, the buffer profile attached to those interfaces uses the default cable length which could cause ingress packet drops due to insufficient headroom. Hence, there is a need to update the buffer templates to identify such neighbors and assign the same cable length as used by the T1.
How I did it
Modified the buffer template to identify EdgeZoneAggregator as a neighbor device type and assign it the same cable length as a T1/leaf router.
How to verify it
Unit tests pass, and manually checked on a 7260 to see the changes take effect.
Signed-off-by: dojha <devojha@microsoft.com>
Fixes#11873.
#### Why I did it
When loading from minigraph, for port channels, don't create the members@ array in config_db in the PORTCHANNEL table. This is no longer needed or used.
In addition, when adding a port channel member from the CLI, that member doesn't get added into the members@ array, resulting in a bit of inconsistency. This gets rid of that inconsistency.
Why I did it
Dhcpmon had incorrect RX count for server side packets. It does not raise any false alarms, but could miss catching server side packet count mismatch between snapshot and current counter.
Add debug mode which prints counter to syslog
How I did it
Due to dualtor inbound filter requirement, there are currently two filters, each for listening to rx / tx packets.
Originally, we opened up an rx/tx socket for each interface specified, which causes duplicate socket. Now we initialize the sockets only once. Both sockets are not binded to an interface, and we use vlan to interface mapping to filter packets. For inbound uplinks, we use a portchannel to interface mapping.
Previous dhcpmon counter before dual tor change:
[ Agg-Vlan1000- Current rx/tx] Discover: 1/ 4, Offer: 1/ 1, Request: 3/ 12, ACK: 1/ 1
[ eth0- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ eth0- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ PortChannel104- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel103- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel102- Current rx/tx] Discover: 0/ 2, Offer: 1/ 0, Request: 0/ 6, ACK: 1/ 0
[ PortChannel101- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ Vlan1000- Current rx/tx] Discover: 1/ 0, Offer: 0/ 1, Request: 3/ 0, ACK: 0/ 1
[ Agg-Vlan1000- Current rx/tx] Discover: 1/ 4, Offer: 1/ 1, Request: 3/ 12, ACK: 1/ 1
Dhcpmon counter after this PR:
[ PortChannel104- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel103- Current rx/tx] Discover: 0/ 1, Offer: 0/ 0, Request: 0/ 3, ACK: 0/ 0
[ PortChannel102- Current rx/tx] Discover: 0/ 2, Offer: 1/ 0, Request: 0/ 6, ACK: 1/ 0
[ PortChannel101- Current rx/tx] Discover: 0/ 0, Offer: 0/ 0, Request: 0/ 0, ACK: 0/ 0
[ Vlan1000- Current rx/tx] Discover: 1/ 0, Offer: 0/ 1, Request: 3/ 0, ACK: 0/ 1
[ Agg-Vlan1000- Current rx/tx] Discover: 1/ 4, Offer: 1/ 1, Request: 3/ 12, ACK: 1/ 1
How to verify it
Ran dhcp relay test to send all four packets in singles and batches on both single ToR and dual ToR. Counter was as expected.
On SONiC VoQ chassis, the speed changes are done from 400G to 100G needs to be supported on 400G linecards.
To enable this, along with speed change the port lanes need to be changed. This PR has the changes to update the port lanes when such speed change happens.
This PR is intended only for VoQ chassis linecards. These platforms today have 400g port with 8 serdes lines, and 100g will operate with 4 serdes lane. When the port speed changes from 400G to 100G the first 4 lanes will be used for 100G port.
Platforms which support 2x50g PAM4 or support 100G PAM4 serdes or other combinations are not handled in the PR.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>
- Why I did it
Healthd check system status every 60 seconds. However, running checker may take several seconds. Say checker takes X seconds, healthd takes (60 + X) seconds to finish one iteration. This implementation makes sonic-mgmt test case not so stable because the value X is hard to predict and different among different platforms. This PR introduces an interval
compensation mechanism to healthd main loop.
- How I did it
Introduces an interval compensation mechanism to healthd main loop: healthd should wait (60 - X) seconds for next iteration
- How to verify it
Manual test
Unit test
Why I did it
Add interface-id in dhcpv6-relay yang model
How I did it
Add interface-id option and corresponding UT. Updated configuration.md
How to verify it
kellyyeh@kellyyeh:~/sonic-buildimage/src/sonic-yang-models$ pyang -Vf tree -p /usr/local/share/yang/modules/ietf ./yang-models/sonic-dhcpv6-relay.yang
fa8b709 Handled the error case of negative age (#57)
990f5b0 Use github code scanning instead of LGTM (#55)
a7992c5 Install libyang for swss-common. (#50)
244fa86 Update README.md
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Manual cherry-pick of #14045
Why I did it
Fixing issue #13983 Added Missing fields in sonic-portchannel yang model. "fallback" and "fast_rate" fields are present in configuration schema but not in yang model. This leads to traceback when yang is validated
sonic_yang(3):All Keys are not parsed in PORTCHANNEL dict_keys(['PortChannel100'])
sonic_yang(3):exceptionList:["'fast_rate'"]
sonic_yang(3):Data Loading Failed:All Keys are not parsed in PORTCHANNEL dict_keys(['PortChannel100'])
exceptionList:["'fast_rate'"]
Data Loading Failed
All Keys are not parsed in PORTCHANNEL
dict_keys(['PortChannel100'])
exceptionList:["'fast_rate'"]
ConfigMgmt Class creation failed
Failed to break out Port. Error: Failed to load the config. Error: ConfigMgmtDPB Class creation failed
How I did it
Updated yang model
How to verify it
Added tests to verify
- Why I did it
Need to add the possibility to choose between dropping packets (using ACL) on ingress or egress in Dual ToR scenario
- How I did it
Add new attribute "mux_tunnel_ingress_acl" to SYSTEM_DEFAULTS table
- How to verify it
check that new attribute exists in redis:
admin@sonic:~$ redis-cli -n 4
127.0.0.1:6379[4]> HGETALL SYSTEM_DEFAULTS|mux_tunnel_ingress_acl
1."state"
2."false"
Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
Why I did it
8c7ddf56 - [warm/fast-reboot] Backup logs from tmpfs to disk during fast/warm shutdown ([swss]: update swss docker to stretch #2714) (3 hours ago) [Vaibhav Hemant Dixit]
f2a31b30 - [ci] Fix pipeline issue caused by sonic-slave-* change. ([201803] Modify Debian apt repos to reflect changes made by maintainers #2709) (3 hours ago) [Liu Shilong]
586ecf0e - [dhcp_relay] Fix dhcp_relay restart error while add/del vlan ([thrift] add a patch to revert THRIFT-3650 #2688) (3 hours ago) [Yaqiang Zhu]
07b0ef4c - [portstat CLI] don't print reminder if use json format ([devices] add new accton platform minipack. #2670) (3 hours ago) [wenyiz2021]
48d3d3ef - [show][muxcable] add some new commands health, reset-cause, queue_info support for muxcable (DUT takes more than 7 seconds to finish update ip v6 neighbor #2414) (3 hours ago) [vdahiya12]
How I did it
How to verify it
Why I did it
submodule advance
b085b5f - [ci] Fix pipeline error about team5 not found. (Core dump in orchagent when assigning router interface to a vlan with untagged mode #2684) (3 hours ago) [Liu Shilong]
4549b4c - Fix issue: there is no retry while creating a RIF which is in removing state ([201811 sub-module] advance sub-modules: utilities, swss, swss-common #2679) (3 hours ago) [Junchao-Mellanox]
980a45b - [FDB]Fixing FDB consolidated flush for Remote MACs (pmon to stretch #2673) (3 hours ago) [Sudharsan Dhamal Gopalarathnam]
c646607 - Do not allow to add port to .1Q bridge while router port deletion is not completed (Update SDK, FW and SAI #2669) (3 hours ago) [Lior Avramov]
4a321f0 - [orchagent]: Get bridge port ID from orchagent cache instead of SAI API ([201811 sub module] advance sairedis sub module #2657) (3 hours ago) [Lawrence Lee]
f4b88f3 - [Dual-ToR] handle 'mux_tunnel_egress_acl' attrib in order to change ACL configuration (drop on ingress/egress) on standby ToR (lm75 doesn't support written alarm to syslog. #2646) (3 hours ago) [Andriy Yurkiv]
a4f29c1 - [Workaround] EvpnRemoteVnip2pOrch warmboot check failure ([teamd]: wait for swss db flush done before starting teamd container #2626) (3 hours ago) [jcaiMR]
53ee0a8 - Support for tc-dot1p and tc-dscp qosmap ([201803] [router-advertiser] Add templated script to wait for pertinent interfaces to be ready before starting radvd #2559) (3 hours ago) [Divya Mukundan]
b953866 - [dual-tor] add missing SAI attribte in order to create IPNIP tunnel (Config reload/load_minigraph not clearing State DB #2503) (3 hours ago) [Andriy Yurkiv]
How I did it
How to verify it
Why I did it
cf9a66b - Fix issue: bulk counter feature is disabled ([Broadcom]: Update Broadcom SDK/SAI package #1205) (4 hours ago) [Lior Avramov]
8b1583b - [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd ([Makefile]: variable ENABLE_SYNCD_RPC is always empty string #1201) (4 hours ago) [Andriy Yurkiv]
50d8e21 - [syncd]: Enable port bulk API ([platform] Accton AS7712-32X. Update for sensors and sfputil. #1197) (4 hours ago) [Nazarii Hnydyn]
a72438a - Use new value of STATE_DB FAST_REBOOT entry ([device/accton]: Update Accton-AS5712_54X #1196) (4 hours ago) [Aryeh Feigin]
d78ce86 - validation support for SAI_ATTR_VALUE_TYPE_JSON ([installer] FIX. ONIE installer error issue: #1152) (4 hours ago) [svshah-intel]
How I did it
How to verify it
Why I did it
9ccaaa5 - Update host electrical interface for 2x100G AOC ([platform]: add dell s6100 into one image #346) (4 hours ago) [mihirpat1]
d7016a4 - [ssd_generic] Get health status from Remaining_Life_Left field for virtium SSD ([docker]: Update docker-orchagent start.sh to combine td2 qos/buffers… #344) (4 hours ago) [Junchao-Mellanox]
How I did it
How to verify it
Why I did it
6391de0 - [ycable] add changes for correcting telemetry values for 'active-active' (Add default dhcp_relay.yml file to OneImage build #341) (4 hours ago) [vdahiya12]
2cb31c4 - Update CMIS module types for 2x100G AOC support ([kernel]: update linux kernel to support z9100 #339) (4 hours ago) [mihirpat1]
2ea9cf2 - [ycabled] add more coverage to ycabled; add minor name change for vendor API CLI return key-values pairs ([Makefile]: Automatically rebuild sonic-slave #338) (4 hours ago) [vdahiya12]
How I did it
How to verify it
Why I did it
e732ed0 - Prevent sonic-db-cli generate core dump (Update submodule: sairedis #749) (4 minutes ago) [Hua Liu]
28adcb4 - Support for TC-DOT1p qos map (Update submodules: sonic-swss-common, sonic-sairedis #721) (5 minutes ago) [Divya Mukundan]
How I did it
How to verify it
Why I did it
[Build] Support to use loosen version when failed to install python packages
It is to fix the issue #14012
How I did it
Try to use the installation command without constraint
How to verify it
Why I did it
Support to upgrade packages, do better cleanup after the build.
How I did it
Remove the no use preference version control file after the build.
How to verify it
Why I did it
On a supervisor card in a chassis, syncd/teamd/swss/lldp etc dockers are created for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.
system-health indicates errors in this scenario as it is expecting dockers for all Switch Fabrics (based on NUM_ASIC defined in asic.conf file).
system-health process error messages were also altered to indicate which container had the issue; multiple containers may run processes with the same name, which can result in identical system-health error messages, causing ambiguity.
How I did it
Port container_checker logic from #11442 into service_checker for system-health.
How to verify it
Bringup Supervisor card with one or more missing fabric cards. Execute 'show system-health summary'. The command should not report failure due to missing dockers for the asics on the fabric cards which are not present.
Why I did it
Submodule advances:
sonic-utilities
8e8e6088 - [202211][dhcp_relay] Remove add field of vlanid to DHCP_RELAY table while adding vlan ([201811 sub-module] advance sub-modules: utilities, swss, swss-common #2679) (16 hours ago) [Yaqiang Zhu]
1400fb94 - [GCU] Ignore bgpraw in GCU applier (Fix sfputil indexing for 7170-Q59S20 #2623) (15 hours ago) [jingwenxie]
f76a6364 - [vlan] Refresh dhcpv6_relay config while adding/deleting a vlan ([sonic-py-swsssdk] Update submodule #2660) (15 hours ago) [Yaqiang Zhu]
7849e18d - [db_migrator] make LOG_LEVEL_DB migration more robust (Mellanox platform: attach queues 2 and 6 to lossy profile using generic buffer template #2651) (16 hours ago) [Stepan Blyshchak]
c7df6dfa - Fixed a bug in "show vnet routes all" causing screen overrun. (Add hook to allow customizing link cable lengths #2644) (16 hours ago) [siqbal1986]
a5505f02 - show logging CLI support for logs stored in tmpfs (Traceback error seen while issuing show interface commands with if_names #2641) (16 hours ago) [mihirpat1]
bbacb91a - [system-health] Fix issue: show system-health CLI crashes (Updating deb package for platform and sai #2635) (16 hours ago) [Junchao-Mellanox]
8d724024 - [sai_failure_dump]Invoking dump during SAI failure ([dockers]: Upgrade LLDP docker to stretch build #2633) (16 hours ago) [Sudharsan Dhamal Gopalarathnam]
3c3be526 - Add transceiver info CLI support to show output from TRANSCEIVER_INFO for ZR ([submodule]: Update sonic-sairedis pointer #2630) (16 hours ago) [mihirpat1]
37f41666 - [show] add support for gRPC show commands for active-active ([bitmap-vnet]: Bitmap vnet test image [DO NOT MERGE] #2629) (16 hours ago) [vdahiya12]
b06d7fe4 - [show_bfd] add local discriminator in show bfd command ([Pmon] Selectively load pmon container daemons #2625) (16 hours ago) [Baorong Liu]
6adcd3e8 - [GCU] Ignore bgpraw table in GCU operation ([Mellanox] Fix SAI version #2628) (16 hours ago) [jingwenxie]
c65bdc35 - [muxcable][config] Add support to enable/disable ceasing to be an advertisement interface when radv service is stopped (Add knob in ConfigDB to enable/disable telemetry container #2622) (16 hours ago) [Jing Zhang]
91e9457f - Add Transceiver PM basic CLI support to show output from TRANSCEIVER_PM table for ZR ([201803] Restart SwSS, syncd and dependent services if a critical process in syncd container exits #2615) (16 hours ago) [longhuan-cisco]
54cc8c5a - Remove TODO comment which is no longer relevant (Warm-reboot: teamd warm restart caused neighbor deleted and learned again. #2600) (16 hours ago) [Lior Avramov]
6891b4fb - Making 'show feature autorestart' more resilient to missing auto_restart config in CONFIG_DB ([submodule] update mellanox hw-mgmgt pointer (V.2.0.0061) #2592) (16 hours ago) [kartik-arista]
1e8bea37 - [storyteller] add link prober state change to story teller ([sonic-buildimage] New feature managementVRF(L3mdev) #2585) (16 hours ago) [Jing Zhang]
7481a20f - Extend fast-reboot STATE_DB entry timer ([submodule]: update sonic-swss-common, sonic-py-swsssdk, sonic-snmpagent #2577) (16 hours ago) [Aryeh Feigin]
0e08701c - [sonic_installer] use /etc/resolv.conf from the host when migrating packages (Set a rate limit on syslog messages from all Docker containers #2573) (16 hours ago) [Stepan Blyshchak]
06096780 - Fixed admin state config CLI for Backport interfaces (Prior to install a new ONIE SONiC image, delete all partitions except EFI/ONIE #2557) (16 hours ago) [anamehra]
9f1f13e4 - [show] Add bgpraw to show run all (Fixed typo on paragraph #40#2537) (16 hours ago) [jingwenxie]
98bc8bd2 - [chassis][voq] Add "show fabric reachability" command. ([ntp]: Build 4.2.6 locally. #2528) (16 hours ago) [jfeng-arista]
3a50b63f - Preserve copp tables through DB migration ([docker-radvd]: upgrade docker radvd to stretch based #2524) (16 hours ago) [Aryeh Feigin]
28f6b127 - [masic] 'show interfaces counters' reminds to use '-d all' option to check for internal links (solve dependency issue #2466) (16 hours ago) [wenyiz2021]
15026e14 - suppport multi asic for show queue counter ([dockers] Prevent old supervisord messages from gettting re-logged to syslog #2439) (16 hours ago) [zhixzhu]
2d773e17 - [masic support] 'show run bgp' support for multi-asic (lo address not synced to the asic #2427) (16 hours ago) [wenyiz2021]
sonic-swss
4f304bc - [EVPN]Handling race condition when remote VNI arrives before tunnel map entry ([sonic-quagga] Function defect, do NOT cancel route while connect IP down #2642) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
34fc615 - [sai_failure_dump]Invoking dump during SAI failure (Add hook to allow customizing link cable lengths #2644) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
b817695 - [autoneg]Fixing adv interface types to be set when AN is disabled (Fix issue with platform file path name #2638) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
ab36bd4 - [bfdorch] add local discriminator to state DB ([bitmap-vnet]: Bitmap vnet test image [DO NOT MERGE] #2629) (15 hours ago) [Baorong Liu]
6343471 - Remove TODO comments that are no longer relevant (Add knob in ConfigDB to enable/disable telemetry container #2622) (15 hours ago) [Lior Avramov]
2b1869c - [refactor]Refactoring sai handle status (Rollback kernel submodule update. #2621) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
c41a1b7 - Fix issue ARP entry is out of sync between kernel and APPL_DB after warm reboot if the ARP entry is updated more than once during warm reboot in PFC watchdog warm reboot test #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL ([sub module] advance sonic-utilities sub module for 201811 branch #2619) (15 hours ago) [Stephen Sun]
da0cf7a - Changed the BFD default detect multiplier to 10x ("failed to load plugin io.containerd.snapshotter..." seen during linux boot up #2614) (15 hours ago) [siqbal1986]
13b5adf - [vstest] Only collect stdout of orchagent_restart_check in vstest ([submodules] update swss and utilities pointers #2597) (15 hours ago) [bingwang-ms]
2b9d94d - Avoid aborting orchagent when setting TUNNEL attributes (build failing for PLATFORM=p4 #2591) (15 hours ago) [Stephen Sun]
99b7d3b - Only collect stdout of orchagent_restart_check in vstest ( [saibcm-modules]: import new bcm modules #2578) (15 hours ago) [bingwang-ms]
5209c42 - dereg acl-rule counters during acl-table del ([201803] Set a rate limit on syslog messages from all Docker containers #2574) (15 hours ago) [Vivek]
ae68054 - Fixed set mtu for deleted subintf due to late notification ([vs]: Add option to specify platform name for DVS orchagent #2571) (15 hours ago) [EdenGri]
ab13dfa - Remove TODO comments which are no longer needed (support set timezone in ConfigDB #2568) (15 hours ago) [Junchao-Mellanox]
a3545cf - Modify coppmgr mergeConfig to support preserving copp tables through reboot. (Added new SN3700/SN3700C Mellanox platforms #2548) (15 hours ago) [Aryeh Feigin]
be16e79 - Use github code scanning instead of LGTM ([201803] [services] Restart SwSS service upon unexpected critical process exit #2546) (15 hours ago) [Liu Shilong]
63c0234 - Updated handling of VRF_VNI mapping and VLAN_VNI mapping for same VNI ID (Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABL… #2538) (15 hours ago) [Tapash Das]
4844111 - Fix potential risks ([mlnx] Fix sai xml path for boxer platform #2516) (15 hours ago) [Liran-Ar]
6420808 - [p4orch]: PINS Extension tables support ([build] When generating image version, handle case where current commit has no reachable tags #2506) (15 hours ago) [svshah-intel]
sonic-swss-common
1badd46 - Increase the netlink buffer size from 3MB to 16MB. (arp_update doesn't sleep 300 between each execution #739) (14 hours ago) [KISHORE KUNAL]
6555057 - Refactor eventpublisher deinit ([acl] Add default deny rule for l3 table #734) (14 hours ago) [Zain Budhwani]
f4d6de7 - Use github code scanning instead of LGTM ([sonic-quagga]:update submodule #718) (14 hours ago) [Liu Shilong]
sonic-linux-kernel
74f9a8f - Update linux kernel for hw-mgmt V.7.0020.4104 (Move template files to /usr/share/sonic/templates #305) (14 hours ago) [Stephen Sun]
6365701 - Fixes for emmc unreliability ([build_debian.sh]: Integrate system dump script #270) (14 hours ago) [Samuel Angebault]
How I did it
How to verify it
Why I did it
sonic-sairedis
53488e9 - [sai_failure_dump]Invoking dump during SAI failure (Update Mellanox buffer profiles config #1198) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
85921af - [Mellanox] Enable DSCP remapping by using SAI attribute ([Nephos] Updating download link for SAI and SDK #1188) (15 hours ago) [Stephen Sun]
82f2cd7 - Switch to using stock gcovr 5.2 (Add service to config hostname based on configdb #1174) (15 hours ago) [Saikrishna Arcot]
3a6c60d - [ppi]: Enable bulk API. ([Aboot] Declare flash_size for all platform #1171) (15 hours ago) [Nazarii Hnydyn]
f1303cb - Use github code scanning instead of LGTM (#1160) (15 hours ago) [Liu Shilong]
b1972d9 - Fix for [EVPN] When MAC moves from remote end point to local, ASIC DB fields are not updated properly for the mac #11503Update NotificationProcessor.cpp ([libteam] Add fallback support for single-member-port LAG #1118) (15 hours ago) [anilkpan]
How I did it
How to verify it
Why I did it
advance sonic-platform-daemons
7219b56 - [Xcvrd]: Fix optics insertion/removal not detected (Add Ingrasys S9100 platform submodule #333) (3 days ago) [Prince George]
9b15ccf - add data for telemtery enhancement for 'active-active' cable type ([platform]: add support for Force10-Z9100 32x100G #332) (3 days ago) [vdahiya12]
1c7dba6 - Fix bug where transceiver info is missing after port breakout change ([teamd] Fix a bug in #305 that will break teamd #329) (3 days ago) [Tal Berlowitz]
07b8f3c - Xcvrd should restart if any child thread crashes (Update Mellanox SAI git reference #326) (3 days ago) [mihirpat1]
How I did it
How to verify it
Why I did it
advance sonic-platform-common
2dbc0ea - (HEAD, origin/202211) Change get_tx_bias return type to list ([platform]: add eeprom/sfputil support for z9100 #342) (2 days ago) [mihirpat1]
How I did it
How to verify it
Why I did it
include changes from sairedis submodule
102d20b | [202211][submodule][SAI]Advance header include 0031470 | improve enum values integration check (#1727) (#1737)
04d3c41 | [Submodule][upgrade]Upgrade SAI submodule (#1204)
updates from SAI
7710e24 | [cherry-pick][202211]Enhance the check enum lock script (#1741) (#1742)
0031470 | improve enum values integration check (#1727) (#1737)
4f11c7e | Enable github code scanning to replace LGTM. (#1709)
How I did it
How to verify it
During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.
This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.
Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
#### How I did it
#### How to verify it
Why I did it
This PR is an enhancement of PR #13105
Because the input string of AttachTo for ACL table can appear in both port name group and port alias group, I added a logic to determine whether the string should be port name or port alias
If all the input strings belong to port name group, then we treat all of them as port name
If all the input strings belong to port alias, then we treat all of them as port alias
If all the input string belongs to both port alias group and port name group, we prefer port alias. The behavior is as before.
How I did it
Walk through all port names/alias in the input to make a decision.
How to verify it
Verified by adding UT.
Why I did it
There is a queue in sysmonitor.py that is created based on an object of multiprocessing.Manager.
After performing fast-reboot, system health monitor is being shut down, what causes this Manager to be shut down as well, since it is a child-process of healthd.
That's why I moved the creation of this Manager from the top of the file to the function Sysmonitor.system_service() (The only place it is used), to make Manager a child-process of Sysmonitor, instead of Healthd. This way both the queue (the Manager) and the processes that uses this queue will be child-processes of the same process, and the problematic scenario of sysmonitor sending messages to a dead queue will not be possible.
How I did it
Removed the definition of manager as global and moved it to system_service() function
How to verify it
Perform a fast reboot and verify the traceback issue is fixed
- Why I did it
Fixed sflow yang model to include collector_vrf field.
- How I did it
Added leaf for collector_vrf under sflow_collector. Additionally aligned the configuration guide
- How to verify it
Added UT to verify.
Why I did it
To ensure, that after a BGP startup, dualtor T0 receives BGP updates before sending out BGP updates.
Please refer to sonic-net/SONiC#1161 for more details.
How I did it
add coalesce-time 10000 to the frr bgp startup config.
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Why I did it
We plan to pilot k8s feature, need to fix several bugs including enable telemetry feature and add platform label.
How I did it
Add support feature set, only enable telemetry container upgrade for now
Add platform label for scheduler usage
Remove CNI installation code, it would be auto installed when install kubeadm
How to verify it
After sonic device join k8s cluster, show node labels to check if platform label is visible.
Signed-off-by: Yun Li yunli1@microsoft.com
#### Why I did it
Improve naming convention for bgp notification events and change type of leaf for sonic-events-host mem usage from uint64 to decimal64
#### How I did it
Replace "-" with "_"
Replace uint64 with decimal64
#### How to verify it
Run yang model unit tests
#### Description for the changelog
Change YANG model leaf naming convention for bgp notification
Why I did it
Fix issue caused by dualtor support PR [dhcpmon] Open different socket for dual tor to enable interface filtering #11201
Improve code
How I did it
On single ToR, packets received count was duplicated due to socket filter set to "inbound"
Tx count not increasing due to filter set to "inbound". Added an outbound socket to count tx packets
Added vlan member interface mapping for Ethernet interface to vlan interface lookup in reference to PR Fix multiple vlan issue sonic-dhcp-relay#27
Exit when socket fails to initialize to allow dhcp_relay docker to restart
How to verify it
Tested on vstestbed single tor and dual tor, sent packets and verify printed out dhcpmon rx and tx counters is correct
Correct number of tx increases
Tx does not increase when ToR is on standby
#### Why I did it
Segfault was occuring when running memory_checker
#### How I did it
Deinit publisher immediately after publishing
#### How to verify it
Manual testing
Why I did it
This PR is to update minigraph.py to support both port alias and port name as input of AttachTo attribute of ACL table.
Before this change, only port alias is supported.
How I did it
Add a global variable to store port names
Search both port names and port alias wheh parsing the value of AttachTo.
How to verify it
Verified by a new unit test case test_minigraph_acl_attach_to_ports
Verified by copying the new minigraph.py to a testbed and run conflg load_minigraph.