Commit Graph

844 Commits

Author SHA1 Message Date
mssonicbld
763fcd7eeb
[ci/build]: Upgrade SONiC package versions (#8199) 2021-07-16 14:34:02 +00:00
mssonicbld
b1728187be
[ci/build]: Upgrade SONiC package versions (#8163) 2021-07-15 14:57:14 +00:00
Blueve
762847c2cf [port_config] Introduce ad-hoc mport_config.json file (#8066)
Signed-off-by: Jing Kan jika@microsoft.com
2021-07-15 12:06:47 +00:00
gechiang
514f760793
[202012] BRCM SAI 4.3.3.9 Changes for ISSU support and Dual ToR fixes (#8179) 2021-07-14 10:36:15 -07:00
Kebo Liu
86d64d2fef mount 'mellanox' folder only instead of create each sub folder (#7830)
#### Why I did it

Following the discussion in another PR https://github.com/Azure/sonic-buildimage/pull/7708#discussion_r642933510 , since there will be multi subfolders under **/var/log/mellanox**, so we agreed to only mount this folder and the subfolders will be created afterward on demand.  

#### How I did it

during the syncd docker creation, only mount  folder **/var/log/mellanox**

#### How to verify it

build an Mellanox image and verify the related folder on the host and docker side.
2021-07-13 11:36:56 +00:00
mssonicbld
18ae4a37d8
[ci/build]: Upgrade SONiC package versions (#8156) 2021-07-12 14:01:55 +00:00
mssonicbld
a5fe47858a
[ci/build]: Upgrade SONiC package versions (#8152) 2021-07-11 14:18:53 +00:00
mssonicbld
58b8d5502b
[ci/build]: Upgrade SONiC package versions (#8148) 2021-07-10 14:37:53 +00:00
mssonicbld
cd5d950c98
[ci/build]: Upgrade SONiC package versions (#8127) 2021-07-09 13:25:21 +00:00
mssonicbld
3ded393093
[ci/build]: Upgrade SONiC package versions (#8062) 2021-07-07 13:26:21 +00:00
rajendra-dendukuri
049a23e5a4 [kdump] Fix kdump error message when a reboot is issued (#7985)
dash doesn't support += operation to append to a variable's value. Use KDUMP_CMDLINE_APPEND="${KDUMP_CMDLINE_APPEND} " instead

The below error message is seen when a reboot is issued.

[ 342.439096] kdump-tools[13655]: /etc/init.d/kdump-tools: 117: /etc/default/kdump-tools: KDUMP_CMDLINE_APPEND+= panic=10 debug hpet=disable pcie_port=compat pci=nommconf sonic_platform=x86_64-accton_as7326_56x-r0: not found
2021-07-07 09:40:16 +00:00
mssonicbld
de46dc53a3
[ci/build]: Upgrade SONiC package versions (#8054) 2021-07-04 13:45:42 +00:00
mssonicbld
f44d6cef8b
[ci/build]: Upgrade SONiC package versions (#8051) 2021-07-03 13:53:50 +00:00
mssonicbld
70cb258b7e
[ci/build]: Upgrade SONiC package versions (#8041) 2021-07-02 13:52:41 +00:00
Guohan Lu
d3e2983188 Revert "[Kubernetes]: The kube server could be used as http-proxy for docker (#7469)"
This reverts commit e851a42db7.
2021-07-01 18:41:21 -07:00
mssonicbld
e611f50b48
[ci/build]: Upgrade SONiC package versions (#7858)
Upgrade SONiC Versions
2021-07-01 21:42:47 +08:00
xumia
c3018773d7 Fix vtysh shell-ingestion security issue (#7759)
Fix vtysh shell-ingestion security issue
Only expose the limited parameters of the command vtysh show.
2021-06-28 09:38:00 +00:00
gechiang
efb6c1d9cb
[202012][BRCMSAI]Fix two crash issues introduced by SAI 4.3.3.8 (#7979)
Why I did it
There were two regression issues introduced by BRCM SAI 4.3.3.8:

CS00012196056 [4.3.3.8][WARMBOOT] syncd[2584]: segfault at 5616ad6c3d80 ip 00007f61e0c6bc65 sp 00007fff0c5a7a90 error 4 in libsai.so.1.0[7f61e0a95000+3cd8000]
CS00012195956 [4.3.3.8] [TD3]Syncd Crash at brcm_sai_tnl_mp_create_tunnel()
How I did it
Patch for CS00012195956 from BRCM was validated to have addressed the tunnel creation issue.
Temporary worked around the issue by commenting out a portion of questionable code in BRCM SAI that seems to be the root cause of CS00012196056 .
How to verify it
See the BRCM cases for details.
2021-06-25 08:06:45 -07:00
gechiang
7e4d42eb88
[202012] Pick up BRCM SAI 4.3.3.8 Changes that fixed several issues (#7918) 2021-06-20 22:50:32 -07:00
Sujin Kang
d67a5b887f Support multiple pcie configuration file and change the pcie status table name to match with pcied changes (#7886)
Why I did it
Support multiple pcie configuration file and change the pcie status table name
This is to match with below two PRs.
Azure/sonic-platform-common#195
Azure/sonic-platform-daemons#189

How I did it
Check pcie configuration file with wild card and change the device status table name

How to verify it
Restart with changes and see if the pcie check works as expected.
2021-06-17 07:09:50 +00:00
Renuka Manavalan
e851a42db7 [Kubernetes]: The kube server could be used as http-proxy for docker (#7469)
Why I did it
The SONiC switches get their docker images from local repo, populated during install with container images pre-built into SONiC FW. With the introduction of kubernetes, new docker images available in remote repo could be deployed. This requires dockerd to be able to pull images from remote repo.

Depending on the Switch network domain & config, it may or may not be able to reach the remote repo. In the case where remote repo is unreachable, we could potentially make Kubernetes server to also act as http-proxy.

How I did it
When admin explicitly enables, the kubernetes-server could be configured as docker-proxy. But any update to docker-proxy has to be via service-conf file environment variable, implying a "service restart docker" is required. But restart of dockerd is vey expensive, as it would restarts all dockers, including database docker.

To avoid dockerd restart, pre-configure an http_proxy using an unused IP. When k8s server is enabled to act as http-proxy, an IP table entry would be created to direct all traffic to the configured-unused-proxy-ip to the kubernetes-master IP. This way any update to Kubernetes master config would be just manipulating IPTables, which will be transparent to all modules, until dockerd needs to download from remote repo.

How to verify it
Configure a switch such that image repo is unreachable
Pre-configure dockerd with http_proxy.conf using an unused IP (e.g. 172.16.1.1)
Update ctrmgrd.service to invoke ctrmgrd.py with "-p" option.
Configure a k8s server, and deploy an image for feature with set_owner="kube"
Check if switch could successfully download the image or not.
2021-06-17 07:09:50 +00:00
gechiang
341e15b620
[202012] Bring in BRCM SAI changes from SAI 4.3.3.7 (#7850) 2021-06-14 17:52:35 -07:00
mssonicbld
99b03cff45
[ci/build]: Upgrade SONiC package versions (#7856) 2021-06-12 14:22:14 +00:00
mssonicbld
b5551f044e
[ci/build]: Upgrade SONiC package versions (#7805) 2021-06-11 12:58:22 +00:00
yozhao101
fb2c995f53
[202012][Monit] Deprecate the feature of monitoring the critical processes by Monit (#7823)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
Currently we leveraged the Supervisor to monitor the running status of critical processes in each container and it is more reliable and flexible than doing the monitoring by Monit. So we removed the functionality of monitoring the critical processes by Monit.

How I did it
I removed the script process_checker and corresponding Monit configuration entries of critical processes.

How to verify it
I verified this on the device str-7260cx3-acs-1.
2021-06-09 09:04:22 -07:00
Renuka Manavalan
32e5137ab7 Add service to restore TACACS from old config (#7560)
Why I did it
In upgrade scenarios, where config_db.json is not carry forwarded to new image, it could be left w/o TACACS credentials.
Added a service to trigger 5 minutes after boot and restore TACACS, if /etc/sonic/old_config/tacacs.json is present.

How I did it
By adding a service, that would fire 5 mins after boot.
This service apply tacacs if available.

How to verify it
Upgrade and watch status of tacacs.timer & tacacs.service
You may create /etc/sonic/old_config/tacacs.json, with updated credentials
(before 5mins after boot) and see that appears in config & persisted too.

Which release branch to backport (provide reason below if selected)
 201911
 202006
 202012
2021-06-07 06:02:32 +00:00
mssonicbld
1e9cb30008
[ci/build]: Upgrade SONiC package versions (#7770) 2021-06-05 14:03:04 +00:00
mssonicbld
eddce4d58b
[ci/build]: Upgrade SONiC package versions (#7755) 2021-05-31 13:02:58 +00:00
yozhao101
3af05fdffe [Monit] Restart telemetry container if memory usage is beyond the threshold (#7645)
Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
This PR aims to monitor the memory usage of streaming telemetry container and restart streaming telemetry container if memory usage is larger than the pre-defined threshold.

How I did it
I borrowed the system tool Monit to run a script memory_checker which will periodically check the memory usage of streaming telemetry container. If the memory usage of telemetry container is larger than the pre-defined threshold for 10 times during 20 cycles, then an alerting message will be written into syslog and at the same time Monit will run the script restart_service to restart the streaming telemetry container.

How to verify it
I verified this implementation on device str-7260cx3-acs-1.
2021-05-31 04:38:18 +00:00
mssonicbld
71d4b17ad0
[ci/build]: Upgrade SONiC package versions (#7746) 2021-05-29 14:22:37 +00:00
Prince Sunny
7a816ed5fa [Mux] Do not clean-up HW_MUX_CABLE_TABLE from State DB (#7710)
Co-authored-by: Ubuntu <prsunny@prince-vm.vzw1i4tqyeburcdz5lrgulxi2c.yx.internal.cloudapp.net>
2021-05-27 22:30:06 +00:00
Lawrence Lee
cb3a9eec58 [swss.service]: Remove ordering with pmon (#7614)
Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
2021-05-27 22:29:35 +00:00
Alexander Allen
bd6096a018 [ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB (#7586)
Why I did it
Currently, there is a bug in the ntp.conf jinja2 template where it will ignore the src_intf directive in CONFIG_DB if there are multiple IP addresses associated with an interface. This code change fixes that bug and allows the template to select the correct source interface for NTP.

How I did it
I did this by modifying the macro in ntp.conf.j2 which determines if there is an ip address associated with an interface to set a state variable when it detects a valid interface entry in CONFIG_DB instead of outputting "true" directly (which could result in multiple "trues" outputted for interfaces with multiple valid IP addresses).

How to verify it
Add two ipv4 addresses to an interface in SONiC

Add the following configuration to config_db.json

{
"NTP": {
    "global": {
        "src_intf": "Ethernet1"
        }
    }
}
Replace Ethernet1 with the interface name of the one you assigned the IP addresses to.

Run sudo config reload -y

Open /etc/ntp.conf and verify that the following line exists

...
interface listen Ethernet1
...
The interface specified should be the one set in the previous steps.

Description for the changelog
[ntp] Fix ntp.conf template to allow setting of source port in CONFIG_DB
2021-05-27 22:29:01 +00:00
Renuka Manavalan
53b3d378c7 Invoke disk check periodically. (#7374)
Why I did it
Helps with periodic scan of disk for RO state.
If found, this script makes transient fix and raise error message.
2021-05-27 22:28:44 +00:00
gechiang
5d29a2c2a6
[202012] BRCM SAI 4.3.3.5-3 Enable VFP-based subintf on td2 (#7728)
Why I did it
This SAI change is to Enable TD2 platforms to also be able to handle VFP-based sub-interfaces.
This is the feature that is also needed for TD2 platforms.

How to verify it
Guohan and Team has validated this feature change on TD2 platforms
2021-05-27 15:11:03 -07:00
mssonicbld
5b5131f26d
[ci/build]: Upgrade SONiC package versions (#7701) 2021-05-26 13:58:58 +00:00
shlomibitton
c53f58e488 Remove 'vm.panic_on_oom=1' (#7678)
#### Why I did it
If a process limits using nodes by mempolicy/cpusets, and those nodes become memory exhaustion status, one process may be killed by oom-killer.
No panic occurs in this case, because other node's memory may be free.
This means system total status may be not fatal yet.

#### How I did it
Remove 'vm.panic_on_oom=1' kernel flag from 'vmcore-sysctl.conf '
2021-05-26 02:41:02 +00:00
Neetha John
cb930c30cd [qos]: modify dot1p to tc mapping (#7661)
Map priority 0 to TC 1 and priority 1 to TC 0

Send traffic on priority 0 and 1 and verified that it gets mapped correctly in hw

Signed-off-by: Neetha John <nejo@microsoft.com>
2021-05-24 22:25:47 +00:00
mssonicbld
a9af05d1a7
[ci/build]: Upgrade SONiC package versions (#7686) 2021-05-24 13:02:24 +00:00
mssonicbld
460e150187
[ci/build]: Upgrade SONiC package versions (#7664) 2021-05-22 13:14:06 +00:00
mssonicbld
839527def8
[ci/build]: Upgrade SONiC package versions (#7659) 2021-05-20 14:05:56 +00:00
Sujin Kang
d1043e3c91 add config-setup.service as dependency for pcie-check.service (#7599)
Why I did it
start pcie-check.service after config-setup.service since pcie_util depends on device_info which is available with config db metadata.

How I did it
Add config-setup.service as a dependency of pcie-check.service

How to verify it
Upon reboot, check if the pcie-check.sh throws the platform api error which is dependent on DEVICE_METADATA
2021-05-19 18:14:19 +00:00
mssonicbld
1461c9f98f
[ci/build]: Upgrade SONiC package versions (#7613) 2021-05-19 14:02:10 +00:00
gechiang
3eaadec21c
[202012] BRCM SAI 4.3.3.5-2 picked up BRCM hsdk-6.5.21-p1.patch and CS00012097141 fix (#7581)
This is to pick up BRCM SAI 4.3.3.5-2 which contains 2 main changes:
1.  hsdk-6.5.21-p1.patch (to address some field problems related to SDK issue.
2. Fix for CS00012097141 (remove some unnecessary debug setting that was causing non functional impacting problem at boot time)

Preliminary tests looks fine. BGP neighbors were all up with proper routes programmed
interfaces are all up
Manually ran the following test cases on S6100 DUT and all passed:
```
     ipfwd/test_dir_bcast.py
     fib/test_fib.py
     vxlan/test_vxlan_decap.py
     decap/test_decap.py
     fdb/test_fdb.py
```
2021-05-12 07:56:46 -07:00
mssonicbld
c8ee27319b
[ci/build]: Upgrade SONiC package versions (#7589) 2021-05-12 14:11:06 +00:00
mssonicbld
aac5c04d91
[ci/build]: Upgrade SONiC package versions (#7583) 2021-05-12 07:19:54 +00:00
Qi Luo
262d0948f1
[build]: Update the netifaces pip3 package version (#7574)
202012 branch is using reproducible build. The versions-py3 file must contains correct pip3 package version. Otherwise we will get a build error

```
Step 31/74 : RUN pip3 install          scapy==2.4.4          pyroute2==0.5.14          netifaces==0.10.9
 ---> Running in d7a2401dd21d
Collecting scapy==2.4.4
  Downloading scapy-2.4.4.tar.gz (1.0 MB)
Collecting pyroute2==0.5.14
  Downloading pyroute2-0.5.14.tar.gz (873 kB)
ERROR: Cannot install netifaces==0.10.9 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested netifaces==0.10.9
    The user requested (constraint) netifaces==0.10.7

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
The command '/bin/sh -c pip3 install          scapy==2.4.4          pyroute2==0.5.14          netifaces==0.10.9' returned a non-zero code: 1
[  FAIL LOG END  ] [ target/docker-sonic-vs.gz ]
```
2021-05-11 09:19:52 -07:00
mssonicbld
a3980f948a
[ci/build]: Upgrade SONiC package versions (#7576) 2021-05-11 14:35:57 +00:00
Renuka Manavalan
99958304c1 [container_checker] Use Feature table to get running containers (#7474)
Why I did it
Finding running containers through "docker ps" breaks when kubernetes deploys container, as the names are mangled.

How I did it
The data is is available from FEATURE table, which takes care of kubernetes deployment too.

How to verify it
Deploy a feature via kubernetes and don't expect error from container_check.
2021-05-10 15:59:57 -07:00
mssonicbld
b95f86722c
[ci/build]: Upgrade SONiC package versions (#7540) 2021-05-09 04:44:39 +00:00