ONAP Deployment FAQ
Manual mount volume
Persistence: manually add the volume part in deployment, NFS mode
spec: containers: - image: hub.baidubce.com/duanshuaixing/tools:v3 imagePullPolicy: IfNotPresent name: test-volume resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /root/ name: nfs-test dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 volumes: - name: nfs-test nfs: path: /dockerdata-nfs/test-volume/ server: 10.0.0.7
Restart the node to check the nfs automount
Restart node and see whether nfs client auto-mounts nfs or not, if not, you should munually mount it.
df -Th |grep nfs
sudo mount $MASTER_IP:/dockerdata-nfs /dockerdata-nfs/
Reinstall One Project
1、Delete a module(Take so as an example) helm delete dev-so --purge 2、 If delete failed, you can manually delete pvc、pv、deployment、configmap、statefulset、job 3、Install a module cd oom/kubernests make so meke onap helm install local/so --namespace onap --name dev-so or(under the circumstance that use docker proxy repository) helm install local/so --namespace onap --name dev-so --set global.repository=172.30.1.66:10001 Use a proxy repository when installing a module or define a mirror policy for a module helm install local/so --namespace onap --name dev-so --set global.repository=172.30.1.66:10001 --set so.pullPolicy=IfNotPresent 4、Clear /dockerdata-nfs/dev-so file( can mv to /bak directory)
Helm hasn't deploy parameter
helm has no deploy parameter problem
cp -R ~/oom/kubernetes/helm/plugins/ ~/.helm/
Helm list show no release
cp /root/oom/kubernetes/onap/values.yaml /root/integration-override.yaml
helm deploy dev local/onap -f /root/oom/kubernetes/onap/resources/environments/public-cloud.yaml -f /root/integration-override.yaml --namespace onap --verbose
Forced to delete all pods
$(kubectl get pod -n onap |awk '{print $1}') -n onap --grace-period=0 --forceCopy file to pod
Copy from local to pod, problem about specifying the path
This can be temporarily resolved by installing the LRZSZ command, or by executing the docker cp command within the node
Check the port exposed by the pod
1. Check the node where pod belongs to kubectl get pod -n onap -o wide|grep uui-server 2、Check the type of pod controller (ReplicaSet corresponds to deployment, statefulset corresponds to statefulset) kubectl -n onap describe pod dev-uui-uui-server-67fc49b6d9-szr7t|grep Controlled 3、Check the deployment corresponding to pod kubectl get svc -n onap |grep uui-server 4、 Access pod according to floating ip where the pod is located and 30000+ port
Check pod through the port
Manual mount volume
Can't start ansible-serveransible problem is caused by the unanalysis of dns, you can solve the problem by deploy configmap.
kubectl replace -f kube-dns-configmap.yamlClose the health check to avoid restarting
Delete or comment the following code in deployment or statefulset , it will restart pod after the operation.
Manual mount volume
Restart the container in node to check if the new file is missing when the pod health check open/close
a. In the case that the health check is enabled in the deployment, add a test file to the pod and restart the container in the node.
Conclusion: After restarts the container in the node, a new container will be created, and the original test file in the pod will be lost.
b. In the case that the health check is not enabled in the deployment, add the test file in the pod and restart the container in the node.Conclusion: when container restart, stop and start, the data in the pod will not lost
500 error when SDC distribute package
try to restart/reinstall dmaap,before you restart or reinstall, you should delete the dev-dmaap file in nfsIf the error still happen , try to restart/reinstall SDC
SDC pod can't start
There are dependencies between podsThe pod that ultimately affects the other pods is dev-sdc-sdc-cs
If SDC is redeployed, manually remove /dockerdata-nfs/dev-sdc/
Sdnc-dgbuilder pod can't start
Manual mount volume
homles don't install automatically
dmaap restart sequenceStart dmaap, zookeeper, Kafka, msg, router in sequence, each interval is 1 minute
dev-consul-consul take up a lot of disk space
Problem: Node disk alarm
Troubleshooting: through du-hs * troubleshooting /var/lib/docker/ disk occupancy, the problem is caused by the relatively large disk occupancy under this directory
/var/lib/docker/aufs/diff/b759b23cb79cff6cecdf0e44f7d9a1fb03db018f0c5c48696edcf7e23e2d045b/home/consul/.kube/http-cache/.diskv-temp/
By kubectl -n onap get pod -o wide|grep consul , confirm the pod is dev-consul-consul-6d7675f5b5-sxrmq,and reconfirm according to kubectl exec this pod
Solution:Delete all the document in /home/consul/.kube/http-cache/.diskv-temp/ in pod
Following next is part of the file in the exampleCan't delete statefulset
If it is kubectl1.8.0,it need to be upgrated to kubecto version 1.9.0 or above
Rollback after image update
Update image in oom with the docker-manifest.csv under integration repo
Delete ONAP
Missing svc or yaml configuration file
Calling multiple k8s api objects at one time and occurs a stuck problem
Problem:Build 60 service at a time,kubectl command stucks during execution
Reason: java progress occur oom problem in pod in rancher server
Temporary solution:mulnipulate api objects in batches
Permanent solution:Modify memory limits in Java parameters
Specify the Xmx number when installing, the default is 4096M, you can increase it to 8192M.
docker run -d --restart=unless-stopped -e JAVA_OPTS="-Xmx8192m" -p 8080:8080 --name rancher_server rancher/server:v$RANCHER_VERSIONFilter image version
Filter the image version of oom in kubenetes (take VFC as an example)
grep -r -E 'image|Image:' ~/oom/kubernetes/|awk '{print $2}'|grep onap|grep vfc
Service Port configuration