Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 324 Current »

Scriptedundercloud(Helm/Kubernetes/Docker)andONAPinstall-SingleVM

ONAP on deployed by or RKE managed by on






VMs

Amazon AWS

Microsoft AzureGoogle Compute

OpenStack

ManagedAmazon EKSAKS

Sponsor

Amazon (384G/m - 201801 to 201808) - thank you

Michael O'Brien (201705-201905)

Amdocs - 201903+
michael - 201905+




Microsoft (201801+)

Amdocs

Intel/Windriver

(2017-)

This is a private page under daily continuous modification to keep it relevant as a live reference (don't edit it unless something is really wrong)

https://twitter.com/_mikeobrien | https://www.linkedin.com/in/michaelobrien-developer/
http://wiki.obrienlabs.cloud/display/DEV/Architecture

For general support consult the official documentation at http://onap.readthedocs.io/en/latest/submodules/oom.git/docs/oom_quickstart_guide.html and https://onap.readthedocs.io/en/beijing/submodules/oom.git/docs/oom_cloud_setup_guide.html and raise DOC JIRA's for any modifications required to them.

This page details deployment of ONAP on any environment that supports Kubernetes based containers.

Chat:  http://onap-integration.eastus.cloudapp.azure.com:3000/group/onap-integration

Separate namespaces - to avoid the 1MB configmap limit - or just helm install/delete everything (no helm upgrade)

OOM Helm (un)Deploy plugins

https://kubernetes.slack.com/messages/C09NXKJKA/?

https://d1.awsstatic.com/whitepapers/architecture/AWS_Well-Architected_Framework.pdf

Deployment Profile

28 pods, 196 pods including vvp without the filebeat sidecars - 20181130 - this number is when all replicaSets and DaemonSets are set to 1 - which is 241 instances in the clustered case 

Docker images currently size up to 75G as of 20181230

After a docker_prepull.sh

/dev/sda1      389255816 77322824 311916608  20% /


TypeVMsTotal
RAM
vCores
HD
VM FlavorK8S/Rancher Idle RAM

Deployed
Total RAM

Deployed
ONAP RAM
PodsContainersMax vCoresIdle vCoresHD/VMHD
NFS
only
IOPSDateCostbranchNotes
deployment post 75min
Full Cluster (14 + 1) - recommended15224G
112 vC
100G/VM

16G, 8 vCores

C5.2xLarge


187Gb102Gb28248 total
241 onap
217 up
0 error
24 config

18

6+G master
14 to 50 +G slave(s)

8.1G
20181106

$1.20 US/hour

using the spot market

C
Single VM (possible - not recommended)1432G
64 vC
180G
256G+ 32+ vCoresRancher: 13G
Kubernetes: 8G
Top: 10G
165Gb (after 24h)141Gb28

240 total
233 onap
200 up
6 error
38 config

196 if RS and DS are set to 1

5522

131G

(including 75G dockers)

n/aMax: 550/sec
Idle: 220/sec

20181105

20180101


C

Tested on 432G/64vCore azure VM - R 1.6.22 K8S 1.11

updated 20190101

Developer 1-n pods1

16G
4 vC
100G

16/32G 4-16 vCores
14Gb10Gb3+


120+Gn/a


CAAI+robot only

Security

The VM should be open with no CIDR rules - but lock down 10249-10255 with RBAC

If you get an issue connecting to your rancher server "dial tcp 127.0.0.1:8880: getsockopt: connection refused" - this is usually security related - this line is the first to fail for example

https://git.onap.org/logging-analytics/tree/deploy/rancher/oom_rancher_setup.sh#n117 

check the server first - either of these - but if the helm version hangs on "server" - the ports have an issue - run with all tcp/udp ports open 0.0.0.0/0 and ::/0 - and lock down the API on 10249-10255 via oauth github security from the rancher console to keep out crypto miners.

Example 15 node (1 master + 14 nodes) OOM Deployment

Rancher 1.6.25, Kubernetes 1.11.5, Docker 17.03, Helm 2.9.1

empty

With ONAP deployed

Throughput and Volumetrics

Cloudwatch CPU Average

Specific to logging - we have a problem on any VM that contains AAI - the logstash container is being saturated there - see the 30+ percent VM -  LOG-376 - Getting issue details... STATUS

NFS Throughput for /dockerdata-nfs

Cloudwatch Network In Max

Cost

Using the spot market on AWS - we ran a bill of $10 for 8 hours of 15 VM's of C5.2xLarge - (includes EBS but not DNS, EFS/NFS)


Details: 20181106:1800 EDT master

ubuntu@ip-172-31-40-250:~$ kubectl get pods --all-namespaces | wc -l
248
ubuntu@ip-172-31-40-250:~$ kubectl get pods --all-namespaces | grep onap | wc -l
241
ubuntu@ip-172-31-40-250:~$ kubectl get pods --all-namespaces | grep onap | grep -E '1/1|2/2' | wc -l
217
ubuntu@ip-172-31-40-250:~$ kubectl get pods --all-namespaces | grep onap | grep -E '0/|1/2' | wc -l
24
ubuntu@ip-172-31-40-250:~$ kubectl get pods --all-namespaces -o wide | grep onap | grep -E '0/|1/2' 
onap          onap-aaf-aaf-sms-preload-lvqx9                                 0/1       Completed          0          4h        10.42.75.71     ip-172-31-37-59.us-east-2.compute.internal    <none>
onap          onap-aaf-aaf-sshsm-distcenter-ql5f8                            0/1       Completed          0          4h        10.42.75.223    ip-172-31-34-207.us-east-2.compute.internal   <none>
onap          onap-aaf-aaf-sshsm-testca-7rzcd                                0/1       Completed          0          4h        10.42.18.37     ip-172-31-34-111.us-east-2.compute.internal   <none>
onap          onap-aai-aai-graphadmin-create-db-schema-26pfs                 0/1       Completed          0          4h        10.42.14.14     ip-172-31-37-59.us-east-2.compute.internal    <none>
onap          onap-aai-aai-traversal-update-query-data-qlk7w                 0/1       Completed          0          4h        10.42.88.122    ip-172-31-36-163.us-east-2.compute.internal   <none>
onap          onap-contrib-netbox-app-provisioning-gmmvj                     0/1       Completed          0          4h        10.42.111.99    ip-172-31-41-229.us-east-2.compute.internal   <none>
onap          onap-contrib-netbox-app-provisioning-n6fw4                     0/1       Error              0          4h        10.42.21.12     ip-172-31-36-163.us-east-2.compute.internal   <none>
onap          onap-contrib-netbox-app-provisioning-nc8ww                     0/1       Error              0          4h        10.42.109.156   ip-172-31-41-110.us-east-2.compute.internal   <none>
onap          onap-contrib-netbox-app-provisioning-xcxds                     0/1       Error              0          4h        10.42.152.223   ip-172-31-39-138.us-east-2.compute.internal   <none>
onap          onap-dmaap-dmaap-dr-node-6496d8f55b-jfvrm                      0/1       Init:0/1           28         4h        10.42.95.32     ip-172-31-38-194.us-east-2.compute.internal   <none>
onap          onap-dmaap-dmaap-dr-prov-86f79c47f9-tldsp                      0/1       CrashLoopBackOff   59         4h        10.42.76.248    ip-172-31-34-207.us-east-2.compute.internal   <none>
onap          onap-oof-music-cassandra-job-config-7mb5f                      0/1       Completed          0          4h        10.42.38.249    ip-172-31-41-110.us-east-2.compute.internal   <none>
onap          onap-oof-oof-has-healthcheck-rpst7                             0/1       Completed          0          4h        10.42.241.223   ip-172-31-39-138.us-east-2.compute.internal   <none>
onap          onap-oof-oof-has-onboard-5bd2l                                 0/1       Completed          0          4h        10.42.205.75    ip-172-31-38-194.us-east-2.compute.internal   <none>
onap          onap-portal-portal-db-config-qshzn                             0/2       Completed          0          4h        10.42.112.46    ip-172-31-45-152.us-east-2.compute.internal   <none>
onap          onap-portal-portal-db-config-rk4m2                             0/2       Init:Error         0          4h        10.42.57.79     ip-172-31-38-194.us-east-2.compute.internal   <none>
onap          onap-sdc-sdc-be-config-backend-2vw2q                           0/1       Completed          0          4h        10.42.87.181    ip-172-31-39-138.us-east-2.compute.internal   <none>
onap          onap-sdc-sdc-be-config-backend-k57lh                           0/1       Init:Error         0          4h        10.42.148.79    ip-172-31-45-152.us-east-2.compute.internal   <none>
onap          onap-sdc-sdc-cs-config-cassandra-vgnz2                         0/1       Completed          0          4h        10.42.111.187   ip-172-31-34-111.us-east-2.compute.internal   <none>
onap          onap-sdc-sdc-es-config-elasticsearch-lkb9m                     0/1       Completed          0          4h        10.42.20.202    ip-172-31-39-138.us-east-2.compute.internal   <none>
onap          onap-sdc-sdc-onboarding-be-cassandra-init-7zv5j                0/1       Completed          0          4h        10.42.218.1     ip-172-31-41-229.us-east-2.compute.internal   <none>
onap          onap-sdc-sdc-wfd-be-workflow-init-q8t7z                        0/1       Completed          0          4h        10.42.255.91    ip-172-31-41-30.us-east-2.compute.internal    <none>
onap          onap-vid-vid-galera-config-4f274                               0/1       Completed          0          4h        10.42.80.200    ip-172-31-33-223.us-east-2.compute.internal   <none>
onap          onap-vnfsdk-vnfsdk-init-postgres-lf659                         0/1       Completed          0          4h        10.42.238.204   ip-172-31-38-194.us-east-2.compute.internal   <none>

ubuntu@ip-172-31-40-250:~$ kubectl get nodes -o wide
NAME                                          STATUS    ROLES     AGE       VERSION            INTERNAL-IP      EXTERNAL-IP      OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
ip-172-31-33-223.us-east-2.compute.internal   Ready     <none>    5h        v1.11.2-rancher1   18.222.148.116   18.222.148.116   Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-34-111.us-east-2.compute.internal   Ready     <none>    5h        v1.11.2-rancher1   3.16.37.170      3.16.37.170      Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-34-207.us-east-2.compute.internal   Ready     <none>    5h        v1.11.2-rancher1   18.225.32.201    18.225.32.201    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-36-163.us-east-2.compute.internal   Ready     <none>    5h        v1.11.2-rancher1   13.58.189.251    13.58.189.251    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-37-24.us-east-2.compute.internal    Ready     <none>    5h        v1.11.2-rancher1   18.224.180.26    18.224.180.26    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-37-59.us-east-2.compute.internal    Ready     <none>    5h        v1.11.2-rancher1   18.191.248.14    18.191.248.14    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-38-194.us-east-2.compute.internal   Ready     <none>    4h        v1.11.2-rancher1   18.217.45.91     18.217.45.91     Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-38-95.us-east-2.compute.internal    Ready     <none>    4h        v1.11.2-rancher1   52.15.39.21      52.15.39.21      Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-39-138.us-east-2.compute.internal   Ready     <none>    4h        v1.11.2-rancher1   18.224.199.40    18.224.199.40    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-41-110.us-east-2.compute.internal   Ready     <none>    4h        v1.11.2-rancher1   18.223.151.180   18.223.151.180   Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-41-229.us-east-2.compute.internal   Ready     <none>    5h        v1.11.2-rancher1   18.218.252.13    18.218.252.13    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-41-30.us-east-2.compute.internal    Ready     <none>    4h        v1.11.2-rancher1   3.16.113.3       3.16.113.3       Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-42-33.us-east-2.compute.internal    Ready     <none>    5h        v1.11.2-rancher1   13.59.2.86       13.59.2.86       Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ip-172-31-45-152.us-east-2.compute.internal   Ready     <none>    4h        v1.11.2-rancher1   18.219.56.50     18.219.56.50     Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2
ubuntu@ip-172-31-40-250:~$ kubectl top nodes
NAME                                          CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
ip-172-31-33-223.us-east-2.compute.internal   852m         10%       13923Mi         90%       
ip-172-31-34-111.us-east-2.compute.internal   1160m        14%       11643Mi         75%       
ip-172-31-34-207.us-east-2.compute.internal   1101m        13%       7981Mi          51%      
ip-172-31-36-163.us-east-2.compute.internal   656m         8%        13377Mi         87%       
ip-172-31-37-24.us-east-2.compute.internal    401m         5%        8543Mi          55%       
ip-172-31-37-59.us-east-2.compute.internal    711m         8%        10873Mi         70%       
ip-172-31-38-194.us-east-2.compute.internal   1136m        14%       8195Mi          53%       
ip-172-31-38-95.us-east-2.compute.internal    1195m        14%       9127Mi          59%       
ip-172-31-39-138.us-east-2.compute.internal   296m         3%        10870Mi         70%       
ip-172-31-41-110.us-east-2.compute.internal   2586m        32%       10950Mi         71%       
ip-172-31-41-229.us-east-2.compute.internal   159m         1%        9138Mi          59%       
ip-172-31-41-30.us-east-2.compute.internal    180m         2%        9862Mi          64%       
ip-172-31-42-33.us-east-2.compute.internal    1573m        19%       6352Mi          41%       
ip-172-31-45-152.us-east-2.compute.internal   1579m        19%       10633Mi         69%  


Quickstart

Undercloud Install - Rancher/Kubernetes/Helm/Docker

Ubuntu 16.04 Host VM Configuration

keyvalue



Redhat 7.6 Host VM Configuration

see https://gerrit.onap.org/r/#/c/77850/

keyvalue
firewalld offsystemctl disable firewalld
git, make, python
yum install git
yum groupinstall 'Development Tools'
IPv4 forwardingadd to /etc/sysctl.conf
net.ipv4.ip_forward = 1
Networking enabledsudo vi /etc/sysconfig/network-scripts/ifcfg-ens33 with ONBOOT=yes"

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html-single/getting_started_with_kubernetes/index

General Host VM Configuration

Follow https://git.onap.org/logging-analytics/tree/deploy/rancher/oom_rancher_setup.sh

Run the following script on a clean Ubuntu 16.04 or Redhat RHEL 7.x (7.6) VM anywhere - it will provision and register your kubernetes system as a collocated master/host.

Ideally you install a clustered set of hosts away from the master VM - you can do this by deleting the host from the cluster after it is installed below and run the (docker, nfs and the rancher agent docker on each host)/

vm.max_map_count 64 to 256kb limit

The cd.sh script will fix your VM for this limitation first found in  LOG-334 - Getting issue details... STATUS .  If you don't run the cd.sh script - run the following command manually on each VM so that any elasticsearch container comes up properly - this is a base OS issue.

https://git.onap.org/logging-analytics/tree/deploy/cd.sh#n49

# fix virtual memory for onap-log:elasticsearch under Rancher 1.6.11 - OOM-431
sudo sysctl -w vm.max_map_count=262144

Scripted RKE Kubernetes Cluster install

OOM RKE Kubernetes Deployment

Scripted undercloud(Helm/Kubernetes/Docker) and ONAP install - Single VM

Prerequisites

Create a single VM - 256G+

See recommended cluster configurations on ONAP Deployment Specification for Finance and Operations#AmazonAWS

Create a 0.0.0.0/0 ::/O open security group

Use github to OAUTH authenticate your cluster just after installing it.

Last test 20190305 using 3.0.1-ONAP

ONAP Development#Changemax-podsfromdefault110podlimit

# 0 - verify the security group has all protocols (TCP/UCP) for 0.0.0.0/0 and ::/0
# to be save edit/make sure dns resolution is setup to the host
ubuntu@ld:~$ sudo cat /etc/hosts
127.0.0.1 cd.onap.info


# 1 - configure combined master/host VM - 26 min
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo cp logging-analytics/deploy/rancher/oom_rancher_setup.sh .
sudo ./oom_rancher_setup.sh -b master -s <your domain/ip> -e onap


# to deploy more than 110 pods per vm
before the environment (1a7) is created from the kubernetes template (1pt2) - at the waiting 3 min mark - edit it via https://wiki.onap.org/display/DW/ONAP+Development#ONAPDevelopment-Changemax-podsfromdefault110podlimit

--max-pods=900
https://lists.onap.org/g/onap-discuss/topic/oom_110_kubernetes_pod/25213556?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,25213556


in "additional kubelet flags"
--max-pods=500
# on a 244G R4.8xlarge vm - 26 min later k8s cluster is up
NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
kube-system   heapster-6cfb49f776-5pq45               1/1       Running   0          10m
kube-system   kube-dns-75c8cb4ccb-7dlsh               3/3       Running   0          10m
kube-system   kubernetes-dashboard-6f4c8b9cd5-v625c   1/1       Running   0          10m
kube-system   monitoring-grafana-76f5b489d5-zhrjc     1/1       Running   0          10m
kube-system   monitoring-influxdb-6fc88bd58d-9494h    1/1       Running   0          10m
kube-system   tiller-deploy-8b6c5d4fb-52zmt           1/1       Running   0          2m

# 3 - secure via github oauth the master - immediately to lock out crypto miners
http://cd.onap.info:8880

# check the master cluster
ubuntu@ip-172-31-14-89:~$ kubectl top nodes
NAME                                         CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
ip-172-31-8-245.us-east-2.compute.internal   179m         2%        2494Mi          4%        
ubuntu@ip-172-31-14-89:~$ kubectl get nodes -o wide
NAME                                         STATUS    ROLES     AGE       VERSION            EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
ip-172-31-8-245.us-east-2.compute.internal   Ready     <none>    13d       v1.10.3-rancher1   172.17.0.1    Ubuntu 16.04.1 LTS   4.4.0-1049-aws   docker://17.3.2

# 7 - after cluster is up - run cd.sh script to get onap up - customize your values.yaml - the 2nd time you run the script - a clean install - will clone new oom repo
# get the dev.yaml and set any pods you want up to true as well as fill out the openstack parameters
sudo wget https://git.onap.org/oom/plain/kubernetes/onap/resources/environments/dev.yaml
sudo cp dev.yaml dev0.yaml
sudo vi dev0.yaml 
sudo cp dev0.yaml dev1.yaml
sudo cp logging-analytics/deploy/cd.sh .

# this does a prepull (-p), clones 3.0.0-ONAP, managed install -f true
sudo ./cd.sh -b 3.0.0-ONAP -e onap -p true -n nexus3.onap.org:10001 -f true -s 300 -c true -d true -w false -r false
# check around 55 min (on a 256G single node - with 32 vCores)
pods/failed/up @ min and ram
161/13/153 @ 50m 107g
@55 min
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep onap | grep -E '1/1|2/2' | wc -l
152
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep -E '0/|1/2' 
onap          dep-deployment-handler-5789b89d4b-s6fzw                 1/2       Running                 0          8m
onap          dep-service-change-handler-76dcd99f84-fchxd             0/1       ContainerCreating       0          3m
onap          onap-aai-champ-68ff644d85-rv7tr                         0/1       Running                 0          53m
onap          onap-aai-gizmo-856f86d664-q5pvg                         1/2       CrashLoopBackOff        9          53m
onap          onap-oof-85864d6586-zcsz5                               0/1       ImagePullBackOff        0          53m
onap          onap-pomba-kibana-d76b6dd4c-sfbl6                       0/1       Init:CrashLoopBackOff   7          53m
onap          onap-pomba-networkdiscovery-85d76975b7-mfk92            1/2       CrashLoopBackOff        9          53m
onap          onap-pomba-networkdiscoveryctxbuilder-c89786dfc-qnlx9   1/2       CrashLoopBackOff        9          53m
onap          onap-vid-84c88db589-8cpgr                               1/2       CrashLoopBackOff        7          52m

Note: DCAE has 2 sets of orchestration after the initial k8s orchestration - another at 57 min
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep -E '0/|1/2' 
onap          dep-dcae-prh-6b5c6ff445-pr547                           0/2       ContainerCreating       0          2m
onap          dep-dcae-tca-analytics-7dbd46d5b5-bgrn9                 0/2       ContainerCreating       0          1m
onap          dep-dcae-ves-collector-59d4ff58f7-94rpq                 0/2       ContainerCreating       0          1m
onap          onap-aai-champ-68ff644d85-rv7tr                         0/1       Running                 0          57m
onap          onap-aai-gizmo-856f86d664-q5pvg                         1/2       CrashLoopBackOff        10         57m
onap          onap-oof-85864d6586-zcsz5                               0/1       ImagePullBackOff        0          57m
onap          onap-pomba-kibana-d76b6dd4c-sfbl6                       0/1       Init:CrashLoopBackOff   8          57m
onap          onap-pomba-networkdiscovery-85d76975b7-mfk92            1/2       CrashLoopBackOff        11         57m
onap          onap-pomba-networkdiscoveryctxbuilder-c89786dfc-qnlx9   1/2       Error                   10         57m
onap          onap-vid-84c88db589-8cpgr                               1/2       CrashLoopBackOff        9          57m

at 1 hour
ubuntu@ip-172-31-20-218:~$ free
              total        used        free      shared  buff/cache   available
Mem:      251754696   111586672    45000724      193628    95167300   137158588
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep onap | wc -l
164
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep onap | grep -E '1/1|2/2' | wc -l
155
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep -E '0/|1/2' | wc -l
8
ubuntu@ip-172-31-20-218:~$ kubectl get pods --all-namespaces | grep -E '0/|1/2' 
onap          dep-dcae-ves-collector-59d4ff58f7-94rpq                 1/2       Running                 0          4m
onap          onap-aai-champ-68ff644d85-rv7tr                         0/1       Running                 0          59m
onap          onap-aai-gizmo-856f86d664-q5pvg                         1/2       CrashLoopBackOff        10         59m
onap          onap-oof-85864d6586-zcsz5                               0/1       ImagePullBackOff        0          59m
onap          onap-pomba-kibana-d76b6dd4c-sfbl6                       0/1       Init:CrashLoopBackOff   8          59m
onap          onap-pomba-networkdiscovery-85d76975b7-mfk92            1/2       CrashLoopBackOff        11         59m
onap          onap-pomba-networkdiscoveryctxbuilder-c89786dfc-qnlx9   1/2       CrashLoopBackOff        10         59m
onap          onap-vid-84c88db589-8cpgr                               1/2       CrashLoopBackOff        9          59m


ubuntu@ip-172-31-20-218:~$ df
Filesystem     1K-blocks     Used Available Use% Mounted on
udev           125869392        0 125869392   0% /dev
tmpfs           25175472    54680  25120792   1% /run
/dev/xvda1     121914320 91698036  30199900  76% /
tmpfs          125877348    30312 125847036   1% /dev/shm
tmpfs               5120        0      5120   0% /run/lock
tmpfs          125877348        0 125877348   0% /sys/fs/cgroup
tmpfs           25175472        0  25175472   0% /run/user/1000

todo: verify the release is there after a helm install - as the configMap size issue is breaking the release for now


Prerequisites

Create a single VM - 256G+

20181015

ubuntu@a-onap-dmz-nodelete:~$ ./oom_deployment.sh -b master -s att.onap.cloud -e onap -r a_ONAP_CD_master -t _arm_deploy_onap_cd.json -p _arm_deploy_onap_cd_z_parameters.json
# register the IP to DNS with route53 for att.onap.info - using this for the ONAP academic summit on the 22nd
13.68.113.104 = att.onap.cloud


Scripted undercloud(Helm/Kubernetes/Docker) and ONAP install - clustered

Prerequisites

Add an NFS (EFS on AWS) share

Create a 1 + N cluster

See recommended cluster configurations on ONAP Deployment Specification for Finance and Operations#AmazonAWS

Create a 0.0.0.0/0 ::/O open security group

Use github to OAUTH authenticate your cluster just after installing it.

Last tested on ld.onap.info 20181029

# 0 - verify the security group has all protocols (TCP/UCP) for 0.0.0.0/0 and ::/0
# 1 - configure master - 15 min
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo logging-analytics/deploy/rancher/oom_rancher_setup.sh -b master -s <your domain/ip> -e onap
# on a 64G R4.2xlarge vm - 23 min later k8s cluster is up
kubectl get pods --all-namespaces
kube-system   heapster-76b8cd7b5-g7p6n               1/1       Running   0          8m
kube-system   kube-dns-5d7b4487c9-jjgvg              3/3       Running   0          8m
kube-system   kubernetes-dashboard-f9577fffd-qldrw   1/1       Running   0          8m
kube-system   monitoring-grafana-997796fcf-g6tr7     1/1       Running   0          8m
kube-system   monitoring-influxdb-56fdcd96b-x2kvd    1/1       Running   0          8m
kube-system   tiller-deploy-54bcc55dd5-756gn         1/1       Running   0          2m

# 2 - secure via github oauth the master - immediately to lock out crypto miners
http://ld.onap.info:8880

# 3 - delete the master from the hosts in rancher
http://ld.onap.info:8880

# 4 - create NFS share on master
https://us-east-2.console.aws.amazon.com/efs/home?region=us-east-2#/filesystems/fs-92xxxxx
# add -h 1.2.10 (if upgrading from 1.6.14 to 1.6.18 of rancher)
sudo logging-analytics/deploy/aws/oom_cluster_host_install.sh -n false -s <your domain/ip> -e fs-nnnnnn1b -r us-west-1 -t 371AEDC88zYAZdBXPM  -c true -v true

# 5 - create NFS share and register each node - do this for all nodes
sudo git clone https://gerrit.onap.org/r/logging-analytics
# add -h 1.2.10 (if upgrading from 1.6.14 to 1.6.18 of rancher)
sudo logging-analytics/deploy/aws/oom_cluster_host_install.sh -n true -s <your domain/ip> -e fs-nnnnnn1b -r us-west-1 -t 371AEDC88zYAZdBXPM  -c true -v true

# it takes about 1 min to run the script and 1 minute for the etcd and healthcheck containers to go green on each host
# check the master cluster
kubectl top nodes
NAME                                          CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
ip-172-31-19-9.us-east-2.compute.internal     9036m        56%       53266Mi         43%       
ip-172-31-21-129.us-east-2.compute.internal   6840m        42%       47654Mi         38%       
ip-172-31-18-85.us-east-2.compute.internal    6334m        39%       49545Mi         40%       
ip-172-31-26-114.us-east-2.compute.internal   3605m        22%       25816Mi         21%  
# fix helm on the master after adding nodes to the master - only if the server helm version is less than the client helm version (rancher 1.6.18 does not have this issue)

ubuntu@ip-172-31-14-89:~$ sudo helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
ubuntu@ip-172-31-14-89:~$ sudo helm init --upgrade
$HELM_HOME has been configured at /home/ubuntu/.helm.
Tiller (the Helm server-side component) has been upgraded to the current version.
ubuntu@ip-172-31-14-89:~$ sudo helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
# 7a - manual: follow the helm plugin page
# https://wiki.onap.org/display/DW/OOM+Helm+%28un%29Deploy+plugins
sudo git clone https://gerrit.onap.org/r/oom
sudo cp -R ~/oom/kubernetes/helm/plugins/ ~/.helm
cd oom/kubernetes
sudo helm serve &
sudo make all
sudo make onap
sudo helm deploy onap local/onap --namespace onap 
fetching local/onap
release "onap" deployed
release "onap-aaf" deployed
release "onap-aai" deployed
release "onap-appc" deployed
release "onap-clamp" deployed
release "onap-cli" deployed
release "onap-consul" deployed
release "onap-contrib" deployed
release "onap-dcaegen2" deployed
release "onap-dmaap" deployed
release "onap-esr" deployed
release "onap-log" deployed
release "onap-msb" deployed
release "onap-multicloud" deployed
release "onap-nbi" deployed
release "onap-oof" deployed
release "onap-policy" deployed
release "onap-pomba" deployed
release "onap-portal" deployed
release "onap-robot" deployed
release "onap-sdc" deployed
release "onap-sdnc" deployed
release "onap-sniro-emulator" deployed
release "onap-so" deployed
release "onap-uui" deployed
release "onap-vfc" deployed
release "onap-vid" deployed
release "onap-vnfsdk" deployed
# 7b - automated: after cluster is up - run cd.sh script to get onap up - customize your values.yaml - the 2nd time you run the script
# clean install - will clone new oom repo
# get the dev.yaml and set any pods you want up to true as well as fill out the openstack parameters
sudo wget https://git.onap.org/oom/plain/kubernetes/onap/resources/environments/dev.yaml
sudo cp logging-analytics/deploy/cd.sh .
sudo ./cd.sh -b master -e onap -c true -d true -w true
# rerun install - no delete of oom repo
sudo ./cd.sh -b master -e onap -c false -d true -w true


Deployment Integrity based on Pod Dependencies

20181213 running 3.0.0-ONAP

LOG-899 - Getting issue details... STATUS

LOG-898 - Getting issue details... STATUS

OOM-1547 - Getting issue details... STATUS

OOM-1543 - Getting issue details... STATUS

Patches

                             Windriver openstack heat template 1+13 vms

                             https://gerrit.onap.org/r/#/c/74781/


                             docker prepull script – run before cd.sh - https://git.onap.org/logging-analytics/plain/deploy/docker_prepull.sh


                             https://gerrit.onap.org/r/#/c/74780/

                             Not merged with the heat template until the following nexus3 slowdown is addressed

                             https://lists.onap.org/g/onap-discuss/topic/nexus3_slowdown_10x_docker/28789709?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,28789709

                             https://jira.onap.org/browse/TSC-79

Base Platform First

Bring up dmaap and aaf first and the rest of the pods in the following order.

Every 2.0s: helm list                                                                                                                                                                             Fri Dec 14 15:19:49 2018
NAME            REVISION        UPDATED                         STATUS          CHART           NAMESPACE
onap            2               Fri Dec 14 15:10:56 2018        DEPLOYED        onap-3.0.0      onap
onap-aaf        1               Fri Dec 14 15:10:57 2018        DEPLOYED        aaf-3.0.0       onap
onap-dmaap      2               Fri Dec 14 15:11:00 2018        DEPLOYED        dmaap-3.0.0     onap

onap          onap-aaf-aaf-cm-5c65c9dc55-snhlj                       1/1       Running     0          10m
onap          onap-aaf-aaf-cs-7dff4b9c44-85zg2                       1/1       Running     0          10m
onap          onap-aaf-aaf-fs-ff6779b94-gz682                        1/1       Running     0          10m
onap          onap-aaf-aaf-gui-76cfcc8b74-wn8b8                      1/1       Running     0          10m
onap          onap-aaf-aaf-hello-5d45dd698c-xhc2v                    1/1       Running     0          10m
onap          onap-aaf-aaf-locate-8587d8f4-l4k7v                     1/1       Running     0          10m
onap          onap-aaf-aaf-oauth-d759586f6-bmz2l                     1/1       Running     0          10m
onap          onap-aaf-aaf-service-546f66b756-cjppd                  1/1       Running     0          10m
onap          onap-aaf-aaf-sms-7497c9bfcc-j892g                      1/1       Running     0          10m
onap          onap-aaf-aaf-sms-preload-vhbbd                         0/1       Completed   0          10m
onap          onap-aaf-aaf-sms-quorumclient-0                        1/1       Running     0          10m
onap          onap-aaf-aaf-sms-quorumclient-1                        1/1       Running     0          8m
onap          onap-aaf-aaf-sms-quorumclient-2                        1/1       Running     0          6m
onap          onap-aaf-aaf-sms-vault-0                               2/2       Running     1          10m
onap          onap-aaf-aaf-sshsm-distcenter-27ql7                    0/1       Completed   0          10m
onap          onap-aaf-aaf-sshsm-testca-mw95p                        0/1       Completed   0          10m
onap          onap-dmaap-dbc-pg-0                                    1/1       Running     0          17m
onap          onap-dmaap-dbc-pg-1                                    1/1       Running     0          15m
onap          onap-dmaap-dbc-pgpool-c5f8498-fn9cn                    1/1       Running     0          17m
onap          onap-dmaap-dbc-pgpool-c5f8498-t9s27                    1/1       Running     0          17m
onap          onap-dmaap-dmaap-bus-controller-59c96d6b8f-9xsxg       1/1       Running     0          17m
onap          onap-dmaap-dmaap-dr-db-557c66dc9d-gvb9f                1/1       Running     0          17m
onap          onap-dmaap-dmaap-dr-node-6496d8f55b-ffgfr              1/1       Running     0          17m
onap          onap-dmaap-dmaap-dr-prov-86f79c47f9-zb8p7              1/1       Running     0          17m
onap          onap-dmaap-message-router-5fb78875f4-lvsg6             1/1       Running     0          17m
onap          onap-dmaap-message-router-kafka-7964db7c49-n8prg       1/1       Running     0          17m
onap          onap-dmaap-message-router-zookeeper-5cdfb67f4c-5w4vw   1/1       Running     0          17m

onap-msb        2               Fri Dec 14 15:31:12 2018        DEPLOYED        msb-3.0.0       onap
onap          onap-msb-kube2msb-5c79ddd89f-dqhm6                     1/1       Running     0          4m
onap          onap-msb-msb-consul-6949bd46f4-jk6jw                   1/1       Running     0          4m
onap          onap-msb-msb-discovery-86c7b945f9-bc4zq                2/2       Running     0          4m
onap          onap-msb-msb-eag-5f86f89c4f-fgc76                      2/2       Running     0          4m
onap          onap-msb-msb-iag-56cdd4c87b-jsfr8                      2/2       Running     0          4m

onap-aai        1               Fri Dec 14 15:30:59 2018        DEPLOYED        aai-3.0.0       onap
onap          onap-aai-aai-54b7bf7779-bfbmg                          1/1       Running     0          2m
onap          onap-aai-aai-babel-6bbbcf5d5c-sp676                    2/2       Running     0          13m
onap          onap-aai-aai-cassandra-0                               1/1       Running     0          13m
onap          onap-aai-aai-cassandra-1                               1/1       Running     0          12m
onap          onap-aai-aai-cassandra-2                               1/1       Running     0          9m
onap          onap-aai-aai-champ-54f7986b6b-wql2b                    2/2       Running     0          13m
onap          onap-aai-aai-data-router-f5f75c9bd-l6ww7               2/2       Running     0          13m
onap          onap-aai-aai-elasticsearch-c9bf9dbf6-fnj8r             1/1       Running     0          13m
onap          onap-aai-aai-gizmo-5f8bf54f6f-chg85                    2/2       Running     0          13m
onap          onap-aai-aai-graphadmin-9b956d4c-k9fhk                 2/2       Running     0          13m
onap          onap-aai-aai-graphadmin-create-db-schema-s2nnw         0/1       Completed   0          13m
onap          onap-aai-aai-modelloader-644b46df55-vt4gk              2/2       Running     0          13m
onap          onap-aai-aai-resources-745b6b4f5b-rj7lm                2/2       Running     0          13m
onap          onap-aai-aai-search-data-559b8dbc7f-l6cqq              2/2       Running     0          13m
onap          onap-aai-aai-sparky-be-75658695f5-z2xv4                2/2       Running     0          13m
onap          onap-aai-aai-spike-6778948986-7h7br                    2/2       Running     0          13m
onap          onap-aai-aai-traversal-58b97f689f-jlblx                2/2       Running     0          13m
onap          onap-aai-aai-traversal-update-query-data-7sqt5         0/1       Completed   0          13m

onap-msb        5               Fri Dec 14 15:51:42 2018        DEPLOYED        msb-3.0.0               onap
onap          onap-msb-kube2msb-5c79ddd89f-dqhm6                     1/1       Running     0          18m
onap          onap-msb-msb-consul-6949bd46f4-jk6jw                   1/1       Running     0          18m
onap          onap-msb-msb-discovery-86c7b945f9-bc4zq                2/2       Running     0          18m
onap          onap-msb-msb-eag-5f86f89c4f-fgc76                      2/2       Running     0          18m
onap          onap-msb-msb-iag-56cdd4c87b-jsfr8                      2/2       Running     0          18m

onap-esr        3               Fri Dec 14 15:51:40 2018        DEPLOYED        esr-3.0.0       onap
onap          onap-esr-esr-gui-6c5ccd59d6-6brcx                      1/1       Running     0          2m
onap          onap-esr-esr-server-5f967d4767-ctwp6                   2/2       Running     0          2m
onap-robot      2               Fri Dec 14 15:51:48 2018        DEPLOYED        robot-3.0.0             onap
onap          onap-robot-robot-ddd948476-n9szh                        1/1       Running             0          11m

onap-multicloud 1               Fri Dec 14 15:51:43 2018        DEPLOYED        multicloud-3.0.0        onap


Tiller requires wait states between deployments

There is a patch going into 3.0.1 to delay deployments to not overload tiller 3+ seconds

sudo cp -R ~/oom/kubernetes/helm/plugins/ ~/.helm
sudo vi ~/.helm/plugins/deploy/deploy.sh 

Use public-cloud.yaml override

Note: your HD/SSD, ram and cpu configuration will drastically affect deployment.  For example if you are cpu starved - the idle state of onap will delay pods as more come in - additionally network bandwidth to pull docker containers will be significant - and PV creation is sensitive to FS throughput/lag.

Some of the internal pod timings are optimized for certain azure deployment

https://git.onap.org/oom/tree/kubernetes/onap/resources/environments/public-cloud.yaml

Optimizing Docker Image Pulls

https://lists.onap.org/g/onap-discuss/topic/onap_helpdesk_65794_nexus3/28794221?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,28794221

Verify if the integration docker csv manifest is the truth or the oom repo values.yaml (no override required?)

TSC-86 - Getting issue details... STATUS

https://lists.onap.org/g/onap-discuss/topic/oom_onap_deployment/28883609?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,28883609

Nexus Proxy

Soleil, Alain (Deactivated) pointed out the proxy page (was using commercial nexus3) - ONAP OOM Beijing - Hosting docker images locally - I had about 4 jiras on this and forgot about them.

20190121: 

Answered John Lotoski for EKS and his other post on nexus3 proxy failures - looks like an issue with a double proxy between dockerhub - or an issue specific to the dockerhub/registry:2 container - https://lists.onap.org/g/onap-discuss/topic/registry_issue_few_images/29285134?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,29285134


Running

LOG-355 - Getting issue details... STATUS

nexus3.onap.info:5000 - my private AWS nexus3 proxy of nexus3.onap.org:10001

nexus3.onap.cloud:5000 - azure public proxy - filled with casablanca (will retire after Jan 2)

nexus4.onap.cloud:5000 - azure public proxy - filled with master - and later casablanca

nexus3windriver.onap.cloud:5000 - windriver/openstack lab inside the firewall to use only for the lab - access to public is throttled

Nexus3 proxy setup - host
# from a clean ubuntu 16.04 VM
# install docker
sudo curl https://releases.rancher.com/install-docker/17.03.sh | sh
sudo usermod -aG docker ubuntu
# install nexus
mkdir -p certs
openssl req -newkey rsa:4096 -nodes -sha256 -keyout certs/domain.key -x509 -days 365 -out certs/domain.crt
Common Name (e.g. server FQDN or YOUR name) []:nexus3.onap.info

sudo nano /etc/hosts
sudo docker run -d  --restart=unless-stopped  --name registry  -v `pwd`/certs:/certs  -e REGISTRY_HTTP_ADDR=0.0.0.0:5000  -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt  -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key  -e REGISTRY_PROXY_REMOTEURL=https://nexus3.onap.org:10001  -p 5000:5000  registry:2
sudo docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
7f9b0e97eb7f        registry:2          "/entrypoint.sh /e..."   8 seconds ago       Up 7 seconds        0.0.0.0:5000->5000/tcp   registry
# test it
sudo docker login -u docker -p docker nexus3.onap.info:5000
Login Succeeded
# get images from https://git.onap.org/integration/plain/version-manifest/src/main/resources/docker-manifest.csv?h=casablanca
# use for example the first line onap/aaf/aaf_agent,2.1.8
# or the prepull script in https://git.onap.org/logging-analytics/plain/deploy/docker_prepull.sh

sudo docker pull nexus3.onap.info:5000/onap/aaf/aaf_agent:2.1.8
2.1.8: Pulling from onap/aaf/aaf_agent
18d680d61657: Pulling fs layer 
819d6de9e493: Downloading [======================================>            ] 770.7 kB/1.012 MB

# list
sudo docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
registry            2                   2e2f252f3c88        3 months ago        33.3 MB

# prepull to cache images on the server - in this case casablanca branch
sudo wget https://git.onap.org/logging-analytics/plain/deploy/docker_prepull.sh
sudo chmod 777 docker_prepull.sh

# prep - same as client vms - the cert
sudo mkdir /etc/docker/certs.d
sudo mkdir /etc/docker/certs.d/nexus3.onap.cloud:5000
sudo cp certs/domain.crt /etc/docker/certs.d/nexus3.onap.cloud:5000/ca.crt
sudo systemctl restart docker
sudo docker login -u docker -p docker nexus3.onap.cloud:5000

# prepull
sudo nohup ./docker_prepull.sh -b casablanca -s nexus3.onap.cloud:5000 &
Nexus3 proxy usage per cluster node

Cert is on  TSC-79 - Getting issue details... STATUS

# on each host
# Cert is on TSC-79
sudo wget https://jira.onap.org/secure/attachment/13127/domain_nexus3_onap_cloud.crt

# or if you already have it
scp domain_nexus3_onap_cloud.crt ubuntu@ld3.onap.cloud:~/   
    # to avoid
    sudo docker login -u docker -p docker nexus3.onap.cloud:5000
        Error response from daemon: Get https://nexus3.onap.cloud:5000/v1/users/: x509: certificate signed by unknown authority

# cp cert
sudo mkdir /etc/docker/certs.d
sudo mkdir /etc/docker/certs.d/nexus3.onap.cloud:5000
sudo cp domain_nexus3_onap_cloud.crt /etc/docker/certs.d/nexus3.onap.cloud:5000/ca.crt
sudo systemctl restart docker
sudo docker login -u docker -p docker nexus3.onap.cloud:5000
Login Succeeded

# testing
# vm with the image existing - 2 sec
ubuntu@ip-172-31-33-46:~$ sudo docker pull nexus3.onap.cloud:5000/onap/aaf/aaf_agent:2.1.8
2.1.8: Pulling from onap/aaf/aaf_agent
Digest: sha256:71781f3cfa51066abb1a4a35267af37beec01b6bb75817fdfae056582839290c
Status: Downloaded newer image for nexus3.onap.cloud:5000/onap/aaf/aaf_agent:2.1.8


# vm with layers existing except for last 5 - 5 sec
ubuntu@a-cd-master:~$ sudo docker pull nexus3.onap.cloud:5000/onap/aaf/aaf_agent:2.1.8
2.1.8: Pulling from onap/aaf/aaf_agent
18d680d61657: Already exists 
.. 20
49e90af50c7d: Already exists 
....
acb05d09ff6e: Pull complete 
Digest: sha256:71781f3cfa51066abb1a4a35267af37beec01b6bb75817fdfae056582839290c
Status: Downloaded newer image for nexus3.onap.cloud:5000/onap/aaf/aaf_agent:2.1.8

# clean AWS VM (clean install of docker) - no pulls yet - 45 sec for everything
ubuntu@ip-172-31-14-34:~$ sudo docker pull nexus3.onap.cloud:5000/onap/aaf/aaf_agent:2.1.8
2.1.8: Pulling from onap/aaf/aaf_agent
18d680d61657: Pulling fs layer 
0addb6fece63: Pulling fs layer 
78e58219b215: Pulling fs layer 
eb6959a66df2: Pulling fs layer 
321bd3fd2d0e: Pull complete  
...
acb05d09ff6e: Pull complete 
Digest: sha256:71781f3cfa51066abb1a4a35267af37beec01b6bb75817fdfae056582839290c
Status: Downloaded newer image for nexus3.onap.cloud:5000/onap/aaf/aaf_agent:2.1.8
ubuntu@ip-172-31-14-34:~$ sudo docker images
REPOSITORY                                 TAG                 IMAGE ID            CREATED             SIZE
nexus3.onap.cloud:5000/onap/aaf/aaf_agent   2.1.8               090b326a7f11        5 weeks ago         1.14 GB

# going to test a same size image directly from the LF - with minimal common layers
nexus3.onap.org:10001/onap/testsuite                    1.3.2                c4b58baa95e8        3 weeks ago         1.13 GB
# 5 min in we are still at 3% - numbers below are a min old 
ubuntu@ip-172-31-14-34:~$ sudo docker pull nexus3.onap.org:10001/onap/testsuite:1.3.2
1.3.2: Pulling from onap/testsuite
32802c0cfa4d: Downloading [=============>                                     ] 8.416 MB/32.1 MB
da1315cffa03: Download complete 
fa83472a3562: Download complete 
f85999a86bef: Download complete 
3eca7452fe93: Downloading [=======================>                           ] 8.517 MB/17.79 MB
9f002f13a564: Downloading [=========================================>         ] 8.528 MB/10.24 MB
02682cf43e5c: Waiting 
....
754645df4601: Waiting 

# in 5 min we get 3% 35/1130Mb - which comes out to 162 min for 1.13G for .org as opposed to 45 sec for .info - which is a 200X slowdown - some of this is due to the fact my nexus3.onap.info is on the same VPC as my test VM - testing on openlab


# openlab - 2 min 40 sec which is 3.6 times slower - expected than in AWS - (25 min pulls vs 90min in openlab) - this makes nexus.onap.org 60 times slower in openlab than a proxy running from AWS (2 vCore/16G/ssd VM)
ubuntu@onap-oom-obrien-rancher-e4:~$ sudo docker pull nexus3.onap.info:5000/onap/aaf/aaf_agent:2.1.8
2.1.8: Pulling from onap/aaf/aaf_agent

18d680d61657: Pull complete 
...
acb05d09ff6e: Pull complete 
Digest: sha256:71781f3cfa51066abb1a4a35267af37beec01b6bb75817fdfae056582839290c
Status: Downloaded newer image for nexus3.onap.info:5000/onap/aaf/aaf_agent:2.1.8

#pulling smaller from nexus3.onap.info 2 min 20 - for 36Mb = 0.23Mb/sec - extrapolated to 1.13Gb for above is 5022 sec or 83 min - half the rough calculation above
ubuntu@onap-oom-obrien-rancher-e4:~$ sudo docker pull nexus3.onap.org:10001/onap/aaf/sms:3.0.1
3.0.1: Pulling from onap/aaf/sms
c67f3896b22c: Pull complete 
...
76eeb922b789: Pull complete 
Digest: sha256:d5b64947edb93848acacaa9820234aa29e58217db9f878886b7bafae00fdb436
Status: Downloaded newer image for nexus3.onap.org:10001/onap/aaf/sms:3.0.1

# conclusion - nexus3.onap.org is experiencing a routing issue from their DC outbound causing a 80-100x slowdown over a proxy nexus3 - since 20181217 - as local jenkins.onap.org builds complete faster
# workaround is to use a nexus3 proxy above


and adding to values.yaml

global:
  #repository: nexus3.onap.org:10001
  repository: nexus3.onap.cloud:5000
  repositoryCred:
    user: docker
    password: docker

windriver lab also has a network issue (for example if i pull from nexus3.onap.cloud:5000 (azure) into an aws EC2 instance - 45 sec for 1.1G - If I pull the same in an openlab VM - on the order of 10+ min) - therefore you need a local nexus3 proxy if you are inside the openstack lab - I have registered nexus3windriver.onap.cloud:5000 to a nexus3 proxy in my logging tenant - cert above

Docker Prepull

https://git.onap.org/logging-analytics/plain/deploy/docker_prepull.sh

using

https://git.onap.org/integration/tree/version-manifest/src/main/resources/docker-manifest.csv?h=casablanca

via

https://gerrit.onap.org/r/#/c/74780/

LOG-905 - Getting issue details... STATUS

git clone ssh://michaelobrien@gerrit.onap.org:29418/logging-analytics
cd logging-analytics 
git pull ssh://michaelobrien@gerrit.onap.org:29418/logging-analytics refs/changes/80/74780/1
ubuntu@onap-oom-obrien-rancher-e0:~$ sudo nohup ./docker_prepull.sh & 
[1] 14488
ubuntu@onap-oom-obrien-rancher-e0:~$ nohup: ignoring input and appending output to 'nohup.out'


POD redeployment/undeploy/deploy

If you need to redeploy a pod due to a job timeout, failure or to pickup a config/code change - delete the /dockerdata-nfs/*-aai for example subdirectory - so that a db restart for example does not run into existing data issues.

sudo chmod -R 777 /dockerdata-nfs
sudo rm -rf /dockerdata-nfs/onap-aai


Casablanca Deployment Examples

Deploy to 13+1 cluster

Deploy as one with deploy.sh delays and public.cloud.yaml - single 500G server AWS

sudo helm deploy onap local/onap --namespace $ENVIRON -f ../../dev.yaml -f onap/resources/environments/public-cloud.yaml 
where dev.yaml is the same as in resources but with all components turned on and IfNotPresent instead of Always

Deploy in sequence with validation on previous pod before proceeding - single 500G server AWS

we are not using the public-cloud.yaml override here - to verify just timing between deploys in this case - each pod waits for the previous to complete so resources are not in contention

see update to 

https://git.onap.org/logging-analytics/tree/deploy/cd.sh

https://gerrit.onap.org/r/#/c/75422

          DEPLOY_ORDER_POD_NAME_ARRAY=('robot consul aaf dmaap dcaegen2 msb aai esr multicloud oof so sdc sdnc vid policy portal log vfc uui vnfsdk appc clamp cli pomba vvp contrib sniro-emulator')
          # don't count completed pods
          DEPLOY_NUMBER_PODS_DESIRED_ARRAY=(1 4 13 11 13 5 15 2 6 17 10 12 11 2 8 6 3 18 2 5 5 5 1 11 11 3 1)
          # account for podd that have varying deploy times or replicaset sizes
          # don't count the 0/1 completed pods - and skip most of the ResultSet instances except 1
          # dcae boostrap is problematic
          DEPLOY_NUMBER_PODS_PARTIAL_ARRAY=(1 2 11 9 13 5 11 2 6 16 10 12 11 2 8 6 3 18 2 5 5 5 1 9 11 3 1)

Deployment in sequence to Windriver Lab

Note: the Windriver Openstack lab requires that host registration occurs against the private network 10.0.0.0/16 not the 10.12.0.0/16 public network - this is fine in Azure/AWS but not in openstack

The docs will be adjusted  OOM-1550 - Getting issue details... STATUS

This is bad - public IP based cluster

This is good - private IP based cluster

Openstack/Windriver HEAT template for 13+1 kubernetes cluster

https://jira.onap.org/secure/attachment/13010/logging_openstack_13_16g.yaml

LOG-324 - Getting issue details... STATUS

see

https://gerrit.onap.org/r/74781

obrienbiometrics:onap_oom-714_heat michaelobrien$ openstack stack create -t logging_openstack_13_16g.yaml -e logging_openstack_oom.env OOM20181216-13
+---------------------+-----------------------------------------+
| Field               | Value                                   |
+---------------------+-----------------------------------------+
| id                  | ed6aa689-2e2a-4e75-8868-9db29607c3ba    |
| stack_name          | OOM20181216-13                          |
| description         | Heat template to install OOM components |
| creation_time       | 2018-12-16T19:42:27Z                    |
| updated_time        | 2018-12-16T19:42:27Z                    |
| stack_status        | CREATE_IN_PROGRESS                      |
| stack_status_reason | Stack CREATE started                    |
+---------------------+-----------------------------------------+
obrienbiometrics:onap_oom-714_heat michaelobrien$ openstack server list
+--------------------------------------+-----------------------------+--------+--------------------------------------+--------------------------+
| ID                                   | Name                        | Status | Networks                             | Image Name               |
+--------------------------------------+-----------------------------+--------+--------------------------------------+--------------------------+
| 7695cf14-513e-4fea-8b00-6c2a25df85d3 | onap-oom-obrien-rancher-e13 | ACTIVE | oam_onap_RNa3=10.0.0.23, 10.12.7.14  | ubuntu-16-04-cloud-amd64 |
| 1b70f179-007c-4975-8e4a-314a57754684 | onap-oom-obrien-rancher-e7  | ACTIVE | oam_onap_RNa3=10.0.0.10, 10.12.7.36  | ubuntu-16-04-cloud-amd64 |
| 17c77bd5-0a0a-45ec-a9c7-98022d0f62fe | onap-oom-obrien-rancher-e2  | ACTIVE | oam_onap_RNa3=10.0.0.9, 10.12.6.180  | ubuntu-16-04-cloud-amd64 |
| f85e075f-e981-4bf8-af3f-e439b7b72ad2 | onap-oom-obrien-rancher-e9  | ACTIVE | oam_onap_RNa3=10.0.0.6, 10.12.5.136  | ubuntu-16-04-cloud-amd64 |
| 58c404d0-8bae-4889-ab0f-6c74461c6b90 | onap-oom-obrien-rancher-e6  | ACTIVE | oam_onap_RNa3=10.0.0.19, 10.12.5.68  | ubuntu-16-04-cloud-amd64 |
| b91ff9b4-01fe-4c34-ad66-6ffccc9572c1 | onap-oom-obrien-rancher-e4  | ACTIVE | oam_onap_RNa3=10.0.0.11, 10.12.7.35  | ubuntu-16-04-cloud-amd64 |
| d9be8b3d-2ef2-4a00-9752-b935d6dd2dba | onap-oom-obrien-rancher-e0  | ACTIVE | oam_onap_RNa3=10.0.16.1, 10.12.7.13  | ubuntu-16-04-cloud-amd64 |
| da0b1be6-ec2b-43e6-bb3f-1f0626dcc88b | onap-oom-obrien-rancher-e1  | ACTIVE | oam_onap_RNa3=10.0.0.16, 10.12.5.10  | ubuntu-16-04-cloud-amd64 |
| 0ffec4d0-bd6f-40f9-ab2e-f71aa5b9fbda | onap-oom-obrien-rancher-e5  | ACTIVE | oam_onap_RNa3=10.0.0.7, 10.12.6.248  | ubuntu-16-04-cloud-amd64 |
| 125620e0-2aa6-47cf-b422-d4cbb66a7876 | onap-oom-obrien-rancher-e8  | ACTIVE | oam_onap_RNa3=10.0.0.8, 10.12.6.249  | ubuntu-16-04-cloud-amd64 |
| 1efe102a-d310-48d2-9190-c442eaec3f80 | onap-oom-obrien-rancher-e12 | ACTIVE | oam_onap_RNa3=10.0.0.5, 10.12.5.167  | ubuntu-16-04-cloud-amd64 |
| 7c248d1d-193a-415f-868b-a94939a6e393 | onap-oom-obrien-rancher-e3  | ACTIVE | oam_onap_RNa3=10.0.0.3, 10.12.5.173  | ubuntu-16-04-cloud-amd64 |
| 98dc0aa1-e42d-459c-8dde-1a9378aa644d | onap-oom-obrien-rancher-e11 | ACTIVE | oam_onap_RNa3=10.0.0.12, 10.12.6.179 | ubuntu-16-04-cloud-amd64 |
| 6799037c-31b5-42bd-aebf-1ce7aa583673 | onap-oom-obrien-rancher-e10 | ACTIVE | oam_onap_RNa3=10.0.0.13, 10.12.6.167 | ubuntu-16-04-cloud-amd64 |
+--------------------------------------+-----------------------------+--------+--------------------------------------+--------------------------+
# 13+1 vms on openlab available as of 20181216 - running 2 separate clusters
# 13+1 all 16g VMs
# 4+1 all 32g VMs 
# master undercloud
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo cp logging-analytics/deploy/rancher/oom_rancher_setup.sh .
sudo ./oom_rancher_setup.sh -b master -s 10.12.7.13 -e onap
# master nfs
sudo wget https://jira.onap.org/secure/attachment/12887/master_nfs_node.sh
sudo chmod 777 master_nfs_node.sh 
sudo ./master_nfs_node.sh 10.12.5.10 10.12.6.180 10.12.5.173 10.12.7.35 10.12.6.248 10.12.5.68 10.12.7.36 10.12.6.249 10.12.5.136 10.12.6.167 10.12.6.179 10.12.5.167 10.12.7.14
#sudo ./master_nfs_node.sh 10.12.5.162 10.12.5.198 10.12.5.102 10.12.5.4

# slaves nfs
sudo wget https://jira.onap.org/secure/attachment/12888/slave_nfs_node.sh
sudo chmod 777 slave_nfs_node.sh 
sudo ./slave_nfs_node.sh 10.12.7.13
#sudo ./slave_nfs_node.sh 10.12.6.125
# test it
ubuntu@onap-oom-obrien-rancher-e4:~$ sudo ls /dockerdata-nfs/
test.sh

# remove client from master node
ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl get nodes
NAME                         STATUS    ROLES     AGE       VERSION
onap-oom-obrien-rancher-e0   Ready     <none>    5m        v1.11.5-rancher1
ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                   READY     STATUS    RESTARTS   AGE
kube-system   heapster-7b48b696fc-2z47t              1/1       Running   0          5m
kube-system   kube-dns-6655f78c68-gn2ds              3/3       Running   0          5m
kube-system   kubernetes-dashboard-6f54f7c4b-sfvjc   1/1       Running   0          5m
kube-system   monitoring-grafana-7877679464-872zv    1/1       Running   0          5m
kube-system   monitoring-influxdb-64664c6cf5-rs5ms   1/1       Running   0          5m
kube-system   tiller-deploy-6f4745cbcf-zmsrm         1/1       Running   0          5m
# after master removal from hosts - expected no nodes
ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl get nodes
error: the server doesn't have a resource type "nodes"

# slaves rancher client - 1st node
# register on the private network not the public IP
# notice the CATTLE_AGENT
sudo docker run -e CATTLE_AGENT_IP="10.0.0.7"  --rm --privileged -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/rancher:/var/lib/rancher rancher/agent:v1.2.11 http://10.0.16.1:8880/v1/scripts/5A5E4F6388A4C0A0F104:1514678400000:9zpsWeGOsKVmWtOtoixAUWjPJs
ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl get nodes
NAME                         STATUS    ROLES     AGE       VERSION
onap-oom-obrien-rancher-e1   Ready     <none>    0s        v1.11.5-rancher1
# add the other nodes
# the 4 node 32g = 128g cluster
ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl get nodes
NAME                         STATUS    ROLES     AGE       VERSION
onap-oom-obrien-rancher-e1   Ready     <none>    1h        v1.11.5-rancher1
onap-oom-obrien-rancher-e2   Ready     <none>    4m        v1.11.5-rancher1
onap-oom-obrien-rancher-e3   Ready     <none>    5m        v1.11.5-rancher1
onap-oom-obrien-rancher-e4   Ready     <none>    3m        v1.11.5-rancher1

# the 13 node 16g = 208g cluster
ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl top nodes
NAME                          CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
onap-oom-obrien-rancher-e1    208m         2%        2693Mi          16%       
onap-oom-obrien-rancher-e10   38m          0%        1083Mi          6%        
onap-oom-obrien-rancher-e11   36m          0%        1104Mi          6%        
onap-oom-obrien-rancher-e12   57m          0%        1070Mi          6%        
onap-oom-obrien-rancher-e13   116m         1%        1017Mi          6%        
onap-oom-obrien-rancher-e2    73m          0%        1361Mi          8%        
onap-oom-obrien-rancher-e3    62m          0%        1099Mi          6%        
onap-oom-obrien-rancher-e4    74m          0%        1370Mi          8%        
onap-oom-obrien-rancher-e5    37m          0%        1104Mi          6%        
onap-oom-obrien-rancher-e6    55m          0%        1125Mi          7%        
onap-oom-obrien-rancher-e7    42m          0%        1102Mi          6%        
onap-oom-obrien-rancher-e8    53m          0%        1090Mi          6%        
onap-oom-obrien-rancher-e9    52m          0%        1072Mi          6%  
Installing ONAP via cd.sh

The cluster hosting kubernetes is up with 13+1 nodes and 2 network interfaces (the private 10.0.0.0/16 subnet and the 10.12.0.0/16 public subnet)


Verify kubernetes hosts are ready

ubuntu@onap-oom-obrien-rancher-e0:~$ kubectl get nodes
NAME                          STATUS    ROLES     AGE       VERSION
onap-oom-obrien-rancher-e1    Ready     <none>    2h        v1.11.5-rancher1
onap-oom-obrien-rancher-e10   Ready     <none>    25m       v1.11.5-rancher1
onap-oom-obrien-rancher-e11   Ready     <none>    20m       v1.11.5-rancher1
onap-oom-obrien-rancher-e12   Ready     <none>    5m        v1.11.5-rancher1
onap-oom-obrien-rancher-e13   Ready     <none>    1m        v1.11.5-rancher1
onap-oom-obrien-rancher-e2    Ready     <none>    2h        v1.11.5-rancher1
onap-oom-obrien-rancher-e3    Ready     <none>    1h        v1.11.5-rancher1
onap-oom-obrien-rancher-e4    Ready     <none>    1h        v1.11.5-rancher1
onap-oom-obrien-rancher-e5    Ready     <none>    1h        v1.11.5-rancher1
onap-oom-obrien-rancher-e6    Ready     <none>    46m       v1.11.5-rancher1
onap-oom-obrien-rancher-e7    Ready     <none>    40m       v1.11.5-rancher1
onap-oom-obrien-rancher-e8    Ready     <none>    37m       v1.11.5-rancher1
onap-oom-obrien-rancher-e9    Ready     <none>    26m       v1.11.5-rancher1

Openstack parameter overrides

# manually check out 3.0.0-ONAP (script is written for branches like casablanca)
sudo git clone -b 3.0.0-ONAP http://gerrit.onap.org/r/oom
sudo cp -R ~/oom/kubernetes/helm/plugins/ ~/.helm
# fix tiller bug
sudo nano ~/.helm/plugins/deploy/deploy.sh
# modify dev.yaml with logging-rc file openstack parameters - appc, sdnc and 
sudo cp logging-analytics/deploy/cd.sh .
sudo cp oom/kubernetes/onap/resources/environments/dev.yaml .
sudo nano dev.yaml 
ubuntu@onap-oom-obrien-rancher-0:~/oom/kubernetes/so/resources/config/mso$ echo -n "Whq..jCLj" | openssl aes-128-ecb -e -K `cat encryption.key` -nosalt | xxd -c 256 -p
bdaee....c60d3e09
  # so server configuration
  config:
    openStackUserName: "michael_o_brien"
    openStackRegion: "RegionOne"
    openStackKeyStoneUrl: "http://10.12.25.2:5000"
    openStackServiceTenantName: "service"
    openStackEncryptedPasswordHere: "bdaee....c60d3e09"


Deploy all or a subset of ONAP

# copy dev.yaml to dev0.yaml
# bring up all onap in sequence or adjust the list for a subset specific for the vFW - assumes you already cloned oom
sudo nohup ./cd.sh -b 3.0.0-ONAP -e onap -p false -n nexus3.onap.org:10001 -f true -s 900 -c false -d true -w false -r false &
#sudo helm deploy onap local/onap --namespace $ENVIRON -f ../../dev.yaml -f onap/resources/environments/public-cloud.yaml 

The load is distributed across the cluster even for individual pods like dmaap

Verify the ONAP installation

do Accessingtheportal



vFW  vFirewall Workarounds

From Alexis Chiarello currently verifying

20190125 - these are for the heat environment - not the kubernetes one - following Casablanca Stability Testing Instructions currently

20181213 - thank you Alexis and Beejal Shah

Something else I forgot to mention, I did change the heat templates to adapt for our Ubuntu images in our env (to enable additional NICs, eth2 / eth3) and also disable gateway by default on the 2 additional subnets created.

See attached for the modified files.

Cheers,

Alexis.

sudo chmod 777 master_nfs_node.sh 

I reran the vFWCL use case in my re-installed Casablanca lab and here is what I had to manually do post-install :

- fix Robot "robot-eteshare-configmap" config map and adjust values that did not my match my env (onap_private_subnet_id, sec_group, dcae_collector_ip, Ubuntu image names, etc...).
- fix DEFAULT_KEYSTONE entry in identity_services in SO catalog DB for proper identity_url, mso_id, mso_pass; note that those are populated based on parsing the "so-openstack-adapter-app-configmap" config map, however it seems the config map is not populated with the entries from the kubernetes/onap/values.yaml file. It might be something I do wrong when installing, though I followed steps from Wiki.

Other than that, for closed-loop to work, policies need to be pushed :

- make sure to push the policies from pap (PRELOAD_POLICIES=true then run config/push-policies.sh from /tmp/policy-install folder)

(the following are for heat not kubernetes)

For the Robot execution :

- ran "demo.sh <namespace> init"
- ran "ete-k8s.sh [namespace] instantiateDemoVFWCL"

Finally, for Policy to actually parse the proper model ID from the AAI reponse on the named-query, policy-engine needs to be restarted manually; the robot script fails at doing and need to do it manually after the Robot test ends (I did not investigate the robot part, but basically looks like an ssh is done and fails)

docker exec -t -u policy drools bash -c "source /opt/app/policy/etc/profile.d/env.sh; policy stop docker exec -t -u policy drools bash -c "source /opt/app/policy/etc/profile.d/env.sh; policy start

That's it, in my case, with the above the vFWCL closed loop works just fine and able to see APP-C processing the modifyConfig event and change the number of streams using netconf to the packet generator.

Cheers,

Alexis.




Full Entrypoint Install

Two choices, run the single oom_deployment.sh via your ARM, CloudFormation, Heat template wrapper as a oneclick or use the 2 step procedure above.

entrypoint

aws/azure/openstack

Ubuntu 16

rancher install

oom deployment

CD script






Remove a Deployment

https://git.onap.org/logging-analytics/tree/deploy/cd.sh#n57

see also  OOM-1463 - Getting issue details... STATUS

https://git.onap.org/logging-analytics/tree/deploy/cd.sh#n57

required for a couple pods that leave left over resources and for the secondary cloudify out-of-band orchestration in DCAEGEN2 

OOM-1089 - Getting issue details... STATUS

DCAEGEN2-1067 - Getting issue details... STATUS

DCAEGEN2-1068 - Getting issue details... STATUS

sudo helm undeploy $ENVIRON --purge
kubectl delete namespace onap
sudo helm delete --purge onap
kubectl delete pv --all
kubectl delete pvc --all
kubectl delete secrets --all
kubectl delete clusterrolebinding --all


sudo rm -rf /dockerdata-nfs/onap-<pod>

# or for a single pod 
kubectl delete pod $ENVIRON-aaf-sms-vault-0 -n $ENVIRON --grace-period=0 --force

Using ONAP

Accessing the portal

Access the ONAP portal via the 8989 LoadBalancer Mandeep Khinda merged in for  OOM-633 - Getting issue details... STATUS  and documented at http://onap.readthedocs.io/en/latest/submodules/oom.git/docs/oom_user_guide.html#accessing-the-onap-portal-using-oom-and-a-kubernetes-cluster

ubuntu@a-onap-devopscd:~$ kubectl -n onap get services|grep "portal-app"
portal-app                         LoadBalancer   10.43.145.94    13.68.113.105                          8989:30215/TCP,8006:30213/TCP,8010:30214/TCP,8443:30225/TCP   20h


In the case of connecting to openlab through the vpn from your mac - you would need the 2nd number - which will be something like 10.0.0.12 - but the public IP corresponding to this private network IP - which only for this case is the e1 instance with 10.12.7.7 as the external routable IP

add the following and prefix with the IP above to your client's /etc/hosts

 in this case I am using the public 13... ip (elastic or generated public ip) - AWS in this example
 13.68.113.105 portal.api.simpledemo.onap.org
 13.68.113.105 vid.api.simpledemo.onap.org
 13.68.113.105 sdc.api.fe.simpledemo.onap.org
 13.68.113.105 portal-sdk.simpledemo.onap.org
 13.68.113.105 policy.api.simpledemo.onap.org
 13.68.113.105 aai.api.sparky.simpledemo.onap.org
 13.68.113.105 cli.api.simpledemo.onap.org
 13.68.113.105 msb.api.discovery.simpledemo.onap.org

launch

http://portal.api.simpledemo.onap.org:8989/ONAPPORTAL/login.htm

login with demo user

Accessing MariaDB portal container

kubectl n onap exec -it dev-portal-portal-db-b8db58679-q9pjq  - mysql -D mysql -h localhost -e 'select * from user'

see

PORTAL-399 - Getting issue details... STATUS  and  PORTAL-498 - Getting issue details... STATUS

Running the vFirewall

Casablanca Stability Testing Instructions

# verifying on ld.onap.cloud  20190126
oom/kubernetes/robot/demo-k8s.sh onap init


Initialize Customer And Models                                        | FAIL |
ConnectionError: HTTPConnectionPool(host='1.2.3.4', port=5000): Max retries exceeded with url: /v2.0/tokens (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efd0f8a4ad0>: Failed to establish a new connection: [Errno 110] Connection timed out',))



# push sample vFWCL policies
PAP_POD=$(kubectl --namespace onap get pods | grep policy-pap | sed 's/ .*//')
kubectl exec -it $PAP_POD -n onap -c pap -- bash -c 'export PRELOAD_POLICIES=true; /tmp/policy-install/config/push-policies.sh'
# ete instantiateDemoVFWC
/root/oom/kubernetes/robot/ete-k8s.sh onap instantiateDemoVFWCL
# restart drools
kubectl delete pod dev-policy-drools-0 -n onap
# wait for policy to kick in
sleep 20m
# demo vfwclosedloop
/root/oom/kubernetes/robot/demo-k8s.sh onap vfwclosedloop $PNG_IP
# check the sink on 667


Deployment Profile

For a view of the system see Log Streaming Compliance and API

Minimum Single VM Deployment

A single 122g R4.4xlarge VM in progress

see also  LOG-630 - Getting issue details... STATUS

helm install will bring up everything without the configmap failure - but the release is busted - pods come up though

ubuntu@ip-172-31-27-63:~$ sudo helm install local/onap -n onap --namespace onap -f onap/resources/environments/disable-allcharts.yaml --set aai.enabled=true --set dmaap.enabled=true --set log.enabled=true --set policy.enabled=true --set portal.enabled=true --set robot.enabled=true --set sdc.enabled=true --set sdnc.enabled=true --set so.enabled=true --set vid.enabled=true
deploymentcontainers
minimum (no vfwCL)

medium (vfwCL)

full


Container Issues

20180901

amdocs@ubuntu:~/_dev/oom/kubernetes$ kubectl get pods --all-namespaces | grep 0/1
onap          onap-aai-champ-68ff644d85-mpkb9                         0/1       Running                 0          1d
onap          onap-pomba-kibana-d76b6dd4c-j4q9m                       0/1       Init:CrashLoopBackOff   472        1d
amdocs@ubuntu:~/_dev/oom/kubernetes$ kubectl get pods --all-namespaces | grep 1/2
onap          onap-aai-gizmo-856f86d664-mf587                         1/2       CrashLoopBackOff        568        1d
onap          onap-pomba-networkdiscovery-85d76975b7-w9sjl            1/2       CrashLoopBackOff        573        1d
onap          onap-pomba-networkdiscoveryctxbuilder-c89786dfc-rtdqc   1/2       CrashLoopBackOff        569        1d
onap          onap-vid-84c88db589-vbfht                               1/2       CrashLoopBackOff        616        1d

with clamp and pomba enabled (ran clamp first)
amdocs@ubuntu:~/_dev/oom/kubernetes$ sudo helm upgrade -i onap local/onap --namespace onap -f dev.yaml 
Error: UPGRADE FAILED: failed to create resource: Service "pomba-kibana" is invalid: spec.ports[0].nodePort: Invalid value: 30234: provided port is already allocated


Full ONAP Cluster

see the AWS cluster install below

Requirements

Hardware Requirements

VMsRAMHDvCoresPortsNetwork
1

55-70G at startup

40G per host min

(30G for dockers)

100G after a week

5G min per NFS

4GBPS peak

(need to reduce 152 pods to 110)

8 min

60 peak at startup

recommended 16-64 vCores

see list on PortProfile



Recommend

0.0.0.0/0 (all open) inside VPC

Block 10249-10255 outside

secure 8888 with oauth

170 MB/sec

peak 1200

3+

85G

Recommend min 3 x 64G class VMs

Try for 4

master: 40G

hosts: 80G (30G of dockers)

NFS: 5G

24 to 64

This is snapshot of the CD system running on Amazon AWS at http://jenkins.onap.info/job/oom-cd-master/

It is a 1 + 4 node cluster composed of four 64G/8vCore R4.2xLarge VMs






Amazon AWS

Account Provider: (2) Robin of Amazon and Michael O'Brien of Amdocs

Amazon has donated an allocation enough for 512G of VM space (a large 4 x 122G/16vCore cluster and a secondary 9 x 16G cluster) in order to run CD systems since Dec 2017 - at a cost savings of at least $500/month - thank you very much Amazon in supporting ONAP

See example max/med allocations for IT/Finance in ONAP Deployment Specification for Finance and Operations#AmazonAWS

Amazon AWS is currently hosting our RI for ONAP Continuous Deployment - this is a joint Proof Of Concept between Amazon and ONAP.

Auto Continuous Deployment via Jenkins and Kibana

AWS CLI Installation

Install the AWS CLI on the bastion VM

https://docs.aws.amazon.com/cli/latest/userguide/cli-install-macos.html

OSX

obrien:obrienlabs amdocs$ pip --version
pip 9.0.1 from /Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg (python 2.7)
obrien:obrienlabs amdocs$ curl -O https://bootstrap.pypa.io/get-pip.py
obrien:obrienlabs amdocs$ python3 get-pip.py --user
Requirement already up-to-date: pip in /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages
obrien:obrienlabs amdocs$ pip3 install awscli --upgrade --user
Successfully installed awscli-1.14.41 botocore-1.8.45 pyasn1-0.4.2 s3transfer-0.1.13

Ubuntu

obrien:obrienlabs amdocs$ ssh ubuntu@<your domain/ip>
$ sudo apt install python-pip
$ pip install awscli --upgrade --user
$ aws --version
aws-cli/1.14.41 Python/2.7.12 Linux/4.4.0-1041-aws botocore/1.8.45

Windows Powershell


Configure Access Keys for your Account

$aws configure
AWS Access Key ID [None]: AK....Q
AWS Secret Access Key [None]: Dl....l
Default region name [None]: us-east-1
Default output format [None]: json
$aws ec2 describe-regions --output table
||  ec2.ca-central-1.amazonaws.com   |  ca-central-1    ||
....

Option 0: Deploy OOM Kubernetes to a spot VM

Peak Performance MetricsWe hit a peak of 44 cores during startup, with an external network peak of 1.2Gbps (throttled nexus servers at ONAP), a peak SSD write rate of 4Gbps and 55G ram on a 64 vCore/256G VM on AWS Spot.

Kubernetes Installation via CLI

Allocate an EIP static public IP (one-time)

https://docs.aws.amazon.com/cli/latest/reference/ec2/allocate-address.html

$aws ec2 allocate-address
{    "PublicIp": "35.172..",     "Domain": "vpc",     "AllocationId": "eipalloc-2f743..."}

Create a Route53 Record Set - Type A (one-time)

$ cat route53-a-record-change-set.json 
{"Comment": "comment","Changes": [
    { "Action": "CREATE",
      "ResourceRecordSet": {
        "Name": "amazon.onap.cloud",
        "Type": "A", "TTL": 300,
        "ResourceRecords": [
          { "Value": "35.172.36.." }]}}]}
$ aws route53 change-resource-record-sets --hosted-zone-id Z...7 --change-batch file://route53-a-record-change-set.json
{    "ChangeInfo": {        "Status": "PENDING",         "Comment": "comment", 
       "SubmittedAt": "2018-02-17T15:02:46.512Z",         "Id": "/change/C2QUNYTDVF453x"    }}


$ dig amazon.onap.cloud
; <<>> DiG 9.9.7-P3 <<>> amazon.onap.cloud
amazon.onap.cloud.	300	IN	A	35.172.36..
onap.cloud.		172800	IN	NS	ns-1392.awsdns-46.org.

Request a spot EC2 Instance

# request the usually cheapest $0.13 spot 64G EBS instance at AWS
aws ec2 request-spot-instances --spot-price "0.25" --instance-count 1 --type "one-time" --launch-specification file://aws_ec2_spot_cli.json

# don't pass in the the following - it will be generated for the EBS volume
            "SnapshotId": "snap-0cfc17b071e696816"
launch specification json
{      "ImageId": "ami-c0ddd64ba",
      "InstanceType": "r4.2xlarge",
      "KeyName": "obrien_systems_aws_2015",
      "BlockDeviceMappings": [
        {"DeviceName": "/dev/sda1",
          "Ebs": {
            "DeleteOnTermination": true,
            "VolumeType": "gp2",
            "VolumeSize": 120
          }}],
      "SecurityGroupIds": [ "sg-322c4nnn42" ]}
# results
{    "SpotInstanceRequests": [{   "Status": {
                "Message": "Your Spot request has been submitted for review, and is pending evaluation.", 
                "Code": "pending-evaluation", 

Get EC2 instanceId after creation

aws ec2 describe-spot-instance-requests  --spot-instance-request-id sir-1tyr5etg
            "InstanceId": "i-02a653592cb748e2x",

Associate EIP with EC2 Instance

Can be done separately as long as it is in the first 30 sec during initialization and before rancher starts on the instance.

$aws ec2 associate-address --instance-id i-02a653592cb748e2x --allocation-id eipalloc-375c1d0x
{    "AssociationId": "eipassoc-a4b5a29x"}

Reboot EC2 Instance to apply DNS change to Rancher in AMI

$aws ec2 reboot-instances --instance-ids i-02a653592cb748e2x

Clustered Deployment

look at https://github.com/kubernetes-incubator/external-storage

EC2 Cluster Creation

EFS share for shared NFS


"From the NFS wizard"

Setting up your EC2 instance

  1. Using the Amazon EC2 console, associate your EC2 instance with a VPC security group that enables access to your mount target. For example, if you assigned the "default" security group to your mount target, you should assign the "default" security group to your EC2 instance. Learn more
  2. Open an SSH client and connect to your EC2 instance. (Find out how to connect)

  3. If you're not using the EFS mount helper, install the NFS client on your EC2 instance:
    • On an Ubuntu instance:
      sudo apt-get install nfs-common

Mounting your file system

  1. Open an SSH client and connect to your EC2 instance. (Find out how to connect)
  2. Create a new directory on your EC2 instance, such as "efs".
    • sudo mkdir efs
  3. Mount your file system. If you require encryption of data in transit, use the EFS mount helper and the TLS mount option. Mounting considerations
    • Using the EFS mount helper:
      sudo mount -t efs fs-43b2763a:/ efs
    • Using the EFS mount helper and encryption of data in transit:
      sudo mount -t efs -o tls fs-43b2763a:/ efs
    • Using the NFS client:
      sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 fs-43b2763a.efs.us-east-2.amazonaws.com:/ efs

If you are unable to connect, see our troubleshooting documentation.

https://docs.aws.amazon.com/efs/latest/ug/mounting-fs.html

EFS/NFS Provisioning Script for AWS

https://git.onap.org/logging-analytics/tree/deploy/aws/oom_cluster_host_install.sh

ubuntu@ip-172-31-19-239:~$ sudo git clone https://gerrit.onap.org/r/logging-analytics
Cloning into 'logging-analytics'...
ubuntu@ip-172-31-19-239:~$ sudo cp logging-analytics/deploy/aws/oom_cluster_host_install.sh .
ubuntu@ip-172-31-19-239:~$ sudo ./oom_cluster_host_install.sh -n true -s <your domain/ip> -e fs-0000001b -r us-west-1 -t 5EA8A:15000:MWcEyoKw -c true -v
# fix helm after adding nodes to the master
ubuntu@ip-172-31-31-219:~$ sudo helm init --upgrade
$HELM_HOME has been configured at /home/ubuntu/.helm.
Tiller (the Helm server-side component) has been upgraded to the current version.
ubuntu@ip-172-31-31-219:~$ sudo helm repo add local http://127.0.0.1:8879
"local" has been added to your repositories
ubuntu@ip-172-31-31-219:~$ sudo helm repo list
NAME  	URL                                             
stable	https://kubernetes-charts.storage.googleapis.com
local 	http://127.0.0.1:8879  

4 Node Kubernetes Cluster on AWS

Notice that we are vCore bound Ideally we need 64 vCores for a minimal production system

Client Install

# setup the master 
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo logging-analytics/deploy/rancher/oom_rancher_setup.sh -b master -s <your domain/ip> -e onap
# manually delete the host that was installed on the master - in the rancher gui for now

# run without a client on the master
sudo logging-analytics/deploy/aws/oom_cluster_host_install.sh -n false -s <your domain/ip> -e fs-nnnnnn1b -r us-west-1 -t 371AEDC88zYAZdBXPM  -c true -v true
ls /dockerdata-nfs/
onap  test.sh

# run the script from git on each cluster nodes
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo logging-analytics/deploy/aws/oom_cluster_host_install.sh -n true -s <your domain/ip> -e fs-nnnnnn1b -r us-west-1 -t 371AEDC88zYAZdBXPM  -c true -v true

# check a node
ls /dockerdata-nfs/
onap  test.sh
sudo docker ps
CONTAINER ID        IMAGE                             COMMAND                  CREATED             STATUS                  PORTS               NAMES
6e4a57e19c39        rancher/healthcheck:v0.3.3        "/.r/r /rancher-en..."   1 second ago        Up Less than a second                       r-healthcheck-healthcheck-5-f0a8f5e8
f9bffc6d9b3e        rancher/network-manager:v0.7.19   "/rancher-entrypoi..."   1 second ago        Up 1 second                                 r-network-services-network-manager-5-103f6104
460f31281e98        rancher/net:holder                "/.r/r /rancher-en..."   4 seconds ago       Up 4 seconds                                r-ipsec-ipsec-5-2e22f370
3e30b0cf91bb        rancher/agent:v1.2.9              "/run.sh run"            17 seconds ago      Up 16 seconds                               rancher-agent

# On the master - fix helm after adding nodes to the master
sudo helm init --upgrade
$HELM_HOME has been configured at /home/ubuntu/.helm.
Tiller (the Helm server-side component) has been upgraded to the current version.
sudo helm repo add local http://127.0.0.1:8879
# check the cluster on the master
kubectl top nodes
NAME                                          CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
ip-172-31-16-85.us-west-1.compute.internal    129m         3%        1805Mi          5%        
ip-172-31-25-15.us-west-1.compute.internal    43m          1%        1065Mi          3%        
ip-172-31-28-145.us-west-1.compute.internal   40m          1%        1049Mi          3%        
ip-172-31-21-240.us-west-1.compute.internal   30m          0%        965Mi           3% 

# important: secure your rancher cluster by adding an oauth github account - to keep out crypto miners
http://cluster.onap.info:8880/admin/access/github
# now back to master to install onap
# get the dev.yaml and set any pods you want up to true as well as fill out the openstack parameters
sudo wget https://git.onap.org/oom/plain/kubernetes/onap/resources/environments/dev.yaml
sudo cp logging-analytics/deploy/cd.sh .
sudo ./cd.sh -b master -e onap -c true -d true -w false -r false

136 pending > 0 at the 1th 15 sec interval
ubuntu@ip-172-31-28-152:~$ kubectl get pods -n onap | grep -E '1/1|2/2' | wc -l
20
120 pending > 0 at the 39th 15 sec interval
ubuntu@ip-172-31-28-152:~$ kubectl get pods -n onap | grep -E '1/1|2/2' | wc -l
47
99 pending > 0 at the 93th 15 sec interval


after an hour most of the 136 containers should be up
kubectl get pods --all-namespaces | grep -E '0/|1/2'
onap          onap-aaf-cs-59954bd86f-vdvhx                    0/1       CrashLoopBackOff   7          37m
onap          onap-aaf-oauth-57474c586c-f9tzc                 0/1       Init:1/2           2          37m
onap          onap-aai-champ-7d55cbb956-j5zvn                 0/1       Running            0          37m
onap          onap-drools-0                                   0/1       Init:0/1           0          1h
onap          onap-nexus-54ddfc9497-h74m2                     0/1       CrashLoopBackOff   17         1h
onap          onap-sdc-be-777759bcb9-ng7zw                    1/2       Running            0          1h
onap          onap-sdc-es-66ffbcd8fd-v8j7g                    0/1       Running            0          1h
onap          onap-sdc-fe-75fb4965bd-bfb4l                    0/2       Init:1/2           6          1h

# cpu bound - a small cluster has 4x4 cores - try to run with 4x16 cores
ubuntu@ip-172-31-28-152:~$ kubectl top nodes
NAME                                          CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
ip-172-31-28-145.us-west-1.compute.internal   3699m        92%       26034Mi         85%       
ip-172-31-21-240.us-west-1.compute.internal   3741m        93%       3872Mi          12%       
ip-172-31-16-85.us-west-1.compute.internal    3997m        99%       23160Mi         75%       
ip-172-31-25-15.us-west-1.compute.internal    3998m        99%       27076Mi         88%     

13 Node Kubernetes Cluster on AWS

Node: R4.large (2 cores, 16g)


Notice that we are vCore bound Ideally we need 64 vCores for a minimal production system - this runs with 12 x 4 vCores = 48

30 min after helm install start - DCAE containers come at at 55


ssh ubuntu@ld.onap.info

# setup the master 
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo logging-analytics/deploy/rancher/oom_rancher_setup.sh -b master -s <your domain/ip> -e onap
# manually delete the host that was installed on the master - in the rancher gui for now
# get the token for use with the EFS/NFS share
ubuntu@ip-172-31-8-245:~$ cat ~/.kube/config | grep token
    token: "QmFzaWMgTVVORk4wRkdNalF3UXpNNE9E.........RtNWxlbXBCU0hGTE1reEJVamxWTjJ0Tk5sWlVjZz09"
# run without a client on the master
ubuntu@ip-172-31-8-245:~$ sudo logging-analytics/deploy/aws/oom_cluster_host_install.sh -n false -s ld.onap.info -e fs-....eb -r us-east-2 -t QmFzaWMgTVVORk4wRkdNalF3UX..........aU1dGSllUVkozU0RSTmRtNWxlbXBCU0hGTE1reEJVamxWTjJ0Tk5sWlVjZz09 -c true -v true
ls /dockerdata-nfs/
onap  test.sh

# run the script from git on each cluster node
sudo git clone https://gerrit.onap.org/r/logging-analytics
sudo logging-analytics/deploy/aws/oom_cluster_host_install.sh -n true -s <your domain/ip> -e fs-nnnnnn1b -r us-west-1 -t 371AEDC88zYAZdBXPM  -c true -v true

ubuntu@ip-172-31-8-245:~$ kubectl top nodes
NAME                                          CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
ip-172-31-14-254.us-east-2.compute.internal   45m          1%        1160Mi          7%        
ip-172-31-3-195.us-east-2.compute.internal    29m          0%        1023Mi          6%        
ip-172-31-2-105.us-east-2.compute.internal    31m          0%        1004Mi          6%        
ip-172-31-0-159.us-east-2.compute.internal    30m          0%        1018Mi          6%        
ip-172-31-12-122.us-east-2.compute.internal   34m          0%        1002Mi          6%        
ip-172-31-0-197.us-east-2.compute.internal    30m          0%        1015Mi          6%        
ip-172-31-2-244.us-east-2.compute.internal    123m         3%        2032Mi          13%       
ip-172-31-11-30.us-east-2.compute.internal    38m          0%        1142Mi          7%        
ip-172-31-9-203.us-east-2.compute.internal    33m          0%        998Mi           6%        
ip-172-31-1-101.us-east-2.compute.internal    32m          0%        996Mi           6%        
ip-172-31-9-128.us-east-2.compute.internal    31m          0%        1037Mi          6%        
ip-172-31-3-141.us-east-2.compute.internal    30m          0%        1011Mi          6%  

# now back to master to install onap
# get the dev.yaml and set any pods you want up to true as well as fill out the openstack parameters
sudo wget https://git.onap.org/oom/plain/kubernetes/onap/resources/environments/dev.yaml
sudo cp logging-analytics/deploy/cd.sh .
sudo ./cd.sh -b master -e onap -c true -d true -w false -r false

after an hour most of the 136 containers should be up
kubectl get pods --all-namespaces | grep -E '0/|1/2'

Amazon EKS Cluster for ONAP Deployment

LOG-554 - Getting issue details... STATUS

LOG-939 - Getting issue details... STATUS

follow

https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

https://aws.amazon.com/getting-started/projects/deploy-kubernetes-app-amazon-eks/

follow the VPC CNI plugin - https://aws.amazon.com/blogs/opensource/vpc-cni-plugin-v1-1-available/

and 20190121 work with John Lotoskion https://lists.onap.org/g/onap-discuss/topic/aws_efs_nfs_and_rancher_2_2/29382184?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,29382184

https://docs.newrelic.com/docs/integrations/kubernetes-integration/installation/kubernetes-installation-configuration#install-amazon-eks

Network Diagram

Standard ELB and public/private VPC

Create EKS cluster


Provision access to EKS cluster


Kubernetes Installation via CloudFormation

ONAP Installation

SSH and upload OOM

oom_rancher_install.sh is in  OOM-715 - Getting issue details... STATUS  under https://gerrit.onap.org/r/#/c/32019/


Run OOM

see  OOM-710 - Getting issue details... STATUS

cd.sh in  OOM-716 - Getting issue details... STATUS  under https://gerrit.onap.org/r/#/c/32653/


Scenario: installing Rancher on clean Ubuntu 16.04 64g VM (single collocated server/host) and the master branch of onap via OOM deployment (2 scripts)

1 hour video of automated installation on an AWS EC2 spot instance


Run Healthcheck

Run Automated Robot parts of vFirewall VNF

Report Results

Stop Spot Instance

$ aws ec2 terminate-instances --instance-ids i-0040425ac8c0d8f6x
{    "TerminatingInstances": [        {
            "InstanceId": "i-0040425ac8c0d8f63", 
            "CurrentState": {
                "Code": 32, 
                "Name": "shutting-down"           }, 
            "PreviousState": {
                "Code": 16, 
                "Name": "running"
            }        }    ]}


Verify Instance stopped


Video on Installing and Running the ONAP Demos#ONAPDeploymentVideos

WE can run ONAP on an AWS EC2 instance for $0.17/hour as opposed to Rackspace at $1.12/hour for a 64G Ubuntu host VM.

I have created an AMI on Amazon AWS under the following ID that has a reference 20170825 tag of ONAP 1.0 running on top of Rancher

ami-b8f3f3c3 : onap-oom-k8s-10

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Images:visibility=public-images;search=ami-b8f3f3c3;sort=name

EIP 34.233.240.214 maps to http://dev.onap.info:8880/env/1a7/infra/hosts

A D2.2xlarge with 61G ram on the spot market https://console.aws.amazon.com/ec2sp/v1/spot/launch-wizard?region=us-east-1 at $0.16/hour for all of ONAP


It may take up to 3-8 min for kubernetes pods to initialize as long as you preload the docker images  OOM-328 - Getting issue details... STATUS


Workaround for the disk space error - even though we are running with a 1.9 TB NVMe SSD

https://github.com/kubernetes/kubernetes/issues/48703

Use a flavor that uses EBS like M4.4xLarge which is OK


Use a flavor that uses EBS like M4.4xLarge which is OK - except for AAI right now

Expected Monthly Billing

r4.2xlarge is the smallest and most cost effective 64g min instance to use for full ONAP deployment - it requires EBS stores.  This is assuming 1 instance up at all times and a couple ad-hoc instances up a couple hours for testing/experimentation.

Option 1: Migrating Heat to CloudFormation

Resource Correspondence

IDTypeParentAWSOpenstack
















Using the CloudFormationDesigner

https://console.aws.amazon.com/cloudformation/designer/home?region=us-east-1#

Decoupling and Abstracting Southbound Orchestration via Plugins

Part of getting another infrastructure provider like AWS to work with ONAP will be in identifying and decoupling southbound logic from any particular cloud provider using an extensible plugin architecture on the SBI interface.

see Multi VIM/Cloud (5/11/17)VID project (5/17/17)Service Orchestrator (5/14/17)ONAP Operations Manager (5/10/17)ONAP Operations Manager / ONAP on Containers


Design Issues

DI 1: Refactor nested orchestration in DCAE

Replace the DCAE Controller

DI 2: Elastic IP allocation

DI 3: Investigate Cloudify plugin for AWS

Cloudify is Tosca based - https://github.com/cloudify-cosmo/cloudify-aws-plugin

DI 4: 20180803 Investigate ISTIO service mesh

https://istio.io/docs/setup/kubernetes/quick-start/

LOG-592 - Getting issue details... STATUS



Links

Waiting for the EC2 C5 instance types under the C620 chipset to arrive at AWS so we can experiment under EC2 Spot - http://technewshunter.com/cpus/intel-launches-xeon-w-cpus-for-workstations-skylake-sp-ecc-for-lga2066-41771/ https://aws.amazon.com/about-aws/whats-new/2016/11/coming-soon-amazon-ec2-c5-instances-the-next-generation-of-compute-optimized-instances/

http://docs.aws.amazon.com/cli/latest/userguide/cli-install-macos.html

use

curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip"
unzip awscli-bundle.zip
sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
aws --version
aws-cli/1.11.170 Python/2.7.13 Darwin/16.7.0 botocore/1.7.28





EC2 VMs

AWS Clustered Deployment

AWS EC2 Cluster Creation

AWS EFS share for shared NFS

You need an NFS share between the VM's in your Kubernetes cluster - an Elastic File System share will wrap NFS


"From the NFS wizard"

Setting up your EC2 instance

  1. Using the Amazon EC2 console, associate your EC2 instance with a VPC security group that enables access to your mount target. For example, if you assigned the "default" security group to your mount target, you should assign the "default" security group to your EC2 instance. Learn more
  2. Open an SSH client and connect to your EC2 instance. (Find out how to connect)

  3. If you're not using the EFS mount helper, install the NFS client on your EC2 instance:
    • On an Ubuntu instance:
      sudo apt-get install nfs-common

Mounting your file system

  1. Open an SSH client and connect to your EC2 instance. (Find out how to connect)
  2. Create a new directory on your EC2 instance, such as "efs".
    • sudo mkdir efs
  3. Mount your file system. If you require encryption of data in transit, use the EFS mount helper and the TLS mount option. Mounting considerations
    • Using the EFS mount helper:
      sudo mount -t efs fs-43b2763a:/ efs
    • Using the EFS mount helper and encryption of data in transit:
      sudo mount -t efs -o tls fs-43b2763a:/ efs
    • Using the NFS client:
      sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 fs-43b2763a.efs.us-east-2.amazonaws.com:/ efs

If you are unable to connect, see our troubleshooting documentation.

https://docs.aws.amazon.com/efs/latest/ug/mounting-fs.html

Automated

Manual

ubuntu@ip-10-0-0-66:~$ sudo apt-get install nfs-common
ubuntu@ip-10-0-0-66:~$ cd /
ubuntu@ip-10-0-0-66:~$ sudo mkdir /dockerdata-nfs
root@ip-10-0-0-19:/# sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 fs-43b2763a.efs.us-east-2.amazonaws.com:/ /dockerdata-nfs
# write something on one vm - and verify it shows on another
ubuntu@ip-10-0-0-8:~$ ls /dockerdata-nfs/
test.sh


Microsoft Azure

Subscription Sponsor: (1) Microsoft

VMs

Deliverables are deployment scripts, arm/cli templates for various deployment scenarios (single, multiple, federated servers)

In review  OOM-711 - Getting issue details... STATUS

Quickstart

Single collocated VM

Automation is currently only written for single VM that hosts both the rancher server and the deployed onap pods. Use the ARM template below to deploy your VM and provision it (adjust your config parameters)

Two choices, run the single oom_deployment.sh ARM wrapper - or use it to bring up an empty vm and run oom_entrypoint.sh manually. Once the VM comes up the oom_entrypoint.sh script will run - which will download the oom_rancher_setup.sh script to setup docker, rancher, kubernetes and helm - the entrypoint script will then run the cd.sh script to bring up onap based on your values.yaml config by running helm install on it.

# login to az cli, wget the deployment script, arm template and parameters file - edit the parameters file (dns, ssh key ...) and run the arm template
wget https://git.onap.org/logging-analytics/plain/deploy/azure/oom_deployment.sh
wget https://git.onap.org/logging-analytics/plain/deploy/azure/_arm_deploy_onap_cd.json
wget https://git.onap.org/logging-analytics/plain/deploy/azure/_arm_deploy_onap_cd_z_parameters.json
# either run the entrypoint which creates a resource template and runs the stack - or do those two commands manually
./oom_deployment.sh -b master -s azure.onap.cloud -e onap -r a_auto-youruserid_20180421 -t arm_deploy_onap_cd.json -p arm_deploy_onap_cd_z_parameters.json
# wait for the VM to finish in about 75 min or watch progress by ssh'ing into the vm and doing
root@ons-auto-201803181110z: sudo tail -f /var/lib/waagent/custom-script/download/0/stdout

# if you wish to run the oom_entrypoint script yourself - edit/break the cloud init section at the end of the arm template and do it yourself below
# download and edit values.yaml with your onap preferences and openstack tenant config
wget https://jira.onap.org/secure/attachment/11414/values.yaml
# download and run the bootstrap and onap install script, the -s server name can be an IP, FQDN or hostname
wget https://git.onap.org/logging-analytics/plain/deploy/rancher/oom_entrypoint.sh
chmod 777 oom_entrypoint.sh
sudo ./oom_entrypoint.sh -b master -s devops.onap.info -e onap
# wait 15 min for rancher to finish, then 30-90 min for onap to come up


#20181015 - delete the deployment, recreate the onap environment in rancher with the template adjusted for more than the default 110 container limit - by adding
--max-pods=500
# then redo the helm install


OOM-714 - Getting issue details... STATUS  see https://jira.onap.org/secure/attachment/11455/oom_openstack.yaml and https://jira.onap.org/secure/attachment/11454/oom_openstack_oom.env

LOG-320 - Getting issue details... STATUS  see https://git.onap.org/logging-analytics/tree/deploy/rancher/oom_entrypoint.sh

customize your template (true/false for any components, docker overrides etc...)

https://jira.onap.org/secure/attachment/11414/values.yaml

Run oom_entrypoint.sh after you verified values.yaml - it will run both scripts below for you - a single node kubernetes setup running what you configured in values.yaml will be up in 50-90 min.  If you want to just configure your vm without bringing up ONAP - comment out the cd.sh line and run that separately.

LOG-325 - Getting issue details... STATUS  see wget https://git.onap.org/logging-analytics/plain/deploy/rancher/oom_rancher_setup.sh

LOG-326 - Getting issue details... STATUS  see wget https://git.onap.org/logging-analytics/plain/deploy/cd.sh

Verify your system is up by doing a kubectl get pods --all-namespaces and checking the 8880 port to bring up the rancher or kubernetes gui.

Login to Azure CLI

https://portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.Resources%2Fresources

Download Azure ONAP ARM template

see 

OOM-711 - Getting issue details... STATUS

Edit Azure ARM template environment parameters
Create Resource Group
az group create --name onap_eastus --location eastus

Run ARM template
az group deployment create --resource-group onap_eastus --template-file oom_azure_arm_deploy.json --parameters @oom_azure_arm_deploy_parameters.json
Wait for Rancher/Kubernetes install

The oom_entrypoint.sh script will be run as a cloud-init script on the VM - see 

LOG-320 - Getting issue details... STATUS

which runs

LOG-325 - Getting issue details... STATUS

Wait for OOM ONAP install

see 

LOG-326 - Getting issue details... STATUS

Verify ONAP installation 
kubectl get pods --all-namespaces
# raise/lower onap components from the installed directory if using the oneclick arm template
# amsterdam only
root@ons-auto-master-201803191429z:/var/lib/waagent/custom-script/download/0/oom/kubernetes/oneclick# ./createAll.bash -n onap


Azure CLI Installation

Requirements

Azure subscription

OSX

https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest

Install homebrew first (reinstall if you are on the latest OSX 10.13.2 https://github.com/Homebrew/install because of 3718)

Will install Python 3.6

$brew update
$brew install azure-cli

https://docs.microsoft.com/en-us/cli/azure/get-started-with-azure-cli?view=azure-cli-latest

$ az login
To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code E..D to authenticate.
[ {
    "cloudName": "AzureCloud",
    "id": "f4...b",
    "isDefault": true,
    "name": "Pay-As-You-Go",
    "state": "Enabled",
    "tenantId": "bcb.....f",
    "user": {
      "name": "michael@....org",
      "type": "user"
    }}]
Bastion/Jumphost VM in Azure

https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-apt?view=azure-cli-latest

# in root
AZ_REPO=$(lsb_release -cs)
echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" |      sudo tee /etc/apt/sources.list.d/azure-cli.list
apt-key adv --keyserver packages.microsoft.com --recv-keys 52E16F86FEE04B979B07E28DB02C46DF417A0893
apt-get install apt-transport-https
apt-get update && sudo apt-get install azure-cli
az login


# verify
root@ons-dmz:~# ps -ef | grep az
root       1427      1  0 Mar17 ?        00:00:00 /usr/lib/linux-tools/4.13.0-1011-azure/hv_vss_daemon -n

Windows Powershell

https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-windows?view=azure-cli-latest

ARM Template

Follow https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-create-first-template

Create a Storage Account
$ az login
To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code E...Z to authenticate.
$ az group create --name examplegroup --location "South Central US"
{
  "id": "/subscriptions/f4b...e8b/resourceGroups/examplegroup",
  "location": "southcentralus",
  "managedBy": null,
  "name": "examplegroup",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null
}
obrien:obrienlabs amdocs$ vi azuredeploy_storageaccount.json
obrien:obrienlabs amdocs$ az group deployment create --resource-group examplegroup --template-file azuredeploy_storageaccount.json
{
  "id": "/subscriptions/f4...e8b/resourceGroups/examplegroup/providers/Microsoft.Resources/deployments/azuredeploy_storageaccount",
  "name": "azuredeploy_storageaccount",
  "properties": {
    "additionalProperties": {
      "duration": "PT32.9822642S",
      "outputResources": [
        {
          "id": "/subscriptions/f4..e8b/resourceGroups/examplegroup/providers/Microsoft.Storage/storageAccounts/storagekj6....kk2w",
          "resourceGroup": "examplegroup"
        }],
      "templateHash": "11440483235727994285"},
    "correlationId": "41a0f79..90c291",
    "debugSetting": null,
    "dependencies": [],
    "mode": "Incremental",
    "outputs": {},
    "parameters": {},
    "parametersLink": null,
    "providers": [
      {
        "id": null,
        "namespace": "Microsoft.Storage",
        "registrationState": null,
        "resourceTypes": [
          {
            "aliases": null,
            "apiVersions": null,
            "locations": [
              "southcentralus"
            ],
            "properties": null,
            "resourceType": "storageAccounts"
          }]}],
    "provisioningState": "Succeeded",
    "template": null,
    "templateLink": null,
    "timestamp": "2018-02-17T16:15:11.562170+00:00"
  },
  "resourceGroup": "examplegroup"}
Pick a region
az account list-locations
northcentralus
for example
Create a resource group
# create a resource group if not already there
az group create --name obrien_jenkins_b_westus2 --location westus2

Create a VM

We need a 128G VM with at least 8vCores (peak is 60) and a 100+GB drive. The sizes are detailed on https://docs.microsoft.com/en-ca/azure/virtual-machines/windows/sizes-memory - we will use the Standard_D32s_v3 type

We need an "all open 0.0.0.0/0" security group and a reassociated data drive as boot drive - see the arm template in LOG-321

Get the ARM template

see open review in  OOM-711 - Getting issue details... STATUS


"ubuntuOSVersion": "16.04.0-LTS"
"imagePublisher": "Canonical",
"imageOffer": "UbuntuServer",
"vmSize": "Standard_E8s_v3"
"osDisk": {"createOption": "FromImage"},"dataDisks": [{"diskSizeGB": 511,"lun": 0, "createOption": "Empty" }]

Follow

https://github.com/Azure/azure-quickstart-templates/tree/master/101-acs-kubernetes

https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-template-deploy

https://docs.microsoft.com/en-us/cli/azure/group/deployment?view=azure-cli-latest#az_group_deployment_create

https://github.com/Azure/azure-quickstart-templates/tree/master/101-vm-simple-linux

It needs a security group https://docs.microsoft.com/en-us/azure/virtual-network/virtual-networks-create-nsg-arm-template

{
"apiVersion": "2017-03-01",
"type": "Microsoft.Network/networkSecurityGroups",
"name": "[variables('networkSecurityGroupName')]",
"location": "[resourceGroup().location]",
"tags": { "displayName": "NSG - Front End" },
"properties": {
  "securityRules": [
    {
      "name": "in-rule",
      "properties": {
        "description": "All in",
        "protocol": "Tcp",
        "sourcePortRange": "*",
        "destinationPortRange": "*",
        "sourceAddressPrefix": "Internet",
        "destinationAddressPrefix": "*",
        "access": "Allow",
        "priority": 100,
        "direction": "Inbound"
      }
    },
    {
      "name": "out-rule",
      "properties": {
        "description": "All out",
        "protocol": "Tcp",
        "sourcePortRange": "*",
        "destinationPortRange": "*",
        "sourceAddressPrefix": "Internet",
        "destinationAddressPrefix": "*",
        "access": "Allow",
        "priority": 101,
        "direction": "Outbound"
      }
    }
  ]
}
}
, 
{
      "apiVersion": "2017-04-01",
      "type": "Microsoft.Network/virtualNetworks",
      "name": "[variables('virtualNetworkName')]",
      "location": "[resourceGroup().location]",
      "dependson": [
        "[concat('Microsoft.Network/networkSecurityGroups/', variables('networkSecurityGroupName'))]"
      ],
      "properties": {
        "addressSpace": {
          "addressPrefixes": [
            "[variables('addressPrefix')]"
          ]
        },
        "subnets": [
          {
            "name": "[variables('subnetName')]",
            "properties": {
              "addressPrefix": "[variables('subnetPrefix')]",
              "networkSecurityGroup": {
                 "id": "[resourceId('Microsoft.Network/networkSecurityGroups', variables('networkSecurityGroupName'))]"
              }
            }
          }
        ]
      }
    },
# validate first (validate instead of create)
az group deployment create --resource-group obrien_jenkins_b_westus2 --template-file oom_azure_arm_deploy.json --parameters @oom_azure_arm_cd_amsterdam_deploy_parameters.json

SSH into your VM and run the Kubernetes and OOM installation scripts

Use the entrypoint script in  OOM-710 - Getting issue details... STATUS

# clone the oom repo to get the install directory
sudo git clone https://gerrit.onap.org/r/logging-analytics
# run the Rancher RI installation (to install kubernetes)
sudo logging-analytics/deploy/rancher/oom_rancher_install.sh -b master -s 192.168.240.32 -e onap
# run the oom deployment script
# get a copy of onap-parameters.yaml and place in this folder
logging-analytics/deploy/cd.sh -b master -s 192.168.240.32 -e onap

oom_rancher_install.sh is in  OOM-715 - Getting issue details... STATUS  under https://gerrit.onap.org/r/#/c/32019/

cd.sh in  OOM-716 - Getting issue details... STATUS  under https://gerrit.onap.org/r/#/c/32653/

Delete the VM and resource group
# delete the vm and resources
az group deployment delete --resource-group ONAPAMDOCS --name oom_azure_arm_deploy
# the above deletion will not delete the actual resources - only a delete of the group or each individual resource works
# optionally delete the resource group
az group delete --name ONAPAMDOCS -y

Azure devops
create static IP

az network public-ip create --name onap-argon --resource-group a_ONAP_argon_prod_donotdelete --location eastus --allocation-method Static

ONAP on Azure Container Service

AKS Installation

Follow https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-deploy-cluster

Register for AKS preview via az cli
obrienbiometrics:obrienlabs michaelobrien$ az provider register -n Microsoft.ContainerService
Registering is still on-going. You can monitor using 'az provider show -n Microsoft.ContainerService'
Create an AKS resource group

Raise your AKS vCPU quota - optional

https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits#container-service-aks-limits

http://aka.ms/corequotaincrease

https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/newsupportrequest

Deployment failed. Correlation ID: 4b4707a7-2244-4557-855e-11bcced556de. Provisioning of resource(s) for container service onapAKSCluster in resource group onapAKS failed. Message: Operation results in exceeding quota limits of Core. Maximum allowed: 10, Current in use: 10, Additional requested: 1. Please read more about quota increase at http://aka.ms/corequotaincrease.. Details: 

Create AKS cluster
obrienbiometrics:obrienlabs michaelobrien$ az aks create --resource-group onapAKS --name onapAKSCluster --node-count 1 --generate-ssh-keys
 - Running ..
 "fqdn": "onapaksclu-onapaks-f4....3.hcp.eastus.azmk8s.io",
AKS cluster VM granularity

The cluster will start with a 3.5G VM before scaling

Resources for your AKS cluster



Bring up AAI only for now


Design Issues

Resource Group

A resource group makes it easier to package and remove everything for a deployment - essentially making the deployment stateless

Network Security Group

Global or local to the resource group?

Follow CSEC guidelines https://www.cse-cst.gc.ca/en/system/files/pdf_documents/itsg-22-eng.pdf


Static public IP

Register a CNAME for an existing domain and use the same IP address everytime the deployment comes up

Entrypoint cloud init script

How to attach the cloud init script to provision the VM

ARM template chaining

passing derived varialbles into the next arm template - for example when bringing up an entire federated set in one or more DCs

see script attached to 

Troubleshooting

DNS propagation and caching

It takes about 2 min for DNS entries to propagate out from A record DNS changes.  For example the following IP/DNS association took 2 min to appear in dig.

obrienbiometrics:onap_oom_711_azure michaelobrien$ dig azure.onap.info
; <<>> DiG 9.9.7-P3 <<>> azure.onap.info
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10599
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;azure.onap.info.		IN	A
;; ANSWER SECTION:
azure.onap.info.	251	IN	A	52.224.233.230
;; Query time: 68 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Feb 20 10:26:59 EST 2018
;; MSG SIZE  rcvd: 60

obrienbiometrics:onap_oom_711_azure michaelobrien$ dig azure.onap.info
; <<>> DiG 9.9.7-P3 <<>> azure.onap.info
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30447
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;azure.onap.info.		IN	A
;; ANSWER SECTION:
azure.onap.info.	299	IN	A	13.92.225.167
;; Query time: 84 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Feb 20 10:27:04 EST 2018
Corporate Firewall Access
Inside the corporate firewall - avoid it

PS C:\> az login
Please ensure you have network connection. Error detail: HTTPSConnectionPool(host='login.microsoftonline.com', port=443)
: Max retries exceeded with url: /common/oauth2/devicecode?api-version=1.0 (Caused by NewConnectionError('<urllib3.conne
ction.VerifiedHTTPSConnection object at 0x04D18730>: Failed to establish a new connection: [Errno 11001] getaddrinfo fai
led',))

at home or cell hotspot

PS C:\> az login
To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code E...2W to authenticate.
[  {    "cloudName": "AzureCloud",    "id": "4...da1",    "isDefault": true,    "name": "Microsoft Azure Internal Consumption",    "state": "Enabled",    "tenantId": "72f98....47",    "user": {      "name": "fran...ocs.com",      "type": "user"    }]

On corporate account (need permissions bump to be able to create a resource group prior to running an arm template
https://wiki.onap.org/display/DW/ONAP+on+Kubernetes+on+Microsoft+Azure#ONAPonKubernetesonMicrosoftAzure-ARMTemplate
PS C:\> az group create --name onapKubernetes --location eastus
The client 'fra...s.com' with object id '08f98c7e-...ed' does not have authorization to per
form action 'Microsoft.Resources/subscriptions/resourcegroups/write' over scope '/subscriptions/42e...8
7da1/resourcegroups/onapKubernetes'.

try my personal = OK
PS C:\> az login
To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code EE...ULR to authenticate.
Terminate batch job (Y/N)? y
# hangs when first time login in a new pc
PS C:\> az login
To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code E.PBKS to authenticate.
[  {    "cloudName": "AzureCloud",    "id": "f4b...b",    "isDefault": true    "name": "Pay-As-You-Go",    "state": "Enabled",   "tenantId": "bcb...f4f",   "user":       "name": "michael@obrien...org",    "type": "user"    }  }]
PS C:\> az group create --name onapKubernetes2 --location eastus
{  "id": "/subscriptions/f4b....b/resourceGroups/onapKubernetes2",  "location": "eastus",  "managedBy": null,  "name": "onapKubernetes2",  "properties": {    "provisioningState": "Succeeded"  },  "tags": null}

Design Issues

20180228: Deployment delete does not delete resources without a resourceGroup delete

I find that a delete deployment deletes the deployment but not the actual resources.  The workaround is to delete the resource group - but in some constrained subscriptions the cli user may not have the ability to create a resource group - and hence delete it.

see

https://github.com/Azure/azure-sdk-for-java/issues/1167

deleting the resources manually for now - is a workaround if you cannot create/delete resource groups

# delete the vm and resources
az group deployment delete --resource-group ONAPAMDOCS --name oom_azure_arm_deploy
# the above deletion will not delete the actual resources - only a delete of the group or each individual resource works
# optionally delete the resource group
az group delete --name ONAPAMDOCS -y

However modifying the template to add resources works well.  For example adding a reference to a network security group

20180228: Resize the OS disk

ONAP requires at least 75g - the issue is than in most VM templates on Azure - the OS disk is 30g - we need to either switch to the data disk or resize the os disk.

# add diskSizeGB to the template
          "osDisk": {
                "diskSizeGB": 255,
                "createOption": "FromImage"
            },
ubuntu@oom-auto-deploy:~$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
udev            65989400       0  65989400   0% /dev
tmpfs           13201856    8848  13193008   1% /run
/dev/sda1      259142960 1339056 257787520   1% /
tmpfs           66009280       0  66009280   0% /dev/shm
tmpfs               5120       0      5120   0% /run/lock
tmpfs           66009280       0  66009280   0% /sys/fs/cgroup
none                  64       0        64   0% /etc/network/interfaces.dynamic.d
/dev/sdb1      264091588   60508 250592980   1% /mnt
tmpfs           13201856       0  13201856   0% /run/user/1000
ubuntu@oom-auto-deploy:~$ free
              total        used        free      shared  buff/cache   available
Mem:      132018560      392336   131242164        8876      384060   131012328
20180301: Add oom_entrypoint.sh bootstrap script to install rancher and onap

in review under OOM-715 

https://jira.onap.org/secure/attachment/11206/oom_entrypoint.sh

If using amsterdam - swap out the onap-parameters.yaml  (the curl is hardcoded to a master branch version)

20180303: cloudstorage access on OSX via Azure Storage Manager

use this method instead of installing az cli directly - for certain corporate oauth configurations

https://azure.microsoft.com/en-us/features/storage-explorer/

Install AZM using the name and access key of a storage account created manually or by enabling the az cli on the browser

20180318: add oom_entrypoint.sh to cloud-init on the arm template

See https://docs.microsoft.com/en-us/azure/templates/microsoft.compute/virtualmachines/extensions it looks like Azure has a similar setup to AWS ebextentions

Targetting 

typestringNoSpecifies the type of the extension; an example is "CustomScriptExtension".


https://docs.microsoft.com/en-us/azure/virtual-machines/linux/extensions-customscript

deprecated
 {
    "apiVersion": "2015-06-15",
    "type": "Microsoft.Compute/virtualMachines/extensions",
    "name": "[concat(parameters('vmName'),'/onap')]",
    "location": "[resourceGroup().location]",
    "dependsOn": ["[concat('Microsoft.Compute/virtualMachines/', parameters('vmName'))]"],
    "properties": {
        "publisher": "Microsoft.Azure.Extensions",
        "type": "CustomScript",
        "typeHandlerVersion": "1.9",
        "autoUpgradeMinorVersion": true,
        "settings": {
            "fileUris": [ "https://jira.onap.org/secure/attachment/11263/oom_entrypoint.sh" ],
"commandToExecute": "[concat('./' , parameters('scriptName'), ' -b master -s dns/pub/pri-ip -e onap' )]" }
        }
     }


use
    {
    "apiVersion": "2017-12-01",
    "type": "Microsoft.Compute/virtualMachines/extensions",
    "name": "[concat(parameters('vmName'),'/onap')]",
    "location": "[resourceGroup().location]",
    "dependsOn": ["[concat('Microsoft.Compute/virtualMachines/', parameters('vmName'))]"],
    "properties": {
        "publisher": "Microsoft.Azure.Extensions",
        "type": "CustomScript",
        "typeHandlerVersion": "2.0",
        "autoUpgradeMinorVersion": true,
        "settings": {
            "fileUris": [ "https://jira.onap.org/secure/attachment/11281/oom_entrypoint.sh" ],
            "commandToExecute": "[concat('./' , parameters('scriptName'), ' -b master ', ' -s ', 'ons-auto-201803181110z', ' -e onap' )]"
           }
        }
     }


ubuntu@ons-dmz:~$ ./oom_deployment.sh 

Deployment template validation failed: 'The template resource 'entrypoint' for type 'Microsoft.Compute/virtualMachines/extensions' at line '1' and column '6182' has incorrect segment lengths. A nested resource type must have identical number of segments as its resource name. A root resource type must have segment length one greater than its resource name. Please see https://aka.ms/arm-template/#resources for usage details.'.

ubuntu@ons-dmz:~$ ./oom_deployment.sh 

Deployment failed. Correlation ID: 532b9a9b-e0e8-4184-9e46-6c2e7c15e7c7. {

  "error": {

    "code": "ParentResourceNotFound",

    "message": "Can not perform requested operation on nested resource. Parent resource '[concat(parameters('vmName'),'' not found."

  }

}

fixed 20180318:1600

Install runs - but I need visibility - checking /var/lib/waagent/custom-script/download/0/

progress

./oom_deployment.sh


# 7 min to delete old deployment
ubuntu@ons-dmz:~$ az vm extension list -g a_ONAP_auto_201803181110z --vm-name ons-auto-201803181110z
..
    "provisioningState": "Creating",
 "settings": {
      "commandToExecute": "./oom_entrypoint.sh -b master  -s ons-auto-201803181110zons-auto-201803181110z.eastus.cloudapp.azure.com -e onap",
      "fileUris": [
        "https://jira.onap.org/secure/attachment/11263/oom_entrypoint.sh"  


ubuntu@ons-auto-201803181110z:~$ sudo su -
root@ons-auto-201803181110z:~# docker ps
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                              NAMES
83458596d7a6        rancher/server:v1.6.14   "/usr/bin/entry /u..."   3 minutes ago       Up 3 minutes        3306/tcp, 0.0.0.0:8880->8080/tcp   rancher_server

root@ons-auto-201803181110z:~# tail -f /var/log/azure/custom-script/handler.log
time=2018-03-18T22:51:59Z version=v2.0.6/git@1008306-clean operation=enable seq=0 file=0 event="download complete" output=/var/lib/waagent/custom-script/download/0
time=2018-03-18T22:51:59Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="executing command" output=/var/lib/waagent/custom-script/download/0
time=2018-03-18T22:51:59Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="executing public commandToExecute" output=/var/lib/waagent/custom-script/download/0
root@ons-auto-201803181110z:~# docker ps
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                              NAMES
539733f24c01        rancher/agent:v1.2.9     "/run.sh run"            13 seconds ago      Up 13 seconds                                          rancher-agent
83458596d7a6        rancher/server:v1.6.14   "/usr/bin/entry /u..."   5 minutes ago       Up 5 minutes        3306/tcp, 0.0.0.0:8880->8080/tcp   rancher_server
root@ons-auto-201803181110z:~# ls -la /var/lib/waagent/custom-script/download/0/
total 31616
-rw-r--r-- 1 root   root   16325186 Aug 31  2017 helm-v2.6.1-linux-amd64.tar.gz
-rw-r--r-- 1 root   root          4 Mar 18 22:55 kube_env_id.json
drwxrwxr-x 2 ubuntu ubuntu     4096 Mar 18 22:53 linux-amd64
-r-x------ 1 root   root       2822 Mar 18 22:51 oom_entrypoint.sh
-rwxrwxrwx 1 root   root       7288 Mar 18 22:52 oom_rancher_setup.sh
-rwxr-xr-x 1 root   root   12213376 Mar 18 22:53 rancher
-rw-r--r-- 1 root   root    3736787 Dec 20 19:41 rancher-linux-amd64-v0.6.7.tar.gz
drwxr-xr-x 2 root   root       4096 Dec 20 19:39 rancher-v0.6.7

testing via http://jenkins.onap.cloud/job/oom_azure_deployment/

Need the ip address and not the domain name - via linked template

https://docs.microsoft.com/en-ca/azure/azure-resource-manager/resource-group-linked-templates#return-state-from-a-template

or

https://docs.microsoft.com/en-us/azure/templates/microsoft.network/publicipaddresses

https://github.com/Azure/azure-quickstart-templates/issues/583

Arm templates cannot specify a static ip - without a private subnet

reference(variables('publicIPAddressName')).ipAddress

for

reference(variables('nicName')).ipConfigurations[0].properties.privateIPAddress

Using the hostname instead of the private/public ip works (verify /etc/hosts though)

obrienbiometrics:oom michaelobrien$ ssh ubuntu@13.99.207.60
ubuntu@ons-auto-201803181110z:~$ sudo su -
root@ons-auto-201803181110z:/var/lib/waagent/custom-script/download/0# cat stdout
INFO: Running Agent Registration Process, CATTLE_URL=http://ons-auto-201803181110z:8880/v1
INFO: Attempting to connect to: http://ons-auto-201803181110z:8880/v1
INFO: http://ons-auto-201803181110z:8880/v1 is accessible
INFO: Inspecting host capabilities
INFO: Boot2Docker: false
INFO: Host writable: true
INFO: Token: xxxxxxxx
INFO: Running registration
INFO: Printing Environment
INFO: ENV: CATTLE_ACCESS_KEY=9B0FA1695A3E3CFD07DB
INFO: ENV: CATTLE_HOME=/var/lib/cattle
INFO: ENV: CATTLE_REGISTRATION_ACCESS_KEY=registrationToken
INFO: ENV: CATTLE_REGISTRATION_SECRET_KEY=xxxxxxx
INFO: ENV: CATTLE_SECRET_KEY=xxxxxxx
INFO: ENV: CATTLE_URL=http://ons-auto-201803181110z:8880/v1
INFO: ENV: DETECTED_CATTLE_AGENT_IP=172.17.0.1
INFO: ENV: RANCHER_AGENT_IMAGE=rancher/agent:v1.2.9
INFO: Launched Rancher Agent: b44bd62fd21c961f32f642f7c3b24438fc4129eabbd1f91e1cf58b0ed30b5876
waiting 7 min for host registration to finish
1 more min
KUBECTL_TOKEN base64 encoded: QmFzaWMgUWpBNE5EWkdRlRNN.....Ukc1d2MwWTJRZz09
run the following if you installed a higher kubectl version than the server
helm init --upgrade
Verify all pods up on the kubernetes system - will return localhost:8080 until a host is added
kubectl get pods --all-namespaces
NAMESPACE     NAME                                   READY     STATUS    RESTARTS   AGE
kube-system   heapster-76b8cd7b5-v5jrd               1/1       Running   0          5m
kube-system   kube-dns-5d7b4487c9-9bwk5              3/3       Running   0          5m
kube-system   kubernetes-dashboard-f9577fffd-cpwv7   1/1       Running   0          5m
kube-system   monitoring-grafana-997796fcf-s4sjm     1/1       Running   0          5m
kube-system   monitoring-influxdb-56fdcd96b-2mn6r    1/1       Running   0          5m
kube-system   tiller-deploy-cc96d4f6b-fll4t          1/1       Running   0          5m

20180318: Create VM image without destroying running VM

In AWS we can select the "no reboot" option and create an image from a running VM as-is with no effect on the running system.

Having issues with the Azure image creator - it is looking for the ubuntu pw even though I only use key based access

20180319: New Relic Monitoring
20180319: document devops flow

aka: travellers guide

20180319: Document Virtual Network Topology
20180429: Helm repo n/a after reboot - rerun helm serve

If you run into issues doing a make all - your helm server is not running

# rerun
helm serve &
helm repo add local http://127.0.0.1:8879
20180516: Clustered NFS share via Azure Files

Need a cloud native NFS wrapper like EFS(AWS) - looking at Azure files

Training

(Links below from Microsoft - thank you)

General Azure Documentation

Azure Site http://azure.microsoft.com

Azure Documentation Site https://docs.microsoft.com/en-us/azure/

Azure Training Courses https://azure.microsoft.com/en-us/training/free-online-courses/

Azure Portal http://portal.azure.com

Developer Documentation

Azure AD Authentication Libraries https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-authentication-libraries

Java Overview on Azure https://azure.microsoft.com/en-us/develop/java/

Java Docs for Azure https://docs.microsoft.com/en-us/java/azure/

Java SDK on GitHub https://github.com/Azure/azure-sdk-for-java

Python Overview on Azure https://azure.microsoft.com/en-us/develop/python/

Python Docs for Azure https://docs.microsoft.com/en-us/python/azure/

Python SDK on GitHub https://github.com/Azure/azure-sdk-for-python

REST Api and CLI Documentation

REST API Documentation https://docs.microsoft.com/en-us/rest/api/

CLI Documentation https://docs.microsoft.com/en-us/cli/azure/index

Other Documentation

Using Automation for VM shutdown & startup https://docs.microsoft.com/en-us/azure/automation/automation-solution-vm-management

Azure Resource Manager (ARM) QuickStart Templates https://github.com/Azure/azure-quickstart-templates

Known Forks

The code in this github repo has 2 month old copies of cd.sh and oom_rancher_install.sh 

https://github.com/taranki/onap-azure

Use the official ONAP code in

https://gerrit.onap.org/r/logging-analytics

The original seed source from 2017 below is deprecated -  use onap links above

https://github.com/obrienlabs/onap-root

Links

https://azure.microsoft.com/en-us/services/container-service/

https://docs.microsoft.com/en-us/azure/templates/microsoft.compute/virtualmachines

https://docs.microsoft.com/en-us/azure/container-service/kubernetes/container-service-kubernetes-helm

https://kubernetes.io/docs/concepts/containers/images/#using-azure-container-registry-acr

https://azure.microsoft.com/en-us/features/storage-explorer/

https://docs.microsoft.com/en-ca/azure/virtual-machines/linux/capture-image







AKS

Google GCE

Account Provider: Michael O'Brien of Amdocs

OOM Installation on a GCE VM

The purpose of this page is to detail getting ONAP on Kubernetes (OOM) setup on a GCE VM.

I recommend using the ONAP on Kubernetes on Amazon EC2 Amazon EC2 Spot API - as it runs around $0.12-0.25/hr at 75% off instead of the $0.60 below (33% off for reserved instances) - this page is here so we can support GCE and also work with the kubernetes open source project in a space it was originally designed in at Google.

Login to your google account and start creating a 128g Ubuntu 16.04 VM

Install google command line tools 

 ??????????????????????????????????????????????????????????????????????????????????????????????????????????????
?                                                  Components                                                 ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????
?     Status    ?                         Name                         ?            ID            ?    Size   ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????
? Not Installed ? App Engine Go Extensions                             ? app-engine-go            ?  97.7 MiB ?
? Not Installed ? Cloud Bigtable Command Line Tool                     ? cbt                      ?   4.0 MiB ?
? Not Installed ? Cloud Bigtable Emulator                              ? bigtable                 ?   3.5 MiB ?
? Not Installed ? Cloud Datalab Command Line Tool                      ? datalab                  ?   < 1 MiB ?
? Not Installed ? Cloud Datastore Emulator                             ? cloud-datastore-emulator ?  17.7 MiB ?
? Not Installed ? Cloud Datastore Emulator (Legacy)                    ? gcd-emulator             ?  38.1 MiB ?
? Not Installed ? Cloud Pub/Sub Emulator                               ? pubsub-emulator          ?  33.2 MiB ?
? Not Installed ? Emulator Reverse Proxy                               ? emulator-reverse-proxy   ?  14.5 MiB ?
? Not Installed ? Google Container Local Builder                       ? container-builder-local  ?   3.7 MiB ?
? Not Installed ? Google Container Registry's Docker credential helper ? docker-credential-gcr    ?   2.2 MiB ?
? Not Installed ? gcloud Alpha Commands                                ? alpha                    ?   < 1 MiB ?
? Not Installed ? gcloud Beta Commands                                 ? beta                     ?   < 1 MiB ?
? Not Installed ? gcloud app Java Extensions                           ? app-engine-java          ? 116.0 MiB ?
? Not Installed ? gcloud app PHP Extensions                            ? app-engine-php           ?  21.9 MiB ?
? Not Installed ? gcloud app Python Extensions                         ? app-engine-python        ?   6.2 MiB ?
? Not Installed ? kubectl                                              ? kubectl                  ?  15.9 MiB ?
? Installed     ? BigQuery Command Line Tool                           ? bq                       ?   < 1 MiB ?
? Installed     ? Cloud SDK Core Libraries                             ? core                     ?   5.9 MiB ?
? Installed     ? Cloud Storage Command Line Tool                      ? gsutil                   ?   3.3 MiB ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????

==> Source [/Users/michaelobrien/gce/google-cloud-sdk/completion.bash.inc] in your profile to enable shell command completion for gcloud.
==> Source [/Users/michaelobrien/gce/google-cloud-sdk/path.bash.inc] in your profile to add the Google Cloud SDK command line tools to your $PATH.

gcloud init

obrienbiometrics:google-cloud-sdk michaelobrien$ source ~/.bash_profile
obrienbiometrics:google-cloud-sdk michaelobrien$ gcloud components update

All components are up to date.

Connect to your VM by getting a dynamic SSH key

 
obrienbiometrics:google-cloud-sdk michaelobrien$ gcloud compute ssh instance-1
WARNING: The public SSH key file for gcloud does not exist.
WARNING: The private SSH key file for gcloud does not exist.
WARNING: You do not have an SSH key for gcloud.
WARNING: SSH keygen will be executed to generate a key.
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /Users/michaelobrien/.ssh/google_compute_engine.
Your public key has been saved in /Users/michaelobrien/.ssh/google_compute_engine.pub.
The key fingerprint is:
SHA256:kvS8ZIE1egbY+bEpY1RGN45ruICBo1WH8fLWqO435+Y michaelobrien@obrienbiometrics.local
The key's randomart image is:
+---[RSA 2048]----+
|    o=o+* o      |
| . .oo+*.= .     |
|o o ..=.=+.      |
|.o o ++X+o       |
|. . ..BoS        |
|     + * .       |
|    . . .        |
|   .  o o        |
|   .o. *E        |
+----[SHA256]-----+
Updating project ssh metadata.../Updated [https://www.googleapis.com/compute/v1/projects/onap-184300].                                                                                                                           
Updating project ssh metadata...done.                                                                                                                                                                                            
Waiting for SSH key to propagate.
Warning: Permanently added 'compute.2865548946042680113' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.10.0-37-generic x86_64)
 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage
  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud
0 packages can be updated.
0 updates are security updates.
michaelobrien@instance-1:~$ 

Open up firewall rules or the entire VM

We need at least port 8880 for rancher

obrienbiometrics:20171027_log_doc michaelobrien$ gcloud compute firewall-rules create open8880 --allow tcp:8880 --source-tags=instance-1 --source-ranges=0.0.0.0/0 --description="8880"
Creating firewall...|Created [https://www.googleapis.com/compute/v1/projects/onap-184300/global/firewalls/open8880].                                                                                                             
Creating firewall...done.                                                                                                                                                                                                        
NAME      NETWORK  DIRECTION  PRIORITY  ALLOW     DENY
open8880  default  INGRESS    1000      tcp:8880

Better to edit the existing internal firewall rule to the CIDR 0.0.0.0/0

Continue with ONAP on Kubernetes

ONAP on Kubernetes#QuickstartInstallation




Kubernetes

Kubernetes API

follow https://kubernetes.io/docs/reference/kubectl/jsonpath/

Take the ~/.kube/config server and token and retrofit a rest call like the curl below

curl -k -H "Authorization: Bearer $TOKEN" -H 'Accept: application/json' $K8S-server-and-6443-port/api/v1/pods | jq -r .items[0].metadata.name
heapster-7b48b696fc-67qv6


Kubernetes v11 Curl examples

for validating raw kubernetes api calls (take the .kube/config server and token and create a curl call with optional json parsing) - like below 
ubuntu@ip-172-31-30-96:~$ curl -k -H "Authorization: Bearer QmFzaWMgUVV........YW5SdGFrNHhNdz09" -H 'Accept: application/json' https://o...fo:8880/r/projects/1a7/kubernetes:6443/api/v1/pods | jq -r .items[0].spec.containers[0]
{
  "name": "heapster",
  "image": "docker.io/rancher/heapster-amd64:v1.5.2",
  "command": [
    "/heapster",
    "--source=kubernetes:https://$KUBERNETES_SERVICE_HOST:443?inClusterConfig=true&useServiceAccount=true",
    "--sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086?retention=0s",
    "--v=2"
  ],
  "resources": {},
  "volumeMounts": [
    {
      "name": "io-rancher-system-token-wf6d4",
      "readOnly": true,
      "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
    }
  ],
  "terminationMessagePath": "/dev/termination-log",
  "terminationMessagePolicy": "File",
  "imagePullPolicy": "IfNotPresent"
} 


Kubernetes Best Practices


Local nexus proxy

in progress - needs values.yaml global override

From gary https://lists.onap.org/g/onap-discuss/message/11909?p=,,,20,0,0,0::Created,,local+nexus,20,2,0,24568685

ubuntu@a-onap-devopscd:~$ docker run -d -p 5000:5000 --restart=unless-stopped --name registry -e REGISTRY_PROXY_REMOTEURL=https://nexus3.onap.org:10001 registry:2
Unable to find image 'registry:2' locally
2: Pulling from library/registry
Status: Downloaded newer image for registry:2
bd216e444f133b30681dab8b144a212d84e1c231cc12353586b7010b3ae9d24b
ubuntu@a-onap-devopscd:~$ sudo docker ps | grep registry
bd216e444f13        registry:2                                                                                                                                                 "/entrypoint.sh /e..."    2 minutes ago       Up About a minute   0.0.0.0:5000->5000/tcp             registry


Verify your Kubernetes cluster is functioning properly - Tiller is up

Check the dashboard

http://dev.onap.info:8880/r/projects/1a7/kubernetes-dashboard:9090/#!/pod?namespace=_all

check kubectl

check tiller container is in state Running - not just tiller-deploy

ubuntu@a-onap-devops:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                       READY     STATUS    RESTARTS   AGE
kube-system   heapster-6cfb49f776-9lqt2                  1/1       Running   0          20d
kube-system   kube-dns-75c8cb4ccb-tw992                  3/3       Running   0          20d
kube-system   kubernetes-dashboard-6f4c8b9cd5-rcbp2      1/1       Running   0          20d
kube-system   monitoring-grafana-76f5b489d5-r99rh        1/1       Running   0          20d
kube-system   monitoring-influxdb-6fc88bd58d-h875w       1/1       Running   0          20d
kube-system   tiller-deploy-645bd55c5d-bmxs7             1/1       Running   0          20d
onap          logdemonode-logdemonode-5c8bffb468-phbzd   2/2       Running   0          20d
onap          onap-log-elasticsearch-7557486bc4-72vpw    1/1       Running   0          20d
onap          onap-log-kibana-fc88b6b79-d88r7            1/1       Running   0          20d
onap          onap-log-logstash-9jlf2                    1/1       Running   0          20d
onap          onap-portal-app-8486dc7ff8-tssd2           2/2       Running   0          5d
onap          onap-portal-cassandra-8588fbd698-dksq5     1/1       Running   0          5d
onap          onap-portal-db-7d6b95cd94-66474            1/1       Running   0          5d
onap          onap-portal-sdk-77cd558c98-6rsvq           2/2       Running   0          5d
onap          onap-portal-widget-6469f4bc56-hms24        1/1       Running   0          5d
onap          onap-portal-zookeeper-5d8c598c4c-hck2d     1/1       Running   0          5d
onap          onap-robot-6f99cb989f-kpwdr                1/1       Running   0          20d
ubuntu@a-onap-devops:~$ kubectl describe pod tiller-deploy-645bd55c5d-bmxs7 -n kube-system
Name:           tiller-deploy-645bd55c5d-bmxs7
Namespace:      kube-system
Node:           a-onap-devops/172.17.0.1
Start Time:     Mon, 30 Jul 2018 22:20:09 +0000
Labels:         app=helm
                name=tiller
                pod-template-hash=2016811718
Annotations:    <none>
Status:         Running
IP:             10.42.0.5
Controlled By:  ReplicaSet/tiller-deploy-645bd55c5d
Containers:
  tiller:
    Container ID:  docker://a26420061a01a5791401c2519974c3190bf9f53fce5a9157abe7890f1f08146a
    Image:         gcr.io/kubernetes-helm/tiller:v2.8.2
    Image ID:      docker-pullable://gcr.io/kubernetes-helm/tiller@sha256:9b373c71ea2dfdb7d42a6c6dada769cf93be682df7cfabb717748bdaef27d10a
    Port:          44134/TCP
    Command:
      /tiller
      --v=2
    State:          Running
      Started:      Mon, 30 Jul 2018 22:20:14 +0000
    Ready:          True


LOGs

Helm Deploy plugin logs

Need these to triage helm deploys that do not show up in a helm list - as in there were errors before marking the deployment as failed

also use --verbose

ubuntu@a-ld0:~$ sudo ls ~/.helm/plugins/deploy/cache/onap/logs/onap-
onap-aaf.log             onap-cli.log             onap-dmaap.log           onap-multicloud.log      onap-portal.log          onap-sniro-emulator.log  onap-vid.log
onap-aai.log             onap-consul.log          onap-esr.log             onap-oof.log             onap-robot.log           onap-so.log              onap-vnfsdk.log
onap-appc.log            onap-contrib.log         onap-log.log             onap-policy.log          onap-sdc.log             onap-uui.log             onap-vvp.log
onap-clamp.log           onap-dcaegen2.log        onap-msb.log             onap-pomba.log           onap-sdnc.log            onap-vfc.log             


Monitoring

Grafana Dashboards

There is a built in grafana dashboard (thanks Mandeep Khinda and James MacNider) that once enabled can show more detail about the cluster you are running - you need to expose the nodeport and target the VM the pod is on.

The CD system one is running below http://master3.onap.info:32628/dashboard/db/cluster?orgId=1&from=now-12h&to=now

# expose the nodeport
kubectl expose -n kube-system deployment monitoring-grafana --type=LoadBalancer --name monitoring-grafana-client
service "monitoring-grafana-client" exposed
# get the nodeport pod is running on
kubectl get services --all-namespaces -o wide | grep graf
kube-system   monitoring-grafana          ClusterIP      10.43.44.197    <none>                                 80/TCP                                                                       7d        k8s-app=grafana
kube-system   monitoring-grafana-client   LoadBalancer   10.43.251.214   18.222.4.161                           3000:32628/TCP                                                               15s       k8s-app=grafana,task=monitoring
# get the cluster vm DNS name
ubuntu@ip-10-0-0-169:~$ kubectl get pods --all-namespaces -o wide | grep graf
kube-system   monitoring-grafana-997796fcf-7kkl4                                1/1       Running            0          5d        10.42.84.138    ip-10-0-0-80.us-east-2.compute.internal


see also

MSB-209 - Getting issue details... STATUS

Kubernetes DevOps

ONAP Development#KubernetesDevOps

Additional Tools

https://github.com/jonmosco/kube-ps1

https://github.com/ahmetb/kubectx

https://medium.com/@thisiskj/quickly-change-clusters-and-namespaces-in-kubernetes-6a5adca05615

https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/

brew install kube-ps1
brew install kubectx


Openstack

Windriver Intel Lab

see  OOM-714 - Getting issue details... STATUS

Windriver/Openstack Lab Network Topology

Openlab VNC and CLI

The following is missing some sections and a bit out of date (v2 deprecated in favor of v3)

Get an openlab account - Integration / Developer Lab Access

Stephen Gooch provides excellent/fast service - raise a JIRA like the following

OPENLABS-75 - Getting issue details... STATUS

Install openVPN - Using Lab POD-ONAP-01 Environment

For OSX both Viscosity and TunnelBlick work fine

Login to Openstack

Install openstack command line toolsTutorial: Configuring and Starting Up the Base ONAP Stack#InstallPythonvirtualenvTools(optional,butrecommended)
get your v3 rc file

verify your openstack cli access (or just use the jumpbox)
obrienbiometrics:aws michaelobrien$ source logging-openrc.sh 
obrienbiometrics:aws michaelobrien$ openstack server list
+--------------------------------------+---------+--------+-------------------------------+------------+
| ID                                   | Name    | Status | Networks                      | Image Name |
+--------------------------------------+---------+--------+-------------------------------+------------+
| 1ed28213-62dd-4ef6-bdde-6307e0b42c8c | jenkins | ACTIVE | admin-private-mgmt=10.10.2.34 |            |
+--------------------------------------+---------+--------+-------------------------------+------------+
get some elastic IP's

You may need to release unused IPs from other tenants - as we have 4 pools of 50

fill in your stack env parameters

to fill in your config (mso) settings in values.yaml follow https://onap.readthedocs.io/en/beijing/submodules/oom.git/docs/oom_quickstart_guide.html section "To generate openStackEncryptedPasswordHere"

example

ubuntu@ip-172-31-54-73:~/_dev/log-137-57171/oom/kubernetes/so/resources/config/mso$ cat encryption.key 

aa3871669d893c7fb8abbcda31b88b4f

ubuntu@ip-172-31-54-73:~/_dev/log-137-57171/oom/kubernetes/so/resources/config/mso$ echo -n "55" | openssl aes-128-ecb -e -K aa3871669d893c7fb8abbcda31b88b4f -nosalt | xxd -c 256 -p

a355b08d52c73762ad9915d98736b23b

Run the HEAT stack to create the kubernetes undercloud VMs
[michaelobrien@obrienbiometrics onap_log-324_heat(keystone_michael_o_brien)]$ openstack stack list
+--------------------------------------+--------------------------+-----------------+----------------------+----------------------+
| ID                                   | Stack Name               | Stack Status    | Creation Time        | Updated Time         |
+--------------------------------------+--------------------------+-----------------+----------------------+----------------------+
| d6371a95-dc3d-4103-978e-bab1f378573a | OOM-obrien-20181223-13-0 | CREATE_COMPLETE | 2018-12-23T14:55:10Z | 2018-12-23T14:55:10Z |
| 7f821906-2216-4a6e-8ef0-d46a97adf3fc | obrien-nexus3            | CREATE_COMPLETE | 2018-12-20T02:41:38Z | 2018-12-20T02:41:38Z |
| 9c4d3ebb-b7c9-4428-9e44-7ef5fba08940 | OOM20181216              | CREATE_COMPLETE | 2018-12-16T18:28:21Z | 2018-12-16T18:28:21Z |
| 52379aea-d0a9-48db-a13e-35ca00876768 | dcae                     | DELETE_FAILED   | 2018-03-04T22:02:12Z | 2018-12-16T05:05:19Z |
+--------------------------------------+--------------------------+-----------------+----------------------+----------------------+

[michaelobrien@obrienbiometrics onap_log-324_heat(keystone_michael_o_brien)]$  openstack stack create -t logging_openstack_13_16g.yaml -e logging_openstack_oom.env OOM-obrien-20181223-13-0
+---------------------+-----------------------------------------+
| Field               | Value                                   |
+---------------------+-----------------------------------------+
| id                  | d6371a95-dc3d-4103-978e-bab1f378573a    |
| stack_name          | OOM-obrien-20181223-13-0                |
| description         | Heat template to install OOM components |
| creation_time       | 2018-12-23T14:55:10Z                    |
| updated_time        | 2018-12-23T14:55:10Z                    |
| stack_status        | CREATE_IN_PROGRESS                      |
| stack_status_reason | Stack CREATE started                    |
+---------------------+-----------------------------------------+
ssh in

see clusters in Logging DevOps Infrastructure

obrienbiometrics:onap_log-324_heat michaelobrien$ ssh ubuntu@10.12.6.151
ubuntu@onap-oom-obrien-rancher:~$ docker version
Client:
 Version:      17.03.2-ce
 API version:  1.27
install Kubernetes stack (rancher, k8s, helm)

LOG-325 - Getting issue details... STATUS

sudo git clone https://gerrit.onap.org/r/logging-analytics
cp logging-analytics/deploy/rancher/oom_rancher_setup.sh .
# 20190105 - master, casablanca and 3.0.0-ONAP are all at the same Rancher 1.6.25, Kubernetes 1.11.5, Helm 2.9.1 and docker 17.03 levels
# ignore the docker warning - as the cloud init script in the heat template already installed docker and prepulled images
sudo nohup ./oom_rancher_setup.sh -b master -s 10.0.16.1 -n onap &
# wait 90 min
kubectl get pods --all-namespaces
kubectl get pods --all-namespaces | grep 0/
create the NFS share

https://onap.readthedocs.io/en/latest/submodules/oom.git/docs/oom_setup_kubernetes_rancher.html#onap-on-kubernetes-with-rancher

Scripts from above 20181207

https://jira.onap.org/secure/attachment/12887/master_nfs_node.sh

https://jira.onap.org/secure/attachment/12888/slave_nfs_node.sh

#master
ubuntu@onap-oom-obrien-rancher-0:~$ sudo ./master_nfs_node.sh 10.12.5.99 10.12.5.86 10.12.5.136 10.12.6.179 10.12.5.102 10.12.5.4
ubuntu@onap-oom-obrien-rancher-0:~$ sudo ls /dockerdata-nfs/
test.sh

#slaves
ubuntu@onap-oom-obrien-rancher-1:~$ sudo ./slave_nfs_node.sh 10.12.5.68
ubuntu@onap-oom-obrien-rancher-1:~$ sudo ls /dockerdata-nfs/
test.sh
deploy onap
# note this will saturate your 64g vm unless you run a cluster or turn off parts of onap
sudo vi oom/kubernetes/onap/values.yaml 
# rerun cd.sh
# or 
# get the dev.yaml and set any pods you want up to true as well as fill out the openstack parameters
sudo wget https://git.onap.org/oom/plain/kubernetes/onap/resources/environments/dev.yaml
sudo cp logging-analytics/deploy/cd.sh .
sudo ./cd.sh -b master -e onap -c true -d true -w false -r false

ONAP Usage

Accessing an external Node Port

Elasticsearch port example

# get pod names and the actual VM that any pod is on
ubuntu@ip-10-0-0-169:~$ kubectl get pods --all-namespaces -o wide | grep log-
onap          onap-log-elasticsearch-756cfb559b-wk8c6                           1/1       Running            0          2h        10.42.207.254   ip-10-0-0-227.us-east-2.compute.internal
onap          onap-log-kibana-6bb55fc66b-kxtg6                                  0/1       Running            16         1h        10.42.54.76     ip-10-0-0-111.us-east-2.compute.internal
onap          onap-log-logstash-689ccb995c-7zmcq                                1/1       Running            0          2h        10.42.166.241   ip-10-0-0-111.us-east-2.compute.internal
onap          onap-vfc-catalog-5fbdfc7b6c-xc84b                                 2/2       Running            0          2h        10.42.206.141   ip-10-0-0-227.us-east-2.compute.internal
# get nodeport
ubuntu@ip-10-0-0-169:~$ kubectl get services --all-namespaces -o wide | grep log-
onap          log-es                     NodePort       10.43.82.53     <none>                                 9200:30254/TCP                                                               2h        app=log-elasticsearch,release=onap
onap          log-es-tcp                 ClusterIP      10.43.90.198    <none>                                 9300/TCP                                                                     2h        app=log-elasticsearch,release=onap
onap          log-kibana                 NodePort       10.43.167.146   <none>                                 5601:30253/TCP                                                               2h        app=log-kibana,release=onap
onap          log-ls                     NodePort       10.43.250.182   <none>                                 5044:30255/TCP                                                               2h        app=log-logstash,release=onap
onap          log-ls-http                ClusterIP      10.43.81.173    <none>                                 9600/TCP                                                                     2h        app=log-logstash,release=onap
# check nodeport outside container
ubuntu@ip-10-0-0-169:~$ curl ip-10-0-0-111.us-east-2.compute.internal:30254
{
  "name" : "-pEf9q9",
  "cluster_name" : "onap-log",
  "cluster_uuid" : "ferqW-rdR_-Ys9EkWw82rw",
  "version" : {
    "number" : "5.5.0",
    "build_hash" : "260387d",
    "build_date" : "2017-06-30T23:16:05.735Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.0"
  }, "tagline" : "You Know, for Search"
}
# check inside docker container - for reference
ubuntu@ip-10-0-0-169:~$ kubectl exec -it -n onap onap-log-elasticsearch-756cfb559b-wk8c6 bash
[elasticsearch@onap-log-elasticsearch-756cfb559b-wk8c6 ~]$ curl http://127.0.0.1:9200   
{
  "name" : "-pEf9q9",

ONAP Deployment Specification

Resiliency

Longest lived deployment so far 

NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
kube-system   heapster-6cfb49f776-479mx               1/1       Running   7          59d
kube-system   kube-dns-75c8cb4ccb-sqxbr               3/3       Running   45         59d
kube-system   kubernetes-dashboard-6f4c8b9cd5-w5xr2   1/1       Running   8          59d
kube-system   monitoring-grafana-76f5b489d5-sj9tl     1/1       Running   6          59d
kube-system   monitoring-influxdb-6fc88bd58d-22vg2    1/1       Running   6          59d
kube-system   tiller-deploy-8b6c5d4fb-4rbb4           1/1       Running   7          19d


Performance

Cluster Performance

ONAP runs best on a large cluster.  As of 20180508 there are 152 pods (above the 110 limit per VM).  ONAP is also vCPU bound - therefore try to run with a minimum of 24 vCores, ideally 32 to 64.

Even though most replicaSets are set at 3 - try to have at least 4 nodes so we can survive a node failure and still be able to run all the pods.  The memory profile is around 85g right now.

Security Profile

ONAP will require certain ports open by CIDR to several static domain names in order to deploy defined in a security group.  At runtime the list is reduced.

Ideally these are all inside a private network.

It looks like we will need a standard public/private network locked down behind a combined ACL/SG for AWS VPC or a NSG for Azure where we only expose what we need outside the private network.

Still working on a list of ports but we should not need any of these exposed if we use a bastion/jumpbox + nat combo inside the network.

Known Security Vulnerabilities

https://medium.com/handy-tech/analysis-of-a-kubernetes-hack-backdooring-through-kubelet-823be5c3d67c

https://github.com/kubernetes/kubernetes/pull/59666 fixed in Kubernetes 1.10

ONAP Port Profile

ONAP on deployment will require the following incoming and outgoing ports.  Note: within ONAP rest calls between components will be handled inside the Kubernetes namespace by the DNS server running as part of K8S.

portprotocolincoming/outgoingapplicationsourcedestinationNotes
22ssh
sshdeveloper vmhost
443

tillerclienthost
8880http
rancherclienthost
9090http
kubernetes
host
10001https
nexus3
nexus3.onap.org
10003https
nexus3
nexus3.onap.org

https
nexus
nexus.onap.org

https

ssh


git
git.onap.org
30200-30399http/https
REST apideveloper vmhost
32628http
grafana

dashboard for the kubernetes cluster - must be enabled
5005tcp
java debug portdeveloper vmhost


Lockdown ports



8080

outgoing





10250-10255
in/out


Lock these down via VPC or a source CIDR that equals only the server/client IP list

https://medium.com/handy-tech/analysis-of-a-kubernetes-hack-backdooring-through-kubelet-823be5c3d67c


Azure Security Group

AWS VPC + Security Group


OOM Deployment Specification - 20180507 Beijing/master




The generated host registration docker call is the same as the one generated by the wiki - minus server IP (currently single node cluster)











Cluster Stability

Kubernetes cluster stability

OOM-1520 - Getting issue details... STATUS

Long Duration Clusters

Single Node Deployments

A 31 day Azure deployment eventually hits the 80% FS saturation barrier - fix:  LOG-853 - Getting issue details... STATUS

onap          onap-vnfsdk-vnfsdk-postgres-0                                  1/1       Running            0          30d
onap          onap-vnfsdk-vnfsdk-postgres-1                                  1/1       Running            0          30d
ubuntu@a-osn-cd:~$ df
Filesystem     1K-blocks      Used Available Use% Mounted on
udev           222891708         0 222891708   0% /dev
tmpfs           44580468   4295720  40284748  10% /run
/dev/sda1      129029904 125279332   3734188  98% /


TODO

https://docs.microsoft.com/en-us/windows/wsl/about

Links

https://kubernetes.io/docs/user-guide/kubectl-cheatsheet/

ONAP on Kubernetes#QuickstartInstallation

https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

https://kubernetes.io/docs/tasks/job/fine-parallel-processing-work-queue/


  • No labels