Setting up Closed Loop for K8S vFW - initial pass
- 1 Recap of the vFW closed loop test
- 2 Notes on the environment used for the investigation
- 3 Setting up a Service the Packet Generator netconf mount
- 4 Stop the automatic vFW packet generator test
- 5 Configure the Firewall to send VES events to DCAE
- 6 VES Events from the vFirewall
- 7 Add a vserver object to AAI
- 8 Configure the vFW Closed Loop Policy
- 9 Update DCAE Consul
- 10 What Happened
- 11 The Fix - workaround
This page documents the steps taken to set up the vFW Closed Loop test as part of an investigation to identify changes needed to support the same.
The general method used was to review the operation of the integration Robot 'instantiatevFWCL' and 'vfwclosedloop' test and replicate the steps manually for an instance of vFW deployed to a K8S cloud region.
Recap of the vFW closed loop test
The following things are needed to run the closed loop test:
APPC must be able to perform a netconf mount to the VPP honeycomb component of the Packet Generator so that the number of packet streams produced by the packet generator can be configured in response to policy reacting to a threshold event. In the case of the K8S vFW, this requires that a Service be added which exposes a NodePort (at least, this is the approach taken for this investigation).
DCAE must be able to receive VES events sent by the vFirewall component. In the K8S vFW instance, this was possible to do, although the DCAE IP and Port values were not passed in, so the VES sending application needed to be restarted with the correct values.
The statistics reported by the vSink component need to be exposed via a Service (already present in the K8S vFW helm charts). This is convenient for monitoring the behavior of the vFW and its reaction to configuration changes - whether made manually or by Policy.
Notes on the environment used for the investigation
This was done on a Dublin installation running in the Intel ONAP Integration Lab. The K8S KUD cloud region was running as a single node cluster running in a VM in another single server Titanium Cloud system which has external network connectivity to the ONAP Integration system - e.g. on the 10.12.x.x network.
The K8S vFW instance was deployed per the steps described here: Deploying vFw and EdgeXFoundry Services on Kubernets Cluster with ONAP and as presented here: https://wiki.lfnetworking.org/download/attachments/15630468/ONAP_Dublin_SO_Multicloud.pdf?version=1&modificationDate=1560495527000&api=v2
Per the issue here: MULTICLOUD-718: Outdated container in the multicloud-k8s-4.0.0 chartClosed multicloud was deleted, multicloud-k8s updated to version 1.4.0 and then multicloud was redeployed. (before onboarding and distributing the K8S VF and Service)
root@onap-rancher:~/oom# git diff
diff --git a/kubernetes/multicloud/values.yaml b/kubernetes/multicloud/values.yaml
index bff78caf..00fd8c33 100644
--- a/kubernetes/multicloud/values.yaml
+++ b/kubernetes/multicloud/values.yaml
@@ -20,7 +20,7 @@ global:
nodePortPrefix: 302
loggingRepository: docker.elastic.co
loggingImage: beats/filebeat:5.5.0
- artifactImage: onap/multicloud/framework-artifactbroker:1.3.3
+ artifactImage: onap/multicloud/framework-artifactbroker:1.4.0
prometheus:
enabled: false
@@ -29,7 +29,7 @@ global:
#################################################################
# application image
repository: nexus3.onap.org:10001
-image: onap/multicloud/framework:1.3.3
+image: onap/multicloud/framework:1.4.0
pullPolicy: Always
#Istio sidecar injection policy
Setting up a Service the Packet Generator netconf mount
In the K8S cloud region, a Service needed to be added to expose the netconf port.
Such as: kubectl create -f pgservice.yaml where:
pgservice.yaml
apiVersion: v1
kind: Service
metadata:
name: packetgen-service
labels:
app: packetgen
chart: packetgen
release: profile1
spec:
selector:
app: packetgen
release: profile1
ports:
- port: 2831
nodePort: 30831
protocol: TCP
targetPort: 2831
type: NodePort
This exposes the packet generator honeycomb port 2831 to the K8S cloud region nodes as NodePort 30831.
On the ONAP side, the APPC netconf mount was then created using a Postman command:
PUT http://{{AAI1_PUB_IP}}:30230/restconf/config/network-topology:network-topology/topology/topology-netconf/node/ef4aa32c-0eb9-46c0-b6b0-c6a35184f07b
with a body of:
APPC Netconf Mount for vFW
<node xmlns="urn:TBD:params:xml:ns:yang:network-topology">
<node-id>ef4aa32c-0eb9-46c0-b6b0-c6a35184f07b</node-id>
<host xmlns="urn:opendaylight:netconf-node-topology">10.12.17.12</host>
<port xmlns="urn:opendaylight:netconf-node-topology">30831</port>
<username xmlns="urn:opendaylight:netconf-node-topology">admin</username>
<password xmlns="urn:opendaylight:netconf-node-topology">admin</password>
<tcp-only xmlns="urn:opendaylight:netconf-node-topology">false</tcp-only>
<!-- non-mandatory fields with default values, you can safely remove these if you do not wish to override any of these values-->
<reconnect-on-changed-schema xmlns="urn:opendaylight:netconf-node-topology">false</reconnect-on-changed-schema>
<connection-timeout-millis xmlns="urn:opendaylight:netconf-node-topology">20000</connection-timeout-millis>
<max-connection-attempts xmlns="urn:opendaylight:netconf-node-topology">0</max-connection-attempts>
<between-attempts-timeout-millis xmlns="urn:opendaylight:netconf-node-topology">2000</between-attempts-timeout-millis>
<sleep-factor xmlns="urn:opendaylight:netconf-node-topology">1.5</sleep-factor>
<!-- keepalive-delay set to 0 turns off keepalives-->
<keepalive-delay xmlns="urn:opendaylight:netconf-node-topology">120</keepalive-delay>
</node>
The node-id 'ef4aa32c-0eb9-46c0-b6b0-c6a35184f07b' is the generic-vnf-id of the deployed K8S vFW VNF.
The host IP '10.12.17.12' is the host IP of the K8S KUD cluster and the port '30831' is the exposed node port as described above.
Once this netconf mount is executed, the K8s vFW packet generator should show up in the APPC list of Mounted Resources - and the packet generator stream-count can be controlled from APPC.
Stop the automatic vFW packet generator test
Note - that after it is deployed the packet generator automatically starts a script called run_traffic_fw_demo.sh.
Before running closed loop, this script should be terminated. This script will alternate between running 1 and 10 packet streams (e.g. 100 or 1000 packets per second).
When running the closed loop test, this script will interfere so it is best to stop it.
Configure the Firewall to send VES events to DCAE
The vFirewall component sends VES using the 'vpp_measurement_reporter' program found in the directory '/opt/VES/evel/evel-library/code/VESreporting'.
By default this will be running following deployment, but it will be using the wrong parameters. To fix, do the following:
Edit the files /opt/config/dcae_collector_ip.txt and /opt/config/dcae_collector_port.txt and place in them the IP and port for the DCAE collector of the ONAP.
For example:
vFirewall DCAE collector configuration
Where the address '10.12.5.63' is a Host IP of one of the ONAP cluster nodes and port '30235' is the port of the DCAE VES collector service.
Terminate the 'vpp_measurement_reporter' process if it is currently running and restart it with the new configuration by running the 'go-client.sh' script, which is also found in the directory '/opt/VES/evel/evel-library/code/VESreporting'
go-client.sh
VES Events from the vFirewall
The current K8S vFirewall sends out a VES event that looks like this:
K8S vFW VES Event
There are a few key values in this event. One of these is the 'sourceName' field. The 'sourceName' is used as the vserver name in AAI to correlate the event with a vserver.
In the current K8S vFW demo, the sourceName is 'k8s-testing'. This will need to be made instance specific in the future.
NOTE: Further investigation reveals that the vFW obtained the sourceName of 'k8s-testing' by making an OpenStack metadata service query and using the subsequent name from the response. 'k8s-testing' was the OpenStack instance name of the VM in which the KUD cloud region was running. Adding a route in the vFW to reject the network used for OpenStack metadata (e.g. 169.254.0.0/16) causes the vFW VES code to default to the vFW hostname - which is the name of the vFW pod (e.g. profile1-firewall-6558957c88-2rxdh )
Add a vserver object to AAI
The Robot tests will runs 'Heatbridge' to update AAI with details about the deployed VNF. See AAI Update after Resource Instantiation for more information about Heatbridge and AAI update.
At this time, there is no heatbridge or AAI code for the K8S vFW deployments. So, in support of handling the AAI enrichment process by looking up via vserver, the following AAI object is added to AAI manually.
NOTE: As mentioned just above, the sourceName in this example happened to be 'k8s-testing', but the suggested approach is to use the pod name for the vserver-name.
PUT https://{{AAI1_PUB_IP}}:{{AAI1_PUB_PORT}}/aai/v11/bulkadd
with body:
K8S vFW vserver AAI example
Configure the vFW Closed Loop Policy
Once the above is done, all that is needed is to update the vFWCL policy to match this service.
Note - this sequence was created by reviewing closely at the way the Robot vFWCL instantiation modifies the closed loop policy configuration - and then copying this sequence with appropriate modifications for the K8S vFW deployment.
All of the following curl commands are done from inside the ONAP robot pod (probably doesn't have to be robot specifically, but that one worked).
Create vFirewall Monitoring Policy
First check the health.
Then :
where the body is:
newpolicytype.json
The modification in the above body is that the two occurrences of 'closedLoopControlName' are set to "ControlLoop-vFirewall-fb6f9541-32e6-44df-a312-9a67320c0b08" where fb6f9541-32e6-44df-a312-9a67320c0b08 is the Model Invariant ID of the VNF model for the K8S vFW. This can be found (for example) from the "Model ID" line on the VID for the VNF.
The response to the POST was:
Create vFWCL Operational Policy
Now issue this command:
where the body is:
newoppolicy.json
Notice again that the VNF invariant model ID of fb6f9541-32e6-44df-a312-9a67320c0b08 has been placed in the body twice - as part of the 'controlLoopName' and as the 'resourceID' value.
The response from the command is:
Push vFirewall Policies to PDP Group
Now issue the following command:
Where the body is:
Note here that the first digit of the 'policy-version' of "4.0.0" was taken from the "policy-version": "4" that was returned in the previous step where the Operational policy was posted.
The response to this command is just: "{}"
Validate the vFWCL Policy
This is a query that the Robot test does after the above steps.
Where the response in this case was:
Update DCAE Consul
A setting in Consul needs to be updated as well. Read how to do it here: https://onap.readthedocs.io/en/latest/submodules/integration.git/docs/docs_vfw.html#preconditions
For this example, the two occurrences of 'closedLoopControlName' were changed to ControlLoop-vFirewall-fb6f9541-32e6-44df-a312-9a67320c0b08 (same is in the above commands).
What Happened
After making the above changes to the vFW components and the policy configuration, it was time to try testing the closed loop operation.
Using the APPC to change the packet generator stream count to either '1' or '10' - it was observed that events were being received by the VES collector and forwarded on.
AAI enrichment by vserver query was successful. However, the policy did not work.
Looking at the drools network.log file, there were lots of events being handled, even for vservers that did not exist - i.e. they were standard vFWCL instances that had been generated previously by robot on Openstack.
The K8S vFW policy appeared to be showing up as already 'locked' or something.
The Fix - workaround
Behavior was very confusing, so the fix turned out to be:
go through a helm delete, redeploy sequence for the policy component
e.g. from rancher node in ~/oom/kubernetes, do the following:
After policy pods are all back running, the above 3 policy configuration steps were executed again. The only difference this time was in step 3, the 'policy-version' was '1.0.0' since '1' was returned for 'policy-version' in step 2.
After this was done, the VES events were started up again on the K8S vFW (they had been stopped). Then, using APPC, the packet generator was configured to either 1 or 10 (it had been manually set to 5 before testing started to begin within policy).
Now, the policy worked and APPC was used automatically to set the streams back to 5.
Following picture shows how it was set several times over an hour and policy set it back to 5.
In the following example, the vFW virtlet