Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 114 Next »


(warning) Re-arranging content...and, cleaning up....


General Background

A broad set of transformations are taking place:

  • Business transformation: OTT services, faster TTM, Monetization
  • Technical transformation: QoE, ULL, SDN/NFV/OMEC integration, Edge Analytics, Big data, Virtualization, Automation, C->E, R->E
  • Architectural transformation: 4 views “NORMA-like” Cloud, ECOMP, Flexible architecture (RAN, Core, CDN, Application delivery, Automation, IoT, fog,..)
  • Industrial transformation: ICT&E

To efficiently and effectively deploy 5G network supporting ultra low latency and high bandwidth mobile network, we need to deploy variety of applications and workload at the edge and close to the mobile end user devices (UE or IoT).  That would include various virtualized RAN and core network elements, content (video), various applications (AR / VR, industrial automation, connected cars, etc.).  We might deploy near-real time network optimization, customer experience / UE performance enhancement applications at edge.  Edge cloud must support deployment of third party application (e.g. Value added optional services, Marketing, Advertising, etc.).  We must deploy mechanisms to collect real time radio network information, process them in real-time (e.g. Geo Location data), summarize, anonymize, etc. and make them available to third party applications deployed at the edge or central location or outside service provider environment.  Edge data collection could also be used for training machine learning models and fully trained models can be deployed at the edge to support network optimization.

The need

End users and other devices, cyber-physical systems will benefit from a broad set of context information that can enhance and enrich the delivery of a broad set of applications. 

Service Deployment Goal

Deliver Application SLAs while minimizing TCO.

Application Profiles


No

Application Classification

(based on required RTT)

Application ExamplesNetwork / Service Behavior TypeDeployment Component/ APIsONAP ManagedEdge Deployment Hard /Soft Constraint (Based on RTT)Potential Application ProviderCasablanca CandidateAdditional Information
1Real-time (20ms -100ms)In service path optimization applications which run in open CU-CP platform (also known as RAN Intelligent Controller, or SD-RAN controller). Real-Time   Network State ControlOpen 5G CU-CP (CU - Control Plane) – VNFC.YesHardNF Vendor/Service Provider/3rd PartyYes

These applications include load balancing, link set-up, policies for L1-3 functions, admission control and leverage standard interface defined by oRAN / xRAN between network information base (or context database) and third party applications. Data collection through is B1 and implemented using x technology.

2Near-real-time (500ms and above)Slice monitoring, performance analysis, fault analysis, root cause analysis, SON applications, Optimization (SON Drive Test Minimization etc.), ML methodologies for various apps.Network Analytics & OptimizationDCAEYesSoftNF Vendor/Service ProviderYes
3Near-real-time (500ms and above)Video Analytics, Video Optimization, Customer geoLocation information, Anonymized customer data etc.Workload Analytics,  Optimization & Context processingCloud Edge or Cloud CentralNoSoft3rd PartyNA. Out of scope for ONAPThe apps are OTT and the service provider is offering their infrastrcture as a service to OTT providers.
4Real-time (10-20 ms)Third party applications that directly interacts with the UEs, like AR/VR, factory automation, drone control, etc. Workload Automation / AR-VR / Content, etc.UE or Cloud EdgeNoHard3rd PartyNA. Out of scope for ONAP.

These are third party applications, developed by enterprise customers (e.g. factory automation) or content creators (AR/VR applications). In this case, messages or requests or measurements directly go from UE (via UPF or GWs) to the applications and applications respond back. 

5same as 3)same as 3)Value Added Services + same as 3)

same as 3) + MEC/Cloud APIs (Note 1)

Yessame as 3)same as 3)StretchService Provider could be oferring video surveillance (video analytics/optimization apps etc.) as a service to enterprises.
6same as 4)same as 4)Value Added Services + same as 4)same as 4) + MEC/Cloud APIs (Note 1)Yessame as 4)same as 4)StretchService Provider could be oferring factory automation as a service to enterprises.

Note 1:  API Details

  • e.g., MEC APIs - Location info, Radio control info etc.
  • e.g., Cloud APIs - IaaS/PaaS + Context Awareness (time, places, activity, weather etc.)  

Edge Infrastructure

This diverse work load will require somewhat heterogeneous cloud environment, including Graphical Processing Unit, highly programmable network accelerators, etc., in addition to traditional compute, storage, etc.

To support edge deployment, we need:

1)     Rich information / data model to discover and capture hardware resources deployed at the edge and request right type of resource to meet unique application needs.

2)     Must support workload deployment options such as VM, Container (e.g. Kubernetes) on VM or bare metal

3)     Must support a very small foot print to an edge location supporting a metropolitan area with verity of workload deployment

4)     Edge cloud could be on customer premises – Factory automation

5)     Must provide efficient network infrastructure that support slicing and QoS configuration options to meet various mobility services need

6)     Must support policy driven auto recovery / scale up scale down


Edge Infrastructure Profiles

((warning) example based on Akraino Edge Stack..but, need to generalize)  

ProfilesWorkloadsComputeNetworkingStorageControlSecurityEdge Application Infrastructure
Large

Support for VMs and containers.


Commentary:

  • VNFs from Operators and Edge applications from customers of Operators.
  • Number of tenants to be supported???

>50 Compute Servers

Accelerators:

SRIOV based QAT for Crypto and Compression acceleration.

ML/DL Accelerators

Compute profiles: Fixed number of profiles are expected to be supported. (Will add profiles)


SRIOV Networking for High performnace Data plane VNFs.

vSwitch (OVS-DPDK) based networking for all other workloads

Multiple leaf switches and two spine switches

WAN - Underlay :

  • L3VPN Support (BGPVPN)
  • L2VPN support (E-VPN, PBB VPN, VPLS?)

Underlay realization options

  • PE at the Edge (MPLS/BGP start at the Edge) as physical appliance
  • PE at the edge as virutal appliance
  • CE- Physical at the edge
  • CE - Virtual at the edge

Overlay realization options

  • GENEVE based networks (for workload migration, redundancy and scalability)

IPv4 and IPv6 support

NAT44 with LSN (Large Scale NAT) support by providers.

Support for dedicated public IP addresses

Commentary: Network sharing among container and VM workloads will need to be supported. DVR (Distributed Virtual Routing) for forwarding packets locally among vSwitch based networks. Leaf/Spine switches for forwarding traffic among SRIOV based networks and for networks between vswtich and SRIOV based networks.

Few fixed profiles for following:

  • Local network profiles
  • Fabric topology profiles
  • WAN connectivity profiles

Block device support using Ceph

Dedicated nodes for storage ( 3 nodes )

Storage profiles representing whether the nodes are dedicated for storage, use compute nodes for storage, Number of nodes for storage etc...

Is support for Object storage required in Edges?

Dedicated nodes for control stack

Automation Offload Platform (Offloading ONAP) at the Edge.

Few control profiles

  • Profile 1:
    • Openstack for VM workloads
    • K8S for Container workloads
    • Dedicated nodes for VMs and containers.
  • Profile 2:
    • K8S control for both VMs and containers. No need to dedicate the computes.

Automation Offload Platform profiles consists of following:

    • VNF Life Cycle management
    • Fabric Control
    • WAN Control
    • Analytics Offload


Transport : TLS 1.2 and above between ONAP and Edge Services

Infra Security: TPM 2.0/SGX for private key security and secret/password protection, Remote attestation to detect any software tampering of compute, storage and control nodes.



MEC Platform as a VNF to provide contextual information to Edge applications.



MediumSame as above

Same as above.

Number of compute nodes are >10 and < 50

Same as aboveSame as above, except that there is no dedication of nodes to Ceph cluster

Same as above with respect to control, but Automation Offload Platform is not part of the Edge. No dedicated control nodes. Control functionality is shared with compute nodes.

Support for K8S profile as it can support both VMs and containers

Same as aboveSame as above
SmallSame as above, but may support very less number of tenantsSame as above. Number of compute nodes are < 10Same as above, but no PE and CE at the Edge. Fabric itself acts as CE.Same as above, no dedication of nodes to Ceph cluster

No control at the Edge

No Automation offload platform at the Edge

Regional sites are expected to provide control and AOP services.

Support for K8S based control.

Same as aboveSame as above

Edge Infrastructure Profile Summary

ONAP Activity Goal #1: ONAP requires IaaS/PaaS attributes (see ongoing work – Distributed Edge Cloud Infrastructure Enablement in ONAP, 5G Items for Casablanca) from Cloud providers for Infrastructure profiles that allow Distributed, Highly-secure, Config/Cloud-diverse, Capacity-constrained and Peformance/Isolation-aware

  • Distributed
    • 1000's of edge locations of varying capacity
    • Casablanca - Implementation
      • 10-100 edge locations (simple starting point)
  • Peformance-awareness
    • GPU, FPGAs, SR-IOV etc.
    • Casablanca - Implementation
      • SR-IOV desired for Data Plane (5G CU-UP)
      • NIC offload desired for tunnel encap/decap e.g. 5G CU-UP GTP tunnel
  • Resource Isolation through fine-grained QoS
    • Support both Latency-sensitive and General purpose applications
    • Support ONAP Management plane components in the same cloud with Workloads
    • Casablanca - Implementation
      • Min/Max resource reservation model desired
  • Security
    • Workloads are often deployed in external (non-dc-type) locations and need HW security (TPM etc.) 
    • 3rd party applications which need additional HW security (VM, Containers in VM etc.) and SW security (Inter-component TLS etc.)
    • Casablanca - Implementation
      • Edge Clouds with private IP addresses, i.e. reachable via private connections 
      • For example, edge cloud in a public cloud provider reachable via AWS direct connect or Azure express route or Google partner interconnect 
  • Capacity constraints
    • Very small footprint (few nodes per physical location), Medium footprint (10's of nodes per physical location), Large footprint (100's of nodes per physical location)
    • Casablanca - Deployment
      • Need number of cores per servers; Need storage capacity/pool
  • Cloud Diversity
    • Private and Public Cloud Providers

    • Casablanca - Implementation
      • Note: ONAP currently supports private edge clouds based on VMware VIO, Wind River Titanium Cloud, Upstream OpenStack
      • Desire to have at least one Public Cloud Provider (Azure, AWS, GCE etc.) as an Edge Cloud Provider
        • ONAP central instantiates an Edge Cloud instance (blue cloud provider in gliffy) via a IaaS API to cloud provider
        • ONAP central instantiates one or more ONAP edge components as need, e.g. DCAE
        • ONAP central instantiates one or more NFs, e.g. 5G CU-UP/CP
  • Configuration Diversity
    • 5G Factory Automation, 5G General Mobility Services etc. – User Plane components (DU, CU-UP, UPF etc.) 

ONAP Edge Automation 


ONAP Activity Goal #2: Define hierarchical ONAP Central/Edge Architecture/functional interactions (API reference points) to support aforementioned Application/Infrastrcuture profile in Any "Cloud" (internal Business Unit or external Partner) at Any "Location" edge, regional or central. 

Feedback from OOM team

 (May 9th call / Ramki Krishnan attended OOM call and captured feedback) - Keep it Simple Stupid (KISS)

  • Suggested Approach - Separate ONAP-edge Instance per 'edge domain', (ie., separate from onap-central instance, of course)
    • Note: Independent of any Edge CP's Orchestration components.
  • SP uses a central-OOM with a 'policy' for deployment of an onap-edge instance, e.g., xyz edge provider with abc components, etc.
    • However, onap-edge instance can be 'lighter weight' with subset of components needed (per MVP discussed below)
    • Desirable to managed as a separate K8s cluster (ie., separate from onap-central instance, of course) and, only for onap-edge use, ie., don't use for other 'workloads' like network apps or 3rd party apps
  • Use External API framework to exchange requests/responses, e.g., summarize data over longer (such as 60-min) intervals vs detailed over shorter (such as 1-min) intervals, etc., between ONAP-central and ONAP-edge instances


Details:

  • Optimal Distribution of Intelligence and Control, includes distributed data collection and localized processing of intelligence
    • Support for various edge sizes
    • Scaling needs - Hierarchical federation (over and beyond auto scale-out of ONAP services) - Distribution of orchestration, fabric control, stats/faults/log collection and distributed processing of same (Regional Controllers)
    • Optimal placing of edge applications. For example placing edge applications in the best edge(s) considering various constraints (e.g Proximity to end user,  Radio/BW availability, cost,  accelerators availability - HPA,   Geo-affinity regulations, trusted infrastrcture of edge, device characteristics and resource availability to take up load etc...),  Auto creation of constraints is one requirement.
    • Providing contextual information to application services after gathering information from 5G network functions.
  • Autonomic Control, Management and Operations of distributed service chains
    • Traffic  steering to the right edge applications (e.g  Programming UE classifier of UPF) and dynamic SFC within VNFCs of edge application.
    • Supporting various workload types (VMs, Containers etc.)
    • Deploying IoT specific infrastructure software in edges such as EdgeXFoundry.
    • Supporting multi-tenancy to place workloads in Edges belonging to various organizations
    • Performance determinism and high throughput edge 
    • Securing confidential information/keys/secrets and detecting any software tampering at edges


Few examples:  on scaling -   OOM based scaling may not be good enough and  there may be a need to  offload some ONAP functionality to regional level as the target number of edge clouds could be in tens of thousands.   Also, to reduce amount of data to central ONAP services for analytics,  there is a need for offloading DCAE functions to regional level, which could involve  identifying real time data sources, collecting and analyzing the data and disseminating output data to central ONAP function.  Controlling fabric (L2/L3 switches in edge-clouds and WAN links) is another function that may require offloading some ONAP SDNC functions to regional sites. 

Example Service Model

Note - arch gliffy to be updated after tweaking this example service model

ONAP Hierarchical (Central/Edge) Architecture

Additional Notes on Gliffy

  • Cloud Provider Business Unit: Provides hosting of Workloads, ie., IaaS/PaaS
    • SP installs and manages ONAP in separate 'Management Cloud' instances
    • SP installs and manages Network Services + 3rd Party Apps in separate 'Services/Apps Cloud' instances
  • Cloud Provider Business Unit: Provides SaaS, eg., Analytics/Closed Loop as a Service, LCM of Apps, etc..
    • ONAP Edge may not be needed

Sequence Diagram

ONAP Edge MVP  

(warning) Edge Application (refer to app classification) / Infrastructure (refer to infra profile summary) Requirement – ONAP Project impact

OptionONAP Central (Key Impacted Projects & Enhancements)ONAP Edge (Key Impacted ONAP Projects & Enhancements)Edge Cloud FunctionalityRelated Use Cases & Additional NotesONAP Central to ONAP Edge CommunicationRelease
A

A&AI, Multi-Cloud, Policy, APP-C, VF-C, CLAMP, DCAE, OOF etc.

OOF Enhancements

  • Example: Choosing the Cloud Region for deployment of Network Functions (PNF/VNF) based on various constraints
    • Leverage Infrastructure Events/Alerts besides Metrics for aggregate objects (Tenant, Cluster etc.) from Edge Cloud

Multi-Cloud Enhancements

  • Multi Cloud standardizes "ONSET" and "ABATE" of Infrastructure Alerts received from Edge Cloud

*** None ***
  • Analytics (Infra/App)
    • Value: Summarize data in the edge and avoid WAN bandwidth deluge
      • Generate appropriate events and alarms
    • Edge Infra Analytics 
      • Cloud 
    • Edge App Analytics 
      • VNFs
    • Close Loop Use Cases which need only ONAP Central intervention
      • VNF Scale in/out - Proactive using app/infra predictive analytics 
      • Enhanced Alarm Correlation 
  • Closed Loop Use cases which does not need ONAP intervention 
    • Fault Management 
      • Cloud provider can automatically recover from VM/Host going unresponsive (e.g. heartbeat mechanism) 
      • VNF/App vendor can automatically recover from VNF/App going unresponsive (e.g. health check mechanism) 
  • ONAP requires IaaS/PaaS attributes  from Cloud providers for Infrastructure profiles that allow Distributed, Highly-secure, Config/Cloud-diverse, Capacity-constrained and Peformance/Isolation-aware – Key Features
    • Superior Isolation for Tiered Services using Resource Reservation (Aggregate/Atomic Objects)

    • High Performance Networking Enablement (Intra-DC DPDK-based Overlay & SR-IOV)

Related Use Cases:

Notes:

  • This assumes Analytics and Fault Management Policies in Clouds and VNFs are independently configured. 
  • Single pane of glass policy management through ONAP involves managing a multi-vendor distributed policy framework and out of scope for R3.
*** None ***Casablanca
B

Same as Option A + ONAP Central Project(s) based on Edge DCAE Apps

  • OOM Enhancements
    • SP uses a central-OOM with a 'policy' for deployment of an onap-edge instance, e.g., xyz edge provider with abc components, etc.
      • However, onap-edge instance can be 'lighter weight' with subset of components needed (per MVP discussed below)
      • Desirable to managed as a separate K8s cluster (ie., separate from onap-central instance, of course) and, only for onap-edge use, ie., don't use for other 'workloads' like network apps or 3rd party apps
  • Cloudify Enhancements (Lusheng TBD)
  • ONAP Edge DCAE Microservices
    • Support New microservice based Apps –  Centralized SON applications, Optimization (SON Drive Test Minimization etc.), ML methodologies for various apps etc.
Same as Option A

Related Use Cases:

Notes:

  • Choose applications that are independent and which do not impact closed loop operations

ONAP Edge XYZ ↔ ONAP Edge API GW ↔ ONAP Central API GW ↔ ONAP Central XYZ


Casablanca
C

Same as Option B + ONAP Central Project(s) based on ONAP Edge Closed Loop

CLAMP Enhancements

  • Deploy/Manage a separate Closed Loop per ONAP Edge


  • ONAP Edge Closed Loop
    • Edge Policy
      • Static/Dynamic Policy - PDP 
        • Policy may depend on current deployment state and also might need service context for the service component such as VNFs? So, other ONAP components may be involved at the edge? 
    • Edge APP-C, VF-C, Multi-Cloud for Controller Function
    • ...
Same as Option B
Same as Option BCasablanca+
DSame as Option C + ONAP Central Project(s) based on ONAP Edge Service Orchestration
  • ONAP Edge Service Orchestration
    • SO for service orchestration
    • OOF for homing
    • A&AI for inventory
    • ...
Same as Option C
Same as Option CCasablanca+

API GW Options:

  • Option 1 (desired):
    • HTTPS communication across Gateways
    • Session termination of local communication from ONAP instance (DMaaP etc.) and translation to HTTPS session to peer API GW
    • Benefit
      • Hierarchical and Scalable communication across ONAP Central and ONAP Edge instance microservices (avoid full-mesh communication)
  • Option 2:
    • Secure IPSEC communication across Gateways
    • No Session termination of local communication from ONAP instances (DMaaP etc.)
    • Benefit
      • Easy Implementation (full-mesh communication) 

(warning) Need to align table with Edge Infrastructure Profile Summary

Edge RequirementONAP roleProjectsWhat can be done in Casablanca?
Support for large number of Edge sites (Hierarchical Scaling)

Support External controllers that take up the load of ONAP (Identify changes i required in ONAP to support external entities that take up the load off of ONAP)

  • Site specific or regional level external controller support
    • Exteral controllers
      • VNF LCM Controllers (that bringup, terminate, heal, migrate, configure, monitor of workloads)
      • Fabric Controller (that control/configure L2/L3 switches at the Edge)
      • CE and PE controllers (that control WAN connectivity of Edge sites)
  • Distribution of ONAP configuration to Edge sites/regional-sites
    • Policy configuration (for Closed loop control within the site)
    • Policies that help in local optimization (VNF Placement as part of scaling)
  • Support for site/region level for analytics and get hold of aggregate data (API and Model support for remote site)
  • API suppot for sites/regions to send relevant topoloyg information
  • Scaling of ONAP CA (AAF CA or ISTIO CA) to issue Intermediate CA certificates to Edge sites/regional-sties

VNF LCM Controller support: SO, APP-C

Fabric and WAN controller support: SDN-C

DIstribution of Policies : POLICY

Regional Site Analytics support: DCAE

Topology support: A&AI

High priority:

  • Infrastructure support to bring up regional/edge Analytics containers (with K8S, Service mesh and CA)
  • Regional/Edge level Analytics and sending only aggregated data to ONAP central DCAE

Medium Priority:

  • Regional/Edge level fabric controller
  • VNF LCM Controller at regional/edge level.
  • Closed loop control at regional/edge level, which requires selective policy synchronization from central ONAP to regional/edges.
  • Support for regional/edge level for topology and inventory repository

Low priority:

  • CE/PE controllers
Performance (Determinism, Low jitter, Low latency, high throughput)
  • Support for SRIOV-NIC (Ability for ONAP to take care of unique requirements of VIMs)
  • Support for GPUs and FPGAs (Support on-demand programming. Example: Via Openstack Cyborg)
Multi-Cloud plugins, OOF(?) and SO (question)

High Priority

  • SRIOV NIC support

Medium Priority:

  • FPGA support: Stretch as OS & K8S don't have FPGA support yet. Openstack Rocky is going to support Cyborg
Constrained Edges
  • Support for containerized workloads (Network functions) using K8S
  • Unified networking among VM and Container workloads
Multi-Cloud, SO, ModelingPOC approved
Edge Application Provisioning
  • AF (Application Functions - 5G Terminology)) / User App LCM Proxy (MEC terminology) support
    • APIs for AF registration to ONAP
    • APIs to provide NEF Reachability information to external AFs -
      • To allow External AFs to reach 5G NFs to create traffic rules in UPFs on the sites where edge applications being brought up.
      • To register with the 5G NFs (via NEF) to get hold of contextual information.
    • Acting as proxy to reach 5G NFs from AF.
    • ONAP to provide more constraints in selecting the best region to place edge application (Cost, Latency, Bandwidth etc...)
External API, OOFStretch Goal
Security and Regulations
  • Constraints to support VNF placement based on data-placement regulations (such as GDPR)
  • Ensuring that secrets and password shared with region-site level external controllers are secured well (Using TPM/SGX)
  • ONAP ensuring that site-sepcific controllers/software are not tampered

OOF

AAF (Secret Management Service, CA Service)

New project for SW tampering detection and taking actions

High Priority :

  • VNF placement based on data-placement regulations

Medium Priority:

  • Architecture and design of secret/password management with edge sites and Software tampering.
Site reachabilityONAP to interact with various site services using private IP addresses (via IPSec tunnels?) - Edge sites/Regional-sties to connect to ONAP IPSec Server.IPSEC Server (New project?), who would ensure that private IP space is not overlapping across sites/regions.High Priority
VNF image management (Reduce operational expenses)

ONAP to do centralized VNF image management

  • Ability for VNF vendors to provide set of images for their VDUs (One per type of remote site - Openstack based, K8S based, AWS based, Azure based.
  • Ability for VNF vendors to provide artifacts specific to remote site specific controls.
  • ONAP to manage images to sites (Pro-active basis - PUSH , on-demand basis - PULL)
    • Some images to some sites can be pushed on pro-active basis (e.g Hyperclouds) : Push images to sites whenever VNF image is uploaded in ONAP. Also, remove image when the image is removed from the ONAP.
    • Some images or some sites may not take too many images (due to persistent memory limitations or cost) and hence support for pulling images. Support for docker hub based image management for K8S based sites and Support for glance API for Openstack based edge sites : As part of instantiation request, letting the remote sites download the image.
Revive image manager project?Highly preferred : Being addressed as part of Image Manager project.
Manageability - Dynamic Site registration

Currently each site is expected to be registered manually. Need for dynamic registration of edge-site, regional site

ONAP to API to

  • Edge/Regional-site registration/De-registration
  • In case of regional-sites, edge sites that it controls.
  • Site status (reach-ability status, Site capacity, current VNFs, Total number of VNFs brought up so far etc...)
ESRStretch goal / Medium priority. Manual registration is good enough for now.





  • No labels