Distributed Analytics as a Service (Continuation from R4)

Distributed Analytics as a Service (Continuation from R4)

BUSINESS DRIVER

Executive Summary - Data driven placement and monitoring of workloads and infrastructure require AI based big data analytics. To support multiple edges, federated learning, it is required to deploy anlalytics packages at multiple locations.  This project intends to simplify deployment and automation of big data framework at multiple locations, thereby reducing the analytics deployment from weeks to hours. 

Business Impact - Simplify operations and separate out analytics framework from analytics applications, which allows multiple data scientist organization to deploy training/predictions applications without worrying about on how analytics framework is deployed and managed.

Business Markets - Applicable to physical and virtual network functions deployed in large operational networks - cellular service (4G/5G), cloud service and data center networks.  

Funding/Financial Impacts - Potential of significantly reducing the CAPEX

Organization Mgmt, Sales Strategies - There is no additional organizational management or sales strategies for this use case outside of a service providers "normal" ONAP deployment and its attendant organizational resources from a service provider. 

DEVELOPMENT IMPACTS

PROJECT

PTL

User Story / Epic

Requirement

A&AI

@James Forsyth

 

 

AAF

@Jonathan Gathman

 

 

APPC

@Takamune Cho

 

 

CLAMP

@Gervais-Martial Ngueko

 

 

CC-SDK

@Dan Timoney

 

 

DCAE

@Vijay Kumar

 

 

DMaaP

@Mandar Sawant

 

 

External API

@Matthieu Geerebaert

 

 

MODELING

@Hui Deng

 

 

Multi-VIM /

Cloud

@Bin Yang

 

 

OOF

@Sarat Puthenpura

 

 

POLICY

@Pamela Dragosh

 

 

PORTAL

@Manoop Talasila

 

 

SDN-C

@Dan Timoney

 

 

SDC

@Ofir Sonsino

 

 

SO

@Seshu Kumar Mudiganti

 

 

VID

@ittay

 

 

VNFRQTS

@Steven wright

 

 

VNF-SDK

@victor gao

 

 

CDS

@Yuriy Malakov

 

 

List of PTLs:Approved Projects

Technical Debt 

Features that would make it in for R4

  • Container images and Helm charts for 

    • Collection and Distribution stack (CollectD, Node-Exporter, cAdvisor, Prometheus, CollectD operator)

    • Data lake (HDFS and M3DB, M3DB operator, HDFS writer, Operator for HDFS writer and prometheus related remote_read and write services)

    • Training (Spark with scikit-learn, math libraries, MLLib etc...,  Horovod,  spark-k8s-operator and possibly Horovod operator AND mode optimizer & storage service)

    • Inferencing (Tensorflow-Serving)

    • Messaging (Kafka-broker, zookeeper, Kafka operator)

    • Model management (Minio and Minio operator)

    • Sample applications

  • Placement of stack in various locations and configuration of them to work together.

  • Features that may be postponed to R5

    • Various operators

Release 6 goals:

  • Performance features 

  • Security (ingress, egress, inter-component,  RBAC for each package)

  • Way to customize  packages

  • Deployment using Multi-Cluster scheduler

  • GUI/CLI based configuration of stack

  • Monitoring of stacks - Ensure that each stack component has prometheus target.

  • Some real AI based analytics

 

  File Modified

Microsoft Powerpoint Presentation Distributed_analytics_v5.pptx

Aug 15, 2019 by Srinivasa Addepalli