Distributed Analytics as a Service (Continuation from R4)
BUSINESS DRIVER
Executive Summary - Data driven placement and monitoring of workloads and infrastructure require AI based big data analytics. To support multiple edges, federated learning, it is required to deploy anlalytics packages at multiple locations. This project intends to simplify deployment and automation of big data framework at multiple locations, thereby reducing the analytics deployment from weeks to hours.
Business Impact - Simplify operations and separate out analytics framework from analytics applications, which allows multiple data scientist organization to deploy training/predictions applications without worrying about on how analytics framework is deployed and managed.
Business Markets - Applicable to physical and virtual network functions deployed in large operational networks - cellular service (4G/5G), cloud service and data center networks.
Funding/Financial Impacts - Potential of significantly reducing the CAPEX
Organization Mgmt, Sales Strategies - There is no additional organizational management or sales strategies for this use case outside of a service providers "normal" ONAP deployment and its attendant organizational resources from a service provider.
DEVELOPMENT IMPACTS
PROJECT | PTL | User Story / Epic | Requirement |
A&AI | @James Forsyth | ||
AAF | @Jonathan Gathman | ||
APPC | @Takamune Cho | ||
CLAMP | @Gervais-Martial Ngueko | ||
CC-SDK | @Dan Timoney | ||
DCAE | @Vijay Kumar | ||
DMaaP | @Mandar Sawant | ||
External API | @Matthieu Geerebaert | ||
MODELING | @Hui Deng | ||
Multi-VIM / Cloud | @Bin Yang | ||
OOF | @Sarat Puthenpura | ||
POLICY | @Pamela Dragosh | ||
PORTAL | @Manoop Talasila | ||
SDN-C | @Dan Timoney | ||
SDC | @Ofir Sonsino | ||
SO | @Seshu Kumar Mudiganti | ||
VID | @ittay | ||
VNFRQTS | @Steven wright | ||
VNF-SDK | @victor gao | ||
CDS | @Yuriy Malakov |
List of PTLs:Approved Projects
Technical Debt
Features that would make it in for R4
Container images and Helm charts for
Collection and Distribution stack (CollectD, Node-Exporter, cAdvisor, Prometheus, CollectD operator)
Data lake (HDFS and M3DB, M3DB operator, HDFS writer, Operator for HDFS writer and prometheus related remote_read and write services)
Training (Spark with scikit-learn, math libraries, MLLib etc..., Horovod, spark-k8s-operator and possibly Horovod operator AND mode optimizer & storage service)
Inferencing (Tensorflow-Serving)
Messaging (Kafka-broker, zookeeper, Kafka operator)
Model management (Minio and Minio operator)
Sample applications
Placement of stack in various locations and configuration of them to work together.
Features that may be postponed to R5
Various operators
Release 6 goals:
Performance features
Security (ingress, egress, inter-component, RBAC for each package)
Way to customize packages
Deployment using Multi-Cluster scheduler
GUI/CLI based configuration of stack
Monitoring of stacks - Ensure that each stack component has prometheus target.
Some real AI based analytics