L7 Proxy Service Mesh Controller
- 1 Background
- 2 Design Overview
- 3 JIRA
- 4 API
- 4.1 RESTful North API (with examples)
- 4.1.1 1. Inbound access
- 4.1.2 2. Outbound access
- 4.1.2.1 POST
- 4.1.3 3. Compound Service access
- 4.1.4 Scenarios supported for the current release
- 4.1 RESTful North API (with examples)
- 5 External DNS - Design and intent API
- 6 Internal Design details
- 7 Sequence flows
- 8 Test cases
Background
L7 Proxy Service Mesh Controller intends to provide connectivity, shape the traffic, apply policies, RBAC and provide
mutual TLS for applications/microservices running across clusters (with service mesh), within the cluster
and with external applications. The functionalities are subjected to the usage of underlying service mesh technology.
Design Overview
Traffic Controller Design Internals
Internal Implementation Details
NOTE - Current implementation will support the ISTIO service mesh technology and SD-WAN load balancer and ExternalDNS as DNS provider. The plugin architecture of the controller makes it extensible to work with any Service mesh technology and any external load balancer as well. It is also designed to configure and communicate with external DNS servers.
JIRA
Component | JIRA Items |
---|---|
| MULTICLOUD-913: Define APIs exposed by Traffic ControllerClosed |
2. Controller Interface, Backend Process | MULTICLOUD-914: Design Internal architecture for generating ISTIO configurationClosed |
3. Developing backend code with interfaces | MULTICLOUD-915: Implement traffic controller as a plugin to k8s pluginClosed |
4. Loadbalancer configuration (Firewall, IPVS, NAT, and other L3 connectivity) | MULTICLOUD-924: Configure Firewall configuration as required by SD WAN controllerClosed MULTICLOUD-925: Configure NAT configuration as required by SD WAN controller for each clusterClosedMULTICLOUD-926: Configure IPVS Rules configuration as required by SD WAN controllerClosed |
5. External DNS Configuration | MULTICLOUD-909: Manage external DNS configuration for user facing service trafficClosed |
6. Testing | |
7.Documentation | MULTICLOUD-923: Wiki Documentation of various scenario testingClosed |
Elements of Traffic Controller with ISTIO as the service mesh
Gateways - The inbound/outbound access for the service mesh. It is an envoy service
VirtualServices - To expose the service outside the service mesh
DestinationRule - To apply rules for the traffic flow
AuthorizationPolicy - Authorization for service access
serviceEntry - Add an external service into the mesh
Authentication Policy - Authenticate external communication
These are the Kubernetes resources generated per cluster. There will be multiple of these resources depending on the intent
API
RESTful North API (with examples)
Types | Intent APIs | Functionality |
---|---|---|
1. outbound service communication | /v2/projects/{project-name}/composite-apps/blue-app/{version}/traffic-intent-set/inbound-intents/ | Define outbound traffic for a service |
2. inbound service communication | v2/projects/{project-name}/composite-apps/blue-app/{version}/traffic-intent-set/outbound-intents/ | Define Inbound service for a service |
3. Compound service communication | /v2/projects/{project-name}/composite-apps/blue-app/{version}/traffic-intent-set/compound-intents/{compund-intent-name}/inbound-intents/ | Define a virtual path for connecting to multiple services |
URL: /v2/projects/{project-name}/composite-apps/{composite-app-name}/{version}/traffic-intent-set
POST BODY:
{
"name": "john",
"description": "Traffic intent groups"
"set":[
{
"inbound":"abc"
},
{
"outbound":"abc"
}
]
}
1. Inbound access
POST
POST
URL: /v2/projects/{project-name}/composite-apps/blue-app/{version}/traffic-intent-set/inbound-intents/
POST BODY:
{
"metadata": {
"name": "<>" // unique name for each intent
"description": "connectivity intent for inbound communication"
"userdata1": <>,
"userdata2": <>
}
"spec": { // update the memory allocation for each field as per OpenAPI standards
"application": "<app1>",
"servicename": "httpbin" //actual name of the client service - {istioobject - serviceEntry of client's cluster}
"externalName": "httpbin.k8s.com"
"traffic-weeight": "" // Default is "". Used for redirecting traffic percentage when compound API is called
"protocol": "HTTP",
"headless": "false", // default is false. Option "True" will make sure all the instances of the headless service will have access to the client service
"mutualTLS": "MUTUAL", // default is simple. Option MUTUAL will enforce mtls {istioobject - destinationRule}
"port" : "80", // port on which service is exposed as through servicemesh, not the port it is actually running on
"serviceMesh": "istio", // get it from cluster record
"sidecar-proxy": "yes", // The features (mTLS, LB, Circuit breaking) are not available to services without istio-proxy. Only inbound routing is possible.
// Traffic management fields below are valid only if the sidecar-proxy is set to "yes"
"traffic-management-info" : {
// Traffic configuration - Loadbalancing is applicable per service. The traffic to this service is distrbuted amongst the pods under it.
"loadbalancingType": "ConsistenHash", // "Simple" and "consistentHash" are the two modes - {istioobject - destinationRule}
"loadBalancerMode": "httpCookie" // Modes for consistentHash - "httpHeaderName", "httpCookie", "useSourceIP", "minimumRingSize", Modes for simple - "LEAST_CONN", "ROUND_ROBIN", "RANDOM", "PASSTHROUGH" // choices of the mode must be explicit - {istioobject - destinationRule}
"httpCookie": "user1" // Name of the cookie to maitain sticky sessions - {istioobject - destinationRule}
// Circuit Breaking
"maxConnections": 10 //connection pool for tcp and http traffic - {istioobject - destinationRule}
"concurrenthttp2Requests": 1000 // concurent http2 requests which can be allowed - {istioobject - destinationRule}
"httpRequestPerConnection": 100 // number of http requests per connection. Valid only for http traffic - {istioobject - destinationRule}
"consecutiveErrors": 8 // Default is 5. Number of consecutive error before the host is removed - {istioobject - destinationRule}
"baseEjectionTime" : 15 // Default is 5, - {istioobject - destinationRule}
"intervalSweep": 5m, //time limit before the removed hosts are added back to the load balancing pool. - {istioobject - destinationRule}
}
// credentials for mTLS.
"Servicecertificate" : "" // Present actual certificate here.
"ServicePrivateKey" : "" // Present actual private key here.
"caCertificate" : "" // present the trusted certificate to verify the client connection, Required only when mtls mode is MUTUAL
}
}
RETURN STATUS: 201
RETURN BODY:
{
"name": "<name>"
"Message": "inbound service created"
}
GET
GET
URL: /v2/projects/{project-name}/composite-apps/blue-app/{version}/traffic-intent-set/inbound-intents/<name>
RETURN STATUS: 201
RETURN BODY:
{
"metadata": {
"name": "<>" // unique name for each intent
"description": "connectivity intent for stateless micro-service to stateless micro-service communication"
"userdata1": <>,
"userdata2": <>
}
"spec": { // update the memory allocation for each field as per OpenAPI standards
"application": "<app1>",
"servicename": "<>" //actual name of the client service - {istioobject - serviceEntry of client's cluster}
"externalName": "<>" // prefix to expose this service outside the cluster
"protocol": "", // supported protocols are HTTP, TCP, UDP and HTTP2
"headless": "", // default is false. Option "True" will make sure all the instances of the headless service will have access to the client service
"mutualTLS": "", // default is simple. Option MUTUAL will enforce mtls {istioobject - destinationRule}
"port" : "80", // port on which service is exposed as through servicemesh, not the port it is actually running on
"serviceMesh": "istio", // get it from cluster record
"sidecar-proxy": "yes", // The features (mTLS, LB, Circuit breaking) are not available to services without istio-proxy. Only inbound routing is possible.
/ Traffic management fields below are valid only if the sidecar-proxy is set to "yes"
"traffic-management-info" : {
// Traffic configuration - Loadbalancing is applicable per service. The traffic to this service is distrbuted amongst the pods under it.
"loadbalancingType": "", // "Simple" and "consistentHash" are the two modes - {istioobject - destinationRule}
"loadBalancerMode": "" // Modes for consistentHash - "httpHeaderName", "httpCookie", "useSourceIP", "minimumRingSize", Modes for simple - "LEAST_CONN", "ROUND_ROBIN", "RANDOM", "PASSTHROUGH" // choices of the mode must be explicit - {istioobject - destinationRule}
"httpCookie": "user1" // Name of the cookie to maitain sticky sessions - {istioobject - destinationRule}
// Circuit Breaking
"maxConnections": "" //connection pool for tcp and http traffic - {istioobject - destinationRule}
"concurrenthttp2Requests": "" // concurent http2 requests which can be allowed - {istioobject - destinationRule}
"httpRequestPerConnection": "" // number of http requests per connection. Valid only for http traffic - {istioobject - destinationRule}
"consecutiveErrors": "" // Default is 5. Number of consecutive error before the host is removed - {istioobject - destinationRule}
"baseEjectionTime" : "" // Default is 5, - {istioobject - destinationRule}
"intervalSweep": '', //time limit before the removed hosts are added back to the load balancing pool. - {istioobject - destinationRule}
}
// credentials for mTLS.
"Servicecertificate" : "" // Present actual certificate here.
"ServicePrivateKey" : "" // Present actual private key here.
"caCertificate" : "" // present the trusted certificate to verify the client connection, Required only when mtls mode is MUTUAL
// Access Control
"namespaces": [] // Workloads from this namespaces can access the inbound service - {istioobject - authorizationPolicy}
"serviceAccountAccess" : {[ "SaDetails": ["ACTION": "URI"]} // {istioobject - authorizationPolicy, will be applied for the inbound service}
}
}
DELETE
DELETE
2. Outbound access
POST -
POST
3. Compound Service access
POST
Note - After the compound intent is created, Call the inbound services under it and make sure you assign the weightage to each service under it. As shown in the below example
POST
Scenarios supported for the current release
Development
go API library - https://github.com/gorilla/mux
backend - mongo - https://github.com/onap/multicloud-k8s/tree/master/src/k8splugin/internal/db - Reference
intent to config conversion - use go templates and admiral? https://github.com/istio-ecosystem/admiral
writing the config to etcd - WIP
Unit tests and Integration test - go tests
External DNS - Design and intent API
See here: External DNS provider update design and intent API
External application communication intents
Considering DNS resolution, No DNS resolution (IP addresses), Egress proxies of the Service Mesh, Third-party egress proxy
User facing communication intents
Considering Multiple DNS Servers
Considering multiple user-facing entities
Considering RBAC/ABAC
Internal Design details
Guidelines that need to keep in mind
Support for metrics that can be retrieved by Prometheus
Support for Jaeger distributed tracing by including open tracing libraries around HTTP calls.
Support for logging that is understood by fluentd
Mutual exclusion of database operations (keeping internal modules accessing database records simultaneously and also by replication entities of the scheduler micro-service).
Resilience - ensure that the information returned by controllers is not lost as the synchronization of resources to remote edge clouds can take hours or even days when the edge is not up and running and possibility of restart of scheduler micro service in the meantime.
Concurrency - Support multiple operations at a time and even synchronizing resources in various edge clouds in parallel.
Performance - Avoiding file system operations as much as possible.
Modules (Description, internal structures etc..)
Service Mesh Config:
Main Function: the module is invoked by traffic controller after traffic controller receives
intents from external world, and parses requests from traffic controller and
extracts some key information to assemble a new yaml file for creating instances
of inbound services and clients based on istio.
Main Operations:
create/destroy inbound services (API: Add Inbound service)
create/destroy client services (API: Add Clients)
create/remove security details for client services (API: Add Security details for clients)
create/destroy ServiceEntry for inbound services used by clients
create/destroy DestinationRules for both inbound and client services
create/destroy VirtualService for client services
create/destroy AuthorizationPolicy for inbound services used by clients
The key information includes but not limited:
client name
inbound service name
protocol: http/https/tcp
TLS options: no/simple/mutual
port
The interface between SM config and Traffic Controller: maybe via gPRC, and APIs are TBD
Traffic Controller
Main Function: it acts as main controlling loop/daemon, and receives the request in a form of
REST from external modules e.g. orchestrator. Then it parses these requests and
figures out the exact purposes which these requests want to express
e.g. service creation, DNS update or workload adjust. Afterwards, it invoke corresponding
components like SM config, DNS updater, to fulfill these requirements by creating
and configuring related uServices based on the various mechanisms of istio.
Main steps:
0. Traffic controller need to be registered in orchestrtor by calling the APIs provided by orchestrtor
1. Orchestrtor starts to instantiate the traffic controller
2. Traffic controller finds the config files about various plugins like SM config,
Loadbalancers and DNS updater from some certain locations, and then instantiate
these plugins. Here, these plugins may be defined as istio VirtualService and
their associated yaml files should be provided beforehand.
3. Traffic controller need to have some health-check about the instances of these
plugins and make sure they are up and running well(some heath-check criteria
also need to be defined).
4. Traffic controller may need to notify orchestrator that it, including the plugins,
is ready to serve (which API provided by orchestrator should be invoked?).
5. At this moment, orchestrator can start to monitor and manage the life-cycle of traffic
controller. And the way/APIs of monitor and manage need to be clarified.
(Is HA is required for traffic controller?)
6. Users/admin are allowed to send their request to create uServices or access the running
uServices directly via REST, like the inbound/client services creation. After traffic controller
convert the intents to service description, the generated yml files which will be used by istio
to create uSevices should be given to workload scheduler/placement helper to place and
instantiate these uService on edge cloud clusters. Namely, traffic controller need to inform
the workload scheduler/placement helper that there are some uServices to be placed and
instantiated on edge cloud clusters.
7. Traffic controller need to call DNS module to expose the domain names of services to external world,
after it is aware of these uServices have been instantiated on edge cloud clusters. (how is traffic controller
aware of the accomplishment of of uServices instantiation?)
8. Traffic controller may need to manage the lifecycle of uServices (or done by some modules within orchestrator?)
by a way e.g. detecting the heartbeat from various uServices periodically e.g. one check per 10 seconds.
9. Considering the HA, traffic controller should instantiate at least 2 of those plugins, and should be
able to monitor the health of those instances of plugins. when any of instances is down, traffic controller
can restart/recreate one for it again.
LoadBalancer (aka GSLB/LB controller?)
GSLB(Geo-replicated Services LB/LB controller) is used to balance incoming load across multiple istio-ingress-gateways.
It shall be able to be aware of the run-time load of various working uService instances which are distributed on different
edge clouds or the general load level of each kind of uService in each edge cloud by interacting with actual LB(e.g. metallb,
or other module except metallb) running on one edge cloud. Metallb is responsible for the load balance jobs between instances
of uService within an edge cloud, while GSLB(controller) shall be in charge of guiding the traffic load to multiple edge clouds.
GSLB(LB controller) should be aware of the public IPs (achieved by using metallb? or SDWAN?) of all edge clouds.
GSLB(LB controller) get requests from external users, and it evaluates the load level of the uServices that the user want to access
on various edge clouds, and then choose one certain edge cloud on which the targeted(or a set of) uService instance is running,
next GSLB returns the domain name of the chosen uService instance to users, so users can utilize this new domain name to access
their expected uService. These steps mentioned above imply that those domain names associated with different uServices distributed
on edge clouds should point to the IP address of center cloud on which GSLB is running first, after GSLB figure out to which edge cloud
the users' request should be forwarded, the real domain name of uServices is given back to users by GSLB.
....