External DNS provider update design and intent API

This section covers the design for how external DNS records are updated.

Create DNS Provider Intent

When instantiating a distributed app with user facing microservices, the Traffic Control Intent Set includes information to indicate the set of external DNS servers which should be updated with appropriate DNS records.  At the Project level, DNS Update Records are supplied to identify the type of DNS Provider to be updated along with a set of parameters required to perform the DNS record update.  The 'external-dns' package is used to perform DNS update to a selected set of DNS providers.  After the DNS Update  Records have been created, they can be associated with specific logical clouds or clusters utilizing the Key Value pair feature of those APIs.

Each instance of a DNS Update Record has a unique name within the Project and is created and managed by the Traffic Controller API.  The record will contain all of the parameters needed to invoke 'external-dns' for that specific provider. 





The following example illustrates the DNS Update Record API. 

DNS Update Record API
URL: /v2/project/{project-name}/dns-update-records/ POST BODY: { "metadata": { "name": "name_of_dns_information_record", "description": "description of the DNS information", "userdata1": "some user data", "userdata2": "some different user data" }, "spec": { "provider": "coredns", "external-dns-parameters": { "oneOf": [ ## schema-like (array structure not in body) - one of the following is selected to match the 'provider' - e.g. 'coredns' in this case { "aws-zone-type": "public", "aws-zone-tags": "zone tags", "aws-assume-role": "arn:aws:iam::123455567:role/external-dns", "aws-batch-change-size": 1000, "aws-batch-change-interval": "1s", "aws-evaluate-target-health": true, "no-aws-evaluate-target-health": true, "aws-api-retries": 3, "aws-prefer-cname": true }, { "azure-config-file": "/etc/kubernetes/azure.json", "azure-resource-group": "resource group", "azure-subscription-id": "subscription id", "azure-user-assigned-identity-client-id": "client id" }, { "coredns-prefix": "skydns" }, { "rfc2136-host": "host.sample.com", "rfc2136-zone": "", "rfc2136-insecure": false, "rfc2136-tsig-keyname": "tsig key", "rfc2136-tsig-secret": "tsig secret", "rfc2136-tsig-secret-alg": "tsig secret alg", "rfc2136-tsig-axfr": "axfr", "request-timeout": "30s" } ], "contour-load-balancer": "heptio-contour/contour", ## from here on, possible parameters used by external-dns - not all required "fqdn-template": "template", "combine-fqdn-annotation": true, "ignore-hostname-annotation": true, "compatibility": "mate", "publish-internal-services": true, "publish-host-ip": true, "service-type-filter": "all", "domain-filter": "example.com", "exclude-domains": "example.com", "zone-id-filter": "zone filter", "tls-ca": "tls ca path", "tls-client-cert": "tls client cert path", "tls-client-cert-key": "tls client cert key path", "policy": "sync", "registry": "txt", "txt-owner-id": "default", "txt-prefix": "custom string", "txt-cache-interval": "30s", "interval": "30s", "once": true, "dry-run": true, "log-format": "text", "metrics-address": ":7979", "log-level": "info" } } }

The following shows an OpenAPI component definition of the above DNS Update Record structure (note: in progress).  Parameters with descriptions are shown in more detail.  

(in progress) OpenAPI definition of DNS Update Record
openapi: 3.0.0 info: title: DNS update information record description: Defines information needed to update external DNS version: 2.0.0 paths: <...tbd...> components: schemas: dnsrecord: properties: metadata: $ref: '#/components/schemas/metadata' spec: $ref: '#/components/schemas/extdns_info' extdns_info: type: "object" properties: provider: type: "string" enum: ["aws", "aws-sd", "azure", "azure-dns", "azure-private-dns", "coredns", "rfc2136"] example: "coredns" external-dns-parameters: allOf: - $ref: '#/components/schemas/extdns_provider' - $ref: '#/components/schemas/extdns_general' metadata: type: "object" required: - name properties: name: type: "string" example: "name_of_dns_information_record" description: type: "string" example: "description of the DNS information" userdata1: type: "string" example: "some user data" userdata2: type: "string" example: "some different user data" extdns_general: type: "object" required: - provider properties: request-timeout: type: "string" example: "30s" description: "Request timeout when calling Kubernetes APIs. 0s means no timeout" contour-load-balancer: type: "string" example: "heptio-contour/contour" default: "heptio-contour/contour" description: "The fully-qualified name of the Contour load balancer service." fqdn-template: type: "string" example: "template" description: "A templated string that's used to generate DNS names from sources that don't define a hostname themselves, or to add a hostname suffix when paired with the fake source (optional). Accepts comma separated list for multiple global FQDN." combine-fqdn-annotation: type: "boolean" example: true description: "Combine FQDN template and Annotations instead of overwriting" ignore-hostname-annotation: type: "boolean" example: true default: false description: "Ignore hostname annotation when generating DNS names, valid only when using fqdn-template is set (optional, default: false)" compatibility: type: "string" example: "mate" enum: ["mate", "molecule"] description: "Process annotation semantics from legacy implementations (optional, options: mate, molecule)" publish-internal-services: type: "boolean" example: true description: "Allow external-dns to publish DNS records for ClusterIP services (optional)" publish-host-ip: type: "string" example: true description: "Allow external-dns to publish host-ip for headless services (optional)" service-type-filter: type: "string" example: "all" default: "all" enum: ["all", "ClusterIP", "NodePort", "LoadBalancer", "ExternalName"] description: "The service types to take care about (default: all, expected: ClusterIP, NodePort, LoadBalancer or ExternalName)" domain-filter: type: "string" example: "example.com" description: "Limit possible target zones by a domain suffix; specify multiple times for multiple domains (optional)" exclude-domains: type: "string" example: "example.com" description: "Exclude subdomains (optional)" zone-id-filter: type: "string" example: "zone filter" description: "Filter target zones by hosted zone id; specify multiple times for multiple zones (optional)" tls-ca: type: "string" example: "tls ca path" description: "When using TLS communication, the path to the certificate authority to verify server communications (optionally specify --tls-client-cert for two-way TLS)" tls-client-cert: type: "string" example: "tls client cert path" description: "When using TLS communication, the path to the certificate to present as a client (not required for TLS)" tls-client-cert-key: type: "string" example: "tls client cert key path" description: "When using TLS communication, the path to the certificate key to use with the client certificate (not required for TLS)" policy: type: "string" example: "sync" default: "sync" enum: ["sync", "upsert-only", "create-only"] description: "Modify how DNS records are synchronized between sources and providers (default: sync, options: sync, upsert-only, create-only)" registry: type: "string" example: "txt" default: "txt" enum: ["txt", "noop", "aws-sd"] description: "The registry implementation to use to keep track of DNS record ownership (default: txt, options: txt, noop, aws-sd)" txt-owner-id: type: "string" example: "default" default: "default" description: "When using the TXT registry, a name that identifies this instance of ExternalDNS (default: default)" txt-prefix: type: "string" example: "custom string" description: "When using the TXT registry, a custom string that's prefixed to each ownership DNS record (optional)" txt-cache-interval: type: "string" example: "30s" description: "The interval between cache synchronizations in duration format (default: disabled)" interval: type: "string" example: "30s" default: "1m" description: "The interval between two consecutive synchronizations in duration format (default: 1m)" once: type: "boolean" example: true description: "When enabled, exits the synchronization loop after the first iteration (default: disabled)" dry-run: type: "boolean" example: true description: "When enabled, prints DNS record changes rather than actually performing them (default: disabled)" log-format: type: "string" example: "text" default: "text" enum: ["text", "json"] description: "The format in which log messages are printed (default: text, options: text, json)" metrics-address: type: "string" example: ":7979" description: "Specify where to serve the metrics and health check endpoint (default: :7979)" log-level: type: "string" example: "info" default: "info" enum: ["panic", "debug", "info", "warning", "error", "fatal"] description: "Set the level of logging. (default: info, options: panic, debug, info, warning, error, fatal" extdns_provider: oneOf: - $ref: '#/components/schemas/extdns_provider_aws' - $ref: '#/components/schemas/extdns_provider_azure' - $ref: '#/components/schemas/extdns_provider_coredns' - $ref: '#/components/schemas/extdns_provider_rfc2136' extdns_provider_aws: type: "object" properties: aws-zone-type: type: "string" example: "public" enum: ["public", "private"] description: "When using the AWS provider, filter for zones of this type (optional, options: public, private)" aws-zone-tags: type: "string" example: "zone tags" description: "When using the AWS provider, filter for zones with these tags" aws-assume-role: type: "string" example: "arn:aws:iam::123455567:role/external-dns" description: "When using the AWS provider, assume this IAM role. Useful for hosted zones in another AWS account. Specify the full ARN, e.g. `arn:aws:iam::123455567:role/external-dns` (optional)" aws-batch-change-size: type: "integer" example: 1000 description: "When using the AWS provider, set the maximum number of changes that will be applied in each batch." aws-batch-change-interval: type: "string" example: "1s" description: "When using the AWS provider, set the interval between batch changes." aws-evaluate-target-health: type: "boolean" example: true description: "When using the AWS provider, set whether to evaluate the health of a DNS target (default: enabled, disable with --no-aws-evaluate-target-health)" no-aws-evaluate-target-health: type: "boolean" example: true description: "When using the AWS provider, set whether to evaluate the health of a DNS target (default: enabled, disable with --no-aws-evaluate-target-health)" aws-api-retries: type: "integer" example: 3 description: "When using the AWS provider, set the maximum number of retries for API calls before giving up." aws-prefer-cname: type: "boolean" example: true description: "When using the AWS provider, prefer using CNAME instead of ALIAS (default: disabled)" extdns_provider_azure: type: "object" properties: azure-config-file: type: "string" example: "/etc/kubernetes/azure.json" description: "When using the Azure provider, specify the Azure configuration file (required when --provider=azure" azure-resource-group: type: "string" example: "resource group" description: "When using the Azure provider, override the Azure resource group to use (required when --provider=azure-private-dns)" azure-subscription-id: type: "string" example: "subscription id" description: "When using the Azure provider, specify the Azure configuration file (required when -ws-zone-type-provider=azure-private-dns)" azure-user-assigned-identity-client-id: type: "string" example: "client id" description: "When using the Azure provider, override the client id of user assigned identity in config file (optional)" extdns_provider_coredns: type: "object" properties: coredns-prefix: type: "string" example: "skydns" description: "When using the CoreDNS provider, specify the prefix name" extdns_provider_rfc2136: type: "object" properties: rfc2136-host: type: "string" example: "host.sample.com" description: "When using the RFC2136 provider, specify the host of the DNS server" rfc2136-zone: type: "string" example: "" description: "When using the RFC2136 provider, specify the zone entry of the DNS server to use" rfc2136-insecure: type: "boolean" example: false description: "When using the RFC2136 provider, specify whether to attach TSIG or not (default: false, requires --rfc2136-tsig-keyname and --rfc2136-tsig-secret)" rfc2136-tsig-keyname: type: "string" example: "tsig key" description: "When using the RFC2136 provider, specify the TSIG key to attached to DNS messages (required when --rfc2136-insecure=false)" rfc2136-tsig-secret: type: "string" example: "tsig secret" description: "When using the RFC2136 provider, specify the TSIG (base64) value to attached to DNS messages (required when --rfc2136-insecure=false)" rfc2136-tsig-secret-alg: type: "string" example: "tsig secret alg" description: "When using the RFC2136 provider, specify the TSIG (base64) value to attached to DNS messages (required when --rfc2136-insecure=false)" rfc2136-tsig-axfr: type: "string" example: "axfr" description: "When using the RFC2136 provider, specify the TSIG (base64) value to attached to DNS messages (required when --rfc2136-insecure=false)"



The Key Value pair list that is added to the logical-clouds and clusters will look like the following:

DNS Key Value pair
{ "metadata":{ "name":"dns-update-record-list", "description":"list of the DNS update records associated with this logical-cloud or cluster", "userData1":"<user data>", "userData2":"<user data>" }, "spec":{ "kv":[ { "key1":"dns-update-record-1" }, { "key2":"dns-update-record-47" } ] } }



Traffic Control Intent Handling - DNS Update Records

After DNS Update Records are created and associated with logical cloud and/or clusters and associated with the Traffic Control intents of the distributed application, a call is made to instantiate a Profile.  Eventually, the multicloud orchestrator will invoke the traffic controller to process the traffic controller intents.  One part of that process will be to handle the DNS Update Records associated with the Profile (via the intent).



The following sequence diagram illustrates what happens:



The DNS Update Record handling consists of two key tasks:

  1. Prepare manifests for external-dns Deployments which will handle the updating of specific DNS-Providers.  There is a separate external-dns Deployment for each DNS Provider (based on current understanding of how external-dns works).

    1. For a given project, there should only need to be one external-dns Deployment to handle all distributed Apps that are deployed - so, maybe the App Context is not the right location for these Deployment manifests.

    2. The 'source' for external-dns will always be a DNSendpointCRD for a namespace in the project.  An enhancement to external-dns is expected necessary to handle filtering the DNSendpointCRD based on a label to associate specific CRs with a DNS Provider.

  2. Prepare DNSendpointCRD manifests for user facing services present in the App Context.

The following show an example of a DNS Endpoint CR instance.

DNS Endpoint CRD

see also:    https://github.com/kubernetes-sigs/external-dns/tree/master/docs/contributing/crd-source

To associate these with specific DNS Providers, the proposal is to add a label which will allow external-dns (after modification) to handle only DNS Endpont CRs that match the label (in addition to the namespace) for a given DNS Provider.

The 'name' of the DNS Endpoint CR can be derived from the sub-app (microservice) that is exposing these endpoints.



The DNS Update Record handler creates the manifests for these DNS Endpoint CRs using the following algorithm:



Intent processing algorithm

Notes:

  • There will need to be a way to find the appropriate set of IPs to use.  Some IPs will be appropriate for a public scope (e.g. update a DNS Provider associated with the logical cloud) and others may be local to the cluster network (e.g. cluster DNS Providers).

Resource Synchronization

The following illustrates how the external DNS updates are triggered once the Resource Synchronization of the distributed app is performed.



The external-dns deployment for a given DNS Provider may be utilized for more than one distributed application, so some method for detecting that there is already a deployment running is needed.

Proposal:  annotate the deployment with something like "resource bundle/resource bundle version/profile" when the external-dns deployment is created.  Additional apps will use the same deployment and annotate the deployment.  If a distributed app is removed, then the annotation for that app is removed.

Note:  The Resource Sync of the App Context is applying these external DNS related resources to the central controller cluster instead of the edge or other clusters where most of the distributed app resources will be applied.