Obtaining signed certificate

Here's a rough summary of the sequence of operations between DCM, rsync and clusters:

Create/Apply CSR (like other resources)
Approve CSR (new via /subresources/approval)

K8s signer will issue a certificate some time after approval of CSR takes place
To read about the new /subresources level, check Supporting subresources

Watch/monitor CSR to see when a .status is created
Return signed certificate obtained from CSR .status.certificate all the way back to etcd
DCM will read the certificate from etcd

Sequence Diagram

With regards to DCM obtaining the signed user certificate per cluster (mostly point #5 above), for now it will be based on lazy-loading the certificates from etcd into MongoDB whenever the user requests a kubeconfig to be generated for the logical cloud cluster.

Thus, when attempting to retrieve the kubeconfig for a particular logical cloud cluster (after the user/client requests a kubeconfig via DCM API), there are 3 possible states and actions that will take place:

If rsync has already written the signed certificate, DCM will copy the certificate to MongoDB, then generate the kubeconfig, then reply with the generated kubeconfig and HTTP 200
If rsync has not yet written the signed certificate, DCM will not generate a kubeconfig and instead respond with an HTTP 202 informing the client that the request was accepted but data is not ready yet
- Client should repeat the request until the 1st possibility
If the client asks for the kubeconfig again, DCM will simply re-generate that kubeconfig using the certificate in MongoDB, skipping etcd altogether (so will be faster than when lazy loading takes place) and return it.

Currently, DCM expects the user certificate (in base64) to be stored by rsync in the following etcd AppContext path (lc1 is the name of the logical cloud):

/context/<ID>/app/logical-cloud/cluster/<cluster-reference>/resource/lc1+cert/

However, this is likely to change depending on the non-monitor rsync side of the implementation. Also, this conflates K8s resources with non-K8s resources (lc1+cert above is a raw certificate).

Additional decisions/clarifications about current DCM's approach for getting certificate:

Rsync will use client-go to ask for csr approval after iterating through /subresources (just /subresource/approval for now)
New Monitor code watches CSRs, which copies the certs to the ResourceBundleState
DCM lazy-checks for the certs in etcd starting with the 1^st time they are ever needed (getting a kubeconfig) - as explained above
- Certs not yet in etcd= Logical Cloud is still applying
- Certs already in etcd = Logical Cloud is applied, everything got stored in mongodb and it’s ready to return kubeconfigs
  - no more lazy-checks are done after this point because all info needed is now in mongodb

Security considerations:

The existing user's private key (which was originally used to sign the CSRs for each cluster) has been previously generated by DCM locally and stored in and only in MongoDB
The private key does not leave MongoDB at any point until:
- after the kubeconfig has been fully generated and returned back to the user/client as an HTTP response (the private key is in the kubeconfig's user's client-key-data field as a base64-encoded string).
The certificates issued by the clusters are transported from the cluster, to rsync, and back to DCM over etcd, which then stores them in MongoDB as well.
A high-security implementation could leverage the Trusted Platform Module (TPM) present in most edge deployments to store the private key and sign/decrypt data without exposing this key. However, it's unclear what this would look like when applied to a kubeconfig.

Regarding what happens outside of DCM (points #2, #3 and #4 above), and since the K8s signer issues a certificate some time after approval of a CSR takes place (making the whole process very much asynchronous.. in fact, this could also be done manually by humans), the Monitor has been chosen as the tool to track what happens to the CSR and trigger other actions.

The reader is referred to the Sequence Diagram above to better understand how the Monitor, together with cluster etcd and rsync (cluster watcher) work together to detect that a CSR has been approved and has issued a certificate in its own .status.certificate subresource field. This certificate is then propagated back to the main etcd instance, where DCM can read from using the lazy-load method presented above.

Point #1 is simply calling DCM API to create resources and apply the logical cloud, after which etcd will be populated with all needed resources including CSR and rsync will be notified via gRPC.

Regarding all 5 points above, here is what isn't yet implemented:

rsync can't yet request a cluster to approve a CSR (but we know how to do this)
cluster watcher can't yet extract an issued certificate from ResourceBundleState and store it in etcd (to be read by DCM)
potential fixes after we actually have all these pieces working together hands-off

Dev reference sheet:

To see what the monitor sees:

kubectl logs monitor-755db946d8-n2w2m

To check the resourcebundlestate:

kubectl get resourcebundlestate