Installing cert-manager webhook for OCI and Let's Encrypt ClusterIssuer on an OKE cluster

If you're an operator maintaining a footprint of Kubernetes clusters, then you already know that there's more to provisioning and granting access to a cluster. For a cluster to be useful for an end consumer an operator will usually install, enable and configure several other components. One such component is cert-manager. Its purpose is to issue certificates from a variety of supported sources. Consulting the docs, you might learn about the ACME issuer type and the DNS01 challenge validation method. Then depending on your IaaS requirements, you will find a limited set of supported providers and webhooks.

And now we come to the reason why I'm writing this post. If you're contemplating installing cert-manager an OKE cluster on OCI, you'll want to employ a webhook. However, it is not as straightforward as you might think. Uninitiated, you might visit this link and follow the instructions to install it. The implementation has not kept up with the API changes over successive releases of cert-manager. So, you would be forced to install a pre-v1.x version of cert-manager. Not a great option as you'd be left open to many security vulnerabilities. I want to share how I built, tested and published an up-to-date version of the aftermentioned, here.

Upgrading the implementation

I decided to target compatibility with cert-manager v1.10 or better, so that the updated webhook would be functional on Kubernetes clusters, versions 1.20 through 1.26.

As I reviewed the Gitlab repository where the original webhook implementation and Helm chart reside, I noticed that another developer had made an effort to upgrade the webhook to be compatible with cert-manager 1.8 in this fork. So, I cloned that fork and began to make updates.

I upgraded the Golang version in go.mod to 1.19 and decided to target Kubernetes version 1.24 in Makefile. I also updated the args in Dockerfile to be

ARG GOLANG_VERSION=1.19.4
ARG ALPINE_VERSION=3.17

First, I upgraded the libraries

go get -u all
go mod tidy

Then, I attempted to build the container image that would be employed by the Helm chart.

docker build -t pacphi/cert-manager-webhook-oci .

Where to host this image? In a public container image registry repository. I chose an OCI registry instead of Dockerhub because I (and others who might want to consume my pre-built image) won't get rate-limited on image pull requests.

Lastly, I authored a BaSH script to help me remember the sequence of steps to create a repo (if it doesn't already exist), build and tag the image, obtain an auth token, authenticate to the registry, and finally push the image.

Inspecting the image

I put myself in the shoes of a potential consumer. Could they successfully pull the image with no authentication credentials from the public repository I had just created? And how might they learn about what's truly inside the image?

To pull the image, they'd execute

docker pull phx.ocir.io/axyd58snjxbf/cert-manager-webhook-oci:latest

To explore the layers of the image, they could use dive

dive phx.ocir.io/axyd58snjxbf/cert-manager-webhook-oci:latest

Upgrading the Helm chart

The chart configuration required only one update. In apiservice.yaml, I needed to revert all occurrences of v1 back to v1alpha. Why? The acme.d-n.be apiservice is not discoverable otherwise.

Installing and uninstalling the Helm chart

Instructions in the README of the original repository left much to be desired. So, I authored another BaSH script to help me consistently install the prerequisites and the webhook itself. And what about teardown? There's a script for that too.

Testing it out

Before you can leverage the install and uninstall scripts, you'll need to:

* add a DNS zone in OCI
* add a VCN in OCI
* provision a cluster
* update the install script's environment variables

That last bullet I might improve upon soon so that you don't need to edit the script itself.

Here's a fast and inexpensive way to test

#! Create a single-node cluster on your workstation with kind; @see https://kind.sigs.k8s.io/
kind create cluster
#! Setup access to your Oracle Cloud Infastructure account
oci setup config
#! Clone the repo where the updated webhook implementation resides
git clone https://github.com/paphi/cert-manager-webhook-oci
#! Change directories into the scripts directory
cd cert-manager-webhook-oci/scripts
#! Edit and save updates to install-cert-manager-webhook-oci.sh
#! Then run this script to install
./install-cert-manager-webhook-oci.sh
#! To validate the installation...
kubectl get pods -A
kubectl get apiservice | grep acme.d-n.be
kubectl get clusterissuer
kubectl get cert -A
kubectl get secret -A
#! To uninstall, run this script
./uninstall-cert-manager-webhook-oci.sh
#! To destroy the cluster
kind delete cluster

But if you want to test this out against an OKE cluster, you could follow this tutorial to spin one up.