Yelling at my laptop - Part 1

A retro on how I installed Tanzu Application Platform on GKE integrated with Artifact Registry

Preface

This post chronicles my attempt at installing a commercial, secure software supply-chain platform on top of a public cloud provider's managed Kubernetes-as-service offering by not explicitly following public documentation.

First, let me share a personal bias. I don't have the best recall. Being asked to execute a complex task that I may have completed a few weeks or months ago heightens my anxiety. Particularly, if I'm under a time constraint. So, if I know I'm going to have to do something repeatedly, I'm going to automate it. I do this to reduce toil and avoid potential mistakes.

On the choice of a technology stack

Every DIY developer I know forms opinions on the choice of tech stack to aid in value delivery. They may or may not have the authority and autonomy to exercise those opinions in an enterprise setting, where the choice of toolset, frameworks and automation may be dictated by any number of reasons (e.g., feasibility, contractual, community, financial, regulatory). Therefore, when I settle on a technology stack to accomplish a task, I want to think about how I'd position its applicability in that context.

To accomplish the task of provisioning public cloud resources, I chose Terraform. It ticks several boxes, not the least that I have several years of experience with it. I need to be able to create and destroy cloud environments hosting a footprint of multiple Kubernetes clusters. At a minimum that consists of a virtual network and subnets, a container registry, one or more clusters, one or more DNS zones, blob storage, and a secrets manager instance.

To work with a cluster, of course, I'll need the kubectl client. To make my life easier I'll install a few krew plugins like ctx, reap, and view-secret. To build, tag, pull and push container images for test purposes, I'll use a combination of docker and pack - in anger. Lastly, I'll rely on a complement of other sharp tools, like yq, kapp, and ytt.

Provisioning cloud resources

With time I've acquired and developed skills with HCL, authoring and consuming Terraform modules. I've often found it to be the case that when I want to stitch something together myself someone else has authored a coarse-grained module on Github or the Terraform Registry that's designed in such a way that I can easily leverage and/or extend it. (It's particularly gratifying when I pick a module where large communities have evolved to maintain and consume it).

Creating an Artifact Registry instance

With that bit of background, let's focus on the creation of a Google Artifact Registry. Artifact registries allow you to host many types of artifacts; for my purposes, I needed to create one to host container images. The first thing you need to grok (which I failed to understand early on, wasting huge amounts of time) is how artifacts are organized for storage and retrieval. The public documentation available speaks to how you create, authenticate, then push and pull container images.

So, after glancing at that I set about designing a Terraform module to lifecycle-manage an Artifact registry instance and a repository within it.

main.tf

data "external" "env" {
  program = ["${path.module}/env.sh"]
}

data "google_kms_key_ring" "terraform" {
  name     = var.keyring
  location = var.location
}

data "google_kms_crypto_key" "terraform" {
  name     = "terraform"
  key_ring = data.google_kms_key_ring.terraform.id
}

resource "google_artifact_registry_repository" "instance" {
  for_each      = toset(var.repository_names)
  location      = var.location
  kms_key_name  = data.google_kms_crypto_key.terraform.id
  repository_id = each.key
  description   = "OCI image repository"
  format        = "DOCKER"
}

env.sh

#!/bin/sh

# env.sh

# Change the contents of this output to get the environment variables
# of interest. The output must be valid JSON, with strings for both
# keys and values.
cat <<EOF
{
  "google_service_account_key": "$GOOGLE_SERVICE_ACCOUNT_KEY"
}
EOF

variables.tf

variable "project" {
  description = "A valid Google project identifier"
  sensitive   = true
}

variable "repository_names" {
  type        = list(string)
  description = "Specifies the names of repositories that will be created within Google Cloud Artifact Registry"
  default     = ["tanzu"]
}

variable "location" {
  description = "Google Cloud Artifact Registry Repository and Google Cloud KMS keyring locations"

  validation {
    condition     = contains(["us-west1", "us-west2", "us-east1", "us-central1", "europe-north1", "europe-west1", "europe-southwest1", "asia-east1", "asia-south1", "asia-northeast3", "australia-southeast2"], var.location)
    error_message = "Valid values for Google Cloud Artifact Registry locations are (us-west1, us-west2, us-east1, us-central1, europe-north1, europe-west1, europe-southwest1, asia-east1, asia-south1, asia-northeast3, australia-southeast2)."
  }

  default = "us-west2"
}

variable "keyring" {}

providers.tf

provider "google" {
  project = var.project
}

versions.tf

terraform {

  required_version = ">= 0.14.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = ">= 4.33.0"
    }
  }
}

outputs.tf

output "admin_username" {
  value = "_json_key_base64"
}

output "admin_password" {
  description = "The base64-encoded password associated with the Container Registry admin account"
  value       = base64encode(data.external.env.result["google_service_account_key"])
  sensitive   = true
}

output "endpoint" {
  value = "${var.location}-docker.pkg.dev"
}

terraform.tfvars

project = "REPLACE_ME"
repository_names = ["tanzu"]
location = "us-west2"
keyring = "REPLACE_ME"

create-artifact-registry.sh

#!/bin/bash

terraform init -upgrade
terraform validate
terraform plan -out terraform.plan
terraform apply -auto-approve -state terraform.tfstate terraform.plan

destroy-artifact-registry.sh

#!/bin/bash

terraform destroy -auto-approve
rm -Rf .terraform .terraform.lock.hcl terraform.tfstate terraform.tfstate.backup terraform.log terraform.plan

Note the use of scripts. I don't want to have to think too hard about the sequence of Terraform commands to issue to create and destroy this cloud resource.

Also, importantly, we're only creating one repository named tanzu within the registry instance.

The outputs from this module allow us to authenticate, push/pull container images to/from the registry instance.

Working with the Artifact Registry instance repository

If we have an instance available, we want to be able to interact with it, right?

Let's quickly run through the sequence of commands to pull a public image, tag it, and then push it to the registry instance repository.

E.g.,

#! Authenticate
echo "cat $HOME/.ssh/terraform-sa.json" | base64 -w0 | docker login -u _json_key_base64 --password-stdin us-west2-docker.pkg.dev
#! Pull a public image from Dockerhub
docker pull busybox
#! Tag the image
docker tag busybox us-west2-docker.pkg.dev/fe-cphillipson/tanzu/busybox
#! Push the image to the Artifact Registry instance repository
docker push us-west2-docker.pkg.dev/fe-cphillipson/tanzu/busybox

Provisioning a cloud environment

To be able to install Tanzu Application Platform, we'll need multiple resources provisioned. I've curated a set of Terraform modules to do that, here. At a minimum, consider executing these modules: iam, project, kms, registry, virtual-network, cluster, main-dns, and child-dns.

Installing Tanzu Application Platform

While I provided a link to install TAP (in online-mode), we're not going to follow those instructions. As I mentioned earlier, I wanted to put myself in a position to repeat these steps with ease and minimal toil and friction.

I am setting the stage to share a series of posts with you about how I assembled automation targeting multiple cloud providers.

With that in mind, we're going to stare at some YAML, because who doesn't like to do that, huh?

But before we do that we need to satisfy some prerequisites:

Next, we need to create a namespace and install two controllers into the cluster. (You will already have administrator access if you used the Terraform cluster module above).

#! Create namespace where TAP will be installed
kubectl create namespace tap-install

#! Install kapp-controller
#! @see https://carvel.dev/kapp-controller/docs/v0.43.2/install/
kubectl apply -f https://github.com/vmware-tanzu/carvel-kapp-controller/releases/latest/download/release.yml

#! Install secretgen-controller
#! @see https://github.com/carvel-dev/secretgen-controller/blob/develop/docs/install.md
kubectl apply -f https://github.com/carvel-dev/secretgen-controller/releases/latest/download/release.yml

Now, let's create some directories

mkdir tap-install
cd tap-install
mkdir base
mkdir config
mkdir -p profiles/iterate

And here beginneth a wave of templated YAML files.

Place the following files in the tap-install/base folder.

rbac.yml

#@ load("@ytt:data", "data")
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: tap-default-sa
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-group: tap-install/rbac
    kapp.k14s.io/change-rule: "delete after deleting tap-install/tap"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: tap-default-role
  annotations:
    kapp.k14s.io/change-group: tap-install/rbac
    kapp.k14s.io/change-rule: "delete after deleting tap-install/tap"
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tap-default-role-binding
  annotations:
    kapp.k14s.io/change-group: tap-install/rbac
    kapp.k14s.io/change-rule: "delete after deleting tap-install/tap"
subjects:
- kind: ServiceAccount
  name: tap-default-sa
  namespace: #@ data.values.tap.namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: tap-default-role

container-image-registry-secret.yml

#@ load("@ytt:data", "data")
#@ load("@ytt:json", "json")
---
#@ def crc_config():
#@  return {
#@    "auths": {
#@      data.values.tap.credentials.registry.host: {
#@        "email": data.values.tap.credentials.registry.email,
#@        "username": data.values.tap.credentials.registry.username,
#@        "password": data.values.tap.credentials.registry.password
#@      }
#@    }
#@  }
#@ end
---
apiVersion: v1
kind: Secret
metadata:
  name: container-registry-credentials
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-rule: "delete after deleting tap"
type: kubernetes.io/dockerconfigjson
stringData:
  .dockerconfigjson: #@ json.encode(crc_config())
---
apiVersion: secretgen.carvel.dev/v1alpha1
kind: SecretExport
metadata:
  name: container-registry-credentials
  namespace: #@ data.values.tap.namespace
spec:
  toNamespaces:
  - '*'

dev-namespace.yml

#@ load("@ytt:data", "data")
---
#@ if data.values.tap.devNamespace != "default" and data.values.tap.devNamespace != "":
apiVersion: v1
kind: Namespace
metadata:
  name: #@ data.values.tap.devNamespace
#@ end
---
apiVersion: secretgen.carvel.dev/v1alpha1
kind: SecretImport
metadata:
  name: container-registry-credentials
  namespace: #@ data.values.tap.devNamespace
spec:
  fromNamespace: #@ data.values.tap.namespace
---
apiVersion: secretgen.carvel.dev/v1alpha1
kind: SecretImport
metadata:
  name: tanzu-network-credentials
  namespace: #@ data.values.tap.devNamespace
spec:
  fromNamespace: #@ data.values.tap.namespace
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: default
  namespace: #@ data.values.tap.devNamespace
secrets:
  - name: container-registry-credentials
imagePullSecrets:
  - name: container-registry-credentials
  - name: tanzu-network-credentials
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: #@ "{}-{}".format(data.values.tap.devNamespace, "permit-deliverable")
  namespace: #@ data.values.tap.devNamespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: deliverable
subjects:
  - kind: ServiceAccount
    name: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: #@ "{}-{}".format(data.values.tap.devNamespace, "permit-workload")
  namespace: #@ data.values.tap.devNamespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: workload
subjects:
  - kind: ServiceAccount
    name: default

tanzu-network-secret.yml

#@ load("@ytt:data", "data")
#@ load("@ytt:base64", "base64")
#@ load("@ytt:json", "json")
---
#@ def tn_config():
#@  return {
#@    "auths": {
#@      data.values.tap.credentials.tanzuNet.host: {
#@        "username": data.values.tap.credentials.tanzuNet.username,
#@        "password": data.values.tap.credentials.tanzuNet.password
#@      }
#@    }
#@  }
#@ end
---
apiVersion: v1
kind: Secret
metadata:
  name: tanzu-network-credentials
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-rule: "delete after deleting tap"
type: kubernetes.io/dockerconfigjson
stringData:
  .dockerconfigjson: #@ json.encode(tn_config())
---
apiVersion: secretgen.carvel.dev/v1alpha1
kind: SecretExport
metadata:
  name: tanzu-network-credentials
  namespace: #@ data.values.tap.namespace
spec:
  toNamespaces:
  - '*'

tap-package-repo.yml

#@ load("@ytt:data", "data")
---
apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageRepository
metadata:
  name: tanzu-tap-repository
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-group: tap-install/tap-repo
spec:
  fetch:
    imgpkgBundle:
      image: #@ "{}/tanzu-application-platform/tap-packages:{}".format(data.values.tap.credentials.tanzuNet.host, data.values.tap.version)
      secretRef:
        name: tanzu-network-credentials

tap-package-install.yml

#@ load("@ytt:data", "data")
---
apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageInstall
metadata:
  name: tap
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-group: tap-install/tap
    kapp.k14s.io/change-rule.0: "upsert after upserting tap-install/rbac"
    kapp.k14s.io/change-rule.1: "upsert after upserting tap-install/tap-repo"
    packaging.carvel.dev/downgradable: ""
spec:
  packageRef:
    refName: tap.tanzu.vmware.com
    versionSelection:
      constraints: #@ str(data.values.tap.version)
      prereleases: {}
  serviceAccountName: tap-default-sa
  values:
  - secretRef:
      name: tap-values

Place the following files in the tap-install/profiles/iterate folder.

tbs-full-dependencies-package-repo.yml

#@ load("@ytt:data", "data")
---
apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageRepository
metadata:
  name: tbs-full-deps-repository
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-group: tap-install/tbs-full-deps-repo
spec:
  fetch:
    imgpkgBundle:
      image: #@ "{}/tanzu-application-platform/full-tbs-deps-package-repo:{}".format(data.values.tap.credentials.tanzuNet.host, data.values.buildService.version)
      secretRef:
        name: tanzu-network-credentials

tbs-full-dependencies-package-install.yml

#@ load("@ytt:data", "data")
#@ load("@ytt:yaml", "yaml")
---
#@ def stack_config():
  stack_configuration: #@ data.values.tap.stack_configuration
#@ end
---
apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageInstall
metadata:
  name: tbs-full-dependencies
  namespace: #@ data.values.tap.namespace
  annotations:
    kapp.k14s.io/change-group: tap-install/tbs-full-deps
    kapp.k14s.io/change-rule.0: "upsert after upserting tap-install/tap"
    kapp.k14s.io/change-rule.1: "upsert after upserting tap-install/tbs-full-deps-repo"
    packaging.carvel.dev/downgradable: ""
spec:
  packageRef:
    refName: full-tbs-deps.tanzu.vmware.com
    versionSelection:
      constraints: #@ str(data.values.buildService.version)
      prereleases: {}
  serviceAccountName: tap-default-sa
  values:
  - secretRef:
      name: tbs-full-dependencies
---
apiVersion: v1
kind: Secret
metadata:
  name: tbs-full-dependencies
  namespace: #@ data.values.tap.namespace
stringData:
  tbs-full-dependencies-secrets.yml: #@ yaml.encode(stack_config())

Place the following files in the tap-install/config directory.

tap-iterate-values-template.yml

#@ load("@ytt:data", "data")
#@ load("@ytt:yaml", "yaml")
---
#@ def config():
profile: iterate

ceip_policy_disclosed: true

shared:
  ingress_domain: #@ data.values.tap.domains.main

buildservice:
  kp_default_repository: #@ "{}/{}".format(data.values.tap.credentials.registry.host, data.values.tap.registry.repositories.buildService)
  kp_default_repository_username: #@ data.values.tap.credentials.registry.username
  kp_default_repository_password: #@ data.values.tap.credentials.registry.password
  tanzunet_username: #@ data.values.tap.credentials.tanzuNet.username
  tanzunet_password: #@ data.values.tap.credentials.tanzuNet.password
  exclude_dependencies: true
  stack_configuration: #@ data.values.tap.stack_configuration

#! supply_chain is pinned to basic as we want fast feedback in inner loop development
supply_chain: basic

ootb_supply_chain_basic:
  cluster_builder: full
  registry:
    server: #@ data.values.tap.credentials.registry.host
    repository: #@ data.values.tap.registry.repositories.ootbSupplyChain
  gitops:
    ssh_secret: #@ data.values.tap.supply_chain.gitops.ssh_secret

scanning:
  metadataStore:
    url: ""

metadata_store:
  ns_for_export_app_cert: #@ data.values.tap.devNamespace
  app_service_type: ClusterIP

image_policy_webhook:
  allow_unmatched_tags: true

contour:
  envoy:
    service:
      type: LoadBalancer

cnrs:
  domain_name: #@ data.values.tap.domains.knative
  domain_template: #@ "{}-{}-{}".format("{{.Name}}", data.values.tap.domains.suffix, "{{.Namespace}}.{{.Domain}}")

appliveview_connector:
  backend:
    sslDisabled: false
    ingressEnabled: true
    host: #@ "appliveview.{}".format(data.values.tap.domains.main)

#@ end
---
apiVersion: v1
kind: Secret
metadata:
  name: tap-values
  namespace: #@ data.values.tap.namespace
type: Opaque
stringData:
  values.yml: #@ yaml.encode(config())

config.yml

#@data/values

#@overlay/match-child-defaults missing_ok=True
---
buildService:
  version: "1.9.0"

tap:
  version: "1.4.0"
  namespace: tap-install
  devNamespace: development
  catalogs: []

  registry:
    repositories:
      buildService: REPLACE_WITH_YOUR_GOOGLE_PROJECT_ID/tanzu/build-service
      ootbSupplyChain: REPLACE_WITH_YOUR_GOOGLE_PROJECT_ID/tanzu/supply-chain

  domains:
    main: tap.REPLACE_WITH_YOUR_DOMAIN
    knative: apps.tap.REPLACE_WITH_YOUR_DOMAIN
    tapGui: tap-gui.tap.REPLACE_WITH_YOUR_DOMAIN
    suffix: iterate

  #! Change to "jammy-only" if you want to install Tanzu Application Platform with Ubuntu 22.04 (Jammy) as the only available stack
  stack_configuration: "default"

  supply_chain:
    cluster_builder: full
    #! choices below are: [ go-git, libgit2 ]
    git_implementation: go-git
    gitops:
      ssh_secret: ""

sensitive-config.yml

#@data/values

#@overlay/match-child-defaults missing_ok=True
---
tap:
  credentials:
    registry:
      host: REPLACE_WITH_GOOGLE_REGION-docker.pkg.dev
      username: _json_key_base64
      password: REPLACE_WITH_BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_JSON_FILE_CONTENTS
    tanzuNet:
      host: registry.tanzu.vmware.com
      username: REPLACE_WITH_TANZUNET_USERNAME
      password: REPLACE_WITH_TANZUNET_PASSWORD

Phew! That was quite a wave. Now we'll marry the configuration with the templated manifest to generate a concrete manifest named tap-iterate-values.yml using ytt. (Make sure you are still in the tap-install directory).

ytt -f config > tap-iterate-values.yml

Go ahead and inspect the result, it's just a Secret.

cat tap-iterate-values.yml

Let's deploy it

kubectl apply -f tap-iterate-values.yml

To cap it all off we'll use the combination of ytt and kapp to install Tanzu Application Platform activating the iterate profile.

kapp deploy -a tap-iterate -f <(ytt -f config/config.yml -f config/sensitive-config.yml -f base -f profiles/iterate) -c -y

Execute the following command to see that all components were installed successfully. You may have to do this multiple times if you're impatient. A typical installation completes within 20 minutes. (You'll know that when the DESCRIPTION column shows Reconcile succeeded for all installed components).

kubectl get app -A

Final thoughts

We spent a fair amount of time cultivating HCL and YAML to drive the desired outcome of automated provisioning of public cloud resources and installation of TAP. And we were exposed to some sharp tools. However, exercising the building blocks above still leads to toil.

What if we were asked to create multiple environments? No doubt, we'd start crying.

And what about collaboration? I would want the system of record to be a Git repository, not my laptop. I don't want to tie up my laptop compute either (when provisioning cloud resources with Terraform), so I would consider leveraging a self-service continuous integration/delivery engine.

What else might we do to improve upon this approach?

Stay tuned for my next post.