Kubernetes the Not So Hard Way With Ansible - Ingress with Traefik v2 and cert-manager (Part 2) [Updated for Traefik v2.8]
CHANGELOG
2022-02-02
- update cert-manager to
v1.6.1
2021-09-12
- update Traefik to v2.5
- update cert-manager to v1.5
2021-05-19
- update Traefik to v2.4
- update cert-manager to v1.3
Install role
In part 1 I installed Traefik
proxy. So it’s now basically possible to expose Kubernetes
services to the Internet. But nowadays traffic should be encrypted whenever possible. And even if you don’t think you need it, think about SEO. Google ranks sites with encrypted traffic higher e.g.
So cert-manager can be installed to automatically get TLS certificates from Let’s Encrypt. The certificates can then be used by Traefik
to enable SSL for an Ingress
. cert-manager
will also take care to keep them up to date.
As with Traefik
I’ve also prepared an Ansible role to install cert-manager
. It’s available at Ansible Galaxy and can be installed via
ansible-galaxy install githubixx.cert_manager_kubernetes
or you just clone the Github repository in your roles directory:
git clone https://github.com/githubixx/ansible-role-cert-manager-kubernetes
Role requirements
Like for the Traefik
role you also need Helm 3 and kubectl plus a proper configured KUBECONFIG
.
Behind the doors it uses the official Helm chart. Currently procedures like installing, updating/upgrading and deleting the cert-manager
deployment are supported.
Role configuration
So lets have a look at the available role variables:
# Helm chart version
cert_manager_chart_version: "v1.6.1"
# Helm release name
cert_manager_release_name: "cert-manager"
# Helm repository name
cert_manager_repo_name: "jetstack"
# Helm chart name
cert_manager_chart_name: "{{ cert_manager_repo_name }}/{{ cert_manager_release_name }}"
# Helm chart URL
cert_manager_chart_url: "https://charts.jetstack.io"
# Kubernetes namespace where cert-manager resources should be installed
cert_manager_namespace: "cert-manager"
# The following list contains the configurable parameters of the cert-manager
# Helm chart. For all possible values see:
# https://artifacthub.io/packages/helm/jetstack/cert-manager#configuration
# But for most users "installCRDs=true" should be sufficient.
# If true, CRD resources will be installed as part of the Helm chart.
# If enabled, when uninstalling CRD resources will be deleted causing all
# installed custom resources to be DELETED.
cert_manager_values:
- installCRDs=true
- global.leaderElection.namespace="{{ cert_manager_namespace }}"
# To install "ClusterIssuer" for Let's Encrypt (LE) "cert_manager_le_clusterissuer_options"
# needs to be defined. The variable contains a list of hashes and can be defined
# in "group_vars/all.yml" e.g.
#
# name: Defines the name of the "ClusterIssuer"
# email: Use a valid e-mail address to be alerted by LE in case a certificate
# expires
# server: Hostname part of the LE URL
# private_key_secret_ref_name: Name of the secret which stores the private key
# solvers_http01_ingress_class: Value of "kubernetes.io/ingress.class" annotation.
# Depends on your ingress controller. Common values
# are "traefik" for Traefik or "nginx" for nginx.
#
# Besides "email" the following values can be used as is and will create valid
# "ClusterIssuer" for Let's Encrypt staging and production. Only "email" needs
# to be adjusted if Traefik is used as ingress controller. For other ingress
# controllers "solvers_http01_ingress_class" needs to be adjusted too. Currently
# only "ClusterIssuer" and "http01" solver is implemented. For definition also
# see "tasks/install-issuer.yml".
#
cert_manager_le_clusterissuer_options:
- name: letsencrypt-prod
email: insert@your-e-mail-address.here
server: acme-v02
private_key_secret_ref_name: letsencrypt-account-key
solvers_http01_ingress_class: "traefik"
- name: letsencrypt-staging
email: insert@your-e-mail-address.here
server: acme-staging-v02
private_key_secret_ref_name: letsencrypt-staging-account-key
solvers_http01_ingress_class: "traefik"
First check if you want to change any of the default values in default/main.yml
. As usual those values can be overridden in host_vars
or group_vars
. Normally there is no need to change that much. Besides the cert_manager_chart_version
you might want do add a few options to cert_manager_values
. It contains the configurable parameters of the cert-manager
Helm chart. The list is submitted “as is” to helm
binary for template
, install
or upgrade
commands.
My cert_manager
entry in Ansible’s hosts
file is just
[cert_manager]
localhost
And in k8s.yml
I added
-
hosts: cert_manager
roles:
- role: githubixx.cert_manager_kubernetes
tags: role-cert-manager-kubernetes
Render and verify YAML resources
The default action is to just render the Kubernetes resources YAML file after replacing all Jinja2 variables and stuff like that (that means not specifying any value via --extra-vars action=...
to ansible-playbook
).
So to render the YAML files that WOULD be applied (nothing will be installed at this time) and the playbook is called k8s.yml
execute the following command (as mentioned in the previous blog post you may set ANSIBLE_STDOUT_CALLBACK=debug
environment variable or stdout_callback = debug
in ansible.cfg
to get a pretty printed output):
ansible-playbook --tags=role-cert-manager-kubernetes k8s.yml
Install cert-manager
If the rendered output contains everything you need, the role can be installed which finally deploys cert-manager
(still assuming the playbook file is called k8s.yml
- if not please adjust accordingly):
ansible-playbook --tags=role-cert-manager-kubernetes --extra-vars action=install k8s.yml
To check if everything was deployed use the usual kubectl
commands like kubectl -n <cert_manager_namespace> get pods -o wide
. E.g.
kubectl -n cert-manager get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cert-manager-76d899dd6c-8q8bx 1/1 Running 0 1m 10.200.3.146 worker01 <none> <none>
cert-manager-cainjector-68c96b7844-wrnr5 1/1 Running 0 1m 10.200.2.32 worker02 <none> <none>
cert-manager-webhook-5bb449596f-5pbqx 1/1 Running 0 1m 10.200.3.249 worker01 <none> <none>
Before the playbook finishes it waits for the first cert-manager-webhooks
pod to become ready. In general wait until all cert-manager
pods are ready before you try to get the first certificate.
Install ClusterIssuer
The role currently supports deploying a ClusterIssuer for Let’s Encrypt (LE) for LE staging and production. The most relevant variable in this case is cert_manager_le_clusterissuer_options
. Please see the role variables above for more information.
After cert_manager_le_clusterissuer_options
variable is adjusted accordingly the ClusterIssuer
can be installed:
ansible-playbook --tags=role-cert-manager-kubernetes --extra-vars action=install-issuer k8s.yml
After deploying the issuer the first time it takes a little bit until they are ready. To figure out if they are ready kubectl
can be used:
kubectl get clusterissuer.cert-manager.io
NAME READY AGE
letsencrypt-prod True 10m
letsencrypt-staging True 11m
Request Let’s Encrypt certificate
Before a Certificate
can be requested make sure that the DNS entry for the domain you want to get a certificate points to one of the Traefik instances or to the loadbalancer IP that you might have placed “in front” of the Traefik instances.
Now a certificate can be issued. This happens outside of this Ansible role. E.g. to get a certificate for domain www.domain.name
from Let’s Encrypt staging server (this one is only for testing and doesn’t issue a valid certificate that browsers will accept) create a YAML file (e.g. domain-name.yaml) like this:
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cert-name
namespace: namespace-name
spec:
commonName: www.domain.name
secretName: secret-name
dnsNames:
- www.domain.name
issuerRef:
name: letsencrypt-staging
kind: ClusterIssuer
issuerRef.name: letsencrypt-staging
points to Let’s Encrypt staging API. Before switching to production API letsencrypt-prod
make sure that staging works fine. The production API has some rate limiting. So if you experiment to much with this issuer Let’s Encrypt might block you for a while. So after changing the values to your needs, apply this file with kubectl apply -f domain-name.yaml
.
If you request a (Cluster)Issuer
or a Certificate
you can watch cert-manager
logs to see what’s going on e.g. (in case you use a different namespace for cert-manager
change the namespace accordingly):
kubectl -n cert-manager logs --tail=5 -f $(kubectl -n cert-manager get pods -l app=cert-manager --output=jsonpath='{.items..metadata.name}')
To get information about a Certificate
this command can be used:
kubectl -n your-namespace get certificate cert-name -o json
Esp. watch out if the Certificate
is ready e.g.:
kubectl -n your-namespace get certificate your-certificate -o json | jq '.status.conditions'
[
{
"lastTransitionTime": "2021-01-03T22:05:59Z",
"message": "Certificate is up to date and has not expired",
"reason": "Ready",
"status": "True",
"type": "Ready"
}
]
For more information also see the README of the role.
Configure IngressRoute
Now that the (staging) certificate is in place we finally create an IngressRoute. IngressRoute
is a Traefik specific custom implementation of Ingress. The IngressRoute
will use the certificate which is stored as a Kubernetes secret that cert-manager
fetched from Let’s Encrypt:
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: www-domain-name
namespace: namespace-name
spec:
entryPoints:
- web
- websecure
routes:
- kind: Rule
match: Host(`www.domain.name`)
services:
- kind: Service
name: service-name
namespace: namespace-name
passHostHeader: true
port: 80
tls:
secretName: cert-name
This manifest specifies an IngressRoute
called www-domain-name
in namespace namespace-name
. It’s bound to the web
and websecure
entrypoints. There is also a Rule
. It’ll trigger if the incoming request wants to fetch a page from www.domain.name
and will forward the request to a Service
called service-name
in namespace namespace-name
. Finally a secretName
called cert-name
is specified. That’s the reference to the Certificate
which was created above.
If you save the manifest now to ingressroute.yaml
it can be applied: kubectl apply -f ingressroute.yaml
. This will create the resource and you should see the IngressRoute
in the Traefik
dashboard (how to access it see above).
That’s it basically! :-)
What’s next
You probably already figured out that the whole setup is ok so far but not perfect. If you point your website DNS record to one of the Traefik instances (which basically means to one of the Traefik
DaemonSet
members) and the host dies you’re out of business for a while. Also if you use DNS round robin and distribute the requests to all Traefik nodes you still have the problem if one node fails you loose at least the requests to this nodes. One solution to this problem could be a managed loadbalancer as already mentioned further above.
If you can change your DNS records via API (which is the case for Google Cloud DNS or OVH DNS e.g.) you could deploy a Kubernetes Cron Job that monitors all Traefik
instances and changes DNS records if one of the nodes fail or you can implement a watchdog functionality yourself and deploy the program as pod into your K8s cluster. This can also depend on Prometheus metrics.
But one of the best options to solve the problem is probably MetalLB. Also see Configuring HA Kubernetes cluster on bare metal servers with GlusterFS & MetalLB and What you need to know about MetalLB.
If you use Hetzner cloud hcloud-fip-controller is a possible option that might be sufficient for maybe quite a few use cases. hcloud-fip-controller
is a small controller, to handle floating IP management in a Kubernetes cluster on Hetzner cloud virtual machines.
There is also kube-vip. The kube-vip project provides High-Availability and load-balancing for both inside and outside a Kubernetes cluster.
Next up: Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes