Kubernetes the not so hard way with Ansible - etcd cluster - (K8s v1.28)

This post is based on Kelsey Hightower’s Kubernetes The Hard Way - Bootstrapping the etcd cluster.

In the previous part certificate authority I installed the PKI (public key infrastructure) in order to secure communication between our Kubernetes components/infrastructure and also created certificates for authentication of the various K8s components. Now we use the certificate authorities (CA) and generated keys for the first and very important component - the etcd cluster. etcd is basically a distributed key/value database. The Kubernetes services in general are stateless. All state is stored in etcd so you should take care of your etcd cluster in production. If you loose all etcd nodes you loose the whole Kubernetes state… So making a snapshot/backup from time to time is - at least - recommended 😉

I want to mention that if the etcd cluster nodes wont join then a possible reason could be the certificates. If it isn’t the firewall blocking traffic between etcd nodes, the certificate’s host list could be the problem. The error message isn’t always clear about the issue.

As usual we add the role ansible-role-etcd to the k8s.yml file e.g.:

  hosts: k8s_etcd
  roles:
    -
      role: githubixx.etcd
      tags: role-etcd

Next install the role via

ansible-galaxy install githubixx.etcd

(or just clone the Github repo whatever you like). Basically you don’t need to change a lot of variables but you can if you want of course:

# The directory from where to copy the etcd certificates. By default this
# will expand to user's LOCAL $HOME (the user that run's "ansible-playbook ..."
# plus "/etcd-certificates". That means if the user's $HOME directory is e.g.
# "/home/da_user" then "etcd_ca_conf_directory" will have a value of
# "/home/da_user/etcd-certificates".
etcd_ca_conf_directory: "{{ '~/etcd-certificates' | expanduser }}"

# etcd Ansible group
etcd_ansible_group: "k8s_etcd"

# etcd version
etcd_version: "3.5.9"

# Port where etcd listening for clients
etcd_client_port: "2379"

# Port where etcd is listening for it's peer's
etcd_peer_port: "2380"

# Interface to bind etcd ports to
etcd_interface: "tap0"

# Run etcd daemon as this user.
#
# Note 1: If you want to use an "etcd_peer_port" < 1024 you most probably need
# to run "etcd" as user "root".
# Note 2: If the user specified in "etcd_user" does not exist then the role
# will create it. Only if the user already exists the role will not create it
# but it will adjust it's UID/GID and shell if specified (see settings below).
# Additionally if "etcd_user" is "root" then this role wont touch the user
# at all.
etcd_user: "etcd"

# UID of user specified in "etcd_user". If not specified the next available
# UID from "/etc/login.defs" will be taken (see "SYS_UID_MAX" setting).
# etcd_user_uid: "999"

# Shell for specified user in "etcd_user". For increased security keep
# the default.
etcd_user_shell: "/bin/false"

# Specifies if the user specified in "etcd_user" will be a system user (default)
# or not. If "true" the "etcd_user_home" setting will be ignored. In general
# it makes sense to keep the default as there should be no need to login as
# the user that runs "etcd".
etcd_user_system: true

# Home directory of user specified in "etcd_user". Will be ignored if
# "etcd_user_system" is set to "true". In this case no home directory will
# be created. Normally not needed.
# etcd_user_home: "/home/etcd"

# Run etcd daemon as this group
#
# Note: If the group specified in "etcd_group" does not exist then the role
# will create it. Only if the group already exists the role will not create it
# but will adjust GID if specified in "etcd_group_gid" (see setting below).
etcd_group: "etcd"

# GID of group specified in "etcd_group". If not specified the next available
# GID from "/etc/login.defs" will be take (see "SYS_GID_MAX" setting).
# etcd_group_gid: "999"

# Specifies if the group specified in "etcd_group" will be a system group (default)
# or not.
etcd_group_system: true

# Directory for etcd configuration
etcd_conf_dir: "/etc/etcd"

# Permissions for directory for etcd configuration
etcd_conf_dir_mode: "0750"

# Owner of directory specified in "etcd_conf_dir"
etcd_conf_dir_user: "root"

# Group owner of directory specified in "etcd_conf_dir"
etcd_conf_dir_group: "{{ etcd_group }}"

# Directory to store downloaded etcd archive
# Should not be deleted to avoid downloading over and over again
etcd_download_dir: "/opt/etcd"

# Permissions for directory to store downloaded etcd archive
etcd_download_dir_mode: "0755"

# Owner of directory specified in "etcd_download_dir"
etcd_download_dir_user: "{{ etcd_user }}"

# Group owner of directory specified in "etcd_download_dir"
etcd_download_dir_group: "{{ etcd_group }}"

# Directory to store etcd binaries
#
# IMPORTANT: If you use the default value for "etcd_bin_dir" which is
# "/usr/local/bin" then the settings specified in "etcd_bin_dir_mode",
# "etcd_bin_dir_user" and "etcd_bin_dir_group" are ignored. This is
# done to prevent that the permissions of "/usr/local/bin" are changed.
# This directory normally exists already on every Linux installation
# and should not be changed.
# So please be careful if you specify a directory like "/usr/bin" or
# "/bin" as "etcd_bin_dir" as this will change the permissions of
# these directories and this is something you normally do not want.
etcd_bin_dir: "/usr/local/bin"

# Permissions for directory to store etcd binaries
etcd_bin_dir_mode: "0755"

# Owner of directory specified in "etcd_bin_dir"
etcd_bin_dir_user: "{{ etcd_user }}"

# Group owner of directory specified in "etcd_bin_dir"
etcd_bin_dir_group: "{{ etcd_group }}"

# etcd data directory (etcd database files so to say)
etcd_data_dir: "/var/lib/etcd"

# Permissions for directory to store etcd data
etcd_data_dir_mode: "0700"

# Owner of directory specified in "etcd_data_dir"
etcd_data_dir_user: "{{ etcd_user }}"

# Group owner of directory specified in "etcd_data_dir"
etcd_data_dir_group: "{{ etcd_group }}"

# Architecture to download and install
etcd_architecture: "amd64"

# Only change this if the architecture you are using is unsupported
# For more information, see this:
# https://github.com/etcd-io/website/blob/main/content/en/docs/v3.5/op-guide/supported-platform.md
etcd_allow_unsupported_archs: false

# By default etcd tarball gets downloaded from official
# etcd repository. This can be changed to some custom
# URL if needed. For more information which protocols
# can be used see:
# https://docs.ansible.com/ansible/latest/collections/ansible/builtin/get_url_module.html
# It's only important to keep the filename naming schema:
# "etcd-v{{ etcd_version }}-linux-{{ etcd_architecture }}.tar.gz"
etcd_download_url: "https://github.com/etcd-io/etcd/releases/download/v{{ etcd_version }}/etcd-v{{ etcd_version }}-linux-{{ etcd_architecture }}.tar.gz"

# By default the SHA256SUMS file is used to verify the
# checksum of the tarball archive. This can also be
# changed to your needs.
etcd_download_url_checksum: "sha256:https://github.com/coreos/etcd/releases/download/v{{ etcd_version }}/SHA256SUMS"

# Options for [Service] section. For more information see:
# https://www.freedesktop.org/software/systemd/man/systemd.service.html#Options
# The options below "Type=notify" are mostly security/sandbox related settings
# and limit the exposure of the system towards the unit's processes.
# https://www.freedesktop.org/software/systemd/man/systemd.exec.html
etcd_service_options:
  - User={{ etcd_user }}
  - Group={{ etcd_group }}
  - Restart=on-failure
  - RestartSec=5
  - Type=notify
  - ProtectHome=true
  - PrivateTmp=true
  - ProtectSystem=full
  - ProtectKernelModules=true
  - ProtectKernelTunables=true
  - ProtectControlGroups=true
  - CapabilityBoundingSet=~CAP_SYS_PTRACE

etcd_settings:
  "name": "{{ ansible_hostname }}"
  "cert-file": "{{ etcd_conf_dir }}/cert-etcd-server.pem"
  "key-file": "{{ etcd_conf_dir }}/cert-etcd-server-key.pem"
  "trusted-ca-file": "{{ etcd_conf_dir }}/ca-etcd.pem"
  "peer-cert-file": "{{ etcd_conf_dir }}/cert-etcd-peer.pem"
  "peer-key-file": "{{ etcd_conf_dir }}/cert-etcd-peer-key.pem"
  "peer-trusted-ca-file": "{{ etcd_conf_dir }}/ca-etcd.pem"
  "advertise-client-urls": "{{ 'https://' + hostvars[inventory_hostname]['ansible_' + etcd_interface].ipv4.address + ':' + etcd_client_port }}"
  "initial-advertise-peer-urls": "{{ 'https://' + hostvars[inventory_hostname]['ansible_' + etcd_interface].ipv4.address + ':' + etcd_peer_port }}"
  "listen-peer-urls": "{{ 'https://' + hostvars[inventory_hostname]['ansible_' + etcd_interface].ipv4.address + ':' + etcd_peer_port }}"
  "listen-client-urls": "{{ 'https://' + hostvars[inventory_hostname]['ansible_' + etcd_interface].ipv4.address + ':' + etcd_client_port + ',https://127.0.0.1:' + etcd_client_port }}"
  "peer-client-cert-auth": "true"            # Enable peer client cert authentication
  "client-cert-auth": "true"                 # Enable client cert authentication
  "initial-cluster-token": "etcd-cluster-0"  # Initial cluster token for the etcd cluster during bootstrap.
  "initial-cluster-state": "new"             # Initial cluster state ('new' or 'existing')
  "data-dir": "{{ etcd_data_dir }}"          # etcd data directory (etcd database files so to say)
  "wal-dir": ""                              # Dedicated wal directory ("" means no separated WAL directory)
  "auto-compaction-retention": "0"           # Auto compaction retention in hour. 0 means disable auto compaction.
  "snapshot-count": "100000"                 # Number of committed transactions to trigger a snapshot to disk
  "heartbeat-interval": "100"                # Time (in milliseconds) of a heartbeat interval
  "election-timeout": "1000"                 # Time (in milliseconds) for an election to timeout. See tuning documentation for details
  "max-snapshots": "5"                       # Maximum number of snapshot files to retain (0 is unlimited)
  "max-wals": "5"                            # Maximum number of wal files to retain (0 is unlimited)
  "quota-backend-bytes": "0"                 # Raise alarms when backend size exceeds the given quota (0 defaults to low space quota)
  "logger": "zap"                            # Specify ‘zap’ for structured logging or ‘capnslog’.
  "log-outputs": "systemd/journal"           # Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd
  "enable-v2": "true"                        # enable v2 API to stay compatible with previous etcd 3.3.x (needed for flannel e.g.)
  "discovery-srv": ""                        # Discovery domain to enable DNS SRV discovery, leave empty to disable. If set, will override initial-cluster.

# Certificate authority and certificate files for etcd
etcd_certificates:
  - ca-etcd.pem               # certificate authority file
  - ca-etcd-key.pem           # certificate authority key file
  - cert-etcd-peer.pem        # peer TLS cert file
  - cert-etcd-peer-key.pem    # peer TLS key file
  - cert-etcd-server.pem      # server TLS cert file
  - cert-etcd-server-key.pem  # server TLS key file

The etcd default flags/settings defined in etcd_settings can be overridden by defining a variable called etcd_settings_user. You can also add additional settings by using this variable. E.g. to override the default value for log-outputs setting add a new setting like heartbeat-interval and election-timeout add the following settings to group_vars/k8s_etcd.yml e.g.:

etcd_settings_user:
  "log-outputs": "stdout"
  "heartbeat-interval": "250"
  "election-timeout": "2500"

The last two settings are definitely something to checkout. The values for these two settings made very much sense in my setup. The default ones were a little bit too low. I also add "peer-cert-allowed-cn": ""etcd" in my setup. The value of this variable is taken from etcd_peer_csr_cn variable (see previous blog post). When peer-cert-allowed-cn flag is specified, an etcd node can only join with matching common name (cn) even with shared CAs (certificate authority). This might be important if you have multiple etcd clusters running all with the same CA. This avoids that an etcd member joins accidentally the wrong cluster.

And then there are two more default variables which I’ll change:

etcd_interface: "{{ k8s_interface }}"
etcd_ca_conf_directory: "{{ k8s_ca_conf_directory }}"

The role will search for the certificates I created in certificate authority in the directory specified in etcd_ca_conf_directory (which in my case is just the value of k8s_ca_conf_directory defined in group_vars/all.yml) on the host where the Ansible Controller is locacted. And make sure to change etcd_interface to wg0 (or the variable k8s_interface defined in group_vars/all.yml) instead of tap0 if you followed my blog series so far and used WireGuard VPN and the interface wg0. That’s important as the etcd cluster nodes and the kube-apiserver must be able to talk to each other!

And finally the firewall rules must be adjusted. So far I only allowed port 22222 (for SSH) and 51820 (for WireGuard). For etcd port 2380 (default) needs to be open between the etcd cluster members for cluster/peer communication and normally port 2379 needs to be open for etcd to kube-apiserver communication. But as I’ll use my etcd cluster also for Cilium and Traefik as mentioned in the previous blog post I need to allow port 2379 for the whole cluster. So I need to extend harden_linux_ufw_rules in group_vars/k8s_etcd.yml accordingly and the final list will look like this:

harden_linux_ufw_rules:
  - rule: "allow"
    to_port: "22222"
    protocol: "tcp"
  - rule: "allow"
    to_port: "51820"
    protocol: "udp"
  - rule: "allow"
    to_port: "2379"
    protocol: "tcp"
    from_ip: "10.0.11.0/24" # K8s Controller + Worker
  - rule: "allow"
    to_port: "2380"
    protocol: "tcp"
    from_ip: "10.0.11.2/32" # etcd node1
  - rule: "allow"
    to_port: "2380"
    protocol: "tcp"
    from_ip: "10.0.11.5/32" # etcd node2
  - rule: "allow"
    to_port: "2380"
    protocol: "tcp"
    from_ip: "10.0.11.8/32" # etcd node3

To deploy the firewall changes I re-deploy my harden-linux role:

ansible-playbook --tags=role-harden-linux k8s.yml

Now I can deploy the etcd role:

ansible-playbook --tags=role-etcd k8s.yml

This will install the etcd cluster and start the etcd daemons. Have a look at the logs of your etcd hosts if everything worked and the etcd nodes are connected. Use journalctl --no-pager or journalctl -f or journalctl -t etcd to check the systemd log. E.g.

ansible -m command -a "systemctl status etcd" k8s_etcd
ansible -m command -a "journalctl --boot -t etcd" k8s_etcd

Afterwards we can use Ansible to check the cluster status e.g.:

ansible -m shell -e "etcd_conf_dir=/etc/etcd" -a 'ETCDCTL_API=3 etcdctl endpoint health \
--endpoints=https://{{ ansible_wg0.ipv4.address }}:2379 \
--cacert={{ etcd_conf_dir }}/ca-etcd.pem \
--cert={{ etcd_conf_dir }}/cert-etcd-server.pem \
--key={{ etcd_conf_dir }}/cert-etcd-server-key.pem' \
k8s_etcd

I use Ansible’s shell module here. I also set a variable etcd_conf_dir which points to the directory where the etcd certificate files are located. That should be the same value as the value of etcd_conf_dir variable of the etcd role. Since my etcd processes listen on the WireGuard interface I use ansible_wg0.ipv4.address here as wg0 is the name of my Wireguard interface. If you use a different port than 2379 then of course you need to change that one too. You should see now a output similar to this:

etcd-node1 | CHANGED | rc=0 >>
https://10.8.0.101:2379 is healthy: successfully committed proposal: took = 2.807665ms
etcd-node2 | CHANGED | rc=0 >>
https://10.8.0.103:2379 is healthy: successfully committed proposal: took = 2.682864ms
etcd-node3 | CHANGED | rc=0 >>
https://10.8.0.102:2379 is healthy: successfully committed proposal: took = 10.169332ms

Or:

ansible -m shell -e "etcd_conf_dir=/etc/etcd" -a 'ETCDCTL_API=3 etcdctl --write-out=table endpoint status \
--endpoints=https://{{ ansible_wg0.ipv4.address }}:2379 \
--cacert={{ etcd_conf_dir }}/ca-etcd.pem \
--cert={{ etcd_conf_dir }}/cert-etcd-server.pem \
--key={{ etcd_conf_dir }}/cert-etcd-server-key.pem' \
k8s_etcd

which will show you a table like this:

k8s-010101.i.example.com | CHANGED | rc=0 >>
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.11.2:2379 | 296f59da7313dfd7 |   3.5.9 |   20 kB |     false |      false |         2 |         11 |                 11 |        |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
k8s-010201.i.example.com | CHANGED | rc=0 >>
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.11.5:2379 | 4a37ed13f1667e6b |   3.5.9 |   20 kB |      true |      false |         2 |         11 |                 11 |        |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
k8s-010301.i.example.com | CHANGED | rc=0 >>
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.11.8:2379 | d20e0decc9083d25 |   3.5.9 |   20 kB |     false |      false |         2 |         11 |                 11 |        |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

That cluster is in good shape. All nodes have the the same index and etcd #2 is the leader.

Next we’ll install the Kubernetes control plane.