Kubernetes the not so hard way with Ansible - The basics

Ansible introduction and setup hosts for Kubernetes

September 1, 2018

I created a series of posts about running Kubernetes (K8s for short) managed by Ansible. I have my hosts at Hetzner Cloud but you should be able to use the playbooks with minor or no modifications for other hoster e.g. Scaleway or Digital Ocean. I’ll only test this with Ubuntu 18.04 LTS but with minimal modifications it should work with all systemd based Linux operating systems and also with Ubuntu 16.04 LTS.

I used Kelsey Hightower’s wonderful guide Kubernetes the hard way as starting point. My goal is to install a Kubernetes cluster with Ansible which could be used in production and is maintainable. It’s not H/A at the moment because the Kubernetes components currently communicate only with one kube-apiserver (beside the fact that we’ll have 3 running there is just no loadbalancing in place yet). So at the moment there are still a few TODO’s besides making requests to kube-apiserver H/A. One idea to make the requests from a K8s worker node to kube-apiserver H/A is by installing nginx webserver as loadbalancer on the worker node and balance the requests between the three kube-apiserver. In this case kube-proxy and kublet on the worker nodes can talk to the local nginx instance and if one kube-apiserver fails nginx would automatically route the requests to the next healthy kube-apiserver. nginx on the other hand can run as DaemonSet which will take care to restart nginx if needed. But that’s just a idea ATM ;-)

If you need something fast maybe have a look at this projects: Minikube

IMHO none of the projects solve all problems you’ll have setting up a K8s cluster esp. when it comes to secure communication between the K8s nodes or regarding H/A or updating the cluster. But that may change over time (or has already changed) of course ;-)

To enable the Kubernetes services to communicate securly between the hosts I’ll use WireGuard. It’s not part of the Linux kernel at the moment but this will hopefully change in the near future (maybe with kernel 4.194.20). Kelsey Hightower uses Google Cloud or AWS which supports cool networking options but we don’t have this features. So WireGuard will help us to compensate this a little bit as we can create a network at layer 2 with communication encrypted and it’s easy to install (I’ve created a Ansible role for this). But more about that in a later blog post.

Start your engines

If you want to do something real with your Kubernetes cluster you’ll need at least 4 or 5 instances. Three for the Kubernetes controller nodes and for etcd (three nodes for high availbility) and one or two nodes for the worker (the nodes that will run the Docker container and do the actual work). For smaller workloads at Hetzner Online CX11 instances (1x64 bit core, 2 GB RAM, 20 GB SSD each) are sufficient for the controller nodes. I try to keep costs low. So if you run production load you should distribute the services on more hosts and use bigger hosts for the worker (maybe something like CX31 or bigger).

As a side note: We’ll install the etcd cluster on the controller nodes (e.g. where the API server, K8s scheduler and K8s controller manager runs) to save costs. But it’s recommended for production to install etcd on it’s own hosts. So you may install three additional hosts just for etcd.

Use the Hetzner Cloud Console UI to setup the hosts or a tool like Hashicorp’s Terraform. Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. There is a Hetzner Cloud Provider available for Terraform. I won’t get into detail how to setup hosts here as it depends on your provider.

Prepare Ansible

If you never heard of Ansible: Ansible is a powerful IT automation engine. Instead of managing and handling your instances or deployments by hand Ansible will do this for you. This is less error phrone and everything is repeatable. To do something like installing a package you create a Ansible task. This tasks are organized in playbooks. The playbooks can be modified via variables for hosts, host groups, and so on. A very useful feature of Ansible are roles. E.g. you want to install ten hosts with Apache webserver. In that case you just add a Apache role to that ten hosts and maybe modify some host group variables and roll out Apache webserver on all the hosts you specified. Very easy! For more information read Getting started with Ansible. But I’ll add some comments in my blog posts what’s going on in the roles/playbooks we use.

For beginners: Also have a look here: ANSIBLE BEST PRACTICES: THE ESSENTIALS

I was also thinking about using ImmutableServer and Immutable infrastructure but decided to go with Ansible for now. This concepts have some real advantages and we also using it in my company very successfully together with the Google Cloud. Using virtual machines like Docker container and throw them away at any time is quite cool :-) .

Setup Ansible

If you haven’t already setup a Ansible directory which holds our hosts file, the roles and so on then do so now. The default directory for Ansible roles is /etc/ansible/roles. To add an additional roles directory adjust the Ansible configuration /etc/ansible/ansible.cfg and add your roles path to roles_path setting (separated by :). I’m a fan of having everything in one place so I put everything for Ansible in /opt/ansible. So the roles path for me is roles_path /opt/ansible/roles:/etc/ansible/roles. A tool for installing Ansible roles is ansible-galaxy which is included if you install Ansible. Also have a look at https://galaxy.ansible.com/ for more information (you can also browse the available roles there).

If you’ll follow me regarding directory structure your Ansible directory will look like the following structure when we’re done with the blog series (also see Ansible directory layout best practice):

├── group_vars
│   ├── all.yml
├── hosts
├── host_vars
│   └── controller01.i.domain.tld
│   └── controller02.i.domain.tld
│   └── controller03.i.domain.tld
│   └── worker01.i.domain.tld
│   └── worker02.i.domain.tld
│   └── workstation
├── k8s.yml
├── playbooks
│   └── kubernetes-misc
│       ├── kubeauthconfig
│       ├── kubectlconfig
│       ├── kubedns
│       ├── kubeencryptionconfig
│       ├── kube-router
│       ├── traefik
│       ├── LICENSE
│       └── README.md
└── roles
    ├── brianshumate.consul
    ├── githubixx.cfssl
    ├── githubixx.docker
    ├── githubixx.etcd
    ├── githubixx.harden-linux
    ├── githubixx.kubectl
    ├── githubixx.kubernetes-ca
    ├── githubixx.kubernetes-controller
    ├── githubixx.kubernetes-flanneld
    ├── githubixx.kubernetes-worker
    └── githubixx.wireguard

Don’t worry if directories doesn’t contain all the files yet we’ll get there. Just make sure that at least the top level directories like group_vars, host_vars, playbooks and roles exist. As you can see from the output group_vars, host_vars, playbooks and roles are directories. hosts and k8s.yml are files. I’ll explain what this directories and files are good for while you walk through the blog posts and I’ll also tell a little bit more about Ansible.

Hint: Quite a few variables are needed by more then one role and playbooks. Put this kind of variables into group_vars/all.yml. Especially variables needed by the playbooks (not the roles) fit good there. But it’s up to you where you want to place the variables as long as the roles/playbooks find them when they’re needed ;-) Throughout the tutorial I’ll put all common variables into group_vars/all.yml as it makes things more straight forward. You may organize variables differently.

That’s it for the basics. Continue with harden the instances.