Kubernetes the not so hard way with Ansible (at Scaleway) - Part 7 - The worker [updated for Kubernetes v1.10.x]
Installing Flannel, Docker, kube-apiserver, kube-controller-manager, kube-scheduler and kube-dns
February 20, 2017
CHANGELOG
2018-09-05
- I’ll no longer update this text as I migrated my hosts to Hetzner Online because of constant network issues with Scaleway. I’ve created a new blog series about how to setup a Kubernetes cluster at Hetzner Online but since my Ansible playbooks are not provider depended the blog text should work for Scaleway too if you still want to use it. The new blog post is here.
2018-06-25
- fix iptables bug in
k8s_worker_kubeproxy_conf_yaml
2018-06-11
- update to Kubernetes v1.10.4
- update
k8s_release
to1.10.4
- introduce
k8s_worker_kubelet_conf_yaml
variable - removed deprecated setttings in
k8s_worker_kubelet_settings
- moved settings in
k8s_worker_kubelet_settings
tok8s_worker_kubelet_conf_yaml
: see https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ see https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/kubelet/apis/kubeletconfig/v1beta1/types.go - introduce
k8s_worker_kubeproxy_conf_yaml
variable - removed deprecated settings in
k8s_worker_kubeproxy_settings
- moved settings in
k8s_worker_kubeproxy_settings
tok8s_worker_kubeproxy_conf_yaml
: see: https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/apis/kubeproxyconfig/v1alpha1/types.go
2018-04-02
- update to Kubernetes v1.9.3
- refactoring of Docker role
- introduce flexible parameter settings for dockerd via
dockerd_settings/dockerd_settings_user
variables - update to Flannel v0.10.0
- refactoring of Flannel role
- introduce flexible parameter settings for flannel via
flannel_settings/flannel_settings_user
variables
2018-01-06
- update to Kubernetes v1.9.1
- kubedns service template updated for Kubernetes 1.9
- change defaults for
k8s_ca_conf_directory
andk8s_config_directory
- introduce flexible parameter settings for kubelet via
k8s_worker_kubelet_settings/k8s_worker_kubelet_settings_user
variables - introduce flexible parameter settings for kube-proxy via
k8s_worker_kubeproxy_settings/k8s_worker_kubeproxy_settings_user
- add kube-proxy
healthz-bind-address
setting
2018-01-03
k8s_cluster_dns
andk8s_cluster_domain
variables are gone. Values forclusterIP
(the Kubernetes cluster IP for kube-dns) is now hardcoded to10.32.0.254
. Same is true fork8s_cluster_domain
which was hardcoded tocluster.local
. User needs to review kubedns/templates/kubedns.yaml.j2 template and adjust accordingly if the default values are not used.
2017-11-19
- update to flannel 0.9.1
2017-10-10
- update to flannel 0.9.0
- flanneld config now uses VXLAN backend by default
- add –healthz-ip and –healthz-port options to flanneld systemd service file
- removed alsologtostderr option from systemd service file
- use variable for flannel subnet directory
- update CNI plugin to 0.6.0
- variable
local_cert_dir
changed tok8s_ca_conf_directory
/ addedk8s_ca_conf_directory
- Docker update to 17.03.2-ce
- added –masquerade-all to kube-proxy settings to avoid DNS problems
- added healthz-bind-address and healthz-port option to kube-apiserver
- added task to install several needed network packages
- added missing default variable
k8s_controller_manager_cluster_cidr
- changed variable
k8s_download_dir
tok8s_worker_download_dir
- a few fixes in the role
- rename
local_cert_dir
->k8s_ca_conf_directory
- rename
k8s_cni_plugins
->k8s_cni_plugin_version
- removed
k8s_kubelet_token
as we now use RBAC (RBAC everywhere ;-) )
This post is based on Kelsey Hightower’s Kubernetes The Hard Way - Bootstrapping Kubernetes Workers.
To allow easy communication between the hosts and their services (etcd, API server, kubelet, …) we installed PeerVPN . This gives us some kind of a unified and secure network for our Kubernetes hosts (like a AWS VPC or Google Cloud Engine VPC). Now we need the same for our pods we want to run in our cluster (let’s call it the pod network). For this we use flannel. flannel is a network fabric for containers, designed for Kubernetes.
First we need a big IP range for that. The default value in my flannel role ansible-role-flanneld is 10.200.0.0/16
(flannel_ip_range
). This range is stored in etcd. Flannel will use a /24
subnet for every host where flanneld runs on out of that big IP range we configured. Further every pod on a worker node will get a IP address out of the /24
subnet which flannel uses for a specific host. On the flannel site you can see a diagram that shows pretty good how this works.
As already mentioned I created a role for installing flannel: ansible-role-flanneld. Install the role via
ansible-galaxy install githubixx.flanneld
The role has the following default settings:
# The interface on which the K8s services should listen on. As all cluster
# communication should use the PeerVPN interface the interface name is
# normally "tap0" or "peervpn0".
k8s_interface: "tap0"
# The directory to store the K8s certificates and other configuration
k8s_conf_dir: "/var/lib/kubernetes"
# CNI network plugin settings
k8s_cni_conf_dir: "/etc/cni/net.d"
# The directory from where to copy the K8s certificates. By default this
# will expand to user's LOCAL $HOME (the user that run's "ansible-playbook ..."
# plus "/k8s/certs". That means if the user's $HOME directory is e.g.
# "/home/da_user" then "k8s_ca_conf_directory" will have a value of
# "/home/da_user/k8s/certs".
k8s_ca_conf_directory: "{{ '~/k8s/certs' | expanduser }}"
etcd_conf_dir: "/etc/etcd"
etcd_bin_dir: "/usr/local/bin"
etcd_client_port: 2379
etcd_certificates:
- ca-etcd.pem
- ca-etcd-key.pem
- cert-etcd.pem
- cert-etcd-key.pem
flannel_version: "v0.10.0"
flannel_etcd_prefix: "/kubernetes-cluster/network"
flannel_ip_range: "10.200.0.0/16"
flannel_backend_type: "vxlan"
flannel_cni_name: "podnet"
flannel_subnet_file_dir: "/run/flannel"
flannel_options_dir: "/etc/flannel"
flannel_bin_dir: "/usr/local/sbin"
flannel_cni_conf_file: "10-flannel"
flannel_systemd_restartsec: "5"
flannel_systemd_limitnofile: "40000"
flannel_systemd_limitnproc: "1048576"
flannel_settings:
"etcd-cafile": "{{k8s_conf_dir}}/ca-etcd.pem"
"etcd-certfile": "{{k8s_conf_dir}}/cert-etcd.pem"
"etcd-keyfile": "{{k8s_conf_dir}}/cert-etcd-key.pem"
"etcd-prefix": "{{flannel_etcd_prefix}}"
"iface": "{{k8s_interface}}"
"public-ip": "{{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}"
"subnet-file": "{{flannel_subnet_file_dir}}/subnet.env"
"ip-masq": "true"
"healthz-ip": "0.0.0.0"
"healthz-port": "0" # 0 = disable
The settings for Flannel daemon defined in flannel_settings
can be overriden by defining a variable called flannel_settings_user
. You can also add additional settings by using this variable. E.g. to override healthz-ip
default value and add kubeconfig-file
setting add the following settings to group_vars/k8s.yml
or where ever it fit’s best for you:
flannel_settings_user:
"healthz-ip": "1.2.3.4"
"kubeconfig-file": "/etc/k8s/k8s.cfg"
Basically there should be no need to change any of the settings if you used mostly the default settings of my other roles so far. Maybe there’re two settings which you may choose to change. etcd-prefix
is the path in etcd where flannel will store it’s config
object. So with the default above the whole path to the flannel config object in etcd would be /kubernetes-cluster/network/config
. Next flannel_ip_range
is the big IP range I mentioned above. Don’t make it too small! For every host flannel will choose a /24
subnet out of this range.
Next we extend our k8s.yml
playbook file and add the role e.g.:
-
hosts: k8s:children
roles:
-
role: githubixx.flanneld
tags: role-kubernetes-flanneld
As you can see flanneld will be installed on all nodes (group k8s:children
includes controller, worker and etcd in my case). I decided to do so because I’ll have Docker running on every host so it makes sense to have one unified network setup for all Docker daemones. Be aware that flanneld needs to run BEFORE Docker! Now you can apply the role to all specifed hosts:
ansible-playbook --tags=role-kubernetes-flanneld k8s.yml
Now we need to install Docker on all of our nodes (you don’t need Docker on the etcd hosts if you used separate nodes for etcd and controller). You can use whatever Ansible Docker playbook you want to install Docker (you should find quite a few out there ;-) ). I created my own because I wanted to use the official Docker binaries archive, overlay FS storage driver and a custom systemd unit file. Be aware that you need to set a few options to make Docker work with flannel overlay network. Also we use Docker 17.03.2-ce. At time of writing this is the latest Docker version which is supported by Kubernetes v1.10.x. If you want to use my Docker playbook you can install it via
ansible-galaxy install githubixx.docker
The playbook has the following default variables:
docker_download_dir: "/opt/tmp"
docker_version: "17.03.2-ce"
docker_user: "docker"
docker_group: "docker"
docker_uid: 666
docker_gid: 666
docker_bin_dir: "/usr/local/bin"
dockerd_settings:
"host": "unix:///var/run/docker.sock"
"log-level": "error"
"storage-driver": "overlay"
"iptables": "false"
"ip-masq": "false"
"bip": ""
"mtu": "1472"
The settings for dockerd
daemon defined in dockerd_settings
can be overriden by defining a variable called dockerd_settings_user
. You can also add additional settings by using this variable. E.g. to override mtu
default value and add debug
add the following settings to group_vars/k8s.yml
or where ever it fit’s best for you:
dockerd_settings_user:
"mtu": "1450"
"debug": ""
There should be no need to change any of this default values besides maybe storage-driver
. If you don’t use my Docker role pay attention to set at least the last four default settings mentioned above correct. As usual place the variables in group_vars/k8s.yml
if you want to change variables. Add the role to our playbook file k8s.yml
e.g.:
-
hosts: k8s:children
roles:
-
role: githubixx.docker
tags: role-docker
A word about storage-driver
: Since we use Ubuntu 16.04 at Scaleway we should have already a very recent kernel running (at time of writing this blog post it was kernel 4.15.x on my VPS instance). It makes sense to use a recent kernel for Docker in general. I recommend to use a kernel >=4.14.x if possible. Verify that you have overlayfs
filesystem available on your worker instances (execute cat /proc/filesystems | grep overlay
. If you see an output you should be fine). If the kernel module isn’t compiled into the kernel you can normally load it via modprobe -v overlay
(-v
gives us a little bit more information). We’ll configure Docker to use overlayfs
by default because it’s one of the best choises (Docker 1.13.x started to use overlayfs
by default if available). But you can change the storage driver via the storage-driver
setting if you like. Again: Use kernel >=4.14.x if possible!
Now you can roll out Docker role on all nodes using
ansible-playbook --tags=role-docker k8s.yml
In part 6 we installed Kubernetes API server, Scheduler and Controller manager on the controller nodes. For the worker I’ve also prepared a Ansible role which installes Kubernetes worker. The Kubernetes part of a worker node needs a kubelet
and a kube-proxy
daemon. The worker do the “real” work. They run the pods and the Docker container. So in production and if you do real work it won’t hurt if you choose bigger iron for the worker hosts ;-) The kubelet
is responsible to create a pod/container on a worker node if the scheduler had choosen that node to run a pod on. The kube-proxy
cares about routes. E.g. if a pod or a service was added kube-proxy
takes care to update routing rules in iptables accordingly.
The worker depend on the infrastructure we installed in part 6. The role uses the following variables:
# The directory to store the K8s certificates and other configuration
k8s_conf_dir: "/var/lib/kubernetes"
# The directory to store the K8s binaries
k8s_bin_dir: "/usr/local/bin"
# K8s release
k8s_release: "1.10.4"
# The interface on which the K8s services should listen on. As all cluster
# communication should use the PeerVPN interface the interface name is
# normally "tap0" or "peervpn0".
k8s_interface: "tap0"
# The directory from where to copy the K8s certificates. By default this
# will expand to user's LOCAL $HOME (the user that run's "ansible-playbook ..."
# plus "/k8s/certs". That means if the user's $HOME directory is e.g.
# "/home/da_user" then "k8s_ca_conf_directory" will have a value of
# "/home/da_user/k8s/certs".
k8s_ca_conf_directory: "{{ '~/k8s/certs' | expanduser }}"
# Directory where kubeconfig for Kubernetes worker nodes and kube-proxy
# is stored among other configuration files. Same variable expansion
# rule applies as with "k8s_ca_conf_directory"
k8s_config_directory: "{{ '~/k8s/configs' | expanduser }}"
# K8s worker binaries to download
k8s_worker_binaries:
- kube-proxy
- kubelet
- kubectl
# Certificate/CA files for API server and kube-proxy
k8s_worker_certificates:
- ca-k8s-apiserver.pem
- ca-k8s-apiserver-key.pem
- cert-k8s-apiserver.pem
- cert-k8s-apiserver-key.pem
- cert-kube-proxy.pem
- cert-kube-proxy-key.pem
# Download directory for archive files
k8s_worker_download_dir: "/opt/tmp"
# Directory to store kubelet configuration
k8s_worker_kubelet_conf_dir: "/var/lib/kubelet"
# kubelet settings
k8s_worker_kubelet_settings:
"config": "{{k8s_worker_kubelet_conf_dir}}/kubelet-config.yaml"
"node-ip": "{{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}"
"container-runtime": "docker"
"image-pull-progress-deadline": "2m"
"kubeconfig": "{{k8s_worker_kubelet_conf_dir}}/kubeconfig"
"network-plugin": "cni"
"cni-conf-dir": "{{k8s_cni_conf_dir}}"
"cni-bin-dir": "{{k8s_cni_bin_dir}}"
"cloud-provider": ""
"register-node": "true"
# kublet kubeconfig
k8s_worker_kubelet_conf_yaml: |
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: {{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: "{{k8s_conf_dir}}/ca-k8s-apiserver.pem"
authorization:
mode: Webhook
clusterDomain: "cluster.local"
clusterDNS:
- "10.32.0.254"
failSwapOn: true
healthzBindAddress: "{{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}"
healthzPort: "10248"
runtimeRequestTimeout: "15m"
serializeImagePulls: false
tlsCertFile: "{{k8s_conf_dir}}/cert-k8s-apiserver.pem"
tlsPrivateKeyFile: "{{k8s_conf_dir}}/cert-k8s-apiserver-key.pem"
# Directroy to store kube-proxy configuration
k8s_worker_kubeproxy_conf_dir: "/var/lib/kube-proxy"
# kube-proxy settings
k8s_worker_kubeproxy_settings:
"config": "{{k8s_worker_kubeproxy_conf_dir}}/kubeproxy-config.yaml"
k8s_worker_kubeproxy_conf_yaml: |
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: {{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}
clientConnection:
kubeconfig: "{{k8s_worker_kubeproxy_conf_dir}}/kubeconfig"
healthzBindAddress: {{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}:10256
mode: "iptables"
iptables:
masqueradeAll: true
clusterCIDR: "10.200.0.0/16"
# CNI network plugin settings
k8s_cni_dir: "/opt/cni"
k8s_cni_bin_dir: "{{k8s_cni_dir}}/bin"
k8s_cni_conf_dir: "/etc/cni/net.d"
k8s_cni_plugin_version: "0.6.0"
The role will search for the certificates we created in part 4 in the directory you specify in k8s_ca_conf_directory
on the host you run Ansible. The files used here are listed in k8s_certificates
. The Kubernetes worker binaries we need are listed in k8s_worker_binaries
. The Kubelet can use CNI (the Container Network Interface) to manage machine level networking requirements. The CNI archive that we want do download is specified in k8s_cni_plugin_version
and will be placed in k8s_cni_dir
.
If you created a different PeerVPN interface (e.g. peervpn0) change k8s_interface
.
Now add an entry for your worker hosts into Ansible’s hosts
file e.g.:
[k8s_worker]
worker[1:3].your.tld
Install the role via
ansible-galaxy install githubixx.kubernetes-worker
Next add the role to k8s.yml
file e.g.:
hosts: k8s_worker
roles:
-
role: githubixx.kubernetes-worker
tags: role-kubernetes-worker
Run the playbook via
ansible-playbook --tags=role-kubernetes-worker k8s.yml
Now that we’ve installed basically everything needed for running pods,deployments,services,… we should be able to do a sample deployment. On your laptop run:
kubectl run my-nginx --image=nginx --replicas=4 --port=80
This will deploy 4 pods running nginx. To get a overview what’s running e.g. pods,services,deployments,… run:
kubectl get all -o wide
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deploy/my-nginx 4 4 4 4 1m my-nginx nginx run=my-nginx
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
rs/my-nginx-5d69b5ff7 4 4 4 1m my-nginx nginx pod-template-hash=182561993,run=my-nginx
NAME READY STATUS RESTARTS AGE IP NODE
po/my-nginx-5d69b5ff7-66jgk 1/1 Running 0 1m 10.200.25.2 k8s-worker2
po/my-nginx-5d69b5ff7-kphsd 1/1 Running 0 1m 10.200.5.2 k8s-worker1
po/my-nginx-5d69b5ff7-mwcb6 1/1 Running 0 1m 10.200.5.3 k8s-worker1
po/my-nginx-5d69b5ff7-w888j 1/1 Running 0 1m 10.200.25.3 k8s-worker2
You should be also able to run curl
on every master and controller node to get the default page from one of the four nginx webservers. In the case above curl http://10.200.25.2
should work on all nodes in our cluster (flanneld magic ;-) ).
You can output the worker internal IPs and the pod CIDR’s that was assigned to that host with:
kubectl get nodes --output=jsonpath='{range .items[*]}{.status.addresses[?(@.type=="InternalIP")].address} {.spec.podCIDR} {"\n"}{end}'
10.3.0.211 10.200.0.0/24
10.3.0.212 10.200.1.0/24
The IP adress 10.3.0.211/212
are adressess I assigned to the PeerVPN interface to worker1/2. That’s important since all communication should travel though the PeerVPN interfaces.
If you just want to see if the worker nodes are ready use:
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-worker1 Ready <none> 44d v1.10.4 <none> Ubuntu 16.04.3 LTS 4.14.11-mainline-rev1 docker://17.3.2
k8s-worker2 Ready <none> 44d v1.10.4 <none> Ubuntu 16.04.3 LTS 4.14.11-mainline-rev1 docker://17.3.2
Now finally we install KubeDNS to enable services in our K8s cluster to do DNS lookups of internal services (service discovery) in a predictable way. If you cloned the ansible-kubernetes-playbooks repository already you find a kubedns
directory in there with a playbook file called kubedns.yml
. So change directory to kubedns
and adjust templates/kubedns.yaml.j2
. There’re basically only two settings you need to adjust if you don’t have used my default settings: Search for clusterIP
and for cluster.local
and change accordingly. You should find the following entries:
clusterIP: 10.32.0.254
--domain=cluster.local
--probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A
--probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
Now run
ansible-playbook kubedns.yml
to roll out KubeDNS deployment. If you run
kubectl get pods -l k8s-app=kube-dns -n kube-system -o wide
you should see something like this:
NAME READY STATUS RESTARTS AGE IP NODE
kube-dns-6c857864fb-bp2kx 3/3 Running 0 32m 10.200.7.9 k8s-worker2
There’re a lot more things that could/should be done now but running Heptio Sonobuoy could be a good next step. Heptio Sonobuoy is a diagnostic tool that makes it easier to understand the state of a Kubernetes cluster by running a set of Kubernetes conformance tests in an accessible and non-destructive manner.
Next up: Kubernetes the Not So Hard Way With Ansible (at Scaleway) - Part 8 - Ingress with Traefik