Kubernetes the not so hard way with Ansible (at Scaleway) - Part 6 - Control plane [updated for Kubernetes v1.10.x]

Install and configure the heart of Kubernetes: kube-apiserver, kube-controller-manager and kube-scheduler

February 15, 2017

CHANGELOG

2018-09-05

  • I’ll no longer update this text as I migrated my hosts to Hetzner Online because of constant network issues with Scaleway. I’ve created a new blog series about how to setup a Kubernetes cluster at Hetzner Online but since my Ansible playbooks are not provider depended the blog text should work for Scaleway too if you still want to use it. The new blog post is here.

2018-06-04

  • update to Kubernetes v1.10.4
  • removed deprecated kube-apiserver parameter insecure-bind-address (see: #59018)
  • added variable k8s_apiserver_secure_port: "6443"
  • added parameter secure-port to k8s_apiserver_settings parameter list
  • added kube-controller-manager-ca certificate files to k8s_certificates list
  • added variable k8s_controller_manager_conf_dir / added kubeconfig for kube-controller-manager
  • added variable k8s_scheduler_conf_dir / added kubeconfig for kube-scheduler / settings for kube-scheduler now in templates/var/lib/kube-scheduler/kube-scheduler.yaml.j2
  • added kubeconfig for admin user (located by default in k8s_conf_dir)
  • replaced kube-apiserver parameter admission-control with enable-admission-plugins
  • new service-account-key-file value for kube-apiserver
  • changes in k8s_controller_manager_settings: removed master parameter, added kubeconfig, new value for service-account-private-key-file, new parameter use-service-account-credentials

2018-01-06

  • update to Kubernetes v1.9.1
  • introduce flexible parameter settings for API server via k8s_apiserver_settings/k8s_apiserver_settings_user
  • introduce flexible parameter settings for controller manager via k8s_controller_manager_settings/k8s_controller_manager_settings_user
  • introduce flexible parameter settings for kube-scheduler via k8s_scheduler_settings/k8s_scheduler_settings_user
  • change defaults for k8s_ca_conf_directory and k8s_config_directory variables
  • move advertise-address,bind-address,insecure-bind-address settings out of kube-apiserver.service.j2 template
  • move address,master settings out of kube-controller-manager.service.j2 template
  • move address,master settings out of kube-scheduler.service.j2 template
  • update info regarding kubectl get componentstatuses and how to mitigate the problem

2017-11-22

2017-10-09

  • k8s_binaries renamed to k8s_controller_binaries
  • make systemd service template files for API server, controller manager and schedule more flexible by allowing more parameter to be changed via variable
  • k8s_auth_tokens no longer used as we now use Role-Based Access Control (“RBAC”) everywhere
  • RBAC for kubelet authorization

This post is based on Kelsey Hightower’s Bootstrapping the Kubernetes Control Plane.

This time we install a 3 node Kubernetes controller cluster (that’s Kubernetes API Server, Scheduler and Controller manager). All components will run on every node. In part 4 we installed our PKI (public key infrastructure) in order to secure communication between our Kubernetes components/infrastructure. As with the etcd-cluster we use the certificate authority and generated certificates but for Kubernetes API server we generated a separate CA and certificate. If you used the default values in the other playbooks so far you most likely don’t need to change any default variable setting which are:

# The directory to store the K8s certificates and other configuration
k8s_conf_dir: "/var/lib/kubernetes"
# The directory to store the K8s binaries
k8s_bin_dir: "/usr/local/bin"
# K8s release
k8s_release: "1.10.4"
# The interface on which the K8s services should listen on. As all cluster
# communication should use the PeerVPN interface the interface name is
# normally "tap0" or "peervpn0" but basically you can use every interface
# you want. But keep in mind if you use something like "eth0" services could
# be exposed to the outside world if not secured right.
k8s_interface: "tap0"

# The directory from where to copy the K8s certificates. By default this
# will expand to user's LOCAL $HOME (the user that run's "ansible-playbook ..."
# plus "/k8s/certs". That means if the user's $HOME directory is e.g.
# "/home/da_user" then "k8s_ca_conf_directory" will have a value of
# "/home/da_user/k8s/certs".
k8s_ca_conf_directory: "{{ '~/k8s/certs' | expanduser }}"
# Directory where kubeconfig for Kubernetes worker nodes and kube-proxy
# is stored among other configuration files. Same variable expansion
# rule applies as with "k8s_ca_conf_directory"
k8s_config_directory: "{{ '~/k8s/configs' | expanduser }}"

# K8s control plane binaries to download
k8s_controller_binaries:
  - kube-apiserver
  - kube-controller-manager
  - kube-scheduler
  - kubectl

# K8s kube-(apiserver|controller-manager|scheduler) certificates
k8s_certificates:
  - ca-k8s-apiserver.pem
  - ca-k8s-apiserver-key.pem
  - cert-k8s-apiserver.pem
  - cert-k8s-apiserver-key.pem
  - cert-k8s-controller-manager-sa.pem
  - cert-k8s-controller-manager-sa-key.pem

k8s_apiserver_secure_port: "6443"

# kube-apiserver settings (can be overriden or additional added by defining
# "k8s_apiserver_settings_user")
k8s_apiserver_settings:
  "advertise-address": "hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address"
  "bind-address": "hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address"
  "secure-port": "{{k8s_apiserver_secure_port}}"
  "enable-admission-plugins": "Initializers,NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota"
  "allow-privileged": "true"
  "apiserver-count": "3"
  "authorization-mode": "Node,RBAC"
  "audit-log-maxage": "30"
  "audit-log-maxbackup": "3"
  "audit-log-maxsize": "100"
  "audit-log-path": "/var/log/audit.log"
  "enable-swagger-ui": "true"
  "event-ttl": "1h"
  "kubelet-https": "true"
  "kubelet-preferred-address-types": "InternalIP,Hostname,ExternalIP"
  "runtime-config": "api/all"
  "service-cluster-ip-range": "10.32.0.0/16"
  "service-node-port-range": "30000-32767"
  "client-ca-file": "{{k8s_conf_dir}}/ca-k8s-apiserver.pem"
  "etcd-cafile": "{{k8s_conf_dir}}/ca-etcd.pem"
  "etcd-certfile": "{{k8s_conf_dir}}/cert-etcd.pem"
  "etcd-keyfile": "{{k8s_conf_dir}}/cert-etcd-key.pem"
  "experimental-encryption-provider-config": "{{k8s_conf_dir}}/encryption-config.yaml"
  "kubelet-certificate-authority": "{{k8s_conf_dir}}/ca-k8s-apiserver.pem"
  "kubelet-client-certificate": "{{k8s_conf_dir}}/cert-k8s-apiserver.pem"
  "kubelet-client-key": "{{k8s_conf_dir}}/cert-k8s-apiserver-key.pem"
  "service-account-key-file": "{{k8s_conf_dir}}/cert-k8s-controller-manager-sa.pem"
  "tls-ca-file": "{{k8s_conf_dir}}/ca-k8s-apiserver.pem"
  "tls-cert-file": "{{k8s_conf_dir}}/cert-k8s-apiserver.pem"
  "tls-private-key-file": "{{k8s_conf_dir}}/cert-k8s-apiserver-key.pem"

# The directory to store controller manager configuration.
k8s_controller_manager_conf_dir: "/var/lib/kube-controller-manager"

# kube-controller-manager settings (can be overriden or additional added by defining
# "k8s_controller_manager_settings_user")
k8s_controller_manager_settings:
  "address": "{{hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address}}"
  "cluster-cidr": "10.200.0.0/16"
  "cluster-name": "kubernetes"
  "kubeconfig": "{{k8s_controller_manager_conf_dir}}/kube-controller-manager.kubeconfig"
  "leader-elect": "true"
  "service-cluster-ip-range": "10.32.0.0/16"
  "cluster-signing-cert-file": "{{k8s_conf_dir}}/ca-k8s-apiserver.pem"
  "cluster-signing-key-file": "{{k8s_conf_dir}}/cert-k8s-apiserver-key.pem"
  "root-ca-file": "{{k8s_conf_dir}}/ca-k8s-apiserver.pem"
  "service-account-private-key-file": "{{k8s_conf_dir}}/cert-k8s-controller-manager-sa-key.pem"
  "use-service-account-credentials": "true"

# The directory to store scheduler configuration.
k8s_scheduler_conf_dir: "/var/lib/kube-scheduler"
# kube-scheduler settings (only --config left,
# see https://github.com/kubernetes/kubernetes/pull/62515, remaining parameter deprecated)
k8s_scheduler_settings:
  "config": "{{k8s_scheduler_conf_dir}}/kube-scheduler.yaml"

# The port the control plane componentes should connect to etcd cluster
etcd_client_port: "2379"
# The interface the etcd cluster is listening on
etcd_interface: "tap0"

# The etcd certificates needed for the control plane componentes to be able
# to connect to the etcd cluster.
etcd_certificates:
  - ca-etcd.pem
  - ca-etcd-key.pem
  - cert-etcd.pem
  - cert-etcd-key.pem

The kube-apiserver settings defined in k8s_apiserver_settings can be overriden by defining a variable called k8s_apiserver_settings_user. You can also add additional settings for the kube-apiserver daemon by using this variable. E.g. to override audit-log-maxage and audit-log-maxbackup default values and add watch-cache option add the following settings to group_vars/k8s.yml:

k8s_apiserver_settings_user:
  "audit-log-maxage": "40"
  "audit-log-maxbackup": "4"
  "watch-cache": "false"

The same is true for the kube-controller-manager by adding entries to k8s_controller_manager_settings_user variable. For kube-scheduler add entries to k8s_scheduler_settings_user variable to override settings in k8s_scheduler_settings dictionary or to add new one.

As you can see we install Kubernetes 1.10.x by default. The role will search for the certificates we created in part 4 in the directory you specify in k8s_ca_conf_directory on the host you run Ansible. Also the encryption file will be used which this role should find in k8s_encryption_config_directory (which is the same as k8s_config_directory in my case). The CA and certificate files used here are listed in k8s_certificates. The binaries listed in k8s_controller_binaries will be downloaded and stored into the directory you specify in k8s_bin_dir. If you created a different interface for PeerVPN (e.g. peervpn0) change k8s_interface.

If you ask yourself “why do we need to specify etcd_certificates here again?”: Well the Kubernetes API server needs to communicate with the Kubernetes componentes AND the etcd cluster as you may remember. That’s the reason why it must be aware of both CA’s and certificates. But since we store all group variables in group_vars/k8s.yml it’s of course sufficient to specifiy all variables only once there even if you see the same variable in different roles (mainly in defaults/main.yml).

Now add an entry for your controller hosts into Ansible’s hosts file e.g. (of course you need to change controller[1:3].your.tld to your own hostnames):

[k8s_controller]
controller[1:3].your.tld

Install the role via

ansible-galaxy install githubixx.kubernetes-controller

Next add the role ansible-role-kubernetes-controller to the k8s.yml file e.g.:

  hosts: k8s_controller
  roles:
    -
      role: githubixx.kubernetes-controller 
      tags: role-kubernetes-controller

Apply the role via

ansible-playbook --tags=role-kubernetes-controller k8s.yml

After the role is applyed you can basically check the status of the components with (/path/to should be the value of the variable k8s_config_directory:

kubectl get componentstatuses

BUT first we need to configure kubectl ;-) We alredy installed kubectl locally in part 2 of my tutorial. I’ve prepared a playbook to do the kubectl configuration. You should already have cloned my ansible-kubernetes-playbooks repository. I recommend to place it at the same directory level as Ansible’s roles directory (git clone https://github.com/githubixx/ansible-kubernetes-playbooks). Switch to ansible-kubernetes-playbooks/kubectlconfig directory.

There is now one thing you may need to change: https://github.com/githubixx/ansible-kubernetes-playbooks/blob/v3.0.0_r1.10.4/kubectlconfig/kubectlconfig.yml#L11 . This complicated looking line get’s the first hostname in our [k8s_controller] host group and uses the IP address of this host’s PeerVPN interface as the API server address for kubectl (kubectl is basically the frontend utility for the API server). My laptop also has PeerVPN installed and it’s part of this Kubernetes PeerVPN mesh network. This allow’s kubectl on my laptop to contact the API server. But that may not work for you. Either do the same or you maybe setup ssh forwarding to one of the controller node’s PeerVPN interface (port 6443 by default) and then use --server=https://localhost:6443 or you do something completly different ;-) You could also copy $HOME/.kube directory (if the configs are generated in a moment) to one of the Kubernetes hosts and work from there.

Now generate the kubectl configuration with

ansible-playbook kubectlconfig.yml

If you have your Ansible variables all in place as I suggested in my previous posts it should just work. The playbook will configure kubectl using the admin certificates we created with the Ansible role role-kubernetes-ca.

If you now run kubectl get componentstatuses one would expect to see this output:

kubectl get componentstatuses

NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok                   
scheduler            Healthy   ok                   
etcd-0               Healthy   {"health": "true"}   
etcd-1               Healthy   {"health": "true"}   
etcd-2               Healthy   {"health": "true"}  

BUT instead you will probably see this:

NAME                 STATUS      MESSAGE                                                                                        ERROR
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: getsockopt: connection refused   
controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: getsockopt: connection refused   
etcd-0               Healthy     {"health": "true"}                                                                             
etcd-1               Healthy     {"health": "true"}                                                                             
etcd-2               Healthy     {"health": "true"}

If you don’t see any error in systemd’s journal ( journalctl --no-pager ) on the controller nodes and the scheduler and controller-manager are running and listening ( netstat -tlpn ) on port 10251 and 10252 and you get the output above it’s because of this long standig bug “kubectl get cs”: incorrect hard coded master component locations. ATM this can only be avoided if you bind the scheduler and the controller-manager to 0.0.0.0 which means to listen on all interfaces. But this is something I don’t want. I configured this services to listen only on the PeerVPN interface because it’s really sufficient. This way all communication is secure automatically because the encrypted PeerVPN connection and more important the PeerVPN interfaces shouldn’t be reachable from the Internet by default even if you don’t use any firewall. And just because of this bug I should make scheduler and controller-manager listening on all interfaces? Doesn’t really makes sense for me…

If you don’t care about the issues mentioned above you can use different defaults for scheduler and controller-manager by defining

k8s_controller_manager_settings_user:
  "address": "0.0.0.0"

and

k8s_scheduler_settings_user:
  "address": "0.0.0.0"

in group_vars/k8s.yml.

An alternative would be to setup a iptables rule to forward the traffic accordingly (haven’t looked at this yet).

Now it’s time to setup the Kubernetes worker.