Kubernetes upgrade notes: 1.17.x to 1.18.x

If you used my Kubernetes the Not So Hard Way With Ansible blog posts to setup a Kubernetes (K8s) cluster this notes might be helpful for you (and maybe for others too that manage a K8s cluster on their own).

I’ve a general upgrade guide Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes that worked quite well for me for the last past K8s upgrades. So please read that guide if you want to know HOW the components are updated. This post here is esp. for the 1.17.x to 1.18.x upgrade and WHAT I changed.

First: As usual I don’t update a production system before the .2 release of a new major version is released. In my experience the .0 and .1 are just too buggy (and to be honest sometimes it’s even better to wait for the .5 release ;-) ). Of course it is important to test new releases already in development or integration systems and report bugs!

Second: I only upgrade from the latest version of the former major release. In my case I was running 1.17.4 and at the time writing this text 1.18.5 was the latest 1.18.x release. After reading the 1.17.x changelog to see if any important changes where made between 1.17.4 and 1.17.7 I don’t saw anything that prevented me updating and I don’t needed to change anything. So I did the 1.17.4 to 1.17.7 upgrade first. If you use my Ansible roles that basically only means to change k8s_release variable from 1.17.4 to 1.17.7 and roll the changes for the control plane and worker nodes out as described in my upgrade guide. After that everything still worked as expected so I continued with the next step.

Here are two links that might be interesting regarding what’s new in regards to new features in Kubernetes 1.18:
What’s new in Kubernetes 1.18 - Twitter thread What’s new in Kubernetes v1.18.0 - SUSE blog What’s new in Kubernetes 1.18 - SysDig blog

Since K8s 1.14 there are also searchable release notes available. You can specify the K8s version and a K8s area/component (e.g. kublet, apiserver, …) and immediately get an overview what changed in that regard. Quite nice! :-)

As it is normally no problem to have a newer kubectl utility that is only one major version ahead of the server version I also updated kubectl from 1.17.x to 1.18.5 using my kubectl Ansible role.

As always before a major upgrade read the Urgent Upgrade Notes! If you used my Ansible roles to install Kubernetes and used most of the default settings then there should be no need to adjust any settings. But have a look at the deprecations for kubeapiserver:

the following deprecated APIs can no longer be served:

  All resources under apps/v1beta1 and apps/v1beta2 - use apps/v1 instead
  daemonsets, deployments, replicasets resources under extensions/v1beta1 - 
  use apps/v1 instead

  networkpolicies resources under extensions/v1beta1 -
  use networking.k8s.io/v1 instead

  podsecuritypolicies resources under extensions/v1beta1 - use policy/v1beta1
  instead (#85903, @liggitt) [SIG API Machinery, Apps, Cluster Lifecycle,
  Instrumentation and Testing]

So before upgrading make sure that you don’t use any of the deprecated APIs mentioned above anymore.

Besides that CoreDNS was upgraded to 1.6.7. I took the opportunity to adjust my CoreDNS playbook a little bit and replaced deprecated options before they are getting removed finally in 1.7.0. There is a very handy tool that helps you upgrading CoreDNS’s configuration file Corefile. Read more about it at CoreDNS Corefile Migration for Kubernetes.

Also CNI was upgraded to v0.8.5 but besides version upgrade no configuration changes are needed.

Further interesting notes:

  • NodeLocal DNSCache is an add-on that runs a dnsCache pod as a daemonset to improve clusterDNS performance and reliability. The feature has been in Alpha since 1.13 release. The SIG Network is announcing the GA graduation of Node Local DNSCache
  • SIG Network is moving IPv6 to Beta in Kubernetes 1.18
  • SIG CLI introduces kubectl debug. kubectl now contains a kubectl alpha debug command. This command allows attaching an ephemeral container to a running pod for the purposes of debugging.
  • Server-side Apply was promoted to Beta in 1.16, but is now introducing a second Beta in 1.18. This new version will track and manage changes to fields of all new Kubernetes objects, allowing you to know what changed your resources and when.
  • Kubernetes Topology Manager moves to beta
  • Extending Ingress with and replacing a deprecated annotation with IngressClass
  • kubectl and k8s.io/client-go no longer default to a server address of http://localhost:8080
  • kubectl run now only creates pods. See specific kubectl create subcommands to create objects other than pods.
  • The deprecated command kubectl rolling-update has been removed
  • kube-proxy: --healthz-port and --metrics-port flags are deprecated, please use --healthz-bind-address and --metrics-bind-address instead
  • kubectl apply --server-dry-run is deprecated and replaced with --dry-run=server. The kubectl --dry-run flag now accepts the values ‘client’, ‘server’, and ’none’, to support client-side and server-side dry-run strategies.
  • kube-proxy flags --ipvs-tcp-timeout, --ipvs-tcpfin-timeout, --ipvs-udp-timeout were added to configure IPVS connection timeouts
  • ‘kube-proxy`: Added dual-stack IPv4/IPv6 support to the iptables proxier.
  • The kubelet and the default docker runtime now support running ephemeral containers in the Linux process namespace of a target container.
  • kubectl drain node --dry-run will list pods that would be evicted or deleted
  • kubectl apply -f <file> --prune -n <namespace> should prune all resources not defined in the file in the cli specified namespace.

In case you use my Ansible roles to install Kubernetes there is one breaking change: I renamed cert-etcd.pem/cert-etcd-key.pem to cert-k8s-apiserver-etcd.pem/cert-k8s-apiserver-etcd-key.pem. This was also adjusted in etcd_certificates list. The changed name makes it more obvious that this is a client certificate for kube-apiserver used to connect to a TLS secured etcd cluster. In fact kube-apiserver is just a client to etcd as all clients. In my ansible-role-kubernetes-ca this was also changed accordingly (see etcd_additional_clients list). ansible-role-kubernetes-ca is now able to generate client certificates for other services like Traefik or Cilium which are often used in a Kubernetes cluster. So the already existing etcd cluster for Kubernetes (esp. for kube-apiserver) can be reused for other components.

If you use CSI then also check the CSI Sidecar Containers documentation. Every sidecar container contains a matrix which version you need at a minimum, maximum and which version is recommend to use with whatever K8s version. Since this is quite new stuff basically all CSI sidecar container are working with K8s 1.13 to 1.18. The first releases of these sidecar containers only need K8s 1.10 but I wouldn’t use this old versions. So there is at least no urgent need to upgrade CSI sidecar containers ATM. Nevertheless if your K8s update to v1.18 worked fine I would recommend to also update the CSI sidecar containers sooner or later because a) lots of changes happen ATM in this area and b) you might require the newer versions for the next K8s version anyways.

Now I finally updated the K8s controller and worker nodes to version 1.18.5 as described in Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes.

If you see errors like

kube-controller-manager[3375]: E0405 18:58:30.109867    3375 leaderelection.go:331] error retrieving resource lock kube-system/kube-controller-manager: leases.coordination.k8s.io "kube-controller-manager" is forbidden: User "system:kube-controller-manager" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"

during upgrading the controller nodes then this seems to be ok. The error should go away if all controller nodes are using the new Kubernetes version (also see https://github.com/gardener/gardener/issues/1879).

That’s it for today! Happy upgrading! ;-)