Building a Kubernetes 1.18 multi-master cluster - p01

Overview

This setup uses:

  • Kubernetes 1.18
  • eight servers running Ubuntu 20.04 LTS
  • three master nodes
  • five worker nodes
  • all servers have a root drive and a data drive
  • no swap drive

Setting up the servers

All servers are Ubuntu 20.04 LTS.

/etc/netplan/00-installer-config.yaml

kube-01 (master)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.181/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-02 (master)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.182/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-03 (master)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.183/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-04 (worker)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.184/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-05 (worker)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.185/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-06 (worker)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.186/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-07 (worker)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.187/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2
kube-08 (worker)
network:
  ethernets:
    eth0:
      addresses:
      - 192.168.122.188/24
      gateway4: 192.168.122.1
      nameservers:
        addresses:
        - 192.168.122.153
        - 192.168.122.1
        search:
        - domain.dom
  version: 2

Timezone

timedatectl set-timezone America/Toronto

Update and upgrade

apt update && apt upgrade -y

Install net tools and dns utils

apt install net-tools dnsutils

Install and configure chrony

apt install chrony -y

/etc/chrony/chrony.conf

server time2.domain.dom iburst
server time3.domain.dom iburst
server time4.domain.dom iburst
server time1.domain.dom iburst

Enabling and starting chrony:

systemctl enable chrony
systemctl start chrony && systemctl status chrony

Enable forwarding and bridge kernel module

modprobe br_netfilter

/etc/sysctl.conf

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
sysctl -p

Some more info related to this:
Why do net.bridge.bridge-nf-call-{arp,ip,ip6}tables default to 1?
What is the net.bridge.bridge-nf-call-iptables kernel parameter?
libvirt: The problem

Remove snap (snapd)

snap list
snap remove lxd
snap remove core18
snap remove snapd

apt purge snapd

rm -Rf ~/snap
rm -Rf /snap
rm -Rf /var/snap
rm -Rf /var/lib/snap

apt autoremove

apt-get clean
apt-get install localepurge

Remove cloud-init

apt purge cloud-init
rm -Rf /etc/cloud && rm -Rf /var/lib/cloud
rm -Rf /etc/netplan/50-cloud-init.yaml

Disable motd

/etc/default/motd-news

ENABLED=0
chmod -x /etc/update-motd.d/*
systemctl disable motd-news.service
systemctl disable motd-news.timer

Disable a few other services

/etc/apt/apt.conf.d/20auto-upgrades

APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Unattended-Upgrade "0";

/etc/apt/apt.conf.d/10periodic

APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Download-Upgradeable-Packages "0";
APT::Periodic::AutocleanInterval "0";
systemctl disable ubuntu-fan.service
systemctl disable fwupd-refresh.service
systemctl disable apport.service
systemctl stop apt-daily.timer
systemctl disable apt-daily.timer
systemctl disable apt-daily.service
systemctl stop apt-daily-upgrade.timer
systemctl disable apt-daily-upgrade.timer
systemctl disable apt-daily-upgrade.service
systemctl disable lxd-agent.service
systemctl disable lxd-agent-9p.service

Disable systemd-resolved

systemctl disable systemd-resolved
systemctl stop systemd-resolved
rm -Rf /etc/resolv.conf

/etc/resolv.conf

search domain.dom
nameserver 192.168.122.153
nameserver 192.168.122.1

Data Drive

All servers have a second drive, /dev/sdb and a /data standard partition on /deb/sdb1.

I want to have all data related to docker, containerd, dockershim, etcd, and kubelet stored on /data drive. For this, I am creating some symlinks.

Creating directoires

mkdir /data/docker
mkdir /data/containerd
mkdir /data/dockershim
mkdir /data/etcd
mkdir /data/kubelet

Creating symlinks

ln -s /data/docker /var/lib/docker
ln -s /data/containerd /var/lib/containerd
ln -s /data/dockershim /var/lib/dockershim
ln -s /data/etcd /var/lib/etcd
ln -s /data/kubelet /var/lib/kubelet

For storing longhon data, will use a bind mount point

mkdir /data/longhorn
mkdir /var/lib/longhorn

/etc/fstab

/data/longhorn /var/lib/longhorn   none  bind  0 0

Installing Docker

Installing docker on each of the nodes.

apt install docker.io -y

/etc/docker/daemon.json

{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
}
systemctl enable docker
systemctl restart docker && systemctl status docker
docker info

Installing and configuring Keepalived and HAProxy on the master nodes

On each of the three master nodes, installing and configuring keepalived and HAProxy to have the nodes load balanced. Keepalived gives us the virtual IP address for the cluster and HAProxy balances connections to the cluster between the three master nodes.

apt install haproxy keepalived -y

The script used by keepalived to check the api server (the cluster IP is 192.168.122.189):

/etc/keepalived/check_apiserver.sh

#!/bin/sh
APISERVER_VIP=192.168.122.189
APISERVER_DEST_PORT=6443

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
    curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi
chmod +x /etc/keepalived/check_apiserver.sh

Keepalived configuration

On the first master, kube-01:

/etc/keepalived/keepalived.conf

! keepalived - kube-01
!
global_defs {
    notification_email {
     alerts@domain.dom
   }
   notification_email_from kube-01@domain.dom
   smtp_server 192.168.122.139
   smtp_connect_timeout 30
   router_id kube-01
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance apiserver {
    state MASTER
    interface eth0
    virtual_router_id 70 
    priority 100

    virtual_ipaddress {
        192.168.122.189/24
    }
    track_script {
        check_apiserver
    }
}

On the second master, kube-02

/etc/keepalived/keepalived.conf

! keepalived - kube-02
!
global_defs {
    notification_email {
     alerts@domain.dom
   }
   notification_email_from kube-02@domain.dom
   smtp_server 192.168.122.139
   smtp_connect_timeout 30
   router_id kube-02
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance apiserver {
    state BACKUP
    interface eth0
    virtual_router_id 70 
    priority 99

    virtual_ipaddress {
        192.168.122.189/24
    }
    track_script {
        check_apiserver
    }
}

On the third master, kube-03

/etc/keepalived/keepalived.conf

! keepalived - kube-03
!
global_defs {
    notification_email {
     alerts@domain.dom
   }
   notification_email_from kube-03@domain.dom
   smtp_server 192.168.122.139
   smtp_connect_timeout 30
   router_id kube-03
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance apiserver {
    state BACKUP
    interface eth0
    virtual_router_id 70 
    priority 98

    virtual_ipaddress {
        192.168.122.189/24
    }
    track_script {
        check_apiserver
    }
}

For HAProxy, the same configuration file on all three masters

/etc/haproxy/haproxy.cfg

#---------------------------------------------------------------------
# kubernetes apiserver frontend
#---------------------------------------------------------------------
frontend apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend apiserver
#---------------------------------------------------------------------
# kubernetes apiserver backend
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    balance roundrobin
        server kube-01 192.168.122.181:6443 check check-ssl verify none
        server kube-02 192.168.122.182:6443 check check-ssl verify none
        server kube-03 192.168.122.183:6443 check check-ssl verify none
systemctl enable keepalived --now
systemctl enable haproxy --now

Kubernetes Install

On all nodes

apt install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
apt update
apt-get install -y  kubelet=1.18.8-00 kubeadm=1.18.8-00 kubectl=1.18.8-00

Disable updates for docker and kubernetes

apt-mark hold docker kubelet kubeadm kubectl

Kubernetes Cluster Initialization

On the first master (kube-01)

kubeadm init --control-plane-endpoint "192.168.122.189:8443" --upload-certs

where 192.168.122.189 is the cluster vip.

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 192.168.122.189:8443 --token g3hlhz.a63zp68iayo03xe6 \
    --discovery-token-ca-cert-hash sha256:eae5fb3b7c6fb62abd1460c44ae7d199453feee7ef99460f1be12cdbed79df7e \
    --control-plane --certificate-key d6400de0ac991e44f93b68c04709bef716682497ff5978e925a73573ccd1b270

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.122.189:8443 --token g3hlhz.a63zp68iayo03xe6 \
    --discovery-token-ca-cert-hash sha256:eae5fb3b7c6fb62abd1460c44ae7d199453feee7ef99460f1be12cdbed79df7e 
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Installing calico for CNI

kubectl apply -f https://docs.projectcalico.org/v3.16/manifests/calico.yaml

On the second master (kube-02)

kubeadm join 192.168.122.189:8443 --token g3hlhz.a63zp68iayo03xe6 \
    --discovery-token-ca-cert-hash sha256:eae5fb3b7c6fb62abd1460c44ae7d199453feee7ef99460f1be12cdbed79df7e \
    --control-plane --certificate-key d6400de0ac991e44f93b68c04709bef716682497ff5978e925a73573ccd1b270
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

On the third master (kube-03)

kubeadm join 192.168.122.189:8443 --token g3hlhz.a63zp68iayo03xe6 \
    --discovery-token-ca-cert-hash sha256:eae5fb3b7c6fb62abd1460c44ae7d199453feee7ef99460f1be12cdbed79df7e \
    --control-plane --certificate-key d6400de0ac991e44f93b68c04709bef716682497ff5978e925a73573ccd1b270
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

On each of the worker nodes

kubeadm join 192.168.122.189:8443 --token g3hlhz.a63zp68iayo03xe6 \
    --discovery-token-ca-cert-hash sha256:eae5fb3b7c6fb62abd1460c44ae7d199453feee7ef99460f1be12cdbed79df7e 

Adjusting the etcd configuration

From install, only the third master has all three nodes. Adjusting /etc/kubernetes/manifests/etcd.yaml so it has all the three masters for the initial-cluster config

- --initial-cluster=kube-01=https://192.168.122.181:2380,kube-02=https://192.168.122.182:2380,kube-03=https://192.168.122.183:2380
- --initial-cluster-state=existing

Installing etcd-client on all three masters

apt install etcd-client

Checking the endpoint health:

ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key \
--endpoints=https://192.168.122.181:2379,\
https://192.168.122.182:2379,\
https://192.168.122.183:2379 \
endpoint health

Getting a list of members

ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key \
--endpoints=https://192.168.122.181:2379,\
https://192.168.122.182:2379,\
https://192.168.122.183:2379 \
member list

Saving a snapshot for etcd database

ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key \
--endpoints=https://192.168.122.181:2379,\
https://192.168.122.182:2379,\
https://192.168.122.183:2379 \
snapshot save snapshotdb
ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key \
--endpoints=https://192.168.122.181:2379,\
https://192.168.122.182:2379,\
https://192.168.122.183:2379 \
--write-out=table snapshot status snapshotdb