[K8s] Learn Kubernetes HA Architecture with Hands-On Example



The following diagram shows a typical high-availibility Kubernetes cluster with embedded etcd:

However, HA architecture also brings about another problem - which K8s API server should the client connect to? How to detect a node failure while fowarding traffic in order to achieve high availibility?

Each of K8s controller plane replicas will run the following components in the following mode:

But how to determine which K8s API server the client should connect to while ensuring healthiness of the backend servers? Simply adding mulitple DNS A records does not solve the problem since DNS server could not check server healthiness in real time. A reverse proxy with upstream health check is still not good enough because it would become a single-point-of-failure.

The concept of virtual ip comes to the rescue.

To achieve high availibility, we will use Keepalived, a LVS (Linux Virtual Server) solution based on VRRP protocol. A LVS service contains a master and multiple backup servers while exposing a virtual ip to outside as a whole. The virtual ip will be pointed to the master server. The master server will send heartbeats to backup servers. Once backup servers are not receiving heartbeats, one backup server will take over the virtual ip, ie. the virtual ip will “float” to that backup server.

To sum up, we have the following architecture:

  1. A floating vip maps to three keepalived servers (one master and two backups). Each keepalived server runs on a K8s controller plane replica.
  2. When keepalived master server fails, vip floats to another backup server automatically.
  3. HAProxy load-balances traffic to three K8s API servers.

The following section is hands-on tutorial that implements the above architecture.

Settings

Usage

You can find full source code on my Github.

HAProxy configuration for load balancing on K8s API servers (haproxy.cfg):

global
    log /dev/log  local0 warning
    maxconn     4000
    daemon
  
defaults
  log global
  option  httplog
  option  dontlognull
  timeout connect 5000
  timeout client 50000
  timeout server 50000


frontend kube-apiserver
  bind *:"$HAPROXY_PORT"
  mode tcp
  option tcplog
  default_backend kube-apiserver

backend kube-apiserver
  mode tcp
  option tcp-check
  balance roundrobin
  default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
  server kube-apiserver-1 172.16.217.171:6443 check # simple http health check
  server kube-apiserver-1 172.16.217.172:6443 check
  server kube-apiserver-1 172.16.217.173:6443 check

Keepalived configuration (keepalived.conf):

global_defs {
  default_interface {{ KEEPALIVED_INTERFACE }}
}

vrrp_script chk_haproxy {
    # check haproxy
    script "/bin/bash -c 'if [[ $(netstat -nlp | grep 7443) ]]; then exit 0; else exit 1; fi'"
    interval 2 # check every two seconds
    weight 11
}

vrrp_instance VI_1 {
  interface {{ KEEPALIVED_INTERFACE }}

  state {{ KEEPALIVED_STATE }}
  virtual_router_id {{ KEEPALIVED_ROUTER_ID }}
  priority {{ KEEPALIVED_PRIORITY }}
  advert_int 2

  unicast_peer {
    {{ KEEPALIVED_UNICAST_PEERS }}
  }

  virtual_ipaddress {
    {{ KEEPALIVED_VIRTUAL_IPS }}
  }

  authentication {
    auth_type PASS
    auth_pass {{ KEEPALIVED_PASSWORD }}
  }

  track_script {
    chk_haproxy
  }

  {{ KEEPALIVED_NOTIFY }}
}

This configure keepalived server to check if HAProxy is healthy every 2 seconds. Some important environment variables:

We will use docker to run HAProxy and on each K8s controller plane replicas.

On first node:

docker run -d --net=host --volume $(pwd)/config/haproxy:/usr/local/etc/haproxy \
    -e HAPROXY_PORT=7443 \
    haproxy:2.3-alpine
docker run -d --cap-add=NET_ADMIN --cap-add=NET_BROADCAST --cap-add=NET_RAW --net=host \
    --volume $(pwd)/config/keepalived/keepalived.conf:/container/service/keepalived/assets/keepalived.conf \
    -e KEEPALIVED_INTERFACE=eth0 \
    -e KEEPALIVED_PASSWORD=pass \
    -e KEEPALIVED_STATE=MASTER \
    -e KEEPALIVED_VIRTUAL_IPS="#PYTHON2BASH:['172.16.100.100', '172.16.100.101']" \
    -e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['172.16.217.171', '172.16.217.172', '172.16.217.173']" \
    osixia/keepalived:2.0.20 --copy-service

On second node:

docker run -d --net=host --volume $(pwd)/config/haproxy:/usr/local/etc/haproxy \
    -e HAPROXY_PORT=7443 \
    haproxy:2.3-alpine
    --volume $(pwd)/config/keepalived/keepalived.conf:/container/service/keepalived/assets/keepalived.conf \
docker run -d --cap-add=NET_ADMIN --cap-add=NET_BROADCAST --cap-add=NET_RAW --net=host \
    -e KEEPALIVED_INTERFACE=eth0 \
    -e KEEPALIVED_PASSWORD=pass \
    -e KEEPALIVED_STATE=BACKUP \
    -e KEEPALIVED_VIRTUAL_IPS="#PYTHON2BASH:['172.16.100.100', '172.16.100.101']" \
    -e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['172.16.217.171', '172.16.217.172', '172.16.217.173']" \
    osixia/keepalived:2.0.20 --copy-service

On third node:

docker run -d --net=host --volume $(pwd)/config/haproxy:/usr/local/etc/haproxy \
    -e HAPROXY_PORT=7443 \
    haproxy:2.3-alpine
docker run -d --cap-add=NET_ADMIN --cap-add=NET_BROADCAST --cap-add=NET_RAW --net=host \
    --volume $(pwd)/config/keepalived/keepalived.conf:/container/service/keepalived/assets/keepalived.conf \
    -e KEEPALIVED_INTERFACE=eth0 \
    -e KEEPALIVED_PASSWORD=pass \
    -e KEEPALIVED_STATE=BACKUP \
    -e KEEPALIVED_VIRTUAL_IPS="#PYTHON2BASH:['172.16.100.100', '172.16.100.101']" \
    -e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['172.16.217.171', '172.16.217.172', '172.16.217.173']" \
    osixia/keepalived:2.0.20 --copy-service

Finally, point your dns to either 172.16.100.100 or 172.16.100.101. From now on, a request with url <https://your.domain.com:7443> will be resolved to https://<172.16.100.100|172.16.100.101>:7443, rounted to the current keepalived master, and eventually sent to one of the three K8s API servers by HAProxy.

Reference

Read More

Comments