Kubernetes Ingress using NGINX

2021-12-06 · Computing

There are many possibilities for setting up ingress into Kubernetes. Having in the last few months spent some time evaluating various approaches and their trade-offs, I want to share some notes about one configuration which is working well for me. It uses an external load balancer, NodePort, and NGINX; supports HTTPS redirects and CertManager TLS certificate auto-renewal; exposes source IPs to backend services; and is highly-available and reasonably low-latency.

The reference cluster is Kubernetes 1.22 (3 control plane and 3 worker nodes), running on CentOS Stream 8, installed using Kubeadm via custom modules for Ansible 2.9, provisioned using custom modules for Terraform 1.0, hosted on Hetzner Cloud with a Hetzner Cloud Load Balancer. It’s live in production for sites such as Isoxya web crawler.

NGINX Ingress Controller

Download an up-to-date configuration for NGINX Ingress Controller. At present, I find using a NodePort Service a nice compromise between automation and flexibility. You might prefer to instead use a LoadBalancer Service to configure the cloud load balancer for you, as using a NodePort Service requires mapping the ephemeral ports allocated, and adding them as load balancer targets yourself. However, this is sufficient for me at present, since the ports are allocated on creation and remain fixed thereafter, and the separation from the cloud load balancer allows me to use the same approach for other load balancers, too (such as on-prem HAProxy clusters). Thus, I use the bare-metal, rather than the cloud, approach:

wget -O tp-prd/k8s/ingress-nginx/deploy.yaml \
    https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.0/deploy/static/provider/baremetal/deploy.yaml

I check all Kubernetes manifests into Git version control rather than applying them directly, so it’s clear what’s deployed where and any changes I’ve made. My naming convention is <network>/<cluster>/<namespace>/; as well as being clear, this structure also makes applying a whole namespace contained in multiple files easy.

There are 3 changes I make to the default manifest above. The first is to use the PROXY protocol. In order to work, this must match the load balancer configuration. Using the PROXY protocol is a good solution for allowing the source IP to be exposed to backend services. It can be accomplished by adding a key to the ingress-nginx-controller ConfigMap:

@@ -40,2 +40,3 @@
   allow-snippet-annotations: 'true'
+  use-proxy-protocol: 'true'
 ---

The second change I make is to route to node-local endpoints. Although this can be done as part of a solution for allowing the source IP to be exposed to backend services, this isn’t actually what I’m using it for here. Since we’re using an external load balancer rather than allowing traffic to reach the worker nodes directly, doing so would expose only the IPs of the load balancers, not the clients. Rather, this routing change is made to avoid a second hop, at the risk of imbalanced traffic spread. Since the nodes are fronted by a load balancer, this is of lesser importance to me, and making the change decreases latency. There is a good post with pretty diagrams about Kubernetes external traffic policies here. So, I set the ingress-nginx-controller Service to use the Local policy:

@@ -276,2 +277,3 @@
   type: NodePort
+  externalTrafficPolicy: Local
   ipFamilyPolicy: SingleStack

The third and final change I make is related to the node-local routing: since traffic now won’t be proxied to other nodes, each node participating as a load balancer target should be capable of handling the request itself. Changing the ingress-nginx-controller Deployment to a DaemonSet spins up one ingress pod per worker node, each handling its own traffic:

@@ -297,3 +299,3 @@
 apiVersion: apps/v1
-kind: Deployment
+kind: DaemonSet
 metadata:

Finally, I apply the manifests:

kubectl apply -f tp-prd/k8s/ingress-nginx/

To check that the NGINX Ingress Controller pods are running:

kubectl get pod -n ingress-nginx 
NAME                                      READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create--1-qz8zw   0/1     Completed   0          73m
ingress-nginx-admission-patch--1-kmm7b    0/1     Completed   1          73m
ingress-nginx-controller-5d9qg            1/1     Running     0          73m
ingress-nginx-controller-7lkl5            1/1     Running     0          73m
ingress-nginx-controller-x4wf4            1/1     Running     0          73m

There should be one controller for each worker node.

CertManager

For security and privacy, I only host sites using TLS, and I permanently redirect all unencrypted traffic. This can be accomplished straightforwardly using the NGINX Ingress Controller, coupled with CertManager to handle the TLS certificate requests and renewals. Installing CertManager is straightforward:

wget -O tp-prd/k8s/cert-manager/cert-manager.yaml \
    https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.yaml

As before, I check this into Git version control, rather than applying it to the Kubernetes cluster directly. To actually manage certificate issuance, it’s necessary to configure an issuer. I’m a fan of Let’s Encrypt, so I define an issuer based on this tutorial:

cat tp-prd/k8s/cert-manager/letsencrypt-prd.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prd
spec:
  acme:
    email: webmaster@example.com # REPLACE
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prd
    solvers:
    - http01:
        ingress:
          class: nginx

Finally, I apply the manifests:

kubectl apply -f tp-prd/k8s/cert-manager/

To check that the CertManager pods are running:

kubectl get pod -n cert-manager 
NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-55658cdf68-8nf9n             1/1     Running   0          16m
cert-manager-cainjector-967788869-bbgd6   1/1     Running   0          16m
cert-manager-webhook-6668fbb57d-4dffm     1/1     Running   0          16m

To check that the issuer has been installed:

kubectl get clusterissuers -o wide
NAME              READY   STATUS                                                 AGE
letsencrypt-prd   True    The ACME account was registered with the ACME server   37d

Hetzner Cloud Load Balancer

Since the ingress is using NodePort, it’s necessary to find which ephemeral ports have been allocated to the Service. These will be used as targets for the load balancer, running in Hetzner Cloud. To view the ingress service and ports:

kubectl get svc -n ingress-nginx 
NAME                                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx-controller             NodePort    10.222.14.95   <none>        80:31022/TCP,443:32244/TCP   89m
ingress-nginx-controller-admission   ClusterIP   10.222.14.36   <none>        443/TCP                      89m

This shows that HTTP is on Port 31022, and HTTPS is on Port 32244.

I’m using a custom Terraform module, so I’ll summarise the next steps: Using the Hetzner Cloud provider, I create a hcloud_load_balancer resource in the desired network and location. Using the AWS provider, I create a DNS A record for the load balancer in Route 53. For each worker node, I create a hcloud_load_balancer_target resource. Finally, I define the load balancer services:

resource "hcloud_load_balancer_service" "lb_k8s_lbe_80" {
  destination_port = 31022
  listen_port      = 80
  load_balancer_id = module.lb_k8s_lbe.lb.id
  protocol         = "tcp"
  proxyprotocol    = true
}

resource "hcloud_load_balancer_service" "lb_k8s_lbe_443" {
  destination_port = 32244
  listen_port      = 443
  load_balancer_id = module.lb_k8s_lbe.lb.id
  protocol         = "tcp"
  proxyprotocol    = true
}

Note that destination_port matches the NodePort ephemeral ports, and proxyprotocol is set.

After deploying this somehow (or creating the load balancer manually using the Hetzner Cloud UI, if you’re not using Terraform), the load balancer should be healthy for all targets and services. Additionally, it should be possible to visit the load balancer A record and get HTTP 404.

Backend

All that remains is to install the backend itself (www-example). I create a DNS CNAME record (www.example.com), pointed to the load balancer record. I also create a DNS A record for the apex (example.com), using the load balancer IP. For both of these, I use Terraform. Then, I define the Kubernetes manifest:

cat tp-prd/k8s/default/www-example.yml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: www-example
  labels:
    app: www-example
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prd
    nginx.ingress.kubernetes.io/from-to-www-redirect: 'true'
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - example.com
        - www.example.com
      secretName: www-example-tls
  rules:
    - host: www.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: www-example
                port:
                  number: 80

Here, an Ingress is created. A CertManager annotation is added to link it to the letsencrypt-prd issuer. An NGINX annotation is added to redirect the apex to use the www subdomain. An ingressClassName is set to specify that the NGINX Ingress Controller be used. TLS certificates are specified for both the apex and www subdomain, with the certificate key stored in the www-example-tls secret; the certificate request and issuance will be handled automatically. Permanent redirects to use HTTPS, as well as HSTS Strict-Transport-Security headers, are also added automatically. A rule is added to route all paths to a backend Service.

---
apiVersion: v1
kind: Service
metadata:
  name: www-example
  labels:
    app: www-example
spec:
  selector:
    app: www-example
  ports:
    - port: 80
  clusterIP: None

A Service is created to route traffic to the Kubernetes pods. Note that allocating a clusterIP is unnecessary, since it will only be accessed through the Ingress.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: www-example
  labels:
    app: www-example
spec:
  replicas: 1
  selector:
    matchLabels:
      app: www-example
  template:
    metadata:
      labels:
        app: www-example
    spec:
      containers:
        - name: www
          image: docker.io/library/nginx:latest

Finally, a Deployment is created defining the actual containers. If desired, replicas, affinity, and a PodDisruptionBudget can also be set.

To install everything, I apply the manifests:

kubectl apply -f tp-prd/k8s/default/www-example.yml

To check the certificate has been issued:

kubectl get cert -o wide
NAME                 READY   SECRET               ISSUER            STATUS                                          AGE
www-example-tls      True    www-example-tls      letsencrypt-prd   Certificate is up to date and has not expired   37d

To check the ingress has been created:

kubectl get ing
NAME             CLASS   HOSTS                             ADDRESS                                   PORTS     AGE
www-example      nginx   www.example.com                   10.222.12.2,10.222.13.128,10.222.13.130   80, 443   37d

The Service and Deployment should also be checked. Everything should be healthy, and external traffic should be able to reach the containers!