Skip to content

Proxy Dockerhub access with K3S

ChatGPT Generated title image

The Problem

When you are running a Kubernetes cluster, you will often deploy images from Dockerhub.

This can be slow, as the images have to be downloaded from the internet.

This can be a problem if you have a slow internet connection, or if you have to deploy many images at once.

You also waste bandwidth, as the images are downloaded multiple times.

In addition, you might want to be able to scan the images for security vulnerabilities before deploying them. Or you need to authenticate to download the images, as many images run into rate limiting issues otherwise.

The Solution

One way to solve this problem is to use a proxy.

This proxy can be used to cache images, so that they do not have to be downloaded from Dockerhub every time.

This is also known as a Pull Through Cache1 2, which is a common solution to this problem.

This can speed up the deployment process, as the images are already available on the local network. And prevents being rate limited, as the images are only downloaded from Dockerhub Once. An additional benefit, is that the hosts do not need to have direct access to Dockerhub, and do not need to the credentials, limiting the exposure of the credentials.

The Implementation Plan

So, how do you set this up?

We start with setting up the proxy server. If you are in a public cloud environment, it is best to use the cloud provider's registry service, as it is already optimized for this use case.

In AWS, you can use ECR as a pull through cache. The folks of AWS have documented how to do this in their documentation2.

In a private cloud environment, you have to run a Container Image registry yourself. There are various options, such as Docker Registry 23, Sonatype Nexus4, JFrog Artifactory5, or Harbor6.

In this example, we use Harbor. Harbor is a free and open source container image registry, is well maintained, and has a lot of features. It is also the defacto standard for the Kubernetes distribution from VMware Tanzu7(by Broadcom).

Warning

Full disclaimer, I used to work for VMware and assisted various companies with setting up Harbor.

Once you have a registry set up, you can configure it to act as a pull through cache. You create a Proxy repository with authentication to Docker Hub, and configure the registry to use this repository as a cache.

Assuming that you run the registry yourself, you need to supply the registry with a certificate so it can be trusted. The Kubernetes nodes need to trust this certificate as well, so that they can pull images from the registry.

In my homelab, I'm using K3S8 as my Kubernetes distribution. So, I will show you how to configure K3S to use the registry as a pull through cache.

Setup a Proxy Server - Harbor

There are many ways to set up a registry, as mentioned before we use Harbor in this example.

We can install Harbor in various ways, such as using Helm, Docker Compose, or the installer script.

In my Homelab, I run two Kubernetes cluster, each consisting of an Asus MiniPC running the Control Plane and several Raspberry Pi 4's running the Worker Nodes.

The MiniPCs use Docker Compose to run several services, such as Keycloak for authentication, outside of the clusters. This way, these services are always available even before the clusters are up and running - or in case of a cluster failure.

I use the same approach for Harbor, running it on one of the MiniPCs using Docker Compose.

Install Harbor Via Docker Compose

Harbor has default support for Docker Compose, with an installer9 that generates the necessary files.

The installation process, described here10, is quite straightforward.

  1. Download the latest release
  2. Configure SSL
  3. Configure Harbor
  4. Run the Prepare Script
  5. Run Docker Compose

Download and unpack the installer

You can download the online installer from the GitHub releases page11.

In my case, the latest release is v2.11.1.

We download the installer and unpack it:

tar -xvf harbor-online-installer-v2.11.1.tgz
cd harbor

Configure SSL

In my case, I run more than one service on the MiniPC, so I use Envoy as a reverse proxy. So the Envoy proxy will handle the SSL termination, and the Harbor service will run on HTTP.

If you want to run Harbor on HTTPS, you can configure the harbor.yml file accordingly.

Configure Harbor

The default settings are included, so we only need to change the hostname and the admin password.

As we run Harbor on HTTP, we comment out the HTTPS section.

We copy the harbor.yml.tmpl to harbor.yml and edit the file.

We change the following settings:

  • hostname: my.harbor.com
  • harbor_admin_password: admin
  • comment out the https section
    • # https:
    • # port: 443
    • # certificate: /your/certificate/path/fullchain.pem
    • # private_key: /your/private/key/path/privkey.pem

Run the Prepare Script

The Prepare script runs a container that generates the necessary configuration files for Docker Compose.

./prepare

The script validates the configuration and generates the necessary files. The output should look like this:

prepare base dir is set to /home/joostvdg/projects/test/harbor-install/harbor
Unable to find image 'goharbor/prepare:v2.11.1' locally
v2.11.1: Pulling from goharbor/prepare
Digest: sha256:35dbf7b4293e901e359dbf065ed91d9e4a0de371898da91a3b92c3594030a88c
Status: Downloaded newer image for goharbor/prepare:v2.11.1
prepare base dir is set to /home/joostvdg/projects/test/harbor-install/harbor
WARNING:root:WARNING: HTTP protocol is insecure. Harbor will deprecate http protocol in the future. Please make sure to upgrade to https
Generated configuration file: /config/portal/nginx.conf
Generated configuration file: /config/log/logrotate.conf
Generated configuration file: /config/log/rsyslog_docker.conf
Generated configuration file: /config/nginx/nginx.conf
Generated configuration file: /config/core/env
Generated configuration file: /config/core/app.conf
Generated configuration file: /config/registry/config.yml
Generated configuration file: /config/registryctl/env
Generated configuration file: /config/registryctl/config.yml
Generated configuration file: /config/db/env
Generated configuration file: /config/jobservice/env
Generated configuration file: /config/jobservice/config.yml
Generated and saved secret to file: /data/secret/keys/secretkey
Successfully called func: create_root_cert
Generated configuration file: /compose_location/docker-compose.yml
Clean up the input dir

In my case, I ran into some issues with permissions and the log files. So I removed all the log configuration (not recommended), and changed the permissions of the common directory.

chown -R joostvdg:joostvdg common/
sudo chmod +r -R common

By default, the proxy cache is disabled. We need to enable it by setting the PERMITTED_REGISTRY_TYPES_FOR_PROXY_CACHE environment variable. This is done for the core service in the docker-compose.yml file.

For each registry type, we need to add a comma-separated value. In order to be flexible, we add all the supported registry types.

services:
  core:
    environment:
      - PERMITTED_REGISTRY_TYPES_FOR_PROXY_CACHE=docker-hub,harbor,azure-acr,aws-ecr,google-gcr,quay,docker-registry,github-ghcr,jfrog-artifactory

Once we have made the changes, we can run Docker Compose.

docker compose up

Envoy Reverse Proxy

As described, I use Envoy as a reverse proxy fronting the services on the MiniPC.

You can use many other reverse proxies, such as Nginx, Traefik, or HAProxy. In my case, I was also working with Knative Serving (which uses Envoy) for work, so I wanted to get more experience with Envoy.

For Envoy to do its job, we need to configure the following:

  1. Docker Compose Service entry
  2. Envoy Configuration
  3. Certificates

Docker Compose Service entry

To run Envoy, we need to add a service entry to the docker-compose.yml file.

In this, we configure the image, the restart policy, the configuration file, the command, the ports, the resources, and the secrets.

The secrets are the certificates that Envoy needs to trust the Harbor service.

As my MiniPC has limited resources, I limit the CPU and memory usage of the Envoy service.

Docker Compose Service entry

docker-compose.yml
services:

  envoy:
    image: envoyproxy/envoy:v1.27.0
    restart: unless-stopped
    configs:
      - source: envoy_proxy
        target: /etc/envoy/envoy-proxy.yaml
        uid: "103"
        gid: "103"
        mode: 0440
    command: /usr/local/bin/envoy -c /etc/envoy/envoy-proxy.yaml -l debug
    ports:
      - 443:443
      - 80:80
      - 8082:8082
    deploy:
      resources:
        limits:
          cpus: '0.50'
          memory: 128M
        reservations:
          cpus: '0.25'
          memory: 64M
    secrets:
      - harbor-cert
      - harbor-cert-key
      - ca

configs:
  envoy_proxy:
    file: ./envoy/envoy.yaml

secrets:
  harbor-cert:
    file: ./certs/harbor.pem
  harbor-cert-key:
    file: ./certs/harbor-key.pem
  ca:
    file: ./certs/ca.pem

Envoy Configuration

The config file mentioned in the Docker Compose service entry is the Envoy configuration file.

In here, we configure the proxy, the listener, the filters, the clusters, and the certificates.

The certificates are the secrets that we added to the Docker Compose service entry. Which by default are mounted in the /run/secrets/ directory.

Envoy Listener

Here we configure the deault listener on port 443.

envoy.yaml
static_resources:
  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 443
    listener_filters:
    - name: "envoy.filters.listener.tls_inspector"
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector
Envoy Filter

Here we configure the filter for the Harbor service.

This tells the listener to route all traffic to the Harbor Cluster when the hostname is harbor.home.lab.

At the same time, we configure the Transport Socket to use the certificates that we mounted as secrets.

envoy.yaml
filter_chains:
- filter_chain_match:
    server_names:
    - harbor.home.lab
  filters:
  - name: envoy.filters.network.http_connection_manager
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
      codec_type: AUTO
      stat_prefix: ingress_http
      route_config:
        name: local_route
        virtual_hosts:
        - name: app
          domains:
          - "*"
          routes:
          - match:
              prefix: "/"
            route:
              cluster: harbor
      http_filters:
      - name: envoy.filters.http.router
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
  transport_socket:
    name: envoy.transport_sockets.tls
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
      common_tls_context:
        tls_certificates:
        - certificate_chain:
            filename: /run/secrets/harbor-cert
          private_key:
            filename: /run/secrets/harbor-cert-key
Envoy Cluster

Here we configure the cluster for the Harbor service.

The cluster is the backend service that the listener routes the traffic to. By default Envoy assumes a backend service has more than one running instance, so we configure the Load Balancing Policy.

We only have one service, but Harbor can run in active/active mode if you want to scale it out. So it is possible to have multiple instances of Harbor running, and Envoy will load balance the traffic between them.

envoy.yaml
clusters:
- name: harbor
  connect_timeout: 60s
  type: STRICT_DNS
  lb_policy: ROUND_ROBIN
  load_assignment:
    cluster_name: harbor
    endpoints:
    - lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: harbor-nginx
              port_value: 8080
Full Example

Here's the full example of the Envoy configuration file.

Full Envoy Example
envoy.yaml
static_resources:
  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 443

    listener_filters:
    - name: "envoy.filters.listener.tls_inspector"
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector
    filter_chains:
    - filter_chain_match:
        server_names:
        - harbor.home.lab
      filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          codec_type: AUTO
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: app
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: harbor
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
            - certificate_chain:
                filename: /run/secrets/harbor-cert
              private_key:
                filename: /run/secrets/harbor-cert-key

  - name: harbor
    connect_timeout: 60s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: harbor
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: harbor-nginx
                port_value: 8080

Envoy Certificates

For creating the Certificate Authority (CA) and the certificates for the Harbor service, we use the cfssl tool.

I have written a guide on how to use here12, we so we will not go into detail here.

Configure Proxy Cache in Harbor

Now that we have Harbor running and accessible via the Envoy proxy, we can configure the proxy cache15.

For this to work, we need to:

  1. Register a new Registry Endpoint
  2. Create a new Project as a Proxy Cache

Once of the key aspects of the proxy cache is that it needs to be able to authenticate to Docker Hub13. So make sure you have an account at Docker Hub, and generate an access token14.

In Harbor, login as the admin user, and go to the Administration section. In there, select Registries menu, and select New Endpoint.

In the New Endpoint form, select Docker Hub as the Provider, and fill in the Name and Description. Fill in the Access ID (your Docker Hub username) and the Access Secret (the access token you generated).

Hit Test Connection to verify the connection, and then Save.

Now, go to the Projects section, and select New Project.

Fill in the Name and Description, and select the Public checkbox. Then flip the Proxy Cache switch to On, and select the Docker Hub endpoint you just created.

Hit OK, and the project is created. This project will now act as a proxy cache for Docker Hub.

Warning

The default namespace for Dockerhub is library, so you need to use this namespace when pulling images from Dockerhub that have no prefix, e.g. nginx.

If your Harbor is accessible via harbor.home.lab, you can pull images from Dockerhub using harbor.home.lab/library/nginx.

Don't worry, we can configure K3S in a way that we don't need to handle this manually in Kubernetes.

Configure K3S

K3S is a lightweight Kubernetes distribution from Suse (formerly Rancher). It is optimized for edge and IoT use cases, but is also great for homelabs.

Because it is designed for edge use cases, it is very flexible and easy to configure. Especially for cases where it cannot reach the internet, or where you want to use a private registry16.

We can use this capability to configure the Harbor registry as a pull through cache.

K3S Registries Configuration

As per the docs16:

Containerd can be configured to connect to private registries and use them to pull images as needed by the kubelet. Upon startup, K3s will check to see if /etc/rancher/k3s/registries.yaml exists. If so, the registry configuration contained in this file is used when generating the containerd configuration.

We can create a registries.yaml file and place it in the /etc/rancher/k3s/ directory.

We first define the mirrors section, which is a map of registry names to their endpoints, and possibly a rewrite rule. We need to define a rewrite rule, as the images in the Harbor proxy cache are prefixed with the (Harbor) project name.

Then we define the configs section, which is a map of registry names to their configuration. In this case, we need to define the CA certificate that the nodes (i.e., ContainerD) need to trust.

Registries.yaml

mirrors:
  docker.io:
    endpoint:
      - "https://harbor.home.lab"
    rewrite:
      ".*": "proxy/$0/$1"
configs:
  "harbor.home.lab:443":
    tls:
      ca_file: /usr/local/share/ca-certificates/kearos/ca.crt

Configure K3S with Ansible

In my homelab, I use Ansible to configure the MiniPC's (control plane) and Raspberry Pi's that run the K3S worker nodes. Below is an example of how to configure the K3S Control Plane nodes to use the Harbor registry as a pull through cache.

Ansible Playbook

---
- hosts: "controlplane"
  become: true
  tasks:
  - name: Create directory k3s in /etc/rancher/
    file:
      path: /etc/rancher/k3s
      state: directory
  - name: Copy registries.yaml to /etc/rancher/k3s/
    ansible.builtin.copy:
      src: registries.yaml
      dest: /etc/rancher/k3s/registries.yaml
      owner: root
      group: root
      mode: 755
  - name: Restart K3S Server
    shell: /bin/bash -c "sudo systemctl restart k3s"

  - name: Create directory mycert in /usr/local/share/ca-certificates/
    file:
      path: /usr/local/share/ca-certificates/kearos
      state: directory
  - name: Copy certificate to /usr/local/share/ca-certificates/kearos
    copy:
      src: ca.crt
      dest: /usr/local/share/ca-certificates/kearos/ca.crt
      owner: root
      group: root
      mode: 0644

  - name: Update CA certificate trust
    shell: /bin/bash -c "update-ca-certificates"

For the worker nodes, you can use a similar playbook, but you we need to change the restart command to restart the K3S agent service.

- name: Restart K3S Agent
  shell: /bin/bash -c "sudo systemctl restart k3s-agent"

Conclusion

In this post, we have shown how to set up a proxy cache for Dockerhub using Harbor and K3S.

This can speed up the deployment process, as the images are already available on the local network.

It also prevents being rate limited, as the images are only downloaded from Dockerhub once and using authentication. It is also more secure, as the hosts do not need to have direct access to Dockerhub, and do not need to the credentials.

References