On August 24 2020 Docker announced they would be implementing Rate Limits on the Docker Hub and they were implemented on November 2 2020 thus ending our free ride of unlimited Docker Image pulls.
Unless you’re a paid customer of Docker or very lucky you’ve probably started to see errors like this:
ERROR: toomanyrequests: Too Many Requests.
Or
You have reached your pull rate limit. You may increase
the limit by authenticating and upgrading:
This can be very frustrating, especially in Kubernetes where it might not be apparent why your new Pod is just sitting there in a Pending
state. Imagine this happening right as you need to scale your Deployments to serve a sudden increase in traffic.
This would be where a troll on Reddit (You know the sort, the kind that will “What you guys are referring to as Linux, is in fact, GNU/Linux” at you would proclaim “You own your availability”. He’s not wrong … but also not helpful.
Thankfully the team developing the Harbor Registry have been hard at work to ensure that you can access the images you need without downloading the whole internet to your server.
There are actually two features in Harbor that will let us work around the rate limits, Registry Replication, and Registry Proxy.
Registry Replication allows you to replicate images between registries, whereas Proxy lets you keep a local copy of images on an as-requested basis.
In a production scenario you would probably look to Replication so that you can be very specific about what Images to allow, however in a Development scenario you might use Proxy-ing as you don’t necessarily know ahead of time what Images you might need access to. Further using Proxy-ing can be really useful for a home lab to cut down on internet traffic as you pull images.
We’ll explore both options below.
Prerequisites
Before you get started, you’ll need Harbor (ideally version 2.1.3 or newer) installed somewhere. If you don’t already have it installed, we’ve made it incredibly easy for your with our Getting Started with Harbor Guide.
Once you have a Harbor registry installed, log into it’s Web UI as an Admin user.
Note: Docker has been rapidly changing both the Docker Hub and the Docker CLI, this makes it difficult for Integrations such as Harbor’s replication / proxy features to keep pace. To ensure the best chance of functionality, ensure you’re using the versions stated in this document.
Set up a Registry Endpoint
Whether doing replication or proxy, you need to configure Dockerhub as a replication endpoint.
-
Go to Administration -> Registries and click the + New Endpoint button.
-
Set the Provider and Name both to
Docker Hub
. -
You can leave the rest of the settings as default, unless you want access to private images, in which case add in your Access ID and Access Secret.
- Press the Test Connection button and an a successful test hit OK to save.
Create a Dockerhub Proxy
For more information about how Proxy Projects work, see the official documentation.
-
Go to Projects and click the + New Project button.
-
Set Project Name to
dockerhub-proxy
. -
Set Access Level to
Public
(unless you intend to make it private and require login). -
Leave Storage Quota at the default
-1 GB
. -
Set Proxy Cache to
Docker Hub
(the Endpoint we created earlier). -
Test the proxy is working with
docker pull
:
$ docker pull <url-of-registry>/dockerhub-proxy/library/ubuntu:20.04
20.04: Pulling from dockerhub-proxy/library/ubuntu
83ee3a23efb7: Pull complete
db98fc6f11f0: Pull complete
f611acd52c6c: Pull complete
Digest: sha256:703218c0465075f4425e58fac086e09e1de5c340b12976ab9eb8ad26615c3715
Status: Downloaded newer image for harbor.aws.paulczar.wtf/dockerhub-proxy/library/ubuntu:20.04
harbor.aws.paulczar.wtf/dockerhub-proxy/library/ubuntu:20.04
If you receive error Error response from daemon: missing or empty Content-Type header
, you’ll need to upgrade Harbor to version 2.1.3 as some changes in Docker have had downstream ripple effects. Older versions of Docker will still work.
Configure Docker Hub Replication
Create a Project to replicate to
With Proxy-ing enabled, let’s now turn our eyes to Replication. This is where we can surgically select which images we want to make available.
For more information about how Replication works, see the official documentation.
-
Go to Projects and click the + New Project button.
-
Set Project Name to
dockerhub-replica
. -
Leave all other settings as their defaults.
Create a Replication Rule
Next we create a Replication Rule to determine the specific Images we want to replicate. In this case we want only the library/python:3.8.2-slim
image. We restrict this as Replication can quickly hit the Docker Hub rate limits.
The resource filters support basic pattern recognition, so you could use library/**
if you wanted to replicate all of the official images, however this would quickly hit the rate limits.
-
Go to Administration -> Replication and click the + New Replication Rule button.
-
Set Name to
dockerhub-python-slim
-
Set Replication mode to
Pull-based
-
Set Source registry to
Docker Hub
-
Set Source resource filter -> Name to
library/python
-
Set Source resource filter -> Tag to
3.8.2-slim
-
Set Destination namespace to
dockerhub-replica/python
-
Leave the rest as their defaults.
Test Replication
We chose manual replication (so that we don’t overwhelm the rate limits) so we need to actually perform the replication step, and then validate that it worked.
- Go to Administration -> Replication and click the dockerhub-python-slim item then click the Replicate Button.
Harbor will kick off the replication and will show the attempt below in the Executions section. You can click on it for more details or logs, but for now we’re just waiting for it to finish.
- Go to Projects and select dockerhub-replica then click Repositories. You should see
dockerhub-replica/python/python
with at least one Artifact. *To avoid this accidental redundancy in the name we should have set Destination namespace todockerhub-replica
rather thandockerhub-replica/python
.
Summary
That’s it! We’ve learned how to replicate Docker images from Docker Hub using both Proxy-ing and Replication. This can be applied for Harbor to Harbor replication as well. It’s not uncommon to have one main Harbor registry as the source of truth and then Replication to remote sites, and Proxy-ing to edge sites.