Production-Ready Harbor Deployment on Kubernetes

Harbor is an open-source container registry that secures artifacts with policies and role-based access control, ensuring images are scanned for vulnerabilities and signed as trusted. To learn more about Harbor and how to deploy it on a Virtual Machine (VM) and in Kubernetes (K8s), refer to parts 1 and 2 of the series.

While deploying Harbor is straightforward, making it production-ready requires careful consideration of several key aspects. This blog outlines critical factors to ensure your Harbor instance is robust, secure, and scalable for production environments.

For this blog, we will focus on upstream Harbor (v 2.14) deployed on Kubernetes via Helm as our base and provide suggestions for this specific deployment.

1. High Availability (HA) and Scalability

For a production environment, single points of failure are unacceptable. This is especially true for image registries that acts as a central repository for storing and pulling images and artifacts. Therefore, implementing high availability for Harbor is crucial and involves several key considerations:

Deploy with an Ingress: Configure a Kubernetes Service of type Ingress controller (e.g. Traefik) in front of your Harbor instances to distribute incoming traffic efficiently and provide a unified entry point along with cert-manager for certificate management. You can specify this in your values.yaml file under:

expose:
  # Available Options: "loadBalancer", "ingress", "clusterIP", "nodePort"
  type: ingress

expose:

# Available Options: "loadBalancer", "ingress", "clusterIP", "nodePort"

type: ingress

To locate your values.yaml file, refer to the previous blog.

Utilize Multiple Harbor Instances: Increase the replica count for critical Harbor components (e.g., core, jobservice, portal, registry, trivy) in your values.yaml to ensure redundancy.

core:
  replicas: 3
jobservice:
  replicas: 3
portal:
  replicas: 3
registry:
  replicas: 3
trivy:
  replicas: 3

# While not strictly for the HA of the registry itself, consider increasing exporter replicas for robust monitoring availability
exporter:
  replicas: 3

# Optionally, if using Ingress, consider increasing the Nginx replicas for improving Ingress availability
nginx:
  replicas: 3

core:

replicas: 3

jobservice:

replicas: 3

portal:

replicas: 3

registry:

replicas: 3

trivy:

replicas: 3

# While not strictly for the HA of the registry itself, consider increasing exporter replicas for robust monitoring availability

exporter:

replicas: 3

# Optionally, if using Ingress, consider increasing the Nginx replicas for improving Ingress availability

nginx:

replicas: 3

Enable Database HA (PostgreSQL): Harbor includes a built-in PostgreSQL database, but we do not recommend it for production use. Here’s why:

Lack of High Availability (HA): The default internal PostgreSQL setup within the Harbor Helm chart is typically a single instance. This creates a single point of failure. If that database pod goes down, your entire Harbor instance will be unavailable.

Limited Scalability: An embedded database is not designed for independent scaling. If your Harbor usage grows, you might hit database performance bottlenecks that are difficult to address without disrupting Harbor itself.

Complex Lifecycle Management: Managing backups, point-in-time recovery, patching, and upgrades for a stateful database directly within an application’s Helm chart can be significantly more complex and error-prone than with dedicated database solutions.

Thus, it is recommended to deploy a highly available PostgreSQL cluster within Kubernetes (e.g., using a Helm chart for Patroni or CloudNativePG) or leverage a managed database service outside the cluster. Configure Harbor to connect to this HA database by updating the values.yaml:

database:
  type: "external"
  external:
    host: "192.168.0.1"
    port: "5432"
    username: "user"
    password: "password"
    coreDatabase: "registry"
    # If using an existing secret, the key must be "password"
    existingSecret: ""
    # "disable" - No SSL
    # "require" - Always SSL (skip verification)
    # "verify-ca" - Always SSL (verify that the certificate presented by the
    # server was signed by a trusted CA)
    # "verify-full" - Always SSL (verify that the certification presented by the
    # server was signed by a trusted CA and the server host name matches the one
    # in the certificate)
    sslmode: "verify-full"

database:

type: "external"

external:

host: "192.168.0.1"

port: "5432"

username: "user"

password: "password"

coreDatabase: "registry"

# If using an existing secret, the key must be "password"

existingSecret: ""

# "disable" - No SSL

# "require" - Always SSL (skip verification)

# "verify-ca" - Always SSL (verify that the certificate presented by the

# server was signed by a trusted CA)

# "verify-full" - Always SSL (verify that the certification presented by the

# server was signed by a trusted CA and the server host name matches the one

# in the certificate)

sslmode: "verify-full"

Implement Redis HA: Deploy a highly available Redis cluster in Kubernetes (e.g., using a Helm chart for Redis Sentinel or Redis Cluster) or utilize a managed Redis service. Configure Harbor to connect to this HA Redis instance by updating redis.type and connection details in values.yaml.

redis:
  type: external
  external:
    addr: "192.168.0.2:6397"
    sentinelMasterSet: ""
    tlsOptions:
      enable: true
    username: ""
    password: ""

redis:

type: external

external:

addr: "192.168.0.2:6397"

sentinelMasterSet: ""

tlsOptions:

enable: true

username: ""

password: ""

2. Security Best Practices

Security is paramount for any production system, especially a container registry.

Enable TLS/SSL: Always enable TLS/SSL for all Harbor components. For automated certificate management in Kubernetes, integrate with Cert-Manager and configure it within your Harbor Helm values.yaml:

expose:
  tls:
    enabled: true
    certSource: auto # change to manual if using cert-manager
    auto:
      commonName: ""
internalTLS:
  enabled: true
  strong_ssl_ciphers: true
  certSource: "auto"
  core:
    secretName: ""
  jobService:
    secretName: ""
  registry:
    secretName: ""
  portal:
    secretName: ""
  trivy:
    secretName: ""

expose:

tls:

enabled: true

certSource: auto # change to manual if using cert-manager

auto:

commonName: ""

internalTLS:

enabled: true

strong_ssl_ciphers: true

certSource: "auto"

core:

secretName: ""

jobService:

secretName: ""

registry:

secretName: ""

portal:

secretName: ""

trivy:

secretName: ""

Configure Role-Based Access Control (RBAC): Leverage Kubernetes RBAC for managing access to Harbor resources. Post-deployment, integrate Harbor with enterprise identity providers such as LDAP or OIDC. Refer to the Harbor configuration guides for detailed steps: Configure LDAP/Active Directory Authentication or Configure OIDC Provider Authentication

Implement Vulnerability Scanning: Ensure vulnerability scanning is enabled in values.yaml. Harbor uses Trivy by default. Verify its activation and configuration within the Helm chart.

trivy:
 enabled: true

1 2	trivy: enabled: true

Activate Content Trust: Harbor supports multiple content trust mechanisms to ensure the integrity of your artifacts. For modern OCI artifact signing, Harbor recommends Cosign and Notation. Enforce deployment security at the project level within the Harbor UI or via the Harbor API. This allows only verified, cryptographically signed image deployment.

Maintain Regular Updates: Regularly update your Harbor Helm chart and underlying Kubernetes components to benefit from the latest security patches and bug fixes. Use helm upgrade for this purpose.

3. Storage Considerations

Efficient and reliable storage is critical for Harbor’s performance and stability.

Configure Shared Storage: For persistent data, configure Kubernetes StorageClasses and PersistentVolumes to use shared storage solutions like vSAN, NFS, S3-compatible object storage (e.g., MinIO deployed in-cluster or external S3), or a distributed file system. Specify these in your values.yaml under:

persistence:
  enabled: true
  resourcePolicy: "keep"
  persistentVolumeClaim:
    registry:
      #If left empty, the kubernetes cluster default storage class will be used
      storageClass: "your-storage-class"
     jobservice:
       storageClass: "your-storage-class"
     database:
       storageClass: "your-storage-class"
    redis:
      storageClass: "your-storage-class"
    trivy:
      storageClass: "your-storage-class"

persistence:

enabled: true

resourcePolicy: "keep"

persistentVolumeClaim:

registry:

#If left empty, the kubernetes cluster default storage class will be used