High Availability with Cloudflare Tunnels and Docker Swarm

A comprehensive guide to setting up and testing high availability (HA) for your web applications using Cloudflare Tunnels distributed across multiple VPS nodes with Docker Swarm.

Overview

This guide shows how to deploy a highly available web application across 3 VPS nodes using:

  • Docker Swarm for container orchestration
  • Cloudflare Tunnels for secure, zero-config ingress
  • Overlay networks for inter-service communication
  • Health checks for automatic failover

The setup ensures your application remains online even if individual VPS nodes fail, with automatic load balancing across all available nodes.

Architecture

┌─────────────────┐
│   Cloudflare    │
│     Edge        │
└────────┬────────┘
         │ Load Balancing
    ┌────┴────┬────┬────┐
    │         │    │    │
┌───▼───┐ ┌──▼──┐ ┌▼───┐ ┌▼───┐
│ VPS 1 │ │VPS 2│ │VPS 3│ │ ... │
│ NC2   │ │NC3  │ │NC4  │ │     │
└───┬───┘ └──┬──┘ └┬───┘ └┬───┘
    │       │      │      │
    └───────┴──────┴──────┘
         Docker Swarm
         Overlay Network

Each VPS runs:

  • Cloudflared connector (1 replica per node)
  • Frontend service (1 replica per node)
  • Backend service (1 replica per node)

Prerequisites

Infrastructure

  • 3+ VPS nodes with Docker installed
  • Docker Swarm initialized across all nodes
  • Each node can communicate with others (ports 2377, 7946, 4789 open)

Cloudflare

  • Cloudflare account with Zero Trust enabled
  • Domain configured in Cloudflare DNS
  • Cloudflare Tunnel created

Software

  • Docker Engine 20.10+
  • Docker Swarm mode enabled

Setup

1. Initialize Docker Swarm

On your manager node (NC2-net):

docker swarm init --advertise-addr <MANAGER_IP>

On worker nodes (NC3-org, NC4-de):

docker swarm join --token <TOKEN> <MANAGER_IP>:2377

2. Create Overlay Network

Create an attachable overlay network for inter-service communication:

docker network create --driver overlay --subnet 10.0.1.0/24 --gateway 10.0.1.1 --attachable cloudflared-ha

3. Get Cloudflare Tunnel Token

  1. Go to Cloudflare Zero Trust → Networks → Tunnels
  2. Create a new tunnel or use existing one
  3. Copy the tunnel token

4. Configure Cloudflared

Create a .env file in your cloudflared directory:

TUNNEL_TOKEN=your_tunnel_token_here

Configuration Files

cloudflared.yaml

services:
  cloudflared-ha:
    image: cloudflare/cloudflared:latest
    command: tunnel run
    env_file:
      - .env
    networks:
      - cloudflared-ha
    healthcheck:
      test: ["CMD", "cloudflared", "version"]
      interval: 60s
      timeout: 20s
      retries: 5
      start_period: 60s
      start_interval: 5s
    deploy:
      mode: replicated
      replicas: 3
      placement:
        max_replicas_per_node: 1
      restart_policy:
        condition: any

networks:
  cloudflared-ha:
    external: true

swarm.yaml

services:
  frontend:
    image: your-registry/frontend:latest
    env_file:
      - ./frontend/.env
    networks:
      - cloudflared-ha
    healthcheck:
      test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://127.0.0.1/health"]
      interval: 60s
      timeout: 20s
      retries: 5
      start_period: 60s
      start_interval: 5s
    deploy:
      mode: replicated
      replicas: 3
      placement:
        max_replicas_per_node: 1
      update_config:
        parallelism: 1
        delay: 10s
        order: start-first
      restart_policy:
        condition: any

  backend:
    image: your-registry/backend:latest
    env_file:
      - ./backend/.env
    networks:
      - cloudflared-ha
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 60s
      timeout: 20s
      retries: 5
      start_period: 60s
      start_interval: 5s
    deploy:
      mode: replicated
      replicas: 3
      placement:
        max_replicas_per_node: 1
      update_config:
        parallelism: 1
        delay: 10s
        order: start-first
      restart_policy:
        condition: any

networks:
  cloudflared-ha:
    external: true

Deployment

1. Deploy Cloudflared Stack

cd /path/to/cloudflared-ha
docker stack deploy -c cloudflared.yaml cloudflared

Verify deployment:

docker service ps cloudflared_cloudflared-ha

You should see 1 replica running on each node.

2. Deploy Application Stack

cd /path/to/your-app
docker stack deploy -c swarm.yaml blog

Verify deployment:

docker service ps blog_frontend
docker service ps blog_backend

3. Configure Cloudflare Tunnel Routes

In Cloudflare Zero Trust dashboard:

  1. Go to Networks → Tunnels → Your tunnel
  2. Add Public Hostname:
    • Hostname: yourdomain.com
    • Service: http://frontend:80
  3. Add Public Hostname for API:
    • Hostname: api.yourdomain.com
    • Service: http://backend:8000

Testing High Availability

Verify All Connectors Are Registered

Check Cloudflare dashboard → Networks → Connectors:

  • Should show 3 connectors (1 per VPS)
  • All status should be "Connected"
  • Note the Connector IDs for each

Test Traffic Distribution

Method 1: Cloudflare Live Logs

  1. Go to Zero Trust → Networks → Tunnels → Live logs
  2. Access your application multiple times
  3. Observe different Connector IDs handling requests
# Make multiple requests
for i in {1..10}; do curl -I https://yourdomain.com; sleep 1; done

You should see traffic distributed across different connector IDs.

Method 2: Simulate Node Failure

# Scale down cloudflared to 2 replicas
docker service scale cloudflared_cloudflared-ha=2

# Verify site still works
curl -I https://yourdomain.com

# Scale back to 3
docker service scale cloudflared_cloudflared-ha=3

Method 3: Check Service Replicas

# Check all services are running
docker service ls

# Check individual service distribution
docker service ps cloudflared_cloudflared-ha
docker service ps blog_frontend
docker service ps blog_backend

Each should show 1 replica per node.

Test Health Checks

# Check service health
docker service ps blog_frontend --format "table \t"

# Check individual container health
docker inspect <container_id> --format ''

Test Network Connectivity

# Test overlay network connectivity
docker network inspect cloudflared-ha

# Verify containers can communicate
docker exec <frontend_container> wget -qO- http://backend:8000/health

Troubleshooting

Issue: Services Not Starting on All Nodes

Symptoms: Only some replicas running, others rejected

Solution:

# Check network configuration
docker network inspect cloudflared-ha

# Remove and recreate network
docker network rm cloudflared-ha
docker network create --driver overlay --subnet 10.0.1.0/24 --gateway 10.0.1.1 --attachable cloudflared-ha

# Redeploy stacks
docker stack rm cloudflared blog
docker stack deploy -c cloudflared.yaml cloudflared
docker stack deploy -c swarm.yaml blog

Issue: Health Checks Failing

Symptoms: Containers stuck in "health: starting"

Solution:

  • Verify health check command works inside container:
    docker exec <container_id> <health_check_command>
  • Ensure required tools (curl, wget) are installed in Dockerfile
  • Check health endpoint is accessible

Issue: Cloudflared Not Connecting

Symptoms: Connectors showing as disconnected in dashboard

Solution:

# Check cloudflared logs
docker service logs cloudflared_cloudflared-ha -f

# Verify tunnel token is correct
# Check network connectivity between nodes
# Ensure overlay network is working

Issue: No Traffic Distribution

Symptoms: All traffic going to single connector

Solution:

  • Verify all connectors are registered in Cloudflare dashboard
  • Check tunnel is configured for HA mode
  • Use Cloudflare Live logs to verify distribution
  • Ensure placement constraints allow distribution

Best Practices

1. Resource Limits

Add resource constraints to prevent resource exhaustion:

deploy:
  resources:
    limits:
      cpus: '0.5'
      memory: 512M
    reservations:
      cpus: '0.25'
      memory: 256M

2. Update Strategy

Use rolling updates for zero-downtime deployments:

update_config:
  parallelism: 1
  delay: 10s
  order: start-first
  failure_action: rollback

3. Monitoring

  • Enable Cloudflare Analytics for traffic insights
  • Monitor Docker Swarm service health
  • Set up alerts for service failures
  • Use Cloudflare Live logs for real-time debugging

4. Security

  • Use environment variables for sensitive data
  • Restrict network access where possible
  • Keep images updated
  • Use private Docker registries
  • Enable Cloudflare WAF rules

5. Backup Strategy

  • Backup Docker Swarm configuration
  • Document tunnel tokens securely
  • Version control configuration files
  • Test disaster recovery procedures

Conclusion

This HA setup provides:

  • Zero-downtime deployments with rolling updates
  • Automatic failover when nodes fail
  • Load balancing across all available nodes
  • Secure ingress without exposed ports
  • Easy scaling by adding more nodes

The combination of Docker Swarm and Cloudflare Tunnels creates a robust, production-ready infrastructure that can handle node failures while maintaining service availability.

Additional Resources