Deployment

Deploying Cornucopia to production

Cornucopia Deployment Guide

Local Development

Prerequisites

  • Go 1.22 or later
  • Python 3.8+ (for testing pip integration)
  • pip (for testing)

Setup

git clone https://github.com/getcornucopia/cornucopia.git
cd cornucopia
go mod download

Running

# Development mode (listen on :8080, cache in ./cache)
go run ./cmd/cornucopia

# Or build first
make build
./bin/cornucopia

Testing

# Unit tests
make test

# Integration tests (with pip)
make test-integration

# All tests
make test && make test-integration

Docker

Building

make docker
# Image: cornucopia:latest

Running

# Standalone
docker run -p 8080:8080 -v cache:/app/cache cornucopia:latest

# With config
docker run -p 8080:8080 \
  -v cache:/app/cache \
  -v ./config.yml:/app/config.yml:ro \
  -e CORNUCOPIA_CONFIG=/app/config.yml \
  cornucopia:latest

docker-compose

docker-compose up

Adjust docker-compose.yml to mount your config and cache directory.


Kubernetes

A StatefulSet ensures stable hostnames and persistent storage for the cache.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cornucopia-config
data:
  config.yml: |
    server:
      addr: ":8080"
      base_url: "http://pypi.example.com"
      max_upload_size: 104857600
    cache:
      dir: "/data/cache"
      metadata_ttl: 10m
      enable_memory: true
    pypi:
      json_api_url: "https://pypi.org/pypi"
      timeout: 30s
      max_retries: 3
    auth:
      disabled: false
      tokens:
        "token-1": "user1"
        "token-2": "user2"

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cornucopia
spec:
  serviceName: cornucopia
  replicas: 3
  selector:
    matchLabels:
      app: cornucopia
  template:
    metadata:
      labels:
        app: cornucopia
    spec:
      containers:
      - name: cornucopia
        image: cornucopia:latest
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http
        env:
        - name: CORNUCOPIA_CONFIG
          value: /etc/cornucopia/config.yml
        volumeMounts:
        - name: cache
          mountPath: /data/cache
        - name: config
          mountPath: /etc/cornucopia
          readOnly: true
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 512Mi
      volumes:
      - name: config
        configMap:
          name: cornucopia-config
  volumeClaimTemplates:
  - metadata:
      name: cache
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 100Gi

---
apiVersion: v1
kind: Service
metadata:
  name: cornucopia
spec:
  clusterIP: None
  selector:
    app: cornucopia
  ports:
  - port: 8080
    name: http

---
apiVersion: v1
kind: Service
metadata:
  name: cornucopia-lb
spec:
  type: LoadBalancer
  selector:
    app: cornucopia
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP

Deployment (Simpler, Shared Cache)

If using a shared cache volume (NFS, EBS):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cornucopia
spec:
  replicas: 3
  selector:
    matchLabels:
      app: cornucopia
  template:
    metadata:
      labels:
        app: cornucopia
    spec:
      containers:
      - name: cornucopia
        image: cornucopia:latest
        ports:
        - containerPort: 8080
        env:
        - name: CORNUCOPIA_CONFIG
          value: /etc/cornucopia/config.yml
        volumeMounts:
        - name: cache
          mountPath: /data/cache
        - name: config
          mountPath: /etc/cornucopia
          readOnly: true
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 30
      volumes:
      - name: cache
        persistentVolumeClaim:
          claimName: cornucopia-cache
      - name: config
        configMap:
          name: cornucopia-config

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cornucopia-cache
spec:
  accessModes: [ "ReadWriteMany" ]
  resources:
    requests:
      storage: 100Gi

Systemd

Create /etc/systemd/system/cornucopia.service:

[Unit]
Description=Cornucopia PyPI Mirror
Documentation=https://github.com/getcornucopia/cornucopia
After=network.target

[Service]
Type=simple
User=cornucopia
Group=cornucopia
WorkingDirectory=/var/lib/cornucopia

ExecStart=/usr/local/bin/cornucopia
EnvironmentFile=/etc/cornucopia/cornucopia.env

StandardOutput=journal
StandardError=journal

# Restart on failure
Restart=on-failure
RestartSec=10

# Graceful shutdown
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30s

[Install]
WantedBy=multi-user.target

Create /etc/cornucopia/cornucopia.env:

CORNUCOPIA_ADDR=:8080
CORNUCOPIA_CACHE_DIR=/var/lib/cornucopia/cache
CORNUCOPIA_CONFIG=/etc/cornucopia/config.yml

Setup:

# Create user
useradd -r -s /sbin/nologin cornucopia

# Create directories
mkdir -p /var/lib/cornucopia/cache
chown cornucopia:cornucopia /var/lib/cornucopia

# Copy binary
cp bin/cornucopia /usr/local/bin/
chmod +x /usr/local/bin/cornucopia

# Copy config
cp config.example.yml /etc/cornucopia/config.yml
chown cornucopia:cornucopia /etc/cornucopia/config.yml

# Enable and start
systemctl daemon-reload
systemctl enable cornucopia
systemctl start cornucopia

# Check status
systemctl status cornucopia
journalctl -u cornucopia -f

Configuration

See config.example.yml for all options.

Key Settings

Server:

  • addr: Listen address (e.g., :8080)
  • base_url: Public URL (e.g., http://pypi.example.com)
  • max_upload_size: Max file size for uploads (default: 100MB)

Cache:

  • dir: Cache directory (default: ./cache)
  • metadata_ttl: Metadata TTL (default: 10m)
  • enable_memory: In-memory cache (default: true)

PyPI:

  • json_api_url: PyPI JSON API (default: https://pypi.org/pypi)
  • timeout: Upstream timeout (default: 30s)
  • max_retries: Retry count (default: 3)

Auth:

  • disabled: Allow uploads without auth (default: true)
  • tokens: Map of token→username

Environment Variables

  • CORNUCOPIA_ADDR: Override server address
  • CORNUCOPIA_CACHE_DIR: Override cache directory
  • CORNUCOPIA_CONFIG: Path to YAML config

Scaling

Vertical Scaling

  • Increase GOMAXPROCS: Set GOMAXPROCS env var to number of cores
  • Increase cache size: Use larger storage volume
  • Tune timeouts: Increase for slow networks

Horizontal Scaling

  1. Shared Cache: All instances must access the same cache volume (NFS, EBS, S3)
  2. Load Balancer: Front all instances with a reverse proxy (nginx, AWS ALB)
  3. Stateless: Each instance is identical; no session affinity needed

Example Nginx config:

upstream cornucopia {
    server cornucopia1:8080;
    server cornucopia2:8080;
    server cornucopia3:8080;
}

server {
    listen 80;
    server_name pypi.example.com;

    location / {
        proxy_pass http://cornucopia;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # For large file uploads
        client_max_body_size 500M;
    }
}

Monitoring

Health Check

curl http://localhost:8080/health
# {"status":"ok"}

Logging

Logs go to stdout (JSON format). Pipe to a log aggregator:

# systemd journal
journalctl -u cornucopia -f

# Container logs
docker logs cornucopia

# Kubernetes
kubectl logs -f deployment/cornucopia

Metrics

No built-in metrics (consider adding Prometheus exporter in future).

For now, monitor via:

  • Request latency (see logs)
  • Cache disk usage: du -sh /var/lib/cornucopia/cache
  • Memory usage: ps aux | grep cornucopia

Backup & Recovery

Backup Cache

Cache can be recreated from PyPI, so backup is optional. However, to avoid re-downloading:

# Tar entire cache
tar -czf cornucopia-cache-$(date +%s).tar.gz -C /var/lib/cornucopia cache/

# Or sync to S3
aws s3 sync /var/lib/cornucopia/cache s3://my-backup-bucket/cornucopia-cache/

Recovery

# Restore from backup
tar -xzf cornucopia-cache-*.tar.gz -C /var/lib/cornucopia/

# Or sync from S3
aws s3 sync s3://my-backup-bucket/cornucopia-cache/ /var/lib/cornucopia/cache/

Troubleshooting

High Memory Usage

Memory = ~50 MB baseline + cache size. If exceeds limits:

  • Disable in-memory cache: enable_memory: false
  • Reduce metadata TTL: metadata_ttl: 1m
  • Delete old cache: rm -rf ./cache/meta

Slow Package Indexes

If /simple/ is slow:

  • Ensure enable_memory: true
  • Increase cache TTL: metadata_ttl: 30m
  • Check disk I/O: iostat -x 1

Upload Failures

  • Check auth token is correct
  • Check max_upload_size setting
  • Check disk space: df -h

Upstream Timeouts

  • Increase pypi.timeout: e.g., timeout: 60s
  • Increase max_retries: e.g., max_retries: 5
  • Check network: ping pypi.org