Convoy depends on Redis for the task queue, rate limiting, circuit breaker state, and caching. Because the task queue and circuit breaker state must survive restarts, Redis persistence and high availability matter for production deployments. This guide covers the recommended approach — using a cloud-managed Redis service — and an alternative path for teams that choose to self-host Redis with Sentinel for high availability.Documentation Index
Fetch the complete documentation index at: https://getconvoy.io/docs/llms.txt
Use this file to discover all available pages before exploring further.
How Convoy Uses Redis
| Function | Description | Data Characteristics |
|---|---|---|
| Task Queue (Asynq) | All event processing jobs are enqueued and dequeued through Redis | High write throughput, must survive restarts |
| Rate Limiting | Per-endpoint rate limit counters | High read/write, short-lived keys |
| Circuit Breaker | Tracks endpoint health for circuit breaking | Moderate read/write, must survive restarts |
| Caching | Project, endpoint, subscription lookups | High read, tolerates loss on failover |
Recommended: Cloud-Managed Redis
This is the recommended approach for production deployments. A managed Redis service handles the hardest parts of running Redis at scale — HA, failover, persistence, security patching, backups, and monitoring — so you can focus on running Convoy.
- Automatic failover with multi-AZ replicas
- Persistence and backups without you managing RDB/AOF directly
- Security patching applied without downtime
- TLS-encrypted endpoints out of the box
- Monitoring and alerting built in
- Memory and CPU autoscaling on most providers
Providers
| Provider | Mode | Notes |
|---|---|---|
| AWS ElastiCache | Cluster mode or replication group with primary endpoint | Replication groups support automatic failover; cluster mode is also supported by Convoy |
| GCP Memorystore | Standalone or HA tier | HA tier provides automatic failover |
| Azure Managed Redis / Azure Cache for Redis | Standalone, Replication, or Cluster | Premium and Enterprise tiers offer HA |
| Aiven for Redis | HA across nodes | Simple connection endpoint |
| Upstash | Serverless Redis | Single endpoint, automatic HA |
| Redis Cloud (Redis Inc.) | Managed by the maintainers | Sentinel and cluster options |
Configuring Convoy for a Managed Service
For most managed services, you connect to a single primary endpoint. The provider handles failover transparently:Environment Variables
convoy.json
rediss scheme for TLS-enabled endpoints (most managed services require TLS).
If your provider exposes Sentinels (e.g., Bitnami Sentinel on Kubernetes, some managed offerings), use the redis-sentinel scheme instead and point Convoy at the Sentinels:
Environment Variables
Self-Hosted Redis with Sentinel
Self-hosting Redis in HA mode means you are responsible for Sentinel monitoring and failover, persistence configuration, memory management, network partitions, security, and backups. Misconfigured persistence can cause data loss; an OOM kill can drop the entire task queue. If any of the above is unfamiliar, we strongly recommend using a managed Redis service.
Recommended Architecture: 3-Node Sentinel
For most self-hosted Convoy workloads, a 3-node Redis deployment with Sentinel provides the right balance of HA and operational simplicity. Redis Cluster is unnecessary at this scale.Why 3 Nodes?
- Sentinel requires a quorum (majority vote) to trigger failover.
- With 3 Sentinels, 1 node can fail and failover still works (2 of 3 agree).
- With 2 nodes, losing 1 means no quorum and no automatic failover.
- With 1 node, there is no HA at all.
VM Specifications
Redis is single-threaded for command processing, so it does not benefit from many CPU cores. Memory and network are the primary resources.| Node | Role | vCPU | RAM | Disk | Count |
|---|---|---|---|---|---|
| Redis Primary | Primary + Sentinel | 2-4 | 8 GB | 20 GB SSD | 1 |
| Redis Replica | Replica + Sentinel | 2-4 | 8 GB | 20 GB SSD | 2 |
| Total | 6-12 | 24 GB | 60 GB SSD | 3 |
- 2-4 vCPU: Redis is single-threaded for commands, but background tasks (persistence, replication) use additional threads. 2 vCPU is the minimum; 4 gives headroom for RDB snapshots and AOF rewrites.
- 8 GB RAM: Redis stores everything in memory. Set
maxmemoryto ~75% of RAM (6 GB on an 8 GB VM), leaving room for OS buffers, persistence operations, and replication buffers. - 20 GB SSD: Required for RDB snapshots and AOF files. SSD matters — HDD will cause latency spikes during persistence.
Scaling: For workloads above ~10M requests/day, increase RAM to 16 GB per node. CPU and disk requirements rarely need to change.
Step 1: Install Redis on All 3 Nodes
On each VM:Step 2: Configure the Primary Node (Node 1)
Edit/etc/redis/redis.conf on Node 1. The settings below are grouped by concern; expand each group for the recommended values and rationale.
Network and Authentication
Network and Authentication
Memory
Memory
RDB Snapshots
RDB Snapshots
AOF (Append Only File)
AOF (Append Only File)
Performance
Performance
Logging
Logging
Step 3: Configure Replica Nodes (Node 2 and Node 3)
Use the same configuration as the primary, plus the replication settings below:/etc/redis/redis.conf (replicas only)
<primary_node_ip> with the private IP address of Node 1.
Step 4: Configure Sentinel on All 3 Nodes
Create/etc/redis/sentinel.conf on every node:
/etc/redis/sentinel.conf
<primary_node_ip> with the private IP of Node 1.
Step 5: Apply Linux Kernel Settings
Run on all 3 nodes:Step 6: Start Services
Start order matters: bring up the primary first, then the replicas, then start Sentinels on all nodes.Step 7: Verify the Setup
Check replication on the primary:role:master and connected_slaves:2.
Check Sentinel state on any node:
SENTINEL ckquorum should return OK.
Test failover before going to production:
Configuring Convoy to Use Sentinel
Point Convoy at the Sentinel endpoints, not at the primary directly:Environment Variables
convoy.json
master_name must match the cluster name in sentinel monitor <name> from Step 4. Restart all Convoy services after making this change.
See the Redis Sentinel configuration reference for all available parameters, including separate Sentinel authentication.
Monitoring
Connect to any Redis node:| Metric | Source | Healthy | Action |
|---|---|---|---|
used_memory | INFO memory | < 75% of maxmemory | If approaching the limit, investigate queue backlog or raise maxmemory |
connected_clients | INFO clients | Stable, not growing unbounded | Investigate connection leaks if growing |
instantaneous_ops_per_sec | INFO stats | Matches expected workload | Baseline the value; spikes may signal trouble |
rejected_connections | INFO stats | 0 | Increase maxclients if non-zero |
rdb_last_bgsave_status | INFO persistence | ok | If err, check disk space and logs |
aof_last_bgrewrite_status | INFO persistence | ok | If err, check disk space and logs |
master_link_status (replicas) | INFO replication | up | If down, check network between nodes |
master_last_io_seconds_ago (replicas) | INFO replication | < 10 | High values indicate replication lag |
- Redis down —
redis-cli pingdoes not returnPONG - Memory > 80% —
used_memory / maxmemory - Replication lag > 30s —
master_last_io_seconds_agoon replicas - Sentinel quorum lost —
SENTINEL CKQUORUM convoy-redisdoes not returnOK - Persistence failures —
rdb_last_bgsave_statusoraof_last_bgrewrite_statusiserr - Disk usage > 80% on
/var/lib/redis
Troubleshooting
LOADING Redis is loading the dataset in memory
LOADING Redis is loading the dataset in memory
Redis is recovering from RDB or AOF files after a restart. This is normal — duration is proportional to dataset size. Convoy will fail to connect during this period. For typical Convoy workloads, loading takes a few seconds.
OOM command not allowed when used memory > maxmemory
OOM command not allowed when used memory > maxmemory
Redis is out of memory. New jobs cannot be enqueued, which is critical for Convoy.
- Inspect memory:
redis-cli -a your_redis_password INFO memory - Check the Asynq queue depth:
redis-cli -a your_redis_password LLEN "asynq:{EventQueue}:pending" - If the queue is backed up, investigate why workers are not processing (check Convoy worker logs)
- If memory needs are genuinely higher, raise
maxmemoryand add RAM to each node
READONLY You can't write against a read only replica
READONLY You can't write against a read only replica
Convoy is connecting to a replica instead of the current primary. Common causes:
- A Sentinel failover happened and Convoy is using a stale primary address. When using
redis-sentinelscheme, the client discovers the current primary via Sentinel — ensureCONVOY_REDIS_SENTINEL_MASTER_NAMEmatches the cluster name insentinel monitor <name>. - Convoy is configured with the standalone scheme (
redis) pointed directly at a replica. Use theredis-sentinelscheme for HA setups so the client always finds the primary.
Sentinel reports sdown but no failover triggered
Sentinel reports sdown but no failover triggered
Only one Sentinel detected the outage (subjective down) — the quorum of 2 was not met. Check network connectivity between all 3 Sentinel nodes and verify port 26379 is reachable from each node to the others.Should return
OK. If it does not, fix Sentinel reachability before relying on automatic failover.Replicas falling behind on replication
Replicas falling behind on replication
Signs: high
master_last_io_seconds_ago, replication backlog filling up, or master_link_status: down.- Check network bandwidth between primary and replicas
- Check disk I/O on the primary — AOF rewrites and RDB saves compete for I/O
- Increase the replication backlog if reconnections are frequent:
repl-backlog-size 256mb - Confirm replicas are not running in a memory-constrained state
Convoy fails to start with redis-sentinel scheme
Convoy fails to start with redis-sentinel scheme
Verify the basics:
CONVOY_REDIS_HOSTis a comma-separated list of Sentinel hostnames or IPs (not the Redis primary)CONVOY_REDIS_PORTis the Sentinel port, typically26379CONVOY_REDIS_SENTINEL_MASTER_NAMEmatches the name insentinel monitor <name>exactlyCONVOY_REDIS_PASSWORDis the Redis password (used after Sentinel discovers the primary)- If your Sentinels require authentication, set
CONVOY_REDIS_SENTINEL_PASSWORDseparately