Replication is enabled for all paid Upstash databases. The data is replicated to multiple instances. Replication provides you high availability and better scalability.

High Availability

Replication makes your database resilient to failures because even one of the replicas is not available, your database continues to work.

There are two types of replicas in Upstash Redis: primary replicas and read replicas. Primary replicas handle both reads and writes, while read replicas are used only for reads. In all database types, multiple primary replicas ensure that if one fails, your database continues to function.

In free and pay-as-you-go databases, there is one read replica per region. If it fails, your database remains operational, and you can still read from the primary replicas, though with higher latency. In Pro and Enterprise databases, multiple read replicas ensure that if one fails, your database remains operational, and you can read from the other read replicas without any additional latency.

Better Scalability

In a replicated database, your requests are evenly distributed among the replicas using a round-robin approach. As your throughput requirements grow, additional replicas can be added to the cluster to handle the increased workload and maintain high performance. This scalability feature ensures that your database can effectively meet the demands of high throughput scenarios.

Architecture

We use the single leader replication model. Each key is owned by a leader replica and other replicas become the backups of the leader. Writes on a key are processed by the leader replica first then propagated to backup replicas. Reads can be performed from any replica. This model gives a better write consistency and read scalability.

Consistency

Each replica in the cluster utilizes a failure detector to monitor the status of the leader replica. In the event that the leader replica fails, the remaining replicas initiate a new leader election process to select a new leader. During this leader election round, which is the only unavailability window for the cluster, there may be a short period of time where your requests can be temporarily blocked.

However, once a new leader is elected, normal operations resume, ensuring the continued availability of the cluster. This mechanism ensures that any potential unavailability caused by leader failure is minimized, and the cluster can quickly recover and resume processing requests.