# Parallelism and Rate Limit for Workflow And QStash

> **Source:** https://upstash.com/blog/QStash-rateLimit
> **Date:** 2025-02-20
> **Author(s):** Sancar Koyunlu
> **Reading time:** 5 min read
> **Tags:** qstash, workflow, announcement
> **Format:** text/markdown — machine-readable content for agents and LLMs

---

## Parallelism and Rate Limits for Workflow and QStash  

We are excited to announce the release of **Flow-Control**, a new feature that lets you set **Rate** and **Parallelism** limits for QStash Publish and Workflow.  

This blog is divided into four sections for easier reading:  
- [Motivation](#motivation): Why we developed Flow-Control.  
- [How It Works](#how-it-works): Understanding rate limiting and parallelism.  
- [How to Use It](#how-to-use-it): Practical examples of implementing Flow-Control in QStash and Workflow.  
- [What Happened to Queues](#what-happened-to-queues): Differences between Queues and Flow-Control  

## Motivation  

Our community drives our development. We’ve heard your feedback on two key points:  

- **Workflow**: Our [Workflow](https://upstash.com/docs/workflow/getstarted) product is gaining traction. However, it launched without built-in rate or parallelism limits. The existing parallelism control via Queues was not suitable for Workflow.  
- **Queue Limitations**: Many users relied on Queue Parallelism to prevent bursts on their endpoints. However, queues had per-plan limits due to memory/CPU allocation constraints. The new design removes these restrictions, allowing you to configure limits based on your application’s needs without worrying about plan limits.  

To solve these issues, we developed Flow-Control, which works for both Workflow and QStash.  

## How It Works  

- **Rate Limit**: This defines the maximum number of calls per second. Calls within the same `FlowControl` key contribute to the rate count. Instead of rejecting excess calls, QStash delays them for execution in later seconds, respecting the limit.  
- **Parallelism Limit**: This controls the number of concurrent executions. Unlike rate limiting, execution duration matters. At no time will there be more than the specified number of active calls. If the limit is reached, other publishes will wait until a slot is available and, once one finishes, they will continue.
- **Using Rate and Parallelism Together**: Both parameters can be combined. For example, with a rate of 10 per second and parallelism of 20, if each request takes a minute to complete, QStash will trigger 10 calls in the first second and another 10 in the next. Since none of them will have finished, the system will wait until one completes before triggering another.  

## How to Use It  

### QStash  

Here’s an example of using Flow-Control with QStash:  

```js
const client = new Client({ token: "<QStash_TOKEN>" });

await client.publishJSON({
  url: "https://my-api...",
  body: { hello: "world" },
  flowControl: { key: "app1", parallelism: 3, ratePerSecond: 10 },
});
```

For more details and other SDKs, see the documentation [here](https://upstash.com/docs/qstash/features/flowcontrol).  

### Workflow  

There are two main use cases for Flow-Control in Workflow:  
1. [Limiting the Workflow Environment](#limiting-the-workflow-environment): Controlling the execution environment.  
2. [Limiting External API Calls](#limiting-external-api-calls): Preventing excessive requests to external services.  

#### Limiting the Workflow Environment  

To limit the execution environment, you need to configure both the `serve` and `trigger` methods. 
When configured, all the steps of the workflow will respect the limits.
Due to the nature of the Workflow SDK, QStash calls the `serve` method multiple times. This means that to stay within the limits 
of the deployed environments, the given rate will be applied to all calls going from QStash servers to the `serve` method.

Note that if there are multiple Workflows running in the same environment, their steps can interleave, but overall rate and parallelism limits will be respected 
if they share the same `flowControl` key.

- **In the `serve` method**:  

```js
export const { POST } = serve<string>(
  async (context) => {
    await context.run("step-1", async () => {
      return someWork();
    });
  },
  {
    flowControl: { key: "app1", parallelism: 3, ratePerSecond: 10 }
  }
);
```

For more details, see the documentation [here](https://upstash.com/docs/workflow/basics/serve#flowcontrol).

- **In the `trigger` method**:  

```js
import { Client } from "@upstash/workflow";

const client = new Client({ token: "<QStash_TOKEN>" });
const { workflowRunId } = await client.trigger({
  url: "https://workflow-endpoint.com",
  body: "hello there!",
  flowControl: { key: "app1", parallelism: 3, ratePerSecond: 10 }
});
```

For more details, see the documentation [here](https://upstash.com/docs/workflow/basics/client#trigger-workflow).  

#### Limiting External API Calls  

To limit requests to an external API, use `context.call`:  

```js 
import { serve } from "@upstash/workflow/nextjs";

export const { POST } = serve<{ topic: string }>(async (context) => {
  const request = context.requestPayload;

  const response = await context.call(
    "generate-long-essay", 
    {
      url: "https://api.openai.com/v1/chat/completions",
      method: "POST",
      body: {/*****/},
      flowControl: { key: "app1", parallelism: 3, ratePerSecond: 10 }
    }
  );
});
```

For more details, see the documentation [here](https://upstash.com/docs/workflow/basics/context#context-call).  

## What Happened to Queues?  

Previously, parallelism control were managed through Queues. However, this approach was hard to manage for our users. The main issue was that a single failed message could block the queue.  
The reason is that QStash Queues were initially designed for FIFO use cases with single parallelism.
This means that until a message has either been successfully processed or exhausted all its configured retries, the queue does not proceed to the next message.
While this design remains useful for strict FIFO requirements, it does not work well for rate-limiting and concurrency use cases.

With Flow-Control, rate and parallelism limits are applied without blocking new publishes due to failures.  

Queues are still available for FIFO (first-in, first-out) processing. If strict message ordering is required, queues with parallelism set to `1` should be used.  
We plan to phase out parallelism option of the Queues, once all users have migrated to Flow-Control.  
If you are using queue parallelism greater than `1`, we recommend switching to the new feature.  

## Conclusion  

We hope this feature improves your experience! If you have any feedback or suggestions, join us on [Discord](https://upstash.com/discord) and let us know. We’re here to build what you need.