Vercel AI SDK Memory, RAG & Chat History with Redis

Upstash AgentKit builds AI agents on Upstash Redis: memory, conversation history, caching, and RAG, with no separate vector database. The semantic features run on Upstash Redis Search and its $smart fuzzy operator. @upstash/agentkit-ai-sdk is the Vercel AI SDK adapter — drop-ins for generateText / streamText. redis defaults to Redis.fromEnv(), so you import only from this package.

Import	Feature
`createChatHistory`	Durable chat history on Redis Search — save, list, and `$smart`-search a user’s transcripts.
`createMemoryTools`	`recall_memory` + `save_memory` tools so the model reads and writes long-term memory.
`createSearchTools`	`search` / `aggregate` / `count` tools over a Redis Search index (this is how you do RAG).
`createRateLimit`	A configured Upstash Ratelimit to call before the model.
`cachedTools`	Memoize a map of AI SDK tools’ results in Redis.

npm install @upstash/agentkit-ai-sdk @upstash/redis ai

AgentKit reads UPSTASH_REDIS_REST_URL / UPSTASH_REDIS_REST_TOKEN from the environment by default. Pass your own @upstash/redis client as redis to any helper to override.

How to store chat history with the AI SDK

A Redis-backed ChatHistory<UIMessage>, the durable source of truth for your conversations. userId comes from your auth session; chatId is the useChat id that the client posts. Save the full transcript from your route’s onFinish:

// app/api/chat/route.ts
import { convertToModelMessages, createUIMessageStreamResponse, streamText, toUIMessageStream } from "ai";
import { createChatHistory } from "@upstash/agentkit-ai-sdk";

const history = createChatHistory();

export async function POST(req: Request) {
  const userId = await getSessionUserId(req); // your auth session, never a client-sent id
  const { id: chatId, messages } = await req.json(); // useChat posts its chat id + the full transcript

  const result = streamText({ model, messages: convertToModelMessages(messages) });

  return createUIMessageStreamResponse({
    stream: toUIMessageStream({
      stream: result.stream,
      originalMessages: messages,
      onFinish: ({ messages }) =>
        history.saveChat({ userId, sessionId: chatId, messages, title: "New chat" }),
    }),
  });
}

To load a chat, take chatId from the page route and userId from the session, then seed useChat:

const chat = await history.getChat({ userId, sessionId: chatId }); // full transcript, or null
const chats = await history.listChats({ userId, limit: 50 }); // summaries, no messages
const hits = await history.searchChats({ userId, query: "headphones", target: "both", limit: 20 });
// client: useChat({ id: chatId, messages: chat?.messages ?? [] })

Config and how it's stored

createChatHistory({
  redis, // optional: defaults to Redis.fromEnv()
  prefix: "agentkit:chat", // optional: base key prefix
  indexName: "agentkit_chat", // optional: index name (defaults to the prefix)
  ttlSeconds: 60 * 60 * 24 * 30, // optional: per-chat TTL (default: no expiry)
});

Each chat is one JSON doc at agentkit:chat:<userId>:<sessionId> (keyed per user, so two users can’t collide on a sessionId), indexed over userId + sessionId (filters) and userMessages + modelMessages ($smart fuzzy text); the raw messages array rides along unindexed. saveChat overwrites the whole array (no delta merge) — useChat sends the full conversation. Other methods: getChat / deleteChat ({ userId, sessionId }), listChats / searchChats ({ userId }).

Security — userId is the tenant boundary

Every method takes a single object; userId is required, non-empty, and may not contain :. Derive it from a verified server-side auth source — the subject/user id from your auth provider (Clerk, Auth.js/NextAuth, Supabase Auth, Auth0, …) — and never from a client-supplied header, query param, or body (read it from the session in your route). A chat can’t be read or overwritten under a different userId.

How to add agent memory with the AI SDK

recall_memory and save_memory tools so the model reads and writes its own long-term memory.

import { createMemoryTools } from "@upstash/agentkit-ai-sdk";
import { generateText, stepCountIs } from "ai";

const tools = createMemoryTools({ userId });

await generateText({ model, tools, stopWhen: stepCountIs(5), prompt: "What do you know about me?" });

Options and the userId tenant boundary

userId (required) — a string, or (input, options) => string.
redis — defaults to Redis.fromEnv().
topK — max memories recall returns.
minScore — BM25 relevance floor.
recallToolName / saveToolName — override the tool names.

userId is the only tenant boundary (required, non-empty, no :). Derive it from a verified server-side auth source (Clerk, Auth.js/NextAuth, Supabase Auth, Auth0, …) — never a client-supplied value. Memories are stored at agentkit:memory:<userId>:<id>.

How to add RAG with the AI SDK

search / aggregate / count tools over an Upstash Redis Search index; the model-facing descriptions are generated from your schema.

import { s } from "@upstash/redis";
import { createSearchTools } from "@upstash/agentkit-ai-sdk";
import { generateText, stepCountIs } from "ai";

const schema = s.object({ name: s.string(), age: s.number(), city: s.string().noTokenize() });
const tools = createSearchTools({ schema, indexName: "users" });

await generateText({ model, tools, stopWhen: stepCountIs(5), prompt: "How many users named Ada live in London?" });

Options

schema (required) — built with s from @upstash/redis.
redis — defaults to Redis.fromEnv().
indexName — defaults to "agentkit:search".
prefix — key prefix for indexed JSON docs (defaults to "<indexName>:").
defaultLimit — default page size for search (10).

The index is created (and waitIndexing-ed) reactively on first use — no setup step.

How to add rate limiting with the AI SDK

A configured Upstash Ratelimit. Call .limit(identifier) before the model and short-circuit when over the limit.

import { createRateLimit, Ratelimit } from "@upstash/agentkit-ai-sdk";

const ratelimit = createRateLimit({ limiter: Ratelimit.slidingWindow(20, "1 m") });

const { success } = await ratelimit.limit(userId);
if (!success) throw new Error("rate limited"); // or return a 429 from your route

Options

limiter (required) — e.g. Ratelimit.slidingWindow(20, "1 m") or fixedWindow(...).
redis — defaults to Redis.fromEnv().
prefix — base key prefix; keys are <prefix>:<identifier> (default agentkit:rateLimit).

There’s no model wrapper. Pass a per-user identifier to .limit() to throttle per user.

How to cache tools with the AI SDK

Memoize a map of AI SDK tools’ results in Redis. Each tool is cached under its map key, scoped to userId.

import { z } from "zod";
import { generateText, tool } from "ai";
import { cachedTools } from "@upstash/agentkit-ai-sdk";

const tools = cachedTools(
  {
    getWeather: tool({
      description: "Get the weather for a city",
      inputSchema: z.object({ city: z.string() }),
      execute: async ({ city }) => fetchWeather(city),
    }),
  },
  { userId },
);

await generateText({ model, tools, prompt: "What's the weather in Paris?" });

Options

Pass tools built with the AI SDK’s tool() (so each keeps full input/output inference). Second arg:

userId (required) — a string, or (input, options) => string; scopes every entry to this user.
redis — defaults to Redis.fromEnv().
ttlSeconds — default per-result TTL for every tool.

Cache keys are agentkit:toolCache:<userId>:<toolName>:<hash-of-input> — the toolName is the map key, so you never pass a name yourself.

How to put it all together with the AI SDK

A single streamText route can wire every feature: rate limit first, then memory, search, and cached tools, persisting the whole conversation in onFinish:

// app/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import {
  convertToModelMessages,
  createUIMessageStreamResponse,
  streamText,
  stepCountIs,
  toUIMessageStream,
  tool,
  type UIMessage,
} from "ai";
import { z } from "zod";
import { s, Redis } from "@upstash/redis";
import {
  Ratelimit,
  cachedTools,
  createChatHistory,
  createMemoryTools,
  createRateLimit,
  createSearchTools,
} from "@upstash/agentkit-ai-sdk";

export async function POST(req: Request) {
  const { id, messages } = (await req.json()) as { id: string; messages: UIMessage[] };
  const redis = Redis.fromEnv();
  // Derive this from your verified auth session in production (Clerk, Auth.js, …), never a client value.
  const userId = "user-123";

  // 1. Rate limit (by user) before any model work.
  const ratelimit = createRateLimit({ redis, limiter: Ratelimit.slidingWindow(30, "1 m") });
  const { success } = await ratelimit.limit(userId);
  if (!success) return new Response("Rate limited", { status: 429 });

  // 2. Memory, search, and cached tools — all scoped to this user.
  const tools = {
    ...createMemoryTools({ redis, userId }),
    ...createSearchTools({ schema: s.object({ title: s.string(), author: s.string() }), redis, indexName: "books" }),
    ...cachedTools(
      {
        convert_price: tool({
          description: "Convert a USD price to another currency.",
          inputSchema: z.object({ usd: z.number(), currency: z.string() }),
          execute: async ({ usd, currency }) => ({ currency, amount: usd * 0.92 }),
        }),
      },
      { userId, redis },
    ),
  };

  const result = streamText({
    model: openai("gpt-5.4-mini"),
    messages: await convertToModelMessages(messages),
    tools,
    stopWhen: stepCountIs(5),
  });

  // 3. Persist the whole conversation when the stream finishes.
  const history = createChatHistory({ redis });
  return createUIMessageStreamResponse({
    stream: toUIMessageStream({
      stream: result.stream,
      originalMessages: messages,
      onFinish: ({ messages }) => history.saveChat({ userId, sessionId: id, messages, title: "New chat" }),
    }),
  });
}

A complete, runnable Next.js demo (useChat UI, chat sidebar with fuzzy search, inline tool calls) lives in examples/ai-sdk-demo.

AgentKit on GitHub

Source, packages, and the full example apps.

Vercel AI SDK

The AI SDK this adapter plugs into.

​How to store chat history with the AI SDK

​How to add agent memory with the AI SDK

​How to add RAG with the AI SDK

​How to add rate limiting with the AI SDK

​How to cache tools with the AI SDK

​How to put it all together with the AI SDK

AgentKit on GitHub

Vercel AI SDK

How to store chat history with the AI SDK

How to add agent memory with the AI SDK

How to add RAG with the AI SDK

How to add rate limiting with the AI SDK

How to cache tools with the AI SDK

How to put it all together with the AI SDK