The Assistants API shuts down August 26, 2026. Here is exactly what changes, what you need to rewrite, and what stays the same.

The Deadline

OpenAI announced that the Assistants API beta will be fully deprecated on August 26, 2026. After that date, any application still calling Assistants API endpoints will receive errors. This affects tens of thousands of production applications built on Assistants, Threads, and the Files API.

The replacement is not a drop-in swap. The architecture is meaningfully different. This article maps every major Assistants concept to its Responses API equivalent so you know exactly what needs to change.

August 26, 2026 is a hard deadline. OpenAI has confirmed no extensions. Start migration at least 8 weeks before the deadline to allow time for testing and unexpected issues.

What Changed and Why

The Assistants API was stateful: OpenAI stored your conversation Threads server-side, managed your tool calls, and maintained context across messages automatically. The new Responses API is stateless -- you manage conversation history yourself, pass it with each request, and get more control and predictability in return.

Assistants API concept Responses API equivalent Key difference
Assistant System prompt (string) No longer a persistent object -- just a string
Thread Conversation history array You store and pass history, not OpenAI
Message Item in the input array Same structure, different field names
Run Single responses.create() call Synchronous by default, streaming optional
Tool (function) tool in tools array Same JSON schema format
File Search tool file_search in tools Similar, more control over chunking
Code Interpreter code_interpreter in tools Same capability
run_steps output_items in response Access tool call results differently

Side-by-Side: Creating a Response

Assistants API (old)

from openai import OpenAI
client = OpenAI()
 
# Step 1: Create an assistant (one-time setup)
assistant = client.beta.assistants.create(
    name="Support Agent",
    instructions="You are a helpful customer support agent.",
    model="gpt-4o",
    tools=[{"type": "function", "function": {...}}],
)
 
# Step 2: Create a thread (per conversation)
thread = client.beta.threads.create()
 
# Step 3: Add a message
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What is your refund policy?",
)
 
# Step 4: Create a run and poll
run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id,
    assistant_id=assistant.id,
)
 
# Step 5: Get messages
messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages.data[0].content[0].text.value)

Responses API (new)

from openai import OpenAI
client = OpenAI()
 
# Conversation history -- you manage this, not OpenAI
conversation_history = []
 
def chat(user_message: str) -> str:
    # Add user message to history
    conversation_history.append({"role": "user", "content": user_message})
 
    # Single API call -- no threads, no runs, no polling
    response = client.responses.create(
        model="gpt-4o",
        instructions="You are a helpful customer support agent.",
        input=conversation_history,
        tools=[{
            "type": "function",
            "name": "get_refund_policy",
            "description": "Get the current refund policy",
            "parameters": {"type": "object", "properties": {}},
        }],
    )
 
    # Add assistant response to history for next turn
    assistant_message = response.output_text
    conversation_history.append({"role": "assistant", "content": assistant_message})
 
    return assistant_message
 
reply = chat("What is your refund policy?")
print(reply)

Handling Tool Calls in the New API

response = client.responses.create(
    model="gpt-4o",
    instructions="You are a support agent.",
    input=conversation_history,
    tools=[get_order_tool],
)
 
# Check if the model wants to call a tool
for output_item in response.output:
    if output_item.type == "function_call":
        tool_name = output_item.name
        tool_args = json.loads(output_item.arguments)
 
        # Execute the tool
        tool_result = execute_tool(tool_name, tool_args)
 
        # Add tool call and result to history, then re-call
        conversation_history.append({
            "role": "assistant",
            "content": response.output,
        })
        conversation_history.append({
            "role": "tool",
            "tool_call_id": output_item.call_id,
            "content": json.dumps(tool_result),
        })
 
        # Re-call with tool result
        response = client.responses.create(
            model="gpt-4o",
            instructions="You are a support agent.",
            input=conversation_history,
            tools=[get_order_tool],
        )

Storing Conversation History

With the Assistants API, OpenAI stored your Threads. Now you store your conversation history. For single-user scripts this is trivial. For multi-user production apps, you need a storage layer.

import json
import redis
 
r = redis.Redis(host="localhost", port=6379, decode_responses=True)
 
def get_history(session_id: str) -> list:
    raw = r.get(f"conv:{session_id}")
    return json.loads(raw) if raw else []
 
def save_history(session_id: str, history: list, ttl_seconds: int = 86400):
    r.setex(f"conv:{session_id}", ttl_seconds, json.dumps(history))
 
# In your request handler:
history = get_history(session_id)
history.append({"role": "user", "content": user_message})
 
response = client.responses.create(
    model="gpt-4o",
    instructions=SYSTEM_PROMPT,
    input=history,
)
 
history.append({"role": "assistant", "content": response.output_text})
save_history(session_id, history)

Using the OpenAI Agents SDK

If your Assistants app was complex (multiple tools, handoffs between assistants, long multi-step tasks), consider migrating to the OpenAI Agents SDK rather than raw Responses API. The SDK provides a higher-level abstraction that is closer to the Assistants experience.

from agents import Agent, Runner
 
support_agent = Agent(
    name="Support Agent",
    instructions="You are a helpful customer support agent.",
    model="gpt-4o",
    tools=[get_order_status, process_refund],
)
 
# Run the agent -- SDK handles tool calls and conversation internally
result = await Runner.run(
    support_agent,
    input="What is the status of order ORD-12345?",
)
print(result.final_output)
The OpenAI Agents SDK is the closest experience to the Assistants API. If your team wants to preserve the 'define an agent once, run it many times' pattern, use the SDK rather than raw Responses API calls.

Migration Checklist

  • Audit every Assistants API endpoint in your codebase: /assistants, /threads, /messages, /runs
  • Decide: raw Responses API (more control) or OpenAI Agents SDK (closer to Assistants experience)
  • Build a conversation history store (Redis, PostgreSQL, or in-memory for simple cases)
  • Rewrite tool call handling -- the loop pattern replaces run polling
  • Update your file search configuration if using the built-in retrieval tool
  • Test with your full production prompt and tool set before the deadline
  • Set a migration completion target at least 4 weeks before August 26, 2026