The Assistants API shuts down August 26, 2026. Here is exactly what changes, what you need to rewrite, and what stays the same.
The Deadline
OpenAI announced that the Assistants API beta will be fully deprecated on August 26, 2026. After that date, any application still calling Assistants API endpoints will receive errors. This affects tens of thousands of production applications built on Assistants, Threads, and the Files API.
The replacement is not a drop-in swap. The architecture is meaningfully different. This article maps every major Assistants concept to its Responses API equivalent so you know exactly what needs to change.
August 26, 2026 is a hard deadline. OpenAI has confirmed no extensions. Start migration at least 8 weeks before the deadline to allow time for testing and unexpected issues.What Changed and Why
The Assistants API was stateful: OpenAI stored your conversation Threads server-side, managed your tool calls, and maintained context across messages automatically. The new Responses API is stateless -- you manage conversation history yourself, pass it with each request, and get more control and predictability in return.
| Assistants API concept | Responses API equivalent | Key difference |
|---|---|---|
| Assistant | System prompt (string) | No longer a persistent object -- just a string |
| Thread | Conversation history array | You store and pass history, not OpenAI |
| Message | Item in the input array | Same structure, different field names |
| Run | Single responses.create() call | Synchronous by default, streaming optional |
| Tool (function) | tool in tools array | Same JSON schema format |
| File Search tool | file_search in tools | Similar, more control over chunking |
| Code Interpreter | code_interpreter in tools | Same capability |
| run_steps | output_items in response | Access tool call results differently |
Side-by-Side: Creating a Response
Assistants API (old)
from openai import OpenAI
client = OpenAI()
# Step 1: Create an assistant (one-time setup)
assistant = client.beta.assistants.create(
name="Support Agent",
instructions="You are a helpful customer support agent.",
model="gpt-4o",
tools=[{"type": "function", "function": {...}}],
)
# Step 2: Create a thread (per conversation)
thread = client.beta.threads.create()
# Step 3: Add a message
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="What is your refund policy?",
)
# Step 4: Create a run and poll
run = client.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=assistant.id,
)
# Step 5: Get messages
messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages.data[0].content[0].text.value)Responses API (new)
from openai import OpenAI
client = OpenAI()
# Conversation history -- you manage this, not OpenAI
conversation_history = []
def chat(user_message: str) -> str:
# Add user message to history
conversation_history.append({"role": "user", "content": user_message})
# Single API call -- no threads, no runs, no polling
response = client.responses.create(
model="gpt-4o",
instructions="You are a helpful customer support agent.",
input=conversation_history,
tools=[{
"type": "function",
"name": "get_refund_policy",
"description": "Get the current refund policy",
"parameters": {"type": "object", "properties": {}},
}],
)
# Add assistant response to history for next turn
assistant_message = response.output_text
conversation_history.append({"role": "assistant", "content": assistant_message})
return assistant_message
reply = chat("What is your refund policy?")
print(reply)Handling Tool Calls in the New API
response = client.responses.create(
model="gpt-4o",
instructions="You are a support agent.",
input=conversation_history,
tools=[get_order_tool],
)
# Check if the model wants to call a tool
for output_item in response.output:
if output_item.type == "function_call":
tool_name = output_item.name
tool_args = json.loads(output_item.arguments)
# Execute the tool
tool_result = execute_tool(tool_name, tool_args)
# Add tool call and result to history, then re-call
conversation_history.append({
"role": "assistant",
"content": response.output,
})
conversation_history.append({
"role": "tool",
"tool_call_id": output_item.call_id,
"content": json.dumps(tool_result),
})
# Re-call with tool result
response = client.responses.create(
model="gpt-4o",
instructions="You are a support agent.",
input=conversation_history,
tools=[get_order_tool],
)Storing Conversation History
With the Assistants API, OpenAI stored your Threads. Now you store your conversation history. For single-user scripts this is trivial. For multi-user production apps, you need a storage layer.
import json
import redis
r = redis.Redis(host="localhost", port=6379, decode_responses=True)
def get_history(session_id: str) -> list:
raw = r.get(f"conv:{session_id}")
return json.loads(raw) if raw else []
def save_history(session_id: str, history: list, ttl_seconds: int = 86400):
r.setex(f"conv:{session_id}", ttl_seconds, json.dumps(history))
# In your request handler:
history = get_history(session_id)
history.append({"role": "user", "content": user_message})
response = client.responses.create(
model="gpt-4o",
instructions=SYSTEM_PROMPT,
input=history,
)
history.append({"role": "assistant", "content": response.output_text})
save_history(session_id, history)Using the OpenAI Agents SDK
If your Assistants app was complex (multiple tools, handoffs between assistants, long multi-step tasks), consider migrating to the OpenAI Agents SDK rather than raw Responses API. The SDK provides a higher-level abstraction that is closer to the Assistants experience.
from agents import Agent, Runner
support_agent = Agent(
name="Support Agent",
instructions="You are a helpful customer support agent.",
model="gpt-4o",
tools=[get_order_status, process_refund],
)
# Run the agent -- SDK handles tool calls and conversation internally
result = await Runner.run(
support_agent,
input="What is the status of order ORD-12345?",
)
print(result.final_output)The OpenAI Agents SDK is the closest experience to the Assistants API. If your team wants to preserve the 'define an agent once, run it many times' pattern, use the SDK rather than raw Responses API calls.Migration Checklist
- Audit every Assistants API endpoint in your codebase: /assistants, /threads, /messages, /runs
- Decide: raw Responses API (more control) or OpenAI Agents SDK (closer to Assistants experience)
- Build a conversation history store (Redis, PostgreSQL, or in-memory for simple cases)
- Rewrite tool call handling -- the loop pattern replaces run polling
- Update your file search configuration if using the built-in retrieval tool
- Test with your full production prompt and tool set before the deadline
- Set a migration completion target at least 4 weeks before August 26, 2026