Running ADK agents locally is straightforward. Getting them into production -- with scaling, auth, and monitoring -- requires a few extra steps.

Two Production Paths

ADK agents can be deployed two ways: managed (Vertex AI Agent Engine, Google's hosted runtime) or self-hosted (wrap in a FastAPI server, deploy anywhere). The managed path requires less infrastructure work but ties you to Google Cloud. The self-hosted path is portable but requires more setup.

Path 1: Vertex AI Agent Engine (Managed)

Agent Engine is Google's managed runtime for ADK agents. It handles scaling, session management, and integrates with Google Cloud's monitoring and logging stack. This is the recommended path for teams already using Google Cloud.

# Install the SDK
# pip install google-cloud-aiplatform[adk,agent_engines]
 
import vertexai
from vertexai import agent_engines
from google.adk.agents import LlmAgent
from google.adk.tools import google_search
 
vertexai.init(project="your-gcp-project", location="us-central1")
 
# Define your agent
my_agent = LlmAgent(
    name="support_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful customer support agent.",
    tools=[google_search],
)
 
# Deploy to Agent Engine
remote_agent = agent_engines.create(
    my_agent,
    requirements=["google-adk"],
    display_name="Support Agent",
    description="Customer support agent for production",
)
 
print(f"Deployed agent resource name: {remote_agent.resource_name}")
# After deployment, interact with the agent via the managed session API
session = remote_agent.create_session(user_id="user-123")
 
response = remote_agent.stream_query(
    user_id="user-123",
    session_id=session["id"],
    message="What is the status of my order?",
)
 
for event in response:
    if event.get("content"):
        print(event["content"]["parts"][0]["text"], end="", flush=True)
Agent Engine manages session state for you -- you do not need to build your own conversation history store when using the managed path. Sessions are stored server-side and referenced by user_id + session_id.

Path 2: Self-Hosted with FastAPI

For teams not on Google Cloud, or who need more control, you can wrap an ADK agent in a FastAPI server and deploy it anywhere.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.artifacts import InMemoryArtifactService
import uvicorn
 
app = FastAPI()
 
# Initialise agent and services
agent = LlmAgent(
    name="support_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful customer support agent.",
)
 
session_service = InMemorySessionService()   # swap for DB-backed in production
artifact_service = InMemoryArtifactService()
 
runner = Runner(
    agent=agent,
    app_name="support_app",
    session_service=session_service,
    artifact_service=artifact_service,
)
 
class ChatRequest(BaseModel):
    session_id: str
    user_id: str
    message: str
 
class ChatResponse(BaseModel):
    response: str
    session_id: str
 
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    from google.genai.types import Content, Part
 
    result = await runner.run_async(
        user_id=request.user_id,
        session_id=request.session_id,
        new_message=Content(parts=[Part(text=request.message)], role="user"),
    )
 
    final_response = ""
    async for event in result:
        if event.is_final_response() and event.content:
            for part in event.content.parts:
                if part.text:
                    final_response += part.text
 
    return ChatResponse(response=final_response, session_id=request.session_id)
 
if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8080)

Replacing InMemorySessionService for Production

InMemorySessionService loses all sessions on restart. For production self-hosted deployments, replace it with a database-backed implementation.

# Option 1: Use the built-in DatabaseSessionService (SQLite/PostgreSQL)
from google.adk.sessions import DatabaseSessionService
 
session_service = DatabaseSessionService(
    db_url="postgresql://user:pass@host/dbname"
)
 
# Option 2: Implement your own SessionService
from google.adk.sessions import BaseSessionService
 
class RedisSessionService(BaseSessionService):
    def __init__(self, redis_client):
        self.redis = redis_client
 
    async def create_session(self, app_name, user_id, session_id=None, state=None):
        # Your Redis implementation
        ...
 
    async def get_session(self, app_name, user_id, session_id):
        # Your Redis implementation
        ...
 
    async def update_session(self, session):
        # Your Redis implementation
        ...

Adding Authentication

ADK's FastAPI server has no built-in authentication. Add it as middleware before deploying.

from fastapi import FastAPI, HTTPException, Depends, Security
from fastapi.security import APIKeyHeader
 
API_KEY_HEADER = APIKeyHeader(name="X-API-Key", auto_error=False)
VALID_API_KEYS = {"prod-key-abc123", "prod-key-def456"}
 
async def verify_api_key(api_key: str = Security(API_KEY_HEADER)):
    if api_key not in VALID_API_KEYS:
        raise HTTPException(status_code=403, detail="Invalid API key")
    return api_key
 
@app.post("/chat", response_model=ChatResponse, dependencies=[Depends(verify_api_key)])
async def chat(request: ChatRequest):
    ...

Quick Reference

  • Managed path (Vertex AI Agent Engine): agent_engines.create() -- handles scaling and sessions automatically
  • Self-hosted path: wrap in FastAPI with Runner, expose as REST endpoint
  • Replace InMemorySessionService with DatabaseSessionService or custom Redis backend
  • Add API key or OAuth middleware before exposing any ADK endpoint publicly
  • Use stream_query / run_async for streaming responses -- avoid blocking calls in web servers
  • TypeScript SDK and Go SDK are in development -- Python is the primary production-ready option as of early 2026