Relevance AI Cost Control: Understanding Actions, Credits, and Why Costs Burn Faster Than Expected

Relevance AI's two-part billing model confuses almost every new user. Here is how it works and how to stay in control.

The Two-Part Billing Model

Relevance AI charges for two things separately, and the interaction between them is what catches most users off guard:

Cost type	What it covers	Analogy
Actions	Every step your agent takes (tool calls, LLM calls, data reads, outputs)	Like CPU credits -- consumed per operation
Vendor Credits	The underlying LLM API costs (GPT-4o tokens, Claude tokens, etc.)	Passed-through API costs with no markup

Since September 2025, Vendor Credits have zero markup -- you pay the same as calling the API directly. Paid plans also let you bring your own API key to bypass Vendor Credits entirely, routing model costs through your own OpenAI/Anthropic account.

What Consumes Actions

Actions are consumed by every discrete step an agent takes. A single agent run can consume many Actions:

Each LLM call the agent makes: 1 Action
Each tool execution (web search, code runner, API call): 1 Action
Each knowledge base search: 1 Action
Each data transformation or enrichment step: 1 Action

An agent that does: classify query (1) -> search knowledge base (1) -> call external API (1) -> generate response (1) consumes 4 Actions per user message. A conversation with 5 back-and-forth turns: up to 20 Actions.

Audit your most-used agents with Relevance AI's execution logs to see how many Actions each run consumes. You will likely find 1-2 agent designs consuming most of your Action budget.

How to Reduce Action Consumption

1. Combine tool calls where possible

Instead of an agent making separate calls for each piece of data (3 Actions), design a single tool that fetches everything needed in one call (1 Action). This requires designing tools with broader scope.

2. Use simpler agents for simple queries

Not every user message needs a full reasoning chain. Use a classifier to route simple queries (FAQ-style) to a direct knowledge base answer (2 Actions) instead of the full agent pipeline (5-8 Actions).

3. Reduce agent iterations

Agents with vague instructions tend to iterate more -- trying multiple approaches before reaching an answer. Tighter, more specific instructions reduce unnecessary iterations and Action consumption.

Bring Your Own API Key

On paid plans, you can connect your own OpenAI or Anthropic API key. This routes all model costs through your account, eliminates Vendor Credits, and makes model costs predictable and directly visible in your OpenAI/Anthropic usage dashboard.

Go to Relevance AI Settings > Integrations > LLM Providers
Add your OpenAI API key (or Anthropic, Cohere, etc.)
Select this key as the default for your agent or specific tools
Monitor costs in your OpenAI/Anthropic dashboard -- they now appear there directly

When using BYOK (Bring Your Own Key), your costs split across two bills: Actions to Relevance AI, model tokens to OpenAI/Anthropic. Set up billing alerts on both accounts.

Setting Budget Limits

Relevance AI does not yet have native budget caps per agent (as of early 2026), but you can implement soft limits through monitoring:

Set billing alerts in your Relevance AI account for Action thresholds
Use the Relevance AI API to check Action consumption programmatically and disable agents that exceed a threshold
For always-on agents: set a daily Action budget and monitor it via the dashboard

# Check usage via Relevance AI API
import httpx
 
async def get_current_usage(api_key: str, region: str) -> dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api-{region}.stack.tryrelevance.com/latest/usage",
            headers={"Authorization": api_key},
        )
        return response.json()
 
# Set up a monitoring cron job to alert when nearing limits
usage = await get_current_usage(api_key="your-key", region="us")
if usage.get("actions_used", 0) > 0.8 * usage.get("actions_limit", 1):
    send_alert("Relevance AI Actions at 80% of monthly limit")

Quick Reference

Actions = per-step cost; Vendor Credits = model token cost (no markup on paid plans)
Audit execution logs to find which agents consume the most Actions
Design tools with broader scope to reduce per-run Action count
Use BYOK (Bring Your Own Key) to route model costs through your own API account
Set billing alerts on both Relevance AI (Actions) and OpenAI/Anthropic (tokens) when using BYOK
Route simple queries to direct KB lookups to avoid full agent pipeline costs