What makes AI features expensive?

The biggest cost drivers are prompt and completion tokens, expensive models, multi-step agent workflows, retrieval calls, and the infrastructure required to route and observe requests.

Why is cost per feature hard to measure?

Most teams see provider bills and cloud spend, but not feature-level attribution. That makes it difficult to connect infrastructure cost to a specific workflow, feature, or user action.

What should AI teams track?

Teams should track cost per request, cost per workflow, cost per user, model mix, token usage, and feature-level margin.

How does SpendLens help?

SpendLens connects AI infrastructure spend to product activity so teams can see cost per feature, workflow, and customer interaction instead of just a total bill.

AI Cost Visibility

How much does it cost to run an AI feature?

Q: How much does it cost to run an AI feature?

The cost can range from fractions of a cent to multiple dollars per interaction depending on model choice, token volume, retrieval steps, orchestration, and traffic scale.

Running an AI feature can cost fractions of a cent or multiple dollars per interaction depending on tokens, model choice, retrieval, orchestration, and traffic scale. The hard part is not seeing the total bill. It is understanding the cost of each feature, workflow, and customer interaction inside that bill.

Get Access Read AI Unit Economics

$0.01

Simple retrieval + generation request

$0.25+

Complex multi-step agent workflow

Cost swing from model and prompt decisions

1 bill

Usually hides many feature-level economics

At a glance

Feature cost is more than the model invoice

A practical breakdown of how one AI feature should be measured inside a real product.

Feature Revenue

$0.19

Monetized value per interaction

Feature Cost

$0.07

Includes more than inference alone

Margin View

Healthy

Visible only after attribution

Illustrative cost breakdown per interaction

Prompt + completion tokens

Often the largest direct cost driver

Largest driver

Retrieval + reranking

Vector lookups, search, and ranking layers

Efficient

Workflow orchestration

Tool calls, routing, logic, and retries

Adds overhead

Feature-level margin

Requires full attribution, not just provider bills

Business view

Provider bill

Shows total spend

Not enough

Feature cost

Shows one product surface

Operationally useful

Workflow economics

Shows end-to-end business impact

Decision-ready

The real cost

The total provider invoice is not the same as the cost of a single AI feature.

Most teams know their monthly OpenAI, Anthropic, Bedrock, or cloud bill. Far fewer can answer a simpler and more useful question: what does it cost to run one AI-powered feature inside the product?

That number is what determines pricing power, margin, adoption strategy, and whether the feature gets more investment or gets quietly throttled.

Per request

The first useful operating metric

Per workflow

Captures multi-step complexity

Per feature

Turns spend into product economics

The model call is only the start

A modern AI feature often includes prompt assembly, retrieval, reranking, multiple invocations, policy checks, observability, retries, and orchestration.

Prompt and completion tokens
Retrieval, embeddings, and vector lookups
Workflow coordination and fallbacks

Cost changes by feature

A support copilot, content tool, and research agent may share a provider but have completely different economics.

Model choice changes cost
Context size changes cost
Workflow design changes margin

Teams need attribution

The most useful metrics are cost per request, cost per workflow, cost per active user, and feature-level margin.

See what is improving
See what is leaking margin
See what deserves more investment

What contributes to cost

The layers behind AI feature cost

Even when each component looks inexpensive on its own, the full interaction can become materially expensive at scale.

Layer 01

Inference and tokens

Prompt and completion token volume, context size, and model pricing create the most visible direct cost.

Long prompts compound spend
Frontier models change economics fast
Verbose completions add cost at scale

Layer 02

Retrieval and tooling

Embeddings, vector database queries, reranking, tool calls, and validators each add incremental cost.

Retrieval is usually necessary but not free
Agent tools create useful capability and overhead
Supporting systems often hide in cloud spend

Layer 03

Reliability and scale

Retries, fallbacks, latency-driven duplication, and traffic volume turn small inefficiencies into real operating problems.

Failure handling inflates spend quietly
High-traffic features magnify waste
Attribution is required to optimize well

Example workflow

A simple AI feature can still have a multi-layer cost stack

Below is a simplified example of a customer-facing AI assistant that retrieves documentation, generates an answer, and logs the interaction.

Illustrative request cost

Embedding search + retrieval

Knowledge lookup and context assembly

$0.001

Prompt tokens

Input context and instruction cost

$0.004

Completion tokens

Generated response cost

$0.006

Guardrails, logging, orchestration

Supporting workflow overhead

$0.002

Retry / fallback overhead

Reliability overhead not always visible in product analytics

$0.001

Total estimated cost per request

Illustrative blended view

$0.014

What teams should measure

A simple way to operationalize feature economics

Track cost per request

Start with the cost of one user interaction.

Group requests into workflows

Capture multi-step chains, tools, and retries.

Map to features and cohorts

See which surfaces and customer groups consume spend.

Compare against value

Measure margin against pricing, retention, or expansion.

FAQ

Frequently asked questions about AI feature cost

Question