Devstars
Blog
Date: 05/03/2026
There’s a conversation about AI model costs that most businesses aren’t having yet, but they will be.
It’s not “should we use AI?” That ship has sailed. It’s “why is our AI bill so high, and what on earth are we getting for it?”
When we built our first team of AI agents at Devstars, we made every mistake going. We defaulted to the most powerful model for everything, ran tasks sequentially when they could run in parallel, and wondered why we were burning through £62 before lunch.

A bit of attention, a bit of model-routing logic, and that bill dropped to around £10 a day. Same output quality. Same capabilities. Just smarter decisions about which brain to use for which job.
Here’s what we learned.
Before any pricing makes sense, you need to understand how AI is actually billed.
AI models don’t charge by the word, the query, or the hour. They charge by the token.
A token is roughly 0.75 of a word, or about 4 characters. So “digital marketing” is approximately 3 tokens. A 1,000-word blog post is around 1,300 tokens.
There are two types, and they’re priced very differently:
Why the gap? Generating output is computationally heavy. The model is creating something new, token by token, using considerable processing power. Reading your input is comparatively light work.
This matters enormously when you’re designing AI workflows. If you can ask for a shorter, structured answer rather than a flowing essay, you save real money. A model that returns a JSON object rather than paragraphs of explanation costs a fraction of the price for the same underlying analysis.
All AI model costs below are quoted per million tokens as of early 2026. Think of a million tokens as roughly 750,000 words, or about 600 average blog posts.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For |
|---|---|---|---|
| Claude Opus | £4.00 | £20.00 | Complex reasoning, nuanced strategy |
| Claude Sonnet | £2.40 | £12.00 | Balanced quality and cost |
| Claude Haiku | £0.80 | £4.00 | Fast, repetitive tasks |
Anthropic’s tiered naming is straightforward: Opus is the flagship, Sonnet is the workhorse, Haiku is the sprinter. The gap between Opus and Haiku is 25x on output. That is the difference between £10/day and £250/day at the same volume. See Anthropic’s official pricing for the latest figures.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For |
|---|---|---|---|
| GPT-5 Mini | £0.20 | £1.60 | High-volume classification, simple tasks |
| GPT-4o Mini | £0.12 | £0.48 | Cheap, fast, surprisingly capable |
| GPT-5 | £1.00 | £8.00 | General reasoning, content |
| GPT-4o | £2.00 | £8.00 | Multimodal, vision, nuanced tasks |
| GPT-5.2 | £1.40 | £11.20 | Coding, agentic tasks |
GPT-4o Mini is extraordinary value for the right tasks. It costs almost nothing and handles classification, routing, summarisation, and structured data extraction reliably well. GPT-4 classic, by contrast, used to cost £24/£48 per million tokens — most people had no idea they were paying that.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For |
|---|---|---|---|
| Gemini 2.0 Flash-Lite | £0.06 | £0.24 | Ultra-cheap volume tasks |
| Gemini 2.5 Flash-Lite | £0.08 | £0.32 | Fast lightweight tasks |
| Gemini 2.5 Flash | £0.28 | £1.12 | Research, long-context analysis |
| Gemini 2.5 Pro | £1.00 | £8.00 | Deep reasoning, complex analysis |
| Gemini 3 Pro | £1.60 | £9.60 | Flagship capability |
Google’s Gemini Flash models have enormous context windows (you can feed in entire documents, websites, even videos) at prices that make Opus look like a luxury car. For research-heavy tasks where you need to process large amounts of text, Gemini Flash is often the right call.
Stop thinking about specific model names and start thinking in tiers.
Tier 1 — The Strategists (Opus, GPT-5.2 Pro, Gemini 3 Pro)
These are your senior partners. Expensive, deliberate, exceptional at nuanced reasoning. Use them sparingly for the decisions that genuinely require deep thinking. Final strategy documents. Complex competitive analysis. High-stakes content where quality is non-negotiable.
Tier 2 — The Workhorses (Sonnet, GPT-5, Gemini 2.5 Pro, GPT-4o)
This is where most of your AI work should live. Balanced capability and cost. Good enough for the vast majority of tasks, excellent for most. First drafts, analysis, client reports, research synthesis, content production at scale.
Tier 3 — The Sprinters (Haiku, GPT-4o Mini, GPT-5 Mini, Gemini 2.5 Flash-Lite)
Fast and cheap. Deceptively capable for the right tasks. Routing decisions, classification, structured data extraction, summarisation, simple Q&A. When you need to process thousands of items, this is your tier.
When we built our AI agent team at Devstars, we started naively. Every task went to the best model because it felt safer. Why take risks with a cheaper model?
The problem is that “safer” is relative. Yes, Opus produces slightly better output than Haiku for some tasks. But for a task like “does this piece of content contain a keyword?” or “summarise this in three bullet points,” Haiku does the job just as well at a fraction of the cost.
Here’s what we changed:
The result: costs dropped from around £32 to roughly £8 a day. Same agents, same tasks, same quality output. Just smarter model selection.
Here’s the reality check.
A startup running AI agents, processing content, analysing customer queries, and generating reports can easily spend £1,000–£3,000 a month on AI model costs if they are not paying attention. Most of that spend comes down to two mistakes: using expensive models for low-value tasks, and generating more output tokens than necessary.
A few hours of audit — mapping each workflow to the appropriate model tier, compressing outputs, implementing caching — can cut that by half or more without any reduction in quality. That’s real money, especially when you’re scaling.
If you’re just starting with AI tools, don’t default to the most powerful model. Start with a mid-tier model (Sonnet, GPT-4o, Gemini 2.5 Pro), see what quality you’re getting, then decide whether you need to step up or can step down.
If you’re already running AI workflows, audit them. For each task, ask: Does this genuinely require Tier 1, or am I paying premium prices out of habit?
If you’re building an AI-powered product or service, build model routing from the start. It’s far easier to do at the design stage than to retrofit later.
And if your AI bill is creeping up and you’re not sure why, get someone to look at it. The savings are usually sitting in plain sight.
AI model costs have been falling consistently. GPT-4, which cost £24/£48 per million tokens at launch, has effectively been replaced by models that are faster, better, and a fraction of the price. This trend will continue.
The implication: lock in flexible architecture. Don’t hardcode a specific model into your systems. Use an abstraction layer that lets you swap models when better options emerge. Which, in this market, is roughly every six months.
Stuart Watkins is the founder of Devstars LWDA, a Jersey-based digital agency specialising in GEO, technical SEO, and AI-powered growth systems. If you want to put AI model costs to work for your business, our OpenClaw AI agent platform in Jersey is a practical starting point. That is worth a conversation.
Currently scheduling strategic partnerships for Q1-Q2 2026. Limited spaces remain.
Get a free technical consultation and project roadmap. We’ll assess your requirements and provide transparent pricing for your growth-stage development needs.
Call: +44 020 8898 3993