Cloud bills creep up, support tickets spike, payment fees bite, and in 2026 AI compute can turn a good unit model into a nasty surprise. UK SaaS founders and finance leads feel it first in the numbers, then in the decisions they can’t make, hiring slows, product bets get smaller, and fundraising gets harder.
Gross margin is simply revenue minus your cost to deliver the service (COGS), shown as a percentage of revenue. It matters because it drives runway, shapes valuation, and funds the work that keeps growth going. Healthy SaaS often targets 70 to 80 percent gross margin, top performers push 80 percent plus, and early-stage businesses can be lower while they stabilise infrastructure and support.
This post gives you a practical checklist to improve SaaS gross margin by reducing COGS without wrecking product quality or the customer experience. You’ll see where to look first (cloud, data, support, payments, third-party tools, AI usage), what to measure, and how to make changes that stick.
At Consult EFC, we help UK startups and SMEs grow the proper way, with clear numbers, tight controls, and decisions you can defend when investors start asking questions.
Start with a clean gross margin: what counts as SaaS COGS and what doesn’t
Before you improve gross margin, you need to trust the number. If your SaaS COGS are messy, your margin becomes a foggy dashboard, it might look fine, right up until you try to scale and the cash doesn’t follow.
A clean gross margin starts with one rule: COGS are the costs that rise (or should rise) as you deliver the service to customers. Everything else is operating expense (opex). Get that split right, and you can see what is really driving margin, pricing, usage, vendor sprawl, or process.

Photo by RDNE Stock project
A simple SaaS COGS map: cloud, third-party tools, payments, and support
If you want a quick mental model, think of COGS as the “cost to serve” bucket. When a customer signs up, uses the product, raises tickets, and pays, what costs do you incur to deliver that experience?
Here’s a practical checklist of common SaaS COGS lines, plus where teams often miscode them.
- Cloud hosting and runtime compute (COGS): AWS, Azure, GCP usage, container hosting, managed databases, serverless runtime.
- Data transfer and CDN (COGS): egress fees, CDN delivery, bandwidth charges (often missed because it sits in the same cloud bill).
- Third-party infrastructure tied to production (COGS): error monitoring, logging, auth, email/SMS delivery, search, video, mapping, analytics that runs in production and scales with users.
- AI/ML usage to deliver the product (COGS): per-token LLM API calls, embedding generation, image processing, vector database consumption.
- Payment processing fees (COGS): card fees, gateway fees, fraud tools charged per transaction.
- Customer support and success that is part of delivery (usually COGS): support agents, on-call rotation, customer success tied to retention and ongoing delivery (not new logo sales).
- Direct service delivery costs (COGS): if you bundle onboarding, data migration, or managed services as part of what customers buy.
Common misclassifications to watch:
- Support in overhead: putting support under admin or “G&A” can make gross margin look better than reality, which hides a cost-to-serve problem.
- AI compute hidden in R&D: if AI calls are required for the product to work, they belong in COGS, even if the invoice comes from a “dev” team budget.
- Cloud spend buried in engineering: “infra” coded to product development makes margin look artificially high, then surprises you when revenue grows but margin doesn’t.
- Implementation services treated as opex: if onboarding is included in the subscription, it is still part of fulfilment.
- Conversely, dumping true R&D into COGS: this makes margins look worse than reality, and can push you into wrong pricing moves or premature cost cuts.
The point isn’t perfection. It’s consistency, so month-to-month changes mean something.
How to spot misclassified costs in under an hour
You don’t need a full finance project to find the big errors. You need a quick pass that highlights what moves with customer usage.
Use this fast process:
- Pull the last 3 months of spend from your accounting system and card tool (include bills, subscriptions, and cloud invoices).
- Sort by vendor, then total value (largest first). The top 20 vendors usually tell the story.
- Tag each vendor as
COGSorOpex, using a simple rule: “Does this cost increase when customers use the service?” - Review “swing vendors” (spend varies with usage). These are the usual misclassification hotspots.
- Sanity check against your product flow: sign-up, usage, storage, messages, tickets, payments. If a vendor sits in the flow, it is likely COGS.
Two recurring surprises in SaaS:
- AI usage bills: token spikes, retries, and background jobs can turn into a margin leak fast. If AI is part of the core user journey, treat it like hosting.
- Data transfer (egress) and CDN fees: they often rise quietly with usage, and they are easy to miss when the “headline” compute cost looks stable.
Keep your policy simple and written down. Complexity creates loopholes and reclassifications later.
A quick warning on capitalisation: don’t capitalise costs just to make the P&L prettier. If you capitalise software costs, follow a clear, consistent policy that matches what the spend really is (build versus run). Over-aggressive capitalisation can make gross margin and EBITDA look “better” in the short term, then bite when amortisation and catch-up costs land.
Benchmarks by stage: what ‘good’ looks like in 2026
Benchmarks help you sense-check your numbers, but they should not become a stick to beat yourself with. The real target is a margin that improves over time as you scale, and stays stable as your product gets more complex.
Here’s practical stage-based guidance for SaaS gross margin in 2026:
| Stage | Practical “good” gross margin | What it usually means |
|---|---|---|
| Early-stage | 50%+ can be acceptable | You are still stabilising hosting, support, tooling, and pricing. |
| Growth-stage | 75%+ is strong | Unit costs are under control, and scale is starting to work. |
| Mature-stage | 80%+ is excellent | Efficient delivery, tight infra, and disciplined support model. |
How to interpret the numbers:
- Below 70% often signals a pricing problem, a high cost-to-serve problem, or both. It can also mean your COGS coding is wrong, which is why cleaning up classification comes first.
- Product type matters. High-touch onboarding, compliance-heavy products, and services-heavy models often run lower margins than pure self-serve SaaS.
- Treat benchmarks as a direction, not a verdict. If you moved from 58% to 66% in two quarters, that is progress you can build on (and explain to investors).
The practical COGS reduction checklist: the big levers that move gross margin fast
When SaaS gross margin drops, it’s rarely one dramatic issue. It’s usually lots of small “always on” costs that stack up: idle environments, oversized instances, noisy AI prompts, payment fee creep, and a support model that doesn’t match your pricing.
This checklist focuses on changes that move fast. The goal isn’t to starve the product of resources. The goal is to stop paying for things customers don’t value, and keep spend tightly tied to real usage.

Photo by RDNE Stock project
Cloud and infrastructure: cut waste without risking uptime
Cloud cost reduction works best when you treat it like housekeeping plus engineering discipline. You want fewer surprises, smaller baselines, and a clear way to undo changes if something goes wrong.
Rightsize CPU and memory (then verify). Many SaaS services run with headroom that never gets used. Start with the top 10 services by cost, then check actual utilisation (CPU, memory, request rate, and latency). Reduce one step at a time and watch error rates and tail latency (p95 and p99), not just averages.
Turn off idle environments. Dev, staging, preview, and QA environments can quietly become production-sized. Put them on schedules (nights and weekends) and auto-delete old preview branches. If you need persistent test data, separate storage from compute so you can shut down the expensive part.
Use auto-scaling, but set sane floors. Auto-scaling helps when traffic is spiky, but it can also inflate your baseline if your minimums are too high. Set minimum capacity to what you need for uptime and performance, not what “feels safe”. Pair this with load tests so scaling events happen before users notice.
Reserved Instances and Savings Plans (commitments with intent). For steady workloads, commitments often deliver 30 to 70 percent savings depending on provider, region, term, and how predictable the workload is. The trap is committing before you’ve cleaned up waste. A simple order:
- Rightsize and kill idle spend.
- Identify what is truly “always on”.
- Commit only that baseline.
Storage lifecycle rules (pay for what you actually need). Storage is cheap until it’s not. Put lifecycle policies in place:
- Move old logs and raw events to colder tiers after a set number of days.
- Expire temporary objects (exports, uploads, build artefacts).
- Enforce retention rules that match customer needs and compliance, not habit.
Egress controls (bandwidth is a margin leak). Data transfer fees often show up late, after usage grows. Control it by keeping data close to compute, avoiding cross-region chatter, compressing payloads, and reducing “chatty” service calls. If customers pull lots of data, consider paid limits or a tier that reflects the real cost.
Caching and CDN where appropriate. If users keep asking for the same content, don’t regenerate it. Caching (application, database, and edge) and a CDN can cut compute and latency at the same time. Prioritise:
- Public assets and static content at the edge.
- Hot reads in memory caches.
- Pre-computed responses for expensive endpoints.
Don’t cut blind: stress tests and rollback plans. Every cost cut should have a safety net:
- A performance test before and after (even if it’s a simple load script).
- A clear rollback step (one config change, one deploy, one toggle).
- Monitoring alerts ready before the change goes live.
A mini checklist keeps you honest without turning this into a full-time job.
Weekly checks (30 to 60 minutes):
- Top cost increases by service, region, and environment.
- Idle environments still shutting down on schedule.
- Any unusual egress spikes and their source.
- Error rates and p95 latency on the most-used endpoints.
Monthly checks (half-day deeper review):
- Rightsizing candidates (underutilised instances, oversized DBs, low-throughput clusters).
- Commitment coverage (what percentage of steady spend is covered by RIs or Savings Plans).
- Storage growth and retention compliance (logs, backups, object stores).
- Architecture hot spots that cause repeat compute (missing caching, inefficient queries, cross-zone traffic).
AI and data compute: keep the model bill from eating your margin
If your SaaS includes AI features, treat inference and model API usage like any other variable COGS line. You wouldn’t let card fees drift without a look, so don’t let token spend drift either.
Start by building a simple view of cost per request (or cost per active user). Then work backwards: what’s driving the cost, and what can you trim without harming output quality?
Prompt and context trimming. Many teams ship long system prompts, giant context windows, and verbose tool output because it’s quick. Tighten it:
- Remove duplicated instructions.
- Summarise long histories.
- Cap retrieved context to only what improves answers. Small reductions in tokens compound fast at scale.
Cache outputs where it’s safe. If the same inputs produce the same outputs (or near enough), cache them. Common wins include classification, routing decisions, template generation, and “explain this” content. Even partial caching, like caching embeddings for repeated documents, can reduce repeat spend.
Route simple tasks to smaller models. Not every request needs the best model. Use a router rule set:
- Small model for extraction, tagging, formatting, and basic Q&A.
- Larger model only when confidence is low, or for tasks customers pay for. This keeps quality where it matters, and cost low where it doesn’t.
Rate limits and back-pressure. Put guardrails in front of usage spikes:
- Per-user and per-account request limits.
- Burst limits for automation-heavy endpoints.
- Queues for batch jobs instead of real-time calls. These controls stop one customer workflow, or one bug, from turning into a five-figure bill.
Align usage-based pricing with your real costs. If AI spend is variable, price should be too. Options include metered add-ons, included quotas by tier, or hard caps with upgrade prompts. The aim is simple: heavy usage must pay for itself.
Customer-level caps and clear overage rules. Caps protect your downside. They also force product clarity. If you can’t explain why a customer hit a cap, you may have a UX problem or a runaway loop.
Monitor cost per request like you monitor uptime. Track:
- Tokens or compute per request.
- Cache hit rate.
- Model mix (what percentage is routed to larger models).
- Retry rate and failure loops (often hidden cost drivers).
A metric you can use in finance reviews and product stand-ups:
- Gross profit per 1,000 AI requests = Revenue attributable to those requests minus AI compute cost for those requests.
Set a guardrail target tied to pricing. Example: if your pricing implies £10 of revenue per 1,000 requests, and you want 80 percent gross margin on that feature, your AI compute budget is £2 per 1,000 requests (before any other delivery costs). If you’re above that, you either cut unit cost or reprice.
Payment fees: reduce leakage on every transaction
Payment processing can look “fixed” because it’s a percentage. In reality, fees move with your mix: card type, region, refund behaviour, chargebacks, and the deal you’ve negotiated.
Know the basics:
- Blended processing rate: your average percentage fee across all transactions.
- Fixed fees: per-transaction pence fees that hurt more on low-priced plans.
- Cross-border costs: higher fees when the card issuer and merchant are in different countries.
- Chargebacks: fees plus the admin time, plus the revenue risk.
Practical steps that usually pay back quickly:
Negotiate as volume grows. Processors price to risk and scale. If your volume has increased, your pricing should be reviewed. Bring data: monthly volume, average transaction value, refund rate, chargeback rate, and country mix.
Add local payment methods when it fits your market. In some regions, local bank transfer or direct debit methods can be cheaper than cards, and can reduce churn by making payments “stickier”. Only do this where customers will actually use it, and where reconciliation won’t become a mess.
Reduce refunds by fixing billing flows. Refunds don’t just lose revenue, they often don’t fully reverse fees. Common fixes:
- Clearer invoice descriptors so customers recognise the charge.
- Strong dunning (card update flows, reminder emails, in-app prompts).
- Proration rules that customers understand.
- Better cancellation flows that reduce “angry refund” requests.
Batch payouts where relevant. If your model involves paying out to creators, partners, or sellers, batching can reduce per-payout fees and operational work. It also makes cash forecasting easier.
Review monthly fee reports, not just the headline percentage. Look for:
- Increases in cross-border fees.
- Higher fixed fee impact (plan mix shifts to lower ARPA).
- Chargeback spikes tied to a product or billing change.
- “Misc” fees from add-on tools (fraud, dispute management).
A simple example shows why this matters. If you improve your effective processing rate by 0.4 percent, it flows straight into gross margin.
- Annual revenue processed: £2,000,000
- Fee improvement: 0.4 percent
- Annual COGS reduction: £8,000
That £8,000 is gross profit you can spend on product, support, or runway, without selling a single extra seat.
Customer support and onboarding: lower cost to serve while keeping customers happy
Support is a margin lever because it scales with customers, not with your org chart. If you don’t design it, it designs itself, and you end up hiring to keep up with avoidable tickets.
The best SaaS support cost reduction looks like this: fewer tickets, faster resolution, and happier users because the product is clearer.
Build self-serve help that actually gets used. A help centre only works if it answers the questions customers ask. Start with your top 20 ticket topics, then create short articles with screenshots, short videos, and copy-paste steps. Keep every page focused on one outcome.
Improve docs and in-app guidance. If users need to leave the product to understand the product, friction rises. Use:
- Tooltips and first-run checklists for key actions.
- Empty-state guidance (what to do when there’s “no data yet”).
- Contextual links to docs from the exact screen where confusion happens.
Deflect with chat and forms, but keep it honest. Chatbots can help with routing and simple answers, but customers hate loops. Use chat to:
- Collect key details before a human joins (plan, account ID, error message).
- Offer relevant articles before opening a ticket.
- Route urgent issues (billing, outages) to a fast lane.
Triage better so senior people don’t do junior work. Most teams waste money on poor routing. Create a simple triage rule set by category:
- Billing and access issues: fast response, scripted fixes.
- Bugs: capture steps, screenshots, logs, then send to engineering.
- How-to questions: point to docs, offer short guidance, then close. This protects engineer time and keeps customers from waiting.
Fix the product to remove repeat tickets. Repeat tickets are your goldmine. Every recurring issue is a “tax” you’re paying each month. Create a monthly loop where support tags the top drivers, product prioritises two fixes, and you measure the ticket drop after release.
Segment support by plan (match cost to price). You don’t need the same service level for every user:
- Free tier: docs, community, and automation-first support.
- Entry paid tiers: fast self-serve plus standard human support.
- Higher tiers: priority response, named success support, and proactive check-ins. This keeps cost to serve aligned with gross margin by tier.
A metric checklist you can track without overcomplicating reporting:
- Cost to serve per account (support payroll and tools, divided by active accounts).
- Tickets per customer (also track by plan and cohort).
- First response time (by channel and tier).
- Churn signals (spike in tickets before cancellation, billing disputes, repeated “how do I” issues, unresolved bugs).
If you treat support as part of product delivery, not just a help desk, it becomes one of the quickest ways to protect margin while improving retention.
Protect growth while you cut costs: guardrails, pricing checks, and product decisions
Gross margin work only helps if it keeps your SaaS healthy. The danger is simple: a “saving” that slows the product, frustrates users, or weakens support often shows up later as churn, lower expansion, and longer sales cycles. Treat cost reduction like tuning an engine, not ripping parts out. Set guardrails first, check pricing against cost-to-serve, then run small tests you can roll out with confidence.
Set ‘do not break’ metrics before you cut anything
Before you touch infrastructure, support coverage, third-party tools, or AI usage, agree on the metrics you won’t sacrifice. These are your do not break lines. If a cost change pushes a guardrail out of range, you pause, roll back, or adjust the plan.
Here are practical guardrails to set, with a plain link to future revenue.
- Uptime (availability): If users can’t access the product, they can’t get value. Outages trigger refund requests, support spikes, angry stakeholders, and sometimes contract clauses. Over time, reliability becomes part of your brand, and weak reliability drags on renewals.
- Latency (p95 and p99 response times): Slow software feels broken. Even small delays reduce usage, increase drop-offs, and create “the tool is annoying” feedback that sales and success teams then have to fight. Latency can also reduce activation, which means fewer customers reach the “aha” moment that leads to retention.
- Error rate (failed requests, job failures, timeouts): Errors create hidden costs fast. Users repeat actions, workflows break, and support tickets multiply. Many SaaS businesses see error rate increases show up in churn with a lag, so treat it as an early warning sign.
- NPS (if you use it): NPS is not a margin metric, but it can be a quick read on whether changes are harming the experience. Watch for sharp drops right after cost actions, especially if they coincide with reliability or support changes.
- Churn (logo churn and revenue churn): Churn is a direct tax on all your margin wins. If you save £10k a month and lose £15k a month in recurring revenue a quarter later, you did not improve the business, you just moved the pain.
- Net revenue retention (NRR): NRR tells you if existing customers expand and stick around. Cost cuts that reduce performance, limit functionality, or weaken support can kill expansion. When NRR falls, your growth engine has to work harder just to stand still.
- Activation rate: Activation is the bridge between acquisition and retention. If cost cutting reduces onboarding help, slows key flows, or limits essential compute, fewer customers will adopt the product properly. That usually turns into higher churn in later months.
- Support satisfaction (CSAT) and first response time: Support is part of delivery for most SaaS. If response times worsen or CSAT drops, you often see renewals become harder and expansion deals stall. Even if customers stay, success teams spend more time “repairing” relationships.
A useful way to operationalise guardrails is to set thresholds and owners. Example: “p95 latency must stay under X ms for these endpoints, and the engineering lead owns weekly review.” Keep it simple, and write it down. The point is to stop random cuts that feel good in a spreadsheet and hurt you in renewals.
Pricing and packaging quick audit: are you charging enough for high-cost users?
Not all customers cost the same to serve. In SaaS, your “expensive” accounts often look like your “best” accounts at first, they are active, they push limits, and they need help. If your pricing does not reflect that cost-to-serve, gross margin gets squeezed as you grow.
Common high-cost drivers include:
- Heavy API usage: High request volume, frequent polling, large payloads, retry loops, and webhook floods can drive compute and egress.
- Large data volumes: Storage growth, backups, indexing, and analytics queries add real cost, even if storage alone looks cheap.
- Premium compute needs: Real-time processing, AI features, complex reporting, and high-frequency jobs cost more than basic CRUD usage.
- Lots of support and success time: Training, custom workflows, security reviews, and “can you jump on a call” support is valuable, but it must be paid for.
Practical fixes that protect margin without picking fights:
Introduce usage-based components where usage is the cost driver.
If cost rises with requests, runs, tokens, or data processed, you need a pricing line that moves with it. That can be metered billing or prepaid bundles. Keep the unit clear so customers can predict bills.
Add fair use limits to flat plans.
Flat pricing can still work if you define what “normal” usage means. A fair use policy keeps the offer simple while giving you a way to handle edge cases without resentment. The key is transparency, and a friendly upgrade path.
Create add-ons for premium compute or heavy features.
If a feature drives real cost (advanced analytics, AI generation, large exports, high-frequency sync), isolate it as an add-on. Customers who value it will pay; customers who don’t will not subsidise it.
Set minimum contract values for high-touch plans.
If a plan includes onboarding, dedicated support, or faster response times, protect your team. A minimum annual contract value (or onboarding fee) stops low-revenue accounts from consuming high-value time.
Offer annual prepay incentives to fund delivery.
Annual prepay does not change gross margin directly, but it improves cash and reduces billing churn risk. For customers with predictable usage, it can also support better planning for capacity and commitments.
Here’s a simple decision tree you can use when you spot a margin problem in a customer segment:
- Is the customer’s cost-to-serve high because they use more?
Choose: change packaging (usage-based component, quotas, add-ons). - Is the customer’s cost-to-serve high because they need high-touch support or bespoke work?
Choose: raise price (minimum contract, paid onboarding, higher tier for high-touch). - Is the cost-to-serve high because of internal inefficiency or architecture choices?
Choose: reduce the cost driver (optimise compute, fix retry loops, improve caching, reduce ticket drivers).
If you can’t explain why a customer is “unprofitable to serve” in one sentence, you probably need better cost attribution by account. Even a rough view helps: revenue per account versus estimated COGS drivers (tickets, API calls, storage, AI usage). That clarity makes pricing changes feel fair, not arbitrary.
Run small tests, then lock in the wins
Cost work tends to fail for one reason: teams change too many things at once, then they can’t tell what helped, what hurt, and what to roll back. A simple experiment rhythm keeps you moving, and protects your SaaS guardrails.
A practical approach:
- Pick one lever (for example, reduce DB instance size, change log retention, tighten AI context, adjust support routing).
- Baseline the metric you expect to move (cost per account, cost per request, egress spend, tickets per customer), and also baseline your guardrails (latency, error rate, CSAT, churn signals).
- Change one thing and keep the rest stable. If you need multiple tweaks, schedule them as separate tests.
- Monitor for 2 to 4 weeks. Shorter can miss billing cycles and usage patterns, longer can slow momentum.
- Scale the change only after it holds up, then move to the next lever.
Once you have a win, make it hard to undo by accident. That is where change management matters, even in small teams.
- Communicate internally: Tell support, sales, engineering, and finance what changed, why it changed, and what customers might notice.
- Document the new policy: Reserved instance rules, support routing rules, vendor approvals, environment schedules, and usage limits should live in one place.
- Avoid one-off cuts: A random “turn it off” decision might save money this month, then create outages, angry tickets, and a sprint of emergency work next month.
Think of it like setting household rules. If you agree them once and stick to them, life stays calm. If everyone makes up their own rules, you get noise, waste, and avoidable fires. In gross margin improvement, consistency is the difference between a temporary dip in spend and a lasting improvement you can forecast and defend.
Monthly gross margin operating rhythm: a repeatable scorecard and owner-based actions
If you only look at gross margin when it drops, you’ll always be reacting. A simple monthly operating rhythm makes margin feel less like a mystery and more like a routine health check. The goal is not perfect forecasting. It’s a repeatable set of numbers and a clear list of actions, owned by real people, that you can run every month.
Think of it like weighing yourself on the same scales at the same time each week. The trend tells the truth, not the one-off reading.
The SaaS gross margin scorecard that fits on one page
A one-page scorecard forces focus. If you can’t fit your margin story on one page, you probably can’t manage it quickly either.
Here’s a practical structure you can copy into a spreadsheet:
| Scorecard line | What to track (monthly) | Plain-language target |
|---|---|---|
| Revenue | Subscription revenue, usage revenue, services (if included in COGS story) | Revenue should grow without COGS growing faster |
| COGS, cloud | Hosting, managed DB, CDN, egress, logging, auth, email/SMS in production | Should fall as a % of revenue as you scale |
| COGS, AI compute | Model/API spend, embeddings, vector DB usage tied to customer features | Cost per request should trend down, or pricing should cover it |
| COGS, payments | Processor fees, fixed per-transaction fees, fraud tools per transaction | Blended rate should be stable or improving |
| COGS, support | Support payroll, tools, outsourced support, on-call allowances (if delivery-related) | Cost to serve per customer should be stable or falling |
| Gross margin % | (Revenue minus total COGS) divided by revenue | Aim for steady improvement, avoid sudden drops |
Under the table, add four “operating” blocks that make it actionable:
- COGS per customer (or COGS per 1,000 requests for API-first SaaS): this is your unit cost of delivery. Track it overall and for your top tier.
- Gross profit per customer: revenue per customer minus estimated COGS to serve that customer (even a rough version beats none).
- Top 10 vendors by spend: include month, prior month, and change %. Big margin leaks tend to sit in the top 10.
- Unit economics basics: ARPA, COGS per customer, gross profit per customer, and a simple note on whether unit economics improved or worsened.
Targets need context. Benchmarks (like 70 to 80 percent gross margin) help you sense-check, but your best comparison is your own history. If your SaaS was at 74 percent for six months and is now 68 percent, that drop matters even if 68 percent is “fine” for someone else.
Assign owners for each COGS line, and give them a playbook
Margins improve when someone owns the drivers, not when finance sends a stern email. Each major COGS line needs an owner, a monthly checklist, and a clear definition of done.
A simple ownership model that works in most SaaS teams:
- Engineering owns cloud and AI compute: because architecture choices and guardrails decide the bill.
- Finance owns vendor contracts and classification: because coding, renewals, and terms control what you pay and how you report it.
- Support lead owns cost to serve: because ticket volume, routing, and time-to-resolution drive headcount.
- Product owns deflection and packaging inputs: because UX, quotas, and plan design decide how expensive customers are.
What “done” looks like each month (examples you can adopt):
- Engineering (cloud): publishes the top 5 cost drivers, explains the two biggest changes, and commits 1 to 2 fixes (for example, rightsizing a DB, deleting idle environments, reducing egress on a noisy endpoint). Done means the fix is shipped, monitored, and either kept or rolled back.
- Engineering (AI compute): reports cost per 1,000 requests, cache hit rate, model mix, and retry rate. Done means one cost control moved (prompt trim, routing change, caching, rate limit), with quality checks noted.
- Finance (vendors and classification): keeps the top 10 vendor list clean, flags renewals 60 days out, and corrects any miscoding that would distort gross margin. Done means the month closes with stable categories and no “sweep it into other” lines.
- Support lead (cost to serve): reports tickets per customer, top 5 drivers, and the cost impact (time or £) of repeat issues. Done means at least one repeat driver is closed out with a product fix, doc update, or better triage.
- Product (deflection and packaging): reviews which features drive high usage costs, checks quota pressure, and proposes packaging moves (fair use limits, add-ons, usage-based lines). Done means one packaging experiment is approved or launched, with expected margin impact written down.
This rhythm turns margin from a monthly surprise into a managed process. You still won’t control every variable, but you will always know what changed and who is fixing it.
What Consult EFC helps SaaS teams do when margins stall
When gross margin stops improving, the problem is usually one of two things: the numbers are not telling the truth, or the team can’t translate the numbers into actions. Consult EFC helps SaaS teams get both right, without turning it into a six-month finance project.
Typical work includes clean COGS mapping (so cloud, AI compute, payments, and support sit in the right place), then management reporting that shows margin by month in a format founders and heads of function will actually use. When margins move, a margin bridge makes it obvious what changed, price, volume, mix, usage, or unit cost. From there, teams often need scenario modelling for pricing and usage so they can test quotas, add-ons, and plan changes before shipping them. Finally, a cost optimisation plan ties actions to owners and guardrails, so you protect growth while you lower cost-to-serve.
Conclusion
SaaS gross margin improves when your COGS number is clean, then you work the biggest drivers in order, cloud and egress, AI compute, payment fees, and support cost-to-serve. The wins come from small, repeatable changes, backed by guardrails that protect uptime, latency, and retention. That’s how you cut waste without paying it back later in churn or stalled expansion.
This week, pick one COGS line and treat it like a product metric. Baseline the unit cost (per customer, per 1,000 requests, or per transaction), make one controlled change, then track the impact for two to four weeks. Keep what holds, roll back what harms the experience, and document the new rule so the savings stick.
If you want hands-on help tightening reporting, setting guardrails, and turning savings into a plan the team will follow, Consult EFC supports UK SaaS founders and finance leads with advisory and accounting that keeps growth fundable.
Checklist recap: classify COGS correctly, focus on the biggest cost drivers, set guardrails, run one small test at a time, then lock in the win with owners and a monthly scorecard.



