← All field notesEngineering·07 Jun 2026·12 min

Microservices aren't a scaling strategy

Microservices solve an organisational problem, not a performance problem. Reach for them when your team structure demands it — not because your traffic does. Here's how we decide, what goes wrong when teams decide wrong, and the boring monolith pattern we ship instead nine times out of ten.

Nneka Okafor

Head of Engineering

#architecture#microservices#monolith#distributed-systems#scalability

Every quarter, somebody walks into a Hexcore discovery sprint with a deck that opens "we need to move to microservices to scale." Every quarter, we ask the same question: scale what, exactly?

What follows is usually a list of things microservices do not, in fact, fix. Throughput on a hot endpoint. A database that's pegged at 90% CPU. A deploy pipeline that takes 40 minutes. An on-call rotation that pages three people every Sunday night. None of those problems are about service topology. All of them get worse the moment you cut the system into fifteen pieces.

Microservices are not a scaling strategy. They are an organisational scaling strategy — a way to let independent teams ship independently without coordinating a release train. They are expensive, they are right for some teams, and they are catastrophically wrong for most.

Here is how we decide, what goes wrong when teams decide wrong, and the boring pattern we ship nine times out of ten instead.

What microservices actually buy you

Strip away the conference talks and there are exactly three things microservices give you that a monolith does not:

Independent deploys. Team A can ship without coordinating with Team B's release.
Independent failure domains. Team A's bad deploy can't take down Team B's service.
Independent technology stacks. Team A can be on Python, Team B on Go, Team C on Rust.

That's the list. Notice what isn't on it: performance, throughput, "scalability" in the way a CTO uses the word in a board deck. A well-written monolith on a single large box outperforms a badly-coordinated mesh of microservices on almost every benchmark that matters — latency, p99, dollars per million requests, time to debug a production incident.

The three things on the list are real, and they are valuable. They are also things you only need when you have enough teams to coordinate. If you are one team of eight engineers, you have nothing to coordinate. You are paying the full operational tax of a distributed system to solve a problem you do not have.

What you actually pay for them

The bill arrives in five places, and most teams underestimate every one.

1. The network is not free, and it is not reliable

A function call in a monolith is a few nanoseconds and cannot fail in a way you didn't write. A network call between services is, on a good day, a millisecond — and on a bad day it's a timeout, a partial response, a retry storm, or a Tuesday afternoon you spend reading TCP traces.

Every microservice boundary you create is a place the system can fail in a new way. We have lost more weekends to retry-amplification cascades and circuit-breaker misconfigurations than to any class of bug a monolith ever produced.

2. Distributed transactions don't exist (in any sane form)

The moment your business logic spans two services with their own databases, you have lost transactions. You can pretend you haven't — with sagas, with the outbox pattern, with eventual consistency hand-waved past the product manager — but you are now in the business of writing compensating logic for every failure mode of every cross-service call. That is not application code. That is infrastructure code masquerading as application code, and it never makes the roadmap.

If your bug class budget for the year is twenty bugs, distributed transactions will eat fifteen of them and produce nothing the customer can see.

3. Observability is now a product, not a feature

In a monolith, you read the log file. In a fifteen-service mesh, you need: distributed tracing, structured logs with correlation IDs, a metrics stack with consistent labelling, an incident process that knows which team owns which service, and someone whose actual job is keeping all of that healthy. None of that ships itself. All of it costs real engineering hours that are not going into the product.

We tell clients: if you cannot draw your trace pipeline on a whiteboard in two minutes, you are not ready to operate microservices. Most teams cannot.

4. You now run a platform, whether you wanted to or not

Once you have more than three or four services, the question of how a new service starts life becomes a real product problem. Templates. Golden paths. Service catalogues. Local development that doesn't require booting twelve containers. Secrets management that scales past the founder's 1Password. CI pipelines per service. Deployment manifests per service. SLOs per service.

This is platform engineering — and it is a discipline, not a sprint. The teams that ship microservices well have a dedicated platform team. The teams that ship microservices badly have eight backend engineers each maintaining their own slightly-different Dockerfile.

5. Refactoring across boundaries is now a multi-team negotiation

A function in a monolith can be renamed in an IDE, with confidence, in thirty seconds. A field in a service contract can take a quarter to migrate: deprecate, dual-write, dual-read, backfill, cut over, clean up — with coordination across every team that touched the field. If your domain model is still in flux (and in a young product, it always is), this is the single most expensive cost of premature microservices. You have frozen a design you don't yet understand.

The decision rule we use

In a Hexcore discovery sprint, the architecture conversation usually takes about ninety minutes. The microservices question gets resolved in about ten of them, with one rule:

You need microservices when your team topology demands them. Not before.

Concretely, that means:

Fewer than ~25 engineers, single product: Monolith. Almost no exceptions. Cut into modules by domain (we'll come back to this), deploy as one artefact, scale horizontally on the boring axis (more instances, bigger box, read replicas, a cache).
25–80 engineers, single product or tightly-coupled suite: Modular monolith with one or two carved-out services where the failure-domain or scaling profile genuinely differs (a video transcoder, a billing engine, an ML inference path).
80+ engineers, multiple product surfaces: Service-oriented architecture is now buying you something. Cut services along team ownership lines, not along technical lines.

We have built systems at each of these scales. The teams that broke this rule — usually a team of twelve ambitious engineers building "microservices from day one" — universally regretted it within eighteen months. The teams that held the line shipped faster, debugged faster, and had a clearer path to actual decomposition when their headcount eventually justified it.

What "scalability" actually means

When a client says "we need to scale," nine times out of ten they mean one of these:

A specific endpoint is slow under load. Fix the endpoint. Add an index. Cache the read. Move the heavy computation off the request path. Microservices will not make a slow query fast.
The deploy is scary and infrequent. Improve CI, add feature flags, decouple deploy from release. Microservices will make this worse until you've solved it for one service first.
The codebase is hard to reason about. This is a module-boundary problem, not a deployment-boundary problem. You can solve it inside a single repo, with a single deploy, today.
One team is blocking another. This is the real microservices problem. Fix it with microservices, or with clearer module ownership, or by reorganising the team. All three work.

The honest version of the conversation, every time, is: which of these are you actually trying to solve? Because the answers don't all point at the same architecture.

The boring pattern we ship instead

When the answer is "a young product with a small team that needs to ship fast," our default architecture is what we call a modular monolith on a managed Postgres. The structure looks like this:

A single deployable Go or TypeScript service, with strict internal module boundaries — one module per bounded context (billing, catalog, ordering, identity).
Modules communicate through typed in-process interfaces, not HTTP. The compiler enforces the contract.
One Postgres, with schemas-per-module, and a rule that no module reads another module's tables directly. Same DB, but the boundaries are real.
A single CI pipeline. A single deploy. A single observability stack. One on-call rotation.
Horizontal scale on the obvious axis: more instances behind a load balancer, read replicas where they help, a cache in front of read-heavy paths.
Background work on a queue (we like Postgres-backed queues for this scale — pg_jobs, river, or a hand-rolled SELECT ... FOR UPDATE SKIP LOCKED).

This pattern has carried products we've built from launch through to seven-figure ARR without a single re-architecture. When growth eventually demands service decomposition, the module boundaries we drew on day one become the service boundaries. The migration is mechanical, not philosophical.

When we do reach for services — and how

There are real cases where carving out a service on day one is correct, even on a small team:

Wildly different scaling profiles. A real-time video pipeline alongside a normal CRUD app. Don't make your CRUD service pay for GPU instances.
Strict failure isolation. A regulated payments path that must not be affected by an outage in the marketing analytics path.
Distinct technology need. A core that has to be in Rust for a hard latency reason, alongside everything else in TypeScript.
External boundary. A service consumed by partners over a stable public API contract that you are committing to long-term.

When we do carve, we carve one service, not five. The monolith stays the centre of gravity. The carved service has a clear, narrow, well-documented contract. We do not split "because we're going to need to anyway." Premature optimisation is premature in architecture too.

The principle behind the rule

This is principle 02 of how we work: we build for the operator, not the demo. Microservices look great in a system-design interview. They look great on an architecture diagram. They look great in a conference talk.

They look much worse at 3am when a junior engineer is paging the on-call lead because a circuit breaker tripped in service B because service A's deploy was eight seconds slow and now the retry queue is backing up on service C — and the customer just wants their order to go through.

The customer does not care how many services your system has. The customer cares that it works, that it's fast, and that when something breaks, someone competent is on it within minutes.

A boring monolith, well-built, ships that experience faster than a clever distributed system, badly built, every single time.

When in doubt: write the monolith. Draw the module boundaries with care. Let the team grow into the architecture. Cut services only when the org chart, not the architecture diagram, says it's time.

That's the rule. We've watched it hold for sixty-plus shipped systems. It will hold for the next sixty too.

Written by

Nneka Okafor

Head of Engineering

Distributed systems · SRE

Distributed systems specialist. Owns Hexcore's engineering standards, hiring bar, and on-call practices.

Found this useful?

Brief us on your problem →More blog posts

RelatedView all →

Boring on purpose: why Go quietly won the modern backend

Engineering

22 May 202611 min read

Boring on purpose: why Go quietly won the modern backend

Go was never the fashionable choice. It was always the durable one. Here's why we reach for Go first when we're building backend services that have to last — and the specific patterns we use to keep them clean as the codebase grows.

Akeem Amusat·Founder & CEO

Three boxes, three jobs: API gateway, load balancer, and reverse proxy

Cloud

08 May 202610 min read

Three boxes, three jobs: API gateway, load balancer, and reverse proxy

Half the architecture reviews we sit in conflate these three components — and that confusion shows up later as over-engineered infrastructure or under-engineered traffic management. Here's the honest distinction, the decision framework we use at Hexcore, and the patterns we ship in production.

Akeem Amusat·Founder & CEO

The four practices that turn a launch into a durable product

Engineering

28 May 20269 min read

The four practices that turn a launch into a durable product

Most launches succeed. Most products fail soon after. The gap is not engineering talent — it's a set of operating practices that the strongest teams treat as defaults. Here are the four we install on every Hexcore engagement.

Akeem Amusat·Founder & CEO