← All field notesCloud·08 May 2026·10 min

Three boxes, three jobs: API gateway, load balancer, and reverse proxy

Half the architecture reviews we sit in conflate these three components — and that confusion shows up later as over-engineered infrastructure or under-engineered traffic management. Here's the honest distinction, the decision framework we use at Hexcore, and the patterns we ship in production.

Akeem Amusat

Founder & CEO

#api-gateway#load-balancer#reverse-proxy#architecture#infrastructure

Three boxes, three jobs: API gateway, load balancer, and reverse proxy

Walk into any reasonably-sized engineering organisation and ask a senior engineer the difference between an API gateway, a load balancer, and a reverse proxy. You'll get three different answers from three different engineers, and a fourth answer from the platform team that disagrees with all of them.

This isn't pedantry. These three components are the traffic layer of every modern backend system — and conflating them produces two distinct failure modes:

Over-engineering — buying Kong or Apigee to do what a single Nginx config could have done, and then maintaining a system three orders of magnitude more complex than the problem warranted.
Under-engineering — trying to stretch Nginx into an API gateway and ending up with 4,000-line config files that no one understands, with rate-limiting logic glued together via Lua scripts.

Both failure modes are expensive. The fix is to be precise about which job each box is doing, and to combine them deliberately. Here's the version we install on every Hexcore engagement.

The three boxes — what they actually are

Stripped of vendor marketing, the three components have distinct primary jobs:

Reverse proxy

The original concept. A reverse proxy sits in front of one or more backend servers and forwards client requests to them, returning the response. From the client's perspective, the proxy is the server.

Primary jobs: TLS termination, caching, compression, basic L7 routing (path/host-based), header manipulation. Often handles static-file serving and simple security headers.

Examples: Nginx, Caddy, HAProxy, Apache, Envoy.

Load balancer

A specialised reverse proxy whose primary job is distributing traffic across multiple identical backend instances. The conceptual unit it cares about is "the pool of servers that can serve this request."

Primary jobs: Distribution algorithm (round-robin, least-connections, weighted, hash-based), health checks, automatic ejection of unhealthy instances, connection draining during deploys. Can operate at L4 (TCP/UDP — fast, dumb) or L7 (HTTP — slower, smarter).

Examples: AWS ALB/NLB, GCP Cloud Load Balancing, Azure Front Door, HAProxy, F5 BIG-IP, Envoy as LB.

API gateway

A reverse proxy with API-specific features layered on top. It treats the traffic as APIs being exposed to consumers, and adds cross-cutting concerns that those APIs need: authentication, rate limiting, key management, plan tiers, request transformation, schema validation, usage analytics.

Primary jobs: API authentication (OAuth, JWT, mTLS, API keys), per-consumer rate limiting and quotas, request/response transformation, API composition (fan-out and aggregation), developer portal, usage metering.

Examples: Kong, Apigee, AWS API Gateway, Tyk, KrakenD, Gravitee.

Where the overlap confuses everyone

These boxes overlap because all three sit at the edge and all three do some flavour of L7 HTTP routing. The honest taxonomy looks like this:

Capability	Reverse proxy	Load balancer	API gateway
TLS termination	Yes	Yes (L7)	Yes
Path / host routing	Yes	Yes (L7)	Yes
Health checks	Basic	Core	Yes
Load distribution	Basic	Core	Yes
Caching	Yes	Limited	Yes
Auth (JWT, OAuth, keys)	No (or DIY)	No	Core
Per-consumer rate limiting	No (or DIY)	No	Core
Request transformation	Limited	No	Core
API key / plan management	No	No	Core
Developer portal	No	No	Core

A modern L7 load balancer (AWS ALB, Envoy) can do almost everything a basic reverse proxy can do. An API gateway can do almost everything a load balancer can do — and usually with worse performance and higher cost per request.

The framing that actually helps: each box has a primary job it's optimised for. Use it for that job. Don't ask it to do another box's job unless you've thought hard about why.

The decision framework

Three questions, asked in this order. Each cuts the architecture down to size.

Question 1 — Are you exposing APIs to consumers who aren't you?

If the answer is yes (you're running a developer programme, a partner API, a multi-tenant SaaS where customers consume your API directly, or you sell API access as a product) — you need an API gateway. The features it offers around key management, plan tiers, rate limiting per consumer, and developer portal aren't optional; they're the product surface.

If the answer is no (your APIs are internal between your own services, or only called by your own frontend) — you almost certainly do not need an API gateway. Reach for a load balancer or reverse proxy instead.

This single question eliminates 60% of inappropriate API gateway deployments we encounter on audits.

Question 2 — Do you have multiple identical backend instances behind the same endpoint?

If yes — you need a load balancer. This is what they exist for. Health checks, traffic distribution, connection draining during deploys — none of these are reverse-proxy strengths.

If no (single backend, or the backends serve different purposes routed by path) — a reverse proxy is sufficient.

Question 3 — Do you need TLS termination, caching, header manipulation, or path-based routing for a small system?

If yes and you've answered "no" to questions 1 and 2 — a reverse proxy is enough. Pick Nginx, Caddy, or HAProxy. Configure it well. Move on.

The combinations we ship

Real systems combine these components. Here are the four patterns we install most often at Hexcore, in order of system maturity.

Pattern 1 — The small system

Single VM or container, single app server, one or two domains. Used for early-stage products, internal tools, marketing sites with a CMS backend.

Client → Caddy → app server

Caddy or Nginx as a reverse proxy. Automatic HTTPS via Let's Encrypt. No load balancer, no API gateway. One config file, fifty lines, done.

Trap to avoid: don't reach for AWS ALB just because "production should have a load balancer." If you have one backend, you don't need an LB. You need TLS termination, which the reverse proxy gives you.

Pattern 2 — The horizontally-scaled service

A handful of identical backend instances behind a single endpoint. Used for the typical SaaS workload.

Client → Cloud LB (L7) → app instances (n)

Cloud LB (AWS ALB, GCP HTTPS LB) handles TLS, path routing if needed, distribution, health checks. The LB is the reverse proxy and load balancer in one — modern cloud LBs do both jobs adequately.

Trap to avoid: don't add Nginx in front of an ALB "for caching." You'll add a hop, a failure mode, and most of the caching you want belongs in CloudFront or a CDN, not Nginx.

Pattern 3 — Microservices with internal callers only

Many services, called by your own frontend or by each other. No external API consumers.

Client (your frontend) → Cloud LB → ingress controller (Envoy/Nginx) → services
                                  ↘ service-mesh sidecars (east-west)

The LB handles north-south (client → cluster) ingress. The ingress controller does L7 routing inside the cluster. East-west traffic between services goes through a service mesh (Istio, Linkerd) which provides mTLS, retries, circuit breaking, and observability — at a per-call cost, not a per-API cost.

No API gateway here. The auth concerns are internal (mTLS via mesh) or handled by your own application code. A gateway would be a layer you're paying for and operating without using its API-specific features.

Pattern 4 — The API-product business

Microservices behind a developer-facing API surface. Used when third parties consume your API (partners, plan-tiered customers, an open developer programme).

External developers → CDN → API Gateway → Cloud LB → services
                                       ↘ Auth, rate limiting, key management, billing

The API gateway is the product surface. It handles OAuth/JWT/key validation, per-plan rate limits, request transformation for backward-compatibility, and feeds the developer portal and usage analytics. Behind it, the cloud LB distributes traffic to identical service instances; behind that, the services do their actual work.

Trap to avoid: pushing business logic into the gateway. Gateway plugins and transformations should be cross-cutting concerns (auth, rate limit, schema check) — not business rules. The moment your gateway is computing prices, you've recreated the monolith in a place that's expensive to debug.

The mistake patterns we see

After auditing hundreds of architectures, the same five mistakes show up:

Mistake 1 — API gateway as the only reverse proxy

Symptom: every internal service-to-service call goes through the API gateway, including health checks and admin endpoints.

Why it's bad: gateways are slow per call relative to L4 LBs or direct service mesh. They're priced and engineered for external traffic. Internal traffic should bypass them entirely.

Fix: keep north-south (external) on the gateway. Move east-west (internal) onto a service mesh or direct LB.

Mistake 2 — Nginx as a half-built API gateway

Symptom: 4,000-line Nginx config with Lua scripts implementing JWT validation, per-client rate limits, and request transformation.

Why it's bad: Nginx is brilliant at being Nginx. It is mediocre at being Kong. The team that maintains the 4,000-line config will eventually leave, and the new team will rewrite it — usually as an actual API gateway, six months too late.

Fix: if you have more than two API-gateway features in your Nginx config, evaluate moving to a real gateway. The migration cost is usually less than the maintenance debt.

Mistake 3 — L4 load balancer for L7 needs

Symptom: NLB (L4) in front of services that need path-based routing or host-based routing. Routing logic ends up in the application.

Why it's bad: L4 is for raw TCP. The moment you need to route based on HTTP path, you need L7. The application is the wrong place to do routing — it couples deployment and tying every service to knowing every other service's existence.

Fix: ALB (L7) or an L7 ingress controller for HTTP routing. Reserve L4 for non-HTTP protocols or extreme-performance use cases.

Mistake 4 — Multiple reverse proxies in the request path

Symptom: CDN → Cloud LB → Nginx → app → Nginx (cache) → upstream API. Five hops.

Why it's bad: every hop adds latency, a failure point, and an observability gap. Many of these layers were added piecemeal "to solve X" without considering whether the existing layers could solve X.

Fix: audit the request path. Each layer must earn its place — there should be a specific named job that no other layer can do as well.

Mistake 5 — Treating the gateway as a deployment unit

Symptom: every API change requires a gateway config deploy. Gateway deploys block service deploys.

Why it's bad: gateways become the bottleneck for shipping. You've recreated the monolith deployment problem at the edge.

Fix: gateway config should be declarative, versioned in code, deployed alongside the service it fronts — not in a separate operational track. Most modern gateways (Kong, Tyk, AWS API Gateway) support this; some teams choose not to use it.

Where this is going

A few quiet trends worth tracking:

Service meshes are absorbing east-west gateway features. mTLS, retries, circuit breakers, observability — once the exclusive domain of API gateways, now baseline service-mesh features. The mesh is becoming the right place for internal traffic concerns.
L7 cloud LBs are absorbing reverse-proxy features. AWS ALB now does WAF integration, basic auth, header manipulation, and more. For many workloads, the cloud LB is sufficient — no separate Nginx layer required.
API gateways are bifurcating. "Enterprise" gateways (Kong, Apigee) are leaning into the developer portal and monetisation features. Lightweight gateways (KrakenD, Envoy-based) are leaning into pure L7 routing with API features available but optional.
Wasm-based extensions are becoming the standard for custom logic. Envoy and Kong both support Wasm filters now. This is the right place for custom cross-cutting concerns, not Lua scripts or sidecar containers.

The honest summary

These three boxes get conflated because they overlap. The way to keep them apart in your head is to remember each one's primary job:

Reverse proxy — be the public face of one or a few backends. TLS, caching, routing.
Load balancer — distribute traffic across many identical backends. Health, distribution, draining.
API gateway — expose backends as APIs to external consumers. Auth, quotas, plans, portals.

If you can name which primary job each box in your architecture is doing, you probably have the right number of boxes. If you can't — or if a single box is doing two of these jobs and starting to creak — that's where the next architecture review should focus.

The traffic layer is one of the few parts of your system where over-engineering and under-engineering cost about the same amount of money. The teams who get it right are the ones who choose each box for one named job, and resist the temptation to do more with it than that.

If you're untangling a traffic layer that grew organically — too many proxies, an API gateway doing too much, or an LB doing too little — we run targeted architecture reviews that produce a prioritised remediation plan. Drop a line at hello@hexcore.ng, or read more blog posts on the kind of work we do.

Written by

Akeem Amusat

Founder & CEO

Strategy · Platform engineering

15 years across payments, telco, and platform engineering. Founded Hexcore to prove African engineering can ship at world-class standards.

Found this useful?

Brief us on your problem →More blog posts

RelatedView all →

Engineering

07 Jun 202612 min read

Microservices aren't a scaling strategy

Microservices solve an organisational problem, not a performance problem. Reach for them when your team structure demands it — not because your traffic does. Here's how we decide, what goes wrong when teams decide wrong, and the boring monolith pattern we ship instead nine times out of ten.

Nneka Okafor·Head of Engineering

FinOps for African fintechs: A practical playbook

Cloud

29 Apr 20268 min read

FinOps for African fintechs: A practical playbook

Cloud spend is a quiet line on most African fintech P&Ls until it isn't. By the time it's a problem, the architecture decisions are already made. Here's the operating playbook we install before that day arrives.

Akeem Amusat·Founder & CEO

The four practices that turn a launch into a durable product

Engineering

28 May 20269 min read

The four practices that turn a launch into a durable product

Most launches succeed. Most products fail soon after. The gap is not engineering talent — it's a set of operating practices that the strongest teams treat as defaults. Here are the four we install on every Hexcore engagement.

Akeem Amusat·Founder & CEO