26 August 2025

Promise.all isn't batching

When parallelising N small queries looks like a fix and is really just self-DoS at a slightly higher concurrency level.

$A single orange filament travels left-to-right across a dark backdrop, then fractures at a point into hundreds of fine divergent threads.$

The shape of the mistake is so common it’s almost a reflex. You’ve got an array of IDs and you need data for each one. Your repository’s getThing method takes a single ID. The obvious move:

const things = await Promise.all(
  ids.map((id) => this.repo.getThing(id))
);

Done. Parallel. Fast. Ship it.

Except Promise.all isn’t batching. It’s parallel execution of separate operations. The two look identical from up close — both turn “do N things” into something that finishes faster than doing them serially — and they’re nothing like each other in cost.

What you actually built

That snippet, for an array of 500 IDs, just fired 500 individual SQL queries at your database from one application instance. They’re concurrent rather than sequential, which means total wall-clock time is closer to “the slowest one” than “the sum of all of them”. That’s the part that feels like a win.

The part that doesn’t feel like a win is everything else.

The database opens up to 500 connections (or queues you against the pool if the limit is lower), parses 500 query plans, runs 500 lookups for what could have been one query with a WHERE id IN (...) clause. Each query carries the same per-statement overhead as if it were the only thing happening: parsing, planning, locking, the round trip itself. The variable cost is small — a single index lookup. The fixed cost dominates by an order of magnitude.

The same pattern against an internal HTTP service is worse. Every “request” is now N requests, each with its own TCP handshake, its own auth check, its own log line, its own trace. If the downstream service runs anywhere near capacity, you’ve just degraded your own dependency in a way that looks to them like a small denial-of-service attack from a system that’s supposed to be a friendly neighbour. They start failing, you start retrying, things get exciting.

The actual answer

Almost always: batch downstream.

If the data source is a database, write the repo method to take an array and use WHERE id IN (...). One round-trip, one query plan, one index scan touching N rows. The previous post argued for this as a default, so I won’t relitigate it.

If the data source is an HTTP service, ask whether it has a bulk endpoint. If it does, use it. If it doesn’t and you own the service, add one. If you don’t own the service and it doesn’t have one, then Promise.all with a concurrency limit is a defensible fallback — but call it what it is, a workaround, not the design.

The middle case — where the data crosses several service calls per item and there’s no single endpoint that takes bulk input — is where dataloader earns its keep. Each individual call becomes a loader.load(key), the loader batches across every call in the same tick of the event loop, and one bulk fetch goes downstream. Same call sites in your code, one round-trip on the wire.

The mental shift is: parallelism is a property of your runtime; batching is a property of the workload. Parallelism makes your code wait less. Batching makes the system do less. Only one of those scales.

The message-queue variant

A related mistake wears different clothes. You’ve got the same array of IDs, and instead of parallelising the work in-process, you publish N messages to a queue and let some worker pool eat them one at a time.

This gets sold as “scaling horizontally”. It’s the same fixed-cost problem in a fancier coat, with the added bonus of eventual consistency, distributed tracing pain, and a much longer tail when the queue is already busy. The actual workload — processing N items individually — hasn’t shrunk. It’s just been smeared across more machines.

// Looks like scale. Isn't.
for (const item of items) {
  await publisher.publish(new Message({ itemId: item.id }));
}

If the per-item work is genuinely independent, inherently sequential per-item, and slow enough to want isolation between items (a webhook fan-out, a long-running per-row computation), queues are fine. If the per-item work could have been one bulk operation against one downstream system, the queue is laundering the problem rather than solving it.

When `Promise.all` is right

To be clear: there’s a real use for it. Several unrelated async operations that need to all complete:

const [user, settings, notifications] = await Promise.all([
  loadUser(id),
  loadSettings(id),
  loadNotifications(id),
]);

Three different data fetches, three different downstream paths, no batchable structure between them. Promise.all is exactly the right tool — concurrency with a clean rendezvous. The trap is reaching for the same tool when the operations are identical-shaped calls to the same backend.

The test I run in my head: if I replaced Promise.all with a for loop, would the downstream system still be doing the same total work, just slower? If yes, I haven’t fixed anything — I’ve just made the bill arrive faster.

What you actually built

The actual answer

The message-queue variant

When Promise.all is right

Comments

When `Promise.all` is right