Designing Scalable Backend Systems in Go

"Scalable" gets thrown around as if it were a property you sprinkle on at the end. In practice it is the result of a few decisions made early: where the boundaries are, how work flows between them, and how you use the concurrency the language gives you. Go makes the last part deceptively easy, which is exactly why it deserves care.

Concurrency is a tool, not a goal

Goroutines are cheap, but unbounded goroutines are a great way to exhaust memory or hammer a downstream dependency. When fanning out work, bound it:

func screenAll(ctx context.Context, apps []Application) error {
    g, ctx := errgroup.WithContext(ctx)
    g.SetLimit(runtime.GOMAXPROCS(0))

    for _, app := range apps {
        app := app
        g.Go(func() error {
            return screen(ctx, app)
        })
    }
    return g.Wait()
}

errgroup with a limit gives you parallelism, error propagation, and context cancellation in a few lines. Reducing a slow sequential screening loop to a bounded parallel one is often the difference between sub-second and multi-second response times.

Always respect the context

Every blocking call in a request path should take a context.Context and honor its cancellation. A request that has already been abandoned by the client should not keep holding a database connection:

row := db.QueryRowContext(ctx, query, args...)

This sounds obvious, but a single time.Sleep or context-less call buried in a hot path can pin resources long after the caller has given up.

Choose messaging deliberately

Synchronous calls couple the availability of two services. For work that does not need an immediate answer, asynchronous messaging decouples them and absorbs spikes. The two tools I reach for serve different needs:

NATS for low-latency, fire-and-forget signalling and request/reply between services.
Kafka for durable, replayable event streams where consumers may join late or reprocess history.

Picking the wrong one is expensive: Kafka's durability is overkill for a transient notification, and NATS core's at-most-once delivery is wrong for an event you must never lose.

Keep boundaries thin and explicit

Microservices are a deployment choice, not an architecture. The architecture is in the interfaces. A few habits keep them maintainable:

Own your data — one service, one database, no shared tables.
Make the contract explicit with gRPC or a versioned REST schema.
Treat every remote call as something that can fail, time out, or be slow.

A monolith with clean internal boundaries is far healthier than a distributed system with leaky ones.

Design for the failure, not the happy path

Scaling is mostly about behaving well when something is overloaded or down. Timeouts, retries with backoff and jitter, circuit breakers, and idempotent handlers are not premature optimization — they are the parts of the system that decide whether a single slow dependency degrades gracefully or takes everything down with it.

Get the boundaries and the failure modes right, and "scaling" becomes mostly a matter of adding replicas.

Arham Abiyan — Backend / Software Engineer

Experience

Skills

Writing

Contact

Concurrency is a tool, not a goal

Always respect the context

Choose messaging deliberately

Keep boundaries thin and explicit

Design for the failure, not the happy path