Skip to main content

Motivation

Historical context and motivation for Distributed Async Await

Rise of Concurrency and Distribution

The trajectory of software development over the past two decades reveals a fundamental shift: from sequential to concurrent systems, and from concurrent to distributed systems. This evolution was not only inevitable, but necessary. Today, the question is no longer if an application is concurrent or distributed, but to what degree.

This shift began in earnest in 2006, when AWS launched S3 and EC2, catalyzing the modern era of cloud computing. The launch of cloud primitives marked the beginning of the mass adoption of distributed computing. As cloud-native applications emerged as the dominant paradigm, so too did the complexity of their execution environments.

What began as short-lived computations on long-lived servers transitioned to long-lived computations on short-lived servers. The reliability and stability that developers could once assume—the idea that a process would run to completion—was replaced with the reality of near-certain interruptions. Continuity could no longer be presumed; fragmentation became the default.

New Execution Landscape

From the mid-2000s to the mid-2010s, applications operated on long-lived virtual machines or containers.

Short Running Executions on Long Runnning Resourcesshort-running-on-long-running

Computations were short and predictable, often completing within milliseconds. The rare failure was an anomaly, and developers could safely write code assuming successful, uninterrupted completion.

In the post-2010 landscape, however, that assumption collapsed. Compute resources became ephemeral. Virtual machines, containers, and even functions could come and go with little warning. Consider food delivery, ride-hailing, or vacation rentals—tasks that unfold over minutes, hours, or even days. Executions became long-lived, but the infrastructure became short-lived, flipping the model.

Long Running Executions on Short Runnning Resourceslong-running-on-short-running

This shift created an asymmetry: executions now outlived their hosts. To cope, systems needed to coordinate across both space (distribution across machines) and time (distribution across lifetimes).

Programming Models for Concurrency

The initial response to growing complexity was the development of concurrent programming models. Threads offered a low-level way to express concurrency, but introduced unpredictability and complexity. The introduction of async await marked a pivotal advancement: an intuitive, procedure-based syntax that abstracted away the mechanics of concurrency while preserving clarity.

Async await replaced opaque callback chains with clean, readable logic. Rather than writing event-driven code that manually tracked state and control flow, developers could write code that looked sequential but executed asynchronously.

Invoke and Await
function foo() {

//...

// 1
promise = async bar();

//...

// 2
let value = await promise;

//...

return v;

}

function bar() {
//...
return v;
}

Invocation

An async function execution, the caller, invokes an async function and immediately receives a promise instance representing the callee. On invocation (1), (the execution of) foo triggers (the execution of) bar, at which point (the execution of) foo and (the execution of) bar execute concurrently. The invocation returns a promise instance that represents (the execution of) bar.

Await

An async function execution, the caller, awaits a promise instance and eventually receives a value, the return value of the callee. On await (2), (the execution of) foo suspends its execution until (the execution of) bar terminates and completes its associated promise instance. On termination of (the execution of) bar and the completion of its associated promise instance, (the execution of) foo resumes its execution.

But async await, despite its elegance, stopped at the boundary of the process. It was concurrency-focused. It had no concept of remote invocation, no facility for distribution, and no notion of recovery across process or machine failures. As systems became distributed by necessity, async await revealed its limits.

Distribution-Aware Abstractions

Distributed systems are defined by two core properties: partial order and partial failure. In concurrent systems, operations may interleave in unpredictable ways. In distributed systems, operations may also fail in unpredictable ways.

Partial Order and Partial Failure

Consider a simple example, two processes P and Q, each consisting of just three steps.

P = a • b • c

Q = x • y • z

(P | Q | ⚡️) ≡ (P • Q) ∨ (Q • P)

Partial Order and Partial Failurepartial-order-partial-failure

The concurrent composition of 2 processes consisting of 3 steps leads to 20 distinct possible executions. The distributed composition of 2 processes consisting of 3 steps leads to 69 distinct possible executions.

To mitigate partial order, we need coordination. To mitigate partial failure, we need recovery.

Traditional programming models did not address these challenges. HTTP calls and task queues became the ad hoc glue for distributed interactions. These mechanisms, however, introduced new problems:

  • HTTP tightly coupled caller and callee.
  • Task queues loosely coupled execution, but sacrificed visibility, debuggability, and value propagation.

Control flow across machines became fragmented. Distributed executions lost identity. Developers were burdened with choreography, orchestration, compensation, retries, and outbox patterns. Somewhere in the middle, actual application logic was written.

Durable Executions and Logical Continuity

Enter the concept of durable executions. At its core, durable execution means the ability to resume an execution after a failure or voluntary suspension. This doesn’t require magic. If an execution is composed of idempotent steps and maintains externalized state, it can be restarted from the beginning, and will converge to the same result.

This idea has existed in practice through tools like Apache Hadoop, Temporal, AWS Step Functions, and more. But many of these frameworks rely on proprietary abstractions and centralized orchestrators that make composition difficult and developer experience burdensome.

Distributed Async Await: A Formal Model

Distributed Async Await extends the async await model by elevating distribution to a first-class concern. It introduces durable, globally-addressable promises and executions. These promises are not tied to a process; they are tied to an identity and stored durably.

Like traditional async await, there are two primary operations:

  • Invoke: Create a new execution, identified by a durable promise.
  • Await: Suspend until the promise resolves or rejects.

But unlike traditional async await, Distributed Async Await differentiates between:

  • Local Invocation: Execution is scheduled locally, within the current process.
  • Remote Invocation: Execution is scheduled on the global event loop, potentially on a different process or machine.

This design introduces two foundational protocols:

  1. The Distributed Event Loop Protocol ensures coordination across partial order. It implements a consistent mechanism for invoke and resume messages.
  2. The Distributed Task Protocol ensures recovery from partial failure. It introduces heartbeats, task claiming, and message redelivery in case of process crashes.

Unified Semantics and Developer Experience

Distributed Async Await preserves the developer ergonomics of async await while introducing clear, formalized semantics for distribution. By using functions and promises as composable primitives, it provides a unified model for computation and coordination.

Developers gain the ability to:

  • Express remote invocations using the same structure as local calls.
  • Maintain a coherent, traceable call graph across machines.
  • Propagate results and errors through durable promises.
  • Recover executions transparently after failure.

In doing so, Distributed Async Await bridges the gap between the promise of distributed systems—scalability, fault tolerance—and the reality of building them.

Conclusion

Software engineering is shaped by its abstractions. Good abstractions preserve desired properties while offering a delightful developer experience. The shift from sequential to concurrent to distributed programming has exposed the need for better abstractions that don't just mask complexity but embrace and structure it.

Distributed Async Await represents a new foundation. It captures the operational realities of cloud-native systems—distribution across space and time—while offering a programming model grounded in clarity and composability. By doing so, it transforms the fragmented and ad hoc into the structured and coherent.

Not by hiding the cloud, but by making it programmable.