declarative workflow definition via elixir dsl macros
Defines complex multi-step workflows using Elixir macros (workflow, step, branch, parallel, foreach) that compile to an AST-based execution plan persisted in PostgreSQL. The DSL abstracts control flow, state management, and resumability into composable building blocks, eliminating boilerplate for long-running processes. Workflows are defined as pure Elixir code with compile-time validation of step dependencies and control flow structure.
Unique: Uses Elixir's compile-time macro system to transform workflow definitions into persistent execution plans, enabling type-safe control flow composition and static validation of step dependencies without runtime interpretation overhead. Unlike Temporal or Cadence which use separate workflow languages, Durable embeds orchestration directly in Elixir code with full access to the language's pattern matching and functional composition.
vs alternatives: Tighter integration with Elixir's type system and pattern matching than Oban (which treats workflows as job sequences), and simpler deployment than Temporal (no separate server required, uses existing PostgreSQL).
postgresql-backed durable state persistence with automatic resumability
Persists complete workflow execution state (step results, context, execution history) to PostgreSQL after each step completes, enabling workflows to resume from the exact point of interruption after crashes, restarts, or arbitrary delays. Uses Ecto schemas (WorkflowExecution, StepExecution) to model workflow state as relational data with transactional consistency guarantees. Resumability is automatic—the execution engine queries persisted state and continues from the last completed step without explicit checkpointing logic.
Unique: Implements durability as a first-class concern via Ecto schemas with automatic transactional persistence after each step, rather than as an optional feature bolted onto a job queue. The execution engine treats the database as the source of truth for workflow state, enabling seamless multi-instance deployments and arbitrary pause/resume cycles without resource leaks.
vs alternatives: More transparent than Oban (which hides job state in a queue table) and simpler than Temporal (which requires a separate event store service). Leverages PostgreSQL's ACID guarantees directly rather than implementing custom consensus protocols.
multi-instance deployment with distributed concurrency control
Supports deploying Durable across multiple application instances with automatic concurrency control via database-level locking. When multiple instances attempt to execute the same workflow, the execution engine uses PostgreSQL row-level locks to ensure only one instance executes a given workflow step at a time. This enables horizontal scaling without a central coordinator. The execution engine polls for available work (steps ready to execute) and acquires locks before execution, ensuring distributed safety.
Unique: Implements distributed concurrency control via PostgreSQL row-level locks rather than a separate coordination service, enabling multi-instance deployment without additional infrastructure. Lock acquisition is transparent to workflow logic, and the execution engine automatically handles lock timeouts and retries.
vs alternatives: Simpler than Temporal's multi-worker deployment (which requires a separate server) and more transparent than manual distributed locking in step logic. Leverages PostgreSQL's built-in locking mechanisms rather than implementing custom consensus.
workflow execution observability via log capture and state querying
Provides comprehensive observability into workflow execution via two mechanisms: (1) automatic log capture that records all step execution logs to the database, and (2) queryable workflow state that enables inspection of execution history, step results, and context at any point in time. Logs are captured from Elixir's Logger and associated with specific step executions. Workflow state can be queried via Ecto queries or API endpoints, enabling real-time monitoring and debugging of running workflows.
Unique: Integrates logging and state querying directly into the workflow engine via PostgreSQL, enabling unified observability without external logging infrastructure. Logs are associated with specific step executions and queryable alongside execution state, providing rich context for debugging and monitoring.
vs alternatives: More integrated than external logging systems (which require separate configuration) and simpler than Temporal's event history (which requires custom event emission). Log capture is automatic and transparent to workflow logic.
pluggable queue and message bus adapters for custom integration
Provides extensible queue and message bus adapter interfaces, enabling custom implementations for step execution scheduling and event delivery. The default implementation uses PostgreSQL polling, but adapters can implement push-based scheduling (e.g., via RabbitMQ, Kafka) or custom event delivery mechanisms. Adapters implement a standard interface (enqueue, dequeue, publish, subscribe) and are plugged into the Durable supervision tree via configuration. This enables integration with existing message infrastructure without modifying core workflow logic.
Unique: Provides pluggable adapter interfaces for queue and message bus implementations, enabling custom integration without modifying core workflow logic. Adapters are configured via Elixir configuration and plugged into the supervision tree, enabling runtime selection of queue strategy.
vs alternatives: More flexible than Oban (which is tightly coupled to PostgreSQL) and simpler than Temporal (which requires separate worker services). Adapter interface is minimal and easy to implement for custom use cases.
workflow cancellation with cascading cleanup
Enables cancellation of running workflows via the cancel API, which marks the workflow as cancelled and triggers cleanup of associated resources. When a workflow is cancelled, the execution engine stops executing new steps, executes compensations for completed steps (in reverse order), and marks the workflow as cancelled in the database. Cancellation is asynchronous and resumable—if the application crashes during cancellation, the process resumes from the last completed compensation.
Unique: Implements workflow cancellation as a first-class operation with automatic compensation execution, rather than as a simple state flag. Cancellation is resumable and fully observable, enabling graceful shutdown of workflows with complex resource cleanup.
vs alternatives: More sophisticated than simple workflow termination and simpler than Temporal's cancellation (which requires custom activity implementations). Cancellation automatically triggers compensations without explicit cleanup logic.
configurable retry logic with multiple backoff strategies
Provides per-step retry configuration with exponential, linear, constant, and custom backoff strategies. When a step fails, the execution engine automatically reschedules it based on the configured backoff function, max retry count, and jitter settings. Retries are persisted to the database, allowing workflows to survive transient failures (network timeouts, rate limits) without manual intervention. Backoff state is tracked in StepExecution records, enabling observability into retry attempts and failure patterns.
Unique: Implements retries as first-class workflow primitives with pluggable backoff strategies, rather than as a generic job queue feature. Retry state is fully observable via database queries, and backoff functions are composable Elixir functions, enabling custom strategies (e.g., retry only on specific error types) without framework modifications.
vs alternatives: More flexible than Oban's built-in retry (which uses fixed exponential backoff) and simpler than Temporal (which requires custom activity retry policies). Retries are transparent to step logic—no try/catch boilerplate needed.
human-in-the-loop workflow pausing with event and input resumption
Enables workflows to pause execution and wait for external events (webhooks, user input, approvals) or time-based delays without holding system resources. Implements three wait primitives: wait_for_event (pause until external event arrives), wait_for_input (pause until user provides data), and wait_for_approval (pause until approval is granted). Paused workflows are stored in PostgreSQL with a WaitState record indicating the resume condition. The execution engine polls or subscribes to resume events and automatically continues the workflow when the condition is met.
Unique: Treats human-in-the-loop as a workflow primitive (wait_for_approval, wait_for_input) rather than as custom step logic, enabling declarative approval workflows without state machine boilerplate. Paused workflows are fully queryable and resumable via API, allowing external systems (web UIs, Slack bots, webhooks) to trigger resumption without coupling to workflow internals.
vs alternatives: Simpler than Temporal (which requires custom activity implementations for approvals) and more explicit than Oban (which lacks built-in pause/resume semantics). Enables long-duration waits (days/months) without resource leaks, unlike in-memory job queues.
+6 more capabilities