Design notes
Design invariants
The following constraints are part of the library's design, not incidental implementation details.
- Hooks are best-effort: observability hooks must not break workloads. Hook failures should never change retry outcomes.
- Behavior must stay bounded: deadlines, attempt limits, unknown caps, budgets, and breaker thresholds exist to prevent unbounded failure handling.
- Classification is coarse and semantic: classifiers should map failures into a small stable set of policy-relevant classes, not mirror every domain taxonomy.
- Structured outcomes are first-class:
execute()andRetryOutcomeare part of the intended control-flow surface, not a second-tier escape hatch. - Sync and async should preserve semantics: the async API should differ in mechanics, not in the meaning of retries, stop reasons, hooks, and outcomes.
- Circuit breakers compose with retries: breaker decisions and retry decisions should use the same classification model and the breaker should observe final operation outcomes, not per-attempt noise.
-
Observability must stay low-cardinality: tags and event fields should stay useful for production systems and avoid payload-style data.
-
Unified policy model:
Policy/AsyncPolicyare the primary containers. Retry is one optional component (Retry/AsyncRetry) rather than the only execution model. - Backward-compatible sugar:
RetryPolicyandAsyncRetryPolicyremain convenience wrappers for retry-only use cases, but they intentionally mirror the unified policy semantics. - Circuit breakers wrap operations, not attempts: Breakers observe the final operation outcome after retry processing so fail-fast behavior and retry behavior stay coordinated.
- Strict envelopes: Deadline and max-attempts are enforced before each attempt; sleeps are capped to remaining deadline to avoid overruns.
- Classification first: Policies are domain-agnostic; callers map exceptions to
ErrorClassvia classifiers (default or custom). - Per-class backoff: Strategies are looked up by class, falling back to a default; absence of a strategy is a hard error.
- Result and exception symmetry: Retry decisions can be driven by exceptions or returned results; both feed the same stop reasons and outcome surface.
- Deferred execution is first-class: Sleep handlers can choose
SleepDecision.DEFER, allowing queue/worker systems to reschedule externally instead of sleeping inline. - Best-effort hooks: Metric/log hooks are isolated—exceptions are swallowed so retries never break due to observability failures.
- Deterministic jitter bounds: Built-in strategies clamp to configured maxima; property-based tests assert bounds.
- Sync/async symmetry:
Policy/AsyncPolicy,Retry/AsyncRetry, and the@retrydecorator share the same mental model. - Context reuse: Context managers bind hooks/operations once for batches; avoid repeating kwargs on every call.
- Structured terminal outcomes:
execute()returnsRetryOutcomewith stop reason, attempts, and last failure information so integrations do not need to infer lifecycle state from hook events alone. - Unknowns:
max_unknown_attemptsprevents unbounded retries on unclassified errors; deadline remains a global guardrail.