# m13-domain-error > Use when designing domain error handling. Keywords: domain error, error categorization, recovery strategy, retry, fallback, domain error hierarchy, user-facing vs internal errors, error code design, circuit breaker, graceful degradation, resilience, error context, backoff, retry with backoff, error recovery, transient vs permanent error, 领域错误, 错误分类, 恢复策略, 重试, 熔断器, 优雅降级 - Author: LeonWang0735 - Repository: huoli4844/rustfs - Version: 20260124105510 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/huoli4844/rustfs - Web: https://mule.run/skillshub/@@huoli4844/rustfs~m13-domain-error:20260124105510 --- --- name: m13-domain-error description: "Use when designing domain error handling. Keywords: domain error, error categorization, recovery strategy, retry, fallback, domain error hierarchy, user-facing vs internal errors, error code design, circuit breaker, graceful degradation, resilience, error context, backoff, retry with backoff, error recovery, transient vs permanent error, 领域错误, 错误分类, 恢复策略, 重试, 熔断器, 优雅降级" --- # Domain Error Strategy > **Layer 2: Design Choices** ## Core Question **Who needs to handle this error, and how should they recover?** Before designing error types: - Is this user-facing or internal? - Is recovery possible? - What context is needed for debugging? --- ## Error Categorization | Error Type | Audience | Recovery | Example | |------------|----------|----------|---------| | User-facing | End users | Guide action | `InvalidEmail`, `NotFound` | | Internal | Developers | Debug info | `DatabaseError`, `ParseError` | | System | Ops/SRE | Monitor/alert | `ConnectionTimeout`, `RateLimited` | | Transient | Automation | Retry | `NetworkError`, `ServiceUnavailable` | | Permanent | Human | Investigate | `ConfigInvalid`, `DataCorrupted` | --- ## Thinking Prompt Before designing error types: 1. **Who sees this error?** - End user → friendly message, actionable - Developer → detailed, debuggable - Ops → structured, alertable 2. **Can we recover?** - Transient → retry with backoff - Degradable → fallback value - Permanent → fail fast, alert 3. **What context is needed?** - Call chain → anyhow::Context - Request ID → structured logging - Input data → error payload --- ## Trace Up ↑ To domain constraints (Layer 3): ``` "How should I handle payment failures?" ↑ Ask: What are the business rules for retries? ↑ Check: domain-fintech (transaction requirements) ↑ Check: SLA (availability requirements) ``` | Question | Trace To | Ask | |----------|----------|-----| | Retry policy | domain-* | What's acceptable latency for retry? | | User experience | domain-* | What message should users see? | | Compliance | domain-* | What must be logged for audit? | --- ## Trace Down ↓ To implementation (Layer 1): ``` "Need typed errors" ↓ m06-error-handling: thiserror for library ↓ m04-zero-cost: Error enum design "Need error context" ↓ m06-error-handling: anyhow::Context ↓ Logging: tracing with fields "Need retry logic" ↓ m07-concurrency: async retry patterns ↓ Crates: tokio-retry, backoff ``` --- ## Quick Reference | Recovery Pattern | When | Implementation | |------------------|------|----------------| | Retry | Transient failures | exponential backoff | | Fallback | Degraded mode | cached/default value | | Circuit Breaker | Cascading failures | failsafe-rs | | Timeout | Slow operations | `tokio::time::timeout` | | Bulkhead | Isolation | separate thread pools | ## Error Hierarchy ```rust #[derive(thiserror::Error, Debug)] pub enum AppError { // User-facing #[error("Invalid input: {0}")] Validation(String), // Transient (retryable) #[error("Service temporarily unavailable")] ServiceUnavailable(#[source] reqwest::Error), // Internal (log details, show generic) #[error("Internal error")] Internal(#[source] anyhow::Error), } impl AppError { pub fn is_retryable(&self) -> bool { matches!(self, Self::ServiceUnavailable(_)) } } ``` ## Retry Pattern ```rust use tokio_retry::{Retry, strategy::ExponentialBackoff}; async fn with_retry(f: F) -> Result where F: Fn() -> impl Future>, E: std::fmt::Debug, { let strategy = ExponentialBackoff::from_millis(100) .max_delay(Duration::from_secs(10)) .take(5); Retry::spawn(strategy, || f()).await } ``` --- ## Common Mistakes | Mistake | Why Wrong | Better | |---------|-----------|--------| | Same error for all | No actionability | Categorize by audience | | Retry everything | Wasted resources | Only transient errors | | Infinite retry | DoS self | Max attempts + backoff | | Expose internal errors | Security risk | User-friendly messages | | No context | Hard to debug | .context() everywhere | --- ## Anti-Patterns | Anti-Pattern | Why Bad | Better | |--------------|---------|--------| | String errors | No structure | thiserror types | | panic! for recoverable | Bad UX | Result with context | | Ignore errors | Silent failures | Log or propagate | | Box everywhere | Lost type info | thiserror | | Error in happy path | Performance | Early validation | --- ## Related Skills | When | See | |------|-----| | Error handling basics | m06-error-handling | | Retry implementation | m07-concurrency | | Domain modeling | m09-domain | | User-facing APIs | domain-* |