## Curation Note
Poor error handling is the leading cause of silent failures and debugging nightmares. This skill emerged from production incident post-mortems and resilience engineering patterns. The structured approach to error hierarchies and the distinction between recoverable/unrecoverable errors prevents the common anti-pattern of catching exceptions too broadly. The retry patterns with exponential backoff are essential for distributed systems reliability.
## Error Classification
### Error Hierarchy Design
```typescript
// Base error class with metadata
class AppError extends Error {
readonly code: string;
readonly statusCode: number;
readonly isOperational: boolean;
readonly context: Record<string, unknown>;
constructor(
message: string,
code: string,
statusCode: number = 500,
isOperational: boolean = true,
context: Record<string, unknown> = {}
) {
super(message);
this.name = this.constructor.name;
this.code = code;
this.statusCode = statusCode;
this.isOperational = isOperational;
this.context = context;
Error.captureStackTrace(this, this.constructor);
}
}
// Specific error types
class ValidationError extends AppError {
constructor(message: string, fields?: Record<string, string>) {
super(message, 'VALIDATION_ERROR', 400, true, { fields });
}
}
class NotFoundError extends AppError {
constructor(resource: string, id: string) {
super(`${resource} not found`, 'NOT_FOUND', 404, true, { resource, id });
}
}
class ExternalServiceError extends AppError {
constructor(service: string, originalError: Error) {
super(`External service failed: ${service}`, 'EXTERNAL_SERVICE_ERROR', 503, true, {
service,
originalError: originalError.message
});
}
}
```
### Operational vs Programming Errors
```typescript
// Operational: Expected failures, handle gracefully
// - Network timeouts
// - Invalid user input
// - Resource not found
// Programming: Bugs, should crash and fix
// - Undefined is not a function
// - Type errors
// - Assertion failures
function handleError(error: Error): void {
if (error instanceof AppError && error.isOperational) {
// Log and continue
logger.warn('Operational error', { error });
return sendErrorResponse(error);
}
// Programming error - log, alert, and potentially restart
logger.error('Programming error', { error });
alertOps(error);
process.exit(1);
}
```
## Retry Patterns
```typescript
interface RetryOptions {
maxAttempts: number;
baseDelayMs: number;
maxDelayMs: number;
shouldRetry: (error: Error) => boolean;
}
async function withRetry<T>(fn: () => Promise<T>, options: RetryOptions): Promise<T> {
let lastError: Error;
for (let attempt = 1; attempt <= options.maxAttempts; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (!options.shouldRetry(lastError) || attempt === options.maxAttempts) {
throw lastError;
}
// Exponential backoff with jitter
const delay = Math.min(
options.baseDelayMs * Math.pow(2, attempt - 1) + Math.random() * 100,
options.maxDelayMs
);
await sleep(delay);
}
}
throw lastError!;
}
// Usage
const result = await withRetry(() => externalApi.call(), {
maxAttempts: 3,
baseDelayMs: 100,
maxDelayMs: 5000,
shouldRetry: (err) => err instanceof ExternalServiceError
});
```
## Error Boundaries (React)
```tsx
class ErrorBoundary extends React.Component<
{ children: React.ReactNode; fallback: React.ReactNode },
{ hasError: boolean; error?: Error }
> {
state = { hasError: false };
static getDerivedStateFromError(error: Error) {
return { hasError: true, error };
}
componentDidCatch(error: Error, errorInfo: React.ErrorInfo) {
logger.error('React error boundary caught', {
error,
componentStack: errorInfo.componentStack
});
}
render() {
if (this.state.hasError) {
return this.props.fallback;
}
return this.props.children;
}
}
```
## Logging Best Practices
```typescript
// Structured logging with context
function logError(error: Error, context: Record<string, unknown> = {}): void {
const logEntry = {
timestamp: new Date().toISOString(),
level: 'error',
message: error.message,
name: error.name,
stack: error.stack,
...(error instanceof AppError && {
code: error.code,
statusCode: error.statusCode,
isOperational: error.isOperational,
errorContext: error.context
}),
context,
requestId: getRequestId(),
userId: getCurrentUserId()
};
console.error(JSON.stringify(logEntry));
}
```
## Best Practices
1. **Fail fast** - Validate early, fail with clear messages
2. **Error hierarchy** - Create specific error types
3. **Include context** - Add debugging information
4. **Separate concerns** - Don't mix business logic with error handling
5. **Log appropriately** - Structured logs with request IDs
6. **User-friendly messages** - Technical details for logs, clarity for users
7. **Retry wisely** - Only retry transient failures
8. **Monitor errors** - Track error rates and patterns