Skip to content

Dead Letter Queue

The Dead Letter Queue stores failed jobs with full metadata for debugging and recovery.

Why Jobs End Up in DLQ

ReasonDescription
explicit_failJob explicitly failed via error throw
max_attempts_exceededJob exceeded retry attempts
timeoutJob timed out during processing
stalledJob stalled (no heartbeat)
ttl_expiredJob TTL expired before processing
worker_lostWorker disconnected during processing

Configuration

const queue = new Queue('my-queue');
queue.setDlqConfig({
autoRetry: true, // Enable automatic retry from DLQ
autoRetryInterval: 3600000, // Retry every hour
maxAutoRetries: 3, // Max 3 auto-retries
maxAge: 604800000, // Purge after 7 days (null = never)
maxEntries: 10000, // Max entries per queue
});

Options

OptionDefaultDescription
autoRetryfalseEnable automatic retry
autoRetryInterval3600000Time between auto-retries (1 hour)
maxAutoRetries3Maximum auto-retry attempts
maxAge604800000Auto-purge age (7 days, null = never)
maxEntries10000Maximum DLQ entries per queue

Viewing DLQ Entries

// Get all DLQ entries
const entries = queue.getDlq();
entries.forEach(entry => {
console.log('Job ID:', entry.job.id);
console.log('Reason:', entry.reason);
console.log('Error:', entry.error);
console.log('Entered DLQ:', new Date(entry.enteredAt));
console.log('Attempts:', entry.attempts.length);
console.log('Retry count:', entry.retryCount);
console.log('Next retry:', entry.nextRetryAt);
console.log('Expires:', entry.expiresAt);
});

Filtering

// Filter by reason
const stalledJobs = queue.getDlq({ reason: 'stalled' });
const timeoutJobs = queue.getDlq({ reason: 'timeout' });
// Filter by age
const oldJobs = queue.getDlq({
olderThan: Date.now() - 86400000 // Older than 24 hours
});
const recentJobs = queue.getDlq({
newerThan: Date.now() - 3600000 // Last hour
});
// Filter retriable entries
const retriable = queue.getDlq({ retriable: true });
// Pagination
const page = queue.getDlq({ limit: 10, offset: 20 });

Statistics

const stats = queue.getDlqStats();
console.log('Total entries:', stats.total);
console.log('By reason:', stats.byReason);
// { explicit_fail: 5, timeout: 2, stalled: 1, ... }
console.log('Pending retry:', stats.pendingRetry);
console.log('Expired:', stats.expired);
console.log('Oldest entry:', new Date(stats.oldestEntry));
console.log('Newest entry:', new Date(stats.newestEntry));

Retrying Jobs

// Retry all jobs
const count = queue.retryDlq();
// Retry specific job
queue.retryDlq('job-123');
// Retry by filter
queue.retryDlqByFilter({ reason: 'timeout' });
queue.retryDlqByFilter({ olderThan: Date.now() - 86400000 });

Purging

// Purge all DLQ entries
const purged = queue.purgeDlq();
console.log(`Purged ${purged} entries`);

DLQ Entry Structure

interface DlqEntry<T> {
job: Job<T>; // The failed job
enteredAt: number; // When first moved to DLQ
reason: FailureReason; // Why it failed
error: string | null; // Error message
attempts: AttemptRecord[]; // Full attempt history
retryCount: number; // Times retried from DLQ
lastRetryAt: number | null; // Last DLQ retry time
nextRetryAt: number | null; // Next scheduled auto-retry
expiresAt: number | null; // When entry expires
}
interface AttemptRecord {
attempt: number; // Attempt number (1-based)
startedAt: number; // When attempt started
failedAt: number; // When attempt failed
reason: FailureReason; // Failure reason
error: string | null; // Error message
duration: number; // Attempt duration (ms)
}

Auto-Retry Behavior

When autoRetry is enabled:

  1. Failed jobs are added to DLQ with nextRetryAt set
  2. Background task checks for due retries every minute
  3. Jobs are re-queued with reset attempt count
  4. Uses exponential backoff: interval * 2^(retryCount-1)
  5. After maxAutoRetries, job stays in DLQ permanently
queue.setDlqConfig({
autoRetry: true,
autoRetryInterval: 60000, // Base: 1 minute
maxAutoRetries: 3,
});
// Retry schedule:
// 1st retry: 1 minute after failure
// 2nd retry: 2 minutes after 1st retry
// 3rd retry: 4 minutes after 2nd retry
// After that: no more auto-retries

Example: Monitoring Dashboard

// Poll DLQ stats every 30 seconds
setInterval(() => {
const stats = queue.getDlqStats();
// Alert if too many failures
if (stats.total > 100) {
alertOps('High DLQ count', stats);
}
// Check for stall issues
if (stats.byReason.stalled > 10) {
alertOps('Many stalled jobs - check workers', stats);
}
}, 30000);