Beyond “Try-Catch”: Building Self-Healing Apex with Transaction Finalizers

We’ve all been there as developers. You build a complex Queueable job. You bulk-test it in the sandbox. Everything looks perfect. Then, production reality hits. A row lock here, a CPU timeout there, and suddenly your process dies a silent death.

As an Architect, the “silent failure” is my nightmare. In the past, we tried to wrap everything in try-catch blocks, but let’s be honest—you can’t try-catch a Limit Exception. When you hit 10.1 seconds of CPU time, the transaction just… ends.

That’s why I’ve become an advocate for the System.Finalizer interface. It’s the closest thing we have to a “safety net” for the asynchronous world.

The Architecture: A “Manager-Worker” Relationship

Think of a Finalizer as a supervisor who stands outside the factory floor. Even if the factory (your Queueable) collapses, the supervisor is still standing there with a clipboard, ready to log the incident and call for help.

The Glue: The `IRetryable` Interface

To ensure our Finalizer can talk to any Queueable job without knowing its specific business logic, we define an interface. This allows the Finalizer to ask the job, “Are you allowed to try again?” and “What is your current retry count?”

The Implementation

Here is how I structure this pattern to ensure resiliency. We are going to build a Self-Healing Worker that can detect its own failure and attempt a retry.

Architect’s Warning: Salesforce limits successive re-queuing from a Finalizer to 5 consecutive attempts. If the job fails 5 times in a row, the chain stops to prevent infinite loops.

1. The Interface

/**
 * @description Interface to enable self-healing capabilities.
 */
public interface IRetryable {
    Boolean canRetry();
    void incrementRetryCount();
    Integer getRetryCount();
}

2. The Supervisor (The Finalizer)

/**
 * @description Architect Pattern: Transactional Safety Net
 */
public class QueueableSafetyNet implements System.Finalizer {
    private Object parentJob; 

    public QueueableSafetyNet(Object job) {
        this.parentJob = job;
    }

    public void execute(System.FinalizerContext ctx) {
        if (ctx.getResult() != ParentJobResult.SUCCESS) {
            handleFailure(ctx);
        }
    }

    private void handleFailure(System.FinalizerContext ctx) {
        Exception ex = ctx.getException();
        System.debug('Async failure detected: ' + ex?.getMessage());
        // 1. Log to your custom error framework
        // insert new Error_Log__c(...);

        if (parentJob instanceof IRetryable) {
            IRetryable retryableJob = (IRetryable)parentJob;
            
            if (retryableJob.canRetry()) {
                retryableJob.incrementRetryCount();
                System.debug('Self-healing: Retry #' + retryableJob.getRetryCount());
                System.enqueueJob(parentJob); 
            }
        }
    }
}

3. The Worker (The Queueable)

public class DataSyncJob implements Queueable, IRetryable {
    private List<Id> recordIds;
    private Integer retryCount = 0;
    private static final Integer MAX_RETRIES = 3;

    public DataSyncJob(List<Id> ids) { this.recordIds = ids; }

    public void execute(QueueableContext qbc) {
        // ATTACH FIRST: Ensure the net is under you before you start walking the wire
        System.attachFinalizer(new QueueableSafetyNet(this));

        // Business Logic: High-risk processing goes here
    }

    public Boolean canRetry() { return retryCount < MAX_RETRIES; }
    public void incrementRetryCount() { this.retryCount++; }
    public Integer getRetryCount() { return this.retryCount; }
}

Comparison: Traditional Try-Catch vs. Finalizers

Scenario	Try-Catch Block	Transaction Finalizer
Logic Errors (Null Pointer, etc.)	✅ Can catch	✅ Can catch
Governor Limits (CPU/Heap)	❌ Cannot catch	✅ Can catch
Assertion Failures	❌ Cannot catch	✅ Can catch
Scope	Only the code inside the block	The entire `execute` method

Why this changes your “Architectural DNA”

Resiliency over Rigidity: Instead of just failing on a row lock, your code now says, “I’ll try again in a minute.”
True Error Visibility: You can finally report on why things failed in the background without digging through raw Trace Logs.
Governance: You’re respecting the platform. Finalizers allow you to fail gracefully rather than leaving data in a partial or “zombie” state.

The Trade-offs (Architect’s Reality Check)

Chain Limits: You can only chain 5 jobs in a row. If your job is fundamentally broken (logic error), retrying won’t help. Use your retry count wisely.
State Management: Ensure your Queueable class is serializable. Everything you need to “restart” the job must be stored in the class variables.

Final Thought

We’re moving toward a world of “Autonomous Salesforce.” Our systems should be smart enough to detect a hiccup. They should correct it without an admin having to manually click a button. Transaction Finalizers are the foundation of that autonomy.