Handling Errors and Poison Blobs in Azure Functions With Azure Blob Storage Triggers

(This article applies to Azure Functions V2)

An Azure Function can be triggered by new blobs being written (or updated). If an unhandled exception occurs in the function, by default Azure Functions will retry the blob 5 times. This means the function will be triggered again for the same blob up to 5 times. If the same blob causes errors 5 times, no further attempts will be made and the processing of the blob will be “lost”.

Understanding Blob Processing Errors in Azure Functions

When a new (or updated) blob triggers a function, the Azure Functions runtime makes sure that the same blob is not processed twice (if no error occurs in the function execution). To do this the runtime makes use of “blob receipts”. These are stored in the Azure storage account associated with the function app (as defined in the AzureWebJobsStorage Function App settings).

As an example, suppose a new blob (called “followupletterrequest.data”) triggered the following function:

class FollowupLetterRequest
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

public static class PoisonBlobExampleFunctions
{
    [FunctionName("PoisonBlobExampleFunctions")]
    public static void Run(
        [BlobTrigger("followup-letters/{blobname}.data")]string blobData, 
        string blobname,
        [Blob("followup-letters/{blobname}.txt")] out string letter,
        ILogger log)
    {
        var settings = new JsonSerializerSettings
        {
            MissingMemberHandling = MissingMemberHandling.Error
        };

        // This code assumes blob JSON is valid, if not an exception will be thrown
        var request = JsonConvert.DeserializeObject<FollowupLetterRequest>(blobData, settings);

        string firstName = request.FirstName;
        string lastName = request.LastName;

        letter = RenderFollowUpLetterText(firstName, lastName);
    }
    
    private static string RenderFollowUpLetterText(string firstName, string lastName)
    {
        string simulateLetterText = WaffleEngine.Text(paragraphs: 3, includeHeading: false);

        return $"Dear {firstName} {lastName}\r\n \r\n{simulateLetterText}";
    }
}

After the function runs, in the storage account under a path like “azure-webjobs-hosts/blobreceipts” the blob receipt can be seen. On a development machine using the local storage emulator the full path would be something like: “blobreceipts/desktop/DontCodeTiredDemosV2.PoisonBlobExampleFunctions.Run/"0x8D69224161F4590"/followup-letters/followupletterrequest.data”.

This full path to the blob receipt blob represents:

  • Function Id that the blob triggered (DontCodeTiredDemosV2.PoisonBlobExampleFunctions.Run)
  • Blob Container Name (followup-letters)
  • Name of triggering blob (followupletterrequest.data)
  • Triggering blob version ETag (“0x8D69224161F4590”)

 

If we now added another new blob called “followupletterrequest_bad.data” that contains bad data (e.g. a missing JSON property), so that an exception is thrown, a second blob receipt will be generated: “blobreceipts/desktop/DontCodeTiredDemosV2.PoisonBlobExampleFunctions.Run/"0x8D692245985E910"/followup-letters/followupletterrequest_bad.data”.

Because this blob generated an error, after the default number of retries (5) there will be no more attempts to process it.

Manually Forcing a Blob to Be Reprocessed

The documentation states that if the blob receipt is manually deleted, this will force the blob to reprocessed. This may be suitable to force reprocessing of a set of blobs that failed processing due to some transient error such as a database or network being temporarily offline. You should obviously take care that reprocessing blobs wont cause problems such as duplicate orders, emails, etc. or other errors in the system. You  may also need to consider what would happen if blobs are retried in a different order and/or interleaved with new blobs being added. Also blobs may not be reprocessed immediately. Using the local function runtime development environment, once the blob receipt has been deleted, it seems that the function app needs restarting to cause the blob to be reprocessed (either that or I didn’t wait long enough…). Once deployed to Azure there can be a delay between when the blob receipt is deleted and the blob being retried, the following timeline shows the delay between the blob receipt being deleted and the retry attempt 1.

2019-02-14 03:40:24.374 <attempt 1 - failure>
2019-02-14 03:40:24.763 <attempt 2 - failure>
2019-02-14 03:40:24.891 <attempt 3 - failure>
2019-02-14 03:40:25.007 <attempt 4 - failure>
2019-02-14 03:40:25.117 <attempt 5 - failure>
<blob receipt deleted>
2019-02-14 04:24:24.327 <retry attempt 1 - failure>
2019-02-14 04:24:25.155 <retry attempt 2 - failure>
2019-02-14 04:24:25.288 <retry attempt 3 - failure>
2019-02-14 04:24:25.455 <retry attempt 4 - failure>
2019-02-14 04:24:25.592 <retry attempt 5 - failure>

Automatically Responding to Blob Failures in Azure Functions

When a blob fails for the last time, information about the failure will written as a message to a Storage queue called “webjobs-blobtrigger-poison”. The message contains a JSON payload describing the triggering blob that didn’t complete processing successfully, for example:

{
  "Type": "BlobTrigger",
  "FunctionId": "DontCodeTiredDemosV2.PoisonBlobExampleFunctions.Run",
  "BlobType": "BlockBlob",
  "ContainerName": "followup-letters",
  "BlobName": "followupletterrequest_bad.data",
  "ETag": "\"0x8D692245985E910\""
}

The information contained in the JSON can be used to alert support people about the error and take appropriate action as required such as writing to a support ticket database or sending an email. You could also implement logic to automatically delete the blob receipt to force reprocessing but there would probably want to be some retry count otherwise bad data could cause an infinite processing loop. Exactly how you handle failed blob processing will depend on the business scenario.

As an example, the following function monitors the “webjobs-blobtrigger-poison” queue and grabs the information about the failed blob:

[FunctionName("PoisonBlobQueueProcessor")]
public static void PoisonBlobQueueProcessor(
    [QueueTrigger("webjobs-blobtrigger-poison")] string message,
    ILogger log)
{
    var poisonBobDetails = JsonConvert.DeserializeObject<dynamic>(message);

    log.LogInformation($"Found an unprocessed blob {poisonBobDetails.ContainerName}/{poisonBobDetails.BlobName}\r\n");
    
    // Send an email, log a ticket in a fault system, log a CRM issue, etc.            
}

Add comment

Loading