Understanding Azure Durable Functions - Part 3: What Is Durability?

This is the third part in a series of articles.

Durable Functions make it easier to organize (orchestrate) multiple individual Azure Functions working together.

In addition to simplifying this orchestration, durable functions (as its name suggests) provide a level of “durability”. So what is this durability and how does it work?

Durable Functions provides “reliable execution”. What this means is that the framework takes care of a number of things for us, one of these things is the management of orchestration history.

"Orchestrator functions and activity functions may be running on different VMs within a data center, and those VMs or the underlying networking infrastructure is not 100% reliable. In spite of this, Durable Functions ensures reliable execution of orchestrations." [Microsoft]

One important thing to note is that this “durability” is not meant to automatically retry operations or execute compensation logic, later in the series we’ll look at error handling.

Behind the Scenes

To better explain how this works, lets take the following functions:

using System.Collections.Generic;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Extensions.Logging;

namespace DurableDemos
{
    public static class Function1
    {
        [FunctionName("OrchestratorFunction")]
        public static async Task<List<string>> RunOrchestrator(
            [OrchestrationTrigger] DurableOrchestrationContext context)
        {
            var outputs = new List<string>();

            // Replace "hello" with the name of your Durable Activity Function.
            outputs.Add(await context.CallActivityAsync<string>("ActivityFunction", "Tokyo"));
            outputs.Add(await context.CallActivityAsync<string>("ActivityFunction", "Seattle"));
            outputs.Add(await context.CallActivityAsync<string>("ActivityFunction", "London"));

            // returns ["Hello Tokyo!", "Hello Seattle!", "Hello London!"]
            return outputs;
        }

        [FunctionName("ActivityFunction")]
        public static string SayHello([ActivityTrigger] string name, ILogger log)
        {
            Thread.Sleep(5000); // simulate longer processing delay

            log.LogInformation($"Saying hello to {name}.");
            return $"Hello {name}!";
        }

        [FunctionName("ClientFunction")]
        public static async Task<HttpResponseMessage> HttpStart(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")]HttpRequestMessage req,
            [OrchestrationClient]DurableOrchestrationClient starter,
            ILogger log)
        {
            // Function input comes from the request content.
            string instanceId = await starter.StartNewAsync("OrchestratorFunction", null);

            log.LogInformation($"Started orchestration with ID = '{instanceId}'.");

            return starter.CreateCheckStatusResponse(req, instanceId);
        }
    }
}

If we execute the orchestration (as we did in part two of this series) with some PowerShell: $R = Invoke-WebRequest 'http://localhost:7071/api/ReplayExample_HttpStart' -Method 'POST'

And take a look in the Azure storage account associated with the function app (in this example the local storage emulator) there are a number of storage queues and tables created as the following screenshot shows:

Durable Functions tables and queues

Whilst it is not necessary to understand how Durable Functions works behind the scenes, it is useful to know that these queues and tables exist and are used.

For example if we open the DurableFunctionsHubInstances table, we see a single row representing the execution of the orchestration we just initiates with PowerShell:

Azure Durable Functions behind the scenes table

Notice in the preceding screenshot, the Name field contains the name of the orchestration we just executed: [FunctionName("ReplayExample")]

If we execution the orchestration again, we get a second row, etc.

The DurableFunctionsHubHistory table is more complex and has more data per invocation:

DurableFunctionsHubHistory Table

Notice in the previous screenshot that the data in this table keeps track of where the orchestrator function is at, for example what activity functions have been executed so far.

The storage queues behind the scenes are used by the framework to drive function execution.

How Durable Functions Work – Checkpoints

The orchestrator function instance may be removed from memory while it is waiting for activity functions to execute, namely the await context.CallActivityAsync calls. Because these calls are awaited, the orchestrator function may temporarily be removed from memory/paused until the activity function completes.

Because orchestrator functions may be removed from memory, there needs to be some way of the framework to keep track of which activity functions have completed and which ones have yet to be executed. The framework does this by creating “checkpoints” and using the DurableFunctionsHubHistory table to store these checkpoints.

“Azure Storage does not provide any transactional guarantees between saving data into table storage and queues. To handle failures, the Durable Functions storage provider uses eventual consistency patterns. These patterns ensure that no data is lost if there is a crash or loss of connectivity in the middle of a checkpoint.” [Microsoft]

Essentially what this means is the the code in the orchestrator function may execute multiple times per invocation (be replayed) and the checkpoint data ensures that a given activity call is not called multiple times.

At a high level, at each checkpoint (each await) the execution history is saved into table storage and  messages are added to storage queues to trigger other functions.

Replay Considerations

Because the code in an orchestrator function can be replayed, there are a few restrictions on the code you write inside them.

Firstly, code in orchestrator function should be deterministic, that is if the code is replayed multiple times it should produce the same result – some examples of non-determinist code include random number generation, GUID generation, current date/time calls and calls to non-deterministic code/APIs/endpoints.

Code inside orchestrator functions should not block, so for example do not do IO operations or call Thread.Sleep.

Code inside orchestrator functions should not call into async code except by way of the DurableOrchestrationContext object that is passed into the orchestrator function, e.g. public static async Task RunOrchestrator( [OrchestrationTrigger] DurableOrchestrationContext context). For example initiating an activity function via: context.CallActivityAsync<string>("ReplayExample_ActivityFunction", "Tokyo"); So for example do not use Task.Run, Task.Delay, HttpClient.SendAsync, etc. inside an orchestrator function.

Inside an orchestrator function, if you want the following logic use the substitute APIs as recommended in the Microsoft docs:

Note: these restrictions apply to the orchestrator function, not to the activity function(s) or client function.

Seeing Replay in Action

As a quick example, we can modify the code in the orchestrator function to add a log message:

[FunctionName("ReplayExample")]
public static async Task RunOrchestrator(
    [OrchestrationTrigger] DurableOrchestrationContext context, ILogger log)
{
    log.LogInformation($"************** RunOrchestrator method executing ********************");
    
    await context.CallActivityAsync<string>("ReplayExample_ActivityFunction", "Tokyo");            
    await context.CallActivityAsync<string>("ReplayExample_ActivityFunction", "Seattle");
    await context.CallActivityAsync<string>("ReplayExample_ActivityFunction", "London");
}

Now when the client function is called and the orchestration started, the following simplified log output can be seen:

Executing HTTP request: {
  "requestId": "c7edb6b0-e947-4347-9f0e-fc46a2bdeefe",
  "method": "POST",
  "uri": "/api/ReplayExample_HttpStart"
}
Executing 'ReplayExample_HttpStart' (Reason='This function was programmatically called via the host APIs.', Id=ca981987-5ea3-4c76-8934-09fef757cd6a)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample (Orchestrator)' scheduled. Reason: NewInstance. IsReplay: False. State: Scheduled. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 2.
Started orchestration with ID = '75dce58e3be440e98dcd3c99112c9bbf'.
Executed 'ReplayExample_HttpStart' (Succeeded, Id=ca981987-5ea3-4c76-8934-09fef757cd6a)
Executing 'ReplayExample' (Reason='', Id=fb5e9945-3f9e-4fc6-bda3-9028fa6d8f04)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample (Orchestrator)' started. IsReplay: False. Input: (16 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 3.
************** RunOrchestrator method executing ********************
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' scheduled. Reason: ReplayExample. IsReplay: False. State: Scheduled. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 4.
Executed 'ReplayExample' (Succeeded, Id=fb5e9945-3f9e-4fc6-bda3-9028fa6d8f04)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 5.
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' started. IsReplay: False. Input: (36 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 6.
Executing 'ReplayExample_ActivityFunction' (Reason='', Id=e93e770c-a15b-47b5-b8e3-e0db081cf44b)
Saying hello to Tokyo.
Executed 'ReplayExample_ActivityFunction' (Succeeded, Id=e93e770c-a15b-47b5-b8e3-e0db081cf44b)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' completed. ContinuedAsNew: False. IsReplay: False. Output: (56 bytes). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 7.
Executing 'ReplayExample' (Reason='', Id=0542a8de-d9e5-46d0-a5a5-d94ea5040202)
************** RunOrchestrator method executing ********************
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' scheduled. Reason: ReplayExample. IsReplay: False. State: Scheduled. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 8.
Executed 'ReplayExample' (Succeeded, Id=0542a8de-d9e5-46d0-a5a5-d94ea5040202)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 9.
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' started. IsReplay: False. Input: (44 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 10.
Executing 'ReplayExample_ActivityFunction' (Reason='', Id=9b408473-3224-4968-8d7b-1b1ec56d9359)
Saying hello to Seattle.
Executed 'ReplayExample_ActivityFunction' (Succeeded, Id=9b408473-3224-4968-8d7b-1b1ec56d9359)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' completed. ContinuedAsNew: False. IsReplay: False. Output: (64 bytes). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 11.
Executing 'ReplayExample' (Reason='', Id=040e7026-724c-4b07-9e4f-46ed52278785)
************** RunOrchestrator method executing ********************
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' scheduled. Reason: ReplayExample. IsReplay: False. State: Scheduled. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 12.
Executed 'ReplayExample' (Succeeded, Id=040e7026-724c-4b07-9e4f-46ed52278785)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 13.
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' started. IsReplay: False. Input: (40 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 14.
Executing 'ReplayExample_ActivityFunction' (Reason='', Id=03a0b729-d8e5-4e42-8b7a-28ea974bb1a6)
Saying hello to London.
Executed 'ReplayExample_ActivityFunction' (Succeeded, Id=03a0b729-d8e5-4e42-8b7a-28ea974bb1a6)
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample_ActivityFunction (Activity)' completed. ContinuedAsNew: False. IsReplay: False. Output: (60 bytes). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 15.
Executing 'ReplayExample' (Reason='', Id=02f307bc-c63c-4803-80f8-acfae72c3577)
************** RunOrchestrator method executing ********************
75dce58e3be440e98dcd3c99112c9bbf: Function 'ReplayExample (Orchestrator)' completed. ContinuedAsNew: False. IsReplay: False. Output: (null). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.8.2. SequenceNumber: 16.
Executed 'ReplayExample' (Succeeded, Id=02f307bc-c63c-4803-80f8-acfae72c3577)

Notice in the preceding log output, there are 4 instances of the log messages ************** RunOrchestrator method executing ********************

The final instance of the message occurs before the Function 'ReplayExample (Orchestrator)' completed message so even though we only have 3 activity calls, the orchestrator function method itself was executed 4 times.

SHARE:

Add comment

Loading