What Would Easy Be Like?

As another (Gregorian) year edges ever closer, it’s a period that offers a natural reflection time on what happened in the past year and the forging, often weakly, of “New Years Resolutions”.

One interesting question to ask for your work (or personal) life is: “what would easy be like?”.

For example, looking back to the past year where was the pain, sadness, frustration, long hours, constant overtime, feelings of falling behind in new technologies, failed deployments, the same bugs re-occurring over and over again, unsupportive management, mean co-workers, feelings of inadequacy, disappointed customers, etc.

you should feel like you’re doing your best work

Take some time. Make a list. Write them all down.

Now imagine what it would feel like if all of the things on the list did not exist in the upcoming year – what would easy be like?

Fixing some of the things may mean radical decisions such as quitting your job and moving employer.

Some things may not be quite as radical: better automated testing to catch bugs earlier.

You probably won’t be able to fix everything on your list. Starting from the point of view that things should be easy, that you should feel like you’re doing your best work, can be a powerful way to frame where you’re currently at and where you want to be by this time next year.

Previewing the Generated PostgreSQL SQL for a Query in Marten

Marten is a .NET document database library that uses an underlying PostgreSQL database to store objects as JSON. The library has a variety of features that allow the logging of SQL statements issued to the underlying PostgreSQL database in addition to being able to do things such as get the PostgreSQL query plan for a given LINQ query.

One simple way to get the generated SQL for a Marten LINQ query is to use the ToCommand() extension method.

As an example, suppose we are developing some query code as follows (this code uses the Include method to include the related documents in a single database round-trip):

Customer customer = null;

List<Order> orders = session.Query<Order>()
                            .Include<Customer>(joinOnOrder => joinOnOrder.CustomerId, includedCustomer => customer = includedCustomer)
                            .Where(x => x.CustomerId == 4001).ToList();

If we want to get an idea of what SQL Marten will generate for this LINQ query, we can change the code as shown in the following:

Customer customer = null;

IQueryable<Order> orders = session.Query<Order>()
                                  .Include<Customer>(joinOnOrder => joinOnOrder.CustomerId, includedCustomer => customer = includedCustomer)
                                  .Where(x => x.CustomerId == 4001);

// Get the SQL command that will be issued when the query executes
NpgsqlCommand cmd = orders.ToCommand();

// Output some selected command info
Console.WriteLine(cmd.CommandText);

foreach (NpgsqlParameter parameter in cmd.Parameters)
{
    Console.WriteLine($"Parameter {parameter.ParameterName} = {parameter.Value}");
}

// Ensure included customer variable is populated
List<Order> orderResults = orders.ToList();

Console.WriteLine(customer.Name);
foreach (Order order in orderResults)
{
    Console.WriteLine($" Order {order.Id} for {order.Quantity} items");
}

Running this preceding code  results in the following console output:

select d.data, d.id, d.mt_version, customer_id.data, customer_id.id, customer_id.mt_version from public.mt_doc_order as d INNER JOIN public.mt_doc_customer as customer_id ON d.customer_id = customer_id.id where d.customer_id = :arg0
Parameter arg0 = 4001
Sarah
 Order 3001 for 42 items
 Order 4001 for 477 items
 Order 5001 for 9 items

To learn more about the document database features of Marten check out my Pluralsight courses: Getting Started with .NET Document Databases Using Marten and Working with Data and Schemas in Marten.

Retrieving Raw JSON Data in Web API with Marten

Marten is a .NET document database library that uses an underlying PostgreSQL database to store objects as JSON.

Ordinarily, Marten takes care of retrieving the JSON from the database and deserializing it into an object. We can however instruct Marten to perform a query to retrieve document(s) and not perform the deserialization, instead giving us the JSON as it appears in the underlying PostgreSQL record. If we are exposing documents via a Web API, this feature can be taken advantage of to reduce some processing overhead on the server.

As an example, we could have the following Customer document defined:

public class Customer
{
    public int Id { get; set; }
    public string Name { get; set; }

    // etc.
}

And in a CustomersController we could start with a method as follows to add Customers to the database:

public void Post(Customer customer)
{
    // no validation

    // DocumentStore would normally be created only once in app, e.g. via IOC singleton 
    using (var store = DocumentStore.For(ConnectionString))
    {
        using (var session = store.LightweightSession())
        {
            session.Store(customer);
            session.SaveChanges();
        }
    }
}

To get all Customers, the following method could be written:

// GET api/customers
public IEnumerable<Customer> Get()
{
    // DocumentStore would normally be created only once in app, e.g. via IOC singleton 
    using (var store = DocumentStore.For(ConnectionString))
    {
        using (var session = store.QuerySession())
        {
            return session.Query<Customer>();
        }
    }
}

The preceding method however incurs the additional overhead of Marten deserializing the database JSON into Customer objects, only to be serializing it again back into JSON on the way out of the API.

When creating the LINQ query, the Marten ToJsonArray() method can be added to instruct Marten to simply return the JSON directly from the database.

We can then modify the Get method as follows:

public HttpResponseMessage Get()
{
    // DocumentStore would normally be created only once in app, e.g. via IOC singleton 
    using (var store = DocumentStore.For(ConnectionString))
    {
        using (var session = store.QuerySession())
        {
            string rawJsonFromDb = session.Query<Customer>().ToJsonArray();

            var response = Request.CreateResponse(HttpStatusCode.OK);
            response.Content = new StringContent(rawJsonFromDb, Encoding.UTF8, "application/json");
            return response;
        }
    }
}

We could also write the parameterized Get method and use Marten’s AsJson() method to get the JSON string for the individual Customer document as  in the following code:

public HttpResponseMessage Get(int id)
{
    // DocumentStore would normally be created only once in app, e.g. via IOC singleton 
    using (var store = DocumentStore.For(ConnectionString))
    {
        using (var session = store.LightweightSession())
        {
            var rawJsonFromDb = session.Query<Customer>().Where(x => x.Id == id).AsJson().FirstOrDefault();

            if (string.IsNullOrEmpty(rawJsonFromDb))
            {
                throw new HttpResponseException(HttpStatusCode.NotFound);
            }

            var response = Request.CreateResponse(HttpStatusCode.OK);
            response.Content = new StringContent(rawJsonFromDb, Encoding.UTF8, "application/json");
            return response;
        }
    }
}

To learn more about the document database features of Marten check out my Pluralsight courses: Getting Started with .NET Document Databases Using Marten and Working with Data and Schemas in Marten.

New Pluralsight Course: Working with Data and Schemas in Marten

Marten is a .NET document database library to allows objects to be stored, retrieved, and queried as documents stored as JSON in an underlying PostgreSQL database. This new course is a follow-on from the previous Getting Started with .NET Document Databases Using Marten course, if you’re new to Marten I’d recommend checking out the previous course first before continuing with this new one.

Among other topics, this new course covers how to log/diagnose the SQL that is being issued to PostgreSQL; how to enable offline optimistic concurrency; bulk document inserts; a number of ways to improve query performance; and the customization of database schema objects.

You can check out the new course on the Pluralsight site.

Screen Scraping As A Service with Azure Functions in 5 Mins

If you have some data in a web page but there is no API to get the same data, it’s possible to use (the often brittle and error prone) technique of screen scraping to read the values out of the HTML.

By leveraging Azure Functions, it’s trivial (depending on how horrendous the HTML is) to create a HTTP Azure Function that loads a web page, parses the HTML, and returns the data as JSON.

The fist step is to create an HTTP-triggered Azure Function:

Creating a new Azure Function with a HTTP Trigger

To help with the parsing, we can use the HTML Agility Pack NuGet package. To add this to the function, create a project.json file and add the NuGet reference:

Using NuGet Packages in Azure Functions

{
  "frameworks": {
    "net46":{
      "dependencies": {
        "HtmlAgilityPack": "1.4.9.5"
      }
    }
   }
}

Once this file is saved, the NuGet package will be installed and a using directive can be added: using HtmlAgilityPack;

Now we can download the required HTML page as a string, in the example below the archive page from Don’t Code Tired, use some LINQ to get the post titles, and return this as the HTTP response. Correctly selecting/parsing the required data from the HTML page is likely to be the most time-consuming part of the function creation.

The full function source code is as follows. Notice that there’s no error checking code to simplify the demo code. Screen scraping should usually be a last resort because of its very nature it can often break if the UI changes or unexpected data exists in the HTML. It also requires the whole page of HTML be downloaded which may raise performance concerns.

using System.Net;
using HtmlAgilityPack;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    HttpClient client = new HttpClient();

    string html = await client.GetStringAsync("http://dontcodetired.com/blog/archive");    

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(html); 
    
    var postTitles = doc.DocumentNode
                        .Descendants("td")
                        .Where(x => x.Attributes.Contains("class") && x.Attributes["class"].Value.Contains("title"))
                        .Select(x => x.InnerText);

    return req.CreateResponse(HttpStatusCode.OK, postTitles);
}

We can now call the function via HTTP and get back a list of all Don’t Code Tired article titles as the following Postman screenshot shows:

Using Postman to call Azure Functions

Azure HTTP Function Authorization with Function Keys

When creating an Azure Function triggered via HTTP, one way to authorize use of the function is to configure the HTTP function trigger to require the caller to provide a function key.

Azure Function HTTP Trigger Authorization Modes

With the authorization set to Anonymous, as expected anyone can call it.

When set to Function Authorization, the caller needs to provide the function key either as a URL query string parameter or in a header.

The function key can be found by navigating to Manage tab as the following screenshot shows:

Finding the Azure Function Key

Once Function Authorization is enabled, if the client does not provide it correctly the function will return a 401 Unauthorized.

To supply the function key in the URL, the “code” query string parameter can be used, e.g. “https://myazurecloudfunctions.azurewebsites.net/api/SayHi?code=udXhf3pviSICFMtViW/pqmV/1Q5vLH5aMcRWXfD/q6NXk2VVxRlfYw==”.

Alternatively an “x-functions-key” header can be added containing the key as the following Postman screenshot shows:

Calling Azure Function with Postman and Function Key

Reducing Database Round Trips With Included Documents in Marten

Marten is a .NET document database library that uses an underlying PostgreSQL database to store objects as JSON.

In the  previous article (Enforcing Referential Integrity Between Documents with Marten and PostgreSQL) we saw how we can enable referential integrity between documents and the example showed a Customer document may have many related Order documents.

When creating LINQ queries with Marten, one feature is the idea of included documents. So for example we might want to get a Customer document and all the Order documents for that customer.

We could write some code as follows to get Customer 4001 and a list of their Orders:

// 2 trips to database, 2 queries
using (var session = store.QuerySession())
{
    Customer customer = session.Query<Customer>()
                               .Single(x => x.Id == 4001);

    Console.WriteLine(customer.Name);

    IEnumerable<Order> orders = session.Query<Order>()
                                       .Where(x => x.CustomerId == 4001);

    foreach (var order in orders)
    {
        Console.WriteLine($" Order {order.Id} for {order.Quantity} items");
    }
}

The preceding code however will result in two round trips to the database which may result in a performance problem:

select d.data, d.id, d.mt_version 
from public.mt_doc_customer as d 
where d.id = 4001 LIMIT 2
select d.data, d.id, d.mt_version
from public.mt_doc_order as d 
where d.customer_id = 4001

 

The .Include() extension method of Marten allows a single round trip to be made to the database that executes a SQL join and returns both the Customer data and Orders.

The following code shows how to accomplish this, notice that we need to declare an “output” variable to hold the “included” customer.

// 1 trip to database, 1 query
using (var session = store.QuerySession())
{                
    Customer customer = null;

    List<Order> orders = 
         session.Query<Order>()
                .Include<Customer>(joinOnOrder => joinOnOrder.CustomerId, includedCustomer => customer = includedCustomer)
                .Where(x => x.CustomerId == 4001)
                .ToList();


    Console.WriteLine(customer.Name);

    foreach (var order in orders)
    {
        Console.WriteLine($" Order {order.Id} for {order.Quantity} items");
    }  
}

The preceding code now only makes a single request to the database with the following SQL:

select d.data, d.id, d.mt_version, customer_id.data, customer_id.id, customer_id.mt_version 
from public.mt_doc_order as d 
INNER JOIN public.mt_doc_customer as customer_id ON d.customer_id = customer_id.id 
where d.customer_id = 4001

To learn more about the document database features of Marten check out my Pluralsight courses: Getting Started with .NET Document Databases Using Marten and Working with Data and Schemas in Marten.

Enforcing Referential Integrity Between Documents with Marten and PostgreSQL

Even though Marten is a library to enable document database storage semantics, because it’s built on top of (the advanced JSON features) of PostgreSQL, and PostgreSQL itself is relational, we can add constraints between document instances just as we would do in relational normalised databases.

For example, the following two simplified classes represent an order that points to a customer document:

class Customer
{
    public int Id { get; set; }
    public string Name { get; set; }
    
    // etc.
}

class Order
{
    public int Id { get; set; }
    public int CustomerId { get; set; }
    public int Quantity { get; set; }
    
    // etc.
}

By default there is no referential integrity enforcing the idea that the CustomerId in the Order class must point to an existing Customer document in the database. This means that the following code succeeds with no errors:

const string connectionString = "host = localhost; database = OrderDb; password = g7qo84nck22i; username = postgres";

var store = DocumentStore.For(connectionString);

using (var session = store.OpenSession())
{
    var order = new Order
    {
        Quantity=42,
        CustomerId = 34567 // non-existent
    };

    session.Store(order);

    session.SaveChanges(); // no error
}

The preceding code will result in an order document being stored with an invalid CustomerId.

Marten can be configured to enforce this by either adding the [ForeignKey(typeof(Customer))] attribute to the CustomerId property or by configuring the document store as the following code shows:

var store = DocumentStore.For(configure =>
{
    configure.Connection(connectionString);
    configure.Schema.For<Order>().ForeignKey<Customer>(x => x.CustomerId);
});

Now running the code again will result in an exception: insert or update on table "mt_doc_order" violates foreign key constraint "mt_doc_order_customer_id_fkey”.

If the code is modified to create a valid customer in the same session, Marten will generate a customer Id that can then be used in the order. When SaveChanges() is called, Marten will save the dependent customer document first and then the order document that points to it. The following code executes successfully:

using (var session = store.OpenSession())
{
    var customer = new Customer {Name = "Sarah"};

    session.Store(customer);
    
    // customer now has an Id auto-generated 

    var order = new Order
    {
        Quantity = 42,
        CustomerId = customer.Id // valid existing customer ID
    };

    session.Store(order);

    session.SaveChanges();
}

To learn more about the document database features of Marten check out my Pluralsight courses: Getting Started with .NET Document Databases Using Marten and Working with Data and Schemas in Marten.

Creating a Tweet Buffer with Azure Queues and Microsoft Flow

There are apps and services that allow the scheduling or buffering of the sending of Tweets. Using the features of Microsoft Flow, it’s possible to create a solution that allows Tweets to be quickly created as simple text files in a OneDrive folder and these will then be buffered to be sent every 15 minutes (or whatever schedule you fancy). The actual buffering mechanism used below is an Azure Queue.

There are two Flows as part of this solution: Flow 1 to pick up text files from OneDrive, extract the content and write a new message to an Azure Queue. Flow 2 runs on a schedule, picks a message off the queue, grabs the message content and sends it a s a Tweet.

Flow 1: Queuing Tweets

The first step is to create an Azure Storage account and create an Azure Queue. The easiest way to create a new queue is to use the Azure Storage Explorer. Once installed and connected, creating a queue is a simple right-click operation:

Using Azure Storage Explorer to create a new Azure Queue

We’ll call the queue “tweet-queue”.

We’ll also create OneDrive folders: OneDrive\FlowDemo\TweetQ\In

Now we can create a new Flow that grabs files from this path and adds them to tweet-queue as the following screenshot shows (notice we're also deleting the file after adding to the queue):

Microsoft Flow reading a file from OneDrive and adding to Azure Queue

Now if we create a .txt file (for example with the content “Testing - this Tweet came from Microsoft Flow via OneDrive and an Azure Queue” in the OneDrive\FlowDemo\TweetQ\In directory, wait for the Flow to run and check out the queue in Storage Explorer we can see a new message as the following screenshot shows:

Azure Storage Explorer showing Azure Queue message content

Now we have a way of queuing Tweets we can create a second flow to send them on a timer.

Flow 2: Sending Tweets

The second Flow will be triggered every 15 minutes, grab a message from the queue, use the message body as the Tweet content, then delete the message from the queue.

The following screenshot shows the first 2 phases:

Getting Azure Queue messages on a timer

Even though we’ve specified 1 message, when we add the next action in the Flow, we’ll automatically get an “Apply to each” added as the following screenshot shows:

Posting Tweet from Azure Queue

Notice in the preceding screenshot that we also need to add an action to delete the message from the queue.

Now once we save this Flow, every 15 minutes a message will be retrieved and posted as a Tweet:

Triggering a Microsoft Flow from an HTTP Post

Microsoft Flow allows the building of workflows in the cloud. One way to trigger a Flow is to set up a HTTP endpoint that can be posted to.

For example, a Flow can be created that takes some JSON data and writes it out to OneDrive or Dropbox.

The first step is to create a new Flow and add a Request trigger. When the Flow is saved, a URL will be generated that can be posted to.

As part of this Request trigger, a JSON schema can be specified that allows individual JSON properties to be surfaced and referenced by name in later actions.

For example the following JSON schema could be specified:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Customer",
  "description": "A generic customer",
  "type": "object",
  "properties": {
    "id": {
      "description": "The unique identifier for a customer",
      "type": "integer"
    },
    "name": {
      "description": "The full name of the customer excluding title",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name"
  ]
}

Now in later steps, the “id” and the “name” properties from the incoming JSON can be used as dynamic content.

Next action(s) can be added that make use of the the data in these properties when an HTTP post occurs. For example we could create a file in OneDrive where the filename is {id}.txt that contains the customer name. This is a simple example but serves to demonstrate the flexibility.

The following screenshot shows the full flow and the JSON schema properties in use:

Microsoft Flow using request trigger and OneDrive

We can now post to the generated URL. For example the following screenshot shows a test post using Postman and the resulting file that was created in OneDrive:

HTTP post to trigger a Microsoft Flow