Adventures with Cosmos DB: High Ingestion Architecture

Let’s say you need you receive a large number of transactions from a third-party software system.

The third party software system would like to send the requests via an API call.

Each transaction needs to be validated (good data), enriched (add meta data) and transformed (convert to final format).

My first attempt at solving this problem leveraged Azure Functions and Azure Cosmos DB.

I created the following Azure resources:

  • Azure Function – HTTP Trigger, to receive the data from the third party software system
  • Cosmos DB to store the received data
  • Multiple Azure Functions – Cosmos DB Change Feed Trigger, acting as Microservices, to process the data

For the purpose of this article assume the Microservice Azure Functions can all be ran in parallel, if there was a need for sequential processing a better solution might be Durable Functions or Logic Apps to handle the orchestration, with my preference being Logic Apps.

Looks good?

Some things to keep in mind:

  • The change feed will be polled, by default, every 5 seconds, this can be configured for a shorter or longer duration
  • The code running in the Microservice Azure Functions in synchronous, documents are processed one at a time, so if what you are doing in each document takes 3 minutes, for 100 documents, that Azure Function would take 300 minutes.

My second attempt addresses the performance problem by inserting message storage.

In addition to the already created Azure Resource, I added the following:

  • Azure Event Grid to receive events
  • Azure Function – Cosmos DB Change Feed Trigger, to read from the change feed and forward messages to Event Grid
  • Re-purposed the Microservice Azure Functions to read from Event Grid

Now when a document is received, the Azure Function reads it from the change feed and forwards it to the Azure Event Grid, which reduces the performance bottle neck and maximizes the scaling of our Azure Functions.

If I overlooked something, misspoke, or if there is just a better way, probably all of the above, please share your thoughts.

Related Articles


Discover more from Matt Ruma

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *