C# snippets in Azure APIM policies

It I was an interesting finding this week. I was not aware that it is possible to use multi-line C# code snippets inside Azure API Management policies.

For example if you’d like to calculate an expression (could be e.g. a backend service query parameter) based on a request header value, that you could use a snippet similar to this one:

  /* here you can place multi-line code */ 
  new dict = new Dictionary<string, string>() {
    {"a", "1"}, 
    {"b", "2"}
  var val = context.Request.Headers.GetValueOrDefault("test", "a");
  return dict.ContainsKey(val) ? dict[val] : "1";

Details about expressions syntax can be found here: here: https://docs.microsoft.com/en-us/azure/api-management/api-management-policy-expressions

One gotch’a was with moving that kind of policy definition to Terraform.  It is necessary to replace characters “<” and “>” with entities: &#60; and &#62; respectively. Otherwise Tarraform could not apply the changes, although it worked directly in Azure portal.

Worth to note that you could achieve the same by using control flow policies, but this example is only an illustration, you can have more complex snippets e.g. for request verification or composing/manipulating response body.

Lambda architcture λ

I’ve been doing some research recently about architectures for large scale data analysis systems. An idea that appears quite often when discussing this problem is lambda architecture.

Data aggregation

The idea is quite simple. Intuitive approach to analytics is to gather data as it comes and then aggregate data for better performance of analytic queries. E.g. when users do reports by date range, pre-aggregate all sales/usage numbers per day and then produce the result for given date range by making the sum of aggregates for each data in the range. If you have let’s say 10k transactions per day, that approach will create only 1 record per day. Of course in reality you’d probably need many aggregates for different dimensions, to enable data filtering, but still you will probably have much less dimensions than the number of aggregated rows.

Aggregation is not the only way to increase query performance. It could be any kind of pre-computing like batch processing, indexing, caching etc. This layer in lambda architecture is called “serving layer” as it’s goal is to server for the analytical queries as a fast source of information.

Speed layer

This approach has a significant downside – aggregated results are available after a delay. In the example above the aggregated data required to perform analytical queries, will be available on the next day. Lambda architecture mitigates that problem by introducing so called speed-layer. In our example that would be the layer keeping data for current day. The amount of data for a single day is relatively small and probably does not require aggregates to be queried or can fit into a relatively small number of fast and more expensive machines (e.g. using in-memory computing)


Analytical queries combine results from 2 sources: aggregates and speed layer. The speed layer can be also used to crate aggregates for the next day. Once data is aggregated it can be removed from speed layer to free the resources.

Master data

Let’s do not forget that besides speed layer and aggregates, there is also a so called master data that contains all raw, not aggregated records. This dataset in lambda architecture should be append-only


This architecture is technology-agnostic. For example you can build all the layers on top of SQL servers. But typically a distributed file system like  HDFS would be used as master data. MapReduce pattern would be used for batch processing the maser data. Technologies like Apache HBase or ElephantDb would be used to query the serving layer. And Apache Storm would be used for the speed layer. Those choices could be quite common in the industry but technology stack can vary a lot from project to project or company to company.