Once you’ve decided to instrument your ASP.NET Core application with Application Insights, you may be looking for how to anonymize or customize the data that is being sent to Application Insights. For details on why you should be using Application Insights and how to get started, please reference my previous post in this blog series.

Two things to consider with telemetry:

  1. Adding additional telemetry that is not recorded by Application Insights.
  2. Removing or anonymizing data that is already being sent to Application Insights.

There will a later post in this series discussing how to add new telemetry, this post focuses on anonymizing or removing data.

Personally Identifiable Information (PII) Already in Application Insights Account

We’ll start with a likely scenario, during an audit or during testing you discovered that you are logging PII to your Application Insights account. How can you go about fixing that? The short answer is to delete the entire Application Insights resource. That means you’ll lose access to all historical telemetry that was in the account and your production system will no longer be logging telemetry anywhere unless you create a new account and update your production system with the new telemetry key. However, this does solve your immediate problem of retaining PII. See the Microsoft documentation, for details on what is captured by Application Insights, how it’s transmitted and stored.

Application Insights does provide a PURGE endpoint, but requests are not timely, the endpoint isn’t well documented and it will not properly update metrics to account for the data that was purged. In general, if you have a compliance concern, delete the Application Insights account. Remember, Application Insights is designed to be a highly available high-performance telemetry platform, which means it is designing around being an append-only system. The best solution is simply not to send data to the platform that you shouldn’t.

API Use Case

Think of an API your business may have built. This API allows us to search for customers by e-mail to find their customer id. Once we have the customer id, we can make updates to their record such as their first name, birthday, etc. By default, Application Insights records the request URL and the response code. By default, it does NOT record any HTTP headers or the request body or the response body. First, let’s think of how we might design the search endpoint, we have two approaches:

  1. GET /api/customer/search?emailAddress=test@appliedis.com
  2. POST /api/customer/search
    a. In the request body, send JSON:
    { “emailAddress”: “test@appliedis.com”}

If we design the API using approach #1, we will be sending PII to Application Insights by default. However, if we designed the API using approach #2, by default no PII would be sent to Application Insights. Always try and keep your URLs free of any PII.

That may not always be possible, so let’s look at another use case. Let’s say we have the primary key of a customer record and we want to view and make edits to that record, what would the endpoints look like:

  1. GET /api/customer/9b02dd9d-0afd-4d06-aaf1-c34d3c051ec6
  2. PUT /api/customer/9b02dd9d-0afd-4d06-aaf1-c34d3c051ec6

Now, depending on your regulatory environment logging these URLs to Application Insights might present a problem. Notice we are not logging e-mail addresses, phone numbers or names; we are logging behavior about an individual. Pay attention to when the site was accessed and when was their profile updated? To avoid this we would like to anonymize the URL data that is being sent to Application Insights.

Anonymize Data Sent to Application Insights

This section assumes you are using ASP.NET Core and have already configured Application Insights, see my previous blog post for details.  Also, if you need to troubleshoot your configuration or need to verify it’s working as expected, please see my other blog post for details.

The Application Insights NuGet package provides an interface for exactly this purpose called ITelemetryProcessor. You simply need to subclass it and implement the Process method. The Telemetry Processor implementation acts much like ASP.NET Core middleware, in that there is a chain of telemetry processors. You must provide a constructor in your implementation that accepts an ITelemetryProcessor which is next in the chain. In your process method, you are then responsible for calling onto the next processor in the chain. The last processors in the chain are the ones provided by the NuGet package that implements the same interface and sends the telemetry over the wire to the actual Application Insights service in Azure. In the Process method which you are required to implement, you receive a single argument, ITelemetry. You can cast that to one of the subclasses, e.g. DependencyTelemetry, RequestTelemetry, etc. In the Process method, you can then mutate the telemetry in whatever way you need to, e.g. to anonymize data. You’ll then be responsible for calling the Process method on the next telemetry processor in the chain, e.g. the one that was provided to the constructor of your class. If you want the given telemetry item to never be sent to Application Insights, simply omit the call to the Process method of the next telemetry processor in the chain.

Now we will look at the source code for one that does what we are proposing. This will be for anything that resembles a customer id in the URL of RequestTelemetry and then replaces it with the word “customerid”.
RequestTelemetry

As seen above, in the constructor we receive the next telemetry processor in the chain. In the process method, we check to see if we have RequestTelemetry, e.g. ignoring all other telemetry types, like TraceTelemetry or DependencyTelemetry. The RequestTelemetry has a .Name and .Url property both of which might contain details about the URL which contains our PII. We use a regular expression to see if either contains a customer id, if so, we replace it with the word “customerid”. And then we always ensure that we call the next telemetry processor in the chain so that the modified telemetry is sent to Application Insights.
Remember, just having this ITelemetryProcessor coded won’t do anything. We still need to register it with the Application Insights library. In the Startup.cs, you will need to add a line to the ConfigureServices method, see below:

Add a Line to the ConfigureServices Method

Now that you’ve registered the processor, it’s just a matter of deploying the code. If you wanted to debug this locally while still sending telemetry live to Application Insights in Azure, please see my blog post with details on how to do that.

Summary

You learned about the ITelemetryProcessor extension point in the Application Insights NuGet package and how to use that extension point to prevent PII data from being logged to your Application Insights account. We’ve discussed how to design your API endpoints efficiently by default so that hopefully you don’t need to customize your telemetry configuration. Lastly, I have shown you how to delete PII data that may have been logged accidentally to your production Application Insights account. You may also want to take advantage of a relatively new feature of Application Insights to set the retention period of logs. Previously it was only 90 days, but now you can configure it in as low as 30 days. This may be useful for handling forgotten requests if your logging system is storing PII and you need to ensure it is removed within 30 days of being requested. Next, in this blog series, we will discuss logging additional telemetry to create an overall better picture of your system’s health and performance.