Databricks provides a robust notebook environment that is excellent for ad-hoc and interactive access to data. However, it lacks robust software development tooling. Databricks Connect and Visual Studio (VS) Code can help bridge the gap. Once configured, you use the VS Code tooling like source control, linting, and your other favorite extensions and, at the same time, harness the power of your Databricks Spark Clusters.

Configure Databricks Cluster

Your Databricks cluster must be configured to allow connections.

  1. In the Databricks UI edit your cluster and add this/these lines to the spark.conf:
    spark.databricks.service.server.enabled true
    spark.databricks.service.port 8787
  2. Restart Cluster

Configure Local Development Environment

The following instructions are for Windows, but the tooling is cross-platform and will work wherever Java, Python, and VSCode will run.

  • Install Java JDK 8 (enable option to set JAVA_HOME) – https://adoptopenjdk.net/
  • Install Miniconda (3.7, default options) – https://docs.conda.io/en/latest/miniconda.html
  • From the Miniconda prompt run (follow prompts):Note: Python and databricks-connect library must match the cluster version. Replace {version-number} with version i.e. Python 3.7, Databricks Runtime 7.3
    “` cmd
    conda create –name dbconnect python={version-number}
    conda activate dbconnect
    pip install -U databricks-connect=={version-number}
    databricks-connect configure
    “`
  • Download http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe to C:\Hadoop
    From command prompt run:

    “` cmd
    setx HADOOP_HOME “C:\Hadoop\” /M
    “`
  • Test Databricks connect. In the Miniconda prompt run:
    “` cmd
    databricks-connect test
    “`

You should see an “* All tests passed.” if everything is configured correctly.

  • Install VSCode and Python Extension
    •  https://code.visualstudio.com/docs/python/python-tutorial
    • Open Python file and select “dbconnect” interpreter in lower toolbar of VSCode
  • Activate Conda environment in VSCode cmd terminal
    From VSCode Command Prompt:
    This only needs to be run once (replace username with your username):
    “` cmd
    C:\Users\{username}\Miniconda3\Scripts\conda init cmd.exe
    “`
    Open new Cmd terminal
    “` cmd
    conda activate dbconnect
    “`
    Optional: You can run the command ` databricks-connect test` from Step 5 to insure the Databricks connect library is configured and working within VSCode.

Run Spark commands on Databricks cluster

You now have VS Code configured with Databricks Connect running in a Python conda environment. You can use the below code to get a Spark session and any dependent code will be submitted to the Spark cluster for execution.

“` python
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
“`
Once a context is established, you can interactively send commands to the cluster by selecting them and right-click “Run Select/Line in Python Interactive Window” or by pressing Shift+Enter.

Context established to send commandsThe results of the command executed on the cluster will display in the Visual Studio Code Terminal. Commands can also be executed from the command line window.

Executed Command Cluster

Summary

To recap, we set up a Python virtual environment with Miniconda and installed the dependencies required to run Databricks Connect. We configured Databricks Connect to talk to our hosted Azure Databricks Cluster and setup Visual Studio code to use the conda command prompt to execute code remotely. Now that you can develop locally in VS Code, all its robust developer tooling can be utilized to build a more robust and developer-centric solution.

DevOps implements a Continuous Integration/Continuous Delivery (CI/CD) process. When multiple team members work in the same codebase, anyone’s update could break the integrated code. So, Continuous Integration is to trigger a build pipeline whenever a code update is pushed. The build pipeline will fail if the newly updated code is incompatible with the existing codebase if there are any conflicts. The codebase might work well within a single developer environment, but in a build pipeline where all configurations and dependencies are expected to be in place can fail. Continuous Delivery speeds up the deployment process. The release pipeline helps to deploy the same code base to multiple environments based on configurations. This helps to support code to be deployed in all environments without many manual changes.

Having an approval process helps peer code reviews, identifies potential issues, and any security flaws ahead of time. The current production applications are very distributed and complex. Whether it is an on-premise or cloud-based solution, missing a dependency or proper configurations could cost significant risk in deployments. DevOps helps to maintain the same code base for repeatable deployment in many environments with just configuration changes. DevOps avoids manually building the deployment packages and handing over to the operations team who would not have insights on what is being deployed. If an error occurs during deployment or post-deployment, then the development team jumps in at that time, which is time-consuming. This will cost in production timeline and end up with some unhappy customers also!
DevOps ImagePicture credit: DoD DevOps

Popular DevOps Tools

Follow here to learn more about DevOps practices from other AIS bloggers!

Why not just “DevOps”?

DevOps is fundamental for any organization’s build and deployment process with seamless CI/CD integration. Then, what is ‘DevSecOps’ and why is ‘Sec’ added between Dev and Ops. The ‘Sec’ in DevSecOps is ‘Security.‘ Though it’s added in between, security implementation should start from Development and continue in Operations. As development and deployment packages add many dependencies from both internal and external, this could introduce vulnerabilities. It could cost severe issues in production if not identified earlier in the build pipeline. Code scans help identify possible weaknesses in code implementations. But for any cybersecurity-related vulnerabilities, only specific tools at different stages of the pipeline must be used to identify as early as possible. Adding security scanning earlier in the pipeline and automating are essential for DevSecOps.

DevSecOps Software Lifecycle

Picture Credit: DoD DevSecOps

DevSecOps is not a tool or pattern but a practice and can be enhanced by adding appropriate tools. It is a process in securing the build and deployment by using several security tools by shifting security to the left. These security tools help to identify vulnerabilities that the code could have introduced, recommend possible solutions to fix those issues, and in some instances, the tools can mitigate some of those issues as well. This is to use the ‘fail fast’ method to identify vulnerabilities earlier in the build pipeline. As more applications moved into the cloud, it is highly imperative to follow Cloud Native Computing Foundation (CNCF) certified tools and implement security benchmarks that provided CIS benchmarks. DevSecOps avoids manual changes once the code is in the pipeline, deployed, and deployed. The codebase will be a single source of truth and should not be manipulated at any point.

Adding scanning tools for security and vulnerabilities helps to mitigate any flaws introduced in code and operations. Many open-source tools provide these functionalities. Enabling logging, continuous monitoring, alerting processes, and any self-fix for faster remediation are key for ongoing business operations. Containerizing with hardened container images from DoD Iron Bank helps to protect application container images. Hardened images can be kept up to date from reliable providers. Containers provide cloud-agnostic and no vendor lock-in solutions.

All the security tools in the DevSecOps pipeline must be deployed and running for pipeline scanning in the customer environment. A request will be sent to those security tools from the pipeline code via API request or trigger command-line interface (CLI) commands. Those tools then respond with their findings, statistics, and provide pass/fail criteria. If a tool identifies any vulnerability findings in the scan, then the pipeline will fail.

Deploying the security tools as SaaS services will require permission from the security team. Not all are approved to run in highly secured cloud environments. Those tools all need to be Authority to Operate (ATO) to deploy and configure. Whereas getting the hardened container images for those tools is a safer and secure approach to deploy those tools in the cloud. As the containers are already hardened, which means scanned, secured, and ready to go with all dependencies, they will provide continuous ATO. The hardened container images can be downloaded from DoD Iron Bank, and almost all tool providers provide container images. Many of these providers have different downloads, whether as a software download or a container image. When downloading as a software image, additional tasks to be done to ensure all the dependencies are appropriately configured or should pre-exist. Simultaneously, downloading as hardened container images comes with dependencies and are pre-scanned. The tools can be deployed into Kubernetes in your cloud environment to provide scalable functionality.

Below is a sample DevSecOps pipeline implementation with recommended security tools, as depicted in the picture below:

  • Source code pull request is approved by reviewers
  • The build pipeline kicks off and code scan is run after a successful initial build
    • If any code vulnerabilities are identified, then the pipeline fails
  • Build pipeline continues with DAST and PEN testing
    • If any vulnerabilities are identified, then the pipeline fails
  • Build artifacts are added to private repository either as packages or container
    • Repository scan is performed using repository scanning tools and vulnerabilities are reported
  • Release pipeline picks up artifacts from private repositories and deploys to Azure (or cloud of your choice)
    • Kubernetes is a highly recommended deployment for orchestration, but deployment can be an application of your choice such as Function App, App Service, Azure Container Instances, etc.
  • Security has been applied throughout the pipeline process and will continue once the application is deployed. Both native security tools such as Azure Monitor, Azure Security Center, Azure Policies, etc., and third-party tools such as Twistlock, Qualys, etc. Can be used to monitor the health of your production environment.DevSecOps Diagram

Let’s look at a few of the recommended tools to support the security validations in the DevSecOps process.

Build tools/CLI

A developer can write their code in their favorite editor such as Visual Studio, VS Code, and run/execute to test their applications. The code editor also generates debug/release packages generating binaries using the build tool that comes with the editor. The application works seamlessly from the developer environment as the dependencies and correct configurations exist. For the build to work in the pipeline, the build tool must be available to build the code. Based on the code language, the build tool varies, and they must be available in the pipeline.

Some of the build tools are:

  • DotNet Build
  • MSBuild
  • Maven
  • Gradle

Static Application Security Testing (SAST)

A code scan is one of the essential steps in securing the codebase. Automated testing helps identify failures, but these specific code scan tools help identify security flaws and vulnerabilities. The application does not need to be running for code scan tools as it scans only the codebase and not any dependencies.

Some of the Code scanning tools are:

  • SonarQube
  • Fortify
  • Anchore
  • JFrog Xray
  • OpenSCAP
  • HBSS
  • OWASP dependency check

Dynamic Application Security Testing (DAST)

DAST scans the application while its running or a container image that is hosted in private repositories. Container scanning before deploying helps resolve many security vulnerabilities.

Some of the DAST scanning tools are:

Penetration (Pen) Testing

Provides Web Applications scanner to help to find security vulnerabilities. Read here to learn about, “Top 10 Web Application Security Risks”

PEN testing tools:

  • OWASP ZAP

Deploy Code & IaC (Infrastructure as Code)

IaC is paramount in DevOps to avoid any manual work in customer environments and help with immutable infrastructure.

Popular IaC tools are:

  • Azure ARM Templates
  • Terraform
  • HELM
  • Private Repositories

In DevSecOps, a private repository is recommended to host the build dependencies, reference container images, container images for tools, and the built packages or application container images. This is to keep all the artifacts together in one centralized location, and the release pipeline can continue with deployments from there.
Some of the private repositories are:
JFrog
Docker Hub
Azure Container Registry (ACR)

Private Repository Scanning

As the pipeline requires security scanning, the repositories require scanning also. These tools scan for vulnerabilities in all packages and container artifacts stored in the repository. A scan report is being sent/notified for any issues.

Some artifact scanning tools are:

  • XRay
  • SonaType
  • Azure Monitor
  • Azure Security Center

Deploy

As the recommendation to deploy the security tools with container orchestration, the same recommendation goes to deployed applications. Containers provide high security with limited ways to be affected by attackers. Sidecar containers protect by continually monitoring applications with a container security stack built-in. Applications are scalable on a demand basis using Kubernetes and tools such as Kubectl; HELM packages are used to deploy and manage K8S clusters. ArgoCD is a declarative tool specifically for Kubernetes deployment in CI/CD pipeline.

Deployments to Azure could be:

  • Azure function app
  • Azure App Service
  • Azure Container Instance
  • Azure Kubernetes Service (AKS)
  • Open Shift in Azure
  • Monitoring/Alerting

Monitoring/Alerting

As the applications deployed and running in a cloud environment, it must be continuously monitored for attacks and identify any security vulnerabilities. For containers, these tools act as sidecar containers to regularly protect main containers from attacks, and some mitigate the issue. All these tools have built-in alert/notify operations team for immediate actions.

Monitoring/alerting tools:

  • Azure Monitor
  • Azure Security Center
  • Twistlock
  • Qualys
  • Aqua Security

So, all powered up with learning DevSecOps! Follow up back here for the next blog post in container-based deployments and containers scanning in the DevSecOps pipeline!

References for continuing your DevSecOps Journey

Azure Kubernetes Service is a Microsoft Azure-hosted offering that allows for the ease of deploying and managing your Kubernetes clusters. There is much to be said about AKS and its abilities, but I will discuss another crucial role of AKS and containers, security. Having a secure Kubernetes infrastructure is a must, and it can be challenging to find out where to start. I’ll break down best practices, including baseline security for clusters and pods, and implement network hardening practices that you can apply to your own AKS environment that will lay the foundation for a more secure container environment, including how to maintain updates.

Cluster and Pod Security

Let’s first look at some best practices for securing your cluster and pods using policies and initiatives. To get started, Azure has pre-defined policies that are AKS specific. These policies help to improve the posture of your cluster and pods. These policies also allow for additional control over things such as root privileges. A best practice Microsoft recommends is limiting access to the actions that containers can provide and avoiding root/privileged escalation. When the Azure Policy Add-on for AKS is enabled, it will install a managed instance of Gatekeeper. This instance handles enforcement and validation through a controller. The controller inspects each request when a resource is created or updated. You’ll then need to validate (based on your policies). Features such as these are ever-growing and can make creating a baseline easier. Azure Policy also includes a feature called initiatives. Initiatives are collections of policies that align with organizational compliance goals. Currently, there are two built-in AKS initiatives which are baseline and restricted. Both come with many policies that lockdown items, such as limiting the host filesystem, networking, and ports. By combining both initiatives and policies, you can tighten security and meet compliance goals in a more managed fashion.

Another way to secure your cluster is to protect the access to the Kubernetes API-Server. This is accomplished by integrating RBAC with AD or other identity providers. This feature allows for granular access, similar to how you control access to your Azure resources. The Kubernetes API is the single connection point to perform actions on a cluster. For this reason, it’s imperative to deploy logging\auditing and to enforce the least privileged access. The below diagram depicts this process:

Cluster and Pod Security

Reference:https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-cluster-security#secure-access-to-the-api-server-and-cluster-nodes

Network Security

Next, let’s look at network security and how it pertains to securing your environment. A first step would be to apply network policies. Much like above, Azure has many built-in policies that assist with network hardenings, such as using a policy that only allows for specific network traffic from authorized networks based on IP addresses or namespaces. It’s also important to note this can only occur when the cluster is first created. You also have the option for ingress controllers that access internal IP addresses. This ensures they can only get accessed from that internal network. These small steps can narrow the attack surface of your cluster and tighten traffic flows. The below diagram demonstrates using a Web Application Firewall (WAF) and an egress firewall to manage defined routing in/out of your AKS environment. Even more granular control is possible using network security groups. These allow only specific ports and protocols based on source/destination. By default, AKS creates subnet level NSGs for your cluster. As you add services such as load balancers, port mappings, and ingress routes, it will automatically modify the NSGs. This ensures the correct traffic flow and makes it easier to manage change. Overall these effortless features and policies can allow for a secure network posture.

Network Security Graphic

Reference: Microsoft Documentation

The Final Piece

The final piece of securing your AKS environment is staying current on new AKS features and bug fixes. Specifically, upgrading the Kubernetes version in your cluster. These upgrades can also include security fixes. These fixes are paramount to remain up to date on vulnerabilities that could leave you exposed. I won’t go too deep on best practices for Linux node updates or managing reboot. This link dives deeper into what Kured is and how it can be leveraged to process updates safely. There are many ways to foundationally secure your AKS clusters. I hope this article helps future implementations and maintainability of your deployment.

Introduction

As enterprises start to utilize Azure resources, even a reasonably small footprint can begin to accumulate thousands of individual resources. This means that the resource count for much larger enterprises could quickly grow to hundreds of thousands of resources.

Establishing a naming convention during the early stages of establishing Azure architecture for your enterprise is vital for automation, maintenance, and operational efficiency. For most enterprises, these aspects involve both humans and machines, and hence the naming should cater to both of them.

It would be too naive and arrogant to propose a one-size-fits-all naming convention. Each enterprise has its own unique culture, tools, and processes. So, here are seven rules for scalable and flexible Azure resource naming conventions. To emphasize, these are rules for establishing naming conventions and not the actual naming convention itself.

Rule #1: Break them Up

  • Breakup resource names into segments. Each segment may contain one or more characters to indicate a specific attribute of the resource.
  • For example: Resource name cte2-sales-prod-rgp has four segments. First segment represents Contoso’s [ct] Enterprise’s, in East US 2 [e2], production [prod] Resource Group [rgp] for Sales application [sales]
    Why This Rule: Logically partitioning resource names into segments allows for the comprehension of resource information by both machines and humans.

Rule #2: Make them Uniquely Identifiable

  • Every resource should have a unique name. Meaning, a name should only belong to a singular resource. Do not hesitate to add additional segments to make the name unique.
  • Ideally, the name should be unique globally across Azure, but if that is too hard to achieve, then at a minimum, it must be unique across all Azure Subscriptions under your Azure AD Tenant.
  • For our Contoso example form Rule # 1, using a couple of characters that identify enterprise would increase chances for Azure wide uniqueness. For cte2-sales-prod-rgp, [ct] represents Contoso enterprise. Other segments, as explained in Rule # 1, also increases uniqueness.
    Why This Rule: Following this rule will eliminate misidentification and resource name conflicts.

Rule #3: Make them Easily Recognizable

  • Names must convey ordinary, but some critical pieces of information about the resource. This rule also serves as a backstop to Rule # 1, whereby taking Rule # 1 to an extreme, one might be tempted to use something like GUID to name the resource.
  • The information may include Azure Region, Environment or Environment Category, Resource Type, etc. to name a few.
  • For our Contoso example, each segment helps with the identification of Azure Region, Application, Environment, and Resource type. All good things for recognizing the resource.
    Why This Rule: Following this rule will eliminate needing a lookup to get information, as the information is embedded in the name itself. Do not use random name generation such as GUIDs, as it might generate a unique name, but it would serve no other purpose.

Rule #4: Make Exceptions Obedient

  • Some resources may not want to fit into the convention. For those resources, establish a convention for exceptions. Don’t let exceptions dictate the overall convention.
  • For example: Storage account names cannot have non-alphanumeric characters. So, if your convention happens to use a dash to separate segments, don’t use the Storage account name and have a dash. Don’t drop a dash for all other resource names.
    Why This Rule: Following this rule prevents a Convention that is too rigid and draconian, leading to convoluted and confusing names.

Rule # 5: Know When To Stop

  • Initially, establish a naming convention for high-level resources and maybe one level deeper. Do not try to establish a naming convention for resources that are three, four, or five levels deep within a resource.
  • If there is a need, let the convention for those lower levels be established by folks who have the expertise and happen to work with them daily.
  • For example, establish a convention for Storage accounts, do not go too deep into naming Container, Blob, and Table.
    Why This Rule: It is impossible to know everything about every resource type used by an enterprise. Leaving room for future extensions is essential for resilient and scalable naming conventions. Your future self and your colleagues will thank you for it.

Rule # 6. Keep Them Handsome, Pretty & Melodic

  • Names created by convention should be pleasing to the eye and melodic to the ears. This means that you should pay special attention to following
    • Acronyms
    • Segment sizes
    • Juxtaposition of segments
    • Sequencing of segments
    • Separators
  • Go back to our Contoso example and see how you can improve so that it lives up to Rule # 6.
    Why This Rule: You will live with the names for a long time and spend a lot of time with them. So you might as well make them a pleasure to work with.

Rule # 7: Toot Your Horn, But With Open Ears

  • Document your convention where it is easily accessible and searchable such as a central Wiki. Present your convention at every opportunity. Demo real-life excellent and bad examples. Write blogs and make videos to explain.
  • But, always keep an open mind. Listen to feedback and be open to refining when needed.
    Why This Rule: Your established naming pattern is only as good as it’s last use. So practice, preach, persuade, push, peddle, profligate, pander but never be pedantic.

These rules have been battle-tested in several large enterprises over the past decade, so following these rules for flawless Azure Naming Convention.

It would not be unfair to say that Azure Storage is one of Azure’s most essential services. Almost all the other services in Azure use Azure Storage in some shape or form.

AIS has been involved with Azure since it’s beta days, under the code name Red Dog. We’ve seen Azure Storage grow from a service with a limited set of features and capabilities to a service with an extensive collection of features, supporting storage requirements of small organizations and large enterprises equally.

Given the extensive feature set, we have seen our customers sometimes struggle to choose the right kind of storage for their needs. Furthermore, at the time of this blog, it is not possible to change the type of storage account once it is created.

We intend to clear up some of the confusion through this blog post by providing a matrix of features available based on the kind of storage account.

Storage Account Kind

When you create a storage account (in a portal or by other means), you are asked to make many selections (like resource group, location, etc.). Among them, three vital selections that you’re asked to make are:

  • Desired Performance Level
  • Account Type
  • Replication/Data Redundancy

Desired Performance Level

When it comes to performance, Azure Storage offers you two options – Premium and Standard. In Premium Storage, the data is stored on Solid State Drives (SSD) versus standard Hard Disk Drives (HDD) instead of Standard Storage. Premium Storage provides you better performance in terms of IOPS (Input/Output Operations Per Second) and throughput.

Choosing the right performance level at the time of account creation becomes essential. Once a storage account is created with a performance level, it can’t be changed i.e.; you can’t change a “Standard” storage account to a “Premium” storage account and vice-versa. Furthermore, not all services are supported for all performance levels. For example, if your application makes heavy use of Storage Queues or Tables, you cannot choose “Premium” as these services are not supported.

You can learn more about the storage account performance levels here.

Account Type

Next is the account type. At the time of writing of this blog, Azure Storage supports the following types of accounts:

  • General-purpose v2 accounts
  • General-purpose v1 accounts
  • BlockBlobStorage accounts
  • FileStorage accounts
  • BlobStorage accounts

Choosing the right type of account at the time of account creation is vital because you can’t convert the type of an account once it’s created. The only exception is that you can do a one-time upgrade a general-purpose v1 account to a general-purpose v2 account.

Also, like with performance level, not all features are supported in all account types. For example, the “FileStorage” kind of accounts only supports file storage service and not a blob, queue, and table service. Another example is that you can only host a static website in general-purpose v2 and BlockBlobStorage type of accounts.

Another important consideration in choosing the right type of storage account is pricing. In our experience, general-purpose v2 accounts are more expensive than general-purpose v1 accounts, offering more features.

You can learn more about the storage account types here.

Replication/Data Redundancy

Azure Storage is strongly a consistent service and multiple copies of your data are stored to protect that data from planned and unplanned events like data center failures, transient errors, etc. At the time of writing of this blog, Azure Storage provides the following types of redundancy options:

  • Locally redundant storage (LRS)
  • Zone-redundant storage (ZRS)
  • Geo-redundant storage (GRS)
  • Geo-zone-redundant storage (GZRS)
  • Read-access geo-redundant storage (RAGRS)
  • Read-access geo-zone-redundant storage (RAGZRS)

Choosing the right redundancy kind becomes essential as it enables your application to be fault-tolerant and more available.

Again, not all redundancy kinds are supported for all storage account types. While it is true that you can change redundancy kind of a storage account on the fly, it is only possible between certain redundancy kinds. For example, you can’t change the redundancy kind of an account from Geo-zone-redundant storage (GZRS) to Geo-redundant storage (GRS).

Another good example is that you can convert Geo-zone-redundant storage (GZRS)/Read-access geo-zone-redundant storage (RAGZRS) to a Zone-redundant storage (ZRS) but not the other way around.

You can read more about the data redundancy options available in Azure Storage here.

Storage Feature Matrix

Based on the account type, performance level, and redundancy kind, we have come up with the following feature matrix.

Storage Feature

Using this matrix, you should choose the right kind of storage account to meet your needs.

Here are some examples:

  • If you need to host static websites in Azure Storage, you can either use “Premium BlockBlobStorage (LRS/ZRS)” or “Standard General-purpose v2 (LRS/GRS/RAGRS/GZRS/RAGZRS/ZRS)” type of storage account.
  • Suppose you need to archive the data for compliance or another regulatory purpose. In that case, you can either use “BlobStorage (LRS/GRS/RAGRS)” or “General-purpose v2 (LRS/GRS/RAGRS)” type of storage account.
  • If you need premium performance with your page blobs, you can either use “General-purpose v2 (LRS) or “General-purpose v1 (LRS)” type of storage account.
  • If you need a premium performance of your file shares, the only option you have is the “Premium FileStorage (LRS/ZRS)” type of storage account.

Summary

We hope that you find this blog post useful and will use the feature matrix the next time you have a need to create a storage account.

Feel free to reach out to us if we can be of any assistance with your Azure projects. You can contact us online here!

Introduction

Unfortunately, Azure DevOps does not have a SaaS offering running in Azure Government. The only options are to spin up Azure DevOps server in your Azure Government tenant or connect the Azure DevOps commercial PaaS offering (specifically Azure Pipelines) to Azure Government. Your customer may object to the latter approach; the purpose of this post is to provide you with additional ammunition in making a case that you can securely use commercial Azure DevOps with Azure Government.

Throughout this blog post, the biggest question you should always keep in mind is where is my code running?

Scenario

Take a simple example; a pipeline that calls a PowerShell script to create a key vault and randomly generates a secret to be used later (such as for a password during the creation of a VM)

add-type -AssemblyName System.Web
$rgName = "AisDwlBlog-rg"
$kvName = "AisDwlBlog-kv"
$pw = '"' + [System.Web.Security.Membership]::GeneratePassword(16, 4) + '"'
az group create --name $rgName --location eastus --output none
az keyvault create --name $kvName --resource-group $rgName --output none
az keyvault set-policy --name $kvName --secret-permissions list get --object-id 56155951-2544-4008-9c9a-e53c8e8a1ab2 --output none
az keyvault secret set --vault-name $kvName --name "VMPassword" --value $pw

The easiest way we can execute this script is to create a pipeline using a Microsoft-Hosted agent with an Azure PowerShell task that calls our deployment script:

pool:
  name: 'AisDwl'

steps:
- task: AzureCLI@2
  inputs:
    azureSubscription: 'DwlAzure'
    scriptType: 'ps'
    scriptLocation: 'scriptPath'
    scriptPath: '.\Deploy.ps1'

When we execute this, note that output from that PowerShell script flows back into the Pipeline:

Deployment Output

To circle back to that question that should still be in your mind….where is my code running? In this case, it is running in a Virtual Machine or container provided by Microsoft. If you have a customer that requires all interactions with potentially sensitive data to be executed in a more secure environment (such as IL-4 in Azure Government), you are out of luck as that VM/Container for the hosted build agent is not certified at any DoD Impact Level. Thus, we have to look at other options, where our deployment scripts can run in a more secure environment.

I’ll throw a second wrench into things…did you see the bug in my Deploy.ps1 script above? I forgot to add --output none to the last command (setting the password in the key vault). When I run the pipeline, this is shown in the output:

Secret Visible

Not good! In an ideal world, everyone writing these scripts would be properly handling output, but we need to code defensively to handle unintended situations. Also, think about any error messages that might bubble back to the output of the pipeline.

Option 1

Azure Pipelines provide the capability to run pipelines in self-hosted agents, which could be a VM or container-managed by you/your organization. If you set up this VM in a USGov or DoD region of Azure, your code is running in either an IL-4 or IL-5 compliant environment. However, we can’t simply spin up a build agent and call it a day. As with the Microsoft-hosted build agent, the default behavior of the pipeline still returns output to Azure DevOps. If there is ever an issue like I just demonstrated, or an inadvertent Write-Output or Write-Error, or an unhandled exception containing sensitive information, that will be displayed in the output of the pipeline. We need to prevent that information from flowing back to Azure Pipelines. Fortunately, there is a relatively simple fix for this: instead of having a task to execute your PowerShell scripts directly, create a wrapper/bootstrapper PowerShell script.

The key feature of the bootstrapper is that it executes the actual deployment script as a child process and captures the output from that child process, preventing any output or errors from flowing back into your Pipeline. In this case, I am simply writing output to a file on the build agent, but a more real-world scenario would be to upload that file to a storage account.

try
{
	& "$PSScriptRoot\Deploy.ps1" | Out-File "$PSScriptRoot\log.txt" -append
	Write-Output "Deployment complete"
}
catch
{
	Write-Error "there was an error"
}

The biggest disadvantage of this approach is the additional administrative burden of setting up and maintaining one (or more) VMs/containers to use as self-hosted build agents.

Option 2

If you would prefer to avoid managing infrastructure, another option is to run your deployment scripts in an Azure Automation Account. Your Pipeline (back to running in a Microsoft-hosted agent) starts an Azure Automation Runbook to kick off the deployment. The disadvantage of this approach is that all of your deployment scripts must either be staged to the Automation Account as modules or converted into “child” runbooks to be executed by the “bootstrapper” runbook. Also, keep in mind that the bootstrapper runbook must take the same preventative action of capturing output from any child scripts or runbooks to prevent potentially sensitive information from flowing back to the Pipeline.

Sample code of calling a runbook:

$resourceGroupName = "automation"
$automationAccountName = "dwl-aaa"
$runbookName = "Deployment-Bootstrapper"
                    
$job = Start-AzAutomationRunbook -AutomationAccountName $automationAccountName -ResourceGroupName $resourceGroupName -Name $runbookName -MaxWaitSeconds 120 -ErrorAction Stop
                    
$doLoop = $true
While ($doLoop) {
    Start-Sleep -s 5
    $job = Get-AzAutomationJob -ResourceGroupName $resourceGroupName –AutomationAccountName $automationAccountName -Id $job.JobId
    $status = $job.Status
    $doLoop = (($status -ne "Completed") -and ($status -ne "Failed") -and ($status -ne "Suspended") -and ($status -ne "Stopped"))
}
                    
if ($status -eq "Failed")
{
    Write-Error "Job Failed"
}

The Deployment script code running as an Azure Automation Runbook (Note that this has been converted to Azure PowerShell as the AzureCLI isn’t supported in an Automation Account Runbook):

$Conn = Get-AutomationConnection -Name AzureRunAsConnection
Connect-AzAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint

add-type -AssemblyName System.Web
$rgName = "AisDwlBlog-rg"
$kvName = "AisDwlBlog-kv"
$pw = [System.Web.Security.Membership]::GeneratePassword(16, 4)

$rg = Get-AzResourceGroup -Name $rgName
if ($rg -eq $null)
{
	$rg = New-AzResourceGroup -Name $rgName -Location EastUs
}

$kv = Get-AzKeyVault -VaultName $kvName -ResourceGroupName $rgName
if ($kv -eq $null)
{
	$kv = New-AzKeyVault -Name $kvName -ResourceGroupName $rgName -location EastUs
}
Set-AzKeyVaultAccessPolicy -VaultName $kvName -PermissionsToSecrets list,get,set -ServicePrincipalName $Conn.ApplicationID

$securePw = ConvertTo-SecureString -String $pw -AsPlainText -Force
Set-AzKeyVaultSecret -VaultName $kvName -Name "VMPassword" -SecretValue $securePw

Leveraging the Power Platform and Microsoft Azure to connect housing agencies and intermediaries with the Department of Housing and Urban Development (HUD)

Intro

The US Department of Housing and Urban Development runs a program known as the Housing Counseling Program that aids housing agencies around the nation through things like grants and training. These agencies provide tremendous benefit to Americans in many forms including credit counseling, foreclosure prevention, predatory lending awareness, homelessness, and more. One of the requirements for approval into the program is that the agency uses a software system that is also approved by HUD and interfaces with HUD’s Agency Reporting Module SOAP web service.

Original System

The original system is built on Dynamics 365 Online using custom entities, web resources, and plug-ins. Complex batching functionality was implemented to deal with the two-minute timeout period that plug-ins have, web resource files (JavaScript, HTML, and CSS) were stuck in a solution with no real structure, and application lifecycle management (ALM) was extremely painful. The original system design also doesn’t account for the intermediary-to-agency relationship, specifically the roll-up reporting and auditing that intermediaries desperately need for providing their compiled agency data to HUD, as well as their own internal reporting. Finally, because the solution was built using Dynamics 365 Online Customer Engagement, licensing costs were more than double what we knew they could be with Microsoft’s new Power Apps licenses and Azure pay-as-you-go subscriptions.

The Original Diagram

Figure 1 – current CMS system build on Dynamics 365 Online

Definition

Intermediaries – organizations that operate a network of housing counseling agencies, providing critical support services to said agencies including training, pass-through funding, and technical assistance.

Modern System

Whereas the data schema for the solution remains mostly unchanged, the architecture of the system has changed profoundly. Let’s check out some of the bigger changes.

Power Apps Component Framework (PCF)

The Power Apps Component Framework enables developers to create code components that can run on model-driven and canvas apps and can be used on forms, dashboards, and views. Unlike traditional web resources, PCF controls are rendered in the same context and at the same time as any other component on a form. A major draw for developers is that PCF controls are created using TypeScript and tools like Visual Studio, and Visual Studio Code. In the modern solution, web resources are replaced with PCF controls that make calls directly to the Power Apps Web API.

Azure Resources (API Management, Function Apps, Logic Apps, and a Custom Connector)

A Custom Connector and Azure API Management are used to manage and secure various APIs that are used in the system. The connector can be used in any Azure Logic App to make connecting to an API much easier than having to configure HTTP actions. All of the business logic that once lived in plug-ins has been replaced with a combination of Azure Logic Apps and Azure Function Apps. This has provided incredible gains where development and testing are concerned, but it also provides a more secure and scalable model to support these agencies as they grow. Lastly, it removes the burden experienced with the two-minute time-out limitation that plug-ins have.

ALM with Azure DevOps Services and Power Apps Build Tools

Azure DevOps Services, coupled with the Power Apps Build Tools, are being used to ease the pain we experienced with ALM prior to these tools being available. Now we can easily move our Power App solutions across environments (e.g. dev, test, production) and ensure our latest solution code and configurations are always captured in source control.

ALM with Azure

Figure 2 – modern solution using Power Apps and Azure resources for extreme scalability and maintainability

Next Steps

I hope by demonstrating this use case you’re inspired to contact your Microsoft partner, or better yet contact us and let us work with you to identify your organization’s workloads and technology modernization opportunities.

Ready to dig a little deeper into the technologies used in this blog post? Below we have some hands-on labs for you. Stay tuned for more updates!

  1. Power Apps Component Framework Controls
    As a developer, PCF Controls has been one of the most exciting things to grace the Power Platform landscape. In this video, I’ll show you how to go from zero to a simple PCF control that calls the Power Platform Web API. I also provide a Pro Tip or two on how I organize my PCF controls in my development environment.

  2. Using an Azure Logic App to Query Data in an On-Premises SQL Server Database
    Companies are embracing modernization efforts across the organization, quickly replacing legacy apps with Power Apps and Azure resources. This can’t all happen at once though, and often times companies either can’t or won’t move certain data to the cloud. In this hands-on lab, I’ll walk you through using an Azure Logic App and an On-Premises Data Gateway to connect to a local SQL Server 2017 database to query customer data using an HTTP request and response.
  3. Azure API Management and Custom Connectors
    API’s give you the ability to better secure and manage your application interfaces. Couple an API with a Custom Connector and you not only better secure and manage, but also make it super easy for other app developers across the organization to connect to your back-end resources through the API, without having to worry about all of the API configurations.
  4. Power Apps DevOps Using Best Practices
    There’s more to DevOps than just backing up your Power App solutions in source control and automating your build and release pipeline. Good solution management and best practices around publishers, entity metadata, components, and subcomponents are also key to manageability and scalability with your ALM.
When building a web API or web application it is critically important to know that the application is functioning as intended. Whether that be from a performance perspective or simply knowing that external clients are using the application correctly. Historically, for an on-premise solution that involves installing agent monitoring software and configuring a logging solution with associated storage management. With Azure, that now becomes a turn-key solution using Application Insights. Application Insights can be used whether your actual application is deployed on-premise or in the cloud. In this post, I’d like to talk about configuring Application Insights for an ASP.NET Core application and I’d also like to talk about structured logging.

Enable Application Insights for ASP.NET Core

The way to enable Application Insights for your ASP.NET Core application is to install the Nuget package into your .csproj, file as shown below.

Enable Application Insights for ASP.NET Core

The rest of this article assumes you are using version 2.7.1 or later of the Nuget package. There have been several changes in the last 6 months to the library.
Please add the following code to your Startup.cs,

Add Code to Your Startup.cs

Allocate your Application Insights resource in Azure, whichever way you prefer. This could be Azure Portal, Azure CLI, etc. See Azure Docs for more details.

In your appsettings.json, add the following:

appsettings.json

By now you’ve enabled Application Insights for your ASP.Net Core application. You’ll now get the following features:

  • Request Lodging
  • Automatic dependency logging for SQL requests and HTTP requests
  • A 90-day long retention period
  • Live metrics, which permit you to view and filter the above telemetry along while viewing CPU and memory usage statistics live. For example, see the below screenshots.

Live Metrics Stream

Telemetry Types and Waterfall View

One of the interesting features that Application Insights provides compared to other logging systems is that it has different kinds of telemetry. This includes RequestTelemetry, DependencyTelemetry, ExceptionTelemetry, and TraceTelemetry. Application Insights also provides the ability to have a parent operation that other telemetry operations belong to and you can view a waterfall view of a given request. For an example see the screenshot below:

End to end transaction

Structured Logging

Any of the telemetry types will provide the ability to add arbitrary key-value pairs. Those values will then be logged as key-value pairs to Application Insights. Then using the Log Analytics feature of Application Insights, one can then query on those custom key-value pairs. Effectively, you are getting a schema-less ability to attach custom properties to any telemetry in real-time. This is commonly referred to as Structured Logging with other frameworks. This is so you are not creating one long message string, then trying to parse the message string. Instead, you get custom key-value pairs and can simply query for a given key having a given value. The screenshot below provides an example of a Log analytics query on a custom property:

Customer API App Insights

Log Your Own Custom Messages and Telemetry

We now ask the question of how do you go about logging custom telemetry to Application Insights from within your ASP.NET Core application? The Application Insights NuGet package automatically registers the TelemetryClient class provided by the library into the Dependency Injection container. You could add that as a constructor argument to your Controller for instance and then directly call methods on the TelemetryClient. However, at this point, you are coupling more parts of your application to ApplicationInsights. It’s not necessary that you do that. With the latest versions of the ApplicationInsights NuGet for ASP.NET Core, they register an ILogger implementation with ASP.NET Core. So, you could then update your controller as follows:

Log Custom Messages

In the above example, we have logged a message and a custom key-value pair. The key will be “id” and the value will be the value of the argument passed into the Get function. Notice, we have done this only with a dependency on ILogger, which is a generic abstraction provided by Microsoft. ILogger will typically log to multiple outputs, Console, ApplicationInsights and you can find many implementations of ILogger. ILogger natively supports structured logging and will pass the information down to the actual log implementation. The below example being Application Insights.
Currently, by default Application Insights will only log warning messages from ILogger. So, my above example would not work. To disable the built-in filter, you would need to add the following to Startup.cs in ConfigureServices.

ConfigureServices

Summary

With Application Insights, we can provide within minutes in Azure. You’ll receive 5 GB of data ingestion free per month and free data retention for 90 days. It is trivial to instrument your application. Some of the benefits you’ll receive are:

  • Automatic logging of requests/responses
  • Ability to drill into recent failures/exceptions in Azure portal
  • Waterfall mapping of a request
  • Automatic dependency logging of out-bound SQL and HTTP requests
  • Arbitrarily query your data using Log Analytics
  • Ability to drill into recent performance metrics in Azure portal
  • Live metrics view as your application is running in production with filtering.
  • Custom graphs and charts with Notebooks
  • Ability to create an Azure Portal Dashboard
  • Application map that will show the topology of your application with any external resources it uses.

Application Insights is a very powerful tool to ensure your application is functioning as intended, and it is very easy to get started. You spend your time instrumenting your application and checking application health, not time provisioning log storage solutions and picking log query tools.

Look for future blog posts covering additional topics like keeping Personally Identifiable Information (PII) out of your logs and troubleshooting your Application Insights configuration.

This blog is a follow on about Azure Cognitive Services, Microsoft’s offering for enabling artificial intelligence (AI) applications in daily life. The offering is a collection of AI services with capabilities around speech, vision, search, language, and decision.

Azure Personalizer is one of the services in the suit of Azure Cognitive Services, a cloud-based API service that allows you to choose the best experience to show to your users by learning from their real-time behavior. Azure Personalizer is based on cutting-edge technology and research in the areas of Reinforcement Learning, it uses a machine learning model that is different from traditional supervised and unsupervised learning models.

In Azure Cognitive Services Personalizer: Part One, we discussed the core concepts and architecture of Azure Personalizer Service, Feature Engineering, its relevance, and its importance.

In Part Two, we cover a couple of use cases in which Azure Personalizer Service is implemented. We looked at features used, reward calculation, and their test run result.

In this blog, Part Three, we list out recommendations and capacities for implementing solutions using Azure Personalizer Service.

Recommendations, Current Capacities, and Limits

This section describes some essential recommendations while implementing Personalizer, and current capacity factors for its use.

Recommendations

  • Personalizer starts with default learning policy which can yield moderate performance. As part of optimization, Evaluations are run that allows Personalizer to create new Learning Policies specifically optimized to a given use case. Optimized learning policies perform significantly better for each specific loop, generated during evaluation.
  • The reward score calculation should consider only relevant factors with appropriate weight. Experiment duration (Rank to Reward cycle) should be low enough that the reward score can be computed while it’s still relevant. How well-ranked results worked can be computed by business logic, by measuring a related aspect of the user behavior, and it is expressed in value between -1 and 1.
  • The context for the ranking items (actions) can be expressed as a dictionary of at least 5 features that you think would help make the right choice, and that doesn’t include personally identifiable information. Similarly, each item (action) should be expressed as a dictionary of at least 5 attributes or features that you think will help Personalizer make the right choice. There should be less than 50 actions (items) to rank per call.
  • Personalizer will adapt to continuous change in the real world, but results won’t be optimal if there are not enough events and data to learn from to discover and settle on new patterns. Data can be retained long enough to accumulate a history of at least 100,000 interactions.
  • You should choose a use case that happens often enough. Consider looking for use cases that happen at least 500 times per day. Context and actions have enough features defined to facilitate learning.
  • Your data retention settings allow Personalizer to collect enough data to perform offline evaluations and policy optimization. This is typically at least 50,000 data points.
  • Don’t use Personalizer where the personalized behavior isn’t something that can be discovered across all users but rather something that should be remembered for specific users or comes from a user-specific list of alternatives.
  • To prevent actions from being ranked, it can either be removed from the list while making Rank API call or use Inactive Events. To disable automatic learning, call Rank API with learningEnabled = False. Learning for an inactive event is implicitly activated if you send a reward for the Rank results.
  • Personalizer exploration setting of zero will negate many of the benefits of Personalizer. With this setting, Personalizer uses no user interactions to discover better user behavior. This leads to model stagnation, drift, and ultimately lower performance.
  • A setting that is too high will negate the benefits of learning from user behavior. Setting it to 100% implies constant randomization, and any learned behavior from users would not influence the outcome.
  • To realize the full potential of AI offerings, design and implementation should gain the full trust of end-users, aspects to consider include ethics, privacy, security, safety, inclusion, transparency, and accountability.

Capacity & Limits

  • How well the ranked-choice worked need to be measured with relevant user behavior and scored between -1 and 1 with single or multiple calls to Reward API.
  • Context and Actions (Items) have enough features (at least 5 features each) to facilitate learning. Fewer than 50 items (actions) to rank per single Rank call.
  • Retaining the data for long enough to accumulate a history of at least 100,000 interactions to perform effective offline evaluations and policy optimizations, typically at least 50,000 data points.
  • Personalizer supports features of data type string, numeric, and Boolean. Empty context is not supported, it should have at least one feature in the context.
  • For categorical features, pre-defining of the possible values or ranges is not required.
  • Features that are not available at the time of Rank call should be omitted instead of sent with a null value.
  • There can be hundreds of features defined for a use case, but they must be evaluated (using principles of Feature Engineering and Personalizer Evaluation option) for effectiveness, and less effective ones should be removed.
  • The features in the actions may or may not have a correlation with the features in the context used in Personalizer.
  • If the ‘Reward Wait Time’ expires, and there has been no reward information, a default reward is applied to that event for training. The maximum wait duration supported currently is 6 days.
  • Personalizer Service can return a rank very rapidly, and azure will auto-scale on need basis to maintain the rapid generation of ranking results. Throughput is calculated by adding the size of action and context JSON documents, and factor the rate of 20 MB / sec.
  • Context and Actions (items) are expressed as a JSON object that is sent with the Rank API call. JSON objects can include nested JSON objects and simple property/values. Arrays can be included if the items are numbers.

Conclusion

Azure Cognitive Services suite facilitates a broad range of AI implementations. It enables applying the benefits of AI technology in little things we do in our daily lives. Personalizer service is simple to use yet powerful AI service that can be applied in any scenario where the ranking of options is meaningful once it is expressed with a rich set of features. I hope this blogpost is helpful in explaining the high potential use of Azure Cognitive Services Personalizer Service. I also wanted to thank my colleague Kesav Chenna at AIS for his contribution in implementing Personalizer in the use cases discussed in this blog.

References

If you’re looking for an intelligent cloud-native Security Information and Event Management (SIEM) solution that manages all incidents in one place, Azure Sentinel may be a good fit for you.

Not only does Azure Sentinel provide intelligent security analytics and threat intelligence, but it’s also considered a Security Orchestration and Automation Response (SOAR) solution, meaning it will collect data about security threats and you can automate responses to lower-level security events without the traditionally manual efforts required. You can extend this solution across data sources by integrating Azure Sentinel with enterprise tools, like ServiceNow. There are also services offered at no additional cost, such as User Behavior Analysis (UBA ), Petabyte daily digestion, and Office 365 data ingestion, to make Azure sentinel even more valuable.

BETTER SECURITY FOR YOUR CLOUD
We'll help you review your current security posture, risks, and gaps to establish a secure code culture. Reach out today to learn more.

First Impression

After opening Azure Sentinel from the Azure portal, you will be presented with the below items:

Azure sentinel first view

Theoretically, Azure Sentinel has four core areas.

Azure Sentinel Four Core Areas

  • Collect – By using connections from multiple vendors or operating systems, Azure Sentinel collects security events and data and keeps them for 31 days by default. This is extendable up to 730 days.
  • Detect – Azure Sentinel has suggested queries, you can find samples, or build your own. Another option is Azure Notebook, which is more interactive and has the potential to use your data science analysis.
  • Investigate – For triaging using the same detection methodology in conjunction with events investigation. Later you will have a case created for the incident.
  • Respond –  Finally, responding can be manual or automated with the help of Azure Sentinel playbooks. Also, you can use graphs, dashboards, or workbooks for presentation.

For a better understanding, the flow in this example of behind the scene is helpful.

Steps in Azure Sentinel

How do I enable Azure Sentinel?

If you already have an Azure Log Analytics Workspace, you are one click away from Azure Sentinel. You need to have contributor RBAC permission on the subscription that has Azure Log Analytics Workspace, which Azure Sentinel will bind itself to it.

Azure Sentinel has some prebuilt dashboards and you are able to share it with your team members.

You can also enable the integration of security data from Security Center > Threat Detection > Enable integration with other Microsoft security services

Azure Sentinel has a variety of built-in connectors that collect data and process it with its artificial intelligence empowered processing engine. Azure Sentinel can relate your events to well-known or unknown anomalies (with the help of ML)!

Below is a sample connection which  offers two out-of-the-box dashboards:

sample connection in Azure Sentinel

All connections have a fair amount of instructions, which usually allows for a fast integration. A sample of an AWS connector can be found here.

Azure Sentinel has thirty out-of-the-box dashboards that make it easy to create an eloquent dashboard, however, built-in dashboards only work if you have configured the related connection.

Built-In Ready to Use Dashboards:

  • AWS Network Activities
  • AWS User Activities
  • Azure Activity
  • Azure AD Audit logs
  • Azure AD Sign-in logs
  • Azure Firewall
  • Azure Information Protection
  • Azure Network Watcher
  • Check Point Software Technologies
  • Cisco
  • CyberArk Privileged Access Security
  • DNS
  • Exchange Online
  • F5 BIG-IP ASM F5
  • FortiGate
  • Identity & Access
  • Insecure Protocols
  • Juniper
  • Linux machines
  • Microsoft Web Application Firewall (WAF)
  • Office 365
  • Palo Alto Networks
  • Palo Alto Networks Threat
  • SharePoint & OneDrive
  • Symantec File Threats
  • Symantec Security
  • Symantec Threats
  • Symantec URL Threats
  • Threat Intelligence
  • VM insights

A Sample Dashboard:

One of the most useful IaaS monitoring services that Azure provides is VMInsights, or Azure Monitor for VMs. Azure Sentinel has a prebuilt VMInsight Dashboard. You can connect your VM to your Azure Log Analytics Workspace, then enable VMInsights from VM > Monitoring > Insights. Make sure the Azure Log Analytics Workspace is the same one that has Azure Sentinel enabled on it.

Sample Dashboard VMInsights or Azure Monitor for VMs

Creating an alert is important. Alerts are the first step for having a case or ‘incidents’. After a case is created based on the alert, then you can do your investigation. For creating an alert, you need to use the KQL language that you probably already used it in Azure Log analytics.

Azure Sentinel has a feature named entity mapping, which lets you relate the query to values like IP address and hostname. These values make the investigation much more meaningful. Instead of going back and forth to multiple queries to relate, you can use entities to make your life easier. At the time of writing this article, Azure Sentinel has four entities; Account, Host, IP address, and Timestamp, which you can bind to your query. You can enable or disable an alert or run it manually as you prefer easily from Configuration > Analytics. Naming might be a little bit confusing since you also need to create your alerts from Analytics.

Azure Sentinel Investigation map of entities becomes public in September 2019 and you no longer need to fill out a form request access.

Let’s Go Hunting

You can use Azure Sentinel built-in hunting queries. You can also directly shoot it down if you know where to find the anomalies by KQL queries and create an alert. Or uses Azure Notebook for AI, ML-based hunting. You can bring your own ML model to Azure Sentinel. Azure Sentinel Notebook is for your tier 4 SOC analysis.

Azure Sentinel built-in hunting query

Azure Sentinel uses MITRE ATT&CK-based queries and introduced eight types of queries, also known as bookmarks, for hunting.

After you become skilled in detection, you can start creating your playbook constructed on logic app workflows. You can also build your automated responses to threads or craft custom actions after an incident has happened. Later you can enable Azure Sentinel Fusion to associate lower fidelity anomalous activities to high fidelity cases.

Azure Sentinel Detection Playbook

A sample playbook:

Azure Sentinel Sample Playbook

Image Source: Microsoft

Azure Notebooks is a Jupyter notebook (interactive computational tool) for facilitating your investigation by using your data science skills. Azure Notebooks support languages and packages from Python 2 and 3 you can also use R and F#.

We all love community-backed solutions. You can share your findings and designs with others and use their insights by using the Azure Sentinel Community on GitHub.

Azure Sentinel Fusion

Fusion helps reduction of noise by preventing alert fatigue. Azure Sentinel Fusion uses this insight here, and you can see how to enable Azure Sentinel Fusion.

Traditionally we assume an attacker follows a static kill chain as the attack path or all information of an attack is present in the logs. Fusion can help here by bringing probabilistic kill chain and to find novel attacks. You can find more information on this topic here. Formerly, you should run a PowerShell command to enable Fusion, but going on Fusion is enabled by default.

What Data Sources Are Supported?

Azure Sentinel has three types of connectors. First, Microsoft services are connected natively and can be configured with a few clicks. Second, is by connecting to external solutions via API. And finally, connecting to external solutions via an agent. These connectors are not limited to below table, and there are some examples of IoT and Azure DevOps that can communicate with Azure Sentinel

Microsoft services External solutions via API External solutions via an agent
Office 365 Barracuda F5
Azure AD audit logs and sign-ins Symantec Check Point
Azure Activity Amazon Web Services Cisco ASA
Azure AD Identity Protection Fortinet
Azure Security Center Palo Alto
Azure Information Protection Common Event Format CEF appliances
Azure Advanced Threat Protection Other Syslog appliances
Cloud App Security DLP solutions
Windows security events Threat intelligence providers
Windows firewall DNS machines
DNS Linux servers
Microsoft web application firewall (WAF) Other clouds

Where Does Azure Sentinel Sit in the Azure Security Picture?

Azure Sentinel in the Azure Security Big Picture

Azure Sentinel can be used before an attack, like Azure Active Directory signings from new locations. During an attack, like malware in the machine or post-attack for investigation about an incident and perform triage with it. Azure Sentinel has a service graph that can show you the related event to an incident.

If you are security titled a person or part of the SOC team and you prefer a cloud-native solution, Azure Sentinel is a good option.

Security Providers or Why Azure Sentinel?

Azure Sentinel uses Microsoft Intelligent Security Graph that is backed by Microsoft Intelligent Security Association. This association consists of almost 60 companies that hand in hand help to find vulnerabilities more efficiently.

Microsoft brings its findings from 3500+ security professionals, 18B+ Bing page scans per month, 470B emails analyzed per month, 1B+ azure account 1.2B devices updated each month, 630B authentications per month, 5B threats blocked per month.

Microsoft Intelligent Security Graph Overview

Image Source: Microsoft

Microsoft has more solutions that create a valuable experience for his Microsoft Graph Security API: Windows antimalware platform, Windows Defender ATP, Azure Active Directory, Azure Information Protection, DMARC reporting for Office 365, Microsoft Cloud App Security, and Microsoft Intune.

Microsoft Intelligent Security Association (MISA)

Microsoft creates vast threat intelligence solutions. Microsoft collaborated with other companies to create a product under the name of Microsoft Intelligent Security Graph API. Microsoft calls the association The Microsoft Intelligent Security Association (MISA), an association that consists of almost 60 companies who share their security insights from trillions of signals.

  • Microsoft products: Azure Active Directory, Azure Information Protection, Windows Defender ATP, Microsoft Intune, Microsoft Graph Security API, Microsoft Cloud App Security, DMARC reporting for Office 365, Windows antimalware platform, Microsoft Azure Sentinel
  • Identity and access management: Axonius, CyberArk, Duo, Entrust Datacard, Feitian, Omada, Ping Identity, Saviynt, Swimlane, Symantec, Trusona, Yubico, Zscaler
  • Information protection: Adobe, Better Mobile, Box, Citrix, Checkpoint, Digital Guardian, Entrust Datacard, EverTrust, Forcepoint, GlobalSign, Imperva, Informatica, Ionic Security, Lookout, Palo Alto Networks, Pradeo, Sectigo, Sophos, Symantec, Wandera, Zimperium, Zscaler
  • Threat protection: AttackIQ, Agari, Anomali, Asavie, Bay Dynamics, Better Mobile, Bitdefender, Citrix, Contrast Security, Corrata, Cymulate, DF Labs, dmarcian, Duo Security, FireEye, Illumio, Lookout, Minerva Labs, Morphisec, Palo Alto Networks, Red Canary, ThreatConnect, SafeBreach, SentinelOne, Swimlane, ValiMail, Wandera, Ziften
  • Security management: Aujas, Barracuda, Carbon Black, Checkpoint, Fortinet, F5, Imperva, Symantec, Verodin

MISA and Security Graph API

MISA is a combined security effort. It continuously monitors cyberthreats and fortifies itself. This enriched knowledge is accessible by Microsoft Intelligent Security Graph API. Azure Sentinel Fusion is the engine that uses graph powered Machine Learning algorithms. Fusion associates activities with patterns of anomalies.

Microsoft Intelligent Security Association (MISA) and Security Graph API

Below you can see the Azure Sentinel Big Picture:

Azure Sentinel Big Picture

I hope you found this blog helpful! As you can see, Azure Sentinel is just a tip of the Microsoft Security ‘iceburg’.

Azure Sentinel Microsoft Security Iceburg

GAPS IN YOUR SECURITY POSTURE?
Work with AIS to identify security risks and gaps. Together, we'll create a plan for your secure cloud environment.