This past Thanksgiving marked the first anniversary of going live with a SharePoint environment that AIS migrated from on-prem to Microsoft Azure IL5. Since then, our client has experienced 100% uptime during business hours and reduced deployment timelines, from weeks to minutes.

Challenge: Improve Performance, Speed Up Deployments

AIS set out to help a DoD agency that had experienced ongoing service issues with their existing provider while operating their on-prem SharePoint farm. During the year before the migration, the DoD customer experienced three service outages during business hours, which halted the ability to perform mission-critical activities. Additionally, their existing enterprise service provider required a lead-in time of 1-2 weeks to deploy any code changes or new capabilities into the environment. AIS was tasked with building a cloud solution to maximize uptime and accommodate rapid deployments to better serve the fast tempo required by our DoD customer.

Solution: Hybrid IaaS/Paas in Azure IL5

To provide a solution tailored to maximize uptime and accommodate rapid deployments, AIS architected a DoD first: a hybrid IaaS/PaaS environment in Azure Government IL5 that utilized the DISA Cloud Access Point to integrate with NIPRNet. We leveraged a suite of technologies to employ DevSecOps methodologies, allowing the solution to remain scalable while adhering to industry best practices. By implementing an automated code scanning solution, we reduced deployment lead-in time from weeks to minutes. Our infrastructure as code (IaC) development also drastically reduced the time required to build a new environment from several days to under one hour.

Looking Ahead: Cost-Sharing, Scale Across the DoD

AIS has worked with our DoD customers to offer these cloud services to neighboring agencies to benefit from cost-sharing. In doing so, we have passed on lessons learned and processes that we have developed to share our success across the DoD enterprise. As we grow, we continue to integrate evolving best practices to remain at the DoD DevSecOps initiative’s forefront.

AIS FIRST TO DEPLOY AZURE IL6 EVNVIRONMENT AND ACHIEVE ATO!

Have you spent a lot of time getting comfortable using Team Foundation Server (TFS) or Azure Pipelines only to be switched to GitLab CI/CD for a project? That’s what happened to me about four months ago. Since then, I’ve been learning the differences between the tools. There are several differences between Azure Pipelines and GitLab CI/CD that I’ve experienced. This post aims to be a primer for you to understand these differences and save you a lot of research time.

Classic Pipelines vs YAML

I worked with TFS’/Azure Pipelines’ Classic Editor the most before this project. While it’s a great visual designer for automating builds, they aren’t very portable from project-to-project or environment-to-environment because the exported JSON files reference instances of other resources like Service Connections by ID, not name. When importing these JSON files into another TFS or Azure Pipelines instance, changes need to be made to update affected references. This is where YAML, which stands for “YAML Ain’t Markup Language,” comes in. YAML pipelines are the future for both Azure Pipelines and GitLab CI/CD. YAML references the Service Connection by name, making it easier to use the pipeline in other environments if the connection name remains the same.

I didn’t realize before diving into YAML for GitLab because it’s not a language but more of a file format. I had expected the little Azure Pipelines YAML I had written would translate almost perfectly to GitLab. However, GitLab doesn’t use a lot of the syntax that Azure Pipelines does. I’ve listed some of the differences I’ve become aware of, but this is likely just a partial list.

Azure Pipeline and Description

For more details, here is the YAML reference for GitLab and Azure.

Integration with Azure

Azure Pipelines offers Service Connections to connect to remote services like Azure, Docker, or GitHub.
GitLab doesn’t have an equivalent offering. Making those connections must be done in Jobs. One of the concerns is that you would have to keep the connection credentials in environment variables. When doing this, it’s important to remember to mask sensitive variables like passwords to not appear in job logs. This also means you can’t use Azure Key Vault Secrets as pipeline variables, either, which is something my previous projects used heavily.

Runners

While project-specific, the GitLab runners that were made available don’t have all the tools needed for my project. Another team that is developing the runners has provided some basic images that don’t include tools like the Azure or JFrog CLIs, which are required for the environment. This is likely to be remedied, but it’s caused a need to learn how to install it. This isn’t an issue with more open solutions. Still, my project will be isolated from the web in the staging and production environments, making it challenging to use tools like Chocolatey, Nuget, or NPM to install packages from the web. This is still something the other team is working to resolve. Still, the only resolution I see is maintaining their own repository of packages in something like Artifactory and having images or scripts pull from there.

I’m far more comfortable with YAML because of this project. While GitLab is powerful and has received a few excellent features since I began the project, the creature comforts I’m used to with Azure Pipelines (notably the Service Connections) require additional work that seems unnecessary.

When building and managing an Azure environment, Microsoft maintains control of the network traffic as a core operations responsibility. The primary Azure platform resource to implement network traffic control is the Network Security Group (NSG). A Network Security Group allows you to define security rules, like firewall rules, that control traffic by specifying allowed and denied sources, destinations, ports, and protocols. Like all Azure resources, there are multiple options to manage NSGs, including the standard Azure Management tools: The Azure Portal, Scripts (PowerShell and CLI), APIs, and Azure Resource Manager (ARM) templates.

Managing NSG security rules using an ARM template can be challenging. Each security rule is defined using a large chunk of JSON, and many security rules may be required. The verbose JSON structure makes it difficult to see many rules at once, visualize changes from version to version, and encourage team members to revert to the Azure Portal to view and edit rules. Why use the Azure Portal? It turns out the portal’s grid format for NSG Security Rules was comfortable for quickly viewing multiple rules and for making minor edits to individual rules.

Since the portal’s grid view was comfortable the CSV file format seemed like the right idea based on its similarity to a grid. CSV files have a few pros:

  • Good viewers and editors including Excel and VS Code.
  • One vertically compact line for each security rule.
  • A vertically compact view that makes it easier to visually scan rules and to see the changes that are made from version to version when viewing differences.
  • Anyone who can edit a CSV can edit the NSG Security Rules allowing a larger group of security rule editors.

NSG in JSON format

This is a simple example of the NSG Security Rule JSON. A rule like this can get much larger vertically when numerous ports and address prefixes are defined:

{
          "name": "I-L-All_HttpHttps-UI_Layer-T",
          "description": "Allow HTTP + HTTPS traffic inbound.",
          "priority": 1110,
          "access": "Allow",
          "direction": "Inbound",
          "protocol": "Tcp",
          "sourceAddressPrefix": "",
          "sourceAddressPrefixes": [
            "AzureLoadBalancer"
          ],
          "sourceApplicationSecurityGroups": null,
          "sourcePortRange": "*",
          "sourcePortRanges": null,
          "destinationAddressPrefix": "*",
          "destinationAddressPrefixes": null,
          "destinationApplicationSecurityGroups": null,
          "destinationPortRange": "",
          "destinationPortRanges": [
            "80",
            "443"
          ]
        }

NSG in CSV Format

Excel

Example CSV Excel

Example CSV

Converting Between CSV and JSON

The transition from CSV to JSON and from JSON back to CSV must be repeatable and simple. In this scenario, PowerShell scripts manage this process: Convert-NsgCsvToJson.ps1 and Convert-NsgJsonToCsv.ps1.

The Convert-NsgCsvToJson.ps1 script is straightforward and does the following:

  1. Read the source CSV file.
  2. Read the destination JSON file.
  3. Split multi-value fields into an array based on the parameter: CsvArraySeparator. The default is the pipe character ‘|’. For fields like source and destination port ranges, this collapses multiple values into a single CSV field.
  4. Structure of the CSV format data into objects that match the ARM Template NSG Security Rule JSON structure.
  5. Use a JsonFileType parameter to determine where in the destination JSON structure to place the security rules array. This allows placement of the security rules array into a parameter file, template file, or into an empty JSON file.

A New Workflow

With PowerShell scripts, the new workflow for NSGs is:

  1. Create and edit NSG Security Rules in a CSV file – usually using Excel.
  2. Visually scan the CSV looking for obvious anomalies (Excel makes it easy to see when one rule stands out from the others and as an example, a value is in the wrong column).
  3. Execute the script: Convert-NsgCsvToJson.ps1 to convert the rules to the Json Structure and update the destination JSON file.
  4. Deploy the ARM Template and updated parameters file to a dev/test environment using standard deployment approaches such as the Azure CLI. This will fully validate the NSG Json prior to production deployment.
  5. Deploy to Production during a planned change window.

From JSONback to CSV

At times, a team member may change the portal, for example, during troubleshooting. Once an update is made in the portal, transfer Azure changes back to the code that defines this infrastructure. The CSV files are the canonical source, so there needs to be a process to return to CSV from JSON.

  1. To retrieve the NSG Security Rules from the portal execute a CLI command to retrieve NSG security rules and export them to a JSON File.
    az network nsg rule list --nsg-name subnet-01-nsg --resource-group net-rgp-01 | set-content subnet-01-export.json
  2. Execute the Convert-NsgJsonToCsv.ps1 script using the generated file as the input and the corresponding CSV file as the output.

Constraints

The environment these scripts were built for may not match your own. This environment includes several constraints:

  • Azure Resource Manager Templates are the language for Azure Infrastructure as Code.
  • Manual steps are required: automated build and release pipelines are not yet available.
  • There is no guarantee that NSG security rules will not be modified in the Azure Portal, so a mechanism is required to synchronize the code with the environment.

Future Improvements

This solution represented a significant improvement for this team instead of managing NSG security rules directly in the JSON format. As with every answer, there are ideas on how to improve. Here are a few that have come to mind:

  • Use CI/CD tools such as GitHub Actions to automatically execute the Convert-NsgJsonToCsv.ps1 script when an NSG CSV file is committed.
  • Implement a release pipeline so that modified NSG Csv files trigger the conversion script, wait for approval to deploy, and deploy the ARM Template to the dev/test environment.
  • Add Pester tests to the PowerShell scripts.
  • Try this approach with other IaC languages such as Terraform.

Additional Notes

  • The example template has been dramatically simplified.
    • The production template also configures NSG Diagnostic Settings and NSG Flow Logs.
    • The production template builds all resource names based on several segments defined in a naming convention.
  • There are NSG Security Rules that are considered baseline rules that should be applied to every NSG. These rules are managed in a CSV file and placed in an array in the base template and not repeated in each parameter file. An example of this is a rule that allows all servers to contact the organization’s DNS servers.
  • Application Security Groups are used to group servers in the local VNET so that NSG Security Rules do not need to include IP addresses for servers contained in the VNET. The only IP Address Prefixes specified directly in our rules are from outside the current VNET. As with the NSGs, this template defines ASGs in the template (baseline) and parameters file (local) combined and created during template deployment. Only the unique portion of the name is used to define the group, and to specify rules. The remainder of the term is built during deployment. ASGs in Azure are currently only valid for the VNET where they are created, and only one ASG may be specified per security rule. This script creates all the ASGs defined in the template and parameters file.

Code

The code for these scripts including the conversion scripts and a sample ARM Template, ARM Template Parameters files, and matching NSG Security Rule CSV files is available on GitHub: https://github.com/matthew-dupre/azure-arm-nsg-via-csv

Code Scripts

Databricks provides a robust notebook environment that is excellent for ad-hoc and interactive access to data. However, it lacks robust software development tooling. Databricks Connect and Visual Studio (VS) Code can help bridge the gap. Once configured, you use the VS Code tooling like source control, linting, and your other favorite extensions and, at the same time, harness the power of your Databricks Spark Clusters.

Configure Databricks Cluster

Your Databricks cluster must be configured to allow connections.

  1. In the Databricks UI edit your cluster and add this/these lines to the spark.conf:
    spark.databricks.service.server.enabled true
    spark.databricks.service.port 8787
  2. Restart Cluster

Configure Local Development Environment

The following instructions are for Windows, but the tooling is cross-platform and will work wherever Java, Python, and VSCode will run.

  • Install Java JDK 8 (enable option to set JAVA_HOME) – https://adoptopenjdk.net/
  • Install Miniconda (3.7, default options) – https://docs.conda.io/en/latest/miniconda.html
  • From the Miniconda prompt run (follow prompts):Note: Python and databricks-connect library must match the cluster version. Replace {version-number} with version i.e. Python 3.7, Databricks Runtime 7.3
    “` cmd
    conda create –name dbconnect python={version-number}
    conda activate dbconnect
    pip install -U databricks-connect=={version-number}
    databricks-connect configure
    “`
  • Download http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe to C:\Hadoop
    From command prompt run:

    “` cmd
    setx HADOOP_HOME “C:\Hadoop\” /M
    “`
  • Test Databricks connect. In the Miniconda prompt run:
    “` cmd
    databricks-connect test
    “`

You should see an “* All tests passed.” if everything is configured correctly.

  • Install VSCode and Python Extension
    •  https://code.visualstudio.com/docs/python/python-tutorial
    • Open Python file and select “dbconnect” interpreter in lower toolbar of VSCode
  • Activate Conda environment in VSCode cmd terminal
    From VSCode Command Prompt:
    This only needs to be run once (replace username with your username):
    “` cmd
    C:\Users\{username}\Miniconda3\Scripts\conda init cmd.exe
    “`
    Open new Cmd terminal
    “` cmd
    conda activate dbconnect
    “`
    Optional: You can run the command ` databricks-connect test` from Step 5 to insure the Databricks connect library is configured and working within VSCode.

Run Spark commands on Databricks cluster

You now have VS Code configured with Databricks Connect running in a Python conda environment. You can use the below code to get a Spark session and any dependent code will be submitted to the Spark cluster for execution.

“` python
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
“`
Once a context is established, you can interactively send commands to the cluster by selecting them and right-click “Run Select/Line in Python Interactive Window” or by pressing Shift+Enter.

Context established to send commandsThe results of the command executed on the cluster will display in the Visual Studio Code Terminal. Commands can also be executed from the command line window.

Executed Command Cluster

Summary

To recap, we set up a Python virtual environment with Miniconda and installed the dependencies required to run Databricks Connect. We configured Databricks Connect to talk to our hosted Azure Databricks Cluster and setup Visual Studio code to use the conda command prompt to execute code remotely. Now that you can develop locally in VS Code, all its robust developer tooling can be utilized to build a more robust and developer-centric solution.

DevOps implements a Continuous Integration/Continuous Delivery (CI/CD) process. When multiple team members work in the same codebase, anyone’s update could break the integrated code. So, Continuous Integration is to trigger a build pipeline whenever a code update is pushed. The build pipeline will fail if the newly updated code is incompatible with the existing codebase if there are any conflicts. The codebase might work well within a single developer environment, but in a build pipeline where all configurations and dependencies are expected to be in place can fail. Continuous Delivery speeds up the deployment process. The release pipeline helps to deploy the same code base to multiple environments based on configurations. This helps to support code to be deployed in all environments without many manual changes.

Having an approval process helps peer code reviews, identifies potential issues, and any security flaws ahead of time. The current production applications are very distributed and complex. Whether it is an on-premise or cloud-based solution, missing a dependency or proper configurations could cost significant risk in deployments. DevOps helps to maintain the same code base for repeatable deployment in many environments with just configuration changes. DevOps avoids manually building the deployment packages and handing over to the operations team who would not have insights on what is being deployed. If an error occurs during deployment or post-deployment, then the development team jumps in at that time, which is time-consuming. This will cost in production timeline and end up with some unhappy customers also!
DevOps ImagePicture credit: DoD DevOps

Popular DevOps Tools

Follow here to learn more about DevOps practices from other AIS bloggers!

Why not just “DevOps”?

DevOps is fundamental for any organization’s build and deployment process with seamless CI/CD integration. Then, what is ‘DevSecOps’ and why is ‘Sec’ added between Dev and Ops. The ‘Sec’ in DevSecOps is ‘Security.‘ Though it’s added in between, security implementation should start from Development and continue in Operations. As development and deployment packages add many dependencies from both internal and external, this could introduce vulnerabilities. It could cost severe issues in production if not identified earlier in the build pipeline. Code scans help identify possible weaknesses in code implementations. But for any cybersecurity-related vulnerabilities, only specific tools at different stages of the pipeline must be used to identify as early as possible. Adding security scanning earlier in the pipeline and automating are essential for DevSecOps.

DevSecOps Software Lifecycle

Picture Credit: DoD DevSecOps

DevSecOps is not a tool or pattern but a practice and can be enhanced by adding appropriate tools. It is a process in securing the build and deployment by using several security tools by shifting security to the left. These security tools help to identify vulnerabilities that the code could have introduced, recommend possible solutions to fix those issues, and in some instances, the tools can mitigate some of those issues as well. This is to use the ‘fail fast’ method to identify vulnerabilities earlier in the build pipeline. As more applications moved into the cloud, it is highly imperative to follow Cloud Native Computing Foundation (CNCF) certified tools and implement security benchmarks that provided CIS benchmarks. DevSecOps avoids manual changes once the code is in the pipeline, deployed, and deployed. The codebase will be a single source of truth and should not be manipulated at any point.

Adding scanning tools for security and vulnerabilities helps to mitigate any flaws introduced in code and operations. Many open-source tools provide these functionalities. Enabling logging, continuous monitoring, alerting processes, and any self-fix for faster remediation are key for ongoing business operations. Containerizing with hardened container images from DoD Iron Bank helps to protect application container images. Hardened images can be kept up to date from reliable providers. Containers provide cloud-agnostic and no vendor lock-in solutions.

All the security tools in the DevSecOps pipeline must be deployed and running for pipeline scanning in the customer environment. A request will be sent to those security tools from the pipeline code via API request or trigger command-line interface (CLI) commands. Those tools then respond with their findings, statistics, and provide pass/fail criteria. If a tool identifies any vulnerability findings in the scan, then the pipeline will fail.

Deploying the security tools as SaaS services will require permission from the security team. Not all are approved to run in highly secured cloud environments. Those tools all need to be Authority to Operate (ATO) to deploy and configure. Whereas getting the hardened container images for those tools is a safer and secure approach to deploy those tools in the cloud. As the containers are already hardened, which means scanned, secured, and ready to go with all dependencies, they will provide continuous ATO. The hardened container images can be downloaded from DoD Iron Bank, and almost all tool providers provide container images. Many of these providers have different downloads, whether as a software download or a container image. When downloading as a software image, additional tasks to be done to ensure all the dependencies are appropriately configured or should pre-exist. Simultaneously, downloading as hardened container images comes with dependencies and are pre-scanned. The tools can be deployed into Kubernetes in your cloud environment to provide scalable functionality.

Below is a sample DevSecOps pipeline implementation with recommended security tools, as depicted in the picture below:

  • Source code pull request is approved by reviewers
  • The build pipeline kicks off and code scan is run after a successful initial build
    • If any code vulnerabilities are identified, then the pipeline fails
  • Build pipeline continues with DAST and PEN testing
    • If any vulnerabilities are identified, then the pipeline fails
  • Build artifacts are added to private repository either as packages or container
    • Repository scan is performed using repository scanning tools and vulnerabilities are reported
  • Release pipeline picks up artifacts from private repositories and deploys to Azure (or cloud of your choice)
    • Kubernetes is a highly recommended deployment for orchestration, but deployment can be an application of your choice such as Function App, App Service, Azure Container Instances, etc.
  • Security has been applied throughout the pipeline process and will continue once the application is deployed. Both native security tools such as Azure Monitor, Azure Security Center, Azure Policies, etc., and third-party tools such as Twistlock, Qualys, etc. Can be used to monitor the health of your production environment.DevSecOps Diagram

Let’s look at a few of the recommended tools to support the security validations in the DevSecOps process.

Build tools/CLI

A developer can write their code in their favorite editor such as Visual Studio, VS Code, and run/execute to test their applications. The code editor also generates debug/release packages generating binaries using the build tool that comes with the editor. The application works seamlessly from the developer environment as the dependencies and correct configurations exist. For the build to work in the pipeline, the build tool must be available to build the code. Based on the code language, the build tool varies, and they must be available in the pipeline.

Some of the build tools are:

  • DotNet Build
  • MSBuild
  • Maven
  • Gradle

Static Application Security Testing (SAST)

A code scan is one of the essential steps in securing the codebase. Automated testing helps identify failures, but these specific code scan tools help identify security flaws and vulnerabilities. The application does not need to be running for code scan tools as it scans only the codebase and not any dependencies.

Some of the Code scanning tools are:

  • SonarQube
  • Fortify
  • Anchore
  • JFrog Xray
  • OpenSCAP
  • HBSS
  • OWASP dependency check

Dynamic Application Security Testing (DAST)

DAST scans the application while its running or a container image that is hosted in private repositories. Container scanning before deploying helps resolve many security vulnerabilities.

Some of the DAST scanning tools are:

Penetration (Pen) Testing

Provides Web Applications scanner to help to find security vulnerabilities. Read here to learn about, “Top 10 Web Application Security Risks”

PEN testing tools:

  • OWASP ZAP

Deploy Code & IaC (Infrastructure as Code)

IaC is paramount in DevOps to avoid any manual work in customer environments and help with immutable infrastructure.

Popular IaC tools are:

  • Azure ARM Templates
  • Terraform
  • HELM
  • Private Repositories

In DevSecOps, a private repository is recommended to host the build dependencies, reference container images, container images for tools, and the built packages or application container images. This is to keep all the artifacts together in one centralized location, and the release pipeline can continue with deployments from there.
Some of the private repositories are:
JFrog
Docker Hub
Azure Container Registry (ACR)

Private Repository Scanning

As the pipeline requires security scanning, the repositories require scanning also. These tools scan for vulnerabilities in all packages and container artifacts stored in the repository. A scan report is being sent/notified for any issues.

Some artifact scanning tools are:

  • XRay
  • SonaType
  • Azure Monitor
  • Azure Security Center

Deploy

As the recommendation to deploy the security tools with container orchestration, the same recommendation goes to deployed applications. Containers provide high security with limited ways to be affected by attackers. Sidecar containers protect by continually monitoring applications with a container security stack built-in. Applications are scalable on a demand basis using Kubernetes and tools such as Kubectl; HELM packages are used to deploy and manage K8S clusters. ArgoCD is a declarative tool specifically for Kubernetes deployment in CI/CD pipeline.

Deployments to Azure could be:

  • Azure function app
  • Azure App Service
  • Azure Container Instance
  • Azure Kubernetes Service (AKS)
  • Open Shift in Azure
  • Monitoring/Alerting

Monitoring/Alerting

As the applications deployed and running in a cloud environment, it must be continuously monitored for attacks and identify any security vulnerabilities. For containers, these tools act as sidecar containers to regularly protect main containers from attacks, and some mitigate the issue. All these tools have built-in alert/notify operations team for immediate actions.

Monitoring/alerting tools:

  • Azure Monitor
  • Azure Security Center
  • Twistlock
  • Qualys
  • Aqua Security

So, all powered up with learning DevSecOps! Follow up back here for the next blog post in container-based deployments and containers scanning in the DevSecOps pipeline!

References for continuing your DevSecOps Journey

Azure Kubernetes Service is a Microsoft Azure-hosted offering that allows for the ease of deploying and managing your Kubernetes clusters. There is much to be said about AKS and its abilities, but I will discuss another crucial role of AKS and containers, security. Having a secure Kubernetes infrastructure is a must, and it can be challenging to find out where to start. I’ll break down best practices, including baseline security for clusters and pods, and implement network hardening practices that you can apply to your own AKS environment that will lay the foundation for a more secure container environment, including how to maintain updates.

Cluster and Pod Security

Let’s first look at some best practices for securing your cluster and pods using policies and initiatives. To get started, Azure has pre-defined policies that are AKS specific. These policies help to improve the posture of your cluster and pods. These policies also allow for additional control over things such as root privileges. A best practice Microsoft recommends is limiting access to the actions that containers can provide and avoiding root/privileged escalation. When the Azure Policy Add-on for AKS is enabled, it will install a managed instance of Gatekeeper. This instance handles enforcement and validation through a controller. The controller inspects each request when a resource is created or updated. You’ll then need to validate (based on your policies). Features such as these are ever-growing and can make creating a baseline easier. Azure Policy also includes a feature called initiatives. Initiatives are collections of policies that align with organizational compliance goals. Currently, there are two built-in AKS initiatives which are baseline and restricted. Both come with many policies that lockdown items, such as limiting the host filesystem, networking, and ports. By combining both initiatives and policies, you can tighten security and meet compliance goals in a more managed fashion.

Another way to secure your cluster is to protect the access to the Kubernetes API-Server. This is accomplished by integrating RBAC with AD or other identity providers. This feature allows for granular access, similar to how you control access to your Azure resources. The Kubernetes API is the single connection point to perform actions on a cluster. For this reason, it’s imperative to deploy logging\auditing and to enforce the least privileged access. The below diagram depicts this process:

Cluster and Pod Security

Reference:https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-cluster-security#secure-access-to-the-api-server-and-cluster-nodes

Network Security

Next, let’s look at network security and how it pertains to securing your environment. A first step would be to apply network policies. Much like above, Azure has many built-in policies that assist with network hardenings, such as using a policy that only allows for specific network traffic from authorized networks based on IP addresses or namespaces. It’s also important to note this can only occur when the cluster is first created. You also have the option for ingress controllers that access internal IP addresses. This ensures they can only get accessed from that internal network. These small steps can narrow the attack surface of your cluster and tighten traffic flows. The below diagram demonstrates using a Web Application Firewall (WAF) and an egress firewall to manage defined routing in/out of your AKS environment. Even more granular control is possible using network security groups. These allow only specific ports and protocols based on source/destination. By default, AKS creates subnet level NSGs for your cluster. As you add services such as load balancers, port mappings, and ingress routes, it will automatically modify the NSGs. This ensures the correct traffic flow and makes it easier to manage change. Overall these effortless features and policies can allow for a secure network posture.

Network Security Graphic

Reference: Microsoft Documentation

The Final Piece

The final piece of securing your AKS environment is staying current on new AKS features and bug fixes. Specifically, upgrading the Kubernetes version in your cluster. These upgrades can also include security fixes. These fixes are paramount to remain up to date on vulnerabilities that could leave you exposed. I won’t go too deep on best practices for Linux node updates or managing reboot. This link dives deeper into what Kured is and how it can be leveraged to process updates safely. There are many ways to foundationally secure your AKS clusters. I hope this article helps future implementations and maintainability of your deployment.

Introduction

As enterprises start to utilize Azure resources, even a reasonably small footprint can begin to accumulate thousands of individual resources. This means that the resource count for much larger enterprises could quickly grow to hundreds of thousands of resources.

Establishing a naming convention during the early stages of establishing Azure architecture for your enterprise is vital for automation, maintenance, and operational efficiency. For most enterprises, these aspects involve both humans and machines, and hence the naming should cater to both of them.

It would be too naive and arrogant to propose a one-size-fits-all naming convention. Each enterprise has its own unique culture, tools, and processes. So, here are seven rules for scalable and flexible Azure resource naming conventions. To emphasize, these are rules for establishing naming conventions and not the actual naming convention itself.

Rule #1: Break them Up

  • Breakup resource names into segments. Each segment may contain one or more characters to indicate a specific attribute of the resource.
  • For example: Resource name cte2-sales-prod-rgp has four segments. First segment represents Contoso’s [ct] Enterprise’s, in East US 2 [e2], production [prod] Resource Group [rgp] for Sales application [sales]
    Why This Rule: Logically partitioning resource names into segments allows for the comprehension of resource information by both machines and humans.

Rule #2: Make them Uniquely Identifiable

  • Every resource should have a unique name. Meaning, a name should only belong to a singular resource. Do not hesitate to add additional segments to make the name unique.
  • Ideally, the name should be unique globally across Azure, but if that is too hard to achieve, then at a minimum, it must be unique across all Azure Subscriptions under your Azure AD Tenant.
  • For our Contoso example form Rule # 1, using a couple of characters that identify enterprise would increase chances for Azure wide uniqueness. For cte2-sales-prod-rgp, [ct] represents Contoso enterprise. Other segments, as explained in Rule # 1, also increases uniqueness.
    Why This Rule: Following this rule will eliminate misidentification and resource name conflicts.

Rule #3: Make them Easily Recognizable

  • Names must convey ordinary, but some critical pieces of information about the resource. This rule also serves as a backstop to Rule # 1, whereby taking Rule # 1 to an extreme, one might be tempted to use something like GUID to name the resource.
  • The information may include Azure Region, Environment or Environment Category, Resource Type, etc. to name a few.
  • For our Contoso example, each segment helps with the identification of Azure Region, Application, Environment, and Resource type. All good things for recognizing the resource.
    Why This Rule: Following this rule will eliminate needing a lookup to get information, as the information is embedded in the name itself. Do not use random name generation such as GUIDs, as it might generate a unique name, but it would serve no other purpose.

Rule #4: Make Exceptions Obedient

  • Some resources may not want to fit into the convention. For those resources, establish a convention for exceptions. Don’t let exceptions dictate the overall convention.
  • For example: Storage account names cannot have non-alphanumeric characters. So, if your convention happens to use a dash to separate segments, don’t use the Storage account name and have a dash. Don’t drop a dash for all other resource names.
    Why This Rule: Following this rule prevents a Convention that is too rigid and draconian, leading to convoluted and confusing names.

Rule # 5: Know When To Stop

  • Initially, establish a naming convention for high-level resources and maybe one level deeper. Do not try to establish a naming convention for resources that are three, four, or five levels deep within a resource.
  • If there is a need, let the convention for those lower levels be established by folks who have the expertise and happen to work with them daily.
  • For example, establish a convention for Storage accounts, do not go too deep into naming Container, Blob, and Table.
    Why This Rule: It is impossible to know everything about every resource type used by an enterprise. Leaving room for future extensions is essential for resilient and scalable naming conventions. Your future self and your colleagues will thank you for it.

Rule # 6. Keep Them Handsome, Pretty & Melodic

  • Names created by convention should be pleasing to the eye and melodic to the ears. This means that you should pay special attention to following
    • Acronyms
    • Segment sizes
    • Juxtaposition of segments
    • Sequencing of segments
    • Separators
  • Go back to our Contoso example and see how you can improve so that it lives up to Rule # 6.
    Why This Rule: You will live with the names for a long time and spend a lot of time with them. So you might as well make them a pleasure to work with.

Rule # 7: Toot Your Horn, But With Open Ears

  • Document your convention where it is easily accessible and searchable such as a central Wiki. Present your convention at every opportunity. Demo real-life excellent and bad examples. Write blogs and make videos to explain.
  • But, always keep an open mind. Listen to feedback and be open to refining when needed.
    Why This Rule: Your established naming pattern is only as good as it’s last use. So practice, preach, persuade, push, peddle, profligate, pander but never be pedantic.

These rules have been battle-tested in several large enterprises over the past decade, so following these rules for flawless Azure Naming Convention.

It would not be unfair to say that Azure Storage is one of Azure’s most essential services. Almost all the other services in Azure use Azure Storage in some shape or form.

AIS has been involved with Azure since it’s beta days, under the code name Red Dog. We’ve seen Azure Storage grow from a service with a limited set of features and capabilities to a service with an extensive collection of features, supporting storage requirements of small organizations and large enterprises equally.

Given the extensive feature set, we have seen our customers sometimes struggle to choose the right kind of storage for their needs. Furthermore, at the time of this blog, it is not possible to change the type of storage account once it is created.

We intend to clear up some of the confusion through this blog post by providing a matrix of features available based on the kind of storage account.

Storage Account Kind

When you create a storage account (in a portal or by other means), you are asked to make many selections (like resource group, location, etc.). Among them, three vital selections that you’re asked to make are:

  • Desired Performance Level
  • Account Type
  • Replication/Data Redundancy

Desired Performance Level

When it comes to performance, Azure Storage offers you two options – Premium and Standard. In Premium Storage, the data is stored on Solid State Drives (SSD) versus standard Hard Disk Drives (HDD) instead of Standard Storage. Premium Storage provides you better performance in terms of IOPS (Input/Output Operations Per Second) and throughput.

Choosing the right performance level at the time of account creation becomes essential. Once a storage account is created with a performance level, it can’t be changed i.e.; you can’t change a “Standard” storage account to a “Premium” storage account and vice-versa. Furthermore, not all services are supported for all performance levels. For example, if your application makes heavy use of Storage Queues or Tables, you cannot choose “Premium” as these services are not supported.

You can learn more about the storage account performance levels here.

Account Type

Next is the account type. At the time of writing of this blog, Azure Storage supports the following types of accounts:

  • General-purpose v2 accounts
  • General-purpose v1 accounts
  • BlockBlobStorage accounts
  • FileStorage accounts
  • BlobStorage accounts

Choosing the right type of account at the time of account creation is vital because you can’t convert the type of an account once it’s created. The only exception is that you can do a one-time upgrade a general-purpose v1 account to a general-purpose v2 account.

Also, like with performance level, not all features are supported in all account types. For example, the “FileStorage” kind of accounts only supports file storage service and not a blob, queue, and table service. Another example is that you can only host a static website in general-purpose v2 and BlockBlobStorage type of accounts.

Another important consideration in choosing the right type of storage account is pricing. In our experience, general-purpose v2 accounts are more expensive than general-purpose v1 accounts, offering more features.

You can learn more about the storage account types here.

Replication/Data Redundancy

Azure Storage is strongly a consistent service and multiple copies of your data are stored to protect that data from planned and unplanned events like data center failures, transient errors, etc. At the time of writing of this blog, Azure Storage provides the following types of redundancy options:

  • Locally redundant storage (LRS)
  • Zone-redundant storage (ZRS)
  • Geo-redundant storage (GRS)
  • Geo-zone-redundant storage (GZRS)
  • Read-access geo-redundant storage (RAGRS)
  • Read-access geo-zone-redundant storage (RAGZRS)

Choosing the right redundancy kind becomes essential as it enables your application to be fault-tolerant and more available.

Again, not all redundancy kinds are supported for all storage account types. While it is true that you can change redundancy kind of a storage account on the fly, it is only possible between certain redundancy kinds. For example, you can’t change the redundancy kind of an account from Geo-zone-redundant storage (GZRS) to Geo-redundant storage (GRS).

Another good example is that you can convert Geo-zone-redundant storage (GZRS)/Read-access geo-zone-redundant storage (RAGZRS) to a Zone-redundant storage (ZRS) but not the other way around.

You can read more about the data redundancy options available in Azure Storage here.

Storage Feature Matrix

Based on the account type, performance level, and redundancy kind, we have come up with the following feature matrix.

Storage Feature

Using this matrix, you should choose the right kind of storage account to meet your needs.

Here are some examples:

  • If you need to host static websites in Azure Storage, you can either use “Premium BlockBlobStorage (LRS/ZRS)” or “Standard General-purpose v2 (LRS/GRS/RAGRS/GZRS/RAGZRS/ZRS)” type of storage account.
  • Suppose you need to archive the data for compliance or another regulatory purpose. In that case, you can either use “BlobStorage (LRS/GRS/RAGRS)” or “General-purpose v2 (LRS/GRS/RAGRS)” type of storage account.
  • If you need premium performance with your page blobs, you can either use “General-purpose v2 (LRS) or “General-purpose v1 (LRS)” type of storage account.
  • If you need a premium performance of your file shares, the only option you have is the “Premium FileStorage (LRS/ZRS)” type of storage account.

Summary

We hope that you find this blog post useful and will use the feature matrix the next time you have a need to create a storage account.

Feel free to reach out to us if we can be of any assistance with your Azure projects. You can contact us online here!

Introduction

Unfortunately, Azure DevOps does not have a SaaS offering running in Azure Government. The only options are to spin up Azure DevOps server in your Azure Government tenant or connect the Azure DevOps commercial PaaS offering (specifically Azure Pipelines) to Azure Government. Your customer may object to the latter approach; the purpose of this post is to provide you with additional ammunition in making a case that you can securely use commercial Azure DevOps with Azure Government.

Throughout this blog post, the biggest question you should always keep in mind is where is my code running?

Scenario

Take a simple example; a pipeline that calls a PowerShell script to create a key vault and randomly generates a secret to be used later (such as for a password during the creation of a VM)

add-type -AssemblyName System.Web
$rgName = "AisDwlBlog-rg"
$kvName = "AisDwlBlog-kv"
$pw = '"' + [System.Web.Security.Membership]::GeneratePassword(16, 4) + '"'
az group create --name $rgName --location eastus --output none
az keyvault create --name $kvName --resource-group $rgName --output none
az keyvault set-policy --name $kvName --secret-permissions list get --object-id 56155951-2544-4008-9c9a-e53c8e8a1ab2 --output none
az keyvault secret set --vault-name $kvName --name "VMPassword" --value $pw

The easiest way we can execute this script is to create a pipeline using a Microsoft-Hosted agent with an Azure PowerShell task that calls our deployment script:

pool:
  name: 'AisDwl'

steps:
- task: AzureCLI@2
  inputs:
    azureSubscription: 'DwlAzure'
    scriptType: 'ps'
    scriptLocation: 'scriptPath'
    scriptPath: '.\Deploy.ps1'

When we execute this, note that output from that PowerShell script flows back into the Pipeline:

Deployment Output

To circle back to that question that should still be in your mind….where is my code running? In this case, it is running in a Virtual Machine or container provided by Microsoft. If you have a customer that requires all interactions with potentially sensitive data to be executed in a more secure environment (such as IL-4 in Azure Government), you are out of luck as that VM/Container for the hosted build agent is not certified at any DoD Impact Level. Thus, we have to look at other options, where our deployment scripts can run in a more secure environment.

I’ll throw a second wrench into things…did you see the bug in my Deploy.ps1 script above? I forgot to add --output none to the last command (setting the password in the key vault). When I run the pipeline, this is shown in the output:

Secret Visible

Not good! In an ideal world, everyone writing these scripts would be properly handling output, but we need to code defensively to handle unintended situations. Also, think about any error messages that might bubble back to the output of the pipeline.

Option 1

Azure Pipelines provide the capability to run pipelines in self-hosted agents, which could be a VM or container-managed by you/your organization. If you set up this VM in a USGov or DoD region of Azure, your code is running in either an IL-4 or IL-5 compliant environment. However, we can’t simply spin up a build agent and call it a day. As with the Microsoft-hosted build agent, the default behavior of the pipeline still returns output to Azure DevOps. If there is ever an issue like I just demonstrated, or an inadvertent Write-Output or Write-Error, or an unhandled exception containing sensitive information, that will be displayed in the output of the pipeline. We need to prevent that information from flowing back to Azure Pipelines. Fortunately, there is a relatively simple fix for this: instead of having a task to execute your PowerShell scripts directly, create a wrapper/bootstrapper PowerShell script.

The key feature of the bootstrapper is that it executes the actual deployment script as a child process and captures the output from that child process, preventing any output or errors from flowing back into your Pipeline. In this case, I am simply writing output to a file on the build agent, but a more real-world scenario would be to upload that file to a storage account.

try
{
	& "$PSScriptRoot\Deploy.ps1" | Out-File "$PSScriptRoot\log.txt" -append
	Write-Output "Deployment complete"
}
catch
{
	Write-Error "there was an error"
}

The biggest disadvantage of this approach is the additional administrative burden of setting up and maintaining one (or more) VMs/containers to use as self-hosted build agents.

Option 2

If you would prefer to avoid managing infrastructure, another option is to run your deployment scripts in an Azure Automation Account. Your Pipeline (back to running in a Microsoft-hosted agent) starts an Azure Automation Runbook to kick off the deployment. The disadvantage of this approach is that all of your deployment scripts must either be staged to the Automation Account as modules or converted into “child” runbooks to be executed by the “bootstrapper” runbook. Also, keep in mind that the bootstrapper runbook must take the same preventative action of capturing output from any child scripts or runbooks to prevent potentially sensitive information from flowing back to the Pipeline.

Sample code of calling a runbook:

$resourceGroupName = "automation"
$automationAccountName = "dwl-aaa"
$runbookName = "Deployment-Bootstrapper"
                    
$job = Start-AzAutomationRunbook -AutomationAccountName $automationAccountName -ResourceGroupName $resourceGroupName -Name $runbookName -MaxWaitSeconds 120 -ErrorAction Stop
                    
$doLoop = $true
While ($doLoop) {
    Start-Sleep -s 5
    $job = Get-AzAutomationJob -ResourceGroupName $resourceGroupName –AutomationAccountName $automationAccountName -Id $job.JobId
    $status = $job.Status
    $doLoop = (($status -ne "Completed") -and ($status -ne "Failed") -and ($status -ne "Suspended") -and ($status -ne "Stopped"))
}
                    
if ($status -eq "Failed")
{
    Write-Error "Job Failed"
}

The Deployment script code running as an Azure Automation Runbook (Note that this has been converted to Azure PowerShell as the AzureCLI isn’t supported in an Automation Account Runbook):

$Conn = Get-AutomationConnection -Name AzureRunAsConnection
Connect-AzAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint

add-type -AssemblyName System.Web
$rgName = "AisDwlBlog-rg"
$kvName = "AisDwlBlog-kv"
$pw = [System.Web.Security.Membership]::GeneratePassword(16, 4)

$rg = Get-AzResourceGroup -Name $rgName
if ($rg -eq $null)
{
	$rg = New-AzResourceGroup -Name $rgName -Location EastUs
}

$kv = Get-AzKeyVault -VaultName $kvName -ResourceGroupName $rgName
if ($kv -eq $null)
{
	$kv = New-AzKeyVault -Name $kvName -ResourceGroupName $rgName -location EastUs
}
Set-AzKeyVaultAccessPolicy -VaultName $kvName -PermissionsToSecrets list,get,set -ServicePrincipalName $Conn.ApplicationID

$securePw = ConvertTo-SecureString -String $pw -AsPlainText -Force
Set-AzKeyVaultSecret -VaultName $kvName -Name "VMPassword" -SecretValue $securePw

Leveraging the Power Platform and Microsoft Azure to connect housing agencies and intermediaries with the Department of Housing and Urban Development (HUD)

Intro

The US Department of Housing and Urban Development runs a program known as the Housing Counseling Program that aids housing agencies around the nation through things like grants and training. These agencies provide tremendous benefit to Americans in many forms including credit counseling, foreclosure prevention, predatory lending awareness, homelessness, and more. One of the requirements for approval into the program is that the agency uses a software system that is also approved by HUD and interfaces with HUD’s Agency Reporting Module SOAP web service.

Original System

The original system is built on Dynamics 365 Online using custom entities, web resources, and plug-ins. Complex batching functionality was implemented to deal with the two-minute timeout period that plug-ins have, web resource files (JavaScript, HTML, and CSS) were stuck in a solution with no real structure, and application lifecycle management (ALM) was extremely painful. The original system design also doesn’t account for the intermediary-to-agency relationship, specifically the roll-up reporting and auditing that intermediaries desperately need for providing their compiled agency data to HUD, as well as their own internal reporting. Finally, because the solution was built using Dynamics 365 Online Customer Engagement, licensing costs were more than double what we knew they could be with Microsoft’s new Power Apps licenses and Azure pay-as-you-go subscriptions.

The Original Diagram

Figure 1 – current CMS system build on Dynamics 365 Online

Definition

Intermediaries – organizations that operate a network of housing counseling agencies, providing critical support services to said agencies including training, pass-through funding, and technical assistance.

Modern System

Whereas the data schema for the solution remains mostly unchanged, the architecture of the system has changed profoundly. Let’s check out some of the bigger changes.

Power Apps Component Framework (PCF)

The Power Apps Component Framework enables developers to create code components that can run on model-driven and canvas apps and can be used on forms, dashboards, and views. Unlike traditional web resources, PCF controls are rendered in the same context and at the same time as any other component on a form. A major draw for developers is that PCF controls are created using TypeScript and tools like Visual Studio, and Visual Studio Code. In the modern solution, web resources are replaced with PCF controls that make calls directly to the Power Apps Web API.

Azure Resources (API Management, Function Apps, Logic Apps, and a Custom Connector)

A Custom Connector and Azure API Management are used to manage and secure various APIs that are used in the system. The connector can be used in any Azure Logic App to make connecting to an API much easier than having to configure HTTP actions. All of the business logic that once lived in plug-ins has been replaced with a combination of Azure Logic Apps and Azure Function Apps. This has provided incredible gains where development and testing are concerned, but it also provides a more secure and scalable model to support these agencies as they grow. Lastly, it removes the burden experienced with the two-minute time-out limitation that plug-ins have.

ALM with Azure DevOps Services and Power Apps Build Tools

Azure DevOps Services, coupled with the Power Apps Build Tools, are being used to ease the pain we experienced with ALM prior to these tools being available. Now we can easily move our Power App solutions across environments (e.g. dev, test, production) and ensure our latest solution code and configurations are always captured in source control.

ALM with Azure

Figure 2 – modern solution using Power Apps and Azure resources for extreme scalability and maintainability

Next Steps

I hope by demonstrating this use case you’re inspired to contact your Microsoft partner, or better yet contact us and let us work with you to identify your organization’s workloads and technology modernization opportunities.

Ready to dig a little deeper into the technologies used in this blog post? Below we have some hands-on labs for you. Stay tuned for more updates!

  1. Power Apps Component Framework Controls
    As a developer, PCF Controls has been one of the most exciting things to grace the Power Platform landscape. In this video, I’ll show you how to go from zero to a simple PCF control that calls the Power Platform Web API. I also provide a Pro Tip or two on how I organize my PCF controls in my development environment.

  2. Using an Azure Logic App to Query Data in an On-Premises SQL Server Database
    Companies are embracing modernization efforts across the organization, quickly replacing legacy apps with Power Apps and Azure resources. This can’t all happen at once though, and often times companies either can’t or won’t move certain data to the cloud. In this hands-on lab, I’ll walk you through using an Azure Logic App and an On-Premises Data Gateway to connect to a local SQL Server 2017 database to query customer data using an HTTP request and response.
  3. Azure API Management and Custom Connectors
    API’s give you the ability to better secure and manage your application interfaces. Couple an API with a Custom Connector and you not only better secure and manage, but also make it super easy for other app developers across the organization to connect to your back-end resources through the API, without having to worry about all of the API configurations.
  4. Power Apps DevOps Using Best Practices
    There’s more to DevOps than just backing up your Power App solutions in source control and automating your build and release pipeline. Good solution management and best practices around publishers, entity metadata, components, and subcomponents are also key to manageability and scalability with your ALM.