Azure Kubernetes Service is a Microsoft Azure-hosted offering that allows for the ease of deploying and managing your Kubernetes clusters. There is much to be said about AKS and its abilities, but I will discuss another crucial role of AKS and containers, security. Having a secure Kubernetes infrastructure is a must, and it can be challenging to find out where to start. I’ll break down best practices, including baseline security for clusters and pods, and implement network hardening practices that you can apply to your own AKS environment that will lay the foundation for a more secure container environment, including how to maintain updates.

Cluster and Pod Security

Let’s first look at some best practices for securing your cluster and pods using policies and initiatives. To get started, Azure has pre-defined policies that are AKS specific. These policies help to improve the posture of your cluster and pods. These policies also allow for additional control over things such as root privileges. A best practice Microsoft recommends is limiting access to the actions that containers can provide and avoiding root/privileged escalation. When the Azure Policy Add-on for AKS is enabled, it will install a managed instance of Gatekeeper. This instance handles enforcement and validation through a controller. The controller inspects each request when a resource is created or updated. You’ll then need to validate (based on your policies). Features such as these are ever-growing and can make creating a baseline easier. Azure Policy also includes a feature called initiatives. Initiatives are collections of policies that align with organizational compliance goals. Currently, there are two built-in AKS initiatives which are baseline and restricted. Both come with many policies that lockdown items, such as limiting the host filesystem, networking, and ports. By combining both initiatives and policies, you can tighten security and meet compliance goals in a more managed fashion.

Another way to secure your cluster is to protect the access to the Kubernetes API-Server. This is accomplished by integrating RBAC with AD or other identity providers. This feature allows for granular access, similar to how you control access to your Azure resources. The Kubernetes API is the single connection point to perform actions on a cluster. For this reason, it’s imperative to deploy logging\auditing and to enforce the least privileged access. The below diagram depicts this process:

Cluster and Pod Security

Reference:https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-cluster-security#secure-access-to-the-api-server-and-cluster-nodes

Network Security

Next, let’s look at network security and how it pertains to securing your environment. A first step would be to apply network policies. Much like above, Azure has many built-in policies that assist with network hardenings, such as using a policy that only allows for specific network traffic from authorized networks based on IP addresses or namespaces. It’s also important to note this can only occur when the cluster is first created. You also have the option for ingress controllers that access internal IP addresses. This ensures they can only get accessed from that internal network. These small steps can narrow the attack surface of your cluster and tighten traffic flows. The below diagram demonstrates using a Web Application Firewall (WAF) and an egress firewall to manage defined routing in/out of your AKS environment. Even more granular control is possible using network security groups. These allow only specific ports and protocols based on source/destination. By default, AKS creates subnet level NSGs for your cluster. As you add services such as load balancers, port mappings, and ingress routes, it will automatically modify the NSGs. This ensures the correct traffic flow and makes it easier to manage change. Overall these effortless features and policies can allow for a secure network posture.

Network Security Graphic

Reference: Microsoft Documentation

The Final Piece

The final piece of securing your AKS environment is staying current on new AKS features and bug fixes. Specifically, upgrading the Kubernetes version in your cluster. These upgrades can also include security fixes. These fixes are paramount to remain up to date on vulnerabilities that could leave you exposed. I won’t go too deep on best practices for Linux node updates or managing reboot. This link dives deeper into what Kured is and how it can be leveraged to process updates safely. There are many ways to foundationally secure your AKS clusters. I hope this article helps future implementations and maintainability of your deployment.

In this blog post, I would like to introduce and explain Azure’s new service offering for Application Gateway Ingress Controller (AGIC) for Azure Kubernetes Services (AKS). AKS is gaining momentum within the enterprise and Microsoft has been refining the offering to drive adoption. This feature piques my interest because of its capability to improve latency and allow for better application performance within AKS Clusters. I will first discuss the basics of the application gateway Ingress controller. Then, I will dive into the underlying architecture and network components that can provide performance benefits. In addition, I will also explain the improvements and differences this offers over the already existing in-cluster ingress controller for AKS. Finally, I will discuss the new application gateway features that Microsoft is developing to refine the service even further.

In definition, the AGIC is a Kubernetes application that is like Azure’s L7 Application Gateway load balancer by leveraging features such as:

  • URL routing
  • Cookie-based affinity
  • SSL termination or end-to-end SSL
  • Support for public, private, hybrid web sites
  • Integrated Web Application Firewall (WAF)

The controller runs in its own pod on your AKS. It monitors and communicates routing changes to your AKS through the Azure Resource Manager to the Application Gateway which allows for uninterrupted service to the AKS no matter the changes. Below is a high-level design of how the AGIC is deployed along with the application gateway and AKS, including the Kubernetes API server.

Azure Resource Manager

Next, let’s take a closer look at the AGIC and breakdown how it differs from that of the already existing In-Cluster Ingress Controller. The image below shows some distinct differences between the two. The in-cluster load balancer performs all the data path operations leveraging the AKS clusters compute resources, because of this it is competing for the same resources as the application in the cluster.

In-Cluster Ingress Controller

In terms of networking features, there are 2 different models you can leverage when configuring you AKS. They are Kubenet and Azure CNI:

Kubenet

This is the default configuration for AKS cluster creation, each node receives an IP address from the Azure virtual network subnet. Pods then get an IP address from a logically different address space to that of the Azure virtual network subnet. They also use Network Address Translation (NAT) configuration to allow the Pods to the communication of other resources in the virtual network. If you want to dive deeper into Kubenet this document is very helpful. Overall these are the high-level features of Kubenet:

  • Conserves IP address space.
  • Uses Kubernetes internal or external load balancer to reach pods from outside of the cluster.
  • You must manually manage and maintain user-defined routes (UDRs).
  • Maximum of 400 nodes per cluster.
  • Azure Container Networking Interface (ACNI)

The other networking model is utilizing Azure Container Networking Interface (ACNI). This is how the AGIC works, every pod gets an IP address from their own private subnet and can be accessed directly, because of this these IP addresses need to be unique across your network space and should be planned in advance. Each node has a configuration parameter for the maximum number of pods that it supports. The equivalent number of IP addresses per node is then reserved upfront for that node. This approach requires more planning, as can otherwise lead to IP address exhaustion or the need to rebuild clusters in a larger subnet as your application demands grow.

Azure Virtual Network

From the Azure documentation, they also outline the advantages of using ACNI.

Azure CNI

Pods get full virtual network connectivity and can be directly reached via their private IP address from connected networks.
Requires more IP address space.

Performance

Based on the network model of utilizing the ACNI, the Application Gateway can have direct access to the AKS pods. Based on the Microsoft documentation because of this, the AGIC can achieve up to a 50 percent lower network latency versus that of in-cluster ingress controllers. Since Application Gateway is also a managed service it backed by Azure Virtual machine scale sets. This means instead of utilizing the AKS compute resources for data processing like that of the in-cluster ingress controller does the Application Gateway can leverage the Azure backbone instead for things like autoscaling at peak times and will not impact the compute load of you ASK cluster which could impact the performance of your application. Based on Microsoft, they compared the performance between the two services. They set up a web app running 3 nodes with 22 pods per node for each service and created a traffic request to hit the web apps. In their findings, under heavy load, the In-cluster ingress controller had approximately 48 percent higher network latency per request compared to the AGIC.

Conclusion

Overall, the AGIC offers a lot of benefits for businesses and organizations leveraging AKS from improving network performance to handling the on the go changes to your cluster configuration when they are made. Microsoft is also building from this offering by adding more features to this service, such as using certificates stored on Application Gateway, mutual TLS authentication, gRPC, and HTTP/2. I hope you found this brief post on Application Gateway Ingress-Controller helpful and useful for future clients. If you want to see Microsoft’s whole article on AGIC click here. You can also check out the Microsoft Ignite video explaining more on the advancements of Application Gateway.

I recently had the privilege and opportunity to attend this year’s DEF CON conference, one of the world’s largest and most notable hacker conventions, held annually in Las Vegas. Deciding what talks and sessions to attend can be a logistics nightmare for a conference that has anywhere between 20,000 – 30,000 people in attendance, but I pinpointed the ones that I felt would be beneficial for myself and AIS.

During the conference, Tanya Janca, a cloud advocate for Microsoft, and Teri Radichel from 2nd Sight Lab did a presentation on “DIY Azure Security Assessment” that dove into how to verify the security of your Azure environments. More specifically they went into detail on using Azure Security Center, and setting scope, policies, and threat protection. With this post, I want to share what I took away from the talk I found most helpful.

Security in Azure

Security is a huge part of deploying any implementation in Azure and ensuring fail-safes are in place to stop attacks before they occur. I will break down the topics I took away that can help you better understand and perform your own security assessment in Azure along with looking for vulnerabilities and gaps.

The first step in securing your Azure environment is to find the scope at which you are trying to assess and protect. This could also include things external to Azure, such as hybrid solutions with on-premises. These items include the following:

  • Data Protection
  • Application Security
  • Network Security
  • Access Controls
  • Cloud Security Controls
  • Cloud Provider Security
  • Governance
  • Architecture

Second, is using the tools and features within Azure in order to accomplish this objective. Tanya and Teri started out by listing a few key features that every Azure implementation should use. This includes:

  • Turning on Multi-Factor Authentication (MFA)
  • Identity and Access Management (IAM)
    • Roles in Azure AD
    • Policies for access
    • Service accounts
      • Least privilege
    • Account Structure and Governance
      • Management Groups
      • Subscriptions
      • Resource Groups

A key item I took away from this section was allowing access at the “least privileged” level using service accounts, meaning only the required permissions should be granted when needed using accounts that are not for administrative use. Along with tightening access, it’s also important to understand at what level to manage this governance. Granting access at a management group level will cast a wider and more manageable net. A more defined level, such as a subscription level, could help with segregation of duties but this is heavily based on the current landscape of your groups and subscription model.

The Center for Internet Security (CIS)

So maybe now you have an understanding of what scope you want to assess the security of your Azure environment at, but do not know where to start. This is where The Center for Internet Security (CIS) can come into play. CIS is crowd-sourced security for best practices and threat prevention which includes members such as corporations, governments, and academic institutions. It was initially intended for on-premises use. However, as the cloud has grown so has the need for increased security. CIS can help you decide what best practices you should follow based on known threat vectors; these include 20 critical controls broken down into the following 3 sections:

Basic Center for Internet Security Controls

Examples of these CIS control practices could be:

  • Inventory and Control of Hardware Assets by utilizing a software inventory tool
  • Controlled Use of Administrative Privileges by setting up alerts and logs

An additional feature is the CIS Benchmark which has recommendations for best practices in various platforms and services, such as Microsoft SQL or IIS. Plus it’s free! Another cool feature that CIS offers is within the Azure Marketplace. They have pre-defined system images that are already hardened for these best practices.

CIS Offers in Azure Marketplace

The figure below shows an example benchmark for control practice that gives you the recommendation to “Restrict access to Azure AD administration portal.” This will then output audits that show what steps need to be taken to be within the scope of that best practice.

Control Practice to Restrict access to Azure AD administration portal

Azure Security Center (ASC)

In this next section, I detail the features of Azure Security Center (ASC) that I took away from this presentation and how to get started using ASC. The figure below is of the dashboard. As you can see, there are a lot of options inside the ASC dashboard, including sections such as Policy & Compliance and Resource Security Hygiene. The settings inside of those can dive deeper into resources all the way down to the VM or application level.

Azure Security Center Dashboard

Making sure you have ASC turned on should be your first step when implementing the features within it. The visuals you get in ASC are very helpful, including things like subscription coverage and your security score. Policy management is also a feature with ASC to use pre-defined and custom rules to keep your environment within the desired compliance levels.

Cloud Networking

Your network design in Azure plays a crucial role in securing against incoming attacks, including more than just closing ports. When you build a network with security in mind you not only limit your attack surface but also make spotting vulnerabilities easier; all while making it harder for attackers to infiltrate your systems. Using Network Security Groups (NSGs) and routes can also help by allowing only the required ports. You can also utilize Network Watcher to test these effective security rules. Other best practices include not making RDP, SSH, and SQL accessible from the internet. At a higher-level, below are some more networking features and options to secure Azure including:

  • Azure Firewall
    • Protecting storage accounts
    • Using logging
    • Monitored
  • VPN/Express Route
    • Encryption between on-premises and Azure
  • Bastion Host
    • Access to host using jump box feature
    • Heavy logging
  • Advanced Threat Protection
    • Alerts of threats in low, medium and high severity
    • Unusually activities such as large amounts of storage files copied
  • Just in Time (JIT)
    • Access host only when needed in a configured time frame.
    • Select IP Ranges and ports
  • Azure WAF (Web Application Firewall)
    • Layer 7 firewall for applications
    • Utilize logging and monitoring

An additional design factor to consider is the layout of your network architecture. Keeping all your resources divided into tiers can be a great security practice to minimize risk to each component. An example would be utilizing a three-tier design. This design divides a web application into three-tier (VNets). In the figure below you can see a separate web tier, app tier, and data tier. This is much more secure because the front-end web tier can still access the app tier but cannot directly talk to the data tier which helps to minimize risk to your data.

Three Tier Network Architecture: web tier, app tier, and data tier

Logging and Monitoring

Getting the best data and analytics to properly monitor and log your data is an important part of assessing your Azure environment. For those in security roles, liability is an important factor in the ‘chain of custody’. When handling security incidents, extensive logging is required to ensure you understand the full scope of the incident. This includes having logging and monitoring turned on for the following recommended items:

  • IDS/IPS
  • DLP
  • DSN
  • Firewall/WAF
  • Load Balancers
  • CDN

The next possible way to gather even more analytics is the use of a SEIM (Security Information and Event Management) like Azure Sentinel. This just adds another layer of protection to collect, detect, investigate, and respond to threats from on-premises to multi-cloud vendors. An important note of this is to make sure you tune your SEIM, so you are detecting the threats accurately and not diluting the alerts with false positives.

Advanced Data Security

The final point I want to dive into is Advanced Data Security. The protection of data in any organization should be at the top of their list of priorities. Beginning by classifying your data is an important first step to know the sensitivity of your data. This is where Data Discovery & Classification can help in labeling the sensitivity of your data. Next is utilizing the vulnerability assessment scanning which helps assess the risk level of your databases and minimize leaks. Overall, these cloud-native tools are just another great way to help secure your Azure environment.

Conclusion

In closing, Azure has a plethora of tools at your disposal within the Azure Security Center to do your own security assessment and protect yourself, your company, and your clients from future attacks. The ASC can become your hub to define and maintain a compliant security posture for your enterprise. Tanya and Teri go into great detail the steps to take and even supply a checklist you can follow yourself to assess an Azure environment.

Checklist

  1. Set scope & only test what’s in scope
  2. Verify account structure, identity, and access control
  3. Set Azure policies
  4. Turn on Azure Security Center for all subs
  5. Use cloud-native security features – threat protection and adaptive controls, file integrity monitoring, JIT, etc.
  6. Follow networking best practices, NSGs, routes, access to compute and storage, network watcher, Azure Firewall, Express Route and Bastion host
  7. Always be on top of alerts and logs for Azure WAF and Sentinel
  8. VA everything, especially SQL databases
  9. Encryption, for your disk and data (in transit and rest)
  10. Monitor all that can be monitored
  11. Follow the Azure Security Center recommendations
  12. Then call a Penetration Tester

I hope you found this post to be helpful and make you, your company, and your clients’ experience on Azure more secure. For the full presentation, including a demo on Azure Security Center, check out this link.