Motivation
Big Compute refers to running large scale applications which utilize large amounts of CPU and/or memory resources. These resources are provided by using a cluster of computers and the applications are distributed across the cluster. The key concept is to distribute the application to run on multiple machines so as to execute computations simultaneously in parallel. Problems in the financial, scientific and engineering fields often require computations which would take several days or longer if executed on a single computer. Big Compute solutions significantly reduce the solution time dramatically from days to hours or less, depending on how many machines are added to the compute cluster. Big Compute differs subtly from “Big Data” in that the latter is more about using disk capacity and IO performance of a cluster of computers in order to analyze large volumes of data, whereas Big Compute is primarily about utilizing CPU power in a cluster to perform computations. In order to harness the resources of multiple machines, a Big Compute solution also requires components to handle the configuration and scheduling of the individual component computations – this is usually the role of a ‘head node’ in the compute cluster. Microsoft’s HPC (High Performance Computing) platform is a key aspect of their Big Compute offerings. HPC provides all the components necessary to configure, schedule and execute computations in a distributed cluster. Microsoft’s HPC solution is supported in on-premises environments as well as in the Azure cloud, both in an IaaS configuration as well as via an Azure HPC scheduler. Since the publishing of the Pluralsight course, there have been continued developments from Microsoft on the Big Compute offerings in Azure, in particular the new Azure Batch offering which is currently in preview mode. Read More…