The Obama Administration is betting big on Big Data: $200 million, in fact. Big Data refers to the…well, BIG and unwieldy sets of data that exceed the capabilities of conventional database systems. This data can be incredibly useful, even game-changing, but first you have to figure out how properly process it. In past, this simply cost too much money. Today, thanks to cloud architecture, big data is finally in reach for even the smallest enterprise.
The Big Data Initiative (announced on March 29, 2012) hopes to improve the government’s ability to “extract knowledge and insights from large and complex collections of digital data,” especially in the pursuit of science and engineering breakthroughs and discoveries. Several federal agencies are being targeted, including DARPA, the National Institutes of Health, and the National Science Foundation.
AIS’ own CTO Vishwas Lele wrote a great post for ZDNet about the impact Obama’s big money big data Initiative will have on all future Federal IT projects.
The advent of Big Data can bring the tools for arbitrarily large data collection and analysis within the reach of Federal agencies, even when resource-bound. This is possible through adoption of open source frameworks such as Hadoop or Storm, commodity hardware and familiar SQL-like query constructs provided by such tools as Hive. Using an ODBC database driver for Hive that imports results from a Hadoop query into Excel for further analysis, extends the life and usefulness of the data collected, and can be done affordably.