
The Big Data Initiative (announced on March 29, 2012) hopes to improve the government’s ability to “extract knowledge and insights from large and complex collections of digital data,” especially in the pursuit of science and engineering breakthroughs and discoveries. Several federal agencies are being targeted, including DARPA, the National Institutes of Health, and the National Science Foundation.
AIS’ own CTO Vishwas Lele wrote a great post for ZDNet about the impact Obama’s big money big data Initiative will have on all future Federal IT projects.
The advent of Big Data can bring the tools for arbitrarily large data collection and analysis within the reach of Federal agencies, even when resource-bound. This is possible through adoption of open source frameworks such as Hadoop or Storm, commodity hardware and familiar SQL-like query constructs provided by such tools as Hive. Using an ODBC database driver for Hive that imports results from a Hadoop query into Excel for further analysis, extends the life and usefulness of the data collected, and can be done affordably.