The National Science Foundation has announced an $11.9 million grant to a new high-tech computing network that will set up one of its first computing nodes at Johns Hopkins.
The node will be part of GriPhyN, a 16-university initiative led by the University of Florida. GriPhyN, which stands for Computational Grid for Physics Networks, will develop and implement a new approach to storing and using the huge amounts of data generated by major research projects in physics and astronomy.
Physically, the new node will consist of several high-powered, high-tech computers in the Bloomberg Center. Conceptually, though, it will be quite a bit more revolutionary.
"We plan to eventually have a network of over a hundred heavy-duty computing nodes," says Hopkins Alumni Centennial Professor of Physics and Astronomy Alex Szalay, the principal Hopkins scientist involved in GriPhyN. "It will be set up a bit like a power grid--data will be able to flow from one node to another, and the nodes will be able to share processing power based on changing needs for resources."
Individual nodes will "deconstruct" physical or astronomical research data, reducing the data to its most basic form and storing it. A common software environment shared by all the sites will allow scientists anywhere to write programs that will use the information on the grid for research.
Because of the huge volume of information stored on the grid, the research will be done with what Szalay calls "virtual data."
"Many data sets can be created from the data that will be stored on these nodes," Szalay explains, "but why keep them around? In a process that will be invisible to the user, data sets will be created and analyzed on the fly, and then discarded."
The new NSF grant will fund the development of software codes for the GriPhyN system, which is anticipated to take several years. To ensure that the computer hardware for GriPhyN is as up-to-date as possible, scientists will seek new funding for hardware when the code is ready.
Hopkins will receive an initial GriPhyN node because Szalay and other Hopkins faculty members are participants in the Sloan Digital Sky Survey, a cooperative effort between scientists at several laboratories and universities in the United States and Japan. The SDSS team is assembling a comprehensive atlas of a quarter of the night sky, an effort that will eventually amass about 40 terabytes of data. A terabyte is approximately 1 trillion bytes, roughly equivalent to the volume of information in the entire Library of Congress.
Unlike the two other major research projects envisioned for future GriPhyN nodes, SDSS is already gathering data. The other SDSS node will be set up at the Department of Energy's Fermilab, which heads the SDSS coalition.
In addition to his work in SDSS, Szalay is the leader of a research team pioneering new ways to cope with "information overload" from the huge quantities of information generated by modern science. The team, which includes physical and computer scientists at Hopkins, CalTech, FermiLab and the Microsoft Corporation, last year received a three-year, $2.5 million "knowledge and distributed intelligence" grant from NSF. Szalay says they are working on "clever ways" to store and create data so that it can be "easily pulled out" for use by researchers.
"Our current ways of doing science are very much based on the concept that our data sets are so small that we can sort of 'eyeball' the whole thing and locate the interesting data," Szalay says. "With the data sets we are getting in an increasing number of areas of science, this is just not going to be feasible. So we have to do something drastically different."