A University of North Florida chemistry professor wants scientists to work smarter, not harder.
He says it’s a matter of collecting data that already exists and organizing it in a way that’s easy to find, search through and understand.
To help his vision materialize, Stuart Chalk has been awarded a $600,000 grant from the National Science Foundation to be spread out over three years on the project.
The framework he’s started creating, called SciData, is focused on chemical data, which is used in many industries like pharmaceuticals, agriculture, toxicology and materials.
The project aims to create a digital infrastructure that will allow data to be integrated, so humans and machines can pose complicated questions and extract new knowledge automatically.
“What this is about really is leveraging that data,” he said. “Make it into something that you can really use and hopefully in the end solve scientific problems.”
Chalk said there’s a similar movement in the healthcare industry, to try to get patients’ medical records collected in a uniform manner so they can be stored in a single system. “And doctors from wherever can actually look at that information and view it from a number of different perspectives,” Chalk said.
He wants to do that with chemistry. He said the problem is there’s too much data.
With chemical research there’s chemical property data, spectral data as well as material characteristics. Sometimes it’s hard to keep up with what information is already out there.
“If we don’t take advantage of all the research that’s been done then people are likely to repeat it again even if it’s not necessary,” Chalk said.
He said having it easily accessible will not only provide a snapshot of what’s known about a particular area of chemistry, but it can even accelerate research. Chalk said what if some scientific questions are closer to being answered than scientists realize?
“If maybe we have all the data in one system we could potentially answer a more complicated question that scientists right now are like ‘we’re not there yet to be able to do that,’ “ Chalk said.
Creating the system isn’t as easy as just uploading the data, because it exists in all kinds of different formats. That’s something he and his students will be working out over the next three years.
Undergraduate students will take a chemical information science class and then be paid to work on the project over their summers. The students will be working with organizations doing data aggregation of their own to incorporate the information in Chalk’s Sci-Data system.
The grant will also fund students’ travel to those organizations’ locations along with the salary of a new post-doctoral UNF professor in the chemistry data field as well as a Chemical Informatics and Data Science Research Center on campus.
But Chalk said is there one other obstacle to his open-data dream. Scientists don’t always want to share their research. For companies making a product like detergent, they’ve gathered a lot of information about compounds and formulas, all part of their intellectual property.
He said there’s got to be a push to encourage companies that want to use the tool to also contribute to it.
Reporter Lindsey Kilbride can be reached at lkilbride@wjct.org, 904-358-6359 or on Twitter at @lindskilbride.