Advertisement

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.

ENJOY UNLIMITED ACCES TO C&EN

Analytical Chemistry

The Incredible Vastness of Data

In the hands of CAS, a morass of data points ends up telling epic research stories, page by page

by Ivan Amato
June 11, 2007 | A version of this story appeared in Volume 85, Issue 24

From an airplane, you can get the big picture, an entire landscape. But if it is the trees you really need to see, and not the forest, then you have to get your feet on the ground.

In 2005, CAS unveiled a new data analysis and visualization product—STN AnaVist—with which users can navigate the vast landscape of chemical research through the eyes of CAS scientists. It's a perspective that reveals all of the scientific hot spots and regional interconnections, as well as the local detail, all the way down to the full text of the patents, abstracts, papers, and other types of information that make up the global view.

The interactive portal at the right leads to two sets of visualizations—one derived from the full set of about 12,000 records that were on hand in 1907, and one derived from 19,000, or roughly 10%, of the 2007 research records that CAS scientists had created as of the first few months of the year. In each case, we show a global view of the data in which the density of points corresponds to the hottest areas of research at the time. The different colors of the points denote some of the most intense research arenas. These global views serve as snapshots of chemical history. In 1907, for example, some of the most notable locations of the research landscape denote work on the chemistry of air and other gases, and on dye chemistry. A century later, some of the hottest areas include the chemistry of biological molecules such as proteins and antibodies.

Shown along with each global portrait of research accessible through the portal are several more detailed local views that reveal subareas. Follow the colored circles to drill deeper and then deeper into the data. Individual points correspond to specific papers, patents, conference proceedings, or other relevant and retrievable records in the CAS databases. The relative proximity of points is a measure of the conceptual relatedness of the documents.

These views from STN AnaVist provide the merest of glimpses of what 100 years worth of data gathering, categorizing, analyzing, processing, and searching have made possible, notes CAS's Anthony Trippe, one of the tool's designers.

Advertisement

Article:

This article has been sent to the following recipient:

0 /1 FREE ARTICLES LEFT THIS MONTH Remaining
Chemistry matters. Join us to get the news you need.