Editorials

 

Image Informatics:  The Key to Using Image Data 

by Suzanne Z. Mattingly, Ph.D.
Scimagix Inc.
3 Lagoon Drive, Suite 180
Redwood Shores, CA 94065
www.scimagix.com
(650) 508-2270

The explosion of data in the form of images brought about by developments in functional genomics, combinatorial chemistry and high throughput screening poses a growing obstacle to drug discovery.  As this exponential data growth in bioscience research and development (R&D) drives the need for automated storage and database mining, analysis and mining software has quickly become a mainstay across R&D.  Computer-based analysis of images is now possible; for example, in proteomics-based research, software exists to perform automated protein spot detection, spot outlines and quantification.  Such image analysis programs can compare one image to another, but are limited to small sets of images and may still require user intervention.  Programs that enable image quantification and analysis without also providing automated search and data mining capabilities have limited value for the drug discovery and development process.  What good does it do to analyze more gels quickly, if you can’t also search the data to understand its meaning?

Image informatics is the solution.   Encompassing far more than computer-based image analysis, image informatics enables image quantification and analysis along with powerful and complete image search technology.  With a rapid search and retrieval capability, researchers can ask, “Where have I seen this image before?” or “Under what conditions have I seen this protein pattern expressed?”  Image informatics supports databasing and searching of all types of images and links image to non-image data, such as a set of pathology images, related numeric toxicology results and associated chemical entity structures, providing a multidisciplinary display of all the relevant data associated with the set of images under study. 

This mining process enables researchers to gain valuable insights into mechanism of action, especially in early discovery, lead optimization, preclinical or clinical trials.  In proteomics-based research, the ability to rapidly analyze data and then perform searches is instrumental in the process of finding and validating useful protein expression patterns and in using these patterns to eliminate problem compounds or to advance compounds with desired properties.  Image informatics supports new research approaches that integrate image information throughout the R&D cycle, representing a significant advance over analytical programs designed only to compare one image with another. 

Image informatics is a new area of data management that allows researchers to mine scientific images of all types using advanced image data storage, retrieval, mining and analysis capabilities.  Already proven in use by major pharmaceutical companies, this technology is of particular value in proteomics research where the efficient management and analysis of image data is critical in identifying patterns of protein expression and then correlating these with specific outcomes.  For example, searching on a specific pattern of protein expression that connotes a biomarker provides all incidences where this pattern has been seen, such as in specimens from breast cancer patients.  Post-treatment samples can then be retrieved to correlate changes in this pattern, such as down-regulation of a protein, that correspond to treatment response.

Using Image Data to Accelerate Proteomics Research

Proteomics research is amassing volumes of image data.  Researchers are studying protein expression to track protein changes associated with various diseases and as an indication of therapeutic response.  The objective is often to identify potential drug targets and to develop biomarkers for certain disease states.  The most common research tool for protein isolation and studies of expression remains two-dimensional (2D) electrophoresis gels, a procedure that separates proteins along the two dimensions of charge and mass.  Protein characterization is typically performed by mass spectrometry.  A useful method of determining changes in protein expression resulting from disease or in response to treatment, 2D gels have proven their utility since their introduction 25 years ago. Advances in computer-aided analysis tools have automated the process whereby researchers can quantify and compare one 2D image to another in about one hour. 

Lacking until recently, however, has been technology to automate the process of searching for protein expression patterns and gel analysis methods that are of sufficient sensitivity and accuracy to enable searches of this type.  Image informatics technology accurately analyzes and extracts information from protein spots on a 2D gel image, quantifying and storing visual content based on various parameters including location, shape, size and intensity.  Researchers can now conduct searches for a specific pattern of protein expression across hundreds or thousands of gels contained in a single, Oracle® database. 

The ability to store, retrieve, and mine protein expression patterns found in 2D gels allows researchers to verify or disprove hypotheses and to quickly determine the desirability of a compound, thereby speeding the validation process.  Image informatics applications may be used to compare gels in order to identify differentially expressed proteins, to gain insight into mechanism of action, or to rapidly identify biological mechanisms or pathways without characterizing individual proteins. 

Protein expression data stored in one database can be readily correlated with other experimental data including related image or non-image data such as histology results, chemical entity structures, LD50 and ED50 information.  Researchers can create Image-driven Structure/Activity Relationships (ISAR) tables that extract and present image data corresponding to biological, chemical and protein activity.  For example, by starting with a set of 2D gel images that exhibit a pattern that confers toxicity, related image and non-image data can be pulled together and viewed in one screen.  At a glance, researchers can view a more complete set of data than possible previously and gain valuable insights into cause/effect relationships.

In addition to boosting productivity by means of new insights and more informed decision-making, image informatics provides an open infrastructure for researchers to share data and collaborate across experiments and laboratories.  Being able to access images along with the expert’s opinion facilitates the advance of appropriate compounds, no matter where in the organization they have been tested or created.  As pharmaceutical companies partner with biotechnology companies in greater numbers, new database management systems must prove sufficiently robust to accommodate multidisciplinary collaborations with selective viewing privileges for proprietary data. 

Use of image informatics can advance proteomics research at every stage – from hypothesis, experiment design and gel characterization to identifying protein expression patterns, understanding and validating mechanism of action and therapeutic response.  This relatively new technology helps researchers exploit image data throughout the R&D process in ways that were not previously possible.  Researchers can enlist image informatics to design and conduct retrospective and prospective evaluations, to identify mechanism of action and pathways, and to make better use of enterprise legacy data.  Image data can now be integrated much more fully into the decision-making process on a routine basis, serving as a valuable source of insight and significantly accelerating the drug discovery and development process.

The impact of proteomics research is expanding rapidly and affects nearly every stage of the pharmaceutical and biotechnology R&D process.  As proteomics research advances, research techniques and methods used to make decisions about the data it provides must also advance, becoming faster, more accurate and more integrated.  The explosion in image data in particular has created a pressing need for improved technologies for capturing, storing, mining and analyzing image data.  Analysis tools designed to compare one gel to another no longer suffice; proteomic researchers need technology capable of performing rapid searches for protein expression patterns, of enabling sharing of data between labs and company departments, and of integrating images with other, related data.  Image informatics offers researchers, research teams and project teams a mechanism to use image data throughout the entire discovery and development cycle in a way that is far more efficient than was previously possible – opening up new possibilities for insight.  

The articles and opinions expressed in the Editorial Corner are solely
those of the author and do not necessarily reflect the opinions of the Proteome Society management and/or membership.

THE Proteome Society
Post Office Box 197
Ross, CA 94957-0197 

Telephone: (415) 860-5998,  Fax: (415)
459-2266, E-mail: info@proteome.org 
Copyright © 2001-2002, The Proteome Society
webmaster@proteome.org