You are here Biopharmaceutical Glossary homepage > Informatics > Pharmaceutical Cheminformatics

Cheminformatics/ Chemoinformatics Glossary & taxonomy
Evolving terminologies for emerging technologies

Comments? Questions? Revisions? 
Mary Chitty MSLS
Last revised January 09, 2020

Informatics  Term index:   Related glossaries include Applications  Drug Discovery & Development   Pharmacogenomics
Chemistry:  Assays & screening    Chemistry     Combinatorial libraries & synthesis 
Algorithms    Bioinformatics,      Databases & Software,     Drug discovery informatics      Information management & interpretation

3D-QSAR Three-Dimensional Quantitative Structure-Activity Relationships:  Involves the analysis of  the quantitative relationship between the biological activity of a set of compounds and their three- dimensional properties using statistical correlation methods. IUPAC Computational   Broader terms: QSAR; SAR Structure Activity Relationship Cheminformatics  
Narrower term: CoMFA Comparative Molecular Field Analysis Related term: drug design

ab initio: From the Latin: from the beginning. In modeling seems to refer to models devised without experimental data.

ab initio calculations: Quantum chemical calculations using exact equations with no  approximations which involve the whole electronic population of the molecule. [IUPAC Computational]

ab initio molecular dynamics: The Parrinello group has applied ab initio Molecular Dynamics (MD) in which all forces were computed quantum- chemically to chemical reactions in general and to biological systems in particular, with results that compared favorably with experiment and older force field methods. The ab initio method was found to be of ``useful accuracy'' for simulations of biomolecules ... With a 1000 times faster computer (relative to 32 processors on a Cray T3E) the dynamics of a quantum- chemical system consisting of up to 10 atoms could be simulated for 10 s. Opportunities in Molecular Biomedicine in the Era of  Teraflop Computing: Report on a Meeting Held March 3 & 4, 1999 in Rockville, MD, Organized by the NIH Resource for Macromolecular Modeling and Bioinformatics  Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana- Champaign  Molecular Biomedicine in the Era of Teraflop Computing -

chemical informatics:  To a certain extent, chemical informatics is the chemical counterpart of bioinformatics, but the term is considerably broader in its scope of activities in chemistry. Chemical informatics is the application to chemistry of computer technology in all of its manifestations. Variously known as chemoinformatics, cheminformatics, or even chemiinformatics, the field is just beginning to be recognized as a subdiscipline of chemistry. Gary Wiggins, Indiana University  

chemical information: Many people view chemoinformatics as an extension of chemical information, which is a well established concept covering many areas that employ chemical structures, data storage and computational methods, such as compound registration databases, on- line chemical literature, SAR analysis and molecule- property calculation. Timothy Ritchie "Chemoinformatics; manipulating chemical information to facilitate decision- making in drug discovery" Drug Discovery Today 6(16): 813-814, Aug. 2001

chemical information system: Must include registration, computed and measured properties, chemical descriptors and inventory. The primary purpose is to be able to identify a chemical substance, find compounds similar to the target compound and determine the location of the compound. To effectively build it, an object definition of the chemical sample is paramount…The hub [central database] of the chemical information system is the inventory system. Frank Brown "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375-384, 1998

cheminformatics: Use of mathematical and statistical methods to extract information from chemical data.  Note: For screening purposes, cheminformatics approaches are used in order to design compound libraries for general screening or for screening against a specific target class. Cheminformatics is also used to cluster active molecules based on chemical similarity, which aids in the analysis of screening data. IUPAC Biomolecular Screening 

The practice of  finding the "best- fitting" compounds to address particular targets. The field encompasses diversity analysis and library design, virtual screening, rational drug design, and tools and approaches for predicting activity and other properties from structure.

Going by the numbers in cheminformatics seems to be the currently most used form of this word, overtaking chemoinformatics. See the Glossary FAQ question #3 for details and methodology.  Related terms: Drug discovery & development,   Drug discovery informatics

chemodescriptors: Hawkins DM, Basak SC, Kraker J, Geiss KT, Witzmann FA, Combining Chemodescriptors and Biodescriptors in Quantitative Structure-Activity Relationship Modeling, J Chem Inf Model. 46(1): 9-16, Jan 23, 2006

chemoinformatics: The focus [of chemoinformatics] is placed on four traditional research areas: chemical database systems, computer-assisted structure elucidation systems, computer-assisted synthesis design systems, and 3D structure builders. WL Chen, Chemoinformatics: past, present, and future. Journal of Chemical Information Model, 46(6): 2230-2255, Nov 2006

Chemoinformatics is an integral part of the discipline of knowledge management. Nicholas J. Hrib, Norton P. Peet "Chemoinformatics: are we exploiting these new science?" Drug Discovery Today 5 (11): 483- 485, Nov. 2001

Increasingly incorporates "compound registration into databases, including library enumeration; access to primary and secondary scientific literature; QSAR Quantitative Structure Activity Relationships) and similar tools for relating activity to structure; physical and chemical property calculations; chemical structure and property databases, chemical library design and analysis; structure- based design and statistical methods. Because these techniques have traditionally been considered the realms of scientists from different disciplines, differences in computer systems and terminology provide a barrier to effective communication. This is probably the single most challenging problem that chemoinformatics must solve. M Hann and R Green "Chemoinformatics – a new name for an old problem?" Current Opinion in Chemical Biology 3:379- 383, 1999

An emerging area, which annotates small molecules and also libraries with structure – function, synthesis, and all other relevant data used to design and develop better drugs. "Combinatorial Chemistry" Nature Biotechnology 18:  Supplement Oct. 2000, from Nature Biotechnology 16, 691– 693, 1998 

Mixing of information technology and management to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the arena of drug lead identification and optimization. ..In Chemoinformatics there are really only two [primary] questions: 1.) what to test next and 2.) what to make next. The main processes within drug discovery are lead identification, where a lead is something that has activity in the low micromolar range, and lead optimization, which is the process of transforming a lead into a drug candidate. Frank Brown  "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375-384, 1998  Related terms: cheminformatics, chemi-informatics, chemometrics, computational chemistry.  

chemometrics: The application of statistics to the analysis of chemical data (from organic, analytical or medicinal chemistry) and design of chemical experiments and simulations. IUPAC Computational

The science of relating measurements made on a chemical system or process to the state of the system via application of mathematical or statistical methods. International Society of Chemometrics "ISC symbol and definition of chemometrics" 1997  Wikipedia chemometrics  Related terms: Drug discovery informatics  3D-QSAR, comparative molecular field analysis (CoMFA, QSAR ), ClogP values

ClogP values: Calculated 1-octanol/ water partition coefficients, frequently used in   Structure-Property Correlation (SPC) or quantitative structure-activity relationship (QSAR) studies (Leo, 1993).  IUPAC Computational

Logarithm of the partition coefficient.

CML Chemical Markup Language:  Wikipedia 

Comparative Molecular Field Analysis CoMFA: A 3D-QSAR method that uses statistical correlation techniques for the analysis of the quantitative relationship between the biological activity of a set of compounds with a specified alignment, and their three-dimensional electronic and steric properties. Other properties such as hydrophobicity and hydrogen bonding can also be incorporated into the analysis. (See also Three-dimensional Quantitative Structure-Activity Relationship [3D-QSAR]). IUPAC Medicinal Chemistry

Uses statistical correlation techniques for the analysis of the quantitative relationship between the biological activity of a set of compounds with a specified alignment, and their three- dimensional electronic and steric properties. Other properties, such as  hydrophobicity and H-bonding can also be incorporated into the analysis (Cramer et al., 1988; Kubinyi, 1993b).  IUPAC Computational  Narrower term: topomeric CoMFA

computational chemistry:  A discipline using mathematical methods for the calculation of molecular properties or for the simulation of molecular behavior.  It also includes, e.g., synthesis planning, database searching, combinatorial library manipulation (Hopfinger, 1981; Ugi et al., 1990). IUPAC Computational

Computational chemistry seeks to predict quantitatively molecular and biomolecular structures, properties, and reactivity by computational methods alone. It uses modern chemical theory to predict the speed of unknown reactions and the synthetic sequences by which complex new molecules can be made most efficiently. Computational chemistry allows chemists to explore how things work at the atomic and molecular levels and to draw conclusions that are impossible to reach by experimentation alone. Thus, computational chemistry supplements experimentally derived data. Gary D. Wiggins, "What is Chemical Informatics?"  Indiana Univ., US, 2006   Related terms: Drug discovery informatics Computer Aided Molecular Design CAMD, molecular graphics

computational modeling: See ab initio modeling, homology modeling, molecular modeling.

computational quantum chemistry: is primarily concerned with the numerical computation of molecular electronic structures by ab initio and semi-empirical techniques. Overview of computational chemistry Shodor Educational Foundation 2999-2000 

Computer Aided Molecular Design (CAMD): Involves all computer-assisted techniques used to discover, design and optimize compounds with desired structure and properties.  IUPAC Combinatorial

Also known as molecular modeling or computational chemistry, uses computers to analyze and model the physicochemical properties of a molecule. CAMD programs allow integrated molecular design to take drug discovery to a new level by using a more cross-functional team approach to drug research and development.  Oxford Molecular

Computer-Assisted Molecular Design CAMD: Involves all computer-assisted techniques used to discover, design and optimize compounds with desired structure and properties.  IUPAC Computational

Computer-Assisted Molecular Modeling CAMM:  The investigation of molecular structures and properties using computational chemistry and graphical visualization techniques.  IUPAC Computational

conformational analysis: Consists of the exploration of energetically favorable spatial arrangements (shapes) of a molecule (conformations) using molecular mechanicsmolecular dynamics, quantum chemical calculations or analysis of  experimentally- determined structural data, e.g., NMR or crystal structures.

Molecular mechanics and quantum chemical methods are employed to compute conformational energies, whereas systematic and random searches, Monte Carlo, molecular dynamics, and distance geometry are methods (often combined with energy minimization procedures) used to explore the conformational space. IUPAC Computational

drug design: Drug discovery informatics

factorial design FD: An experimental design technique in which each variable (factor or  descriptor) is investigated at fixed levels. In a two- level FD, each variable can take two values, e.g., high and low lipophilicity. IUPAC Computational

force field: A set of functions and parametrization used in molecular mechanics  calculations.  IUPAC Computational

Long-time simulations will pose a challenging benchmark for the force fields employed in molecular modeling. One question is, how will proteins and DNA that were described by the available force fields (and remained stable over nanosecond periods) behave in microsecond simulations? The high cost of long- time simulations will require that the issue is addressed in a systematic way by providing standard cases against which simulations can be tested  Opportunities in Molecular Biomedicine in the Era of  Teraflop Computing: March 3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular Modeling and Bioinformatics  Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana- Champaign Related term: van der Waals

Immersive Virtual Reality IVR: New futuristic technique [which] enables the user to literally become a part of his or her data and to use additional senses. Although IVR has not yet enjoyed widespread use in scientific disciplines, it has been cost- effective in architectural design. Nicholas J. Hrib, Norton P. Peet "Chemoinformatics: are we exploiting these new science?" Drug Discovery Today 5 (11): 483-485, Nov. 2000  

InChI: IUPAC International Chemical Identifier 

Lipinski’s rules of five: See rules of five

lipophilicity: Represents the affinity of a molecule or a moiety for a lipophilic environment. It is commonly measured by its distribution behaviour in a biphasic system, either liquid- liquid (e.g., partition coefficient in octan-1-ol/water) or solid/liquid (retention on reversed- phase high performance liquid chromatography (RP-HPLC) or thin- layer chromatography (TLC) system).  (See also Hydrophobicity). IUPAC Medicinal Chem

molecular design: The application of all techniques leading to the discovery of new chemical entities with specific properties required for the intended application. IUPAC Compendium Related terms: drug design, ligand design, rational drug design.

molecular dynamics: A simulation procedure consisting of the computation of the motion of  atoms in a molecule or of individual atoms or molecules in solids, liquids and gases, according to Newton's laws of motion. The forces acting on the atoms, required to simulate their motions, are generally calculated using molecular mechanics force fields.  IUPAC Computational   Narrower term: ab initio molecular dynamics

molecular dynamics simulation: A computer simulation developed to study the motion of molecules over a period of time. MeSH 2010

molecular geometry:

molecular graphics: A technique for the visualization and manipulation of molecules on a graphical display device. IUPAC Computational

molecular mechanics: The calculation of  molecular conformational geometries and energies using a combination of empirical force fields (Burkert and Allinger, 1982).

Method of calculation of  geometrical and energy characteristics of  molecular entities on the basis of empirical potential functions (see force field) the form of  which is taken from classical mechanics. The method implies transferability of the potential  functions within a network of similar molecules. An assumption is made on "natural” bond lengths and angles, deviations from which result in bond and angle strain  respectively. Repulsive or attractive van der Waals and electrostatic forces between nonbonded atoms are also taken into account. Synonymous with force field method. IUPAC Computational  Related terms: energy function, force fields

molecular modeling, molecular modelling: A technique for the investigation of molecular structures and  properties using computational chemistry and graphical visualization techniques in order to provide a plausible three- dimensional representation under a given set of  circumstances. IUPAC Medicinal Chemistry, IUPAC Computational

The Journal of Molecular Modeling focuses on 'hardcore' modeling, publishing high-quality research and reports. Founded in 1995 as a purely electronic journal, it has adapted its format to include a full-color print edition, and adjusted its aims and scope fit the fast-changing field of molecular modeling, with a particular focus on three-dimensional modeling.Today, the journal covers all aspects of molecular modeling including life science modeling; materials modeling; new methods; and computational chemistry. Topics include computer-aided molecular design; rational drug design, de novo ligand design, receptor modeling and docking; cheminformatics, data analysis, visualization and mining; computational medicinal chemistry; homology modeling; simulation of peptides, DNA and other biopolymers; quantitative structure-activity relationships (QSAR) and ADME-modeling; modeling of biological reaction mechanisms; and combined experimental and computational studies in which calculations play a major role. Scope note Journal of Molecular Modeling

Molecular modeling applications use falls into two broad categories: interactive visualization and computational analyses. ... Three of the most prominent uses of modern molecular modeling applications are structure analysis, homology modeling, and docking ... in essence, objective modeling revolves around three different approaches (each based on different underlying physical and chemical theories): molecular dynamics, molecular mechanics, and quantum mechanics . All of these are concerned with developing a unique solution to what is referred to as the "protein folding" problem - designing and testing algorithms and applications that will reliably predict 3-D structure from primary sequence. Christopher Smith "Molecular Modeling - Seeing the Whole Picture with Modeling Software Packages" Scientist 12[17]:0, Aug. 31, 1998 

Molecular modeling software includes AMBER, DOCK, MODELER, RasMol and many other programs. Related terms: computational chemistry, Computer Assisted Drug Design; molecular graphics,  molecular dynamics, molecular mechanics.

molecular models: Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer generated graphics, and mechanical structures. MeSH, 1984

molecular recognition: refers to the specific interaction between two or more molecules through noncovalent bonding such as hydrogen bonding, metal coordination, hydrophobic forces,[3][4] van der Waals forces, π-π interactions, halogen bonding, electrostatic and/or electromagnetic[5] effects....Molecular recognition plays an important role in biological systems and is observed in between receptor-ligand,[10][11] antigen-antibody, DNA-protein, sugar-lectin, RNA-ribosome, etc.  Wikipedia accessed 2018 Sept 14  Related terms: molecular mimicry, peptidomimetic, recognition site

Partial Least Squares PLS: Projection to latent structures (PLS) is a robust multivariate generalized regression method using projections to summarize multitudes of potentially collinear variables (Wold et al., 1993).  IUPAC Computational

plug and play systems: Required for effective chemoinformatics systems. Must be designed backward from the answer to the data to be captured and systems should be in components where each component has one simple task…modular systems that can "plug and play" into other systems. Frank Brown "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375- 384, 1998

predictive data mining: Algorithms & data analysis  Used in structure- function correlations.

Quantitative Structure-Activity Relationships QSAR:: Mathematical relationships linking chemical structure and pharmacological activity in a quantitative manner for a series of compounds. Methods which can be used in QSAR include various regression and pattern recognition techniques. QSAR is often taken to be equivalent to chemometrics or multivariate statistical data analysis.  It is sometimes used in a more limited sense as equivalent to Hansch analysis. QSAR is a subset of the more general term SPC.  IUPAC Computational

The building of structure – biological activity models by using regression analysis with physicochemical constants, indicator variables or theoretical calculations. The term has been extended by some authors to include chemical reactivity, i.e. activity is regarded as synonymous with reactivity. This extension is, however, discouraged. Related term: correlation analysis. IUPAC Compendium

A quantitative prediction of the biological, ecotoxicological or pharmaceutical activity of a molecule. It is based upon structure and activity information gathered from a series of similar compounds. MeSH, 2001

QSARs attempt to correlate chemical structure with activity using statistical approaches. The QSAR models are useful for various purposes including the prediction of activities of untested chemicals. Quantitative structure- activity relationships and other related approaches have attracted broad scientific interest, particularly in the pharmaceutical industry for drug discovery and in toxicology and environmental science for risk assessment. An assortment of new QSAR methods have been developed during the past decade, most of them focused on drug discovery. Besides advancing our fundamental knowledge of QSARs, these scientific efforts have stimulated their application in a wider range of disciplines, such as toxicology, where QSARs have not yet gained full appreciation. Synonymous? term: QSPR: Quantitative Structure Property Relationship Related terms: Algorithms glossary SAR Structure Activity Relationship;  Hansch analysis; Drug discovery & development drug design; Pharmacogenomics  toxicogenomics

quantum chemical calculations: Molecular property calculations based on the Schrödinger equation, which take into account the interactions between electrons in the molecule. IUPAC Computational

quantum mechanics: The laws of physics that apply on very small scales. The essential feature is that energy, momentum and angular momentum as well as charge come in discrete amounts called quanta.  More... SLAC Glossary, Stanford Linear Accelerator Center, Stanford Univ. US Narrower terms: ab initio quantum mechanical methods, ab initio quantum mechanical modeling, semi- empirical quantum mechanical methods

rules of five: Lipinski’s rules. Set of criteria for predicting the oral bioavailability of a compound on the basis of simple molecular features  (molecular weight CLogP, numbers of  hydrogen- bond donors and acceptors). Often used to profile a library or virtual library with respect to the proportion of drug- like members  which it contains. IUPAC Combinatorial

An algorithm, developed  by Christopher A. Lipinski (of Pfizer) and colleagues, in which many of the cutoff numbers are five or multiples of five. ...Pfizer has developed a additional number of criteria for adoption of lead candidates. Advanced Drug Delivery Research 23: 3- 25, 1997.

semi-empirical methods: Molecular orbital calculations using various degrees of  approximation and using only valence electrons. IUPAC Computational

semi-empirical quantum mechanical methods: Use parameters derived from experimental data to simplify computations. The simplification may occur at various levels: simplification of the Hamiltonian (e.g. as in the Extended Hückel method), approximate evaluation of certain molecular integrals (see, for example, zero  differential overlap), simplification of the wave function (for example, use of p electron approximation as in Pariser-Parr-Pople). IUPAC Computational

silo systems: Legacy method for many information systems, a system built to collect, store and report one laboratory’s data. Each "silo system" holds the data differently and may be in a different technology … the results of the systems cannot easily be interchanged … This is as much a corporate structure and resource problem as it is a technical problem. Contrast with "plug and play". Frank Brown "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375- 384,1998  Related term: information silos, interoperability

simulated annealing SA: A procedure used in molecular dynamics simulations, in which the system is allowed to equilibrate at high temperatures, and then cooled down slowly to remove kinetic energy and to permit trajectories to settle into local minimum energy conformations.  IUPAC Computational

stereochemical formula (stereoformula): A three- dimensional view of a molecule either as such or in a projection. IUPAC Compendium

stereochemistry: See stereochemical formula (stereoformula):

Structure Activity Relationship SAR: The relationship between chemical structure and pharmacological activity for a series of compounds  IUPAC Medicinal Chemistry

Compounds are often classed together because they have structural characteristics in common including shape, size, stereochemical arrangement, and distribution of functional groups. Other factors contributing to structure- activity relationship include chemical reactivity, electronic effects, resonance, and inductive effects. MeSH, 1972  Narrower terms:  3D-QSAR, QSAR; Related terms: NMR SAR by NMR ; Algorithms & data management cluster analysis, Principal Components Analysis PCA, recursive partitioning 

Structure-Property Correlation SPC: All statistical mathematical methods used to correlate any molecular property (intrinsic, chemical or biological) to any other property, using statistical regression or pattern recognition techniques (Van de Waterbeemd, 1992). QSAR is a subset of the more general term SPC.  IUPAC Computational  Narrower terms: 3D QSAR, QSAR

van der Waals forces:  The attractive or repulsive forces between molecular entities (or between groups within the same molecular entity) other than those due to bond formation or to the electrostatic interaction of ions or of ionic groups with one another or with neutral molecules. ... The term is sometimes used loosely for the totality of nonspecific attractive or repulsive forces. IUPAC Compendium

virtual database assembly: A crucial activity as it enables access to the large number of drug- like molecules that could theoretically be made... can serve several purposes: for example, to generate a maximally diverse virtual library for lead generation, a biased library aimed at a specific target or target family, or a lead optimization library. Nicholas J. Hrib, Norton P. Peet "Chemoinformatics: are we exploiting these new science?" Drug Discovery Today 5 (11): 483- 485, Nov. 2000

virtual library: A library which has no physical existence, being constructed solely in electronic form or on paper. The building blocks required for such a library may not exist, and the chemical steps for such a library may not have been tested. These libraries are used in the design and evaluation of possible libraries. IUPAC Combinatorial Chemistry Related terms:  Combinatorial libraries & synthesis

virtual molecules: It has also become clear that even the most efficient combinatorial chemistry approaches can generate only a minute fraction of the 1 x 1040 virtual drug molecules that could be prepared. Timothy Ritchie "Chemoinformatics; manipulating chemical information to facilitate decision- making in drug discovery" Drug Discovery Today 6(16): 813- 814, 16 Aug. 2001

Cheminformatics/Chemoinformatics resources
Chemical Informatics Letters glossary, Jonathan Goodman 
IUPAC  International Union of Pure and Applied Chemistry, Glossary of Terms used in Computational Drug Design, H. van de Waterbeemd, R.E. Carter, G. Grassy, H. Kubinyi, Y. C.. Martin, M.S. Tute, P. Willett, 1997. 125+ definitions
IUPAC International Union of Pure and Applied Chemistry Glossary of terms  used in Biomolecular Screening 2011
IUPAC International Union of Pure and Applied Chemistry, Glossary of terms used in Computational Drug Design, Part I, 1997
IUPAC  International Union of Pure and Applied Chemistry, Glossary of Terms used in Computational Drug Design, Part II,  2015
Open Babel, Avogadro & Molecular Modelling blog 
Virtual Computational Chemistry Laboratory

How to look for other unfamiliar  terms

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry. 

Contact | Privacy Statement | Alphabetical Glossary List | Tips & glossary FAQs | Site Map