You are here Biopharmaceutical/ Genomic glossary homepage/Search > Biology > Biopharmaceutical Protein structures

Biopharmaceutical Protein structures glossary
Evolving terminology for emerging technologies

Comments? Suggestions? Questions?  mchitty@healthtech.com
Last revised November 16, 2009 


New Page 1

Please register for CHI's Genomics Glossaries & Taxonomies website. This sign-in box with then disappear from each page, if you accept cookies. Use of this site will continue to be free, but better demographic data on who is accessing this material helps us to justify the expense of maintaining this resource. Registration policy has details.

Registered users of the Genomics Glossaries & Taxonomies will automatically be signed up for CHI's complimentary email monthly newsletter, GenomeLink, unless you choose to opt out of receiving it.

Mr.     Ms.     Mrs.     Dr.     Prof.

First:

         

Last:

Title:

Dept.:

Company:

Address:

City:

State:

Zip:

Country:

Email:

Opt-out of Email

YES    NO

Telephone:

Would you like to receive CHI event updates via fax? 
Yes       No 

Fax:


Sequencing of the human genome has increased the number of candidate proteins for clinical development and therapeutic use.  Efforts are under way to identify and understand biological mechanisms that exist between proteins and to get information on the structure of proteins as they exist within biological complexes.  A major challenge is to understand how proteins fold and how protein structure relates to protein function.

Biology & chemistry map   Finding guide to terms in these glossaries   Site Map 
Related glossaries include  Applications ProteomicsStructural Genomics;  
Technologies Mass Spectrometry, NMR & X-ray Crystallography  Sequencing
Biology Biomolecules, Expression, Protein categoriesProteinsSequences, DNA & beyond

3D protein structures: The conformation into which a protein “folds.” For proteins consisting of only one polypeptide chain, it is the tertiary structure that is usually referred to by the term “the 3D structure of a protein.  Related term: protein structures

aggregation: Hopelessly tangled and complete amorphous masses of protein fibers. W. Thomasson “Unraveling the mystery of protein folding” FASEB 1997 http://opa.faseb.org/pdf/protfold.pdf 

Related term: misfolding

alpha-helix, alpha-helices: See secondary protein structure, tertiary protein structure.

amino acid motifs: Commonly observed structural components of proteins formed by simple combinations of adjacent secondary structures. A commonly observed structure may be composed of a CONSERVED SEQUENCE which can be represented by a CONSENSUS SEQUENCE. MeSH, 2000 

Related term: consensus sequence See also motifs.

basement membranes: Complex extracellular structures that play crucial roles in the organization and function of most tissues and organs, including muscle, skin, blood vessels, brain, heart, lung, eye, kidney, and peripheral nerves. Basement Membranes, Gordon Conference, 2008 http://www.grc.org/programs.aspx?year=2008&program=basement 

Related terms: Cell biology  extracellular matrix; membrane proteins: Protein categories

beta-sheets: See secondary protein structure, tertiary protein structure.

Blue Gene Project: IBM, Blue Gene Project   http://www.research.ibm.com/bluegene/sciapp.html

comparative protein structure modeling: Structural genomics glossary

conformation: See protein conformation.

crystallomics: Omes & omics

disordered proteins: See intrinsically disordered proteins

domain: An independently folded unit within a protein, often joined by a flexible segment of the polypeptide chain. IUPAC Bioinorganic

A discrete portion of a protein assumed to fold independently of the rest of  the protein and possessing its own function. [NCBI Bioinformatics]

A region of a protein’s amino acid sequence that has evolutionary, structural, or functional significance. The amino acid sequence of a domain determines a protein’s 3D structure. ... The stated goal of structural genomics, as a field, involves generating a set of structures representative of most of the possible folds for specific protein domains and then solving the structures for new proteins based on known fold- structure relationships. Pharmaceutical researchers are most interested in domains because these determine the “active” or “binding” sites of molecules. 

The focus of the group is the understanding of protein function and evolution using genomic, structural and proteomic data. Central to this question is the concept of the domain: a structurally conserved, genetically mobile unit. When viewed at the three-dimensional level of protein structure, a domain is a compact arrangement of secondary structures connected by linker polypeptides. It usually folds independently and possesses a relatively hydrophobic core. The importance of domains is that they cannot be divided into smaller units they represent a fundamental building block that can be used to understand the evolution and function of proteins...  The advent of complete genomic sequences, including more and more eukaryotes, is leading to a fundamental change in protein domain analysis. Having characterised most of the domain families and having developed tools to predict them, we can now start to analyse their function and evolution on a higher level. Protein Function Analysis Group, Max Planck Institute for Molecular Genetics, Germany  http://protfunc.molgen.mpg.de/

Related terms: mosaic proteins, multi- domain proteins, protein families, target selection.

Protein domain databases see Databases & software directory.

domain shuffling: Creating new proteins by bringing domains together. It is thought that this is a major way that new proteins have arisen during evolution. Thus, mining of databases for homology by domains, rather than by whole proteins (which are not as evolutionarily conserved), is important in obtaining clues to functionality.  

A protein sequence can have more than one domain. 

Related term: multi- domain proteins.

fold alignment: A critical step in homology modeling, because it provides the key structures for the model. If suitably matched folds cannot be identified, a type of fold assignment known as protein threading can be used. 

fold recognition: See threading

folding: See protein folding

foldome: Omes & omics

homeomorphic superfamilies: Protein families are clustered into "homeomorphic superfamilies". Sequences are homeomorphic if they can be aligned from end- to- end. In practice, we allow the amino and carboxyl ends to be ragged and moderate internal length variations (represented as gaps in the sequences). However, all members of the superfamily should have the same overall domain architecture, i.e., the same domains in the same order (except for domains missing due to alternative splicing or very recent genetic events). It is assumed, although in most cases this has not been investigated in detail, that the molecules in a homeomorphic superfamily share a common evolutionary history since the acquisition of their constituent domains. Thus, it should be valid to construct an evolutionary tree from the members of a homeomorphic superfamily. If two groups of proteins with the same architecture are shown to have come to that structure independently, they are appropriately separated into two homeomorphic superfamilies. PIR Classification Terminology, Georgetown Univ, revised 1998 http://pir.georgetown.edu/pirwww/aboutpir/doc/short_sf_def.html 

homology domains: Many types of domains have been found in diverse proteins. In common use, the term "immunoglobulin superfamily" refers to the collection of all proteins that contain an immunoglobulin- like domain. We call such a group a "homology domain superfamily". Any given protein sequence will be assigned to only one homeomorphic superfamily, but it may contain sequence segments belonging to several homology domain superfamilies.  PIR Classification Terminology, Georgetown Univ, revised 1998 http://pir.georgetown.edu/pirwww/aboutpir/doc/short_sf_def.html 

homology modelling: Structural genomics

integral membrane proteins: http://en.wikipedia.org/wiki/Integral_membrane_protein 

See also under membrane proteins

intrinsically disordered proteins IDP:  The newly formed IDP Subgroup provides a forum for the discussion of intrinsically disordered proteins, with topics including but not limited to experimental and theoretical studies of i) their intrinsically flexible state, ii) the mechanisms of their interactions with each other and with diverse partners including but not limited to structured proteins and nucleic acids, iii) their broadly defined functional roles in biological systems, and iv) their potential involvement in the pathogenesis of conformational and other diseases. The Subgroup invites participants from all scientific disciplines with an interest in broadening our understanding of IDPs, ranging from biophysical studies of individual proteins to genomic and proteomic studies in whole organisms. Biophysical Society, IDP Subgroup, 2007 http://www.biophysics.org/subgroups/idp.htm 

membrane proteins: Bioprocessing

misfolding: Protein misfolding and protein aggregation have been shown to be involved in a number of diseases, particularly neurodegenerative ones. Related terms fold alignment, fold recognition, protein folding; Structural genomics foldedness

molecular chaperones: A family of cellular proteins that mediate the correct assembly or disassembly of other polypeptides, and in some cases their assembly into oligomeric structures, but which are not components of those final structures. It is believed that chaperone proteins assist polypeptides to self- assemble by inhibiting alternative assembly pathways that produce nonfunctional  structures. Some classes of molecular chaperones are the nucleoplasmins, the CHAPERONINS and HEAT- SHOCK PROTEINS. MeSH, 1995

mosaic proteins: Proteins with many (often repeated) domains are termed mosaic proteins. These domains or modules may be considered to be connected units which are independent in terms of their structure, function and folding behaviour. Principles of protein structure using the Internet, Dept. of Crystallography, Birkbeck College, Univ. of London, UK 1997-98

Mosaic proteins, Birkbeck College, Univ. of London http://www.cryst.bbk.ac.uk/PPS2/course/section10/mosaic.html

motif: A short conserved region in a protein sequence. Motifs are frequently highly conserved parts of domains. [NCBI Bioinformatics] 

See also amino acid motifs

Motif databases see Databases & software directory.

multi-domain proteins: Most proteins are multi- domain.  Structure determination is easiest for single- domain proteins (and these are many of  the ones that have been solved). The interactions between a protein's domains can be complex and can be very significant for protein function and for drug discovery. 

multimeric: See under protein conformation.

native state: For proteins or nucleic acids, the formation in the intact cell. Final configuration

oligomeric proteins: Proteins composed of two or more polypeptide chains.  

peptide library: Used by Mario Geysen (1985 +) to map peptide epitopes or antigenic sites on proteins.  Numerous strategies have developed over the past 20 years to synthesize mixtures of thousands to millions of peptides and allow selection of those with the desired activities.

peptide receptors: Cell surface receptors that bind peptide messengers with high affinity and regulate intracellular signals which influence the behavior of cells. MeSH, 1994

peripheral proteins:  Peripheral proteins associated at the lipid surface are one of the major components of biological membranes. They may function in situ as electron carriers (e.g. cytochrome c), as enzymes (e.g. protein kinase C), as signal transduction proteins (e.g. G-proteins), or primarily as structural elements (e.g. spectrin and myelin basic protein). The protein density at the membrane surface can be relatively high and the peripheral proteins may also interact with the exposed portions of integral proteins embedded within the membrane (e.g. with redox enzymes of the respiratory chain, or with receptors such as those to which G-proteins are coupled). T.Heimburg, and D.Marsh. 1996. Thermodynamics of the interaction of proteins with lipid membranes. in "Biological membranes - A molecular perspective from computation and experiment", B. Roux and K.M.Merz, eds., Birkhäuser, Boston, Basel, Berlin,1996 http://wwwuser.gwdg.de/~theimbu/abstracts/abstract20.html 

See also under membrane proteins

protein conformation: The characteristic 3-dimensional shape of a protein, including the secondary, supersecondary (motifs), tertiary (domains) and quaternary structure of the peptide chain. Quaternary protein structure describes the conformation assumed by multimeric proteins (aggregates of more than one polypeptide chain). MeSH, 1972 

The spatial arrangement of the atoms affording distinction between stereoisomers which can be interconverted by rotations about formally single bonds. Some authorities extend the term to include inversion at trigonal pyramidal centres and other polytopal rearrangements. [IUPAC Stereo]

protein domains:  Wikipedia http://en.wikipedia.org/wiki/Structural_domain 

See also domain

protein family:: http://en.wikipedia.org/wiki/Protein_family 

Related terms: protein superfamily, protein subfamilies

Protein family databases  Databases & software directory.

protein folding: A rapid biochemical reaction involved in the formation of proteins. It begins even before a protein has been completely synthesized and proceeds through discrete intermediates (primary, secondary, and tertiary structures) before the final structure (quaternary structure) is developed. MeSH, 1993

Protein folding is a particularly good target for the application of   molecule methods because its complexity and stochastic nature make it difficult to study using ensemble methods. A population of unfolded protein molecules consists of a large number of nearly degenerate and rapidly interconverting protein conformations. Different folding pathways and transition states for the folding reaction cannot be singled out in a heterogeneous ensemble of molecules. NIGMS, Single Molecule Detection and Manipulation Workshop "Single Molecule Fluorescence of Biomolecules and Complexes Protein Folding April 17-18, 2000  http://www.nigms.nih.gov/news/reports/single_molecules.html#examples 

Related terms: misfolding, protein folds, protein folding problem, refolding; Molecular medicine protein folding disorders

Narrower term: high-throughput protein refolding

Folding@home: From Genome to structure, Stanford University http://www.stanford.edu/group/pandegroup/Cosm/  A new approach to solving the protein folding problem. Background information on the biology.

Protein fold databases see Databases & software directory

protein folding disorders: Molecular Medicine

protein folding problem: Lies at the heart of a huge amount of modern biomedical research): the fact that thousands of different sequences can all form the same three- dimensional structure.  Vijay Pande, Pande Group Projects, Stanford Univ. US   http://www.stanford.edu/group/pandegroup/projects.html#design

Related terms: protein folding- Folding@home; In silico & Molecular modeling virtual genomes

protein folds: The core 3D structure of a domain is called a fold. There are only a few thousand possible folds.

Related terms: misfolding, refolding

protein informatics: Proteomics

protein motif: See motif

protein sequence: Can this be related to protein structure?  Lots of people have been trying to find out for a long time. 

Related terms: protein folding, sequence homology.

protein structure: The 3D structure of a protein determines how the chemical groups that make up the binding site of a ligand, the active site of an enzyme, or the binding site for another protein come together. These binding sites or active sites are key to understanding the function of a protein in the cell, or to understanding how particular molecular targets (which are, in most cases, proteins) interact with drugs. Furthermore, knowledge of the 3D structure of a protein is also key to understanding how binding of a ligand (including drugs) changes the behavior of that protein. This knowledge can also aid the understanding of how particular mutations or variations in the gene that encodes a particular protein lead to changes in the protein’s behavior that can result in disease or in differences in drug interactions among different individuals. ... The 3D conformation of a target will be critical in determining whether the target is even druggable, and, if it is, which compounds will have the best fit based on this conformation. 

A greater ability to work with three- dimensional structures and to look for similarities in these structures (between the products of different genes) is expected to yield improved functional information. 

Related terms: high- throughput protein structure determination, protein structure prediction, protein structure technologies, structural genomics; Narrower terms quaternary protein structure, secondary protein structure, tertiary protein structure.

protein structure data: Protein Data Bank (PDB)

Protein structure databases Databases & software directory.

Protein Structure Factory: Structural genomics
Protein Structure Initiative: Structural genomics
protein structure prediction:
  Structural genomics

protein structure technologies:  NMR & X-ray crystallography

protein subfamilies: Many proteins belong to large families, as suggested by Dayhoff [1]. Such families are often composed of subfamilies related to each other by gene duplication events. ... subfamilies often differ in their biological functionality yet still exhibit a high degree of sequence similarity.  Christian M. Zmasek,  Sean R. Eddy, RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs, BMC Bioinformatics. 2002; 3 (1): 14, 2002

Related terms: protein family, protein superfamilies

protein superfamily: Margaret O. Dayhoff introduced the term protein superfamily in 1974. Since that time, the sequences in the PIR - International Protein Sequence Database have been classified into protein superfamilies. Prior to about 1990, the superfamily classification permitted a sequence to be assigned to a single superfamily only. The recognition of  mosaic, multidomain proteins, whose component domains appear to have had separate evolutionary histories, has made this approach no longer effective. Moreover, the term superfamily has come into common usage and its meaning is no longer well defined. Although originally defined as a group of evolutionarily related proteins, it also has been used in the published literature to refer to a group of structurally or functionally related proteins not necessarily of common evolutionary origin. [David George, "Proposal for the Definition of  "Protein Superfamily", Aug. 18, 1993, PIR database] http://www-nbrf.georgetown.edu/pirwww/otherinfo/sfdef.html  

The organization of proteins into superfamilies based primarily on their sequences is introduced: examples are given of the methods used to cluster the related sequences and to elucidate the evolutionary history of the corresponding genes within each superfamily. MO Dayhoff, The origin and evolution of protein superfamilies, Federation Proceedings 35(10): 2132- 2138, Aug. 1976

Related terms: protein family, protein subfamilies

protein taxonomy: A Protein Taxonomy Based on Secondary Structure T. Przytycka, R. Aurora, GD Rose, Nature Structural Biology 6 (7): 1999.

quaternary protein structure: The defined organization of two or more macromolecules with tertiary structure such as a protein that are held together by hydrogen bonds and van der Waals and coulombic forces.  [IUPAC Compendium]

The characteristic 3-dimensional shape and arrangement of  multimeric proteins (aggregates of more than one polypeptide chain). MeSH, 2000

secondary protein structure: The conformational arrangement (a- helix, b- pleated sheet, etc.) of the backbone segments of a macromolecule such as a polypeptide chain of a protein without regard to the conformation of the side chains or the relationship to other segments. [IUPAC Compendium]

The level of protein structure in which regular hydrogen- bond interactions within contiguous stretches of  polypeptide chain give rise to alpha helices, beta strands (which align to form beta sheets) or other types of coils. This is the first folding level of protein conformation. [MeSH, 1993] 

Related term: motif.

sequence homology: Sequencing

solving protein structures: See high- throughput protein structure determination

structural proteomics: Proteomics

structure-function of proteins: Proteins are responsible in part for maintaining functional stability and homeostasis of cells and tissues ... Accumulation of altered proteins may be correlated with a loss of function or, in some cases, a gain of inappropriate or toxic function... The ability of a protein to perform its function in the cell depends in part upon its ability to assume and retain its proper functional conformation.  The proper conformation is achieved by regulated folding during synthesis, aided by chaperone proteins.  Mutations and other changes that divert proteins from their normal folding pathways or that destabilize their native state may underlie several human diseases ... Cellular quality control machinery must then recognize misfolded and/ or partially folded products and either refold them or mark them for recycling.  Off- pathway traps can be caused by aggregation, mis- targeting into an inappropriate cellular location, or proteolysis.  Proteins and peptides that are aggregated (for example, into amyloid plaques) or cross- linked are often resistant to degradation.  The formation of these deposits, rather than the lack of native protein, may be responsible for, or contribute significantly to, cellular pathology. [National Institute on Aging "Protein Structure and function in Aging & Late-life Disease"  April 14, 1999 RFA:  AG-99-005] Related terms Structural Genomics   http://grants.nih.gov/grants/guide/rfa-files/RFA-AG-99-005.html

superfamily: See protein superfamily

tertiary protein structure: The spatial organization (including conformation) of an entire protein molecule or other macromolecule consisting of a single chain. [IUPAC Compendium]

The level of protein structure in which combinations of secondary protein structures (alpha helices, beta sheets, loop regions, and motifs) pack together to form folded shapes called domains. … Small proteins usually consist of only one domain but larger proteins may contain a number of domains connected by segments of  polypeptide chain which lack regular secondary structure. MeSH, 1993

Bibliography
Folding@home glossary, Stanford Univ. Tug Sezen, Vijay Pande, 2002, 200+ definitions http://www.stanford.edu/group/pandegroup/folding/education/glossary.html
UNI-PROT KnowledgeBase keywords http://www.expasy.org/cgi-bin/keywlist.pl   Swiss Institute of Bioinformatics, Geneva Switzerland, European Bioinformatics Institute, Hinxton, UK, PIR Protein Information Resource, 2004, 800 + definitions.

Alpha glossary index

How to look for other unfamiliar  terms

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry.

 

Contact | Privacy Statement | Alphabetical Glossary List | Tips & glossary FAQs | Site Map