You are here Biopharmaceutical/ Genomic Glossary Homepage/Search > Informatics >Information management & interpretation

Biopharmaceutical information management & interpretation glossary & taxonomy
Evolving Terminology for Emerging Technologies
Comments? Questions? Revisions? Mary Chitty 
mchitty@healthtech.com
Last revised June 02, 2008


New Page 1

Please register for CHI's Genomics Glossaries & Taxonomies website. This sign-in box with then disappear from each page, if you accept cookies. Use of this site will continue to be free, but better demographic data on who is accessing this material helps us to justify the expense of maintaining this resource. Registration policy has details.

Registered users of the Genomics Glossaries & Taxonomies will automatically be signed up for CHI's complimentary email monthly newsletter, GenomeLink, unless you choose to opt out of receiving it.

Mr.     Ms.     Mrs.     Dr.     Prof.

First:

         

Last:

Title:

Dept.:

Company:

Address:

City:

State:

Zip:

Country:

Email:

Opt-out of Email

YES    NO

Telephone:

Would you like to receive CHI event updates via fax? 
Yes       No 

Fax:


The dividing line between this glossary and Algorithms & data analysis is very fuzzy. In general this one focuses on unstructured data (or a combination of structured and unstructured), while Algorithms centers on structured data  Finding guide to terms in these glossaries Informatics  Map   Site Map
Informatics includes Bioinformatics   Computers & computing    In silico & Molecular Modeling   Ontologies   
Technologies Microarrays & protein chips    Sequencing 

Advances in biology and new high-throughput technologies are generating massive amounts of data that overwhelm the current information technology infrastructure. The challenge is to build a common capability that enables a more efficient translation of data into knowledge that leads to new and effective treatments.   caBigTM and Molecular Medicine, NCI, NIH http://cabig.cancer.gov/molecular/overview.asp   

Google = "data analysis" about 1,420,000 as of July 23, 2002; about 4,480,000 as of Sept. 23, 2004;   "data interpretation" about 58, 200 July 23, 2002; about 147,000 as of Sept. 23, 2004

3D technologies: Visual communications are pervasive in information technology and are a key enabler of most new emerging media. In this context, the NRC Institute for Information Technology (NRC-IIT) performs research, development and technology transfer activities to enable access to 3D information of the real world. Research in the 3D Technologies program focuses on three main areas: Virtualizing Reality and Visualization, Collaborative Virtual Environments, 3D Data Mining and Management [Institute for Information Technology, National Research Council, Canada, 3D Technologies] 

artificial intelligence: Algorithms & data analysis glossary

Google = about  1,120,000  July 19, 2002; about 3, 040,000 Oct. 22, 2004

BIRN Biomedical Informatics Research Network: http://www.nbirn.net/ 

bias: One of the two components of measurement error (the other one being variance). Bias is a systematic error that causes the measurement to differ from the correct value. Since bias is systematic, it affects all experiment replicas the same way. 

bibliomining:  The combination of data mining, bibliometrics, statistics, and reporting tools used to extract patterns of behavior- based artifacts from library systems. Scott Nicholson, Bibliomining: Data Mining for Libraries, Syracuse Univ. US http://www.bibliomining.com/ 

bioinformatics visualization: BIoinformatics Glossary

biomedical computing: Computers & computing glossary    

Google = about 11,800 July 19, 2002; about 20,900 Oct. 22, 2004

biomedical informatics: 

Google about 66,600 Oct. 22, 2004

biomedical ontologies: Open Biomedical Ontologies is an umbrella web address for well-structured controlled vocabularies for shared use across different biological and medical domains.  http://obo.sourceforge.net/ 

Google = about 102, Jan. 8, 2003; about 294 Oct. 1, 2003; about 490 Oct 22, 2004; about 488 May 2, 2005

Biomedical Ontologies: Overview

BIONLP.org: Bioinformatics glossary

biopharmaceutical informatics: Drug companies go through a very arduous and regulated discovery, applied research, and development process- typically spanning five years of laboratory research and ten years of clinical studies .. multinational clinical studies, which need to be done with tremendous precision over a very long period of time. The study parameters must be identical for every patient (many times numbering 10,000 patients, followed for five or more years), and all the participating hospitals essentially have to behave in exactly the same way for the trial to be valid. ..  The life science industry is conservative by nature, and therefore it is a late- adopting industry. It is very sensitive to standards because of the legacy according to which these companies have to maintain data and information. Major pharmaceutical companies typically adopt a 100-year minimum document retention policy, ...each of the industry's four industrial sectors - the pharmaceutical, the biotech, the medical device, and the diagnostics sector - has a different set of needs and desires, as well as its own requirements for unique IT solutions.  ... 

Life science companies are dealing with very large computational data sets. Some are now approaching half terabyte sizes and upward Life science companies also immensely concern themselves with security, because their data represent their crown jewels. Other major concerns expressed by this industry include the stability, scalability, and security of an operating environment. Life science companies and regulatory bodies such as the FDA are more concerned than ever with operating environments that decay with use: When under computational stress, these fragile operating systems have a habit of crashing, and when these systems crash, they tend to corrupt data. ...

Post-genomic, proteomic, chemical information, and other data sets have created a major appetite for solutions to deal with this tremendous amount of data. Scientists are now asking their IT professionals for the ability to better conceptualize and interpret the meaning of this vast information. To do this, scientists need tools for 3D visualization with a tremendous degree of high definition and accuracy. The next step is to take disparate data sets, render them into 3D values, see the DNA and RNA interface, watch protein folds, and then put a therapeutic small molecule in there and see how it relates within a virus that environmentally influences a different process. Scientists Are Demanding Solutions for Dealing with the Post-Genomic, Proteomic, and Chemical Data Deluge: An Interview with Howard Asher, Director, Global Life Sciences Group, Sun Microsystems, CHI GenomeLink 30 http://www.healthtech.com/newsarticles/issue30_1.asp 

Biosemantics Group: http://www.biosemantics.org/  Addresses concept identification and disambiguation algorithms, meta-analysis and visualization techniques, and biological applications [interconnect genes and proteins, semi-automated annotations of protein functions.] Medical Informatics department of the ErasmusMC University Medical Center of Rotterdam and the Center for Human and Clinical Genetics of the Leiden University Medical Center

blog: Wikipedia http://en.wikipedia.org/wiki/Blog 

Related terms: blogging, blogosphere, microcontent, nanopublishing, weblog

blogging:  In the beginning - say 1994 - the phenomenon now called blogging was little more than the sometimes nutty, sometimes inspired writing of online diaries. These days, there are tech blogs and sex blogs and drug blogs and onanistic teenage blogs. But there are also news blogs and commentary blogs, sites packed with links and quips and ideas and arguments that only months ago were the near- monopoly of established news outlets. Poised between media, blogs can be as nuanced and well- sourced as traditional journalism, but they have the immediacy of talk radio.  Andrew Sullivan, "The blogging revolution" Wired Magazine, May 2002 http://www.wired.com/wired/archive/10.05/mustread.html?pg=2

bottom-up ontologies: Are flexible through the use of implicit and, hence, parsimonious part- whole and subconcept-  superconcept relations. The bottom- up method complements current practice, where, as a rule, ontologies are built top- down. The design method is illustrated by an example involving ontologies of pure substances at several levels of detail. It is not claimed that bottom- up construction is a generally valid recipe; indeed, such recipes are deemed uninformative or impossible. Rather, the approach is intended to enrich the ontology developer's toolkit. [Paul E. van der Vet, Nicolaas J.I. Mars, Bottom- Up Construction of Ontologies, IEEE Transactions on Knowledge Engineering, July- Aug, 1998 10(4): 513- 526] http://www.computer.org/tkde/tk1998/k0513abs.htm

Google = "bottom-up ontologies" about 10 bottom-up ontologies about 2, 250 July 19, 2002

bottom-up taxonomies: Faceted classification is a hallmark of the bottom-up approach and suggests yet another reason why the phrase "build the taxonomy" is ill-conceived. ... The bottom-up approach suggests a very different way to classify content. When populating a top-down taxonomy, the central question is "where do I put this?" but at the heart of the bottom-up approach is the question "how do I describe this?" By asking this subtly different question, you’ll wind up in a dramatically different destination.  Peter Morville, "Bottoms up: Designing complex, adaptive systems, Faceted Classification, New Architect, 2002  http://www.newarchitectmag.com/documents/s=7733/na1202b/index3.html 

Can mean from specific to general, but it can also mean content- oriented. [Jean Graef "Top down or bottom up" Montague Institute Review, 2001] http://www.montague.com/review/topdown.html

CML Chemical Markup Language: Chemoinformatics glossary

classification: Involves the development and use of a scheme for the systematic organization of knowledge. (Taylor p 576) Arlene Taylor identified three approaches to classification: enumerative, hierarchical, and analytico- synthetic. Enumerative classification attempts to assign headings for every subject and alphabetically enumerates them. Hierarchical classification uses a more philosophical approach based on the inherent organization of the subject being classified, and establishes logical rules for dividing topics into classes, divisions, and subdivisions. Analytico- synthetic classification assigns terms to individual concepts and provides rules for the local cataloger to use in constructing headings for composite subjects. Traditional classification systems in this country are basically enumerative, though many contain some elements of hierarchy and faceting. (Taylor pp 319- 321) Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE" Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995 http://theme.music.indiana.edu/tech_s/mla/facacc.rev  

Indexing in the library and information management sense, but also see Algorithms & data analysis glossary classification, classifiers

collaborative filtering: Tools that leverage user preferences, patterns, and purchasing behavior to customize organization and navigation systems. [Peter Morville "Software for Information Architects" Argus Center for Information Architecture, 2000]  http://argus-acia.com/strange_connections/current_article.html 

Amazon's recommendations based on what other buyers of a specific title are buying is a familiar example of collaborative filtering.  

Google = about  21,600 July 19, 2002; about 49,300 Oct. 22, 2004 

collaborative metadata: A robust increase in both the amount and quality of metadata is integral to realizing the Semantic Web. The research reported on in this article addresses this topic of inquiry by investigating the most effective means for harnessing resource authors' and metadata experts' knowledge and skills for generating metadata. Jane Greenberg, W. Davenport Robertson, Semantic web construction: An Inquiry of Authors' Views on Collaborative Metadata Generation, International Conference DC 2002, Metadata for e-Communities, Oct. 13- 17, 2003, Florence Italy http://dois.mimas.ac.uk/DoIS/data/Papers/dcmdcflorp:5.html
http://www.bncf.net/dc2002/program/ft/paper5.pdf

Google = about 116 Apr. 24, 2003; about 377 Oct. 22, 2004

common ontology: Defines the vocabulary with which queries and assertions are exchanged among agents. ... The agents sharing a vocabulary need not share a knowledge base; each knows things the other does not, and an agent that commits to an ontology is not required to answer all queries that can be formulated in the shared vocabulary. In short, a commitment to a common ontology is a guarantee of consistency, but not completeness, with respect to queries and assertions using the vocabulary defined in the ontology. [Tom Gruber, What is an ontology?"  Knowledge Systems Lab, Stanford Univ. 2001] http://www-ksl.stanford.edu/kst/what-is-an-ontology.html

Google = about  1,190 July 19, 2002, about 4,130 Oct. 22, 2004 

Related terms: ontological commitment, reusable ontologies, shared ontologies 

communications standards: Pharmacogenomics glossary

communities of practice:  Alliances glossary  

competitive intelligence: Business of biopharmaceuticals glossary

computational linguistics:  Computational Linguistics, or Natural Language Processing (NLP), is not a new field. As early as 1946, attempts have been undertaken to use computers to process natural language. These attempts concentrated mainly on Machine Translation ... the limited performance of these systems made it clear that the underlying theoretical difficulties of the task had been grossly underestimated, and in the following years and decades much effort was spent on basic research in formal linguistics. Today, a number of Machine Translation systems are available commercially although there still is no system that produces fully automatic high- quality translations (and probably there will not be for some time). Human intervention in the form of pre- and/ or post-editing is still required in all cases.  Another application that has become commercially viable in the last years is the analysis and synthesis of spoken language, i.e. speech understanding and speech generation. ... An application that will become at least as important as those already mentioned is the creation, administration, and presentation of texts by computer. Even reliable access to written texts is a major bottleneck in science and commerce. The amount of textual information is enormous (and growing incessantly), and the traditional, word- based, information retrieval methods are getting increasingly insufficient as either precision or recall is always low (i.e. you get either a large number of irrelevant documents together with the relevant ones, or else you fail to get a large number of the relevant ones in the collection). Linguistically based retrieval methods, taking into account the meaning of sentences as encoded in the syntactic structure of natural language, promise to be a way out of this quandary. [Computational Linguistics FAQ, Univ. of Zurich, Switzerland, 2001] http://www.ifi.unizh.ch/groups/CL/CL_FAQ.html

Google = about  97,100 July 19, 2002, about 283,000 Oct. 22, 2004 

Linguistics, natural language, and computational linguistics Meta- Index, Stanford Univ. US  http://www-nlp.stanford.edu/links/linguistics.html

configurable: Many out-of-the-box solutions claim to be easy to "customize," when in fact they are referring to configuration options, not true customizability.  Manufacturers have distinct challenges, some which can be addressed out of the box, but many of which cannot. Manufacturers also need the ability to capitalize on changing dynamics in the marketplace before their competitors do. That's why it's imperative to understand the differences between configuration and customization and the value of selecting a CRM system that offers the flexibility to adapt and model specific manufacturing business processes.  Why you need to know the difference between Customizable and Configurable CRM, CDC Software podcast, Intelligent Enterprise,  2006 http://whitepaper.intelligententerprise.com/cmpintelligententerprise/search/viewabstract/86931/index.jsp 

contextual data: While proteomic studies initially focused largely on expression and protein identification, progress in these areas drove the demand for more detailed types of proteomic data. Now researchers want information about where specific proteins are expressed, both in terms of tissues and localization within the cell. Information relating proteins to function require additional details of post- translational modification, and studies of protein interactions have moved beyond just looking at binary interactions to studies of protein complexes.

For both genomics and proteomics, this shift can be characterized as an interest in more contextual data. Enhanced insight into biological context is essential for obtaining a better understanding of how biology actually works, and thus there is now an emphasis to move from genomic and proteomic snapshots to time series data of expression. Such context is of particular value if biological studies are to be translated into medical advances, because of the importance of being able to predict the impact of potential treatments. The integration of genomic and proteomic data with medical conditions, treatment and outcomes becomes another critical type of contextual information. Christina Lingham, Beyond Genome: Thinking Globally, Cambridge Healthtech http://www.beyondgenome.com/download/editorial.pdf

controlled vocabulary: Robin Cover's XML Cover Pages is described as "a collection of references on matters of Subject Classification, Taxonomies, Ontologies, Indexing, Metadata, Metadata Registries, Controlled Vocabularies, Terminology, Thesauri, Business Semantics", 2003 http://xml.coverpages.org/classification.html

A limited number of words or phrases used in an indexing system (subject headings) or database, to ensure reliable, consistent retrieval. Long used to enhance retrievability and consistency, ontologies and/ or taxonomies certainly sound sexier than "controlled vocabularies" but continue to have a good deal in common. Taxonomies add hierarchies, while ontologies make information "machine- understandable" as well as machine- readable. 

Google = about 39,700 July 19, 2002; about 85,300 Oct. 22, 2004 

Broader terms: ontology, taxonomy Related terms: RDF, semantic web 

Thesauri and controlled vocabulary definitions, National Library of Canada, 2002, http://www.tbs-sct.gc.ca/its-nit/standards/tbits39/crit392_e.asp 

customizable: Quite labor intensive and can be very expensive.  Compare configurable.

DAML DARPA Agent Markup Language: The goal of the DAML effort is to develop a language and tools to facilitate the concept of the semantic web. http://www.daml.org/  Related term: OIL

DAML + OIL http://www.w3.org/TR/daml+oil-walkthru/

data cleaning, data integration: Algorithms & data analysis glossary

Google = "data cleaning" about  12,200; about 22,500 July 3, 2003
"data integration" about 175,000 July 19, 2002; about 306, 000 July 3, 2003; about 817,000 Mar. 22, 2004; about 2,940,000 June 22, 2007

data conversion:   Originally data conversion was primarily a matter of moving text and database files from one medium to another, one hardware platform to another, one operating system environment to another. But as text and database representations became more sophisticated it became apparent that application interoperability was going to be the overriding issue of concern. Company History, Data Conversion Lab  http://www.dclab.com/company_history.asp 

Glossary, DCL Labs http://www.dclab.com/glossary.asp 30+ definitions

data management methods: Algorithms & data analysis glossary has automated methods, methods in this glossary generally combine human and automated methods.

data management vocabulary: A third type of taxonomy that is valuable in a business setting is the data management vocabulary. This taxonomy is a short list of authorized terms without any hierarchical structure that is used to support business transactions. For example, with a large sales force, it is most efficient if salespeople report their work using the same list of activities. They may count their contacts with companies according to a simple list of contact types (managers, decision-makers, and so on), and they may categorize the businesses they work with according to different controlled descriptors that have to do with the business's size or market. In this case, a shared taxonomy will help to support reporting needs of management and other salespeople trying to mine the information in the future. Without a shared taxonomy, a company risks developing islands of data that cannot be shared or easily utilized by the rest of the organization. Susan Conway and Char Sligar, "What is a taxonomy" Unlocking Knowledge Assets, Chapter 6, Building Taxonomies, Microsoft Press, 2002   http://www.microsoft.com/mspress/books/sampchap/5516a.asp

Google = about 49 July 9, 2007

Related terms: descriptive taxonomies, navigational taxonomies

data mart, data mining, data pipelining, data reduction methods, data warehouse: Algorithms & data analysis glossary

data visualization:  The classical definition of visualization is as follows: the formation of mental visual images, the act or process of interpreting in visual terms or of putting into visual form. A new definition is a tool or method for interpreting image data fed into a computer and for generating images from complex multi-dimensional data sets (1987). Definitions and Rationale for Visualisation, D. Scott Brown, SIGGRAPH, 1999 http://www.siggraph.org/education/materials/HyperVis/visgoals/visgoal2.htm   includes information on data visualization.

Related term: information visualization; Broader term: visualization

databases: Bioinformatics glossary; Databases & software directory

deep web:  Most of the Web's information is buried far down on dynamically generated sites, and standard search engines never find it.  The deep Web is qualitatively different from the surface Web. Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request. But a direct query is a "one at a time" laborious way to search.  [Michael K. Bergman "The deep web: surfacing hidden value" White Paper, BrightPlanet, 2000-2002] http://www.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp  Another version at  http://www.press.umich.edu/jep/07-01/bergman.html

Google = about 10,200 Aug. 17, 2002; about 42,900 Oct. 22, 2004

Related term:  invisible web

description logic: Has existed as a field for a few decades yet only somewhat recently has appeared to transform from an area of academic interest to an area of broad interest. This paper provides a brief historical perspective of description logic developments that have impacted DL usability to include communities beyond universities and research labs.  Deborah L. McGuinness. ``Description Logics Emerge from Ivory Towers''. Stanford Knowledge Systems Laboratory Technical Report KSL-01-08 2001. In the Proceedings of the International Workshop on Description Logics. Stanford, CA, August 2001.http://www.ksl.stanford.edu/people/dlm/papers/dls-emerge-abstract.html

The main effort of the research in knowledge representation is providing theories and systems for expressing structured knowledge and for accessing and reasoning with it in a principled way. Description Logics are considered the most important knowledge representation formalism unifying and giving a logical basis to the well known traditions of Frame- based systems, Semantic Networks and KL- ONE-like languages, Object- Oriented representations, Semantic data models, and Type systems. [Description Logic Knowledge Representation] http://dl.kr.org/

Description Logics Home Page, Patrick Lambrix, Linkoping Univ. Sweden http://www.ida.liu.se/labs/iislab/people/patla/DL/index.html

descriptive ontology: A descriptive ontology would try to explain how things are, whereas a normative ontology would try to tell us how things ought to be. [Robert Kent "Ballot comment", Standard Upper Ontology [SUO] E-mail archive,  IEEE, 2001] http://suo.ieee.org/email/msg05921.html

Google = about 121 July 19, 2002; about 343 Oct. 22, 2004 

descriptive taxonomies: Supports information retrieval through searching. By developing and maintaining a core set of controlled vocabularies, a company can consistently label or tag its content with descriptive metadata selected from these authorized vocabularies. In addition, vocabularies can capture knowledge worker terminology and map it to a company’s preferred terms. ... Active mining of new terms and phrases from emerging content and from search query logs will help keep a descriptive taxonomy relevant to the users of that information. A taxonomy built on the thesaurus model (designating a preferred or authorized term with entry terms or variants) helps to link these different terms together. At search time, the term that the knowledge worker uses is associated with the preferred (or key) term for more precise searching, or the knowledge worker’s term is expanded to include the variant forms of the term as well as the authorized term for a broader search. Taxonomies built on the thesaurus model do not force all work groups to use a common set of terminology. Susan Conway and Char Sligar, "What is a taxonomy" Unlocking Knowledge Assets, Chapter 6, Building Taxonomies, Microsoft Press, 2002   http://www.microsoft.com/mspress/books/sampchap/5516a.asp

Google = about  119 July 19, 2002; about  201 Oct. 22, 2004; about 456 July 9, 2007

Related terms: bottom-up taxonomies, data management vocabulary, navigational taxonomies, shared taxonomies

digital libraries: International digital libraries research is intended to contribute to the fundamental knowledge required to create information systems that can operate in multiple languages, formats, media, and social and organizational contexts. International collaborative research can bring complementary approaches, resources and perspectives to bear on common needs and information technology research challenges. International digital libraries applications testbeds are intended to build operational prototypes for globally distributed, internet- based resources, and to implement these in a variety of applications contexts. The testbeds are expected to advance technologies across the digital libraries lifecycle, focus collective work on organizing domain- specific content, and engage researchers, scholars, students and teachers in enhancing research and knowledge resources in a variety of subject domains. [National Science Foundation, International Digital Libraries Collaborative Research & Applications Testbeds program solicitation, 2002] http://www.nsf.gov/pubs/2002/nsf02085/nsf02085.html

Google = about 197,000 July 19, 2002; about 1,480,000 Oct. 22, 2004 

Directed Acyclic Graph DAG: A directed graph where no path starts and ends at the same vertex. See also directed graph, acyclic graph, cycle. Note: Also called a DAG or acyclic digraph. Also called an oriented acyclic graph. [Paul E. Black, NIST, Dictionary of Algorithms, Data Structures and Problems, 2001] http://www.nist.gov/dads/HTML/directAcycGraph.html

The difference between a DAG and a hierarchy is that in the latter each child can only have one parent; a DAG allows a child to have more than one parent. A child term may be an "instance" of its parent term (is a relationship) or a component of its parent term (part- of relationship). A child term may have more than one parent term and may have a different class of relationship with its different parents. [Gene Ontology Consortium, General Documentation" 2001] http://www.geneontology.org/doc/GO.doc.html

Google = about  18,300 July 19, 2002; about 35,000 Oct. 2, 2004 

disambiguate: Make less ambiguous, clarify, elucidate. 

Google = about  33,100 July 19, 2002; about 65,300 Oct. 22, 2004 

domain expertise: 

Google = about 25,500 Dec. 18, 2002; about 68,500 Oct. 22, 2004; about 785,000 June 22, 2007

domain ontology: Ontologies glossary

domain taxonomies: The first step is to define the taxonomy of entities in the domain. This consists of firstly defining the basic classes, then defining the sub- types of these classes.  [Mick O'Donnell, Defining domain taxonomies" Domain Acquisition in Ilex 3.0, 1993-1996] http://www.hcrc.ed.ac.uk/ilex/Manual/extending/Domain-Acquisition/domacq/node4.html#S0....

Google = about 166 July 19, 2002; about 276 Oct. 22, 2004 

drug discovery informatics:

drug ontology: Drug discovery & Development

Dublin Core Metadata Initiative: An open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. The original workshop for the Initiative was held in Dublin, Ohio [OCLC] in 1995. http://dublincore.org/

dynamic ontology: Ontology glossary

dynamic taxonomies: Developed as a way of sifting through large amounts of data. At its base it uses a domain specific taxonomic hierarchy, consisting of concepts connected by is- a relationships. Examples from the medical domain include UMLS and SNOMED. Concepts from the hierarchy are used to classify chunks of guidelines text. The hierarchy is then used as an augmented index for guidelines chunk retrieval. Navigation is done via the operations of browsing and zooming. [Dennis Wollersheim, Implementation of dynamic taxonomies for clinical guidelines retrieval, La Trobe Univ., Australia, c. 2001]  http://homepage.cs.latrobe.edu.au/lewisba/SPIRT/dw2001c.pdf

Google = about 119 July 19, 2002; about 369 Oct. 22, 2004 

evolvability:   Tim Berners Lee defines    http://www.w3.org/Talks/1998/0415-Evolvability/slide3-1.htm 

Google = evolvability  about 8,210  July 19, 2002; about 21,400 Oct. 22, 2004

See also under interoperability

facet:  Ranganathan was the first to introduce the word "facet" into library and information science, and the first to consistently develop the theory of facet analysis. A facet is, simply put, a category. Taylor defines facets as "clearly defined, mutually exclusive, and collectively exhaustive aspects, properties, or characteristics of a class or specific subject." Ranganathan demonstrated that analysis, which is the process of breaking down subjects into their elemental concepts, and synthesis, the process of recombining those concepts into subject strings, could be applied to all subjects, and demonstrated that this process could be systematized. (Taylor pp 320- 321; Foskett p 390). The phrase "analytico- synthetic classification" derives from these two processes: analysis and synthesis.  Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE" Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995 http://www.musiclibraryassoc.org/BCC/BCC-Historical/BCC95/95WGFAM2.html 

faceted classification: One of the most powerful, yet least understood, methods of organizing information. Most folks, when thinking about organizing objects or information, immediately think of a hierarchical, or taxonomic, organization; a top- down structure, where you start with a number of broad categories that get ever more detailed, until you arrive at the object. In such structures, each object has a single home, and typically, one path to get there -- this is how things are organized in "the real world", where each item can only be in one place. Oftentimes, when thinking of organizing information, a hierarchy is where people begin (think Yahoo!).  Faceted classification, on the other hand, is a bottom- up scheme. Here, each object is tagged with a certain set of attributes and values (these are the facets), and the organization of these objects emerges from this classification, and how a user chooses to access them. ... Faceted classification allows for exploration directed by the user, where a large dataset is progressively filtered through the user's various choices, until arriving at a manageable set that meet the users' basic criteria. Instead of sifting through a pre- determined hierarchy, the items are organized on- the- fly, based on their inherent qualities. [Peter Merholz "Innovation in classification" Sept. 23, 2001] http://www.peterme.com/archives/00000063.html

The use of facets in information retrieval did not originate with Ranganathan. In the 18th century, a Frenchman named Condorcet devised what we would now call a faceted classification scheme for organizing information about objects or facts. (Whitrow) The Dewey Decimal Classification, first published in 1876, contained elements of facet analysis. Dewey recognized four facets common to all basic classes: bibliographic form, time, place, and general subjects (such as statistics or research) that at times are related to other subjects. (Foskett pp 176-7) Dewey provided for "number building" to combine two or more facets to express a complex subject. (Taylor p 320) The Universal Decimal Classification, based on the Dewey Decimal Classification and first published in 1905, was intended to be an international classification scheme. It also had elements of a faceted structure, and partly influenced Ranganathan's thinking. (Foskett p 349; Vickery pp 12- 14)  Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE" Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995  http://www.musiclibraryassoc.org/BCC/BCC-Historical/BCC95/95WGFAM2.html 

faceted metadata: Composed of orthogonal [mutually independent] sets of categories. For example, in the domain of architectural images, some possible facets might be Materials (concrete, brick, wood, etc.), Styles (Baroque, Gothic, Ming, etc .... and so on. [Jennifer English et. al "Flexible search and navigation using faceted metadata" 2002] http://bailando.sims.berkeley.edu/papers/chi02_short_paper.pdf

Google = about 360 July 19, 2002; about 2,530 Oct. 22, 2004

fractal nature of the web: http://www.w3.org/DesignIssues/Fractal.html Tim Berners- Lee, Commentary on architecture, Fractal nature of the web, first draft  

Society has to be fractal - people want to be involved on a lot of different levels. The need for things that are local and special will create enclaves. And those will give us the diversity of ideas we need to survive. Tim Berners Lee, in "The father of the web", Evan Schwartz, Wired Mar. 1997 http://www.wired.com/wired/archive/5.03/ff_father_pr.html

GIS Geographic Information Systems: Maps have traditionally been used to explore the Earth and to exploit its resources. GIS technology is an expansion of cartographic science. Geographic information systems (GIS) technology can be used for scientific investigations, resource management, and development planning. It has enhanced the efficiency and analytic power of traditional mapping. GIS technology is becoming an essential tool in the effort to understand the process of global change.  [Is GIS in your future?  Boston Chapter, Special Libraries Association meeting, Mar. 12. 2002] http://www.sla.org/chapter/cbos/meetings/fy02/sci_tech.htm

Good Informatics Practices Guidance Document (GIP): A newly drafted comprehensive body of information of regulatory requirements in the form of existing (GLP, GMP, GCP and Part 11) and currently used standards compiled in one reference guide for an IT system of a life science or healthcare environment. http://www.lsit.org/initiatives/gip.php

GUI Graphical User Interface: Computers & computing glossary

granularity: <jargon, parallel> The size of the units of code under consideration in some context The term generally refers to the level of detail at which code is considered, e.g. "You can specify the granularity for this profiling tool". The most common computing use is in parallelism where "fine grain parallelism" means individual tasks are relatively small in terms of code size and execution time, "coarse grain" is the opposite. You talk about the "granularity" of the parallelism. The smaller the granularity, the greater the potential for parallelism and hence speed- up but the greater the overheads of synchronisation and communication. [FOLDOC 1997] 

The extent to which a system contains separate components (like granules). The more components in a system - or the greater the granularity - the more flexible it is. [Webopedia] http://www.webopedia.com/TERM/g/granularity.html

Choosing different levels of granularity, i.e., imposing different quality criteria on models built by homology from representative, experimentally determined [protein] structures, leads to different numbers of family representatives as targets. [NIGMS Structural Genomics Targets Workshop February 11-12, 1999] http://www.nigms.nih.gov/news/meetings/structural_genomics_targets.html

Concept of granularity, ISWorld Mailing List, Michael Chilton, 2001 http://www.isworld.org/isworldarchives/research.asp#  

Level of detail seems to be the essence of granularity.

Google = about  250,000 July 19, 2002; about 454,000 Oct. 22, 2004

health information data: Includes Clinical data captured during the process of diagnosis and treatment. Epidemiological databases , that aggregate data about a population. Demographic data used to identify and communicate with and about an individual. Financial data derived from the care process or aggregated for an organization or population. Research data gathered as a part of care and used for research or gathered for specific research purposes in clinical trials. Reference data that interacts with the care of the individual or with the healthcare deliver systems, like a formulary, protocol, care plan, clinical alerts or reminders, etc. Coded data that is translated into a standard nomenclature or classification so that it may be aggregated, analyzed, and compared.  [Health Information Management; Professional definitions, Committees on Professional Development, American Health Information Management Association, 1999, 2000] http://www.ahima.org/infocenter/definitions/HIMprofessionaldefinition.htm

health information management:  Health information management improves the quality of healthcare by insuring that the best information is available to make any healthcare decision. Health information management professionals manage healthcare data and information resources. The profession encompasses services in planning, collecting, aggregating, analyzing, and disseminating individual patient and aggregate clinical data. It serves the healthcare industry including: patient care organizations, payers, research and policy agencies, and other healthcare- related industries.  [Health Information Management; Professional definitions, Committees on Professional Development, American Health Information Management Association, 1999, 2000] http://www.ahima.org/infocenter/definitions/HIMprofessionaldefinition.htm

Google = about 56,700  Jan. 2, 2003; about 145,000 Oct. 22, 2004

heavyweight ontologies: Heavyweight ontologies, by contrast [to lightweight], contain class hierarchies, constraints, and inference rules. It takes a long time and many resources to develop and maintain them and it is uncertain if there will be a benefit from this extra effort. Resource Description Framework (RDF) and Web Ontology Language (OWL) of the World-Wide Web Consortium (W3C) are technologies designed to model heavyweight ontologies. Topic Maps are Emerging: Why Should I Care?  H. Holger Rath,  http://www.idealliance.org/papers/dx_xmle04/papers/03-01-03/03-01-03.html 

Google = about 21 July 19, 2002; about 60 Oct. 22, 2004; about 70 May 2, 2005
heavyweight taxonomies, heavyweight taxonomy = 0 [except for this glossary]

heterogeneous data:

informatics: The study of the application of computer and statistical techniques to the management of information. In genome projects, informatics includes the development of methods to search databases quickly, to analyse DNA sequence information, and to predict protein sequence and structure from DNA sequence data. ORD Office of Rare Diseases, NIH glossary http://ord.aspensys.com/asp/resources/glossary_a-e.asp#A 

Narrower terms: bioinformatics; cheminformatics; Computers & computing glossary clinical informatics, molecular informatics,  Biomaterials matinformatics research informatics; Drug discovery & development life sciences informatics, Intellectual property & legal glossary;  patinformatics; Molecular imaging image informatics;  pharmacoinformatics, pharmainformatics Proteomics protein informatics 

information -- how much?  How Much Information 2003, School of Information Science and Systems, Univ. of California, Berkeley, 2003 http://www.sims.berkeley.edu/research/projects/how-much-info-2003/index.htm 

information architecture: "Involves the design of organization, labeling, navigation, and searching systems to help people find and manage information more successfully."  Lou Rosenfeld, Peter Morville interview quoted in Mark Hurst "About Information Architecture, Apr. 3, 2000] http://www.goodexperience.com/columns/040300infoarch.html

Google = about 132,000 July 19, 2002; about 258,000 July 3, 2003; about 622,000 Oct. 22, 2004

Information architecture glossary, Kat Hagedorn, Argus Associates, 2000, 60 + definitions http://argus-acia.com/white_papers/iaglossary.html

information ecology: CSTB is contemplating a major initiative that would examine the rise of new forms of content, changes in media use patterns and their implications, changes in the supply of different kinds of content or media and their implications (e.g., for access, use, and the evolution of specific industries or institutions), and such ramifications as growing potential for manipulation of digital information, coping with data overload (data mining, visualization, and other data-intensive applications), and the internationalization of content production, ownership, and use. "Under Development" Computer Science and Telecommunications Board, US National Academics, http://www7.nationalacademies.org/cstb/projects_under_development.html

Google = about 11,100 Oct. 22, 2004

information extraction: Computers & computing glossary

information harvesting: See under Knowledge Discovery in Databases KDD

Google = about 871 July 19, 2002; about 1,230 July 3, 2003; about 1,730 Oct. 22, 2004; about 1,140,000 June 22, 2007

information integration: Our research group is developing intelligent techniques to enable rapid and efficient information integration. The focus of our research has been on the technologies required for constructing distributed, integrated applications from online sources. This research includes: Information Extraction: Machine learning techniques for extracting information from online sources; Source Modeling: Constructing a semantic model of wrapped sources so that they can be automatically integrated with other sources; Record Linkage: Learning how to align records across sources; Data Integration: Generating plans to automatically integrate data across sources; Plan Execution: Representing, defining, and efficiently executing integration plans in the Web environment; Constraint-based Integration  Interactive constraint-based planning and integration for the Web environment. Information Integration Research Group, Intelligent Systems Division, Information Sciences Institute (ISI), University of Southern California http://www.isi.edu/integration/

Google = about 4,430,000 July 3, 2003; about 1,080,000 June 22, 2007

information management:  Information services of various kinds are fundamental to the discovery, development and use of medicines. Within the pharmaceutical industry, often regarded as the epitome of the 'information intensive' industry, research information units provide both external and internal information provision and management to discovery and development programmes, while medical information units provide in- depth information on the company's products to external doctors, pharmacists, etc., and commercial information units handle information on competitors, marketing data, etc. Additionally, information personnel are involved in activities such as records management and archiving, regulatory affairs, data administration, IT support, and many more. Within the NHS [National Health Service, UK] , Drug Information Pharmacists provide information services on effective use of medicines to all healthcare professions, and are also involved in databases compilation, records management, current awareness etc. The move towards evidence- based medicine, with consequent need for evaluation and presentation of information, is of obvious importance to this group. Other sectors with a heavy reliance on the handling pharmaceutical information and knowledge include publishing, database production, software services, and consultancy of varied kinds.  [MSc in Pharmaceutical Information Management, City Univ. London, UK, Dept of Information Science,  Introduction, 2002 ]http://www.soi.city.ac.uk/organisation/is/teaching/pim/

Narrower term: health information management

Google = about 1,470,000 Jan. 2, 2003; about 4,200,000 Oct. 22, 2004

information overload: Biomedicine is in the middle of revolutionary advances. Genome projects, microassay methods like DNA chips, advanced radiation sources for crystallography and other instrumentation, as well as new imaging methods, have exceeded all expectations, and in the process have generated a dramatic information overload that requires new resources for handling, analyzing and interpreting data. Delays in the exploitation of the discoveries will be costly in terms of health benefits for individuals and will adversely affect the economic edge of the country. [Opportunities in Molecular Biomedicine in the Era of Teraflop Computing: March 3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana- Champaign] http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html

Many of today's problems stem from information overload and there is a desperate need for innovative software that can wade through the morass of information and present visually what we know.  The development of such tools will depend critically on further interactions between the computer scientists and the biologists so that the tools address the right questions, but are designed in a flexible and computationally efficient manner.  It is my hope that we will see these solutions published in the biological or computational literature.  Richard J. Roberts, The early days of bioinformatics publishing, Bioinformatics 16 (1): 2-4, 2000

"Information overload" is not an overstatement these days. One of the biggest challenges is to deal with the tidal wave of data, filter out extraneous noise and poor quality data, and assimilate and integrate information on a previously unimagined scale

Google = about  118,000 July 19, 2002; about 249,000 Oct. 22, 2004

Where's my stuff? Ways to help with information overload, Mary Chitty, SLA presentation June 10, 2002, Los Angeles CA

information retrieval: 

information theory: Algorithms & data analysis glossary

information visualization: The direct visualization of a representation of selected features or elements of complex multi- dimensional data. Data that can be used to create a visualization includes text, image data, sound, voice, video - and of course, all kinds of numerical data. Our visual analysis systems also provide the tools to interact with the data that has been visualized so that users can explore, discover and learn. Users do not look at static images, but can subset the data, run queries, do time sequence studies and create categories and correlations of data type. [Pacific Northwest National Lab, About Visualization at PNNL, 1999] http://www.pnl.gov/infoviz/

Google = about 28,100 July 19, 2002; about 94,200 Oct. 22, 2004

Information visualization resources on the web, 2002 http://graphics.stanford.edu/courses/cs348c-96-fall/resources.html

Related term: data visualization; Broader term: visualization

informational repositories: A new strategy that allows universities to apply serious, systematic leverage to accelerate changes taking place in scholarship and scholarly communication, both moving beyond their historic relatively passive role of supporting established publishers in modernizing scholarly publishing through the licensing of digital content, and also scaling up beyond ad-hoc alliances, partnerships, and support arrangements with a few select faculty pioneers exploring more transformative new uses of the digital medium. Clifford Lynch, Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age, ARL Bimonthly Report 226, Feb. 2003 http://www.arl.org/newsltr/226/ir.html

DSpace, MIT http://www.dspace.org/

integrated taxonomy: We developed a comprehensive help taxonomy by combining both user interface and help system attributes, ranging from help access interface, presentation, and supporting knowledge structure, to implementation. The taxonomy systematically identifies independent axes along which help can be categorized which in turn encloses a space of help categories in which to place currently existing help research, and identifies distinct help software architectural features which contrast pros and cons in different approaches to implement help systems. The taxonomy projects a vision of what help can be like if it is on a par with advances in user interface technology, and desirable design features of help system architectures which are in the progressive direction along with the user interface software tools.  [Piyawadee "Noi" Sukaviriya, An Integrated Taxonomy of Online Help Based on User Interface View, GVU, Georgia Institute of Technology, GIT-GVU-91-20] http://www.cc.gatech.edu/gvu/reports/1991/abstracts/91-20.html

Google = about 85 July 19, 2002; about 353 Oct. 22, 2004

integrated view definitions:

Related terms: data mediation, knowledge based mediation

integration: Bioinformatics glossary

interoperability: The ability of two or more systems or components to exchange information and to use the information that has been exchanged. [Institute of Electrical and Electronics Engineers. IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries. New York, NY: 1990] http://www.sei.cmu.edu/str/indexes/glossary/interoperability.html

Enabling heterogeneous databases to function in an integrated way, sometimes refers to cross platform functionality and operability across relational, object- oriented, and non- standard types of databases.

Google = about 1,080,000 July 19, 2002; about 2,380,000 Oct. 22, 2004

Related terms: metadata, ontology, taxonomies ; Narrower terms: ontology interoperability,  semantic interoperability, software interoperability

invisible web:  For this study, we have avoided the term "invisible Web" because it is inaccurate. The only thing "invisible" about searchable databases is that they are not indexable nor able to be queried by conventional search engines. http://www.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp

Those parts of the web which are inaccessible to current search engines. A straightforward example was PubMed/ Medline (until Google started indexing it.) You still can't usually access proprietary (fee- based) databases such as Thomson Dialog or Lexis- Nexis. except directly. Until recently PDF documents and PowerPoint slides were inaccessible to search engines.   

Google = about 17,300 July 19, 2002; about 278,000 Oct. 22, 2004

Direct Search, Gary Price, George Washington Univ. US gary@freepint.com
Invisible Web: Database contents rarely found in Search Engines, Univ. of California- Berkeley, Spring 2001 http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html

Related terms: deep web, semantic web

just in time information: 90,200 websites were found with this phrase by Google on May 23, 2007. An increasing need as we are deluged with information and data -- and still need time to reflect, discuss and think about what all these mean.

Google = about 2,900 March 14, 2002, about 3,400 July 19, 2002; about 51,600 Feb. 21, 2006; about 88,400 May 7, 2007

Just-In-Time Information Retrieval. Bradley J. Rhodes. Ph.D. Dissertation, MIT Media Lab, May 2000. Just in time retrieval agents Bradley J. Rhodes http://www.research.ibm.com/journal/sj/393/part2/rhodes.html

Related terms: information overload, remembrance agents; Bioinformatics modularity

Knowledge Discovery in Databases (KDD): Algorithms & data analysis glossary

knowledge integration:

Related terms: ontologies, semantics

knowledge management:  An organization's collective knowledge - and the ability to access it - comprises a key corporate asset. Smart organizations know that to maintain competitive advantage, they need to manage their data, information, and knowledge effectively and systematically. Knowledge management involves much more than compiling data and retrieving information. It should be seen as an overarching concept that combines a management philosophy with data warehousing, workflow strategies, database management, and knowledge distribution in a network computing environment. [William A. Woods "Knowledge Management Needs Effective Search Technology" Sun Journal] http://www.sun.com/dot-com/sunjournal/V2N1/03_feat2a.html

Google = about 826,000 July 19, 2002; about 3,520,000 Oct. 22, 2004

Knowledge Management, FDA, 2004 http://www.fda.gov/cdrh/strategic/km.html 

Virtual Library: Knowledge Management, May 2000   http://www.brint.com/km/ Definition, articles, white papers, interviews, business and technology library, periodicals and publications, “out of box thinking”, “movers and shakers”, “think tank”, calendar of events, emerging topics. 
Knowledge Management definitions,
Charlie Matthews, VisualInterconnections, 2002 http://www.visualinterconnections.com/CEM/definitions.htm
KM Glossary
, GOTCHA, Univ. of California Berkeley, 1999  About 50 terms. http://sims.berkeley.edu/courses/is213/s99/Projects/P9/web_site/glossary.htm 

Related terms: ontologies, paraphrase problem, taxonomies

knowledge risk: Business of biopharmaceuticals glossary

laboratory informatics:  

The specialized application of information technology to maximize laboratory operations. Laboratory informatics encompasses data acquisition, data processing, laboratory information management system (LIMS), laboratory automation, scientific data management (including data analysis and long- term archiving), and electronic laboratory notebooks. Focus is on the application of this technology in analytical, production, and R&D laboratories.  Graduate Programs: Laboratory Informatics, Indiana Univ. School of Informatics, US  http://www.informatics.iupui.edu/Academics/graduate/laboratory_informatics/index.php

Related term: Drug discovery & development  LIMS

Laboratory Informatics Primer, Waters Corp http://www.waters.com/WatersDivision/ContentD.asp?watersit=EGOO-6M3TVN 

Google = about 1250 Dec. 31, 2002; about 3,000 Oct. 22, 2004

lexical semantics: http://en.wikipedia.org/wiki/Lexical_semantics 

lexicon: A machine- readable dictionary that may contain a good deal of additional information about the properties of the words, notated in a form that parsers can utilize. [Bob Futrelle, A brief introduction to NLP, BIONLP.org, , Computer Science, Northeastern Univ., US, 2002]  http://www.ccs.neu.edu/home/futrelle/bionlp/intro.html

A linguistics term (words and their definitions), an artificial intelligence term.  Sometimes a synonym for glossary or dictionary.

Google = about 768,000 July 19, 2002; about 1,960,000 Oct. 22, 2004

life sciences informatics: Informatics are essential at every step of genomics- based drug discovery and development. The commercial landscape of life sciences information technology has changed dramatically in the last few years. Bioinformatics, in particular, has gone through a dramatic boom/bust. While IT companies are looking to the drug discovery and development arena as a new market opportunity, pharmaceutical companies  are faced with rising pressure to reduce (or at least control) costs, and have a growing need for new informatics tools to help manage the influx of data from genomics, and turn that data into tomorrow's drugs. Key IT tools, such as high- performance computing, Web services, and grids, are being used to improve the speed and efficiency of drug discovery and development. True breakthroughs are still lacking, particularly in key areas such as gene prediction, data mining, protein structure modeling and prediction, and modeling of complex biological systems. However, most experts agree that IT and bioinformatics are essential to reaching the improved productivity the pharmaceutical industry craves.  

lightweight ontologies: Topic maps are seen as lightweight ontologies because they are able to model knowledge in a very ‘shallow’ way (e.g. just topics, their classes, occurrences, and associations, but no class hierarchies, constraints, or inference rules). Even ‘shallow’ topic maps are already very useful without having put large investments in their creation. Topic Maps are Emerging: Why Should I Care?  H. Holger Rath,  http://www.idealliance.org/papers/dx_xmle04/papers/03-01-03/03-01-03.html 

Google = about 154 July 19, 2002; about 287 Oct. 22, 2004; about 274 May 2, 2005

Compare: heavyweight ontologies 

lightweight taxonomies: Existing ontologies vary in a continuum from lightweight taxonomies (thesaura or conceptual vocabularies) to rigorous formalizations. [Manuela Viezzer, Ontologies and conceptual modeling, 2000-08-31] http://www.cs.bham.ac.uk/~mxv/publications/onto_engineering/node1.html

Google = about 5 July 19, 2002; about 4 Oct. 22, 2004

logic based ontologies: Very expressive, model is a set of theories, well defined semantics,  Automatic derived classification taxonomies, Concepts are defined and primitive. [Robert Stevens' slides, Univ. of Manchester, UK at Synopsis of the Bio- Ontologies Workshop at the EBI for MGED, Dec. 5, 2001] http://www.cbil.upenn.edu/Ontology/EBI_Bioontologies_Workshop.html Some powerpoints still on web.

Google = about 23 July 19, 2002; about 71 July 14, 2004

lower ontologies: See under middle ontologies

Google = "lower ontologies" about 62 "lower level ontologies" about 134 Aug. 8, 2002

machine-readable: See under metadata

Google= about 303,000 July 19, 2002; about 535,000 Oct. 22, 2004

machine-understandable: See under metadata

Google= about 3,730 July 19, 2002; about 8,950 July 14, 2004

markup languages: Computers & computing glossary 

Google = about 639,000 Aug. 9, 2002; about 170,000 Oct. 22, 2004

mash-up http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid

Google = about 22,100,000 Oct. 27, 2006

Medbiquitous Consortium: Technology standards based on XML and web services.  http://www.medbiq.org/index.html 

medical informatics: The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine. [MeSH, 1987] 

Medical informatics has to do with all aspects of understanding and promoting the effective organization, analysis, management, and use of information in health care. While the field of medical informatics shares the general scope of these interests with some other health care specialties and disciplines, medical informatics has developed its own areas of emphasis and approaches that have set it apart from other disciplines and specialties. For one, a common thread through medical informatics has been the emphasis on technology as an integral tool to help organize, analyze, manage, and use information. In addition, as professionals involved at the intersection of information and technology and health care, those in medical informatics have historically tended to be engaged in the research, development, and evaluation side of things, and in studying and teaching the theoretical and methodological underpinnings of data applications in health care. However, today medical informatics also counts among its profession many whose activities are focused on dimensions that include the administration and everyday collection and use of information in health care. What is Medical Informatics? History of MEdical Informatics, AMIA American MEdical Informatics Association http://www.amia.org/history/what.html 

medical Informatics: Consisting of required course work concerning computer applications in medicine, computer- assisted medical decision making, biomedical imaging, and bioinformatics. Mark Musen, Design and Use of Clinical Ontologies: Curricular Goals for the Education of Health Telematics Professionals, Stanford Medical Informatics, 1999 http://smi-web.stanford.edu/pubs/SMI_Reports/SMI-1999-0767.pdf

Google = about 163,000 July 19, 2002; about 479,000 Oct. 22, 2004, about 6,960,000 Oct. 3, 2005

metadata: Could elevate the status of the web from machine- readable to something we might call machine- understandable. Metadata is "data about data" or specifically in our current context "data describing web resources." The distinction between "data" and "metadata" is not an absolute one; it is a distinction created primarily by a particular application ("one application's metadata is another application's data"). [W3C, "Introduction to RDF Metadata" 1997] http://www.w3.org/TR/NOTE-rdf-simple-intro

Metadata is machine understandable information for the web. The W3C Metadata Activity addressed the combined needs of several groups for a common framework to express assertions about information on the Web, and was superceded by the W3C Semantic Web Activity.  [W3C, Metadata and Resource Description, W3C Technology and Society Domain, 2001]http://www.w3.org/Metadata/

Information about data that enables intelligent, efficient access and management of data. … metadata is always less than the data. [Robyne M. Sumpter “Whitepaper on Data Management” Lawrence Livermore National Laboratory, February 10, 1994] http://www.llnl.gov/liv_comp/metadata/papers/whitepaper-draft.html  more on metadata Ontologies glossary

Google = about  1,640,000 July 19, 2002; about 4,850,000 Oct. 22, 2004; about 25,600,000 May 9, 2005;  about 62,700,000 May 7, 2007

Narrower terms: Dublin Core Metadata Initiative,  faceted metadata Related terms: interoperability, RDF, semantic web 

micro-theories: An ontology about a specific domain, that fits within, and for the most part is consistent with, an ontology with a broader scope. For example, structural biology fits within the larger context of biology. Structural biology will have its own terminology and specific algorithms that apply within the specific domain, but may not be useful or identical to, for example, the genome community. [Lawrence Berkeley Lab "Advanced Computational Structural Genomics" Glossary]

Google = about 953 July 19, 2002; about 8,670 Oct. 22, 2004

modularity: Bioinformatics glossary

molecular informatics: The effective use of information derived from genomics and proteomics is of central importance and the ability to identify the most important data, to assess its accuracy and to be aware of any assumptions and limitations of hypotheses and predictive models is absolutely essential. Whereas the development of predictive models based on analogy has been very successful in chemistry and cheminformatics, the complex nature of biomolecular systems limits similar transference within bioinformatics. Without a critical analysis, in- silico discovery will be unable to be effectively integrated in the field of molecular informatics. The following themes will be covered: knowledge discovery and data mining, rational drug design, prediction of small molecule bioavailability (ADME Tox) properties, protein structure and function determination, new methods of drug- target modeling, cellular metabolism, and the use of high- throughput methods (biochips) for acquiring gene expression and protein binding information. [Beilstein- Institut, Molecular Informatics: Confronting Complexity International -Workshop May 13- 16 2002]  http://www.beilstein-institut.de/pdf_files/bozen_02_scientific_program.pdf

Unilever is investing over £13M to establish a new world- leading research group within the Department of Chemistry [Univ. of Cambridge, UK] in the emerging field of Molecular Informatics. .. New methods will be devised for creating, manipulating and storing molecular data to deepen our understanding of molecules and their properties and to allow novel in- silico experimentation. Inter- disciplinary research is a fundamental goal of the centre, integrating chemical, biological and materials sciences through molecular informatics. [Cambridge Univ. Chemical Laboratory, UK, 2000-2001] http://www-ucc.ch.cam.ac.uk/

Google = about  2,580 July 19, 2002; about 4,410 Oct. 22, 2004

molecular information theory: Algorithms & data analysis glossary

molecular taxonomy: Cancer genomics glossary 

"molecular taxonomy" Google = about 1,650 July 19, 2002; about 5,260 Oct. 22, 2004
"molecular taxonomies" Google = about 11 July 19, 2002; about 106, Oct. 22, 2004

Broader term: taxonomy

nanopublishing: A term coined by Jeff Jarvis, head of content, technology, and strategic development for Advance. This is part of the Newhouse media group that owns Conde Nast, among other things. In the past, Jarvis started Entertainment Weekly. Now, he's a committed blogger and his company has put its money where his mouth is, that is, in Pyra, the company behind Blogger. Jim McClellan, New biz on the blog, Guardian Jan. 30, 2003 http://www.guardian.co.uk/online/story/0,3605,884658,00.html

National Center for Biomedical Ontology: http://www.bioontology.org/index.html 

natural language ontologies: Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with weak semantics. Example Gene Ontology [Robert Stevens' slides, Univ. of Manchester, UK at Synopsis of the Bio-Ontologies Workshop at the EBI for MGED, Dec. 5, 2001] http://www.cbil.upenn.edu/Ontology/EBI_Bioontologies_Workshop.html

Google = about 69 July 19, 2002; about 96 Oct. 22, 2004

Natural Language Processing NLP: <artificial intelligence> (NLP) Computer understanding, analysis,