You are here
Biopharmaceutical/
Genomic
glossary
homepage > Biology > Sequences DNA & beyond
Sequences – DNA &
beyond
Evolving terminology for emerging
technologies
Comments? Suggestions? Revisions? Mary Chitty MSLS mchitty@healthtech.com
Last revised
July 03, 2019
Biology & Chemistry term index:
Gene definitions, DNA
Proteins,
Protein Structures
and RNA are sub-categories linked to this glossary.
Related glossaries include: Genomics, Proteomics
Informatics Algorithms,
In silico & Molecular
Modeling,
Technologies Microarrays & protein chips,
Sequencing
Biology: Biomolecules, Expression, Glycosciences
3' [three prime] flanking region:
The region of DNA which borders the 3' end of a transcription unit and where a variety of regulatory sequences are located.
MeSH, 2002
3' UTR (three prime):
The sequence at the 3' end of messenger RNA
that does not code for product. This region contains transcription and
translation regulating sequences. MeSH, 1999
Region at the 3' end of a mature transcript
(following the stop codon) that is not translated into a protein. DDBJ/
EMBL/ GenBank Feature Table http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
A term that identifies one end of a single- stranded nucleic acid molecule. The 3' end is that end of the molecule which terminates in a 3' hydroxyl group. The 3' direction is the direction toward the 3' end. Nucleic acid sequences are written with the 5' end to the left and the 3' end to the right, in reference to the direction of DNA synthesis during replication (from 5' to 3'), RNA synthesis during
transcription (from 5' to 3'), and the reading of mRNA sequence (from 5' to 3') during
translation.
Broader term: UTR Related terms: 5' (5-prime)
PCR primer extension.
5' (5-prime):
The sequence at the 5' end of the messenger RNA
that does not code for product. This sequence contains the ribosome binding site
and other transcription and translation regulating sequences. MeSH, 1999
A term that identifies one end of a single- stranded nucleic acid molecule. The 5' end is that end of the molecule which terminates in a 5' phosphate group. The 5' direction is the direction toward the 5' end. Nucleic acid sequences are written with the 5' end to the left and the 3' end to the right, in reference to the direction of DNA synthesis during replication (from 5' to 3'), RNA synthesis during
transcription (from 5' to 3'), and the reading of mRNA sequence (from 5' to 3') during
translation. [Mouse Genome
Informatics] Related term: 3' (3-prime).
5' Flanking Region: The region of DNA which borders the 5' end of a transcription unit and where a variety of regulatory sequences are located.
MeSH 2002
5' UTR (five prime):
Region at the 5' end of a mature transcript
(preceding the initiation codon) that is not translated into a protein. DDBJ/
EMBL/ GenBank Feature Table http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
5' Untranslated Region:. That portion of an mRNA from the 5' end to the position of the first codon used in translation. Related terms: 3'UTR, 3' prime;
PCR primer extension Broader term UTR
adenine (A): A nitrogenous base, one member of the base pair
AT (adenine/ thymine). DOE
amino acid sequence:
The order of amino acids as they occur
in a polypeptide chain. This is referred to as the primary structure of
proteins. It is of fundamental importance in determining protein conformation.
MeSH, 1966
ATCG: See adenine, base, base pair, thymine, cytosine, guanine
attenuator: In prokaryotes. 1) region of DNA at which regulation
of termination of transcription occurs, which controls the expression
of some bacterial operons; 2) sequence segment located between the
promoter and the first structural gene that causes partial termination
of transcription. DDBJ/ EMBL/ GenBank Feature Table http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
base: Adenine, cytosine, guanine, thymine, and (only in RNA) uracil.
Related terms: base pair, nucleotide [DOE]
Called bases because they are alkaline (basic) in the acidic DNA structure.
Base and base pair used "fairly indiscriminately" by molecular
biologists [Bains]
base pair bp): Two bases
which form a "rung of the DNA ladder." A DNA nucleotide is made of a molecule
of sugar, a molecule of phosphoric acid, and a molecule called a base.
The bases are the "letters" that spell out the genetic code. In DNA, the
code letters are A, T, G, and C, which stand for the chemicals adenine,
thymine, guanine, and cytosine, respectively. In base pairing,
adenine always pairs with thymine, and guanine always pairs with cytosine. [NHGRI]Narrower
terms: adenine, cytosine, guanine, thymine, uracil
central dogma: Horace
Freeland Judson quotes Francis Crick talking about the central dogma "Nobody tried
to go from protein sequence back to nucleic acid, because that just wasn't on.
You see. But I don't think it was ever discussed. ... Jim, [Watson] you
might say, had it first. DNA makes RNA makes protein. That became then the
general idea. ... what are all the possible information flows?" [Freeland
asked why he had called it the central dogma?] "It was because, I think, of
my curious religious upbringing. Because Jacques [Monod] has since told me that
a dogma is something which a true believer cannot doubt!" Crick laughed.
... "But that wasn't what was in my mind. My mind was, that a dogma was an
idea for which there was no reasonable evidence. You see?!" And Crick gave
a roar of delight. "I just didn't know what dogma meant. And I could just
as well have called it the "Central Hypothesis" - you know. Which is
what I meant to say. Dogma was just a catch phrase. ... And it's a negative
hypothesis, so it's very very difficult to prove.... The central dogma is much
more powerful [than Crick's sequence hypothesis], and therefore in principle you
might have to say it could never be proved. But it's utility - there was
no doubt about that. Because if you didn't believe that, you could invent
theories, unlimited theories, whereas if you just put in that one assumption,
... then, essentially you were on the right track you see." ... "In
looking back I am struck not only by the brashness which allowed us to venture
powerful statements of a very general nature, but also by the rather delicate
discrimination used in selecting what statements to make. Time has shown that
not everybody appreciated our restraint" HF Judson, Eighth Day of
Creation Cold Spring Harbor Laboratory Press 1996 pp. 333-334
Francis
Crick "Central dogma of molecular biology" Nature 227 (258): 561-563 Aug. 8,
1970 [historical article clarifying original explanation]
The Oxford
English Dictionary makes clear the duality of dogma,
particularly in the context of dogmatic, defined as "accepted as
true instead of being based upon experience, particularly if done in an
imperious, arrogant manner". Dogma is defined as
"systematised beliefs" (sometimes deprecating). Dogmatic physicians
are cited as "an ancient sect" which "endeavoured to discover by
reasoning the essence and occult causes" of disease. Related terms:
transcription, translation
central dogma exceptions:
Reverse transcription, prions,
retroviruses?
1. Reverse transcriptase and RNA genomes. DNA is not the only molecule of
heredity in nature and, as David Baltimore and Howard Temin showed, the flow of
information from DNA to RNA is not the only pathway possible. 2. Catalytic RNAs
(ribozymes). Proteins are not the only structures capable
of catalyzing a reaction. Tom Cech demonstrated the catalytic nature of certain
classes of introns (intervening sequences) that are able to
"self-splice." In addition Harry Noller has shown that the synthesis
of the peptide bond during protein synthesis is catalyzed by the 23S rRNA of the
ribosome. 3. Heritable proteins. Stanley Prusiner has given us the novel name
"prion" (proteinaceous infections particle) to describe the agent
responsible for a number of slow, neurological infectious disease, including
scrapie, bovine spongiform encepalopathy (mad cow disease) and Creutzfeld- Jakob
disease. [Martinez Hewlett, Molecular Biology 411, Univ. of Arizona, Tucson US]
http://www.blc.arizona.edu/marty/411/Modules/mod4.html
cis-acting
sequences: The sequences just 5' of the start site of transcription are
the most important for the initiation of transcription. This is where the
transcription complex is built. In general, this region is called the promoter.
For eukaryotes, several sequences same to be conserved among many genes. One
such sequences is the TATA box. The sequence is located about 30 bases
upstream (-30) from the transcription start site and is the one sequence
required for any significant transcription to occur. Other sequences add in
transcription but are not always part of promoter. The two most found are the CCAAT
box (called the CAT box) and the GC box. Because mutants of these
three sequences only express mRNAs at low levels, these are considered the most
important sequences of the basic transcription complex. Phillip McClean,
"Control of gene expression in eukaryotes, North Dakota State Univ.
https://www.ndsu.edu/pubweb/~mcclean/plsc731/cis-trans/cis-trans6.htm
Compare trans-acting factors
Does not usually code for proteins. Compare trans-acting. Expression
glossary
cytosine (C):
A nitrogenous base, one member of the base pair
GC (guanine and cytosine). [DOE]
DNA - RNA - protein: See central dogma Related term: transposons
How are these two terms different?
ds:
Double-stranded (DNA or RNA).
downstream:
Identifies sequences proceeding farther in the
direction of expression; for example, the
coding region is downstream
from the initiation codon, toward the 3' end of an mRNA molecule.
Sometimes used to refer to a position within a protein sequence, in which case
downstream is toward the carboxyl end which is synthesized after the
amino
end during translation. [Lemon]
enhancer: A cis- acting sequence that increases the utilization
of (some) eukaryotic promoters, and can function in either orientation
and in any location (upstream or downstream) relative to the promoter.
Eukaryotes and eukaryotic viruses. [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
At
the 5' and 3' end of the gene, enhancers are located, which respond to the
signals mediated by the proteins regulating the function of the gene. Enhancers
can also be located within the introns. The regulative effect of the enhancers
is either positive or negative. In the latter case they are often called silencers
[for reviews concerning enhancers and silencers, see for example 141, 142]. ...
In the cis-trans test, the E-
g-/E+
g+
cis-heterozygote is phenotypically
wild, whereas the E-
g+/E+
g-
trans-heterozygote is phenotypically mutant. Thus the cis-trans
test gives a positive result. This means that we cannot on the basis of a
genetic test alone distinguish between an enhancer and the transcription unit
regulated by it; biochemical evidence is needed. Thus, by definition, the
regulatory elements of a transcription unit, such as enhancers, have to be
included in the gene itself. Petter Portin in "The Origin,
Development and Present Status of the Concept of the Gene: A Short Historical
Account of the Discoveries" Current Genomics, 2000
https://pdfs.semanticscholar.org/a61a/4e1a2c28e517d6e4ca9a43fd63bbb65379e4.pdf
enhancer elements (genetics):
Cis- acting DNA sequences which can increase transcription of genes. Enhancers can usually function in either orientation and at various distances from a promoter.
[MeSH, 1988] Related
term: promoter
genetic code: The sequence
of nucleotides, coded in triplets (codons) along the
mRNA,
that determines the sequence of amino acids in protein synthesis. The DNA
sequence of a gene can be used to predict the
mRNA sequence, and the genetic
code can in turn be used to predict the amino acid sequence. [DOE]
The notion of a “code” as the key to information transfer was not articulated publicly until late 1954, when
[George] Gamow, Martynas Ycas, and Alexander Rich published an article that defined the code idiom for the first time since Watson and Crick casually mentioned it in a 1953 article. Yet the concept of coding applied to genetic specificity was somewhat misleading, as translation between the 4
nucleic acid bases and the 20 amino acids would obey the rules of a cipher instead of a code. As Crick acknowledged years later, in linguistic analysis, ciphers generally operate on units of regular length (as in the triplet DNA scheme), whereas codes operate on units of variable length (e.g., words, phrases). But the code metaphor worked well, even though it was literally inaccurate, and in Crick’s words, “‘Genetic code’ sounds a lot more intriguing than ‘genetic cipher’.” Codes and the information transfer metaphor were
extraordinarily powerful, and heredity was often described as a biological form of electronic communication.
[Richard A. Pizzi "Genetic ciphering" Modern Drug Discovery 4
(3): 65- 66 Mar. 2001] http://pubs.acs.org/subscribe/journals/mdd/v04/i03/html/03timeline.html
Who wrote the book of life: A history of the genetic code. Lily
E. Kay, Stanford University Press, 2000. Related term: central dogma
genomic
sequence:
In April 2003, the
sequence of the human genome will be essentially complete. For the scientific
community now to make the best use of that fundamental information resource, the
identity and precise location of all sequence-based functional elements in the
genome must be determined. While many of the protein-coding genes are already
known, many others remain to be identified. Beyond open reading frames, non-
protein- coding genes, transcriptional regulatory elements and determinants of
chromosome structure and function remain largely unknown. A comprehensive
encyclopedia of all of these features is needed to utilize fully the sequence of
the human genome to understand human biology better, to predict potential
disease risks, and to stimulate the development of new therapies and other
interventions to prevent and treat disease.
The sequence- based functional elements
that will be targeted include, but are not limited to: Transcribed sequences,
including both protein- coding and non- protein- coding. A description of the
gene structure with transcriptional start sites, polyadenylation sites, along
with all alternative transcripts, is an example. Conserved non- coding sequences
that may represent functional elements. Cis- acting elements that regulate
transcription and/ or chromatin structure. These elements include promoters,
enhancers, and insulators. Sequence features that affect/ control chromosome
biology. Examples include origins of replication and hot spots for
recombination. Epigenetic changes, such as DNA methylation and chromatin
modifications. Workshop on the Comprehensive Extraction of Biological
Information from Genomic Sequence, Bethesda, Md. July 23-24, 2002, http://www.genome.gov/10005568http://grants1.nih.gov/grants/guide/rfa-files/RFA-HG-03-003.html
guanine (G):
A nitrogenous base, one member of the base pair
GC (guanine and cytosine). DOE]
human sequence: See Sequencing
draft sequence, finished sequence, published sequence, working draft
intein-mediated protein splicing:
has become an essential tool in modern biotechnology. Fundamental progress
in the structure and catalytic strategies of cis- and trans-splicing
inteins has led to the development of modified inteins that promote
efficient protein purification, ligation, modification and cyclization.
Recent work has extended these in vitro applications to the cell or to
whole organisms. We review recent advances in intein-mediated protein
expression and modification, post-translational processing and labeling,
protein regulation by conditional protein splicing, biosensors, and
expression of trans-genes. Topilina NI, Mills KV. Recent advances in in
vivo applications of intein-mediated protein splicing. Mob DNA.
2014;5(1):5. Published 2014 Feb 4. doi:10.1186/1759-8753-5-5
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3922620/
inteins:
Wikipedia http://en.wikipedia.org/wiki/Intein
Internal protein sequences. Related terms: exteins, protein splicing.
interspersed repetitive sequences: Copies of transposable elements interspersed throughout the genome, some of which are still active and often referred to as
"jumping genes". There are two classes of interspersed repetitive elements. Class I elements (or RETROELEMENTS - such as retrotransposons, retroviruses, LONG INTERSPERSED NUCLEOTIDE ELEMENTS and SHORT INTERSPERSED NUCLEOTIDE ELEMENTS) transpose via reverse transcription of an RNA intermediate. Class II elements (or DNA TRANSPOSABLE ELEMENTS - such as transposons, Tn elements, insertion sequence elements and mobile gene cassettes of bacterial integrons) transpose directly from one site in the DNA to another.
MeSH, 1999 Narrower terms: LINES, SINES
LINEs Long Interspersed Nuclear Elements or Long INterspersed Elements:
Families
of long (average length = 6 500 bp), moderately repetitive (about
10,000 copies). LINEs are cDNA copies of functional genes present in the
same genome; also known as processed pseudo- genes. FAO Glossary
Highly repeated sequences, 6K- 8K base pairs in length, which contain RNA polymerase II promoters. They also have an
open reading frame that is related to the reverse transcriptase of retroviruses but they do not contain
LTRs (long terminal repeats). Copies of the LINE 1 (L1) family form
about 15% of the human genome. The jockey elements of Drosophila are LINEs.
MeSH, 1999
Related terms: non-coding, retrotransposons.
LTR Long Terminal Repeat:
A sequence directly repeated at both ends of a
defined sequence, of the sort typically found in retroviruses. DDBJ/ EMBL/
GenBank Feature Table http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
Broader term: terminal repeat sequences
locus control region:
A regulatory region first identified in the human
beta- globin locus but subsequently found in other loci. The region is believed to regulate
transcription by opening and remodeling chromatin structure. It may also have enhancer activity.
MeSH, 1998
Open Reading Frame
ORF:
In molecular
genetics, an open reading frame(ORF) is the part of a reading
frame that has the ability to be translated. An ORF is a continuous
stretch of codons that
contain a start codon(usually
AUG) and a stop codon (usually
UAA, UAG or UGA).[1] An
ATG codon within the ORF (not necessarily the first) may indicate where
translation starts. The transcription
termination site is located after the ORF, beyond the translation stop
codon. If transcription were to cease before the stop codon, an
incomplete protein would be made during translation.[2] In
eukaryotic genes with multiple exons, ORFs span intron/exon regions, which
may be spliced together after transcription of the ORF to yield the final
mRNA for protein translation. Wikipedia accessed 2018 Nov 8
https://en.wikipedia.org/wiki/Open_reading_frame
'ORF' refers to a
stretch of DNA that could potentially be translated into a polypeptide or
RNA: i.e., it begins with an ATG "start" codon and terminates with one of
the 3 "stop" codons. For an ORF to be considered as a good candidate for
coding a bona fide cellular protein, a minimum size requirement has often
been set, e.g., during the yeast genome sequencing project an ORF was
defined as a stretch of DNA that would encode a protein of 100 amino acids
or more. An ORF is not usually considered equivalent to a gene or locus
until there has been shown to be a phenotype associated with a mutation in
the ORF, and/or an mRNA transcript or a gene product generated from the
ORF's DNA has been detected. See ORF naming conventions for how ORFs are
named in Saccharomyces cerevisiae. The usage of the term ORF within SGD
and typically by the Saccharomyces community is generally called a Coding
Sequence (CDS). SGD Glossary
https://sites.google.com/view/yeastgenome-help/sgd-general-help/glossary
Reading frames where successive nucleotide triplets can be read as codons specifying
amino acids
and where the sequence of these triplets is not interrupted by stop
codons. MeSH, 1991
Without stop codons, are continuously readable by RNA polymerase
Broader term: reading frame, Narrower term: URF Related term:
Omes
& omics glossary ORFeome
operator regions (genetics):
Regulatory elements of an operon to which activators or repressors bind to effect the transcription of genes in the operon.
MeSH, 1986
primary (initial, unprocessed) transcript:
Includes 5' clipped
region (5' clip), 5' untranslated region (5' UTR), coding sequences (CDS,
exon), intervening sequences (intron), 3' untranslated region (3'
UTR),
and 3' clipped region (3' clip). DDBJ/ EMBL/ GenBank Feature Table http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
promoter:
Region on a DNA molecule involved in
RNA polymerase
binding to initiate transcription. DDBJ/ EMBL/ GenBank Feature Table
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
Promoters are DNA sequences on the 5' side of the gene on which the RNA
polymerase fastens when transcription begins. In all groups of organisms alternative
promoters have been shown for many genes. These alternative promoters have
been classified into six classes by Ueli Schibler and Filipe Sierra [121] (Fig.
3). Certain types of alternative promoters make it possible for transcription
to start from different points of the gene in different cases, and for the
transcripts to have initiation codons at different positions of the chromosome.
Thus it is possible for a single gene in this case too to produce more than one
type of messenger RNA molecules, encoding more than one polypeptide. This is
again against the basic conceptual framework of the neoclassical view of the
gene. ... According to whether the unit of transcription is controlled by one or
several promoters, simple and complex transcription units are
distinguished. [Petter Portin in "The Origin, Development and
Present Status of the Concept of the Gene: A Short Historical Account of the
Discoveries"
Current Genomics, 2000
https://pdfs.semanticscholar.org/a61a/4e1a2c28e517d6e4ca9a43fd63bbb65379e4.pdf
Related terms: cis- acting, enhancer, promoter regions; Omes
& omics : promoterome
promoter regions: The DNA region, usually upstream to the
coding sequence of a gene or operon, which binds and directs
RNA polymerase
to the correct transcriptional start site and thus permits the initiation
of transcription. IUPAC Biotech
DNA sequences which are recognized (directly or indirectly) and bound
by a DNA- dependent RNA polymerase during the initiation of transcription.
Highly conserved sequences within the promoter include the Pribnow box
in bacteria and the TATA BOX in eukaryotes. MeSH, 1985 Related term: enhancer.
protein splicing: Excision of in- frame internal protein sequences (inteins) of a precursor protein, coupled with ligation of the flanking
sequences (exteins). Protein splicing is an autocatalytic reaction and
results in the production of two proteins from a single primary translation
product: the intein and the mature protein. MeSH, 1997
is defined as
the excision of an intervening protein sequence (the INTEIN) from a protein
precursor and the concomitant ligation of the flanking protein fragments
(the EXTEINS) to form a mature extein host protein and the free intein (Perler
1994). Protein splicing results in a native peptide bond between the ligated
exteins (Cooper
1993). Extein ligation differentiates protein splicing from other forms of
autoproteolysis. Conserved intein motifs differentiate inteins from other types
of in-frame sequences present in one homolog and absent in another homolog or
from other types of protein rearrangements. Please Note: The term 'Protein
Splicing' has been associated with inteins since 1994 (Perler
1994). Recent papers have described protein rearrangements that are not
intein-mediated. The mechanism of these rearrangements is currently unknown, but
preliminary evidence suggests that they are mediated by various cellular
enzymes. For clarity, we suggest calling these non-intein mediated events
either protein rearrangements or Protein Editing. InBase, The Intein Database
and Registry, hosted by Hideo
Iwai lab 2010 http://www.inteins.com/
Related terms: exteins, inteins
Narrower term: intein mediated protein splicing
reading frames: The sequence of codons by which
translation
may occur. A segment of mRNA 5' AUCCGA3' could be translated in three reading
frames, 5' AUC.. or 5' UCC.. or 5' CCG.., depending on the location of
the start codon. MeSH, 1991 Narrower term: ORF Open Reading Frames
reference sequences:
The Reference Sequence (RefSeq) collection aims to provide a
comprehensive, integrated, non-redundant set of sequences, including genomic
DNA, transcript (RNA), and protein products, for major research organisms.
RefSeq standards serve
as the basis for medical, functional, and diversity studies; they provide a
stable reference for gene identification and characterization, mutation
analysis, expression studies, polymorphism discovery, and comparative analyses.
RefSeqs are used as a reagent for the functional annotation of some genome
sequencing projects, including those of human and mouse. NCBI Reference
Sequences database http://www.ncbi.nlm.nih.gov/RefSeq/
response elements: Nucleotide sequences, usually upstream, which are recognized by specific regulatory
transcription factors, thereby causing gene response to various regulatory agents. These elements may be found in both
promoter and enhancer regions.
MeSH, 1998
retroelements: Elements that are transcribed into RNA,
reverse- transcribed into DNA and then inserted into a new site in the genome.
Long terminal repeats (LTRs) similar to those from retroviruses are contained in retrotransposons and
retrovirus- like elements. Retroposons, such as LONG INTERSPERSED NUCLEOTIDE ELEMENTS and SHORT INTERSPERSED NUCLEOTIDE ELEMENTS do not contain
LTRs. MeSH, 1999
retrotransposon:
DNA
fragments copied from viral transcriptase
that insert in the host chromosomes ..Life Sciences
reverse transcriptases: Gene
amplification & PCR Related terms: non- coding, retrotransposons.
reverse transcription:
The synthesis of DNA from an RNA template, via reverse transcription, produces
complementary DNA (cDNA). Reverse transcriptases (RTs) use an RNA template and a
short primer complementary to the 3' end of the RNA to direct the synthesis of
the first strand cDNA, which can be used directly as a template for the
Polymerase Chain Reaction (PCR). This combination of reverse transcription and
PCR (RT-PCR) allows the detection of low abundance RNAs in a sample, and
production of the corresponding cDNA, thereby facilitating the cloning of low
copy genes. Alternatively, the first-strand cDNA can be made double-stranded
using DNA Polymerase I and DNA Ligase. These reaction products can be used for
direct cloning without amplification. New England Biolabs
https://www.neb.com/applications/cloning-and-synthetic-biology/dna-preparation/reverse-transcription-cdna-synthesis
Related terms reverse transcriptases; Gene
definitions cDNA
SINEs Short Interspersed Nuclear Elements or Short INterspersed Elements: Short
interspersed nuclear elements.
Families of short (150 to 300 bp),
moderately repetitive elements of eukaryotes, occurring about 100,000 times in a
genome. SINES appear to be DNA copies of certain tRNA molecules, created
presumably by the unintended action of reverse transcriptase during
retroviral infection. FAO Glossary
Highly repeated sequences, 100- 300 bases long, which contain RNA polymerase III promoters. The primate Alu (ALU ELEMENTS) and the rodent B1
SINEs are derived from 7SL RNA, the RNA component of the signal recognition particle. Most other SINEs are derived from tRNAs including the MIRs
(mammalian- wide interspersed repeats).
MeSH, 1999
sequence:
The order of neighbouring amino acids in a protein
or the purine and pyrimidine bases [A,C,T,G, uracil] in RNA
and DNA. IUPAC Bioinorganic Narrower
terms: sequence data-
molecular; Proteins
amino acid sequence Related terms:
Sequencing
draft sequence - human, published sequence - human, working draft sequence -
human Glycosciences glossary carbohydrate
sequence
sequence data- molecular:
Descriptions of specific amino
acid, carbohydrate or nucleotide sequences which have appeared in the published
literature an/or are deposited in and maintained by databanks such as GenBank,
EMBL, NBRF or other sequence repositories [databases] MeSH, 1988
silencer elements
transcriptional:
Nucleic acid sequences that are involved in the negative
regulation of TRANSCRIPTION
by CHROMATIN SILENCING. MeSH 2003
splice sites: In
1993, Richard
J. Roberts and Phillip
Allen Sharp received
the Nobel
Prize in Physiology or Medicine for
their discovery of "split genes".[4] Using
the model adenovirus in
their research, they were able to discover splicing—the fact that pre-mRNA is
processed into mRNA once introns were removed from the RNA segment. These two
scientists discovered the existence of splice sites, thereby changing the face
of genomics research. They also discovered that the splicing of the messenger
RNA can occur in different ways, opening up the possibility for a mutation to
occur. Wikipedia accessed 2018 Aug 26
https://en.wikipedia.org/wiki/Splice_site_mutation
Location in the DNA sequence where RNA removes the
noncoding areas to form a continuous gene transcript for translation into a
protein. DOE
splice junctions:
Junctions between exons and introns.
splice variants:
The HGNC [Human Genome Nomenclature
Committee] has no authority over protein nomenclature; however, we are
frequently asked how to designate splice variants so we suggest the following: Proteins
should be designated using the same symbol as the gene, printed in non-
italicized letters. When referring to splice variants, the symbol can be
followed by an underscore and the lower case letter
"v" then a consecutive number to denote which variant is which. Human Genome Nomenclature Committee "Guidelines for Human Gene
Nomenclature" Genomics
79(4):464-470 (2002) http://www.genenames.org/guidelines.html
splicing: 1. Of RNA: the procedure by which introns are removed
from eukaryotic precursor mRNA molecules and adjacent exon sequences are
joined together (spliced). 2. Of DNA: manipulation for joining together
double stranded DNA fragments with protruding single stranded "sticky
ends" by means of ligases. [IUPAC Biotech, IUPAC Compendium] Narrower terms:
cis-
splicing, protein splicing,
pre- mRNA splicing, RNA splicing, trans- splicing; Gene
Definitions alternative splicing, cDNA; Related terms Cell
biology spliceosomes
start codon, stop codon: RNA
template: Gene amplification
& PCR
Template appears in many biological and biochemical contexts. Do meanings
vary?
terminal repeat sequences:
Nucleotide sequences repeated on both the 5' and 3' ends of a sequence under consideration. For example, the hallmarks of a transposon are that it is flanked by inverted repeats on each end and the inverted repeats are flanked by direct repeats. The Delta element of Ty retrotransposons and
LTRs (long terminal repeats) are examples of this concept.
MeSH, 1999
terminator:
A sequence of DNA lying beyond the 3’ end of the
coding segment of a gene which is recognized by
RNA polymerase as a signal
to stop synthesizing mRNA. IUPAC Biotech
Sequence of DNA located either at the end of the transcript that
causes RNA polymerase to terminate transcription [DDBJ/ EMBL/ GenBank
Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
terminator regions (genetics): DNA sequences which signal the termination of
transcription. MeSH, 1991
transcript: Expression Related terms 3' UTR, 5' UTR, primary transcript,
terminator
trans-acting factors:
Trans- acting factors
functionally have two domains. One domain is required for the factor to bind to
DNA, and the second domain is required for the activation of transcription. This
was discovered by studying deletion mutants of the factors. Mutants factors were
found that could bind DNA but could not activate transcription. Other
experiments in which a hybrid protein consisting of the non- DNA binding segment
of one trans- acting factor fused to the DNA- binding region of a second trans-
acting activated transcription defined the second function of trans- acting
factors. Phil McLean "Control of gene expression in eukaryotes" North
Dakota State Univ.
https://www.ndsu.edu/pubweb/~mcclean/plsc731/cis-trans/cis-trans6.htm
Compare cis-acting factors
transcription: The process by which the genetic information encoded
in a linear sequence of nucleotides in one strand of DNA is copied into
an exactly complementary sequence of RNA. IUPAC Biotech
The synthesis of an RNA copy from a sequence of DNA (a gene); the first
step in gene expression. Compare translation (the process
in which the genetic code carried by mRNA directs the synthesis of proteins
from amino acids. [DOE]
transcription, genetic:
The transfer of genetic information from DNA to messenger RNA by
DNA- directed RNA polymerase. It includes reverse transcription and transcription of early and late genes expressed early in an organism's life cycle or during later development.
MeSH, 1973 Related terms: translation, attenuator, reverse
transcriptases, transcription machinery; Narrower terms: Gene
amplification & PCR reverse transcription; Microarrays
Northern blotting
translation: The unidirectional process that takes place on the
ribosomes whereby the genetic information present in an mRNA is converted
into a corresponding sequence of amino acids in a protein. IUPAC Bioinorganic The conversion of the genetic instructions for a protein from nucleotides
of messenger RNA with amino acids. NIGMS
translation, genetic: Formation of peptides on ribosomes, directed by messenger RNA.
MeSH, 1973
transposons:
A mobile genetic element that can replicate itself
and insert itself into the genome, including interrupting genes and disrupting
their function, an insertional mutagen.
One of a class of genes that are capable of moving spontaneously from
one chromosome to another, or from one position to another in the same
chromosome; also known as jumping genes or transposable elements.
[Glick]
DNA elements carrying genes for transposition and other genetic functions.
In many cases the latter genes enable bacteria to live in extreme environments.
Transposons are much longer than IS (Insertion) elements. Abbreviated Tn.
Schlindwein
First recognized in the 1940’s by Dr. Barbara McClintock in studies of peculiar
inheritance patterns found in the colors of Indian corn. Also known as
“jumping DNA”, referring to the fact that some stretches of DNA are unstable and
“transposable” i.e. they can move around – on and between chromosomes. Related term: DNA transposable elements
How are these two terms
different?
URF:
Unidentified Reading Frame
UTR:
The parts of the messenger RNA sequence that do not code
for product, i.e. the 5' UNTRANSLATED REGIONS and 3' UNTRANSLATED REGIONS.
MeSH, 1999
UnTranslated Region:
Critical for many aspects of gene regulation
and expression. Narrower terms 3' UTR, 5' UTR.
upstream: Identifies sequences located in a direction opposite to
that of expression; for example, the bacterial promoter is upstream of
the initiation codon. In an mRNA molecule, upstream means toward
the 5' end of the molecule. Occasionally used to refer to a region of a
polypeptide chain which is located toward the amino terminus of the molecule.
Lemon
Sequences DNA resources
DDBJ/ EMBL/ GenBank
Feature Table, 2 017
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
Ensembl Glossary
https://www.ensembl.org/info/website/glossary.html
Mouse Genome Informatics Glossary, Jackson Lab,
US, 2006
http://www.informatics.jax.org/mgihome/other/glossary.shtml
How
to look for other unfamiliar terms
IUPAC definitions are reprinted with the permission of the International
Union of Pure and Applied Chemistry.
|