Analysis of genomes reveals the tremendous impact of retronuons. This is apparent not only by 42% discernible retronuons over the entire human genome but also in their propensity to generate new genes, new coding domains of genes, and regulatory elements including those that can modulate spatial and temporal expression patterns. This recruitment of novel domains and generation of alternate expression patterns is a major driving force of evolution.

We see examples of vertebrate

  1. regulatory elements or parts of coding regions generated by retroelements (Table 1)
  2. regulatory elements or parts of coding regions generated by retronuons (Table 2)
  3. genes (protein and RNA encoding) generated by retronuons (Table 3)
  4. genes probably generated by retronuons - as evidenced by the lack of introns, while a probable corresponding paralogue does contain introns (Table 4)
  5. intronless vertebrate genes - no further evidence of retronuon origin (Table 5)
  6. intronless vertebrate genes likely of retronuon origin — no proven activity (Table 6)
  7. intron containing vertebrate genes featuring large exons - probably of retrosequence origin (Table 7)
Most likely, tables are incomplete. Weíd appreciated your input. Please forward omissions (preferentially with reprint of the publication in PDF format) to

An efficient way to get future work included in this compilation is citation of one of our papers on the subject. I can trace then the respective paper through the Citation Index.

The article that may be cited for intronless genes generated by retroposition (Tables 3-7) is: Brosius, J. (1991) Retroposons - seeds of evolution. Science 251, 753

The article that may be cited for control elements generated by SINEs or any other retronuon (Tables 1, 2) is: Brosius, J., Gould, S.J. (1992) On Genomenclature: A comprehensive (and respectful) taxonomy for pseudogenes and other 'junk DNA'. Proc. Natl. Acad. Sci. U.S.A. 89, 10706-10710

Table 1.
Vertebrate regulatory elements or parts of coding regions generated by retroelements*
retronuon gene under its influence species ancestostor of source nuon serves as references
LTR cDNA 7, cDNAg human THE-1 polyadenylation signal Paulson et al. (1987)
LTR sex-limited protein (slp) mouse 5í LTR of C-type retrovirus (imp1) promoter Stavenhagen and Robins (1988);
Robins and Samuelson (1992);
Ramakrishnan and Robins (1997)
LTR oncomodulin rat (but not mouse) IAP promoter and first exon Banville and Boie (1989)
LTR MIPP mouse IAP promoter Chang-Yeh et al. (1991)
LTR AF-3 human RTVL-H promoter Feuchter et al. (1992)
LTR AF-4 (CDC4L homology) human RTVL-H promoter Feuchter et al.(1992)
LTR PLT human RTVL-H polyadenylation signal Goodchild et al. (1992)
LTR cH-6 human RTVL-H polyadenylation signal Mager (1989)
LTR cH-7 human RTVL-H polyadenylation signal Mager (1989)
LTR cPB-3 human RTVL-H polyadenylation signal Mager (1989)
LTR PLA2L (phospholipase A2 homology) human RTVL-H promoter Feuchter-Murthy et al. (1993)
LTR calbindin D28K human RTVL-H promoter Liu and Abraham (1991)
LTR ZNF80 zinc finger gene human ERV9 promoter Di Cristofano et al. (1995)
LTR Growth factor pleitropin (PTN) human HERV-E (RTVL-1) trophoblast-specific promoter Schulte et al. (1996, 1998)
LTR leptin receptor (OBRa) human HERV-K alternative splicing, inclusion of 67 LTR-derived aa into C-terminus of OBRa protein Kapitonov and Jurka (1999)
LTR-IS A1 mouse MuRRS polyadenylation signal Baumruker et al. (1988)
LTR-IS A3 mouse MuRRS polyadenylation signal Baumruker et al. (1988)
LTR aromatase chicken retrovirus promoter and 5' exon Matsumine et al. (1991)
CR1 lysozyme chicken retrovirus transcriptional silencer Baniahmad et al. (1987)
L1 thymidylate synthase mouse LINE polyadenylation signal Harendza and Johnson (1990)
L1 insulin I gene rat LINE transcriptional silencer Laimins et al. (1986)
LINE as1-casein E goat LINE mRNA stability Pérez et al.(1994)
L1 apolipoprotein(a) human LINE transcriptional enhancer Yang et al. (1998)
L1 proteasome activator PA28b (PMSE2b) mouse LINE promoter Zaiß and Kloetzel (1999)
HERV-E salivary amylase gene human retrovirus promoter Samuelson et al. (1990);
Emi et al. (1988);
Ting et al. (1992)
HRES-1 transaldolase human retrovirus part of the coding sequence Banki et al. (1994)
The-1, IAP immunoglobulin heavy chain human
retrotransposon protein sequence building blocks Hakim et al. (1994)
LTR leptin human MER11 placental enhancer Bi et al. (1997)
ALF annexin VI, interleukin-4, protein kinase C-b human LINE-2 Potent T-cell-specific silencer Donnelly et al. (1999)
Bov-B LINE Bucentaur (bbcnt) ruminantia LINE large part of protein 
coding sequence
Takahashi et al. (1998)
LTR HHLA2 human HERV-H polyadenylation signal Mager et. al. (1999)
LTR HHLA3 human HERV-H polyadenylation signal Mager et al. (1999)
LTR apolipoprotein CI
human HERV-E promoter Medstrand et al. (2001)
LTR endothelin B receptor (EBR) human HERV-E promoter Medstrand et al. (2001)

*updated version from: Brosius, J. (1999) RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 238, 115-134,

For a definition of the differences among retronuons e.g. between retroelements [this table] and retrosequences [Tables 2-7] see text (above article) under section 2.

Not all examples are proven exaptations. Especially events that date back not much more than a few million years could only be potential exaptations (potaptations according to Brosius and Gould, 1992, 1993).

NGF-inducible cAMP-extinguishable retrovirus-like (NICER) have been elements have been described (Cho et al., 1990). No association with a gene under their control has been reported.

Table 2.
Vertebrate regulatory elements or parts of coding regions generated by retronuons*
retronuon gene/retronuon that is under its influence species ancestostor of source nuon serves as references
B2 MHC class I genes mouse tRNALys polyadenylation signal Kress et al. (1984)
B2 B2+mRNAx mouse tRNALys polyadenylation signal Ryskov et al. (1984)
B2 glutathione S-transferase mouse tRNALys polyadenylation signal Rothkopf et al. (1986)
B2 various rodents tRNALys mRNA stability Clemens (1987)
B2 fourth component of comple-ment (C4) in H-2k haplotype mouse tRNALys located in intron 10; reduces expres-sion rate to 1/10 of non-H-2k mice Zheng et al. (1992)
B2 muscle g-phosphorylase kinase mouse tRNALys polyadenylation signal Maichele et al. (1993)
B2 MOK-2 zinc-finger protein mouse tRNALys exerts a negative cis-acting effect on 
MOK-2 promoter activity
Arranz et al. (1994)
B2 leukemia inhibitory factor receptor (LIFR) mouse tRNALys generating new splice variant that leads to a soluble form of LIFR Owczarek et al. (1996); Michel et al. (1997)
B2 laminin a3-chain (Lama3) mouse tRNALys RNA polymerase II promoter Ferrigno et al. (2001)
C repeats MHC rabbit tRNAGly polyadenylation signal Rebiere et al. (1987) 
Krane and Hardison (1990)
C repeats major apoprotein of pulmonary surfactant rabbit tRNAGly polyadenylation signal Boggaram et al. (1988) 
Krane and Hardison (1990)
C repeats cytochrome P-450 isozyme 4 rabbit tRNAGly polyadenylation signal Okino et al. (1985) 
Krane and Hardison (1990)
CHR-1 repeats EP3B and EP3C prostaglandin E2 receptors bovine tRNAGlu protein domain Shimamura et al. (1998)
ID   rat tRNAAla via neural BC1 RNA enhancer McKinnon et al. (1986)
ID pIL2 rat tRNAAla via neural BC1 RNA mRNA stability Glaichenhaus and Cuzin (1987)
B1 pIL2, pIL8 mouse SRP RNA mRNA stability Vidal et al. (1993)
B1 immunoglobulin k light chain mouse SRP RNA negative regulation of transcription Saksela and Baltimore (1993)
Alu haptoglobin related gene human SRP RNA transcriptional enhancer Oliviero and Monaci (1988)
Alu q 1 globin higher primates SRP RNA CCAAT box of promoter Kim et al. (1989)
Alu e-globin human SRP RNA transcriptional modulation Wu et al. (1990)
Alu 7.02 bidirectional promoter monkey SRP RNA transcriptional reducer Saffer and Thurston (1989)
Alu c-myc human SRP RNA transcriptional modulation Tomilin et al. (1990)
Alu adenosine deaminase human SRP RNA transcriptional enhancer Aronow et al. (1992)
Alu proliferating cell nuclear antigen (PCNA) human SRP RNA transcriptional silencer Sell et al. (1992)
Alu mitochondrial hinge protein human SRP RNA transcriptional enhancer Liu and Bradner (1993)
Alu SV40 origin human SRP RNA transcriptional enhancer Saegusa et al. (1993)
Alu FceRI-g human SRP RNA transcriptional regulation (positive and negative) Brini et al. (1993)
Alu keratin 18 (human) mouse (trans-genic) SRP RNA transcriptional insulation; Alus provide retinoic acid receptor binding sites Thorey et al. (1993) 
Neznanov and Oshima (1993) 
Vansant and Reynolds (1995)
Alu CD8a human SRP RNA transcriptional enhancer (located in last intron) Hambor et al. (1993)
Alu a-3 acetylcholine receptor subunit human SRP RNA alternative splicing Mihovilovic et al. (1993)
Alu in several protein coding regions primates SRP RNA generating new splice variants, poten-tially contributing new protein domains reviewed in Makalowski et al. (1994)
Alu interferon receptor, IFNRIR-2 human SRP RNA alt. splicing, part of protein cod. region Mullersman and Pfeffer (1995)
Alu-J double-stranded RNA-specific editase (RED1/ADAR2) human SRP RNA 40 Alu-derived aa are added via alternative splicing; protein product has enzymatic activity Gerber et al. (1997)
Alu-J cathepsin B human SRP RNA alt. splicing of exon 2 in 5í UTR Berquin et al. (1997)
Alu b1C-2 integrin subunit human SRP RNA alt. splicing, part of protein cod. region Svineng et al. (1998)
Alu DNA (cytosine-5) methyltransferase (CpG MTase higher primates SRP RNA alt. splicing, part of protein cod. region Hsu et al. (1999)
Alu 7.8 kb RNA human SRP RNA induction of expression of a ST receptor in trans Almenoff et al. (1994)
Alu Wilms' tumor gene (WT1) human SRP RNA intronic transcriptional silencer Hewitt et al. (1995)
Alu BRCA-1 gene, ERF-3 human SRP RNA estrogen-dependent transcriptional enhancers Norris et al. (1995)
Alu parathyroid hormone gene human SRP RNA negative calcium response element McHaffie and Ralston (1995)
Alu poly(ADP-ribosyl) transferase (ADPRT) gene human SRP RNA transcription regulation Schweiger et al. (1995) 
Oei et al. (1997)
Alu potentially many genes human SRP RNA transcriptional modulation via binding of YY1 protein Humphrey et al. (1996)
Alu myeloperoxidase gene promoter human SRP RNA composite SP1-thyroid hormone-retinoic acid response element Piedrafita et al. (1996)
Alu a3 nicotinic receptor subunit human SRP RNA transcription modulation Fornasari et al. (1997)
RRE1 erythropoietin recept. prom. mouse ? transcription inhibitor Youssoufian and Lodish (1993)
highly repet. element1 c-Ha-ras human ? blocks transcriptional readthrough Lowndes et al. (1990)
MIR nicotinic acetylcholine receptora subunit human tRNA generating new splice variant, contributes to protein coding region Murnane and Morales (1995)
MIR b-tubulin human tRNA polyadenylation signal Murnane and Morales (1995)
MIR follitropin receptor sheep tRNA polyadenylation signal Murnane and Morales (1995)
MIR clone c-zrg02 human tRNA polyadenylation signal Murnane and Morales (1995)
MIR clone NIB1273 human tRNA polyadenylation signal Murnane and Morales (1995)
g-actin salivary amylase gene human mRNA promoter Samuelson et al. (1990; 1996) 
Emi et al. (1988)

* updated version from: Brosius, J. (1999) RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 238, 115-134

for a definition of the differences between retroelements [Table 1] and retrosequences see text of above erference under section 2.

Not all examples are proven exaptations. Especially events that date back not much more than a few million years could only be potential exaptations (potaptations according to Brosius and Gould, 1992, 1993).

1 Resemblance to a known repetitive element not yet established.

Table 3.
Vertebrate genes generated by retronuons*
pattern of expression;
species source gene;
pattern of expression;
(# of introns); chromosome
template for reverse transcription hallmarks of retrosequences references
intron loss A-stretch at 3í end of foun-der RNA direct
insulin I; Langerhans islets; (1) murids insulin II; Langerhans islets; (2) part. proc. hnRNA (+) + (+) Soares et al. (1985)
phosphoglycerate kinase (Pgk-2); 
mammals Pgk-1; constitutive; (10); 
mature mRNA + (+) (+) McCarrey and Thomas (1987) 
Boer et al. (1987) 
Adra et al. (1988)
Zfa; testes; chr 10 mouse Zfx; ubiquitous; (310); X-linked mature mRNA + (+) + Ashworth et al. (1990)
pyruvate dehydrogenase (Pdha2); 
Pdha1; constitutive; (10); 
mature mRNA + (+) (+) Dahl et al. (1990) 
Fitzgerald et al. (1992)
N-myc2; brain and liver tumours1 Sciuridae rodents, e.g woodchucks N-myc1; in development and various adult tissues; (2) mature mRNA + (+) + Fourel et al. (1990, 1992) 
Sugiyama et al. (1989, 1999) 
Robertson et al. (1991) 
Quignon et al. (1996)
NB-1 or CLP; epithelial tissue; chr 10 human calmodulin CaMIII; ubiquitous; (5) mature mRNA +   (+) Yaswen et al. (1992) 
Rhyner et al. (1992) 
Berchtold et al. (1993)
carcinoma associated antigen, GA733-1 human GA733-2; placenta, carcinoma; (8) mRNA +     Linnenbach et al. (1989, 1993)
glutamate dehydrogenase (GLUD2); 
retina, testes, brain; X-linked
human GLUD1; ubiquitous; (13); chr 10 mature mRNA + + + Shashidharan et al. (1994)
S-adenosyl- methionine decarboxylase (AMD2); liver and other tiss.; chr 12 mouse (AMD1); ubiquitous; (9) mature mRNA + + + Persson et al. (1995, 1999); Nishimura et al. (1998)
glucose-6-phosphate dehydrogenase (G6PD-2); testes mouse G6PD-1; constitutive (10); 
mature mRNA + + + Hendriksen et al. (1997)
hypoxoxanthine phosphoribosyl-transferase, HPRT-2; liver kangaroo HPRT-1;ubiquitous; (8); 
X chr
mature mRNA +   (+) Noyce et al. (1997) 
Noyce and Piper (1994)
poly(A) binding protein 2 (Pabp2); spermatogenic cells mouse Pabp1; spermatogenic and somatic cells (several) mature mRNA +     Kleene et al. (1998)
Cg catalytic subunit of cAMP-dependent protein kinase2; testes; chr 9 catarrhini primates Ca catalytic subunit of cAMP-dependent protein kinase; ubiquitous; (~9); chrom. 19 mature mRNA + (+) (+) Reinton et al. (1998)
H430 encoding a splicing factor; pancreas, spleen, prostate etc.; chr 11 human PR264/SC35; thymus, spleen, kidney, lung etc.; (2); chr 17 mature mRNA + (+) + Soret et al. (1998)
CDY genes (at least one family member); testes; Y chr Anthropoidea CDYL; ubiquitous; (9); chr 13 (human) mature mRNA +     Lahn and Page (1999)
proteasome activator PA28, b-subunit (PMSE2b); constitutive; chr 14 mouse PMSE2, gamma interferon inducible; (10); chr 11 mature mRNA + + + Zaiß and Kloetzel (1999)
centrin, Cetn1; testes; chr 18 mammals Cetn2; neonatal testes, oviduct; (4); X chr mature mRNA +   + Hart et al. (1999)
XAP-5-like (X5L)3; chr 6 human 
XAP-5; (12); X-linked mature mRNA       Sedlacek et al. (1999)
PMCHL1 catarrhini primates melanin-concentrating hormone (MCH) gene unspliced antisense        hnRNA Courseaux and Nahon (2001)
BC1 RNA; neurons; chr 7 (mouse) rodents tRNAAla; ubiquitous non-mRNA n.a. + (+) DeChiara and Brosius (1987) 
Martignetti and Brosius (1993a)
BC200 RNA; neurons; 
chr 2 (human)
Anthropoidea free Alu monomer non-mRNA n.a. + (+) Martignetti and Brosius (1993b)

* updated version from: Brosius, J. (1999) RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 238, 115-134

Not all examples are proven exaptations. Especially events that date back not much more than a few million years could only be potential exaptations (potaptations according to Brosius and Gould, 1992, 1993).

1 See also s-myc, sm-myc in rodents and mycL2 in primates; role in apoptosis.

2 Protein product not confirmed yet.

3 Intron in 5í UT.

Table 4.
Vertebrate genes probably generated by retronuons*
pattern of expression;
species presumable source gene;
pattern of expression;
(# of introns); chromosome
template f. reverse transcript. age of retrogene references
replication-dependent histone genes1 various ëreplacementí variant histone genes mRNA metazoans and plants reviewed in: Kedes et al. (1979); 
Hentschel and Birnstiel (1981)
G protein a subunit, Gi class, Gnaz mammals Gnai; (8) part. proc. hnRNA   Wilkie et al. (1992)
G-protein coupled receptors2 various G-protein coupled receptors mRNA   reviewed in: 
Gentles and Karlin (1999); 
Brosius (1999)
potassium channels various potassium channel mRNA   reviewed in: Strong et al. (1993)
class III POU domain proteins 
e.g. SCIP (or Tst-1, Oct-6); 
or: Brn-3b; brain; X chr; 
early development and brain
various POU domain transcription factor; (multiple) 
Brn-3a or 3c; brain; chr 14 and chr 18
mRNA   Kuhn et al. (1991); 
Hara et al. (1992); 
Theil et al. (1994); 
Alvarez-Bolado et al. (1995); 
Atanasoski et al. (1995); 
Levavasseur et al. (1998)
forkhead transcription factors, eg.: brain factor-2 (HBF2)3; fetal brain; chr 14; MFH-1 various brain factor-1 (HBF1); fetal brain; (1) chr 14 mRNA   Wiese et al. (1995);
Ernstsson et al. (1996);
Frank and Zoll (1998);
Miura et al. (1997)
inducible heat shock genes4 various constitutive heat shock genes mRNA ancient Hunt and Morimoto (1985);
Mues et al. (1986);
Zakeri et al. (1988);
Milner and Campbell (1990);
Lim and Brenner(1999)
genes encoded by herpesviruses various var. cellular intron-containing genes mRNA   reviewed by Brunovskis and Kung (1996) 
Martin (1999)
protamines vertebrates protamines     States et al. (1992);
Jankowski et al. (1986);
Moir and Dixon, (1988);
Oliva and Dixon (1989);
Retief et al. (1993);
Schlüter and Engel (1995)
non-histone chromosomal protein HMG-1 mammals HMG-2; (4) mRNA ? mamm. radiation Stros and Dixon (1993); Stros et al. (1995)
glycerol kinase (GyK)5; testes; chr 4 human 
GyK; constitutive; (18); 
mRNA   Sargent et al. (1994b) 
Pan et al. (1999)
antioxidant protein 2 related seq. (Aop2-rs1 and Aop2-rs26); var. 
tissues; chr 2 and chr 4, respectively
mouse antioxidant protein 2 (Aop2); 
var. include. heart, liver, kidney; (4); chr 1
mRNA   Phelan et al. (1998)
DNA ligase IV (LIG4) mouse   mRNA   Barnes et al. (1998)
actin-like-7A and actin-like-7B (ACTL7A, ACTL7B); chr. 9 mammals actin or an actin-related protein (ARP) mRNA   Chadwick et al. (1999)
sterol 12a-hydroxylase (CYP8B1); 3p21.3 (Hsa), 9qF4 (Mmu) human 


CYP8A1(or CYP7A1/7B1) mRNA ? mamm. radiation Gåfvels et al. (1999)
metalloproteinase-distintegrins (ADAM20, ADAM21) human intron-containing family members mRNA   Poindexter et al. (1999)
aCP-1 RNA binding protein mammals aCP-2; (? 12) mRNA   Makeyev et al. (1999)
germ cell-specific actin capping protein a (Gsg3 clone); chr 6 rodents somatic cell type actin capping protein a (ACPa) mRNA   Yoshimura et al. (1999)
1-Cys peroxiredoxin,1-Cys Prx (CP-2 and CP-5) mouse CP-3; (4)     Lee et al. (1999)
Makorin RING and and C3H zinc-finger protein mammals Makorin ring finger protein 1 gene (MKRN1) mRNA Gray et al. (2000)

* Intron loss in comparison to an introncontaining paralogue candidate is the only (remaining) hallmark. The decision, whether a sequence belongs in this table or Table 3 is arbitrary in some cases.. Likewise, inclusion in table 4 versus tables 5 or 6 is somewhat arbitrary. In situations where only one intron is present in the putative founder gene, it may have been acquired in the founder. (e.g. protamine genes). Clearly, not all examples can be proven exaptations (potaptations according to Brosius and Gould, 1992, 1993).

1 We cannot rule out that the ancestral histone gene was intronless and some histone genes acquired introns in the "intron late" scenario.

2 Many genes encoding G-protein coupled receptors that lack introns in their coding regions feature an intron in the 5í UT; presumably generated by acquisition of splice sites (Brosius and Gould, 1992).

3 Intronless HBF-2 is clustered with HBF-1 (one intron in coding region) on chromosome 14q11-13 (Wiese et al. (1995).

4 A 71 kDa heat shock protein has been described in the human genome that contains 8 introns (Dworniczak and Mirault, 1987).

5 In addition to the split gene there are at least six additional loci in humans; two are pseudogenes (Xq and chr. 1) two are active retrogenes (both chr. 4) protein protein product not confirmed yet; the status of the remaining two genes needs to be established.

6 Aop2-rs2 potentially encodes only a truncated popypeptide of 114 aa.

Table 5.
Intronless* vertebrate genes (no further evidence of retronuon origin)
gene species pattern of expression references
interferons vertebrates   reviewed in: Nagata et al. (1980);
Lawn et al. (1981);
Watkins et al. (1991); 
Roberts et al. (1998)
ribonucleases1 various   Carsana et al. (1988);
Hamann et al. (1990);
Samuelson et al. (1991);
Sasso et al. (1991);
Tiffany et al. (1996)
mos mammals   Watson et al. (1982);
Newman and Dai (1996)
insulinoma associated, IA-1 (zinc finger); chr 20 human neuroendocrine tumours Lan et al. (1994);
Li et al. (1997)
transcription elongation factor SII or A (TCEA) human   Park et al. (1994);
DiMarco et al. (1996)
modifier of Na+-D-glucose co-transport (hRS1) human   Lambotte et al. (1996)
profilaggrin mouse, rat   Haydock and Dale (1986)
thrombomodulin mammals RA inducible Jackman et al. (1987);
Niforas et al. (1993)
HS, HGT-C2, HGT-B2, BIIIB4, HGT-F, cKer1 keratins; 
keratin-associated protein, Krtap12-1; KAP6; 
B2E and B2F (high sulfur protein genes, in hair follicles); 
keratin-associated proteins pmg-1 and pmg-2
vertebrates hair, skin Powell and Rogers (1986); 
Kuczek and Rogers (1987); 
Frenkel et al. (1989); 
Whitbread et a. (1991); 
Fratini et al. (1993); 
Mitsui et al. (1998); 
Cole and Reeves (1998); 
Kuhn et al. (1999)
blood platelet membrane glycoprotein Iba, glycoprotein V (GPV); glycoproteins Ibb, IX2 mammals platelets Wenger et al. (1988); Lanza et al. (1993); Ravanat et al. (1997); Yagi et al. (1995)
olfactory marker protein (OMP) rat   Danciger et al. (1989)
melanin-concentrating hormone fish   Takayama et al. (1989)
cerebellar degeneration-related antigen, CDR343 human   Chen et al. (1990)
leukosialin CD43 mammals   Cyster et al. (1990); 
Shelley et al. (1990)
nuclear pore glycoprotein p62 rat   DíOnofrio et al. (1991)
N-acetyltransferases Nat1 and Nat21 vertebrates   Grant et al. (1989); 
Blum et al. (1990a-c); 
Martell et al. (1991)
centromere protein, CENP-B mammals   Sullivan and Glass (1991); 
Bejarano and Valdivia (1996)
JUN protooncogene vertebrates   Hattori et al. (1988); 
Hartl et al. (1991)
factor VIII-associated gene (F8A)4 mammals ubiquitous Levinson et al. (1992)
A-kinase anchor protein, AKAP 75 bovine   Hirsch et al. (1992)
LAP, C/EBPa, b, d; CRP2 or NF-IL6b
CAAT/enhancer-binding proteins; basic region-leucine zipper class (bZIP)
human   Landschulz et al. (1988); 
Akira et al. (1990); 
Chang et al. (1990); 
Descombes et al. (1990); 
Cao et al. (1991); 
Williams et al. (1991); 
Kinoshita et al. (1992)
cytochrome b5 rabbit   Takematsu et al. (1992)
gap junction genes connexin 31.1 and 30.3; chr 4 mouse Skin Hennemann et al. (1992)
Na+-MI cotransporter (SMIT/SLC5A3) human kidney and other tissues Berry et al. (1995); 
Porcellati et al. (1999)
myeloid zinc finger gene (MZF-1) human bone marrow  Hui et al. (1995)
U2 auxiliary factor binding protein related sequence U2AFBPL5 chr 5 / U2afbp-rs (imprinted in mouse); chr 11 human 


  Pearsall et al. (1996)
acetyltransferases AT1 and AT2 rat various Land et al. (1996)
choriolysin H (HCE) teleost fish   Yasumasu et al. (1996)
Pw1 zinc-finger protein     Relaix et al. (1996)
defensin (HNP-1) human   Takemura et al. (1996)
Rho/Rac-like RhoG GTPase6 (ARHG) human   Le Gallic and Fort (1997)
ventral prostate protein C7orf1 human   Peacock et al. (1997)
antiproliferative proteins Tob, ANA mammals ubiquitous Yoshida et al. (1997, 1998); 
Guéhenneux et al. (1987)
serine-threonine kinase genes Tsk1, Tsk2 mouse   Galili et al. (1997)
glycosylphosphatidylinositol synthesis gene PIGC     Hong et al. (1997)
growth arrest-specific C16orf3 human   Whitmore et al. (1998)
sex determining gene SRY 
and SOX-3
mammals   Foster and Graves (1994); 
Tucker and Lundrigan (1995); 
O'Neill et al. (1998)
2í,5í-oligoadenylate-dependent RNAse (interferon inducible) human   Tnani and Bayard (1998)
necdin7 mammals neurons Uetsuki et al. (1996);  Jay et al. (1997)
Nakada et al. (1998)
chondroitin 6-sulfotransferase (C6ST); chr 11 human   Mazany et al. (1998)
testes-specific protein Y-encoded-like, (human TSPYL chr 6; rodent Tspyl chr 10) mammals ubiquitous Vogel et al. (1998)
citrate synthase (CS); chr 12 human   Goldenthal et al. (1998)
CXorf1 human hippocampus Redolfi et al. (1998)
cholesterol 25-hydroxylase human, mouse   Lund et al. (1998)
prion protein8 mammals, chicken Lee et al. (1998)
insulin receptor substrate 4 (IRS-4)9 mouse   Fantin et al. (1999)
110 kDa high molecular wt. basic nuclear protein (HMrBNP) flounder sperm Watson and Davies (1999)
cded/lior mouse   Mishra et al. (1999)
Rab-like protein (Rlp-2) human   Peng et al. (1999)
slow-kinetics immediate early gene Ier5 mouse   Williams et al. (1999)
transport modifier RS1 rabbit   Reinhardt et al. (1999)
sperizin, RING zinc-finger protein mouse haploid sperm cells Fujii et al. (1999)
ZNF127 RING zinc-finger protein human   Jong et al. (1999)
malaria-inducible gene mouse spleen Krücken et al. (1999)
KRML (MAFB); chr 20 human hemapoietic tissue Wang et al. (1999)
a-endosulfine (ENSA); chr 14 human   Heron et al. (1999)
MAGEL210 human, mouse brain, placenta Boccaccio et  al. (1999)
MAGE superfamily of genes including necdin (see above) mammals various Stone et al. (2001)

* May have intron(s) upstream from coding region; a single intron in the coding region may also have been generated subsequent to retroposition (e.g. the monocyte-specific Dif-2 gene (Pietzsch et al., 1998) or the acidic 80 kDa protein kinase C substrate/MARCKS (Erusalimsky et al., 1991; Blackshear et al. 1992).

1 Several genes contain single introns in 5í UT.

2 No intron in entire ORF, but 5í UT.

3 Coding region contains a tandem hexapeptide repetitive structure; could have been exapted from a non-coding repeat region.

4 One of the human genes may be located in intron of factor VIII gene.

5 One of the human genes located on the X chromosome contains introns (Kitagawa et al. 1995).

6 Contains a large exon in the 5í UT.

7 Mouse, sheep have two, humans one exon(s) in the 5' UTR

8 Located on chr 15q11-13 (Prader-Willi Syndrome, PWS, region); maternally expressed.

9 Related gene IRS-3 contains one intron in the coding region. IRS-1 and IRS-2 are thought to be intronless as well. However, Vassen et al. (1999). described an intron at the C-terminus of the IRS-2 ORF.

10 Located on chr 15q11-13 (Prader-Willi Syndrome, PWS, region); paternally expressed.

Table 6.
Intronless vertebrate genes likely of retronuon origin ó no proven activity of gene product (this does not exclude transcriptional or even translational activity)*
pattern of expression;
species source gene;
pattern of expression;
(# of introns); chromosome
age of retrogene references
non-muscle tropomyosin (hTMNM-1 human     MacLeod et al. (1983)
metallothionein (MT-1Y b)1 rat     Andersen et al. (1986)
sarcomeric actin a2 frog actin   Stutz and Spohr (1987)
glutamine synthetase (GSr) 
glutamine synthetase (Y GS)2
glutamine synthetase (GSi) 
glutamine synthetase
  Bhandari et al. (1991) 
Chakrabarti et al. (1995)
heat stable antigen (2 ORFs) mouse heat stable antigen   Wenger et al. (1991)
adenylate kinase 3 (AK3)3; chr 17 human AK3; chr 9   Xu et al. (1992)
ferritin L subunit Lg mouse ferritin L subunit; (3)   Renaudie et al. (1992)
processed CD-MPR gene4; chr 3 mouse cation-dependent mannose 6-phosphate receptor (CD-MPR); (7); chr 6   Ludwig et al. (1992)
Id2B5 human helix-loop-helix protein Id2   Kurabayashi et al. (1993)
casein kinase IIa human casein kinase IIa   Devilat and Carvallo (1993)
YEFIA#16 bovine CCAAT transcription factor subunit EF1A   Ozer et al. (1993)
Y 5HT1D7 human serotonin receptor 5HT1D   Bard et al. (1995)
protein kinase C (Y PKCz8) rat PKCz   Andrea and Walsh (1995)
FAU1P9; chr 18 human FAU1   Kas et al. (1995)
dbpB pseudogene10 human DNA binding protein dpbB   Kudo et al. (1995)
mif rp-1 mouse macrophage migration inhibitory factor (MIF)   Bozza et al. (1995)
ferritin H subunit pseudogene human ferritin H subunit   Zheng et al. (1995, 1997)
prothymosin a intronless mammals prothymosin a   Varghese and Kronenberg (1991); Manrow et al. (1992); Rubtsov and Vartapetyan (1995)
laminin receptor (37LRP/p40), intronless human laminin receptor (37LRP/p40)   Jackers et al. (1996)
LAMRL511 human 67-kDA laminin receptor (LAMR1)   Richardson et al. (1998)
ubiquitin-conjugating enzyme UBE2L1; chr 14 human ubiquitin-conjugating enzyme UBE2L3; (1); chr 22   Moynihan et al. (1996)
Y Adh-212 mouse class III alcohol dehydrogenase (Adh-2); (8)   Foglio and Duester (1996)
MSSP-1 (transcriptional enhancer of c-myc) human MSSP-2; (15)   Haigermoser et al. (1996)
olfactory receptor pseudogenes human olfactory receptor genes   Crowe et al. (1996)
Hp53int113 human     Reisman et al. (1996)
phosphoglycerate mutase brain isoform pseudogene (YPGAM1)14 human phosphoglycerate mutase brain isoform (PGAM1)   Dierick et al. (1997)
a tubulin-related sequence; chr 1115 human keratinocyte a tubulin   Devon et al. (1997)
r.pem2 homeobox gene16; epipydimis; X-linked rat r.Pem, orphan homeobox gene; testes, ovary, placenta, epididymis; (5); chr 4   Nhim et al. (1997)
leukocyte antigen C1pg-2617 dog leukocyte antigen DLA class I; (7)   Burnett et al. (1997)
Tdgf1-ps1; chr 16 
TDGF3; X chr
teratocarcinoma-derived growth factor-1; Tdgf1; (5); chr 9 
TDGF1; chr 3
  Liguori et al. (1996, 1997) 
Dono et al. (1991)
(FABP3-ps); chr 13  human fatty acid binding protein FABP3, chr 1   Prinsen et al. (1997) 
serotonin-7 receptor (5-HT7Y )18 human 5-HT7   Quian et al. (1998)
mannose-binding protein-A; chr 10 human     Guo et al. (1998)
Y FGFR-3 (partial, antisense); fetal development; chr 1 mouse fibroblast growth factor receptor (FGFR-3); chr 5   Weil et al. (1997)
Y ribosomal protein L7 (antisense) human ribosomal prot. L7   Hohlbaum et al. (1998)
Supt4h2; chr 10 mouse Supt4h; (4); chr 11   Chiang et al. (1998)
ubiquitin conjugating E2 enzyme ubc9-psi1 and ubc9-psi2 mouse ubc-9   Tsytsykova et al. (1998)
SMT3A and 3 SMT3B proc. pseudogenes mouse ubiquitin-like proteins   Chen et al. (1998)
Y PTEN19; chr 9 human PTEN/MMAC1/TEP1 phosphatase; chr 10   Dahia et al. (1998)
EIF4E2 translational initiation factor human EIF4E1; (6) recent Gao et al. (1998)
EIF2gA; testes; chr 12 human Euk. Translation initiation factor eIF-2g (EIF2gX); (x); X-linked   Ehrmann et al. (1998)
Y hGABPa20> human ets related GAPBa   Luo et al. (1999)
proto-oncogene hPTTG2 human hPTTG1   Prezant et al. (1999)
CDC42-like; chr 4 human CDC42; chr 1   Nicole et al. (1999)
spondyloepiphyseal dysplasia tarda gene (SEDLP)5; many tissues; chr 19 human spondyloepiphyseal dysplasia tarda gene (SEDL or GPM6B); many tissues; (3); X-linked recent Gedeon et al. (1999)
CK2a; chr 11 human CK2a; (12); chr 11   Wirkner and Pyerin (1999)

*There are probably numerous additional retrogenes whose ORFs are not severely compromised or could yield a truncated polypeptide, partially in a different reading frame (e.g. Chen et al., 1982; Varshney and Gedamu, 1984; Dudov and Perry, 1984; Nojima et al., 1987; Kuzumaki et al., 1987; Srikantha et al., 1987; Seelan and Padmanaban, 1988; Nielsen and Trachsel, 1988; Kawaichi et al., 1992; Jun et al., 1997; Palmer et al., 1998). However, transcription and/or translation are not documented.

1 This retrosequence is transcribed. Due to an insertion after codon 28 the ORF is shifted to with a different hypothetical C-terminus of an additional 35 amino acids instead of 33 aa in the correct MT-1 frame.

2 This retrosequence is transcribed. The ORF is truncated but retains ~2/3 of the coding sequence; probably no protein product.

3 Embedded in intron 10 of NF1 gene (located on human chr 17).

4 This retrosequence is transcribed. The ORF is truncated after 141 codons (out of 278 possible) in murine CD-MPR. A soluble truncated form of CD-MPR encoding only the N-terminal extracytoplasmic region including codon 154 was functional in ligand binding and acid-dependent dissociation (Marron-Terada et al.,1998).

5 This retrosequence is transcribed. Stop codon at aa 37, however.

6 This retrosequence is transcribed. The ORF is truncated; probably no protein product.

7 This retrosequence hypothetically encodes a 140 aa polypeptide most of which (aa 31-140) are similar to bovine EFIA (324 aa total).

8 This retrosequence is transcribed specifically in the brain. The ORF is truncated and no protein product could be identified by Western blots.

9 This retrosequence is not transcribed but contains an intact ORF.

10 One of 16 pseudogenes contains an intact ORF.

11 Potentially active retrogenes may also exist in the mouse (Bignon et al., 1991).

12 25 point mutations relative to Adh-2 cDNA, nevertheless ORF is intact, but no evidence for transcription, thus far.

13 Located in the 10 kb first intron of p53 tumour supressor gene; no or short ORF.

14 Located in intron 1 of Menckes disease gene (ATP7A, MNK).

15 The ORF is truncated but retains 80% of the coding sequence.

16 Although a processed mRNA was the founder of this retrogene, it acquired new splice sites that remove three premature stop codons yielding again an open reading frame - protein product not confirmed yet.

17 Contains single ORF of 332 codons but no potential start codon in the N-terminal 2/3 of ORF; not likely to be functional.

18 Transcribed but not translatable.

19 ORF intact, hypothetical polypeptide somewhat smaller due to loss of first start codon; no evidence for transcription as of yet.

20 This retrogene is transcribed in human myeloid cells, but a mutation at the site that corresponds to the ATG start methionine codon may prevents its translation.

Table 7.
Intron containing vertebrate genes featuring large exons (probably of retronuon origin)
gene species pattern of expression references
developmentally regulated type X collagen chicken   Ninomiya et al. (1986)
C1r and C1s complement human   Tosi et al. (1989)
follicle-stimulating hormone receptor (FSHR); LH, TSH receptors human   Gromoll et al. (1996); Misrahi et al. (1996)
islet homeobox gene (isl1)1 mammals   Bozzi et al. (1996)

1 features intronless homeobox domain

