Institut de Génétique et
Microbiologie -, Université Paris Sud -, 91405 Orsay,
France
title: A genomic approach of
molecular evolution
Comparison of the whole set of
protein sequences encoded by procaryotic genomes has shown a high
level of gene duplications, suggesting that the simple way to create
new protein functions is to duplicate genes and to allow the arised
copies (coding for paralogous proteins) to progressively
differentiate. We have further shown that homology is limited to
structural segments we have called modules. Thus, studying
present-day proteins allows to detect ancient events of gene
duplication and of gene fusion, giving us some insights into the
mechanisms of protein history and of genome evolution. Exhaustive
intragenomic and intergenomic analyses allow to distinguish between
proteins which are unique (uni) to a proteome and those which are
paralogues (para). Then, each of these categories may be splitted in
those which have an orthologue in at least another proteome
(uni-ortho and para-ortho) and those which are unique to their
species (uni-sp and para-sp). An automatic approach is used to
establish the modular structure of each homologous protein and then
its phylogenetic profile. Such a study is giving us essential
informations about the the way proteins specific to one organism have
been progressively built from elementary bricks. The grouping in
families and the deduction of distant modules help to build a
catalogue of ancestral modules and to trace back the history of the
whole set of proteins under study. Moreover, it becomes possible to
number and to date the events of gene duplication and gene fusion
with respect to the speciation events and to approach the gene
composition of the genome of the putative ancestor to all living
species. Moreover, studying of very ancient gene duplication and gene
fusion events may help us to go beyond this last universal common
ancestor, the classical limit of molecular phylogeny. Thus, our
experimental approach may be beneficial both to molecular phylogeny
and protein history. The families of modules may also be used to
indirectly estimate the phylogenetic distance between species.
Accordingly, a new method has been designed to reconstruct a distance
tree of all analysed species (about 55 genomes). This tree is very
helpful to check the topology of the tree of life and to understand
how groups of organisms have progressively adapted to environmental,
physiological or biological conditions.
|