International Summer School

   From Genome to Life:

    Structural, Functional and Evolutionary approaches

 


LABEDAN Bernard

Institut de Génétique et Microbiologie -, Université Paris Sud -, 91405 Orsay, France

title: A genomic approach of molecular evolution

Comparison of the whole set of protein sequences encoded by procaryotic genomes has shown a high level of gene duplications, suggesting that the simple way to create new protein functions is to duplicate genes and to allow the arised copies (coding for paralogous proteins) to progressively differentiate. We have further shown that homology is limited to structural segments we have called modules. Thus, studying present-day proteins allows to detect ancient events of gene duplication and of gene fusion, giving us some insights into the mechanisms of protein history and of genome evolution. Exhaustive intragenomic and intergenomic analyses allow to distinguish between proteins which are unique (uni) to a proteome and those which are paralogues (para). Then, each of these categories may be splitted in those which have an orthologue in at least another proteome (uni-ortho and para-ortho) and those which are unique to their species (uni-sp and para-sp). An automatic approach is used to establish the modular structure of each homologous protein and then its phylogenetic profile. Such a study is giving us essential informations about the the way proteins specific to one organism have been progressively built from elementary bricks. The grouping in families and the deduction of distant modules help to build a catalogue of ancestral modules and to trace back the history of the whole set of proteins under study. Moreover, it becomes possible to number and to date the events of gene duplication and gene fusion with respect to the speciation events and to approach the gene composition of the genome of the putative ancestor to all living species. Moreover, studying of very ancient gene duplication and gene fusion events may help us to go beyond this last universal common ancestor, the classical limit of molecular phylogeny. Thus, our experimental approach may be beneficial both to molecular phylogeny and protein history. The families of modules may also be used to indirectly estimate the phylogenetic distance between species. Accordingly, a new method has been designed to reconstruct a distance tree of all analysed species (about 55 genomes). This tree is very helpful to check the topology of the tree of life and to understand how groups of organisms have progressively adapted to environmental, physiological or biological conditions.