International Summer School

   From Genome to Life:

    Structural, Functional and Evolutionary approaches

 


MARCK Christian

Commissariat à l'Energie Atomique, Service de Biochimie et de Génétique Moléculaire, CEA/Saclay, Gif-Sur-Yvette 91191, France

title: Comparative analysis of tRNA genes from complete genomes of Eukarya, Archaea and Bacteria reveals domain-specific features.

Christian MARCK and Henri GROSJEAN

The recent availability of complete genome sequences of many different organisms offers a unique opportunity for starting a systematic and thorough sequence comparison of tRNA genes (tDNA) within a given genome, as well as between genomes of closely and distanly related organisms. From 50 selected most representative fully sequenced genomes of three domains of life (8 Eukarya, 12 Archaea and 30 Bacteria), we have extracted and analysed over 4000 sequences corresponding to cytoplasmic, non-organellar tRNA. Search for tRNA genes was based on the sole request for a standard cloverleaf structure, followed by semi-automatic procedures able to sort, compare and compute relevant statistical data of the tDNA sequences according to different criteria. In this work, we have verified, compiled and commented various known as well as newly discovered features and sequence peculiarities of tDNA, within the three domains of life. Among these are: (i) characteristic consensus sequences for elongator and initiator tDNA, (ii) frequencies of bases at each sequence position, (iii) type and frequencies of conserved 2D and 3D base-pairs in tRNA, (iv) copy number and anticodon usage and predicted codon/anticodon wobble rules, (v) size of variable arm length, (vi) occurrence, location and size of introns (vii) 3’-CCA and 5’-extra G occurrences at the tDNA level and (viii) distribution of the tRNA genes in genomes and their mode of transcription. A number of interesting exceptions to known rules regarding sequence, anticodon usage and variable arm length are also presented. Among all tRNA isoacceptors, initiator tDNA-iMet are the most conserved across the three domains, yet domain-specific signatures exist. Depending on the tRNA feature considered, Archaea sequester either with Eukarya or Bacteria. These data provide a benchmark for the nuclear tDNA sequence check of future genomes to be sequenced as well as for the matured cytoplasmic tRNA, except of course for the presence of modified nucleotides.