MRC Laboratory of
Molecular Biology, Structural Studies, Hills RD,
Cambridge CB2 2QH, United Kingdom
title: The
Evolution of Protein Families &endash; A
Comparative Study on The Immunoglobulin Superfamily
Repertoire in Drosophila melanogaster and
Caenorhabditis elegans
One of the major
questions in evolutionary structural biology is how
the differences in physiological complexity between
organisms are reflected at the level of proteins
and protein families. Proteins consist of domains,
which are evolutionary and structural units. We
characterised one of the largest protein families,
the immunoglobulin superfamily (IgSF), in the two
metazoan organisms Drosophila melanogaster
(fruitfly) and Caenorhabditis elegans (worm).
Despite proteomes of comparable size (Rubin et al.,
2000), these two organisms differ greatly in
physiological complexity, for example, of the
nervous system and the number of different cells.
The IgSF domains in Drosophila and worm were
identified using a library of Hidden Markov Models
(SUPERFAMILY, Gough, 2002) based on the sequences
of domains in the Structural Classification of
Proteins (SCOP) Database (Murzin et al., 1995). We
also assigned other structural and sequence
domains, signal sequences, low complexity regions
and transmembrane helices. We identified homology
amongst the IgSF proteins within and between the
two organisms according to similarity in sequence,
domain architecture and function. This approach
allowed us to characterise the extent to which
genes are conserved or specific across the two
organisms. We also found cases in each of the two
organisms where an expansion of gene types occurred
by gene duplication and cases where genes where
modified in terms of function and acquisition of
new domains. The set of IgSF proteins in the
fruitfly is about one-and-a-half times the size of
the set in the worm (95 and 67 proteins,
respectively, excluding 59 and 23 short fragments).
The proteins belong to the same broad functional
categories: two thirds of the fly proteins and
about half of the worm proteins are involved in
neural development, another fifth are involved in
other developmental processes, and about one tenth
are involved in muscle function in both organisms.
The vast majority of proteins is common to both
organisms: two thirds of the fly proteins are
shared with four fifths of the worm proteins
&endash; most of which are involved in
development. Our findings demonstrate how the
higher complexity of, e.g., the fly nervous system
has evolved: The number of fly proteins within one
functional group of shared proteins expanded
especially in neural development, and the
development and function of muscles; furthermore,
two thirds of the proteins specific to the fly, but
only one tenth of the proteins specific to the worm
are involved in neural development. However, both
organisms also developed a set of proteins with
specific and novel domain combinations and
functions showing an independent evolution of the
IgSF after the separation of the worm and fly
progenitors. We provide clues as to the function of
71 uncharacterised proteins in total based on our
study of domain architecture and hence double the
number of IgSF proteins with some degree of
functional annotation. Thus our case study of this
particular superfamily contributes to our
understanding of the conservation and divergence
amongst cell adhesion, development and muscle
proteins of metazoan genomes.
References:
-
- *Bateman,
A., Birney, E., Cerruti, L., Durbin, R.,
Etwiller, L., Eddy, S.R., Griffiths-Jones, S.,
Howe, K.L., Marshall, M., Sonnhammer, E.L.
(2002). The Pfam protein families database.
Nucleic Acis Res 30(1), 276-280.
- * Gough, J.,
Chothia, C. (2002). SUPERFAMILY: HMMs
representing all proteins of known structure.
SCOP sequence searches, alignments and genome
assignments. Nucleic Acids Res, 30(1),
268-272.
- * Murzin, A.
G., Brenner, S. E., Hubbard, T., and Chothia, C.
(1995). SCOP - A structural classification of
proteins. Database of the investigation of
sequences and structures. Journal of Molecular
Biology. 247(4), 536-540.
|