EMBL, EBI, Wellcome Trust Genome Campus,
Hinxton, Cambridge CB10 1SD, United Kingdom
title: Probabilistic Inference of
/ with Protein-Protein Interaction Networks
Michael Lappe*, Sabine Dietmann,
Liisa Holm
We present a framework to generate
comprehensive overviews of protein-protein interactions. In the
post-genomic view of cellular function, each biological entity is
seen in the context of a complex network of interactions.
Accordingly, we model functional space by representing
protein-protein-interaction data as undirected graphs. We suggest a
general approach to generate interaction maps of cellular networks in
the presence of huge amounts of fragmented and incomplete data, and
to derive representations of large networks which hide clutter while
keeping the essential architecture of the interaction space. This is
achieved by contracting the graphs according to domain-specific
hierarchical classifications. We generate protein-protein Interaction
Maps by "pooling" interactions using a hierarchical classicifcation
of structures (i.e. Structural Classification of Proteins SCOP, Dali
Domain Dictionary DDD) or a functional classification like the Gene
Ontology (GO). This trades a gain in generality for a loss in
specificity and allows us to make predictions in the absence of any
chracterised homologous protein. The key concept here is the notion
of induced interaction, which allows the integration, comparison and
analysis of interaction data from different sources and different
organisms at a given level of abstraction.
While the classical paradigm is
based on the idea that a similar sequence features imply similar
structural and functional features, we base our predictions on the
concept that similar interaction patterns imply similar structure and
function. We apply this approach to compute the higher-level networks
from several sets of interaction data. The architecture of this
network is scale-free, as frequently seen in biological networks, and
this property persists through many levels of abstraction.
Connections in the network can be projected downwards from higher
levels of abstraction down to the level of individual proteins. As an
example, we describe an algorithm for fold assignment by network
context. This method currently predicts protein folds at 55-90%
accuracy without any requirement of detectable sequence similarity of
the query protein to a protein of known structure. We used this
algorithm to compile a list of structural assignments for previously
unassigned genes from yeast. Finally we discuss ways forward to use
interaction networks for the prediction of novel protein-protein
interactions as well as the validation of high-throughput interaction
data.
|