International Summer School

   From Genome to Life:

    Structural, Functional and Evolutionary approaches

 


LAPPE Michael

EMBL, EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom

title: Probabilistic Inference of / with Protein-Protein Interaction Networks

Michael Lappe*, Sabine Dietmann, Liisa Holm

We present a framework to generate comprehensive overviews of protein-protein interactions. In the post-genomic view of cellular function, each biological entity is seen in the context of a complex network of interactions. Accordingly, we model functional space by representing protein-protein-interaction data as undirected graphs. We suggest a general approach to generate interaction maps of cellular networks in the presence of huge amounts of fragmented and incomplete data, and to derive representations of large networks which hide clutter while keeping the essential architecture of the interaction space. This is achieved by contracting the graphs according to domain-specific hierarchical classifications. We generate protein-protein Interaction Maps by "pooling" interactions using a hierarchical classicifcation of structures (i.e. Structural Classification of Proteins SCOP, Dali Domain Dictionary DDD) or a functional classification like the Gene Ontology (GO). This trades a gain in generality for a loss in specificity and allows us to make predictions in the absence of any chracterised homologous protein. The key concept here is the notion of induced interaction, which allows the integration, comparison and analysis of interaction data from different sources and different organisms at a given level of abstraction.

While the classical paradigm is based on the idea that a similar sequence features imply similar structural and functional features, we base our predictions on the concept that similar interaction patterns imply similar structure and function. We apply this approach to compute the higher-level networks from several sets of interaction data. The architecture of this network is scale-free, as frequently seen in biological networks, and this property persists through many levels of abstraction. Connections in the network can be projected downwards from higher levels of abstraction down to the level of individual proteins. As an example, we describe an algorithm for fold assignment by network context. This method currently predicts protein folds at 55-90% accuracy without any requirement of detectable sequence similarity of the query protein to a protein of known structure. We used this algorithm to compile a list of structural assignments for previously unassigned genes from yeast. Finally we discuss ways forward to use interaction networks for the prediction of novel protein-protein interactions as well as the validation of high-throughput interaction data.