The Effect of Sequence Variation on the Essential Protein Interactions of Pathogens - a Computational Analysis
Creator
Goodacre, Norman F.
Advisor
Wu, Cathy
Uetz, Peter
Abstract
ABSTRACT
Sequence variation is investigated in two different contexts: protein domains of unknown function or DUFs, and virus-host protein-protein interactions. In Chapter I, an integrative Bioinformatics approach is used to elaborate the phylogenetic distribution, extent of structural knowledge, and size characteristics of DUFs, as well as to infer essentiality and possible functions. DUFs are often found to be conserved across all kingdoms of life, although they are particularly common among bacteria, where we report 238 essential DUFs or eDUFs. Unlike non-DUFs, essentiality for DUFs does not appear to be related to conservation. Chapter II presents a methodology for investigating essential interactions of eDUFs using systematic mutation of interface residues, which we propose will aid in solving the function of eDUFs. In Chapter III, a computational pipeline for extracting essential protein-protein interactions between a virus and its host (HIV-1 - human), and identifying sequence variants in host proteins that alter interaction (hence potentially susceptibility), is developed. 45 human proteins were predicted to lose HIV-1 binding as a result of one or more variants at the predicted interaction interface. Chapter IV presents a general computational model for predicting loss of binding due to protein mutation. This more sophisticated model uses machine-learning with features from comparisons of docking simulations of wild-type and mutant complexes, for which affinity of binding (kD) has been experimentally-defined. We show that this model has a remarkably low false-positive rate, compared with commonly-used predictors. We apply the model to a set of well-characterized HIV-1 - human protein interactions with known structures, finding 12 novel sequence variants that are likely to abolish interaction. We speculate that these sequence variants may provide some degree of resistance to HIV-1, in carriers. The computational models described can be used together to iteratively refine a high-confidence set of host sequence variants with a role in susceptibility to viral disease, or indeed any disease with an altered landscape of protein interactions arising from mutations (such as cancer).
Description
Ph.D.
Permanent Link
http://hdl.handle.net/10822/712428Date Published
2014Subject
Type
Publisher
Georgetown University
Extent
171 leaves
Related items
Showing items related by title, author, creator and subject.
-
TWO APPROACHES TO THE STUDY OF PROTEIN INTERACTIONS WITH SMALL MOLECULES: (A) STRUCTURAL ANALYSIS OF PYRIDOXAL L-PHOSPHATE BINDING ENZYMES (B) PURIFICATION, RECONSTITUTION, AND DRUG-BINDING CAPABILITIES OF THE PLASMODIUM FALCIPARUM MULTIDRUG RESISTANCE PROTEIN (PfMDR1)
Pleeter, Perri Gail (Georgetown University, 2012)Structural genomics initiatives are producing new protein structures at a rate that will soon