Computational biology and Bioinformatics
Research summary:
The group of Computational Biology and Bioinformatics of the CBMSO descends from the Bioinformatics Unit founded by Ángel Ramírez Ortiz and integrated by the group of Antonio Morreale. Currently it is directed by Ugo Bastolla, a physicist who has always been interested in Biology, convinced that only a multidisciplinary approach allow understanding the complexity of living beings and that biology will strongly benefit from new quantitative methods and a mathematical formalization resting on statistical mechanics on one side and on evolution on the other side. In this framework, proteins are particularly interesting as a bridge between the two disciplines. Our main research lines concern the computational structural biology of proteins, molecular evolution and theoretical ecology. In particular:
- The computational study of protein dynamics through our mathematical model of elastic network model in torsion angle space (TNM), in order to characterize and predict, among others, the dynamical couplings between protein residues that play a role in ligand binding and allostery, the conformational changes and the structural changes due to mutations.
- The relationship between protein stability and evolution, through the study of how evolution acts on the structure and the stability of proteins. We are developing more realistic substitution models for phylogenetic inference that take into account folding stability and structure conservation.
- The computational study of the properties and the evolution of disordered proteins that lack a stable three-dimensional structure.
- The bioinformatic study of the epigenetic regulation of replication and transcription in complex cells. In collaboration with Crisanto Gutierrez’s group, we characterized nine chromatin states of the model plant Arabidopsis thaliana and their relationship with genome replication, and in collaboration with Maria Gomez’s group, we studied the relationship between the replication of the mouse genome and the properties of nucleosomes.
- Theoretical ecology, a field in which we contributed to quantify the concept of structural stability of ecosystems, which addresses the quest of the ecosystem properties that favour the maintenance of biodiversity against perturbations of the environment. In this context, we addressed the properties of ecological networks and the comparison between mutualism, predation and competition, and we applied this framework to characterize bacterial communities.
In the following, we describe these lines in more detail.
Structural dynamics of proteins through the torsional network model.
Proteins play the biological function for which they have been selected through very finely tuned motions. The elastic network model allows predicting the collective motions (normal modes) of protein regions that move in a coordinated fashion with respect to each other, using the information embodied in the native structure of the protein and very few parameters. Our method adopts torsion angles, which are the most relevant degrees of freedom of proteins, and sets the parameters according to the fluctuations observed in NMR ensembles and protein crystals. This method allows computing big systems very rapidly and precisely, predicting large and physically realistic functional movements very efficiently. Despite normal modes only consider small harmonic fluctuations, we are working to include in the computation terms that go beyond the harmonic approximation.
Our main objective is the quantitative understanding and, if possible, the prediction of how proteins change conformation during their biological activity and their evolution, in order to rationalize protein activity, improve the homology prediction of protein structure and improve protein-ligand docking used for drug design. Presently, we are working to develop a new force-field that allow predicting this conformational changes.
Another application consists in predicting protein regions that move in a coordinated manner and are involved in ligand binding and in allosteric couplings between a functional site and an allosteric site.
Protein folding stability and molecular evolution.
Since protein native structure is crucial for dynamics and functions, natural selection targets very strongly the native structure and its stability. Nevertheless, standard molecular evolution models do not take into account this selective pressure. Since long time, our group developed a mathematical model of protein folding stability simple enough to characterize not only the selective pressure that favours stability against unfolding (positive design) but also the selective pressure that destabilizes non-native structures (negative design). With this model, we found an interesting relationship between protein folding stability, population size and mutation bias, which may explain why intracellular bacteria with reduced effective population size tend to evolve with a mutation bias that favours nucleotides A and T, resulting in more hydrophobic proteins.
Our model of protein folding stability can be applied for predicting the thermodynamic effect of mutations, for protein structure prediction through threading, phylogenetic inferences and reconstructions of phylogenetic trees. In this context, our method produces more realistic evolutionary models, and it may explain why different protein sites evolve with different rates. Nevertheless, it is too tolerant to mutations. We are working to improve the model to take into account the structural changes predicted through our torsional network model.
In the same context, we worked to obtain an automatic classification of protein domain structures through structure similarity measures, and to analyse how changes of structure and function are related in protein evolution. This work lead us to observe that protein domains can be classified on top of a phylogenetic tree only for very large structure similarity, while for lower but still significant similarity their relationship is better represented as a network, due to their evolutionary origin through recombination of sub-domain fragments and due to evolutionary accelerations mostly related with changes of function.
Intrinsically disordered proteins.
Studying the proteins of the Centrosome, we observed that they tend to be much more disordered, coiled-coil and phosphorylated than control proteins of the same organism. These properties confer evolutionary and regulatory plasticity to the Centrosome. We found that they are favoured in organisms with large number of cell types, and they were shaped by evolution mainly through the insertion of long disordered fragments, which tend to happen more frequently in evolutionary branches where the number of cell types increased significantly.
Epigenetic and chromatin structure.
The genome of complex cells is packed by the nucleosomes and other protein complexes that constitute the chromatin, whose structure, which is being very actively investigated, regulates gene expression epigenetically, i.e. the regulation is rooted not in the genome sequence but in the chromatin structure. Our group, in collaboration with the group directed by Crisanto Gutierrez, developed a method based on Hidden Markov Models for classifying chromatin regions of the model plant Arabidopsis thaliana based on histone and nucleotide modifications data and genomic sequence. We characterized nine chromatin states that are strongly related with the transcription process (active elements, PolyComb repressed regions and heterochromatin) and are linearly organized along the genome. Within the same collaboration, we determined the replication origins of the genome in two different developmental stages and we observed small but relevant differences that depend on the chromatin states.
Structural stability in theoretical ecology and bacterial communities.
Flowering plants and insects are groups of organisms with very high biodiversity, characterized by mutualistic interactions that are advantageous for both interacting species. In the ecological literature, there has been heated discussion on the consequences of mutualistic interactions, since some mathematical models suggest that mutualism hinders the stability of ecosystems. Adopting the concept of structural stability that we contributed to quantify several years ago, we showed that mutualism favours structural stability and therefore biodiversity when ecological networks are completely connected. Thereafter, we showed that the structural stability of mutualistic ecosystems increases with the connectance and the overlap (sometimes referred to as nestedness) of the mutualistic network and is inversely related with the interspecific competition. Presently, we are exhaustively comparing mutualism, predation and competition in real and simulated ecological networks.
These results lead us to study the ecological relationships between bacterial taxa, which we predict from co-occurrence data obtained in metagenomics experiments after taking into account as much as we can the effect of the environment. We observed that aggregations between taxa (“mutualism”) are more frequent than exclusions (“competition”) and favour the cosmopolitanism of bacteria, i.e. their capacity to live in many different environments. We developed an algorithm to reconstruct bacterial communities from taxa that present significant aggregations.
Last name | Name | Laboratory | Ext.* | Professional category | |
---|---|---|---|---|---|
Bastolla Bufalini | Ugo | 312 | 4633 | ubastolla(at)cbm.csic.es | E.Científicos Titulares de Organismos Públicos de Investigación |
Campos Bermejo | Carlos | 313.3 | 4633 | Estudiante TFM | |
Sinioraki | Georgia Emmanouela | 313.3 | 4633 | Becario Erasmus |
Relevant publications:
Covid-19
- Mathematical Model of SARS-Cov-2 Propagation Versus ACE2 Fits COVID-19 Lethality Across Age and Sex and Predicts That of SARS.
Bastolla U. Front Mol Biosci. 2021 Jul 12;8:706122. doi: 10.3389/fmolb.2021.706122. - Is Covid-19 severity associated with ACE2 degradation?
Ugo Bastolla, Patrick Chambers, David Abia, Maria-Laura García-Bermejo and Manuel Fresno
http://arxiv.org/abs/2102.13210 - How lethal is the novel coronavirus, and how many undetected cases there
are? The importance of being tested.
Ugo Bastolla
https://medrxiv.org/cgi/content/short/2020.03.27.20045062v1
Structural dynamics of proteins through the torsional network model.
- Structural basis for allosteric transitions of a multidomain pentameric ligand-gated ion channel. Hu H, Howard RJ, Bastolla U, Lindahl E, Delarue M. Proc Natl Acad Sci U S A. 2020 Jun 16;117(24):13437-13446. doi: 10.1073/pnas.1922701117
- Bastolla U, Dehouck Y (2019) Can conformational changes of proteins be represented in torsion angle space? A study with Rescaled Ridge Regression. J Chem Inf Model 59:4929-4941. doi: 10.1021/acs.jcim.9b00627.
- Alfayate A, Rodriguez Caceres C, Gomes Dos Santos H, Bastolla U (2019) Predicted dynamical couplings of protein residues characterize catalysis, transport and allostery. Bioinformatics.
- Mendez R, Bastolla U (2010) Torsional network model: normal modes in torsion angle space better correlate with conformation changes in proteins. Phys Rev Lett. 104:228103.
Protein folding stability and molecular evolution.
- Pascual-García A, Arenas M, Bastolla U (2019) The Molecular Clock in the Evolution of Protein Structures. Syst Biol. 2019 68:987-1002. doi: 10.1093/sysbio/syz022.
- Bastolla U, Dehouck Y, Echave J (2017) What evolution tells us about protein physics, and protein physics tells us about evolution. Curr Opin Struct Biol. 42:59-66. doi: 10.1016/j.sbi.2016.10.020. Review
- Arenas M, Sánchez-Cobos A, Bastolla U (2015) Maximum-Likelihood Phylogenetic Inference with Selection on Protein Folding Stability. Mol Biol Evol. 32:2195-207. doi: 10.1093/molbev/msv085.
Intrinsically disordered proteins.
- Nido GS, Méndez R, Pascual-García A, Abia D, Bastolla U (2012) Protein disorder in the centrosome correlates with complexity in cell types number. Mol Biosyst. 8:353-67. doi: 10.1039/c1mb05199g.
Epigenetic and chromatin structure.
- Sequeira-Mendes J, Vergara Z, Peiró R, Morata J, Aragüez I, Costas C, Mendez-Giraldez R, Casacuberta JM, Bastolla U, Gutierrez C (2019) Differences in firing efficiency, chromatin, and transcription underlie the developmental plasticity of the Arabidopsis DNA replication origins. Genome Res. 29:784-797. doi: 10.1101/gr.240986.118.
- Sequeira-Mendes J, Aragüez I, Peiró R, Mendez-Giraldez R, Zhang X, Jacobsen SE, Bastolla U, Gutierrez C (2014) The Functional Topography of the Arabidopsis Genome Is Organized in a Reduced Number of Linear Motifs of Chromatin States. Plant Cell. 26:2351-2366.
Structural stability in theoretical ecology and bacterial communities.
- Pascual-García A, Bastolla U (2017) Mutualism supports biodiversity when the direct competition is weak. Nat Commun. 8:14326. doi: 10.1038/ncomms14326.
- Bastolla U, Fortuna MA, Pascual-García A, Ferrera A, Luque B, Bascompte J (2009) The architecture of mutualistic networks minimizes competition and increases biodiversity. Nature. 458:1018-20. doi: 10.1038/nature07950.