Computational biology and Bioinformatics

Research summary:

The group of Computational Biology and Bioinformatics of the CBMSO descends from the Bioinformatics Unit founded by Ángel Ramírez Ortiz and integrated by the group of Antonio Morreale. Currently it is directed by Ugo Bastolla, a physicist who has always been interested in Biology, convinced that only a multidisciplinary approach allow understanding the complexity of living beings and that biology will strongly benefit from new quantitative methods and a mathematical formalization resting on statistical mechanics on one side and on evolution on the other side. In this framework, proteins are particularly interesting as a bridge between the two disciplines. Our main research lines concern the computational structural biology of proteins, molecular evolution and theoretical ecology. In particular:

  1. The computational study of protein dynamics through our mathematical model of elastic network model in torsion angle space (TNM), in order to characterize and predict, among others, the dynamical couplings between protein residues that play a role in ligand binding and allostery, the conformational changes and the structural changes due to mutations.
  2. The relationship between protein stability and evolution, through the study of how evolution acts on the structure and the stability of proteins. We are developing more realistic substitution models for phylogenetic inference that take into account folding stability and structure conservation.
  3. The computational study of the properties and the evolution of disordered proteins that lack a stable three-dimensional structure.
  4. The bioinformatic study of the epigenetic regulation of replication and transcription in complex cells. In collaboration with Crisanto Gutierrez’s group, we characterized nine chromatin states of the model plant Arabidopsis thaliana and their relationship with genome replication, and in collaboration with Maria Gomez’s group, we studied the relationship between the replication of the mouse genome and the properties of nucleosomes.
  5. Theoretical ecology, a field in which we contributed to quantify the concept of structural stability of ecosystems, which addresses the quest of the ecosystem properties that favour the maintenance of biodiversity against perturbations of the environment. In this context, we addressed the properties of ecological networks and the comparison between mutualism, predation and competition, and we applied this framework to characterize bacterial communities.

In the following, we describe these lines in more detail.

Structural dynamics of proteins through the torsional network model.

Proteins play the biological function for which they have been selected through very finely tuned motions. The elastic network model allows predicting the collective motions (normal modes) of protein regions that move in a coordinated fashion with respect to each other, using the information embodied in the native structure of the protein and very few parameters. Our method adopts torsion angles, which are the most relevant degrees of freedom of proteins, and sets the parameters according to the fluctuations observed in NMR ensembles and protein crystals. This method allows computing big systems very rapidly and precisely, predicting large and physically realistic functional movements very efficiently. Despite normal modes only consider small harmonic fluctuations, we are working to include in the computation terms that go beyond the harmonic approximation.

Our main objective is the quantitative understanding and, if possible, the prediction of how proteins change conformation during their biological activity and their evolution, in order to rationalize protein activity, improve the homology prediction of protein structure and improve protein-ligand docking used for drug design. Presently, we are working to develop a new force-field that allow predicting this conformational changes.

Another application consists in predicting protein regions that move in a coordinated manner and are involved in ligand binding and in allosteric couplings between a functional site and an allosteric site.

Protein folding stability and molecular evolution.

Since protein native structure is crucial for dynamics and functions, natural selection targets very strongly the native structure and its stability. Nevertheless, standard molecular evolution models do not take into account this selective pressure. Since long time, our group developed a mathematical model of protein folding stability simple enough to characterize not only the selective pressure that favours stability against unfolding (positive design) but also the selective pressure that destabilizes non-native structures (negative design). With this model, we found an interesting relationship between protein folding stability, population size and mutation bias, which may explain why intracellular bacteria with reduced effective population size tend to evolve with a mutation bias that favours nucleotides A and T, resulting in more hydrophobic proteins.

Our model of protein folding stability can be applied for predicting the thermodynamic effect of mutations, for protein structure prediction through threading, phylogenetic inferences and reconstructions of phylogenetic trees. In this context, our method produces more realistic evolutionary models, and it may explain why different protein sites evolve with different rates. Nevertheless, it is too tolerant to mutations. We are working to improve the model to take into account the structural changes predicted through our torsional network model.

In the same context, we worked to obtain an automatic classification of protein domain structures through structure similarity measures, and to analyse how changes of structure and function are related in protein evolution. This work lead us to observe that protein domains can be classified on top of a phylogenetic tree only for very large structure similarity, while for lower but still significant similarity their relationship is better represented as a network, due to their evolutionary origin through recombination of sub-domain fragments and due to evolutionary accelerations mostly related with changes of function.

Intrinsically disordered proteins.

Studying the proteins of the Centrosome, we observed that they tend to be much more disordered, coiled-coil and phosphorylated than control proteins of the same organism. These properties confer evolutionary and regulatory plasticity to the Centrosome. We found that they are favoured in organisms with large number of cell types, and they were shaped by evolution mainly through the insertion of long disordered fragments, which tend to happen more frequently in evolutionary branches where the number of cell types increased significantly.                                                                                                                                                     

Epigenetic and chromatin structure.

The genome of complex cells is packed by the nucleosomes and other protein complexes that constitute the chromatin, whose structure, which is being very actively investigated, regulates gene expression epigenetically, i.e. the regulation is rooted not in the genome sequence but in the chromatin structure. Our group, in collaboration with the group directed by Crisanto Gutierrez, developed a method based on Hidden Markov Models for classifying chromatin regions of the model plant Arabidopsis thaliana based on histone and nucleotide modifications data and genomic sequence. We characterized nine chromatin states that are strongly related with the transcription process (active elements, PolyComb repressed regions and heterochromatin) and are linearly organized along the genome. Within the same collaboration, we determined the replication origins of the genome in two different developmental stages and we observed small but relevant differences that depend on the chromatin states.

Structural stability in theoretical ecology and bacterial communities.

Flowering plants and insects are groups of organisms with very high biodiversity, characterized by mutualistic interactions that are advantageous for both interacting species. In the ecological literature, there has been heated discussion on the consequences of mutualistic interactions, since some mathematical models suggest that mutualism hinders the stability of ecosystems. Adopting the concept of structural stability that we contributed to quantify several years ago, we showed that mutualism favours structural stability and therefore biodiversity when ecological networks are completely connected. Thereafter, we showed that the structural stability of mutualistic ecosystems increases with the connectance and the overlap (sometimes referred to as nestedness) of the mutualistic network and is inversely related with the interspecific competition. Presently, we are exhaustively comparing mutualism, predation and competition in real and simulated ecological networks.

These results lead us to study the ecological relationships between bacterial taxa, which we predict from co-occurrence data obtained in metagenomics experiments after taking into account as much as we can the effect of the environment. We observed that aggregations between taxa (“mutualism”) are more frequent than exclusions (“competition”) and favour the cosmopolitanism of bacteria, i.e. their capacity to live in many different environments. We developed an algorithm to reconstruct bacterial communities from taxa that present significant aggregations.


* For external calls please dial 34 91196 followed by the extension number
Last nameNameLaboratoryExt.*e-mailProfessional category
Bastolla BufaliniUgo3124633ubastolla(at)cbm.csic.esE.Científicos Titulares de Organismos Públicos de Investigación
Dehouck Yves André C.3124633ydehouck(at)cbm.csic.esTitulado Superior FC2
Maza MorenoMª Carmen2264572mcmaza(at)cbm.csic.esTécnico Sup. Actividades Tec. y Profes.GP3

Relevant publications:

Structural dynamics of proteins through the torsional network model.

  • Bastolla U, Dehouck Y (2019) Can conformational changes of proteins be represented in torsion angle space? A study with Rescaled Ridge Regression. J Chem Inf Model 59:4929-4941. doi: 10.1021/acs.jcim.9b00627.
  • Alfayate A, Rodriguez Caceres C, Gomes Dos Santos H, Bastolla U (2019) Predicted dynamical couplings of protein residues characterize catalysis, transport and allostery. Bioinformatics.
  • Mendez R, Bastolla U (2010) Torsional network model: normal modes in torsion angle space better correlate with conformation changes in proteins. Phys Rev Lett. 104:228103.

Protein folding stability and molecular evolution.

  • Pascual-García A, Arenas M, Bastolla U (2019) The Molecular Clock in the Evolution of Protein Structures. Syst Biol. 2019 68:987-1002. doi: 10.1093/sysbio/syz022.
  • Bastolla U, Dehouck Y, Echave J (2017) What evolution tells us about protein physics, and protein physics tells us about evolution. Curr Opin Struct Biol. 42:59-66. doi: 10.1016/ Review
  • Arenas M, Sánchez-Cobos A, Bastolla U (2015) Maximum-Likelihood Phylogenetic Inference with Selection on Protein Folding Stability. Mol Biol Evol. 32:2195-207. doi: 10.1093/molbev/msv085.

Intrinsically disordered proteins.

  • Nido GS, Méndez R, Pascual-García A, Abia D, Bastolla U (2012) Protein disorder in the centrosome correlates with complexity in cell types number. Mol Biosyst. 8:353-67. doi: 10.1039/c1mb05199g.

Epigenetic and chromatin structure.

  • Sequeira-Mendes J, Vergara Z, Peiró R, Morata J, Aragüez I, Costas C, Mendez-Giraldez R, Casacuberta JM, Bastolla U, Gutierrez C (2019) Differences in firing efficiency, chromatin, and transcription underlie the developmental plasticity of the Arabidopsis DNA replication origins. Genome Res. 29:784-797. doi: 10.1101/gr.240986.118.
  • Sequeira-Mendes J, Aragüez I, Peiró R, Mendez-Giraldez R, Zhang X, Jacobsen SE, Bastolla U, Gutierrez C (2014) The Functional Topography of the Arabidopsis Genome Is Organized in a Reduced Number of Linear Motifs of Chromatin States. Plant Cell. 26:2351-2366.

Structural stability in theoretical ecology and bacterial communities.

  • Pascual-García A, Bastolla U (2017) Mutualism supports biodiversity when the direct competition is weak. Nat Commun. 8:14326. doi: 10.1038/ncomms14326.
  • Bastolla U, Fortuna MA, Pascual-García A, Ferrera A, Luque B, Bascompte J (2009) The architecture of mutualistic networks minimizes competition and increases biodiversity. Nature. 458:1018-20. doi: 10.1038/nature07950.

NOTE! This site uses cookies and similar technologies.

If you not change browser settings, you agree to it. Learn more

I understand


What are cookies?

A cookie is a file that is downloaded to your computer when you access certain web pages. Cookies allow a web page, among other things, to store and retrieve information about the browsing habits of a user or their equipment and, depending on the information they contain and the way they use their equipment, they can be used to recognize the user.

Types of cookies

Classification of cookies is made according to a series of categories. However, it is necessary to take into account that the same cookie can be included in more than one category.

  1. Cookies according to the entity that manages them

    Depending on the entity that manages the computer or domain from which the cookies are sent and treat the data obtained, we can distinguish:

    • Own cookies: those that are sent to the user's terminal equipment from a computer or domain managed by the editor itself and from which the service requested by the user is provided.
    • Third party cookies: those that are sent to the user's terminal equipment from a computer or domain that is not managed by the publisher, but by another entity that processes the data obtained through the cookies. When cookies are installed from a computer or domain managed by the publisher itself, but the information collected through them is managed by a third party, they cannot be considered as own cookies.

  2. Cookies according to the period of time they remain activated

    Depending on the length of time that they remain activated in the terminal equipment, we can distinguish:

    • Session cookies: type of cookies designed to collect and store data while the user accesses a web page. They are usually used to store information that only is kept to provide the service requested by the user on a single occasion (e.g. a list of products purchased).
    • Persistent cookies: type of cookies in which the data is still stored in the terminal and can be accessed and processed during a period defined by the person responsible for the cookie, which can range from a few minutes to several years.

  3. Cookies according to their purpose

    Depending on the purpose for which the data obtained through cookies are processed, we can distinguish between:

    • Technical cookies: those that allow the user to navigate through a web page, platform or application and the use of different options or services that exist in it, such as controlling traffic and data communication, identifying the session, access to restricted access parts, remember the elements that make up an order, perform the purchase process of an order, make a registration or participation in an event, use security elements during navigation, store content for the broadcast videos or sound or share content through social networks.
    • Personalization cookies: those that allow the user to access the service with some predefined general characteristics based on a series of criteria in the user's terminal, such as the language, the type of browser through which the user accesses the service, the regional configuration from where you access the service, etc.
    • Analytical cookies: those that allow the person responsible for them to monitor and analyse the behaviour of the users of the websites to which they are linked. The information collected through this type of cookies is used in the measurement of the activity of the websites, applications or platforms, and for the elaboration of navigation profiles of the users of said sites, applications and platforms, in order to introduce improvements in the analysis of the data of use made by the users of the service.

Cookies used on our website

The CBMSO website uses Google Analytics. Google Analytics is a simple and easy to use tool that helps website owners to measure how users interact with the content of the site. You can consult more information about the cookies used by Google Analitycs in this link.

Acceptance of the Cookies Policy

The CBMSO assumes that you accept the use of cookies if you continue browsing, considering that it is a conscious and positive action from which the user's consent is inferred. In this regard, you are previously informed that such behaviour will be interpreted that you accept the installation and use of cookies.

Knowing this information, it is possible to carry out the following actions:

  • Accept cookies: if the user presses the acceptance button, this warning will not be displayed again when accessing any page of the portal.
  • Review the cookies policy: the user can access to this page in which the use of cookies is detailed, as well as links to modify the browser settings.

How to modify the configuration of cookies

Using your browser you can restrict, block or delete cookies from any web page. In each browser the process is different, here we show you links on this particular of the most used browsers: