Tuto2 Tuto1 Paper

AI-driven Automated Discovery Tools Reveal Diverse Behavioral Competencies of Biological Networks

Authors Affiliation Published
Mayalen Etcheverry INRIA, Flowers team, Poietis September, 2023
Clément Moulin-Frier INRIA, Flowers team
Pierre-Yves Oudeyer INRIA, Flowers team
Michael Levin The Levin Lab, Tufts University Reproduce in Notebook


Many applications in biomedicine and synthetic bioengineering depend on the ability to understand, map, predict, and control the complex, context-sensitive behavior of chemical and genetic networks. The emerging field of diverse intelligence has offered frameworks with which to investigate and exploit surprising problem-solving capacities of unconventional agents. However, for systems that are not conventional animals used in behavior science, there are few quantitative tools that facilitate exploration of their competencies, especially when their complexity makes it infeasible to use unguided exploration. Here, we formalize and investigate a view of gene regulatory networks as agents navigating a problem space. We develop automated tools to efficiently map the repertoire of robust goal states that GRNs can reach despite perturbations. These tools rely on two main contributions that we make in this paper: (1) Using curiosity-driven exploration algorithms, originating from the AI community to explore the range of behavioral abilities of a given system, that we adapt and leverage to automatically discover the range of reachable goal states of GRNs and (2) Proposing a battery of empirical tests inspired by implementation-agnostic behaviorist approaches to assess their navigation competencies. Our data reveal that models inferred from real biological data can reach a surprisingly wide spectrum of steady states, while showcasing various competencies that living agents often exhibit, in physiological network dynamics and that do not require structural changes of network properties or connectivity. Furthermore, we investigate the applicability of the discovered "behavioral catalogs" for comparing the evolved competencies across classes of evolved biological networks, as well as for the design of drug interventions in biomedical contexts or for the design of synthetic gene networks in bioengineering. Altogether, these automated tools and the resulting emphasis on behavior-shaping and exploitation of innate competencies open the path to better interrogation platforms for exploring the complex behavior of biological networks in an efficient and cost-effective manner. To read the interactive version of this paper, please visit https://developmentalsystems.org/curious-exploration-of-grn-competencies/paper.html.


Developing methods to recognize, map, predict, and control the complex, context-sensitive behavior of chemical and genetic networks is an essential frontier of research in science and engineering. These systems, such as gene regulatory networks and protein pathways, are known to be instructive drivers of embryogenesis, cell behavior, and complex physiology [91][77][116]. Understanding the control properties of these systems is critical not only for the study of evolutionary developmental biology [88][85][84][71][122], but also for comprehending and intervening in various disease states, including cancer [95][11][144], and for the construction of novel synthetic biologicals in bioengineering contexts[17][35][86][45][18].

Thus, much work has gone into mathematical modeling and computational inference of both protein pathways and gene regulatory network models [50][98][47][104], which has resulted in the development of large collections of publicly-available models such as the Biomodels database [120][121]. Yet, despite the wealth of available models, scientists still largely lack an effective understanding of the range of possible behaviors that these models can exhibit under different initial conditions and environmental stimuli, and are in search of systematic methods to reveal and optimize those behaviors via external interventions. The full extent of the computational and control properties of such networks are not yet well-understood; while dynamical systems theory has been extensively used to characterize their behavior [9][129], it is not known what other sets of tools might reveal and exploit interesting properties of this ubiquitous biological substrate. The field of diverse intelligence (also known as basal cognition) has suggested that strong functional symmetries between pathway networks and neural networks could imply the existence of learning and other kinds of behavior in this unconventional substrate [124][43][109][2][89]. Specifically, it has been hypothesized that gene regulatory networks (GRNs) and other molecular networks could be endowed with surprising navigation competencies allowing them to robustly reach diverse homeostatic or allostatic states despite a wide range of perturbations [62][134][105][130], and that exploiting these innate competencies could provide a promising roadmap for the design of interventions in regenerative medicine and bioengineering contexts [111][78].

However, significant challenges remain in practice for the exploration and behavior-shaping of these innate competencies, which presents a barrier to the use of these ideas in regenerative medicine and bioengineering. Because of the non-linearity and redundancy in pathway dynamics, passive exploration strategies such as random screening are likely to either fail in uncovering the full range of potential behaviors or require time and energy beyond the available resources. Here, we formalize and investigate a view of gene regulatory networks as agents navigating a problem space. We propose a framework and automated tools, leveraging (1) curiosity-driven goal-directed exploration algorithms coming from recent advances in machine learning and (2) a battery of empirical tests inspired from behaviorist approaches, for mapping the repertoire of robust goal states that GRNs can reach within this problem space despite various perturbations. A key novelty of this work is the use of AI-based exploration tools to map the space of possible behaviors in biological networks, which opens interesting avenues for efficient mapping of unfamiliar system behaviors, yielding transferable insights for diverse problem-solving once such a map is discovered.

The challenge of exploring and mapping spaces of complex and self-organized behaviors appears in many fields such as diverse intelligence in biological systems, minimal active matter or robotics: many systems in these areas provide a rich space of evolved, engineered, and hybrid systems that offer many of the same fundamental problems of behavior and control regardless of specific composition or provenance [87]. These span many orders of spatio-temporal scale, from molecular assemblies to swarms of complex organisms [2][139][94][49]. One set of approaches seeks to develop tools to identify the optimal level of control, ranging from physical rewiring to various methods from cybernetics and behavioral sciences, to reveal and exploit the native competencies and computational capacities of these systems [17]. Specifically, it is increasingly realized that the level of competency (and thus the appropriate level of control) often cannot be guessed by inspection of a system's components, and that its position on a spectrum ranging from passive matter to complex metacognition must be determined empirically [87][126][58][16]. This is critical not only for fundamental understanding of evolution of bodies and minds [43][14][42][52][135][97], but also for the design of interventions in biomedicine and synthetic morphology contexts [32][5]. Yet, a common property in many of these systems is that it is expensive in time and energy to conduct experiments: empirical exploration needs to be made under limited resources. Thus, methods for automating efficient exploration and discovery of a diversity of behaviors in these spaces may be widely useful. As explained below, we will here leverage methods from developmental artificial intelligence initially designed for the specific purpose of exploring a diversity of behaviors using a limited budget of experiments.

One especially fascinating set of systems concerns cellular molecular pathways, or gene regulatory networks (GRNs). In the lab or clinic, these pathways are usually treated as simple machines, with intervention strategies focusing on rewiring their structure to achieve a desired outcome: adding or removing nodes (gene therapy), or changing connection weights (by targeting promoter sequences or protein structures) [31][100][75][68]. However, the emergent, generative nature of development and physiology ensure that it is often very hard to know which genes/proteins to modify, and how, in order to reach a complex desired system-level outcome [142]. Moreover, the responses of cells and tissues to drugs changes over time, making it even more difficult to infer specific interventions (e.g. drugs) that will induce a stable improvement in pathway state in vivo. Indeed, with the exception of antibiotics and surgery, most available treatment modalities do not solve the underlying problem -- they seek to mitigate symptoms, which recur (or expand) once the drug is withdrawn. This is because current therapeutics function bottom-up, attempting to force specific molecular states, as it has been challenging to develop methods for shifting complex tissues and organs towards a stable health profile. Next-generation solutions, which would offer true healing (stable correction), require an understanding of the homeostatic and allostatic properties of networks with respect to how they traverse the space of transcriptional, physiological and anatomical states. An understanding of the behavior policies of networks as they dynamically navigate these problem spaces is essential for predicting what stimuli can be used to re-set their setpoints and guide them to autonomously maintain a healthy state. In the language of behavioral neuroscience, this strategy corresponds to exploiting their native robustness, decision-making, and navigational competencies to induce predictable, long-lasting changes in functionality.

Significant challenges remain in revealing and controlling the range of behaviors that can self-organize in these cellular and molecular pathways . To characterize steady-state concentrations and responses to small perturbations, conventional methods rely on piecewise-linear approximation of the system behavior [123][145][25][40][103], but struggle with higher-dimensional systems or wider parameter ranges which limits their applicability [29]. Other works have proposed the porting of tools from network control theory to identify sets of control nodes allowing to drive the network behavior toward target steady states [60]. These methods typically exploit the network topology [60][106][117][19][101] or regulatory structure [70][8][51] to identify control strategies based either on permanent knockout/activation of genes or on temporary perturbations, the latter being preferable in biomedical context.

However, these approaches often require prior knowledge of target attractor states or are limited to Boolean network models. Other works have explored the use of machine learning tools, such as evolutionary search [69][83][82] and gradient-descent optimization [133][79], for controlling continuous ODE biomolecular networks with high-dimensional parameter spaces, mainly in the context of synthetic circuit engineering [46][119]. While providing powerful optimization tools, these approaches tend to focus on rewiring network structure and connectivity. Moreover, the choice of a predefined fitness function and parameter range initialization is not only critical to the success of optimization [83] but largely restricts exploration of the behavior space [79].

In contrast, an alternative line of research proposes exploring and leveraging the inherent molecular mechanisms of adaptivity and robustness in cellular pathways as a promising approach for drug interventions that do not rely on genomic editing or gene therapy [62][140]. Recently, a broad, substrate-independent behavior science perspective suggests novel properties of gene regulatory networks (GRNs) and other biological networks [12][124]. This perspective views GRNs as agents that convert activation levels of specific genes (inputs) to those of effector genes (outputs), with intermediate nodes in between, leading to strategies for controlling network behavior based on a specific history of inputs (experience) rather than through network rewiring. Notably, the concept of training a chemical pathway using pulsed input stimuli (node activation or suppression drugs) has been formalized, and several networks have been analyzed to establish a taxonomy of memory types found in biological GRNs and pathways [76][63].

Here, building upon recent research [105][76][63], we take the next step and investigate a view of gene regulatory networks as agents navigating a problem space toward target goal states with varying degrees of competency (Figure 1-a). We seek to implement a definition of goal that abstracts it from conventional associations with human or other advanced brains and facilitates the use of tools from cybernetics, behavior science, and control theory to understand broader aspects of biological regulation. Here we use the term “goal” state to refer to a system’s steady state, which it expends effort to reach despite interventions or barriers - a definition appropriate to the study of basal (or minimal) proto-cognitive regulatory systems.. Our definition of goal does not imply “purpose” (high-level goals where an agent has the meta-cognition to think about having goals and what they might be), and we do not attribute high-level competencies (such as re-setting one’s own goals) to GRNs.

Our particular focus lies in investigating two types of navigation competencies: versatility, which refers to the capacity to reach diverse goal states under different interventions, and robustness, which refers to the ability to reach a goal state despite various perturbations. The primary scientific question we aim to address is: What is the repertoire of robust goal states that a GRN can actively reach through minimal and non-genetic interventions within a navigation task context, and can we develop systematic methods and automated tools to aid scientists in discovering this repertoire?

Figure 1

Figure 1: Overview of the proposed framework. (a) MOTIVATION: We often focus on studying the navigation and behavior of organisms in conventional three-dimensional environments, neglecting the intelligence underlying competencies at sub-organismal scales [105]. To better understand navigation competencies in unconventional organisms solving problems in unconventional spaces (e.g., embryos in morphological space), it is essential to construct comprehensive "behavioral catalogs" for these novel entities, which in turn requires sophisticated exploration methods to discover the extent of possible behaviors. Images are taken and adapted from [21][55][125][53][41][6]. (b) EXPERIMENTAL DESIGNS: We formalize GRN behavior as a navigation task and propose to investigate it by defining abstract and observer-dependent "problem spaces" that we use to organize the observed biological behaviors and their exploration in practice. (c) AUTOMATED EXPERIMENTATION: Pseudo-code of the curiosity-driven goal exploration process we use to automate the discovery of behavioral abilities that the GRN can exhibit in behavior space. (d) EMPIRICAL TESTS: We use a battery of empirical tests to identify the robust goal states of the systems, i.e. the one that can be attained under a wide variety of perturbation (including noise in gene expression, and pushes or walls during traversal of transcription space). (e) PERSPECTIVES: We explore several potential reuses of the discovered "behavioral catalog" and proposed framework across evolutionary biology, biomedicine and bioengineering contexts.

To address this question in practice, our experimental framework revolves around the definition of "problem spaces", which we use as tractable components of the GRN's overall state space (Figure1-b), and on a set of methodological contributions which we organize around three sub-questions:

  1. Automated discovery of diverse behavioral abilities with autotelic curiosity search (Figure 1-c): What is the range of possible goal states that GRNs can exhibit and how can we devise efficient exploration strategies to automatically identify these goal states? Defining goal states as attractor states of the underlying gene regulatory network, we show that traditional screening methods can be very inefficient in discovering the range of possible goal states. To address this, we propose to use intrinsically-motivated goal exploration processes (IMGEP) [136][65], a recent family of diversity-driven machine learning approaches also known as autotelic curiosity search which was recently shown to form a useful discovery assistant for revealing the behavioral diversity of unfamiliar systems such as chemical oil-droplet systems[146], physical non-equilibrium systems [99] and models of continuous cellular automata [66][72][61].

  2. Evaluation of the navigation competencies (Figure 1-d): How competent is the GRN, in terms of robustness to perturbations, in attaining the diverse previously-identified goal states? Prior studies have offered definitions of robustness in biological networks, characterized as the degree of variation in functionality [3] or phenotypic trait [38] under specific environmental or genetic changes. However, these studies often consider a predefined functionality and random perturbations in network parameters [4][28][82] or specific gene knockouts [48]. Environmental perturbations on the other hand are often limited to random variations in initial conditions within a predefined range [29][7]. Here, inspired from behaviorist approaches, we test hypotheses about non-genetic resistance with respect to various navigation competencies that living agents often exhibit, and that do not require structural changes of network properties or connectivity. Those tests assess the system's ability to maintain robustness despite various perturbations encountered during traversal, including developmental noise in gene expression levels, sudden "pushes" within transcriptional space, and the presence of energy barriers or "walls" acting as force fields in the environment.

  3. Potential reuses of the discovered "behavioral catalog" and framework (Figure 1-e): Can the constructed behavioral catalogs be useful for fundamental research and practical therapeutic applications, and can the framework be easily applied to other systems and problem spaces? We propose that the discovered competencies may provide valuable insights for understanding evolvability and developmental robustness, and provide a fertile source for the design of interventions in biomedicine and synthetic morphology contexts. We also suggest that the framework and automated tools, which are observer-focused and substrate-independent, could be transposed to other systems and problem spaces.

The overall framework is summarized in Figure 1. Applying it on a database of 30 continuous (ODE) models from the Biomodels website, consisting of a total of 432 systems defined as GRN model-behavior space tuples, revealed several interesting insights. First, results suggested that most of the surveyed systems are capable of reaching a surprisingly wide spectrum of steady states depending on their initial state. Interestingly, random screening strategies were not able to reveal this diversity of reachable states (or at least not in a sample efficient way), confirming the need for more advanced exploration strategies like curiosity search. Secondly, among the discovered steady states, we were able to identify several robust goal states i.e. ones that the system consistently reaches despite various perturbations during traversal of transcriptional space. Altogether, these findings seem to suggest that cell phenotype and functionality could be the result of a multi-step program [106] that could be flexibly and robustly reprogrammed by appropriate stimuli [16]. Finally, we demonstrate possible reuses of this "behavioral catalog" for comparing the network's competencies across different classes of organisms, as well as for the design of non-genetic drug interventions. We also demonstrate an alternative reuse of the framework to reveal new kinds of reachable "goals" in synthetic gene networks, suggesting alternative strategies for the design of gene networks in a bioengineering context.

An interactive executable version of the paper, as well as step-by-step tutorials and notebooks can be found online at [https://developmentalsystems.org/curious-exploration-of-grn-competencies]{.underline}. The full codebase of the proposed automated experimentation pipeline is written end-to-end in JAX, a high-performance numerical computing library that we leverage for parallel experimentation and computational speedups of the ODE models time-course simulations.


Generalizing GRN behavior as a navigation task

Dynamical Systems Terminology Behavioral Science Terminology Proposed Isomorphism Navigation Task Terminology
system: a set of interconnected elements that interact to produce emergent behavior organism: a living being that responds to stimuli and adapts to its environment Both are collections of lower-level elements that interact to produce emergent behavior and can adapt at the system level agent or GRN
phase-space trajectory: set of states taken by the system when starting from one particular initial condition behavioral trajectory: the sequence of states that an organism exhibits in response to stimuli Both represent the sequence of states or behaviors that a system or individual experiences over time trajectory
initial condition: initial state of a system's variables and parameters that condition its dynamics stimuli: events that might (or might not) trigger a response in an organism Both represent incoming variations that set a system or organism in motion intervention or perturbation
critical parameter: a parameter or condition that, if changed, can cause a system to undergo a qualitative change or phase transition salient stimuli: stimuli that are particularly relevant or meaningful to an organism, either because they are associated with reward or punishment or because they are novel or unexpected Both represent the incoming variations that have a significant impact on a system's steady-state or organism's response effective intervention
steady-state (or attractor): a stable state (or set of states), towards which the system tends to evolve over time observed response: outcome or endpoint of a behavioral trajectory towards which an organism converges Both represent the endpoint that a system or organism is moving towards reached endpoint or goal
robust attractor: stable attractor toward which the system tends to evolve under various initial conditions and perturbations target goal: it is assumed that an organism engages in a goal-directed manner when it exhibits new ways or actions to achieve a similar outcome when faced with novel circumstances Both represent a stable endpoint or goal that the system successfully attains under various perturbations robust goal
controllability: degree to which the system's dynamics (and resulting steady states) can be controlled or manipulated trainability: degree to which an organism's behavior can be modified or shaped by experience or conditioning Both represent the capacity of a system or individual to be influenced or changed by controlled interventions versatility

Table 1: Glossary of terms used in this paper, with the proposed isomorphism which generalizes concepts from dynamical complex systems and behavioral sciences under a common navigation task perspective.

The GRNs analyzed in this study are biological pathway networks taken from the BioModels repository [120][121]. The term "GRN" is used broadly to include protein interaction, gene regulatory, and metabolic networks. In these mathematical models, the dynamic interactions between nodes of the network (molecular species) are modeled with a system of ordinary differential equations, enabling to quantitatively simulate time-course behavior (model rollouts) and observe the dynamics of node activities over time (Figure 2a). Here, following a terminology which aims to integrate concepts from dynamical complex systems with concepts from behavioral sciences, we propose to conceptualize GRN behavior as a navigation task (Table 1). Model rollouts are viewed as "trajectories" in transcriptional space where network steady states are "goal states" (endpoints) that the "agent" (GRN) can reach with varying levels of competencies. As for living agents, these competencies may range from unstable locomotion patterns to more advanced forms of goal-directed behavior like path following, obstacle avoidance, or even forms of spatial memory and foresight. In this paper, we are particularly interested in investigating two forms of navigation competencies that we refer to as versatility, the capacity to reach diverse goal states under various interventions, and robustness, the capacity to reach a goal state despite various perturbations. Note that versatility and robustness are studied with respect to different sources of incoming environmental variation, respectively interventions and perturbations.

Problem Space Generic definition Specific definition in this study
Observation Space (O) Space of raw observations made during the GRN model rollout to measure its state or behavior Records node activities over time as $o = \left( y(0),\ldots,y(T) \right)$, where y(t) is an n-dimensional vector (n = number of nodes) and T is the measured reaction time
Behavior Space (Z) A projection of the observation space used by the experimenter to encode the "goal states" of a model rollout into a tractable (lower-dimensional) space Encodes the trajectory endpoint of a model rollout. Represents a cell phenotype defined by the state values of some nodes (relevant biological markers), such that $z = \left( y_{i1}(T),\cdots y_{im}(T) \right)$ (we use m=2 in this study for simplicity and visualization)
Intervention Space (I) A space where interventions represent controlled sources of incoming variation that the experimenter can exert on the GRN model rollout to drive it toward novel or targeted states Sets the initial state $i = \left( y_{1}(0),\ldots,y_{n}(0) \right)$ of a model rollout. Defined as a hyper-rectangle $I ⊆ ℝⁿ$ where the boundaries are proportional to the min and max values taken by the respective nodes from default initial conditions
Perturbation Space (U) A space where perturbations represent external sources of incoming variation, used by the experimenter to characterize the robustness of a given goal state Includes three classes of (stochastic) perturbations including noise perturbation $U_{n}$, push perturbation $U_{p}$, and wall perturbation $U_{w}$

Table 2: Problem spaces used in this study

To investigate these competencies in practice, our experimental framework is based on the definition of "problem spaces", which include the observation space (O), behavior space (Z), intervention space (I) and perturbation space (U) as defined in Table 2. To be consistent with our navigation task terminology introduced in Table 1, we refer to a behavior z as the reached "goal state" of a GRN trajectory. However these "goals" may lie on a continuum between complete robustness and high sensitivity, and our primary interest lies in identifying robust goals of the system. Whereas several choices could be made for the intervention space I and perturbation space U, we intentionally consider minimal and non-genetic interventions to investigate the "native" goal states of the GRN, and environmental obstacles to investigate for navigation competencies classically observed in other living agents. Examples of simulations, interventions, and perturbations are illustrated in Figure 2.

Then, a typical analysis using our framework relies on a 2-step procedure, detailed in the subsequent sections. First, to assess the versatility of the GRN, we define an exploration strategy which organizes the sequence of interventions $i_{1},\ldots,i_{N}$ used to drive the system toward a maximally diverse set of reachable endpoints { $z_{k} \in Z$ }$_{k = 1,N}$ , while being given a limited budget of experiments N. Secondly, to assess the robustness of the discovered goal states {${ z}_{k} \in Z$ }, we conduct a battery of empirical tests to characterize their degree of sensitivity to novel perturbations, with a fixed experimental budget of P perturbations per selected behavior z. At the end of this 2-step procedure, we obtain the "behavioral catalog" (H) of the studied GRN, which includes the history of experiments $H = $ {$( i_{k},o_{k},z_{k},~$ { $( u_{p},o_{p},z_{p} ),~p = 1...P$ } $ ),~~k = 1\ldots N$ }.

Following this framework, the behavioral catalog is constructed for a database of 30 biological networks consisting of a total of 432 systems, where a system is defined as a (GRN model, intervention space (I), behavior space (Z)) tuple, as described in Materials and Methods and Table S1. These catalogs provide valuable empirical observations and insights into the navigation competencies of the studied GRNs, particularly in their ability to consistently achieve diverse goal states under various tested perturbations. Statistical analyses of the results are presented in Figures 3, 5, and 7, and specific results for the RKIP-ERK signaling pathway [56] are shown in Figures 2, 4, 6, and 8.

Figure 2: Illustration of the experimental setup and chosen problem spaces on an example GRN model which has 10 nodes and models the influence of RKIP on the ERK Signaling Pathway [56]. (a) Time-course evolution of the different nodes y1, ..., y10 (one color per node) when starting from the default initial conditions (as provided in [56]). The observation captures the states taken through time $o=[y(t=0), ..., y(t=T)]$ where $y=[y_1, ..., y_{10}]$. (b) Corresponding trajectory in transcriptional space (phase space), for two target nodes (ERK, RKIPP_RP), from t=0 (A, in red) to T=1000 seconds (B, in cyan). We can see that the trajectory converges to endpoint B in less than 100 seconds, and then stay there. The behavior (or reached goal state) is the endpoint $B = \left\lbrack y_{ERK}(T),y_{RKIPRP}(T) \right\rbrack$, where T is chosen big enough to ensure convergence. (c) The intervention is setting the initial state of the system trajectory (for all nodes): $i = [y_1(t=0), ..., y_{10}(t=0)]$. (d-e) Example of perturbations used in this paper. (d) Noise perturbation, here applied to all 10 nodes every 5 secs until t=80 secs. (e) Push perturbation, here applied to the two target nodes (ERK, RKIPP_RP) at t=3 seconds. (f) Wall perturbation, also applied to the two target nodes (ERK, RKIPP_RP), here at 10% and 90% of the total distance traveled. Supplementary Figure S1 shows examples of other possible "drug" or "genome" interventions that can be implemented in the accompanying software, as well as the possibility to perform interventions (or perturbations) in parallel using batched computations.

Curiosity Search Uncovers a Diversity of Reachable Goal States