Stochastic Approaches for Systems Biology

Free download. Book file PDF easily for everyone and every device. You can download and read online Stochastic Approaches for Systems Biology file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Stochastic Approaches for Systems Biology book. Happy reading Stochastic Approaches for Systems Biology Bookeveryone. Download file Free Book PDF Stochastic Approaches for Systems Biology at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Stochastic Approaches for Systems Biology Pocket Guide.

Stochastic differential equations SDEs will be informally described, and the chemical Langevin equation will be derived as a diffusion approximation to a discrete stochastic chemical kinetic model. Bayesian inference for SDEs will be reviewed briefly, including efficient algorithms which depart from the likelihood free paradigm. Linear noise and moment closure approximations will be discussed if time permits. Sessions devoted to model building, simulation, and inference will supplement the lectures and code for illustrative examples will be provided. The course will make use of lectures, practical sessions, and software demonstrations.

The slides and background reading material will be distributed to the students before the start of the course. The prerequisites for this course will include basic knowledge of algebra and calculus, probability, statistical modelling, and data analysis. Semester Spring Semester Lecturers M. Khammash , A. Using these tools, the course explores the richness of stochastic phenomena, how it arises from the interactions of dynamics and noise, and its biological implications. Objective To understand the origins and implications of stochastic noise in living cells, and to learn the computational tools for the modeling, simulation, analysis, and identification of stochastic biochemical reaction networks.

The benefit of this resampling process is that it facilitates the efficient, exact sampling of the system along the binned progress coordinates. Brute-force sampling, by definition, concentrates sampling power on the most probable events. By contrast, weighted ensemble samples a distribution more evenly, and compared to brute-force it applies more resources to hard-to-sample regions of interest. By definition, the peak contains the most probable events. Thus, certain parts of the configuration space are destined to be poorly sampled; if the true probability of a state being occupied is less than the inverse of the number of trajectories simulated, it is unlikely to be sampled even once.

On the other hand, weighted ensemble decouples the number of trajectories in a region of configuration space from the probability of a trajectory to be there, and allows for a more even coverage of all regions of the underlying distribution. It should be noted that an even coverage of configuration space lends itself to efficient sampling only if the coordinate s along which this coverage is distributed are useful in characterizing the observables of interest [ 17 ]. The efficient sampling of low-probability regions of the time-dependent probability distribution of a stochastic system can be leveraged to extract unbiased estimates of long-timescale information about the system.

This relationship is exact up to statistical noise when the system exhibits a steady-state flow of probability from A to B.

  • Recommended for you.
  • Social Media for Small Business, 2nd Edition.
  • Its Only Money (Legacy Series Book 1).
  • All That She Wants.

These two states A and B can be single micro-states, or large states composed of many smaller sub-states e. A steady-state is achieved when the probability distribution of the system is constant in time. That is, for each sub-state i in the system, in a given time period the total flow of probability into i from the other sub-states is equal to the flow of probability out of i into other sub-states.

  • Lord, I Just Want to Be Happy.
  • Wicked Intentions (Bonds of Justice).
  • Top Authors.
  • Deterministic Versus Stochastic Modelling in Biochemistry and Systems Biology - 1st Edition.
  • Birthdate Numerology: Your March Guide 1;
  • Infinity Dwindled to Infancy: A Catholic and Evangelical Christology.

In terms of the steady-state probabilities p i of each sub-state, and the transition probabilities k ij between sub-states in an arbitrary, but fixed, time interval , the steady-state condition is given by 2. The conditional probabilities k ij can be estimated from WE, and the p i values can then be inferred by solving the linear system in Eq 2. To accommodate a steady state, the boundary conditions of the weighted ensemble simulation described above must be slightly adjusted to induce a steady-state flow of probability from the initial state to the target state.

This is accomplished by removing a trajectory from the ensemble whenever it enters the target state, and re-starting a new trajectory from the initial state with the same probabilistic weight as the one just removed. After a sufficient amount of time the system will relax into a steady flow of probability from one state to another, with probabilities in each bin maintained at a steady value. Although not used in this report, we note that since the transition rates between bins can often be estimated accurately even before the probability distribution has relaxed to a steady state, a Markov-like transition matrix can be constructed and solved to infer long-timescale properties of the system, including the mean first passage time [ 28 ].

This approach is more efficient than waiting for the system to relax into a steady state when the probability mass itself is slow to relax, so long as there are sufficient transitions between bins, and the degrees of freedom orthogonal to the bins are either well-sampled or unimportant. Throughout this work, we use the WESTPA [ 37 ] implementation of the weighted ensemble algorithm, which is freely available and open source github. This implementation is flexible and adaptable for use with any stochastic dynamics engine, and supports plugins for extended methods such as the steady-state approach noted above.

In order to simplify the process of using weighted ensemble sampling techniques with systems biology models, we have constructed an automated service to convert MCell models into ready-to-go WESTPA simulations, available at weightedensemblizer.

Table of Contents

There are different ways to characterize the gain in efficiency from using weighted ensemble instead of brute-force sampling. We find that a useful approach to evaluating efficiency, which is independent of specific computational architecture, is to take the sum of the simulated dynamics time in the weighted ensemble approach, and compare those results to simulating the same amount of dynamics in brute-force simulations.

Thus, always rounding up to give brute-force the benefit, we compare the weighted ensemble results to the results of running brute-force simulations. The statistical precision exhibited by each method can then be compared on the basis of equal time spent simulating dynamics. As mentioned above, the overhead imposed by weighted ensemble resampling is very small compared to the time spent simulating dynamics for most systems of interest, so for models of even moderate complexity, we find this to be a fair comparison of efficiency.

Because WE trajectories, and hence observables, exhibit correlation within a single simulation, it can be important to perform multiple, independent weighted ensemble runs to ensure uncorrelated estimates of observables. When comparing the performance of brute-force sampling to multiple independent weighted ensemble runs, for each WE run we construct a brute-force ensemble of equivalent cost to each independent WE run, as described above. We can then compare the results of the multiple brute-force ensembles to the multiple independent WE runs on equal footing.

All simulations in this report employ spatially resolved particle-based kinetic Monte Carlo dynamics, implemented in the MCell software package. MCell has been used to study a wide range of neuroscience questions such as neurotransmitter diffusion in the brain [ 39 ], the structure and function of synapses in the central [ 40 ] and peripheral [ 29 ] nervous system, and the effect of drugs on nervous system function [ 41 ].

MCell has also been employed to investigate general cellular phenomena such as calcium signaling [ 42 ] and the role of diffusion in cellular transport [ 43 ]. MCell combines rigorously validated and highly optimized stochastic Monte Carlo algorithms, particle-based random walk diffusion of point particle molecules in space and on surfaces, and stochastic biochemical state transitions. MCell models can contain arbitrarily complex 3D mesh geometries representing the biological system under consideration.

These geometries are typically derived from reconstructions of biological tissue typically from electron microscopy data [ 44 ], or created in silico based on average geometries [ 29 ], e. MCell features a flexible model description language and has the ability to checkpoint simulation trajectories at arbitrary output intervals or times.

Stochastic Approaches for Systems Biology

MCell is a kinetic Monte Carlo scheme, in the sense that the time evolution of the system is explicitly modeled. The Monte Carlo moves that the system makes are not arbitrary trial moves, but are rather chosen according to the reaction and diffusion rates of the molecules being simulated. A constant time-step is employed in these simulations, during which the likelihood of reaction and diffusion processes are computed and stochastically sampled; by using appropriate time-steps, the dynamics of the underlying processes are faithfully recapitulated for further details, see [ 38 , 46 , 47 ].

The construction of large, complex spatial models is facilitated by a combination of software that specializes in separate aspects of this task. One of the limiting factors in performing spatially realistic cell simulations is the difficulty of obtaining cell geometries. This limitation can be addressed by learning generative models of cell organization directly from microscope images; these can be used to synthesize an unlimited number of realistic geometries.

Stochastic Approaches for Systems Biology : Mukhtar Ullah :

For instance, in the complex model in a realistic cellular geometry studied below, biochemical reaction networks, with corresponding compartments for organelles, are constructed using BioNetGen software [ 48 , 49 ], combined with cell geometry models generated by CellOrganizer software [ 50 — 58 ] using CellBlender [ 45 ] to create the MCell spatial simulations [ 59 ].

CellOrganizer CellOrganizer. Currently CellOrganizer supports models for cell shape, nuclear shape, vesicle frequency, location and size, and microtubule length, number and distribution. Biochemical reaction networks in our model of signaling in a realistic cellular geometry are built with the BioNetGen software package BioNetGen. The rule-based approach allows combinatorially large chemical reaction networks to be compactly described using a small set of rules that define the underlying molecular interactions [ 49 ].

Indirect simulation of rule-based models requires automated generation of the reaction network implied by the rule set. The generated reaction network can then be simulated using a variety of approaches including ordinary differential equations and stochastic simulation. BioNetGen has previously been used to model a wide range of processes including signal transduction, metabolic pathways, and genetic regulatory networks [ 49 ].

BioNetGen enables the cellular topology to be defined via compartments [ 61 ], but it does not provide for the specification of more detailed geometric information about these compartments or molecule locations. An automated process converts these rules to an exhaustive network of chemical reactions representing the chemical kinetics of the system see Fig 3.

Geometries are learned from images by CellOrganizer. Chemical reaction networks are generated from rule-based models in BioNetGen. Geometries and reaction networks are imported to MCell via the CellBlender visual editor. The reaction network from BioNetGen is fed into CellOrganizer to obtain an appropriate cellular geometry, and the network and geometry are combined using the CellBlender package. In CellBlender, the reactions and geometry are merged, and exported to MCell. The system is then simulated as usual in MCell, either using weighted ensemble to manage the trajectories, or via brute-force.

We investigate three spatial models of cellular function: 1 a toy model of diffusive binding, 2 an idealized model of cellular signaling, and 3 a realistic model of a neuromuscular junction. All three particle-based kinetic Monte Carlo models are simulated in MCell version 3. A highly simplified model of diffusive binding was constructed as an initial test of the utility of weighted ensemble sampling in a spatial system.

636-0016-00L Computational Systems Biology: Stochastic Approaches

The model geometry is depicted in Fig 4. The receptors at the top are initially bound by ligands, that are free to unbind and diffuse around the cell, and bind to receptors at the bottom. The volume also contains receptors at the bottom of the cube that are initially unbound. We examine the probability density for the number of receptors at the bottom of the volume bound by ligands after simulating 10 milliseconds of dynamics.


The toy model has an internal time step of 10 microseconds, and we perform weighted ensemble resampling at an interval that exactly coincides with the internal time step, or every 10 microseconds. We simulate the model for 10 milliseconds, or weighted ensemble iterations. The progress coordinate we use is the number of receptors bound at the bottom of the cube, with bins on this coordinate at integers on [0, ], and we simulate 16 trajectory segments in each bin.

The system models protein production in response to an extracellular signal and highlights interesting aspects of signal transduction through different subcellular components, such as transport across membranes and feedback between molecules in different subcellular locations [ 59 ].

The model contains on the order of 10 5 reactive molecules, situated in a realistic cellular geometry. Because creating robust, high-quality complex models of cells is itself a challenging endeavor, we employ the model generation pipeline through BioNetGen and CellOrganizer described in the Methods section and Sullivan et al.

We use the geometry shown in Fig 5 , which is derived from three-dimensional images of HeLa cells using CellOrganizer. This geometry contains topologically distinct partitions: the extracellular region, the cytoplasm, the nucleus, and approximately endosomes. The geometry also includes the membranes that partition these compartments, through which molecules must be transported when appropriate. Further details are included in the Supporting Information. Realistic cell geometry generated from microscopy images by CellOrganizer.

The geometry explicitly models the compartmentalization of the cell, by forcing molecules to diffuse through membranes to transition from, for example, the cytoplasm grey to the nucleus blue. Also modeled are endosomes green , and the extracellular environment transparent.

We use the reaction schema illustrated in Fig 6 to describe the reaction kinetics of the model. The BioNetGen rules for this model are included in the Supporting Information, and they produce a network of chemical reactions between 78 species [ 61 ]. Briefly, the signaling network functions as follows. The system is initialized in a state of unbound receptors, and free extracellular ligands. The extracellular ligand binds to receptors on the cell membrane, facilitating receptor dimerization, which can be internalized to the endosomes.

In the endosomes, receptor dimers can become phosphorylated and recruit a transcription factor, which upon phosphorylation can also dimerize and migrate to the nucleus. In the nucleus, the transcription factor initiates the transcription of mRNA1, which, when it migrates to the cytoplasm, produces protein P1. P1 can then migrate to the nucleus and act as a transcription factor for mRNA2, which, when it migrates to the cytoplasm, produces the final species in the cascade, protein P2.

Although this reaction network is idealized, it embodies key aspects of the complexity expected in real signaling processes. This rule-based model is translated into a system of chemical kinetics reactions by BioNetGen, and then simulated in a spatially realistic geometry by MCell. Figure adapted from [ 61 ]. Rate constants and further details are given in the SI. The weighted ensemble simulation of the spatial signaling model has an internal time step microseconds, and we perform weighted ensemble resampling once every second, i.

We simulate the model for seconds, or 5 million internal MCell time steps, i. We use a single progress coordinate for this system, the total number of P2 molecules in the system.


The bins on this coordinate are integers on [0, 25] and one bin from 25 to infinity. We simulate 48 trajectories in the bin containing 0, and 16 trajectory segments in each other bin. Note that many coordinates e. The third model we study represents a single active zone of a frog neuromuscular junction NMJ. Synapses are of crucial physiological importance in neural function, yet their detailed molecular behavior, particularly the way in which calcium triggers synaptic vesicle fusion still lacks a complete, molecular level, characterization.

This is mainly due to the lack of experimental approaches that can probe synapses at the required spatial and temporal resolution. Computational models can provide critical microscopic insight into how calcium binding triggers vesicle fusion and release [ 29 ]. The geometry of the frog NMJ active zone model is detailed in Fig 7 and has been described previously [ 29 ]. The active zone model consists of a double row of 26 synaptic vesicles and two rows of 26 voltage gated calcium channels VGCCs in the space between vesicles see Fig 7.

Thus each synaptic vesicle is associated with a single VGCC. On the left is an example snapshot from a simulation, and on the right is a zoomed-in view of the model. Calcium is released into the presynaptic space and is free to diffuse around the geometry and bind to the synaptic vesicles at the bottom of the active zone. The system is initialized from a state of no free calcium in the active zone. During a simulation, VGCCs open stochastically, driven by a time-dependent action potential waveform [ 29 ].

Once open, VGCCs conduct calcium ions into the presynaptic space. Since each synaptotagmin protein has five calcium binding sites, each synaptic vesicle contains a total of 40 calcium binding sites. A synaptotagmin protein is activated after binding at least two calcium ions, and vesicle fusion is triggered once three out of its eight synaptotagmin proteins have been activated. For each simulation we track the calcium binding events to synaptotagmin sites on synaptic vesicles and can thus determine the number of released vesicles and the time of release. Specifically, the rates for the opening of and calcium conduction through VGCCs in the model are time dependent and are parameterized according to an experimentally measured action potential waveform.

This time-dependent nature of vesicle release in synapses is critical for their physiological function [ 29 ]. Thus, the model, with its time-varying kinetics, cannot be treated using steady-state or equilibrium approaches and is only usefully simulated, even with a weighted ensemble, out of equilibrium and for a predetermined period of time. Weighted ensemble simulation of the NMJ model used an internal time step of 10 nanoseconds, and we performed weighted ensemble resampling at an interval of 6 microseconds for the low calcium conditions.

In total, we simulate the model for 3 ms, i. The progress-coordinate space for the NMJ system was two dimensional: one dimension was the total number of calcium ions bound to all synaptotagmin molecules on a vesicle, and the other was the number of activated synaptotagmin molecules on that vesicle. Since a vesicle fuses once three synaptotagmin molecules are active, the latter coordinate had integer bins from zero to three. For the coordinate tracking the number of bound calcium ions per synaptotagmin, the bins were integers on the interval [0, 20], and one bin from 20 to The NMJ progress coordinate was chosen to facilitate the observation of fusion events, in a manner that is somewhat complicated but also serves to illustrate the flexibility in the type of progress coordinates that WE accepts.

Of the 26 vesicles in the simulation, the one that was closest to fusion was chosen at every WE iteration. That is, the vesicles were sorted in descending order by number of activated synaptotagmin proteins, and then by number of total calcium ions bound; the vesicle at the top of the list was chosen. This ranking was performed at every weighted ensemble resampling event, so in principle the vesicle in question could change during the simulation, but always in favor of progress towards a fusion event.

Due to the time dependent VGCC rate constants in the NMJ model, even weighted ensemble sampling can have difficulty efficiently filling up bins of state space. This is because some regions that are initially difficult to sample become easy to reach, and time spent populating intermediate bins is in some sense ill-spent—the model is still sampled, but the efficiency can be less than ideal if one attempts to always have all bins full of trajectories. To address this issue, instead of performing a single weighted ensemble run with a large number of trajectories, we perform many, less intensive weighted ensemble runs with fewer trajectories and average the results.

Specifically, for the low calcium regimes of 0. The 0. As noted above, multiple independent weighted ensemble runs facilitate error estimation. We sampled the three spatially resolved cell-scale models of varying complexity using the weighted ensemble approach. The results from all three models demonstrate the ability of WE to sample rare events in models of varying spatial and biochemical complexity.

The application of WE sampling to the NMJ model generated novel data about vesicle release in regimes of calcium concentration too difficult to sample well with conventional methods. Our studies of rare event sampling in spatial stochastic systems start with the toy model shown in Fig 4 and described in detail in the Models section.

Briefly, we simulate diffusing ligands unbinding from the top of a cubical volume and binding to the bottom for a short amount of time.