Publications about 'identification' |
Articles in journal or book chapters |
This paper considers the following learning problem: given sample pairs of input and output signals generated by an unknown nonlinear system (which is not assumed to be causal or time-invariant), one wishes to find a continuous-time recurrent neural net, with activation function tanh, that approximately reproduces the underlying i/o behavior with high confidence. Leveraging earlier work concerned with matching derivatives up to a finite order of the input and output signals the problem is reformulated in familiar system-theoretic language and quantitative guarantees on the sup-norm risk of the learned model are derived, in terms of the number of neurons, the sample size, the number of derivatives being matched, and the regularity properties of the inputs, the outputs, and the unknown i/o map. |
The development of resistance to chemotherapy is a major cause of treatment failure in cancer. Intratumoral heterogeneity and phenotypic plasticity play a significant role in therapeutic resistance. Individual cell measurements such as flow and mass cytometry and single cell RNA sequencing (scRNA-seq) have been used to capture and analyze this cell variability. In parallel, longitudinal treatment-response data is routinely employed in order to calibrate mechanistic mathematical models of heterogeneous subpopulations of cancer cells viewed as compartments with differential growth rates and drug sensitivities. This work combines both approaches: single cell clonally-resolved transcriptome datasets (scRNA-seq, tagging individual cells with unique barcodes that are integrated into the genome and expressed as sgRNA's) and longitudinal treatment response data, to fit a mechanistic mathematical model of drug resistance dynamics for a MDA-MB-231 breast cancer cell line. The explicit inclusion of the transcriptomic information in the parameter estimation is critical for identification of the model parameters and enables accurate prediction of new treatment regimens. |
A recent paper by Karin et al. introduced a mathematical notion called dynamical compensation (DC) of biological circuits. DC was shown to play an important role in glucose homeostasis as well as other key physiological regulatory mechanisms. Karin et al.\ went on to provide a sufficient condition to test whether a given system has the DC property. Here, we show how DC is a reformulation of a well-known concept in systems biology, statistics, and control theory -- that of parameter structural non-identifiability. Viewing DC as a parameter identification problem enables one to take advantage of powerful theoretical and computational tools to test a system for DC. We obtain as a special case the sufficient criterion discussed by Karin et al. We also draw connections to system equivalence and to the fold-change detection property. |
This letter discusses a paper in the same journal which reported a method for reconstructing network topologies. Here we show that the method is a variant of a previously published method, modular response analysis. We also demonstrate that the implementation of the algorithm in that paper using statistical similarity measures as a proxy for global network responses to perturbations is erroneous and its performance is overestimated. |
This paper describes a potential pitfall of perturbation-based approaches to network inference It is shows experimentally, and then explained mathematically, how even in the simplest signaling systems, perturbation methods may lead to paradoxical conclusions: for any given pair of two components X and Y, and depending upon the specific intervention on Y, either an activation or a repression of X could be inferred. The experiments are performed in an in vitro minimal system, thus isolating the effect and showing that it cannot be explained by feedbacks due to unknown intermediates; this system utilizes proteins from a pathway in mammalian (and other eukaryotic) cells that play a central role in proliferation, gene expression, differentiation, mitosis, cell survival, and apoptosis and is a perturbation target of contemporary therapies for various types of cancers. The results show that the simplistic view of intracellular signaling networks being made up of activation and repression links is seriously misleading, and call for a fundamental rethinking of signaling network analysis and inference methods. |
Many reverse-engineering techniques in systems biology rely upon data on steady-state (or dynamic) perturbations --obtained from siRNA, gene knock-down or overexpression, kinase and phosphatase inhibitors, or other interventions-- in order to understand the interactions between different ``modules'' in a network. This paper first reviews one such popular such technique, introduced by the author and collaborators, and focuses on why conclusions drawn from its use may be misleading due to ``retroactivity'' (impedance or load) effects. A theoretical result characterizing stoichiometric-induced steady-state retroactivity effects is given for a class of biochemical networks. |
This paper asks what classes of input signals are sufficient in order to completely identify the input/output behavior of generic bilinear systems. The main results are that step inputs are not sufficient, nor are single pulses, but the family of all pulses (of a fixed amplitude but varying widths) do suffice for identification. |
The ``reverse engineering problem'' in systems biology is that of unraveling of the web of interactions among the components of protein and gene regulatory networks, so as to map out the direct or local interactions among components. These direct interactions capture the topology of the functional network. An intrinsic difficulty in capturing these direct interactions, at least in intact cells, is that any perturbation to a particular gene or signaling component may rapidly propagate throughout the network, thus causing global changes which cannot be easily distinguished from direct effects. Thus, a major goal in reverse engineering is to use these observed global responses - such as steady-state changes in concentrations of active proteins, mRNA levels, or transcription rates - in order to infer the local interactions between individual nodes. One approach to solving this global-to-local problem is the ``Modular Response Analysis'' (MRA) method proposed in work of the author with Kholodenko et. al. (PNAS, 2002) and further elaborated in other papers. The basic method deals only with steady-state data. However, recently, quasi-steady state MRA has been used by Santos et. al. (Nature Cell Biology, 2007) for quantifying positive and negative feedback effects in the Raf/Mek/Erk MAPK network in rat adrenal pheochromocytoma (PC-12) cells. This paper presents an overview of the MRA technique, as well as a generalization of the algorithm to that quasi-steady state case. |
A result is presented showing the existence of inputs universal for observability, uniformly with respect to the class of all continuous-time analytic systems. This represents an ultimate generalization of a 1977 theorem, for bilinear systems, due to Alberto Isidori and Osvaldo Grasselli. |
This paper studies a computational problem motivated by the modular response analysis method for reverse engineering of protein and gene networks. This set-cover problem is hard to solve exactly for large networks, but efficient approximation algorithms are given and their complexity is analyzed. |
This paper investigates computational complexity aspects of a combinatorial problem that arises in the reverse engineering of protein and gene networks, showing relations to an appropriate set multicover problem with large "coverage" factor, and providing a non-trivial analysis of a simple randomized polynomial-time approximation algorithm for the problem. |
Biological complexity and limited quantitative measurements impose severe challenges to standard engineering methodologies for systems identification. This paper presents an approach, justified by the theory of universal inputs for distinguishability, based on replacing unmodeled dynamics by fictitious `dependent inputs'. The approach is particularly useful in validation experiments, because it allows one to fit model parameters to experimental data generated by a reference (wild-type) organism and then testing this model on data generated by a variation (mutant), so long as the mutations only affect the unmodeled dynamics that produce the dependent inputs. As a case study, this paper addresses the pathways that control the nitrogen uptake fluxes in baker's yeast Saccharomyces cerevisiae enabling it to optimally respond to changes in nitrogen availability. Well-defined perturbation experiments were performed on cells growing in steady-state. Time-series data of extracellular and intracellular metabolites were obtained, as well as mRNA levels. A nonlinear model was proposed, and shown to be structurally identifiable given input/output data. The identified model correctly predicted the responses of different yeast strains and different perturbations. |
One of the fundamental problems of cell biology is the understanding of complex regulatory networks. Such networks are ubiquitous in cells, and knowledge of their properties is essential for the understanding of cellular behavior. This paper studies the effect of experimental uncertainty on the accuracy of the inferred structure of the networks determined using the method in "Untangling the wires: a novel strategy to trace functional interactions in signaling and gene networks". |
This paper takes a computational learning theory approach to a problem of linear systems identification. It is assumed that input signals have only a finite number k of frequency components, and systems to be identified have dimension no greater than n. The main result establishes that the sample complexity needed for identification scales polynomially with n and logarithmically with k. |
High-throughput technologies have facilitated the acquisition of large genomics and proteomics data sets. However, these data provide snapshots of cellular behavior, rather than help us reveal causal relations. Here, we propose how these technologies can be utilized to infer the topology and strengths of connections among genes, proteins, and metabolites by monitoring time-dependent responses of cellular networks to experimental interventions. We show that all connections leading to a given network node, e.g., to a particular gene, can be deduced from responses to perturbations none of which directly influences that node, e.g., using strains with knock-outs to other genes. To infer all interactions from stationary data, each node should be perturbed separately or in combination with other nodes. Monitoring time series provides richer information and does not require perturbations to all nodes. |
Emerging technologies have enabled the acquisition of large genomics and proteomics data sets. This paper proposes a novel quantitative method for determining functional interactions in cellular signaling and gene networks. It can be used to explore cell systems at a mechanistic level, or applied within a modular framework, which dramatically decreases the number of variables to be assayed. The topology and strength of network connections are retrieved from experimentally measured network responses to successive perturbations of all modules. In addition, the method can reveal functional interactions even when the components of the system are not all known, in which case some connections retrieved by the analysis will not be direct but correspond to the interaction routes through unidentified elements. The method is tested and illustrated using computer-generated responses of a modeled MAPK cascade and gene network. |
Given a set of differential equations whose description involves unknown parameters, such as reaction constants in chemical kinetics, and supposing that one may at any time measure the values of some of the variables and possibly apply external inputs to help excite the system, how many experiments are sufficient in order to obtain all the information that is potentially available about the parameters? This paper shows that the best possible answer (assuming exact measurements) is 2r+1 experiments, where r is the number of parameters. |
The area of hybrid systems concerns issues of modeling, computation, and control for systems which combine discrete and continuous components. The subclass of piecewise linear (PL) systems provides one systematic approach to discrete-time hybrid systems, naturally blending switching mechanisms with classical linear components. PL systems model arbitrary interconnections of finite automata and linear systems. Tools from automata theory, logic, and related areas of computer science and finite mathematics are used in the study of PL systems, in conjunction with linear algebra techniques, all in the context of a "PL algebra" formalism. PL systems are of interest as controllers as well as identification models. Basic questions for any class of systems are those of equivalence, and, in particular, if state spaces are equivalent under a change of variables. This paper studies this state-space equivalence problem for PL systems. The problem was known to be decidable, but its computational complexity was potentially exponential; here it is shown to be solvable in polynomial-time. |
We consider the problem of characterizing possible supply functions for a given dissipative nonlinear system, and provide a result that allows some freedom in the modification of such functions. |
For continuous time analytic input/output maps, the existence of a singular differential equation relating derivatives of controls and outputs is shown to be equivalent to bilinear realizability. A similar result holds for the problem of immersion into bilinear systems. The proof is very analogous to that of the corresponding, and previously known, result for discrete time. |
The family of m-input, n-dimensional linear systems can be globally Identified with a generic input sequence of length 2mn. This bound is the best possible. A best bound is proved also for a corresponding local identification problem. |
Conference articles |
Combining in-vivo experiments with system identification methods, we determine a simple model of aerotaxis in B. subtilis, and we subsequently employ this model in order to compute the sequence of oxygen gradients needed in order to achieve set-point regulation with respect to a signal tracking the center of mass of the bacterial population. We then successfully validate both the model and the control scheme, by showing that in-vivo positioning control can be achieved via the application of the precomputed inputs in-vivo in an open-loop configuration. |
This paper studies model-based estimation methods of a rate of a nonhomogeneous Poisson processes that describes events arising from modeling biological phenomena in which discrete events are measured. We describe an approach based on observers and Kalman filters as well as preliminary simulation results, and compare these to other methods (not model-based) in the literature. The problem is motivated by the question of identification of internal states from neural spikes and bacterial tumbling behavior. |
Summarized conference version of ``Modularity, retroactivity, and structural identification''. |
Preliminary version of paper published in Automatica in 1995. |
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders.
This document was translated from BibT_{E}X by bibtex2html