Research Papers

Causation Entropy Identifies Sparsity Structure for Parameter Estimation of Dynamic Systems

[+] Author and Article Information
Pileun Kim

George W. Woodruff School
of Mechanical Engineering,
Georgia Institute of Technology,
Atlanta, GA 30332

Jonathan Rogers

Assistant Professor
George W. Woodruff School of
Mechanical Engineering,
Georgia Institute of Technology,
Atlanta, GA 30332

Jie Sun

Assistant Professor
Department of Mathematics,
Clarkson University,
Potsdam, NY 13699

Erik Bollt

Department of Mathematics,
Clarkson University,
Potsdam, NY 13699

Contributed by the Design Engineering Division of ASME for publication in the JOURNAL OF COMPUTATIONAL AND NONLINEAR DYNAMICS. Manuscript received January 19, 2016; final manuscript received June 29, 2016; published online September 1, 2016. Assoc. Editor: Bogdan I. Epureanu.

J. Comput. Nonlinear Dynam 12(1), 011008 (Sep 01, 2016) (14 pages) Paper No: CND-16-1026; doi: 10.1115/1.4034126 History: Received January 19, 2016; Revised June 29, 2016

Parameter estimation is an important topic in the field of system identification. This paper explores the role of a new information theory measure of data dependency in parameter estimation problems. Causation entropy is a recently proposed information-theoretic measure of influence between components of multivariate time series data. Because causation entropy measures the influence of one dataset upon another, it is naturally related to the parameters of a dynamical system. In this paper, it is shown that by numerically estimating causation entropy from the outputs of a dynamic system, it is possible to uncover the internal parametric structure of the system and thus establish the relative magnitude of system parameters. In the simple case of linear systems subject to Gaussian uncertainty, it is first shown that causation entropy can be represented in closed form as the logarithm of a rational function of system parameters. For more general systems, a causation entropy estimator is proposed, which allows causation entropy to be numerically estimated from measurement data. Results are provided for discrete linear and nonlinear systems, thus showing that numerical estimates of causation entropy can be used to identify the dependencies between system states directly from output data. Causation entropy estimates can therefore be used to inform parameter estimation by reducing the size of the parameter set or to generate a more accurate initial guess for subsequent parameter optimization.

Copyright © 2017 by ASME
Your Session has timed out. Please sign back in to continue.


Juang, J.-N. , 1994, Applied System Identification, Prentice-Hall, Upper Saddle River, NJ.
Juang, J.-N. , and Pappa, R. , 1985, “ An Eigensystem Realization Algorithm for Modal Parameter Identification and Model Reduction,” J. Guid., 8(5), pp. 620–627. [CrossRef]
Tischler, M. , and Remple, R. , 2012, Aircraft and Rotorcraft System Identification, American Institute of Aeronautics and Astronautics, Reston, VA.
Cooper, J. , and Wright, J. , 1992, “ Spacecraft In-Orbit Identification Using Eigensystem Realization Methods,” J. Guid., Control, Dyn., 15(2), pp. 352–359. [CrossRef]
Pappa, R. , and Juang, J.-N. , 1984, “ Galileo Spacecraft Modal Identification Using an Eigensystem Realization Algorithm,” AIAA Paper No. 84-1070.
Iliff, K. , 1989, “ Parameter Estimation for Flight Vehicles,” J. Guid., Control, Dyn., 12(5), pp. 609–622. [CrossRef]
Kabaila, P. , 1983, “ On Output-Error Methods for System Identification,” IEEE Trans. Autom. Control, 28(1), pp. 12–13. [CrossRef]
Landau, I. , and Karimi, A. , 1997, “ An Output Error Recursive Algorithm for Unbiased Identification in Closed Loop,” Automatica, 33(5), pp. 933–938. [CrossRef]
Iliff, K. W. , and Maine, R. E. , 1982, “ NASA Dryden's Experience in Parameter Estimation and Its Use in Flight Test,” AIAA Paper No. 82-1373.
Greene, W. , 1980, “ Maximum Likelihood Estimation of Econometric Frontier Functions,” J. Econometrics, 13(1), pp. 27–56. [CrossRef]
Taylor, B. , and Rogers, J. , 2014, “ Experimental Investigation of Real-Time Helicopter Weight Estimation,” J. Aircr., 51(3), pp. 1047–1051. [CrossRef]
Akaike, H. , 1974, “ A New Look at the Statistical Model Identification,” IEEE Trans. Autom. Control, AC- 19(6), pp. 716–723. [CrossRef]
Akaike, H. , 1972, “ Information Theory and an Extension of the Maximum Likelihood Principle,” 2nd International Symposium on Information Theory, Supplement to Problems of Control and Information Theory, pp. 267–281.
Baram, Y. , and Sandell, N. , 1978, “ An Information Theoretic Approach to Dynamical Systems Modeling and Identification,” IEEE Trans. Autom. Control, AC-23(1), pp. 1113–1118.
Matsuoka, T. , and Ulrych, T. , 1986, “ Information Theory Measures With Application to Model Identification,” IEEE Trans. Acoust., Speech, Signal Process., 34(3), pp. 511–517. [CrossRef]
Kulhavy, R. , 1996, “ A Kullback–Leibler Approach to System Identification,” Annu. Rev. Control, 20, pp. 119–130. [CrossRef]
Kulhavy, R. , 1996, “ From Matching Data to Matching Probabilities,” Recursive Nonlinear Estimation: A Geometric Approach, Springer Berlin Heidelberg, pp. 13–61.
Kulhavy, R. , 1998, “ On Extension of Information Geometry of Parameter Estimation to State Estimation,” Conference on Mathematical Theory of Networks and Systems (MTNS 98), Padova, Italy, July 6–10.
Chernyshov, K. , 2006, “ An Information Theoretic Approach to System Identification Via Input/Output Signal Processing,” 2006 International Conference on Speech and Computer, St. Petersburg, Russia, June 25–29.
Chernyshov, K. , 2009, “ An Information Theoretic Approach to Neural Network Based System Identification,” 2009 Siberian Conference on Control and Communications, Tomsk, Russia, Mar. 27–28.
Majda, A. , and Gershgorin, B. , “ Improving Model Fidelity and Sensitivity for Complex Systems Through Empirical Information Theory,” Proc. Natl. Acad. Sci., 108(25), pp. 10044–10049. [CrossRef]
Granger, C. W. , 1969, “ Investigating Causal Relations by Econometric Models and Cross-Spectral Methods,” Econometrica, 37(3), pp. 425–438.
Granger, C. W. , 1988, “ Some Recent Developments in a Concept of Causality,” J. Econometrics., 39(1–2), pp. 199–211. [CrossRef]
Scheiber, T. , 2000, “ Measuring Information Transfer,” Phys. Rev. Lett., 85(2), pp. 461–464. [CrossRef] [PubMed]
Honey, C. , Kötter, R. , Breakspear, M. , and Sporns, O. , 2007, “ Network Structure of Cerebral Cortex Shapes Functional Connectivity on Multiple Time Scales,” Proc. Natl. Acad. Sci., 104(24), pp. 10240–10245. [CrossRef]
Vicente, R. , Wibral, M. , Lindner, M. , and Pipa, G. , 2011, “ Transfer Entropy—A Model-Free Measure of Effective Connectivity for the Neurosciences,” J. Comput. Neurosci., 30(1), pp. 45–67. [CrossRef] [PubMed]
Sun, J. , and Bollt, E. , 2014, “ Causation Entropy Identifies Indirect Influences, Dominance of Neighbors and Anticipatory Couplings,” Phys. D, 267, pp. 49–57. [CrossRef]
Sun, J. , Cafaro, C. , and Bollt, E. , 2014, “ Identifying the Coupling Structure in Complex Systems Through the Optimal Causation Entropy Principle,” Entropy, 16(6), pp. 3416–3433. [CrossRef]
Sun, J. , Taylor, D. , and Bollt, E. , 2015, “ Causal Network Inference by Optimal Causation Entropy,” SIAM J. Appl. Dyn. Syst., 14(1), pp. 73–106. [CrossRef]
Cover, T. , and Thomas, J. , 1991, Elements of Information Theory, 2nd ed., Wiley, Hoboken, NJ, Chap. 2.
Sobczyk, K. , and Holobut, P. , 2012, “ Information Theoretic Approach to Dynamics of Stochastic Systems,” Probab. Eng. Mech., 27(1), pp. 47–56. [CrossRef]
Prokopenko, M. , Lizier, J. , and Price, D. , 2013, “ On Thermodynamic Interpretation of Transfer Entropy,” Entropy, 15(2), pp. 524–543. [CrossRef]
Hahs, D. , and Pethel, S. , 2013, “ Transfer Entropy for Coupled Autoregressive Processes,” Entropy, 15(3), pp. 767–788. [CrossRef]
Materassi, M. , Consolini, G. , Smith, N. , and De Marco, R. , 2014, “ Information Theory Analysis of Cascading Process in a Synthetic Model of Fluid Turbulence,” Entropy, 16(3), pp. 1272–1286. [CrossRef]
Silverman, B. , 1986, Density Estimation for Statistics and Data Analysis, Springer Science and Business Media, New York, NY.
Moon, Y.-I. , Rajagopalan, B. , and Lall, U. , 1995, “ Estimation of Mutual Information Using Kernel Density Estimators,” Phys. Rev. E, 52(3), pp. 2318–2321. [CrossRef]
Kraskov, A. , Stogbauer, A. , and Grassberger, P. , 2004, “ Estimating Mutual Information,” Phys. Rev. E, 69(6), p. 066138. [CrossRef]
Hlavackova-Schindler, K. , Palus, M. , Vejmelka, M. , and Bhattacharya, J. , 2007, “ Causality Detection Based on Information Theoretic Approaches to Time Series Analysis,” Phys. Rep., 441(1), pp. 1–46. [CrossRef]
Scott, D. , 1992, Multivariate Density Estimation: Theory, Practice, and Visualization, Wiley, Hoboken, NJ, Chap. 6.
Barrat, A. , Barthelemy, M. , and Vespignani, A. , 2008, Dynamical Processes on Complex Networks, Cambridge University Press, Cambridge, UK.
Hénon, M. , 1976, “ A Two-Dimensional Mapping With a Strange Attractor,” Commun. Math. Phys., 50(1), pp. 69–77. [CrossRef]
Reinhall, P. , Caughey, T. , and Storti, D. , 1989, “ Order and Chaos in a Discrete Duffing Oscillator: Implications on Numerical Integration,” ASME J. Appl. Mech., 56(1), pp. 162–167. [CrossRef]
More, J. J. , 1978, “ The Levenberg–Marquardt Algorithm: Implementation and Theory,” Numerical Analysis: Proceedings of the Biennial Conference Dundee, June 28–July 1, G. A. Watson, ed., Springer, Berlin, pp. 105–116.


Grahic Jump Location
Fig. 1

CEM table values versus a11 for example 2 × 2 linear system

Grahic Jump Location
Fig. 2

CEM table values versus a12 for example 2 × 2 linear system—(o) case: σu=0.1, σv=0.15 and (+) case: σu=0.1, σv=0.7

Grahic Jump Location
Fig. 3

CEM table values versus a21 for example 2 × 2 linear system (analytical and estimated with KDE)

Grahic Jump Location
Fig. 4

Magnitude plot of A matrix components for example 5 × 5 linear system

Grahic Jump Location
Fig. 5

Magnitude plot of theoretical CEM components for example 5 × 5 linear system

Grahic Jump Location
Fig. 6

Magnitude plot of estimated CEM components for example 5 × 5 linear system

Grahic Jump Location
Fig. 7

Example coupled linear harmonic oscillators

Grahic Jump Location
Fig. 8

Magnitude plot of A matrix components for 10 × 10 coupled linear oscillator example

Grahic Jump Location
Fig. 9

Magnitude plot of theoretical CEM components for coupled linear oscillator example

Grahic Jump Location
Fig. 10

Magnitude plot of estimated CEM components for coupled linear oscillator example

Grahic Jump Location
Fig. 11

State time history for Hénon map example

Grahic Jump Location
Fig. 12

State time history for Duffing map example (traces for the two states look similar but are not the same)

Grahic Jump Location
Fig. 13

State time history for 5 × 5 nonlinear system example

Grahic Jump Location
Fig. 14

State time history for 6 × 6 nonlinear system example



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In