BACK TO INDEX

Publications about 'VC dimension'
Articles in journal or book chapters
  1. P. Kuusela, D. Ocone, and E.D. Sontag. Learning Complexity Dimensions for a Continuous-Time Control System. SIAM J. Control Optim., 43(3):872-898, 2004. [PDF] [doi:http://dx.doi.org/10.1137/S0363012901384302] Keyword(s): machine learning, theory of computing and complexity, VC dimension, neural networks.
    Abstract:
    This paper takes a computational learning theory approach to a problem of linear systems identification. It is assumed that input signals have only a finite number k of frequency components, and systems to be identified have dimension no greater than n. The main result establishes that the sample complexity needed for identification scales polynomially with n and logarithmically with k.


  2. E.D. Sontag. VC dimension of neural networks. In C.M. Bishop, editor, Neural Networks and Machine Learning, pages 69-95. Springer, Berlin, 1998. [PDF] Keyword(s): machine learning, VC dimension, learning, neural networks, shattering.
    Abstract:
    The Vapnik-Chervonenkis (VC) dimension is an integer which helps to characterize distribution-independent learning of binary concepts from positive and negative samples. This paper, based on lectures delivered at the Isaac Newton Institute in August of 1997, presents a brief introduction, establishes various elementary results, and discusses how to estimate the VC dimension in several examples of interest in neural network theory. (It does not address the learning and estimation-theoretic applications of VC dimension, and the applications to uniform convergence theorems for empirical probabilities, for which many suitable references are available.)


  3. P. Koiran and E.D. Sontag. Vapnik-Chervonenkis dimension of recurrent neural networks. Discrete Appl. Math., 86(1):63-79, 1998. [PDF] [doi:http://dx.doi.org/10.1016/S0166-218X(98)00014-6] Keyword(s): machine learning, neural networks, recurrent neural networks.
    Abstract:
    This paper provides lower and upper bounds for the VC dimension of recurrent networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. Ignoring multiplicative constants, the main results say roughly the following: 1. For architectures whose activation is any fixed nonlinear polynomial, the VC dimension is proportional to wk. 2. For architectures whose activation is any fixed piecewise polynomial, the VC dimension is between wk and w**2k. 3. For architectures with threshold activations, the VC dimension is between wlog(k/w) and the smallest of wklog(wk) and w**2+wlog(wk). 4. For the standard sigmoid tanh(x), the VC dimension is between wk and w**4 k**2.


  4. E.D. Sontag. A learning result for continuous-time recurrent neural networks. Systems Control Lett., 34(3):151-158, 1998. [PDF] [doi:http://dx.doi.org/10.1016/S0167-6911(98)00006-1] Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.
    Abstract:
    The following learning problem is considered, for continuous-time recurrent neural networks having sigmoidal activation functions. Given a ``black box'' representing an unknown system, measurements of output derivatives are collected, for a set of randomly generated inputs, and a network is used to approximate the observed behavior. It is shown that the number of inputs needed for reliable generalization (the sample complexity of the learning problem) is upper bounded by an expression that grows polynomially with the dimension of the network and logarithmically with the number of output derivatives being matched.


  5. P. Koiran and E.D. Sontag. Vapnik-Chervonenkis dimension of recurrent neural networks. In Computational learning theory (Jerusalem, 1997), volume 1208 of Lecture Notes in Comput. Sci., pages 223-237. Springer-Verlag, London, UK, 1997. Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.


  6. E.D. Sontag. Recurrent neural networks: Some systems-theoretic aspects. In M. Karny, K. Warwick, and V. Kurkova, editors, Dealing with Complexity: a Neural Network Approach, pages 1-12. Springer-Verlag, London, 1997. [PDF] Keyword(s): machine learning, neural networks, recurrent neural networks, learning, VC dimension.
    Abstract:
    This paper provides an exposition of some recent results regarding system-theoretic aspects of continuous-time recurrent (dynamic) neural networks with sigmoidal activation functions. The class of systems is introduced and discussed, and a result is cited regarding their universal approximation properties. Known characterizations of controllability, observability, and parameter identifiability are reviewed, as well as a result on minimality. Facts regarding the computational power of recurrent nets are also mentioned.


  7. P. Koiran and E.D. Sontag. Neural networks with quadratic VC dimension. J. Comput. System Sci., 54(1, part 2):190-198, 1997. Note: (1st Annual Dagstuhl Seminar on Neural Computing, 1994). [PDF] [doi:http://dx.doi.org/10.1006/jcss.1997.1479] Keyword(s): machine learning, neural networks, VC dimension.
    Abstract:
    This paper shows that neural networks which use continuous activation functions have VC dimension at least as large as the square of the number of weights w. This result settles the open question of whether whether the well-known O(w log w) bound, known for hard-threshold nets, also held for more general sigmoidal nets. Implications for the number of samples needed for valid generalization are discussed.


  8. E.D. Sontag. Shattering all sets of k points in `general position' requires (k-1)/2 parameters. Neural Comput., 9(2):337-348, 1997. [PDF] Keyword(s): machine learning, neural networks, VC dimension, real-analytic functions.
    Abstract:
    For classes of concepts defined by certain classes of analytic functions depending on k parameters, there are nonempty open sets of samples of length 2k+2 which cannot be shattered. A slighly weaker result is also proved for piecewise-analytic functions. The special case of neural networks is discussed.


  9. B. DasGupta and E.D. Sontag. Sample complexity for learning recurrent perceptron mappings. IEEE Trans. Inform. Theory, 42(5):1479-1487, 1996. [PDF] Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.
    Abstract:
    Recurrent perceptron classifiers generalize the usual perceptron model. They correspond to linear transformations of input vectors obtained by means of "autoregressive moving-average schemes", or infinite impulse response filters, and allow taking into account those correlations and dependences among input coordinates which arise from linear digital filtering. This paper provides tight bounds on sample complexity associated to the fitting of such models to experimental data. The results are expressed in the context of the theory of probably approximately correct (PAC) learning.


  10. E.D. Sontag. Feedforward nets for interpolation and classification. J. Comput. System Sci., 45(1):20-48, 1992. [PDF] [doi:http://dx.doi.org/10.1016/0022-0000(92)90039-L] Keyword(s): machine learning, neural networks, VC dimension, boolean systems.
    Abstract:
    This paper deals with single-hidden-layer feedforward nets, studying various aspects of classification power and interpolation capability. In particular, a worst-case analysis shows that direct input to output connections in threshold nets double the recognition but not the interpolation power, while using sigmoids rather than thresholds allows doubling both. For other measures of classification, including the Vapnik-Chervonenkis dimension, the effect of direct connections or sigmoidal activations is studied in the special case of two-dimensional inputs.


Conference articles
  1. P. Kuusela, D. Ocone, and E.D. Sontag. On the VC dimension of continuous-time linear control systems. In Proc. 32nd Annual Conf. on Information Sciences and Systems (CISS 98), Princeton, NJ, pages 795-800, 1998.


  2. E.D. Sontag. Some learning and systems-theoretic questions regarding recurrent neural networks. In Proc. Conf. on Information Sciences and Systems (CISS 97), Johns Hopkins, Baltimore, MD, March 1997, pages 630-635, 1997. Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.


  3. B. Dasgupta and E.D. Sontag. Sample complexity for learning recurrent perceptron mappings. In D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 204-210, 1996. MIT Press, Cambridge, MA. Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.


  4. P. Koiran and E.D. Sontag. Neural networks with quadratic VC dimension. In D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 197-203, 1996. MIT Press, Cambridge, MA. Keyword(s): machine learning, neural networks, VC dimension.


  5. A. Macintyre and E.D. Sontag. Finiteness results for sigmoidal neural networks. In STOC '93: Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, New York, NY, USA, pages 325-334, 1993. ACM Press. [PDF] [doi:http://doi.acm.org/10.1145/167088.167192] Keyword(s): machine learning, neural networks, theory of computing and complexity, real-analytic functions.
    Abstract:
    This paper deals with analog circuits. It establishes the finiteness of VC dimension, teaching dimension, and several other measures of sample complexity which arise in learning theory. It also shows that the equivalence of behaviors, and the loading problem, are effectively decidable, modulo a widely believed conjecture in number theory. The results, the first ones that are independent of weight size, apply when the gate function is the "standard sigmoid" commonly used in neural networks research. The proofs rely on very recent developments in the elementary theory of real numbers with exponentiation. (Some weaker conclusions are also given for more general analytic gate functions.) Applications to learnability of sparse polynomials are also mentioned.



BACK TO INDEX




Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders.




Last modified: Wed Apr 17 19:59:04 2024
Author: sontag.


This document was translated from BibTEX by bibtex2html