Nächste Seite: Über dieses Dokument ...
Aufwärts: promotion
Vorherige Seite: Mathematische Details
  Inhalt
- 1
-
L. B. Almeida.
A learning rule for asynchronous perceptrons with feedback in a
combinatorial environment.
In IEEE 1st International Conference on Neural Networks, San
Diego, volume 2, pages 609-618, 1987.
- 2
-
C. W. Anderson.
Learning and Problem Solving with Multilayer Connectionist
Systems.
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf.
Sci., 1986.
- 3
-
A. G. Barto and P. Anandan.
Pattern recognizing stochastic learning automata.
IEEE Transactions on Systems, Man, and Cybernetics,
15:360-375, 1985.
- 4
-
A. G. Barto and M. I. Jordan.
Gradient following without back propagation in layered networks.
In IEEE 1st International Conference on Neural Networks, San
Diego, volume 2, pages 629-636, 1987.
- 5
-
A. G. Barto, R. S. Sutton, and C. W. Anderson.
Neuronlike adaptive elements that can solve difficult learning
control problems.
IEEE Transactions on Systems, Man, and Cybernetics,
SMC-13:834-846, 1983.
- 6
-
R. Bellman.
Adaptive Control Processes.
Princeton University Press, 1961.
- 7
-
M. Compiani, D. Montanari, R. Serra, and G. Valastro.
Classifier systems and neural networks.
In E. R. Caianello, editor, 1st Workshop on Parallel
Architectures and Neural Nets, 1989.
- 8
-
J. L. Elman.
Finding structure in time.
Technical Report CRL Technical Report 8801, Center for Research in
Language, University of California, San Diego, 1988.
- 9
-
S. E. Fahlman.
An empirical study of learning speed in back-propagation networks.
Technical Report CMU-CS-88-162, Carnegie-Mellon Univ., 1988.
- 10
-
M. Gherrity.
A learning algorithm for analog fully recurrent neural networks.
In IEEE/INNS International Joint Conference on Neural Networks,
San Diego, volume 1, pages 643-644, 1989.
- 11
-
S. Grossberg.
Adaptive pattern classification and universal recoding, 1: Parallel
development and coding of neural feature detectors.
Biological Cybernetics, 23:187-202, 1976.
- 12
-
D. O. Hebb.
The Organization of Behavior.
Wiley, New York, 1949.
- 13
-
A. Herz, B. Sulzer, R. Kühn, and J.L. van Hemmen.
Hebbian learning reconsidered: Representation of static and dynamic
objects in associative neural nets, 1988.
Technischer Report 463, Sonderforschungsbereich 123, Universität
Heidelberg.
- 14
-
G. E. Hinton and T. E. Sejnowski.
Learning and relearning in Boltzmann machines.
In Rumelhart and McClelland [45], pages 282-317.
- 15
-
Josef Hochreiter.
Implementierung und Anwendung eines `neuronalen'
Echtzeit-Lernalgorithmus für reaktive Umgebungen , 1990.
Fortgeschrittenenpraktikum, Institut für Informatik, Technische
Universität München.
- 16
-
J. H. Holland.
Adaptation in Natural and Artificial Systems.
University of Michigan Press, Ann Arbor, 1975.
- 17
-
J. J. Hopfield.
Neural networks and physical systems with emergent collective
computational abilities.
Proc. of the National Academy of Sciences, 79:2554-2558, 1982.
- 18
-
R. Huber.
Selektive visuelle Aufmerksamkeit: Untersuchungen zum Erlernen von
Fokustrajektorien durch neuronale Netze, 1990.
Diplomarbeit, Institut für Informatik, Technische Universität
München.
- 19
-
J. Jameson.
A neurocontroller based on model feedback and the adaptive heuristic
critic.
In Proc. IEEE/INNS International Joint Conference on Neural
Networks, San Diego, volume 2, pages 37-43, 1990.
- 20
-
M. I. Jordan.
Serial order: A parallel distributed processing approach.
Technical Report ICS Report 8604, Institute for Cognitive Science,
University of California, San Diego, 1986.
- 21
-
M. I. Jordan.
Supervised learning and systems with excess degrees of freedom.
Technical Report COINS TR 88-27, Massachusetts Institute of
Technology, 1988.
- 22
-
M. I. Jordan and R. A. Jacobs.
Learning to control an unstable system with forward modeling.
In Proc. of the 1990 Connectionist Models Summer School, in
press. San Mateo, CA: Morgan Kaufmann, 1990.
- 23
-
T. Kohonen.
Self-Organization and Associative Memory.
Springer, second edition, 1988.
- 24
-
A. Lapedes and R. Faber.
How neural nets work.
In D. Z. Anderson, editor, `Neural Information Processing
Systems: Natural and Synthetic' (NIPS). NY, American Institute of Physics,
1987.
- 25
-
Y. LeCun.
Une procédure d'apprentissage pour réseau à seuil
asymétrique.
Proceedings of Cognitiva 85, Paris, pages 599-604, 1985.
- 26
-
G. Lukes.
Review of Schmidhuber's paper `Recurrent networks adjusted by
adaptive critics'.
Neural Network Reviews, 4(1):41-42, 1990.
- 27
-
G. Lukes, B. Thompson, and P. Werbos.
Expectation driven learning with an associative memory.
In Proc. IEEE International Joint Conference on Neural Networks,
Washington, D. C., volume 1, pages 521-524, 1990.
- 28
-
M. Minsky.
Steps toward artificial intelligence.
In E. Feigenbaum and J. Feldman, editors, Computers and
Thought, pages 406-450. McGraw-Hill, New York, 1963.
- 29
-
M. Minsky and S. Papert.
Perceptrons.
Cambridge, MA: MIT Press, 1969.
- 30
-
D. J. Montana and L. Davis.
Training feedforward neural networks using genetic algorithms.
Technical report, BBN Systems and Technologies, Inc., Cambridge, MA,
1989.
- 31
-
P. W. Munro.
A dual back-propagation scheme for scalar reinforcement learning.
Proceedings of the Ninth Annual Conference of the Cognitive
Science Society, Seattle, WA, pages 165-176, 1987.
- 32
-
F. Nake.
Ästhetik als Informationsverarbeitung.
Springer, 1974.
- 33
-
K. S. Narendra and M. A. L. Thathatchar.
Learning automata - a survey.
IEEE Transactions on Systems, Man, and Cybernetics, 4:323-334,
1974.
- 34
-
Nguyen and B. Widrow.
The truck backer-upper: An example of self learning in neural
networks.
In IEEE/INNS International Joint Conference on Neural Networks,
Washington, D.C., volume 1, pages 357-364, 1989.
- 35
-
D. B. Parker.
Learning-logic.
Technical Report TR-47, Center for Comp. Research in Economics and
Management Sci., MIT, 1985.
- 36
-
D. B. Parker.
Optimal algorithms for adaptive networks: Second order back
propagation, second order direct propagation, and second order hebbian
learning.
In IEEE 1st International Conference on Neural Networks, San
Diego, volume 2, pages 593-600, 1987.
- 37
-
B. A. Pearlmutter.
Learning state space trajectories in recurrent neural networks.
Neural Computation, 1:263-269, 1989.
- 38
-
F. J. Pineda.
Dynamics and architecture for neural computation.
Journal of Complexity, 4:216-245, 1988.
- 39
-
A. J. Robinson.
Dynamic Error Propagation Networks.
PhD thesis, Trinity Hall and Cambridge University Engineering
Department, 1989.
- 40
-
A. J. Robinson and F. Fallside.
The utility driven dynamic error propagation network.
Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering
Department, 1987.
- 41
-
T. Robinson and F. Fallside.
Dynamic reinforcement driven error propagation networks with
application to game playing.
In Proceedings of the 11th Conference of the Cognitive Science
Society, Ann Arbor, pages 836-843, 1989.
- 42
-
R. Rohwer.
The `moving targets' training method.
In J. Kindermann and A. Linden, editors, Proceedings of
`Distributed Adaptive Neural Information Processing', St.Augustin,
24.-25.5,. Oldenbourg, 1989.
- 43
-
R. Rohwer and B. Forrest.
Training time-dependence in neural networks.
In IEEE 1st International Conference on Neural Networks, San
Diego, volume 2, pages 701-708, 1987.
- 44
-
D. E. Rumelhart, G. E. Hinton, and R. J. Williams.
Learning internal representations by error propagation.
In Rumelhart and McClelland [45], pages 318-362.
- 45
-
D. E. Rumelhart and J. L. McClelland, editors.
Parallel Distributed Processing, volume 1.
MIT Press, 1986.
- 46
-
D. E. Rumelhart and D. Zipser.
Feature discovery by competitive learning.
In Rumelhart and McClelland [45], pages 151-193.
- 47
-
A. L. Samuel.
Some studies in machine learning using the game of checkers.
IBM Journal on Research and Development, 3:210-229, 1959.
- 48
-
J. H. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning
how to learn: The meta-meta-... hook, 1987.
Report, Institut für Informatik, Technische Universität
München.
- 49
-
J. H. Schmidhuber.
Accelerated learning in back-propagation nets.
In R. Pfeifer, Z. Schreter, Z. Fogelman, and L. Steels, editors, Connectionism in Perspective, pages 429 - 438. Amsterdam: Elsevier,
North-Holland, 1989.
- 50
-
J. H. Schmidhuber.
The neural bucket brigade.
In R. Pfeifer, Z. Schreter, Z. Fogelman, and L. Steels, editors, Connectionism in Perspective, pages 439-446. Amsterdam: Elsevier,
North-Holland, 1989.
- 51
-
J. H. Schmidhuber.
Applying temporal difference methods to fully recurrent reinforcement
learning networks.
In preparation, 1990.
- 52
-
J. H. Schmidhuber.
Learning algorithms for networks with internal and external feedback.
In D. S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton,
editors, Proc. of the 1990 Connectionist Models Summer School, pages
52-61. San Mateo, CA: Morgan Kaufmann, 1990.
- 53
-
J. H. Schmidhuber.
A local learning algorithm for dynamic feedforward and recurrent
networks.
Connection Science, 1(4):403-412, 1990.
- 54
-
J. H. Schmidhuber.
Making the world differentiable: On using fully recurrent
self-supervised neural networks for dynamic reinforcement learning and
planning in non-stationary environments.
Technical Report FKI-126-90 (revised), Institut für Informatik,
Technische Universität München, November 1990.
(Revised and extended version of an earlier report from February.).
- 55
-
J. H. Schmidhuber.
Networks adjusting networks.
In J. Kindermann and A. Linden, editors, Proceedings of
`Distributed Adaptive Neural Information Processing', St.Augustin, 24.-25.5.
1989, pages 197-208. Oldenbourg, 1990.
In November 1990 a revised and extended version appeared as
FKI-Report FKI-125-90 (revised) at the Institut für Informatik,
Technische Universität München.
- 56
-
J. H. Schmidhuber.
An on-line algorithm for dynamic reinforcement learning and planning
in reactive environments.
In Proc. IEEE/INNS International Joint Conference on Neural
Networks, San Diego, volume 2, pages 253-258, 1990.
- 57
-
J. H. Schmidhuber.
Recurrent networks adjusted by adaptive critics.
In Proc. IEEE/INNS International Joint Conference on Neural
Networks, Washington, D. C., volume 1, pages 719-722, 1990.
- 58
-
J. H. Schmidhuber.
Reinforcement learning with interacting continually running fully
recurrent networks.
In Proc. INNC International Neural Network Conference, Paris,
volume 2, pages 817-820, 1990.
- 59
-
J. H. Schmidhuber.
Reinforcement-Lernen und adaptive Steuerung.
Nachrichten Neuronale Netze, 2:1-3, 1990.
- 60
-
J. H. Schmidhuber.
Response to G. Lukes' review of `Recurrent networks adjusted by
adaptive critics'.
Neural Network Reviews, 4(1), 1990.
- 61
-
J. H. Schmidhuber.
Temporal-difference-driven learning in recurrent networks.
In R. Eckmiller, G. Hartmann, and G. Hauske, editors, Parallel
Processing in Neural Systems and Computers, pages 209-212. North-Holland,
1990.
- 62
-
J. H. Schmidhuber.
Towards compositional learning with dynamic neural networks.
Technical Report FKI-129-90, Institut für Informatik, Technische
Universität München, 1990.
- 63
-
J. H. Schmidhuber.
A possibility for implementing curiosity and boredom in
model-building neural controllers.
In J. A. Meyer and S. W. Wilson, editors, Proc. of the
International Conference on Simulation of Adaptive Behavior: From Animals to
Animats, pages 222-227. MIT Press/Bradford Books, 1991.
- 64
-
J. H. Schmidhuber and R. Huber.
Learning to generate focus trajectories for attentive vision.
Technical Report FKI-128-90, Institut für Informatik, Technische
Universität München, 1990.
- 65
-
B. Schürmann.
Stability and adaptation in artificial neural systems.
Physical Review, A 40(50):2681-2688, 1989.
- 66
-
R. S. Sutton.
Temporal Credit Assignment in Reinforcement Learning.
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf.
Sci., 1984.
- 67
-
R. S. Sutton.
Learning to predict by the methods of temporal differences.
Machine Learning, 3:9-44, 1988.
- 68
-
R. S. Sutton.
First results with DYNA, an integrated architecture for learning,
planning and reacting.
In Proceedings of the AAAI Spring Symposium on Planning in
Uncertain, Unpredictable, or Changing Environments, 1990.
- 69
-
R. S. Sutton and B. Pinette.
The learning of world models by connectionist networks.
Proceedings of the 7th Annual Conference of the Cognitive
Science Society, pages 54-64, 1985.
- 70
-
C. v.d. Malsburg.
Technical Report Internal Report 81-2, Abteilung für
Neurobiologie, Max-Planck Institut für Biophysik und Chemie,
Göttingen, 1981.
- 71
-
C. Watkins.
Learning from Delayed Rewards.
PhD thesis, King's College, 1989.
- 72
-
P. J. Werbos.
Beyond Regression: New Tools for Prediction and Analysis in the
Behavioral Sciences.
PhD thesis, Harvard University, 1974.
- 73
-
P. J. Werbos.
Advanced forecasting methods for global crisis warning and models of
intelligence.
In General Systems, volume XXII, pages 25-38, 1977.
- 74
-
P. J. Werbos.
Backpropagation and neurocontrol: A review and prospectus.
In IEEE/INNS International Joint Conference on Neural Networks,
Washington, D.C., volume 1, pages 209-216, 1989.
- 75
-
P. J. Werbos.
Consistency of HDP applied to a simple reinforcement learning
problem.
Neural Networks, 2:179-189, 1990.
- 76
-
S.D. Whitehead and D. H. Ballard.
Active perception and reinforcement learning.
Technical Report 331, University of Rochester, Dept. of Comp. Sci.,
1990.
- 77
-
R. J. Williams.
On the use of backpropagation in associative reinforcement learning.
In IEEE International Conference on Neural Networks, San Diego,
volume 2, pages 263-270, 1988.
- 78
-
R. J. Williams.
Toward a theory of reinforcement-learning connectionist systems.
Technical Report NU-CCS-88-3, College of Comp. Sci., Northeastern
University, Boston, MA, 1988.
- 79
-
R. J. Williams and Leemon C. Baird.
Draft: A mathematical analysis of actor-critic architectures for
learning optimal controls through incremental dynamic programming.
Technical report, College of Comp. Sci., Northeastern University,
Boston, MA, 1990.
- 80
-
R. J. Williams and D. Zipser.
Experimental analysis of the real-time recurrent learning algorithm.
Connection Science, 1(1):87-111, 1989.
Juergen Schmidhuber
2003-02-20
Related links in English: Recurrent neural networks - Subgoal learning - Reinforcement learning and POMDPs - Reinforcement learning economies - Selective attention
Deutsche Heimseite