next up previous contents
Nächste Seite: Über dieses Dokument ... Aufwärts: promotion Vorherige Seite: Mathematische Details   Inhalt

Literatur

1
L. B. Almeida.
A learning rule for asynchronous perceptrons with feedback in a combinatorial environment.
In IEEE 1st International Conference on Neural Networks, San Diego, volume 2, pages 609-618, 1987.

2
C. W. Anderson.
Learning and Problem Solving with Multilayer Connectionist Systems.
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf. Sci., 1986.

3
A. G. Barto and P. Anandan.
Pattern recognizing stochastic learning automata.
IEEE Transactions on Systems, Man, and Cybernetics, 15:360-375, 1985.

4
A. G. Barto and M. I. Jordan.
Gradient following without back propagation in layered networks.
In IEEE 1st International Conference on Neural Networks, San Diego, volume 2, pages 629-636, 1987.

5
A. G. Barto, R. S. Sutton, and C. W. Anderson.
Neuronlike adaptive elements that can solve difficult learning control problems.
IEEE Transactions on Systems, Man, and Cybernetics, SMC-13:834-846, 1983.

6
R. Bellman.
Adaptive Control Processes.
Princeton University Press, 1961.

7
M. Compiani, D. Montanari, R. Serra, and G. Valastro.
Classifier systems and neural networks.
In E. R. Caianello, editor, 1st Workshop on Parallel Architectures and Neural Nets, 1989.

8
J. L. Elman.
Finding structure in time.
Technical Report CRL Technical Report 8801, Center for Research in Language, University of California, San Diego, 1988.

9
S. E. Fahlman.
An empirical study of learning speed in back-propagation networks.
Technical Report CMU-CS-88-162, Carnegie-Mellon Univ., 1988.

10
M. Gherrity.
A learning algorithm for analog fully recurrent neural networks.
In IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 1, pages 643-644, 1989.

11
S. Grossberg.
Adaptive pattern classification and universal recoding, 1: Parallel development and coding of neural feature detectors.
Biological Cybernetics, 23:187-202, 1976.

12
D. O. Hebb.
The Organization of Behavior.
Wiley, New York, 1949.

13
A. Herz, B. Sulzer, R. Kühn, and J.L. van Hemmen.
Hebbian learning reconsidered: Representation of static and dynamic objects in associative neural nets, 1988.
Technischer Report 463, Sonderforschungsbereich 123, Universität Heidelberg.

14
G. E. Hinton and T. E. Sejnowski.
Learning and relearning in Boltzmann machines.
In Rumelhart and McClelland [45], pages 282-317.

15
Josef Hochreiter.
Implementierung und Anwendung eines `neuronalen' Echtzeit-Lernalgorithmus für reaktive Umgebungen , 1990.
Fortgeschrittenenpraktikum, Institut für Informatik, Technische Universität München.

16
J. H. Holland.
Adaptation in Natural and Artificial Systems.
University of Michigan Press, Ann Arbor, 1975.

17
J. J. Hopfield.
Neural networks and physical systems with emergent collective computational abilities.
Proc. of the National Academy of Sciences, 79:2554-2558, 1982.

18
R. Huber.
Selektive visuelle Aufmerksamkeit: Untersuchungen zum Erlernen von Fokustrajektorien durch neuronale Netze, 1990.
Diplomarbeit, Institut für Informatik, Technische Universität München.

19
J. Jameson.
A neurocontroller based on model feedback and the adaptive heuristic critic.
In Proc. IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 2, pages 37-43, 1990.

20
M. I. Jordan.
Serial order: A parallel distributed processing approach.
Technical Report ICS Report 8604, Institute for Cognitive Science, University of California, San Diego, 1986.

21
M. I. Jordan.
Supervised learning and systems with excess degrees of freedom.
Technical Report COINS TR 88-27, Massachusetts Institute of Technology, 1988.

22
M. I. Jordan and R. A. Jacobs.
Learning to control an unstable system with forward modeling.
In Proc. of the 1990 Connectionist Models Summer School, in press. San Mateo, CA: Morgan Kaufmann, 1990.

23
T. Kohonen.
Self-Organization and Associative Memory.
Springer, second edition, 1988.

24
A. Lapedes and R. Faber.
How neural nets work.
In D. Z. Anderson, editor, `Neural Information Processing Systems: Natural and Synthetic' (NIPS). NY, American Institute of Physics, 1987.

25
Y. LeCun.
Une procédure d'apprentissage pour réseau à seuil asymétrique.
Proceedings of Cognitiva 85, Paris, pages 599-604, 1985.

26
G. Lukes.
Review of Schmidhuber's paper `Recurrent networks adjusted by adaptive critics'.
Neural Network Reviews, 4(1):41-42, 1990.

27
G. Lukes, B. Thompson, and P. Werbos.
Expectation driven learning with an associative memory.
In Proc. IEEE International Joint Conference on Neural Networks, Washington, D. C., volume 1, pages 521-524, 1990.

28
M. Minsky.
Steps toward artificial intelligence.
In E. Feigenbaum and J. Feldman, editors, Computers and Thought, pages 406-450. McGraw-Hill, New York, 1963.

29
M. Minsky and S. Papert.
Perceptrons.
Cambridge, MA: MIT Press, 1969.

30
D. J. Montana and L. Davis.
Training feedforward neural networks using genetic algorithms.
Technical report, BBN Systems and Technologies, Inc., Cambridge, MA, 1989.

31
P. W. Munro.
A dual back-propagation scheme for scalar reinforcement learning.
Proceedings of the Ninth Annual Conference of the Cognitive Science Society, Seattle, WA, pages 165-176, 1987.

32
F. Nake.
Ästhetik als Informationsverarbeitung.
Springer, 1974.

33
K. S. Narendra and M. A. L. Thathatchar.
Learning automata - a survey.
IEEE Transactions on Systems, Man, and Cybernetics, 4:323-334, 1974.

34
Nguyen and B. Widrow.
The truck backer-upper: An example of self learning in neural networks.
In IEEE/INNS International Joint Conference on Neural Networks, Washington, D.C., volume 1, pages 357-364, 1989.

35
D. B. Parker.
Learning-logic.
Technical Report TR-47, Center for Comp. Research in Economics and Management Sci., MIT, 1985.

36
D. B. Parker.
Optimal algorithms for adaptive networks: Second order back propagation, second order direct propagation, and second order hebbian learning.
In IEEE 1st International Conference on Neural Networks, San Diego, volume 2, pages 593-600, 1987.

37
B. A. Pearlmutter.
Learning state space trajectories in recurrent neural networks.
Neural Computation, 1:263-269, 1989.

38
F. J. Pineda.
Dynamics and architecture for neural computation.
Journal of Complexity, 4:216-245, 1988.

39
A. J. Robinson.
Dynamic Error Propagation Networks.
PhD thesis, Trinity Hall and Cambridge University Engineering Department, 1989.

40
A. J. Robinson and F. Fallside.
The utility driven dynamic error propagation network.
Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.

41
T. Robinson and F. Fallside.
Dynamic reinforcement driven error propagation networks with application to game playing.
In Proceedings of the 11th Conference of the Cognitive Science Society, Ann Arbor, pages 836-843, 1989.

42
R. Rohwer.
The `moving targets' training method.
In J. Kindermann and A. Linden, editors, Proceedings of `Distributed Adaptive Neural Information Processing', St.Augustin, 24.-25.5,. Oldenbourg, 1989.

43
R. Rohwer and B. Forrest.
Training time-dependence in neural networks.
In IEEE 1st International Conference on Neural Networks, San Diego, volume 2, pages 701-708, 1987.

44
D. E. Rumelhart, G. E. Hinton, and R. J. Williams.
Learning internal representations by error propagation.
In Rumelhart and McClelland [45], pages 318-362.

45
D. E. Rumelhart and J. L. McClelland, editors.
Parallel Distributed Processing, volume 1.
MIT Press, 1986.

46
D. E. Rumelhart and D. Zipser.
Feature discovery by competitive learning.
In Rumelhart and McClelland [45], pages 151-193.

47
A. L. Samuel.
Some studies in machine learning using the game of checkers.
IBM Journal on Research and Development, 3:210-229, 1959.

48
J. H. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook, 1987.
Report, Institut für Informatik, Technische Universität München.

49
J. H. Schmidhuber.
Accelerated learning in back-propagation nets.
In R. Pfeifer, Z. Schreter, Z. Fogelman, and L. Steels, editors, Connectionism in Perspective, pages 429 - 438. Amsterdam: Elsevier, North-Holland, 1989.

50
J. H. Schmidhuber.
The neural bucket brigade.
In R. Pfeifer, Z. Schreter, Z. Fogelman, and L. Steels, editors, Connectionism in Perspective, pages 439-446. Amsterdam: Elsevier, North-Holland, 1989.

51
J. H. Schmidhuber.
Applying temporal difference methods to fully recurrent reinforcement learning networks.
In preparation, 1990.

52
J. H. Schmidhuber.
Learning algorithms for networks with internal and external feedback.
In D. S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton, editors, Proc. of the 1990 Connectionist Models Summer School, pages 52-61. San Mateo, CA: Morgan Kaufmann, 1990.

53
J. H. Schmidhuber.
A local learning algorithm for dynamic feedforward and recurrent networks.
Connection Science, 1(4):403-412, 1990.

54
J. H. Schmidhuber.
Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments.
Technical Report FKI-126-90 (revised), Institut für Informatik, Technische Universität München, November 1990.
(Revised and extended version of an earlier report from February.).

55
J. H. Schmidhuber.
Networks adjusting networks.
In J. Kindermann and A. Linden, editors, Proceedings of `Distributed Adaptive Neural Information Processing', St.Augustin, 24.-25.5. 1989, pages 197-208. Oldenbourg, 1990.
In November 1990 a revised and extended version appeared as FKI-Report FKI-125-90 (revised) at the Institut für Informatik, Technische Universität München.

56
J. H. Schmidhuber.
An on-line algorithm for dynamic reinforcement learning and planning in reactive environments.
In Proc. IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 2, pages 253-258, 1990.

57
J. H. Schmidhuber.
Recurrent networks adjusted by adaptive critics.
In Proc. IEEE/INNS International Joint Conference on Neural Networks, Washington, D. C., volume 1, pages 719-722, 1990.

58
J. H. Schmidhuber.
Reinforcement learning with interacting continually running fully recurrent networks.
In Proc. INNC International Neural Network Conference, Paris, volume 2, pages 817-820, 1990.

59
J. H. Schmidhuber.
Reinforcement-Lernen und adaptive Steuerung.
Nachrichten Neuronale Netze, 2:1-3, 1990.

60
J. H. Schmidhuber.
Response to G. Lukes' review of `Recurrent networks adjusted by adaptive critics'.
Neural Network Reviews, 4(1), 1990.

61
J. H. Schmidhuber.
Temporal-difference-driven learning in recurrent networks.
In R. Eckmiller, G. Hartmann, and G. Hauske, editors, Parallel Processing in Neural Systems and Computers, pages 209-212. North-Holland, 1990.

62
J. H. Schmidhuber.
Towards compositional learning with dynamic neural networks.
Technical Report FKI-129-90, Institut für Informatik, Technische Universität München, 1990.

63
J. H. Schmidhuber.
A possibility for implementing curiosity and boredom in model-building neural controllers.
In J. A. Meyer and S. W. Wilson, editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 222-227. MIT Press/Bradford Books, 1991.

64
J. H. Schmidhuber and R. Huber.
Learning to generate focus trajectories for attentive vision.
Technical Report FKI-128-90, Institut für Informatik, Technische Universität München, 1990.

65
B. Schürmann.
Stability and adaptation in artificial neural systems.
Physical Review, A 40(50):2681-2688, 1989.

66
R. S. Sutton.
Temporal Credit Assignment in Reinforcement Learning.
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf. Sci., 1984.

67
R. S. Sutton.
Learning to predict by the methods of temporal differences.
Machine Learning, 3:9-44, 1988.

68
R. S. Sutton.
First results with DYNA, an integrated architecture for learning, planning and reacting.
In Proceedings of the AAAI Spring Symposium on Planning in Uncertain, Unpredictable, or Changing Environments, 1990.

69
R. S. Sutton and B. Pinette.
The learning of world models by connectionist networks.
Proceedings of the 7th Annual Conference of the Cognitive Science Society, pages 54-64, 1985.

70
C. v.d. Malsburg.
Technical Report Internal Report 81-2, Abteilung für Neurobiologie, Max-Planck Institut für Biophysik und Chemie, Göttingen, 1981.

71
C. Watkins.
Learning from Delayed Rewards.
PhD thesis, King's College, 1989.

72
P. J. Werbos.
Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences.
PhD thesis, Harvard University, 1974.

73
P. J. Werbos.
Advanced forecasting methods for global crisis warning and models of intelligence.
In General Systems, volume XXII, pages 25-38, 1977.

74
P. J. Werbos.
Backpropagation and neurocontrol: A review and prospectus.
In IEEE/INNS International Joint Conference on Neural Networks, Washington, D.C., volume 1, pages 209-216, 1989.

75
P. J. Werbos.
Consistency of HDP applied to a simple reinforcement learning problem.
Neural Networks, 2:179-189, 1990.

76
S.D. Whitehead and D. H. Ballard.
Active perception and reinforcement learning.
Technical Report 331, University of Rochester, Dept. of Comp. Sci., 1990.

77
R. J. Williams.
On the use of backpropagation in associative reinforcement learning.
In IEEE International Conference on Neural Networks, San Diego, volume 2, pages 263-270, 1988.

78
R. J. Williams.
Toward a theory of reinforcement-learning connectionist systems.
Technical Report NU-CCS-88-3, College of Comp. Sci., Northeastern University, Boston, MA, 1988.

79
R. J. Williams and Leemon C. Baird.
Draft: A mathematical analysis of actor-critic architectures for learning optimal controls through incremental dynamic programming.
Technical report, College of Comp. Sci., Northeastern University, Boston, MA, 1990.

80
R. J. Williams and D. Zipser.
Experimental analysis of the real-time recurrent learning algorithm.
Connection Science, 1(1):87-111, 1989.



Juergen Schmidhuber 2003-02-20


Related links in English: Recurrent neural networks - Subgoal learning - Reinforcement learning and POMDPs - Reinforcement learning economies - Selective attention
Deutsche Heimseite