Acquisition of Active Perception and Recognition by Reinforcement Learning using Neural Network

Summary
An active perception learning system based on reinforcement learning is proposed. A novel reinforcement architecture, called Actor-Q, is employed in which Q-learning and Actor-Critic are combined. The system decides its actions according to Q-values. One of the actions is to move its sensor, and the others are to make an answer of its recognition result, each of which corresponds to each pattern. When the sensor motion is selected, the sensor moves according to the actor's output signals. The Q-value for the sensor motion is trained by Q-learning, and the Actor is trained by the Q-value for the sensor motion on behalf of the critic. When one of the other actions is selected, the system outputs the recognition result. When the recognition answer is correct, the Q-value is trained to be the upper limit of the Q-value, and when the answer is not correct, it is trained to be 0.0. The module to compute Q-value and the actor module are both consisted of a neural network, and are trained by Error Back Propagation. The training signals are generated based on the above reinforcement learning.

It was confirmed by some simulations using a visual sensor with non-uniform visual cells that the system moves its sensor to the place where it can recognize the presented pattern correctly. Even though the Q-value surface as a function of the sensor location has some local peaks, the sensor was not trapped and moved to the appropriate direction because the Q-value for the sensor motion becomes larger.

Reference
7. Katsunari Shibata, Tetsuo Nishino \& Yoichi Okabe:
Active Perception and Recoginition Learning System Based on Actor-Q Architecture
Systems and Computers in Japan, Vol. 33, No. 14, pp. 12-22, 2002. 12

6. Katsunari Shibata, Tetsuo Nishino and Yoichi Okabe:
Active Perception Learning System Based on Actor-Q Architecture,
Trans. of IEICE (The Institute of Electronics, Information and Communication Engineers), Vol. J84-D-II, No. 9, pp.2121-2130, 2001. 9 (in Japanese)
柴田克成, 西野哲生, 岡部洋一:
Actor-Qアーキテクチャに基づく能動認識学習システム,
Trans. of IEICE (電子情報通信学会論文誌), Vol. J84-D-II, No. 9, pp.2121-2130, 2001. 9
(in Japanese)
pdf File (10 pages, 894kB)

5. Katsunari Shibata, Tetsuo Nishino, and Yoichi Okabe,
Actor-Q Based Active Perception Learning System,
Proc. of ICRA (Int'l Conf. on Robotics and Automation) 2001, pp.1000-1005, 2001. 5
[Actor-Q architecture, reinforcement learning, active perception,neural network]
pdf File (6 pages, 833kB)

4. Katsunari Shibata, Tetsuo Nishino & Yoichi Okabe:
Learning of Active Perception Based on Reinforcement Learning,
J. of JNNS(Japan Neural Network Society), Vol. 3, No. 4, pp.126-134, 1996.12 (in Japanese)
柴田克成、西野哲生、岡部洋一:
"強化学習による能動認識能力の学習", 日本神経回路学会論文誌, Vol. 3, No. 4, pp.126-134, 1996.12

3. T. Nishino, K. Shibata and Y. Okabe (西野哲生、柴田克成、岡部洋一):
遅延強化信号による視点移動の学習,
電子情報通信学会技術研究報告 (Technical Report of IEICE),
NC-96-135, pp. 171-178, 1997.3

2. [PS file (4 pages, 155kB)] K. Shibata, T. Nishino and Y. Okabe :
"Active Perception Based on Reinforcement Learning"
Proc. of WCNN'95 Washington D. C.,vol.2, pp.170-173, 1995.7

1. [PS file (2 pages, 38kB in Japanese)] K. Shibata, T. Nishino and Y. Okabe (柴田克成、西野哲生、岡部洋一):
"Active Perception Based on Reinforcement Learning" (in Japanese)
"強化学習に基づく能動認識"
Proc. of JNNS'94 Tsukuba (日本神経回路学会第5回全国大会講演論文集 つくば),
pp.82-83, 1994.11

concept (17kB in English)


Return to my home page (English)
Return to my home page (Japanese)