Learning of Reward Distribution

Summary
A novel approach for the reward distribution in multi-agent reinforcement learning is proposed. The agent who gets a reward gives a part of it to the other agents. If an agent gives a part of its own reward to the other ones, they may help the agent to get more reward. There may be some cases in which the agent gets more reward than that it gave to the other ones. In this case, it is better for the agent to give the part of the reward to the other ones. Based on this principle, each agent learns the reward distribution ratio to the other agents autonomously based on the selfish value function. Some simulations have been demonstrated that a rational reward distribution ratio is obtained by each agent depending on the given task.
Reference
4. Katsunari Shibata, Tsutomu Masaki \& Masanori Sugisaka:
Autonomous Learning of Reward Distribution in Not100 Game,
Proc. of AROB (Int'l Symp. on Artificial Life and Robotics) 8th, pp.78-81, 2003.1
pdf File (4 pages, 116kB)

3. 柴田克成, 真崎勉:
多人数ゲームにおける報酬分配学習,
計測自動制御学会 システム・情報部門学術講演会2002 講演論文集, pp. 15--20, 2002.11.
(in Japanese)
pdf File (6 pages, 292kB)

2. Katsunari Shibata and Koji Ito:
Autonomous Learning of Reward Distribution for Each Agent in Multi-Agent Reinforcement Learning,
Intelligent Autonomous Systems, Vol. 6, pp. 495-502, 2000.7
[Multi-Agent Systems, Reinforcement Learning, Reward Distribution]
pdf File (8 pages, 930kB)

1. 柴田克成, 伊藤宏司:
2エージェント強化学習における報酬分配の自律学習,
ロボティクス・メカトロニクス講演会 (ROBOMEC'99), 1999. 6
(in Japanese)


Return to my home page (English)
Return to my home page (Japanese)