Unsupervised Predictive Memory in a Goal-Directed Agent

编辑：映维 | 分类：XR | 2020年7月30日

Note: We don't have the ability to review paper

PubDate: Mar 2018

Teams: DeepMind

Writers: Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap

PDF: Unsupervised Predictive Memory in a Goal-Directed Agent

Unsupervised Predictive Memory in a Goal-Directed Agent

Abstract

Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement learning (RL) algorithms with deep neural networks, and the excitement surrounding these results has led to the pursuit of related ideas as explanations of non-human animal learning. However, we demonstrate that contemporary RL algorithms struggle to solve simple tasks when enough information is concealed from the sensors of the agent, a property called “partial observability”. An obvious requirement for handling partially observed tasks is access to extensive memory, but we show memory is not enough; it is critical that the right information be stored in the right format. We develop a model, the Memory, RL, and Inference Network (MERLIN), in which memory formation is guided by a process of predictive modeling. MERLIN facilitates the solution of tasks in 3D virtual reality environments for which partial observability is severe and memories must be maintained over long durations. Our model demonstrates a single learning agent architecture that can solve canonical behavioural tasks in psychology and neurobiology without strong simplifying assumptions about the dimensionality of sensory input or the duration of experiences.

本文链接：https://paper.nweon.com/4420

Unsupervised Predictive Memory in a Goal-Directed Agent

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Unsupervised Predictive Memory in a Goal-Directed Agent

您可能还喜欢...

EarVR: Using ear haptics in virtual reality for deaf and Hard-of-Hearing people

Study of auditory trajectories in virtual environments

Real-time Animation and Motion Retargeting of Virtual Characters Based on Single RGB-D Camera

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘