Teacher Algorithms for Deep RL Agents that Generalize in Procedurally Generated Environments

In this blog post we explore how one can leverage Automatic Curriculum Learning procedures to scaffold Deep Reinforcement Learning agents within complex continuous (procedurally generated) task spaces.

Language as a Cognitive Tool: Dall-E, Humans and Vygotskian RL Agents

This blog post presents a supra-communicative view of language and advocates for the use of language as a cognitive tool to organize the cognitive development of intrinsically motivated artificial agents. We go over studies revealing the cognitive functions of language in humans, cover similar uses of language in the design of artificial agents and advocate for the pursuit of Vygotskian embodied agents - artificial agents that leverage language as a cognitive tool to structure their continuous experience, form abstract representations, reason, imagine creative goals, plan towards them and simulate future possibilities.

Intrinsically Motivated Discovery of Diverse Patterns in Self-Organizing Systems

Self-organisation occurs in many physical, chemical and biological systems, as well as in artificial systems like the Game of Life. Yet, these systems are still full of mysteries and we are far from fully grasping what structures can self-organize, how to represent and classify them, and how to predict their evolution. In this blog post, we present our recent paper which formulates the problem of automated discovery of diverse self-organized patterns in such systems. Using a continuous Game of Life as a testbed, we show how intrinsically-motivated goal exploration processes, initially developed for learning of inverse models in robotics, can efficiently be transposed to this novel application area.

Intrinsically Motivated Modular Multi-Goal RL

In multi-goal RL, agents usually learn multiple variations of a unique task (e.g. reaching different positions). CURIOUS proposes to extend that setting to allow agents to target multiple types of goals within a unique controller (e.g. reach, pick and place, stack). Intrinsic motivations based on learning progress automatically organize learning trajectories, guiding agents to focus on easier goals before moving towards harder ones as the first are solved. Guided by learning progress, agents do not waste time on impossible tasks (with no learning progress) or tasks that are already solved. In case of sensory failure, agent detect a drop in performance (negative LP) and refocus on the corresponding goals to regain control.

Discovery of independently controllable features through autonomous goal setting

Despite recent breakthroughs in artificial intelligence, machine learning agents remain limited to tasks predefined by human engineers. The autonomous and simultaneous discovery and learning of many-tasks in an open world remains very challenging for reinforcement learning algorithms. In this blog post we explore recent advances in developmental learning to tackle the problem of how an agent can learn to represent, imagine and select its own goals.

How Many Random Seeds ?

Reproducibility in Machine Learning and Deep Reinforcement Learning in particular has become a serious issue in the recent years. In this blog post, we present a statistical guide to perform rigorous comparison of RL algorithms.