Friday, July 29, 2022

Brief statement of research vision

I thought I would try to very briefly state the research vision that has in some incarnation animated me since I started doing research almost twenty years ago. Obviously, this could take forever and hundreds of pages. But I had some good wine and need to go to bed soon, so I'll try to finish this and post before I fall asleep, thus keeping it short. No editing, just the raw thoughts. Max one page.

The objective is to create more general artificial intelligence. I'm not saying general intelligence, because I don't think truly general intelligence - the ability to solve any solvable task - could exist. I'm just saying considerably more general artificial intelligence than what we have now, in the sense that the same artificial system could do a large variety of different cognitive-seeming things.

The way to get there is to train sets of diverse-but-related agents in persistent generative virtual worlds. Training agents to play particular video games is all good, but we need more than one game, we need lots of different games with lots of different versions of each. Therefore, we need to generate these worlds, complete with rules and environments. This generative process needs to be sensitive to the capabilities and needs/interests of the agents, in the sense that it generates the content that will best help the agents to develop.

The agents will need to be trained over multiple timescales, both faster "individual" timescales and slower "evolutionary" timescales; perhaps we will need many more different timescales. Different learning algorithms might be deployed at different timescales, perhaps with gradient descent for the lifetime learning and evolution at longer timescales. The agents need to be diverse - without diversity we will collapse to learning a single thing - but they will also need to build on shared capabilities. A quality-diversity evolutionary process might provide the right framework for this.

Of course, drawing a sharp line between agents and environments is arbitrary and probably a dead end at some point. In the natural world, the environments largely consists of other agents, or is created by other agents, of the same species or others. Therefore, the environment and rule generation processes should also be agential, and subject to the same constraints and rewards; ideally, there is no difference between "playing" agents and "generating" agents.

Human involvement could and probably should happen at any stage. This system should be able to identify challenges and deliver them to humans, for example to navigate around a particular obstacle, devise a problem that a particular agent can't solve, and things like that. These challenges could be delivered to humans at a massively distributed scale in a way that provides a game-like experience for human participants, allowing them to inject new ideas into the process where the process needs it most and "anchoring" the developing intelligence in human capabilities. The system might model humans' interests and skills to select the most appropriate human participants to present certain challenges to.

Basically, we are talking about a giant, extremely diverse video game-like virtual world with enormous agent diversity constantly creating itself in a process where algorithms collaborate with humans, creating the ferment from which more general intelligence can evolve. This is important because current agential AI is held back by the tasks and environments we present it with far more than by architectures and learning algorithms.

Of course, I phrase this as a project where the objective is to develop artificial intelligence. But you could just as well turn it around, and see it as a system that creates interesting experiences to humans. AI for games rather than games for AI. Two sides of the same coin etc. Often, the "scientific objective" of a project is a convenient lie; you develop interesting technology and see where it leads.

I find it fascinating to think about how much of this plan has been there for almost twenty years. Obviously, I've been influenced by what other people think and do research-wise, or at least I really hope so. But I do think the general ideas have more or less been there since the start. And many (most?) of the 300 or so papers that have my name on them (usually with the hard work done by my students and/or colleagues) are in some way related to this overall vision.

The research vision I'm presenting here is certainly way more mainstream now than it was a decade or two ago; many of the ideas now fall under the moniker "open-ended learning". I believe that almost any idea worth exploring is more or less independently rediscovered by many people, and that there comes a time for every good idea when the idea is "in the air" and becomes obvious to everyone in the field. I hope this happens to the vision laid out above, because it means that more of this vision gets realized. But while I'm excited for this, it would also mean that I would have to actively go out and look for a new research vision. This might mean freedom and/or stagnation.

Anyway, I'm falling asleep. Time to hit publish and go to bed.

No comments: