The Future of Life Institute (FLI) announced last week that it will use a $10 million (£6 million) grant from Paypal founder Elon Musk to fund 37 research projects that will aim to ensure artificial intelligence (AI) is kept “beneficial”. 

Six of those projects are being conducted by academics in the UK. More specifically, they're being run by academics at Oxford and Cambridge - long considered two leading universities in the UK and two of the best universities in the world. 

Oxford University's Bridge of Sighs ©Flickr/Long Road Photography

Here the intellectuals behind the projects describe a little bit more about what they'll be doing as they aim to ensure assess the possible dangers and benefits linked to AI. 

1. Owen Cotton-Barratt (Oxford)

Project Summary: What are the most important projects for reducing the risk of harm from superintelligent artificial intelligence? We will probably not have to deal with such systems for many years - and we do not expect they will be developed with the same architectures we use today. That may make us want to focus on developing long-term capabilities in AI safety research. On the other hand, there are forces pushing us towards working on near-term problems. We suffer from ‘near-sightedness’ and are better at finding the answer to questions that are close at hand. Just as important, work on long-term problems can happen in the future and get extra people attending to it, while work on near-term problems has to happen now if it is to happen at all.

This project models the trade-offs we make when carrying out AI safety projects that aim at various horizons, and focused on specific architectures. It estimates crucial parameters - like the time-horizon probability distribution and how near-sighted we tend to be. It uses that model to work out what the AI safety community should be funding, and what it should call on policymakers to do.

Technical Abstract: here.

Continues on next page. 

2. Owain Evans (Oxford)

Project Summary: Previous work in economics and AI has developed mathematical models of preferences or values, along with computer algorithms for inferring preferences from observed human choices. We would like to use such algorithms to enable AI systems to learn human preferences by observing humans make real-world choices. However, these algorithms rely on an assumption that humans make optimal plans and take optimal actions in all circumstances. This is typically false for humans. For example, people's route planning is often worse than Google Maps, because we can't number-crunch as many possible paths. Humans can also be inconsistent over time, as we see in procrastination and impulsive behavior. Our project seeks to develop algorithms that learn human preferences from data despite humans not being homo-economicus and despite the influence of non-rational impulses. We will test our algorithms on real-world data and compare their inferences to people's own judgments about their preferences. We will also investigate the theoretical question of whether this approach could enable an AI to learn the entirety of human values.

Technical Abstract: "Previous work in economics and AI has developed mathematical models of preferences, along with algorithms for inferring preferences from observed actions [1, 2]. We would like to use such algorithms to enable AI systems to learn human preferences from observed actions. However, these algorithms typically assume that agents take actions that maximize expected utility given their preferences. This assumption of optimality is false for humans in real-world domains. Optimal sequential planning is intractable in complex environments and humans perform very rough approximations. Humans often don't know the causal structure of their environment (in contrast to MDP models). Humans are also subject to dynamic inconsistencies, as observed in procrastination, addiction and in impulsive behavior [3]. Our project seeks to develop algorithms that learn human preferences from data despite the suboptimality of humans and the behavioral biases that influence human choice. We will test our algorithms on real-world data and compare their inferences to people's own judgments about their preferences. We will also investigate the theoretical question of whether this approach could enable an AI to learn the entirety of human values.

[1]. Luce, Raiffa. ""Games and Decisions""

[2]. Ng, Russell. ""Algorithms for Inverse Reinforcement Learning""

[3]. Ainslie. ""Breakdown of Will"" 

Continues on next page. 

3. Ramana Kumar (Cambridge)

Project Summary: One path to significantly smarter-than-human artificial agents involves self-improvement, i.e., agents doing artificial intelligence research to make themselves even more capable. If such an agent is designed to be robust and beneficial, it should only execute self-modifying actions if it knows they are improvements, which, at a minimum, means being able to trust that the modified agent only takes safe actions. However, trusting the actions of a similar or smarter agent can lead to problems of self-reference, which can be seen as sophisticated versions of the liar paradox (which shows that the self-referential sentence ``this sentence is false'' cannot be consistently true or false). Several partial solutions to these problems have recently been proposed. However, current software for formal reasoning does not have sufficient support for self-referential reasoning to make these partial solutions easy to implement and study. In this project, we will implement a toy model of agents using these partial solutions to reason about self-modifications, in order to improve our understanding of the challenges of implementing self-referential reasoning, and to stimulate work on tools suitable for it.

Technical Abstract: "Artificially intelligent agents designed to be highly reliable are likely to include a capacity for formal deductive reasoning to be applied in appropriate situations, such as when reasoning about computer programs including other agents and future versions of the same agent. However, it will not always be possible to model other agents precisely: considering more capable agents, only abstract reasoning about their architecture is possible. Abstract reasoning about the behavior of agents that justify their actions with proofs lead to problems of self-reference and reflection: Godel's second incompleteness theorem shows that no sufficiently strong proof system can prove its own consistency, making it difficult for agents to show that actions their successors have proven to be safe are in fact safe (since an inconsistent proof system would be able to prove any action "safe"). Recently, some potential approaches to circumventing this obstacle have been proposed in the form of pen-and-paper proofs.

We propose building and studying implementations of agents using these approaches, to better understand the challenges of implementing tools that are able to support this type of reasoning, and to stimulate work in the interactive theorem proving community on this kind of tools."

Continues on next page. 

4. Michael Wooldridge (Oxford)

Project Summary: Codes of ethics play an important role in many sciences. Such codes aim to provide a framework within which researchers can understand and anticipate the possible ethical issues that their research might raise, and to provide guidelines about what is, and is not, regarded as ethical behaviour. In the medical sciences, for example, codes of ethics are fundamentally embedded within the research culture of the discipline, and explicit consideration of ethical issues is a standard expectation when research projects are planned and undertaken. In this project, we aim to start developing a code of ethics for AI research by learning from this interdisciplinary experience and extending its lessons into new areas. The project will bring together three Oxford researchers with expertise in artificial intelligence, philosophy, and applied ethics.

Technical Abstract: Codes of ethics play an important role in many sciences. Such codes aim to provide a framework within which researchers can understand and anticipate the possible ethical issues that their research might raise, and to provide guidelines about what is, and is not, regarded as ethical behaviour. In the medical sciences, especially, codes of ethics are fundamentally embedded within the research culture, and explicit consideration of ethical issues is a standard expectation when research projects are planned and undertaken. The aim of this project is to develop a solid basis for a code of artificial intelligence (AI) research ethics, learning from the scientific and medical community's experience with existing ethical codes, and extending its lessons into three important and representative areas where artificial intelligence comes into contact with ethical concerns: AI in medicine and biomedical technology, autonomous vehicles, and automated trading agents. We will also explore whether the design of ethical research codes might usefully anticipate, and potentially ameliorate, the risks of future research into superintelligence. The project brings together three Oxford researchers with highly relevant expertise in artificial intelligence, philosophy, and applied ethics, and will also draw strongly on other research activity within the University of Oxford.

Continues on next page. 

5. Adrian Weller (Cambridge)

Project Summary: We are unsure about what moral system is best for humans, let alone for potentially super-intelligent machines. It is likely that we shall need to create artificially intelligent agents to provide moral guidance and police issues of appropriate ethical values and best practice, yet this poses significant challenges. Here we propose an initial evaluation of the strengths and weaknesses of one avenue by investigating self-policing intelligent agents. We shall explore two themes: (i) adding a layer of AI agents whose express purpose is to police other AI agents and report unusual or undesirable activity (potentially this might involve setting traps to catch misbehaving agents, and may consider if it is wise to allow policing agents to take corrective action against offending agents); and (ii) analyzing simple models of evolving adaptive agents to see if robust conclusions can be learned. We aim to survey related literature, identify key areas of hope and concern for future investigation, and obtain preliminary results for possible guarantees. The proposal is for a one year term to explore the ideas and build initial models, which will be made publicly available, ideally in journals or at conferences or workshops, with extensions likely if progress is promising.

Technical Abstract: We are unsure about what moral system is best for humans, let alone for potentially super-intelligent machines. It is likely that we shall need to create artificially intelligent agents to provide moral guidance and police issues of appropriate ethical values and best practice, yet this poses significant challenges. Here we propose an initial evaluation of the strengths and weaknesses of one avenue by investigating self-policing intelligent agents. We shall explore two themes: (i) adding a layer of AI agents whose express purpose is to police other AI agents and report unusual or undesirable activity (potentially this might involve setting traps to catch misbehaving agents, and may consider if it is wise to allow policing agents to take corrective action against offending agents); and (ii) analyzing simple models of evolving adaptive agents to see if robust conclusions can be learned. We aim to survey related literature, identify key areas of hope and concern for future investigation, and obtain preliminary results for possible guarantees. The proposal is for a one year term to explore the ideas and build initial models, which will be made publicly available, ideally in journals or at conferences or workshops, with extensions likely if progress is promising.

Continues on next page. 

6. Nick Bostrom (Oxford)

Project Summary: We propose the creation of a joint Oxford-Cambridge research center, which will develop policies to be enacted by governments, industry leaders, and others in order to minimise risks and maximize benefit from artificial intelligence (AI) development in the longer term. The center will focus explicitly on the long-term impacts of AI, the strategic implications of powerful AI systems as they come to exceed human capabilities in most domains of interest, and the policy responses that could best be used to mitigate the potential risks of this technology.

There are reasons to believe that unregulated and unconstrained development could incur significant dangers, both from "bad actors" like irresponsible governments, and from the unprecedented capability of the technology itself. For past high-impact technologies (e.g. nuclear fission), policy has often followed implementation, giving rise to catastrophic risks. It is important to avoid this with superintelligence: safety strategies, which may require decades to implement, must be developed before broadly superhuman, general-purpose AI becomes feasible.

This center represents a step change in technology policy: a comprehensive initiative to formulate, analyze, and test policy and regulatory approaches for a transformative technology in advance of its creation.

Technical Abstract: here 

Find your next job with techworld jobs