#	Pagina
attuale pagina	/open-h2020/projects/197182/index.html

Opendata, web and dolomites

CoPS SIGNED

Coevolutionary Policy Search

Total Cost €

EC-Contrib. €

Partnership

Views

Outcomes and
results

CoPS project word cloud

Explore the words cloud of the CoPS project. It provides you a very rough idea of what is the project "CoPS" about.

implications good fundamental efficiency retrieval necessitating expensive evaluation artificial class manual despite overcome coevolution grand agent discovering autonomous enormous manner decision simultaneously hard efficient unmet grows elicit risk progress complexity realistic insights intelligence construction traffic planning multiple software actions turn lies sampling automatically hopeless difficulty diverse wasted rare performance trials robotic commerce theoretic successful yield realize determines distinguishing quality policy leverage optimization behavioral factories agents discover obstacle optimizes instead stochasticity strategy estimating policies settings events revolutionize critical homes ing averaging specifying obstacles demand invaluable isolate

Project "CoPS" data sheet

The following table provides information about the project.

Coordinator	THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD Organization address address: WELLINGTON SQUARE UNIVERSITY OFFICES city: OXFORD postcode: OX1 2JD website: www.ox.ac.uk contact info title: n.a. name: n.a. surname: n.a. function: n.a. email: n.a. telephone: n.a. fax: n.a.
Coordinator Country	United Kingdom [UK]
Total cost	1˙480˙632 €
EC max contribution	1˙480˙632 € (100%)
Programme	1. H2020-EU.1.1. (EXCELLENT SCIENCE - European Research Council (ERC))
Code Call	ERC-2014-STG
Funding Scheme	ERC-STG
Starting year	2015
Duration (year-month-day)	from 2015-10-01 to 2021-09-30

Partnership

Take a look of project's partnership.

#	participants	country	role	EC contrib. [€]
1	THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD Organization address address: WELLINGTON SQUARE UNIVERSITY OFFICES city: OXFORD postcode: OX1 2JD website: www.ox.ac.uk contact info title: n.a. name: n.a. surname: n.a. function: n.a. email: n.a. telephone: n.a. fax: n.a.	UK (OXFORD)	coordinator	1˙480˙632.00

Map

Project objective

I propose to develop a new class of decision-theoretic planning methods that overcome fundamental obstacles to the efficient optimization of autonomous agents. Creating agents that are effective in diverse settings is a key goal of artificial intelligence with enormous potential implications: robotic agents would be invaluable in homes, factories, and high-risk settings; software agents could revolutionize e-commerce, information retrieval, and traffic control. The main challenge lies in specifying an agent's policy: the behavioral strategy that determines its actions. Since the complexity of realistic tasks makes manual policy construction hopeless, there is great demand for decision-theoretic planning methods that automatically discover good policies. Despite enormous progress, the grand challenge of efficiently discovering effective policies for complex tasks remains unmet. A fundamental obstacle is the cost of policy evaluation: estimating a policy's quality by averaging performance over multiple trials. This cost grows quickly with increases in task complexity (making trials more expensive) or stochasticity (necessitating more trials). To address this difficulty, I propose a new approach that simultaneously optimizes both policies and the manner in which those policies are evaluated. The key insight is that, in many tasks, many trials are wasted because they do not elicit the controllable rare events critical for distinguishing between policies. Thus, I will develop methods that leverage coevolution to automatically discover the best events, instead of sampling them randomly. If successful, this project will greatly improve the efficiency of decision-theoretic planning and, in turn, help realize the potential of autonomous agents. In addition, by automatically identifying the most useful events, the resulting methods will help isolate critical factors in performance and thus yield new insights into what makes decision-theoretic problems hard.

Publications

List of publications.
year	authors and title	journal	last update
2018	Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson Counterfactual Multi-Agent Policy Gradients published pages: , ISSN: , DOI:		2019-08-30
2018	Jakob Foersterâ€š Richard Chenâ€š Maruan Alâˆ’Shedivatâ€š Shimon Whitesonâ€š Pieter Abbeel and Igor Mordatch Learning with Opponentâˆ’Learning Awareness published pages: , ISSN: , DOI:		2019-08-30
2017	Jakob Foersterâ€š Nantas Nardelliâ€š Greg Farquharâ€š Phil Torrâ€š Pushmeet Kohli and Shimon Whiteson Stabilising Experience Replay for Deep Multiâˆ’Agent Reinforcement Learning published pages: , ISSN: , DOI:		2019-08-30
2018	Gregory Farquharâ€š Tim Rocktaschelâ€š Maximilian Igl and Shimon Whiteson TreeQN and ATreeC: Differentiable Treeâˆ’Structured Models for Deep Reinforcement Learning published pages: , ISSN: , DOI:		2019-08-30
2018	Kamil Ciosek Shimon Whiteson Expected Policy Gradients published pages: , ISSN: , DOI:		2019-08-30
2016	Jakob Foersterâ€š Yannis Assaelâ€š Nando de Freitas and Shimon Whiteson Learning to Communicate with Deep Multiâˆ’Agent Reinforcement Learning published pages: , ISSN: , DOI:		2019-08-30
2017	Kamil Ciosek and Shimon Whiteson OFFER: Offâˆ’Environment Reinforcement Learning published pages: , ISSN: , DOI:		2019-08-30
2018	Supratik Paulâ€š Konstantinos Chatzilygeroudisâ€š Kamil Ciosekâ€š Jeanâˆ’Baptiste Mouretâ€š Michael Osborne and Shimon Whiteson Alternating Optimisation and Quadrature for Robust Control published pages: , ISSN: , DOI:		2019-08-30
2018	Kyriacos Shiarlisâ€š Markus Wulfmeierâ€š Sasha Salterâ€š Shimon Whiteson and Ingmar Posner TACO: Learning Task Decomposition via Temporal Alignment for Control published pages: , ISSN: , DOI:		2019-08-30
2018	Ciosek, Kamil; Whiteson, Shimon Expected Policy Gradients for Reinforcement Learning published pages: , ISSN: , DOI:	2	2019-08-30
2018	Rashid, Tabish; Samvelyan, Mikayel; de Witt, Christian Schroeder; Farquhar, Gregory; Foerster, Jakob; Whiteson, Shimon QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning published pages: , ISSN: , DOI:	2	2019-08-30
2018	Matthew Fellowsâ€š Kamil Ciosek and Shimon Whiteson Fourier Policy Gradients published pages: , ISSN: , DOI:		2019-08-30
2018	Foerster, Jakob; Farquhar, Gregory; Al-Shedivat, Maruan; RocktÃ¤schel, Tim; Xing, Eric P.; Whiteson, Shimon DiCE: The Infinitely Differentiable Monte-Carlo Estimator published pages: , ISSN: , DOI:	2	2019-08-30
2018	Igl, Maximilian; Zintgraf, Luisa; Le, Tuan Anh; Wood, Frank; Whiteson, Shimon Deep Variational Reinforcement Learning for POMDPs published pages: , ISSN: , DOI:	1	2019-08-30

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "COPS" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an email (fabio@fabiodisconzi.com) and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "COPS" are provided by the European Opendata Portal: CORDIS opendata.