2016通用AI评估研讨会 International Workshop on Evaluating General-Purpose AI

征稿已开启

已截稿

注册已开启

已截止

活动简介

The aim of this workshop is to bring to bear on the expertise of a diverse set of researchers to progress in the evaluation of general purpose AI systems. Up to now, most AI systems are tested on specific tasks. However, to be considered truly intelligent, a system should exhibit enough flexibility to be able to learn how to perform a wide variety of tasks, some of which may not be known until after the system is deployed. This workshop will examine formalisations, methodologies and test benches for evaluating the numerous aspects of this type of general AI systems. More specifically, we are interested in theoretical or experimental research focused on the development of concepts, tools and clear metrics to characterise and measure the intelligence, and other cognitive abilities, of general AI agents. We are interested in questions such as: Can the various tasks and benchmarks in AI provide a general basis for evaluation and comparison of a broad range of such systems?, Can there be a theory of tasks, or cognitive abilities, that enables a more direct comparison and characterisation of AI systems? How much does the specificity of an AI agent relate to how fast it can approach the optimal performance?

征稿信息

征稿范围

We welcome regular papers, demo papers about benchmarks or tools, and position papers, and encourage discussions over a broad list of topics (not exhaustive):

Analysis and comparisons of AI benchmarks and competitions. Lessons learnt.
Proposals for new general tasks, evaluation environments, workbenches and general AI development platforms.
Theoretical or experimental accounts of the space of tasks, abilities and their dependencies.
Evaluation of development in robotics and other autonomous agents, and cumulative learning in general learning systems.
Tasks and methods for evaluating: transfer learning, cognitive growth, structural self-modification and self-programming.
Evaluation of social, verbal and other general abilities in multi-agent systems, video games and artificial social ecosystems.
Evaluation of autonomous systems: cognitive architectures and multi-agent systems versus general components: machine learning techniques, SAT solvers, planners, etc.
Unified theories for evaluating intelligence and other cognitive abilities, independently of the kind of subject (humans, animals or machines): universal psychometrics.
Analysis of reward aggregation and utility functions, environment properties (Markov, ergodic, etc.) in the characterisation of reinforcement learning tasks.
Methods supporting automatic generation of tasks and problems with systematically introduced variations.
Better understanding of the characterisation of task requirements and difficulty (energy, time, trials needed..), beyond algorithmic complexity.
Evaluation of AI systems using generalised cognitive tests for humans. Computer models taking IQ tests. Psychometric AI.
Application of (algorithmic) information theory, game theory, theoretical cognition and theoretical evolution for the definition of metrics of cognitive abilities.
Adaptation of evaluation tools from comparative psychology and psychometrics to AI: item response theory, adaptive testing, hierarchical factor analysis.
Characterisation and evaluation of artificial personalities.
Evaluation methods for multi-resolutional perception in AI systems and agents.

留言

全部留言

重要日期

会议日期

08月29日

2016

至

08月30日

2016
08月30日 2016

注册截止日期

联系方式

Jose Hernandez-Orallo
jo******@dsic.upv.es

登录查看完整联系方式

移动端

在手机上打开

小程序

打开微信小程序

客服

扫码或点此咨询

2016通用AI评估研讨会

International Workshop on Evaluating General-Purpose AI