征稿已开启

查看我的稿件

注册已开启

查看我的门票

已截止
活动简介

The third edition of the Paris-Saclay Junior Conference on Data Science and Engineering is addressed to PhD students in their first year, M2 students and third year students at Engineering schools at Paris-Saclay. It will offer these students the opportunity to present their scientific works developed at internships, or in the first year of thesis, and also to grow their critical sense thanks to a professional conference hosting prestigious invited speakers, academics and industry scientists.

The conference aims at gathering a large public of master, engineering school and PhD students, and is an excellent means of discovering the research world in Data Science and Engineering.

PhD students are also involved in the conference organization as reviewers, session chairs, organizers of networking events. 

Machine Learning -- What’s next?

For many Machine Learning (ML) problems, labeled data is readily available. When this is the case, algorithms and training time are the performance bottleneck. This is the ML researcher’s paradise! Vision and Speech are good examples of such problems because they have a stable distribution and additional human labels can be collected each year. Problems that extract their labels from history, such as click prediction, data analytics, and forecasting are also blessed with large numbers of labels. Unfortunately, there are only a few problems for which we can rely on such an endless supply of free labels. They receive a disproportionally large amount of attention from the media.

We are interested in tackling the much larger class of ML problems where labeled data is sparse. For example, consider a dialog system for a specific app to recognize specific commands such as “lights on first floor off”, “increase spacing between 2nd and 3rd paragraph”, “make doctor appointment after Hawaii vacation”. Anyone who has attempted building such a system has soon discovered that generalizing to new instances from a small custom set of labeled instances is far more difficult than they originally thought. Each domain has its own generalization challenges, data exploration and discovery, custom features, and decomposition structure. Creating labeled data to communicate custom knowledge is inefficient. It also leads to embarrassing errors resulting from over-training on small sets. ML algorithms and processing power are not a bottleneck when labeled data is scarce. The bottleneck is the teacher and the teaching language.

To address this problem, we change our focus from the learning algorithm to teachers. We define “Machine Teaching” as improving the human productivity given a learning algorithm. If ML is the science and engineering of extracting knowledge from data, Machine Teaching is the science and engineering of extracting knowledge from teachers. A similar shift of focus has happened in computer science. While computing is revolutionizing our lives, systems sciences (e.g., programming languages, operating systems, networking) have shifted their foci to human productivity. We expect a similar trend will shift science from Machine Learning to Machine Teaching.

The aim of this talk is to convince the audience that we are asking the right questions. We provide some answers and some spectacular results. The most exciting part, however, is the research opportunities that come with the emergence of a new field.

Patrice Simard is a Distinguished Engineer in the Microsoft Research AI Lab in Redmond. He is passionate about finding new ways to combine engineering and science in the field of machine learning. Simard’s research is currently focused on human teachers. His goal is to extend the teaching language, science, and engineering, beyond the traditional (input, label) pairs. Simard completed his PhD thesis in Computer Science at the University of Rochester in 1991. He then spent 8 years at AT&T Bell Laboratories working on neural networks. He joined Microsoft Research in 1998. In 2002, he started MSR’s Document Processing and Understanding research group. In 2006, he left MSR to become the Chief Scientist and General Manager of Microsoft’s Live Labs Research. In 2009, he became the Chief Scientist of Microsoft’s AdCenter (the organization that monetizes Bing search). In 2012, he returned to Microsoft Research to work on his passion, Machine Learning research. Specifically, he founded the Computer-Human Interactive Learning (CHIL) group to study Machine Teaching and to make machine learning accessible to everyone.

组委会

Fatiha Saïs 
Chair, Université Paris-Sud, UPsay

Olivier Fercoq 
Co chair, Télécom ParisTech, UPsay

Isabelle Huteau 
Conference Communication Leader, Digicosme

Antoine Naulet 
M2 DataScience

STEERING COMMITTEE

Florence d'Alché-Buc 
Télécom ParisTech, UPsay

Sylvain Arlot 
Université Paris-Sud, UPsay

Albert Bifet 
Télécom ParisTech, UPsay

Sarah Cohen-Boulakia 
Université Paris-Sud, UPsay

Flora Jay 
CNRS, UPsay

Joseph Salmon 
Télécom ParisTech, UPsay

Karine Zeitouni 
Université de Versailles-Saint-Quentin-en-Yvelines, UPSay

SCIENTIFIC COMMITTEE

Florence d'Alché-Buc 
Télécom ParisTech, UPsay

Zacharie Ales 
ENSTA ParisTech, UPsay

Sarah Cohen-Boulakia 
Université Paris-Sud, UPsay

Marco Cuturi 
ENSAE ParisTech, UPSay

Olivier Fercoq 
Télécom ParisTech, UPsay

Alexandre Gramfort 
Inria Saclay, UPSay

Flora Jay 
CNRS, UPsay

Cristina Manfredotti 
AgroParisTech, UPSay

Nicoleta Preda 
Université de Versailles-Saint-Quentin-en-Yvelines, UPsay

Gianluca Quercini 
Centrale-Supélec, UPSay

Fatiha Saïs 
Université Paris-Sud, UPsay

Michaël Thomazo 
Inria Saclay, UPSay

Paola Tubaro 
CNRS, UPSay

征稿信息

重要日期

2018-05-25
摘要截稿日期
2018-05-25
初稿截稿日期
2018-06-29
初稿录用日期

The topics of the conference are listed below: 

  • data mining
  • databases
  • big data analytics
  • machine learning
  • statistics
  • semantic web
  • scientific workflows
  • distributed data and computing
  • applications of data science (biomedical and biological data, physics, chemistry, smart cities, image, documents, audio, video, on-line advertisement, ...)

The extended abstracts will be reviewed by the scientific program committee, possibly including one junior PC member (PhD student or postdoc in data science). They will be selected for oral (15 min) or poster presentation (flash talks, poster and poster-demo sessions) according to their originality and relevance to the conference topics. All the presentations should be in English. Electronic versions of the extended abstracts will be accessible on the conference web site. The book of abstracts will not be published and the extended abstracts will not constitute a formal publication.

Note: Only Master M2 and PhD students from Université Paris-Saclay are invited to contribute.

作者指南

1ST CALL FOR CONTRIBUTIONS:

We invite Master M2 and PhD students from Université Paris-Saclay to submit an extended abstract of up to 3 pages describing new or preliminary results of their work in one of the three forms: oral talk, poster or poster-demo (all in English). Master students are encouraged to submit posters even if they do not have substantial results at the time of submission. Submissions should be formatted according to the Springer Lecture Notes in Computer Science style.

We will award a prize to the best communication.

Extended Abstract Instructions:

  • Please follow this simplified template: PDF ; latex source or Word template

  • Language = English

  • Number of pages = 1 to 3

  • Maximum number of Table or Figure = 1

  • Mandatory paragraphs = Abstract, Keywords, Motivation, References (max 10 ref.)

Each extended abstract must be submitted online via the Easychair submission system

留言
验证码 看不清楚,更换一张
全部留言
重要日期
  • 会议日期

    09月13日

    2018

    09月14日

    2018

  • 05月25日 2018

    摘要截稿日期

  • 05月25日 2018

    初稿截稿日期

  • 06月29日 2018

    初稿录用通知日期

  • 09月14日 2018

    注册截止日期

移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询