The third edition of the Paris-Saclay Junior Conference on Data Science and Engineering is addressed to PhD students in their first year, M2 students and third year students at Engineering schools at Paris-Saclay. It will offer these students the opportunity to present their scientific works developed at internships, or in the first year of thesis, and also to grow their critical sense thanks to a professional conference hosting prestigious invited speakers, academics and industry scientists.
The conference aims at gathering a large public of master, engineering school and PhD students, and is an excellent means of discovering the research world in Data Science and Engineering.
PhD students are also involved in the conference organization as reviewers, session chairs, organizers of networking events.
Machine Learning -- What’s next?
For many Machine Learning (ML) problems, labeled data is readily available. When this is the case, algorithms and training time are the performance bottleneck. This is the ML researcher’s paradise! Vision and Speech are good examples of such problems because they have a stable distribution and additional human labels can be collected each year. Problems that extract their labels from history, such as click prediction, data analytics, and forecasting are also blessed with large numbers of labels. Unfortunately, there are only a few problems for which we can rely on such an endless supply of free labels. They receive a disproportionally large amount of attention from the media.
We are interested in tackling the much larger class of ML problems where labeled data is sparse. For example, consider a dialog system for a specific app to recognize specific commands such as “lights on first floor off”, “increase spacing between 2nd and 3rd paragraph”, “make doctor appointment after Hawaii vacation”. Anyone who has attempted building such a system has soon discovered that generalizing to new instances from a small custom set of labeled instances is far more difficult than they originally thought. Each domain has its own generalization challenges, data exploration and discovery, custom features, and decomposition structure. Creating labeled data to communicate custom knowledge is inefficient. It also leads to embarrassing errors resulting from over-training on small sets. ML algorithms and processing power are not a bottleneck when labeled data is scarce. The bottleneck is the teacher and the teaching language.
To address this problem, we change our focus from the learning algorithm to teachers. We define “Machine Teaching” as improving the human productivity given a learning algorithm. If ML is the science and engineering of extracting knowledge from data, Machine Teaching is the science and engineering of extracting knowledge from teachers. A similar shift of focus has happened in computer science. While computing is revolutionizing our lives, systems sciences (e.g., programming languages, operating systems, networking) have shifted their foci to human productivity. We expect a similar trend will shift science from Machine Learning to Machine Teaching.
The aim of this talk is to convince the audience that we are asking the right questions. We provide some answers and some spectacular results. The most exciting part, however, is the research opportunities that come with the emergence of a new field.
Patrice Simard is a Distinguished Engineer in the Microsoft Research AI Lab in Redmond. He is passionate about finding new ways to combine engineering and science in the field of machine learning. Simard’s research is currently focused on human teachers. His goal is to extend the teaching language, science, and engineering, beyond the traditional (input, label) pairs. Simard completed his PhD thesis in Computer Science at the University of Rochester in 1991. He then spent 8 years at AT&T Bell Laboratories working on neural networks. He joined Microsoft Research in 1998. In 2002, he started MSR’s Document Processing and Understanding research group. In 2006, he left MSR to become the Chief Scientist and General Manager of Microsoft’s Live Labs Research. In 2009, he became the Chief Scientist of Microsoft’s AdCenter (the organization that monetizes Bing search). In 2012, he returned to Microsoft Research to work on his passion, Machine Learning research. Specifically, he founded the Computer-Human Interactive Learning (CHIL) group to study Machine Teaching and to make machine learning accessible to everyone.
Fatiha Saïs
Chair, Université Paris-Sud, UPsay
Olivier Fercoq
Co chair, Télécom ParisTech, UPsay
Isabelle Huteau
Conference Communication Leader, Digicosme
Antoine Naulet
M2 DataScience
STEERING COMMITTEE
Florence d'Alché-Buc
Télécom ParisTech, UPsay
Sylvain Arlot
Université Paris-Sud, UPsay
Albert Bifet
Télécom ParisTech, UPsay
Sarah Cohen-Boulakia
Université Paris-Sud, UPsay
Flora Jay
CNRS, UPsay
Joseph Salmon
Télécom ParisTech, UPsay
Karine Zeitouni
Université de Versailles-Saint-Quentin-en-Yvelines, UPSay
SCIENTIFIC COMMITTEE
Florence d'Alché-Buc
Télécom ParisTech, UPsay
Zacharie Ales
ENSTA ParisTech, UPsay
Sarah Cohen-Boulakia
Université Paris-Sud, UPsay
Marco Cuturi
ENSAE ParisTech, UPSay
Olivier Fercoq
Télécom ParisTech, UPsay
Alexandre Gramfort
Inria Saclay, UPSay
Flora Jay
CNRS, UPsay
Cristina Manfredotti
AgroParisTech, UPSay
Nicoleta Preda
Université de Versailles-Saint-Quentin-en-Yvelines, UPsay
Gianluca Quercini
Centrale-Supélec, UPSay
Fatiha Saïs
Université Paris-Sud, UPsay
Michaël Thomazo
Inria Saclay, UPSay
Paola Tubaro
CNRS, UPSay
The topics of the conference are listed below:
The extended abstracts will be reviewed by the scientific program committee, possibly including one junior PC member (PhD student or postdoc in data science). They will be selected for oral (15 min) or poster presentation (flash talks, poster and poster-demo sessions) according to their originality and relevance to the conference topics. All the presentations should be in English. Electronic versions of the extended abstracts will be accessible on the conference web site. The book of abstracts will not be published and the extended abstracts will not constitute a formal publication.
Note: Only Master M2 and PhD students from Université Paris-Saclay are invited to contribute.
1ST CALL FOR CONTRIBUTIONS:
We invite Master M2 and PhD students from Université Paris-Saclay to submit an extended abstract of up to 3 pages describing new or preliminary results of their work in one of the three forms: oral talk, poster or poster-demo (all in English). Master students are encouraged to submit posters even if they do not have substantial results at the time of submission. Submissions should be formatted according to the Springer Lecture Notes in Computer Science style.
We will award a prize to the best communication.
Extended Abstract Instructions:
Please follow this simplified template: PDF ; latex source or Word template
Language = English
Number of pages = 1 to 3
Maximum number of Table or Figure = 1
Mandatory paragraphs = Abstract, Keywords, Motivation, References (max 10 ref.)
Each extended abstract must be submitted online via the Easychair submission system
09月13日
2018
09月14日
2018
摘要截稿日期
初稿截稿日期
初稿录用通知日期
注册截止日期
留言