2014Web作为语料库研讨会 2014 Workshop on Web as Corpus

征稿已开启

已截稿

注册已开启

已截止

活动简介

The World Wide Web has become increasingly popular as a source of linguistic data, not only within the NLP communities, but also with theoretical linguists facing problems of data sparseness or data diversity. Accordingly, web corpora continue to gain importance, given their size and diversity in terms of genres/text types. However, the field is still new, and a number of issues in web corpus construction still needs much research (fundamental and applied), ranging from questions of corpus design (e.g., corpus composition assessment, sampling strategies and their relation to crawling algorithms, handling of duplicated material) to more technical aspects (e.g., efficient implementation of individual post-processing steps in document cleansing and linguistic annotation, or large-scale parallelization to achieve web-scale corpus construction). Similarly, the systematic evaluation of web corpora, for example in the form of task-based comparisons to traditional corpora, has only lately shifted into focus. For almost a decade, the ACL SIGWAC, and especially the highly successful Web as Corpus (WaC) workshops have served as a platform for researchers interested in building and working with web-derived corpora. Past workshops have been co-located with major conferences on computational linguistics and/ or corpus linguistics (such as EACL, LREC, WWW, Corpus Linguistics). As part of the workshop, we will have a panel discussion dedicated to the planning of a shared task for WaC-10 (2015), including the nomination of organizers of the shared task. The tracks of the shared task will focus on the quality of web corpus creation tools, tools for linguistic annotation (at least lemmatization, possibly also POS tagging, etc.), and the quality of web corpora themselves.

征稿信息

重要日期

2014-01-30

摘要截稿日期

留言

全部留言

重要日期

04月26日

2014

会议日期
01月30日 2014

摘要截稿日期
04月26日 2014

注册截止日期

主办单位

国际计算语言学协会欧洲分会

联系方式

登录查看完整联系方式

移动端

在手机上打开

小程序

打开微信小程序

客服

扫码或点此咨询

2014Web作为语料库研讨会

2014 Workshop on Web as Corpus

征稿已开启

注册已开启

活动简介

征稿信息

重要日期

留言

全部留言

稿件模板

重要日期

主办单位

联系方式