Users of big data are often not computer scientists. On the other hand, it is nontrivial for even experts to optimize performance of big data applications because there are so many decisions to make. For example, users have to first choose from many different big data systems such as those dealing with structured data (e.g., Apache Hbase, Mongo DB, Apache Hive, Apache Accumulo, Presto, Spark SQL), graph data (e.g., Pregel, Giraph, GraphX, GraphLab), and streaming data (e.g., Apache Storm, Apache Heron, Apache Flink, Samza). In addition, there are numerous parameters to tune to optimize performance of a specific system. To make things more complex, users may worry about not only response time or throughput, but also quality of results, monetary cost, security and privacy, and energy efficiency.
In more traditional relational databases these complexities are handled by query optimizer and other automatic tuning tools (e.g., index selection tools) and there are benchmarks to compare performance of different products. Such tools are not available for big data environment and the problem is probably more complicated than the problem for traditional relational databases. The aim of this workshop is to bring researchers and practitioners together to better understand the problems of optimization and performance tuning in a big data environment, to propose new approaches to address such problems, and to develop related benchmarks, tools and best practices.
Topics of interests include, but are not limited to:
Theoretical and empirical performance model for big data applications
Benchmark and comparative studies for big data processing and analytic platforms
Monitoring, analysis, and visualization of performance in big data environment
Workflow/process management & optimization in big data environment
Performance tuning and optimization for specific big data platforms or applications (e.g., No-SQL databases, graph processing systems, stream systems, SQL-on-Hadoop databases)
Performance tuning and optimization for specific data sets (e.g., scientific data, spatio data, temporal data, text data, images, videos, mixed datasets)
Case studies and best practices for performance tuning for big data
Cost model and performance prediction in big data environment
Impact of security/privacy settings on performance of big data systems
Self adaptive or automatic tuning tools for big data applications
Big data application optimization on High Performance Computing (HPC) and Cloud environments
12月11日
2017
12月14日
2017
注册截止日期
留言