Designing Good Algorithms for Map-Reduce and Beyond (Full Day)

Foto N. Afrati (NTUA and Google), Magdalena Balazinska (University of Washington), Anish Das Sarma (Google), Bill Howe (University of Washington), Semih Salihoglu (Stanford University), and Jeffrey D. Ullman (Stanford University)

The goal of this tutorial is to present programming techniques and algorithms for programming systems such as Hadoop (map-reduce). Two important issues to be covered are:
  1. Balancing computation time against communication time to pick the best within a family of similar algorithms.
  2. Managing skew -- the requirement to wait for the last to finish of many parallel tasks.
In addition, the tutorial will examine classes of problems where a small number of map-reduce jobs are not adequate. Options, including extensions of map-reduce and some entirely different systems will be explored.

Open Source Cloud Technologies (Half Day)

Salman A. Baset (IBM Research)

Open source cloud technologies such as OpenStack, CloudStack, OpenNebula, Eucalyptus, OpenShift, and Cloud Foundry have gained significant momentum in the last few years. The first part of this tutorial will provide an overview of OpenStack and CloudStack, two open source infrastructure-a-service (IaaS) cloud platforms. The second part of the tutorial will present a detailed analysis of different OpenStack components, namely, glance (image service), nova (compute service), keystone (identity service), quantum (network service), and swift (object storage service). In particular, the tutorial will describe the scheduling and provisioning process in OpenStack, and how different configuration options lead to myriad provisioning performance. Further, the tutorial will describe how OpenStack has evolved over releases. Finally, the tutorial will describe weaknesses of OpenStack and highlight important areas where researchers can contribute.