### Sunday, May 22

5:00pm Registration and reception

6:00 – 9:00pm Introduction and playback with Omer Reingold (Samsung Research)

### Monday, May 23

8:00 – 8:30am Breakfast in the dorms

8:45 – 9:45am Anna Karlin (U. Washington)

*Two pricing problems, *video

10:00 – 11:00 am Leanne Meyer (CMU Heinz College)

*The Art of Asking for It! Negotiation Strategies for Women, *video

11:00 – 11:30am Break

11:30 – 12:30 pm Elette Boyle (IDC)

*Oblivious RAM, *video

12:30 – 2:00pm Lunch

2:00 – 3:00pm Virginia Vassilevska Williams (Stanford)

*A Fine-Grained Approach to Algorithms and Complexity, *video

3:30 – 6:30pm Student rump sessions, 3 min per presentation

3:30 – 4:10pm Rump session 1

4:10 – 4:40pm Break

4:40 – 5:20pm Rump session 2

5:20 – 5:50pm Break

5:50 – 6:30pm Rump session 3

7:15 – 8:00pm Dinner in the dorms

**Tuesday, May 24**

8:00 – 8:30am Breakfast in the dorms

8:45 – 9:45am Edith Cohen (Google Research and Tel Aviv University)

*Weighted sampling for scalable analytics of large data sets, *video

10:00 –11:00am Mor Harchol-Balter (CMU)

*Smarter Task Assignment for Server Farms via Replication, *video

11:00 – 11:30am Break

11:30 –12:30 pm Dahlia Malkhi (VMWare)

*Reliable Distributed Systems Foundations in Practice, *video

12:30 – 2:00 pm Lunch

2:00 – 3:00pm Cynthia Dwork (MSR)

*Theory for Society, *video

3:15 – 5:15pm Panel

Banquet

**Wednesday, May 25**

8:00 – 8:30am Breakfast in dorms

8:30 – 9:30am Mary Wootters (CMU)

*Locality in error correcting codes, *video

10:00 am Leave for Google

Google Talks

Liadan O’Callaghan: Challenges in Search Ad Auctions

Maya Gupta: GlassBox Machine Learning

Gagan Aggarwal: Ad Auction Design

Zoya Svitkina: Optimization in Resource Planning

Anelia Angelova: Applied Deep Learning for Computer Vision

**Titles and Abstracts**

**Title**: Two pricing problems

**Speaker**: Anna Karlin, U. Washington

**Abstract**: In the first part of the talk, I’ll describe work with Chawla, Devanur and Sivan on a pricing problem where a buyer is interested in purchasing/using a good, such as an app or music or software, repeatedly over time. A common assumption in auction theory and mechanism design is that a consumer knows how much he values an item at the moment he is considering a purchase and that he has an accurate prior over how much he will value that item in the future (if not precise knowledge). In this work, we explore scenarios where the consumer discovers his valuation for a good as he uses it, and his value per usage evolves as a martingale. We show how to construct simple pricing mechanisms that yield approximately optimal seller revenue regardless of the buyer’s risk profile.

In the second part of the talk, I’ll describe results with Fiat, Goldner and Koutsoupias on how to maximize auctioneer revenue when he is selling a service for which buyers have both a value and a deadline. In this setting, we show how to use linear programming duality to design the optimal auction.

**Title:** The Art of Asking for It! Negotiation Strategies for Women

**Speaker: **Leanne Meyer, CMU Heinz College

**Abstract: **Do you * ask for what you want* or are you

*that will advance your career and/or compensation? Learn how to better navigate the barriers we know women face as they negotiate on behalf of their teams, the organization, and themselves.*

**waiting to be offered the opportunity**During this presentation, Carnegie Mellon Leadership and Negotiation Academy Program Director, Leanne Meyer, will help participants recognize opportunities to negotiate, eliminate anxiety, feel entitled to get what they want and avoid social consequences that inhibit good outcomes for you and your organization.

**Title:** Oblivious RAM

**Speaker**: Elette Boyle, IDC

**Abstract:** An Oblivious RAM (ORAM), introduced by Goldreich and Ostrovsky (JACM 1996), is a (probabilistic) RAM that hides its access pattern: i.e, for every input the memory locations accessed are similarly distributed. Since their inception, ORAMs have become an invaluable tool in designing cryptographic systems, where observable memory access patterns crucially must not leak sensitive information.

We present a survey of cryptographic ORAM research, from the original constructions to the most recent discoveries, and the mysteries that lie ahead.

**Title**: A Fine-Grained Approach to Algorithms and Complexity

**Speaker**: Virginia Vassilevska Williams, Stanford

**Abstract**: A central goal of algorithmic research is to determine how fast computational problems can be solved in the worst case. Theorems from complexity theory state that there are problems that, on inputs of size n, can be solved in t(n) time but not in t(n)^{1-ε} time for ε>0. The main challenge is to determine where in this hierarchy various natural and important problems lie. Throughout the years, many ingenious algorithmic techniques have been developed and applied to obtain blazingly fast algorithms for many problems. Nevertheless, for many other central problems, the best known running times are essentially those of the classical algorithms devised for them in the 1950s and 1960s.

Unconditional lower bounds seem very difficult to obtain, and so practically all known time lower bounds are conditional. For years, the main tool for proving hardness of computational problems have been NP-hardness reductions, basing hardness on P≠NP. However, when we care about the exact running time (as opposed to merely polynomial vs non-polynomial), NP-hardness is not applicable, especially if the running time is already polynomial. In recent years, a new theory has been developed, based on “fine-grained reductions” that focus on exact running times. The goal of these reductions is as follows. Suppose problem A is solvable in a(n) time and problem B in b(n) time, and no a(n)^{1-ε} and b(n)^{1-ε} algorithms are known for A and B respectively (for constant ε>0). Then, if A is fine-grained reducible to problem B (for a(n) and b(n)), a b(n)^{1-ε} time algorithm for B (for any ε>0) implies an a(n)^{1-ε’} algorithm for A (for some ε’>0). Now, mimicking NP-hardness, the approach is to (1) select a key problem X that is conjectured to require t(n)^{1-o(1)} time for some t, and (2) reduce X in a fine-grained way to many important problems. This approach has led to the discovery of many meaningful relationships between problems, and even sometimes to equivalence classes.

In this talk I will give an overview or the current progress in this area of study, and will highlight some new exciting developments.

**Title**: Weighted sampling for scalable analytics of large data sets

**Speaker**: Edith Cohen, Google Research and Tel Aviv University

**Abstract**: Random sampling is a classic statistical tool that is used to summarize and perform analytics on data that is too large to store or scalably process. Suppose that our dataset consists of *keys* (cookies, users, queries) with numeric positive weights. We want to estimate the sum of some function *f* of the weights over keys in a specified segment; we call such a sum an *f*-statistic. For example, the number of users from a certain zipcode can be expressed in this way. Weighted sampling can be applied to estimate such *f*-statistics. Very common statistics are the number of active keys in the segment, the sum of their weights, and capped sums, where weights are capped by a parameter. Classic sampling schemes achieve optimal tradeoffs of sample size and estimation quality for a particular $f$-statistics, roughly, by including keys with probability proportional to $f$ applied to their weight.

This talk will focus on extensions of this classic tool to two important regimes. The first, *multi-objective* sampling schemes, obtain optimal samples that provide estimates with specified statistical guarantees for a set of functions. We will also show that the sample-size overhead (with respect to the optimum sample for a particular *f*) is surprisingly small for important classes of functions. The second addresses scalability on streamed or distributed data with *unaggregated*presentation: Each key has multiple elements and its weight is the sum of these element weights, but we are interested in sampling without costly aggregation. We will present a simple sampling framework with novel estimators, which in particular, provided the first solution to capping statistics, used in online advertising.

**Title**: Smarter Task Assignment for Server Farms via Replication

**Speaker**: Mor Harchol-Balter, CMU

**Abstract: **Every time you go to the check-out counter in a supermarket, you have a decision to make: which queue do you join? Do you pick the queue with the fewest customers, the one with the least total number of items, pick a random queue, or do something entirely different? This question of picking the right queue so as to minimize mean response time is known as the “task assignment problem” and is a very old and classic problem in the performance modeling community.

In this talk we examine task assignment in a new light, where there is server-side variability in addition to job-size variability. We ask how replication of jobs (sending a job to multiple queues) can help reduce response time in light of these two types of variability. We present several new results on the performance analysis of server farms under replication, including power-of-d replication and Join-Idle-Queue replication.

Bottom Line: By the end of this talk, you will have a much better sense of what you should do at the supermarket

**Title**: Foundations of Reliable Distributed Systems in Practice

**Speaker**: Dahlia Malkhi, VMware Research

**Abstract: **Distributed systems are difficult to build, algorithms are subtle and bug-prone, and performance is often at odds with consistency. It is therefore crucial to have solid algorithmic foundations that are succinct and simple to follow.

This talk examines how some of the most well known foundations in distributed computing evolved through the drive of practical impact. For example, the past decade exposed interesting anomalies in fundamental replication schemes such as Paxos. It saw the emergence of a new algorithm called Vertical Paxos for managing dynamic systems, that bridges between the Paxos theory and engineering practices.

Yet, despite increasing insight into distributed systems, there is still a strong deficiency in the set of tools available for building scale-out data platforms. State-of-the-art tools for scaling systems, for example NOSQL systems, typically partition state to achieve scalability. With such systems, obtaining a consistent view of global state often requires complicated and expensive distributed protocols.

This led a team of researchers at Microsoft to design a new method, called Corfu, that provides high-throughput scale-out replication with strong consistency guarantees. Corfu spreads updates over a large cluster, each replicated over a small group of nodes. All groups are “stitched” into a global *log* using a soft-state sequencer, which is neither an IO bottleneck, nor a single point of failures.

Fast forward to today, we describe how Corfu is being used by a team of researchers at VMware Research to build a distributed control plane for VMware’s software defined networking (SDN) technology.

**Title**: Theory for Society

**Speaker**: Cynthia Dwork, MSR

**Abstract**: As technology reaches increasingly deeply into our everyday lives, changing the fabric of society woven by our interactions – with one another, with government, with the corporate world – design decisions have increasing impact on basic social values such as privacy, fairness, and protection from physical and emotional harm. Complexity of this type requires mathematically rigorous notions that allow us to quantify these goods and their loss, to explore fundamental tradeoffs and limitations, and to lay the theoretical groundwork for what can be achieved. This talk describes efforts of this type in two areas: privacy-preserving data analysis and fairness in classification, and touches on an agenda for future directions.

**Title:** Locality in error correcting codes**
Speaker: ** Mary Wootters, CMU

**Abstract:**In the traditional set-up for error-correcting codes, a sender (Alice) wants to send a message to a receiver (Bob); the catch is that some errors may get introduced along the way. One solution to their problem is an error-correcting code. Alice will encode her message, introducing some redundancy, and send the resulting codeword to Bob; even if errors are introduced, the extra redundancy will allow Bob to recover the original message.

While originally introduced for the purpose of communication, error-correcting codes have found many uses beyond Alice and Bob. One feature of an error-correcting code which has been especially useful is that of *locality:* If Bob only wants part of Alice’s message, must he look at the entire codeword? It turns out that, no, he needn’t, and moreover, this fact has been very useful in many application domains, including complexity theory, cryptography, and storage.

In this talk, I’ll discuss different notions of locality in coding theory, and their many applications. I’ll also talk about some recent work on one notion of locality—repair bandwidth—which is useful for distributed storage.