Most activities take place at the Maxwell Dworkin Laboratory (33 Oxford Street, Cambridge, MA 02138), Harvard University. Please note any “G” prior to a room number denotes ground floor. The banquet Thursday night takes place at the Harvard Faculty Club (20 Quincy St). It is a short walk across campus from Maxwell Dworkin.

An interactive Harvard campus map is here, and direction to the campus is here.

**Tuesday, June 19**

**Location**: Maxwell Dworkin rooms G115 & 119 & ground floor lobby

5:30pm Registration and reception

6:30pm Introduction and playback with Omer Reingold (Stanford) and Nir Shavit (MIT).

**Wednesday, June 20**

**Location**: Maxwell Dworkin rooms G115 & 119 & ground floor lobby

8:30am Breakfast

9:30am – 10:30am Barna Saha (U. Mass)

** *** Space & Time Efficient Algorithms for Lipschitz Problems*

10:30am – 11:00am Break

11:00am – 12:00 pm Nancy Lynch (MIT)

* Robust Ant Colony Algorithms: Density Estimation and House-Hunting*

12:00pm – 2:00pm Lunch with local faculty (Maxwell Dworkin room 119 & 1^{st} floor lobby)

2:00pm – 3:00pm Anisha Asundi (Harvard)

** ***Gender and Technology: How to Debias our Organizations and Workplaces *

3:00pm — 3:30pm Break

3:30pm Student rump sessions, 3 min per presentation

3:30pm – 4:10pm Rump session 1

4:10pm – 4:40pm Break

4:40pm – 5:20pm Rump session 2

Dinner on your own

### Thursday,** June 21**

** Location: **Maxwell Dworkin rooms G115 & 119 & ground floor lobby

8:30am Breakfast

9:30am – 10:30am Julia Kempe

*Computer Science and Finance*

10:30am –11:00am Break

11:00am – 12:00pm Bonnie Berger (MIT)

*Compressive Algorithms for Biomedical Data Analysis*

12:00pm – 2:00 pm Lunch (Maxwell Dworkin room 119 & 1^{st} floor lobby)

2:00pm – 3:00pm Gillat Kol (Princeton)

*Interactive Information Theory*

3:00pm — 3:15pm Break

3:15pm – 5:15pm Panel

Banquet: Harvard Faculty Club, 20 Quincy St, Cambridge

**Friday, June 22**

**Location**: Maxwell Dworkin rooms G115 & 119 & ground floor lobby

8:30am Breakfast

9:00am – 10:00am Yael Kalai (Microsoft)

* A Survey on Delegating Computations Under Standard Assumptions*

10:00am – 10:30am Break

10:30am –12:00 pm Industry talks and panel

Gwen Beaven (Google): *Google for Jobs*

Lucy Hadden (Google): *Featured Snippets*

Rania Khalaf (IBM)

Emily Shen (MIT Lincoln Lab): *Secure Multi-Party Computation Applications and Techniques*

12:00pm – 2:00 pm Lunch (Maxwell Dworkin room 119 & 1^{st} floor lobby)

2:00pm – 3:00pm Alina Ene (Boston U.)

* Faster and distributed submodular function optimization*

Adjourn

### Abstracts

**Space & Time Efficient Algorithms for Lipschitz Problems
**

*Barna Saha (U. Mass)*

Many fundamental sequence problems exhibit lipschitz property, that is whenever the input changes slightly, the change in the output is also bounded. Some examples include the longest increasing subsequence problem (studied since Schensted, 1961), the string edit distance computation (studied since Levenshtein, 1965), the language edit distance problem (studied since Aho and Peterson, 1972), RNA Folding (studied since Jacobson and Nussinov, 1980) etc. For all these problems, there exist simple polynomial time algorithms based on dynamic programming.

Dynamic programming is one of the most systematic ways of developing polynomial time algorithms, but it often suffers from high space and time complexity. In this talk, I will show a simple technique of amnesic dynamic programming using which we can improve either on time or space complexity of lipschitz problems allowing for additive approximations. For some of these problems, it is also possible to design an exact algorithm that runs faster than the corresponding dynamic programming using a reduction to lipschitz (min,+)-matrix multiplication. Time permitting, I will discuss how this raises serious doubts to the All Pairs Shortest Path conjecture which is the cornerstone to show cubic-hardness of a large collection of fundamental graph and matrix problems.

**Robust Ant Colony Algorithms: Density Estimation and House-Hunting
**

*Nancy Lynch (MIT)*

My research group has been studying Biological Distributed Algorithms for around four years, mainly algorithms for insect colonies but also for some other biological systems such as brain networks. These algorithms have many interesting characteristics:

They tend to be simple to describe, but hard to analyze. They are typically probabilistic, and solve problems only approximately. They are flexible (work in different environments), robust (to failures), and adaptive (to changes during operation). These are interesting features for distributed algorithms—not just biological algorithms but also engineered distributed algorithms.

We are studying these algorithms for two reasons: in order to understand the behavior of biological systems, and in order to extract ideas from biological systems that may help in designing and analyzing algorithms for wireless networks and other engineered systems.

Another issue of interest is composition of algorithms. We would like to understand how one can combine (probabilistic, approximate) biological distributed algorithms for simple problems to obtain algorithms for more complex problems.

This talk is mostly about a particular example: An ant colony density estimation algorithm recently developed by [Lynch, Musco, and Su], together with an ant colony house-hunting algorithm from Radeva’s thesis. I will describe these algorithms and their guarantees separately, and they show how they can be combined. This suggests many new directions for further research.

**Gender and Technology: How to Debias our Organizations and Workplaces
**

*Anisha Asundi*

Research Fellow: Gender Specialist

Women and Public Policy Program, Harvard Kennedy School

The gender gap in technology produces major disparities in economic opportunity, reproduces biases in new technology, and impoverishes the talent pool for innovation. At the same time, the tech sector offers the tools and capacity for data collection and analytics that in turn power innovative and evidence-based solutions—potentially with broad implications for other areas of gender inequality. This talk will explore the interventions have been developed to address the organizational challenges that women in tech face over the length of their careers, and identify technological tools that exist to bring these interventions to scale across sectors.

**Compressive Algorithms for Biomedical Data Analysis**

*Bonnie Berger (MIT)*

The last two decades have seen an exponential increase in genomic and biomedical data, which are outstripping advances in computing power. Extracting new science from these massive datasets will require not only faster computers; it will require algorithms that scale sublinearly in the size of the datasets. We introduce a novel class of algorithms that are able to scale with the entropy and low fractal dimension of the dataset by taking advantage of the unique structure of massive biological data to operate directly on compressed data. These algorithms can be used to address large-scale challenges in genomics, metagenomics and chemogenomics.

**Computer Science and Finance**

*Julia Kempe*

Computer Science, and the hard sciences in general, have more to do with finance than one might think. For example, the theory of random walks was independently invented (at least) twice, to describe Brownian Motion, but also by Bachelier to describe the time dependence of stock prices. Many more connections have emerged over the years, in areas such as analysis of time series, optimization and machine learning. These connections have become even more important in recent years, as almost every part of finance has been revolutionized by computational, machine learning and “big data” techniques, to the point where a PhD in the hard sciences has often become the better preparation for a job in finance than an MBA. Here, we both survey some history, give some key ideas and sketch what a job in finance is like.

**Interactive Information Theory**

*Gillat Kol (Princeton)*

In a profoundly influential 1948 paper, Claude Shannon introduced information theory and used it to study one-way data transmission problems over different channels, both noisy and noiseless. That paper initiated the study of error correcting codes and data compression, two concepts that are especially relevant today with the rise of the internet and data-intensive applications.

In the last decades, interactive communication protocols are used and studied extensively, raising the fundamental question: To what extent can Shannon’s results be generalized to the interactive setting, where parties engage in an interactive communication protocol? In this talk we will focus on the interactive analog of data compression and of coding in an attempt to answer the above question.

**A Survey on Delegating Computations Under Standard Assumptions
**

*Yael Kalai (Microsoft)*

In the past few years there have been several new constructions of non-interactive delegation schemes based on standard assumptions (such as the Learning with Error assumption). We have seen such constructions for all deterministic computations, for batch NP computations, for bounded-space non-deterministic computations, and more. In this talk, I will survey these results, and present some new results, all with a single unifying approach.

**Google for Jobs**

*Gwen Beaven (Google)*

People experience a lot of ‘friction’ when searching for a job, having to navigate and search through many different sites to get a comprehensive list of all opportunities. Google has made it easier to find jobs by collating listings from all across the web with a unified and powerful search experience, helping people find employment and opportunity. In this talk, I will share some of the work my team did to build job search on Google. https://www.blog.google/products/search/connecting-more-americans-jobs/

**Featured Snippets**

*Lucy Hadden (Google.com)*

Sometimes when you do a search, you’ll find that there’s a descriptive box at the top of Google’s results. We call this a “featured snippet.” Featured snippets present interesting challenges, because they involve both query understanding (what is the user looking for) and natural language processing over the (highly unstructured) open web. In this talk, I’ll describe some of what makes this hard.

**Secure Multi-Party Computation Applications and Techniques**

*Emily Shen (MIT Lincoln Lab)*

Often organizations would like to collaborate to compute results derived from their collective data; however, security and privacy concerns can pose barriers to sharing data to perform these computations. Secure multi-party computation (MPC) is a type of cryptographic protocol for performing joint computations while keeping each party’s input data private. While MPC techniques have existed for several decades that can be composed to compute any function securely, developing and optimizing MPC protocols for complex applications can be challenging. In this talk, I will describe some MPC applications, techniques, and a framework we are developing for easily implementing MPC applications.

**Faster and distributed submodular function optimization**

*Alina Ene (Boston U.)*

Many applications in machine learning involve huge amounts of data with a very rich structure. The scale and complexity of the data makes learning very challenging from a computational and modeling perspective; to learn the most from our data, we need both powerful models and efficient algorithms for those models, and it is not a priori clear that these two competing goals can be reconciled.

In this talk, we explore some of the modeling and algorithmic challenges arising in applications. We consider learning tasks with a rich combinatorial structure that can be modeled as submodular optimization problems, and we design faster and distributed algorithms for these problems with provable guarantees.