Lullaby of Cape Cod

Beyond John’s ellipsoid

Here is the video of my talk from the Learning and Optimization workshop, which was a part of Microsoft TechFest.

This covers (parts of) our very fresh work (joint with Alex Andoni, Assaf Naor, Sasho Nikolov and Erik Waingarten), which is currently under submission. In short, we show the first Approximate Nearest Neighbor (ANN) search1 algorithm for general $d$-dimensional normed spaces with approximation $d^{o(1)}$. The only previously known bound is the trivial $O(\sqrt{d})$, which immediately follows from John’s theorem and any good ANN data structure for the Euclidean distance.

A couple of minor glitches:

  1. The Wasserstein-$p$ distance is not a norm for $p > 1$;
  2. In the last couple of slides, I started writing $D(\varepsilon)$ instead of $R(\varepsilon)$, but it is meant to be the same.

1 For some flavor of the problem, see this classic, or – in case you are more adventurous – my PhD thesis.

Apr 12  

Test your intuition

new job is a good reason to start a (new) blog. The first post is in the spirit of the series by Gil Kalai.

Let $K$ be a convex body in $\mathbb{R}^d$ symmetric around the origin. Let $\mathcal{D}_1$ be a distribution supported on $K$, and let $\mathcal{D}_2$ be a distribution supported outside of $2K$. Can one always find a direction $v$ such that given projections of $\mathrm{poly}(d)$ i.i.d. samples from $\mathcal{D}_i$ on $v$, one can recover $i$ with high probability?

For example, suppose that $K = [-1, 1]^d$ is a cube. Then, each point in the support of $\mathcal{D}_2$ has at least one coordinate whose absolute value is large. It means that there exists a coordinate such that with probability at least $1/d$ over $\mathcal{D}_2$ it is outside of $[-2, 2]$. But then, using this coordinate, one can distinguish $\mathcal{D}_1$ and $\mathcal{D}_2$ using only $O(d)$ samples.

Feb 26