Abstract:

In
this talk I will give an overview of some exciting
recent developments in learning boolean functions
with noise. In normal nonnoisy learning, the
goal is to learn a function, given samples of
the function at randomlyselected points. In
noisy learning, the problem is made more difficult,
by allowing a fraction of the examples to be
wrong. If the wrong examples are chosen at random,
then the noise is called "random", while if
the examples are corrupted by a malicious adversary,
the noise is called "agnostic". The subject
of computation with noisy input has been ubiquitous
in computer science, in topics such as coding
theory and approximation algorithms, as well
as practical topics such as computational biology
and vision. Learning with noise has been considered
extremely difficult (especially in the agnostic
setting), but in recent years we've seen a large
influx of positive results (algorithms), as
well as some negative results (hardness results).
I will describe some of these. I will also outline
connections to coding theory and computational
complexity. Here is an example of one of the
problems that I will discuss in the talk, the
problem of learning parity functions with noise:
In the problem of learning parity (without noise),
one gets samples of the form (x,f(x)), where
x is an nbit string chosen uniformly at random,
and f is the XOR function on a subset of the
bits. The function f is not known to the algorithm.
The algorithm can request as many of these samples
as it wants (but cannot specify the x's. They
are picked at random). The goal is to return
f with high probability. Learning parity without
noise is easy to do in time O(n^3), using O(n)
samples, by using Gaussian elimination. Now,
suppose that 1% noise is introduced, so that
in each sample, f(x) is returned with probability
0.99, and 1f(x) is returned with probability
0.01. In this case, the problem becomes extremely
difficult. The faster algorithm known is due
to Blum, Kalai and Wasserman, and runs in time
$2^{O(n/logn)}$. The problem is believed to
be hard (but even defining what hardness means
here is nontrivial). In discussing this problem,
I will describe some possible consequences of
parity with noise being easy or hard. Each way
would be good: Easiness would give us a holy
hammer for use in learning theory, while hardness
would imply simple and efficient cryptographic
primitives, among other consequences. Some parts
of this talk are based on joint work with Adam
Kalai and Yishay Mansour. In the talk I will
not assume any prior knowledge on computational
learning theory. 