Bounds on Generalized Huffman Codes
Dr. Michael Baer
of Technical Staff at VMware, Inc.
June 10, 2009 (Wednesday)
3:30pm - 4:30pm
Rm. 121, Ho Sin Hang Engineering Building, CUHK
Shannon introduced information theory, we have had
entropy bounds for the expected codeword length of
optimal lossless fixed-to-variable-length ("Huffman")
binary codes. The lower bound is entropy, while the
upper bound is one bit greater, resulting in unit-sized
bounds. Since then, tighter bounds have been developed,
bounds which use both entropy and the probability
of the most probable codeword. I will present new
lower and upper bounds for codes optimal according
to various nonlinear codeword length objectives, also
in terms of a form of entropy and/or the probability
of the most probable input symbol. These bounds improve
on known unit-sized bounds for these nonlinear objectives.
The objectives I explore include exponential-average
length, maximum pointwise redundancy, and exponential-average
pointwise redundancy (also called dth exponential
redundancy). These relate to queueing and single-shot
communications, Shannon coding and universal modeling
(worst-case minimax redundancy), and bridging the
maximum pointwise redundancy problem with Huffman
coding, respectively. A generalized form of Huffman
coding known to find optimal codes for these objectives
helps yield these bounds, some of which are tight.
Related properties to such bounds are the necessary
and sufficient conditions for the shortest codeword
being a specific length.
Michael Baer received his Bachelor's degree in electrical
engineering and computer science from the University
of California, Berkeley, and his M.S. and Ph.D. in
electrical engineering from Stanford University. He
is currently a member of technical staff at VMware.
His research interests include source coding trees;
optimizing prefix codes for nontraditional objectives;
virtualization; and image, video, and data signal
processing for compression and other applications.