Huffman coding with unequal letter costs

Mordecai J. Golin*, Claire Kenyon, Neal E. Young

*Corresponding author for this work

Research output: Contribution to journalConference article published in journalpeer-review

37 Citations (Scopus)

Abstract

In the standard Huffman coding problem, one is given a set of words and for each word a positive frequency. The goal is to encode each word w as a codeword c(w) over a given alphabet. The encoding must be prefix free (no codeword is a prefix of any other) and should minimize the weighted average codeword size Σw freq(w) |c(w)|. The problem has a well-known polynomial-time algorithm due to Huffman [15]. Here we consider the generalization in which the letters of the encoding alphabet may have non-uniform lengths. The goal is to minimize the weighted average codeword length Σw freq(w) cost(c(w)), where cost(s) is the sum of the (possibly non-uniform) lengths of the letters in s. Despite much previous work, the problem is not known to be NP-hard, nor was it previously known to have a polynomial-time approximation algorithm. Here we describe a polynomial-time approximation scheme (PTAS) for the problem.

Original languageEnglish
Pages (from-to)785-791
Number of pages7
JournalConference Proceedings of the Annual ACM Symposium on Theory of Computing
Publication statusPublished - 2002
EventProceedings of the 34th Annual ACM Symposium on Theory of Computing - Montreal, Que., Canada
Duration: 19 May 200221 May 2002

Fingerprint

Dive into the research topics of 'Huffman coding with unequal letter costs'. Together they form a unique fingerprint.

Cite this