
5
hiRi-j
=
-Ri
j=
1
where the correlation coefficients,
Ri,
for
i
=
-L,
. . .
,
L,
are known and
R+
equals
R,.
An
efficient algorithm for solving this set of equations is the
Levinson
algorithm.
SOURCE COMPRESSION
The average amount of information required to
describe a source output symbol is equal to the entropy
of the source. Sometimes it is not convenient or
practical to retain all this information. It then is no
longer possible
to
maintain an exact reproduction
of
the
source. An analog source has infinite entropy,
so
distortion must always be present when the source
output is passed through a channel of finite capacity.
Data compression is the practice of intentionally
reducing the information content of a data record. This
should be done in such a way that the least-distorted
reproduction is obtained. Information theory finds the
performance of the optimum compression of a random
source of data; a naive encoder will have greater
distortion.
The output of a discrete information source is a
random sequence of symbols from a finite alphabet
containing
J
symbols given by
{ao,
al,
.
.
.
,
aJ-,}.
A
memoryless source produces the jth letter with proba-
bility
pj
where
pj
is strictly positive. The source output
is
to be reproduced in terms of a second alphabet called
the reproducing alphabet, often identical to the source
alphabet, but not always
so.
For example, the reproduc-
ing alphabet might consist of the union of the source
alphabet and a single new element denoting “data
erased.”
A distortion matrix is a
J
by
K
matrix with nonnega-
tive elements
pjk
for
j
=
0,
.
.
.
,
J
-
1
and
k
=
0,
.
.
.
,
K
-
1
that specifies the distortion associated
with reproducing the
jth
source letter by the kth
reproducing letter. Without loss
of
generality, it can be
assumed that for each source letter
aj
there is at least
one reproducing letter
bk
such that the resulting distor-
tion
pjk
equals zero. Usually the distortion in a block is
defined as the arithmetic average
of
the distortions of
each letter of the block. This
is
called a per-letter
fidelity criterion.
An
important distortion matrix is the probability-of-
error distortion matrix. For this case the alphabets are
identical. For example, take
J
=
K
=
4
and
[;
j
11
p=llol
This distortion matrix says that each error is counted as
one unit
of
distortion.
A
different data-compression
problem with the same source alphabet and reproducing
alphabet is obtained if one takes the distortion matrix
This distortion matrix says that, modulo four, an error
of two units is counted as two units of distortion.
A source compression block code of blocklength
n
and size
M
is a set consisting of
M
sequences of
reproducing letters, each sequence of length
n.
The
source compression code is used as follows. Each
source output block of length
n
is mapped to that one of
the
M
codewords that results in the least distortion.
The entropy of the output of the data compressor is
less than that of the original source and therefore can be
encoded into a smaller number of bits.
The
Distortion-Rate Function
Data compression is a deterministic process. The
same block of source symbols always produces the same
block
of
reproducing letters. Nevertheless, if attention
is restricted to a single source output symbol without
knowledge of the previous or subsequent symbols, then
the reproducing letter is not predetermined. The letter,
bk,
into which source letter
uj
is encoded becomes a
random variable even though the block encoding is
deterministic. This random variable can be described
by a transition matrix,
Qkb.
Heuristically, we think of
Qkb
as describing an artificial channel that approxi-
mates the data compression. Each time the source
produces letter
a),
it is reproduced by letter
bk
with
probability
Qkb.
To obtain the greatest possible compression, it seems
that this conditional probability should, on average,
result in the smallest possible mutual information
between the source and the reproduction provided that
the average distortion is
less
than the allowable average
distortion. This heuristic discussion motivates the fol-
lowing definition.
The distortion-rate function,
D(R),
is given by
where the minimum is over all probability transition
matrices
Q
connecting the source alphabet and the
reproducing alphabet that satisfy
I(p;Q)
5
R.
The
definition is justified by the source compression theor-
ems of information theory. Intuitively, if rate R bits per
source letter is specified, then any compression must
provide a distortion of at least
D(R).
Conversely,
compression to a level arbitrarily close to
D(R)
is
possible by appropriate selection of the compression
scheme.
The distortion-rate function can be evaluated analyti-
cally for simple sources and distortion matrices. Fig.
24