240 CHANNEL CAPACITY
Now consider the channel with memory. Before transmission
begins, Z is randomly chosen and fixed for all time. Thus,
Y
i
= ZX
i
.
(b) What is the capacity if
Z =
1,p= 0.5
−1,p= 0.5?
(7.169)
7.37 Joint typicality. Let (X
i
,Y
i
,Z
i
) be i.i.d. according to p(x,y,z). We
will say that (x
n
,y
n
,z
n
) is jointly typical [written (x
n
,y
n
,z
n
) ∈
A
(n)
]if
•
p(x
n
) ∈ 2
−n(H (X)±)
.
•
p(y
n
) ∈ 2
−n(H (Y )±)
.
•
p(z
n
) ∈ 2
−n(H (Z)±)
.
•
p(x
n
,y
n
) ∈ 2
−n(H (X,Y )±)
.
•
p(x
n
,z
n
) ∈ 2
−n(H (X,Z)±)
.
•
p(y
n
,z
n
) ∈ 2
−n(H (Y,Z)±)
.
•
p(x
n
,y
n
,z
n
) ∈ 2
−n(H (X,Y,Z)±)
.
Now suppose that (
˜
X
n
,
˜
Y
n
,
˜
Z
n
) is drawn according to p(x
n
)p(y
n
)
p(z
n
). Thus,
˜
X
n
,
˜
Y
n
,
˜
Z
n
have the same marginals as p(x
n
,y
n
,z
n
)
but are independent. Find (bounds on) Pr{(
˜
X
n
,
˜
Y
n
,
˜
Z
n
) ∈ A
(n)
} in
terms of the entropies H(X),H(Y),H(Z),H(X,Y),H(X,Z),
H(Y,Z),andH(X,Y,Z).
HISTORICAL NOTES
The idea of mutual information and its relationship to channel capacity
was developed by Shannon in his original paper [472]. In this paper, he
stated the channel capacity theorem and outlined the proof using typical
sequences in an argument similar to the one described here. The first
rigorous proof was due to Feinstein [205], who used a painstaking “cookie-
cutting” argument to find the number of codewords that can be sent with a
low probability of error. A simpler proof using a random coding exponent
was developed by Gallager [224]. Our proof is based on Cover [121] and
on Forney’s unpublished course notes [216].
The converse was proved by Fano [201], who used the inequality bear-
ing his name. The strong converse was first proved by Wolfowitz [565],
using techniques that are closely related to typical sequences. An iterative
algorithm to calculate the channel capacity was developed independently
by Arimoto [25] and Blahut [65].