By Arto Salomaa

ISBN-10: 0521302455

ISBN-13: 9780521302456

However, the existence of the distribution p(Xl , X2 , X3 , X4) constructed immediately after Proposition 2. I2 simply says that it is not always possible to find such a sequence {pd · Therefore, probability distributions which are not strictly positive can be very delicate. 5 that their conditional independence structures are closely related to the factorization problem of such distributions, which has been investigated by Chan and Yeung [43]. 2 SHANNON'S INFORMATION MEASURES We begin this section by introducing the entropy of a random variable.

Let X be a function of Y. Prove that H (X) :S H (Y) . Interpret this result. 12. Prove that for any n ~ 2, n H (X I , X 2," ' , X n ) ~ 2: H(XiIX j , j i= i) . i= 1 13. Prove that Hint: Sum the identities for i = 1,2 , 3 and appl y the result in Problem 12. 14. Let N n = {I , 2, . . , n } and denote H (X i , i E ex) by H (X o. ) for any subset ex of N n . For 1 :S k :S n , let Prove that HI ~ H2 ~ . 39) . See Problem 4 in Chapter 15 for an application of these inequalities. 15. Prove the divergence inequality by using the log-sum inequality.

70) Thus p(x , y, z) logp(z) -+ 0 as p(x , y , z) -+ O. Similarly, we can show that both p(x,y,z)logp(x,z) and p(x ,y,z)logp(y,z) -+ 0 as p(x ,y, z) -+ O. 71) as p(x, y, z) -+ O. Hence, I(X ;YjZ) varies continuously with p even when p(x, y, z) -+ 0 for some x , y, and z . 4 CHAIN RULES In this section, we present a collection of information identities known as the chain rules which are often used in information theory. 23 (CHAIN RULE FOR ENTROPY) n H(X 1,X2 , " ' ,Xn ) = LH(XiIX1 , ,, , ,Xi-d .

