is Shannons “A mathematical theory of communication” worth reading for a beginner in information theory?

I'm studying information theory for the first time, chiefly through Cover&Thomas, in which entropy is introduced in the beginning of the first chapter, with the mathematical definition, but without any motivation as to why it's defined as it is.

Searching for some further explanation, I ran into Shannon's original paper "A mathematical theory of communication", where he gives a descent incentive to define the entropy as it is defined.

Question: Is Shannon's original paper a good introduction to information theory? Or are some of the ideas and notation outdated and incompatible with modern works, likely to confuse a beginner like myself?

edited Jan 29 at 11:30

MJD

47.7k29215397

asked Jan 29 at 9:21

D.M.

494

add a comment |

Searching for some further explanation, I ran into Shannon's original paper "A mathematical theory of communication", where he gives a descent incentive to define the entropy as it is defined.

edited Jan 29 at 11:30

MJD

47.7k29215397

asked Jan 29 at 9:21

D.M.

494

add a comment |

Searching for some further explanation, I ran into Shannon's original paper "A mathematical theory of communication", where he gives a descent incentive to define the entropy as it is defined.

edited Jan 29 at 11:30

MJD

47.7k29215397

asked Jan 29 at 9:21

D.M.

494

Searching for some further explanation, I ran into Shannon's original paper "A mathematical theory of communication", where he gives a descent incentive to define the entropy as it is defined.

self-learning information-theory entropy

edited Jan 29 at 11:30

MJD

47.7k29215397

asked Jan 29 at 9:21

D.M.

494

edited Jan 29 at 11:30

MJD

47.7k29215397

asked Jan 29 at 9:21

D.M.

494

edited Jan 29 at 11:30

MJD

47.7k29215397

edited Jan 29 at 11:30

MJD

47.7k29215397

edited Jan 29 at 11:30

MJD

47.7k29215397

asked Jan 29 at 9:21

D.M.

494

asked Jan 29 at 9:21

D.M.

494

asked Jan 29 at 9:21

D.M.

494

add a comment |

2 Answers
2

active

oldest

votes

Nothing is incompatible in Shannon's paper, though much has been cleaned up and streamlined* - C&T is perhaps one of the best at this. You have to keep in mind though that Shannon wrote a paper - these are never as easy to read as a book if one is not used to them. That said the paper is wonderful. Each time I read it I feel as if I've understood things better, and see them a little differently. Definitely pay close attention to the exposition in it whenever you do read it.

Be warned that the following is necessarily speculative, and a little unrelated to your direct question.

The reason that C&T don't go into why entropy is defined the way it is in Ch. 2 is philosophical. Usually, (and this is the 'incentive' of Shannon that you mention) the justification for entropy is that there are a few natural properties that one wants from a measure of information - key are continuity and that the 'information' of two independent sources is the sum of their individual 'infromation's, and once one posits these axioms, it is a simple theorem that entropy is the unique functional (up to scalar multiplication) that satisfies these.

However, there's a (large**) school in information theory that rejects the above's centrality. It argues that the utility of any information measure is in what operational consequences it has. (This, I think, arises from the fact that the origins - and practice! - of information theory are very much in engineering, and not mathematics, and we engineers, even fairly mathematical ones, are ultimately interested in what one can do with wonderful maths and are not content with how wonderful it is.***). According to this view, the basic reason we define entropy, and that it is such a natural object, is Asymptotic Equipartition (and other nice properties). So you'll find that much of chapter 2 is a (relatively) dry development of various facts about various information measures (except maybe the stuff on Fano's inequality, which is more directly applicable), and that the subject really comes alive in Chapter 3. I'd suggest reading that before you give up on the book - maybe even skip ahead and then go back to Ch. 2.

I'd argue that Cover and Thomas subscribe to the above view. See for instance, the concluding sentences of the introduction to Ch. 2:

In later chapters we show how these quantities arise as natural answers
to a number of questions in communication, statistics, complexity, and
gambling. That will be the ultimate test of the value of these definitions.

and the following from the bottom of page 14 (in the 2nd edition), a little after entropy is defined (following (2.3) ):

It is possible to derive the definition of entropy axiomatically by defining certain properties that the entropy of a random variable must satisfy. This approach is illustrated in Problem 2.46. We do not use the axiomatic approach to justify the definition of entropy; instead, we show that it arises as the answer to a number of natural questions, such as “What is the average length of the shortest description of the random variable?”

*: and some of the proofs have been brought into question from time to time - I think unfairly

**: I have little ability to make estimates about this, but in my experience this has been the dominant school.

***: The practice of this view is also something that Shannon exemplified. His 1948 paper, for instance, sets out to study communication, not to establish what a notion of information should be like - that's 'just' something he had to come up with on the way.

edited Jan 30 at 4:12

answered Jan 30 at 3:30

stochasticboy321

2,587617

1

$begingroup$
Thank you for the comprehensive answer. I was by no means about to "give up" on C&T, it's just that when I encounter a new mathematical idea I like to feel as though I could have, in principle, discovered/developed it myself, rather than simply understanding how it works. I have read the chapter on AEP, and seeing how entropy arises naturally in analysis of typical sets put my mind to rest. However, while obviously not an expert, I'd say these two views are complementary rather than opposing.
$endgroup$
– D.M.
Jan 30 at 10:39

$begingroup$
You're welcome! I agree that they're complementary, really the issue is with primacy. For instance, even C&T mention the axiomatic derivation, and explore it in an exercise.
$endgroup$
– stochasticboy321
Jan 31 at 0:36

add a comment |

Shannon's paper is certainly brilliant, relevant today and readable. It's worth reading, but I would not recommend it as an introduction to information theory. For learning, I would stick with Cover & Thomas or any other modern textbook.

answered Jan 29 at 12:46

leonbloy

42k647108

$begingroup$
When I read Shannon's paper I also read & profited from Khinchin's little book Mathematical Foundations of Information Theory which is, roughly, to Shannon's paper as Shannon's paper is to C&T.
$endgroup$
– kimchi lover
Jan 29 at 13:07

add a comment |

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3091961%2fis-shannons-a-mathematical-theory-of-communication-worth-reading-for-a-beginne%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Be warned that the following is necessarily speculative, and a little unrelated to your direct question.

I'd argue that Cover and Thomas subscribe to the above view. See for instance, the concluding sentences of the introduction to Ch. 2:

In later chapters we show how these quantities arise as natural answers
to a number of questions in communication, statistics, complexity, and
gambling. That will be the ultimate test of the value of these definitions.

and the following from the bottom of page 14 (in the 2nd edition), a little after entropy is defined (following (2.3) ):

It is possible to derive the definition of entropy axiomatically by defining certain properties that the entropy of a random variable must satisfy. This approach is illustrated in Problem 2.46. We do not use the axiomatic approach to justify the definition of entropy; instead, we show that it arises as the answer to a number of natural questions, such as “What is the average length of the shortest description of the random variable?”

*: and some of the proofs have been brought into question from time to time - I think unfairly

**: I have little ability to make estimates about this, but in my experience this has been the dominant school.

edited Jan 30 at 4:12

answered Jan 30 at 3:30

stochasticboy321

2,587617

1

$begingroup$
Thank you for the comprehensive answer. I was by no means about to "give up" on C&T, it's just that when I encounter a new mathematical idea I like to feel as though I could have, in principle, discovered/developed it myself, rather than simply understanding how it works. I have read the chapter on AEP, and seeing how entropy arises naturally in analysis of typical sets put my mind to rest. However, while obviously not an expert, I'd say these two views are complementary rather than opposing.
$endgroup$
– D.M.
Jan 30 at 10:39

$begingroup$
You're welcome! I agree that they're complementary, really the issue is with primacy. For instance, even C&T mention the axiomatic derivation, and explore it in an exercise.
$endgroup$
– stochasticboy321
Jan 31 at 0:36

add a comment |

Be warned that the following is necessarily speculative, and a little unrelated to your direct question.

I'd argue that Cover and Thomas subscribe to the above view. See for instance, the concluding sentences of the introduction to Ch. 2:

In later chapters we show how these quantities arise as natural answers
to a number of questions in communication, statistics, complexity, and
gambling. That will be the ultimate test of the value of these definitions.

and the following from the bottom of page 14 (in the 2nd edition), a little after entropy is defined (following (2.3) ):

It is possible to derive the definition of entropy axiomatically by defining certain properties that the entropy of a random variable must satisfy. This approach is illustrated in Problem 2.46. We do not use the axiomatic approach to justify the definition of entropy; instead, we show that it arises as the answer to a number of natural questions, such as “What is the average length of the shortest description of the random variable?”

*: and some of the proofs have been brought into question from time to time - I think unfairly

**: I have little ability to make estimates about this, but in my experience this has been the dominant school.

edited Jan 30 at 4:12

answered Jan 30 at 3:30

stochasticboy321

2,587617

1

$begingroup$
Thank you for the comprehensive answer. I was by no means about to "give up" on C&T, it's just that when I encounter a new mathematical idea I like to feel as though I could have, in principle, discovered/developed it myself, rather than simply understanding how it works. I have read the chapter on AEP, and seeing how entropy arises naturally in analysis of typical sets put my mind to rest. However, while obviously not an expert, I'd say these two views are complementary rather than opposing.
$endgroup$
– D.M.
Jan 30 at 10:39

$begingroup$
You're welcome! I agree that they're complementary, really the issue is with primacy. For instance, even C&T mention the axiomatic derivation, and explore it in an exercise.
$endgroup$
– stochasticboy321
Jan 31 at 0:36

add a comment |

Be warned that the following is necessarily speculative, and a little unrelated to your direct question.

I'd argue that Cover and Thomas subscribe to the above view. See for instance, the concluding sentences of the introduction to Ch. 2:

In later chapters we show how these quantities arise as natural answers
to a number of questions in communication, statistics, complexity, and
gambling. That will be the ultimate test of the value of these definitions.

and the following from the bottom of page 14 (in the 2nd edition), a little after entropy is defined (following (2.3) ):

It is possible to derive the definition of entropy axiomatically by defining certain properties that the entropy of a random variable must satisfy. This approach is illustrated in Problem 2.46. We do not use the axiomatic approach to justify the definition of entropy; instead, we show that it arises as the answer to a number of natural questions, such as “What is the average length of the shortest description of the random variable?”

*: and some of the proofs have been brought into question from time to time - I think unfairly

**: I have little ability to make estimates about this, but in my experience this has been the dominant school.

edited Jan 30 at 4:12

answered Jan 30 at 3:30

stochasticboy321

2,587617

Be warned that the following is necessarily speculative, and a little unrelated to your direct question.

I'd argue that Cover and Thomas subscribe to the above view. See for instance, the concluding sentences of the introduction to Ch. 2:

In later chapters we show how these quantities arise as natural answers
to a number of questions in communication, statistics, complexity, and
gambling. That will be the ultimate test of the value of these definitions.

and the following from the bottom of page 14 (in the 2nd edition), a little after entropy is defined (following (2.3) ):

It is possible to derive the definition of entropy axiomatically by defining certain properties that the entropy of a random variable must satisfy. This approach is illustrated in Problem 2.46. We do not use the axiomatic approach to justify the definition of entropy; instead, we show that it arises as the answer to a number of natural questions, such as “What is the average length of the shortest description of the random variable?”

*: and some of the proofs have been brought into question from time to time - I think unfairly

**: I have little ability to make estimates about this, but in my experience this has been the dominant school.

edited Jan 30 at 4:12

answered Jan 30 at 3:30

stochasticboy321

2,587617

edited Jan 30 at 4:12

answered Jan 30 at 3:30

stochasticboy321

2,587617

answered Jan 30 at 3:30

stochasticboy321

2,587617

answered Jan 30 at 3:30

stochasticboy321

2,587617

1

$begingroup$
Thank you for the comprehensive answer. I was by no means about to "give up" on C&T, it's just that when I encounter a new mathematical idea I like to feel as though I could have, in principle, discovered/developed it myself, rather than simply understanding how it works. I have read the chapter on AEP, and seeing how entropy arises naturally in analysis of typical sets put my mind to rest. However, while obviously not an expert, I'd say these two views are complementary rather than opposing.
$endgroup$
– D.M.
Jan 30 at 10:39

$begingroup$
You're welcome! I agree that they're complementary, really the issue is with primacy. For instance, even C&T mention the axiomatic derivation, and explore it in an exercise.
$endgroup$
– stochasticboy321
Jan 31 at 0:36

add a comment |

1

$begingroup$
Thank you for the comprehensive answer. I was by no means about to "give up" on C&T, it's just that when I encounter a new mathematical idea I like to feel as though I could have, in principle, discovered/developed it myself, rather than simply understanding how it works. I have read the chapter on AEP, and seeing how entropy arises naturally in analysis of typical sets put my mind to rest. However, while obviously not an expert, I'd say these two views are complementary rather than opposing.
$endgroup$
– D.M.
Jan 30 at 10:39

$begingroup$
You're welcome! I agree that they're complementary, really the issue is with primacy. For instance, even C&T mention the axiomatic derivation, and explore it in an exercise.
$endgroup$
– stochasticboy321
Jan 31 at 0:36

Thank you for the comprehensive answer. I was by no means about to "give up" on C&T, it's just that when I encounter a new mathematical idea I like to feel as though I could have, in principle, discovered/developed it myself, rather than simply understanding how it works. I have read the chapter on AEP, and seeing how entropy arises naturally in analysis of typical sets put my mind to rest. However, while obviously not an expert, I'd say these two views are complementary rather than opposing.

– D.M.
Jan 30 at 10:39

You're welcome! I agree that they're complementary, really the issue is with primacy. For instance, even C&T mention the axiomatic derivation, and explore it in an exercise.

– stochasticboy321
Jan 31 at 0:36

add a comment |

answered Jan 29 at 12:46

leonbloy

42k647108

$begingroup$
When I read Shannon's paper I also read & profited from Khinchin's little book Mathematical Foundations of Information Theory which is, roughly, to Shannon's paper as Shannon's paper is to C&T.
$endgroup$
– kimchi lover
Jan 29 at 13:07

add a comment |

answered Jan 29 at 12:46

leonbloy

42k647108

$begingroup$
When I read Shannon's paper I also read & profited from Khinchin's little book Mathematical Foundations of Information Theory which is, roughly, to Shannon's paper as Shannon's paper is to C&T.
$endgroup$
– kimchi lover
Jan 29 at 13:07

add a comment |

answered Jan 29 at 12:46

leonbloy

42k647108

answered Jan 29 at 12:46

leonbloy

42k647108

answered Jan 29 at 12:46

leonbloy

42k647108

answered Jan 29 at 12:46

leonbloy

42k647108

answered Jan 29 at 12:46

leonbloy

42k647108

$begingroup$
When I read Shannon's paper I also read & profited from Khinchin's little book Mathematical Foundations of Information Theory which is, roughly, to Shannon's paper as Shannon's paper is to C&T.
$endgroup$
– kimchi lover
Jan 29 at 13:07

add a comment |

$begingroup$
When I read Shannon's paper I also read & profited from Khinchin's little book Mathematical Foundations of Information Theory which is, roughly, to Shannon's paper as Shannon's paper is to C&T.
$endgroup$
– kimchi lover
Jan 29 at 13:07

When I read Shannon's paper I also read & profited from Khinchin's little book Mathematical Foundations of Information Theory which is, roughly, to Shannon's paper as Shannon's paper is to C&T.

– kimchi lover
Jan 29 at 13:07

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu