Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲A Mathematical Theory of Communication [pdf] (people.math.harvard.edu)

297 points by luu 6 days ago | 51 comments

ImageXav 3 days ago [-]

If anyone is on the fence about reading this, or worried about their ability to comprehend the content, I would tell you to go ahead and give it a chance. Shannon's writing is remarkably lucid and transparent. The jargon is minimal, and his exposition is fantastic.

As many other commentators has mentioned, it is impressive that such an approachable paper would lay the foundations for a whole field. I actually find that many subsequent textbooks seem to obfuscate the simplicity of the idea of entropy.

Two examples from the paper really stuck with me. In one, he discusses the importance of spaces for encoding language, something which I had never really considered before. In the second, he discusses how it is the redundancy of language that allows for crosswords, and that a less redundant language would make it harder to design these (unless we started making them 3D!). It made me think more deeply about communication as a whole.

teleforce 3 days ago [-]

Shannon is the closest equivalent to Einstein in contribution but for engineering fields.

He pioneered several foundational research in the engineering field including communication entropy, cryptography, chess engine, robotic intelligence, digital boolean, LLM and modern AI in general. An outstanding engineer, or engineer's engineer in a true sense of word.

kragen 3 days ago [-]

my experience is that papers that lay the foundations for a whole field are usually very approachable. i'm not sure why this is:

- maybe being better at breaking new intellectual ground requires some kind of ability that can also be applied to explaining things? like maybe some people are just smarter than others, either inherently or as a result of their training and experience, in a way that generalizes to both tasks

- maybe the things that most strongly impede people from breaking new intellectual ground also impede them from explaining them clearly? candidates might include emotional insecurity, unthinking devotion to tradition, and intellectual vanity (wanting to look right rather than be right)

- maybe the people who suck at explaining their own ideas don't get access to the cutting-edge developments that they would need to break new intellectual ground? shannon had the great good fortune, for example, to spend a lot of the war at bell labs conducting cryptanalysis, rather than sleeping under a dunghill on the battlefield or on the assembly line making artillery shells

jamesrcole 2 days ago [-]

> papers that lay the foundations for a whole field are usually very approachable. i'm not sure why this is

Kuhn talks about this in his works[1]. If I recall correctly, his argument is that when someone is creating a new field (new paradigm) there isn't pre-existing jargon to describe it in terms of, so it has to be described in accessible language.

It's once people start doing work inside the field that they start developing jargon and assuming things.

[1] I think it would have been The Structure of Scientific Revolutions, and/or possibly The Copernican Revolution

raincom 2 days ago [-]

New theories come up with new concepts. Jargon is different from new technical terms. Best example of jargon is post modernism stuff in humanities.

Can you help find the page number where Kuhn talks about jargon?

samatman 2 days ago [-]

I would say that for our purposes, the best example of jargon is the Jargon File: http://catb.org/jargon/html/

A useful quote from the above:

> Linguists usually refer to informal language as ‘slang’ and reserve the term ‘jargon’ for the technical vocabularies of various occupations. However, the ancestor of this collection was called the ‘Jargon File’, and hacker slang is traditionally ‘the jargon’. When talking about the jargon there is therefore no convenient way to distinguish it from what a linguist would call hackers' jargon — the formal vocabulary they learn from textbooks, technical papers, and manuals.

What you call new technical terms is the jargon of technical pursuits. The post modernism stuff is the jargon of "studies" departments. There is also military jargon, for another example. The term is not inherently derogatory.

jdewerd 3 days ago [-]

Jargon and shared context are barriers for newbies. In a new field they simply don't exist yet. The avenues for accidentally excluding people (or intentionally but I like to be charitable) don't exist yet.

kragen 3 days ago [-]

this is an excellent point, and in retrospect obvious

082349872349872 2 days ago [-]

> ...papers that lay the foundations for a whole field are usually very approachable

Might that be one of the motivations for NHA's "By studying the masters and not their pupils" strategy?

(very sorry: on top of other things I need to re-set up email but I still have a bit set for you!)

kragen 2 days ago [-]

plausibly! i think there are some other important benefits too

i'm not dead yet! look forward to hearing from you

akvadrako 3 days ago [-]

I think it's because they are older and that led to less of the modern publication pressures.

You didn't need to publish if you didn't have something interesting to say and you could just make your point, without worrying about drowning in a sea of mediocrity.

kragen 3 days ago [-]

i thought about that, and certainly i have read a lot of terribly written papers (twps) in recent years, but i think this is partly survival bias; there were lots of twps in the older literature (ol) too, but they don't get cited, so you have to do things like find an entire ol journal issue (maybe one containing a single well-known paper) to read through and find twps. but the well-written papers from the ol seem to be much, much better written than the current well-written papers. i think something about the current publishing pipeline acts as a filter against good writing, something that didn't use to be there

unnah 2 days ago [-]

Yet another possibility is that there actually are papers laying out foundations of potential new fields in impenetrable prose... and then no one understands those papers and they are promptly forgotten. One can only hope that someone else reinvents the ideas and explains them better.

keepamovin 2 days ago [-]

I think it’s worth saying, thank you for such a great description of why Shannon’s writing is good, and how approachable his foundational papers are. I could’ve just upboted but it’s nice to know that what you’re doing really resonates with other people and they really appreciate your way of describing it. Thank you. Haha! :)

jessriedel 3 days ago [-]

Agreed. This is one of the all time great papers in that it both launched an entire field (information theory) and remains very accessible and pedagogical. A true gem.

Anon84 3 days ago [-]

He also creates the first (at least that I could find) instance of a auto-regressive (markovian) language model as a clarifying example in the first 10 pages :)

jp42 3 days ago [-]

I am one of the few who is on the fence. This comment motivates me to give it try to this paper. Thanks ImageXav!

baq 2 days ago [-]

> The jargon is minimal

Just pointing out the obvious here - it’s impossible to have any jargon in this paper since it literally created the field. Any jargon is necessarily invented later.

On the whole I agree: just go read the paper. While you’re at it, queue up Lamport’s The Part-Time Parliament of Paxos.

a_wild_dandan 2 days ago [-]

new jargon != any jargon

Plenty of STEM jargon existed which Shannon chose to not use.

contingencies 3 days ago [-]

> Two examples from the paper really stuck with me. In one, he discusses the importance of spaces for encoding language, something which I had never really considered before.

As a westerner who has studied quite a few writing systems this is kind of hard to interpret.

Verbally however, the timing of pauses are important in all languages I've learned. This would be a more coherent argument to place at the pan-lingual level than one related to written representation, which is pretty arbitrary (many languages have migrated scripts over the years, see for example the dual mode devanagari/arabic hindu/urdu divide, many other languages migrating to arabic, phagspa, vietnamese moving from chinese to french diacritics, etc.).

> In the second, he discusses how it is the redundancy of language that allows for crosswords, and that a less redundant language would make it harder to design these (unless we started making them 3D!). It made me think more deeply about communication as a whole.

Yeah, good luck making a Chinese crossword. Not sure "redundancy" is the right term, however. Perhaps "frequent [even tediously repetitive?] glyph reuse".

shalabhc 3 days ago [-]

While well known for this paper and "information theory", Shannon's master's thesis* is worth checking out as well. It demonstrated some equivalence between electrical circuits and boolean algebra, and was one of the key ideas that enabled digital computers.

* https://en.wikipedia.org/wiki/A_Symbolic_Analysis_of_Relay_a...

B1FF_PSUVM 3 days ago [-]

The funny thing is that, at the time, digital logic circuits were made with relays. For most of the XX century you could hear relays clacking away at street junctions, inside metal boxes controlling traffic lights.

Then you got bipolar junction transistors (BJTs), and most digital logic, such as ECL and TTL, was based on a different paradigm for a few decades.

Then came the MOS revolution, allowing for large scale integration. And it worked like relays used to, but Shannon's work was mostly forgotten by then.

mturmon 3 days ago [-]

> Then you got bipolar junction transistors (BJTs), and most digital logic, such as ECL and TTL, was based on a different paradigm for a few decades.

I think the emphasis is misplaced here. It is true that a single BJT, when considered as a three terminal device, does not operate in the same “gated” way as a relay and a CMOS gate does.

But the BJT components still were integrated into chips, or assembled into standard design blocks that implemented recognizable Boolean operations, and synthesis of desired logical functions would use tools like Karnaugh maps that were (as I understand it) outgrowths of Shannon’s approach.

kragen 3 days ago [-]

i don't think traffic light control systems were designed with boolean logic before shannon, nor were they described as 'logic' (or for that matter 'digital')

egl2021 3 days ago [-]

This is my candidate for the most influential master's thesis ever.

dbcurtis 3 days ago [-]

> The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning...

This is my candidate for the sickest burn in a mathematical journal paper...

zhangsen 3 days ago [-]

“Information theory” might be a misnomer.

keepamovin 2 days ago [-]

Why?

nsajko 2 days ago [-]

Hint: the title of this post, and of the paper, uses communication instead of information.

keepamovin 2 days ago [-]

Aaah. That’sa good point. Hehe :) but I mean he does really delve into information in his exposition. Very clear.

Tho I’m intrigued by the idea that there’s more to information than Shannon. Any additional hints??

I’ve often thought about a dual metric with entropy called ‘organization’ that doesn’t Measure disorder/surprise, but measures structure, Coherence. But I don’t think it’s exactly a dual.

groovimus 2 days ago [-]

Shannon's original paper on the topic was written during WWII and I believe it was classified and is much more concise as an introduction. After that, he and Weaver put together the famous and much more comprehensive 1948 paper which expanded into the noisy coding theorem. Meanwhile his original paper ("Communication in the Presence of Noise") was published in 1949, possibly after declassification. I highly recommend reading it first, taking maybe an hour to read. Another terrific intro is a chapter of a book by Bruce Carlson: "Communication Systems: An Introduction to Signals and Noise..." I have a scan of the chapter linked here: https://drive.google.com/file/d/0B9oyGOnmkS7GTFlmQ2F1RWNFd28...

ShaneCurran 3 days ago [-]

Not many know about it, but this paper (written in 1948) stemmed from a lesser-known paper Shannon wrote in 1945 called "A Mathematical Theory of Cryptography"[0].

[0]: https://evervault.com/papers/shannon

pid-1 3 days ago [-]

As an undergrad I struggled to understand why log was used to measure information. Could not find a reason in any textbook.

Took a deep breath and decided to download and read this paper. Surprise, surprise: it's super approachable and the reasoning for using log is explained on the first page.

bmdac 3 days ago [-]

[dead]

SatvikBeri 3 days ago [-]

Among other things, this paper is surprisingly accessible. You can give it to a beginner without much math background and they'll be able to understand it. I actually find it better than most modern books on information theory.

loph 3 days ago [-]

Shannon did a lot more interesting things than just this paper.

If you become more interested in Claude Shannon, I recommend the biography "A Mind At Play"

https://en.wikipedia.org/wiki/A_Mind_at_Play

A very interesting person.

JoeDaDude 2 days ago [-]

Agreed, One niche interest of mine is his treatment of board games and the machines he built to play them.

https://boardgamegeek.com/geeklist/143233/claude-shannon-the...

mehulashah 3 days ago [-]

When you read this and think about the world he was in — it’s even more remarkable. How did he come up with it?

082349872349872 3 days ago [-]

From playing 20 Questions and attempting to formalise it?

EDIT: actually the cryptography connection is more likely: Leibniz was XVII; who was it that was already using binary alternatives for steganography a few centuries earlier?

EDIT2: did entropy in p-chem come before or after Shannon?

EDIT3: well before; S = k_B ln Ω was 1877.

duped 3 days ago [-]

This is apocryphal, but it probably had something to do with dropping shells on Nazis - he was developing fire control systems for the US Navy around the time he developed the theorem, and only published several years after the War.

Allegedly he also derived Mason's Gain Formula around the same time but that was classified until Mason published it.

ziofill 3 days ago [-]

I use this paper whenever I teach information theory. If you are mathematically inclined, I’d recommend you to read the demonstration of his two main theorems, it’s illuminating.

dilawar 3 days ago [-]

I find it incredible how "simple" were his theories and enormous impact they had. Is there anyone else who developed such seemingly "simple" theories?

3 days ago [-]

robrenaud 3 days ago [-]

Newton.

aragonite 2 days ago [-]

The LaTeX code can be found at [1] (.tar.gz) or by clicking the 'directory' link towards the bottom of page [2].

[1] https://web.archive.org/web/20080516051043/http://cm.bell-la...

[2] https://web.archive.org/web/20080516051043/http://cm.bell-la...

the_panopticon 3 days ago [-]

Another great read from Shannon https://archive.org/details/bstj28-4-656

kouru225 3 days ago [-]

Always upvote Shannon

FarhadG 3 days ago [-]

I recently went through two books: (1) Fortune's Formula and (2) A Man for All Markets. They both impressed upon me a deep appreciation for Shannon's brilliant mind.

Curious if there are any great resources/books you'd recommend on Information Theory.

chrispeel 2 days ago [-]

* The Idea Factory (https://en.wikipedia.org/wiki/The_Idea_Factory) includes info on Kelly (https://en.wikipedia.org/wiki/Kelly_criterion) as well as Shannon.

* A Mind at Play (https://en.wikipedia.org/wiki/A_Mind_at_Play) is a bio of Shannon. I think a much better bio could be written, but this is all we have.

* "Information Theory" by Cover and Thomas is a mathematical introduction to information theory. This is a technical book, and is very different than the books above.

If you haven't yet done so, read Shannon's paper as linked above

JoeDaDude 2 days ago [-]

An Introduction to Information Theory Symbols, Signals & Noise By John Robinson Pierce

https://www.google.com/books/edition/An_Introduction_to_Info...

fat_cantor 2 days ago [-]

Cover & Thomas

intalentive 2 days ago [-]

Shannon:Kolmogorov::LLMs:minds

whereismyacc 3 days ago [-]

my holy book

Rendered at 20:42:13 GMT+0000 (Coordinated Universal Time) with Vercel.