# Is the Universe made of information?February 15, 2007 7:05 AM   Subscribe

Can you help a reasonably smart layperson (who is not very good at math) understand how the word "information" is used in physics?

I'm talking about when "information" is NOT used to describe data exchanged between sentient beings. I keep hearing about how "the universe is information" or "the universe is a quantum computer."

I get how some natural systems, like DNA, contain data which nature uses to control some sort of process. Is this the core idea?

And I know that each age has described the universe in terms of its dominant technology (steam or whatever), and so we use information, because this is the "information age," but I also know that -- metaphor or not -- seeing the universe as information is yielding some profound ideas.

I've even read a book or two on the subject, but these books always seem to take off from the point-of-view that the universe IS information. I feel like I'm not getting to the heart of what this means.
posted by grumblebee to Science & Nature (27 answers total) 14 users marked this as a favorite

In physical systems, information is a measure of how much you can predict that an event will have a particular outcome.

For example, let's say you're reading a single point in a string of DNA.

You have four possibilities: A, C, T and G.

If you have no information about the string, then the probability that this single point will take on the value A, C, T or G is equally 1/4. Anywhere you read the string, you could get a A, you could get a C, or a G or T. You don't know ahead of time whether you're more likely to get one or another base.

If you have some information about the string, then you can assign non-uniform probabilities to these letters. You might know that A is much more likely as the rest of the letters, for example, so its probability is, say, 1/2, while the rest have smaller probabilities.

Genomics can measure this information, which differs from organism to organism. Some organisms are more likely to have say, a T, at a certain point along the string, while other organisms are more likely to have a C. This is what information provides.

Deviation from uniform probabilities is a measure of information. The further away you get from a uniform probability distribution, the more information you have.

Another example is π. Its unending digits are statistically random: knowing a thousand million billion digits in the series tells you nothing — imparts no specific information — about what the next digit will be. Essentially the digits of π are a random string.

A system with information is an ordered system. It's a system that has order. One of the laws of thermodynamics says that systems with higher order progress to become systems with lower order.

When the universe is described as information, essentially this means that the universe is not uniformly distributed, or "mixed". There are parts of the universe that have higher order — we can predict with higher levels of certainty that said parts can be described a certain way — which will eventually become parts with lower order. Once the universe is evenly mixed, any one part of the universe is just like another.
posted by Blazecock Pileon at 7:30 AM on February 15, 2007 [2 favorites]

Sounds like you've been reading the wrong books.

You need to read an Introduction to Information Theory. I highly recommend that specific book. Information theory underlies all modern communications, computation and is an essential idea in physics, as you've noted.
posted by vacapinta at 7:34 AM on February 15, 2007

i can't suggest elementary books, but some terms to look up are: information theory, digital physics, statistical mechanics, and natural computing.

the "core idea" is that information is physical (it is encoded into the configurations of energy and matter), and that physics is information (the behavior of energy and matter is determined by their information content).
posted by paradroid at 7:44 AM on February 15, 2007

Response by poster: What are quarks and leptons made out of? Modern physics doesn't have a good answer. Practically, they are just probability density functions that have certain parameters. For a mathematically inclined person, it is reasonable to think of them as collections of numbers that get plugged into the wave function. Numbers are information, so one way to think about the "fundamental" nature of the universe is information flow.

See, I can follow all that (or at least I think I can), but it sounds like you're saying "we don't know the ultimate makeup of matter, but we can create a mathematical model to help us predict how matter will behave." Which sounds sort of like what Blazecock Pileon was saying. We can make predictions.

But isn't that an old, old idea? Isn't that the foundation of science? Making models and using the models to make predictions. I still feel like I'm missing something. How does the word "information" add anything?
posted by grumblebee at 8:50 AM on February 15, 2007

Making models and using the models to make predictions. I still feel like I'm missing something. How does the word "information" add anything?

Information is useful for making useful predictions. For example:

A. Based on seeing how this coin flipped for the last ten tosses, I predict that this coin will next land either heads or tails

B. Based on seeing how this coin flipped for the last ten tosses, I predict that this coin will next land a head

If you're looking for a fair coin, which set of prior experience would you expect to point you to the fair coin?

If you want to get into the mathematics of the above experiments, this is a statistical measurement based upon conditional probability, which is a measurement of the amount of information imparted by prior conditions ("prior knowledge").

You can extrapolate coin flipping to other modeling physical phenomena on all scales, like particle decay, genetic mutation, etc.
posted by Blazecock Pileon at 9:03 AM on February 15, 2007

Response by poster: I started to think about all this, recently, when I read an interview with Seth Lloyd, author of Programming the Universe (which I haven't read), in the July/August issue of "Technology Review"

The interviewer mentions Lloyd's claim that "the universe is indistinguishable from a quantum computer." To which Lloyd responds...

"I know it sounds crazy ... but it's factually the case. We couldn't built quantum computers unless the universe was quantum and computing. We can build such machines because the universe is storing and processing INFORMATION in the quantum realm." [emphasis added]

"How can [an electron] have information associated with it? The electron can either be here or there. So it registers a bit of information, one of two possibilities: on or off."

Again, I understand this -- on a surface level -- but I feel like my brain has not flipped over into REALLY understanding it. What does he mean by "it registers." Is he personifying to make a point, or is there some sense it which it really DOES register? In other words, is this just a case of, "hey, an electron can have these two states, and if he find it useful, we can LOOK AT these two states as if they were binary operations in a computer." Or is there something more to it?

More Lloyd...

"If you're looking for places where the laws of physics allow for information to be injected into the universe, then you must look to quantum mechanics."

and...

"This notion of the universe as a giant quantum computer gets you something new and important that you don't get from the ordinary laws of physics. If you look back 13.8 billion years to the beginning of the universe, the Initial State was extremely simple, only requiring a few bits to describe. But I see on your table an intricate, very beautiful orchid -- where the heck did all that complex information come from? The laws of physics are silent on that issue. They have no explanation. They do not encode some yearning for complexity.

Could the universe have arisen from total randomness? No. If we imagine that every elementary particle was a monkey typing since time began at the maximum speed allowed by the laws of physics, the longest stretch of 'Hamlet' that could have been generated is something like, 'to be or not to be, that is the--.' But imagine monkeys typing at computers that recognize the random gibberish as a program. Algorithmic Information theory shows that there are short, random-looking programs that can cause a computer to write down all the laws of physics. So for the universe to be complex, you need random generation, you need something to process all that information according to a few simple rules: in other words, a quantum computer."

So, again, I get all that, but "information" still slips from my grasp. I tried http://en.wikipedia.org/wiki/Information, and it's eloquent about information in many other fields, but as-soon-as it gets to physics, it trickles to just this:

"Information has a well defined meaning in physics. Examples of this include the phenomenon of quantum entanglement where particles can interact without reference to their separation or the speed of light. Information itself cannot travel faster than light even if the information is transmitted indirectly. This could lead to the fact that all attempts at physically observing a particle with an "entangled" relationship to another are slowed down, even though the particles are not connected in any other way other than by the information they carry."
posted by grumblebee at 9:14 AM on February 15, 2007

Response by poster: Okay, to me, information is

(a) something transmitted from a broadcaster and taken in by a receiver. Something is spitting out X and X is being pulled into something else. (It's fair for those two something to be two parts of the same system, like a computer's cpu and memory.)

(b) the transmitted data must be more useful to the receiver than random noise. Often, this means the receiver can make something (a new idea, a building, a biological creature) that it couldn't have made if it had received nothing or just random noise.

Is this the point?

posted by grumblebee at 9:20 AM on February 15, 2007

Think of information as a measure of an object's non-randomness, or certainty of an outcome.

If you took the example of quantum entanglement, the information is the certainty you have that particles have a particular state relative to each other.

If you did an experiment on one object to figure out its state, you'd have certainty (information) that the other object has a particular state, as in the Einstein-Podolsky-Rosen paradox.
posted by Blazecock Pileon at 9:28 AM on February 15, 2007

Often, this means the receiver can make something (a new idea, a building, a biological creature) that it couldn't have made if it had received nothing or just random noise.

Yes.

Let's say you had a living cell, with all the transcriptional and translational machinery to make new proteins.

Without getting into the details, transcriptional machinery works by looking for specific information, namely a start-and-stop pattern made up of "start" and "stop" signals in the genome.

Those patterns have to be in the genome in specific locations. The right signals have to be between these start and stop signals. These signals add information (or order) to the genome.

Combinatorally speaking, a random genome is very unlikely to have the start and stop signals in the right place, and even less likely to have the right signals in between to make all the working proteins needed to make a living cell.

Random or damaged genomes will be selected against, leaving genomes behind with measureable information. Information helps the genome machinery work, making the cell work.
posted by Blazecock Pileon at 9:42 AM on February 15, 2007

Response by poster: Thanks for all the time you've taken, BP, b1tr0t and others. If anyone else would like to chime in, the more analogies, the better!
posted by grumblebee at 9:50 AM on February 15, 2007

grumblebee, it sounds like you want people to teach you the basics of information theory in this thread.

Although the word "information" is a pretty large concept, the term "information theory" is very well defined.

The seminal paper is one by Claude Shannon which revealed some remarkable theorems about information as akin to a physical quantity which can be measured and which places constraints on all types of interactions. Seriously, Shannon should have won the equivalent of the Nobel prize for his work and its a shame that more people dont know who he is. The book I cited above shows examples of information theory in Biology (as BP has illustrated too), examples in Language (language is highly redundant in its information content and you can show this mathematically) and most importantly in communication - in data and signals and noise.

What folks have been discovering is that many of the principles which apply to abstract data in the realm of information theory also apply, at the fundamental level, to our own universe. Particle interaction is a form of information propagation and each particle is essentially a bundle of information (all of its state information)

Information theory, as revealed by Shannon, is a beautiful thing and will increase your understanding of the world around you in so many ways. I beg you: pick up a good book on the subject and try to understand the basic principles then come back here if you have questions or even feel free to email me.

But, it sounds like your looking for a shortcut, a quick fix. As b1trot says above, there is none. You're going to actually have to do some work here.
posted by vacapinta at 10:05 AM on February 15, 2007

Here's a quote from a paper I wrote on DNA substring entropy awhile back:

"If the sequence of a genome were entirely random, for example, transcriptional machinery would be unable to locate genes or transcribe correct genes. This machinery appears to rely on low entropy regions of the genome to make the various proteins needed for a cell to function [1].

"More generally, the probability distribution of bases within a genomic sequence signifies its information content. The level of entropy in a sequence is maximized when the expected probability of any one base or sequence of bases is uniform, which is to say entropy is a measure of the randomness of a sequence. Likewise, entropy is minimized when there is probabilistic certainty, or a high level of information, about the expected base or bases in a sequence.

"Given that larger genotypic (coding) sequences within a genome contain higher levels of information than stretches of "junk" DNA, aside from the noise of insertion, deletion or replacement mutations that collect over time, we might expect that different species will carry similar distributions of substrings of sequences derived from a common ancestor."

[1] Gatlin, L. L. (1965). The information content of DNA. Journal of Theoretical Biology 10:2 281-300.

That last paragraph didn't turn out to be as true as I hoped, but the paper should make an interesting read for you, since it uses information in the quantitative sense you might want to begin to think about.
posted by Blazecock Pileon at 10:05 AM on February 15, 2007

Response by poster: grumblebee, it sounds like you want people to teach you the basics of information theory in this thread.

No, I don't expect that. I just want a definition of a word (BP and others have provided that). I'm not expecting a definition to explain everything to me. But it's a start.

I defined it, above, and I'm hoping people in the know will explain what's right, wrong or missing about my definition (BP has done that, to some extent). If someone explains to me what's missing, I don't mean "what my definition fails to capture about information theory." I mean "what my definition fails to capture about the Informations theorists define the word "information."

Most laypeople have use the word "information" all their lives. If I say, "I read some information on a sign," most people will assume I mean "some symbolically-encoded data, left there by a person who intended another person to decode it."

If you have this definition in your mind, it's hard to gell that with the Information Theory use of the word.

I'm not naive. I know that (most) Info Theorists aren't implying that some guy created the laws of the universe and wrote them down in a big book. But I also assume that the lay definition shares SOMETHING with the Info Theory one.

I'm trying to pull the to together (or, if necessary, push them apart).
posted by grumblebee at 10:25 AM on February 15, 2007

Response by poster: Thanks for that last bit, BP. It's getting clearer:

If a "listener" "hears" a stream of content, and he finds he can make predictions, based on what he hears (besides the prediction that "what follows will be completely random), we call this content information.

If he can't make predictions, it's not information, it's just noise.
posted by grumblebee at 10:31 AM on February 15, 2007

Response by poster: So with DNA, the "hearer" is -- are -- the mechanisms in the body that, by interpreting the DNA sequences, build proteins. If DNA were random, these mechanisms wouldn't be able to build proteins.

Moving on to the universe, Lloyd (and others) claim that there are content-holding structures (or patterns) in the universe, e.g. electrons, which hold the "content" of whether they are "here or there."

There must be other structures that "read" that content, discover it to be non-random, and use it to build with.

What are these "listener" structures? We know what they are in computers. We know what they are in the human body. What are they in the cosmos?

Is this known, or is it merely that -- without knowing what they are -- we know (or assume) they must exist, because otherwise (as Lloyd suggests) we can't account for the complexity of, say, an orchid?
posted by grumblebee at 10:37 AM on February 15, 2007

If a "listener" "hears" a stream of content, and he finds he can make predictions, based on what he hears (besides the prediction that "what follows will be completely random), we call this content information.

You're describing a method of determining information entropy.
posted by vacapinta at 10:43 AM on February 15, 2007

If he can't make predictions, it's not information, it's just noise.

Complete uncertainty is statistical randomness, which is "maximally entropic".

A concrete example is to flip a fair coin and ask someone to tell you if it is heads or tails. That person knows the coin is fair because you told her it was fair, or perhaps you flipped it a hundred times and she heard you say that the number of heads is roughly equal to the number of tails.

Since you both know you are flipping a fair coin, your friend has zero information that her guess is correct. She could guess heads, or tails, and have no certainty that her guess is correct until observing the outcome of the trial. This state is maximally entropic.

In the second case, you flip a coin fifty times in front of your friend, and the coin always comes up heads. Your friend hears you tell her that that the coin came up heads on every toss.

You flip the coin once more and hide the result. You'd expect that your friend would guess that the coin came up heads again. There is information gained from those fifty coin tosses (trials) which conditions the outcome she would expect from the next trial. This coin has less entropy (or more information) than the fair coin.
posted by Blazecock Pileon at 10:54 AM on February 15, 2007

Response by poster: Thanks for that link, vacapinta. Here's the relevant section:

An intuitive understanding of information entropy relates to the amount of uncertainty about an event associated with a given probability distribution. As an example, consider a box containing many coloured balls. If the balls are all of different colours and no colour predominates, then our uncertainty about the colour of a randomly drawn ball is maximal. On the other hand, if the box contains more red balls than any other colour, then there is slightly less uncertainty about the result: the ball drawn from the box has more chances of being red (if we were forced to place a bet, we would bet on a red ball). Telling someone the colour of every new drawn ball provides them with more information in the first case than it does in the second case, because there is more uncertainty about what might happen in the first case than there is in the second. Intuitively, if we know the number of balls remaining, and they are all of one color, then there is no uncertainty about what the next ball drawn will be, and therefore there is no information content from drawing the ball. As a result, the entropy of the "signal" (the sequence of balls drawn, as calculated from the probability distribution) is higher in the first case than in the second.

If I count 1,2,3,4,5,6..., you can stop listening for a while, because you "get it."

If I say, "87,5,12,3,192,43,6,9,9,9,32..." you have to keep paying attention, because you can't just say, "oh, I see the pattern."

???
posted by grumblebee at 10:56 AM on February 15, 2007

Response by poster: Am I confusing "more information" with "a different kind of information" (more/less etropic)?
posted by grumblebee at 10:58 AM on February 15, 2007

Response by poster: Does TV white noise and an episode of "The Honeymooners", displayed for the same length of time, have the same amount of information (but different amounts of entropy)?
posted by grumblebee at 11:00 AM on February 15, 2007

The word is used in a few different ways. One way concerns things like entanglement, and it seems like you get that idea. Basically, the wavefunction of a particle can be thought of as a bunch of information about the particle. The spatial part of the wavefunction has information about the probability of finding the particle in a given spot. The spin part has information about the probability of finding it pointing one way or another, and so on. This type of information you can actually use to do things like quantum computing.

There's another related meaning, touched on above, which concerns the states of systems of particles. For a simple illustration, you can consider a collection of electrons, all aligned so that they're pointing up. You really only need two pieces of information to describe such a collection: the number of electrons and which way they're pointing.

Now flip one the opposite way. You need more information to describe the system: the number of electrons, which way most of them point, and the location of the one pointing the other way.

The reason you need more information is that there are more possible states for the system to occupy--that is, there are many possible ways to just flip one electron (N ways if you have N electrons), but there's really only one way to have none flipped.

The Universe itself started out as a very highly ordered state--like the electrons all pointing the same way. All you really need to describe it is the temperature, plus a map telling you where things were slightly hotter or cooler. As things evolved, those hot and cool spots turned into stars, galaxies, planets, and people. Suddenly you need a whole lot more information to describe the universe than just its temperature! The more information it takes to describe something, the less ordered it is. And since the second law of thermodynamics tells us that every system should become less ordered as time goes on, the history of the Universe is one of increasing amounts of information. In this since, and over large time scales, entropy and information are interchangeable terms.

Finally, there's a theorem about black holes that says that the entropy of a black hole is proportional to its surface area. In this way, black holes tell you how much information can possibly be stored in any given volume.

On preview: the white noise vs. Honeymooners example is a good one. And I would say that yes, the amount of information in any given amount of time is the same from the point of view of the TV (which only cares about the color of each individual pixel), but the honeymooners is much more highly ordered, so there's less entropy. Think about running the images through a compression algorithm. You won't lose much detail from the Honeymooner's stills, even at significant compression, but the white noise will turn into large white and black blocks for large compression ratios.
posted by dsword at 11:36 AM on February 15, 2007

http://en.wikipedia.org/wiki/Kolmogorov_Complexity
http://en.wikipedia.org/wiki/Algorithmic_information_theory

You will want to read works on Information Theory, proper, before reading those, but they do address -- from a different angle -- the difference between your two sequences:

(a) 1,2,3,4,5,6...
(b) 87,5,12,3,192,43,6,9,9,9,32...

and offer a nice framework for understanding why it makes sense to think of (a) as containing less information than (b).
posted by little miss manners at 12:50 PM on February 15, 2007

Response by poster: Having read through all the responses here so far, I have a much greater understanding than I did when I started. Thank you all for that.

But -- from the responses (whether or not this corresponds to the reality of Info Theory) -- I get the impression that "information" (as used by theorists) is a fuzzy term.

Sort of like the word "love." Which isn't necessarily a problem. One can give someone advice as to what to do when they fall in love, without precisely defining a word. We all have a general idea of what "love" means, and that general idea is usually good enough.

It seems a little strange, though, for a fuzzy concept like this to be used in a scientific discipline, but I realize that it sometimes does happen. For instance, many mathematicians don't hold a crystal clear definition of "number" in their heads, do they? I'm not really sure what a number is, but I still no how to take two of them and knock them together in useful ways.

I still suspect I'm wrong about this, but I'm struck by the lack of many responses that use strong "definitional" language like "information IS..." Instead, (very helpful) posters like dsword write things like "The word ... concerns things like entanglement."

This, again, reminds me of "love." I have a hard time saying, "love IS...". I'm much more comfortable with "love concerns things like attraction, sacrifice, etc."

So is it a fuzzy term or is it actually very precisely defined -- but hard to put into lay terms?
posted by grumblebee at 1:23 PM on February 15, 2007

Response by poster: Thanks. I guess vacapinta is right -- I was (unknowingly) trying to worm out an entire course in Info Theory.

Thanks for the course suggestions, b1tr0t. Learning more math is high on my list of things to do when I can carve out the time.

So I'll consider this questions answered -- as much as it can be in a casual, layperson thread. (Though, of course, I'm always happy to hear more.)

I have learned that "information" is clearly defined within the info theory field, but that I don't (yet) have the math to understand this definition.

Meanwhile, from some of your analogies, I have a better gut feeling about it.
posted by grumblebee at 1:53 PM on February 15, 2007

Best answer: So is it a fuzzy term or is it actually very precisely defined -- but hard to put into lay terms?

It is defined very precisely. Information is measured in units called 'bits' which quantify how much information is in a particular channel.

You see "information" does not exist by itself, floating out in space. It has meaning because it depends on an external system which includes terms such as "message","encoding/decoding" and most importantly "sender" and "receiver" and "noise."

All of the above terms must be defined before you can start to discuss information. The Honeymooners has more meaning to me than to some alien from another planet. Random digits may be an overwhelming noise channel hiding signal or it may be because that is the message I intended to send.

Information theory defines all these terms precisely and then, proceeds to provide exact equations for how to gather how much information flow is possible in a given channel, in a given apparatus setup.

What information theory does not do is define information semantically. Its a purely physical definition and relies on a "sender" of that information.

Likewise, the entropy of a series is based on the conditional probability of each occurence. But the thing about conditional probability is you cant assign such a thing unless you already had an associated expectation which can be defined inversely as an Uncertainty.

Thats what the marble example is showing above. Entropy is related to uncertainty because it is related to conditional probability. Information in this sense is bits of data which help in lowering our uncertainty.

The only reason this sounds fuzzy is probably the same reason quantum mechanics sounds fuzzy to people. Its fantastically well-defined but people are allergic to mathematical equations.
posted by vacapinta at 2:00 PM on February 15, 2007

Oh, one more thing: I hope I didn't sound snarky above. Its great that you're interested in this and this is the sort of stuff I love discussing in person. Unfortunately, it can be frustrating to express it in print.
posted by vacapinta at 2:37 PM on February 15, 2007

Best answer: Once you have a good handle on number theory, take a solid statistics course. Then take a basic calculus course (at this point, it will make much more sense to you than it did in high school or college), and finally move on to information theory. Lots of people skip right to information theory, but a solid foundation will help your understanding.

I would actually recommend that you take a good introductory probability theory class before you take statistics. Probability theory is used for modeling, while statistics teaches you how to test those models.

You'll learn discrete probability and distribution concepts that are very important for learning information theory.

Most statistics classes these days are of the "plug-and-chug" variety and don't spend much time on the underlying probability theory. So you learn how to plug numbers into Excel or R etc. and get some number out that you turn in for a good grade, but it's hard to really know what that number means...

While you'd learn how to test if a coin is fair, you probably wouldn't learn why the math works the way it does.
posted by Blazecock Pileon at 8:56 PM on February 15, 2007 [1 favorite]

« Older Where did the MySpace mass email address function...   |   Name this cheese Newer »