# What the heck is coinduction?July 24, 2006 3:08 PM   Subscribe

Can someone explain coinduction (mathematics, computer science) in English?

I'm studying a text book for a computer science class and I'm struggling with the concept as explained in the text. I'm perfectly comfortable with normal induction, but I don't know what a co-algebra or coinduction is.

The book attempts to explain it in the form of abstract statements, with no concrete examples, and thus, I'm beginning to flounder. The web is even less helpful. Can anyone here make it clear?
posted by knave to Science & Nature (7 answers total) 3 users marked this as a favorite

I'm not quite sure how to approach this. What's your background like? Which text are you using? Coinduction is not an exceedingly difficult topic, but it requires some unsightly machinery.
posted by j.edwards at 7:58 PM on July 24, 2006

Response by poster: Background: Bachelors in CS, minor in physics. Familiar with algebra, calculus, differential equations, linear algebra, discrete math, etc. etc.

I should be able to get the material just fine, but I'm studying from a text book without actually taking a class, so no professor or lecturer to expand on what the book is saying.

I just need someone to explain coinduction in english.
posted by knave at 8:32 PM on July 24, 2006

Response by poster: If it helps, my two insights thus far have been:

1. Induction works on sets with a base case, or starting point, whereas coinduction works on streams, or sets with no beginning or end point.

2. Induction works by starting with a base case and constructing a set from that point. Coinduction works by starting with the entire solution space, and stripping away the non-members of the solution set.

If either or both are correct, it'd be nice just to hear that. If you can add to my current understanding, I'd be very happy. :)
posted by knave at 8:35 PM on July 24, 2006

Sort of (bear in mind this isn't my field, but I'll say what I know). Coinduction as I learned it was a method of proving the equivalence of opaque structures by exploiting bisimilarity (simply put, given two (not necessarily finite) state automata with outputs in the same set, they are bisimilar if there exists a binary relation that "matches up" their state transitions and outputs). "The" bisimilarity relation on a given automaton is the union of all such binary relations from that automaton to itself. Coinduction is then the principle that given a map from states to output, that map distributes over the bisimilarity relation. The effective statement is that two bisimilar start states will arrive at the same output.

As a method of proof, one would show that two states in a given automaton had the same output by demonstrating the existence of a bisimilarity relation between them.

I'm hardly a pedagogue, so let me know if any of this is unclear.

For the co-algebra connection, think of induction as a principle on minimal algebras -- the canonical example would be the algebra formed by zero and the successor function (the minimal algebra therefrom being the natural numbers). A "minimal" co-algebra likewise has no quotients under an equivalence operation -- so in this instance, one could perform coinduction on a coalgebra (say, of an automaton) that had no distinct elements (states) seperated by the equivalence relation (the bisimilarity relation). There's more content to this from a category-theoretic perspective, but I'm even more at a loss there and it sounds like that's not the angle you're going for.
posted by j.edwards at 9:16 PM on July 24, 2006 [1 favorite]

Best answer: I've got a similar background to you — BA in computer science and sociology, MS in computer science, working on PhD. I've recently tried to figure out coinduction as well; my understanding is pretty shaky, but I'll try to explain what I know. I should preface all of this by saying that you really should just ignore me and read the best tutorial paper I've found on the subject, which is "A Tutorial on (Co)Algebras and (Co)Induction" by Bart Jacobs and Jan Rutten. (But skip sections 4 through 6 at first, then come back and read them later.)

My explanation in three parts (just like they do it): coinductive data, coinductive functions, and coinductive proofs. First part, coinductive data.

So Jacobs and Rutten focus on infinite streams — like Haskell's lazy lists if that means anything to you. An infinite stream can be broken into its head (first element) and tail (another infinite stream), but it doesn't necessarily ever bottom out with empty, it might just go on forever. Think of the list [1,2,3,4,5, ...] which keeps ascending without bound.

These data structures can't be defined inductively — in order for an inductive definition to be well-founded it's got to have a base case and these guys don't have one. But consider the inductive definition of lists: usually it's given as
1. Nil is a list; and
2. If xs is a list then so is cons(x,xs) [where x is anything]
But if you think about it, there are lots sets of values that satisfy those constraints; the set of all possible values in the universe is one of them. Really when we give an inductive definition like that, we mean the smallest set that satisfies the given constraints; everything that's in the set has some justification, no random junk is thrown in there for fun.

Coinductive definitions are the complement of that. Instead of taking the smallest set that satisfies the given constraints, we can take the largest set that is consistent with them, meaning that there is nothing in the set that couldn't possibly be produced by some number of applications (perhaps infinitely many) of the rules.

Okay, it's probably not clear why those two things are different from each other. Let's have an example. Consider if we dropped the base case from our inductive definition of lists, and we just had "if xs is a list then so is cons(x,xs)". If it were an inductive definition, we'd be forced to say that there are no lists at all, because we have to interpret the statement as saying that the set lists is the smallest set such that if xs is in it then so is cons(x,xs) and the empty list satisfies that trivially. But as a coinductive definition, it gives you lots of lists: just start with everything you can imagine and carve away the things that couldn't possibly be produced by applications of the rule. Infinitely long lists will never get carved away like that, so they're all a part of the new definition.

Alright, so that's what the data look like. Now how do we define functions on them? Just like we can define functions that process inductively-defined data by giving a case for every one of the constructors of the data (you know, like writing a case for the empty list and a case for cons that uses recursion), we can define functions that produce coinductively defined data by specifying what happens when the result is broken down (in the case of lists by saying what happens when someone tries to take the first or rest of the list). Jacobs and Rutten put it like this:
• In an inductive definition of a function f, one defines the value of f on all constructors.
• In a coinductive definition of a function f, one defines the values of all destructors on each outcome f(x).
Jacobs and Rutten give a couple examples of coinductively-defined functions. Here's one: say you want to write a function odds that takes an infinite stream and returns another stream that consists of every other element of the original, starting with the first. You can define that as follows:
• first(odds(xs)) = first(xs)
• rest(odds(xs)) = odds(rest(rest(xs)))
Similarly, say you want to define merge(xs,ys) that consists of the first element of xs, the first element of ys, the second element of xs, the second element of ys, and so on. That's also easy:
• first(merge(xs,ys)) = first(xs)
• rest(merge(xs,ys)) = merge(ys,rest(xs))
These definitions are a little funny; at first glance (at least to programmers in non-lazy languages) it might look like they can't possibly work because they would never end. There are two answers to that objection: the first is that you can think of this as pure mathematical specification, so it doesn't matter if these things can actually be computed out on a machine as long as you can write down a specification. But another answer is that you can represent these lists in a computer just fine — all you have to do is realize that there's no need to actually compute the next element of any given infinite stream until somebody actually asks for it. So read the definitions as, "If someone takes the rest of odds(xs), then compute and return odds(rest(rest(xs)))." Of course that computation is another stream, which again is allowed to wait for somebody to actually ask for the rest before it actually computes anything, and so on.

Alright, that's all I'm going to say about functions. The big question is, how do we prove things by coinduction? Well, again it's kind of analogous to proving things by regular induction. In regular induction, we say that some property holds for all elements in some inductively defined set if (a) it is true of all the base cases, and (b) for every inductively-defined case, the property being true of the smaller subpieces implies that the property is true for the larger construction. With coinduction you again get the complement: we can conclude that some property holds of some coinductively-defined structure if it holds for the observable "fringe" (the first element of the list, in our example) and if the property holding for entire structure implies that it holds for the substructures (the property holding for the list implies that it holds for the rest of the list in our example).

That's abstract, so we'll have to actually do a proof to see how it works. First of all, let's define another function evens(xs) = odds(rest(xs)); it defines the stream consisting of every other element of the input stream xs, starting with the second (i.e., it contains all the elements that odds(xs) would drop and drops the ones odds(xs) would keep). Now it seems clear that for any stream xs, xs = merge(odds(xs),evens(xs)). We can prove this claim by coinduction.

First (sort of our "base case"): we have to establish that first(xs) = first(merge(odds(xs),evens(xs))). By definition first(merge(odds(xs),evens(xs))) = first(odds(xs)) = first(xs), so that case is done.

Second (our "inductive case"): assuming that xs = merge(odds(xs),evens(xs)) (our "coinductive hypothesis") we have to establish that rest(xs) = merge(odds(rest(xs)),evens(rest(xs))). By coinductive hypothesis we have that rest(xs) = rest(merge(odds(xs),evens(xs))). By definition of merge this is equal to merge(evens(xs),rest(odds(xs))). Using the definitions of evens and odds we get that this is equal to merge(odds(rest(xs)),odds(rest(rest(xs)))); by the definition of evens again we have that this is equal to merge(odds(rest(xs)),evens(rest(xs))) as required. That completes the proof.

So there you have it, a quick intro to coinduction by somebody who really doesn't understand it too well himself. I do recommend you look at Jacobs and Rutten's paper, and also Andy Gordon's "A Tutorial on Co-induction and Functional Programming". There is a lot more to say on this subject, and in particular on why all of this stuff works and what its deep connections to regular induction (and regular algebras) are; but from a pragmatic standpoint it seems like this is a subject where you need to understand it on its own terms before you can appreciate the analogy.

Of course if you look into those you'll probably find that everything I've written here is seriously misguided and overly simplistic. If you do, please let me know what I'm getting wrong!
posted by jacobm at 3:38 PM on July 25, 2006 [1 favorite]

Response by poster: Thanks to both of you, I'm obviously still fuzzy, but you both helped a great deal. The papers you point to look good, and in fact, I had found one of them in previous web searches. I'll have to spend more time with both of them before I am really comfortable with it, but I do have a better general sense of what coinduction means now.