Bathrooms and Statistics
April 13, 2012 9:00 AM   Subscribe

My workplace has 3 small bathrooms. It seems to me that if I go to use one of them, and it's occupied, there is a statically higher chance the 2nd and even 3rd bathroom will be occupied (vs the chance that they would be occupied if the 1st one was not). Is it true?

We have about 25 people at my office. I'm wondering if my instincts are correct. Could this be shown mathematically?
posted by lohmannn to Grab Bag (30 answers total)
 
Not without more information about patterns of bathroom usage.
posted by empath at 9:03 AM on April 13, 2012


No.

Unless you all had lunch together at a roadside taco stand or something.
posted by celtalitha at 9:03 AM on April 13, 2012 [3 favorites]


If there's some other even that synchronizes the time people need to use the bathroom, that could explain it. For example, if someone makes a pot of coffee in the morning and everyone goes and gets a cup at the same time, if the delay between drinking coffee and peeing is similar for the majority of people it will result in bathroom collisions.
posted by XMLicious at 9:03 AM on April 13, 2012 [1 favorite]


If instances are going to the bathroom are purely independent (seems to me like arrivals at the bathroom would be a Poisson process), then my instinct is no: if one bathroom is occupied, it should not affect the odds that other bathrooms are occupied.

However, my guess is that bathroom arrivals are not independent events. Most people in your workplace wake up at the same time and eat meals at the same time. Thus, their urges to go to the bathroom will be more or less synchronized.
posted by deanc at 9:05 AM on April 13, 2012


I've observed a clear time-of-day effect. Stalls are typically booked solid at 8:30 AM (after first cup of coffee) and 1:00 PM (after lunch).
posted by ZenMasterThis at 9:06 AM on April 13, 2012 [2 favorites]


You're assuming people fill in the bathrooms rationally (i.e. 1, 2,3) rather than having a preferred stall.

In my experience, most people have preferred stalls/toilets.
posted by phunniemee at 9:10 AM on April 13, 2012


It also depends on what strategies people are using, for instance if one of the bathrooms is farther away it might get used less often; except for the person who regularly seeks that bathroom out because it is the least used and therefore the cleanest.
posted by RonButNotStupid at 9:11 AM on April 13, 2012 [1 favorite]


If you took detailed notes on bathroom occupation over a long period of time you could graph it out and see if it shows a pattern.
posted by bleep at 9:33 AM on April 13, 2012


Because your office is small, if people went to the bathroom entirely randomly and independently, the chances of a second stall being occupied (given one was) should be slightly less than the chance of any stall being occupied on a given visit. When a stall is occupied, that's one less person who could randomly decide to go to the bathroom and occupy the second stall.

But in reality it's not likely that people go randomly. People might go more often just before or just after meetings, or just before or just after lunch for example. So if people have their meetings starting on the hour as usually happens, you'd maybe see a marked increase in visits just before and just after the hour and half and hour.

The same kind of things can happen even if people interact informally a lot rather than have meetings. No-one is likely to go to the bathroom in the middle of an interaction. (cf Eating with people in a restaurant. You typically don't go in the middle of a lively conversation. But when one person goes, several might because it's a break in proceedings.)

So the answer to your question could well be "Yes, it is statistically more likely because the very presence of one person indicates an increased likelihood of the recent occurrence of some kind of event (Starbucks run, end of meeting etc) that increases the probability of bathroom visits happening.
posted by philipy at 9:36 AM on April 13, 2012 [2 favorites]


If there's a time-of-day effect, yes.

Otherwise, maybe. (Everybody's just saying "I don't think so" but nobody's actually stepped up and done the calculations. I'd do them but I should be working, and I'm not teaching the sort of class this semester where I could use this example.)
posted by madcaptenor at 10:06 AM on April 13, 2012 [1 favorite]


it's like the monte hall effect isn't is? when you find #1 empty do you go look at #2 and #3 or do you just use #1.
posted by thilmony at 10:16 AM on April 13, 2012


What do people do if #1 is occupied? Do they go check out #2 or #3, or wait for #1? The OP might be thinking might be that if #1 is occupied, there's a greater chance #2 and/or #3 are occupied because there is now 1 less bathroom available. Hence instead of there being 1 bathroom for 8.3 people, while #1 is occupied there is now only 1 bathroom for 12 people? (2 remaining bathrooms, 24 remaining people).
posted by cgg at 10:27 AM on April 13, 2012 [1 favorite]


Response by poster: What do people do if #1 is occupied? Do they go check out #2 or #3, or wait for #1? The OP might be thinking might be that if #1 is occupied, there's a greater chance #2 and/or #3 are occupied because there is now 1 less bathroom available. Hence instead of there being 1 bathroom for 8.3 people, while #1 is occupied there is now only 1 bathroom for 12 people? (2 remaining bathrooms, 24 remaining people).

I was thinking along those lines, yes.

For the purpose of the question, assume bathroom use is completely random throughout the day, and that no bathroom is preferred over the others.
posted by lohmannn at 10:43 AM on April 13, 2012


nobody's actually stepped up and done the calculations

Heh, I did but they kind of confirmed what everyone else is saying. If you assume that the probability of each stall being occupied at a given time is independent of the others (call this probability p), and you try the stalls in sequence, then:
  1. The probability of the first bathroom you try being unoccupied is (1-p).
  2. The probability of the first one being occupied and the second being unoccupied is p(1-p)
  3. The probability of the first two being occupied and the third being unoccupied is p2(1-p).
  4. The probability of all three being occupied is p3.
Unless p is greater than 1/2 (i.e., a given stall is in use more than half the time), then these four possibilities are listed in decreasing order of likelihood. In this event, the probability of scenario 3 OR 4 happening (not being able to use the second stall) is also greater than the probability of finding the second stall unoccupied. However, I suspect that it's not likely that a given stall is in use more than half the time.

There's also the scenario where everybody uses the first bathroom stall they come to. In this case, the number of people in a stall at a given time is modeled by a Poisson distribution (as alluded to by deanc above.) In this case, the probability of finding n stalls in use at a given time is

P(n) = λn * e / n!

λ here is the average number of people in the bathroom at any given time. You're asking whether it's possible for it to be more likely that two or more people are in the bathroom than the probability that just one person will be in the bathroom; in other words, for what value of λ is it true that

P(1) < P(2) + P(3) + ...

(We'll assume that people just wait in line for a bathroom to become available if they don't find a stall when they get there.) Numerically, this works out to be λ > 1.256; in other words, if the average number of occupied stalls is greater than this number, then your observation would be correct. But I suspect, again, that this is higher than the actual use rate. Most likely, it's probably a combination of time-of-day effects and confirmation bias.
posted by Johnny Assay at 10:53 AM on April 13, 2012 [1 favorite]


Response by poster: Because your office is small, if people went to the bathroom entirely randomly and independently, the chances of a second stall being occupied (given one was) should be slightly less than the chance of any stall being occupied on a given visit. When a stall is occupied, that's one less person who could randomly decide to go to the bathroom and occupy the second stall.

But doesn't this also imply there are less bathrooms available for the remaining people in the office, so it's more likely that a random specific one of them will be occupied?

This problem seems to be exhibiting some monty-hallness.
posted by lohmannn at 11:58 AM on April 13, 2012


But doesn't this also imply there are less bathrooms available for the remaining people in the office, so it's more likely that a random specific one of them will be occupied?

Clearly it does imply that.

I didn't understand what you were trying to ask before. I thought you wanted to discuss a real life observation about how often n stalls seem to be occupied.
posted by philipy at 12:29 PM on April 13, 2012


If stall A is occupied, is there an increased chance that stall B or C is occupied?

Well, with stall A occupied, there's one less employee who might want to use stall B or C. However, stall B or C presumably wouldn't be occupied until stall A was occupied. So the larger the number of employees in the company (beyond three), the bigger the chance that stall B or C would be occupied if stall A is occupied, versus stall A not being occupied.

I think. Maybe.
posted by davejay at 1:26 PM on April 13, 2012


Poisson distribution -based answers assumes there's a sort of order or ranking to the bathrooms, which may or may not be true (is bathroom A preferable to or closer than bathroom B ?)
Also takes into effect realities of bathroom usage versus time of day.


Let's suppose there's no ranking, and no time of day correlation, because the math is much simpler. Then given 25 people and 3 bathrooms they'd choose among randomly, the probability of any given stall being occupied is:

25 * Pbn * 1/3 (for which bathroom they randomly chose.) = 8.33 Pbn

Pbn would be the probability an average person needs to use the bathroom now. So let's say everyone does that at work once a day, and it takes about 10 minutes. 8 hour work day has 480 minutes, Pbn = 1/48. But whatever the value we'll keep working in units of Pbn and it won't matter.

But when one stall is already taken, the probability of another being taken is:

24 * Pbn * 1/2 = 12Pbn, which is higher as long as Pbn is non-zero.

What if two are taken? Probability for the third is:

23 * Pbn


So with two assumptions, that bathroom choice is random evenly distributed, and that bathroom usage versus time of day is evenly distributed, your instincts are provably correct.
posted by oblio_one at 3:44 PM on April 13, 2012 [1 favorite]


So with two assumptions, that bathroom choice is random evenly distributed, and that bathroom usage versus time of day is evenly distributed, your instincts are provably correct.

You screwed up somewhere. Imagine there are 25 employees and 100 stalls, or hell imagine there's just 1 -- what are the odds that if there's one person when you walk in, that there is somebody already there waiting to use the bathoom?
posted by empath at 5:22 PM on April 13, 2012


I would say yes, for the reasons that oblio_one says. Because the distribution of urges doesn't change when someone is using a bathroom (except by that one person), and the time spent in the bathroom is non-zero, you are going to get stack-ups.

I think this is pretty much the same problem as the birthday distribution paradox. Except there are only three possible "birthdays".
posted by gjc at 6:15 PM on April 13, 2012


Of course you get stack ups, the question is whether a second person in the bathroom is more likely if there is one person in there, and that's wrong on the face of it.

oblio's math doesn't work if there are 100 stalls, because the percentage is

25 * Pbn * 1/100 for the first person, and 24*pbn*1/99 for the next

25/100ths is more than 24/99ths.

And it shouldn't make any difference if there are 3 stalls or 100 stalls, if you're not considering people waiting in line.
posted by empath at 6:27 PM on April 13, 2012


You're right, an accurate accounting for the chances of some person being in a given stall would be:

[chances that only 1 worker is using the bathroom and chose this stall] + [chances that 2 people are using and one of them chose this stall] + [chances that all three stalls are occupied]

in earlier post I only considered the first component, so am way oversimplifying and under estimating the probability.
-
But same conclusion; if the first stall lohmannn checks is occupied, then all three components above are possible for the next stall checked. But if the first stall was empty, then the 3rd component is not possible, and thus the total probability for that 2nd stall to be occupied must be less. This holds no matter what the number of stalls, as that final component is always non-negative, and always removed once one stall is known empty. I think the final component is also the largest chance in every case, in other words:

( 25/3 Pbn + 25*24/2*3 Pbn + 25*24*23/3*2 Pbn) > (25/3Pbn + 25*24/3*2 Pbn), by 2300 Pbn.
posted by oblio_one at 11:17 PM on April 13, 2012


Response by poster: Of course you get stack ups, the question is whether a second person in the bathroom is more likely if there is one person in there, and that's wrong on the face of it.

I may not have explained this clearly. We have 3 separate bathrooms, each only holds 1 person. So if I go to use bathroom 2, and it's occupied, I have to go to bathroom 3.

It seems to me that the odds would depend on the relative likelihood of two people needing to go to the bathroom at (relatively) the same time, vs the proportion of bathrooms to people.
posted by lohmannn at 6:51 AM on April 14, 2012


I think the simplest way to see this is that if bathroom one is in use, bathroom two could contain anyone who usually goes first to bathroom two plus anyone who first tries bathroom one and then, if that's occupied, on to number two.

By contrast, if number one is empty, bathroom two is only used by the folks who go there first. This is a smaller group, so bathroom two is less likely to be occupied if number one is free.

(This assumes there are at least three people, though.)
posted by wyzewoman at 1:22 PM on April 14, 2012


Or even simpler: imagine bathroom one is not merely in use, but is closed for repairs. (And one employee is tied up doing said repairs.). Now the other bathrooms will obviously get more use...
posted by wyzewoman at 1:27 PM on April 14, 2012


I think everyone here is answering somewhat different questions because it's not clear what exactly you want to know, and perhaps you're not quite clear on that yourself.

But if you could formulate your question in precise enough math-speak for us to answer, you'd probably also be able to answer it yourself.

If you still care about the answer, you might want to explain some of the following:

Is this a logic exercise or are you trying to understand real-life?

Are you trying to prove or disprove some hypothesis? If so, what? Examples of what you might be trying to prove, disprove, or asking to be taken as a given:

- People visit randomly and independently
- People in general have preferences for some stalls or others
- Individual people have individual, possibly different, preferences

If this is a logic puzzle where people are assumed to arrive randomly and choose stalls randomly, the following would seem to clearly follow...

Case 1) You arrive, no one else is there. This case is irrelevant to your question, as I understand the question. (But maybe not to what you really intended.)

Case 2) You arrive, exactly one stall is occupied. By the assumption that people choose stalls randomly, the prob of any given stall being occupied is 1/3.

Case 3) You arrive, exactly two stalls are occupied. By the assumption that people choose stalls randomly, the prob of any given stall being occupied is 2/3.

If you are asking us on what proportion of visits you would encounter cases 1, 2 and 3 respectively, that is a different question and has nothing to do with which stalls are occupied.

If you want to know whether people on the whole prefer some stalls over others, then you would want to look at whether the observed patterns differ "significantly" from the distribution mentioned above. i.e. If you have a theory that people prefer stall A, you would expect to find stall A occupied significantly more often than 1/3rd of the time in case 2 and significantly more than 2/3rds of the time in case 3.

What would count as "significant" is a lot harder of a question to answer, and would depend on the strengths of the preferences and the number of observations you have made.

If you have some specific hypothesis here that you want to talk about, you'd have to tell us exactly what that hypothesis is. The statement of the hypothesis would be in something like this form:

- Given unrestricted choice, the prob of choosing A=a%, B=b% and C=c%. (Where a+b+c = 100%)
- Given A-is-occupied, the prob of choosing B=b1%, C=c1% (b1 + c1 = 100%)
- Given B-is-occupied, ..... etc etc

In the extreme example where everyone always prefers A to B or C, and everyone always prefers B to C when A is not available, we would predict that in Case 2) you'd see A occupied 100% of the time, and in case 3) you'd see A and B occupied 100% of the time.

If in 10 visits where exactly two stalls are occupied, stall B was occupied 6 or 7 times, it looks like people's choices (about B) are pretty much random. If it's 9 or 10 times, maybe not. But whatever you observed it's not strong evidence either way.

If however you keep count for 100 visits where two stalls were occupied, you can have much more confidence in your theory, because "90/100" versus "67/100" is much less likely to come about by accidentally running into uncommon behavior than "9/10" versus "7/10".

Hopefully this answers your question, or if not, at least explains how to clarify exactly what you want to know.
posted by philipy at 10:07 AM on April 15, 2012


Response by poster: philipy: thank you for taking time to answer my question. I'm glad someone is still around! Now, on to your questions:

Is this a logic exercise or are you trying to understand real-life?

Logic question. Obviously bathroom use is affected in real-life ways that are mathematically uninteresting (to me).

If this is a logic puzzle where people are assumed to arrive randomly and choose stalls randomly

Yes.

Are you trying to prove or disprove some hypothesis?

Yes.

As a reminder: There are no "stalls." There are separate bathrooms, each holding only 1 person. The bathrooms are geographically distant, so it's impossible to know if a given bathroom is occupied or not without first walking over to it. Also, assume no bathroom is preferred over any other.

My hypothesis is that if I attempt to use bathroom 1, and find it's occupied (and thus unavailable for my use), the odds of bathroom 2 being similarly occupied are higher than they would have been if bathroom 1 was not occupied. (In other words, if the odds of me getting up to use a bathroom and finding it occupied are 1/4, the odds of finding a bathroom occupied after finding my first choice occupied are 1/3.)

My reasoning:

1. The fact that bathroom 1 is occupied could mean that someone attempted to use either bathroom 2 or bathroom 3 (or both) and found them occupied.

2. The fact that bathroom 1 is occupied means there is 1 less bathroom available, so if someone had gotten up just before me to use the bathroom, they had a 50% chance of taking bathroom 2 instead of just 33%.

I hope that makes sense. I can't seem to wrap my brain around how one might go about proving this. A few answers above seem to have come close...
posted by lohmannn at 7:17 PM on April 15, 2012


Best answer: Ok, here's my analysis.

But first some notation. It probably doesn't need explaining, but just in case...

- Cn means "exactly n rooms are occupied", where n = 0,1,2 or 3.
- p(something | whatever) means the probability of "something" being true given "whatever" is true.
- x,y means x and y are both true
- R means "Room R is occupied", where R is A, B or C.

So for example p(B | A, C2) means "the probability of B being occupied given it is true that A is occupied and exactly two rooms are occupied.

So, first let's ask what is p(B) if we don't know anything else. (i.e. If we just go look at room B, how likely is it to be occupied?)

p(B) = p(B|C1) p(C1) + p(B|C2)p(C2) + p(B|C3)p(C3)

Similar to my last comment, p(B|C1) = 1/3 and p(B|C2) = 2/3, while p(B|C3) is 1.

So [Eq 1]:

p(B) = 1/3*p(C1) + 2/3*p(C2) + p(C3)

What the Ci actually are will depend on things like how many people there are, how often they go to the bathroom, and how long they stay there. But for the purpose of what you want to know, it turns out we don't care too much.

Now, what is p(B|A)? (i.e.How likely is B to be occupied if A certainly is?)

Similar equation to before:

p(B|A) = p(B|A,C1) p(A,C1) + p(B|A,C2)p(A,C2) + p(B|A,C3)p(A,C3)

The first term is zero. If exactly one stall is occupied and A is, then B can't be.

We'll skip the second term and look at the third one next. p(A,C3) is actually just p(C3), and p(B|A,C3) is 1, because in both cases C3 implies all rooms are occupied. This is the same as the third term in [Eq 1].

So the interesting part is the second term: p(B|A,C2)p(A,C2)

p(A,C2) = p(C2|A)p(A)

But since we are taking as given that A is occupied p(A) is 1, so p(A,C2) = p(C2|A)

And p(B|A,C2) is 1/2, because as you reasoned, if exactly two stalls are occupied and one of them is certainly A, there's a 50-50 chance the other is B.

So [Eq 2]:

p(B|A) = 1/2*p(C2|A) + p(C3)

Comparing Eq 1 and 2, we can see that your hypothesis will be true if and only if:

1/2*p(C2|A) > 1/3*p(C1) + 2/3*p(C2)

Now what can we say about the p(Cn) and p(C2|A)?

Well... p(C1) > p(C2), and if people don't go to the bathroom very often and there aren't huge numbers of them in the office then p(C1) is likely much greater than p(C2). We could say that in a "typical" scenario, the term 2/3*p(C2) will be quite small in comparison to the others.

The exact formula for p(C2|A) would be complicated, depending on prob of a person going, the number of people etc. But we can again make a simplifying approximation like so...

If the number of people in the office is large enough that one person being for sure in A doesn't have much effect on the chances of the remaining people wanting to use the bathroom then p(C2|A) will be roughly the same as p(C1), although "a bit" lower. Because by assuming randomness and independence the chances of exactly 1 of the remaining 24 people wanting to go to the bathroom in the relevant window is pretty much the same as the chance of exactly 1 of 25 people wanting to go to the bathroom in the first place.

So with these reasonable simplifying assumptions for the "typical" scenario, it turns out your hypothesis is true if and only if:

1/2*p(C1) > 1/3*p(C1)

Which is certainly true. So, congratulations!

But, alas, there can be "atypical" scenarios in which your hypothesis would be false.

For example, if there were only 5 or 6 or 7 people in your office, it is probably not often that two will want to go to the bathroom at the same time. Then the fact of one of them being in bathroom A would indicate that the other bathrooms are *more* likely to be empty. i.e. p(C2|A) would be much lower than p(C1).

All that is complicated enough that I might have slipped up. So I also made a little simulation, and it's coming out roughly the same. If you want a copy and have Python, you can get it here.

If anyone finds problems in the analysis or simulation, please do let me know.
posted by philipy at 12:42 PM on April 16, 2012 [1 favorite]


Response by poster: philipy: Fantastic!!! Thank you!
posted by lohmannn at 5:59 PM on April 16, 2012


Small correction....

On reflection instead of:

p(B|A) = p(B|A,C1) p(A,C1) + p(B|A,C2)p(A,C2) + p(B|A,C3)p(A,C3)

I think it should be:

p(B|A) = p(B|A,C1) p(C1|A) + p(B|A,C2)p(C2|A) + p(B|A,C3)p(C3|A)

That doesn't make much difference to the analysis except that p(C3|A) is bigger than p(C3), and in typical scenarios "almost as big" as p(C2), so your hypothesis is true in even more scenarios with the correct equation.

Also I tweaked the simulation to be a little bit easier to use and print more info. New version here.
posted by philipy at 8:02 AM on April 17, 2012


« Older The cabbies that never sleep?   |   Why does my Chrome browser keep switching the... Newer »
This thread is closed to new comments.