How to calculate probability three things will be on same day of week?August 20, 2018 10:36 PM   Subscribe

If I know Event 1 will happen three times in a week, Event 2 will happen four times, and Event 3 twice, how can I tell how likely they will be to happen on the same day (assuming they're random)?

I am trying to figure out how often I'm doing laundry for multiple apartments. When multiple "dirty laundry days" happen together, that should only count as doing laundry once.

There must be a formula for this, but I am too far away from probability to even know where to begin.

Thank you math people.
posted by tummy_rub to Education (12 answers total) 2 users marked this as a favorite

If they're truly random:
On each day, the probability of Event 1 happening is 3/7. Event 2 is 4/7. Event 3 is 2/7.

A day in which all three events happening is 3/7 * 4/7 * 2/7, or about 0.07, or 7%. There's a 7% chance this will happen.

Some more:

Event 1 and Event 2 happening, but not Event 3: (3/7 * 4/7 * 5/7) = 0.175, or 17.5%
Event 1 and Event 3 happening, but not Event 2: (3/7 * 3/7 * 2/7) = 0.052, or 5.2%
Event 2 and Event 3 happening, but not Event 1: (4/7 * 4/7 * 2/7) = 0.093, or 9.3%
posted by suedehead at 10:49 PM on August 20, 2018 [2 favorites]

3/7*4/7*2/7

= 24/343

=7%
posted by mikek at 10:49 PM on August 20, 2018

Let A = event 1, B = event 2, C = event 3.

The probability of 0 events happening is 60/343, since we want (NOT A) and (NOT B) and (NOT C). When we have AND we want to multiply the probabilities, (3/7) * (4/7) * (5/7).

The probability of exactly 1 event happening on a particular day is (A and (not B and not C)) OR (B and (not A and not C)) OR (C and (not A and not B)). When we have OR we want to add the probabilities, 45/343 + 80/343 + 24/343 = 149/343.

The probability of exactly 2 events happening on one day is ((A and B) and not C) OR ((A and C) and not B) OR ((B and C) and not A). With the same type of calculation this is 110/343.

The probability of exactly 3 events happening on one day is (A and B and C) or simply (3/7) * (4/7) * (2/7) = 24/343.
posted by dilaudid at 10:58 PM on August 20, 2018 [1 favorite]

Thank you for all of these answers. I will take your clear explanations and start plugging in the numbers. Thanks again!
posted by tummy_rub at 11:06 PM on August 20, 2018

I am trying to figure out how often I'm doing laundry for multiple apartments. When multiple "dirty laundry days" happen together, that should only count as doing laundry once.

The fast way to deal with problems of this kind is to flip them around.

You want to know how many days per week, on average, you're doing any laundry. This is kind of a pain in the arse to calculate, because it involves adding the probabilities of each of the many possible combinations of laundry activity, and each of those needs a separate sub-calculation.

Flipping it around asks instead how many days per week you're doing no laundry. This is easy to calculate, because doing no laundry on any given day means you're doing no laundry for apartment A and none for apartment B and none for apartment C; and if A, B and C have independent and randomly distributed laundry days you can work that out just by multiplying the associated probabilities.

So if A has 3 laundry days per week on average, on unpredictable days, then the probability of any given day not being an A laundry day is (7 - 3) / 7 = 4/7.

If B has 4 laundry days per week on average, on unpredictable days, then the probability of any given day not being a B laundry day is (7 - 4) / 7 = 3/7.

And if C has 2 laundry days per week on average, on unpredictable days, then the probability of any given day not being a C laundry day is (7 - 2) / 7 = 5/7.

So the probability of any given day not being a laundry day at all is the product of those probabilities: 4/7 × 3/7 × 5/7 = 60/343.

Which means that the probability of any given day being a laundry day for at least one of them is certainty (probability 1) minus that: 1 - 60/343 = 283/343.

Which means that the average number of days on which you should expect to be doing laundry for at least one apartment in any given week is that probability times the number of days in a week: 283/343 × 7 ≅ 5.8.
posted by flabdablet at 2:33 AM on August 21, 2018 [2 favorites]

Although flabdablet comes closest to what you want, I don't feel the probability representation is correct. For example, you didn't say 'B has 4 laundry days per week on average', you said that B has 4 laundry days and they occur on random days of the week. And, presumably two of those laundry days will not occur on the same day.

So with the above calculations you have the small non-zero chance that every day is a laundry day for one customer - not true - or that all laundry days occur on the same day.

At this point, I'd probably just do a Monte Carlo simulation - a small program can generate random schedules and you can compute the average days you are doing laundry. It will never be less than 4 days a week since 'Event 2' happens that many days and will be as high as 7 - a whole week of doing laundry.
posted by vacapinta at 3:15 AM on August 21, 2018 [3 favorites]

If I understand the question as being how many days do you end up doing more than one apartments laundry and I assume that you do not do the same apartment's laundry more than once a day, then I get the answer that you will end up doing multiple loads on approximately 39% of days.

Forgive my quick and dirty python writing but this is what it looks like: pastebin code

This is what the output looks like:

>>> test()
0.39140714285714284
>>> test()
0.39073285714285716
>>> test()
0.3903871428571429
>>>

Note that this is probably not the answer in the real world. The day to do a given apartment's laundry is not randomly chosen. It probably correlates with how long it's been since an apartment's laundry has been done.
posted by rdr at 3:55 AM on August 21, 2018 [1 favorite]

I wanted to see the multiple baskets from the same user on a single day case so...
```
#! perl6
my %y;
for ^100_000 {
my %x;
#| change 'pick' to 'roll' for multiple baskets per day
%x{\$_}++ for flat (3,4,2).map( { (^7).pick(\$^c) } );  # play this week
%y{\$_}++ for %x.values;
};
my \$t = [+] %y.values;
\$_ /= \$t for %y.values;  # become percentages
say %y;
__END__
# only one basket per user per day
{1 => 0.5261714, 2 => 0.3883893, 3 => 0.0854392}

# any number of baskets (up to their limit) per day
{1 => 0.4991155, 2 => 0.3331747, 3 => 0.1295678, 4 => 0.0318850,
5 => 0.0055773, 6 => 0.0006208, 7 => 0.0000571, 8 => 0.0000019}
```
Played one week at a time... so like you're keeping the baskets and returning them sunday morning or something.

(yes this was a one liner at one point...
posted by zengargoyle at 6:11 AM on August 21, 2018 [1 favorite]

And I forgot to do the days with NO baskets... duh.
`my %x = ^7 X=> 0 xx 7;`
{0 => 0.174503, 1 => 0.435271, 2 => 0.3202343, 3 => 0.0699914}

{0 => 0.249703, 1 => 0.3748543, 2 => 0.2494757, 3 => 0.09709, 4 => 0.0242986, 5 => 0.0040586, 6 => 0.0004886, 7 => 0.0000314}
posted by zengargoyle at 6:33 AM on August 21, 2018 [1 favorite]

If I understand the question as being how many days do you end up doing more than one apartments laundry

I ruled that reading out on the basis of the qualifier: "When multiple "dirty laundry days" happen together, that should only count as doing laundry once." To me, that makes it clear that what's being asked for is a way to calculate the number of days on which any laundry is being done.

I don't feel the probability representation is correct. For example, you didn't say 'B has 4 laundry days per week on average', you said that B has 4 laundry days and they occur on random days of the week. And, presumably two of those laundry days will not occur on the same day.

But they might well occur on the same day of the week across multiple weeks, which is really all we're interested in when playing with probability questions of this kind.

It was indeed a lazy assumption on my part that if an apartment has exactly K laundry days per week but they could be any K days, then the probability of any given day being a laundry day for that apartment is K/7. But I think it's justifiable all the same.

To see why, imagine you had indeed written a small program to generate random weekly laundry schedules for apartment B over a huge number of weeks, say a billion. Then, the number of days will be seven billion, the number of times that apartment B got laundry done will be four billion, and the probability that any randomly selected one of those seven billion days will be an apartment B laundry day is four billion in seven billion = 4/7.

To translate that probability back to an expected number of days per week on which apartment B gets laundry done, we just multiply it by the number of days in a week and get 4, which is clearly correct. The fact that we are actually guaranteed to achieve that expected outcome in any given week, because that's the way the problem was originally set up, doesn't make the expectation wrong.

In fact we would not be guaranteed to see apartment B get four days of laundry done in any given week if we just slipped the week boundaries, running the weeks say Wednesday to Tuesday instead of the Sunday to Saturday we used to generate the original schedules. If we did that, we might well see the occasional 7-day period in which it gets laundry done every day, or see it go as many as six days without getting any laundry done at all. But the fact that exactly four days must be laundry days for some choice of week boundaries can't make the expected rate anything other than four days per week, regardless of where the week boundaries fall.

The same reasoning applies to the probabilities that any given day will be a laundry day for apartments A and C.

Now, in order to justify calculating the probability of X, Y and Z occurring simultaneously by multiplying their individual occurrence probabilities, all we need to guarantee is that X, Y and Z are independent of each other. And if the three apartments do indeed generate their laundry schedules at random and don't take each other into account when doing so, then that condition is satisfied.

What you could not safely use these probabilities for is working out, for example, how likely it would be for no apartment to need laundry done on Friday in a week where you already knew what had happened from Sunday to Thursday. Because in that case you're absolutely not dealing with independent events. But in the question as posed, I believe you are.

So I stand by my answer.

zengargoyle, using sparsely commented Perl code to explain anything is just perverse. However, I note with satisfaction that if I'm reading it right, your result gives the chance of a day with no laundry, under the one basket per user per day assumption, as 0.174503; which makes the chance of a day with some laundry 0.825497; which matches my theoretical prediction of 283/343 to three significant figures.
posted by flabdablet at 7:31 AM on August 21, 2018

I agree with rdr, under the interpretation that we're looking for how many days have laundry in more than one apartment, without having looked at the code. To simulate this we can take three samples of size 4, 3, and 2 from the set of seven days, and then see how many days occur more than once when we put those together. A quick simulation in R:

f = function(x){length(x[x>=2])}
set.seed(1)
mean(replicate(100000, f(table(c(sample(7, 4), sample(7, 3), sample(7, 2))))))

returns 2.73734 multiple-laundry days per week, or a probability of 0.391 of doing multiple laundries on any given day.

If we want to know how many days are laundry days in a given week, we can simulate that too:

set.seed(1)
mean(replicate(100000, length(unique(c(sample(7, 4), sample(7, 3), sample(7, 2))))))

returns an average of 5.77551 laundry days per week.

But all this is assuming that the days get picked uniformly at random, and they really don't. The "right answer" depends on how the laundry days are actually getting generated. Presumably you're looking for a schedule where the laundry days are "evenly" spread through the week and you have to do double-laundry only on two days. Something like:
A: Monday, Wednesday, Friday
B: Tuesday, Thursday, Saturday, Sunday
C: (any two days)
would work - A and B are on opposite days and the days with C are the double-laundry days.

(Note to nerds: there are actually less than 100,000 different results of the sampling part, in fact choose(7, 4) * choose(7, 3) * choose (7, 2) = 25,725. It wouldn't be difficult to enumerate them explicitly, but I have already wasted enough of my employer's time.)
posted by madcaptenor at 8:19 AM on August 21, 2018

flabdablet, in the 'pick' case it's pulling 0-6 from a bag all lottery style (N per week, random different days). In the 'roll' case it's well rolling a seven sided dice (N per week, random days, may be the same). The rest is just counting and presentation.

Average... long term... should be the same within simulation vs calculation.... hopefully. :P

I went more strictly enforcing the average along the lines of bitrate buckets and tokens. But was way to lazy to do some full blown dining philosophers simulation because I'd have to look up maths to figure out how to randomly pick while ensuring eventual meeting of the stated average.

Plus I went all "it's a hotel". The 2 are sheets and towels, the 3 are those who want laundry service, the 4 are the ones who want 2 laundry services and it's a long enough stay for the weekly average to be a maximum boundry condition.

lol, the OP will have to be as specific as some of the answers to get anything much better.... and they all should converge close enough for all practical purposes for making a good guess.
posted by zengargoyle at 9:49 AM on August 21, 2018

« Older Recovery from CMV - what to expect?   |   Recommend me a (better) social media tool Newer »