Math/Stats: Most common middleing factor
August 4, 2019 2:05 AM   Subscribe

Mathematics / Statistics Filter: I have a set of numbers, but I don't necessarily want the lowest common factor or greatest common denominator. I want a number somewhere in between.

My company sells lots of small parts in differing quantities. Short of sales settling on a minimum quantity, i am trying to determine a most common factor that I should store the material in to facilitate picking. My Excel skills are failing me. This post is on the right track though.

Here are a few examples:
Part 1 - 8, 8, 8, 20, 40, 40
Part 2 - 35, 75, 100
Part 3 - 20, 32, 56, 60

I excluded smaller quantities under 8 because they can just be pulled from a grab bag of loose stuff. Eyeballing it, the factors I would do are probably 20, 25 and 10, respectully The problem is that I have thousands of these examples and my eyes start to cross.

Guidance is appreciated.
posted by BobtheThief to Technology (19 answers total) 6 users marked this as a favorite
Offhand I'd do it in this series (assuming that you're willing to count 10 items manually):
10, 20, 40, 80
Repeat X10:
100, 200, 400, 800
and so on..
posted by Thug at 3:14 AM on August 4, 2019 [1 favorite]

Let me see if I'm understanding your desire correctly. You want to store these items in boxes so that for some large fraction of these orders, you don't have to split a box. However, to ensure that you aren't storing something like two parts per box, you're OK with the items in each box not being an exact divisor of *all* of the order sizes, just so long as it divides a good fraction (say, half) of them. Is that basically right?

I'm not immediately sure how to program this in Excel, at least not in an elegant way. But I figured I'd see if I was thinking about it the right way before I started mucking around with spreadsheets.
posted by Johnny Assay at 6:01 AM on August 4, 2019

I would probably just simulate it. For each item type, try packaging amounts ranging from the greatest common divisor to the max order size. Count how many orders have nonzero remainders, or remainders greater than [some amount you’re willing to grab from the grab bag]. My guess is that this is roughly what you’re implicitly doing mentally.
posted by supercres at 6:16 AM on August 4, 2019

The mode is the single most common quantity - is that what you mean? You could find the mode, either for all parts or for each part and then decide on an increment of it for storage.
posted by Kriesa at 6:17 AM on August 4, 2019

Here's a very kludgey way of doing the calculation I mentioned in Google Sheets; something similar could be done in Excel. The number you'd put in each box is the largest row marked "Box" in the right-hand column. However, you'd need a separate sheet for every type of part you sell; I don't know if there's a way to do this more compactly.
posted by Johnny Assay at 6:24 AM on August 4, 2019

Just to be sure I understand your example, I would guess you would an algorithm that would return either 8 or 10 for part 1, 25 for part 2, and 20 for part 3.

Agreed that this is tricky in Excel. Is it feasible to use the Visible Basic facility?
posted by SemiSalt at 6:28 AM on August 4, 2019

I think what you want actually is greatest common factor. For example#1 that would be sets of 4. For #2, it would be sets of 5 though you could speed things up with sets of 10 as well.

I don’t know anything about the mechanics of programming but it seems like you should have some sort of test where if a certain number of the orders are above a certain multiple of the base number, you add a second base number as well. So in example 2, since 100 is 20x5 it makes sense to add 10’s.
posted by mai at 7:35 AM on August 4, 2019

The more I think about it, the more it seems like the objective needs to be defined— what you’re optimizing for.

In my mind, a reasonable goal would be to minimize the total number of “lots” in each order. Like, for an order of 20, if your pack size was 8, you’d need 2 packs (=16) and 4 singles to get you to a total of 20 with 6 lots. By that measure, a pack size of 2 would be worse (10 packs of 2) but a pack size of 6 would be better (3 packs plus two singles for 5 total lots). You’d average that across all the historical orders for that item.

Something like Johnny Assay’s spreadsheet would work with a slightly different scoring function in each cell. But, as they mention, you’d need to duplicate that sheet for each item type, or do it in a scripting language over your original spreadsheet.
posted by supercres at 7:56 AM on August 4, 2019 [2 favorites]

Agree with supercres; this is hard not because it's hard, but because you don't have a clear idea of what it is you want. If you can define precisely what you want, then I suspect this will become trivial.
posted by dmd at 8:35 AM on August 4, 2019 [4 favorites]

This seems like a genuinely hard problem.

As a non-mathematician, my temptation would be to brute-force the solution with simulations. If you've got significant amounts of real data for the number of each item ordered, you could say, "let's group items into three or four groups of specific quantities (presumably including singles) and see which groups would include the fewest number of picks if you used them to fullfull the actual orders." You'll have to specify the maximum number of different groups and think carefully about how significant the results are. Making all possible sets of all possible numbers and counting the total number of picks required to fill every order sounds easier than doing this analytically. (Doing this in Excel sounds like a nightmare, at least for me. Spending 10 hours learning python sounds a lot easier than trying to do this in a spreadsheet.)

Figuring out the ideal distribution of sets for an individual order is not entirely simple either. Calculating the number of picks for every possible combination of groups, discarding the ones that don't add up, and choosing the one with the fewest picks is a the dumb and easy way to do it. It's wasteful, but probably faster than more sophisticated solutions.

If ordering electronic parts as a consumer is any indication, setting price breaks at 10, 25, 100 will probably dramatically influence your orders and packaging choices. As a customer, I'm used to be told these things, rather than being asked about them.
posted by eotvos at 11:33 AM on August 4, 2019

Thanks for all the good answers. To clarify, I have a variable quantity order demand. Our receiving department gets a lot of 100. I'm looking for quantity that I should break down that 100 into so the picking is easy for big orders and also doesn't cause a headache come cycle count time. I could do that 100 in bags of 4, but that would be annoying for all parties.
posted by BobtheThief at 4:11 PM on August 4, 2019

If stuff comes in as lots of 100, that simplifies a little since you want a number that divides 100. That would be 2, 4, 5, 10, 20, 25, or 50.
posted by SemiSalt at 4:27 PM on August 4, 2019 [1 favorite]

This is kind of the same problem that denominations of money is designed to solve: what values of coins/bills should you make so that transactions don't require massive numbers of them? So I think it wouldn't be terrible if you used those, or at least looked at them for inspiration. If you had a part that was commonly ordered in multiples of 5 (example 2), then lots of 5, and lots of 20 or 25 would be good. Examples 1 and 3 are all multiples of 4, so you should do those with lots of 4 and 20.

I think ultimately you might have to look at each part and see if multiples of a specific factor are most common. You're right that you don't want the greatest common divisor exactly, as a single order of 7 units among hundreds of orders that are all multiples of 5 gives you a GCD of 1. You're sort of looking for a number which is a GCD of many of your order sizes, but not all of them. That could be done, but it's probably more complex than a simple Excel function. You'd have to look at each possible divisor and ask, are a disproportionately high number of orders of this part for multiples of 4? Of 5? Of 7?
posted by DevilsAdvocate at 5:42 PM on August 4, 2019

Half in jest, but if you had the space and organization and let the computer do the work and this is outlandish...

Goldbach's conjecture

Every even number is the sum of two odd primes. An odd number you'd just add 1 extra piece. And it only holds for >4. So, if you had a bag of extras for the 4s and the odds, you could make bags of prime numbers below 100 or whatnot and almost never have to pack more than two things and maybe an extra bit.

This is sorta like the 'this is what change does' in that you partition your things not for optimal optimal, but just a sort of max-bad case, 50 + 25 + 10 + 10 + 4 = 99 sort of way.
posted by zengargoyle at 7:15 PM on August 4, 2019

Doing this either way, you'd do something like pick your change denominations and run them against historic sales to total up the number of each denomination that you used to get a fraction for that particular item.

The only other better way I can think of would be a Monte Carlo method simulation where you just throw random denominations at a historic case and pick the best one at the end. Pick five random pack sizes, fill 10,000 orders, count number of boxes used, repeat a million times, pick the one with the lowest count.
posted by zengargoyle at 7:32 PM on August 4, 2019

This isn't totally well thought through, but what I might try is taking your list of order quantities, and using the modulo operator to see what combination of block sizes works best.

So, you say you suspect lots of 25, 20, and 10 will be good. So, if you have a list of order sizes, just do:

(($ORDERSIZE % 25) % 20) % 10

Then, sum up the result of that operation and that tells you the remainder of singletons you'd need to add to the order. Loop over a range of values where I put 25, 20, and 10 above and pick the one that has the smallest overall remainder. I think this will minimize singletons, but variations on this approach could also minimize other things (e.g. only having two bundles rather than three). I would also only search the space for things that humans can easily add in their heads -- no one ever remembers that 51 is 3 x 17.
posted by Betelgeuse at 8:37 PM on August 4, 2019

Depending on your product and processes, another consideration is to select your quantities to make each one obviously different than the next smaller/larger size. Even if 20 and 25 are both useful, you may be better served having only one or the other, to reduce the chance a 25-pack gets picked instead of a 20-pack. If your shipping system can detect this sort of error, than this is less of a concern.
posted by yuwtze at 7:20 AM on August 5, 2019

By picking, do you mean a guy in a warehouse is grabbing some quantity of parts and boxing them up?

What are the most important constraints? Do you want a small number of denominations to minimize mistakes? Speed of execution? (If the sizes were all prime numbers it would be confusing to add.) If the cost of each piece is marginal, you might not care if the quantity is rounded up to the next five or something.
posted by tracer at 8:07 AM on August 5, 2019

I'm really curious and want numbers to play with. It's an interesting problem that is suitable for playing with the data and seeing what falls out. The problem is ripe for brute-force and/or simulation and/or human factor guesswork and/or pruning of solutions.

Pick a Part. Generate a historic list of order sizes for that part, shuffle that list to hide any trends. Provide that list of order numbers and the receive-size for that part. Provide a range of number of partitions that are suitable (like no more than 5 different packing sizes).

I want to brute-force and simulate that and see what pops up as optimal and put some logic in there to make it human-easy. Or just satisfy curiosity. The same for a range of multiple parts would be handy. Can you figure out whether they're sold in 5s or 12s. It would be interesting to see if some code could pick out a few categories of solutions that are human-friendly and nearly optimal.
posted by zengargoyle at 1:13 PM on August 5, 2019

« Older Your favourite new music   |   Frontend programming project/lesson for backend... Newer »

You are not logged in, either login or create an account to post comments