Algorithm for finding the cheapest widget shopping list?
August 3, 2007 1:05 PM   Subscribe

What's a good algorithm for finding the lowest total price if I want to buy several items from several different sellers, where each seller charges a flat shipping cost?

Here's the situation: suppose there's some particular web site where sellers list different kinds of widgets for sale at a price of their choosing. If you want to buy a widget from a particular seller, you pay whatever the seller is asking plus a fixed shipping cost (say, $1). The shipping cost is charged per order, not per widget, so if you bought two widgets from the same seller you'd pay $1 total for shipping, not $2. However, if you buy the same two widgets, one apiece from two different sellers, you pay $2 in shipping, because you've made two different orders.

Now suppose you want to buy lots of different widgets --- say, 10 each of widget A, widget B, and widget C. You want to buy for the lowest total price. Now, you could just take the 10 cheapest widget A's, widget B's, and widget C's, but because of shipping costs that might not be the lowest total price for the order: for instance, suppose that 10 different sellers each have 1 A widget for sale for 1 cent, 10 other sellers are each selling 1 B widget for 1 cent, and 10 more are selling 1 C widget for 1 cent, and there's also a single seller selling 10 A's, 10 B's, and 10 C's for 10 cents each. If you buy the cheapest widgets, you end up paying 30 cents for the widgets, and $30 for shipping. On the other hand, if you buy all the widgets from the higher-priced seller, you pay $3 for widgets and $1 for shipping, clearly better.

My question is, can anyone think of an algorithm that will (a) run quickly and (b) always produce the best possible shopping list? The "running quickly" bit is important --- for the real-world task I'm interested in, people will want to buy around 50-60 widgets at a time and for each distinct kind of widget there will be at least hundreds of offers, so enumerate-all-possibilities-style solutions will be far too slow to be practical. It seems like there should be an easy linear-time algorithm for doing this, too, but somehow I'm not seeing it.
posted by jacobm to Computers & Internet (5 answers total) 1 user marked this as a favorite
 
isn't this a classic instance of the knapsack problem? Google for that and you'll find lots of solutions.
posted by GuyZero at 1:11 PM on August 3, 2007


Response by poster: GuyZero: I agree that it's related to the knapsack problem, but it seems different because items don't have fixed values --- choosing to buy widget A from seller 1 has an impact on how much it's going to cost me to buy widget B from seller 2.
posted by jacobm at 1:15 PM on August 3, 2007


I'm pretty sure the math linear programming is what you're looking for. The GNU people have an implementation
that should do what you want.
posted by cschneid at 1:59 PM on August 3, 2007


At first glance, it doesn't seem that the function you're trying to optimize is linear, so I don't think you'll have much luck with ILP.

You might be more interested in "linearly constrained optimization," which is like LP, but the objective function can be non-linear. Though, I guess it might be tricky to find an off-the-shelf library that will give you integer solutions.

If it were me, I'd try to make some simplifying assumption that would allow me to shoehorn it into ILP, but at 5pm on a Friday, I have insufficient brains.
posted by Zach! at 5:29 PM on August 3, 2007


Best answer: It's a standard optimization problem, and your variables (how many widgets of each type to buy from each store) are integers. I think the objective function (your cost) can be made linear with the following formulation, so it is solvable by any integer programming solver, including the GLPK linked above. But it is because your variables are integer that I doubt it can be solved in polynomial time, let alone linear.


Minimize:
Sum_{store,type} [NumWidgets_{store,type} * Cost_{store,type}]
+
Sum_{store} [BoughtFrom_{store} * Shipping_{store}]

Subject To:
ForAll{type}: Sum_{store} [NumWidgets_{store,type}] = NumNeeded_{type}
ForAll{store}: Sum_{type} [NumWidgets_{store,type}] <= [big number] * BoughtFrom_{store}
ForAll{store,type}: NumWidgets_{store,type} <= InStock_{store,type}


Sum_{sub1,sub2} means "Sum over all combinations of sub1 and sub2," and the variables are all subscripted over the different stores and widget types: NumWidgets_{store,type} is the number of widgets of type "type" bought from store "store."

So you're minimizing your cost (sum of all widget costs plus sum of all shipping costs) subject to three types of constraints. All variables are non-negative integers. The first set of constraints just says that for each type of widget, you buy as many as you need. The second set of constraints makes BoughtFrom_{store} equal to 1 if you buy any widgets from that store and 0 otherwise. [big number] needs to be larger than the number of widgets available from any one store for it to work. It will likely be more efficient if you can tell a solver that the BoughtFrom_{store} variables are binary. The final constraint just ensures that you don't try to purchase more widgets than a store has (it wasn't clear if you had stock limits or not - if not, remove these constraints).

This is probably not the best formulation of the problem. IP solvers are very sensitive to the formulation used, and different formulations can have very different performance. But this shows one way to do it with an IP solver. If the learning curve isn't too steep, you could try out a few sample problems with GLPK to see how it performs. If you need better performance and you have a budget, a commercial tool like CPLEX will be faster, and a graduate student in Operations Research at a nearby university could probably come up with a better formulation.
posted by whatnotever at 3:45 PM on August 4, 2007


« Older Name That Movie Class!   |   How to store text files on my cellphone? Newer »
This thread is closed to new comments.