Returning a needle to the haystack
June 19, 2024 5:39 PM   Subscribe

I'm looking for a fresh take on a problem I've been trying to solve for a while. I have a) 150 jigsaw puzzles of all sizes and piece counts and b) a collection of 30 stray pieces. I would like to return each stray piece to whichever of the 150 puzzles was their original home.

I'm only trying to solve the problem for standard rectangular puzzles.

I have obtained passable image files of the art for most of the puzzles. I also have the puzzle dimensions and piece counts.

My current approach is to take a picture of a piece and do histogram matching across the whole collection of puzzle images. This involves a lot of finicky stuff in determining the exact grid of each puzzle so I can roll through them piece by piece. It's not going well.

Anyone have any clever (or better yet blindingly obvious) ideas?
posted by Tell Me No Lies to Computers & Internet (28 answers total) 11 users marked this as a favorite
 
Step 1: count how many pieces you have for each puzzle. With only thirty stray pieces, you will immediately eliminate 120 puzzles (unless you think you have also completely lost some pieces but still will drastically reduce the number of puzzles that you need to go through)
posted by metahawk at 5:47 PM on June 19 [8 favorites]


Find a stay-at-home person who likes puzzles and ship them a puzzle plus the whole bag of spares. I’m actually not kidding. I have a friend who’s basically been locked in since Covid started bc she has a kidney transplant that’s already failing. She does beaucoups puzzles and frequently finishes one with a piece or two missing. I think she gets them used or in swaps; they’re new to her.

Anyway, throw in a prize for the puzzler and they’ll probably give it back to you with all the pieces. I know my friend just passes on all the puzzles she completes.
posted by toodleydoodley at 5:49 PM on June 19 [6 favorites]


you're comparing color histograms? puzzles also have other qualities: dimensionality, texture &c
posted by HearHere at 6:15 PM on June 19


I would take photographs of the cardboard of the reverse sides of a few of the 150 puzzles at as high a magnification as I could get, maybe using one of those clip on lenses for your phone, and if there’s enough variation — as I suspect there might be — I’d photograph them all, photograph all the strays as well, and narrow down the candidates for each stray.

Then I’d try matching by color, etc.
posted by jamjam at 6:32 PM on June 19 [7 favorites]


You could also look at the cut edges for variations such as numbers and thickness of layers and narrow the matches that way.
posted by jamjam at 6:39 PM on June 19 [2 favorites]


You can rule some combos out by taking a mean piece weight of some 30-50 pieces of each puzzle.
posted by SaltySalticid at 6:41 PM on June 19 [2 favorites]


Count pieces for each puzzle, and set aside the puzzles that have the correct count.
Set aside the puzzles that have too many pieces.
In good light:
Lay out the stray pieces.
For each of the remaining puzzles, lay them out with all pieces visible, and just eyeball the heck out of the stray pieces for clues, and use more than just the color of the piece: size, condition of the piece, backing tint, moist odor, tongue adhesion, etc, all will matter. Use this examination to select the most obvious candidate for matches between stray pieces and puzzles with missing pieces.

For puzzles with good matches, attempt to assemble puzzle around the stray piece. Any success here is a win. Puzzles that will not assemble around your stray pieces become puzzles with no matches.

I think doing too much better with the remaining strays starts to require full assembly of puzzles with missing or extra pieces, and if you still have strays after this, then you have some more complex permuations on your hands, like swapped or cycled pieces between puzzles that require assembly of more puzzles.
posted by the Real Dan at 6:47 PM on June 19


Response by poster: I don’t want to interfere with brainstorming but I wanted to comment that these puzzles average about 1000 pieces each. Counting 150,000 pieces would be a last resort for me.
posted by Tell Me No Lies at 6:51 PM on June 19 [3 favorites]


I believe many 1000 piece puzzles have more than 1000 pieces, so anything trying to work with piece count is likely to miss.

I like the idea of image processing, but it's been years since I used such libraries. You also have the danger of having the missing pics being under some of the marketing that is on the box if you are using box pictures
posted by advicepig at 6:54 PM on June 19 [4 favorites]




I'd start by sorting the 30 stray pieces both by the picture on the front so that all pattern types were together and all common colours were together, but also by the backs, as some may have different hues than other ones do. I'd lay them all out in a tray as some people do when they want to be able to see each piece while they work on putting the puzzle together. I'd give them each a code name such as Ps3 (pink, small, third stray piece that is small and pink)

And then I would open each puzzle box and glance at the pieces inside to see if any of the pieces on my tray could be part of the puzzle in the box. Depending on how much variety you like in the puzzles you buy, you should be able to eliminate a lot of the puzzles as definitely not being the source of your missing pieces. I'd put all the boxes that I'd eliminated away and I'd label each box that was a possible or probably match with a post it note on which I'd note the relevant codes for the matching pieces.

Now if all the puzzles come from the same 1970's generation of tourist scenic images, cut by the same die and made by the same manufacturer, this is not going to work. Or if you have 150 Thomas Kincade the Artist of Light jigsaw puzzles. You'll get too many matches. But if your images and manufacturer and date of manufacturer are varied enough you'll be able to narrow down each piece to probably only being from one, or from a small handful of boxes.

After that I am afraid you're going to have to compare the stray pieces with the contents of the boxes really closely, and then do the puzzles that are the most probably matches to try to prove it.

Whenever I do a puzzle that has pieces missing when I have finished it, I trace the missing pieces on the bottom of the box. That's something we used to do when I was a kid. It made matching missing pieces easier. "Three pieces missing, September 8, 1971" on the bottom of the box meant that when a stray piece turned up, the box with the note would be the suspected source - and it also meant that if three pieces were missing when we did the puzzle again on April 15, 1974, we wouldn't be crawling around under the furniture on hands and knees looking for them. Tracing the hole in the puzzle gave the shape and size, so we knew that this piece with two tabs and two sockets couldn't possible be one of the ones missing if they all had one tab and three sockets. We could match size and shape without even opening the box.

I am going to guess that if you do jigsaw puzzles your have trained yourself to be pretty good at comparing shapes and fragments of images to find matches, so you'll find this more efficient than getting a scale and weighing pieces, or taking photographs and comparing those.
posted by Jane the Brown at 7:06 PM on June 19 [13 favorites]


I would just eyeball this. The idea of creating and running a program seems like another, much worse, type of jigsaw puzzle!

I would stack all the puzzles so the side image shows, and then take about 5 random pieces.

Look at each piece and try to figure out which box it’s from based only on colour, and make a little stack of the likely boxes for each lost piece.

Then open those likely boxes and see if the piece actually matches, looking at the back colour, puzzle cutout shapes, etc.

When you think you have the right box, number a post it note, take a good photo of the piece, front and back, with the post it showing, put the odd piece and post it note in a ziploc so you can find it again in the box, photograph which puzzle you threw it into, close it up and move on.

If you can’t find the right box, keep the puzzle piece in its numbered baggie and just put it aside.

Put all the random piece photos into an album so you now have a document showing what the random pieces look like and what box they now live in. That way if you need to find them again, you can! The ones that you couldn’t re-home, just keep separately.

Now by doing this you might make a mistake! But you might make a mistake any other way as well and this is way easier!!
posted by nouvelle-personne at 7:24 PM on June 19


Instead of rolling through the known puzzles piece by piece, could you mock up "puzzles" that are just each missing piece tiled across 83x12 or whatever and have a program look for near-exact matches in any appropriately-sized area? (I am hoping that all 150 don't have wholly unique grids...)
posted by teremala at 7:50 PM on June 19


If the puzzles are from different manufacturers and time periods and have different amounts of use, I would first sort the 30 pieces to cluster together any that are similar in texture, thickness, backing, edge beveling, and print quality, and begin to determine categories for the larger group of puzzles. Then, I’d look at the larger group of puzzles to sort which ones align with the qualities of the spare pieces I just analyzed, creating smaller groups. That should eliminate a large chunk of them, and some might only have one match.

Then I would look at the images of the puzzles in each small group and see if the spare pieces were obviously not from one or more of them. That could be tricky but it could also find obvious things you missed thus far, like if there is any visible writing on the spare pieces, or a clear chunk of an object, or an eye or whole bird in a blue sky or whatever.

Then I would look at each group’s puzzle piece shapes. A fair few manufacturers use very similar piece patterns for different puzzles, with kind of distinctive details, like weird knobs or bendy curves or deceptively not-edge-piece pieces. If any of the spare pieces have these shape qualities I would match those up. And then I would sift through the puzzles that those match with to try and find pieces that have similar colors/patterns/details as the spare piece to narrow it down even further.

At this point I feel like I would only have a handful of spare pieces with typical shapes and a few different puzzles with matching backings, print qualities, and amounts of wear for each piece. That’s when I would recruit others to build the puzzles with me, having the spare pieces shared among us. Once a spare piece is found to fit, the remaining possible puzzles for that piece can be assumed to have all their pieces.

Now, if these are all puzzles from the same manufacturer and the same time and they’ve each only been built once, a lot of this process won’t help narrow it down at all. But if they are gathered donations or something I bet you can get it narrowed down way more than you think you can just by closely looking at each spare and checking the puzzles for detail matches.
posted by Mizu at 10:44 PM on June 19 [1 favorite]


Strategies:

1. defer. if the future value of these puzzles is based on completing each puzzle, one at a time: avoid solving it as one big problem. instead, address it incrementally each time you do a puzzle. consider your collection of 30 stray pieces when solving that one puzzle. if you figure out where any strays go, or determine that the puzzle does not involve any of the strays, then you have incrementally simplified the problem into a slightly simpler one. over time the problem gets easier and easier. if the full problem never gets solved because no one ever does all of the jigsaw puzzles, is anything truly lost?

2. decouple. if the future value of these puzzles is based on completing multiple puzzles at the same time. it's hard to use the above "defer" strategy due to contention over access to the single collection of strays. simplify this by making 149 copies of the collection of 30 strays, so each of your 150 jigsaw puzzles has its own accompanying collection of 30 strays. it is now possible to complete each puzzle without any missing pieces.

3. the competition. plan, advertise and host a puzzle solving competition. rent a big hall for a day, with a bench for each puzzle. offer a small token prize for each solved puzzle and larger prizes for identification of a missing piece that can be filled by one of the strays.

4. the reformulation. get a heavy-duty blender. blend the strays. blend all the other pieces from the 150 jigsaw puzzles as well. mix well inside a drum until combined into homogeneous puzzle-slurry. mix puzzle-slurry with adhesive. spread the adhesive-puzzle-slurry over a large thick cardboard sheet. let sit until dry. seal the surface. cut result into puzzle-shaped pieces. declare the result to be a single puzzle, complete with all pieces. (with apologies to Perec)
posted by are-coral-made at 1:55 AM on June 20 [19 favorites]


I would start by checking the reverse of the missing pieces and comparing to the puzzles - manufacturers use different colours and grade of backing paper and this should allow you to at least narrow down which are a possible match.
posted by london explorer girl at 3:12 AM on June 20


advicepig: I believe many 1000 piece puzzles have more than 1000 pieces,
Matt Parker has things to say about jigsaw piece count not matching the number on the box.
posted by BobTheScientist at 5:10 AM on June 20 [2 favorites]


Buy digital calipers. Measure the thickness of each of the 30. Measure the puzzles (probably take 3 pieces at random and average them).
This should allow you to reduce the problem's difficulty a bit.
posted by signal at 6:47 AM on June 20 [2 favorites]


Emptying the puzzle pieces from a box into an accurate scale and weighing it will let you eliminate a number of puzzles, if you assume that every puzzle of dimensions X and Y from manufacturer Z weighs the same.
posted by Hogshead at 8:26 AM on June 20


Honestly I don't think this is solvable in a reasonable amount of time. Like you would be ahead by huge amounts of time and/or money by just burning all of the existing puzzles for heat (or donating them to a thrift store as "possibly missing a piece or two", or some other similar option) and just buying all new ones.

And it depends on your objective for future use of the puzzle collection.

What I would be inclined to do, combining several of the ideas above, is tracing or scanning all the pieces - ideally on just one or a couple sheets of paper. So you have exact shape, size, and color/design of ALL the missing pieces on just one or a few sheets of paper. Number the pieces with matching numbers on the sheet and on the back of the piece.

For bonus points (see #3 below) you can actually make a good quality color copy of each of the random bag of pieces & include the full set of copies with each puzzle. For example, arrange all the missing pieces on a good quality color scanner and scan. Then color print the resulting page(s).

Then, if you want or need to, you duplicate that paper & put a copy of the paper(s) into each of your existing puzzle boxes. Which gives you two possible solutions to the problem:

#1. If you give away or sell the puzzles, add your name & contact info to the paper & offer to send them the missing piece if they contact you with the required number(s).

This way all pieces are eventually reunited with the puzzle - but with no extra work or effort required, just through the natural work people are going to do anyway in assembling the puzzles.

#2. If you're going to keep the puzzles yourself you can do the same but you'll only need one master copy of the missing pieces. (Or I guess you can just keep the missing pieces separate and use them as an "extra resource" whenever you're assembling one of the puzzles.) If doing this, you can also mark puzzles as "complete" in the case you complete them and nothing is missing. This will gradually narrow down which puzzles are actually missing pieces.

#3. If all else fails - say, you're not available to send them the missing piece any longer, or they can't reach you - the person can use the nice color copy you have made of the whole set of random pieces, to make a replacement for the missing piece. Just cut out the color copy of the missing piece carefully mount it in a bit of cardboard, etc, and put it in place.

This is clever in the sense that you can put a copy of ALL the missing pieces with EVERY puzzle to which they might belong - through the magic of color scanning & printing. The user can thus complete every puzzle in reasonable fashion - yet you have never had to go through the horrendous manual labor needed to actually figure out exactly which missing piece goes with exactly which puzzle.
posted by flug at 2:35 PM on June 20 [2 favorites]


I gather from the fact that you are doing histogram matching that you're willing to approach this as a computational problem. If coding is not an option, please disregard the rest of my answer.

Try using the cross-correlation between the image of the puzzle piece and the image of the completed puzzle. If the piece is in the puzzle, the cross-correlation function will have a peak; if not, it will not have much of a peak. Here is an example from the Matlab documentation of how it is done there; similar solutions exist in free numerical computing systems like Python+SciPy. This may only work well if you can match the actual resolution between the puzzle piece and the completed puzzle (i.e., so a pixel in both images corresponds to the same physical size). Also I think you will need to have the puzzle piece oriented correctly, which may just be a matter of trying all four orientations relative to the puzzle grid.

If this technique doesn't work well on its own for you and you are really determined to solve this problem, you could look into "image registration," which describes the general problem of finding a way of transforming (moving, rotating, and possibly stretching or more extreme deformations) one image until it matches another, or part of another. This is done in medical imaging all the time, and your problem would probably not be too hard to solve in an afternoon or so using standard techniques there. If this is a rabbit hole you might enjoy going down, or just feel really committed to, you could try ImageJ. For a very deep rabbit hole that will definitely have rabbits at the bottom (i.e., I am very sure it will work but the journey to get there may not be worth it), you could try ANTs.
posted by biogeo at 9:43 PM on June 20


I should amend the above to say that those techniques will probably work well only if you have puzzle pieces that contain some amount of contrast. That is, if you have pieces that are just "sky" or whatever, without much going on within the piece, they may not work well.
posted by biogeo at 9:49 PM on June 20


+1 to social solving. Especially if you’re more worried about the personal effort vs the overall time it takes to solve it.

Find your local jigsaw community—even better if you can meet local speed puzzlers who a) solve 1000 piece puzzles in under 2 hours and b) love to practice on new puzzles. This will help you rule out puzzles not missing pieces, identity puzzles missing pieces, along with shape, color, location and back-type. (Or you can just take the 30 pieces to them at this stage and see which loner fits) This might not always be enough information to 100% put the right piece back in the right box, but can drastically narrow down how many puzzles you have to solve to confirm final fit.

I’ve done some computational image processing and I’m an introvert, and I’d still rather go the social effort of finding people/organizing the project than try to work through recalibrating and tweaking a programming script 150 times to “automate” the problem.

Reddit might have more ideas.
posted by itesser at 7:05 AM on June 21 [3 favorites]


Mod note: [Hello, problem-solvers! Just a note to say this post has been sorted into the sidebar and Best Of blog boxes!]
posted by taz (staff) at 1:41 AM on June 22


IIRC, the way factories do quality control on this stuff is weight. But the above mentioned Matt Parker video notes that "1500" pieces is not usually precise. And unfortunately, the first hit for "puzzle database" doesn't really track that either -- most puzzles I see are tagged with the number on the box, and don't have the other useful metadata. So you're unlikely to know the right details to QC the "bad" puzzles.

Which means looking at the puzzles and doing computational wizardry. This sounds pretty close to the Darpa Shredder Challenge, but internet entropy has taken hold and some of the stuff I'm finding is only accessible via wayback machine. There's probably a lot more here if you want to mine that angle but photographing 150k pieces sounds about as tedious as counting them. Best get a lego mindstorms kit and some microcontrollers to continue down that path.

I think your intended challenge is a bit simpler, in that you only have to identify which puzzle a piece belongs to, given a photo of the "solution." What I would do here* is build a database of hashes using locality sensitive hashing. Divide each puzzle into a grid with approximate dimensions that matches the size ratio and piece count, then for each square in that grid, calculate a number of different hashes. There's a lot of artistry in how you design your hashes, but you probably want to go for color ratios, maybe also use HSL, but definitely also corner/edge/middle status. This isn't that different than your current approach, so the main improvements are to a) automate the grid comparison, and b) use more than one heuristic. FWIW, this is also how Shazam works, and it's not hard to see the parallels IMO.

*actually what I would do is throw the puzzles out, or donate them to Goodwill to throw them out for me. I doubt Ravensberger gives much thought to throwing defects in the recycle bin.
posted by pwnguin at 7:18 PM on June 22


Response by poster: Note: if anyone is curious about the 1000 piece puzzles not having 1000 pieces, the reason falls right out when you go to create a puzzle.

Standard rectangular puzzles are basically a grid that is X by Y, so the total number of pieces is X times Y. Choosing an X and Y so that they multiply to exactly 1000 is very limiting.
posted by Tell Me No Lies at 8:35 PM on June 22


+1 to a community event, especially if you have puzzle clubs in your area. Corral your spare pieces in trays on a folding table, maybe sorted by color or size or irregularity, beside snack-sized, zipper plastic bags; display puzzle boxes facing out. Participants can match a piece to its likely puzzle, then bag the piece, pick up the box, and go to another table to start.

Small prizes (for correct matching, of course, but also by speed of completion? solo solvers vs. same, with separate team category? the local club will have ideas) are great. Libraries, schools, community centers, and religious-organization meeting rooms may have the (air-conditioned) space, with tables and chairs, for your contest.
posted by Iris Gambol at 11:02 AM on June 24


Response by poster: Thank you everyone!

Since this was brainstorming I can't really mark any answer as best, but for my particular use having it pointed out that the physical piece has a lot more to it than just the image is a game changer.
posted by Tell Me No Lies at 5:32 PM on June 26


« Older Alison roman chocolate pound cake   |   How should I rehab my injured calf muscle? Newer »

You are not logged in, either login or create an account to post comments