# How do I sort folks into groups based on their preferences?

February 9, 2016 2:59 PM Subscribe

What kind of algorithm can I use to create optimal group assignments for people, given each person's ranked preferences and limitations on group size? I'm having a hard time coming up with good search terms for this kind of thing.
If an example helps: I have 115 people and 16 groups. Each person needs to go in a group. Each person will submit a list of their preferred groups in their preferred order. Some groups can hold up to 10 people, while others can hold up to 5. For the purposes of this exercise, let's pretend that no one cares who else is in their group with them.

These are difficult problems to solve with any exactitude. Actually, to be exact, you have to define an order of some sort so you would know if for 2 people, if both in their second choice group is better or worse than one in a fist choce and the other in a third choice.

I would probably make one pass to put people in 1st choice groups, unless the are full. Then a second pass to put people in 2nd choice groups, and so forth. When done with that, I'd try to improve the position of anyone in a group they really don't want.

posted by SemiSalt at 3:33 PM on February 9, 2016 [1 favorite]

I would probably make one pass to put people in 1st choice groups, unless the are full. Then a second pass to put people in 2nd choice groups, and so forth. When done with that, I'd try to improve the position of anyone in a group they really don't want.

posted by SemiSalt at 3:33 PM on February 9, 2016 [1 favorite]

I really like this approach. it is the difference between 'sufficient' and 'optimal'.

posted by j_curiouser at 4:06 PM on February 9, 2016

posted by j_curiouser at 4:06 PM on February 9, 2016

After further reading, this

posted by It's Never Lurgi at 4:47 PM on February 9, 2016

*is*the Hospitals/Residents problem, which is a variation of the stable marriage problem (and seems very similar to SM with indifference).posted by It's Never Lurgi at 4:47 PM on February 9, 2016

It's the residency match program, yes. The bane of every 4th year medical student right about ... now, actually.

posted by Ms Vegetable at 5:28 PM on February 9, 2016

posted by Ms Vegetable at 5:28 PM on February 9, 2016

Best answer: Unlike stable marriage and the match, the groups don't care what members they get, so you can't really go with that. Data nerd married to a current resident.

posted by advicepig at 7:00 PM on February 9, 2016 [1 favorite]

posted by advicepig at 7:00 PM on February 9, 2016 [1 favorite]

Best answer: I've dealt with this for my job (I had roughly 140 people and 9 groups.) Here's how I did it:

If you can get everyone to submit their preferences in an excel sheet or something that can easily be converted to excel then that's ideal. If not, you can always just manually put it in excel. If I didn't want to mess around with software too much, I might just ask everyone to individually fill out their preferences in an excel spreadsheet and submit it, and then copy and paste everyone's preference into one big excel sheet.

Once everything is in excel, use the "sort ascending" function so that you have everything sorted by preferences. Cut and paste all the cells that contain names + preferred group + the ranking (which should be 1) into a new excel spreadsheet. This new sheet should just be first preferences.

Then, sort the "first preference" sheet by group. Personally, I found it easiest to just cut and paste each group into a new sheet so that I could easily see how many people were in each group, but you could also just keep everything in the first preference sheet.

So, at this point, you have to adjust based on how everything worked out. If there isn't a huge disparity (where basically everyone wants one or two particular groups) then you can start assigning people to their second choices based on what groups need to be filled and what groups are too full.

(If you have groups that have been filled exactly with first choice people, then you might find it helpful to delete those people from the original spreadsheet since they're pretty much good as is.)

Go back to your original spreadsheet, and copy and paste the "second choice section" into a new spreadsheet and again, sort by group.

Now, let's say group 2 is four people short, and groups 3 and 4 have too many people who selected it as their first choice. See if anyone from groups 3 and 4 listed group 2 as their second choice. If they did, try to fill up group 2 with these people from groups 3 and 4.

If you still don't have the groups evened out after you do this with second choice, move on to third choice, etc.

This becomes trickier if you have a couple of groups that almost everyone wants as their first choice; in that case, you'll have to go further down the choice list or adjust in some other way. This is also assuming that you are solely interested in assigning people to their preferred groups (as opposed to Residency matches where you have residencies with their preferences and medical students with their own preferences).

This may sound confusing or too involved, but I was able to sort everything out this way in about 1.5 hours. I mean, I'm sure there's some sort of program out there that could do this for you (in fact I did have one such program but it was so glitchy and non intuitive that i did it this way instead.)

posted by litera scripta manet at 9:15 PM on February 9, 2016 [1 favorite]

If you can get everyone to submit their preferences in an excel sheet or something that can easily be converted to excel then that's ideal. If not, you can always just manually put it in excel. If I didn't want to mess around with software too much, I might just ask everyone to individually fill out their preferences in an excel spreadsheet and submit it, and then copy and paste everyone's preference into one big excel sheet.

Once everything is in excel, use the "sort ascending" function so that you have everything sorted by preferences. Cut and paste all the cells that contain names + preferred group + the ranking (which should be 1) into a new excel spreadsheet. This new sheet should just be first preferences.

Then, sort the "first preference" sheet by group. Personally, I found it easiest to just cut and paste each group into a new sheet so that I could easily see how many people were in each group, but you could also just keep everything in the first preference sheet.

So, at this point, you have to adjust based on how everything worked out. If there isn't a huge disparity (where basically everyone wants one or two particular groups) then you can start assigning people to their second choices based on what groups need to be filled and what groups are too full.

(If you have groups that have been filled exactly with first choice people, then you might find it helpful to delete those people from the original spreadsheet since they're pretty much good as is.)

Go back to your original spreadsheet, and copy and paste the "second choice section" into a new spreadsheet and again, sort by group.

Now, let's say group 2 is four people short, and groups 3 and 4 have too many people who selected it as their first choice. See if anyone from groups 3 and 4 listed group 2 as their second choice. If they did, try to fill up group 2 with these people from groups 3 and 4.

If you still don't have the groups evened out after you do this with second choice, move on to third choice, etc.

This becomes trickier if you have a couple of groups that almost everyone wants as their first choice; in that case, you'll have to go further down the choice list or adjust in some other way. This is also assuming that you are solely interested in assigning people to their preferred groups (as opposed to Residency matches where you have residencies with their preferences and medical students with their own preferences).

This may sound confusing or too involved, but I was able to sort everything out this way in about 1.5 hours. I mean, I'm sure there's some sort of program out there that could do this for you (in fact I did have one such program but it was so glitchy and non intuitive that i did it this way instead.)

posted by litera scripta manet at 9:15 PM on February 9, 2016 [1 favorite]

Best answer: You could get what litera scripta manet is describing by formatting the columns as

Result should be a table with how many people want each group. You can double click on a cell in the pivot table to see the people who make up a particular data point.

posted by toomanycurls at 9:56 PM on February 9, 2016 [2 favorites]

*name - pref 1 - pref 2 - pref 3 - pref 4*and fill in the group names according. Do a pivot table with this data, have the group name as a row and pref as a colum. Use group name as the data with it in count mode.Result should be a table with how many people want each group. You can double click on a cell in the pivot table to see the people who make up a particular data point.

posted by toomanycurls at 9:56 PM on February 9, 2016 [2 favorites]

Response by poster: Thanks, everyone! The Gale Shapley algorithm for the Stable Marriage problem is exactly the search term that I needed. It seems adaptable enough to work with the idea that the groups don't care who's in each group, which, as advicepig pointed out, does distinguish this from the NRMP match.

As it so happens, the example I'm working with is actually my 2nd year medical school class - Having submitted our preferences, we're waiting to hear back from admin about our 3rd year clerkship groups... There's a bit of delay, and for funsies, one of my classmates challenged me to find out if I could come up with a more optimized/fair distribution of students, given everyone's preferences. Turns out, it's even more non-trivial than I expected.

I was hoping to hear that a company (e.g., SurveyMonkey) had the ability to sort this out and present the user with 4-5 mathematically equivalent sets of groups. But the advice on how to largely work it out using Excel is super helpful, from a practical standpoint, even if it doesn't result in mathematically optimal solutions.

The way I'd define the order would be "fairness" - At the individual level, it's more fair to have two people receive their second choice than one person receive their first choice and another their third choice.

posted by lizzicide at 4:54 AM on February 10, 2016

As it so happens, the example I'm working with is actually my 2nd year medical school class - Having submitted our preferences, we're waiting to hear back from admin about our 3rd year clerkship groups... There's a bit of delay, and for funsies, one of my classmates challenged me to find out if I could come up with a more optimized/fair distribution of students, given everyone's preferences. Turns out, it's even more non-trivial than I expected.

I was hoping to hear that a company (e.g., SurveyMonkey) had the ability to sort this out and present the user with 4-5 mathematically equivalent sets of groups. But the advice on how to largely work it out using Excel is super helpful, from a practical standpoint, even if it doesn't result in mathematically optimal solutions.

*Actually, to be exact, you have to define an order of some sort so you would know if for 2 people, if both in their second choice group is better or worse than one in a first choice and the other in a third choice.*The way I'd define the order would be "fairness" - At the individual level, it's more fair to have two people receive their second choice than one person receive their first choice and another their third choice.

posted by lizzicide at 4:54 AM on February 10, 2016

This thread is closed to new comments.

This does look a lot like the Stable Marriage problem. The partners, in this case, are a person and group opening. The complication is that all the group openings in a group are equally preferred by each person, but I don't think that this changes matters.

posted by It's Never Lurgi at 3:27 PM on February 9, 2016 [2 favorites]