Comments on: Please help me sort this out
http://ask.metafilter.com/93458/Please-help-me-sort-this-out/
Comments on Ask MetaFilter post Please help me sort this outSat, 07 Jun 2008 06:12:49 -0800Sat, 07 Jun 2008 06:12:49 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Question: Please help me sort this out
http://ask.metafilter.com/93458/Please-help-me-sort-this-out
Sorting algorithm - how would you do this? <br /><br /> I have a list of Type objects, somewhere around 100 - 120 objects in the list. This list determines the persistence order in which any of these objects will be saved to the database. Because of foreign key constraints, it is important that some objects be saved before others, and therefore be higher up in the list than others. So I have a shorter list, perhaps a dozen, of types in the correct order and I need to ensure that they appear in that relative order within the larger list. What is the most stable and efficient way to do this? I currently have a HashTable with a key of the required relative order and value of the current position within the Type array, and a simple bubble sort as follows:<br>
<code><br>
int hashTypesLength = hashTypes.Count - 1;<br>
int i;<br>
int j;<br>
Type tempType;<br>
int tempIndex;<br>
<br>
for (i = (hashTypesLength); i >= 0; i--)<br>
{<br>
for (j = 1; j <> (int)hashTypes[j])<br>
{<br>
tempType = typeOrder[(int)hashTypes[j - 1]];<br>
tempIndex = (int)hashTypes[j - 1];<br>
<br>
hashTypes[j - 1] = hashTypes[j];<br>
typeOrder[tempIndex] = typeOrder[(int)hashTypes[j]];<br>
<br>
hashTypes[j] = tempIndex;<br>
typeOrder[(int)hashTypes[j]] = tempType;<br>
}<br>
}<br>
}<br>
</></code><br>
<br>
This works, but is there a better way to approach this?post:ask.metafilter.com,2008:site.93458Sat, 07 Jun 2008 05:33:31 -0800LokheedcodesortalgorithmBy: pharm
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367617
Is this a homework question? You shouldn't need to do any sorting at all to solve the problem as stated, unless I've misunderstood the question of course.<br>
<br>
NB. Has your code lost its formatting? It's almost unreadable.comment:ask.metafilter.com,2008:site.93458-1367617Sat, 07 Jun 2008 06:12:49 -0800pharmBy: mpls2
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367619
If you know about all of the types in advance, couldn't you pre-compute (i.e., hard-code) the order?comment:ask.metafilter.com,2008:site.93458-1367619Sat, 07 Jun 2008 06:15:32 -0800mpls2By: Zarkonnen
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367620
If the elements in the smaller list have or could have some identifying mark, you could first persist the ones in the smaller list, and then the ones in the larger list, skipping the ones you've already done.<br>
<br>
So:<br>
<pre><br>
for (int i = 1; i <>
persist(smallList[i]);<br>
}<br>
<br>
for (int i = 1; i <>
if (largeList[i].wasInSmallList()) {<br>
continue;<br>
}<br>
persist(largeList[i]);<br>
} <br>
</pre</pre></></>comment:ask.metafilter.com,2008:site.93458-1367620Sat, 07 Jun 2008 06:15:47 -0800ZarkonnenBy: Lokheed
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367624
No, it's not a homework problem. The last time I attended school was nearly 20 years ago.<br>
<br>
The problem is that the large list of types, which is auto generated by a tool which I have no control over, and which varies in length and content depending upon the circumstance, often has some things out of order. Take two example objects, Foo and Bar. Bar has a property of Bar.FooID, which is a foreign key in the database to Foo.FooID. Because of that constraint, when persisting new records, Foo has to be saved to the database before Bar. Failing to do so will cause a database exception because of the foreign key constraint. But in that long list of types, Bar is sitting at index 37 while Foo is sitting at index 82. So Foo needs to be moved up in the array ahead of Bar. Now add another dozen or so types that also have foreign key constraints, perhaps about 10% of the total items in the longer list.<br>
<br>
The real solution is to get the third party tool to reliably take foreign key constraints into consideration when creating the large list, but I do not have control over that code. All I can do is look for the subset of objects within that list that matter to me, and sort them appropriately.<br>
<br>
Does that clarify the question, or am I still not making sense?comment:ask.metafilter.com,2008:site.93458-1367624Sat, 07 Jun 2008 06:23:45 -0800LokheedBy: TheophileEscargot
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367668
Wikipedia has a <a href="http://en.wikipedia.org/wiki/Sorting_algorithm#List_of_sorting_algorithms">list of sorting algorithms</a> with links. <a href="http://en.wikipedia.org/wiki/Quicksort">Quicksort</a> is about the best, <a href="http://en.wikipedia.org/wiki/Insertion_sort">insertion sort</a> is easy to implement, though not as efficient. Bubble sort is very inefficient and generally just used for teaching.<br>
<br>
Depending on what language and libraries you're using, you may well have a sort algorithm built in. See if you've got a HashTable.Sort() method or a Sort class or something. It's pretty rare to have to write your own sort algorithms from scratch these days.comment:ask.metafilter.com,2008:site.93458-1367668Sat, 07 Jun 2008 07:57:18 -0800TheophileEscargotBy: demiurge
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367669
I would just search for the problematic objects and put them in order at the front of the list. This would work if you didn't care about how they were ordered compared to the other objects, just to themselves. The running time would be M*N where M is the length of the big list and N is the length of the subset list. You could decrease the running time if you could do binary search on the big list, but since you probably don't know the ids that would be problematic, you can't do a binary search.comment:ask.metafilter.com,2008:site.93458-1367669Sat, 07 Jun 2008 08:09:10 -0800demiurgeBy: demiurge
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367670
TheophileEscargot: I don't think a built in sorting algorithm will help here because of the specific constraints of the list to be sorted based on a example sorted list. A better general sorting algorithm isn't required, a specialized sorting algorithm is.comment:ask.metafilter.com,2008:site.93458-1367670Sat, 07 Jun 2008 08:12:38 -0800demiurgeBy: teraflop
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367687
If I'm understanding the problem correctly, you have a partial ordering on Types (the database constraints) and you want a total ordering (an ordered list of Types). Sounds like you're looking for a <a href="http://en.wikipedia.org/wiki/Topological_sort">topological sort</a>.comment:ask.metafilter.com,2008:site.93458-1367687Sat, 07 Jun 2008 08:41:12 -0800teraflopBy: SPrintF
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367690
This doesn't seem that difficult. Assign a <em>priority</em> to each of your types. Sort on a compound key of (<em>priority</em>, <em>value</em>), where <em>value</em> is whatever distinguishes the items in your list. The higher <em>priority</em> items will "float" to the top of your list.<br>
<br>
When you insert the sorted list into your database, all of your <em>Foos</em> will precede your <em>Bars</em>, so you should have no problem with integrity constraints.<br>
<br>
If you elect to write your own sorting algorithm, a bubble sort is probably good enough, given that you have a short list of objects.<br>
<br>
(This is just another way of implementing <strong>demiurge</strong>'s solution. He's suggesting, "search for the Foos, then search for the Bars". Algorithmically, a bubble sort is much the same as an iterative search.)comment:ask.metafilter.com,2008:site.93458-1367690Sat, 07 Jun 2008 08:44:38 -0800SPrintFBy: driveler
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367721
teraflop is right. Do a topological sort. Create a class that has a type and a list of its properties and their types. Make a unsorted list of all instances of this class. Pick an element out of this list and search its children recursively in depth first search order. When you get to a type with no properties or whose properties have all been checked, then you know it's safe to remove it from your unsorted list and append it to your sorted list, since by definition you've already searched all of its children and added them to your sorted list. Keep doing this until your original list of types is empty.comment:ask.metafilter.com,2008:site.93458-1367721Sat, 07 Jun 2008 09:19:07 -0800drivelerBy: treeshade
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367741
Topological sort? Am I misunderstanding the problem? Generate a directed acyclic graph in linear time then write out the objects in the constrained order. This gives O(n) time overall. If you are not concerned about scaling and you have a simple way to compare items (maybe not the case here), then using a sorting mechanism built into the language (like sort or Set in C++) would be simpler, and for small numbers faster (in practice; in theory it is O(n log n)).comment:ask.metafilter.com,2008:site.93458-1367741Sat, 07 Jun 2008 09:47:43 -0800treeshadeBy: amery
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367823
Are you sure you need to optimize your sorting algorithm? You've got a collection of about 100 things you need to write to a backend. That likely means you're making about 100 remote procedure calls. The time it takes to do those is probably so much bigger than the time it takes to sort about 100 items that the particulars of your sorting algorithm don't matter too much (beyond putting your items in the right order, that is).<br>
<br>
So I'd say: write the simplest program you can that's correct and not pathologically slow (that is, don't use bogosort). My first play would be to use my language's built-in sort() along with a custom comparator. If that's fast enough, declare victory and move on. If it's too slow, profile it to find out where it's slow. My guess is your write() calls will gobble up a lot more time than your slightlyLessThanIdealSort() call.comment:ask.metafilter.com,2008:site.93458-1367823Sat, 07 Jun 2008 11:44:17 -0800ameryBy: hattifattener
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1367876
Yeah, you want a topological sort. I'm sure there are well-known very efficient algorithms to do that, but for only a few hundred items, the ease of writing correct code is more important than shaving a millisecond off of your save time.<br>
<br>
One dead-simple algorithm is this:<br>
<br>
1. Pick an object to be written and remove it from the list. <br>
2. Scan the list for any objects which need to be written before this one. Recursively invoke this algorithm for each one you find.<br>
3. Write out the object you picked in step (1) (or which you were recursively invoked with).<br>
4. Repeat until there's nothing left to write.<br>
<br>
I think it's O(n<sup>2</sup>).<br>
<br>
Alternately, just iterate over your list of Types in order, and write out all objects of each type. O(n * t), but it requires that all objects to be written are of a type that's in your ordered list of types (may or may not be true).comment:ask.metafilter.com,2008:site.93458-1367876Sat, 07 Jun 2008 12:52:52 -0800hattifattenerBy: Civil_Disobedient
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1368165
It sounds like you're trying to roll your own persistence model. I would <i>strongly</i> advise against this.comment:ask.metafilter.com,2008:site.93458-1368165Sat, 07 Jun 2008 20:44:16 -0800Civil_DisobedientBy: Lokheed
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1368273
I'm really not. The third party product (IdeaBlade) is handling all of the object mapping and persistence, it's just coming back with the wrong persistence order. I actually have a post out on their forum to see if there is something I am doing wrong with the configuration or ORM generation on that side that is causing the persistence order to come out wrong. But in the meantime, I need new records to save properly and that means taking their persistence order list and re-sorting it.comment:ask.metafilter.com,2008:site.93458-1368273Sun, 08 Jun 2008 04:22:10 -0800LokheedBy: Civil_Disobedient
http://ask.metafilter.com/93458/Please-help-me-sort-this-out#1368654
<i>The third party product (IdeaBlade) is handling all of the object mapping and persistence, it's just coming back with the wrong persistence order.</i><br>
<br>
Aah, ok. I misunderstood the problem. I'm not familiar with IdeaBlade, though I know there are usually configuration settings that help inform the ORM the persistence order (Hibernate's <i>cascade</i> attribute, for instance).comment:ask.metafilter.com,2008:site.93458-1368654Sun, 08 Jun 2008 14:07:12 -0800Civil_Disobedient