<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

      <title>Comments on: Tricky SQL for a simple function</title>
      <link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function/</link>
      <description>Comments on Ask MetaFilter post Tricky SQL for a simple function</description>
	  	  <pubDate>Fri, 23 Nov 2007 02:12:37 -0800</pubDate>
      <lastBuildDate>Fri, 23 Nov 2007 02:12:37 -0800</lastBuildDate>
      <language>en-us</language>
	  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
	  <ttl>60</ttl>

<item>
  	<title>Question: Tricky SQL for a simple function</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function</link>	
  	<description>Been banging my head against the wall on this the whole day. I need a way to select &amp;amp; sort data from a single MySQL table, but it&apos;s a little bit more complicated that it seems. &lt;br /&gt;&lt;br /&gt; Assume a table called &quot;activity&quot; with 4 fields - activity_id, activity_description, activity_datetime and project_id, which is a foreign key to the &quot;project&quot; table. This &quot;activity&quot; table stores all activity updates related to a project.&lt;br&gt;
&lt;br&gt;
The &quot;project&quot; table is just project_id and project_name.&lt;br&gt;
&lt;br&gt;
In a single SQL, I want to display a table that shows each project, ordered by its latest activity_datetime, like this:&lt;br&gt;
&lt;br&gt;
Project X | &quot;Client meeting&quot; | 2007-11-23&lt;br&gt;
Project Z | &quot;Web design&quot; | 2007-10-22&lt;br&gt;
Project A | &quot;Programming&quot; | 2007-10-10&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
My current SQL does not work because MAX(activity_datetime) gives me the most recent activity, but activity_description still shows the first matching record.&lt;br&gt;
&lt;br&gt;
SELECT *, MAX(activity_datetime) FROM activity a, project p WHERE a.project_id = p.project_id GROUP BY p.project_id</description>
  	<guid isPermaLink="false">post:ask.metafilter.com,2008:site.76902</guid>
  	<pubDate>Fri, 23 Nov 2007 01:38:06 -0800</pubDate>
  	<dc:creator>arrowhead</dc:creator>
	
	<category>mysql</category>
	
	<category>sql</category>
	
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142607</link>	
  	<description>Is there any reason why this wouldn&apos;t work?&lt;br&gt;
&lt;br&gt;
SELECT * FROM activity a, project p WHERE a.project_id = p.project_id GROUP BY p.project_id ORDER BY activity_datetime desc;</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142607</guid>
  	<pubDate>Fri, 23 Nov 2007 02:12:37 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: arrowhead</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142619</link>	
  	<description>Hi evariste, the SQL doesn&apos;t work because the GROUP BY function sort of messes up the results... the results show data that do not correlate to activity_datetime. So that&apos;s the problem - I still need GROUP BY in order for each project to show up only once.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142619</guid>
  	<pubDate>Fri, 23 Nov 2007 02:42:22 -0800</pubDate>
  	<dc:creator>arrowhead</dc:creator>
</item>
<item>
  	<title>By: Leon</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142620</link>	
  	<description>The join on project is superfluous to your problem; which can be reduced to:&lt;br&gt;
&lt;br&gt;
select description, max(dt) from activity group by projectid&lt;br&gt;
&lt;br&gt;
try this:&lt;br&gt;
&lt;br&gt;
select * from activity where dt in (select max(dt) from activity group by projectid)&lt;br&gt;
&lt;br&gt;
rewriting as a single query is left as an exercise for the reader :)</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142620</guid>
  	<pubDate>Fri, 23 Nov 2007 02:44:59 -0800</pubDate>
  	<dc:creator>Leon</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142626</link>	
  	<description>I&apos;m just messing around here, but would this work?&lt;br&gt;
&lt;br&gt;
SELECT * from activity a, project p where p.project_id=a.project_id and a.activity_datetime in (SELECT MAX(activity_datetime) FROM activity a, project p WHERE a.project_id = p.project_id GROUP BY p.project_id)&lt;br&gt;
&lt;br&gt;
Or possibly:&lt;br&gt;
&lt;br&gt;
SELECT * from activity a, project p where p.project_id=a.project_id and a.activity_datetime in (SELECT MAX(activity_datetime) FROM activity a, project p WHERE a.project_id = p.project_id GROUP BY p.project_id) limit (select count(*) from project)&lt;br&gt;
&lt;br&gt;
if the first one gives you too many rows.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142626</guid>
  	<pubDate>Fri, 23 Nov 2007 03:03:50 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142627</link>	
  	<description>Too slow! Looks like I was thinking along the same general track as Leon.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142627</guid>
  	<pubDate>Fri, 23 Nov 2007 03:04:49 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: Leon</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142628</link>	
  	<description>BTW, that inner select assumes that the dt column is unique. With a 1-second resolution, it is guaranteed to break at some point. In the real world, I&apos;d solve this outside SQL.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142628</guid>
  	<pubDate>Fri, 23 Nov 2007 03:05:52 -0800</pubDate>
  	<dc:creator>Leon</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142629</link>	
  	<description>&lt;i&gt;BTW, that inner select assumes that the dt column is unique. With a 1-second resolution, it is guaranteed to break at some point.&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
We&apos;re talking about activity rows with the same datetime, but different project_id, right? I tried adding an activity row with the exact same datetime as another, but a different project, to the toy sqlite database I&apos;m using to help me try to answer this question. It still works fine and outputs something that seems correct, because I&apos;m including the p.project_id=a.project_id clause in the inner select. Does this address what you said, or am I missing your point?</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142629</guid>
  	<pubDate>Fri, 23 Nov 2007 03:10:42 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: Leon</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142631</link>	
  	<description>Can you show the current SQL?</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142631</guid>
  	<pubDate>Fri, 23 Nov 2007 03:14:03 -0800</pubDate>
  	<dc:creator>Leon</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142632</link>	
  	<description>&lt;b&gt;Leon&lt;/b&gt;, my best query attempt so far is this one:&lt;br&gt;
&lt;br&gt;
SELECT * from activity a, project p where p.project_id=a.project_id and a.activity_datetime in (SELECT MAX(activity_datetime) FROM activity a, project p WHERE a.project_id = p.project_id GROUP BY p.project_id) limit (select count(*) from project)&lt;br&gt;
&lt;br&gt;
I&apos;m sure it could be improved. I uploaded my test sqlite database &lt;a href=&quot;http://discardedlies.com/media/mefi-arrowhead.sqlite&quot;&gt;here&lt;/a&gt; if you want some test data. (and &lt;b&gt;arrowhead&lt;/b&gt;, feel free to improve on my probably-crummy test database, I just needed something to play with.)</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142632</guid>
  	<pubDate>Fri, 23 Nov 2007 03:16:58 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: Leon</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142636</link>	
  	<description>*sigh*... I&apos;m half awake this morning. You guys should just ignore me while I mumble to myself. I think evariste&apos;s solution will work, but he should be dragged to the village ducking stool for using &lt;em&gt;two&lt;/em&gt; nested selects. (Seriously - what&apos;s the limit for? I&apos;m not seeing it).</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142636</guid>
  	<pubDate>Fri, 23 Nov 2007 03:29:34 -0800</pubDate>
  	<dc:creator>Leon</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142639</link>	
  	<description>Leon, the limit clause is because I was getting too many rows! It was returning &lt;i&gt;every activity record,&lt;/i&gt; grouped by project and then ordered by datetime, desc. I&apos;m assuming arrowhead doesn&apos;t want every activity record, but rather only wants the single latest activity from each project, so putting on a limit equal to the number of projects seemed like the way to accomplish itand you clearly don&apos;t want to pick a number and hardcode it, because you might add or delete projects. While two nested subqueries is pretty ugly, it does the job...I hope.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142639</guid>
  	<pubDate>Fri, 23 Nov 2007 03:33:43 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: Leon</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1142647</link>	
  	<description>You know, if we assume that activity rows are always added in date-order (and so the highest activity_id for a project is always the most recent one) we can do this:&lt;br&gt;
&lt;br&gt;
SELECT * FROM activity WHERE activity_id IN (SELECT MAX(activity_id) FROM activity GROUP BY project_id)&lt;br&gt;
&lt;br&gt;
That can be a reasonable assumption, sometimes (logging tables).</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1142647</guid>
  	<pubDate>Fri, 23 Nov 2007 04:07:44 -0800</pubDate>
  	<dc:creator>Leon</dc:creator>
</item>
<item>
  	<title>By: arrowhead</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1146950</link>	
  	<description>Thank you leon &amp;amp; evariste - you&apos;ve both been extremely helpful. A derivation of both suggested SQLs did the job - managed solve the problem in less than 10 minutes!</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1146950</guid>
  	<pubDate>Tue, 27 Nov 2007 09:06:50 -0800</pubDate>
  	<dc:creator>arrowhead</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/76902/Tricky-SQL-for-a-simple-function#1147057</link>	
  	<description>Excellent!</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.76902-1147057</guid>
  	<pubDate>Tue, 27 Nov 2007 09:57:43 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>

    </channel>
</rss>
