interpolating dates in SQL query
April 20, 2008 2:42 AM   Subscribe

In a SQL query that returns totals by date, how to interpolate dates where there are no results?

I have a query something like:

select date(timestamp_col), count(foo)
from things
group by date(timestamp_col).

but I want to show dates where there were no foo records as "0", so I can produce results suitable for graphing.

What is the best way to do this? (The database in question is sqlite, if it matters).
posted by i_am_joe's_spleen to Computers & Internet (7 answers total) 1 user marked this as a favorite
 
Best answer: You'll probably need a second table (days) which has one column (date) with one record for each day for the period concerned and then do an outer join.

So you have table things with

Monday 8
Wednesday 10
Saturday 5

and another table with 6 rows Monday - Sunday

Your query will look something like this

select days.date, nvl(count(things.thing), 0)
from days
left outer join things on things.date = days.date
group by days.date

that way you'll get all the days of the week and the count of things where the days match.

Monday 8
Tuesday 0
Wednesday 10
Thursday 0
Friday 0
Saturday 5
Sunday 0

This is the simplest way if I've understood what you want - things can get a bit more complicated though if this isn't exactly what you're after, MeFi mail me if you have any problems...

If you've not used outer joins before, good explanation here. And i may have got the table order the wrong way round in the SQL up there, I'm always doing that, just switch the table round if i have
posted by jontyjago at 3:04 AM on April 20, 2008


Maybe something like:

select
case count(foo)
when 0 then 0
else date(timestamp_col)
end,
count(foo)
from things
group by date(timestamp_col)
posted by gregor-e at 4:16 AM on April 20, 2008


Gregor-e's query is basically the same as the query in the question, but just trapping 0 counts. However there won't be any 0 counts as there are only records where foo exists. If there nothing happened on tuesday, the count won't be 0, it simply won't exist and the results will jump straight from monday to wednesday. I had a very similar case at work this week and the extra days table was the simplest way I found of fixing it. Basically you need to tell the query that there are more days available than in the things table. No disrespect to gregor-e intended, I just do this shit all day long.
posted by jontyjago at 4:36 AM on April 20, 2008


Or, the other way of doing it and keeping it all in one table is to add one row to things for each day in your period with one record for the day with foo as 0. As you are doing a sum it won't affect the totals for the days you do have a value in foo and for the others it will return 0. The original query (and gregor-e's) will then work.

That's me for the day, too much SQL for a sunday afternoon :o)
posted by jontyjago at 4:52 AM on April 20, 2008


Yeah, you need your signal to put a new point /not/ come from your list of values in the table.

I'd SELECT from the database, ordering by date.

Then, make a for loop over the periods you want to plot: { If the row on the front of the selected list is in that period, then pop it off and plot it. Else, plot zero. }
posted by cmiller at 5:09 AM on April 20, 2008


Here is the function you want, if you're using MS SQL. It's a function to create a temp table of dates, as jontyjago suggests. Do a LEFT JOIN against that.
posted by mkultra at 6:57 AM on April 20, 2008


Response by poster: I have happily used jontyjago's approach, except I didn't need a "days" table - I just used "select distinct day from thingies" as a subselect. Also, yes the table order was the wrong way around, but the outer join suggestion put me on the right track.
posted by i_am_joe's_spleen at 1:19 AM on May 3, 2008


« Older FLAC play back   |   Boston.com's most popular article? Newer »
This thread is closed to new comments.