No, this isn't about EVE Online.
February 28, 2013 5:59 AM   Subscribe

I need help taming a very long list of numbers into useful tabulated data, in Excel.

I have an Excel spreadsheet. It consists of a single column with 8,784 rows. Each cell contains a number. That number represents a data value corresponding to an hour from 2012. The data values are in order, starting at midnight Jan 1, and going to midnight Dec 31.

I need to turn this mess into a 12x24 table representing average hourly values for each month. My Excel-fu is weak.

The best way I currently know to do this is to just type a series of very long functions like: =Average(A1, A25, A49,A73 ...)

This is obviously not the right way. So, what's the right way?
posted by 256 to Computers & Internet (10 answers total) 1 user marked this as a favorite
What exactly do you want in the 12x24 table? Is it the count of date values per hour of the day per month?
posted by Aizkolari at 6:07 AM on February 28, 2013

Here's what you do. Make two more columns to the right of your data. Call one Hour and one Month. Then use the =hour() and =month() functions in excel to extract the hour and month from your data. Then use a pivot table to pull it all into a 12x24 table with the months on the rows, the hours on the columns, and the average of your data as the value.

If you can post a link with sample data for one month I can demo this for you.
posted by Aizkolari at 6:17 AM on February 28, 2013

What Aizkolari says. This is what pivot tables are for. Insert a column. Add the month. Insert another column. Add the hour. You can use a formula for the hours because you know they come in blocks of 24. Then pivot it so you have the months down the vertical axis and hours on the horizontal. Or vice versa. Then choose "average" rather than sum or count for the calculation of your data.
posted by MuffinMan at 6:41 AM on February 28, 2013

So here's January.

I have googled "excel pivot table" and am reading the tutorial now. But any more specific help would be super useful.

And to clear, what I need is a table where the first column contains values like "Average hour 1 value in January," "Average hour 2 value in January" and then the second column contains values like "Average hour 1 value in February," "Average hour 2 value in February"
posted by 256 at 6:52 AM on February 28, 2013

Okay.... So I see how pivot tables are what I need, but it seems that, in order to use them, I would need to create one column that counted from 1 to 24 over and over again, and then another column that said "January" 744 times, followed by "February" 696 times, etc. Correct?

This is what you are getting at with the hour() and month() suggestion, I think, but I'm having trouble making that work.
posted by 256 at 7:04 AM on February 28, 2013

What kind of trouble?
posted by flabdablet at 7:08 AM on February 28, 2013

Well, my googling suggests that the hour() function is used for turning times into hour integers, like =HOUR(TIMEVALUE("1:00 PM")) returns 13. I just don't see how I use that function to generate the 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,1,2... that I need to populate column B with.

Sorry, I know this is very entry level Excel stuff.
posted by 256 at 7:13 AM on February 28, 2013

Ok, so you need to add a third extra column with datetime info autofilled in for the whole year. Here is a mockup I did with some fake data for a few extra months. To get the datetime autofill started, I typed in 1/1/2012 00:00, 1/1/2012 01:00, 1/1/2012 02:00 in the first three cells of column B and then it autofilled the rest of the way down. Same for the Hour and Month columns. The pivot table is easy too; just insert it, put the months on rows and hours on columns with Value in the Values, and then go to Value Field Settings on your Value and set it to Average rather than sum.
posted by Aizkolari at 7:29 AM on February 28, 2013 [1 favorite]

Woo! It worked!

These pivot table things seem useful....
posted by 256 at 7:49 AM on February 28, 2013 [1 favorite]

Excel isn't very hot on the 1..24 hour format used in most data logging applications, so your first/last hours of every month with belong to the wrong month. This might matter in your application.

Personally, I build two hash tables - the sum for each month, and the count for each month - then calculate the mean as sum[month]/count[month] for each month. I use whatever scripting language is handy, and I very rarely use any standard time library as they force 0..23 hour notation.
posted by scruss at 7:52 AM on February 28, 2013

« Older I'm turning 25 tomorrow. 5 years away from 30.   |   Lose a good friend or try to fix it? Newer »
This thread is closed to new comments.