Baby, we were born to add
November 10, 2010 11:56 AM   Subscribe

SPSS/SAS filter: How can I convert a series of dates into a series of sequential numbers? Of course the data is messy and some of it's missing.

I've got a dataset of measures taken for people over the course of several months. I'd like to create some variables for some of these measures by month (month in the study, not month of the year). I have the date each measure was taken.

The only way I can see to do this is to assign each date for each person a uniform "study date" -- i.e., Day 1, Day 2, Day 3, etc.

Some issues:
1. People started in the study at different times. So July 2nd could be Day 1 for one person but Day 35 for someone else.
2. There is no connection between calendar date and when someone started -- i.e., Day 1s are scattered through different parts of each month and through different months of the year.
3. There are missing measures. So whatever method I use to convert the dates to numbers will need to account for this and skip over a number associated with a missing date.

Right now the data is in SPSS, and I've also got a copy of it in Excel. I've got access to SAS. I'm not nearly as proficient with it as the other two but I'm willing to muck around a bit if it keeps me from having to do this by hand.

Thanks, Hive!
posted by unannihilated to Technology (5 answers total) 1 user marked this as a favorite
 
I've never used SPSS, but I've done this exact thing with SAS programming. SAS has predefined functions for converting dates between different units. In this instance, I'd load in the date data, matching the format it's currently in. Then I'd change the date data type to a Julian date. I'd figure out the earliest date of each person, and subtract this date, again in Julian units.

SAS programming is a little cumbersome if you're not that familiar with it. But the same methodology may work in SPSS.
posted by Tooty McTootsalot at 12:18 PM on November 10, 2010


SPSS has exactly this function too.

First pick a date format, e.g. dd/mm/yy.

Clean up the dataset, so all dates are in this same format.

In the "Variable View" set the "Type" of all date variables to that date format.

Under the "Transform" menu, choose the top option "Compute variable".

The "Target Variable" should be set as STUDYDATE (or something similar), and the "Numeric expression" is =DATEDIFF(Variable that indicates date of study observation, variable that is date people entered study, "days")
posted by roofus at 2:14 PM on November 10, 2010


You can work with dates in SPSS, and even split the file by person so that you can assign ranks by date, but I am unsure how to generate these ranks without a variable that has the start date. In excel you can do this with a bit of manual effort.

1. Extract all the months (=month(cell)) in another column, say column C
2. Sort the data by person, then by month
3. Create a column for the month of the study by subtracting the minimum value in the range of months for each person from each "month" value: it would look something like this, for example "=C2-min(C2:C20)" (this is the hard part since you will have to specify the range manually) This will give you a column with a set of values that has "0" as the first month of the study.

I am not sure what you mean by missing dates, do you mean you have the data but not the date it was measured?
posted by prenominal at 7:08 PM on November 10, 2010


Response by poster: Thanks for the answers, all. The data was self-recorded by participants. By missing dates, I mean a participant missed doing their recording that day, so there is no data at all for the day. Any data I have is automatically labeled with the date by the system that participants use to enter it.
posted by unannihilated at 5:28 AM on November 11, 2010


Okay, thanks. Missing dates/data are okay then, if you are just going to rank the "extracted" months.
posted by prenominal at 6:10 AM on November 11, 2010


« Older What is the best telescope I can purchase new for...   |   Payments Along the Bell Curve Newer »
This thread is closed to new comments.