How to Convert Dates to a Better Format?
March 4, 2009 2:09 AM   Subscribe

PyFilter: I need a one-liner (?) in Python to convert 9/2/09 style dates to 9/2/2009, while also understanding 9/2/71 to be 9/2/1971. Help!

Inspired by an answer to this question I've started poking around learning Python in my spare time. Yes, that's only been an hour or two every few months, but hey: I don't have much spare time, all right? I'm trying, mom!

In one corner of my current hacky project, I'm being fed tables of data by a serial controller. I'm doing fine getting arrays out of the data, so that's fine, but one of the columns contains dates in a very annoying format, so I need to convert:
8/17/80  ---> 8/17/1980
10/19/82 ---> 10/19/1982
10/9/01  ---> 10/9/2001
I realize there's a problem there in determining what, for example, "year 40" is. I'm happy with setting a bounds, so that any year under 50 should be interpreted as 20xx and any year over 50 as 19xx. This hacky little exercise of mine doesn't need to be Y2.05K compatible! :)

The dates are strings, so one at a time is fine whether with math or just fancy-ass pattern matching. This strikes me as the kind of programming puzzle that probably has fifteen different ridiculously-clever approaches, so I thought of AskMe, my friendly neighborhood source of clever approaches!
posted by rokusan to Computers & Internet (10 answers total)
 
tokens = string.split("/")
if tokens[2] > 50:
print("19"+str((tokens[2]))
else:
print("20"+str((tokens[2]))

I think you should be able to piece it together from there. There are shorter options, you don't really need to use split since you know those digits are the last two of the string you can use string[-2:] to get the last two characters.... However you get them just case to an int compare and create a new string.
posted by magikker at 2:23 AM on March 4, 2009


I left out the casting to an int if you are'nt familiar...

if int(tokens[2]) > 50:

Sorry, I took that for granted. if you know they are ints just wrap it in int() to do math on them and str() to use them like strings.
posted by magikker at 2:26 AM on March 4, 2009


Best answer: You can make it a bit shorter using the
x if c else y
form:
def convert(date):
    m, d, y = date.split('/')
    return "/".join([m, d, '%d%s' % (19 if int(y) >= 50 else 20, y)])
I don't think there's any straightforward way to do it without splitting the string on a separate line.
posted by jordanlewis at 2:37 AM on March 4, 2009


Best answer: Well, jordanlewis beat me on size, so why not take it all the way down to one line?

def convert(date):
return "".join([date[:-2], "%d%s" % (19 if int(date[-2:]) >= 50 else 20, date[-2:])])

Pretend I indented that... I can't seem to make Mefi indent right now.... Apparently I need to learn some more coding myself.
posted by magikker at 2:54 AM on March 4, 2009


I came in here to give you almost exactly what magikker did.
posted by Netzapper at 3:43 AM on March 4, 2009


If we're going for shortest one liner, the best I'm coming up with is:
def convert(d): return d[:-2]+str(19+(int(d[-2:])<5>

posted by JiBB at 4:12 AM on March 4, 2009


I can think of one (and only one) obvious way to do this:
import datetime
def reformat_date(date_string):
  return datetime.datetime.strptime(date_string, '%m/%d/%y').strftime('%m/%d/%Y')
My suggestion, though, would be to parse the data when you read it into a datetime object (leaving off the strftime call here) and only change it back into a string when you output it. That way, you'll be able to sort, compare, etc. with your parsed dates and get proper results. You may find (say, if you're inserting into a database) that you can just use the datetime object and not worry about formatting it at all.

You really are dealing with date data, so parsing it properly will make your program readable (and Y2.05K compliant) as well as making it easy to change the formatting in the future should you need to.
posted by pocams at 5:47 AM on March 4, 2009 [3 favorites]


Sorry, should have mentioned that datetime.strptime is only available in Python 2.5 and up. Before that you have to do a nasty little trick to get a datetime object from a string. (see also)
posted by pocams at 5:51 AM on March 4, 2009


Do it the right way, which is to say mostly like the way pocams posted. This goes double if part of the point of the exercise is to learn Python. Python style actually eschews one-liners and long lines, so I would instead write it like this:
# import the datetime class from the datetime module
from datetime import datetime

def reformat_date(date_string):
    timepoint = datetime.strptime(date_string, "%m/%d/%y")

    return timepoint.strftime("%m/%d/%Y")
I think it's easier to see what is going on this way, too.

Also, since you are already reformatting the date, I would consider using the ISO 8601 date format instead. Not only is it an international standard, but it has nice properties like being sortable as a string without having to parse it again, and being unambiguous about the order of month and day.
from datetime import datetime

def reformat_date(date_string):
    timepoint = datetime.strptime(date_string, "%m/%d/%y")    
    day = timepoint.date()

    return day.isoformat()

posted by grouse at 7:30 AM on March 4, 2009 [1 favorite]


Response by poster: You're all amazing. Jordanlewis and magikker get max karma for doing it quick and dirty, but you should all get a cookie, really.

I can't choose the formats, just work with the ones supported before and after, and pocam/grouse's ideas makes sense but don't have value in this specific case. I see the theoretical value in general use, so maybe I'll save those snips for future battles, but this string is not 'really' a date on the way in anyway, since all I have are those two digits for a year and I have guess as to what was meant all along.

Too, since m/d/yyyy is the one and only format I need to work with at every later point, when the processing will happen, there's no real value to storing it as a 'true' date internally in this application. And ISO 8601 is lovely in a UNIFON way, but nothing else I need to use later supports it, so m/d/yyyy is the winner, imperfect and all.

Thanks, geek cabal!
posted by rokusan at 7:50 PM on March 4, 2009


« Older Where in Los Angeles can pay by the hour to use a...   |   Why are the Taliban not a terrorist group? Newer »
This thread is closed to new comments.