Dealing with Archives
February 5, 2004 11:56 PM   Subscribe

I'm considering buying a domain, and setting up a personal site. This wouldn't be a problem, really, except that I've got over five years of content on my current personal site in various formats, and I don't want to repeat the mistakes of the past. [Very long, very geeky details inside. Not for the faint of heart.]

I started my personal site sometime around 1997/1998. I built all the pages in notepad, manually inserting links, content, next/previous links, archives, etc. After a while, I got tired of the work, and then put together a perl script to take care of putting together the static pages for me. Shortly after that, my host went under, and I purchased my domain (Obviously a self-link, and of limited functionality at the moment due to a recent format). I had to re-work all the links by hand to reflect the new domain/directory structure, and found a nice script that did everything I wanted it to, as far as a journal CMS goes.

THEN! I decided to add a second journal to my site to keep track of the far-too-surreal dreams I was having, but the script I was using didn't have any functionality to handle that, so I set up a second version of the script, in another directory. When Blogger opened, I created an account and had a Blog on my page, also entirely separate from the dream/waking journals. I didn't mind this demarcation so much, as it allowed me to separate the more personal thoughts from the external-linking kind of writing that's prevalent on weblogs.

I eventually outgrew blogger, and after trying out at several different options, I settled on MovableType as the CMS of choice for the weblog-portion of my site. I found a blogger-to-mt script that was entirely useless, and after spending a good week playing with the code (in the process learning that my coding skills have degraded shamefully), I transferred all the content over manually. Somewhere around this time I also stopped using external hosting (due to a series of major-media links that drove my site to over 2TB/traffic a day for three days, before my host shut my account down and asked for enough money to buy a car with), and setup Apache/PHP/Perl/etc on my XP box, along with a mail server to handle my domain. I transferred everything over to my home PC, and continued.

After a critical hard drive failure, I went about restoring my content from backups, only to find that the current Perl-DB version was incompatible with the older MT archives. Unable to locate a working copy of the older version, I again worked up a hack to salvage most of my MT content, losing all the metadata in the process, forcing me to re-enter it all by hand using a static copy of my old site for reference. I quickly switched to SQL for MT's data system.

Now, I'm using a variety of mt/php/perl hacks on my site to get the functionality I want. Winamp now-playing lists, inline movie/book reviews, a BlogRoll, etc., etc... All the while, I've still got these two older CGI scripts with my waking/dream journals in them, and a fair amount of static content.

It's a big mess.

Now, I'm seriously considering selling my domain name to finance a move to Amsterdam (Haarlem, really, but...), and if I'm going to be moving all of my content to a new domain, I'd like to make sure that whatever tools I use don't cause me the same problems I've had in the past. Also, I'd like to be able to import all of my older MT/other entries into this CMS, whatever it might be. My waking/dream journals are stored in individual files named after the date of the entry (Ex: 20020321), and almost entirely plaintext, with a small amount of metadata at the end of the file (Ex: {title=there is an exit here} {template=main} {datestamp=200203211611} ).

MovableType seems like the obvious answer to my CMS needs, but there's the matter of hosting. Do I continue running a mail/web server from my home, or do I go with external hosting? Do I go with a generic host that has PHP/CGI/SQL functionality, or do I go with TypePad, and hope that they never go belly-up? And, of course, how do I get these five years of content into TypePad/MT or whatever CMS I go with?

It all seems very overwhelming to me. I'm hoping that someone here has gone through a similar experience, and can offer some advice on the situation. I don't want to continue maintaining a number of different journaling systems, and I don't want to lose all of my writing. I'd also rather not have the old content archived somewhere, still in a separate format. Integration, and all that.

Thoughts? Comments? Suggestions?
posted by Jairus to Computers & Internet (6 answers total) 1 user marked this as a favorite
Best answer: My thoughts would be that based on what's readily available that MT is the way to go. The database should mean that the information is going to be safe and portable. Even if you move away from MT down the road it shouldn't be too difficult to manipulate the existing database. I expect that as MT develops hacks to do just this will become more common.

I share your reservations about TypePad (or any other remotely hosted service). I'm a control freak and am far more comfortable with the scripting under my control.

As far as hosting goes, with prices as low as they are now I can't see doing it myself for a personal site. Add in the downtime due to moving and connectivity issues once your moved and suddenly $15-20 a month is soundling like a bargain. What's nice for you is that you have a local server available, hence you can mirror your site and maintain constant backups. If the host doesn't work out your down 36 hours while DNS propogates. No big deal.

I'm also convinced that in todays climate anyone who is not a professional and runs a mail server is out of their friggin' tree. Unless your getting paid for it, keeping on top of security issues strikes me as a bottomless pit of aggravation. But who knows, some people like being whipped with barbed wire.
posted by cedar at 12:38 AM on February 6, 2004


MetaFilter: some people like being whipped with barbed wire
posted by gen at 12:51 AM on February 6, 2004

I did something similar over the last year. I had my site hosted on a server in a friend's livingroom and had a bunch of files which had been piling up since 1995.

The options I went with were coding my own CMS and finding a cheap commercial host. I've been extremely happy.

The custom CMS really wasn't that hard, and it sounds like you have the skill to write something like that. It was a pain to create tools to import all the old stuff, but it worked well, and I designed the new system to be much more flexible, so I shouldn't have to do that ever again.

Going with a commercial host was good and bad. The site is much faster. But it's frustrating to have to wait for someone to fix problems. Fortunately that happens very rarely.
posted by y6y6y6 at 4:54 AM on February 6, 2004

If you decide to stick with Movable Type, it would be fairly simple to write a quick Perl script using the MT API to import the journal files you've described. People on the Plugin Development section of the support forums and/or the mt-dev mailing list (myself included), would be happy to give you guidance.

Then you'd at least have all your data in one place, so even if you decide not to use MT in the long run it would be easier to convert it to whatever you do use.

As for hosting, these links may be useful:
Moving your MT Blogs to a New Server or Web Host
Movable Type Friendly Web Hosts
posted by staggernation at 5:58 AM on February 6, 2004

Remember that MT has a plain-text import/export format. I've helped friends move from Blogger when Blogger's normal export feature was dead, and managed to massage their existing files through GREP into an MT-importable form. You've already got the SQL database, but for the belts-and-suspenders approach, you can also occasionally export to text and archive that.
posted by adamrice at 10:05 AM on February 6, 2004

I would definetly spend the time trying to get old blogger or journal posts formatted correctly so you can import it properly into a MT/MySQL situation. Once it's in a database, that's the best way to preserve the flexability of what you can do with the data. Hell, you might even consider paying an adept friend $20 to do it for you or something.
posted by Hackworth at 10:19 AM on February 6, 2004

« Older Building a Check-Out System   |   Mars and Venus Newer »
This thread is closed to new comments.