Converting A Plain Text List Of URLs Into An HTML List of Clickable Links
February 21, 2006 10:11 AM   Subscribe

What's the easiest way to convert a plain text list of URLs with descriptions into an HTML page of clickable links?

I have a long list of URLs and descriptions in a plain text file, formatted like so
### This is a page about monkeys
http://www.monkeypage.com
### This is a page about weasels
http://www.weaselpage.com
and would like to convert them into an XHTML list formatted like so
  • This is a page about monkeys
  • This is a page about weasels
  • What would be the best way to do a conversion like this? I'm running OS 10.4.5, and I'm guessing a good method would involve command line stuff I'm aware of but don't really understand - regular expressions, grep, and the like. Alternatively, since the resulting page is destined for the web, I suppose another option would be a PHP or Perl script to convert the text file on the fly (I only mention PHP and Perl because I have some, very limited, experience with them, and would be lost if someone suggested, eg., a Python solution). I also assume some of the more advanced text editors would be up to the job (though from what I've seen of them, I'm a bit intimidated by the likes of vim or emacs!)

    If this is a simple proposition, I'd be very grateful if someone could give examples of commands/possible scripts, if not, pointers to resources about manipulating text files in this way would be great too (I keep everything in plain text files, so am keen to learn about manipulating and repurposing them in general).
    posted by jack_mo to Computers & Internet (9 answers total)
     
    Response by poster: Oops, posting ate the formatting of the HTML I want to convert to. I'll try again:

    <li><a title="This is a page about monkeys" href="http://www.monkeypage.com">This is a page about monkeys</a></li>
    <li><a title="This is a page about weasels" href="http://www.weaselpage.com">This is a page about weasels</a></li>

    posted by jack_mo at 10:12 AM on February 21, 2006


    I would use a wiki. Most feature mark-up that allow you to convert text into a link by doing the following:

    "This is a page about weasels":http://www.weaselpage.com

    I'm sure with a little batch file or excel processing you could convert your text file into the following mark-up, upload it into a wiki, then export the html file. Notably, Instiki (site not responding for me right now) would make this pretty easy. Stikipad is also a great hosted wiki app that will do this for you.
    posted by ajr at 10:30 AM on February 21, 2006


    I've done this in excel using simple formula's, find and replace. then putting the link together using concatenate (&). It took about 5 minutes to set up.
    posted by maxpower at 10:32 AM on February 21, 2006


    Best answer: A BBEdit find and replace will do this no problem, if the input is as regular as it seems. I think the syntax would be something along the lines of:


    Find: ### (*.)\r(http*.)
    Replace with: <li><a title="\1" href="\2">\1</a></li>


    Basically, The First Thing You Put In Parentheses can be referenced in the replace as \1 and The Second Thing You Put In Parentheses is \2, etc etc.

    \r is the code for a hard return. \t is the code for a tab, in case you needed that.

    BBEdit's manual has a great section on find and replace and how to use regular expressions that has great real world examples. It's written for semi-nerdy ordinary people who have no experience with this kind of stuff.
    posted by bcwinters at 10:32 AM on February 21, 2006


    Thread from two weeks ago on exactly the same thing.

    You can use Textwrangler in place of BBEdit. It's a free download.
    posted by cillit bang at 10:46 AM on February 21, 2006


    Best answer: perl -e 'local $/; $_ = <>; print "<li><a title=\"$1\" href=\"$2\">$1</a></li>\n" while(m,^###\s+(.*?)$(.*?)$,smg);' < input.txt > output.txt
    posted by Rhomboid at 11:24 AM on February 21, 2006


    Response by poster: Damn, no idea how I managed to miss that earlier thread, which would have given me enough to go on.

    Thanks everyone for the examples and tips (I just used Rhomboid's, and it worked treat, but will be investigating TextWrangler further too).
    posted by jack_mo at 11:50 AM on February 21, 2006


    Best answer: This is what awk is for:
    awk '/^###/ {sub(/^### /, ""); text=$0;} 
    /^http/ {print "<li><a href=\""$0"\">" text "</a></li>"}' input.txt > output.txt
    (ignore the line break)
    posted by scruss at 4:17 PM on February 21, 2006


    You could use MS Word (or other word processors I'm assuming).
    The only downside is you have to manually hit the ENTER key after each link.

    You could also use Excel... Make a list of the text you have and a list of the text you want. Using the & function you can merge the individual cells. You may have to fool around with it a bit... but it seems like it may be quicker and easier than BBEdit or equivalents, if you're not familiar with them.
    posted by SupaDave at 6:31 AM on February 22, 2006


    « Older Non-location based job search   |   Who gets invited to the Oscars? Newer »
    This thread is closed to new comments.