Help me with a tricky (for me) Python problem
August 23, 2009 1:00 AM   Subscribe

I am trying to make a python script that takes a CSV file and turns it into an HTML table. I need it to combine a column containing a website name and another column containing a URL into an HTML link.

I'm working on a python script that takes a CSV file and turns it into an HTML table.

The CSV has, among numerous other columns, a column containing a website name and another column containing a URL. I would like to combine the two into an HTML link.

Here is what I have so far that converts the rows & columns into tables:

for row in reader:
for column in row:
html.write('<td>' + column + '</td>');

I know the part that makes the HTML link would need to say:
html.write('<a href="'+ URLcolumn + '">' + sitecolumn + '</a></td>')

But I am stumped on how to finish implementing this. In particular, I can't figure out how to "refer" to the URLcolumn and sitecolumn.
posted by Mjolnir to Computers & Internet (7 answers total) 1 user marked this as a favorite
You don't need to use the second for loop. Either refer to the items in row as an array: (assuming row[0] contains the url and row[1] contains the site, in this case)

for row in reader:
  html.write('<a href="'+ row[0] + '">' + row[1] + '</a></td>'")

Or assign them to names:

for row in reader:
  urlcolumn, sitecolumn = row
  html.write('<a href="'+ urlcolumn + '">' + sitecolumn + '</a></td>'")
posted by joshu at 1:09 AM on August 23, 2009

It sounds like you want to replace the "for column in row: html.write(...)" inner loop with the more complex html output line you describe? And 'row' is some sequence type already (a list or tuple)?

In which case, you can either refer to the elements in row by their index, e.g. row[0], row[1], etc., or you can assign them to variables and then use those variables.

for row in reader:
   html.write('<a href="'+ row[2] + '">' + row[5] + '</a></td>')

(assuming that columns 2 and 5 have the strings you want).
posted by hattifattener at 1:15 AM on August 23, 2009

What was said above about using one loop. But also...

If you are using Python 2.5 or newer, you could use the DictReader class. The DictReader turns each row into a dictionary so you can refer to fields by name, which is much nicer than trying to remember array index numbers.

My test file contains this:


Here's my Python session:

>>> import csv
>>> f = file('test.csv')
>>> dr = csv.DictReader(f)
>>> for row in dr:
... print 'the baz value is %s and the foo value is %s' % (row['baz'], row['foo'])
the baz value is 3 and the foo value is 1
the baz value is 6 and the foo value is 4

If your file doesn't have a header row, you can pass one in:

>>> dr = csv.DictReader(f, ['good', 'bad', 'ugly'])

I think you'll find that using format strings is more concise, reduces unreadable quotes and is easier than gluing strings together (it's a bit quicker too).


html.write('<a href="%s">%s</a></td>' % (row['url'], row['link']))
posted by i_am_joe's_spleen at 2:41 AM on August 23, 2009 [3 favorites]

If you want to be funky with list comprehensions, you could do something like:

html.write('\n'.join(['<tr><td&gt<a href="%s">%s&lt/a></td&gt</td></tr>' % (row['url'], row['link']) for row in dr]))

which gets rid of your loop and is pretty compact. It's a matter of taste.
posted by i_am_joe's_spleen at 2:46 AM on August 23, 2009

Let's try that again.

html.write( '\n'.join( [ '<tr><td><a href="%s">%s</a></td></td></tr>' % (row['url'], row['link']) for row in dr ]))

Sorry, escaping is hard on Askme.
posted by i_am_joe's_spleen at 2:50 AM on August 23, 2009

I love metafilter! Thanks everyone!
posted by Mjolnir at 3:03 AM on August 23, 2009

I rhink the big problem is that you don't yet know that arrays are indexable. You got the iterableness down, but when you need "random access" instead of "sequential access" on an object that supports it (like "list" or "tuple" or even "str") use the square brackets to indicate which position you want the value of, and an integer (starting with zero!) to refer to it. (It's kind of funny and bizarre. You have the exact opposite problem as people who come from C-like languages to Python; the index but never iterate.)

>>> l = ['a', 's', 'd', 'f', 'g', 'h']
>>> l[0]
>>> l[5]
>>> # You can also slice it into pieces of lists!
>>> l[2:4]
['d', 'f']
>>> l[:3]
['a', 's', 'd']
>>> l[3:]
['f', 'g', 'h']
>>> # Negative numbers count backward from the end.
... # Perfect when you don't know how long it is.
>>> l[-2]
>>> l[1:-1]
['s', 'd', 'f', 'g']
>>> # You can programatically access the indexes.
>>> i = 3
>>> l[i]

Oh, you know the Python interpreter can be interactive, yeah? Just run it in a terminal, and type what you want to try.

Overall, I suggest you spend 30 minutes in the tutorial and try things out.
posted by cmiller at 4:45 AM on August 23, 2009 [1 favorite]

« Older grass / synthetic surface run in the bay area?   |   What kind of fake fur is this? Newer »
This thread is closed to new comments.