How do I compare these lists?
September 28, 2009 6:30 PM
Subscribe
I have two lists of numbers. I would like assistance creating some kind of program or method to output a list of all numbers that are present in one of these lists but not the other.
This is not a homework question. No, really. I promise. So I have two very long lists of numbers. Currently they are sitting on my computer as Excel spreadsheets, but I can export them to whatever format is necessary. Both lists are made up of integers, ascending, though never actually consecutive numbers, if it matters.
List A has approximately 4000 unique entries, and is the list I'm starting from.
List B is a list of about 8000 entries I made up by squishing together a bunch of smaller lists I had made from List A. There are a lot of duplicate entries in List B, which I haven't removed (partially because I'm not really sure how to). So there are many numbers that are present multiple times in List B.
I am absolutely positive that there are numbers in List A that are nowhere in List B, and I want to know what they are. It would be extra awesome if they were printed out in such a way that I or it could easily add some HTML links surrounding them, blahblah.com/printnumberhere.html -- the numbers are actually parts of URLs I want to access.
I was thinking perl would be a great way to solve this task, and I googled and got several solutions involving grep and/or comparing arrays. My perl skills, alas, have atrophied to the point where I can no longer tell how to make these given solutions mesh with the really basic parts of coding that I have forgotten. For example, I wouldn't know how to get this data in in a usable form. Comma-separated, perhaps? And then I somehow stuff it in an array? I don't even remember how to read from a file. My perl skills fail me. And with the amount of data I have, I'd hate to do it wrong and, say, end up with an incomplete list of the unique numbers.
Or if there's an easier solution that totally hasn't occurred to me (some kind of Excel macro? I have Office '04), that would be excellent too. I'm running OS X 10.3.9, which will probably limit my nifty programming language options somewhat. Thanks for any help you can provide!
posted by sineala to computers & internet (16 comments total)
1 user marked this as a favorite
for each a in Lista :
if a is in Listb:
listc.append(a)
PS in no way is this properly formatted.
posted by Rubbstone at 6:41 PM on September 28