how do i do the switcheroo on a text list?
February 17, 2009 4:45 PM   Subscribe

I did a dumb thing. I put together a book list in this format: first-name last-name (tab) book-title Note that there *isn't* a comma between the author's first and last name. Is there any "computer magic" I can perform to change it to this format: last-name first-name (tab) book-title The list is just in text format, for what it's worth. I'm on a Mac but also have access to a PC. I'd rather not have to cut-n-paste the last names, it's a pretty long list.
posted by edjusted to Computers & Internet (16 answers total) 4 users marked this as a favorite
Best answer: Do you have Excel/Word? If you do, this is simple.

1. copy data into Excel where first-name and last-name share a cell
2. select that column
3. under the Data menu, select "text to columns"
4. select "delimited" and click next
5. click the box next to the "space" option
6. click next
7. switch the first-name and last name columns

This should put the first name and last name into separate columns. From there you can export into word and then convert the table to text with tab delimiters. There may be simpler ways, but this definitely works.
posted by jessamyn at 4:52 PM on February 17, 2009 [2 favorites]

You could also you something like notepad and do a fine for the space and replace it with a comma.
posted by bleucube at 4:55 PM on February 17, 2009

I think jessamyn's solution almost works, but you will end up with a (tab) between the last and first name unless you first merge back those two columns (with a space delimiter).
posted by EatenByAGrue at 4:55 PM on February 17, 2009

Wow. Biffed that one. I shouldn't have been watching Diego.

1. Open file in notepad
2. Do a find for a space (highlight the space prior to going into the find dialog box)
3. Replace with a comma

sorry about the above comment. Shoot me now.
posted by bleucube at 4:56 PM on February 17, 2009

Using Notepad and Excel, this is what I do:

If the "firstname lastname" combo is the only instance in which you've used the space bar, you can do a Replace in the text file to replace a single space with a single comma. You'll end up with "firstname,lastname."

Then save the text file with a .csv extention.

When you open the .csv file in Excel, the words on either side of the comma will be in separate columns.
posted by peggynature at 4:56 PM on February 17, 2009

Be careful if there are last names with spaces in them, like Betty Smith Jones. Then you'll end up with three columns, using the above methods.
posted by desjardins at 4:59 PM on February 17, 2009

Response by poster: Oh geez, that's one of those answers that makes me feel like an idiot. Haha. Thanks jessamyn!

Also for those of you following along, I had some problems at first because Excel kept replacing the *book titles* column after it was done converting. I worked around this by making the "destination" something further along. That is, column A contained both the first and last name, and column B contained the book title. After doing jessamyn's conversion, column A became the first name, column B became the last name, and the book titles got wiped out. I forced Excel to make the "destination" of the contents column E instead, and just copied and pasted everything back in the order I wanted. Thanks for that ultra quick reply!
posted by edjusted at 5:00 PM on February 17, 2009

Response by poster: Thanks for the other suggestions. Unfortunately, since there are spaces in the book titles, I can't just do a search and replace. And there *are* a few names that jessamyn's technique messes up (e.g. John Doe Jr. or J.J. Smith) but I can just manually fix those.
posted by edjusted at 5:01 PM on February 17, 2009

I'm going to assume that this is a simple text file, or that you can paste it into a simple text file.

Open up the terminal and cd to the directory where you keep the file.

Then do the following:

sed "s/^\(\w*\) \(\w*\)/\2 \1/" YOUR_FILE_NAME > A_NEW_FILE_NAME

The program is sed, and the incantation is a regular expression. One of the more useful bits of computer magic we've worked out.
posted by Netzapper at 5:12 PM on February 17, 2009 [1 favorite]

Also, if you have some graphical program that supports regular expressions (with back references), you can use the regex out of that command. You'll need to take out the backslashes in front of the open and close parentheses though.

I also realized that I didn't take into account people with spaces in their names. The new regex for that would be:
"s/^\(\w*\) \([[:alnum:] ]*\)\t/\2 \1\t/"

Note that this only handles last names with spaces... first names with spaces cannot be distinguished from a first name with no spaces plus a last name with spaces.
posted by Netzapper at 5:21 PM on February 17, 2009

Response by poster: Thanks, Netzapper! I've always known regex is powerful as heck but it's latin to me.

I gave your commands a shot anyway, and it didn't seem to work. Any ideas?

Here's what's in my test file:

posted by edjusted at 7:18 PM on February 17, 2009

Did it do nothing? Or, did it give you some sort of error?

If you're willing to wait, I'll haul out the Mac and try it there. It's probably some sort of shell-escape issue. But, I have some (paying!) work I need to get done first. It'll probably be an hour or two.
posted by Netzapper at 7:41 PM on February 17, 2009

Netzapper's suggestion is likely running afoul of the version of the sed program that is installed on OS X which doesn't deal with "\t" properly. An alternate version of his regex would be:
sed -E -e 's/^([^ ]*) ([^    ]*)    /\2, \1    /' YOUR_FILE_NAME > NEW_FILE_NAME
Unfortunately, most browser/terminal combos don't paste the tab characters properly (In fact, above, I've replaced them with four spaces each for illustration). So, start by pasting the following:

sed -E -e 's/^([^ ]*) ([^TAB]*)TAB/\2, \1TAB/' YOUR_FILE_NAME > NEW_FILE_NAME

and then move the cursor (arrow keys) to each place where it says "TAB" and delete the letters T, A, and B (using backspace). Then, type the tab by typing Ctrl-V, Ctrl-I.

The above version is true to Netzapper's intent and assumes that multi-part names are always of the form "FIRST LAST LAST". If "FIRST FIRST LAST" or "FIRST MIDDLE LAST" occur more frequently, then you may try:

sed -E -e 's/^([^TAB]*) ([^ ]*)TAB/\2, \1TAB/' YOUR_FILE_NAME > NEW_FILE_NAME

posted by Mr Stickfigure at 7:57 PM on February 17, 2009

Yeah, I just fought with the OSX version of sed for about twenty minutes, and couldn't get it to do *anything* for me. It's acting as if atoms aren't sufficiently greedy--e.g. "^([:alnum:]*)" was only matching the start of line most of the time (taking the zero option for *), matching part of the word the rest of the time, but never the whole sequence from line start to the end of the first token.

I don't know if OSX sed's broken, or not standards compliant, or if I'm just stupid. But, I can't help you... regex is frustrating enough when I'm working in an environment where I'm comfortable.

This is what comes of having 50 different goddamn definitions and implementaions of regex.

I'm very sorry for my failing. But Mr. Stickfigure looks like he came to the rescue.
posted by Netzapper at 8:40 PM on February 17, 2009

Response by poster: Mr Stickfigure: that worked perfectly.

Netzapper: thanks for going through the trouble!
posted by edjusted at 10:54 PM on February 17, 2009

Or rather than going and typing in the tab characters manually, you can use bash's $'ANSI C like' quotes, which expand escape sequences.
sed -E -e $'s/^([^ ]*) ([^\t]*)/\\2 \\1/'
posted by JiBB at 10:08 AM on February 18, 2009

« Older Where's the best place in China to see the...   |   Replacement for American Crew Classic Wax Newer »
This thread is closed to new comments.