Is it possible to do a global change-style in Excel?
November 24, 2009 2:41 PM   Subscribe

Is it possible to automate a citation style change on an entire column in Excel, or do I need to do this by hand?

I have a list of approximately 600 articles in an Excel document. There is a column for each field in the cite (author, title, journal, etc). I need to change the citation style. Currently, most cites are Smith, A. B., Jones, C. D., and a few are Smith Ann B, Jones Cathy D. I need to covert them to Smith AB, Jones CD. Most titles (but not all) are in the format Where the wild things are: a great book, but they need to be formatted as Where the Wild Things are: A great book. (Think PubMed style, with proper nouns and the word after a colon capitalized.) Is it possible to do some sort of global change citation style, or do I need to do all this by hand? Is this the kind of thing you can do with a Macro? I haven't ever set one up before. Unfortunately, importing into EndNote or a similar program that will automatically change format style isn't an option.

I'm using Excel 2001 on a Mac. I'm not a programmer. I'm looking for something that will do this relatively quickly, because I could do it by hand in a (very boring) few hours--if the solution involves hours of learning it's probably not that practical.
posted by min to Computers & Internet (11 answers total)
Is there any reason it has to be in Excel?
posted by Mr. Gunn at 3:19 PM on November 24, 2009

Unfortunately, yes. It's a project I'm working on for someone else and they've requested it be in Excel. I could probably take it out temporarily, but I would need to be able to put it back in exactly in order, since there are other things happening in the document that need to stay matched up with their particular row. Otherwise I'd just import it into EndNote.
posted by min at 4:02 PM on November 24, 2009

put it back in exactly in order

You should be able to set up a custom list before you take the data out then use that to sort the new columns back into the exact same order after fixing. It might be a bit tricky in that you'd need to the list to still match what's in the cells (so having a list with full names then changing the data to just initials might screw up the sorting function a little) but if you're just changing capitalisation in the titles and there are no replicates then that field would be a good candidate. Then you can throw it all through Endnote which actually gets what a citation is.

Otherwise Excel doesn't really know anything about citations so it won't do anything intelligently. You could use search and replace to do quite a lot of the fixing, e.g. remove all the fullstops between initials, but it would be a case of working out groups of things to fix and trying to do as many at once as possible. I'm not sure how good the change case thing is in Excel (I only have a newer version available on this computer) but that should at least help get the titles into sentance case, and you could just transfer those columns into Word and back if the change case thing works better there. There'd still be a fair bit of clean up by hand but it might be faster.
posted by shelleycat at 4:44 PM on November 24, 2009

To put them back in the same order, create an extra column that is just a number sequence (1,2,3,4,5, etc.). Then whenever you need to put them back in sequence you can simply sort on that column.

When you get to the end you can just delete that column.

It's easy to make a column with the sequence 1,2,3,4, ... (or other simple sequences). Just fill in the first couple of numbers, then select both of them with the mouse & use Excel's fill handle to fill in the rest.
posted by flug at 6:07 PM on November 24, 2009

Excel has functions upper() lower() and proper() that convert things to upper/lower case.

None of them do exactly what you want. For instance, if you have your "wild things" text in a1 and put the function "=proper(a1)" in cell b1, then the result is this:

Where The Wild Things Are: A Great Book

Once you have converted a column using that formula, change it back to a plain text column by selecting the whole column, copying it (ctrl-c) and then "Paste as values". (In Excel 97 that's under the Home tab & "paste".)

That isn't what you want--but still it might be useful or easier to start off with that than what you have now.

Then you could search/replace for most common words (The to the, And to and, Are to are, To to to, etc.).

There would still be some editing left but a lot less than changing every single word.
posted by flug at 6:22 PM on November 24, 2009

Also first thing I would do is split the first/last names into separate columns.

It is VERY easy to combine columns if you need to, but VERY hard to split them.

Also I would go ahead & keep different versions of the names etc. in different columns.

So in the end you might have

Smith, C. J. Smith C. J. Charles J. CJ

Again it's easy to delete the columns you don't want at the end.

And it's easy to combine columns 2, 3, 4, and/or 5 to create "C. J. Smith", "CJ Smith", "Smith, Charles J.", "Smith CJ", "Smith, CJ" or whatever precisely you need.

BTW, how you combine columns is using a formula like this:

=b2 & ", " & b5

This would combine whatever is in the 2nd column of row b ("Smith") with whatever is in the 5th column of row b ("CJ) with a comma and a space in between, like "Smith, CJ".

When you are doing combining a column with a formula like that, you will want to select the entire column, copy (ctrl-c) and then paste as values.

More about paste special in Excel here.
posted by flug at 6:35 PM on November 24, 2009

Sorry, what I meant here is:

So in the end you might have five different columns (I'll indicate the columns with the "|" symbol):

Smith, C. J. | Smith | C. J. | Charles J. | CJ

posted by flug at 6:38 PM on November 24, 2009

Also you can convert names similar "C. J." to names similar to "CJ" by simply doing a couple of search/replace operations.

How I would do it:

1. First separate first names out into a separate column

2. Search/replace on that column, searching for ". " and replacing with nothing

3. Search/replace on that column, searching for "." and replacing with nothing.
posted by flug at 6:41 PM on November 24, 2009

Also a lot of the advice above is assuming you have only one name in the author field. If you have academic citations with several authors for most articles, it becomes quite a lot more difficult to parse.

You're more likely than not stuck with doing a few simple search & replaces (choosing some strategically that will do more good than harm) and then editing by hand from there.

For instance, if you use search/replace to eliminate all periods, commas, and periods-followed-by-spaces in the author field that wouldn't be perfect (you'll have to go in & hand-edit to finish the job) but still might save you a lot of typing.
posted by flug at 6:46 PM on November 24, 2009

Having an extra column with just 1,2,3 for sorting does work rather nicely. The problem is keeping that information with the correct record when you move it to endnote and back again. If you can have it as a userdefined field when importing into endnote then you'd be set (you just need to include it in the export filters). Also if you can get endnote to export in the same order you imported it then it would become moot, order would be preserved. But if you can't do either of those things then setting up a custom list from one of the information fields itself should get things back into the exact same order regardless of what that order was. So there are options there. Endnote would whip your citations into shape pretty fast.

I really like the idea of splitting out the first and last names too though. This gives you more flexibility to mess around with the first names making them into initials or whatever without interference from the last name. If you end up with a last name with a le or von at the start then things can get tricky, much better to have them separate.
posted by shelleycat at 6:53 PM on November 24, 2009

Wow, thank you so much for these answers, Flug and Shellycat. Apparently you can do a lot more in Excel than I realized!

Unfortunately, these are your typical science articles with 3-7 authors. So while I really like the idea of splitting them out first/last name it's probably more time consuming than just doing it by hand in the first place! I think the search/replace function should work pretty well for fixing the names column so I'll try that to eliminate the periods and extra commas, at least.

I'll also try sending the titles through EndNote and formatting them there. It sounds like I can preserve the list in order and that was my main concern about doing it that way.

Thanks again, these ideas should save me a lot of time.
posted by min at 7:34 PM on November 24, 2009

« Older eBay for discrete electronic components?   |   How to Initiate "Operation: Disambiguate" Newer »
This thread is closed to new comments.