How to sorting column with 10,000 rows?
August 13, 2016 8:46 AM   Subscribe

I have a spreadsheet column with 10,000 rows, and I need to sort it. Is there a way to do this automatically, where certain rows with specific attributes are sent to a new column?

Basically, it will take a while to sort manually, then copy the rows I want, then paste into new columns. If I can use a formula to do this it would be great.

There are 3 kinds of rows:
1) Rows with no information in the cells (this is an email list, the column has recipient names; some recipients haven't left a name)

2) Rows with just one word (a first name)

3) Rows with two words (first name last name)

The end result is to isolate #3 so I can separate the text (we want to send personalized email that uses first name).

I of course know how to use the basic sort function, etc in Excel to get these three groups of rows organized.

Since it's a relatively large list of 10,000 and Excel is my tool, I was hoping to avoid manually copying and pasting the three kinds of rows. Too big of a spreadsheet, too much chance for error.
posted by My Dad to Computers & Internet (9 answers total)
 
Best answer: This sounds like the sort of thing that the text to columns feature is good for. Would it be enough to split the range on spaces and sort the result to get rid of the no-name cases?
posted by gds at 9:04 AM on August 13, 2016 [3 favorites]


I'm not an Excel wizard, but there a couple things come to mind, though they may not be as automatic or programmed as you would like.

A simplifying step would be to add an additional column that has a logical formula indicating which, if any, of the groups you would like to copy are in. Using that column, could use Autofilter (just called Filter in the new versions?) to make just the rows of one group visible, then select/cut/paste that group.

Or

You could fill a new tab with formulas that would copy the values from the first tab if, and only if, they were in a particular group. On the new tab, you can remove the blanks either by sorting, or perhaps using the "remove duplicates" feature. (This may require a step to change from formulas to values.)
posted by SemiSalt at 9:07 AM on August 13, 2016 [1 favorite]


Clearly you have TWO columns here -- one with the emails and one with the name(s) or a blank. That's quite important.

Step one -- Save as a new file so your old data is preserved.

Step two -- select the name column and do a text to column with spaces (as GDS says)

Step three sort --
the whole table by the fourth column, which will have content only for the three name people (your Mary Anne's and your Mary MaidenName Married Names). You will need to manually create the appropriate name in the fifth column.

Step four -- Copy all of the other column 2 content (first names) into column 5.

Delete columns 2-4.

Now you have a list with emails and first names.

If you have time, for the no-name records you can copy them to column 2 and you can do another text to column using . and _ as separators on the no-name email addresses and that will isolate in column 1 maybe 15%-25% more first names.
posted by MattD at 9:51 AM on August 13, 2016 [1 favorite]


Response by poster: Thanks all! I will give these suggestions a try, mostly likely on Monday AM, and shall report back!
posted by My Dad at 10:06 AM on August 13, 2016


For a quick and dirty pass, assuming data in column A;

The formula

=LEFT(A1,FIND(" ",A1)-1)

in column B will return everything up to the first occurence of a space in the data.

But if you want this to be as accurate as possible, MattD's process with text to columns would be a better bet.
posted by protorp at 11:03 AM on August 13, 2016 [1 favorite]


Just one note (and it's minor)- you're not going to be able to automatically differentiate between someone who leaves their two word first name "Mary Ann" and someone named "Mary Smith". You'll most likely be calling both of them Mary.

The only solution is either:
- looking at all of the two word names or
- trying to identify names in the last name column that are typical first names and then only looking at those two word names. it could potentially be done using another column and a vlookup on a list of common first names stored in a separate sheet.
posted by noloveforned at 11:12 AM on August 13, 2016 [1 favorite]


If there are irregular spaces in the cells to get rid of before doing a text to columns, the trim function is your friend.
posted by Pryde at 11:23 AM on August 13, 2016


Wait - your data is in rows?
ie
person1 name
person1 name-surname
person1 email address
person2 name
person2 name-surname
person2 email address
person3 name
person3 name-surname
person3 email address
etc
etc ??
but all in one column?

I'm confused. If this is the layout, some of the other suggested answers aren't going to help.
posted by Xhris at 1:51 PM on August 13, 2016


Response by poster: Thanks, all... "Text to columns" did the trick, where text (firstname lastname) was delimited by a space in the column. Very simple.
posted by My Dad at 4:18 PM on September 14, 2016


« Older Conan! What is good in Shreveport?   |   Fantastic movie dance scenes Newer »
This thread is closed to new comments.