Excel 2013 - Using data from two columns to identify a row in a third
March 5, 2015 12:05 PM   Subscribe

I have data on one sheet of an Excel document with three columns. I want to extract this for use on another sheet as follows: If a row in Column A contains X (text string) and a row in column B contains Y (also a text string) then what does the row in column C (always a value) which matches these two criteria contain? With the data I have, this will always return a single result.

So the three columns are as follows: the product we've been contacted about, the reason for contact, and the number of contacts that match both those criteria.


It looks a little like the following:

Thing1 Reason1 32
Thing1 Reason2 38
Thing1 Reason3 14
Thing1 Reason4 8

Thing2 Reason1 6
Thing2 Reason2 18
Thing2 Reason4 22

Thing3 Reason1 15
Thing3 Reason3 26
Thing3 Reason5 4

I'd like to have a second sheet which looks like this:

Thing Reason Contacts

Thing 1 Reason 1 32
Thing 1 Reason 2 38
Thing 1 Reason 3 14
Thing 1 Reason 4 8
Thing 1 Reason 5 0

Thing 2 Reason 1 6
Thing 2 Reason 2 18
Thing 2 Reason 3 0
Thing 2 Reason 4 22
Thing 2 Reason 5 0

Thing 3 Reason 1 15
Thing 3 Reason 2 0
Thing 3 Reason 3 26
Thing 3 Reason 4 0
Thing 3 Reason 5 4


The data that I have to work from is exported from a call management platform as a CSV file, though is also available to export as an XML file - the use of which is entirely beyond my capabilities.


One of the issues I'm having is that the data I have only contains actual contacts. So, in the above example, we didn't get any contact about Thing 2 for Reason 3, but I need the second sheet to contain this information.

For each Thing, I can easily use VLOOKUP to get the data from column B (reason) to spit out the relevant number in column C (contacts) but everything that I've tried to use for "If A=Thing 1 and B=Reason 2 Then C=Contacts" is giving me errors. Given that this data is for 6 different Things, with 25 Reasons, I'm not inclined to input the data manually.

My Excel knowledge is spotty to say the least but despite lots of searching for a solution to this problem - which from the point of view of a simple logic problem isn't a difficult one - I'm coming up short. I've tried using combinations of index and match formulae, as well as lookup, but I can't get anything to work.

Any help is much appreciated.
posted by Len to Computers & Internet (13 answers total) 4 users marked this as a favorite
 
One hacky solution is to add an additional column where you concatenate the Thing and Reason values... then you can VLOOKUP against that easily enough.
posted by Perplexity at 12:08 PM on March 5, 2015


If I understand your question correctly you should be able to use a SUMIFS formula. I'll assume your data from the CSV is in Sheet1 in columns A, B and C, and you set up your Sheet2 with all possible outcomes for your Things and Reasons in columns A and B. Sheet2 column C will be a formula for your Contacts. In Sheet2 cell C2 the formula should be:

=SUMIFS(Sheet1!C:C,Sheet1!A:A,Sheet2!A2,Sheet1!B:B,Sheet2!B2)

You can then fill down that formula in your table of all possible combinations. If one of you combinations, like Thing2 Reason3, doesn't exist in Sheet1, it will return 0. If your CSV file has multiple lines for the same combination, the number of contacts will be summed.
posted by noneuclidean at 12:20 PM on March 5, 2015 [4 favorites]


So, do you definitely need it to look like that, or would it work to have Things as column labels and Reasons as row labels because - as usual - a pivot table is the nicest solution here. Also, personally, I think it would be easier to read that way instead of entirely tabular (so you could go down and see all the contacts about ThingA, or go across and see all the contacts about ReasonA, and it would be good.

That said, this should be totally doable using sumifs, i think (even if you lay it out as above -- this general technique would still work):

Assuming the first cell you're trying to fill is cell C2 in Sheet2, and your original data is in Sheet 1:

=sumifs(Sheet1!C:C,Sheet1!A:A,A2,Sheet1!B:B,B2)
posted by brainmouse at 12:20 PM on March 5, 2015 [1 favorite]


One hacky solution is to add an additional column where you concatenate the Thing and Reason values... then you can VLOOKUP against that easily enough

I did actually try that, but for reasons beyond me it's giving entirely the wrong numbers; likewise, using a simple A3&B3 (i.e. "Thing 1"& "Reason 3") is just giving me an error - I thought it may be to do with the fact that some Things and most Reasons contain spaces but honestly, I've no idea.

On preview:

Thanks for the (amazingly quick!) advice, noneuclidean and brainmouse. The destination sheet (as it were) definitely needs to have Things in one column and Reasons in another - though I will then use that data for a pivot table on a third sheet. (Some of the other data I have is exported from Sharepoint and thus is already in Pivot Table format and Good God, it makes my life so much easier.)

I'll have a muck about and report back, though probably not until tomorrow. Again, thanks for the advice.
posted by Len at 12:45 PM on March 5, 2015


Actually, I described that wrongly - the second sheet needs a layout for each thing which lists all the Reasons per Thing, e.g.

Thing 1

Reason1 32
Reason2 26
Reason3 0
Reason4 18
Reason5 0

Thing 2

Reason1 26
Reason2 23
Reason3 1
Reason4 0
Reason5 0

And so on and so forth. Hope that makes sense.
posted by Len at 1:09 PM on March 5, 2015


That's harder but not impossible, but the easiest thing to do is a workaround: can you just make it like you had in your original example and hide the first column? Or you can put them off in column G or something and then hide column G. Unless you really need them not to be there, just put them there. and save yourself the grief.
posted by brainmouse at 1:12 PM on March 5, 2015


brainmouse -

Unfortunately hiding/sectioning off the first column scuppers me - if, on Sheet 1, for each Thing, all Reasons were reported (in other words, Reasons which returned a 0 value were included), then that would be perfectly usable, as I could automatically populate the Reasons per Thing on Sheet 2 by reference to absolute cells on Sheet 1. But given that only actual contacts which both match (e.g.) Thing 1 and Reason 5 appear on the data I have, one week, the match of Thing 1 and Reason 5 is going to be A3&B3, while the week after it might be A4&B4, so absolutes are out.

I hope that this makes sense. As mentioned before, I'll see what I can get out of this tomorrow, but even if I can't get any of it to work, I've got more to go on than I did before, so thanks to everyone for that.
posted by Len at 1:32 PM on March 5, 2015


Simple brute-force method if there's not a great whopping shitload of data:

(1) Do a pivot table with row labels as Things and column labels as Reasons
(2) Select pivot table, paste as values somewhere else
(3) Search and replace '' with 0 in the new table
posted by ROU_Xenophobe at 1:33 PM on March 5, 2015


(that is, replace blank cells with 0)
posted by ROU_Xenophobe at 1:34 PM on March 5, 2015


I implemented the solutions suggested by noneuclidian and brainmouse in a spreadsheet you can use as an example. It's a Google Doc but if you download it as an Excel doc I believe the formulas will be preserved. I also created a Calculations2 tab that is formatted based on your update.

The Reasons and Things come from the Reasons and Things tab. You'll need to update their text in that sheet, and both Calculations sheets will update themselves based on the text you enter in Reasons and Things.

Here's how I would use it: each time you get a new extract of data, copy your Excel file to create a new one and name it with the appropriate date. Clear all of the old data out of the Raw Data tab. Then paste or import your data extract into the Raw Data tab. The Calculations tab will update automatically. Or you could create an Excel file every time you get your data extract and copy the Calculations page into the new file, but I think creating new tabs and copying formulas is more error-prone.

Yes, this is somewhat manual and requires cutting and pasting each time. But I think trying to automate this any more is overdoing it. Just get the sumif formula right, get the values of the Things and Reasons right, then paste in your data every month. If new things or reasons are added, do a little copying and pasting to create space for those things and reasons and then you're all set.
posted by Tehhund at 1:35 PM on March 5, 2015 [1 favorite]


With the qualification that I haven't tried any of the above solutions yet because it's 9:30pm and this can wait until I'm actually at work tomorrow and therefore being paid to use AskMe at work for legitimate reasons, thanks everyone.
posted by Len at 1:39 PM on March 5, 2015


Everyone - but especially noneuclidean, brainmouse and Tehhund - thanks so much.

Tehhund's example spreadsheet is exactly what I needed, and I've now adapted it for what I need. And yes, it'll be a case of manually pasting in the data once every week to then get it into a usable format. Extracting, and then copying and pasting the actual data from call logs and into Excel is a 30 seconds job and I don't need it any more automated than that. All I need to do now is the usual task of making it look pretty and colourful and because every manager in existence loves a bar chart, stick that in too :)

Once again, thanks. I've spent more hours than I wanted mucking about with this, and you all solved it in 90 minutes.
posted by Len at 4:31 AM on March 6, 2015


Make sure you document your process of how to copy/paste the data from the source into the target spreadsheet. Put in in another tab of the spreadsheet and call it Help or CheatSheat or HowTo or something. Just in case you get hit by a bus (or get promoted!!) and someone else needs to perpetuate this task.
posted by CathyG at 8:47 AM on March 6, 2015 [1 favorite]


« Older NYC eats for an Aussie   |   Lost/stolen safe deposit box; now what? Newer »
This thread is closed to new comments.