Help with an excel formula
August 25, 2020 7:14 AM   Subscribe

Need a little help inserting text in a cell based on the condition of another cell. Don't want to do giant nested IFs. Full question and visual aid inside.

Here's a mock up. As you see in cell B3 the ingredients have been autofilled based on the yes-no of the coming fields. I might want to add tomato between one of those columns in the future, who knows. It should survive that. If necessary I can use a spare column to create a lookup list.

Were I to "psudocode it" I would describe it as... for C2:F2 =yes insert C1:F1 (and then I would put commas in the ingredients headlines). Or Match from I:I where C2:F2 = yes. If I am far too unclear please let me know.

Actual use case is user puts number of hours per column of activity codes, receives a plain-text B:B saying "ran, jumped, swam" or whatever.
posted by J.R. Hartley to Computers & Internet (18 answers total) 6 users marked this as a favorite
The array formula (note: use Control-shift-enter to end an array formula) in B2 of =concat(if(C2:F2="y",C$1:F$1,"")) seems like it does very nearly the right thing --- it doesn't include commas, though. Trying to modify it to include commas, I built =concat(if(C2:F2="y",concat(C$1:F$1,","),"")), but annoyingly that doesn't work, I think because concat actually takes a list as an argument and so it doesn't treat that C$1:F$1 as something to iterate over the array formula, but rather as a list in its own right. This may be fixable --- I don't know how to tell Excel "treat this particular element of the array formula as something to iterate over rather than as a list".

You might need a little bit of extra magic also to handle suppressing the final comma, and producing some sort of intelligent output when none of the cells contain "y", but this is at least a start.
posted by jackbishop at 7:42 AM on August 25, 2020

I've solved this problem a variety of ways. I teach excel and problems like this are a little tricky to "give" to someone else. In the end, they should find their own path.

However, what I would do given this specific information is to replace the "Y's" and "N's" with "lettuce, " (With the space and comma) or "" (no information). Then, you could just concatonate or use & to combine all of them regardless. You could do this in helper columns if you need the y and n inputs. =if(A5="Y", A1&", ","") basically.

Another way to solve this issue is to do the above in terms of putting lettuce instead of a Y, then concatonate the row so the result is maybe "Lettucebunketchup". Then, instead of a nested if/then, you could do =if(iferror(search(B4,"lettuce"),"ERROR"="ERROR","Lettuce","")&", "&if(iferror(search(B4,"mustard"),"ERROR"="ERROR","mustard","") etc. This is nice because it's more discrete than a nested if.

Another more obscure way I've solved this problem in the past in a more complex situation, is by using numbers to indicate each ingredient. So, you can assign lettuce 100000, ketchup 10000, mustard 1000, bun 100, then add them together or use sumifs. So then lettuce, mustrard, and bun would be 1001100. I haven't cracked the code to divide it out yet but it helps with huge datasets.
posted by bbqturtle at 7:42 AM on August 25, 2020

I would do additional IF columns (hidden) after each separate ingredient column (visible) so that e.g. if "lettuce column" = "Y", then "lettuce marker column" = "lettuce," (adjust spaces and commas to suit), and conversely if "lettuce column" = "N", then "lettuce marker column" = " ". Then you can just concatenate all the marker columns, though I don't know if the output might get a bit ugly if there's not many ingredients out of a long list.
posted by london explorer girl at 8:00 AM on August 25, 2020 [1 favorite]

If you end up with a list that ends with an extra ", " at the end, you can use the LEFT alignment function to strip that off, for example:

=LEFT(A4, LEN(A4)-2)

would give you all but the rightmost two characters, in your case the extra ", " at the end.
posted by Zalzidrax at 8:14 AM on August 25, 2020 [1 favorite]

Taking notions from what other people have said above, I've found that using the "&" operator instead of concat fixes my list-interpretation error. I should probably mention that both this comment and my previous were actually developed in Gnumeric, not Excel, although they should be compatible.

Here's something which does everything except removing the last two characters (which Zalzidrax gives a way to do, if you're willing to either write the same formula twice or use an additional column)
=if(or(C2:F2="y"),concat(if(C2:F2="y",C$1:F$1&", ","")),"none, ")
Note that I added a comma and space after "none" specifically so that whatever strips off the last two characters could strip that, too.
posted by jackbishop at 9:04 AM on August 25, 2020

You want TEXTJOIN, not CONCAT:

=TEXTJOIN(", ", True, IF(C2:F2="y", C$1:F$1, ""))

Sample sheet
posted by flabdablet at 9:22 AM on August 25, 2020 [10 favorites]

You can make this kind of data wrangling much much easier by getting your source data into 3rd normal form. Once your data is well structured you will not need any complicated formulas to make sense of it.
posted by Lanark at 12:21 PM on August 25, 2020

I would do additional IF columns (hidden) after each separate ingredient column (visible)

This would be my approach also, for the benefit of myself later on trying to figure out what I was thinking and of anyone who was unfortunate enough to inherit it.
posted by A Terrible Llama at 2:47 PM on August 25, 2020 [2 favorites]

Lots of good stuff here, am getting my teeth into it now. As I was saying though I ideally want to avoid a lot of nested and repeated stuff, especially since “if” is wanting to finish its mental loop as soon as if=true. Should I be looking at a macro or something instead? Want it light and as unfuckupable as possible by others. Re changing data etc, I should be clear, I have to work in the file I’ve been given, they want just the name and ingredients and this is to stop us needing to type out the words every time for 400 sandwiches, so to speak.
posted by J.R. Hartley at 1:57 AM on August 26, 2020

I ideally want to avoid a lot of nested and repeated stuff

and this is exactly what the array-formula version of IF(), nested just one level inside a TEXTJOIN(), will do for you. Yes, it's nested but it's in no way giant, and is even simple and straightforward enough to be easier to parse the design intent from than the same requirement written out in prose. Did you open my sample spreadsheet?
posted by flabdablet at 8:35 AM on August 26, 2020 [1 favorite]

The magic thing about excel is that there are 100 ways to solve every problem. There's the manual way (just have them do it), the broad way (add columns until problem is solved), The programmer way (use visual basic), and there's the elegant way (there's a formula that does exactly what you need without any additional work).

Flabdablet has given you the elegant way, which is to use the =textjoin formula, which given his setup does EXACTLY what you need it to, in a super simple way! It feels like when I first learned about vlookup.

I run into this kind of problem all the time. I'm going to use textjoin SO MUCH. Thank you so much flabdablet!
posted by bbqturtle at 10:45 AM on August 26, 2020 [1 favorite]

Ok finally got my ducks in a row here and flabdablet that really is a gorgeous solution! Exactly what I need, thank you, I will be experimenting with it a lot! I need to get it working on the version of excel our accursed machines run and see what the equivalent term is in our language version, but fingers crossed this is the one! If any macro-wizards happen by, feel free to have a think as the TEXTJOIN func seems to not be supported in earlier excel versions than 2019.
posted by J.R. Hartley at 11:20 AM on August 26, 2020

As I understand it, TEXTJOIN first appeared in the subscription (Office 365) edition of Excel 2016, and was later included in all editions of Excel 2019. If it's not there in your version there are VBA polyfills available. Here's one. I haven't tested it but it looks like the business.

If it's unacceptable for this functionality to become unavailable when macros are turned off, then you can hack around a lack of TEXTJOIN using CONCAT. Here's how I'd do that:

=REPLACE(CONCAT(IF(C2:F2="y", ", " & C$1:F$1, "")), 1, 2, "")

Sample sheet

It's longer and nestier than the TEXTJOIN version but still not too bad. No repeated functions at any rate, and still based on the array IF so it will still cope with modifications to your columns of checkboxes without needing to be tweaked by hand.

Unlike jackbishop I've stuck the delimiting ", " onto the front of each of the concatenated items rather than the end. That puts the redundant one at the start of the concatenation result rather than the end, which makes it easy to use REPLACE to get rid of it without needing to take string length into account. REPLACE also copes politely and quietly if asked to replace something that isn't there, so there's no need to handle that as a special case (see the new "nothingburger" sammige in the sample sheet).
posted by flabdablet at 1:25 PM on August 26, 2020 [1 favorite]

If REPLACE is too wordy for you then my second choice would be MID:

=MID(CONCAT(IF(C2:F2="y", ", " & C$1:F$1, "")), 3, 999)

I don't like it as much because it forces me to think about whether or not 999 means "as many as exist" quite clearly enough. For this use case I think it probably does.

The Right Thing instead of 999 is an ugly calculation using LEN that would either require the CONCAT to be done twice or shifted to an intermediate result cell, both of which ewww.

On balance I think the REPLACE version is tidier.
posted by flabdablet at 1:46 PM on August 26, 2020

flabdabet thank you, your explainations are going to help me implement this in many other places I didn’t even know I needed them. Also thanks to all other contributors, I got lots of useful stuff there too!
posted by J.R. Hartley at 11:14 PM on August 26, 2020

Am trying the VB but getting errors, part of the issue I suspect is I am using Swedish excel which also includes different delimiters. Don't know if affects VB but absolutely does formulas, so I've switched true for SANT, commas for semicolons in the functions, may need to do.

With regards to the REPLACE I get a "missing" result for this variant.
=ERSÄTT(SAMMANFOGA(OM(E6:P6>0; ", " & E$1:P$1; "")); 1; 2; "")

With regards to MID, I get a value error, possibly because I am using >0 rather than "y"?
=EXTEXT(SAMMANFOGA(OM(E7:P7>0; ", " & E$1:P$1; "")); 3; 999)

Edit: Actually they seem to both be saying "value error" now, that the data used is the wrong type. Even when I left it as "y" and changed the data available to Y instead of a number.
posted by J.R. Hartley at 12:30 AM on August 31, 2020

I can't help with the swedish, but typically when you use a >0 or something like that, it needs to be in quotes for some reason. ">0" could help with your errors. I know that helps in many formulas, not sure about this one.
posted by bbqturtle at 9:33 AM on August 31, 2020

I don't have access to a Swedish version of Excel. But if your troubles began when you switched from Y/N to numbers then you might want to check this followup question to see if anything in its answers turns out to be relevant and/or illuminating.

One thing I learned from that thread is that Evaluate Formula is a thing. It might help you narrow down exactly where your value errors are coming from.

Something else you could try is temporarily switching out the array formula for one that just processes single cells, and step through the cells concerned one at a time to see if the value error always happens or whether it's related to some particular subset of source cells.
posted by flabdablet at 10:43 AM on August 31, 2020

« Older App, tool, extension or site for organizing a...   |   Help a dummy out... prepaid phone question Newer »

You are not logged in, either login or create an account to post comments