Pivot Tables the Hard Way -- Filling A Row With Column Data in Excel
August 4, 2021 7:21 AM   Subscribe

I have a workbook with an input sheet and then sheets showing different views of the data pulled from the input sheet. My primary views are fine, but for secondary views I need to fill output rows by reference to input columns. Due to Reasons, I think I need to do this with either INDEX, OFFSET or INDIRECT(ADDRESS...) rather than Transpose, or Pivot Tables. I can make INDEX work to fill across, but it can't be copied down. Can you help me work around having to manually rewrite a formula for each ouput row?

I have a set of numbered Documents, and each of those Documents is related to multiple numbered Topics.

I need to generate correspondence tables; i.e. one table that is indexed by Document number in the leftmost column, and shows all of the Topics a Document is related to on each row (the primary view), and another table that is indexed by Topic number in the leftmost column and shows all of the Documents related to that Topic on each row (the secondary view).

I have entry set up so that there is a grid indexed (in rows) by Document number and with columns for the Topic number. Data entry is accomplished by filling the appropriate cells with "X"es and I have data validation set to display a dropdown of "X" and a blank.

The primary output views are all working fine, as the input sheet is already indexed by Document number, allowing me to just concatenate and format the rows on the input sheet.

For the secondary output, I need a formula that I can copy and paste to the right but that will read down along a column corresponding to a particular Topic number. i.e., I need the row number to increment when filling horizontally rather than the column letter.

I understand that this is what Pivot Tables do (or part of what they do) but am under the impression that Pivot Tables are dynamic views that you switch between, which is not what I want to do. I want sheets that auto-update with all of the various views I will need and I never want to touch anything but the input sheet.

I also know there is a Transpose function to cutting and pasting that seems like it's intended to do this, but it says it requires absolute references in the source range to preserve them and that seems like a problem here (although maybe I'm wrong about that).

If the above is correct it seems to me my options are abusing some combination of the COLUMN function with INDEX, INDIRECT or OFFSET which so far as I can tell subverts Excel's normal behavior by using the inbuilt Column index numbers to forcibly increment relative references to rows when filling horizontally. In a way that makes my brains want to leak out of my ears.

Here's how it's set up:

The raw input data is pulled into the working sheet's range of D7:D56 (vertically as in the source).

We are trying to copy that range horizontally into D64:BA64 for further concatenation and formatting.

Currently I have one line of output working using INDEX.

The first (upper left) output cell (D64) contains this formula:


This works fine for the first row if I copy it across.

However, due to the absolute cell references in the source, I can't just take D64:BA64 and fill down with it -- Row 65 needs to be filled with data from Column E, Row 66 from Column F, etc.

The sheet is set up to support 250 Topic Numbers, and I do not want to hand-code 250 versions of this formula and fill each row manually if I can all avoid it.

I understand there are ways to do this using OFFSET or INDIRECT(ADDRESS...) but trying to get my mind around that is really hurting my brain more than INDEX did.

Is there some way I can hack my way around this by more programmatically assembling the INDEX function? Or can someone explain how I would do this with OFFSET or INDIRECT in a way that I can do this with two copy/paste operations (i.e. one to fill the top row, then one to copy the top row down).

(Or am I mistaken about being unable to use Pivot Tables or Transpose for this?)
posted by snuffleupagus to Computers & Internet (13 answers total) 4 users marked this as a favorite

Best answer: Hi Snuffleupagus!

I do this kind of work all the time. I know you said you are limited to those functions, but would it be possible to accomplish your goals with Sumifs and xlookups? Sumifs show values based on multiple criteria, and xlookup is the same as vlookup but can be used in either direction. Sumifs also seamlessly transitions from rows to columns without hesitating.

I also am a power user that creates sheets similar to yours, and I'd say 90% of the data is filled with sumifs. Text data is usually filled with xlookup, though with multiple criteria I need to get fancy.

With sumifs, you just fill in your reference rows and columns at the top and left of the data, and write the formula so it dynamically fills in the results. That seems to be the tool missing from your toolbox to solve this problem. I'd be happy to elaborate if helpful.

Though, if you exceed, say, 10,000 rows, sumifs can crash your excel. Always test with a small selection first.
posted by bbqturtle at 7:42 AM on August 4, 2021 [1 favorite]

Best answer: On Transposing formulas:

One useful thing you can do is transpose references. So if you take a selection, and copy and "paste links" it is a link to all of those cells. But, you can't do that and transpose at the same time.

So, the hack I use (I'd like to say I discovered it, but I'm sure others have a way as well) is you can "paste links" normally, then find and replace "=" with "#", which makes all of the cells no longer formulas. Then, copy the "#"-ated cells, and transpose. Then, find and replace # with "=".
posted by bbqturtle at 7:44 AM on August 4, 2021 [2 favorites]

A few other notes on further inspection:

Sumifs could reference the document number as well. Lock that reference!

Use 1's or 0's instead of X's. That was it will work in sumifs/pivot tables much better.
posted by bbqturtle at 7:47 AM on August 4, 2021

Response by poster: Looking into sumifs/xlookup as well as the paste links thing.

I suspect it would be as fast for me to use a temp sheet to build another 249 versions of the INDEX formula I can cut and paste in and then manually fill each row as to learn vlookup/xlookup, if a lot more tedious, but we'll see. It's worth learning if I can puzzle it out.

I want to keep the input sheet as Xs and blanks for clarity and ease of use, by the time the data hits the secondary view sheets it's working with numbers. (Like so: =IF(Input!D10="X", $A8, "") with the numbers statically populated in the A column).

I'll see what I can work out. Will report back. Thanks!
posted by snuffleupagus at 8:02 AM on August 4, 2021

The amount of effort you are putting into this sheet is pretty irregular from a normal business perspective. There's probably a different, faster way to accomplish this goal. I deal with these types of workbooks a lot, and I'm having trouble wrapping my head around the application of it - unless you need to upload it into some ancient system. Having 250 sheets in a workbook is not how sheets are intended to be used typically.

A few possibilities that I've used before:

Could all of your data live on one sheet, and you just use a filter?
Could you use a dynamic sumifs, with references to dropdown-menus at the top, to allow the end user to select their index, then it auto-populates? Like, typically I would have one sheet full of data, and then your "secondary view sheet" is the second sheet, just populated through sumifs. Sometimes I have an intermediary sheet.

It would be really helpful if you had a version of the sheet you could share to explain - with what the input and output looks like.
posted by bbqturtle at 9:45 AM on August 4, 2021 [1 favorite]

Response by poster: I'm sure you're right. (But that's 50x250 columns/rows depending on the orientation, not 250 sheets.)

Like a lot of things, the better way is faster if you know what you're doing. However, learning the better way can be slower. Like, weeks, months, years.

I spent about thirty minutes reading about XLOOKUP (and its forerunners) and realized it's probably the Right Way to go about this, but also that it would take me a couple days to work out how to do it properly. Figuring out how to get OFFSET or the INDIRECT(ADDRESS....) hacks to work would likely be quicker, but is just scrambling my brains at the moment too.

Meanwhile, the cut and paste trick got me thinking that I could probably work out a quick and dirty way to programmatically build all the formulas I needed (like the way you would munge text in a Bash pipeline) in another workbook but with the output using a leading # instead of =@, then copy and paste the resulting cells as values into the sheet doing the column-->row conversion, and find and replace the # with the =@.....et voila.

So now I have CorrespondenceTableColumnMunger.xls hanging out alongside my main workbook. This is of course Ugly and Wrong but it's Good Enough and Working Right Now, and the horrible parts can be replaced with something more elegant later.

The main use case is responses to Requests for Production under the California Code of Civil Procedure. There is a revised requirement that the materials produced must be correlated with the requests they are responsive to (and any one document/exhibit is going to be responsive to more than one request). You can't just do an undifferentiated doucment dump anymore. And if I have to generate a table indexed by Exhibit Number, then I'd also like to be able to generate a table indexed by Request Number.

I also have call to produce correlation tables of numbered documents and their topics for other work purposes.

I can upload the files (or some illustrative screenshots) if anyone really wants to see my hackery, or if the Munger might be of some use to others -- but it might give you the Brain Worms.
posted by snuffleupagus at 11:26 AM on August 4, 2021

Best answer: This is a many to many relationship and is one of the times that using a database is much faster than using Excel, but as you said, that has a learning curve and if this is a one time thing, not worth the learning time. But it sounds like this request may recur, so here goes:

In a correctly designed database, you would not have a table with every document as rows and every column as a topic and then only put an X in the cells that are "Yes". You would have one table with two columns: Document and Topic. Then you populate the rows with only the Yes values. So, instead of a 50x250 cell table, you have a table that can be thousands of rows but is still just two columns, which is much easier to query.
posted by soelo at 12:52 PM on August 4, 2021 [1 favorite]

Response by poster: Yes. This is an area that I'm relatively weak. I know what databases are for, and I know some basics as required for sysadmin tasks I've peformed, but I have no actual DBA skills. Which is why I'm using Excel instead of Access, and why I fell back onto what I do know, from text pipelining, cascading references, basic array manipulation and so forth.

I suppose what I've created is something like a kludged set of linked lists, which I suppose isn't so horrible a structure for this in the abstract.

But it's pretty horrible as executed (some snips as a single image)

Nonetheless, done is done.
posted by snuffleupagus at 1:11 PM on August 4, 2021

Response by poster: Corrected image (it was showing a value instead of its formula).
posted by snuffleupagus at 1:28 PM on August 4, 2021

Best answer: Well, I'm glad you found a solution! Your images don't work, but that's okay. I hope you learned a few nuggets, and next time things will be even easier!

I agree with Soelo that a database would be easier, but the crucial part of what he said was trying to have only one column with the data in it. You can do this in excel too. So instead of a workbook having:

xxxxxxxxxx Date1 date2 date3 date4

With X's location for data, it's much for efficient to have

location 1 date 1 X
location 1 date 2 X
location 1 date 3
Location1 date 4
location2 date1
location2 date2
location2 date3X


If you start from this position it's much easier to reference and work with. Sumifs / xlookups / nested lookups / concats can reference the combination of columns much easier than searching within a range.
posted by bbqturtle at 9:46 AM on August 5, 2021

Response by poster: Weird. I must have the one I deleted cached. How bout this?

Thank you very much for this:

So instead of a workbook having:
xxxxxxxxxx Date1 date2 date3 date4
it's much for efficient to have
location 1 date 1 X
location 1 date 1 X
location 1 date 2 X
location 1 date 3

That crystalizes the thing about having two columns rather than intersecting rows and columns in a way that the DB basics hovered in the hobbyist-level coding tutorials I've done never have.
posted by snuffleupagus at 10:43 AM on August 5, 2021 [1 favorite]

Oh jeez you're really in the weeds on this one. If I was your coworker and you could share screens, I'd have you scrap this thing entirely and try to remake it from scratch, skipping like 4 steps hopefully. But! If it works, it works!

I'm still not totally understanding things, but I'm happy I can help. Honestly, sumifs/xlookups work so well that I've never had to go down the rabbit holes you have. If you ever need more excel help, I'll always be around!
posted by bbqturtle at 5:49 PM on August 5, 2021 [1 favorite]

« Older Simple programmable cell phone for a confused...   |   Proper etiquette for switching doctors within a... Newer »
This thread is closed to new comments.