August 28, 2009 8:41 AM   Subscribe

MathFilter, NumericSimplificationFilter, RecursionFilter, ExcelFilter: Where X_1,X_2,...,X_n can be any number, is there a name, simplified series, or equation to describe: (X_1)+(X_1+X_2)+(X_1+X_2+X_3)+...+(X_1+X_2+...+X_n) ? Seriously seriously insane exel implementation question inside.

The prior question probably gives me enough information that I can do some research on the implementation of what I'm asking below. Effectively I'm doing an accumulation of data in Excel.

For each data set, I have a row containing distinct quantities
We'll say that says{A1:D1} contain the values {100 | 500 | 200 | 300 }
What I need {A2:D2} to say is the cumulative value of the prior cells, plus the cumulative value of the prior cumulative values

This means {A2:D2} = {100 | 700 | 1500 | 2600 } is my desired result.

{A2} =Sum($A1:A1)
{B2} =Sum($A1:B1,A2)
{C2} =Sum($A1:C1,B2)
{D2} =Sum($A1:D1,C2)

I could also represent each of these as
{A2} =Sum((A1*1))
{B2} =Sum((2*A1),(B1*1))
{C2} =Sum((3*A1),(B1*2),(C1*1))
{D2} =Sum((4*A1),{B1*3),(C1*2),(D1*1))

Now here's the thing I'm not doing this to D2, I'm doing this more likely to {BB2}. I need a way (preferably without macros) to get a cleaner table. Is there a relatively simple expression so that I could generate {BB2} without generating {A2:BA2}? With the amount of data I'm performing this on, METHOD 1 requires a lot of cell real estate, and METHOD 2 results in an insanely long, quasi recursive forumula that doesn't lend itself to copying and re-use.

Understand the portability of copying and re-use are the big needs, as I don't do this once, I do this a couple hundred times on data sets of varying length.... generally for 63+ columns and 200-300 entry rows...

I'm hoping that by getting a better name and notation for the series that I can figure out a better way to implement this. Also, eventually this gets ported to SQL, so while recursive realities are nice, the DB admin hates 5 hour tablebuilds...
posted by Nanukthedog to Computers & Internet (16 answers total)
I confess I didn't read the inside portion of your question very closely, but in answer to the outside portion:

Wouldn't that just be n*X_1 + (n-1)*X_2 + (n-2)*X_3 + ... + X_n ?
posted by Commander Rachek at 8:45 AM on August 28, 2009 [2 favorites]

In B2:

Copy along.
posted by bfranklin at 8:48 AM on August 28, 2009

Oh, nevermind. I see that I misunderstood you and this is basically your method 1.
posted by bfranklin at 8:51 AM on August 28, 2009

Slightly less processor intensive: B2 is =2*A2+B1
posted by bfranklin at 8:53 AM on August 28, 2009

yes, Commander Rachek has it. The simplest equation for your {BB2} is =n*X_1 + (n-1)*X_2 + (n-2)*X_3 + ... + X_n.
posted by milestogo at 8:57 AM on August 28, 2009

Upon closer inspection I see that what I outlined is your Method 2. Unfortunately for you, I believe that is the best explicit (non-recursive) representation you're going to get. Surely Excel has some way to automate that?
posted by Commander Rachek at 9:03 AM on August 28, 2009

Response by poster: Not that I've found yet, Commander Rachek. Its bloody hellish for huge datasets ....
posted by Nanukthedog at 9:04 AM on August 28, 2009

Can't you use an extra row for intermediate sums? I would use something like this:
 | A          | B          | C         | D           | ...
1| X1         |  X2        | X3        | X4          | ... 
2| SUM(A1:A1) | SUM(A1:B1) | SUM(A1:C1)| SUM (A1:D1) | ...
3| SUM(A2:A2) | SUM(A2:B2) | SUM(A2:C2)| SUM (A2:D2) | ...

posted by Dr Dracator at 9:13 AM on August 28, 2009

Oh, as for terminology, if X1,X2... is a sequence, X1+X2+... is an infinite series and X1, (X1+x2),... is called the sequence of partial sums but I doubt you will find this useful.
posted by Dr Dracator at 9:20 AM on August 28, 2009 [1 favorite]

Best answer: How about this: Create another row with the values A2 = n, A3 = n-1, A4 = n-2, etc. Put a zero in the cell A(n+1). Then use the SUMPRODUCT function:


I think this will work in general.
posted by Johnny Assay at 9:27 AM on August 28, 2009 [3 favorites]

(This is essentially an implementation of the n*X_1 + (n-1)*X_2 + ... formula mentioned above, BTW.)
posted by Johnny Assay at 9:30 AM on August 28, 2009

Response by poster: Dr. Dracator: Thanks a ton - terminology helps - you are reminding me of what to look up in 15 year old calculus and algorithm textbooks.

Johnny Assay: I have to test that formula (head somewhat exploding from wrapping my head around a meeting), but I'm thinking that may be the best implementation for my use because I can do it with fewer intermediates than you think I need using some COUNTIFS.
posted by Nanukthedog at 10:02 AM on August 28, 2009

Response by poster: Just checked it... Wow - you just made my life much easier Johnny Assay.
posted by Nanukthedog at 10:33 AM on August 28, 2009

Glad to be of help. If you don't want that last zero to be in the second row (to maximize elegance or make a square array or whatever), you can also use =SUMPRODUCT($A1:A1, $A2:A2 - A2 + 1).
posted by Johnny Assay at 11:43 AM on August 28, 2009

Dr. Dracator: Thanks a ton - terminology helps...

It's great to be helpful even when one's brain is disconnected. You have probably realized by now that my suggestion above is highly suboptimal, though it can be rewritten to be non-horrendous:
 | A   | B      | C     | D      | ...
1| X1  | X2     | X3    | X4     | ... 
2| =A1 | =A2+B1 | =B2+C1| =C2+D1 | ...
3| =A2 | =A3+B2 | =B3+C2| =C3+D2 | ...
Reasonably clean, and amenable to copying-pasting and range extension.
posted by Dr Dracator at 1:09 AM on August 30, 2009

Response by poster: So the reason why the SUMPRODUCT works the best for my needs:
By shortening or lengthening the range, I can autocalculate one cell's result independent of all the others, and actually independent of the reverse index (I need not finish on 1).

This means that if I want to calc on A:BB I can, as well as on F:AI with relative ease.
This means that I can easily compare 1 datapoint from multiple starting locations with minimal effort - meaning that the versatility, re-usability, and reapplication of the result is extremely high.
posted by Nanukthedog at 9:48 AM on August 30, 2009

« Older Are dereferenced open streams implicitly closed?   |   Help me avoid having the Dallas club experience... Newer »
This thread is closed to new comments.