Conditional summation notation
July 31, 2020 3:00 PM   Subscribe

I'm trying to show a summation for observations i to n, starting with i = 1, that is only applied when another variable (treatment) is equal to 0. What's the correct notation for that?

The idea that I'm trying to turn into mathematical notation is "Each control unit in a given stratum has an uncorrected weight equal to (the unit's survey weight divided by the sum of all survey weights of control units in that stratum) multiplied by (the sum of all survey weights of treatment units in that stratum)," if that helps for context. I feel very self-conscious asking this question; please be kind.
posted by flipmodemedian to Science & Nature (8 answers total) 2 users marked this as a favorite
 
Best answer: On the Wikipedia page on summation look at the example of a sum over the elements of a set. Your sets are [Stratum I] \cap [Control] (where \cap is the intersection symbol; I tried hard not to resort to LaTeX) and the other is the intersection with Treatment instead.

There are lots of other ways of writing it, too. You can just define something like S_i^C to be the set of control elements in Stratum i and S_i^T to be the treatment elements. Basically, anything that communicates your point is "allowed".
posted by hoyland at 3:07 PM on July 31, 2020


You can also do something like multiply terms by an indicator function (note that I_control = 1 - I_treatment), if that would let you write something more likely to make sense to your readers. (What you wrote in above the fold was confusing to me and below the fold wasn't, so maybe I think of things opposite to you, in which case, maybe the indicator function route is clearer).
posted by hoyland at 3:12 PM on July 31, 2020 [3 favorites]


Response by poster: That's perfect; I really appreciate it!
posted by flipmodemedian at 3:13 PM on July 31, 2020


Yeah, I was going to suggest an indicator function as well. If each unit has weight wi and "type" ti, you could write the sum of all control units as something like:

i (wi · I[ti="control"])

An more concise option is to just write a summation over elements of a particular subset, instead of over all indexes from 1 to n. For example, if U is the set of all units {1, 2, ..., n}, and C is the subset of only control units, you could write the same thing as:

i∈C wi
posted by teraflop at 3:14 PM on July 31, 2020 [1 favorite]


The Iverson bracket is an elegant and unfussy notation for indicators, though it isn't fully mainstream even in math, and I don't know how it would be received by (say) a statistics journal. Defining it in-paper would likely still be less overhead than some other solutions.

Another thing that would not be unusual in math is a summation with one or more logical conditions under the summation sign. For instance, the operator for a sum to be taken over all prime numbers less than 100 might be written as
1 ≤ p < 100
  p prime
(well, that's a poor attempt to typeset it in basic HTML).
posted by aws17576 at 3:22 PM on July 31, 2020 [1 favorite]


Response by poster: You all are wonderful. I sincerely appreciate it.
posted by flipmodemedian at 3:24 PM on July 31, 2020


(I hope this is not too off-topic, but since the main question seems to have been answered, I'll just share something interesting about indicator functions. Some years ago, I had a student ask how to graph piecewise functions on their TI-xx calculator. Looking into it, I found that the leading suggestion was to plot equations like this:
y = x^2 / (-1 < x < 1)
which, for instance, would plot y = x2 from x = –1 to x = 1. This works because the calculator internally represents the boolean values TRUE and FALSE as 1 and 0. Thus, when an expression is divided by a boolean, the result equals the original expression if the boolean is true, but is undefined if the boolean is false. The TI doesn't throw an error when it encounters an undefined value while graphing; it just skips that x.

I passed this on to my student, with more commentary than they bargained for about the uses and abuses of implementation-dependent junk.)
posted by aws17576 at 3:45 PM on July 31, 2020 [2 favorites]


Dirac deltas, also an option. They're arbitrary functions we define as (a) only having value 1 at 0 and zero elsewhere; and (b) integrating cleanly so that intervals containing 0 sum to 1 (else zero).

So your strata can each be activated by multiplying each layer by \delta(x-offset) with a layer-appropriate offset.
posted by k3ninho at 4:41 PM on August 1, 2020


« Older Will 24 Hour Fitness survive/recommended...   |   Problem with Firefox Newer »
This thread is closed to new comments.