Help! Can I normalize statistical coefficients?
August 13, 2014 8:13 AM Subscribe
With deadline looming, stats consultant has bailed. Simple queries need resolution. Help?
I am working on a data graphic that involves statistical calculations about survival rates for startup businesses, correlated with certain tangible and intangible factors. The raw data (about survival/closure/merger outcomes) has already been investigated, and the original researchers (who are awesome) have generated some interesting correlations using univariate regressions and Cox regressions. For my output I am relying on their statistically significant findings, wanting to create comparisons among the univariate coefficients. Not sure my methods are kosher and would appreciate consultation. Avalanche inside.
posted by GrammarMoses to Science & Nature (2 answers total) 1 user marked this as a favorite
To make the information clearer to a lay audience, editors and I want to present the figures in a comparative framing. Ideally, we're aiming to say "firms with factor [foo] have a XX% greater likelihood of survival than firms with factor [not-foo1] and a YY% greater likelihood of survival than firms with factor [not-foo2]."
The factors are characteristics of the owner and/or firm that survives/closes/merges. Specifically, I have univariate coefficients for owner age (binned categorical), race (categorical) owner years of experience (continuous), owner's experience in same industry (dummy: yes/no), owner's college degree (dummy: yes/no), firm diversification (dummy: yes/no), possession of IP (dummy: yes/no), and startup capital (binned categorical). I have firms categorized as high-, medium- and non-tech.
What I am hoping to do is normalize the coefficients. Simple example:
female-owned: survive 8.5 close 17.81 merger or sale 12.09
male-owned: survive 91.5 close 82.19 merger or sale 87.91
(If it matters, t-test was used to determine statistical significance; all of the above are stat-sig.)
Can I say that female-owned businesses in this category are more than twice as likely to close as they are to survive [(17.81/8.5) = 209%]? And, similarly, 42.2% [(12.09 - 8.5)/8.5] more likely to merge or be sold than to survive?
If I calculate similarly for male-owned businesses (89.8% as likely to close as to survive), can I compare the 209%-as-likely women's closure likelihood to the 89.8%-as-likely men's closure likelihood?
And does the 89.8% comparative likelihood of closure mean that men's firms are actually 11% more likely to survive? Or do I need to use a different denominator in calculating that?)
Cox - competing risks:
With Cox regression tables, the researchers calculated the comparative likelihood of competing risks (closure vs. M&A -- both are considered business exits).
It seems clear that these figures are meant to be compared in pairs - closure vs. M&A. As I understand it, the coefficients represent positive/negative correlation (with each type of exit) and the absolute value (intensity) of the influence...?
Duration regression analysis - Cox regression (competing risks)
White-owned high-tech firms:
closure 0.56/m&a 0.45
White-owned medium-tech firms:
closure 0.69/m&a 0.8
White-owned non-tech firms:
closure 0.67/m&a 0.64
I suppose I can compare the closure figures to one another, but how? The scale is not necessarily -1 to 1, as some of these coefficients are upwards of 3 (for factors other than race, such as whether the business is a franchise).
And given this type of info, can any information about survival likelihoods possibly be derived (via...umm...subtraction or something)?
I feel a bit sheepish about these simple-minded queries, but for that reason I was sure they would be no-brainers for someone here. Sincere thanks to anyone who can shed some light.