So I'm beginning a statistics PhD program this fall and I'm concerned that my math skills have gotten rusty since I haven't done anything related for the past two years. I've been working as an actuary since I graduated college but I don't do that much math--mostly a lot of programming. Has anybody been in a similar situation to me? How was the adjustment for you? I'm considering retaking advanced calculus and linear algebra during my first year (probably next summer before I take 2nd year advanced courses) just to refresh myself again. I'm aware some people may think this is kind of pathetic but I'd rather be safe than sorry. Besides, it's only my first year. Is this frowned upon? [more inside]
posted by molamola
on Jun 17, 2013 -
I'm running into trouble with my statistics course. I'm just getting up to t statistics for independent measures research design. My problems are:
1. I'm going through a lot of paper
2. I need to keep all my calculations better organized as a do them
3. I'm flipping back and forth between my book, an online version of the book, and another screen so that I can reference as much material as possible at once. I'm thinking some kind of basic statistics calculator spreadsheet (or any other format) would be in order. Can anybody direct me towards one? [more inside]
posted by Che boludo!
on Jun 8, 2013 -
What are the limits to bedbugs? Why isn't every hotel room infested given how tough they are claimed to be? Is there any evidence on the chances of taking bed bugs home from a hotel with you? Will the bedbug infestation rates go ever upwards? Why or why not? Interested in aggregated, rather than anecdotal evidence here. [more inside]
posted by mister_kaupungister
on May 31, 2013 -
I come from an engineering background rather than a research background, and I find myself lacking in vocabulary when it comes to understanding research papers, particularly when they start talking about ANOVA analyses, F(x) effect sizes and p values. I can skim through the results of a study and see that certain numbers are bigger than other numbers, but I don't really know how to tell whether what I'm seeing is significant. I'm guessing that I'm missing basic education in statistics. Can I fix this in a simple way?
posted by sdis
on May 28, 2013 -
I was recently sitting down to tea with a friend of mine's (we're both in our early 20's), and the topic wandered over to hard drug use (e.g. stuff like cocaine and crystal meth, not marijuana or alcohol.) When comparing our perceptions of how common hard drug use was, we were completely surprised when our answers were polar opposites: I saw it as an extremely rare thing, but she said it was something virtually everyone did but no one talked about. What's the truth here? How prevalent is hard drug use anyway? And why do our experiences differ so much? [more inside]
posted by Conspire
on May 24, 2013 -
I have a list of paired numbers that span multiple orders of magnitude, and I need to find a method to a) compare within each pair in a way that does not disproportionately bias the comparison at the high or low end of the list, and b) define which pairs are dissimilar enough to be excluded from further analysis. The dataset itself follows a rough sigmoid curve, with a few pairs in the 1000s, more in the 100s, a lot in the upper 10's, some in the low 10's, and a few in the single digits. I have tried a few different comparison methods so far, including percent difference and relative percent difference of both the raw and log-transformed data. [more inside]
posted by nekton
on May 24, 2013 -
I'm about to begin a new project that looks at the outcomes of specific events, and would like to query the hivemind to see what kinds of approaches I can take to it. I'm always impressed with the wide variety of approaches to statistical problems I see on here. [more inside]
posted by Tooty McTootsalot
on May 24, 2013 -
Looking for statistical information on how many students study history at the B.A., M.A. and Ph.D. level in various non-US countries. [more inside]
posted by agent99
on May 9, 2013 -
So I'm going to Kenya in 5 weeks time for some work, and I'm meant to be briefing some colleagues (emphasis on brief) about some aspects of our work tomorrow. Something has leapt out at me, and I don't have the time to research it myself before presenting. [more inside]
posted by smoke
on May 7, 2013 -
In English, scientists customarily use the word "significant" or "statistically significant" to refer to an effect that is distinguished from zero at a p < .05 confidence level. On the other hand, the word "significant" in non-technical English carries a connotation of being meaningful, important, or substantial; this creates confusion when researchers write about "a significant effect," since the effect might be significant in the statistical sense while being so small as to be insignificant in the common-English sense.
In your native language, what word is used for "signficance" in the statistical context? Is the same word used outside the technical context, and if so, is it a word whose common meaning is something more like "detectable," more like "important," or something else entirely? In particular, does the confusion that arises in English also take place in your language?
posted by escabeche
on Apr 24, 2013 -
I have five structural equation models that are identical except for the final outcome variable. Should I expect the model fit statistics to vary more than negligibly? [more inside]
posted by aaronetc
on Apr 11, 2013 -
I'm curious what the most frequently purchased colors are for regular, non-jean pants for men. I was discussing this with a colleague today, and guessed black and navy. I'm looking specifically for industry numbers or anecdotal evidence from those working in clothing manufacturing or sales. (I already guessed!) Thanks in advance.
posted by whitebird
on Apr 5, 2013 -
Feeling a little stuck in my current job; unsure if I should go for the PhD or cross it off my list and change jobs. [more inside]
posted by un petit cadeau
on Apr 5, 2013 -
I've been thinking about product ratings online. Product A and B both have an average rating of 4 stars. Product A is universally liked: every reviewer gave it 4 stars. Product B is the Twilight Series: lots of people love it (5 stars), but many 1 star reviews drags down the average to 4 stars.
Other than displaying the rating distribution (# of 1 through 5 star reviews), are there well-known formulas that would give Product A a higher rating?
I think what I'm asking about are weighted means, or some sort of formula that takes into account variance or skew.
But rather than re-invent the statistical wheel, I was hoping some of you may be able to point out well-known examples of good weighted formulas, or research related to this question.
Hope this is clear! Thank you!
posted by User7
on Apr 1, 2013 -
I never use SPSS (I hate it like I hate nothing else, except perhaps Excel) but I must
use SPSS for this problem. Normally I'd just use R and be done with it, but SPSS is necessary for this problem. How can I get a simple error bar plot from two proportions? [more inside]
posted by Philosopher Dirtbike
on Mar 18, 2013 -
The fallacy is assuming that statistic information about a thing is more relevant in dealing with a particular instance of that thing than available first-hand data. [more inside]
posted by CustooFintel
on Mar 12, 2013 -
I'm trying a analyse a set of biological data for a research project and I'm having trouble finding the appropriate statistical tests to use. [more inside]
posted by snoogles
on Feb 12, 2013 -
StatFilter: Would anybody be able to recommend a good introduction to the statistical computing language "R" that a reasonably quantitatively-adept psychologist might be able to work through on his own? Something like a step-by-step book or textbook with exercises would be great to help me become more fluent in R. (My colleagues at work who use R are primarily computer scientists who either first learned MatLab or are brilliant autodidacts when it comes to learning different scripting languages, and thus don't have any suggestions; Googling has mostly proferred a somewhat obscurely structured guide from the R authors and lots of invocations to just learn on my own, somehow...). I've become familiar with how to do many individually useful tasks in data structuring and analysis, but I feel a bit like a very high-functioning tourist who has learned a lot of phrases to get around but who would be lost and mugged in an alleyway if I strayed off the beaten path.
posted by Keter
on Feb 7, 2013 -
I have a Soundcloud account. I think it's among the better sound file sharing sites, but it can be a little pricey. I am regularly adding new content so I am always butting up against the file size limitations of my account. So, I want to have a way to look at the files in my account and decide which of them needs to go and keep the most productive ones. I want to be able to compare how long a file as been posted to number of times it has been viewed. I think that deleting a file based just on duration or views could be deceiving because I could delete an old account that is delivering consistent viewers while a new account that starts out fast could drag as time goes on. I can't tell that by just looking, so I need some help. Can I create something in Excel? I also have SPSS. Would creating crosstabs help?
posted by CollectiveMind
on Jan 28, 2013 -
I have recently been introduced to the concept of pseudoreplication
as a mistake that people often make when using inferential statistics to evaluate treatment outcomes. My field (evolutionary and conservation biology) makes heavy use of inferential statistics, including techniques that are vulnerable to pseudoreplication, yet nowhere in my formal education have I been taught about how poor experimental design and lack of statistical rigor can lead to fallacies like this. My personal statistical proficiency is poor, but I am working to remedy that. To that end, could folks help me by identifying and ideally explaining whatever other potential pitfalls you can think of, and explaining how they can be avoided through careful experimental design and data-analysis?
posted by Scientist
on Jan 26, 2013 -
I'd like to estimate the number of days a year when the high temperature is likely to be below a particular threshold, e.g. below freezing. This turns out to be harder than expected. [more inside]
posted by jon1270
on Jan 24, 2013 -
The Wikipedia page on statistics about rape
shows a very high crime rate for countries like UK, US and Australia in stark contrast to, say, India. The gap can't be explained simply by under-reporting, as that exists in all these countries (even assuming different rates of under-reporting). Is it because these countries have different definitions of rape? Or something else? [more inside]
posted by vidur
on Jan 13, 2013 -
Is scientific research ever organized to search for evidence of absence by reversing the null hypothesis? If not, why not? [more inside]
posted by nathan v
on Dec 23, 2012 -
Can you think of a method that allows an individual to pseudo randomly create a sequence of numbers (at the very least the randomness is opaque to the minds of other people) assuming said individual may only use his mind and body (no physical tools are allowed)? [more inside]
posted by Foci for Analysis
on Dec 21, 2012 -
What statistical test should I use to determine if there is a significant difference in the percent change in the presence of bacterial species observed among five groups before and after treatment. [more inside]
posted by waving
on Dec 18, 2012 -
[StatisticsForTheFeebleMinded]My medical office sees patients who must have monthly blood draws for a condition they have. The samples have the same two tests performed on them at an outside laboratory and by our in-house laboratory. Within the last year, these values have begun to differ wildly. Need recommendations for software/programs/equations/thoughts for analyzing any trend within the numbers that might offer an explanation as to why this is now happening. [more inside]
posted by kuanes
on Dec 11, 2012 -
Is there any kind of listing online that gives average or median apartment sizes by country or, ideally, by city? [more inside]
posted by frimble
on Dec 6, 2012 -
I have a bunch of scores for sites that are the sum of the individual scores of the samples that they contain. The number of samples in each site varies from 1 to several hundred. I would like to adjust the overall site scores to adjust for the variance in samples, so that a site with 200 samples doesn't overwhelm a site that has 10 where the site may be just as significant. However, I'm at a complete loss as how to accomplish this. Any thoughts?
posted by buttercup
on Dec 3, 2012 -
Statistics question about a rare event, and the expected distribution of sightings among witnesses. [more inside]
posted by 517
on Dec 1, 2012 -
In this game, you roll a number of six-sided dice to get a total
. The total is either the highest single die result, or the sum of any multiples rolled, whichever is higher.
For example: If I roll three dice and get a 3, 4, and 6, my total is 6. But if I roll a 4, 4, and 6, my total is 8, the sum of the two 4s.
What I want to find out is the mean, median, mode, and standard deviation of the possible totals given N dice. How might I create a simple script to compute this? [more inside]
posted by j0hnpaul
on Nov 30, 2012 -
My stats notes are getting too long to distribute using the university printers. A publisher wants to turn them into a printed book, but I want to keep control of the electronic distribution of my work. How should I approach this situation? [more inside]
posted by mixing
on Nov 26, 2012 -
Economics Mathematics: I have a Maths degree but lately I've become interested in Economics (Microeconomics and Macroeconomics) and have been reading some textbooks and classic texts and doing some online lecture courses on Economics. But find many of that the "handwaving" graphical "proofs" of economic theories lack a sense of mathematical robustness.
Do more thorough mathematics for these ideas exist? Where can I find them? [more inside]
posted by mary8nne
on Nov 16, 2012 -
What kind of statistical analysis would I use to compare the outcomes of a prospective cohort study, one with an intervention and one as the control? [more inside]
posted by legospaceman
on Nov 15, 2012 -
Say that I have a bag which contains 100 balls and every ball in the bag should be red, but it's possible that one or more of these balls is the wrong colour. How many balls should I look at to be 90% sure that all the balls are red? Or 95%? Or 99.9%? Talk me through how to work this out, please?
posted by xchmp
on Nov 12, 2012 -
What great books or resources are there for practicing probability word problems such as for standardized tests like the GRE? [more inside]
posted by Mr. Papagiorgio
on Nov 8, 2012 -
Statisticsfilter: Given available information about the distribution of self-selected 4-digit passwords (specifically banking PINs), is it possible to calculate the probability of two randomly selected individuals having the same PIN? If so, what're the odds? [more inside]
posted by myrrh
on Oct 27, 2012 -
I'm looking for the percentage of out-of-wedlock births per capita in the United States in 1880 (or, failing that, 1890). So far I'm coming up empty-handed.
Specific stats for the District of Columbia will be even better, but I'm fairly certain that they're not available online.
posted by ryanshepard
on Oct 16, 2012 -
Statistics filter: How can I categorize time series curves into pattern categories? [more inside]
posted by lord_yo
on Oct 11, 2012 -
Please help me figure out potential careers based on my interests and the best paths to obtain them. Psychology, economics, statistics? Market research? Psychometrics? [more inside]
posted by Malleable
on Oct 2, 2012 -
I'm looking to learn how to calculate probabilities for a multi-round dice game. I've researched this question some, and it looks like I might need to know how to use the multinomial distribution, but I can't find any good introductions. Please point me to the most layman-accessible educational material on this subject, and help me to help myself. [more inside]
posted by Richard Daly
on Sep 28, 2012 -
Grad programs--I've just heard (for the first time) that conditional admissions are A Thing. Would I have a snowball's chance with a good GRE score alone or will I have to take pre-req undergrad classes first? [more inside]
posted by wires
on Sep 23, 2012 -
What after bio-statistics software experience is most attractive to future employers? [more inside]
posted by waving
on Sep 6, 2012 -
Putting on the math signal: calling the statistics-literate. Trying to change my weight/body composition and track progress in a useful way, but I'm having trouble separating normal daily variation from actual real change. [more inside]
posted by ctmf
on Sep 1, 2012 -
How do I elegantly present tabular, statistical data online and automatically?
I'd love some examples of beautifully presented tabular data online - something that works natively in a browser, ideally also on a tablet and mobile as well. Some interactivity (sorting, filtering) also OK but priority is usability and elegance like you'd find in printed statistical abstracts. Bonus points for open source web tools / frameworks that could help automate this from a database! [more inside]
posted by tkbarbarian
on Aug 30, 2012 -