I'm looking for every box score my favorite NBA team played in for this year and last. Where can I go to download this data? [more inside]
posted by antonymous
on Dec 5, 2013 -
I have 50+ responses from large companies for a survey that I've written which has approximately 100 questions. There is no other data that can be linked to this survey. I need to know what I can do with these results and how to do it. [more inside]
posted by JiffyQ
on Dec 1, 2013 -
I work in a University managing the broad based direct mail, email and calling programs. I have zero undergrad or graduate experience with math, business or the social sciences. (Aka, I can write a really nice essay...) I would like to chart a path to being recognized as an expert in predictive analytics. [more inside]
posted by meta x zen
on Nov 26, 2013 -
How do I use SPSS to analyze a range of approval ratings which vary by participant and correlate the skew to one demographic variable? [more inside]
posted by lettuchi
on Nov 25, 2013 -
I like math. Programming is OK, but I don't want to make it my thing. What careers should I be looking at? (Special snowflake details inside.) [more inside]
posted by sqrtofpi
on Nov 25, 2013 -
I'm trying to reconcile two numbers from the same national statistical agency. I'm looking for a dummy's guide for what defines the difference between the two numbers and how to use one to estimate the other. [more inside]
posted by MuffinMan
on Oct 22, 2013 -
I'm thinking about setting up a tongue in cheek project for Halloween that involves showing numbers that are scary. What numbers (with the context of a description) make you anxious, or at least spook you a bit on first glance? [more inside]
posted by mccarty.tim
on Oct 5, 2013 -
How flexible is a master's degree in biostatistics compared to one in applied statistics? Is this even what I want to do? [more inside]
posted by Comet Bug
on Oct 4, 2013 -
I have a list of 15 people. Each person has between 1 and 3 entries in a lottery, for a total of 35 entries. I need to select 9 people from the list of 15--nobody can win more than once.
What is the most transparent, most random, most low-tech way I can do this? [more inside]
posted by tchemgrrl
on Sep 18, 2013 -
I post a lot of URLs to social media sites (and since one of those is Twitter I often use url shorteners) that point to my own publishing company's website, and also directly to where I sell my books on Amazon, Barnes & Noble, etc. I know services like ow.ly will track how many clickthroughs a url will get, and I think they can give me multiple shortened urls for the same target url. I'm wondering if any url shortening sites will also let me keep track of all of my shortened urls and give them nicknames or make notes (so I can note where I've used them) and give me a chart or spreadsheet or something that shows me which urls are getting the most traffic. Or if there's an app or separate website where I can enter the info that will then collect the tracking data. I'm trying to avoid having to manually check every url's visitors data.
posted by joannemerriam
on Sep 15, 2013 -
I currently work for a growing company doing various social media marketing for small businesses. I have been finding that I receive a lot of satisfaction doing activities related to what I learned in library school. I enjoy collecting, organizing, and providing data and information for our internal staff and making things approachable. One weakness I see is that we are especially data rich and insight poor with social media. I would like to know if there are any recommended programs for data mining or statistical analysis? [more inside]
posted by andendau
on Sep 8, 2013 -
I'm looking for a word-count tool that will allow me to: set a goal for words written by a specific date, enter in the words I have written each day, see how many words I remaining toward my goal, and how many words I will need to average each day to reach my goal. [more inside]
posted by Tevin
on Aug 26, 2013 -
My office has recently had some funds open up and we are looking into investing in some statistical software to make our lives easier. We do a lot of work with distribution fitting, Monte Carlo analysis, and regression analysis with data sets that may contain left or right censored data. Unfortunately, we only have a few days to identify the best software package for our buck. Alternatively, the idea has been floated to download the free R software and spend the money on some training to get over the steep learning curve. What program or approach would be the best use of our money?
posted by C'est la D.C.
on Aug 12, 2013 -
I'm struggling to understand likelihood ratios (LR) in the context of diagnostic tests, and why a positive LR is influenced by the sensitivity of the test. [more inside]
posted by cacofonie
on Aug 1, 2013 -
We have a group of six people with 55 different options. Each member of the party has to vote for each option under 8 different analyses i.e. appearance, distinctness, etc. The options are quality weighted. [more inside]
posted by trashcan
on Jul 10, 2013 -
So I take it that the OkTrends blog
was killed off after Match bought OkCupid. Where can I now get my regular fix of really interesting statistics presented at a level that the lay person can understand? (I already know about Nate Silver and xkcd's What If.)
posted by capricorn
on Jul 7, 2013 -
I'm working on a problem for "Inferences about the difference between two population means for independent samples: sigma 1 & 2 unknown and unequal."
The final value of "test statistic t" falls in the rejection region for 95% confidence interval, but falls in the nonrejection region for 99% confidence interval.
Should I perform additional calculations before rejecting my null hypothesis?
posted by iamcharity
on Jul 1, 2013 -
I have over 10 years of sent email sitting in a folder. Are there any tools (preferably for OS X or *nix, but anything interesting is welcome) that I can use to generate interesting statistics, or draw pretty graphs, word clouds... basically anything interesting that works on a huge number of emails.
posted by Mwongozi
on Jun 28, 2013 -
Help me with statistics and Excel. Especially help me if you know any labor saving methods. I want the median, mean and standard deviation for the average price of all items sold, but my spreadsheet-full-of-data doesn't tell me the price of each sale -- just the average price per store, and the number sold at that store. Something like this: [more inside]
posted by croutonsupafreak
on Jun 21, 2013 -
So I'm beginning a statistics PhD program this fall and I'm concerned that my math skills have gotten rusty since I haven't done anything related for the past two years. I've been working as an actuary since I graduated college but I don't do that much math--mostly a lot of programming. Has anybody been in a similar situation to me? How was the adjustment for you? I'm considering retaking advanced calculus and linear algebra during my first year (probably next summer before I take 2nd year advanced courses) just to refresh myself again. I'm aware some people may think this is kind of pathetic but I'd rather be safe than sorry. Besides, it's only my first year. Is this frowned upon? [more inside]
posted by molamola
on Jun 17, 2013 -
I'm running into trouble with my statistics course. I'm just getting up to t statistics for independent measures research design. My problems are:
1. I'm going through a lot of paper
2. I need to keep all my calculations better organized as a do them
3. I'm flipping back and forth between my book, an online version of the book, and another screen so that I can reference as much material as possible at once. I'm thinking some kind of basic statistics calculator spreadsheet (or any other format) would be in order. Can anybody direct me towards one? [more inside]
posted by Che boludo!
on Jun 8, 2013 -
What are the limits to bedbugs? Why isn't every hotel room infested given how tough they are claimed to be? Is there any evidence on the chances of taking bed bugs home from a hotel with you? Will the bedbug infestation rates go ever upwards? Why or why not? Interested in aggregated, rather than anecdotal evidence here. [more inside]
posted by mister_kaupungister
on May 31, 2013 -
I come from an engineering background rather than a research background, and I find myself lacking in vocabulary when it comes to understanding research papers, particularly when they start talking about ANOVA analyses, F(x) effect sizes and p values. I can skim through the results of a study and see that certain numbers are bigger than other numbers, but I don't really know how to tell whether what I'm seeing is significant. I'm guessing that I'm missing basic education in statistics. Can I fix this in a simple way?
posted by sdis
on May 28, 2013 -
I was recently sitting down to tea with a friend of mine's (we're both in our early 20's), and the topic wandered over to hard drug use (e.g. stuff like cocaine and crystal meth, not marijuana or alcohol.) When comparing our perceptions of how common hard drug use was, we were completely surprised when our answers were polar opposites: I saw it as an extremely rare thing, but she said it was something virtually everyone did but no one talked about. What's the truth here? How prevalent is hard drug use anyway? And why do our experiences differ so much? [more inside]
posted by Conspire
on May 24, 2013 -
I have a list of paired numbers that span multiple orders of magnitude, and I need to find a method to a) compare within each pair in a way that does not disproportionately bias the comparison at the high or low end of the list, and b) define which pairs are dissimilar enough to be excluded from further analysis. The dataset itself follows a rough sigmoid curve, with a few pairs in the 1000s, more in the 100s, a lot in the upper 10's, some in the low 10's, and a few in the single digits. I have tried a few different comparison methods so far, including percent difference and relative percent difference of both the raw and log-transformed data. [more inside]
posted by nekton
on May 24, 2013 -
I'm about to begin a new project that looks at the outcomes of specific events, and would like to query the hivemind to see what kinds of approaches I can take to it. I'm always impressed with the wide variety of approaches to statistical problems I see on here. [more inside]
posted by Tooty McTootsalot
on May 24, 2013 -
Looking for statistical information on how many students study history at the B.A., M.A. and Ph.D. level in various non-US countries. [more inside]
posted by agent99
on May 9, 2013 -
So I'm going to Kenya in 5 weeks time for some work, and I'm meant to be briefing some colleagues (emphasis on brief) about some aspects of our work tomorrow. Something has leapt out at me, and I don't have the time to research it myself before presenting. [more inside]
posted by smoke
on May 7, 2013 -
In English, scientists customarily use the word "significant" or "statistically significant" to refer to an effect that is distinguished from zero at a p < .05 confidence level. On the other hand, the word "significant" in non-technical English carries a connotation of being meaningful, important, or substantial; this creates confusion when researchers write about "a significant effect," since the effect might be significant in the statistical sense while being so small as to be insignificant in the common-English sense.
In your native language, what word is used for "signficance" in the statistical context? Is the same word used outside the technical context, and if so, is it a word whose common meaning is something more like "detectable," more like "important," or something else entirely? In particular, does the confusion that arises in English also take place in your language?
posted by escabeche
on Apr 24, 2013 -
I have five structural equation models that are identical except for the final outcome variable. Should I expect the model fit statistics to vary more than negligibly? [more inside]
posted by aaronetc
on Apr 11, 2013 -
I'm curious what the most frequently purchased colors are for regular, non-jean pants for men. I was discussing this with a colleague today, and guessed black and navy. I'm looking specifically for industry numbers or anecdotal evidence from those working in clothing manufacturing or sales. (I already guessed!) Thanks in advance.
posted by whitebird
on Apr 5, 2013 -
Feeling a little stuck in my current job; unsure if I should go for the PhD or cross it off my list and change jobs. [more inside]
posted by un petit cadeau
on Apr 5, 2013 -
I've been thinking about product ratings online. Product A and B both have an average rating of 4 stars. Product A is universally liked: every reviewer gave it 4 stars. Product B is the Twilight Series: lots of people love it (5 stars), but many 1 star reviews drags down the average to 4 stars.
Other than displaying the rating distribution (# of 1 through 5 star reviews), are there well-known formulas that would give Product A a higher rating?
I think what I'm asking about are weighted means, or some sort of formula that takes into account variance or skew.
But rather than re-invent the statistical wheel, I was hoping some of you may be able to point out well-known examples of good weighted formulas, or research related to this question.
Hope this is clear! Thank you!
posted by User7
on Apr 1, 2013 -
I never use SPSS (I hate it like I hate nothing else, except perhaps Excel) but I must
use SPSS for this problem. Normally I'd just use R and be done with it, but SPSS is necessary for this problem. How can I get a simple error bar plot from two proportions? [more inside]
posted by Philosopher Dirtbike
on Mar 18, 2013 -
The fallacy is assuming that statistic information about a thing is more relevant in dealing with a particular instance of that thing than available first-hand data. [more inside]
posted by CustooFintel
on Mar 12, 2013 -
I'm trying a analyse a set of biological data for a research project and I'm having trouble finding the appropriate statistical tests to use. [more inside]
posted by snoogles
on Feb 12, 2013 -
StatFilter: Would anybody be able to recommend a good introduction to the statistical computing language "R" that a reasonably quantitatively-adept psychologist might be able to work through on his own? Something like a step-by-step book or textbook with exercises would be great to help me become more fluent in R. (My colleagues at work who use R are primarily computer scientists who either first learned MatLab or are brilliant autodidacts when it comes to learning different scripting languages, and thus don't have any suggestions; Googling has mostly proferred a somewhat obscurely structured guide from the R authors and lots of invocations to just learn on my own, somehow...). I've become familiar with how to do many individually useful tasks in data structuring and analysis, but I feel a bit like a very high-functioning tourist who has learned a lot of phrases to get around but who would be lost and mugged in an alleyway if I strayed off the beaten path.
posted by Keter
on Feb 7, 2013 -
I have a Soundcloud account. I think it's among the better sound file sharing sites, but it can be a little pricey. I am regularly adding new content so I am always butting up against the file size limitations of my account. So, I want to have a way to look at the files in my account and decide which of them needs to go and keep the most productive ones. I want to be able to compare how long a file as been posted to number of times it has been viewed. I think that deleting a file based just on duration or views could be deceiving because I could delete an old account that is delivering consistent viewers while a new account that starts out fast could drag as time goes on. I can't tell that by just looking, so I need some help. Can I create something in Excel? I also have SPSS. Would creating crosstabs help?
posted by CollectiveMind
on Jan 28, 2013 -
I have recently been introduced to the concept of pseudoreplication
as a mistake that people often make when using inferential statistics to evaluate treatment outcomes. My field (evolutionary and conservation biology) makes heavy use of inferential statistics, including techniques that are vulnerable to pseudoreplication, yet nowhere in my formal education have I been taught about how poor experimental design and lack of statistical rigor can lead to fallacies like this. My personal statistical proficiency is poor, but I am working to remedy that. To that end, could folks help me by identifying and ideally explaining whatever other potential pitfalls you can think of, and explaining how they can be avoided through careful experimental design and data-analysis?
posted by Scientist
on Jan 26, 2013 -
I'd like to estimate the number of days a year when the high temperature is likely to be below a particular threshold, e.g. below freezing. This turns out to be harder than expected. [more inside]
posted by jon1270
on Jan 24, 2013 -
The Wikipedia page on statistics about rape
shows a very high crime rate for countries like UK, US and Australia in stark contrast to, say, India. The gap can't be explained simply by under-reporting, as that exists in all these countries (even assuming different rates of under-reporting). Is it because these countries have different definitions of rape? Or something else? [more inside]
posted by vidur
on Jan 13, 2013 -
Is scientific research ever organized to search for evidence of absence by reversing the null hypothesis? If not, why not? [more inside]
posted by nathan v
on Dec 23, 2012 -
Can you think of a method that allows an individual to pseudo randomly create a sequence of numbers (at the very least the randomness is opaque to the minds of other people) assuming said individual may only use his mind and body (no physical tools are allowed)? [more inside]
posted by Foci for Analysis
on Dec 21, 2012 -