##
60 posts tagged with data *and* statistics.

Displaying 1 through 50 of 60. Subscribe:

## Quick Data Collection Problem

I have a CSV with one column that contains User IDs. I need to make another CSV that counts how many times each User ID appears in the document, and output it to another CSV That has something like UID1 >> 3; UID4 >> 1; UID20 >> 0. There are thousands of User IDs, so I need to be able to set up a range for it to scan for. Is this possible with free software and not much programming expertise? [more inside]

## Reliable demographic information on tertiary students globally

I need to find out accurate information showing the quantity of university/tertiary/higher education students and lecturers, by age, for Australia and internationally. [more inside]

## What Do I Do With All This Data?

I have a research project involving about ~90 subjects in two (self-selected) groups. I have collected a number of variables about these subjects (from public sources), and now I would like to do some statistical tests to say whether these variables are significantly correlated with membership in one group or the other. How do I do this? I have a basic knowledge of R and access to Stata 14. [more inside]

## Cubicle Denizen Dilemma: the name for an “average” if the data are weird

Our company has decided to measure the jobs we do by assigning points to tasks. The problem here is that departments don’t do the same thing, or the same amount of things. What concerns me is that the company wants to publish a company-wide average, against which members in each department will be compared. [more inside]

## Looking for an online traffic counter I can query with curl

I am searching for a live traffic flow counter which is on the net and reachable by http, or some webpage that displays hourly or cumulative vehicle statistics. [more inside]

## Statistics for approval correlating demographic variable?

How do I use SPSS to analyze a range of approval ratings which vary by participant and correlate the skew to one demographic variable? [more inside]

## Career Development Suggestions for making sense of and displaying data

I currently work for a growing company doing various social media marketing for small businesses. I have been finding that I receive a lot of satisfaction doing activities related to what I learned in library school. I enjoy collecting, organizing, and providing data and information for our internal staff and making things approachable. One weakness I see is that we are especially data rich and insight poor with social media. I would like to know if there are any recommended programs for data mining or statistical analysis? [more inside]

## Scaling issue calculating similarity between paired numbers

I have a list of paired numbers that span multiple orders of magnitude, and I need to find a method to a) compare within each pair in a way that does not disproportionately bias the comparison at the high or low end of the list, and b) define which pairs are dissimilar enough to be excluded from further analysis. The dataset itself follows a rough sigmoid curve, with a few pairs in the 1000s, more in the 100s, a lot in the upper 10's, some in the low 10's, and a few in the single digits. I have tried a few different comparison methods so far, including percent difference and relative percent difference of both the raw and log-transformed data. [more inside]

## Weather, damn cold weather, and statistics

I'd like to estimate the number of days a year when the high temperature is likely to be below a particular threshold, e.g. below freezing. This turns out to be harder than expected. [more inside]

## How do I elegantly present tabular, statistical data online?

How do I elegantly present tabular, statistical data online and automatically?
I'd love some examples of beautifully presented tabular data online - something that works natively in a browser, ideally also on a tablet and mobile as well. Some interactivity (sorting, filtering) also OK but priority is usability and elegance like you'd find in printed statistical abstracts. Bonus points for open source web tools / frameworks that could help automate this from a database! [more inside]

## What is the best way to learn R?

What is the best way for me to learn R? In particular, what is the best website or online tutorials for learning to deal with large datasets. [more inside]

## learn data science

I'd like to learn about data science. Things like predictive modelling, regression and classification and so on. What would be good books or online courses to start with?

## It's hard to Google infographics

I'm looking for an infographic (?) about how lucky it is to have been born in the developed world. I remember a format similar to "If you are also literate, then you are already in the top x% of the world," with various characteristics substituted in for "literate." [more inside]

## I'm a data scientist! What does that mean?

Can you point me to the best resources to learn about these new-fangled things they call

*data science*and*big data*? I just started a new job as a*data scientist*and need to get up to speed. [more inside]## Looking for cancer survival statistics by individual hospitals in England.

Looking for cancer survival statistics by individual hospitals in England. [more inside]

## Help me help her.

Nutshell: I need a reputable, authoritative source of data about how many times the average victim of abuse/domestic violence/intimate partner violence returns to the relationship. [more inside]

## help finding patterns in data

Given a set of columnar data, some of which are categorical and others that are numerical, how can I identify which category columns are responsible for signficant changes in the one or more of the numerical columns? [more inside]

## 30 Vegans Agree...."Better to be overprepared than underprepared."

How do I go about gathering statistics regarding vegan/vegetarians? [more inside]

## Value Over Replacement Database

Sports nerds of MeFi: where do you go for your downloadable sports statistics database needs? [more inside]

## In search of SAS-fu

What are your beginner/intermediate SAS handbook recommendations? [more inside]

## Numbers in the news

What's that quote? Something about each statistic in a news article reducing readership by x%. I can't for the life of me remember the exact words and who said it.

## So how do people make those nice graphs?

What are some super simple graph making programs? [more inside]

## Software for beautiful, interactive data visualization

Do you have recommendations for good data visualization software for the Hans Rosling fan? [more inside]

## What are some things I can do with a metric ton of housing data?

I'm trying to figure out interesting ways to slice/combine/aggregate a ton of data into useful or interesting statistics. Where do I begin? [more inside]

## Where is data on taxes by congressional district?

Where can I find data on US federal income tax brackets by congressional district? My goal is making a fairly accurate statement of the form "Congressperson X represents a district where Y people are in tax bracket Z."

## R/S-Plus: How can I merge two dataframes of different lengths?

Merging datasets in R. I seem to have forgotten how to do some things. [more inside]

## Likert nightmares

Likert scale survey with 3 matrices, each examining a different factor by asking 8 questions to be ranked. I have the 110 responses I wanted, broken down by some demographics. Now what? [more inside]

## Fun Statistical Data Sets?

I'm looking for data to use in a statistics class. The data set needs to have less than 25 variables and at least 200 individual records. Any fun/unique/silly ideas? I need to right a paper analyzing this data -I'd like it to be interesting. I need access to raw data - it is okay if it has already been analyzed but I need to show that I can analyze it.

## Textbooks on data mining techniques / statistical analysis on large data sets?

Textbooks on data mining techniques / statistical analysis on large data sets? [more inside]

## Reconstructing data from statistics

## These numbers, they vibrate?

How do I become a stats and data whiz? [more inside]

## Something something quantified self something something

Hello MeFites. Lately, I've become interested in the idea of "Quantified Self", essentially, self tracking. A diary of daily events, such as when I woke up, how much coffee I've had, my productivity level and so forth. I'd like to collect some of this data to see if I can deduce some patterns. Maybe when I drink coffee, I go to bed later, and I'm not as productive for example. To achieve this, I think I need a piece of software. [more inside]

## What is normal?

Statistics-filter: Two relatively (I think) simple statistics questions. [more inside]

## Seeking advice/help about statistical tests of significance.

I have a wonderfully large dataset that I'm working with for a long-term project. I am analyzing a small section the dataset for my masters thesis. In meeting with my thesis advisor last week, she suggested I run some statistical tests of significance on the 4 tables I'm working with. She knows that I am yet to be versed in quantitative analysis methods (I've done solely qualitative work thus far) and that I'm under a massive time crunch to get this done. She suggested I seek help from others, as she doesn't want me to get bogged down with figuring out this step, and would rather I concentrate on analyzing the other aspects of this data. To this end, I'm wondering if somebody might be able to suggest the best type of test of significance to run, the easiest way to run it, and a good, simple resource for what the resultant values mean? [more inside]

## You Are Not My Slum Statistician

Help me find useful statistics and information on the Kibera slum in Nairobi. [more inside]

## Where does this data about Social Media come from?

Does anyone know the source of this bit of Social Media data? [more inside]

## What interesting statistical information can I dig out of this medical billing data?

What interesting statistical information can I dig out of this medical billing data? [more inside]

## How does the CDC measure the spread of H1N1?

There's no special place to turn up if you think you've got the swine flu to be tested or otherwise counted--hospitals and clinics tell people to just stay home unless they are having actual health complications. How is the CDC able to say that 22 million people have been infected with H1N1 when if you don't have to be hospitalized, nobody will even test you for it?

## Rates of success?

Statistics question: is it possible to test sets of cumulative data for significant differences in rate? [more inside]

## What method or type of software is best for collecting complex information for future analysis?

Lots of interrelated data, little idea of how to analyze it. What method or type of software is best for collecting complex information for future analysis? [more inside]

## Infogasm

I have some friends collecting movie & music data for the month of August - what are some innovative/interesting things I can do with the results? [more inside]

## ISO data

Google-fu masters: I need some stats, STAT. [more inside]

## How can I convert a SAS dataset into something readable by R?

How to import a SAS dataset into R (with, unfortunately, one extra degree of difficulty...)? [more inside]

## Need advice about statistics...

Am I distorting my data, and not showing the true picture? [more inside]

## Rape records from early US history

Where can I find antebellum white-on-black rape statistics? [more inside]

## Bimodal or am I biased?

StatisticsFilter: how can I find out whether my data is bimodal? [more inside]

## Mashing up government data

I'm trying to lead a crusade for government to publish it's statistics and data in a way that is mashable. Is it possible to define a standard digital format that could apply to a diverse array of data sets? [more inside]

## Visually exploring and representing survey data.

Best ways to visually explore a large survey data set? [more inside]

## Past performance, future results, &c

Statistics-filter: I need to establish to what extent student performance on a particular standardized test is predicted by each of the following: GPA, standardized test scores and a couple of other miscellaneous numerical factors. How do I go about this? [more inside]

## Statistics Question

Statistics filter: Interval or Ordinal data? [more inside]

Page:
1 2