Learning SQL, Python, R
August 19, 2022 5:25 AM Subscribe
Please recommend resources for learning SQL, Python or R. I would be using them for data science.
You can assume, or not, basic programming skill and rusty Statistics 101 knowledge.
I am leaning toward something self-directed, with low-to-moderate cost. But I am open to any suggestions.
You can assume, or not, basic programming skill and rusty Statistics 101 knowledge.
I am leaning toward something self-directed, with low-to-moderate cost. But I am open to any suggestions.
official python tutorial Python includes a tutorial in the official docs, and I think it's reasonable for learning the language, you might consider working through that. It's kind of a blend of language features vs tutorial. It may assume some programming knowledge.
python library docs I also use daily for how library calls work
posted by TheAdamist at 5:48 AM on August 19, 2022
python library docs I also use daily for how library calls work
posted by TheAdamist at 5:48 AM on August 19, 2022
Data Carpentry offers (usually free) workshops, mostly online these days, but also has their curriculum available (open-access) for self-study.
posted by phlox at 5:50 AM on August 19, 2022
posted by phlox at 5:50 AM on August 19, 2022
Check out the learning resources list from the r/datascience subreddit:
https://www.reddit.com/r/datascience/wiki/resources/
Also take a look at the r/dataengineering learning resources:
https://dataengineering.wiki/Learning+Resources
posted by needled at 6:00 AM on August 19, 2022
https://www.reddit.com/r/datascience/wiki/resources/
Also take a look at the r/dataengineering learning resources:
https://dataengineering.wiki/Learning+Resources
posted by needled at 6:00 AM on August 19, 2022
I learned SQL with SQL Tutorial and Python with Learn Python the Hard Way, which, at the time, was a website, but is now apparently a book. I've used Codecademy and Treehouse for several other things and found them helpful, although I will point out that all of the above were for leisure and I've only recently started using SQL at work, so I wasn't super-motivated. I thought they worked well for leisurely learning, at least.
posted by kevinbelt at 6:11 AM on August 19, 2022 [1 favorite]
posted by kevinbelt at 6:11 AM on August 19, 2022 [1 favorite]
Response by poster: I forgot to mention that I prefer text-based over video-based, but still open to all suggestions. Thank you for the input so far!
posted by NotLost at 6:46 AM on August 19, 2022
posted by NotLost at 6:46 AM on August 19, 2022
Data Camp was where I cut my teeth before getting my uni certs and on the job training. It’s not completely free but the interactive labs were very helpful. They have a specific track for data science with Python.
My recommendation with both for the real world: find some unnormalized, messy datasets to work with because finding a distinct data point in a table full of similar but slightly different duplicates is hell on earth when you’re used to the clean data in most labs.
I also recommend PySpark if you’re wanting to cut your teeth on ETL work. A lot of companies are shifting to Azure Synapse Analytics and it is one of the main modules you’d work with.
posted by Cyber666 at 6:47 AM on August 19, 2022 [1 favorite]
My recommendation with both for the real world: find some unnormalized, messy datasets to work with because finding a distinct data point in a table full of similar but slightly different duplicates is hell on earth when you’re used to the clean data in most labs.
I also recommend PySpark if you’re wanting to cut your teeth on ETL work. A lot of companies are shifting to Azure Synapse Analytics and it is one of the main modules you’d work with.
posted by Cyber666 at 6:47 AM on August 19, 2022 [1 favorite]
I agree with TheAdamist that, for learning Python itself, (the language and the standard library), it’s hard to beat the official tutorial. It’s well-organized, easy to read, assumes very little about your knowledge, and doesn’t teach obsolete practices.
posted by musicinmybrain at 8:20 AM on August 19, 2022
posted by musicinmybrain at 8:20 AM on August 19, 2022
I watched these incredible free series of courses last summer when they aired live, and they were phenomenal. They met for three info packed hours on consecutive saturdays, and at the end folks who kept up fully were prepared to work on portfolio projects in each subject area. I've used other resources as well, but I think the quality of this instruction and pedagogical style is really really really excellent.
Data Analysis with Python: zero to PANDAS: https://www.youtube.com/playlist?list=PLyMom0n-MBrpzC91Uo560S4VbsiLYtCwo
Machine Learning with Python: zero to GBMs: https://www.youtube.com/playlist?list=PLyMom0n-MBrq-KvGy4TSEa3PQnZ03OoM6
Deep Learning with PyTorch: Zero to GANS: https://www.youtube.com/playlist?list=PLyMom0n-MBroupZiLfVSZqK5asX8KfoHL
If you want to get a sense of how participants experienced it (testimonials), there's community stories by zero to pandas participants here
other resources i've found helpful are datacamp, and data umbrella and some python users groups..
hope this helps : ) if a study buddy would help, say hi (seriously)
posted by elgee at 8:23 AM on August 19, 2022 [1 favorite]
Data Analysis with Python: zero to PANDAS: https://www.youtube.com/playlist?list=PLyMom0n-MBrpzC91Uo560S4VbsiLYtCwo
Machine Learning with Python: zero to GBMs: https://www.youtube.com/playlist?list=PLyMom0n-MBrq-KvGy4TSEa3PQnZ03OoM6
Deep Learning with PyTorch: Zero to GANS: https://www.youtube.com/playlist?list=PLyMom0n-MBroupZiLfVSZqK5asX8KfoHL
If you want to get a sense of how participants experienced it (testimonials), there's community stories by zero to pandas participants here
other resources i've found helpful are datacamp, and data umbrella and some python users groups..
hope this helps : ) if a study buddy would help, say hi (seriously)
posted by elgee at 8:23 AM on August 19, 2022 [1 favorite]
Swirl is a great tutorial for getting started with R.
posted by aws17576 at 8:55 AM on August 19, 2022
posted by aws17576 at 8:55 AM on August 19, 2022
Practical SQL was an astoundingly helpful book, and got me started on my SQL journey. It's written by a journalist, so it's more engaging than a lot of IT how-to books.
SQL Murder Mystery is a super fun way to practice your skills (I feel like I heard about this here on the green, but I'm not positive).
posted by missrachael at 9:03 AM on August 19, 2022 [1 favorite]
SQL Murder Mystery is a super fun way to practice your skills (I feel like I heard about this here on the green, but I'm not positive).
posted by missrachael at 9:03 AM on August 19, 2022 [1 favorite]
Python Data Science Handbook was essential for me.
posted by supercres at 9:45 AM on August 19, 2022 [1 favorite]
posted by supercres at 9:45 AM on August 19, 2022 [1 favorite]
I used to teach in an information science program at a university. Here are a few of the (free) resources we would use in our courses. They lean toward textbook-style things (so I'm glad you prefer text-based, heh) but all have at least some ancillary materials that are interactive.
For starting from zero with Python: How to Think Like a Computer Scientist. Basically a textbook with interactive programming exercises. Not focused on data science specifically, but covers the basic mechanics of programming in Python.
Think Bayes is a good next step into probabilistic reasoning and estimating from data using the NumPy/SciPy/Pandas stack. More conceptual than applied, in that it won't teach the standard survey of ML models. But most of those models, at heart, use this type of probabilistic reasoning, so if you have the time to spend on this foundation I think it's worthwhile.
If you need to shake the rust off that Statistics 101: OpenIntro Statistics has a book and a number of programming "lab assignments" presented in a variety of languages, including of course R and Python.
posted by egregious theorem at 10:25 AM on August 19, 2022 [2 favorites]
For starting from zero with Python: How to Think Like a Computer Scientist. Basically a textbook with interactive programming exercises. Not focused on data science specifically, but covers the basic mechanics of programming in Python.
Think Bayes is a good next step into probabilistic reasoning and estimating from data using the NumPy/SciPy/Pandas stack. More conceptual than applied, in that it won't teach the standard survey of ML models. But most of those models, at heart, use this type of probabilistic reasoning, so if you have the time to spend on this foundation I think it's worthwhile.
If you need to shake the rust off that Statistics 101: OpenIntro Statistics has a book and a number of programming "lab assignments" presented in a variety of languages, including of course R and Python.
posted by egregious theorem at 10:25 AM on August 19, 2022 [2 favorites]
R for Data Science.
The free Kaggle courses are also pretty decent for getting your feet wet.
posted by thebots at 1:26 PM on August 19, 2022
The free Kaggle courses are also pretty decent for getting your feet wet.
posted by thebots at 1:26 PM on August 19, 2022
SQL, and text based:
Sam's SQL in 10 minutes, by Ben Forta
https://forta.com/books/0135182794/
I used one of the first editions nearly 2 decades ago, and it's continued being a best seller for very good reason.
It's structured like a cookbook almost, where the 10 minutes is that learning how to do any given task will take less than 10 minutes.
Skim through so that you understand *what* you can do with SQL, then when you need to write an actual query, open it up and review. Super quick.
posted by Elysum at 5:13 PM on August 19, 2022
Sam's SQL in 10 minutes, by Ben Forta
https://forta.com/books/0135182794/
I used one of the first editions nearly 2 decades ago, and it's continued being a best seller for very good reason.
It's structured like a cookbook almost, where the 10 minutes is that learning how to do any given task will take less than 10 minutes.
Skim through so that you understand *what* you can do with SQL, then when you need to write an actual query, open it up and review. Super quick.
posted by Elysum at 5:13 PM on August 19, 2022
I’m a beginner myself, but I’m getting a lot out of Head First SQL by Lynn Beighly. It’s very readable even when you’re not on the computer.
posted by Kriesa at 5:30 PM on August 19, 2022
posted by Kriesa at 5:30 PM on August 19, 2022
For SQL, I can't recommend Execute Program's course highly enough. It's not cheap at $40/month, but it is fully text-based and does offer 16 free lessons so you can try out the lesson format and see how the spaced repetition exercises work.
(Disclaimer: I know some of the folks behind the site, but I also thought highly enough of EP that I paid for the subscription myself.)
posted by kejadlen at 2:02 PM on August 20, 2022
(Disclaimer: I know some of the folks behind the site, but I also thought highly enough of EP that I paid for the subscription myself.)
posted by kejadlen at 2:02 PM on August 20, 2022
Response by poster: You've given me a lot to try out. Thank you!
posted by NotLost at 6:53 PM on August 22, 2022
posted by NotLost at 6:53 PM on August 22, 2022
For SQL, there's A Curious Moon and Select Star SQL.
For general Python, I really liked Colt Steele's bootcamp.
posted by taltalim at 9:17 AM on August 29, 2022
For general Python, I really liked Colt Steele's bootcamp.
posted by taltalim at 9:17 AM on August 29, 2022
This thread is closed to new comments.
posted by entropone at 5:32 AM on August 19, 2022 [1 favorite]