Course: Data Science with R
Author: Matthew Renze
TL;DR: Jump to conclusion
The course consists of the following modules:
Introduction to Data Science. It’s about what data science is and the skills that a data scientist should have.
Introduction to R. The basic about R, including UI of RStudio and various data type in R.
Working with Data. It highlights some common data cleaning method when exploring the data, including arrange, filter, mutate, summarize function.
Creating Descriptive Statistics. This module requires some basic statistics knowledge. It demonstrates how to get numeric statistics and correlation coefficient from dataset.
Creating Data Visualizations. It introduces
ggplot2package and demonstrates how to use it to draw plots like histogram and scatter plot.
Creating Statistical Models. This module requires some basic statistics knowledge. It introduces types of statistical models and how to create a simple linear regression model in R using
Handling Big Data. This module tells you what is considered as big data and when to handle it. It introduces
biglmpackages and showcases how to create big linear regression model in big data situation.
Predicting with Machine Learning. It briefly describes what machine learning could do, types of machine learning and the overall process. In the demo, it shows a simple example of training a decision tree model and predicting with it using
Deploying to Production. This module introduces
Shinyand demonstrates how to use it to create Shiny UI and Shiny App.
Apparently this course is for beginners who barely knows anything about data science. This course helps the audience understand what data scientists do, so it is helpful for someone who wants to advance or switch to this career.
The prerequisite says “basic knowledge of programming and statistics”, which is true because part of modules require some degree of basic knowledge. However, even if you don’t, you can still get through the entire course, but might not fully understand some theoretical concept.
Overall, I think it’s a great course as a high-level introduction. It kind of touch bases on everything about data science - programming and statistics side - without going too deep. The course is well-structured and provides enough guidance for audience for next step if they want to dig deeper.
I really like the demo in each module to show how to achieve the goal in R so that I can connect between theory and practice. If you already know R like myself, it’s okay to skip some modules and only watch those you do not know.