Data Science with R
This section is centered around the use of the R programming language within the tidy data framework, and as such employs the most recent advances in data analysis coding. The chapter provide a sophisticated first introduction to the field of data science and provide a balanced mix of practical skills along with generalizable principles.
Chapter
1 Programming Basics
Master the basics of data analysis by manipulating common data structures such as vectors, matrices, and data frames.
Discover conditional statements, loops, and functions to power your own R scripts, and learn to make your R code more efficient using the apply functions.
1.3 Introduction to the tidyverse
Get started on the path to exploring and visualizing your own data with the tidyverse, a powerful and popular collection of data science tools within R. Discover the fundamentals of the Tidyverse, and learn all about renaming and reordering variables, while becoming familiar with binomial distribution.
2 Importing Data
2.1 Introduction to Importing Data
Learn to read .xls, .csv, and text files in R using readxl and gdata, before learning how to use readr and data.table packages to import flat file data.
2.2 Intermediate Importing Data
Parse data in any format. Whether it’s flat files, statistical software, databases, or data right from the web.
Learn how to efficiently import data from the web into R. Discover how to work with APIs, build your own API client, and access data from Wikipedia and other sources by using R to scrape information from web pages.
3 Data Wrangling
3.1 Data Manipulation with dplyr
Delve further into the Tidyverse by learning to transform and manipulate data with dplyr. Learn how to use dplyr to transform and aggregate data, then add, remove, or change variables. You’ll then apply your skills to a real-world case study.
Learn to combine data across multiple tables to answer more complex questions with dplyr. Learn to combine data across multiple tables to answer complex questions with dplyr. You’ll learn 6 different joins including inner, full, anti, and more.
Learn how to use graphical and numerical techniques for exploratory data analysis while generating insightful and beautiful graphics in R.
Use data manipulation and visualization skills to explore the historical voting of the United Nations General Assembly.
Develop the skills you need to go from raw data to awesome insights as quickly and accurately as possible.
3.6 Data Manipulation with data.table
Master core concepts about data manipulation such as filtering, selecting and calculating groupwise statistics using data.table.
3.7 Joining Data with data.table
This course will show you how to combine and merge datasets with data.table.
4 Data Visualization
4.1 Intermediate Data Visualization with ggplot2
Learn to produce meaningful and beautiful data visualizations with ggplot2 by understanding the grammar of graphics.
4.2 Intermediate Data Visualization with ggplot2
Learn to use facets, coordinate systems and statistics in ggplot2 to create meaningful explanatory plots.
5 Statistics
5.1 Introduction to Statistics
Grow your statistical skills and learn how to collect, analyze, and draw accurate conclusions from data. Learn how to work with variables, plotting, and standard deviation in R. It covers histograms, distributions and more.
5.2 Foundations of Probability
In this course, you’ll learn about the concepts of random variables, distributions, and conditioning. Learn about random variables, distributions and conditioning, while gaining intuition for how to solve probability problems through random simulation.
5.3 Introduction to Regression
Learn how you can predict housing prices and ad click-through rate by implementing, analyzing, and interpreting linear and logistic regressions using R.
Learn to perform linear and logistic regression with multiple explanatory variables. Discover how to include multiple explanatory variables in a model, how interactions affect predictions, and how linear and logistic regression work in R.
5.5 Modeling with Data in the Tidyverse
Explore Linear Regression in a tidy framework.Discover different types in data modeling, including for prediction, and learn how to conduct linear regression and model assement measures in the Tidyverse.
In this course you’ll learn about basic experimental design, a crucial part of any data analysis. Learn about basic experimental design, including block and factorial designs, and commonly used statistical tests, such as the t-tests and ANOVAs in R.
Learn A/B testing: including hypothesis testing, experimental design, and confounding variables.
5.8 Fundamentals of Bayesian Data Analysis
Learn what Bayesian data analysis is, how it works, and why it is a useful tool to have in your data science toolbox.
Explore latent variables, such as personality, using exploratory and confirmatory factor analyses. Start this four-hour course today to discover exploratory factor analysis and confirmatory factor analysis in R to explore latent variables such as personality.
7 Machine Learning
7.1 Supervised Learning: Classification
Basics of machine learning for classification. This beginner-level introduction to machine learning covers four of the most common classification algorithms. You will come away with a basic understanding of how each algorithm approaches a learning task, as well as learn the R functions needed to apply these tools to your own work.
7.2 Supervised Learning: Regression
In this course you will learn how to predict future events using linear regression, generalized additive models, random forests, and xgboost.
This course provides an intro to clustering and dimensionality reduction in R from a machine learning perspective.
7.4 Machine Learning in the tidyverse
Leverage the tools in the tidyverse to generate, explore and evaluate machine learning models.
Develop a strong intuition for how hierarchical and k-means clustering work and learn how to apply them to extract insights from your data.
This course teaches the big ideas in machine learning: how to build and evaluate predictive models, how to tune them for performance, and how to preprocess data.
Learn to streamline your machine learning workflows with tidymodels.
7.8 Machine Learning with tree-based Models
Learn how to use tree-based models and ensembles to make classification and regression predictions with tidymodels.
This course will introduce the support vector machine (SVM) using an intuitive, visual approach.
Learn how to fit topic models using the Latent Dirichlet Allocation algorithm.
Use the caret, mlr and h2o packages to find optimal hyperparameters using grid search, random search, adaptive resampling and automatic machine learning.
7.12 Bayesian Regression Modeling
Learn how to leverage Bayesian estimation methods to make better inferences about linear regression models.
Learn how to analyze huge datasets using Apache Spark and R using the sparklyr package.