Page
A Whole World of Data
Statistics and Data science training needs to include a range of realistic data to prepare students for the real world. R provides easy access to an incredibly rich collection of data sets.
This activity introduces ways you can access real data to use in R-Instat.
First watch the video:
Video script prepared by Rachel Kirk-Gushowaty, and Roger Stern. Video constructed and voiced by Beryl Waswa.
- 0:00 Introduction
- 01:17 Introduction to R-Instat
- 01:34 General Datasets
- 02:00 Diamonds data
- 02:55 Graphs
- 04:08 mydata data
- 05:28 efc data
- 06:24 happy data
- 06:57 Datasets for Specific points
- 07:22 Anscombe data
- 07:53 Datasaurusdozen dataset
- 08:09 Graphs
- 09:25 Simpsons paradox dataset
- 11:34 Wikipedia explanation of Simpsons paradox
- 12:18 UCBAdmissions dataset
- 12:47 Datasets from books
- 12:53 Introduction to Data science book
- 13:41 Movie ratings data
- 14:04 dslabs datasets
- 14:47 Datasets from book references
- 15:39 Agriculture data
- 16:35 Gomezsplitssplit dataset
- 16:51 Graphs
- 17:45 Data from lists
- 18:47 Data from outside R packages: The MICS data
- 20:19 Reflections
Then use this practice document to follow along with parts of all of the activity.
All of this data is easily accessible through the Import from Library dialog:
Simply use the dropdown menu to select the package and then choose the dataset you want to explore. Click on the R Help button to learn more about the data or click OK to open it.
Package | Dataset | ||||
---|---|---|---|---|---|
Agricolae: | |||||
Agridat: | split.split. | ||||
Agritutorial: | |||||
datasauRus: | datasaurus_dozen, | simpsons_paradox. | |||
datasets: | anscombe, | UCBAdmissions. | |||
dslabs: | movielens, | historic_co2, | divorce_margarine, | murders, | trump_tweets. |
ggplot2: | diamonds. | ||||
openair: | mydata. | ||||
questionr: | happy. | ||||
sjlabelled: | efc. |
The data from lists, from the rcorpora package, is access through the New Data Frame dialog.
From here you can use the drop down lists to browse the available categories and the lists within each on.