Welcome back to another semester of our UseR.
Weekly challenge
The data
Work with the ecology data set from datacarpentry. An explanation of the dataset can be found here
library(tidyverse)
mydata <- read_csv("https://ndownloader.figshare.com/files/2292169")
glimpse(mydata)## Observations: 34,786
## Variables: 13
## $ record_id       <int> 1, 72, 224, 266, 349, 363, 435, 506, 588, 661,...
## $ month           <int> 7, 8, 9, 10, 11, 11, 12, 1, 2, 3, 4, 5, 6, 8, ...
## $ day             <int> 16, 19, 13, 16, 12, 12, 10, 8, 18, 11, 8, 6, 9...
## $ year            <int> 1977, 1977, 1977, 1977, 1977, 1977, 1977, 1978...
## $ plot_id         <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2...
## $ species_id      <chr> "NL", "NL", "NL", "NL", "NL", "NL", "NL", "NL"...
## $ sex             <chr> "M", "M", NA, NA, NA, NA, NA, NA, "M", NA, NA,...
## $ hindfoot_length <int> 32, 31, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32...
## $ weight          <int> NA, NA, NA, NA, NA, NA, NA, NA, 218, NA, NA, 2...
## $ genus           <chr> "Neotoma", "Neotoma", "Neotoma", "Neotoma", "N...
## $ species         <chr> "albigula", "albigula", "albigula", "albigula"...
## $ taxa            <chr> "Rodent", "Rodent", "Rodent", "Rodent", "Roden...
## $ plot_type       <chr> "Control", "Control", "Control", "Control", "C...In-class challenges
1. Find the average hindfoot_length
mean(mydata$hindfoot_length)## [1] NA# Notice the answer is NA.  This is because NAs are in the data and thus need to be removed. This can be accomplished in two ways
# 1. Remove the NAs from the data and create a new object and then take the mean
hindfoot_length.rev <- mydata$hindfoot_length[!is.na(mydata$hindfoot_length)]
avg_foot <- mean(hindfoot_length.rev)
avg_foot## [1] 29.28793# 2. Use  na.rm option within the mean function
?mean
avg_foot <-mean(mydata$hindfoot_length, na.rm = TRUE)
avg_foot## [1] 29.287932. How many are above and below average
#step 1 - find average
avg <- mean(mydata$hindfoot_length, na.rm=T)
#step 2- index to get only value in number below
lessthan <- mydata$hindfoot_length < avg
head(lessthan, 25) # display the first 25 elements##  [1] FALSE FALSE    NA    NA    NA    NA    NA    NA    NA    NA    NA
## [12] FALSE    NA FALSE FALSE    NA    NA FALSE FALSE FALSE FALSE FALSE
## [23] FALSE    NA FALSEgreaterthan <- mydata$hindfoot_length > avg
#step 3 - count the number of rows
sum(lessthan, na.rm=T)## [1] 15371sum(greaterthan, na.rm=T)## [1] 16067Take-home challenges
1. What are the names of the plot types (treatments) in the experiment?
2. How many species caught?
3. How many species of birds? Rodents?
**4. Average weight of Male Rodents? **
5. Average weight of Female Rodents?