Welcome back to another semester of our UseR.

R script from class

Powerpoint from class

Weekly challenge

The data

Work with the ecology data set from datacarpentry. An explanation of the dataset can be found here

library(tidyverse)

mydata <- read_csv("https://ndownloader.figshare.com/files/2292169")

glimpse(mydata)
## Observations: 34,786
## Variables: 13
## $ record_id <int> 1, 72, 224, 266, 349, 363, 435, 506, 588, 661,...
## $ month <int> 7, 8, 9, 10, 11, 11, 12, 1, 2, 3, 4, 5, 6, 8, ...
## $ day <int> 16, 19, 13, 16, 12, 12, 10, 8, 18, 11, 8, 6, 9...
## $ year <int> 1977, 1977, 1977, 1977, 1977, 1977, 1977, 1978...
## $ plot_id <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2...
## $ species_id <chr> "NL", "NL", "NL", "NL", "NL", "NL", "NL", "NL"...
## $ sex <chr> "M", "M", NA, NA, NA, NA, NA, NA, "M", NA, NA,...
## $ hindfoot_length <int> 32, 31, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32...
## $ weight <int> NA, NA, NA, NA, NA, NA, NA, NA, 218, NA, NA, 2...
## $ genus <chr> "Neotoma", "Neotoma", "Neotoma", "Neotoma", "N...
## $ species <chr> "albigula", "albigula", "albigula", "albigula"...
## $ taxa <chr> "Rodent", "Rodent", "Rodent", "Rodent", "Roden...
## $ plot_type <chr> "Control", "Control", "Control", "Control", "C...

In-class challenges

1. Find the average hindfoot_length

mean(mydata$hindfoot_length)
## [1] NA
# Notice the answer is NA.  This is because NAs are in the data and thus need to be removed. This can be accomplished in two ways

# 1. Remove the NAs from the data and create a new object and then take the mean

hindfoot_length.rev <- mydata$hindfoot_length[!is.na(mydata$hindfoot_length)]
avg_foot <- mean(hindfoot_length.rev)
avg_foot
## [1] 29.28793
# 2. Use  na.rm option within the mean function
?mean
avg_foot <-mean(mydata$hindfoot_length, na.rm = TRUE)
avg_foot
## [1] 29.28793

2. How many are above and below average

#step 1 - find average
avg <- mean(mydata$hindfoot_length, na.rm=T)
#step 2- index to get only value in number below
lessthan <- mydata$hindfoot_length < avg
head(lessthan, 25) # display the first 25 elements
##  [1] FALSE FALSE    NA    NA    NA    NA    NA    NA    NA    NA    NA
## [12] FALSE NA FALSE FALSE NA NA FALSE FALSE FALSE FALSE FALSE
## [23] FALSE NA FALSE
greaterthan <- mydata$hindfoot_length > avg

#step 3 - count the number of rows
sum(lessthan, na.rm=T)
## [1] 15371
sum(greaterthan, na.rm=T)
## [1] 16067

Take-home challenges

1. What are the names of the plot types (treatments) in the experiment?

2. How many species caught?

3. How many species of birds? Rodents?

**4. Average weight of Male Rodents? **

5. Average weight of Female Rodents?