We continue to cover identifying groups in multivariate data. This class will focus on cluster analysis. This is a broad topic and could probably cover most of a semester, if you want more in depth start by looking at:
Notebook files
R notebook Rmarkdown file-Cluster part 1
Challenge
- Complete an agglomerative cluster analysis on the
USArrests
data - Identify the appropriate number of clusters
- Create a dendrogram of the data using
ggplot2
- Format the coloring of the dendrogram so that it matches it groups identified in step two. It should look something like this