Simply apply function plot() on the object resulting from hclust().
To validate how well do clusterings preserve original distances, calculate correlations between cophenetic and original distances and plot them against each other. Which clustering preserves best the original distances? hint
Use function cophenetic() to calculate cophenetic distances. Function cor() will calculate linear correlation.
Since simple linkage clustering lead to a strong chaining of sites preventing an easy interpretation of clusters we will stick to the results of the other two clustering algorithms: complete linkage a UPGMA. Find the number of clusters for each clustering to obtain approximately 4 reasonable clusters. I.e. find the distance level at which about 4 clusters are developed. Ignor simple clusters containing just one sample.
Identify which samples belong to which clusters using cutree().
Use table() to compare the groupings of sites.
Which groupings are the most similar and which are the most different?
anadat/cs/exercises/cv2.txt · Last modified: 2017/04/15 12:04 by vitek