Hierarchical Cluster Analysis with Dendrogram
The hierarchical clustering analysis with dendrogram, as presented in this document, is a statistical method designed to group similar observations into clusters based on their characteristics. It begins by computing a Euclidean distance matrix between observations after standardizing the data to eliminate scale biases. The Ward.D2 method is employed to construct a dendrogram by minimizing intra-cluster variance at each merging step. The optimal number of clusters is determined using the NbClust algorithm, which evaluates indices such as silhouette and gap statistics to identify a robust partition (here, 3 clusters). A principal component analysis (PCA) is then performed to reduce dimensionality, followed by hierarchical clustering on principal components (HCPC) to refine the results. Visualizations, particularly via fviz_dend, facilitate interpretation of the groupings, with colored rectangles highlighting clusters in the dendrogram. The results are exported as tables and files for further analysis.
Have fun!
The Abdi-Basid Courses Institute
The Abdi-Basid Courses Institute