3 种葡萄酒;测量13个指标;总共178个样本
数据集下载链接 https://acadgildsite.s3.amazonaws.com/wordpress_images/r/wineDataset_Kmeans/Wine.csv
df<-read.csv("Wine.csv",header = T)
head(df)
df$Customer_Segment<-as.factor(df$Customer_Segment)
summary(df)
dim(df)
winepca<-prcomp(df[,1:13],scale. = T)
library(factoextra)
fviz_eig(winepca,addlabels = T)
fviz_pca_ind(winepca,col.ind = df$Customer_Segment,
addEllipses = T,geom=("point"),legend.title="")
image.png
image.png
原文链接 Analyzing Wine dataset using K-means Clustering
K均值聚类是最简单也是最常用的聚类算法之一。他试图找到代表数据特定区域的簇中心。算法交替执行以下两个步骤:将每个数据点分配给最近的簇中心,然后将每个簇中心设置为所分配的所有数据点的平均值。如果簇的分配不在发生变化,那么算法结束。
--《Python机器学习基础教程》
library(factoextra)
df<-read.csv("Wine.csv",header = T)
winescale<-scale(df[,1:13])
head(winescale)
fviz_nbclust(winescale,kmeans,method='wss')+
geom_vline(xintercept=3,linetype=5,col="darkred")
winekmeans<-kmeans(winescale,3,nstart=25)
winekmeans
winekmeans$centers
winekmeans$size
fviz_cluster(object=winekmeans,data=winescale,ellipse.type = "norm",
geom = ("point"),palette='jco',main="",
ggtheme=theme_minimal())
image.png
image.png
扫码关注腾讯云开发者
领取腾讯云代金券
Copyright © 2013 - 2025 Tencent Cloud. All Rights Reserved. 腾讯云 版权所有
深圳市腾讯计算机系统有限公司 ICP备案/许可证号:粤B2-20090059 深公网安备号 44030502008569
腾讯云计算(北京)有限责任公司 京ICP证150476号 | 京ICP备11018762号 | 京公网安备号11010802020287
Copyright © 2013 - 2025 Tencent Cloud.
All Rights Reserved. 腾讯云 版权所有