前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >python机器学习实现鸢尾花的分类

python机器学习实现鸢尾花的分类

作者头像
用户6719124
发布于 2019-11-17 15:41:17
发布于 2019-11-17 15:41:17
6.3K00
代码可运行
举报
运行总次数:0
代码可运行

鸢尾花(学名:Iris tectorum Maxim)属百合目、鸢尾科,可供观赏,花香气淡雅,可以调制香水,其根状茎可作中药,全年可采,具有消炎作用。

鸢尾花主要有三个品种,setosa,versicolor,virginnica(山鸢尾、变色鸢尾和维吉尼亚鸢尾)。在进行分类时,主要依据是花瓣的长度(Petal Length)、宽度(Petal Width),花萼的长度(Sepal Length)和宽度(Sepal Width)(均以厘米做单位)。

本文主要是建立一个基础的机器学习的模型,根据所得到的四个长度对鸢尾花进行分类预测。因此本文是三分类问题。所使用的数据集来源于:http://archive.ics.uci.edu/ml/datasets/Iris

下面进行数据集查看和简单处理

首先引入pandas工具包:

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
import pandas as pd

将下载好的iris.data文件放于代码所在文件夹中

开始读取数据,并输出一下查看

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
data = pd.read_csv('iris.data')
print(data.head())

输出为

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
   5.1  3.5  1.4  0.2  Iris-setosa
0  4.9  3.0  1.4  0.2  Iris-setosa
1  4.7  3.2  1.3  0.2  Iris-setosa
2  4.6  3.1  1.5  0.2  Iris-setosa
3  5.0  3.6  1.4  0.2  Iris-setosa
4  5.4  3.9  1.7  0.4  Iris-setosa

这里注意到没有标签标识,考虑添加标签代码

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
data.columns = ['sepal_len', 'sepal_width', 'petal_len', 'petal_width', 'class']
print(data.head())

输出为

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
   sepal_len  sepal_width  petal_len  petal_width        class
0        4.9          3.0        1.4          0.2  Iris-setosa
1        4.7          3.2        1.3          0.2  Iris-setosa
2        4.6          3.1        1.5          0.2  Iris-setosa
3        5.0          3.6        1.4          0.2  Iris-setosa
4        5.4          3.9        1.7          0.4  Iris-setosa

建立模型前,首先介绍pipeline代码:

Pipeline可以将许多算法模型串联起来,比如将特征提取、归一化、分类组织在一起形成一个典型的机器学习问题工作流。主要带来两点好处:

  1. 直接调用fit和predict方法来对pipeline中的所有算法模型进行训练和预测。
  2. 可以结合grid search对参数进行选择。

开始构建模型

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
x = x[:, :2]
# 这里取前两个样本
lr = Pipeline([('sc', StandardScaler()),
               ('poly', PolynomialFeatures(degree=3)),
               ('clf', LogisticRegression())])

# StandardScaler----计算训练集的平均值和标准差,以便测试数据集使用相同的变换
# PolynomialFeatures使用多项式的方法来进行的,degree:控制多项式的度
# LogisticRegression()逻辑回归算法
lr.fit(x, y.ravel())
# .ravel(扁平化操作)
y_hat = lr.predict(x)
y_hat_prob = lr.predict_proba(x)
np.set_printoptions(suppress=True)

输出计算值

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
print('y_hat = \n', y_hat)

输出为

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
y_hat = 
 ['Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa'
 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-virginica'
 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor' 'Iris-virginica'
 'Iris-versicolor' 'Iris-virginica' 'Iris-versicolor' 'Iris-virginica'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-virginica' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-virginica'
 'Iris-virginica' 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-virginica' 'Iris-virginica'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-virginica' 'Iris-versicolor' 'Iris-virginica'
 'Iris-versicolor' 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica' 'Iris-virginica' 'Iris-virginica'
 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica' 'Iris-virginica' 'Iris-virginica'
 'Iris-versicolor' 'Iris-virginica' 'Iris-versicolor' 'Iris-virginica'
 'Iris-versicolor' 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor'
 'Iris-versicolor' 'Iris-virginica' 'Iris-virginica' 'Iris-virginica'
 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica' 'Iris-versicolor']
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
print('y_hat_prob = \n', y_hat_prob)
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
y_hat_prob = 
[[0.85534909 0.09533414 0.04931677]
 [0.98474307 0.00832482 0.00693211]
 [0.98625508 0.00560102 0.00814391]
 [0.99481669 0.00337505 0.00180826]
 [0.99555309 0.00029521 0.0041517 ]
 [0.99869804 0.00071571 0.00058625]
 [0.96800558 0.02312483 0.00886959]
 [0.98822691 0.00186241 0.00991067]
 [0.90246766 0.0651196  0.03241274]
 [0.97733905 0.00824859 0.01441237]
 [0.99239647 0.00505615 0.00254739]
 [0.91642349 0.04917927 0.03439724]
 [0.99676732 0.00030162 0.00293106]
 [0.9398823  0.00009638 0.06002132]
 [0.99461788 0.         0.00538212]
 [0.99555309 0.00029521 0.0041517 ]
 [0.97350852 0.01856996 0.00792152]
 [0.93843364 0.00421448 0.05735187]
 [0.99895347 0.00038381 0.00066272]
 [0.76891197 0.16398317 0.06710486]
 [0.99654569 0.00181396 0.00164035]
 [0.99985893 0.00008626 0.0000548 ]
 [0.89079074 0.08149018 0.02771908]
 [0.99239647 0.00505615 0.00254739]
 [0.76081779 0.16883512 0.07034708]
 [0.96800558 0.02312483 0.00886959]
 [0.95362866 0.03209173 0.01427961]
 [0.90242719 0.07174113 0.02583168]
 [0.98474307 0.00832482 0.00693211]
 [0.94634023 0.03227143 0.02138834]
 [0.76891197 0.16398317 0.06710486]
 [0.99990061 0.00000102 0.00009837]
 [0.99854342 0.00000022 0.00145636]
 [0.90246766 0.0651196  0.03241274]
 [0.89203188 0.07793728 0.03003084]
 [0.82742392 0.10540162 0.06717445]
 [0.90246766 0.0651196  0.03241274]
 [0.99357802 0.00107196 0.00535003]
 [0.94205304 0.04253986 0.01540711]
 [0.98600147 0.00980546 0.00419307]
 [0.73275657 0.24653389 0.02070954]
 [0.99868435 0.00031417 0.00100147]
 [0.98600147 0.00980546 0.00419307]
 [0.99895347 0.00038381 0.00066272]
 [0.91642349 0.04917927 0.03439724]
 [0.99895347 0.00038381 0.00066272]
 [0.99300324 0.00314587 0.0038509 ]
 [0.98722324 0.00541553 0.00736122]
 [0.9372624  0.04564148 0.01709611]
 [0.00066831 0.19151727 0.80781442]
 [0.03574259 0.42213636 0.54212105]
 [0.00172174 0.2380128  0.76026547]
 [0.00029696 0.8681783  0.13152475]
 [0.00950958 0.39994199 0.59054842]
 [0.04283799 0.66526153 0.29190048]
 [0.06414466 0.4034332  0.53242214]
 [0.09112085 0.81279954 0.09607961]
 [0.00919148 0.3627435  0.62806502]
 [0.14866386 0.65381848 0.19751766]
 [0.00000411 0.98713147 0.01286442]
 [0.06444261 0.60066389 0.3348935 ]
 [0.00000726 0.77291594 0.2270768 ]
 [0.03528951 0.56903905 0.39567144]
 [0.08306604 0.64863356 0.2683004 ]
 [0.00778275 0.32411726 0.66809999]
 [0.12020666 0.62210179 0.25769155]
 [0.02028952 0.67175405 0.30795643]
 [0.00000391 0.65907266 0.34092342]
 [0.00481714 0.77187864 0.22330423]
 [0.11985284 0.5470987  0.33304846]
 [0.02379884 0.57797637 0.39822479]
 [0.00126724 0.49987542 0.49885734]
 [0.02379884 0.57797637 0.39822479]
 [0.01930587 0.46089705 0.51979708]
 [0.01163503 0.37280608 0.61555889]
 [0.00183391 0.23007537 0.76809072]
 [0.00679824 0.32461748 0.66858428]
 [0.04065629 0.59470414 0.36463958]
 [0.01100368 0.71867499 0.27032133]
 [0.00166588 0.82860137 0.16973275]
 [0.00166588 0.82860137 0.16973275]
 [0.02028952 0.67175405 0.30795643]
 [0.01541111 0.62137558 0.36321331]
 [0.22790609 0.5687734  0.20332051]
 [0.22911292 0.37297796 0.39790913]
 [0.00778275 0.32411726 0.66809999]
 [0.00003469 0.54029537 0.45966993]
 [0.12020666 0.62210179 0.25769155]
 [0.00647063 0.78828408 0.20524529]
 [0.01829676 0.74880171 0.23290153]
 [0.047037   0.55814244 0.39482055]
 [0.00920271 0.69735007 0.29344721]
 [0.00660544 0.92151972 0.07187483]
 [0.03004904 0.70327522 0.26667574]
 [0.09397292 0.62309996 0.28292712]
 [0.06590712 0.64403573 0.29005715]
 [0.03009288 0.53848234 0.43142478]
 [0.0533754  0.79584651 0.15077809]
 [0.04283799 0.66526153 0.29190048]
 [0.06414466 0.4034332  0.53242214]
 [0.02028952 0.67175405 0.30795643]
 [0.00017942 0.15797581 0.84184477]
 [0.02476733 0.5025339  0.47269877]
 [0.01778844 0.41878697 0.56342459]
 [0.00000002 0.07518897 0.92481101]
 [0.26845465 0.61461222 0.11693313]
 [0.00000746 0.09075442 0.90923812]
 [0.00020758 0.20054784 0.79924458]
 [0.00007868 0.02245862 0.97746269]
 [0.02450888 0.38498824 0.59050288]
 [0.00733268 0.44265197 0.55001535]
 [0.00348126 0.27669229 0.71982645]
 [0.00380647 0.75199042 0.24420311]
 [0.0359175  0.65054005 0.31354245]
 [0.03574259 0.42213636 0.54212105]
 [0.01778844 0.41878697 0.56342459]
 [0.         0.00115857 0.99884143]
 [0.         0.01728022 0.98271978]
 [0.00000726 0.77291594 0.2270768 ]
 [0.00183015 0.22552467 0.77264519]
 [0.05348703 0.67448608 0.27202689]
 [0.         0.0390756  0.9609244 ]
 [0.00946976 0.4985239  0.49200634]
 [0.00987797 0.26085566 0.72926638]
 [0.00005193 0.13855405 0.86139402]
 [0.02041318 0.54303587 0.43655094]
 [0.047037   0.55814244 0.39482055]
 [0.01316304 0.45370611 0.53313084]
 [0.00004617 0.13077655 0.86917728]
 [0.00000075 0.05736916 0.94263009]
 [0.         0.0012725  0.9987275 ]
 [0.01316304 0.45370611 0.53313084]
 [0.01686445 0.50166674 0.48146882]
 [0.00596213 0.60032111 0.39371676]
 [0.         0.07108026 0.92891974]
 [0.0949242  0.31744331 0.58763249]
 [0.0297933  0.44965611 0.5205506 ]
 [0.05504791 0.58154689 0.3634052 ]
 [0.00172174 0.2380128  0.76026547]
 [0.00778275 0.32411726 0.66809999]
 [0.00172174 0.2380128  0.76026547]
 [0.02028952 0.67175405 0.30795643]
 [0.00427181 0.26354829 0.7321799 ]
 [0.00987797 0.26085566 0.72926638]
 [0.00679824 0.32461748 0.66858428]
 [0.00126724 0.49987542 0.49885734]
 [0.01778844 0.41878697 0.56342459]
 [0.13172405 0.34233852 0.52593742]
 [0.06444261 0.60066389 0.3348935 ]]

输出正确率

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
print('准确度:%.2f%%' % (100*np.mean(y_hat == y.ravel())))
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
准确度:80.54%

我们换用K近邻分类法重新对其进行分析

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
from sklearn.datasets import load_iris

iris_dataset = load_iris()
# print(iris_dataset)

from sklearn.model_selection import train_test_split

x_train, y_train, x_test, y_test = train_test_split(iris_dataset['data'], iris_dataset['target'], 0.8, random_state=0)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix

iris_dataframe = pd.DataFrame(x_train, columns=iris_dataset.feature_names)
grr = pd.plotting.scatter_matrix(iris_dataframe, c=y_train, marker='o', figsize=(10, 10), hist_kwds = {'bins': 20}, s=60, alpha = 0.8, cmap='viridis')
plt.show()

以上是2-2分类绘制散点图

引入K分类工具包

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
from sklearn.neighbors import KNeighborsClassifier
Knn = KNeighborsClassifier(n_neighbors=1)
Knn.fit(x_train, y_train)

开始预测并输出结果

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
y_pred = Knn.predict(x_test)
print("Test set score: {:.2f}".format(np.mean(y_pred == y_test)))

结果为

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
Test set score: 0.97

该结果要高于之前的算法。随着进一步调参,正确率也会进一步提高。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2019-09-27,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 python pytorch AI机器学习实践 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档