在深度学习领域,卷积神经网络(CNN)是解决图像分类、目标检测等问题的关键技术之一。近年来,随着深度学习的不断发展,新的网络架构不断涌现。在众多网络架构中,EfficientNet和ResNet都成为了深度学习模型的佼佼者,分别在高效性和深度特性上得到了广泛应用。本文将详细介绍EfficientNet,并与经典的ResNet进行对比,分析它的架构、使用场景、适用问题及实例。
EfficientNet是一种由Google在2019年提出的深度神经网络架构,其目标是通过优化神经网络模型的深度、宽度和分辨率来实现计算效率和准确度的平衡。它的核心理念是:通过复合缩放(Compound Scaling)方法同时优化网络的深度、宽度和输入图像的分辨率,使得网络在给定计算预算下能够达到更高的性能。
EfficientNet的核心创新在于复合缩放方法。传统的深度网络通常只通过单一的维度(如深度、宽度或分辨率)进行缩放,而EfficientNet则结合了这三者,使得模型的计算效率和精度都得到了优化。
具体来说,EfficientNet架构包含以下几个关键点:
EfficientNet可以广泛应用于各类计算机视觉任务中,尤其是在需要高效计算资源的情况下,表现尤为突出。主要的应用场景包括:
EfficientNet和ResNet都是现代深度学习中的重要网络架构,它们各自有着不同的设计理念和优缺点。我们可以从以下几个方面进行对比:
下面是两种模型架构的详细解释与对比,首先我将分别讲解两种模型的结构,再进行对比,并绘制成图表格式。
EfficientNetB0 是 EfficientNet 系列中的一种轻量级卷积神经网络,其架构基于深度可分离卷积和多种优化策略。以下是其主要组成部分:
(32, 32, 3)
,代表32x32的RGB图像。ZeroPadding2D
: 用于在图像边缘添加填充,使后续卷积操作不会丢失信息。Conv2D
(卷积层): 采用3x3卷积核,将输入转换为16x16x32的特征图。BatchNormalization
(批归一化) + Activation
: 对卷积层输出进行标准化,并通过ReLU激活函数。DepthwiseConv2D
: 深度可分离卷积,减少计算量和参数量,输出32通道的16x16特征图。DepthwiseConv2D
和 Conv2D
层,逐渐提取更高级的特征。ResNet(Residual Networks)是另一种深度卷积神经网络,它通过残差连接(skip connections)克服了深度网络中梯度消失和梯度爆炸的问题。以下是其主要组成部分:
(32, 32, 3)
,同样代表32x32的RGB图像。ZeroPadding2D
: 对输入图像进行填充,使后续的卷积操作保持一致性。Conv2D
(卷积层): 采用3x3卷积核,将输入转换为16x16x64的特征图。BatchNormalization
+ Activation
: 对卷积输出进行标准化,并通过ReLU激活函数。MaxPooling2D
: 对卷积层的输出进行最大池化,降低空间维度,减少计算量。特性 | EfficientNetB0 | ResNet |
---|---|---|
网络结构 | 基于复合缩放(Compound Scaling),采用深度可分离卷积 | 基于残差连接,传统的卷积层结构 |
输入尺寸 | (32, 32, 3) | (32, 32, 3) |
卷积层 | 深度可分离卷积(减少计算量) | 传统卷积层+残差连接 |
池化层 | 使用深度卷积和步长2的卷积替代池化层 | 使用最大池化层(MaxPooling2D) |
特点 | 轻量化设计,高效的计算和内存使用 | 深度网络结构,通过残差连接解决梯度消失 |
应用场景 | 适用于计算资源受限的场景,如移动端、嵌入式设备 | 适用于需要解决深度训练难题的场景 |
Layer (type) | Output Shape | Param # | Connected to |
---|---|---|---|
Input Layer (InputLayer) | (None, 32, 32, 3) | 0 | - |
Rescaling (Rescaling) | (None, 32, 32, 3) | 0 | Input Layer |
Normalization (Normalization) | (None, 32, 32, 3) | 7 | Rescaling |
Stem Conv Pad (ZeroPadding2D) | (None, 33, 33, 3) | 0 | Normalization |
Stem Conv (Conv2D) | (None, 16, 16, 32) | 864 | Stem Conv Pad |
Stem BN (BatchNormalization) | (None, 16, 16, 32) | 128 | Stem Conv |
Stem Activation (Activation) | (None, 16, 16, 32) | 0 | Stem BN |
Block1a DwConv (DepthwiseConv2D) | (None, 16, 16, 32) | 288 | Stem Activation |
Layer (type) | Output Shape | Param # | Connected to |
---|---|---|---|
Input Layer (InputLayer) | (None, 32, 32, 3) | 0 | - |
Conv1 Pad (ZeroPadding2D) | (None, 38, 38, 3) | 0 | Input Layer |
Conv1 Conv (Conv2D) | (None, 16, 16, 64) | 9,472 | Conv1 Pad |
Conv1 BN (BatchNormalization) | (None, 16, 16, 64) | 256 | Conv1 Conv |
Conv1 Relu (Activation) | (None, 16, 16, 64) | 0 | Conv1 BN |
Pool1 Pad (ZeroPadding2D) | (None, 18, 18, 64) | 0 | Conv1 Relu |
Pool1 Pool (MaxPooling2D) | (None, 8, 8, 64) | 0 | Pool1 Pad |
Conv2 Block1 1 Conv (Conv2D) | (None, 8, 8, 64) | 4,160 | Pool1 Pool |
Conv2 Block1 1 BN (BatchNormalization) | (None, 8, 8, 64) | 256 | Conv2 Block1 1 Conv |
让我们通过一个实际的例子,使用TensorFlow Keras库实现EfficientNet,并与ResNet进行对比。我们选择的是Keras库自带的CIFAR-10数据集,进行图像分类任务。
1.首先加载数据集,这里使用cifar10数据集。
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import cifar10
# 加载CIFAR-10数据集
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train.shape
(50000, 32, 32, 3)
可以看到训练数据有50000张图片,并且图片是32*32大小的3颜色通道图像。
from matplotlib import pyplot as plt
plt.subplot(131)
plt.imshow(x_train[1])
plt.subplot(132)
plt.imshow(x_train[2])
plt.subplot(133)
plt.imshow(x_train[3])
展示出一些图像进行查看。
2.为了方便比较resnet50和EfficientNet,我将从两个方面进行研究,第一,探索两个架构在不使用预训练参数的情况下,也就是从头开始训练模型,比较其准确率和损失。其二,对两个模型都使用预训练好的参数,比较其准确率和损失值。其他参数,例如epochs=10, batch_size=64都保持一致。
# 归一化处理
x_train, x_test = x_train / 255.0, x_test / 255.0
# 使用EfficientNet模型进行训练
efficientnet_model = tf.keras.applications.EfficientNetB0(
include_top=True,
weights=None, # weights='imagenet', 表示从头开始训练
input_shape=(32, 32, 3),
classes=10
)
# 编译模型
efficientnet_model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 训练模型并记录历史
history = efficientnet_model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test))
# 测试模型
test_loss, test_acc = efficientnet_model.evaluate(x_test, y_test, verbose=2)
print(f"EfficientNet Test accuracy: {test_acc}")
# 绘制损失图
plt.figure(figsize=(10, 6))
# 绘制训练和验证的损失曲线
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
# 添加标题和标签
plt.title('Loss during Training and Validation')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
# 显示图像
plt.show()
Epoch 1/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 119s 73ms/step - accuracy: 0.1605 - loss: 3.4596 - val_accuracy: 0.1718 - val_loss: 2.4781
Epoch 2/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 26ms/step - accuracy: 0.2614 - loss: 2.3754 - val_accuracy: 0.2990 - val_loss: 3.5932
Epoch 3/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.3080 - loss: 2.1670 - val_accuracy: 0.2642 - val_loss: 2.0965
Epoch 4/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.2998 - loss: 2.2079 - val_accuracy: 0.3511 - val_loss: 1.7923
Epoch 5/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.3160 - loss: 2.1311 - val_accuracy: 0.2875 - val_loss: 2.4609
Epoch 6/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.2823 - loss: 2.2442 - val_accuracy: 0.3555 - val_loss: 1.8949
Epoch 7/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.3466 - loss: 2.0003 - val_accuracy: 0.3637 - val_loss: 1.9848
Epoch 8/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.3813 - loss: 1.8895 - val_accuracy: 0.3848 - val_loss: 1.7935
Epoch 9/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.3739 - loss: 1.9279 - val_accuracy: 0.3653 - val_loss: 1.7812
Epoch 10/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 21s 27ms/step - accuracy: 0.3726 - loss: 1.9205 - val_accuracy: 0.4238 - val_loss: 1.6427
313/313 - 5s - 17ms/step - accuracy: 0.4238 - loss: 1.6427
EfficientNet Test accuracy: 0.423799991607666
# 使用ResNet-50模型进行训练
resnet_model = tf.keras.applications.ResNet50(
include_top=True,
weights=None,
input_shape=(32, 32, 3),
classes=10
)
# 编译模型
resnet_model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 训练模型并记录历史
history = resnet_model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test))
# 测试模型
test_loss, test_acc = resnet_model.evaluate(x_test, y_test, verbose=2)
print(f"ResNet Test accuracy: {test_acc}")
# 绘制损失图
plt.figure(figsize=(10, 6))
# 绘制训练和验证的损失曲线
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
# 添加标题和标签
plt.title('Loss during Training and Validation')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
# 显示图像
plt.show()
Epoch 1/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 98s 63ms/step - accuracy: 0.3055 - loss: 2.2639 - val_accuracy: 0.4116 - val_loss: 1.6579
Epoch 2/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 35ms/step - accuracy: 0.3894 - loss: 1.8813 - val_accuracy: 0.1687 - val_loss: 2.5237
Epoch 3/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 35ms/step - accuracy: 0.4234 - loss: 1.8272 - val_accuracy: 0.1247 - val_loss: 73.8607
Epoch 4/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 35ms/step - accuracy: 0.3188 - loss: 2.1395 - val_accuracy: 0.2297 - val_loss: 2.5845
Epoch 5/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 34ms/step - accuracy: 0.4081 - loss: 1.7452 - val_accuracy: 0.4130 - val_loss: 1.6245
Epoch 6/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 34ms/step - accuracy: 0.4723 - loss: 1.6311 - val_accuracy: 0.2371 - val_loss: 2.5472
Epoch 7/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 34ms/step - accuracy: 0.4987 - loss: 1.5423 - val_accuracy: 0.4830 - val_loss: 3.2968
Epoch 8/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 35ms/step - accuracy: 0.5589 - loss: 1.3475 - val_accuracy: 0.5539 - val_loss: 1.7728
Epoch 9/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 34ms/step - accuracy: 0.6003 - loss: 1.2565 - val_accuracy: 0.3240 - val_loss: 2.1211
Epoch 10/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 27s 35ms/step - accuracy: 0.5625 - loss: 1.4461 - val_accuracy: 0.5895 - val_loss: 1.1765
313/313 - 4s - 14ms/step - accuracy: 0.5895 - loss: 1.1765
ResNet Test accuracy: 0.5895000100135803
# 使用EfficientNet模型进行训练 (不包括顶部的全连接层)
efficientnet_model = tf.keras.applications.EfficientNetB0(
include_top=False, # 不包含顶部的分类层
weights='imagenet', # 使用ImageNet预训练权重
input_shape=(32, 32, 3)
)
# 冻结预训练的层,不训练它们
efficientnet_model.trainable = False
# 在EfficientNet的顶部添加一个自定义的分类层
model = tf.keras.Sequential([
efficientnet_model,
tf.keras.layers.GlobalAveragePooling2D(), # 使用全局平均池化层
tf.keras.layers.Dense(10, activation='softmax') # CIFAR-10 有 10 个分类
])
# 编译模型
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 训练模型并记录历史
history = model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test))
# 测试模型
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"EfficientNet Test accuracy: {test_acc}")
# 绘制损失图
plt.figure(figsize=(10, 6))
# 绘制训练和验证的损失曲线
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
# 添加标题和标签
plt.title('Loss during Training and Validation')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
# 显示图像
plt.show()
Epoch 1/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 39s 28ms/step - accuracy: 0.1013 - loss: 2.3282 - val_accuracy: 0.1000 - val_loss: 2.3202
Epoch 2/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.0990 - loss: 2.3311 - val_accuracy: 0.1000 - val_loss: 2.3488
Epoch 3/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.0971 - loss: 2.3280 - val_accuracy: 0.1000 - val_loss: 2.3273
Epoch 4/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.0984 - loss: 2.3272 - val_accuracy: 0.1000 - val_loss: 2.3191
Epoch 5/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.1045 - loss: 2.3287 - val_accuracy: 0.1000 - val_loss: 2.3206
Epoch 6/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.1032 - loss: 2.3272 - val_accuracy: 0.1000 - val_loss: 2.3103
Epoch 7/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.0990 - loss: 2.3287 - val_accuracy: 0.1000 - val_loss: 2.3229
Epoch 8/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.1001 - loss: 2.3273 - val_accuracy: 0.1000 - val_loss: 2.3333
Epoch 9/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.1002 - loss: 2.3312 - val_accuracy: 0.1000 - val_loss: 2.3290
Epoch 10/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 8s 10ms/step - accuracy: 0.0994 - loss: 2.3294 - val_accuracy: 0.1000 - val_loss: 2.3264
313/313 - 5s - 15ms/step - accuracy: 0.1000 - loss: 2.3264
EfficientNet Test accuracy: 0.10000000149011612
# 使用ResNet-50模型进行训练 (不包括顶部的全连接层)
resnet_model = tf.keras.applications.ResNet50(
include_top=False, # 不包含顶部的分类层
weights='imagenet', # 使用ImageNet预训练权重
input_shape=(32, 32, 3)
)
# 冻结预训练的层,不训练它们
resnet_model.trainable = False
# 在ResNet-50的顶部添加一个自定义的分类层
model = tf.keras.Sequential([
resnet_model,
tf.keras.layers.GlobalAveragePooling2D(), # 使用全局平均池化层
tf.keras.layers.Dense(10, activation='softmax') # CIFAR-10 有 10 个分类
])
# 编译模型
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 训练模型并记录历史
history = model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test))
# 测试模型
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"ResNet Test accuracy: {test_acc}")
# 绘制损失图
plt.figure(figsize=(10, 6))
# 绘制训练和验证的损失曲线
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
# 添加标题和标签
plt.title('Loss during Training and Validation')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
# 显示图像
plt.show()
94765736/94765736 ━━━━━━━━━━━━━━━━━━━━ 4s 0us/step
Epoch 1/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 24s 20ms/step - accuracy: 0.0980 - loss: 2.4080 - val_accuracy: 0.1000 - val_loss: 2.3360
Epoch 2/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 12ms/step - accuracy: 0.0980 - loss: 2.3526 - val_accuracy: 0.1000 - val_loss: 2.3432
Epoch 3/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 12ms/step - accuracy: 0.1007 - loss: 2.3432 - val_accuracy: 0.1000 - val_loss: 2.3263
Epoch 4/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 11ms/step - accuracy: 0.1027 - loss: 2.3470 - val_accuracy: 0.1000 - val_loss: 2.3581
Epoch 5/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 11ms/step - accuracy: 0.0996 - loss: 2.3540 - val_accuracy: 0.1000 - val_loss: 2.3316
Epoch 6/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 11ms/step - accuracy: 0.0993 - loss: 2.3496 - val_accuracy: 0.1000 - val_loss: 2.4083
Epoch 7/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 12ms/step - accuracy: 0.1008 - loss: 2.3473 - val_accuracy: 0.1000 - val_loss: 2.4054
Epoch 8/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 11ms/step - accuracy: 0.1002 - loss: 2.3534 - val_accuracy: 0.1000 - val_loss: 2.3475
Epoch 9/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 11ms/step - accuracy: 0.0993 - loss: 2.3522 - val_accuracy: 0.1000 - val_loss: 2.3304
Epoch 10/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 9s 11ms/step - accuracy: 0.1008 - loss: 2.3523 - val_accuracy: 0.1000 - val_loss: 2.3446
313/313 - 4s - 12ms/step - accuracy: 0.1000 - loss: 2.3446
ResNet Test accuracy: 0.10000000149011612
这里的准确率确实比较差,因为并没有进行细致的调参,只是为了比较两种模型,图像大小可以调整为224*224,并且epoch可以适当增加,batch_size可以适当缩小。这样效果应该会变好。
最后,制作不易,如果我的内容对你有帮助,请动动发财的小手,点个关注,非常感谢。