YoloV7虽然和YoloV5、YoloV8一脉相承,但是其配置文件及其复杂,对修改造成一定的难度。
yolov7.yaml配置文件如下:
# parameters
nc: 80 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [12,16, 19,36, 40,28] # P3/8
- [36,75, 76,55, 72,146] # P4/16
- [142,110, 192,243, 459,401] # P5/32
# yolov7 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [32, 3, 1]], # 0
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
[-1, 1, Conv, [64, 1, 1]],
[-2, 1, Conv, [64, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 11
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 16-P3/8
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 24
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 29-P4/16
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [1024, 1, 1]], # 37
[-1, 1, MP, []],
[-1, 1, Conv, [512, 1, 1]],
[-3, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [512, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 42-P5/32
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [1024, 1, 1]], # 50
]
# yolov7 head
head:
[[-1, 1, SPPCSPC, [512]], # 51
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[37, 1, Conv, [256, 1, 1]], # route backbone P4
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 63
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[24, 1, Conv, [128, 1, 1]], # route backbone P3
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 75
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3, 63], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 88
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3, 51], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]],
[-2, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 101
[75, 1, RepConv, [256, 3, 1]],
[88, 1, RepConv, [512, 3, 1]],
[101, 1, RepConv, [1024, 3, 1]],
[[102,103,104], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
太多的层次了。所以,先对YoloV7做封装。
all 229 1407 0.966 0.99 0.993 0.734
c17 229 131 0.977 0.992 0.991 0.828
c5 229 68 0.941 1 0.99 0.837
helicopter 229 43 0.949 1 0.98 0.628
c130 229 85 0.994 1 0.997 0.691
f16 229 57 0.99 0.965 0.994 0.694
b2 229 2 0.904 1 0.995 0.796
other 229 86 0.988 0.955 0.992 0.565
b52 229 65 0.98 0.969 0.985 0.819
kc10 229 62 0.995 0.984 0.986 0.83
command 229 40 0.991 1 0.996 0.835
f15 229 123 0.992 0.992 0.997 0.652
kc135 229 91 0.986 0.989 0.987 0.707
a10 229 27 1 0.889 0.997 0.454
b1 229 20 0.989 1 0.996 0.74
aew 229 25 0.949 1 0.981 0.751
f22 229 17 0.977 1 0.996 0.754
p3 229 105 0.998 1 0.998 0.797
p8 229 1 0.853 1 0.995 0.597
f35 229 32 0.994 1 0.996 0.58
f18 229 125 0.991 0.992 0.993 0.822
v22 229 41 0.995 1 0.996 0.696
su-27 229 31 0.992 1 0.996 0.829
il-38 229 27 0.962 1 0.996 0.857
tu-134 229 1 0.846 1 0.995 0.896
su-33 229 2 0.939 1 0.995 0.498
an-70 229 2 0.904 1 0.995 0.846
tu-22 229 98 0.998 1 0.998 0.81
记录一下YoloV7官方结果,方便和后续的结果做对比!
从结构图上可以看出,我们需要封装的模块分为ELAN、MP1、MP2、ELAN-H这几个模块。
ELAN和ELAN-H比较相似,只有内部的channel不同,所以放在一起做比较,方便大家学习,如下图:
首先,封装E_ELAN,代码如下:
class E_ELAN(nn.Module):
def __init__(self, c1, c2,e=0.5):
'''
:param c1: 输入通道
:param c2: 这里给的是中间层的输出通道
:param flg: 判断是否为backbone的最后一层,因为这里的输出通道数有所改变
'''
super(E_ELAN, self).__init__()
c_ = int(c1 *e)
self.conv1 = Conv(c1, c_, k=1, s=1)
self.conv2 = Conv(c1, c_, k=1, s=1)
self.conv3 = Conv(c_, c_, k=3, s=1)
self.conv4 = Conv(c_, c_, k=3, s=1)
self.conv5 = Conv(c_, c_, k=3, s=1)
self.conv6 = Conv(c_, c_, k=3, s=1)
self.conv7 = Conv(4 * c_, c2, k=1, s=1)
def forward(self, x):
# 分支一输出
output1 = self.conv1(x)
# 分支二输出
output2_1 = self.conv2(x)
output2_2 = self.conv3(output2_1)
output2_3 = self.conv4(output2_2)
output2_4 = self.conv5(output2_3)
output2_5 = self.conv6(output2_4)
output_cat = torch.cat((output1, output2_1, output2_3, output2_5), dim=1)
return self.conv7(output_cat)
先讲解一下参数,c1,代表输入的维度,c2代表输出的维度,e代表比例,由于输入后,x经过两个分支,每个分支为原来一半的channel,所以,e的值设置为0.5。共有四个输出分支,经过拼接后输出的结果为原来channel的2倍,然后,再通过一层卷积调整channel,我将这层卷积定义为c2。
接下来是ELAN_H模块,代码如下:
class E_ELAN_H(nn.Module):
def __init__(self, c1, c2,e1=0.5,e2=0.25):
'''
:param c1: 输入通道
:param c2: 这里给的是中间层的输出通道
:param flg: 判断是否为backbone的最后一层,因为这里的输出通道数有所改变
'''
super(E_ELAN_H, self).__init__()
c_ = int(c1 * e1)
c_hidden = int(c1 *e2)
self.conv1 = Conv(c1, c_, k=1, s=1)
self.conv2 = Conv(c1, c_, k=1, s=1)
self.conv3 = Conv(c_, c_hidden, k=3, s=1)
self.conv4 = Conv(c_hidden, c_hidden, k=3, s=1)
self.conv5 = Conv(c_hidden, c_hidden, k=3, s=1)
self.conv6 = Conv(c_hidden, c_hidden, k=3, s=1)
self.conv7 = Conv(2 * c1, c2, k=1, s=1)
def forward(self, x):
'''
:param x: 输入
:return:
'''
# 分支一输出
output1 = self.conv1(x)
# 分支二输出
output2_1 = self.conv2(x)
output2_2 = self.conv3(output2_1)
output2_3 = self.conv4(output2_2)
output2_4 = self.conv5(output2_3)
output2_5 = self.conv6(output2_4)
output_cat = torch.cat((output1, output2_1, output2_2, output2_3, output2_4, output2_5), dim=1)
return self.conv7(output_cat)
先讲解一下参数,c1,代表输入的维度,c2代表输出的维度,e1代表比例,由于输入后,x经过两个分支,每个分支为原来一半的channel,所以,e1的值设置为0.5。e2,代表第二个分支的比率,由于第二个分支的比率是c1的四分之一,所以,e2设置为0.25。共有6个输出分支,经过拼接后输出的结果为原来channel的2倍,然后,再通过一层卷积调整channel,我将这层卷积定义为c2。
这个模块有两种结构,一种是输入输出的channel大小不变,一种是输出的channel是输入channel的2倍,结构一致,仅有channel不同,所以可放在一起,通过比率来调节,代码如下:
class MPConv(nn.Module):
def __init__(self, c1, e=0.5):
'''
:param ch_in: 输如通道
:param ch_out: 这里给的是中间层的输出通道
'''
c_ = int(c1 * e)
super(MPConv, self).__init__()
# 分支一
self.conv1 = nn.Sequential(
nn.MaxPool2d(2, 2),
Conv(c1, c_, 1, 1),
)
# 分支二
self.conv2 = nn.Sequential(
Conv(c1, c_, 1, 1),
Conv(c_, c_, 3, 2),
)
self.cat=Concat()
def forward(self, x):
# 分支一
output1 = self.conv1(x)
# 分支二
output2 = self.conv2(x)
output=self.cat((output1, output2))
return output
e是比率,通过设置e的比率就可以控制输出的维度了。
将上面的模块封装好后,就可以在yolo.py的parse_model函数中增加模块的参数配置逻辑了,代码如下:
if m in [nn.Conv2d, Conv, RobustConv, RobustConv2, E_ELAN, E_ELAN_H, DWConv, GhostConv, RepConv, RepConv_OREPA,
DownC,
SPP, SPPF, SPPCSPC, GhostSPPCSPC, MixConv2d, Focus, Stem, GhostStem, CrossConv,
Bottleneck, BottleneckCSPA, BottleneckCSPB, BottleneckCSPC,
RepBottleneck, RepBottleneckCSPA, RepBottleneckCSPB, RepBottleneckCSPC,
Res, ResCSPA, ResCSPB, ResCSPC,
RepRes, RepResCSPA, RepResCSPB, RepResCSPC,
ResX, ResXCSPA, ResXCSPB, ResXCSPC,
RepResX, RepResXCSPA, RepResXCSPB, RepResXCSPC,
Ghost, GhostCSPA, GhostCSPB, GhostCSPC,
SwinTransformerBlock, STCSPA, STCSPB, STCSPC,
SwinTransformer2Block, ST2CSPA, ST2CSPB, ST2CSPC]:
c1, c2 = ch[f], args[0]
elif m is MPConv:
if args[0] == 1:
c2 = ch[f] * 2
args = [ch[f], *args]
这里要注意MPConv模块的比率为1时,输入和输出的channel不一致,要将输出的channel赋值给c2,否则,下一维的c1就不对了。 修改位置如下图:
修改配置文件,核心是id要对上,我将新的配置文件命名为yolov7_new.yaml,内容如下:
# parameters
nc: 80 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [12,16, 19,36, 40,28] # P3/8
- [36,75, 76,55, 72,146] # P4/16
- [142,110, 192,243, 459,401] # P5/32
# yolov7 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [32, 3, 1]], # 0
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
[-1, 1, E_ELAN, [256,0.5]], # 4
[-1, 1, MPConv, [0.5]],
[-1, 1, E_ELAN, [512, 0.5]], # 6
[-1, 1, MPConv, [0.5]],
[-1, 1, E_ELAN, [1024,0.5]], # 8
[-1, 1, MPConv, [0.5]],
[-1, 1, E_ELAN, [1024,0.25]], # 10
]
# yolov7 head
head:
[[-1, 1, SPPCSPC, [512]], # 11 54
[-1, 1, Conv, [256, 1, 1]],#12
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[8, 1, Conv, [256, 1, 1]], # 14 route backbone P4
[[-1, -2], 1, Concat, [1]], # 15
[-1, 1, E_ELAN_H, [256,0.5,0.25]], #16
[-1, 1, Conv, [128, 1, 1]], #17
[-1, 1, nn.Upsample, [None, 2, 'nearest']], #18
[6, 1, Conv, [128, 1, 1]], # route backbone P3 19
[[-1, -2], 1, Concat, [1]], # 20
[-1, 1, E_ELAN_H, [128,0.5,0.25]], #21
[-1, 1, MPConv, [1]], # 22
[[-1, 16], 1, Concat, [1]],# 23
[-1, 1, E_ELAN_H, [256,0.5,0.25]], # 24
[-1, 1, MPConv, [1]], # 25
[[-1, 11], 1, Concat, [1]], # 26
[-1, 1, E_ELAN_H, [512,0.5,0.25]], #27
[21, 1, RepConv, [256, 3, 1]],
[24, 1, RepConv, [512, 3, 1]],
[27, 1, RepConv, [1024, 3, 1]],
[[28,29,30], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 15/15 [00:02<00:00, 5.42it/s]
all 229 1407 0.979 0.977 0.993 0.736
c17 229 131 0.99 1 0.996 0.841
c5 229 68 0.962 1 0.996 0.839
helicopter 229 43 0.964 1 0.988 0.636
c130 229 85 1 0.997 0.997 0.671
f16 229 57 1 0.966 0.996 0.697
b2 229 2 0.948 1 0.995 0.647
other 229 86 1 0.914 0.994 0.562
b52 229 65 0.985 0.969 0.98 0.834
kc10 229 62 1 0.974 0.986 0.823
command 229 40 0.998 1 0.996 0.798
f15 229 123 1 0.979 0.997 0.659
kc135 229 91 0.988 0.989 0.987 0.679
a10 229 27 1 0.599 0.989 0.425
b1 229 20 0.994 1 0.997 0.741
aew 229 25 0.951 1 0.97 0.769
f22 229 17 0.993 1 0.996 0.742
p3 229 105 1 1 0.998 0.788
p8 229 1 0.921 1 0.995 0.498
f35 229 32 1 0.998 0.996 0.62
f18 229 125 0.987 0.992 0.995 0.816
v22 229 41 0.998 1 0.996 0.719
su-27 229 31 0.996 1 0.996 0.84
il-38 229 27 0.994 1 0.995 0.878
tu-134 229 1 0.866 1 0.995 0.995
su-33 229 2 0.965 1 0.995 0.647
an-70 229 2 0.94 1 0.995 0.896
tu-22 229 98 0.999 1 0.997 0.817
300 epochs completed in 4.911 hours.
Optimizer stripped from runs\train\exp2\weights\last.pt, 75.1MB
Optimizer stripped from runs\train\exp2\weights\best.pt, 75.1MB
测试结果优于官方结果,说明封装的模型没有问题!
https://blog.csdn.net/m0_47867638/article/details/134128245?spm=1001.2014.3001.5502