前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >YoloV7改进策略:重新封装YoloV7,方便后续更改

YoloV7改进策略:重新封装YoloV7,方便后续更改

作者头像
AI浩
发布2024-10-22 12:17:02
1190
发布2024-10-22 12:17:02
举报
文章被收录于专栏:AI智韵

摘要

YoloV7虽然和YoloV5、YoloV8一脉相承,但是其配置文件及其复杂,对修改造成一定的难度。

yolov7.yaml配置文件如下:

代码语言:javascript
复制
# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [12,16, 19,36, 40,28]  # P3/8
  - [36,75, 76,55, 72,146]  # P4/16
  - [142,110, 192,243, 459,401]  # P5/32

# yolov7 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 1]],  # 0
  
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2      
   [-1, 1, Conv, [64, 3, 1]],
   
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4  
   [-1, 1, Conv, [64, 1, 1]],
   [-2, 1, Conv, [64, 1, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]],  # 11
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 16-P3/8  
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1]],  # 24
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 29-P4/16  
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [1024, 1, 1]],  # 37
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [512, 1, 1]],
   [-3, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [512, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 42-P5/32  
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [1024, 1, 1]],  # 50
  ]

# yolov7 head
head:
  [[-1, 1, SPPCSPC, [512]], # 51
  
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [37, 1, Conv, [256, 1, 1]], # route backbone P4
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]], # 63
   
   [-1, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [24, 1, Conv, [128, 1, 1]], # route backbone P3
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1]], # 75
      
   [-1, 1, MP, []],
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3, 63], 1, Concat, [1]],
   
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]], # 88
      
   [-1, 1, MP, []],
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, -3, 51], 1, Concat, [1]],
   
   [-1, 1, Conv, [512, 1, 1]],
   [-2, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1]], # 101
   
   [75, 1, RepConv, [256, 3, 1]],
   [88, 1, RepConv, [512, 3, 1]],
   [101, 1, RepConv, [1024, 3, 1]],

   [[102,103,104], 1, Detect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

太多的层次了。所以,先对YoloV7做封装。

YoloV7官方代码测试结果

代码语言:javascript
复制
                 all         229        1407       0.966        0.99       0.993       0.734
                 c17         229         131       0.977       0.992       0.991       0.828
                  c5         229          68       0.941           1        0.99       0.837
          helicopter         229          43       0.949           1        0.98       0.628
                c130         229          85       0.994           1       0.997       0.691
                 f16         229          57        0.99       0.965       0.994       0.694
                  b2         229           2       0.904           1       0.995       0.796
               other         229          86       0.988       0.955       0.992       0.565
                 b52         229          65        0.98       0.969       0.985       0.819
                kc10         229          62       0.995       0.984       0.986        0.83
             command         229          40       0.991           1       0.996       0.835
                 f15         229         123       0.992       0.992       0.997       0.652
               kc135         229          91       0.986       0.989       0.987       0.707
                 a10         229          27           1       0.889       0.997       0.454
                  b1         229          20       0.989           1       0.996        0.74
                 aew         229          25       0.949           1       0.981       0.751
                 f22         229          17       0.977           1       0.996       0.754
                  p3         229         105       0.998           1       0.998       0.797
                  p8         229           1       0.853           1       0.995       0.597
                 f35         229          32       0.994           1       0.996        0.58
                 f18         229         125       0.991       0.992       0.993       0.822
                 v22         229          41       0.995           1       0.996       0.696
               su-27         229          31       0.992           1       0.996       0.829
               il-38         229          27       0.962           1       0.996       0.857
              tu-134         229           1       0.846           1       0.995       0.896
               su-33         229           2       0.939           1       0.995       0.498
               an-70         229           2       0.904           1       0.995       0.846
               tu-22         229          98       0.998           1       0.998        0.81

记录一下YoloV7官方结果,方便和后续的结果做对比!

YoloV7架构图

从结构图上可以看出,我们需要封装的模块分为ELAN、MP1、MP2、ELAN-H这几个模块。

对模块做封装

ELAN和ELAN-H

ELAN和ELAN-H比较相似,只有内部的channel不同,所以放在一起做比较,方便大家学习,如下图:

首先,封装E_ELAN,代码如下:

代码语言:javascript
复制
class E_ELAN(nn.Module):
    def __init__(self, c1, c2,e=0.5):
        '''
        :param c1: 输入通道
        :param c2: 这里给的是中间层的输出通道
        :param flg: 判断是否为backbone的最后一层,因为这里的输出通道数有所改变
        '''
        super(E_ELAN, self).__init__()
        c_ = int(c1 *e)
        self.conv1 = Conv(c1, c_, k=1, s=1)

        self.conv2 = Conv(c1, c_, k=1, s=1)

        self.conv3 = Conv(c_, c_, k=3, s=1)
        self.conv4 = Conv(c_, c_, k=3, s=1)
        self.conv5 = Conv(c_, c_, k=3, s=1)
        self.conv6 = Conv(c_, c_, k=3, s=1)
        self.conv7 = Conv(4 * c_, c2, k=1, s=1)

    def forward(self, x):
        # 分支一输出
        output1 = self.conv1(x)
        # 分支二输出
        output2_1 = self.conv2(x)
        output2_2 = self.conv3(output2_1)
        output2_3 = self.conv4(output2_2)
        output2_4 = self.conv5(output2_3)
        output2_5 = self.conv6(output2_4)
        output_cat = torch.cat((output1, output2_1, output2_3, output2_5), dim=1)
        return self.conv7(output_cat)

先讲解一下参数,c1,代表输入的维度,c2代表输出的维度,e代表比例,由于输入后,x经过两个分支,每个分支为原来一半的channel,所以,e的值设置为0.5。共有四个输出分支,经过拼接后输出的结果为原来channel的2倍,然后,再通过一层卷积调整channel,我将这层卷积定义为c2。

接下来是ELAN_H模块,代码如下:

代码语言:javascript
复制

class E_ELAN_H(nn.Module):
    def __init__(self, c1, c2,e1=0.5,e2=0.25):
        '''
        :param c1: 输入通道
        :param c2: 这里给的是中间层的输出通道
        :param flg: 判断是否为backbone的最后一层,因为这里的输出通道数有所改变
        '''
        super(E_ELAN_H, self).__init__()
        c_ = int(c1 * e1)
        c_hidden = int(c1 *e2)
        self.conv1 = Conv(c1, c_, k=1, s=1)
        self.conv2 = Conv(c1, c_, k=1, s=1)
        self.conv3 = Conv(c_, c_hidden, k=3, s=1)
        self.conv4 = Conv(c_hidden, c_hidden, k=3, s=1)
        self.conv5 = Conv(c_hidden, c_hidden, k=3, s=1)
        self.conv6 = Conv(c_hidden, c_hidden, k=3, s=1)
        self.conv7 = Conv(2 * c1, c2, k=1, s=1)

    def forward(self, x):
        '''
        :param x: 输入
        :return:
        '''
        # 分支一输出
        output1 = self.conv1(x)
        # 分支二输出
        output2_1 = self.conv2(x)
        output2_2 = self.conv3(output2_1)
        output2_3 = self.conv4(output2_2)
        output2_4 = self.conv5(output2_3)
        output2_5 = self.conv6(output2_4)
        output_cat = torch.cat((output1, output2_1, output2_2, output2_3, output2_4, output2_5), dim=1)
        return self.conv7(output_cat)

先讲解一下参数,c1,代表输入的维度,c2代表输出的维度,e1代表比例,由于输入后,x经过两个分支,每个分支为原来一半的channel,所以,e1的值设置为0.5。e2,代表第二个分支的比率,由于第二个分支的比率是c1的四分之一,所以,e2设置为0.25。共有6个输出分支,经过拼接后输出的结果为原来channel的2倍,然后,再通过一层卷积调整channel,我将这层卷积定义为c2。

MPConv模块

这个模块有两种结构,一种是输入输出的channel大小不变,一种是输出的channel是输入channel的2倍,结构一致,仅有channel不同,所以可放在一起,通过比率来调节,代码如下:

代码语言:javascript
复制
class MPConv(nn.Module):
    def __init__(self, c1, e=0.5):
        '''
        :param ch_in: 输如通道
        :param ch_out: 这里给的是中间层的输出通道
        '''
        c_ = int(c1 * e)
        super(MPConv, self).__init__()
        # 分支一
        self.conv1 = nn.Sequential(
            nn.MaxPool2d(2, 2),
            Conv(c1, c_, 1, 1),
        )
        # 分支二
        self.conv2 = nn.Sequential(
            Conv(c1, c_, 1, 1),
            Conv(c_, c_, 3, 2),
        )
        self.cat=Concat()
    def forward(self, x):
        # 分支一
        output1 = self.conv1(x)
        # 分支二
        output2 = self.conv2(x)
        output=self.cat((output1, output2))
        return output

e是比率,通过设置e的比率就可以控制输出的维度了。

参数配置和配置文件

将上面的模块封装好后,就可以在yolo.py的parse_model函数中增加模块的参数配置逻辑了,代码如下:

代码语言:javascript
复制
     if m in [nn.Conv2d, Conv, RobustConv, RobustConv2, E_ELAN, E_ELAN_H, DWConv, GhostConv, RepConv, RepConv_OREPA,
                 DownC,
                 SPP, SPPF, SPPCSPC, GhostSPPCSPC, MixConv2d, Focus, Stem, GhostStem, CrossConv,
                 Bottleneck, BottleneckCSPA, BottleneckCSPB, BottleneckCSPC,
                 RepBottleneck, RepBottleneckCSPA, RepBottleneckCSPB, RepBottleneckCSPC,
                 Res, ResCSPA, ResCSPB, ResCSPC,
                 RepRes, RepResCSPA, RepResCSPB, RepResCSPC,
                 ResX, ResXCSPA, ResXCSPB, ResXCSPC,
                 RepResX, RepResXCSPA, RepResXCSPB, RepResXCSPC,
                 Ghost, GhostCSPA, GhostCSPB, GhostCSPC,
                 SwinTransformerBlock, STCSPA, STCSPB, STCSPC,
                 SwinTransformer2Block, ST2CSPA, ST2CSPB, ST2CSPC]:
            c1, c2 = ch[f], args[0]
        elif m is MPConv:
            if args[0] == 1:
                c2 = ch[f] * 2
            args = [ch[f], *args]

这里要注意MPConv模块的比率为1时,输入和输出的channel不一致,要将输出的channel赋值给c2,否则,下一维的c1就不对了。 修改位置如下图:

修改配置文件,核心是id要对上,我将新的配置文件命名为yolov7_new.yaml,内容如下:

代码语言:javascript
复制
# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [12,16, 19,36, 40,28]  # P3/8
  - [36,75, 76,55, 72,146]  # P4/16
  - [142,110, 192,243, 459,401]  # P5/32

# yolov7 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 1]],  # 0
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2      
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4
   [-1, 1, E_ELAN, [256,0.5]],  # 4

   [-1, 1, MPConv, [0.5]],
   [-1, 1, E_ELAN, [512, 0.5]],  # 6

   [-1, 1, MPConv, [0.5]],
   [-1, 1, E_ELAN, [1024,0.5]],  # 8

   [-1, 1, MPConv, [0.5]],
   [-1, 1, E_ELAN, [1024,0.25]],  # 10
  ]

# yolov7 head
head:
  [[-1, 1, SPPCSPC, [512]], # 11 54
  
   [-1, 1, Conv, [256, 1, 1]],#12
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [8, 1, Conv, [256, 1, 1]], # 14 route backbone P4
   [[-1, -2], 1, Concat, [1]], # 15
   [-1, 1, E_ELAN_H, [256,0.5,0.25]], #16

   
   [-1, 1, Conv, [128, 1, 1]], #17
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], #18
   [6, 1, Conv, [128, 1, 1]], # route backbone P3 19
   [[-1, -2], 1, Concat, [1]], # 20
   [-1, 1, E_ELAN_H, [128,0.5,0.25]], #21
      
   [-1, 1, MPConv, [1]], # 22
   [[-1, 16], 1, Concat, [1]],# 23
   [-1, 1, E_ELAN_H, [256,0.5,0.25]], # 24
      
   [-1, 1, MPConv, [1]], # 25
   [[-1, 11], 1, Concat, [1]], # 26
   [-1, 1, E_ELAN_H, [512,0.5,0.25]], #27
   
   [21, 1, RepConv, [256, 3, 1]],
   [24, 1, RepConv, [512, 3, 1]],
   [27, 1, RepConv, [1024, 3, 1]],

   [[28,29,30], 1, IDetect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

测试结果

代码语言:javascript
复制
    Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100%|██████████| 15/15 [00:02<00:00,  5.42it/s]
                 all         229        1407       0.979       0.977       0.993       0.736
                 c17         229         131        0.99           1       0.996       0.841
                  c5         229          68       0.962           1       0.996       0.839
          helicopter         229          43       0.964           1       0.988       0.636
                c130         229          85           1       0.997       0.997       0.671
                 f16         229          57           1       0.966       0.996       0.697
                  b2         229           2       0.948           1       0.995       0.647
               other         229          86           1       0.914       0.994       0.562
                 b52         229          65       0.985       0.969        0.98       0.834
                kc10         229          62           1       0.974       0.986       0.823
             command         229          40       0.998           1       0.996       0.798
                 f15         229         123           1       0.979       0.997       0.659
               kc135         229          91       0.988       0.989       0.987       0.679
                 a10         229          27           1       0.599       0.989       0.425
                  b1         229          20       0.994           1       0.997       0.741
                 aew         229          25       0.951           1        0.97       0.769
                 f22         229          17       0.993           1       0.996       0.742
                  p3         229         105           1           1       0.998       0.788
                  p8         229           1       0.921           1       0.995       0.498
                 f35         229          32           1       0.998       0.996        0.62
                 f18         229         125       0.987       0.992       0.995       0.816
                 v22         229          41       0.998           1       0.996       0.719
               su-27         229          31       0.996           1       0.996        0.84
               il-38         229          27       0.994           1       0.995       0.878
              tu-134         229           1       0.866           1       0.995       0.995
               su-33         229           2       0.965           1       0.995       0.647
               an-70         229           2        0.94           1       0.995       0.896
               tu-22         229          98       0.999           1       0.997       0.817
300 epochs completed in 4.911 hours.

Optimizer stripped from runs\train\exp2\weights\last.pt, 75.1MB
Optimizer stripped from runs\train\exp2\weights\best.pt, 75.1MB

测试结果优于官方结果,说明封装的模型没有问题!

完整代码链接

代码语言:javascript
复制
https://blog.csdn.net/m0_47867638/article/details/134128245?spm=1001.2014.3001.5502
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2023-11-28,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 AI智韵 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 摘要
  • YoloV7官方代码测试结果
  • YoloV7架构图
  • 对模块做封装
    • ELAN和ELAN-H
      • MPConv模块
      • 参数配置和配置文件
      • 测试结果
      • 完整代码链接
      相关产品与服务
      腾讯云服务器利旧
      云服务器(Cloud Virtual Machine,CVM)提供安全可靠的弹性计算服务。 您可以实时扩展或缩减计算资源,适应变化的业务需求,并只需按实际使用的资源计费。使用 CVM 可以极大降低您的软硬件采购成本,简化 IT 运维工作。
      领券
      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档