云服务器上建网站,wordpress怎么上传文件,wordpress定义一个变量,昭通网站建设秋招面试专栏推荐 #xff1a;深度学习算法工程师面试问题总结【百面算法工程师】——点击即可跳转 #x1f4a1;#x1f4a1;#x1f4a1;本专栏所有程序均经过测试#xff0c;可成功执行#x1f4a1;#x1f4a1;#x1f4a1; 上下文Transformer#xff08;CoT…秋招面试专栏推荐 深度学习算法工程师面试问题总结【百面算法工程师】——点击即可跳转 本专栏所有程序均经过测试可成功执行 上下文TransformerCoT块是一种新颖的Transformer风格模块用于视觉识别。它充分利用输入键之间的上下文信息来指导动态注意力矩阵的学习从而加强了视觉表示的能力。CoT块首先通过3×3卷积对输入键进行上下文化编码得到输入的静态上下文表示。然后将编码后的键与输入查询连接起来通过两个连续的1×1卷积来学习动态的多头注意力矩阵。最后将静态和动态上下文表示的融合作为输出。文章在介绍主要的原理后将手把手教学如何进行模块的代码添加和修改并将修改后的完整代码放在文章的最后方便大家一键运行小白也可轻松上手实践。以帮助您更好地学习深度学习目标检测YOLO系列的挑战。 专栏地址YOLO11入门 改进涨点——点击即可跳转 欢迎订阅 目录
1.论文
2. 将CoTAttention添加到YOLO11中
2.1 CoTAttention代码实现
2.2 更改init.py文件
2.3 添加yaml文件
2.4 在task.py中进行注册
2.5 执行程序
3.修改后的网络结构图
4. 完整代码分享
5. GFLOPs
6. 进阶
7.总结 1.论文 论文地址Contextual Transformer Networks for Visual Recognition——点击即可跳转
官方代码官方代码仓库——点击即可跳转
2. 将CoTAttention添加到YOLO11中
2.1 CoTAttention代码实现 关键步骤一: 将下面代码粘贴到在/ultralytics/ultralytics/nn/modules/block.py中 class CoTAttention(nn.Module):def __init__(self, dim512, kernel_size3):super().__init__()self.dim dimself.kernel_size kernel_sizeself.key_embed nn.Sequential(nn.Conv2d(dim, dim, kernel_sizekernel_size, paddingkernel_size // 2, groups4, biasFalse),nn.BatchNorm2d(dim),nn.SiLU())self.value_embed nn.Sequential(nn.Conv2d(dim, dim, 1, biasFalse),nn.BatchNorm2d(dim))factor 4self.attention_embed nn.Sequential(nn.Conv2d(2 * dim, 2 * dim // factor, 1, biasFalse),nn.BatchNorm2d(2 * dim // factor),nn.SiLU(),nn.Conv2d(2 * dim // factor, kernel_size * kernel_size * dim, 1))def forward(self, x):bs, c, h, w x.shapek1 self.key_embed(x) # bs,c,h,wv self.value_embed(x).view(bs, c, -1) # bs,c,h,wy torch.cat([k1, x], dim1) # bs,2c,h,watt self.attention_embed(y) # bs,c*k*k,h,watt att.reshape(bs, c, self.kernel_size * self.kernel_size, h, w)att att.mean(2, keepdimFalse).view(bs, c, -1) # bs,c,h*wk2 F.softmax(att, dim-1) * vk2 k2.view(bs, c, h, w)return k1 k2 2.2 更改init.py文件 关键步骤二修改modules文件夹下的__init__.py文件先导入函数 然后在下面的__all__中声明函数 2.3 添加yaml文件 关键步骤三在/ultralytics/ultralytics/cfg/models/11下面新建文件yolo11_CoTA.yaml文件粘贴下面的内容 目标检测 # Ultralytics YOLO , AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. modelyolo11n.yaml will call yolo11.yaml with scale n# [depth, width, max_channels]n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPss: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPsm: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPsl: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPsx: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs# YOLO11n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 2, C3k2, [256, False, 0.25]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 2, C3k2, [512, False, 0.25]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 2, C3k2, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 2, C3k2, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 2, C2PSA, [1024]] # 10# YOLO11n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P4- [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 10], 1, Concat, [1]] # cat head P5- [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)- [ -1, 1, CoTAttention, [1024] ]- [[16, 19, 23], 1, Detect, [nc]] # Detect(P3, P4, P5)语义分割 # Ultralytics YOLO , AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. modelyolo11n.yaml will call yolo11.yaml with scale n# [depth, width, max_channels]n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPss: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPsm: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPsl: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPsx: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs# YOLO11n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 2, C3k2, [256, False, 0.25]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 2, C3k2, [512, False, 0.25]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 2, C3k2, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 2, C3k2, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 2, C2PSA, [1024]] # 10# YOLO11n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P4- [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 10], 1, Concat, [1]] # cat head P5- [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)- [ -1, 1, CoTAttention, [1024] ]- [[16, 19, 23], 1, Segment, [nc, 32, 256]] # Segment(P3, P4, P5) 旋转目标检测 # Ultralytics YOLO , AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. modelyolo11n.yaml will call yolo11.yaml with scale n# [depth, width, max_channels]n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPss: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPsm: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPsl: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPsx: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs# YOLO11n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 2, C3k2, [256, False, 0.25]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 2, C3k2, [512, False, 0.25]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 2, C3k2, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 2, C3k2, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 2, C2PSA, [1024]] # 10# YOLO11n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P4- [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 10], 1, Concat, [1]] # cat head P5- [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)- [ -1, 1, CoTAttention, [1024] ]- [[16, 19, 23], 1, OBB, [nc, 1]] # Detect(P3, P4, P5)温馨提示本文只是对yolo11基础上添加模块如果要对yolo11n/l/m/x进行添加则只需要指定对应的depth_multiple 和 width_multiple。 # YOLO11n
depth_multiple: 0.50 # model depth multiple
width_multiple: 0.25 # layer channel multiple
max_channel1024# YOLO11s
depth_multiple: 0.50 # model depth multiple
width_multiple: 0.50 # layer channel multiple
max_channel1024# YOLO11m
depth_multiple: 0.50 # model depth multiple
width_multiple: 1.00 # layer channel multiple
max_channel512# YOLO11l
depth_multiple: 1.00 # model depth multiple
width_multiple: 1.00 # layer channel multiple
max_channel512 # YOLO11x
depth_multiple: 1.00 # model depth multiple
width_multiple: 1.50 # layer channel multiple
max_channel512 2.4 在task.py中进行注册 关键步骤四在task.py的parse_model函数中进行注册 先在task.py导入函数 然后在task.py文件下找到parse_model这个函数如下图添加CoTAttention elif m is CoTAttention:c1, c2 ch[f], args[0]if c2 ! nc:c2 make_divisible(min(c2, max_channels) * width, 8)args [c1, *args[1:]] 2.5 执行程序 关键步骤五在ultralytics文件中新建train.py将model的参数路径设置为yolo11_CoTA.yaml的路径即可 from ultralytics import YOLO
import warnings
warnings.filterwarnings(ignore)
from pathlib import Pathif __name__ __main__:# 加载模型model YOLO(ultralytics/cfg/11/yolo11.yaml) # 你要选择的模型yaml文件地址# Use the modelresults model.train(datar你的数据集的yaml文件地址,epochs100, batch16, imgsz640, workers4, namePath(model.cfg).stem) # 训练模型 运行程序如果出现下面的内容则说明添加成功 from n params module arguments0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2]1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2]2 -1 1 6640 ultralytics.nn.modules.block.C3k2 [32, 64, 1, False, 0.25]3 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]4 -1 1 26080 ultralytics.nn.modules.block.C3k2 [64, 128, 1, False, 0.25]5 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]6 -1 1 87040 ultralytics.nn.modules.block.C3k2 [128, 128, 1, True]7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2]8 -1 1 346112 ultralytics.nn.modules.block.C3k2 [256, 256, 1, True]9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5]10 -1 1 249728 ultralytics.nn.modules.block.C2PSA [256, 256, 1]11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, nearest]12 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]13 -1 1 111296 ultralytics.nn.modules.block.C3k2 [384, 128, 1, False]14 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, nearest]15 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]16 -1 1 32096 ultralytics.nn.modules.block.C3k2 [256, 64, 1, False]17 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]18 [-1, 13] 1 0 ultralytics.nn.modules.conv.Concat [1]19 -1 1 86720 ultralytics.nn.modules.block.C3k2 [192, 128, 1, False]20 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]21 [-1, 10] 1 0 ultralytics.nn.modules.conv.Concat [1]22 -1 1 378880 ultralytics.nn.modules.block.C3k2 [384, 256, 1, True]23 -1 1 577024 ultralytics.nn.modules.block.CoTAttention [256]24 [16, 19, 23] 1 464912 ultralytics.nn.modules.head.Detect [80, [64, 128, 256]]
YOLO11_CoTAttention summary: 332 layers, 3,201,104 parameters, 3,201,088 gradients, 7.1 GFLOPs 3.修改后的网络结构图 4. 完整代码分享
这个后期补充吧~先按照步骤来即可
5. GFLOPs
关于GFLOPs的计算方式可以查看百面算法工程师 | 卷积基础知识——Convolution
未改进的YOLO11n GFLOPs 改进后的GFLOPs 6. 进阶
可以与其他的注意力机制或者损失函数等结合进一步提升检测效果
7.总结
通过以上的改进方法我们成功提升了模型的表现。这只是一个开始未来还有更多优化和技术深挖的空间。在这里我想隆重向大家推荐我的专栏——《YOLO11改进有效涨点》。这个专栏专注于前沿的深度学习技术特别是目标检测领域的最新进展不仅包含对YOLO11的深入解析和改进策略还会定期更新来自各大顶会如CVPR、NeurIPS等的论文复现和实战分享。
为什么订阅我的专栏 ——《YOLO11改进有效涨点》 前沿技术解读专栏不仅限于YOLO系列的改进还会涵盖各类主流与新兴网络的最新研究成果帮助你紧跟技术潮流。 详尽的实践分享所有内容实践性也极强。每次更新都会附带代码和具体的改进步骤保证每位读者都能迅速上手。 问题互动与答疑订阅我的专栏后你将可以随时向我提问获取及时的答疑。 实时更新紧跟行业动态不定期发布来自全球顶会的最新研究方向和复现实验报告让你时刻走在技术前沿。
专栏适合人群 对目标检测、YOLO系列网络有深厚兴趣的同学 希望在用YOLO算法写论文的同学 对YOLO算法感兴趣的同学等