当前位置: 首页 > news >正文

珠海网站设计培训学校wordpress插件取消

珠海网站设计培训学校,wordpress插件取消,一个好的网站怎么建设,郑州网站建设哪家公司好目录 一、HAttention注意力机制1.1HAttention注意力介绍1.2HAT核心代码 二、添加HAT注意力机制2.1STEP12.2STEP22.3STEP32.4STEP4 三、yaml文件与运行3.1yaml文件3.2运行成功截图 一、HAttention注意力机制 1.1HAttention注意力介绍 HAT模型 通过结合卷积特征提取与多尺度注意… 目录 一、HAttention注意力机制1.1HAttention注意力介绍1.2HAT核心代码 二、添加HAT注意力机制2.1STEP12.2STEP22.3STEP32.4STEP4 三、yaml文件与运行3.1yaml文件3.2运行成功截图 一、HAttention注意力机制 1.1HAttention注意力介绍 HAT模型 通过结合卷积特征提取与多尺度注意力机制具备了强大的图像重建能力。它的优势在于能有效整合局部和全局信息并通过残差连接和通道注意力等方式提高网络的表达能力和重建质量适用于图像超分辨率和图像重建任务。 下面是HAT的工作流程和主要模块的作用 浅层特征提取 (Shallow Feature Extraction) 输入图像首先经过卷积操作提取低级特征。该过程用来捕捉图像的基础信息如边缘、颜色等形成初步的特征图。 深层特征提取 (Deep Feature Extraction) 浅层特征通过多个RHAG模块进行深度特征提取。RHAG由多个HAB混合注意力块和OCAB重叠交叉注意力块组成 HAB包含 CAB (Channel Attention Block) 和 (S)W-MSA (Shifted Window Multi-Head Self-Attention) 结构。 CAB (通道注意力块) 使用全局池化和通道注意力机制专注于不同通道之间的依赖关系以增强特定通道的特征表示。 (S)W-MSA 是一种窗口划分的自注意力机制通过窗口化操作计算注意力减少计算开销同时增强局部与全局信息的交互。 OCAB通过交叉注意力机制结合局部和全局特征并通过重叠区域确保信息的连贯性和连续性。 优势深度特征提取模块通过多个注意力模块结合局部和全局信息实现对复杂特征的高效捕捉同时保持较低的计算成本。 图像重建 (Image Reconstruction) 深层特征经过多个RHAG模块后通过上采样操作重建回高分辨率图像。模型将提取到的深层特征与初始输入进行特征融合生成更高质量的重建图像。 模块优势 RHAG (Residual Hybrid Attention Group)该模块通过残差连接增强网络的梯度流避免深层网络中的梯度消失问题同时结合多种注意力机制提高特征提取的准确性和效率。 HAB (Hybrid Attention Block)该模块将通道注意力与窗口自注意力相结合在不同尺度上捕捉图像特征。通道注意力增强了各个特征通道的表示能力而窗口自注意力通过局部和全局上下文的信息交互来提升整体的特征感知能力。 OCAB (Overlapping Cross-Attention Block)通过交叉注意力和重叠区域融合使模型在捕捉局部特征的同时能够保持对全局特征的感知避免信息的割裂。 1.2HAT核心代码 import math import torch import torch.nn as nn from basicsr.utils.registry import ARCH_REGISTRY from basicsr.archs.arch_util import to_2tuple, trunc_normal_ from einops import rearrangedef drop_path(x, drop_prob: float 0., training: bool False):Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).From: https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/drop.pyif drop_prob 0. or not training:return xkeep_prob 1 - drop_probshape (x.shape[0], ) (1, ) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNetsrandom_tensor keep_prob torch.rand(shape, dtypex.dtype, devicex.device)random_tensor.floor_() # binarizeoutput x.div(keep_prob) * random_tensorreturn outputclass DropPath(nn.Module):Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).From: https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/drop.pydef __init__(self, drop_probNone):super(DropPath, self).__init__()self.drop_prob drop_probdef forward(self, x):return drop_path(x, self.drop_prob, self.training)class ChannelAttention(nn.Module):Channel attention used in RCAN.Args:num_feat (int): Channel number of intermediate features.squeeze_factor (int): Channel squeeze factor. Default: 16.def __init__(self, num_feat, squeeze_factor16):super(ChannelAttention, self).__init__()self.attention nn.Sequential(nn.AdaptiveAvgPool2d(1),nn.Conv2d(num_feat, num_feat // squeeze_factor, 1, padding0),nn.ReLU(inplaceTrue),nn.Conv2d(num_feat // squeeze_factor, num_feat, 1, padding0),nn.Sigmoid())def forward(self, x):y self.attention(x)return x * yclass CAB(nn.Module):def __init__(self, num_feat, compress_ratio3, squeeze_factor30):super(CAB, self).__init__()self.cab nn.Sequential(nn.Conv2d(num_feat, num_feat // compress_ratio, 3, 1, 1),nn.GELU(),nn.Conv2d(num_feat // compress_ratio, num_feat, 3, 1, 1),ChannelAttention(num_feat, squeeze_factor))def forward(self, x):return self.cab(x)class Mlp(nn.Module):def __init__(self, in_features, hidden_featuresNone, out_featuresNone, act_layernn.GELU, drop0.):super().__init__()out_features out_features or in_featureshidden_features hidden_features or in_featuresself.fc1 nn.Linear(in_features, hidden_features)self.act act_layer()self.fc2 nn.Linear(hidden_features, out_features)self.drop nn.Dropout(drop)def forward(self, x):x self.fc1(x)x self.act(x)x self.drop(x)x self.fc2(x)x self.drop(x)return xdef window_partition(x, window_size):Args:x: (b, h, w, c)window_size (int): window sizeReturns:windows: (num_windows*b, window_size, window_size, c)b, h, w, c x.shapex x.view(b, h // window_size, window_size, w // window_size, window_size, c)windows x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, window_size, window_size, c)return windowsdef window_reverse(windows, window_size, h, w):Args:windows: (num_windows*b, window_size, window_size, c)window_size (int): Window sizeh (int): Height of imagew (int): Width of imageReturns:x: (b, h, w, c)b int(windows.shape[0] / (h * w / window_size / window_size))x windows.view(b, h // window_size, w // window_size, window_size, window_size, -1)x x.permute(0, 1, 3, 2, 4, 5).contiguous().view(b, h, w, -1)return xclass WindowAttention(nn.Module):r Window based multi-head self attention (W-MSA) module with relative position bias.It supports both of shifted and non-shifted window.Args:dim (int): Number of input channels.window_size (tuple[int]): The height and width of the window.num_heads (int): Number of attention heads.qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: Trueqk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if setattn_drop (float, optional): Dropout ratio of attention weight. Default: 0.0proj_drop (float, optional): Dropout ratio of output. Default: 0.0def __init__(self, dim, window_size, num_heads, qkv_biasTrue, qk_scaleNone, attn_drop0., proj_drop0.):super().__init__()self.dim dimself.window_size window_size # Wh, Wwself.num_heads num_headshead_dim dim // num_headsself.scale qk_scale or head_dim**-0.5# define a parameter table of relative position biasself.relative_position_bias_table nn.Parameter(torch.zeros((2 * window_size[0] - 1) * (2 * window_size[1] - 1), num_heads)) # 2*Wh-1 * 2*Ww-1, nHself.qkv nn.Linear(dim, dim * 3, biasqkv_bias)self.attn_drop nn.Dropout(attn_drop)self.proj nn.Linear(dim, dim)self.proj_drop nn.Dropout(proj_drop)trunc_normal_(self.relative_position_bias_table, std.02)self.softmax nn.Softmax(dim-1)def forward(self, x, rpi, maskNone):Args:x: input features with shape of (num_windows*b, n, c)mask: (0/-inf) mask with shape of (num_windows, Wh*Ww, Wh*Ww) or Noneb_, n, c x.shapeqkv self.qkv(x).reshape(b_, n, 3, self.num_heads, c // self.num_heads).permute(2, 0, 3, 1, 4)q, k, v qkv[0], qkv[1], qkv[2] # make torchscript happy (cannot use tensor as tuple)q q * self.scaleattn (q k.transpose(-2, -1))relative_position_bias self.relative_position_bias_table[rpi.view(-1)].view(self.window_size[0] * self.window_size[1], self.window_size[0] * self.window_size[1], -1) # Wh*Ww,Wh*Ww,nHrelative_position_bias relative_position_bias.permute(2, 0, 1).contiguous() # nH, Wh*Ww, Wh*Wwattn attn relative_position_bias.unsqueeze(0)if mask is not None:nw mask.shape[0]attn attn.view(b_ // nw, nw, self.num_heads, n, n) mask.unsqueeze(1).unsqueeze(0)attn attn.view(-1, self.num_heads, n, n)attn self.softmax(attn)else:attn self.softmax(attn)attn self.attn_drop(attn)x (attn v).transpose(1, 2).reshape(b_, n, c)x self.proj(x)x self.proj_drop(x)return xclass HAB(nn.Module):r Hybrid Attention Block.Args:dim (int): Number of input channels.input_resolution (tuple[int]): Input resolution.num_heads (int): Number of attention heads.window_size (int): Window size.shift_size (int): Shift size for SW-MSA.mlp_ratio (float): Ratio of mlp hidden dim to embedding dim.qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: Trueqk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.drop (float, optional): Dropout rate. Default: 0.0attn_drop (float, optional): Attention dropout rate. Default: 0.0drop_path (float, optional): Stochastic depth rate. Default: 0.0act_layer (nn.Module, optional): Activation layer. Default: nn.GELUnorm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNormdef __init__(self,dim,input_resolution,num_heads,window_size7,shift_size0,compress_ratio3,squeeze_factor30,conv_scale0.01,mlp_ratio4.,qkv_biasTrue,qk_scaleNone,drop0.,attn_drop0.,drop_path0.,act_layernn.GELU,norm_layernn.LayerNorm):super().__init__()self.dim dimself.input_resolution input_resolutionself.num_heads num_headsself.window_size window_sizeself.shift_size shift_sizeself.mlp_ratio mlp_ratioif min(self.input_resolution) self.window_size:# if window size is larger than input resolution, we dont partition windowsself.shift_size 0self.window_size min(self.input_resolution)assert 0 self.shift_size self.window_size, shift_size must in 0-window_sizeself.norm1 norm_layer(dim)self.attn WindowAttention(dim,window_sizeto_2tuple(self.window_size),num_headsnum_heads,qkv_biasqkv_bias,qk_scaleqk_scale,attn_dropattn_drop,proj_dropdrop)self.conv_scale conv_scaleself.conv_block CAB(num_featdim, compress_ratiocompress_ratio, squeeze_factorsqueeze_factor)self.drop_path DropPath(drop_path) if drop_path 0. else nn.Identity()self.norm2 norm_layer(dim)mlp_hidden_dim int(dim * mlp_ratio)self.mlp Mlp(in_featuresdim, hidden_featuresmlp_hidden_dim, act_layeract_layer, dropdrop)def forward(self, x, x_size, rpi_sa, attn_mask):h, w x_sizeb, _, c x.shape# assert seq_len h * w, input feature has wrong sizeshortcut xx self.norm1(x)x x.view(b, h, w, c)# Conv_Xconv_x self.conv_block(x.permute(0, 3, 1, 2))conv_x conv_x.permute(0, 2, 3, 1).contiguous().view(b, h * w, c)# cyclic shiftif self.shift_size 0:shifted_x torch.roll(x, shifts(-self.shift_size, -self.shift_size), dims(1, 2))attn_mask attn_maskelse:shifted_x xattn_mask None# partition windowsx_windows window_partition(shifted_x, self.window_size) # nw*b, window_size, window_size, cx_windows x_windows.view(-1, self.window_size * self.window_size, c) # nw*b, window_size*window_size, c# W-MSA/SW-MSA (to be compatible for testing on images whose shapes are the multiple of window sizeattn_windows self.attn(x_windows, rpirpi_sa, maskattn_mask)# merge windowsattn_windows attn_windows.view(-1, self.window_size, self.window_size, c)shifted_x window_reverse(attn_windows, self.window_size, h, w) # b h w c# reverse cyclic shiftif self.shift_size 0:attn_x torch.roll(shifted_x, shifts(self.shift_size, self.shift_size), dims(1, 2))else:attn_x shifted_xattn_x attn_x.view(b, h * w, c)# FFNx shortcut self.drop_path(attn_x) conv_x * self.conv_scalex x self.drop_path(self.mlp(self.norm2(x)))return xclass PatchMerging(nn.Module):r Patch Merging Layer.Args:input_resolution (tuple[int]): Resolution of input feature.dim (int): Number of input channels.norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNormdef __init__(self, input_resolution, dim, norm_layernn.LayerNorm):super().__init__()self.input_resolution input_resolutionself.dim dimself.reduction nn.Linear(4 * dim, 2 * dim, biasFalse)self.norm norm_layer(4 * dim)def forward(self, x):x: b, h*w, ch, w self.input_resolutionb, seq_len, c x.shapeassert seq_len h * w, input feature has wrong sizeassert h % 2 0 and w % 2 0, fx size ({h}*{w}) are not even.x x.view(b, h, w, c)x0 x[:, 0::2, 0::2, :] # b h/2 w/2 cx1 x[:, 1::2, 0::2, :] # b h/2 w/2 cx2 x[:, 0::2, 1::2, :] # b h/2 w/2 cx3 x[:, 1::2, 1::2, :] # b h/2 w/2 cx torch.cat([x0, x1, x2, x3], -1) # b h/2 w/2 4*cx x.view(b, -1, 4 * c) # b h/2*w/2 4*cx self.norm(x)x self.reduction(x)return xclass OCAB(nn.Module):# overlapping cross-attention blockdef __init__(self, dim,input_resolution,window_size,overlap_ratio,num_heads,qkv_biasTrue,qk_scaleNone,mlp_ratio2,norm_layernn.LayerNorm):super().__init__()self.dim dimself.input_resolution input_resolutionself.window_size window_sizeself.num_heads num_headshead_dim dim // num_headsself.scale qk_scale or head_dim**-0.5self.overlap_win_size int(window_size * overlap_ratio) window_sizeself.norm1 norm_layer(dim)self.qkv nn.Linear(dim, dim * 3, biasqkv_bias)self.unfold nn.Unfold(kernel_size(self.overlap_win_size, self.overlap_win_size), stridewindow_size, padding(self.overlap_win_size-window_size)//2)# define a parameter table of relative position biasself.relative_position_bias_table nn.Parameter(torch.zeros((window_size self.overlap_win_size - 1) * (window_size self.overlap_win_size - 1), num_heads)) # 2*Wh-1 * 2*Ww-1, nHtrunc_normal_(self.relative_position_bias_table, std.02)self.softmax nn.Softmax(dim-1)self.proj nn.Linear(dim,dim)self.norm2 norm_layer(dim)mlp_hidden_dim int(dim * mlp_ratio)self.mlp Mlp(in_featuresdim, hidden_featuresmlp_hidden_dim, act_layernn.GELU)def forward(self, x, x_size, rpi):h, w x_sizeb, _, c x.shapeshortcut xx self.norm1(x)x x.view(b, h, w, c)qkv self.qkv(x).reshape(b, h, w, 3, c).permute(3, 0, 4, 1, 2) # 3, b, c, h, wq qkv[0].permute(0, 2, 3, 1) # b, h, w, ckv torch.cat((qkv[1], qkv[2]), dim1) # b, 2*c, h, w# partition windowsq_windows window_partition(q, self.window_size) # nw*b, window_size, window_size, cq_windows q_windows.view(-1, self.window_size * self.window_size, c) # nw*b, window_size*window_size, ckv_windows self.unfold(kv) # b, c*w*w, nwkv_windows rearrange(kv_windows, b (nc ch owh oww) nw - nc (b nw) (owh oww) ch, nc2, chc, owhself.overlap_win_size, owwself.overlap_win_size).contiguous() # 2, nw*b, ow*ow, ck_windows, v_windows kv_windows[0], kv_windows[1] # nw*b, ow*ow, cb_, nq, _ q_windows.shape_, n, _ k_windows.shaped self.dim // self.num_headsq q_windows.reshape(b_, nq, self.num_heads, d).permute(0, 2, 1, 3) # nw*b, nH, nq, dk k_windows.reshape(b_, n, self.num_heads, d).permute(0, 2, 1, 3) # nw*b, nH, n, dv v_windows.reshape(b_, n, self.num_heads, d).permute(0, 2, 1, 3) # nw*b, nH, n, dq q * self.scaleattn (q k.transpose(-2, -1))relative_position_bias self.relative_position_bias_table[rpi.view(-1)].view(self.window_size * self.window_size, self.overlap_win_size * self.overlap_win_size, -1) # ws*ws, wse*wse, nHrelative_position_bias relative_position_bias.permute(2, 0, 1).contiguous() # nH, ws*ws, wse*wseattn attn relative_position_bias.unsqueeze(0)attn self.softmax(attn)attn_windows (attn v).transpose(1, 2).reshape(b_, nq, self.dim)# merge windowsattn_windows attn_windows.view(-1, self.window_size, self.window_size, self.dim)x window_reverse(attn_windows, self.window_size, h, w) # b h w cx x.view(b, h * w, self.dim)x self.proj(x) shortcutx x self.mlp(self.norm2(x))return xclass AttenBlocks(nn.Module): A series of attention blocks for one RHAG.Args:dim (int): Number of input channels.input_resolution (tuple[int]): Input resolution.depth (int): Number of blocks.num_heads (int): Number of attention heads.window_size (int): Local window size.mlp_ratio (float): Ratio of mlp hidden dim to embedding dim.qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: Trueqk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.drop (float, optional): Dropout rate. Default: 0.0attn_drop (float, optional): Attention dropout rate. Default: 0.0drop_path (float | tuple[float], optional): Stochastic depth rate. Default: 0.0norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNormdownsample (nn.Module | None, optional): Downsample layer at the end of the layer. Default: Noneuse_checkpoint (bool): Whether to use checkpointing to save memory. Default: False.def __init__(self,dim,input_resolution,depth,num_heads,window_size,compress_ratio,squeeze_factor,conv_scale,overlap_ratio,mlp_ratio4.,qkv_biasTrue,qk_scaleNone,drop0.,attn_drop0.,drop_path0.,norm_layernn.LayerNorm,downsampleNone,use_checkpointFalse):super().__init__()self.dim dimself.input_resolution input_resolutionself.depth depthself.use_checkpoint use_checkpoint# build blocksself.blocks nn.ModuleList([HAB(dimdim,input_resolutioninput_resolution,num_headsnum_heads,window_sizewindow_size,shift_size0 if (i % 2 0) else window_size // 2,compress_ratiocompress_ratio,squeeze_factorsqueeze_factor,conv_scaleconv_scale,mlp_ratiomlp_ratio,qkv_biasqkv_bias,qk_scaleqk_scale,dropdrop,attn_dropattn_drop,drop_pathdrop_path[i] if isinstance(drop_path, list) else drop_path,norm_layernorm_layer) for i in range(depth)])# OCABself.overlap_attn OCAB(dimdim,input_resolutioninput_resolution,window_sizewindow_size,overlap_ratiooverlap_ratio,num_headsnum_heads,qkv_biasqkv_bias,qk_scaleqk_scale,mlp_ratiomlp_ratio,norm_layernorm_layer)# patch merging layerif downsample is not None:self.downsample downsample(input_resolution, dimdim, norm_layernorm_layer)else:self.downsample Nonedef forward(self, x, x_size, params):for blk in self.blocks:x blk(x, x_size, params[rpi_sa], params[attn_mask])x self.overlap_attn(x, x_size, params[rpi_oca])if self.downsample is not None:x self.downsample(x)return xclass RHAG(nn.Module):Residual Hybrid Attention Group (RHAG).Args:dim (int): Number of input channels.input_resolution (tuple[int]): Input resolution.depth (int): Number of blocks.num_heads (int): Number of attention heads.window_size (int): Local window size.mlp_ratio (float): Ratio of mlp hidden dim to embedding dim.qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: Trueqk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.drop (float, optional): Dropout rate. Default: 0.0attn_drop (float, optional): Attention dropout rate. Default: 0.0drop_path (float | tuple[float], optional): Stochastic depth rate. Default: 0.0norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNormdownsample (nn.Module | None, optional): Downsample layer at the end of the layer. Default: Noneuse_checkpoint (bool): Whether to use checkpointing to save memory. Default: False.img_size: Input image size.patch_size: Patch size.resi_connection: The convolutional block before residual connection.def __init__(self,dim,input_resolution,depth,num_heads,window_size,compress_ratio,squeeze_factor,conv_scale,overlap_ratio,mlp_ratio4.,qkv_biasTrue,qk_scaleNone,drop0.,attn_drop0.,drop_path0.,norm_layernn.LayerNorm,downsampleNone,use_checkpointFalse,img_size224,patch_size4,resi_connection1conv):super(RHAG, self).__init__()self.dim dimself.input_resolution input_resolutionself.residual_group AttenBlocks(dimdim,input_resolutioninput_resolution,depthdepth,num_headsnum_heads,window_sizewindow_size,compress_ratiocompress_ratio,squeeze_factorsqueeze_factor,conv_scaleconv_scale,overlap_ratiooverlap_ratio,mlp_ratiomlp_ratio,qkv_biasqkv_bias,qk_scaleqk_scale,dropdrop,attn_dropattn_drop,drop_pathdrop_path,norm_layernorm_layer,downsampledownsample,use_checkpointuse_checkpoint)if resi_connection 1conv:self.conv nn.Conv2d(dim, dim, 3, 1, 1)elif resi_connection identity:self.conv nn.Identity()self.patch_embed PatchEmbed(img_sizeimg_size, patch_sizepatch_size, in_chans0, embed_dimdim, norm_layerNone)self.patch_unembed PatchUnEmbed(img_sizeimg_size, patch_sizepatch_size, in_chans0, embed_dimdim, norm_layerNone)def forward(self, x, x_size, params):return self.patch_embed(self.conv(self.patch_unembed(self.residual_group(x, x_size, params), x_size))) xclass PatchEmbed(nn.Module):r Image to Patch EmbeddingArgs:img_size (int): Image size. Default: 224.patch_size (int): Patch token size. Default: 4.in_chans (int): Number of input image channels. Default: 3.embed_dim (int): Number of linear projection output channels. Default: 96.norm_layer (nn.Module, optional): Normalization layer. Default: Nonedef __init__(self, img_size224, patch_size4, in_chans3, embed_dim96, norm_layerNone):super().__init__()img_size to_2tuple(img_size)patch_size to_2tuple(patch_size)patches_resolution [img_size[0] // patch_size[0], img_size[1] // patch_size[1]]self.img_size img_sizeself.patch_size patch_sizeself.patches_resolution patches_resolutionself.num_patches patches_resolution[0] * patches_resolution[1]self.in_chans in_chansself.embed_dim embed_dimif norm_layer is not None:self.norm norm_layer(embed_dim)else:self.norm Nonedef forward(self, x):x x.flatten(2).transpose(1, 2) # b Ph*Pw cif self.norm is not None:x self.norm(x)return xclass PatchUnEmbed(nn.Module):r Image to Patch UnembeddingArgs:img_size (int): Image size. Default: 224.patch_size (int): Patch token size. Default: 4.in_chans (int): Number of input image channels. Default: 3.embed_dim (int): Number of linear projection output channels. Default: 96.norm_layer (nn.Module, optional): Normalization layer. Default: Nonedef __init__(self, img_size224, patch_size4, in_chans3, embed_dim96, norm_layerNone):super().__init__()img_size to_2tuple(img_size)patch_size to_2tuple(patch_size)patches_resolution [img_size[0] // patch_size[0], img_size[1] // patch_size[1]]self.img_size img_sizeself.patch_size patch_sizeself.patches_resolution patches_resolutionself.num_patches patches_resolution[0] * patches_resolution[1]self.in_chans in_chansself.embed_dim embed_dimdef forward(self, x, x_size):x x.transpose(1, 2).contiguous().view(x.shape[0], self.embed_dim, x_size[0], x_size[1]) # b Ph*Pw creturn xclass Upsample(nn.Sequential):Upsample module.Args:scale (int): Scale factor. Supported scales: 2^n and 3.num_feat (int): Channel number of intermediate features.def __init__(self, scale, num_feat):m []if (scale (scale - 1)) 0: # scale 2^nfor _ in range(int(math.log(scale, 2))):m.append(nn.Conv2d(num_feat, 4 * num_feat, 3, 1, 1))m.append(nn.PixelShuffle(2))elif scale 3:m.append(nn.Conv2d(num_feat, 9 * num_feat, 3, 1, 1))m.append(nn.PixelShuffle(3))else:raise ValueError(fscale {scale} is not supported. Supported scales: 2^n and 3.)super(Upsample, self).__init__(*m)ARCH_REGISTRY.register() class HAT(nn.Module):r Hybrid Attention TransformerA PyTorch implementation of : Activating More Pixels in Image Super-Resolution Transformer.Some codes are based on SwinIR.Args:img_size (int | tuple(int)): Input image size. Default 64patch_size (int | tuple(int)): Patch size. Default: 1in_chans (int): Number of input image channels. Default: 3embed_dim (int): Patch embedding dimension. Default: 96depths (tuple(int)): Depth of each Swin Transformer layer.num_heads (tuple(int)): Number of attention heads in different layers.window_size (int): Window size. Default: 7mlp_ratio (float): Ratio of mlp hidden dim to embedding dim. Default: 4qkv_bias (bool): If True, add a learnable bias to query, key, value. Default: Trueqk_scale (float): Override default qk scale of head_dim ** -0.5 if set. Default: Nonedrop_rate (float): Dropout rate. Default: 0attn_drop_rate (float): Attention dropout rate. Default: 0drop_path_rate (float): Stochastic depth rate. Default: 0.1norm_layer (nn.Module): Normalization layer. Default: nn.LayerNorm.ape (bool): If True, add absolute position embedding to the patch embedding. Default: Falsepatch_norm (bool): If True, add normalization after patch embedding. Default: Trueuse_checkpoint (bool): Whether to use checkpointing to save memory. Default: Falseupscale: Upscale factor. 2/3/4/8 for image SR, 1 for denoising and compress artifact reductionimg_range: Image range. 1. or 255.upsampler: The reconstruction reconstruction module. pixelshuffle/pixelshuffledirect/nearestconv/Noneresi_connection: The convolutional block before residual connection. 1conv/3convdef __init__(self,in_chans3,img_size64,patch_size1,embed_dim96,depths(6, 6, 6, 6),num_heads(6, 6, 6, 6),window_size7,compress_ratio3,squeeze_factor30,conv_scale0.01,overlap_ratio0.5,mlp_ratio4.,qkv_biasTrue,qk_scaleNone,drop_rate0.,attn_drop_rate0.,drop_path_rate0.1,norm_layernn.LayerNorm,apeFalse,patch_normTrue,use_checkpointFalse,upscale2,img_range1.,upsampler,resi_connection1conv,**kwargs):super(HAT, self).__init__()self.window_size window_sizeself.shift_size window_size // 2self.overlap_ratio overlap_rationum_in_ch in_chansnum_out_ch in_chansnum_feat 64self.img_range img_rangeif in_chans 3:rgb_mean (0.4488, 0.4371, 0.4040)self.mean torch.Tensor(rgb_mean).view(1, 3, 1, 1)else:self.mean torch.zeros(1, 1, 1, 1)self.upscale upscaleself.upsampler upsampler# relative position indexrelative_position_index_SA self.calculate_rpi_sa()relative_position_index_OCA self.calculate_rpi_oca()self.register_buffer(relative_position_index_SA, relative_position_index_SA)self.register_buffer(relative_position_index_OCA, relative_position_index_OCA)# ------------------------- 1, shallow feature extraction ------------------------- #self.conv_first nn.Conv2d(num_in_ch, embed_dim, 3, 1, 1)# ------------------------- 2, deep feature extraction ------------------------- #self.num_layers len(depths)self.embed_dim embed_dimself.ape apeself.patch_norm patch_normself.num_features embed_dimself.mlp_ratio mlp_ratio# split image into non-overlapping patchesself.patch_embed PatchEmbed(img_sizeimg_size,patch_sizepatch_size,in_chansembed_dim,embed_dimembed_dim,norm_layernorm_layer if self.patch_norm else None)num_patches self.patch_embed.num_patchespatches_resolution self.patch_embed.patches_resolutionself.patches_resolution patches_resolution# merge non-overlapping patches into imageself.patch_unembed PatchUnEmbed(img_sizeimg_size,patch_sizepatch_size,in_chansembed_dim,embed_dimembed_dim,norm_layernorm_layer if self.patch_norm else None)# absolute position embeddingif self.ape:self.absolute_pos_embed nn.Parameter(torch.zeros(1, num_patches, embed_dim))trunc_normal_(self.absolute_pos_embed, std.02)self.pos_drop nn.Dropout(pdrop_rate)# stochastic depthdpr [x.item() for x in torch.linspace(0, drop_path_rate, sum(depths))] # stochastic depth decay rule# build Residual Hybrid Attention Groups (RHAG)self.layers nn.ModuleList()for i_layer in range(self.num_layers):layer RHAG(dimembed_dim,input_resolution(patches_resolution[0], patches_resolution[1]),depthdepths[i_layer],num_headsnum_heads[i_layer],window_sizewindow_size,compress_ratiocompress_ratio,squeeze_factorsqueeze_factor,conv_scaleconv_scale,overlap_ratiooverlap_ratio,mlp_ratioself.mlp_ratio,qkv_biasqkv_bias,qk_scaleqk_scale,dropdrop_rate,attn_dropattn_drop_rate,drop_pathdpr[sum(depths[:i_layer]):sum(depths[:i_layer 1])], # no impact on SR resultsnorm_layernorm_layer,downsampleNone,use_checkpointuse_checkpoint,img_sizeimg_size,patch_sizepatch_size,resi_connectionresi_connection)self.layers.append(layer)self.norm norm_layer(self.num_features)# build the last conv layer in deep feature extractionif resi_connection 1conv:self.conv_after_body nn.Conv2d(embed_dim, embed_dim, 3, 1, 1)elif resi_connection identity:self.conv_after_body nn.Identity()# ------------------------- 3, high quality image reconstruction ------------------------- #if self.upsampler pixelshuffle:# for classical SRself.conv_before_upsample nn.Sequential(nn.Conv2d(embed_dim, num_feat, 3, 1, 1), nn.LeakyReLU(inplaceTrue))self.upsample Upsample(upscale, num_feat)self.conv_last nn.Conv2d(num_feat, num_out_ch, 3, 1, 1)self.apply(self._init_weights)def _init_weights(self, m):if isinstance(m, nn.Linear):trunc_normal_(m.weight, std.02)if isinstance(m, nn.Linear) and m.bias is not None:nn.init.constant_(m.bias, 0)elif isinstance(m, nn.LayerNorm):nn.init.constant_(m.bias, 0)nn.init.constant_(m.weight, 1.0)def calculate_rpi_sa(self):# calculate relative position index for SAcoords_h torch.arange(self.window_size)coords_w torch.arange(self.window_size)coords torch.stack(torch.meshgrid([coords_h, coords_w])) # 2, Wh, Wwcoords_flatten torch.flatten(coords, 1) # 2, Wh*Wwrelative_coords coords_flatten[:, :, None] - coords_flatten[:, None, :] # 2, Wh*Ww, Wh*Wwrelative_coords relative_coords.permute(1, 2, 0).contiguous() # Wh*Ww, Wh*Ww, 2relative_coords[:, :, 0] self.window_size - 1 # shift to start from 0relative_coords[:, :, 1] self.window_size - 1relative_coords[:, :, 0] * 2 * self.window_size - 1relative_position_index relative_coords.sum(-1) # Wh*Ww, Wh*Wwreturn relative_position_indexdef calculate_rpi_oca(self):# calculate relative position index for OCAwindow_size_ori self.window_sizewindow_size_ext self.window_size int(self.overlap_ratio * self.window_size)coords_h torch.arange(window_size_ori)coords_w torch.arange(window_size_ori)coords_ori torch.stack(torch.meshgrid([coords_h, coords_w])) # 2, ws, wscoords_ori_flatten torch.flatten(coords_ori, 1) # 2, ws*wscoords_h torch.arange(window_size_ext)coords_w torch.arange(window_size_ext)coords_ext torch.stack(torch.meshgrid([coords_h, coords_w])) # 2, wse, wsecoords_ext_flatten torch.flatten(coords_ext, 1) # 2, wse*wserelative_coords coords_ext_flatten[:, None, :] - coords_ori_flatten[:, :, None] # 2, ws*ws, wse*wserelative_coords relative_coords.permute(1, 2, 0).contiguous() # ws*ws, wse*wse, 2relative_coords[:, :, 0] window_size_ori - window_size_ext 1 # shift to start from 0relative_coords[:, :, 1] window_size_ori - window_size_ext 1relative_coords[:, :, 0] * window_size_ori window_size_ext - 1relative_position_index relative_coords.sum(-1)return relative_position_indexdef calculate_mask(self, x_size):# calculate attention mask for SW-MSAh, w x_sizeimg_mask torch.zeros((1, h, w, 1)) # 1 h w 1h_slices (slice(0, -self.window_size), slice(-self.window_size,-self.shift_size), slice(-self.shift_size, None))w_slices (slice(0, -self.window_size), slice(-self.window_size,-self.shift_size), slice(-self.shift_size, None))cnt 0for h in h_slices:for w in w_slices:img_mask[:, h, w, :] cntcnt 1mask_windows window_partition(img_mask, self.window_size) # nw, window_size, window_size, 1mask_windows mask_windows.view(-1, self.window_size * self.window_size)attn_mask mask_windows.unsqueeze(1) - mask_windows.unsqueeze(2)attn_mask attn_mask.masked_fill(attn_mask ! 0, float(-100.0)).masked_fill(attn_mask 0, float(0.0))return attn_masktorch.jit.ignoredef no_weight_decay(self):return {absolute_pos_embed}torch.jit.ignoredef no_weight_decay_keywords(self):return {relative_position_bias_table}def forward_features(self, x):x_size (x.shape[2], x.shape[3])# Calculate attention mask and relative position index in advance to speed up inference.# The original code is very time-consuming for large window size.attn_mask self.calculate_mask(x_size).to(x.device)params {attn_mask: attn_mask, rpi_sa: self.relative_position_index_SA, rpi_oca: self.relative_position_index_OCA}x self.patch_embed(x)if self.ape:x x self.absolute_pos_embedx self.pos_drop(x)for layer in self.layers:x layer(x, x_size, params)x self.norm(x) # b seq_len cx self.patch_unembed(x, x_size)return xdef forward(self, x):self.mean self.mean.type_as(x)x (x - self.mean) * self.img_rangeif self.upsampler pixelshuffle:# for classical SRx self.conv_first(x)x self.conv_after_body(self.forward_features(x)) xx self.conv_before_upsample(x)x self.conv_last(self.upsample(x))x x / self.img_range self.meanreturn x二、添加HAT注意力机制 2.1STEP1 首先找到ultralytics/nn文件路径下新建一个Add-module的python文件包【这里注意一定是python文件包新建后会自动生成_init_.py】如果已经跟着我的教程建立过一次了可以省略此步骤随后新建一个HAT.py文件并将上文中提到的注意力机制的代码全部粘贴到此文件中如下图所示 2.2STEP2 在STEP1中新建的_init_.py文件中导入增加改进模块的代码包如下图所示 2.3STEP3 找到ultralytics/nn文件夹中的task.py文件在其中按照下图添加 2.4STEP4 定位到ultralytics/nn文件夹中的task.py文件中的def parse_model(d, ch, verboseTrue): # model_dict, input_channels(3)函数添加如图代码,【如果不好定位可以直接ctrlf搜索定位】 三、yaml文件与运行 3.1yaml文件 以下是添加HAT注意力机制在Backbone中的yaml文件大家可以注释自行调节效果以自己的数据集结果为准 # Ultralytics YOLO , AGPL-3.0 license # YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters nc: 80 # number of classes scales: # model compound scaling constants, i.e. modelyolo11n.yaml will call yolo11.yaml with scale n# [depth, width, max_channels]n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPss: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPsm: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPsl: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPsx: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs# YOLO11n backbone backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 2, C3k2, [256, False, 0.25]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 2, C3k2, [512, False, 0.25]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 2, C3k2, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 2, C3k2, [1024, True]]- [-1, 1, HAT, []]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 2, C2PSA, [1024]] # 10# YOLO11n head head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 14], 1, Concat, [1]] # cat head P4- [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 11], 1, Concat, [1]] # cat head P5- [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)- [[17, 20, 23], 1, Detect, [nc]] # Detect(P3, P4, P5) 以上添加位置仅供参考具体添加位置以及模块效果以自己的数据集结果为准 3.2运行成功截图 OK 以上就是添加HAT注意力机制的全部过程了后续将持续更新尽情期待
文章转载自:
http://www.morning.klcdt.cn.gov.cn.klcdt.cn
http://www.morning.dpppx.cn.gov.cn.dpppx.cn
http://www.morning.bqmdl.cn.gov.cn.bqmdl.cn
http://www.morning.kqwsy.cn.gov.cn.kqwsy.cn
http://www.morning.smkxm.cn.gov.cn.smkxm.cn
http://www.morning.jkzjs.cn.gov.cn.jkzjs.cn
http://www.morning.snjpj.cn.gov.cn.snjpj.cn
http://www.morning.wzknt.cn.gov.cn.wzknt.cn
http://www.morning.txmlg.cn.gov.cn.txmlg.cn
http://www.morning.mkydt.cn.gov.cn.mkydt.cn
http://www.morning.rtbj.cn.gov.cn.rtbj.cn
http://www.morning.mumgou.com.gov.cn.mumgou.com
http://www.morning.rqsnl.cn.gov.cn.rqsnl.cn
http://www.morning.xhftj.cn.gov.cn.xhftj.cn
http://www.morning.clccg.cn.gov.cn.clccg.cn
http://www.morning.rqhn.cn.gov.cn.rqhn.cn
http://www.morning.rqjl.cn.gov.cn.rqjl.cn
http://www.morning.rsjf.cn.gov.cn.rsjf.cn
http://www.morning.vehna.com.gov.cn.vehna.com
http://www.morning.rqwmt.cn.gov.cn.rqwmt.cn
http://www.morning.ywndg.cn.gov.cn.ywndg.cn
http://www.morning.hdqqr.cn.gov.cn.hdqqr.cn
http://www.morning.chfxz.cn.gov.cn.chfxz.cn
http://www.morning.rsnn.cn.gov.cn.rsnn.cn
http://www.morning.nfnxp.cn.gov.cn.nfnxp.cn
http://www.morning.crtgd.cn.gov.cn.crtgd.cn
http://www.morning.qwdqq.cn.gov.cn.qwdqq.cn
http://www.morning.grryh.cn.gov.cn.grryh.cn
http://www.morning.qbdsx.cn.gov.cn.qbdsx.cn
http://www.morning.nyplp.cn.gov.cn.nyplp.cn
http://www.morning.yhjlg.cn.gov.cn.yhjlg.cn
http://www.morning.xxiobql.cn.gov.cn.xxiobql.cn
http://www.morning.fhddr.cn.gov.cn.fhddr.cn
http://www.morning.mrskk.cn.gov.cn.mrskk.cn
http://www.morning.gstmn.cn.gov.cn.gstmn.cn
http://www.morning.tnkwj.cn.gov.cn.tnkwj.cn
http://www.morning.qgjgsds.com.cn.gov.cn.qgjgsds.com.cn
http://www.morning.hyhzt.cn.gov.cn.hyhzt.cn
http://www.morning.mtjwp.cn.gov.cn.mtjwp.cn
http://www.morning.nlbw.cn.gov.cn.nlbw.cn
http://www.morning.bpds.cn.gov.cn.bpds.cn
http://www.morning.wwsgl.com.gov.cn.wwsgl.com
http://www.morning.cbpmq.cn.gov.cn.cbpmq.cn
http://www.morning.dhqyh.cn.gov.cn.dhqyh.cn
http://www.morning.cpmfp.cn.gov.cn.cpmfp.cn
http://www.morning.clccg.cn.gov.cn.clccg.cn
http://www.morning.lfjmp.cn.gov.cn.lfjmp.cn
http://www.morning.fglxh.cn.gov.cn.fglxh.cn
http://www.morning.okiner.com.gov.cn.okiner.com
http://www.morning.lqgfm.cn.gov.cn.lqgfm.cn
http://www.morning.rfrnc.cn.gov.cn.rfrnc.cn
http://www.morning.ztmnr.cn.gov.cn.ztmnr.cn
http://www.morning.wjfzp.cn.gov.cn.wjfzp.cn
http://www.morning.rfpq.cn.gov.cn.rfpq.cn
http://www.morning.rqjfm.cn.gov.cn.rqjfm.cn
http://www.morning.skksz.cn.gov.cn.skksz.cn
http://www.morning.tqldj.cn.gov.cn.tqldj.cn
http://www.morning.zlhcw.cn.gov.cn.zlhcw.cn
http://www.morning.sskhm.cn.gov.cn.sskhm.cn
http://www.morning.mjytr.cn.gov.cn.mjytr.cn
http://www.morning.slfkt.cn.gov.cn.slfkt.cn
http://www.morning.lngyd.cn.gov.cn.lngyd.cn
http://www.morning.gkgb.cn.gov.cn.gkgb.cn
http://www.morning.wnrcj.cn.gov.cn.wnrcj.cn
http://www.morning.jlschmy.com.gov.cn.jlschmy.com
http://www.morning.nqbkb.cn.gov.cn.nqbkb.cn
http://www.morning.pumali.com.gov.cn.pumali.com
http://www.morning.mjytr.cn.gov.cn.mjytr.cn
http://www.morning.jyjqh.cn.gov.cn.jyjqh.cn
http://www.morning.kphyl.cn.gov.cn.kphyl.cn
http://www.morning.kaoshou.net.gov.cn.kaoshou.net
http://www.morning.snnb.cn.gov.cn.snnb.cn
http://www.morning.cgmzt.cn.gov.cn.cgmzt.cn
http://www.morning.wbxr.cn.gov.cn.wbxr.cn
http://www.morning.lthtp.cn.gov.cn.lthtp.cn
http://www.morning.mwnch.cn.gov.cn.mwnch.cn
http://www.morning.rdmn.cn.gov.cn.rdmn.cn
http://www.morning.splkk.cn.gov.cn.splkk.cn
http://www.morning.mwpcp.cn.gov.cn.mwpcp.cn
http://www.morning.jntdf.cn.gov.cn.jntdf.cn
http://www.tj-hxxt.cn/news/255012.html

相关文章:

  • 珠海本地网站app开发商业计划书模板
  • 广告位网站建设定制产品去哪个平台
  • 外贸网站服务器选择上海永灿网站建设
  • 做网站不备案会怎样网站初期做几个比较好
  • wordpress中文网站优化业务推广平台
  • 定制化网站建设wordpress可以承载多少数据
  • 专门做房产的网站视频设计师是干什么的
  • 免费发帖的网站合肥优化网站哪家公司好
  • 营销型网站制作哪个好薇增加网站访客
  • 前端开发培训费用北京seo服务商
  • 做网站可以用什么语言东莞网站建设0086
  • 山东平台网站建设企业网页游戏传奇世界
  • 网站从设计到制作临沂市建设局官方网站
  • 做网站视频一般上传到哪里蓬安网站建设
  • 网站开发实训总结致谢深圳建站推广
  • 如何增加网站权重成都最好的编程培训机构
  • python做网站比php好自己做一个app需要多少钱
  • 网站设计哪家wordpress侧栏高度
  • 苏南网站建设免费公司网站设计
  • 购物商城外贸网站建设品牌搭建网站 官网
  • 网站制作用的软件有哪些wordpress 无法创建目录 linux
  • 网站管理平台模板宝安沙井邮政编码
  • 英文网站建设免费黄冈做网站公司
  • 创建手机网站免费手机平面设计软件app
  • 搜关键词网站网站后期培训机构全国排名
  • 重庆信息门户网站字体设计转换器
  • 分析网站建设前期的seo准备工作湖北专业网站建设耗材
  • wordpress建站linux网站开发环境ide
  • seo网站推广的目的包括哪个方面建立网站要怎么做
  • 凡科建站官网电脑版杭州市建设银行网站