深度学习论文: LEDnet: A lightweight encoder-decoder network for real-time semantic segmentation及其PyTorch实现_bilateral attention decoder: a lightweight decoder_mingo_敏的博客-程序员宅基地

技术标签: 深度学习  Semantic Segmentation  pytorch  Paper Reading  Deep Learning  

深度学习论文: LEDnet: A lightweight encoder-decoder network for real-time semantic segmentation及其PyTorch实现
LEDnet: A lightweight encoder-decoder network for real-time semantic segmentation
PDF:https://arxiv.org/pdf/1905.02423.pdf
PyTorch: https://github.com/shanglianlm0525/PyTorch-Networks

1 概述

LEDNet的不对称结构(asymmetrical architecture),如上图所示,使得网络参数大大减少,加速了推理过程;

残差网络中的 Channel split and shuffle 有强大的特征表示。

在 decoder 端,采用特征金字塔的注意力机制来设计APN,进一步降低了整个网络的复杂性。

模型参数不到1M,并且能够在单个GTX 1080Ti GPU中以超过71 FPS的速度运行。

2 LEDnet

LEDNet 由两部分构成:编码网络和解码网络

编码模块:
LEDNet 的非对称机制使得可以减少参数量,加速推理过程

残差模块中的 channel split 和 shuffle 机制可以减小网络规模,提升特征表达能力。

skip connection 允许卷积学习残差函数来帮助训练,
split 和 shuffle 过程能够加强通道间的信息转换同时保持类似于一维分解卷积的计算开销。

解码模块:

使用特征金字塔注意力机制来设计 attention pyramid network(APN)用来抽取丰富特征,使用注意力机制来估计每个像素的语义标签,
在这里插入图片描述

2-1 SS-nbt module

采用split-transform-merge 策略
在这里插入图片描述

class SS_nbt(nn.Module):
    def __init__(self, channels, dilation=1, groups=4):
        super(SS_nbt, self).__init__()

        mid_channels = channels // 2
        self.half_split = HalfSplit(dim=1)

        self.first_bottleneck = nn.Sequential(
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, padding=[1, 0]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, padding=[0, 1]),
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, dilation=[dilation,1], padding=[dilation, 0]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, dilation=[1,dilation], padding=[0, dilation]),
        )

        self.second_bottleneck = nn.Sequential(
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, padding=[0, 1]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, padding=[1, 0]),
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, dilation=[1,dilation], padding=[0, dilation]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, dilation=[dilation,1], padding=[dilation, 0]),
        )

        self.channelShuffle = ChannelShuffle(groups)

    def forward(self, x):
        x1, x2 = self.half_split(x)
        x1 = self.first_bottleneck(x1)
        x2 = self.second_bottleneck(x2)
        out = torch.cat([x1, x2], dim=1)
        return self.channelShuffle(out+x)

2-2 APN

在这里插入图片描述

class APN(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(APN, self).__init__()

        self.conv1_1 = ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=3, stride=2, padding=1)
        self.conv1_2 = Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels)

        self.conv2_1 = ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=5, stride=2, padding=2)
        self.conv2_2 = Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels)

        self.conv3 = nn.Sequential(
            ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=7, stride=2, padding=3),
            Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels),
        )

        self.conv1 = nn.Sequential(
            ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=3, stride=2, padding=1),
            Conv1x1BNReLU(in_channels=in_channels,out_channels=out_channels),
        )

        self.branch2 = Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels)

        self.branch3 = nn.Sequential(
            nn.AdaptiveAvgPool2d(output_size=1),
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels,kernel_size=1, stride=1,padding=0),
        )

    def forward(self, x):
        _, _, h, w = x.shape
        x1 = self.conv1_1(x)
        x2 = self.conv2_1(x1)
        x3 = self.conv3(x2)
        x3 = F.interpolate(x3, size=(h//4, w//4), mode='bilinear', align_corners=True)
        x2 = self.conv2_2(x2) + x3
        x2 = F.interpolate(x2, size=(h // 2, w // 2), mode='bilinear', align_corners=True)
        x1 = self.conv1_2(x1) + x2
        out1 = F.interpolate(x1, size=(h, w), mode='bilinear', align_corners=True)
        
        out2 = self.branch2(x)

        out3 = self.branch3(x)
        out3 = F.interpolate(out3, size=(h, w), mode='bilinear', align_corners=True)
        return out1 * out2 + out3

2-3 LEDNet Architecture

在这里插入图片描述

import torch
import torch.nn as nn
import torchvision
import torch.nn.functional as F

def ConvBNReLU(in_channels,out_channels,kernel_size,stride,padding,dilation=[1,1],groups=1):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding,dilation=dilation,groups=groups, bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU6(inplace=True)
        )


def ConvBN(in_channels,out_channels,kernel_size,stride,padding,dilation=[1,1],groups=1):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding,dilation=dilation,groups=groups, bias=False),
            nn.BatchNorm2d(out_channels)
        )

def ConvReLU(in_channels,out_channels,kernel_size,stride,padding,dilation=[1,1],groups=1):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding,dilation=dilation,groups=groups, bias=False),
            nn.ReLU6(inplace=True)
        )

def Conv1x1BNReLU(in_channels,out_channels):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU6(inplace=True)
        )


def Conv1x1BN(in_channels,out_channels):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, bias=False),
            nn.BatchNorm2d(out_channels)
        )

class HalfSplit(nn.Module):
    def __init__(self, dim=1):
        super(HalfSplit, self).__init__()
        self.dim = dim

    def forward(self, input):
        splits = torch.chunk(input, 2, dim=self.dim)
        return splits[0], splits[1]

class ChannelShuffle(nn.Module):
    def __init__(self, groups):
        super(ChannelShuffle, self).__init__()
        self.groups = groups

    def forward(self, x):
        '''Channel shuffle: [N,C,H,W] -> [N,g,C/g,H,W] -> [N,C/g,g,H,w] -> [N,C,H,W]'''
        N, C, H, W = x.size()
        g = self.groups
        return x.view(N, g, int(C / g), H, W).permute(0, 2, 1, 3, 4).contiguous().view(N, C, H, W)


class SS_nbt(nn.Module):
    def __init__(self, channels, dilation=1, groups=4):
        super(SS_nbt, self).__init__()

        mid_channels = channels // 2
        self.half_split = HalfSplit(dim=1)

        self.first_bottleneck = nn.Sequential(
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, padding=[1, 0]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, padding=[0, 1]),
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, dilation=[dilation,1], padding=[dilation, 0]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, dilation=[1,dilation], padding=[0, dilation]),
        )

        self.second_bottleneck = nn.Sequential(
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, padding=[0, 1]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, padding=[1, 0]),
            ConvReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[1, 3], stride=1, dilation=[1,dilation], padding=[0, dilation]),
            ConvBNReLU(in_channels=mid_channels, out_channels=mid_channels, kernel_size=[3, 1], stride=1, dilation=[dilation,1], padding=[dilation, 0]),
        )

        self.channelShuffle = ChannelShuffle(groups)

    def forward(self, x):
        x1, x2 = self.half_split(x)
        x1 = self.first_bottleneck(x1)
        x2 = self.second_bottleneck(x2)
        out = torch.cat([x1, x2], dim=1)
        return self.channelShuffle(out+x)


class DownSampling(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(DownSampling, self).__init__()
        mid_channels = out_channels - in_channels

        self.conv = nn.Conv2d(in_channels=in_channels,out_channels=mid_channels,kernel_size=3,stride=2,padding=1)
        self.maxpool = nn.MaxPool2d(kernel_size=3,stride=2, padding=1)

        self.bn = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        x1 = self.conv(x)
        x2 = self.maxpool(x)
        output = torch.cat([x1, x2], 1)
        return self.relu(self.bn(output))

class Encoder(nn.Module):
    def __init__(self, groups = 4):
        super(Encoder, self).__init__()
        planes = [32, 64, 128]

        self.downSampling1 = DownSampling(in_channels=3, out_channels=planes[0])
        self.ssBlock1 = self._make_layer(channels=planes[0], dilation=1, groups=groups, block_num=3)
        self.downSampling2 = DownSampling(in_channels=32, out_channels=planes[1])
        self.ssBlock2 = self._make_layer(channels=planes[1], dilation=1, groups=groups, block_num=2)
        self.downSampling3 = DownSampling(in_channels=planes[1], out_channels=planes[2])
        self.ssBlock3 = nn.Sequential(
            SS_nbt(channels=planes[2], dilation=1, groups=groups),
            SS_nbt(channels=planes[2], dilation=2, groups=groups),
            SS_nbt(channels=planes[2], dilation=5, groups=groups),
            SS_nbt(channels=planes[2], dilation=9, groups=groups),
            SS_nbt(channels=planes[2], dilation=2, groups=groups),
            SS_nbt(channels=planes[2], dilation=5, groups=groups),
            SS_nbt(channels=planes[2], dilation=9, groups=groups),
            SS_nbt(channels=planes[2], dilation=17, groups=groups),
        )

    def _make_layer(self, channels, dilation, groups, block_num):
        layers = []
        for idx in range(block_num):
            layers.append(SS_nbt(channels, dilation=dilation, groups=groups))
        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.downSampling1(x)
        x = self.ssBlock1(x)
        x = self.downSampling2(x)
        x = self.ssBlock2(x)
        x = self.downSampling3(x)
        out = self.ssBlock3(x)
        return out


class APN(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(APN, self).__init__()

        self.conv1_1 = ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=3, stride=2, padding=1)
        self.conv1_2 = Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels)

        self.conv2_1 = ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=5, stride=2, padding=2)
        self.conv2_2 = Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels)

        self.conv3 = nn.Sequential(
            ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=7, stride=2, padding=3),
            Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels),
        )

        self.conv1 = nn.Sequential(
            ConvBNReLU(in_channels=in_channels, out_channels=in_channels, kernel_size=3, stride=2, padding=1),
            Conv1x1BNReLU(in_channels=in_channels,out_channels=out_channels),
        )

        self.branch2 = Conv1x1BNReLU(in_channels=in_channels, out_channels=out_channels)

        self.branch3 = nn.Sequential(
            nn.AdaptiveAvgPool2d(output_size=1),
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels,kernel_size=1, stride=1,padding=0),
        )

    def forward(self, x):
        _, _, h, w = x.shape
        x1 = self.conv1_1(x)
        x2 = self.conv2_1(x1)
        x3 = self.conv3(x2)
        x3 = F.interpolate(x3, size=(h//4, w//4), mode='bilinear', align_corners=True)
        x2 = self.conv2_2(x2) + x3
        x2 = F.interpolate(x2, size=(h // 2, w // 2), mode='bilinear', align_corners=True)
        x1 = self.conv1_2(x1) + x2
        out1 = F.interpolate(x1, size=(h, w), mode='bilinear', align_corners=True)
        
        out2 = self.branch2(x)

        out3 = self.branch3(x)
        out3 = F.interpolate(out3, size=(h, w), mode='bilinear', align_corners=True)
        return out1 * out2 + out3


class Decoder(nn.Module):
    def __init__(self, in_channels,num_classes):
        super(Decoder, self).__init__()
        self.apn = APN(in_channels=in_channels, out_channels=num_classes)

    def forward(self, x):
        _, _, h, w = x.shape
        apn_x = self.apn(x)
        out = F.interpolate(apn_x, size=(h*8, w*8), mode='bilinear', align_corners=True)
        return out


class LEDnet(nn.Module):
    def __init__(self, num_classes=20):
        super(LEDnet, self).__init__()
        self.encoder = Encoder()
        self.decoder = Decoder(in_channels=128,num_classes=num_classes)

    def forward(self, x):
        x = self.encoder(x)
        out = self.decoder(x)
        return out


if __name__ == '__main__':
    model = LEDnet(num_classes=20)
    print(model)

    input = torch.randn(1,3,1024,512)
    output = model(input)
    print(output.shape)

3 Experiments

在这里插入图片描述

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/shanglianlm/article/details/107061715

智能推荐

基于阿基米德算法优化变分模态分解的双向长短期记忆网络AOA-VMD-BILSTM分类预测,AOA-VMD-BILSTM分类预测。多特征输入单输出的二分类及多分类模型。程序内注释详细,直接替换数据就_机器学习-深度学习的博客-程序员宅基地

基于阿基米德算法优化变分模态分解的双向长短期记忆网络AOA-VMD-BILSTM分类预测,AOA-VMD-BILSTM分类预测。多特征输入单输出的二分类及多分类模型。程序内注释详细,直接替换数据就可以用。程序语言为matlab,程序可出分类效果图,迭代优化图,混淆矩阵图。

博客项目(前端小项目练习)_TKOP_的博客-程序员宅基地

记录使用 express 框架和基础的 node.js 实现多人博客管理系统项目的笔记。涉及的知识有 session、cookie 和一些第三方模块的使用等。_博客项目

Element基础_element介绍-程序员宅基地

Element 基本使用Element 介绍Element:网站快速成型工具。是饿了么公司前端开发团队提供的一套基于 Vue 的网站组件库。使用 Element 前提必须要有 Vue。组件:组成网页的部件,例如 超链接、按钮、图片、表格等等~Element 官网:https://element.eleme.cn/#/zh-CN自己完成的按钮Element 提供的按钮Element 快速入门下载 Element 核心库。引入 Element 样式文件。引_element介绍

unity学习笔记:小球吃方块_自闭的饭盒的博客-程序员宅基地

基本编程格式:public class NewBehaviourScript : MonoBehaviour{ //声明一个刚体对象 rd private Rigidbody rd; void Start () { // 得到刚体组件 赋值给 rd rd = GetComponent<Rigidbody>(); } void ...

【Visual Studio 2022】常用快捷键和一些小技巧_vs2022 快捷键_琚建飞的博客-程序员宅基地

VS 2022 快捷键1、F4,显示属性窗口。 2、F12,转到定义。 3、Shift+Tab,取消制表符。 4、F5,运行调试; Ctrl + F5,运行不调试;Shift+F5,结束调试。 5、Ctrl+E+C,注释选中内容;Ctrl+E+U,取消注释内容。 6、Ctrl+W+X,打开工具箱。 7、Ctrl+E+W,自动换行。 8、Ctrl+M+M,隐藏或展开当前嵌套的折叠状态。 9、Ct_vs2022 快捷键

iPhone数据丢失怎么办?如何恢复iPhone数据?iPhone数据恢复的三种方法_ 吖 頭的博客-程序员宅基地

iPhone数据丢失怎么办?如何恢复iPhone数据?Joyoshare iPhone Data Recovery数据恢复软件拥有三种智能恢复模式,您可以轻松地从 iDevices 本身、iTunes 或 iCloud 备份中取回数据。现在,我们将为您提供有关如何分别从 iOS 设备、iTunes 和 iCloud 恢复丢失数据的分步指南。首先,请在您的计算机上下载并安装 Joyoshare iPhone Data Recovery如何直接从 iOS 设备恢复丢失的数据将 iOS 设备连接到计算机

随便推点

hdu 2054_庸人自扰61的博客-程序员宅基地

hdu 2054 Ps:WA了无数次,,简直成了心病..今天终于AC了..先取整数部分,去零,判断位数相等否,再比较.如果相等,再取小数部分,去零,比较,输出....好烦...代码;#include "stdio.h"#include "string.h"void swi(char a[],char b[]);void swi1(char a...

《果壳中的C# C# 5.0 权威指南》 (09-26章) - 学习笔记_c#5.0权威指南_GATTACA2011的博客-程序员宅基地

《果壳中的C# C# 5.0 权威指南》========== ========== ==========[作者] (美) Joseph Albahari (美) Ben Albahari[译者] (中) 陈昇 管学理 曾少宁 杨庆川[出版] 中国水利水电出版社[版次] 2013年08月 第1版[印次] 2013年08月 第1次 印刷[定价] 118.00元========== =..._c#5.0权威指南

DeepLearning.ai笔记:(2-3)-- 超参数调试(Hyperparameter tuning)_zjufangzh的博客-程序员宅基地

首发于个人博客:fangzh.top,欢迎来访这周主要讲了这些超参数调试的方法以及batch norm,还有softmax多分类函数的使用。调试处理之前提到的超参数有:α\alphaαhidden unitsminibatch sizeβ\betaβ(Momentum)layerslearning rate decayβ1,β2,ϵ\beta_1,\beta_2,\eps..._hyperparameter tuning

Centos Linux上Docker安装以及应用靶场_linux用docker搭建crapi靶场_张朝阳的博客的博客-程序员宅基地

一、安装Docker第一步:[root@localhost network-scripts]# yum install -y yum-utils device-mapper-persistent-data lvm2//替换docker的安装源[root@localhost network-scripts]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo_linux用docker搭建crapi靶场

MS Windows Error Messages(错误信息)1_M15814217001的博客-程序员宅基地

CodeErrorMessage0操作成功完成。1功能错误。2系统找不到指定的文件。3系统找不到指定的路径。4系统无法打开文件。5拒绝访问。6句柄无效。7存储控制块被损坏。8存储空间不足,无法处理此命令。9存储控制块地址无效。10环境错误。11试图加载格式错误的程序。12访问码无效...

GIT服务与gitlab搭建_安装了git服务端还要装gitlab吗_那块代码没问题的博客-程序员宅基地

git服务搭建安装依赖的包yum -y install curl-devel expat-devel gettext-devel openssl-devel zlib-devel gcc perl-ExtUtils-MakeMaker下载git源码并解压$ wget https://github.com/git/git/archive/v2.23.0.zip$ unzip v2.2..._安装了git服务端还要装gitlab吗

推荐文章

热门文章

相关标签