技术标签: 算法 python 深度学习 模型训练 pytorch
主要是有几个地方的文件要修改一下
config/swin下的配置文件,我用的是mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py
img_scale主要是内存不够要改小
_base_ = [
'../_base_/models/mask_rcnn_swin_fpn.py',
# '../_base_/datasets/coco_instance.py',
'../_base_/datasets/coco_detection.py',
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]
model = dict(
backbone=dict(
embed_dim=96,
depths=[2, 2, 6, 2],
num_heads=[3, 6, 12, 24],
window_size=7,
ape=False,
drop_path_rate=0.1,
patch_norm=True,
use_checkpoint=False
),
neck=dict(in_channels=[96, 192, 384, 768]))
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
# augmentation strategy originates from DETR / Sparse RCNN
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=False),
# dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='AutoAugment',
policies=[
[
dict(type='Resize',
# img_scale=[(480, 1333), (512, 1333), (544, 1333), (576, 1333),
# (608, 1333), (640, 1333), (672, 1333), (704, 1333),
# (736, 1333), (768, 1333), (800, 1333)],
img_scale=[(224,224)],
multiscale_mode='value',
keep_ratio=True)
],
[
dict(type='Resize',
# img_scale=[(400, 1333), (500, 1333), (600, 1333)],
img_scale=[(224, 224)],
multiscale_mode='value',
keep_ratio=True),
dict(type='RandomCrop',
crop_type='absolute_range',
crop_size=(384, 600),
allow_negative_crop=True),
dict(type='Resize',
# img_scale=[(480, 1333), (512, 1333), (544, 1333),
# (576, 1333), (608, 1333), (640, 1333),
# (672, 1333), (704, 1333), (736, 1333),
# (768, 1333), (800, 1333)],
img_scale=[(224, 224)],
multiscale_mode='value',
override=True,
keep_ratio=True)
]
]),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
# dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
data = dict(train=dict(pipeline=train_pipeline))
optimizer = dict(_delete_=True, type='AdamW', lr=0.0001, betas=(0.9, 0.999), weight_decay=0.05,
paramwise_cfg=dict(custom_keys={'absolute_pos_embed': dict(decay_mult=0.),
'relative_position_bias_table': dict(decay_mult=0.),
'norm': dict(decay_mult=0.)}))
lr_config = dict(step=[8, 11])
# runner = dict(type='EpochBasedRunnerAmp', max_epochs=12)
runner = dict(type='EpochBasedRunner', max_epochs=12)
# do not use mmdet version fp16
fp16 = None
# optimizer_config = dict(
# type="DistOptimizerHook",
# update_interval=1,
# grad_clip=None,
# coalesce=True,
# bucket_size_mb=-1,
# use_fp16=True,
# )
configs\_base_\default_runtime.py (如果下载了官方预训练模型)
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
# 预训练模型的路径按需修改,不用预训练模型训练的结果会很差
#load_from = 'D:/Users/Downloads/Swin-Transformer-Object-Detection/model/mask_rcnn_swin_tiny_patch4_window7_1x.pth'
resume_from = None
workflow = [('train', 1)]
configs\_base_\datasets\coco_detection.py 数据集路径,格式和coco保持一致
dataset_type = 'CocoDataset'
# data_root = 'data/coco/'
data_root = 'F:/myproject/dataset/COCO_/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
# dict(type='LoadAnnotations', with_bbox=True),
dict(type='LoadAnnotations', with_bbox=True,with_mask=False ,with_seg=False,poly2mask=False),
dict(type='Resize', img_scale=(224, 224), keep_ratio=True),#img_scale=(1333, 800),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(224, 224), #img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json',
img_prefix=data_root + 'train2017/',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='bbox')
configs\_base_\models\mask_rcnn_swin_fpn.py 注意修改类别和mask
# model settings
model = dict(
type='MaskRCNN',
pretrained=None,
backbone=dict(
type='SwinTransformer',
embed_dim=96,
depths=[2, 2, 6, 2],
num_heads=[3, 6, 12, 24],
window_size=7,
mlp_ratio=4.,
qkv_bias=True,
qk_scale=None,
drop_rate=0.,
attn_drop_rate=0.,
drop_path_rate=0.2,
ape=False,
patch_norm=True,
out_indices=(0, 1, 2, 3),
use_checkpoint=False),
neck=dict(
type='FPN',
in_channels=[96, 192, 384, 768],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
roi_head=dict(
type='StandardRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
# 修改为自己的数据类数
# num_classes=80,
num_classes=20,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
# mask_roi_extractor=dict(
# type='SingleRoIExtractor',
# roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
# out_channels=256,
# featmap_strides=[4, 8, 16, 32]),
# mask_head=dict(
# type='FCNMaskHead',
# num_convs=4,
# in_channels=256,
# conv_out_channels=256,
# 修改为自己的数据类数
# num_classes=80,
# loss_mask=dict(
# type='CrossEntropyLoss', use_mask=True, loss_weight=1.0
# )
# )
),
# model training and testing settings
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
# mask_size=28,
pos_weight=-1,
debug=False)),
test_cfg=dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
# mask_thr_binary=0.5
)))
mmdet\datasets\coco.py 替换自定义数据集的类别
class CocoDataset(CustomDataset):
# CLASSES = ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
# 'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
# 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
# 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe',
# 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
# 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat',
# 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
# 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
# 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',
# 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
# 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop',
# 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave',
# 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock',
# 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush')
CLASSES = (
'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
)
mmdet\core\evaluation\class_names.py中的67行也要替换自定义数据集类别信息
# def coco_classes():
# return [
# 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train',
# 'truck', 'boat', 'traffic_light', 'fire_hydrant', 'stop_sign',
# 'parking_meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep',
# 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella',
# 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard',
# 'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard',
# 'surfboard', 'tennis_racket', 'bottle', 'wine_glass', 'cup', 'fork',
# 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange',
# 'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'chair',
# 'couch', 'potted_plant', 'bed', 'dining_table', 'toilet', 'tv',
# 'laptop', 'mouse', 'remote', 'keyboard', 'cell_phone', 'microwave',
# 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase',
# 'scissors', 'teddy_bear', 'hair_drier', 'toothbrush'
# ]
def coco_classes():
return [
'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
]
os.environ是系统上定义的环境变量的字典例子:import osprint os.environ输出{'TMP': 'C:\\Users\\LYLE20~1\\AppData\\Local\\Temp', 'COMPUTERNAME': 'LENOVO-PC', '1830B7BD-F7A3-4C4D-989B-C004DE465EDE': '17a4:
一、与串相关的概念1、串(或字符串)是由零个或多个字符组成的有限序列。一般记作:s=〃c0c1c2…cn-1〃(n≥0)。零个字符的串称为空串,通常以两个相邻的双引号来表示空串,仅由空格组成的的串称为空格串,如:s=〃〃;2、串与线性表的异同。字符串一般简称为串,可以将它看作是一种特殊的线性表,这种线性表的数据元素的类型总是字符型的,字符串的数据对象约束为字符集。在线性表的基本操...
HLSL,High Level Shader Language,高级着色器语言,是 Direct3D 着色器模型所必须的语言。WPF 支持 Direct3D 9,也支持使用 HLSL 来编写着色器。你可以使用任何一款编辑器来编写 HLSL,但 Shazzam Shader Editor 则是专门为 WPF 实现像素着色器而设计的一款编辑器,使用它来编写像素着色器,可以省去像素着色器接入到 WPF ...
Linux是一个多用户系统,这意味着不止一个人可以同时在同一系统进行交互。作为系统管理员,您有责任通过创建和删除用户来管理系统的用户和组,并将其分配给不同的组。在Linux中,您可以使用useradd命令创建用户帐户并将用户分配到不同的组。useradd是一个非常实用的命令,Debian和Ubuntu用户可以使用更友好的adduser命令。本教程介绍了useradd命令及其选项。前提条件为了能够使用useradd命令创建新用户,您需要以root用户或具有sudo访问权限的用户身份登录。在linu
对抗生成网络GAN(优化目标篇)
延迟队列:根据上面的方案我门知道我们生产中不可缺少的使用延迟队列,我们如何实现今天介绍下需要安装插件https://www.rabbitmq.com/community-plugins.html 这个里面有rabbitmq的插件集合我们从中找到rabbitmq_delayed_message_exchange 点击下载,下载后的代码放到你的安装目录下面的plugins目录下比如我放的就是:/usr/local/Cellar/rabbitmq/3.8.16/plugins 这个目录下面,找自己的
有两种情况会生成只读1.用多个程序编写同一个文件(这种情况就让它保持只读就可以了)2.vim非常规退出原因:因为每次用vim打开一个文件,都会生成.(filename).swp文件以备不测,如果正常退出,该文件就会删除,如果非常规退出,swp文件存在,就会变成只读状态如何解决:1.恢复文件vi -r 文件名.后缀2.删除swp文件rm .文件名.后缀.swp...
http://blog.chinaunix.net/uid-26009923-id-3194580.html 分析一下make zImage的流程,具体的操作是:首先将linux-2.6.30.4/config_EmbedSky_W35_256MB改名为.config,然后make zImage.看一下make zImage之后发生了什么事情。点击(此处)折叠或打开
在开发过程中经常会遇到容器的宽度为100%,但是却需要他的border为它的长度减去padding之后的长度,通过研究发现可以使用伪类实现该效果,代码如下:.examPaper_header::after { content: ''; width: 100%; height: 1px; display: block; margin: 0 auto; border-bottom: 1px solid rgba(0, 0, 0, 0.09);}...
调试在64位Debian上编译好的Linux 0.11(一)本机环境:SMP Debian 3.11.6-1 (2013-10-27) x86_64 GNU/Linuxgcc (Debian 4.8.2-5) 4.8.2GNU assembler (GNU Binutils for Debian) 2.23.90.20131116GNU ld (GNU Binutils for Debian) 2...
黑客技术点击右侧关注,了解黑客的世界!Linux编程点击右侧关注,免费入门到精通!回想一下我们发布 iOS 应用,不仅步骤繁琐,非常耗时。一旦其中一步失误了,又得重新来。...
我用2003的IIS做了个虚拟目录,可以在浏览器上列出文件,但ISO,JNLP等的文件不能下载,在下载时提示文件不存在,但服务器上是有此文件的,其他的可以下载。解决方式是:在设置该虚拟目录的属性->HTTP头->MIME类型,把ISO,JNLP等扩展名加进去就可以了。应为IIS不认这些扩展名,所以提示文件不存在,把扩展名加进去就ok了。 转载...