Nvidia jetson nano | Tensorflow-gpu | TensorFlow object detection API | mobilnet-ssd | 训练自己的数据集_n.n.imoom,tf-程序员宅基地

技术标签： tensorflow nano object-detection-api mobilnet-ssd 训练自己的数据 jetson

参考自：

https://www.cnblogs.com/leviatan/p/10740105.html

https://www.cnblogs.com/gezhuangzhuang/p/10613468.html

关于如何安装tensorflow-gpu参考我这篇博客

https://blog.csdn.net/ourkix/article/details/103577082

下载文件

依赖安装如按之前博客来的话应该已经安装好了

安装 object_detection 库

设置 PYTHONPATH

测试 object_detection 库是否安装成功

训练自己的数据集

1. 在自己的voc数据格式文件夹内，新建 train_test_split.py 把xml文件数据集分为了train、test、validation三部分，并存储在Annotations文件夹中，训练验证集占80%，测试集占20%。训练集占训练验证集的80%。代码如下：

2. 把xml转换成csv文件，xml_to_csv.py 将生成的csv文件放在 object_detection/data/

3. 生成tfrecord文件，在research目录下建立generate_tfrecord.py

训练

1. 在object_detection/data文件夹下创建标签分类的配置文件（labelmap.pbtxt），需要检测几种目标，将创建几个id，代码如下：

2. 配置管道配置文件，找到object_detection/samples/config/ssd_mobilenet_v1_coco.config，复制到data文件夹下。修改后的代码如下：

3.下载预训练模型（用我上传的文件的话，已经在object_detection/ssd_model/ssd_mobilenet目录下了）

4. 开始训练(这个train.py 文件可能就在object_detection目录下也可能在object_detection/legacy下)

5.训练完成后，运行 export_inference_graph.py 脚本将训练出的模型固化成 TensorFlow 的 .pb 模型，其中 trained_checkpoint_prefix 要设置成 model.ckpt-[step]，其中 step 要与训练迭代次数相同

6.测试模型（在object_detection目录下创建文件seahorse_ssd_detect.py）

下载文件

下载地址: https://github.com/tensorflow/models

也可以使用我上传的里面有数据集和预训练文件和测试图片，文件有点大分卷下载了，要都下载下来一起解压

下载地址：https://download.csdn.net/download/ourkix/12068490

下载地址：https://download.csdn.net/download/ourkix/12068504

下载后得到一个 models-master.zip 文件，解压后移动到 (关于如何在文件查看其中看到隐藏的文件 Ctrl + H 快捷键)

/home/nvidia/.local/lib/python3.6/site-packages/tensorflow

文件夹下，并重命名为 models

如果用我上传的，下载解压后是个models文件夹，里面还有个models，进去吧里面的models复制到

/home/nvidia/.local/lib/python3.6/site-packages/tensorflow

文件夹下

依赖安装如按之前博客来的话应该已经安装好了

python3 -m pip install pillow --user
python3 -m pip install lxml --user
python3 -m pip install matplotlib --user
python3 -m pip install pandas --user

这里查看自己是否有安装 protobuf

protoc --version

出现

libprotoc 3.0.0

代表有安装

如没安装

sudo apt-get install -y python3-protobuf
#也可以用pip
python3 -m pip install protobuf --user

进入 models/research/ 目录，并编译 protobuf (这里可能会报错没有pandas 库安装就是了)

cd /home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research
protoc object_detection/protos/*.proto --python_out=.

安装 object_detection 库

python3 setup.py build
python3 setup.py install

设置 PYTHONPATH

编辑 .bashrc文件

sudo gedit ~/.bashrc

最后加上

export PYTHONPATH=$PYTHONPATH:/home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research
export PYTHONPATH=$PYTHONPATH:/home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research/slim

保存，使环境生效

source ~/.bashrc

测试 object_detection 库是否安装成功

cd /home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research
python3 object_detection/builders/model_builder_test.py

运行测试目标检测脚本测试在object_detection目录下有个 object-detection-turorial.ipynb 这里不用jupyter-notebook，改用python，更方便。

新建一个文件 object-detection-turorial.py

touch object-detection-turorial.py

编辑，写入

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import matplotlib

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
  raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')



import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
  raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')



from utils import label_map_util

from utils import visualization_utils as vis_util


global output_num
global output_img_dic

matplotlib.use('TkAgg')

# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_2017_11_17'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')

print(PATH_TO_LABELS)


# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

output_num = 1
output_img_dic = r'\output_images'










opener = urllib.request.URLopener()
print("--\n")
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
print("--\n")
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
  file_name = os.path.basename(file.name)
  if 'frozen_inference_graph.pb' in file_name:
    tar_file.extract(file, os.getcwd())

print("--\n")


detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.compat.v1.GraphDef()
  with tf.io.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

print("--\n")

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

print("--\n")

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)






def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.compat.v1.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.compat.v1.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.compat.v1.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict




for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)
  plt.figure(figsize=IMAGE_SIZE)
  print(1,image_np)
  plt.imshow(image_np)
  plt.show()
  
  if not os.path.exists(output_img_dic):
      os.mkdir(output_img_dic)
  output_img_path = os.path.join(output_img_dic,str(output_num)+".png")
  plt.savefig(output_img_path)

保存，运行

python3 object-detection-turorial.py

等待运行，nano运行比较久，要下载文件什么的，等个2-3分钟。

训练自己的数据集

生成tfrecord文件

VOC数据集目录结构是这样的

我在object_detection目录下建立了ssd_model目录，里面放了VOCdeckit，我会提供整个models文件夹内容（包括预训练模型，海马数据集，测试数据），你们可以按我的来

|--VOCdevkit

|--VOC2007

|--Annotations

|--ImageSets

|--Layout

|--Main

|--Segmentation

|--JPEGImages

1. 在自己的voc数据格式文件夹内，新建 train_test_split.py 把xml文件数据集分为了train、test、validation三部分，并存储在Annotations文件夹中，训练验证集占80%，测试集占20%。训练集占训练验证集的80%。代码如下：

import os  
import random  
import time  
import shutil

xmlfilepath=r'./Annotations'  
saveBasePath=r"./Annotations"

trainval_percent=0.8  
train_percent=0.8  
total_xml = os.listdir(xmlfilepath)  
num=len(total_xml)  
list=range(num)  
tv=int(num*trainval_percent)  
tr=int(tv*train_percent)  
trainval= random.sample(list,tv)  
train=random.sample(trainval,tr)  
print("train and val size",tv)  
print("train size",tr) 

start = time.time()

test_num=0  
val_num=0  
train_num=0  

for i in list:  
    name=total_xml[i]
    if i in trainval:  #train and val set 
        if i in train: 
            directory="train"  
            train_num += 1  
            xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
            if(not os.path.exists(xml_path)):  
                os.mkdir(xml_path)  
            filePath=os.path.join(xmlfilepath,name)  
            newfile=os.path.join(saveBasePath,os.path.join(directory,name))  
            shutil.copyfile(filePath, newfile)
        else:
            directory="validation"  
            xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
            if(not os.path.exists(xml_path)):  
                os.mkdir(xml_path)  
            val_num += 1  
            filePath=os.path.join(xmlfilepath,name)   
            newfile=os.path.join(saveBasePath,os.path.join(directory,name))  
            shutil.copyfile(filePath, newfile)

    else:
        directory="test"  
        xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
        if(not os.path.exists(xml_path)):  
                os.mkdir(xml_path)  
        test_num += 1  
        filePath=os.path.join(xmlfilepath,name)  
        newfile=os.path.join(saveBasePath,os.path.join(directory,name))  
        shutil.copyfile(filePath, newfile)

end = time.time()  
seconds=end-start  
print("train total : "+str(train_num))  
print("validation total : "+str(val_num))  
print("test total : "+str(test_num))  
total_num=train_num+val_num+test_num  
print("total number : "+str(total_num))  
print( "Time taken : {0} seconds".format(seconds))

2. 把xml转换成csv文件，xml_to_csv.py 将生成的csv文件放在 object_detection/data/

import os  
import glob  
import pandas as pd  
import xml.etree.ElementTree as ET 

def xml_to_csv(path):  
    xml_list = []  
    for xml_file in glob.glob(path + '/*.xml'):  
        tree = ET.parse(xml_file)  
        root = tree.getroot()
        
        print(root.find('filename').text)  
        for member in root.findall('object'): 
            value = (root.find('filename').text,  
                int(root.find('size')[0].text),   #width  
                int(root.find('size')[1].text),   #height  
                member[0].text,  
                int(member[4][0].text),  
                int(float(member[4][1].text)),  
                int(member[4][2].text),  
                int(member[4][3].text)  
                )  
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)  
    return xml_df      

def main():  
    for directory in ['train','test','validation']:  
        xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  

        xml_df = xml_to_csv(xml_path)  
        # xml_df.to_csv('whsyxt.csv', index=None)  
        xml_df.to_csv('/home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research/object_detection/data/seahorse_{}_labels.csv'.format(directory), index=None)  
        print('Successfully converted xml to csv.')

main()

3. 生成tfrecord文件，在research目录下建立generate_tfrecord.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

#Usage:
  # From tensorflow/models/
  # Create train data:
  #python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv  --output_path=train.record
  # Create test data:
  #python generate_tfrecord.py --csv_input=data/test_labels.csv  --output_path=test.record



import os
import io
import pandas as pd
import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

os.chdir('/home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research/')

flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS


# TO-DO replace this with label map
def class_text_to_int(row_label):
        # 你的所有类别
    if row_label == 'seahorse':
            return 1
    else:
        return None

def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example


def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), 'object_detection/ssd_model/VOCdevkit/VOC2007/JPEGImages/')
    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    num = 0
    for group in grouped:
        num += 1
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())
        if (num % 100 == 0):    # 每完成100个转换，打印一次
            print(num)

    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))


if __name__ == '__main__':
    tf.app.run()

主要是在 row_label 这里要添加上你标注的类别，字符串 row_label 应于labelImg中标注的名称相同。同样 path 为图片的路径。

执行生成前要去改一下cvs文件，把3个文件里面的jpeg改成jpg，这里是我图片后缀问题，不改会报错。

cd /home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research

python3 generate_tfrecord.py --csv_input=object_detection/data/seahorse_train_labels.csv --output_path=object_detection/data/seahorse_train.tfrecord

generate_tfrecord.py 需要在research目录下，也就是object_detection的上级目录，因为在脚本中使用了 object_detection.utils，如果在 object_detection 下执行命令会报错（No module named object_detection）。

类似的，我们可以输入如下命令，将验证集和测试集也转换为tfrecord格式。

python3 generate_tfrecord.py --csv_input=object_detection/data/seahorse_validation_labels.csv --output_path=object_detection/data/seahorse_validation.tfrecord
python3 generate_tfrecord.py --csv_input=object_detection/data/seahorse_test_labels.csv --output_path=object_detection/data/seahorse_test.tfrecord

训练

1. 在object_detection/data文件夹下创建标签分类的配置文件（labelmap.pbtxt），需要检测几种目标，将创建几个id，代码如下：

item {
  id: 1    # id 从1开始编号
  name: 'seahorse'
}

2. 配置管道配置文件，找到object_detection/samples/config/ssd_mobilenet_v1_coco.config，复制到data文件夹下。修改后的代码如下：

# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
#修改，分类的总数
    num_classes: 2
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
#修改，批次大小，nano的话在图形界面下跑4会出现卡顿OOM，内存不足，2的话勉强可以跑。可以在不启动图形界面跑会好些
  batch_size: 2
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
#修改，初始学习率
          initial_learning_rate: 0.0001
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
#修改，预训练模型
  fine_tune_checkpoint: "ssd_model/ssd_mobilenet/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
#修改，迭代总次数
  num_steps: 5000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
#修改，训练数据 按理这里是seahorse_train.tfrecord
    input_path: "data/seahorse.tfrecord"
  }
#修改，labelmap路径
  label_map_path: "data/labelmap.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
#修改，训练验证数据
    input_path: "data/seahorse_validation.tfrecord"
  }
#修改，labelmap路径
  label_map_path: "data/labelmap.pbtxt"
  shuffle: false
  num_readers: 1
}

3.下载预训练模型（用我上传的文件的话，已经在object_detection/ssd_model/ssd_mobilenet目录下了）

下载 ssd_mobilenet 至 ssd_model/ 目录下，解压并重命名为 ssd_mobilenet

ssd_mobilenet: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz

tar zxvf ssd_mobilenet_v1_coco_11_06_2017.tar.gz
mv ssd_mobilenet_v1_coco_11_06_2017 ssd_mobilenet

将 ssd_mobilenet_v1_coco.config 中 fine_tune_checkpoint 修改为如下格式的路径（上面已经改好）

fine_tune_checkpoint: "ssd_model/ssd_mobilenet/model.ckpt"

关闭图形界面，训练时再关闭（看你的平台情况而定，训练不了就关闭）ps:我nano在图形界面勉强可以训练

# ubuntu关闭图形用户界面
sudo systemctl set-default multi-user.target
sudo reboot
 
# ubuntu启用图形用户界面
sudo systemctl set-default graphical.target

4. 开始训练(这个train.py 文件可能就在object_detection目录下也可能在object_detection/legacy下)

python3 legacy/train.py --logtostderr --train_dir=training/ --pipeline_config_path=data/ssd_mobilenet_v1_coco.config

5.训练完成后，运行 export_inference_graph.py 脚本将训练出的模型固化成 TensorFlow 的 .pb 模型，其中 trained_checkpoint_prefix 要设置成 model.ckpt-[step]，其中 step 要与训练迭代次数相同

python3 ./object_detection/export_inference_graph.py --input_type image_tensor --pipeline_config_path ./object_detection/ssd_model/ssd_mobilenet_v1_coco.config --trained_checkpoint_prefix ./object_detection/training/model.ckpt-5000 --output_directory ./object_detection/ssd_model/model/

转换后生成的 .pb 模型位于 object_detection/ssd_model/model/ 目录下

6.测试模型（在object_detection目录下创建文件seahorse_ssd_detect.py）

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

import cv2

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
  raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')



from utils import label_map_util

from utils import visualization_utils as vis_util


global output_num
global output_img_dic

matplotlib.use('TkAgg')



# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH =  'ssd_model/model/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'labelmap.pbtxt')

print(PATH_TO_LABELS)


# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(3, 7) ]

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

output_num = 1
output_img_dic = r'\output_images'












detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.compat.v1.GraphDef()
  with tf.io.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

print("--\n")

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

print("--\n")

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)






def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.compat.v1.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.compat.v1.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.compat.v1.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict






def detect(imgfile):
    #origimg = cv2.imread(imgfile)
    image = Image.open(imgfile)

    image_np = load_image_into_numpy_array(image)
    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)
    # Actual detection.
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks'),
        use_normalized_coordinates=True,
        line_thickness=8)
    plt.figure(figsize=IMAGE_SIZE)
    print(1,image_np) 

    cv2.imshow("SSD", image_np)
 
    k = cv2.waitKey(0) & 0xff
        #Exit if ESC pressed
    if k == 27 : return False
    return True

test_dir = "/home/nvidia/.local/lib/python3.6/site-packages/tensorflow/models/research/object_detection/seahorseImages"

for f in os.listdir(test_dir):
    if detect(test_dir + "/" + f) == False:
       break

  
#  if not os.path.exists(output_img_dic):
#      os.mkdir(output_img_dic)
#  output_img_path = os.path.join(output_img_dic,str(output_num)+".png")
#  plt.savefig(output_img_path)

测试(任意键下一张图，ESC退出)

python3 seahorse_ssd_detect.py

本文链接：https://blog.csdn.net/ourkix/article/details/103778044

原作者删帖不实内容删帖广告或垃圾文章投诉

智能推荐

linux里面ping www.baidu.com ping不通的问题_linux桥接ping不通baidu-程序员宅基地

文章浏览阅读3.2w次，点赞16次，收藏90次。对于这个问题我也是从网上找了很久，终于解决了这个问题。首先遇到这个问题，应该确认虚拟机能不能正常的上网，就需要ping 网关，如果能ping通说明能正常上网，不过首先要用命令route -n来查看自己的网关，如下图：第一行就是默认网关。现在用命令ping 192.168.1.1来看一下结果：然后可以看一下电脑上面百度的ip是多少可以在linux里面ping 这个IP，结果如下：..._linux桥接ping不通baidu

android 横幅弹出权限,有关 android studio notification 横幅弹出的功能没有反应-程序员宅基地

文章浏览阅读512次。小妹在这里已经卡了2-3天了，研究了很多人的文章，除了低版本api 17有成功外，其他的不是channel null 就是没反应 (channel null已解决)拜托各位大大，帮小妹一下，以下是我的程式跟 gradle, 我在这里卡好久又没有人可问(哭)![image](/img/bVcL0Qo)public class MainActivity extends AppCompatActivit..._android 权限申请弹窗横屏

CNN中padding参数分类_cnn “相同填充”(same padding)-程序员宅基地

文章浏览阅读1.4k次，点赞4次，收藏6次。valid padding（有效填充）：完全不使用填充。half/same padding（半填充/相同填充）：保证输入和输出的feature map尺寸相同。full padding（全填充）：在卷积操作过程中，每个像素在每个方向上被访问的次数相同。arbitrary padding（任意填充）：人为设定填充。..._cnn “相同填充”(same padding)

Maven的基础知识，java技术栈-程序员宅基地

文章浏览阅读790次，点赞29次，收藏28次。手绘了下图所示的kafka知识大纲流程图（xmind文件不能上传，导出图片展现），但都可提供源文件给每位爱学习的朋友一个人可以走的很快，但一群人才能走的更远。不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎扫码加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！[外链图片转存中…(img-Qpoc4gOu-1712656009273)][外链图片转存中…(img-bSWbNeGN-1712656009274)]

getFullYear()和getYear()有什么区别_getyear和getfullyear-程序员宅基地

文章浏览阅读469次。Date对象取得年份有getYear和getFullYear两种方法经测试var d=new Date;alert(d.getYear())在IE中返回 2009，在Firefox中会返回109。经查询手册，getYear在Firefox下返回的是距1900年1月1日的年份，这是一个过时而不被推荐的方法。而alert(d.getFullYear())在IE和FF中都会返回2009。因此，无论何时都应使用getFullYear来替代getYear方法。例如：2016年用 getFullYea_getyear和getfullyear

Unix传奇（上篇）_unix传奇pdf-程序员宅基地

文章浏览阅读182次。Unix传奇(上篇) 陈皓了解过去，我们才能知其然，更知所以然。总结过去，我们才会知道我们明天该如何去规划，该如何去走。在时间的滚轮中，许许多的东西就像流星一样一闪而逝，而有些东西却能经受着时间的考验散发着经久的魅力，让人津津乐道，流传至今。要知道明天怎么去选择，怎么去做，不是盲目地跟从今天各种各样琳琅满目前沿技术，而应该是去 —— 认认真真地了解和回顾历史。 Unix是目前还在存活的操作系_unix传奇pdf

随便推点

ACwing 哈希算法入门：_ac算法哈希-程序员宅基地

文章浏览阅读308次。哈希算法：将字符串映射为数字形式，十分巧妙，一般运用为进制数，进制据前人经验，一般为131，1331时重复率很低，由于字符串的数字和会很大，所以一般为了方便，一般定义为unsigned long long,爆掉时，即为对 2^64 取模，可以对于任意子序列的值进行映射为数字进而进行判断入门题目链接：AC代码：#include<bits/stdc++.h>using na..._ac算法哈希

VS配置Qt和MySQL_在vs中如何装qt5sqlmysql模块-程序员宅基地

文章浏览阅读952次，点赞13次，收藏27次。由于觉得Qt的编辑界面比较丑，所以想用vs2022的编辑器写Qt加MySQL的项目。_在vs中如何装qt5sqlmysql模块

【渝粤题库】广东开放大学互联网营销形成性考核_画中画广告之所以能有较高的点击率,主要由于它具有以下特点-程序员宅基地

文章浏览阅读1k次。选择题题目：下面的哪个调研内容属于经济环境调研？（）题目：（）的目的就是加强与客户的沟通，它是是网络媒体也是网络营销的最重要特性。题目：4Ps策略中4P是指产品、价格、顾客和促销。题目：网络市场调研是目前最为先进的市场调研手段，没有任何的缺点或不足之处。题目：市场定位的基本参数有题目：市场需求调研可以掌握（）等信息。题目：在开展企业网站建设时应做好以下哪几个工作。（）题目：对企业网站首页的优化中，一定要注意下面哪几个方面的优化。（）题目：（）的主要作用是增进顾客关系，提供顾客服务，提升企业_画中画广告之所以能有较高的点击率,主要由于它具有以下特点

爬虫学习（1）：urlopen库使用_urlopen the read operation timed out-程序员宅基地

文章浏览阅读1k次，点赞2次，收藏5次。以爬取CSDN为例子：第一步：导入请求库第二步：打开请求网址第三步：打印源码import urllib.requestresponse=urllib.request.urlopen("https://www.csdn.net/?spm=1011.2124.3001.5359")print(response.read().decode('utf-8'))结果大概就是这个样子：好的，继续，看看打印的是什么类型的：import urllib.requestresponse=urllib.r_urlopen the read operation timed out

分享读取各大主流邮箱通讯录(联系人)、MSN好友列表的的功能【升级版(3.0)】-程序员宅基地

文章浏览阅读304次。修正sina.com/sina.cn邮箱获取不到联系人，并精简修改了其他邮箱代码，以下就是升级版版本的介绍：完整版本，整合了包括读取邮箱通讯录、MSN好友列表的的功能，目前读取邮箱通讯录支持如下邮箱：gmail(Y)、hotmail(Y)、 live(Y)、tom(Y)、yahoo(Y)(有点慢)、 sina(Y)、163(Y)、126(Y)、yeah(Y)、sohu(Y) 读取后可以发送邮件(完..._通讯录应用读取邮件的相关

云计算及虚拟化教程_云计算与虚拟化技术教改-程序员宅基地

文章浏览阅读213次。云计算及虚拟化教程学习云计算、虚拟化和计算机网络的基本概念。此视频教程共2.0小时，中英双语字幕，画质清晰无水印，源码附件全课程英文名：Cloud Computing and Virtualization An Introduction百度网盘地址：https://pan.baidu.com/s/1lrak60XOGEqMOI6lXYf6TQ?pwd=ns0j课程介绍：https://www.aihorizon.cn/72云计算：概念、定义、云类型和服务部署模型。虚拟化的概念使用 Type-2 Hyperv_云计算与虚拟化技术教改