In this article, I'll only cover the part of inference that takes advantage of Google's already trained model. I'll discuss later how to create a new model through image training for your own use.
Google releases many models implemented with TensorFlow under the Apache license. Among them, the object detection API is an open source framework that makes it easy to create / learn / distribute models that recognize objects in photos. Object recognition is a very active and rapidly evolving model, and at the time of writing, Google has released 19 pre-trained object detection models, and more and more models will be implemented and released.
Using this API, you can easily implement real-time object recognition using a webcam as shown below. The picture below is taken from here.
Prerequisites
Must install libfreetype6 brefore pillow installation- apt-get install python3-dev libfreetype6-dev
- pip3 install Cython contextlib2 pillow lxml matplotlib
Install Object Detection Models from Github
Copy the TensorFlow Object detection model from GitHub.cd /usr/local/src
git clone https://github.com/tensorflow/models
cd /usr/local/src/models/research
python3 setup.py build python3 setup.py install
protoc object_detection/protos/*.proto --python_out=. pip3 install .
#Add Python Path
#if you want to add this line to ~/.bash_profile, use this line(instead of pwd, use full path) instead
# export PYTHONPATH=$PYTHONPATH:/usr/local/src/models/research:/usr/local/src/models/research/slim
export PYTHONPATH=$PYTHONPATH:'pwd':'pwd'/slim
COCOAPI? You can skip ahead. The official documentation describes the installation of the COCO API, but this part can be omitted because it will be needed later in the learning process to build your model.
Check the installation.
If you see the last OK message, the installation is successful.root@JetsonNano:/usr/local/src/models/research# python3 object_detection/builders/model_builder_test.py 2019-12-01 23:14:53.708855: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons * https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue. WARNING:tensorflow:From /usr/local/src/models/research/slim/nets/inception_resnet_v2.py:374: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead. WARNING:tensorflow:From /usr/local/src/models/research/slim/nets/mobilenet/mobilenet.py:397: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead. Running tests under Python 3.6.9: /usr/bin/python3 [ RUN ] ModelBuilderTest.test_create_experimental_model [ OK ] ModelBuilderTest.test_create_experimental_model [ RUN ] ModelBuilderTest.test_create_faster_rcnn_model_from_config_with_example_miner [ OK ] ModelBuilderTest.test_create_faster_rcnn_model_from_config_with_example_miner [ RUN ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_faster_rcnn_with_matmul [ OK ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_faster_rcnn_with_matmul [ RUN ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_faster_rcnn_without_matmul [ OK ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_faster_rcnn_without_matmul [ RUN ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_mask_rcnn_with_matmul [ OK ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_mask_rcnn_with_matmul [ RUN ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_mask_rcnn_without_matmul [ OK ] ModelBuilderTest.test_create_faster_rcnn_models_from_config_mask_rcnn_without_matmul [ RUN ] ModelBuilderTest.test_create_rfcn_model_from_config [ OK ] ModelBuilderTest.test_create_rfcn_model_from_config [ RUN ] ModelBuilderTest.test_create_ssd_fpn_model_from_config [ OK ] ModelBuilderTest.test_create_ssd_fpn_model_from_config [ RUN ] ModelBuilderTest.test_create_ssd_models_from_config [ OK ] ModelBuilderTest.test_create_ssd_models_from_config [ RUN ] ModelBuilderTest.test_invalid_faster_rcnn_batchnorm_update [ OK ] ModelBuilderTest.test_invalid_faster_rcnn_batchnorm_update [ RUN ] ModelBuilderTest.test_invalid_first_stage_nms_iou_threshold [ OK ] ModelBuilderTest.test_invalid_first_stage_nms_iou_threshold [ RUN ] ModelBuilderTest.test_invalid_model_config_proto [ OK ] ModelBuilderTest.test_invalid_model_config_proto [ RUN ] ModelBuilderTest.test_invalid_second_stage_batch_size [ OK ] ModelBuilderTest.test_invalid_second_stage_batch_size [ RUN ] ModelBuilderTest.test_session [ SKIPPED ] ModelBuilderTest.test_session [ RUN ] ModelBuilderTest.test_unknown_faster_rcnn_feature_extractor [ OK ] ModelBuilderTest.test_unknown_faster_rcnn_feature_extractor [ RUN ] ModelBuilderTest.test_unknown_meta_architecture [ OK ] ModelBuilderTest.test_unknown_meta_architecture [ RUN ] ModelBuilderTest.test_unknown_ssd_feature_extractor [ OK ] ModelBuilderTest.test_unknown_ssd_feature_extractor ---------------------------------------------------------------------- Ran 17 tests in 0.821s OK (skipped=1)
Pretrained Object Detection Models that we can use
You can see these charts at https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.mdCOCO-trained models
Note: If you download the tar.gz file of quantized models and un-tar, you will get different set of files - a checkpoint, a config file and tflite frozen graphs (txt/binary).
Kitti-trained models
Model name | Speed (ms) | Pascal mAP@0.5 | Outputs |
---|---|---|---|
faster_rcnn_resnet101_kitti |
Open Images-trained models
Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs |
---|---|---|---|
faster_rcnn_inception_resnet_v2_atrous_oidv2 | 727 | 37 | Boxes |
faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2 | 347 | Boxes | |
facessd_mobilenet_v2_quantized_open_image_v4 [^3] | 20 | 73 (faces) | Boxes |
Model name | Speed (ms) | Open Images mAP@0.5[^4] | Outputs |
---|---|---|---|
faster_rcnn_inception_resnet_v2_atrous_oidv4 | 425 | 54 | Boxes |
ssd_mobilenetv2_oidv4 | 89 | 36 | Boxes |
ssd_resnet_101_fpn_oidv4 |
iNaturalist Species-trained models
Model name | Speed (ms) | Pascal mAP@0.5 | Outputs |
---|---|---|---|
faster_rcnn_resnet101_fgvc | 395 | 58 | Boxes |
faster_rcnn_resnet50_fgvc |
AVA v2.1 trained models
Model name | Speed (ms) | Pascal mAP@0.5 | Outputs |
---|---|---|---|
faster_rcnn_resnet101_ava_v2.1 |
Mobile models
Model name | Pixel 1 Latency (ms) | COCO mAP | Outputs |
---|---|---|---|
ssd_mobilenet_v3_large_coco | 119 | 22.3 | Boxes |
ssd_mobilenet_v3_small_coco |
Let's Try
The repo you just downloaded contains an example object_detection_tutorial.ipynb that you can test with the Jupiter notebook. However, this code has been adapted for TensorFlow 2.0 and is not currently available on Jetson Nano using Tensoflow 1.X.So I modified the object_detection_tutorial.ipynb code to object_detection.py which work with Tensorflow 1.14.
import numpy as np import os import six.moves.urllib as urllib import sys import time import tarfile import tensorflow as tf import zipfile from distutils.version import StrictVersion from collections import defaultdict from io import StringIO from matplotlib import pyplot as plt from PIL import Image from object_detection.utils import ops as utils_ops if StrictVersion(tf.__version__) < StrictVersion('1.12.0'): raise ImportError('Please upgrade your TensorFlow installation to v1.12.*.') from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as vis_util # What model to download. MODEL_NAME = 'ssd_mobilenet_v1_coco_2018_01_28' #MODEL_NAME = 'ssd_inception_v2_coco_2018_01_28' #MODEL_NAME = 'ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03' #MODEL_NAME = 'ssdlite_mobilenet_v2_coco_2018_05_09' MODEL_FILE = MODEL_NAME + '.tar.gz' DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/' # Path to frozen detection graph. This is the actual model that is used for the object detection. PATH_TO_FROZEN_GRAPH ='./object_detection/model/' + MODEL_NAME + '/frozen_inference_graph.pb' PATH_TO_MODEL ='./object_detection/model/' + MODEL_FILE # List of the strings that is used to add correct label for each box. #PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt') PATH_TO_LABELS = './object_detection/data/mscoco_label_map.pbtxt' opener = urllib.request.URLopener() #opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, './object_detection/models/saved_mode/' + MODEL_FILE) opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, PATH_TO_MODEL) tar_file = tarfile.open(PATH_TO_MODEL) for file in tar_file.getmembers(): file_name = os.path.basename(file.name) if 'frozen_inference_graph.pb' in file_name: tar_file.extract(file, './object_detection/model/') detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True) def load_image_into_numpy_array(image): (im_width, im_height) = image.size return np.array(image.getdata()).reshape( (im_height, im_width, 3)).astype(np.uint8) # For the sake of simplicity we will use only 2 images: # image1.jpg # image2.jpg # If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS. PATH_TO_TEST_IMAGES_DIR = './object_detection/test_images' TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 4) ] # Size, in inches, of the output images. IMAGE_SIZE = (12, 8) def run_inference_for_single_image(image, graph): with graph.as_default(): with tf.Session() as sess: # Get handles to input and output tensors ops = tf.get_default_graph().get_operations() all_tensor_names = {output.name for op in ops for output in op.outputs} tensor_dict = {} for key in [ 'num_detections', 'detection_boxes', 'detection_scores', 'detection_classes', 'detection_masks' ]: tensor_name = key + ':0' if tensor_name in all_tensor_names: tensor_dict[key] = tf.get_default_graph().get_tensor_by_name( tensor_name) if 'detection_masks' in tensor_dict: # The following processing is only for single image detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0]) detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0]) # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size. real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32) detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1]) detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1]) detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks( detection_masks, detection_boxes, image.shape[1], image.shape[2]) detection_masks_reframed = tf.cast( tf.greater(detection_masks_reframed, 0.5), tf.uint8) # Follow the convention by adding back the batch dimension tensor_dict['detection_masks'] = tf.expand_dims( detection_masks_reframed, 0) image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0') # Run inference output_dict = sess.run(tensor_dict, feed_dict={image_tensor: image}) # all outputs are float32 numpy arrays, so convert types as appropriate output_dict['num_detections'] = int(output_dict['num_detections'][0]) output_dict['detection_classes'] = output_dict[ 'detection_classes'][0].astype(np.int64) output_dict['detection_boxes'] = output_dict['detection_boxes'][0] output_dict['detection_scores'] = output_dict['detection_scores'][0] if 'detection_masks' in output_dict: output_dict['detection_masks'] = output_dict['detection_masks'][0] return output_dict for image_path in TEST_IMAGE_PATHS: print('===== Image open:%s ====='%(image_path)) im = Image.open(image_path) width, height = im.size #image = im.resize((int(width / 2), int(height / 2))) image = im.copy() t = time.time() # the array based representation of the image will be used later in order to prepare the # result image with boxes and labels on it. image_np = load_image_into_numpy_array(image) # Expand dimensions since the model expects images to have shape: [1, None, None, 3] image_np_expanded = np.expand_dims(image_np, axis=0) # Actual detection. output_dict = run_inference_for_single_image(image_np_expanded, detection_graph) elapsed = time.time() - t # Visualization of the results of a detection. vis_util.visualize_boxes_and_labels_on_image_array( image_np, output_dict['detection_boxes'], output_dict['detection_classes'], output_dict['detection_scores'], category_index, instance_masks=output_dict.get('detection_masks'), use_normalized_coordinates=True, line_thickness=8) fig = plt.figure(figsize=IMAGE_SIZE) txt = 'FPS:%f'%(1.0 / elapsed) plt.text(10, 10, txt, fontsize=12) plt.imshow(image_np) name = os.path.splitext(image_path)[0] name = name + '_result.png' plt.savefig(name)
<research/object_detection.py>
Run the above code as follows:
/usr/local/src/models/research
mkdir object_detection/model
python3 object_detection.py
Be careful : Run the code at research directory. Or modify the python code's directory values.
You may find the image_name_result.png file at "/usr/local/src/models/research/object_detection/test_images/".
The detection score values of above image are low compared to the homepage picture.
This is because the network model used is different. In the above source code, ssd_mobilenet_v1_coco_2018_01_28 is used as the network model. If you change the model name, the detection score will change.
댓글 없음:
댓글 쓰기