Thursday, 7 February 2019

Training a custom object detector using TensorFlow object detection API

After I failed to train object detector with custom data  using NVIDIA digits platform on detectNet , I tried my luck with TensorFlow object detection API. I think I successfully trained mobileNet model with it.

In this post I will try to explain what I did and what are the error's I faced while doing so.

 Data Collection : Instead of taking images I write a script to grab the frames from the video

It will take video file path and number of frames to be generated. The frames grabbed from video look like this
And yes I take brinjal (is that what it is called ?) images for testing.

labelImg is used to label images. I installed via zip instead of pip (labelImg with pip never worked for me).

Installation : you can follow official docs to install both tensorflow and models.
I find it much easier compared to digits installation

Once you downloaded models, open jupyter and run the objectDetection example. it will take bit of time to run , since it has to download the mobieNet model trained on coco dataset.

Note :
for some weird reason I have to enter

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
 in terminal even-though I added research and slim path to my .bashrc file


Custom model Installation :
  1.  Generate XML to CSV Files
  2.  Generate .record files from CSV files
  3.  make .pbtxt file and dont include comma ','
  4. Download the model and .config file 
  5. Edit the .config file to modify following
    • number of classes
    • pretrained model path
    • test labels path, with images
    • train labels path, with images
  6. Then copy the train.py file to train your model
              When i try to train my custom data using model specified in docs (ssd_inception_v2), I keep getting the following error : WARNING:root:Variable [FeatureExtractor/InceptionV2/Conv2d_1a_7x7/BatchNorm/beta/ExponentialMovingAverage] is not available in checkpoint .

I tried to find the problem using following :
 But didn't get it working. I tried the model and config file from this tutorial  :
https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/ with ssd_mobileNet and cofig file, for some unknown reason , IT WORKED.

7. Evaluating the model
              Copy the eval.py file from legacy folder and run following command

python3 eval.py --logtostderr --pipeline_config_path=/home/ic/Documents/objectExtraction/workspace/training_demo/training/inception_v2.config --checkpoint_dir=/home/ic/Documents/objectExtraction/workspace/training_demo/training --eval_dir=/home/ic/Documents/objectExtraction/workspace/training_demo/eval

You need to specify the number of test images in .config file

eval_config: {
  num_examples: 22
}


You can check the eval output with


                             tensorboard --logdir=eval\

 Check and images , you can find the output


If you get following error :
NameError: name 'unicode' is not defined in object_detection/utils/object_detection_evaluation.py 

Try to replace unicode with str in file object_detection_evaluation.py as specified in https://github.com/tensorflow/models/issues/5203


8. Exporting the model 
           I used the following command to export inferred graph 
python3 export_inference_graph.py --input_type image_tensor       --pipeline_config_path training/inception_v2.config --trained_checkpoint_prefix training/model.ckpt-688 --output_directory trained-inference-graphs/output_inference_graph_v1

When I run it , i got the warning 

114 ops no flops stats due to incomplete shapes. Parsing Inputs... Incomplete shape.


But as stated here , we can ignore it and use the model


9. using the trained model
   To use the trained model , modify the following lines :
Specify the lebel.pbtxt file used 
PATH_TO_LABELS = os.path.join('data', 'label_map.pbtxt')
 and exported model
MODEL_NAME = 'output_inference_graph_v1'

Then add few more images in test_images folder and change the for loop range
accordingly
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 6) ]
 then run the file

And Results I get are :


And for some weird reason This
And Like always I don't even know why third image came like that, may be i have to train it with versatile images.

Are you trying the same and struck ? or do you have any suggestions / solutions for errors i faced (using inceptionNet as specified in official docs)? let me know in comments.



No comments:

Post a Comment