Thursday, 7 February 2019

Training a custom object detector using TensorFlow object detection API

After I failed to train object detector with custom data  using NVIDIA digits platform on detectNet , I tried my luck with TensorFlow object detection API. I think I successfully trained mobileNet model with it.

In this post I will try to explain what I did and what are the error's I faced while doing so.

 Data Collection : Instead of taking images I write a script to grab the frames from the video

It will take video file path and number of frames to be generated. The frames grabbed from video look like this
And yes I take brinjal (is that what it is called ?) images for testing.

labelImg is used to label images. I installed via zip instead of pip (labelImg with pip never worked for me).

Installation : you can follow official docs to install both tensorflow and models.
I find it much easier compared to digits installation

Once you downloaded models, open jupyter and run the objectDetection example. it will take bit of time to run , since it has to download the mobieNet model trained on coco dataset.

Note :
for some weird reason I have to enter

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
 in terminal even-though I added research and slim path to my .bashrc file


Custom model Installation :
  1.  Generate XML to CSV Files
  2.  Generate .record files from CSV files
  3.  make .pbtxt file and dont include comma ','
  4. Download the model and .config file 
  5. Edit the .config file to modify following
    • number of classes
    • pretrained model path
    • test labels path, with images
    • train labels path, with images
  6. Then copy the train.py file to train your model
              When i try to train my custom data using model specified in docs (ssd_inception_v2), I keep getting the following error : WARNING:root:Variable [FeatureExtractor/InceptionV2/Conv2d_1a_7x7/BatchNorm/beta/ExponentialMovingAverage] is not available in checkpoint .

I tried to find the problem using following :
 But didn't get it working. I tried the model and config file from this tutorial  :
https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/ with ssd_mobileNet and cofig file, for some unknown reason , IT WORKED.

7. Evaluating the model
              Copy the eval.py file from legacy folder and run following command

python3 eval.py --logtostderr --pipeline_config_path=/home/ic/Documents/objectExtraction/workspace/training_demo/training/inception_v2.config --checkpoint_dir=/home/ic/Documents/objectExtraction/workspace/training_demo/training --eval_dir=/home/ic/Documents/objectExtraction/workspace/training_demo/eval

You need to specify the number of test images in .config file

eval_config: {
  num_examples: 22
}


You can check the eval output with


                             tensorboard --logdir=eval\

 Check and images , you can find the output


If you get following error :
NameError: name 'unicode' is not defined in object_detection/utils/object_detection_evaluation.py 

Try to replace unicode with str in file object_detection_evaluation.py as specified in https://github.com/tensorflow/models/issues/5203


8. Exporting the model 
           I used the following command to export inferred graph 
python3 export_inference_graph.py --input_type image_tensor       --pipeline_config_path training/inception_v2.config --trained_checkpoint_prefix training/model.ckpt-688 --output_directory trained-inference-graphs/output_inference_graph_v1

When I run it , i got the warning 

114 ops no flops stats due to incomplete shapes. Parsing Inputs... Incomplete shape.


But as stated here , we can ignore it and use the model


9. using the trained model
   To use the trained model , modify the following lines :
Specify the lebel.pbtxt file used 
PATH_TO_LABELS = os.path.join('data', 'label_map.pbtxt')
 and exported model
MODEL_NAME = 'output_inference_graph_v1'

Then add few more images in test_images folder and change the for loop range
accordingly
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 6) ]
 then run the file

And Results I get are :


And for some weird reason This
And Like always I don't even know why third image came like that, may be i have to train it with versatile images.

Are you trying the same and struck ? or do you have any suggestions / solutions for errors i faced (using inceptionNet as specified in official docs)? let me know in comments.



Sunday, 3 February 2019

NVIDIA DIGITS ( failed ) Object detection DetectNet with custom data

Let me start with this  :
I know nothing about the ML, DL , AI or those big buzz words you keep hearing couple of times a day. I just barely trying to scratch those huge mountains from past 2 months  and cant even able to successfully do that till now. So I just know names nothing else but with a lot of optimism I'm trying to use few software or programming tools to train a pre-trained network using transfer learning with custom data.

In this blog post I will try to explain how I miserably tried and failed at training a Object detection model using custom data with NVIDIA DIGITS using DetectNet.

 The total process is divided in to three steps :
  1. Installing and setting up digits in system
  2. Collecting and preparing the data
  3. Training the model
 1. Installing and setting up digits in system 
        There is exhaustive guide on how to setup the NVIDIA digits on the host system or using cloud : https://github.com/dusty-nv/jetson-inference

Follow it line by line , if you are lucky you can set it up and test it in two days as stated in the documents - but for me it took me almost entire week to set it up.

Possible Pitfalls :

2.Collecting and preparing the data
     I'm trying to detect the (lemon) leafs in the following image

So I go to the nearby field and collected around 100 images like them using my phone. I labelled them using labelImg - that's quite a laborious work for 100 , but think about when need a couple of thousand training images.

labelImg will give us  annotations as XML files in PASCAL VOC format , like this


<object>
        <name>leaf</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>2451</xmin>
            <ymin>142</ymin>
            <xmax>2798</xmax>
            <ymax>986</ymax>
        </bndbox>
 </object>
 <object>
        <name>leaf</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>637</xmin>
            <ymin>1025</ymin>



But DIGITS needs data in KITTI format.
which is a TXT file with specific information, so i wrote a python code for converting labelImg xml files to kiiti format. you can find it in my git repo.
To use my code copy all your images into the directory and specify it in SRC_DIR and it will do the rest. The code will create directory named 'labels' and saves generated files there.

After that you need to divide that data in to train and validate as specified in the doc. You can use my another script to do that.