In my last posts ([here](http://flovv.github.io/Logo_detection_deep_learning/ and here, I described how one can detect logos in images with R. The first results were promising and achieved a classification accuracy of ~50%. In this post i will detail how to do transfer learning (using a pre-trained network) to further improve the classification accuracy.
The dataset is a combination of the Flickr27-dataset, with 270 images of 27 classes and self-scraped images from google image search. In case you want to reproduce the analysis, you can download the set here.
In addition to the previous post, this time I wanted to use pre-trained image models, to see how they perform on the task of identifing brand logos in images.
First, I tried to adapt the official example on the Keras-rstudio website.
Here is a copy of the instructions:
It is basically a three step process; 1) load an existing model and add some layers, 2) train the extended model on your own data, 3) set more layers trainable and fine-tune the model on your own data.
I have to admit that I struggled to make it work on my data. That’s why I will try to detail my approach. Hopefully it helps you to get better started with transfer learning.
Here is the full example that I got to work:
The key change to the Rstudio sample code is to use a different pre-trained model. I use VGG16 over Inception_v3 as the later would work. (The Inception-model would not pick up any information and accuracy remains around the base rate.)
But let’s see what the code does step by step.
In section 1 the image data is prepared and loaded. One that setting that is important to set while loading the images is “class_mode = “categorical””, as we have 27 different image classes/labels to assign. In section 2 we load the pretrained image model.
In section 3 we add custom layers. (In this case I copied the last layers from the previous post.) It is important to set the last layers to the number of labels (27) and the activation function to softmax.
In section 4 we set the layers of the loaded image model to non-trainable.
Section 5 sets the optimiser and finally trains the model. If the training process does not show improvements in terms of decreasing loss, try to increase the learning rate.
The learning process is documented in the hist-object, which can be easily plotted.
After 50 Traing-epochs the accuracy is at 55% on the training 35% on the validation set.
I assume that the accuracy can be further improved by training the full model or at least set more layers trainable and fine tune the full model as it is detailed in the R-Studio case. However, so far I tried various parameter settings without success. There is basically no improvement on the established 50%, yet.
I will update the post, as soon as I have figured it out.