From Image Recognition to Brand Logo Detection

I previously did a short review on Microsoft’s image recognition and face detection API. A couple of weeks ago Google announced their vision API providing some similar features. Even though there is no R package or code to dive into this API and their API documentation is rather sparse, I thought it could be fun and inspiring to give it a try.

In general, it works as Microsoft’s API, provide an image, select which kind of analysis you like and receive a (well) coded response.

To give you an idea how that looks like for “face detection”. Let’s use the same Arnold Schwarzenegger photo.

Face detection example

For the provided image, one receives a table with the following columns: “boundingPoly” “fdBoundingPoly” “landmarks” “rollAngle” “panAngle” “tiltAngle” “detectionConfidence” “landmarkingConfidence” “joyLikelihood” “sorrowLikelihood” “angerLikelihood” “surpriseLikelihood” “underExposedLikelihood” “blurredLikelihood” “headwearLikelihood”.

with 2 rows. One for Arnold, one for his wife. The following sub-setted table shows the results:

tiltAngle detectionConfidence landmarkingConfidence joyLikelihood sorrowLikelihood
-12,861,863 0,99996805 0,73490918 VERY_LIKELY VERY_UNLIKELY
-0,25818413 0,99998611 0,76625621 VERY_UNLIKELY VERY_UNLIKELY

In comparison to Microsoft’s API: Not very impressive. So let’s try something else: The API also provides access to a functionality called logo detection.

brand recognition example Providing the image above, with parameter of 40 results yields the following response:

description score
Walmart 0.50977039
Coca Cola Shoes 0.48768377
Sainsburys 0.47962409
IKEA 0.45845419
Kellogg’s 0.454154
Disney 0.44845602
Guardian Co Uk 0.42800492
Nintendo 0.41539443
Heinz 0.41503713

Interesting! The results show that some brand logos are correctly detected. However most logos go unrecognized, even their own brands Google and the YouTube are not returned. I tried some other other images with different brands and the results are mixed at best. As a quick note: Google’s Vision API is not on par with human recognition.

Let’s finally test their OCR capabilities by providing the same image of brand logos.

As a result the API returns:

“Tube, Sainsbury’s, Royal Mail, Colgate 4, You, HEINZ BBC, VISA, PEPSI, MARKS, SPENCER, Vodafone, Dove, amazon YAHOO!, twitter, Nintendo, WIKIPEDIA, ISNEp r BlackBerry, Google IKKEA, C2, facebook, Oxfam, BTe, ER the, market, com, dyson, Microsoft, compare, TESCO John Lewis, Walmart, Save money. Live better., AMSUN, orange, CHANEL, SONY, guardian, SkV, MasterCard, BARCLAYS, “

That looks pretty impressive to me. Even though all brands use their own typo and colors, most brand names are well returned.

In case you want to try it yourself, please see the commented R-code. Compared to Microsoft, you need to provide billing information even though the first 1000 API calls are supposed to be free…

Written on March 28, 2016