Members of our developer community often request us to share useful tips and tricks with them. One of the most exciting questions we regularly encounter are related to image recognition and AR performance.
For example, "How can I improve recognition by eliminating elements that are not there in real life or are changing in nature?" Think plants that may grow or change leaves, people in the background or liquids inside the bottle whose label you want to get recognized.
Why is it important for image recognition to eliminate unnecessary elements?
As we recently discussed in our “7 Tips to Choose the Best Images for Image Recognition” blog post, Catchoom's recognition is proven to be more accurate than the market average, even in difficult conditions. However, using good reference images are crucial to boost your image recognition performance and keep your users happy.
One of the most important things to do is making sure that your reference images don’t contain unnecessary elements of the background that won’t be there or won’t look the same in real life.
Imagine yourself looking at this scene with the painted door on the street. Your eyes are focusing on the painting, but you cannot help but look at the wall, the plants and flowers, too. Your brain gets slightly distracted by those elements.
Then you come back in a few months. The plants may have lost their leaves by then, the flowers may have grown or they could have been removed completely. Your brain has to use its capacity to ultimately realize that it’s the same object, the same door that you are seeing, even if you are not aware of this mental process.
Mobile image recognition works in a slightly similar way.
If you want to enable your mobile app to recognize that same painted door in different conditions, you can make its ‘job’ easier instead of putting extra burden on its capacity.
By eliminating distracting elements from your reference images that are not important for the experience – in this case, for instance, for a street art tour app – you make the recognition faster and more accurate, by focusing only on what’s important.
But how to give your reference images the right ‘treatment’?
Why is it better to crop the image instead of using ‘masks’?
You might be tempted to start ‘masking’ or start photoshopping your heart out to get rid of those unnecessary details in the background.
However, as a rule of thumb, cropping the image is actually a better solution. See what we mean by using the example of the same painted door.
Masking the unnecessary or ‘changing’ elements of the background
is not a very good idea, because a huge part of the image area is ‘wasted’,
and the white squares may still make the matching with the real scene difficult.
In general, the more image area is devoted to the unique pattern, the better for the recognition.
Cropping the photo so that the reference image only contains the part that is
relevant for the recognition is a better option.
What about recognition of other types of objects, such as a photo in a magazine?
In this article, we mainly addressed the recognition of everyday objects and scenes, such as a painting in a museum, a street art work, a unique storefront.
Similarly, if you want to recognize an object such as bottle, cropping the reference image so that it only includes the area with the unique pattern - typically, the label - is a better option than using the photo of the whole bottle.
Pretty much the same logics apply to the recognition of photos or print images, too. If you want to scan a photo in a magazine or some graphic illustration in a textbook, you should rather use a cropped photo file or a selected part of the page than using the whole page as the reference image.
Remember, great recognition performance leads to happy users and happy developers.
See for yourself by trying Image Recognition with CraftAR:
Photo credit: Travellingclaus.com