Google opens up AI tech behind “Portrait Mode” to developers as open source

Google opens up AI tech behind “Portrait Mode” to developers as open source

Portrait Mode has been simultaneously one of the biggest jokes and coolest advancements in smartphone camera technology. Google’s version of it can be found in the portrait mode of the Pixel 2 and Pixel 2 XL smartphones. And they have just released their latest version of it as Open Source, available to any developer who can make use of it.

It’s detailed in Semantic Image Segmentation with DeepLap in Tensorflow on the Google Research blog. And reading how it works is quite interesting, even if you have no idea how to actually do it. Semantic Image Segmentation is basically the process by which pixels in an image are defined by labels, such as “road”, “sky”, “person” or “dog”. It allows apps to figure out what to keep sharp and what to blur.

DeepLab-v3+ is the new version, and it’s implemented in the Tensorflow machine learning library. It builds on top of a powerful convolutional neural network (CNN) for accurate results intended for server-side deployment. Google is also sharing their Tensorflow training and evaluation code, along with pre-trained models.

They say that the software has come a long way since the first incarnation of DeepLab three years ago. It features improved CNN feature extractors, better object scale modelling, assimilation of contextual information, and improved training procedures.

Modern semantic image segmentation systems built on top of convolutional neural networks (CNNs) have reached accuracy levels that were hard to imagine even five years ago, thanks to advances in methods, hardware, and datasets. We hope that publicly sharing our system with the community will make it easier for other groups in academia and industry to reproduce and further improve upon state-of-art systems, train models on new datasets, and envision new applications for this technology.

While I do find the topic absolutely fascinating, I’ve no desire to actually play with the code myself. I just have no practical use for it. Not even the curiosity to experiment with it. And I know if I did, it would become a huge time suck figuring it all out.

But if you do, or you want to find out more, head on over to the Google Research Blog and check it out for yourself.

[via DPReview]

from -Hacking Photography, One Picture At A Time