You know how, in movies, when there’s an image on the computer and the detective that’s standing next to The Guy In The Chair says “can you zoom in and enhance that license plate?” That technology is now real, thanks to Google’s newest AI engines.
The process to do this is extremely complex and tough to master, as they’re based on diffusion models (and some seriously advanced math) and work to add details to an image that weren’t originally there. This is done by guesswork using similar images and is a technique that Google dubbed natural image synthesis, and in this instance, image super-resolution.
Obviously, you start off with a small and pixelated image (like the images on the left side of each of the image sets above) and end up with a much higher resolution picture that not only looks sharper but appears real to the human eye, even if it isn’t a 100% exact match to the original. To get the job done, Google used two new AI tools: Super-Resolution via Repeated Refinement (SR3) and Cascaded Diffusion Models (CDM).
The first, SR3, adds noise to an image (this looks similar to the static or snow you see on a TV screen when the signal is weak), then reverses the process. It uses a large database of images and a series of probability calculations to map out what a low-resolution version of the image looks like, which Google Researcher Chitwan Saharia goes into more depth with here.
“Diffusion models work by corrupting the training data by progressively adding Gaussian noise, slowly wiping out details in the data until it becomes pure noise, and then training a neural network to reverse this corruption process,” explained Saharia.
The second tool, CDM, uses “pipelines” the various diffusion models (including SR3) can be directed through to produce the high-res upgrades. This tool makes bigger images of the enhancement models using carefully calculated simulations based on advanced probabilities, which Google published a research paper on.
The end result? When research presented the finalized images to people in a test, they chose the generated faces were mistaken for real faces roughly half of the time. While a 50% rate may not sound successful, it’s in line with what we could expect with a perfect algorithm. Google says this method produces better results than other image enhancement options, including generative adversarial networks that use competing neural networks to refine an image.
Google says it intends to do more with these AI engines, and their related technologies, beyond the scope of image upscaling, like other areas of probability modeling. And while this “zoom and enhance” technology will make it easy to do things like upscale old photos, it definitely has undeniably concerning potential, too, like, well, zooming in and enhancing a photo or a license plate or anything else.
via Science Alert