Using neural networks to “upscale” old films
The famous short, silent film L’arrivée d’un train en gare de La Ciotat, produced by Auguste and Louis Lumière in 1896, hit the news this week. AI developer Denis Shiryaev used a combination of Gigapixel AI and depth-aware video frame interpolation (DAIN) to “upscale” the film to 4k, 60 frames-per-second quality.
You can watch the upscaled video here:
Here is the original version for comparison:
There were two parts to creating this upscaled video. Firstly, the enhancement to 4k resolution. The algorithm used for this is based on neural networks and was trained with millions of photos. The training process helped to create a sophisticated network that learned the best way to enlarge, enhance, and create natural details.
In addition to the enhanced resolution, Shiryaev utilised DAIN to add frames per second. This video frame interpolation method was developed by Wenbo Bao and colleagues (Shanghai Jiao Tong University, University of California, Merced, and Google) and aims to synthesize new frames in between the original frames. In their arXiv article from April 2019 they propose a novel depth-aware video frame interpolation algorithm which explicitly detects occlusion (when one object in a 3D space is blocking another object from view) using depth information. They developed a depth-aware flow projection layer to synthesize intermediate flows that preferentially sample closer objects rather than those further away. Their algorithm also learns hierarchical features to gather contextual information from neighbouring pixels. The model then warps the input frames, depth maps, and contextual features within an adaptive warping layer. Finally, a frame synthesis network generates the output frame using residual learning.