IBC 2018: Convergence and deep learning

By David Cox

In the 20 years I’ve been traveling to IBC, I’ve tried to seek out new technology, work practices and trends that could benefit my clients and help them be more competitive. One thing that is perennially exciting about this industry is the rapid pace of change. Certainly, from a post production point of view, there is a mini revolution every three years or so. In the past, those revolutions have increased image quality or the efficiency of making those images. The current revolution is to leverage the power and flexibly of cloud computing. But those revolutions haven’t fundamentally changed what we do. The images might have gotten sharper, brighter and easier to produce, but TV is still TV. This year though, there are some fascinating undercurrents that could herald a fundamental shift in the sort of content we create and how we create it.

Games and Media Collide
There is a new convergence on the horizon in our industry. A few years ago, all the talk was about the merge between telecommunications companies and broadcasters, as well as the joining of creative hardware and software for broadcast and film, as both moved to digital.

The new convergence is between media content creation as we know it and the games industry. It was subtle, but technology from gaming was present in many applications around the halls of IBC 2018.

One of the drivers for this is a giant leap forward in the quality of realtime rendering by the two main game engine providers: Unreal and Unity. I program with Unity for interactive applications, and their new HDSRP rendering allows for incredible realism, even when being rendered fast enough for 60+ frames per second. In order to create such high-quality images, those game engines must start with reasonably detailed models. This is a departure from the past, where less detailed models were used for games than were used for film CGI shots, to protect for realtime performance. So, the first clear advantage created by the new realtime renderers is that a film and its inevitable related game can use the same or similar model data.

NCam

Being able to use the same scene data between final CGI and a realtime game engine allows for some interesting applications. Habib Zargarpour from Digital Monarch Media showed a system based on Unity that allows a camera operator to control a virtual camera in realtime within a complex CGI scene. The resulting camera moves feel significantly more real than if they had been keyframed by an animator. The camera operator chases high-speed action, jumps at surprises and reacts to unfolding scenes. The subtleties that these human reactions deliver via minor deviations in the movement of the camera can convey the mood of a scene as much as the design of the scene itself.

NCam was showing the possibilities of augmenting scenes with digital assets, using their system based on the Unreal game engine. The NCam system provides realtime tracking data to specify the position and angle of a freely moving physical camera. This data was being fed to an Unreal game engine, which was then adding in animated digital objects. They were also using an additional ultra-wide-angle camera to capture realtime lighting information from the scene, which was then being passed back to Unreal to be used as a dynamic reflection and lighting map. This ensured that digitally added objects were lit by the physical lights in the realworld scene.

Even a seemingly unrelated (but very enlightening) chat with StreamGuys president Kiriki Delany about all things related to content streaming still referenced gaming technology. Delany talked about their tests to build applications with Unity to provide streaming services in VR headsets.

Unity itself has further aspirations to move into storytelling rather than just gaming. The latest version of Unity features an editing timeline and color grading. This allows scenes to be built and animated, then played out through various virtual cameras to create a linear story. Since those scenes are being rendered in realtime, tweaks to scenes such as positions of objects, lights and material properties are instantly updated.

Game engines not only offer us new ways to create our content, but they are a pathway to create a new type of hybrid entertainment, which sits between a game and a film.

Deep Learning
Other undercurrents at IBC 2018 were the possibilities offered by machine learning and deep learning software. Essentially, a normal computer program is hard wired to give a particular output for a given input. Machine learning allows an algorithm to compare its output to a set of data and adjust itself if the output is not correct. Deep learning extends that principle by using neural network structures to make a vast number of assessments of input data, then draw conclusions and predications from that data.

Real-world applications are already prevalent and are largely related in our industry to processing viewing metrics. For example, Netflix suggests what we might want to watch next by comparing our viewing habits to others with a similar viewing pattern.

But deep learning offers — indeed threatens — much more. Of course, it is understandable to think that, say, delivery drivers might be redundant in a world where autonomous vehicles rule, but surely creative jobs are safe, right? Think again!

IBM was showing how its Watson Studio has used deep learning to provide automated editing highlights packages for sporting events. The process is relatively simple to comprehend, although considerably more complicated in practice. A DL algorithm is trained to scan a video file and “listen” for a cheering crowd. This finds the highlight moment. Another algorithm rewinds back from that to find the logical beginning of that moment, such as the pass forward, the beginning of the volley etc. Taking the score into account helps decide whether that highlight was pivotal to the outcome of the game. Joining all that up creates a highlight package without the services of an editor. This isn’t future stuff. This has been happening over the last year.

BBC R&D was talking about their trials to have DL systems control cameras at sporting events, as they could be trained to follow the “two thirds” framing rule and to spot moments of excitement that justified close-ups.

In post production, manual tasks such as rotoscoping and color matching in color grading could be automated. Even styles for graphics, color and compositing could be “learned” from other projects.

It’s certainly possible to see that deep learning systems could provide a great deal of assistance in the creation of day-to-day media. Tasks that are based on repetitiveness or formula would be the obvious targets. The truth is, much of our industry is repetitive and formulaic. Investors prefer content that is more likely to be a hit, and this leads to replication over innovation.

So, are we heading for “Skynet” and need Arnold to save us? I thought it was very telling that IBM occupied the central stand position in Hall 7 — traditionally the home of the tech companies that have driven creativity in post. Clearly, IBM and its peers are staking their claim. I have no doubt that DL and ML will make massive changes to this industry in the years ahead. Creativity is probably, but not necessarily, the only defence for mere humans to keep a hand in.

That said, at IBC2018 the most popular place for us mere humans to visit was a bar area called The Beach, where we largely drank Heineken. If the ultimate deep learning system is tasked to emulate media people, surely it would create digital alcohol and spend hours talking nonsense, rather than try and take over the media world? So perhaps we have a few years left yet.


David Cox is a VFX compositor and colorist with 20-plus years of experience. He started his career with MPC and The Mill before forming his own London-based post facility. Cox recently created interactive projects with full body motion sensors and 4D/AR experiences.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.