Tag Archives: GTC

GTC: GPUs power The Mandalorian‘s in-camera VFX, realtime workflows

By Mike McCarthy

Each year, Nvidia hosts a series of conferences that focus on new developments in GPU-based computing. Originally, these were about graphics and visualization, which were the most advanced things being done with GPUs. Now they focus on everything from supercomputing and AI to self-driving cars and VR. The first GTC conference I attended was in 2016, when Nvidia announced its Pascal architecture with dedicated Tensor cores. While that was targeted to supercomputer users, there was still a lot of graphics-based content to explore, especially with VR.

Over time, the focus has shifted from visual applications to AI applications that aren’t necessarily graphics-based; they just have similar parallel computing requirements to graphics processing and are optimal tasks to be accelerated on GPU hardware. This has made GTC more relevant to programmers and similar users, but the hardware developments that enable those capabilities also accelerate the more traditional graphics workflows — and new ways of using that power are constantly being developed.

I was looking forward to going to March’s GTC to hear the details on what was expected to be an announcement about Nvidia’s next generation of hardware architecture and to see all of the other presentations about how others have been using current GPU technology. Then came the coronavirus, and the world changed. Nvidia canceled the online keynote and a few SDK updates were released, but all major product announcements have been deferred for the time being. What Nvidia did offer was a selection of talks and seminars that were remotely recorded and hosted as videos to watch. These are available to anyone who registers for the free online version of GTC, instead of paying the hundreds it would cost to attend in person.

One that really stood out to me was “Creating In-Camera VFX with Realtime Workflows.” It highlighted the Unreal Engine and what that technology allowed on The Mandalorian — it was amazing. The basic premise is to replace greenscreen composites with VFX projections behind the elements being photographed. This was done years ago for exteriors of in-car scenes using flat prerecorded footage, but technology has progressed dramatically since then. The main advances are in motion capture, 3D rendering and LED walls.

From the physical standpoint, LED video walls have greater brightness, allowing them not only to match the lit foreground subjects, but to light those subjects for accurate shadows and reflections without post compositing. And if that background imagery can be generated in real time — instead of recordings or renders — it can respond to the movement of the camera as well. That is where Unreal comes in — as a 3D game rendering engine that is repurposed to generate images corrected for the camera’s perspective in order to project on the background. This allows live-action actors to be recorded in complex CGI environments as if they were real locations. Actors can see the CGI elements they are interacting with, and the crew can see it all working together in real time without having to imagine how it’s going to look after VFX. We looked at using this technology on the last film I worked on, but it wasn’t quite there yet at the scale we needed; we used greenscreens instead, but it looks like this use of the technology has arrived. And Nvidia should be happy, because it takes a lot more GPU power to render the whole environment in real time than it does to render just what the camera sees after filming. But the power is clearly available, and even more is coming.

While no new Nvidia technology has been announced, something is always right around the corner. The current Turing generation of GPUs, which has been available for over 18 months, brought dedicated RTX cores for realtime raytracing. The coming generation is expected to scale up the number of CUDA cores and amount of memory by using smaller transistors than Turing’s 12nm process. This should offer more processing power for less money, which is always a welcome development.


Mike McCarthy is an online editor/workflow consultant with over 10 years of experience on feature films and commercials. He has been involved in pioneering new solutions for tapeless workflows, DSLR filmmaking and multi-screen and surround video experiences. Check out his site.

Nvidia’s GTC 2016: VR, A.I. and self driving cars, oh my!

By Mike McCarthy

Last week, I had the opportunity to attend Nvidia’s GPU Technology Conference, GTC 2016. Five thousand people filled the San Jose Convention Center for nearly a week to learn about GPU technology and how to use it to change our world. GPUs were originally designed to process graphics (hence the name), but are now used to accelerate all sorts of other computational tasks.

The current focus of GPU computing is in three areas:

Virtual reality is a logical extension of the original graphics processing design. VR requires high frame rates with low latency to keep up with user’s head movements, otherwise the lag results in motion sickness. This requires lots of processing power, and the imminent release of the Oculus Rift and HTC Vive head-mounted displays are sure to sell many high-end graphics cards. The new Quadro M6000 24GB PCIe card and M5500 mobile GPU have been released to meet this need.

Autonomous vehicles are being developed that will slowly replace many or all of the driver’s current roles in operating a vehicle. This requires processing lots of sensor input data and making decisions in realtime based on inferences made from that information. Nvidia has developed a number of hardware solutions to meet these needs, with the Drive PX and Drive PX2 expected to be the hardware platform that many car manufacturers rely on to meet those processing needs.

This author calls the Tesla P100 "a monster of a chip."

This author calls the Tesla P100 “a monster of a chip.”

Artificial Intelligence has made significant leaps recently, and the need to process large data sets has grown exponentially. To that end, Nvidia has focused their newest chip development — not on graphics, at least initially — on a deep learning super computer chip. The first Pascal generation GPU, the Tesla P100 is a monster of a chip, with 15 billion 16nm transistors on a 600mm2 die. It should be twice as fast as current options for most tasks, and even more for double precision work and/or large data sets. The chip is initially available in the new DGX-1 supercomputer for $129K, which includes eight of the new GPUs connected in NVLink. I am looking forward to seeing the same graphics processing technology on a PCIe-based Quadro card at some point in the future.

While those three applications for GPU computing all had dedicated hardware released for them, Nvidia has also been working to make sure that software will be developed that uses the level of processing power they can now offer users. To that end, there are all sorts of SDKs and libraries they have been releasing to help developers harness the power of the hardware that is now available. For VR, they have Iray VR, which is a raytracing toolset for creating photorealistic VR experiences, and Iray VR Lite, which allows users to create still renderings to be previewed with HMD displays. They also have a broader VRWorks collection of tools for helping software developers adapt their work for VR experiences. For Autonomous vehicles they have developed libraries of tools for mapping, sensor image analysis, and a deep-learning decision-making neural net for driving called DaveNet. For A.I. computing, cuDNN is for accelerating emerging deep-learning neural networks, running on GPU clusters and supercomputing systems like the new DGX-1.

What Does This Mean for Post Production?
So from a post perspective (ha!), what does this all mean for the future of post production? First, newer and faster GPUs are coming, even if they are not here yet. Much farther off, deep-learning networks may someday log and index all of your footage for you. But the biggest change coming down the pipeline is virtual reality, led by the upcoming commercially available head-mounted displays (HMD). Gaming will drive HMDs into the hands of consumers, and HMDs in the hand of consumers will drive demand for a new type of experience for story-telling, advertising and expression.

As I see it, VR can be created in a variety of continually more immersive steps. The starting point is the HMD, placing the viewer into an isolated and large feeling environment. Existing flat video or stereoscopic content can be viewed without large screens, requiring only minimal processing to format the image for the HMD. The next step is a big jump — when we begin to support head tracking — to allow the viewer to control the direction that they are viewing. This is where we begin to see changes required at all stages of the content production and post pipeline. Scenes need to be created and filmed at 360 degrees.

At the conference, this high-fidelity VR simulation that uses scientifically accurate satellite imagery and data from NASA was shown.

The cameras required to capture 360 degrees of imagery produce a series of video streams that need to be stitched together into a single image, and that image needs to be edited and processed. Then the entire image is made available to the viewer, who then chooses which angle they want to view as it is played. This can be done as a flatten image sphere or, with more source data and processing, as a stereoscopic experience. The user can control the angle they view the scene from, but not the location they are viewing from, which was dictated by the physical placement of the 360-camera system. Video-Stitch just released a new all-in-one package for capturing, recording and streaming 360 video called the Orah 4i, which may make that format more accessible to consumers.

Allowing the user to fully control their perspective and move around within a scene is what makes true VR so unique, but is also much more challenging to create content for. All viewed images must be rendered on the fly, based on input from the user’s motion and position. These renders require all content to exist in 3D space, for the perspective to be generated correctly. While this is nearly impossible for traditional camera footage, it is purely a render challenge for animated content — rendering that used to take weeks must be done in realtime, and at much higher frame rates to keep up with user movement.

For any camera image, depth information is required, which is possible to estimate with calculations based on motion, but not with the level of accuracy required. Instead, if many angles are recorded simultaneously, a 3D analysis of the combination can generate a 3D version of the scene. This is already being done in limited cases for advance VFX work, but it would require taking it to a whole new level. For static content, a 3D model can be created by processing lots of still images, but storytelling will require 3D motion within this environment. This all seems pretty far out there for a traditional post workflow, but there is one case that will lend itself to this format.

Motion capture-based productions already have the 3D data required to render VR perspectives, because VR is the same basic concept as motion tracking cinematography, except that the viewer controls the “camera” instead of the director. We are already seeing photorealistic motion capture movies showing up in theaters, so these are probably the first types of productions that will make the shift to producing full VR content.

The Maxwell Kepler family of cards.

Viewing this content is still a challenge, where again Nvidia GPUs are used on the consumer end. Any VR viewing requires sensor input to track the viewer, which much be processed, and the resulting image must be rendered, usually twice for stereo viewing. This requires a significant level of processing power, so Nvidia has created two tiers of hardware recommendations to ensure that users can get a quality VR experience. For consumers, the VR-Ready program includes complete systems based on the GeForce 970 or higher GPUs, which meet the requirements for comfortable VR viewing. VR-Ready for Professionals is a similar program for the Quadro line, including the M5000 and higher GPUs, included in complete systems from partner ISVs. Currently, MSI’s new WT72 laptop with the new M5500 GPU is the only mobile platform certified VR Ready for Pros. The new mobile Quadro M5500 has the same system architecture as the desktop workstation Quadro M5000, with all 2048 CUDA cores and 8GB RAM.

While the new top-end Maxwell-based Quadro GPUs are exciting, I am really looking forward to seeing Nvidia’s Pascal technology used for graphics processing in the near future. In the meantime, we have enough performance with existing systems to start processing 360-degree videos and VR experiences.

Mike McCarthy is a freelance post engineer and media workflow consultant based in Northern California. He shares his 10 years of technology experience on www.hd4pc.com, and he can be reached at mike@hd4pc.com.

Nvidia’s GPU Technology Conference: Part III

Entrepreneurs, self-driving cars and more

By Fred Ruckel

Welcome to the final installment of my Nvidia GPU Technology Conference experience. If you have read Part I and Part II, I’m confident you will enjoy this wrap-up — from a one-on-one meeting with one of Nvidia’s top dogs to a “shark tank” full of entrepreneurs to my take on the status of self-driving cars. Thanks for following along and feel free to email if you have any questions about my story.

Going One on One
I had the pleasure sitting down with Nvidia marketing manager Greg Estes, along with Gail Laguna, their PR expert in media and entertainment. They allowed me to pick their brains about Continue reading

Nvidia GPU Technology Conference 2015: Part I

By Fred Ruckel

Recently, I had the pleasure of attending the Nvidia GPU Technology Conference 2015 in San Jose, California, a.k.a. Silicon Valley. This was not a conference for the faint of heart; it was an in-depth look at where the development of GPU technology is heading and what strides it had made over the last year. In short, it was the biggest geek fest I have ever known, and I mean that as a compliment. The cast of The Big Bang Theory would have fit right in.

While some look at “geek” as having a negative connotation, in the world of technology geeks Continue reading

Quick Chat: Nvidia’s Greg Estes

By Randi Altman

Greg Estes, Nvidia’s VP of marketing, recently took a few minutes out of his schedule to discuss the industry, trends and how the company goes about creating new products that target the needs of users.

The short answer is listening to what studios and broadcasters need. The long answer is… well give it a read and see for yourself.

Continue reading