By Mike McCarthy
I, once again, had the opportunity to attend Nvidia’s GPU Technology Conference (GTC) in San Jose last week. The event has become much more focused on AI supercomputing and deep learning as those industries mature, but there was also a concentration on VR for those of us from the visual world.
The big news was that Nvidia released the details of its next-generation GPU architecture, code named Volta. The flagship chip will be the Tesla V100 with 5,120 CUDA cores and 15 Teraflops of computing power. It is a huge 815mm chip, created with a 12nm manufacturing process for better energy efficiency. Most of its unique architectural improvements are focused on AI and deep learning with specialized execution units for Tensor calculations, which are foundational to those processes.
Similar to last year’s GP100, the new Volta chip will initially be available in Nvidia’s SXM2 form factor for dedicated GPU servers like their DGX1, which uses the NVLink bus, now running at 300GB/s. The new GPUs will be a direct swap-in replacement for the current Pascal based GP100 chips. There will also be a 150W version of the chip on a PCIe card similar to their existing Tesla lineup, but only requiring a single half-length slot.
Assuming that Nvidia puts similar processing cores into their next generation of graphics cards, we should be looking at a 33% increase in maximum performance at the top end. The intermediate stages are more difficult to predict, since that depends on how they choose to tier their cards. But the increased efficiency should allow more significant increases in performance for laptops, within existing thermal limitations.
Nvidia is continuing its pursuit of GPU-enabled autonomous cars with its DrivePX2 and Xavier systems for vehicles. The newest version will have a 512 Core Volta GPU and a dedicated deep learning accelerator chip that they are going to open source for other devices. They are targeting larger vehicles now, specifically in the trucking industry this year, with an AI-enabled semi-truck in their booth.
They also had a tractor showing off Blue River’s AI-enabled spraying rig, targeting individual plants for fertilizer or herbicide. It seems like farm equipment would be an optimal place to implement autonomous driving, allowing perfectly straight rows and smooth grades, all in a flat controlled environment with few pedestrians or other dynamic obstructions to be concerned about (think Interstellar). But I didn’t see any reference to them looking in that direction, even with a giant tractor in their AI booth.
On the software and application front, software company SAP showed an interesting implementation of deep learning that analyzes broadcast footage and other content looking to identify logos and branding, in order to provide quantifiable measurements of the effectiveness of various forms of brand advertising. I expect we will continue to see more machine learning implementations of video analysis, for things like automated captioning and descriptive video tracks, as AI becomes more mature.
Nvidia also released an “AI-enabled” version of I-Ray to use image prediction to increase the speed of interactive ray tracing renders. I am hopeful that similar technology could be used to effectively increase the resolution of video footage as well. Basically, a computer sees a low-res image of a car and says, “I know what that car should look like,” and fills in the rest of the visual data. The possibilities are pretty incredible, especially in regard to VFX.
On the VR front, Nvidia announced a new SDK that allows live GPU-accelerated image stitching for stereoscopic VR processing and streaming. It scales from HD to 5K output, splitting the workload across one to four GPUs. The stereoscopic version is doing much more than basic stitching, processing for depth information and using that to filter the output to remove visual anomalies and improve the perception of depth. The output was much cleaner than any other live solution I have seen.
I also got to try my first VR experience recorded with a Light Field camera. This not only gives the user a 360 stereo look around capability, but also the ability to move their head around to shift their perspective within a limited range (based on the size the recording array). The project they were using to demo the technology didn’t highlight the amazing results until the very end of the piece, but when it did that was the most impressive VR implementation I have had the opportunity to experience yet.
Mike McCarthy is an online editor/workflow consultant with 10 years of experience on feature films and commercials. He has been working on new solutions for tapeless workflows, DSLR filmmaking and multi-screen and surround video experiences. Check out his site.