Category Archives: A.I.

Epic Games launches Unreal Engine 4.20

Epic Games has introduced Unreal Engine 4.20, which allows developers to build even more realistic characters and immersive environments across games, film and TV, VR/AR/MR and enterprise applications. The Unreal Engine 4.20 release combines the latest realtime rendering advancements with improved creative tools, making it even easier to ship games across all platforms. With hundreds of optimizations, especially for iOS, Android and Nintendo Switch — which have been built for Fortnite and are now rolled into Unreal Engine 4.20 and released to all users — Epic is providing developers with the scalable tools they need for these types of projects.

Artists working in visual effects, animation, broadcast and virtual production will find enhancements for digital humans, VFX and cinematic depth of field, allowing them to create realistic images across all forms of media and entertainment. In the enterprise space, Unreal Studio 4.20 includes upgrades to the UE4 Datasmith plugin suite, such as SketchUp support, which make it easier to get CAD data prepped, imported and working in Unreal Engine.

Here are some key features of Unreal Engine 4.20:

A new proxy LOD system: Users can handle sprawling worlds via UE4’s production-ready Proxy LOD system for the easy reduction of rendering cost due to poly count, draw calls and material complexity. Proxy LOD offers big gains when developing for mobile and console platforms.

A smoother mobile experience: Over 100 mobile optimizations developed for Fortnite come to all 4.20 users, marking a major shift for easy “shippability” and seamless gameplay optimization across platforms. Major enhancements include improved Android debugging, mobile Landscape improvements, RHI thread on Android and occlusion queries on mobile.

Works better with Switch: Epic has improved Nintendo Switch development by releasing tons of performance and memory improvements built for Fortnite on Nintendo Switch to 4.20 users as well.

Niagara VFX (early access): Unreal Engine’s new programmable VFX editor, Niagara, is now available in early access and will help developers take their VFX to the next level. This new suite of tools is built from the ground up to give artists unprecedented control over particle simulation, rendering and performance for more sophisticated visuals. This tool will eventually replace the Unreal Cascade particle editor.

Cinematic depth of field: Unreal Engine 4.20 delivers tools for achieving depth of field at true cinematic quality in any scene. This brand-new implementation replaces the Circle DOF method. It’s faster, cleaner and provides a cinematic appearance through the use of a procedural bokeh simulation. Cinematic DOF also supports alpha channel and dynamic resolution stability, and has multiple settings for scaling up or down on console platforms based on project requirements. This feature debuted at GDC this year as part of the Star Wars “Reflections” demo by Epic, ILMxLAB and Nvidia.

Digital human improvements: In-engine tools now include dual-lobe specular/double Beckman specular models, backscatter transmission in lights, boundary bleed color subsurface scattering, iris normal slot for eyes and screen space irradiance to build the most cutting-edge digital humans in games and beyond.

Live record and replay: All developers now have access to code from Epic’s Fortnite Replay system. Content creators can easily use footage of recorded gameplay sessions to create incredible replay videos.

Sequencer cinematic updates: New features include frame accuracy, media tracking, curve editor/evaluation and Final Cut Pro 7 XML import/export.

Shotgun integration: Shotgun, a production management and asset tracking solution, is now supported. This will streamline workflows for Shotgun users in game development who are leveraging Unreal’s realtime performance. Shotgun users can assign tasks to specific assets within Unreal Engine.

Mixed reality capture support (early access): Users with virtual production workflows will now have mixed reality capture support that includes video input, calibration and in-game compositing. Supported webcams and HDMI capture devices allow users to pull real-world greenscreened video into the engine, and supported tracking devices can match your camera location to the in-game camera for more dynamic shots.

AR support: Unreal Engine 4.20 ships with native support for ARKit 2, which includes features for creating shared, collaborative AR experiences. Also included is the latest support for Magic Leap One, Google ARCore 1.2 support.

Metadata control: Import metadata from 3ds Max, SketchUp and other common CAD tools for the opportunity to batch process objects by property, or expose metadata via scripts. Metadata enables more creative uses of Unreal Studio, such as Python script commands for updating all meshes of a certain type, or displaying relevant information in interactive experiences.

Mesh editing tools: Unreal Engine now includes a basic mesh editing toolset for quick, simple fixes to imported geometry without having to fix them in the source package and re-import. These tools are ideal for simple touch-ups without having to go to another application. Datasmith also now includes a base Python script that can generate Level of Detail (LOD) meshes automatically.

Non-destructive re-import: Achieve faster iteration through the new parameter tracking system, which monitors updates in both the source data and Unreal Editor, and only imports changed elements. Previous changes to the scene within Unreal Editor are retained and reapplied when source data updates.

GTC embraces machine learning and AI

By Mike McCarthy

I had the opportunity to attend GTC 2018, Nvidia‘s 9th annual technology conference in San Jose this week. GTC stands for GPU Technology Conference, and GPU stands for graphics processing unit, but graphics makes up a relatively small portion of the show at this point. The majority of the sessions and exhibitors are focused on machine learning and artificial intelligence.

And the majority of the graphics developments are centered around analyzing imagery, not generating it. Whether that is classifying photos on Pinterest or giving autonomous vehicles machine vision, it is based on the capability of computers to understand the content of an image. Now DriveSim, Nvidia’s new simulator for virtually testing autonomous drive software, dynamically creates imagery for the other system in the Constellation pair of servers to analyze and respond to, but that is entirely machine-to-machine imagery communication.

The main exception to this non-visual usage trend is Nvidia RTX, which allows raytracing to be rendered in realtime on GPUs. RTX can be used through Nvidia’s OptiX API, as well as Microsoft’s DirectX RayTracing API, and eventually through the open source Vulkan cross-platform graphics solution. It integrates with Nvidia’s AI Denoiser to use predictive rendering to further accelerate performance, and can be used in VR applications as well.

Nvidia RTX was first announced at the Game Developers Conference last week, but the first hardware to run it was just announced here at GTC, in the form of the new Quadro GV100. This $9,000 card replaces the existing Pascal-based GP100 with a Volta-based solution. It retains the same PCIe form factor, the quad DisplayPort 1.4 outputs and the NV-Link bridge to pair two cards at 200GB/s, but it jumps the GPU RAM per card from 16GB to 32GB of HBM2 memory. The GP100 was the first Quadro offering since the K6000 to support double-precision compute processing at full speed, and the increase from 3,584 to 5,120 CUDA cores should provide a 40% increase in performance, before you even look at the benefits of the 640 Tensor Cores.

Hopefully, we will see simpler versions of the Volta chip making their way into a broader array of more budget-conscious GPU options in the near future. The fact that the new Nvidia RTX technology is stated to require Volta architecture CPUs leads me to believe that they must be right on the horizon.

Nvidia also announced a new all-in-one GPU supercomputer — the DGX-2 supports twice as many Tesla V100 GPUs (16) with twice as much RAM each (32GB) compared to the existing DGX-1. This provides 81920 CUDA cores addressing 512GB of HBM2 memory, over a fabric of new NV-Link switches, as well as dual Xeon CPUs, Infiniband or 100GbE connectivity, and 32TB of SSD storage. This $400K supercomputer is marketed as the world’s largest GPU.

Nvidia and their partners had a number of cars and trucks on display throughout the show, showcasing various pieces of technology that are being developed to aid in the pursuit of autonomous vehicles.

Also on display in the category of “actually graphics related” was the new Max-Q version of the mobile Quadro P4000, which is integrated into PNY’s first mobile workstation, the Prevail Pro. Besides supporting professional VR applications, the HDMI and dual DisplayPort outputs allow a total of three external displays up to 4K each. It isn’t the smallest or lightest 15-inch laptop, but it is the only system under 17 inches I am aware of that supports the P4000, which is considered the minimum spec for professional VR implementation.

There are, of course, lots of other vendors exhibiting their products at GTC. I had the opportunity to watch 8K stereo 360 video playing off of a laptop with an external GPU. I also tried out the VRHero 5K Plus enterprise-level HMD, which brings the VR experience to whole other level. Much more affordable is TP-Cast’s $300 wireless upgrade Vive and Rift HMDs, the first of many untethered VR solutions. HTC has also recently announced the Vive Pro, which will be available in April for $800. It increases the resolution by 1/3 in both dimensions to 2880×1600 total, and moves from HDMI to DisplayPort 1.2 and USB-C. Besides VR products, they also had all sorts of robots in various forms on display.

Clearly the world of GPUs has extended far beyond the scope of accelerating computer graphics generation, and Nvidia is leading the way in bringing massive information processing to a variety of new and innovative applications. And if that leads us to hardware that can someday raytrace in realtime at 8K in VR, then I suppose everyone wins.


Mike McCarthy is an online editor/workflow consultant with 10 years of experience on feature films and commercials. He has been involved in pioneering new solutions for tapeless workflows, DSLR filmmaking and multi-screen and surround video experiences. Check out his site.

DG 7.9.18

Axle Video rebrands as Axle AI

Media management company Axle Video has rebranded as Axle AI. The company has also launched their new Axle AI software, allowing users to automatically index and search large amounts of video, image and audio content.

Axle AI is available either as software, which runs on standard Mac hardware, or as a self-contained software/hardware appliance. Both options provide integrations with leading cloud AI engines. The appliance also includes embedded processing power that supports direct visual search for thousands of hours of footage with no cloud connectivity required. Axle AI has an open architecture, so new third-party capabilities can be added at any time.

Axle has also launched Axle Media Cloud with Wasabi, a 100% cloud-based option for simple media management. The offering is available now and is priced at $400 per month for 10 terabytes of managed storage, 10 user accounts and up to 10 terabytes of downloaded media per month.

In addition, Axle Embedded is a new version of axle software that can be run directly on storage solutions from a range of industry partners, including, G-Technology and Panasas. As with Axle Media Cloud, all of Axle AI’s automated tagging and search capabilities are simple add-ons to the system.


What was new at GTC 2017

By Mike McCarthy

I, once again, had the opportunity to attend Nvidia’s GPU Technology Conference (GTC) in San Jose last week. The event has become much more focused on AI supercomputing and deep learning as those industries mature, but there was also a concentration on VR for those of us from the visual world.

The big news was that Nvidia released the details of its next-generation GPU architecture, code named Volta. The flagship chip will be the Tesla V100 with 5,120 CUDA cores and 15 Teraflops of computing power. It is a huge 815mm chip, created with a 12nm manufacturing process for better energy efficiency. Most of its unique architectural improvements are focused on AI and deep learning with specialized execution units for Tensor calculations, which are foundational to those processes.

Tesla V100

Similar to last year’s GP100, the new Volta chip will initially be available in Nvidia’s SXM2 form factor for dedicated GPU servers like their DGX1, which uses the NVLink bus, now running at 300GB/s. The new GPUs will be a direct swap-in replacement for the current Pascal based GP100 chips. There will also be a 150W version of the chip on a PCIe card similar to their existing Tesla lineup, but only requiring a single half-length slot.

Assuming that Nvidia puts similar processing cores into their next generation of graphics cards, we should be looking at a 33% increase in maximum performance at the top end. The intermediate stages are more difficult to predict, since that depends on how they choose to tier their cards. But the increased efficiency should allow more significant increases in performance for laptops, within existing thermal limitations.

Nvidia is continuing its pursuit of GPU-enabled autonomous cars with its DrivePX2 and Xavier systems for vehicles. The newest version will have a 512 Core Volta GPU and a dedicated deep learning accelerator chip that they are going to open source for other devices. They are targeting larger vehicles now, specifically in the trucking industry this year, with an AI-enabled semi-truck in their booth.

They also had a tractor showing off Blue River’s AI-enabled spraying rig, targeting individual plants for fertilizer or herbicide. It seems like farm equipment would be an optimal place to implement autonomous driving, allowing perfectly straight rows and smooth grades, all in a flat controlled environment with few pedestrians or other dynamic obstructions to be concerned about (think Interstellar). But I didn’t see any reference to them looking in that direction, even with a giant tractor in their AI booth.

On the software and application front, software company SAP showed an interesting implementation of deep learning that analyzes broadcast footage and other content looking to identify logos and branding, in order to provide quantifiable measurements of the effectiveness of various forms of brand advertising. I expect we will continue to see more machine learning implementations of video analysis, for things like automated captioning and descriptive video tracks, as AI becomes more mature.

Nvidia also released an “AI-enabled” version of I-Ray to use image prediction to increase the speed of interactive ray tracing renders. I am hopeful that similar technology could be used to effectively increase the resolution of video footage as well. Basically, a computer sees a low-res image of a car and says, “I know what that car should look like,” and fills in the rest of the visual data. The possibilities are pretty incredible, especially in regard to VFX.

Iray AI

On the VR front, Nvidia announced a new SDK that allows live GPU-accelerated image stitching for stereoscopic VR processing and streaming. It scales from HD to 5K output, splitting the workload across one to four GPUs. The stereoscopic version is doing much more than basic stitching, processing for depth information and using that to filter the output to remove visual anomalies and improve the perception of depth. The output was much cleaner than any other live solution I have seen.

I also got to try my first VR experience recorded with a Light Field camera. This not only gives the user a 360 stereo look around capability, but also the ability to move their head around to shift their perspective within a limited range (based on the size the recording array). The project they were using to demo the technology didn’t highlight the amazing results until the very end of the piece, but when it did that was the most impressive VR implementation I have had the opportunity to experience yet.
———-
Mike McCarthy is an online editor/workflow consultant with 10 years of experience on feature films and commercials. He has been working on new solutions for tapeless workflows, DSLR filmmaking and multi-screen and surround video experiences. Check out his site.


Nvidia’s GTC 2016: VR, A.I. and self driving cars, oh my!

By Mike McCarthy

Last week, I had the opportunity to attend Nvidia’s GPU Technology Conference, GTC 2016. Five thousand people filled the San Jose Convention Center for nearly a week to learn about GPU technology and how to use it to change our world. GPUs were originally designed to process graphics (hence the name), but are now used to accelerate all sorts of other computational tasks.

The current focus of GPU computing is in three areas:

Virtual reality is a logical extension of the original graphics processing design. VR requires high frame rates with low latency to keep up with user’s head movements, otherwise the lag results in motion sickness. This requires lots of processing power, and the imminent release of the Oculus Rift and HTC Vive head-mounted displays are sure to sell many high-end graphics cards. The new Quadro M6000 24GB PCIe card and M5500 mobile GPU have been released to meet this need.

Autonomous vehicles are being developed that will slowly replace many or all of the driver’s current roles in operating a vehicle. This requires processing lots of sensor input data and making decisions in realtime based on inferences made from that information. Nvidia has developed a number of hardware solutions to meet these needs, with the Drive PX and Drive PX2 expected to be the hardware platform that many car manufacturers rely on to meet those processing needs.

This author calls the Tesla P100 "a monster of a chip."

This author calls the Tesla P100 “a monster of a chip.”

Artificial Intelligence has made significant leaps recently, and the need to process large data sets has grown exponentially. To that end, Nvidia has focused their newest chip development — not on graphics, at least initially — on a deep learning super computer chip. The first Pascal generation GPU, the Tesla P100 is a monster of a chip, with 15 billion 16nm transistors on a 600mm2 die. It should be twice as fast as current options for most tasks, and even more for double precision work and/or large data sets. The chip is initially available in the new DGX-1 supercomputer for $129K, which includes eight of the new GPUs connected in NVLink. I am looking forward to seeing the same graphics processing technology on a PCIe-based Quadro card at some point in the future.

While those three applications for GPU computing all had dedicated hardware released for them, Nvidia has also been working to make sure that software will be developed that uses the level of processing power they can now offer users. To that end, there are all sorts of SDKs and libraries they have been releasing to help developers harness the power of the hardware that is now available. For VR, they have Iray VR, which is a raytracing toolset for creating photorealistic VR experiences, and Iray VR Lite, which allows users to create still renderings to be previewed with HMD displays. They also have a broader VRWorks collection of tools for helping software developers adapt their work for VR experiences. For Autonomous vehicles they have developed libraries of tools for mapping, sensor image analysis, and a deep-learning decision-making neural net for driving called DaveNet. For A.I. computing, cuDNN is for accelerating emerging deep-learning neural networks, running on GPU clusters and supercomputing systems like the new DGX-1.

What Does This Mean for Post Production?
So from a post perspective (ha!), what does this all mean for the future of post production? First, newer and faster GPUs are coming, even if they are not here yet. Much farther off, deep-learning networks may someday log and index all of your footage for you. But the biggest change coming down the pipeline is virtual reality, led by the upcoming commercially available head-mounted displays (HMD). Gaming will drive HMDs into the hands of consumers, and HMDs in the hand of consumers will drive demand for a new type of experience for story-telling, advertising and expression.

As I see it, VR can be created in a variety of continually more immersive steps. The starting point is the HMD, placing the viewer into an isolated and large feeling environment. Existing flat video or stereoscopic content can be viewed without large screens, requiring only minimal processing to format the image for the HMD. The next step is a big jump — when we begin to support head tracking — to allow the viewer to control the direction that they are viewing. This is where we begin to see changes required at all stages of the content production and post pipeline. Scenes need to be created and filmed at 360 degrees.

At the conference, this high-fidelity VR simulation that uses scientifically accurate satellite imagery and data from NASA was shown.

The cameras required to capture 360 degrees of imagery produce a series of video streams that need to be stitched together into a single image, and that image needs to be edited and processed. Then the entire image is made available to the viewer, who then chooses which angle they want to view as it is played. This can be done as a flatten image sphere or, with more source data and processing, as a stereoscopic experience. The user can control the angle they view the scene from, but not the location they are viewing from, which was dictated by the physical placement of the 360-camera system. Video-Stitch just released a new all-in-one package for capturing, recording and streaming 360 video called the Orah 4i, which may make that format more accessible to consumers.

Allowing the user to fully control their perspective and move around within a scene is what makes true VR so unique, but is also much more challenging to create content for. All viewed images must be rendered on the fly, based on input from the user’s motion and position. These renders require all content to exist in 3D space, for the perspective to be generated correctly. While this is nearly impossible for traditional camera footage, it is purely a render challenge for animated content — rendering that used to take weeks must be done in realtime, and at much higher frame rates to keep up with user movement.

For any camera image, depth information is required, which is possible to estimate with calculations based on motion, but not with the level of accuracy required. Instead, if many angles are recorded simultaneously, a 3D analysis of the combination can generate a 3D version of the scene. This is already being done in limited cases for advance VFX work, but it would require taking it to a whole new level. For static content, a 3D model can be created by processing lots of still images, but storytelling will require 3D motion within this environment. This all seems pretty far out there for a traditional post workflow, but there is one case that will lend itself to this format.

Motion capture-based productions already have the 3D data required to render VR perspectives, because VR is the same basic concept as motion tracking cinematography, except that the viewer controls the “camera” instead of the director. We are already seeing photorealistic motion capture movies showing up in theaters, so these are probably the first types of productions that will make the shift to producing full VR content.

The Maxwell Kepler family of cards.

Viewing this content is still a challenge, where again Nvidia GPUs are used on the consumer end. Any VR viewing requires sensor input to track the viewer, which much be processed, and the resulting image must be rendered, usually twice for stereo viewing. This requires a significant level of processing power, so Nvidia has created two tiers of hardware recommendations to ensure that users can get a quality VR experience. For consumers, the VR-Ready program includes complete systems based on the GeForce 970 or higher GPUs, which meet the requirements for comfortable VR viewing. VR-Ready for Professionals is a similar program for the Quadro line, including the M5000 and higher GPUs, included in complete systems from partner ISVs. Currently, MSI’s new WT72 laptop with the new M5500 GPU is the only mobile platform certified VR Ready for Pros. The new mobile Quadro M5500 has the same system architecture as the desktop workstation Quadro M5000, with all 2048 CUDA cores and 8GB RAM.

While the new top-end Maxwell-based Quadro GPUs are exciting, I am really looking forward to seeing Nvidia’s Pascal technology used for graphics processing in the near future. In the meantime, we have enough performance with existing systems to start processing 360-degree videos and VR experiences.

Mike McCarthy is a freelance post engineer and media workflow consultant based in Northern California. He shares his 10 years of technology experience on www.hd4pc.com, and he can be reached at mike@hd4pc.com.