Arraiy 4.11.19

Category Archives: A.I.

AI and deep learning at NAB 2019

By Tim Nagle

If you’ve been there, you know. Attending NAB can be both exciting and a chore. The vast show floor spreads across three massive halls and several hotels, and it will challenge even the most comfortable shoes. With an engineering background and my daily position as a Flame artist, I am definitely a gear-head, but I feel I can hardly claim that title at these events.

Here are some of my takeaways from the show this year…

Tim Nagle

8K
Having listened to the rumor mill, this year’s event promised to be exciting. And for me, it did not disappoint. First impressions: 8K infrastructure is clearly the goal of the manufacturers. Massive data rates and more Ks are becoming the norm. Everybody seemed to have an 8K workflow announcement. As a Flame artist, I’m not exactly looking forward to working on 8K plates. Sure, it is a glorious number of pixels, but the challenges are very real. While this may be the hot topic of the show, the fact that it is on the horizon further solidifies the need for the industry at large to have a solid 4K infrastructure. Hey, maybe we can even stop delivering SD content soon? All kidding aside, the systems and infrastructure elements being designed are quite impressive. Seeing storage solutions that can read and write at these astronomical speeds is just jaw dropping.

Young Attendees
Attendance remained relatively stable this year, but what I did notice was a lot of young faces making their way around the halls. It seemed like high school and university students were able to take advantage of interfacing with manufacturers, as well as some great educational sessions. This is exciting, as I really enjoy watching young creatives get the opportunity to express themselves in their work and make the rest of us think a little differently.

Blackmagic Resolve 16

AI/Deep Learning
Speaking of the future, AI and deep learning algorithms are being implemented into many parts of our industry, and this is definitely something to watch for. The possibilities to increase productivity are real, but these technologies are still relatively new and need time to mature. Some of the post apps taking advantage of these algorithms come from Blackmagic, Autodesk and Adobe.

At the show, Blackmagic announced their Neural Engine AI processing, which is integrated into DaVinci Resolve 16 for facial recognition, speed warp estimation and object removal, to name just a few. These features will add to the productivity of this software, further claiming its place among the usual suspects for more than just color correction.

Flame 2020

The Autodesk Flame team has implemented deep learning in to their app as well. It portends really impressive uses for retouching and relighting, as well as creating depth maps of scenes. Autodesk demoed a shot of a woman on the beach, with no real key light possibility and very flat, diffused lighting in general. With a few nodes, they were able to relight her face to create a sense of depth and lighting direction. This same technique can be used for skin retouch as well, which is very useful in my everyday work.

Adobe has also been working on their implementation of AI with the integration of Sensei. In After Effects, the content-aware algorithms will help to re-texture surfaces, remove objects and edge blend when there isn’t a lot of texture to pull from. Watching a demo artist move through a few shots, removing cars and people from plates with relative ease and decent results, was impressive.

These demos have all made their way online, and I encourage everyone to watch. Seeing where we are headed is quite exciting. We are on our way to these tools being very accurate and useful in everyday situations, but they are all very much a work in progress. Good news, we still have jobs. The robots haven’t replaced us yet.


Tim Nagle is a Flame artist at Dallas-based Lucky Post.

NAB 2019: First impressions

By Mike McCarthy

There are always a slew of new product announcements during the week of NAB, and this year was no different. As a Premiere editor, the developments from Adobe are usually the ones most relevant to my work and life. Similar to last year, Adobe was able to get their software updates released a week before NAB, instead of for eventual release months later.

The biggest new feature in the Adobe Creative Cloud apps is After Effects’ new “Content Aware Fill” for video. This will use AI to generate image data to automatically replace a masked area of video, based on surrounding pixels and surrounding frames. This functionality has been available in Photoshop for a while, but the challenge of bringing that to video is not just processing lots of frames but keeping the replaced area looking consistent across the changing frames so it doesn’t stand out over time.

The other key part to this process is mask tracking, since masking the desired area is the first step in that process. Certain advances have been made here, but based on tech demos I saw at Adobe Max, more is still to come, and that is what will truly unlock the power of AI that they are trying to tap here. To be honest, I have been a bit skeptical of how much AI will impact film production workflows, since AI-powered editing has been terrible, but AI-powered VFX work seems much more promising.

Adobe’s other apps got new features as well, with Premiere Pro adding Free-Form bins for visually sorting through assets in the project panel. This affects me less, as I do more polishing than initial assembly when I’m using Premiere. They also improved playback performance for Red files, acceleration with multiple GPUs and certain 10-bit codecs. Character Animator got a better puppet rigging system, and Audition got AI-powered auto-ducking tools for automated track mixing.

Blackmagic
Elsewhere, Blackmagic announced a new version of Resolve, as expected. Blackmagic RAW is supported on a number of new products, but I am not holding my breath to use it in Adobe apps anytime soon, similar to ProRes RAW. (I am just happy to have regular ProRes output available on my PC now.) They also announced a new 8K Hyperdeck product that records quad 12G SDI to HEVC files. While I don’t think that 8K will replace 4K television or cinema delivery anytime soon, there are legitimate markets that need 8K resolution assets. Surround video and VR would be one, as would live background screening instead of greenscreening for composite shots. No image replacement in post, as it is capturing in-camera, and your foreground objects are accurately “lit” by the screens. I expect my next major feature will be produced with that method, but the resolution wasn’t there for the director to use that technology for the one I am working on now (enter 8K…).

AJA
AJA was showing off the new Ki Pro Go, which records up to four separate HD inputs to H.264 on USB drives. I assume this is intended for dedicated ISO recording of every channel of a live-switched event or any other multicam shoot. Each channel can record up to 1080p60 at 10-bit color to H264 files in MP4 or MOV and up to 25Mb.

HP
HP had one of their existing Z8 workstations on display, demonstrating the possibilities that will be available once Intel releases their upcoming DIMM-based Optane persistent memory technology to the market. I have loosely followed the Optane story for quite a while, but had not envisioned this impacting my workflow at all in the near future due to software limitations. But HP claims that there will be options to treat Optane just like system memory (increasing capacity at the expense of speed) or as SSD drive space (with DIMM slots having much lower latency to the CPU than any other option). So I will be looking forward to testing it out once it becomes available.

Dell
Dell was showing off their relatively new 49-inch double-wide curved display. The 4919DW has a resolution of 5120×1440, making it equivalent to two 27-inch QHD displays side by side. I find that 32:9 aspect ratio to be a bit much for my tastes, with 21:9 being my preference, but I am sure there are many users who will want the extra width.

Digital Anarchy
I also had a chat with the people at Digital Anarchy about their Premiere Pro-integrated Transcriptive audio transcription engine. Having spent the last three months editing a movie that is split between English and Mandarin dialogue, needing to be fully subtitled in both directions, I can see the value in their tool-set. It harnesses the power of AI-powered transcription engines online and integrates the results back into your Premiere sequence, creating an accurate script as you edit the processed clips. In my case, I would still have to handle the translations separately once I had the Mandarin text, but this would allow our non-Mandarin speaking team members to edit the Mandarin assets in the movie. And it will be even more useful when it comes to creating explicit closed captioning and subtitles, which we have been doing manually on our current project. I may post further info on that product once I have had a chance to test it out myself.

Summing Up
There were three halls of other products to look through and check out, but overall, I was a bit underwhelmed at the lack of true innovation I found at the show this year.

Full disclosure, I was only able to attend for the first two days of the exhibition, so I may have overlooked something significant. But based on what I did see, there isn’t much else that I am excited to try out or that I expect to have much of a serious impact on how I do my various jobs.

It feels like most of the new things we are seeing are merely commoditized versions of products that may originally have been truly innovative when they were initially released, but now are just slightly more fleshed out versions over time.

There seems to be much less pioneering of truly new technology and more repackaging of existing technologies into other products. I used to come to NAB to see all the flashy new technologies and products, but now it feels like the main thing I am doing there is a series of annual face-to-face meetings, and that’s not necessarily a bad thing.

Until next year…


Mike McCarthy is an online editor/workflow consultant with over 10 years of experience on feature films and commercials. He has been involved in pioneering new solutions for tapeless workflows, DSLR filmmaking and multi-screen and surround video experiences. Check out his site.

Arraiy 4.11.19

Dell updates Precision 7000 Series workstation line

Dell has updated its Precision 7920 and 7820 towers and Precision 7920 rack workstations to target the media and entertainment industry. Enhancements include processing of large data workloads, AI capabilities, hot-swappable drives, a tool-less external power supply and a flexible 2U rack form factor that boosts cooling, noise reduction and space savings.

Both the Dell Precision 7920 and 7820 towers will be available with the new 2nd Gen Intel Xeon Scalable processors and Nvidia Quadro RTX graphic options to deliver enhanced performance for applications with large datasets, including enhancements for artificial intelligence and machine learning workloads. All Precision workstations come equipped with the Dell Precision Optimizer. The Dell Precision Optimizer Premium is available at an additional cost. This feature uses AI-based technology to tune the workstation based on how it is being used.

In addition, the Precision workstations now feature a multichannel thermal design for advanced cooling and acoustics. An externally accessible tool-less power supply and FlexBays for lockable, hot-swappable drives are also included.

For users needing high-security, remotely accessible 1:1 workstation performance, the updated Dell Precision 7920 rack workstation delivers the same performance and scalability of the Dell Precision 7920 tower in a 2U rack form factor. This rack workstation is targeted to OEMs and users who need to locate their compute resources and valuable data in central environments. This option can save space and help reduce noise and heat, while providing secure remote access to external employees and contractors.

Configuration options will include the recently announced 2nd Gen Intel Xeon Scalable processors, built for advanced workstation professionals, with up to 28 cores, 56 threads and 3TB DDR4 RDIMM per socket. The workstations will also support Intel Deep Learning Boost, a new set of Intel AVX-512 instructions.

The Precision 7000 Series workstations will be available in May with high-performance storage capacity options, including up to 120TB/96TB of Enterprise SATA HDD and up to 16TB of PCIe NVMe SSDs.


Video: Machine learning with Digital Domain’s Doug Roble

Just prior to NAB, postPerspective’s Randi Altman caught up with Digital Domain’s senior director of software R&D, Doug Roble, to talk machine learning.

Roble is on a panel on the Monday of NAB 2019 called “Influencers in AI: Companies Accelerating the Future.” It’s being moderated by Google’s technical director for media, Jeff Kember, and features Roble along with
Autodesk’s Evan Atherton, Nvidia’s Rick Champagne, Warner Bros’ Greg Gewickey, Story Tech/Television Academy’s Lori Schwartz.

In our conversation with Roble, he talks about how Digital Domain has been using machine learning in visual effects for a couple of years. He points to the movie Avengers and the character Thanos, which they worked on.

A lot of that character’s facial motion was done with a variety of machine learning techniques. Since then, Digital Domain has pushed that technology further, taking the machine learning aspect and putting it on realtime digital humans — including Doug Roble.

Watch our conversation and find out more…


Nvidia intros Turing-powered Titan RTX

Nvidia has introduced its new Nvidia Titan RTX, a desktop GPU that provides the kind of massive performance needed for creative applications, AI research and data science. Driven by the new Nvidia Turing architecture, Titan RTX — dubbed T-Rex — delivers 130 teraflops of deep learning performance and 11 GigaRays of raytracing performance.

Turing features new RT Cores to accelerate raytracing, plus new multi-precision Tensor Cores for AI training and inferencing. These two engines — along with more powerful compute and enhanced rasterization — will help speed the work of developers, designers and artists across multiple industries.

Designed for computationally demanding applications, Titan RTX combines AI, realtime raytraced graphics, next-gen virtual reality and high-performance computing. It offers the following features and capabilities:
• 576 multi-precision Turing Tensor Cores, providing up to 130 Teraflops of deep learning performance
• 72 Turing RT Cores, delivering up to 11 GigaRays per second of realtime raytracing performance
• 24GB of high-speed GDDR6 memory with 672GB/s of bandwidth — two times the memory of previous-generation Titan GPUs — to fit larger models and datasets
• 100GB/s Nvidia NVLink, which can pair two Titan RTX GPUs to scale memory and compute
• Performance and memory bandwidth sufficient for realtime 8K video editing
• VirtualLink port, which provides the performance and connectivity required by next-gen VR headsets

Titan RTX provides multi-precision Turing Tensor Cores for breakthrough performance from FP32, FP16, INT8 and INT4, allowing faster training and inference of neural networks. It offers twice the memory capacity of previous-generation Titan GPUs, along with NVLink to allow researchers to experiment with larger neural networks and datasets.

Titan RTX accelerates data analytics with RAPIDS. RAPIDS open-source libraries integrate seamlessly with the world’s most popular data science workflows to speed up machine learning.

Titan RTX will be available later in December in the US and Europe for $2,499.


Panasas’ new ActiveStor Ultra targets emerging apps: AI, VR

Panasas has introduced ActiveStor Ultra, the next generation of its high-performance computing storage solution, featuring PanFS 8, a plug-and-play, portable, parallel file system. ActiveStor Ultra offers up to 75GB/s per rack on industry-standard commodity hardware.

ActiveStor Ultra comes as a fully integrated plug-and-play appliance running PanFS 8 on industry-standard hardware. PanFS 8 is the completely re-engineered Panasas parallel file system, which now runs on Linux and features intelligent data placement across three tiers of media — metadata on non-volatile memory express (NVMe), small files on SSDs and large files on HDDs — resulting in optimized performance for all data types.

ActiveStor Ultra is designed to support the complex and varied data sets associated with traditional HPC workloads and emerging applications, such as artificial intelligence (AI), autonomous driving and virtual reality (VR). ActiveStor Ultra’s modular architecture and building-block design enables enterprises to start small and scale linearly. With dock-to-data in one hour, ActiveStor Ultra offers fast data access and virtually eliminates manual intervention to deliver the lowest total cost of ownership (TCO).

ActiveStor Ultra will be available early in the second half of 2019.


Video Coverage: postPerspective Live from SMPTE 2018

The yearly SMPTE Technical Conference and Exhibition was held late last month in Downtown Los Angeles at the Westin Bonaventure Hotel, a new venue for the event.

The conference included presentations that touched on all three of the organization’s “pillars,” which are Standards, Education and Membership.

One of the highlights was a session on autonomous vehicles and how AI and machine learning are making that happen. You might wonder, “What will everyone do with that extra non-driving time?” Well, companies are already thinking of ways to entertain you while you’re on your way to where you need to go. The schedule of sessions and presentations can be found here.

Another highlight at this year’s SMPTE Conference was the Women in Technology lunch, which featured a conversation between Disney’s Kari Grubin and Fem Inc.’s Rachel Payne. Payne is a tech entrepreneur and technology executive who has worked at companies like Google. She was also a Democratic candidate for the 48th Congressional District of California. It was truly inspiring to hear about her path.

Feeling like you might have missed some cool stuff? Well don’t worry, postPerspective’s production crews were capturing interviews with manufacturers in the exhibit hall and with speakers, SMPTE members and so many others throughout the Conference.

A big thank you to AlphaDogs , who shot and posted our videos this year, as well as our other sponsors: Blackmagic Design, The Studio – B&H, LitePanels and Lenovo.

Watch Here!


Quick Chat: AI-based audio mastering

Antoine Rotondo is an audio engineer by trade who has been in the business for the past 17 years. Throughout his career he’s worked in audio across music, film and broadcast, focusing on sound reproduction. After completing college studies in sound design, undergraduate studies in music and music technology, as well as graduate studies in sound recording at McGill University in Montreal, Rotondo went on to work in recording, mixing, producing and mastering.

He is currently an audio engineer at Landr.com, which has released Landr Audio Mastering for Video, which provides professional video editors with AI-based audio mastering capabilities in Adobe Premiere Pro CC.

As an audio engineer how do you feel about AI tools to shortcut the mastering process?
Well first, there’s a myth about how AI and machines can’t possibly make valid decisions in the creative process in a consistent way. There’s actually a huge intersection between artistic intentions and technical solutions where we find many patterns, where people tend to agree and go about things very similarly, often unknowingly. We’ve been building technology around that.

Truth be told there are many tasks in audio mastering that are repetitive and that people don’t necessarily like spending a lot of time on, tasks such as leveling dialogue, music and background elements across multiple segments, or dealing with noise. Everyone’s job gets easier when those tasks become automated.

I see innovation in AI-driven audio mastering as a way to make creators more productive and efficient — not to replace them. It’s now more accessible than ever for amateur and aspiring producers and musicians to learn about mastering and have the resources to professionally polish their work. I think the same will apply to videographers.

What’s the key to making video content sound great?
Great sound quality is effortless and sounds as natural as possible. It’s about creating an experience that keeps the viewer engaged and entertained. It’s also about great communication — delivering a message to your audience and even conveying your artistic vision — all this to impact your audience in the way you intended.

More specifically, audio shouldn’t unintentionally sound muffled, distorted, noisy or erratic. Dialogue and music should shine through. Viewers should never need to change the volume or rewind the content to play something back during the program.

When are the times you’d want to hire an audio mastering engineer and when are the times that projects could solely use an AI-engine for audio mastering?
Mastering engineers are especially important for extremely intricate artistic projects that require direct communication with a producer or artist, including long-form narrative, feature films, television series and also TV commercials. Any project with conceptual sound design will almost always require an engineer to perfect the final master.

Users can truly benefit from AI-driven mastering in short form, non-fiction projects that require clean dialog, reduced background noise and overall leveling. Quick turnaround projects can also use AI mastering to elevate the audio to a more professional level even, when deadlines are tight. AI mastering can now insert itself in the offline creation process, where multiple revisions of a project are sent back and forth, making great sound accessible throughout the entire production cycle.

The other thing to consider is that AI mastering is a great option for video editors who don’t have technical audio expertise themselves, and where lower budgets translate into them having to work on their own. These editors could purchase purpose-built mastering plugins, but they don’t necessarily have the time to learn how to really take advantage of these tools. And even if they did have the time, some would prefer to focus more on all the other aspects of the work that they have to juggle.


AI for M&E: Should you take the leap?

By Nick Gold

In Hollywood, the promise of artificial intelligence is all the rage. Who wouldn’t want a technology that adds the magic of AI to smarter computers for an instant solution to tedious, time-intensive problems? With artificial intelligence, anyone with abundant rich media assets can easily churn out more revenue or cut costs, while simplifying operations … or so we’re told.

If you attended IBC, you probably already heard the pitch: “It’s an ‘easy’ button that’s simple to add to the workflow and foolproof to operate, turning your massive amounts of uncategorized footage into metadata.”

But should you take the leap? Before you sign on the dotted line, take a closer look at the technology behind AI and what it can — and can’t — do for you.

First, it’s important to understand the bigger picture of artificial intelligence in today’s marketplace. Taking unstructured data and generating relevant metadata from it is something that other industries have been doing for some time. In fact, many of the tools we embrace today started off in other industries. But unlike banking, finance or healthcare, our industry prioritizes creativity, which is why we have always shied away from tools that automate. The idea that we can rely on the same technology as a hedge fund manager just doesn’t sit well with many people in our industry, and for good reason.

Nick Gold talks AI for a UCLA Annex panel.

In the media and entertainment industry, we’re looking for various types of metadata that could include a transcript of spoken words, important events within a period of time or information about the production (e.g., people, location, props), and currently there’s no single machine-learning algorithm that will solve for all these types of metadata parameters. For that reason, the best starting point is to define your problems and identify which machine learning tools may be able to solve them. Expecting to parse reams of untagged, uncategorized and unstructured media data is unrealistic until you know what you’re looking for.

What works for M&E?
AI has become pretty good at solving some specific problems for our industry. Speech-to-text is one of them. With AI, extracting data from a generally accurate transcription offers an automated solution that saves time. However, it’s important to note that AI tools still have limitations. An AI tool, known as “sentiment analysis,” could theoretically look for the emotional undertones described in spoken word, but it first requires another tool to generate a transcript for analysis.

But no matter how good the algorithms are, they won’t give you the qualitative data that a human observer would provide, such as the emotions expressed through body language. They won’t tell you the facial expressions of the people being spoken to, or the tone of voice, pacing and volume level of the speaker, or what is conveyed by a sarcastic tone or a wry expression. There are sentiment analysis engines that try to do this, but breaking down the components ensures the parameters you need will be addressed and solved.

Another task at which machine learning has progressed significantly is logo recognition. Certain engines are good at finding, for example, all the images with a Coke logo in 10,000 hours of video. That’s impressive and quite useful, but it’s another story if you want to also find footage of two people drinking what are clearly Coke-shaped bottles where the logo is obscured. That’s because machine-learning engines tend to have a narrow focus, which goes back to the need to define very specifically what you hope to get from it.

There are a bevy of algorithms and engines out there. If you license a service that will find a specific logo, then you haven’t solved your problem for finding objects that represent the product as well. Even with the right engine, you’ve got to think about how this information fits in your pipeline, and there are a lot of workflow questions to be explored.

Let’s say you’ve generated speech-to-text with audio media, but have you figured out how someone can search the results? There are several options. Sometimes vendors have their own front end for searching. Others may offer an export option from one engine into a MAM that you either already have on-premise or plan to purchase. There are also vendors that don’t provide machine learning themselves but act as a third-party service organizing the engines.

It’s important to remember that none of these AI solutions are accurate all the time. You might get a nudity detection filter, for example, but these vendors rely on probabilistic results. If having one nude image slip through is a huge problem for your company, then machine learning alone isn’t the right solution for you. It’s important to understand whether occasional inaccuracies will be acceptable or deal breakers for your company. Testing samples of your core content in different scenarios for which you need to solve becomes another crucial step. And many vendors are happy to test footage in their systems.

Although machine learning is still in its nascent stages, there is a lot of interest in learning how to make it work in the media workflow. It can do some magical things, but it’s not a magic “easy” button (yet, anyway). Exploring the options and understanding in detail what you need goes hand-in-hand with finding the right solution to integrate with your workflow.


Nick Gold is lead technologist for Baltimore’s Chesapeake Systems, which specializes in M&E workflows and solutions for the creation, distribution and preservation of content. Active in both SMPTE and the Association of Moving Image Archivists (AMIA), Gold speaks on a range of topics. He also co-hosts the Workflow Show Podcast.
 

Adobe updates Creative Cloud

By Brady Betzel

You know it’s almost fall when when pumpkin spice lattes are  back and Adobe announces its annual updates. At this year’s IBC, Adobe had a variety of updates to its Creative Cloud line of apps. From more info on their new editing platform Project Rush to the addition of Characterizer to Character Animator — there are a lot of updates so I’m going to focus on a select few that I think really stand out.

Project Rush

I use Adobe Premiere quite a lot these days; it’s quick and relatively easy to use and will work with pretty much every codec in the universe. In addition, the Dynamic Link between Adobe Premiere Pro and Adobe After Effects is an indispensible feature in my world.

With the 2018 fall updates, Adobe Premiere will be closer to a color tool like Blackmagic’s Resolve with the addition of new hue saturation curves in the Lumetri Color toolset. In Resolve these are some of the most important aspects of the color corrector, and I think that will be the same for Premiere. From Hue vs. Sat, which can help isolate a specific color and desaturate it to Hue vs. Luma, which can help add or subtract brightness values from specific hues and hue ranges — these new color correcting tools further Premiere’s venture into true professional color correction. These new curves will also be available inside of After Effects.

After Effects features many updates, but my favorites are the ability to access depth matte data of 3D elements and the addition of the new JavaScript engine for building expressions.

There is one update that runs across both Premiere and After Effects that seems to be a sleeper update. The improvements to motion graphics templates, if implemented correctly, could be a time and creativity saver for both artists and editors.

AI
Adobe, like many other companies, seem to be diving heavily into the “AI” pool, which is amazing, but… with great power comes great responsibility. While I feel this way and realize others might not, sometimes I don’t want all the work done for me. With new features like Auto Lip Sync and Color Match, editors and creators of all kinds should not lose the forest for the trees. I’m not telling people to ignore these features, but asking that they put a few minutes into discovering how the color of a shot was matched, so you can fix something if it goes wrong. You don’t want to be the editor who says, “Premiere did it” and not have a great solution to fix something when it goes wrong.

What Else?
I would love to see Adobe take a stab at digging up the bones of SpeedGrade and integrating that into the Premiere Pro world as a new tab. Call it Lumetri Grade, or whatever? A page with a more traditional colorist layout and clip organization would go a long way.

In the end, there are plenty of other updates to Adobe’s 2018 Creative Cloud apps, and you can read their blog to find out about other updates.