Sony Pixel Power calrec Sony

NVIDIA Research at ICLR - Pioneering the Next Wave of Multimodal Generative AI

24/04/2025

Advancing AI requires a full-stack approach, with a powerful foundation of computing infrastructure - including accelerated processors and networking technologies - connected to optimized compilers, algorithms and applications.

NVIDIA Research is innovating across this spectrum, supporting virtually every industry in the process. At this week's International Conference on Learning Representations (ICLR), taking place April 24-28 in Singapore, more than 70 NVIDIA-authored papers introduce AI developments with applications in autonomous vehicles, healthcare, multimodal content creation, robotics and more.

ICLR is one of the world's most impactful AI conferences, where researchers introduce important technical innovations that move every industry forward, said Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. The research we're contributing this year aims to accelerate every level of the computing stack to amplify the impact and utility of AI across industries.

Research That Tackles Real-World Challenges Several NVIDIA-authored papers at ICLR cover groundbreaking work in multimodal generative AI and novel methods for AI training and synthetic data generation, including:

Fugatto: The world's most flexible audio generative AI model, Fugatto generates or transforms any mix of music, voices and sounds described with prompts using any combination of text and audio files. Other NVIDIA models at ICLR improve audio large language models (LLMs) to better understand speech.

HAMSTER: This paper demonstrates that a hierarchical design for vision-language-action models can improve their ability to transfer knowledge from off-domain fine-tuning data - inexpensive data that doesn't need to be collected on actual robot hardware - to improve a robot's skills in testing scenarios.

Hymba: This family of small language models uses a hybrid model architecture to create LLMs that blend the benefits of transformer models and state space models, enabling high-resolution recall, efficient context summarization and common-sense reasoning tasks. With its hybrid approach, Hymba improves throughput by 3x and reduces cache by almost 4x without sacrificing performance.

LongVILA: This training pipeline enables efficient visual language model training and inference for long video understanding. Training AI models on long videos is compute and memory-intensive - so this paper introduces a system that efficiently parallelizes long video training and inference, with training scalability up to 2 million tokens on 256 GPUs. LongVILA achieves state-of-the-art performance across nine popular video benchmarks.

LLaMaFlex: This paper introduces a new zero-shot generation technique to create a family of compressed LLMs based on one large model. The researchers found that LLaMaFlex can generate compressed models that are as accurate or better than state-of-the art pruned, flexible and trained-from-scratch models - a capability that could be applied to significantly reduce the cost of training model families compared to techniques like pruning and knowledge distillation.

Proteina: This model can generate diverse and designable protein backbones, the framework that holds a protein together. It uses a transformer model architecture with up to 5x as many parameters as previous models.

SRSA: This framework addresses the challenge of teaching robots new tasks using a preexisting skill library - so instead of learning from scratch, a robot can apply and adapt its existing skills to the new task. By developing a framework to predict which preexisting skill would be most relevant to a new task, the researchers were able to improve zero-shot success rates on unseen tasks by 19%.

STORM: This model can reconstruct dynamic outdoor scenes - like cars driving or trees swaying in the wind - with a precise 3D representation inferred from just a few snapshots. The model, which can reconstruct large-scale outdoor scenes in 200 milliseconds, has potential applications in autonomous vehicle development.

Discover the latest work from NVIDIA Research, a global team of around 400 experts in fields including computer architecture, generative AI, graphics, self-driving cars and robotics.
LINK: https://blogs.nvidia.com/blog/ai-research-iclr-2025/...
See more stories from nvidia

Most recent headlines

04/09/2025

Monumental Sports & Entertainment and Dalet Win Prestigious 2025 NAB Show Project of the Year Award

Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...

24/04/2025

Sonnet Announces Solo5G USB-C to 5 Gigabit Ethernet Adapter

Sonnet Announces Solo5G USB-C to 5 Gigabit Ethernet Adapter Brie Clayton April 24, 2025 0 Comments Compact, Bus-powered Adapter Adds Instant 5 Gigabit...

24/04/2025

Sony Electronics Launches FE 50-150MM F2 GM

Sony Electronics Launches FE 50-150MM F2 GM Brie Clayton April 24, 2025 0 Comments The World's Firsti Telephoto Zoom Lens with a Maximum Focal Len...

24/04/2025

Masterwork Films advances brand storytelling with virtual...

Branded cinematic shorts and documentaries have surged in popularity as more companies strategize out-of-the-box ways to connect and engage with audiences. Whil...

24/04/2025

Sonnet Announces Solo5G USB-C to 5 Gigabit Ethernet Adapt...

Compact, Bus-powered Adapter Adds Instant 5 Gigabit Ethernet Connectivity to Computers With USB-C or Thunderbolt ports What's New: Sonnet Technologies tod...

24/04/2025

Emerald DESKVUE PE From Black Box Wins 2025 NAB Show Prod...

Black Box , a leading digital infrastructure solution provider, today announced that its Emerald DESKVUE PE is a remote production category winner in the 2025 ...

24/04/2025

Interra Systems Takes Home Two 2025 NAB Show Awards

Interra Systems, a leading provider of end-to-end quality assurance solutions for the digital media industry, today announced that its ORION content monitoring ...

24/04/2025

FingerWorks Telestrators Give Sports Broadcasters a Compe...

FingerWorks Telestrators continues to be at the forefront of live sports broadcasting with its cutting-edge solutions, supporting major events like NASCAR, the...

24/04/2025

IBC launches 2025 Innovation Awards with nominations now...

IBC announces the launch of the IBC2025 Innovation Awards, which recognise pioneering advances in technology and social impact in the media and entertainment (M...

24/04/2025

Beam Dynamics brings its unified asset management and ope...

Beam Dynamics will make its MPTS debut this year, showcasing its all-in-one platform designed to streamline workflows across live production, OB, broadcast, and...

24/04/2025

Comprimato leverages PHABRIX QxL rasterizer to advance de...

Test & measurement innovator, Leader Electronics of Europe, has announced that Czechia-based Comprimato, a leader in high performance software encoding and tran...

24/04/2025

CueScript Takes Home Two Awards from NAB 2025 for CSMPPB...

CueScript, the leading developer of professional teleprompting solutions, took home two awards from NAB 2025 for its simple yet revolutionary solution that stre...

24/04/2025

Cobalt Scores a Trifecta of Awards at NAB 2025

Cobalt Digital, the leading designer and manufacturer of award-winning signal processing products and a founding partner in the openGear initiative, added anot...

24/04/2025

Dot Group to Showcase Cost-Cutting AI-Powered Technologie...

Dot Group, European specialists in IBM technologies for the media and entertainment industry, announces its participation at the Media Production & Technology S...

24/04/2025

Scality unveils ARTESCA Veeam unified software appliance

Scality, a global leader in cyber-resilient storage for the AI era, today unveiled a first-of-its-kind unified software appliance developed in collaboration wit...

24/04/2025

Hitomi Broadcast Showcases MatchBox Software Solutions at...

Expanded applications for precise timing measurement across broadcast and professional AV sectors Hitomi Broadcast, the market leader in audio/video alignment ...

24/04/2025

Judge Orders VOA to be Restored

The Trump administration's efforts to shut down the Voice of America broadcast network were handed a setback this week when a federal court judge ordered th...

24/04/2025

NESN Taps ViewLift to Upgrade NESN 360 Streaming Offering

BOSTON and NEW YORK The sports network NESN has announced that it has selected ViewLift provide solutions and technologies to upgrade the viewing and streaming ...

24/04/2025

YouTube Turns 20, Celebrates 20 Billion Uploaded Videos

YouTube celebrated its 20th birthday, with stats showing how popular the platform has become and by announcing some new features, including the ability for user...

24/04/2025

Max Launches Extra Member Add-On Subscription Option

Warner Bros. Discoverys Max streaming services has announced a new U.S. product update that includes the introduction of a Extra Member Add-On feature....

24/04/2025

Berklee in Puerto Rico Program to Host 30th Anniversary Concert

Berklee in Puerto Rico Program to Host 30th Anniversary Concert The event will feature Grammy- and Latin Grammy-winning artist Miguel Zen n BM '98, cuatro...

24/04/2025

Detach Yourself from the Herd': Fito Pez Inspires at Berklee

Detach Yourself from the Herd': Fito P ez Inspires at Berklee In a session hosted by the Mediterranean Music Institute, the influential Argentine musician...

24/04/2025

New RT series narrated by Brendan Gleeson tells the story of Kerry's rich natural history

Episode one airs Sunday April 27th, 6.30pm on RT One and RT Player Watch: Ke...

24/04/2025

NVIDIA Research at ICLR - Pioneering the Next Wave of Multimodal Generative AI

Advancing AI requires a full-stack approach, with a powerful foundation of computing infrastructure - including accelerated processors and networking technologi...

24/04/2025

All Roads Lead Back to Oblivion: Bethesda's The Elder Scrolls IV: Oblivion Remastered' Arrives on GeForce NOW

Get the controllers ready and clear the calendar - it's a jam-packed GFN Thu...

24/04/2025

Thales reports its order intake and sales for the first quarter of 2025

Facebook Twitter LinkedIn Order intake: 3.8 billion, down -25% (-27% on an organic basis1) Sales: 5.0 billion, up 12.2% ( 9.9% on an organic basis) A...

23/04/2025

Finding Accountability

In 2024, Thomson's Total Turnout project played a pivotal role in elevating ethical election reporting and strengthening journalist safety in Pakistan. Wit...

23/04/2025

We heard that you could use a little pick-me-up, so get ready to sh-sh-shake it goodit's almost time for the Eurovision Song Contest!

We heard that you could use a little pick-me-up, so get ready to sh-sh-shake it ...

23/04/2025

The Gauge: Poland | March 2025

March shows a further downward trend of time spent watching television; following the February period when winter holidays contributed to a shorter time spent. ...

23/04/2025

The Gauge: Mexico March 2025

During March, audiences in Mexico increased their streaming usage by 2.1 points compared to the previous month, accounting for 24.4% of TV viewing. Disclaimer:...

23/04/2025

Nielsen Report: Asian American Audiences Are Reshaping Sports, Digital Media and Beauty Trends

AANHPI audiences over index the total U.S. for share of time spent with Netflix ...

23/04/2025

Roku Unveils New TVs, Smart Devices and Software Upgrades

SAN JOSE, Calif. Roku has announced new TVs, new streaming devices and significant upgrades to its user interface and software platforms that are designed to st...

23/04/2025

COW Job Listing: Opportunity for a Passionate Feature Film Editor - London-Based, In-Person, Paid Indie Project

COW Job Listing: Opportunity for a Passionate Feature Film Editor - London-Based...

23/04/2025

RM Equity Partners Acquires MAGIX Software, Appoints Robert Rutkowski as CEO to Drive Growth in the Creator Economy

RM Equity Partners Acquires MAGIX Software, Appoints Robert Rutkowski as CEO to ...

23/04/2025

All Men Are Wicked Western Shot with Blackmagic Design

All Men Are Wicked Western Shot with Blackmagic Design Brie Clayton April 23, 2025 0 Comments Blackmagic Pocket Cinema Camera 4Ks were put to the test...

23/04/2025

March Madness, Max Boost Warner Bros. Discovery's TV Viewing Share

NEW YORK Warner Bros. Discovery captured the largest monthly viewership increase among media distributors in March, according to Nielsen's latest Media Dist...

23/04/2025

WAPA+ FAST Channel Launches on Samsung TV Plus

MIAMI, Fla. Hemisphere Media Group has inked a deal with Samsung to launch WAPA+ as a FAST channel on Samsung TV Plus, a free TV streaming service that comes pr...

23/04/2025

Calrec Promotes Sid Stanley to Managing Director

HEBDEN BRIDGE, U.K. Calrec Audio has promoted Sid Stanley to managing director. Stanley, who joined the company in July 2018 as general manager, has been instru...

23/04/2025

FCC Commissioner Simington Names Gavin M. Wax Chief Of Staff

WASHINGTON Federal Communications Commission Commissioner Nathan A. Simington has announced a series of staff appointments made in March and April 2025, includi...

23/04/2025

FCC Releases Agenda for April Open Meeting

WASHINGTON The Federal Communications Commission has set its agenda for the Monday, April 28, 2025 Open Meeting, which is scheduled to start at 10:30 a.m. in th...

23/04/2025

COW Job Listing: Full-Time Video Editor, Remote

COW Job Listing: Full-Time Video Editor, Remote Brie Clayton April 22, 2025 0 Comments Full-Time Video Editor April 23, 2025COW Job Listing: Opportu...

23/04/2025

Berklee Abu Dhabi's Jazz Night Kicks off Global Celebration of Music and Culture

Berklee Abu Dhabi's Jazz Night Kicks off Global Celebration of Music and Cul...

23/04/2025

E.W. Scripps Folding Scripps News, Eliminating 200 Jobs; Stock Jumps 15%

The E.W. Scripps Co. said it was shutting down its Scripps News over-the-air channel effective November 15 and eliminating at least 200 jobs....

23/04/2025

Audio Helps Chess.com Checkmate Cheating

Audio Helps Chess.com Checkmate Cheating Matches are produced using REMI workflows By Dan Daley, Audio Editor Wednesday, April 23, 2025 - 7:00 am Print Th...

23/04/2025

Tech Focus: Intercoms, Part 1 - Key to Onsite, REMI, Hybrid Operations

Tech Focus: Intercoms, Part 1 - Key to Onsite, REMI, Hybrid Operations Intercoms keep increasingly disparate production locations together By Dan Daley, Audio ...

23/04/2025

Speaking My Language: Bringing AI to the Fore With Real Time Audio Translation for French football

Speaking my language: Bringing AI to the fore with real time audio translation f...

23/04/2025

Netflix Drops Trailer for Crime Drama 'Secrets We Keep', Set in Denmark's Wealthiest Neighborhood

Back to All News Netflix Drops Trailer for Crime Drama Secrets We Keep, Set in ...

23/04/2025

World Book Day: Netflix Announces New TV Adaptations and Impact on Book Sales as Sweet Magnolias' Is Renewed for Season 5

Back to All News World Book Day: Netflix Announces New TV Adaptations and Impac...