Sony Pixel Power calrec Sony

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Debut

28/08/2024

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another.

In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token.

MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

The continued growth of LLMs is driving the need for more compute to process inference requests. To meet real-time latency requirements for serving today's LLMs, and to do so for as many users as possible, multi-GPU compute is a must. NVIDIA NVLink and NVSwitch provide high-bandwidth communication between GPUs based on the NVIDIA Hopper architecture and provide significant benefits for real-time, cost-effective large model inference. The Blackwell platform will further extend NVLink Switch's capabilities with larger NVLink domains with 72 GPUs.

In addition to the NVIDIA submissions, 10 NVIDIA partners - ASUSTek, Cisco, Dell Technologies, Fujitsu, Giga Computing, Hewlett Packard Enterprise (HPE), Juniper Networks, Lenovo, Quanta Cloud Technology and Supermicro - all made solid MLPerf Inference submissions, underscoring the wide availability of NVIDIA platforms.

Relentless Software Innovation NVIDIA platforms undergo continuous software development, racking up performance and feature improvements on a monthly basis.

In the latest inference round, NVIDIA offerings, including the NVIDIA Hopper architecture, NVIDIA Jetson platform and NVIDIA Triton Inference Server, saw leaps and bounds in performance gains.

The NVIDIA H200 GPU delivered up to 27% more generative AI inference performance over the previous round, underscoring the added value customers get over time from their investment in the NVIDIA platform.

Triton Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise software, is a fully featured open-source inference server that helps organizations consolidate framework-specific inference servers into a single, unified platform. This helps lower the total cost of ownership of serving AI models in production and cuts model deployment times from months to minutes.

In this round of MLPerf, Triton Inference Server delivered near-equal performance to NVIDIA's bare-metal submissions, showing that organizations no longer have to choose between using a feature-rich production-grade AI inference server and achieving peak throughput performance.

Going to the Edge Deployed at the edge, generative AI models can transform sensor data, such as images and videos, into real-time, actionable insights with strong contextual awareness. The NVIDIA Jetson platform for edge AI and robotics is uniquely capable of running any kind of model locally, including LLMs, vision transformers and Stable Diffusion.

In this round of MLPerf benchmarks, NVIDIA Jetson AGX Orin system-on-modules achieved more than a 6.2x throughput improvement and 2.4x latency improvement over the previous round on the GPT-J LLM workload. Rather than developing for a specific use case, developers can now use this general-purpose 6-billion-parameter model to seamlessly interface with human language, transforming generative AI at the edge.

Performance Leadership All Around This round of MLPerf Inference showed the versatility and leading performance of NVIDIA platforms - extending from the data center to the edge - on all of the benchmark's workloads, supercharging the most innovative AI-powered applications and services. To learn more about these results, see our technical blog.

H200 GPU-powered systems are available today from CoreWeave - the first cloud service provider to announce general availability - and server makers ASUS, Dell Technologies, HPE, QCT and Supermicro.

See notice regarding software product information.
LINK: https://blogs.nvidia.com/blog/mlperf-inference-benchmark-blackwell/...
See more stories from nvidia

North America Stories

13/09/2024

Give Me the Backstory: Get to Know Megan Park, the Filmmaker Behind My Old Ass

By Bailey Pennick One of the most exciting things about the Sundance Film Festival is having a front-row seat for the bright future of independent filmmaking. ...

13/09/2024

SNS Previews New Workflow Innovations At IBC2024

SNS is excited to preview the latest features coming to the award-winning EVO Suite. Be among the first to test drive these workflow innovations at IBC2024, and...

13/09/2024

US Navy Awards L3Harris $587 Million Contract for Next Generation Jammer - Low Band Program

SALT LAKE CITY, Sept. 12, 2024 - L3Harris Technologies (NYSE:LHX) has received a...

13/09/2024

L3Harris Rapidly Advances US Hypersonic Missile Tracking and Defense Capabilities

Rendering of L3Harris Tranche 2 Tracking (T2TRK) imagery for the Space Developme...

13/09/2024

The EA-37B Compass Call: Delivering the Future of Electronic Warfare Today

Last month, L3Harris delivered the third U.S. Air Force EA-37B Compass Call Cross Deck aircraft integrated with sophisticated electromagnetic attack (EA) miss...

13/09/2024

Grass Valley Unveils New Game-Changing Switcher Products for Enhanced Production Efficiency

Targeted at low-to-mid tier productions, the new products set new standards for ...

13/09/2024

Grass Valley Sets the New Standard in High-Density IP and SDI Modular Processing with the launch of ACE-3901 and XIP-3911-GRID

The next generation products will enhance the Densit Modular Platform with unma...

13/09/2024

Calrec Introduces Argo M Compact Broadcast Audio Console At IBC 2024

AMSTERDAM Calrec Audio has unveiled its compact Argo M audio console, ImPulseV cloud audio mixing solution and True Control 2.0 at IBC 2024, Sept. 13-16....

13/09/2024

Media and Broadcast Companies Reveal Confidence In AI Technology

A study of how the media and broadcast industry feels about artificial intelligence (AI) has revealed most media and broadcast companies view its adoption as a ...

13/09/2024

Starchive, Seagate Lyve Cloud Object Storage Announce Collaboration

AMSTERDAM AI-driven digital asset management system (DAM) Starchive announced at IBC 2024 a collaboration with Seagate that fully integrates its toolset with Se...

13/09/2024

DPA Microphones Unveils 2061 Miniature Omnidirectional Mic At IBC 2024

AMSTERDAM DPA Microphones has introduced its 2061 Miniature Omnidirectional Microphone for applications ranging from broadcast studios and ENG to theaters, even...

13/09/2024

Rodney Alejandro Named Dean of Professional Writing and Music Technology

Rodney Alejandro Named Dean of Professional Writing and Music Technology Alejandro, Berklees current Songwriting Department chair and an alumnus of the colleg...

13/09/2024

More Than 67 Million Viewers Watched Presidential Debate

The presidential debate between Vice President Kamala Harris and former President Donald Trump was watched by 67.1 million viewers, according to Nielsen....

13/09/2024

Brightcove Launches AI Suite With Content-Creation Capabilities

Brightcove introduced the Brightcove AI Suite, which uses AI to provide content creation, audience growth and engagement and increased revenue generation....

13/09/2024

Charter Reaches Distribution Deal With Warner Bros. Discovery That Gives Subscribers Streaming Service Max

Warner Bros. Discovery and Charter Communications said the reached a new multi-y...

13/09/2024

History This Week Podcast Returns for Fifth Season

The History Channel's History This Week podcast returns for a fifth season on September 16....

13/09/2024

67.1M Viewers Tune into Harris-Trump Debate

NEW YORK The ABC News Presidential Debate between Vice President Kamala Harris and former President Donald Trump drew an estimated audience of 67.1 million view...

13/09/2024

FCC Chair Rosenworcel Rebuts Trump's Call to Revoke ABC Licenses

WASHINGTON, D.C. In response to Donald Trump's call for ABC stations to lose their broadcast licenses, Federal Communications Commission chairwoman Jessica ...

13/09/2024

Teradek To Introduce Prism 877 4K HEVC Encoder/Decoder Cards At IBC 2024

IRVINE, Calif. Teradek will launch its new Prism 877 4K HEVC encoder/decoder cards at the IBC 2024, Sept. 13-16, at the RAI Amsterdam Convention Center....

13/09/2024

GAB Honors First-Ever Class Of Rising Stars' in Broadcasting

ATLANTA The Georgia Association of Broadcasters honored nine station employees at the early stage of their broadcast career as members of its first-ever class o...

13/09/2024

Grass Valley, ES Broadcast Sign Strategic Partnership Agreement

MONTREAL Grass Valley and broadcast equipment supply and service company ES Broadcast have inked a strategic partnership agreement that continues and expands a ...

13/09/2024

Panasonic Connect Launches New Cameras, Expands NDI Support

NEWARK, N.J. Panasonic Connect has announced the new AK-UCX100 4K studio camera and two new 4K multi-purpose cameras. In addition, the company said it will exte...

13/09/2024

Ross Video Unveils Ultricore Tally

OTTAWA, Canada Ross Video has launched Ultricore Tally, a full-featured, versatile tally control and management solution and will showcase it at IBC 2024, Sept....

13/09/2024

Hollyland Elevates Wireless Communications for Large-Scale Film and TV Productions

Hollyland Elevates Wireless Communications for Large-Scale Film and TV Productio...

13/09/2024

Blackmagic Design Announces Blackmagic PYXIS Monitor!

Blackmagic Design Announces Blackmagic PYXIS Monitor! Brie Clayton September 12, 2024 0 Comments New cinematic monitoring solution for Blackmagic PYXI...

13/09/2024

Sculpting Reimagined: Maxon Releases ZBrush for iPad

Sculpting Reimagined: Maxon Releases ZBrush for iPad Brie Clayton September 13, 2024 0 Comments Take the power of ZBrush on the go with reimagined UI,...

13/09/2024

WCVB Spotlights Berklee Institute for Accessible Arts Education

WCVB Spotlights Berklee Institute for Accessible Arts Education The institutes ABLE summer camp program was featured in the 5 for Good segment. By Daniel Pe...

13/09/2024

Netflix Greenlights Love Untangled': A Nostalgic Korean Youth Romance Set in 1998

Back to All News Netflix Greenlights Love Untangled': A Nostalgic Korean Y...

12/09/2024

Sundance Institute Selects Three Finalists to Host the Sundance Film Festival Beginning in 2027

Final Phase of RFP Process Will See a Decision Announced in Q1 of 2025...

12/09/2024

Skeleton Twins Is 10! Reminisce Over Surprise Musical Moments From the Comedy-Drama and Other Fest Faves

By Jessica Herndon We all have moments from movies that we love to rewatch. Som...

12/09/2024

Legacy Effects Chooses EVO Shared Storage For New Features

St. Louis, MO - September 12, 2024 - Studio Network Solutions (SNS) announced that Legacy Effects, the Oscar-nominated and Emmy-winning FX studio, relies on EVO...

12/09/2024

SNS Announces Updated NDI Plug-in At IBC2024

St. Louis, MO - September 12, 2024 - Studio Network Solutions (SNS) today announced a significant update to its NDI plug-in for Vizrt NRS systems at IBC2024. T...

12/09/2024

SNS Showcases EVO Suite Updates At IBC2024

St. Louis, MO - September 12, 2024 - Studio Network Solutions (SNS) is showcasing significant updates to the EVO Suite at IBC2024. Designed to empower media t...

12/09/2024

L3Harris Joins Australia's Global Supply Chain Program

L3Harris is now one of 13 prime contractors involved in the Australian government's GSC program, designed to support in-country suppliers and bolster global...

12/09/2024

NSF-DOE Rubin Observatory's Secondary Mirror Installed

The 3.5-meter glass mirror, which L3Harris polished and finished, has been lifted onto Rubin Observatory's Simonyi Survey Telescope in Chile. This achieveme...

12/09/2024

Grass Valley to Launch LDX 110 and LDX C110 Entry Level Cameras at IBC 2024

The new LDX 110 and LDX C110 bring the cutting-edge technology of the LDX 100 series to the entry-level market. Montreal, Canada, September 12, 2024 -. Grass V...

12/09/2024

Over 67 Million Viewers Tune In for ABC News Harris-Trump Debate

NEW YORK - September 11, 2024 - The ABC News Presidential Debate between Vice President Kamala Harris and former President Donald Trump drew an estimated audien...

12/09/2024

Nielsen & Video Research renew partnership with goal of cutting-edge, de-duplicated cross-media measurement solution for Japan

Reinvigorated alliance the next step in the evolution of media in Japan and a bi...

12/09/2024

67.1M Viewers Tune into ABC News Harris-Trump Debate

NEW YORK The ABC News Presidential Debate between Vice President Kamala Harris and former President Donald Trump drew an estimated audience of 67.1 million view...

12/09/2024

Boston Conservatory at Berklee to Present Center Stage Production of Jesus Christ Superstar

Boston Conservatory at Berklee to Present Center Stage Production of Jesus Chris...

12/09/2024

Jennifer Hudson' Chooses Joy in Season 3 Return

The Jennifer Hudson Show returns for its third season on Monday, September 16 with a renewed focus on joy, positivity and inspiration, the show said Tuesday....

12/09/2024

Netflix's The Perfect Couple' Opens Atop TVision's Power Score Rankings

Netflix's The Perfect Couple had its debut as the No. 1 show on connected TV...

12/09/2024

FreeWheel Integrates Samba TV's ACR Data for Campaign Targeting

FreeWheel, Comcast's ad-tech company, said it would be integrating Samba TV automatic content recognition viewer data into FreeWheel's Audience Manager,...

12/09/2024

GroupM's Mike Fisher Joins Comcast Advertising

Mike Fisher, who had been executive director of investment innovation at GroupM, has joined Comcast Advertising as executive director, agency development....

12/09/2024

Hispanic TV Summit: John Leguizamo on How American Historia' Came to Be, and What He Hopes to Accomplish With PBS Project

John Leguizamo took the stage at the Hispanic Television Summit, as the session ...

12/09/2024

Revised: 28.3M Homes Watched Trump, Harris Presidential Debate

Samba TV is reporting that 28.3 million homes watched the Presidential Debate hosted by ABC News on Sept. 10. That makes the faceoff between Donald Trump and Ka...

12/09/2024

Dalet Expands Leadership Team with Three New Members

PARIS Dalet has announced three new members of its executive team. Tara Bryant joins as chief revenue officer (CRO) with responsibilities spanning sales, market...

12/09/2024

Avid | Stream IO Ingest & Playout Solution Now Supports SMPTE 2110

BURLINGTON, Mass. Avid has announced improvements to its Avid | Stream IO software subscription solution for broadcast production ingest and playout that are de...

12/09/2024

Nielsen: Hispanic Sports Fans Drive Record Sports Viewing

NEW YORK A new report from Nielsen indicates that the U.S. Hispanic population is having a major influence on the U.S. sports landscape, helping to drive growin...