NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Debut
28/08/2024
In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.
The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token.
MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.
The continued growth of LLMs is driving the need for more compute to process inference requests. To meet real-time latency requirements for serving today's LLMs, and to do so for as many users as possible, multi-GPU compute is a must. NVIDIA NVLink and NVSwitch provide high-bandwidth communication between GPUs based on the NVIDIA Hopper architecture and provide significant benefits for real-time, cost-effective large model inference. The Blackwell platform will further extend NVLink Switch's capabilities with larger NVLink domains with 72 GPUs.
In addition to the NVIDIA submissions, 10 NVIDIA partners - ASUSTek, Cisco, Dell Technologies, Fujitsu, Giga Computing, Hewlett Packard Enterprise (HPE), Juniper Networks, Lenovo, Quanta Cloud Technology and Supermicro - all made solid MLPerf Inference submissions, underscoring the wide availability of NVIDIA platforms.
Relentless Software Innovation NVIDIA platforms undergo continuous software development, racking up performance and feature improvements on a monthly basis.
In the latest inference round, NVIDIA offerings, including the NVIDIA Hopper architecture, NVIDIA Jetson platform and NVIDIA Triton Inference Server, saw leaps and bounds in performance gains.
The NVIDIA H200 GPU delivered up to 27% more generative AI inference performance over the previous round, underscoring the added value customers get over time from their investment in the NVIDIA platform.
Triton Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise software, is a fully featured open-source inference server that helps organizations consolidate framework-specific inference servers into a single, unified platform. This helps lower the total cost of ownership of serving AI models in production and cuts model deployment times from months to minutes.
In this round of MLPerf, Triton Inference Server delivered near-equal performance to NVIDIA's bare-metal submissions, showing that organizations no longer have to choose between using a feature-rich production-grade AI inference server and achieving peak throughput performance.
Going to the Edge Deployed at the edge, generative AI models can transform sensor data, such as images and videos, into real-time, actionable insights with strong contextual awareness. The NVIDIA Jetson platform for edge AI and robotics is uniquely capable of running any kind of model locally, including LLMs, vision transformers and Stable Diffusion.
In this round of MLPerf benchmarks, NVIDIA Jetson AGX Orin system-on-modules achieved more than a 6.2x throughput improvement and 2.4x latency improvement over the previous round on the GPT-J LLM workload. Rather than developing for a specific use case, developers can now use this general-purpose 6-billion-parameter model to seamlessly interface with human language, transforming generative AI at the edge.
Performance Leadership All Around This round of MLPerf Inference showed the versatility and leading performance of NVIDIA platforms - extending from the data center to the edge - on all of the benchmark's workloads, supercharging the most innovative AI-powered applications and services. To learn more about these results, see our technical blog.
H200 GPU-powered systems are available today from CoreWeave - the first cloud service provider to announce general availability - and server makers ASUS, Dell Technologies, HPE, QCT and Supermicro.
See notice regarding software product information.
LINK: | https://blogs.nvidia.com/blog/mlperf-inference-benchmark-blackwell/... |
See more stories from nvidia |
Most recent headlines
09/12/2024
Dalet Named an IDC Innovator in Media and Entertainment
Dalet, a leading technology and service provider for media-rich organizations, today announced that it has been named an IDC Innovator in the IDC Innovators: ...
09/11/2024
Dalet Expands Leadership Team to Fuel Next Stage of Growth
Dalet, a leading technology and service provider for media-rich organizations, today announced three new members of its executive team. Tara Bryant joins as Chi...
13/09/2024
Give Me the Backstory: Get to Know Megan Park, the Filmmaker Behind My Old Ass
By Bailey Pennick One of the most exciting things about the Sundance Film Festival is having a front-row seat for the bright future of independent filmmaking. ...
13/09/2024
SNS Previews New Workflow Innovations At IBC2024
SNS is excited to preview the latest features coming to the award-winning EVO Suite. Be among the first to test drive these workflow innovations at IBC2024, and...
13/09/2024
Spotify Pilots New Music Experience for Young Listeners on Family Plans
Spotify is committed to bringing the best listening experience to all our users, and that includes parents and families. That's why we've introduced a p...
13/09/2024
Spotify's New Countdown To' Vodcast Series Kicks Off With Artists Jelly Roll and mgk in Debut Episode
As fans patiently await their favorite artists' new album releases, the anti...
13/09/2024
The Making of Beetlejuice Beetlejuice
What I've always loved about his films is that incredible warmth and that incredible imagination, cinematographer Haris Zambarloukos, BSC, GSC says of dire...
13/09/2024
US Navy Awards L3Harris $587 Million Contract for Next Generation Jammer - Low Band Program
SALT LAKE CITY, Sept. 12, 2024 - L3Harris Technologies (NYSE:LHX) has received a...
13/09/2024
L3Harris Rapidly Advances US Hypersonic Missile Tracking and Defense Capabilities
Rendering of L3Harris Tranche 2 Tracking (T2TRK) imagery for the Space Developme...
13/09/2024
The EA-37B Compass Call: Delivering the Future of Electronic Warfare Today
Last month, L3Harris delivered the third U.S. Air Force EA-37B Compass Call Cross Deck aircraft integrated with sophisticated electromagnetic attack (EA) miss...
13/09/2024
Grass Valley Unveils New Game-Changing Switcher Products for Enhanced Production Efficiency
Targeted at low-to-mid tier productions, the new products set new standards for ...
13/09/2024
Grass Valley Sets the New Standard in High-Density IP and SDI Modular Processing with the launch of ACE-3901 and XIP-3911-GRID
The next generation products will enhance the Densit Modular Platform with unma...
13/09/2024
Revolution in Storage: EditShare Reveals Game-Changing New Line Up at IBC 2024
Revolution in Storage: EditShare Reveals Game-Changing New Line Up at IBC 2024 More power, more speed, more flexibility, more capacity, more security, more co...
13/09/2024
Calrec Introduces Argo M Compact Broadcast Audio Console At IBC 2024
AMSTERDAM Calrec Audio has unveiled its compact Argo M audio console, ImPulseV cloud audio mixing solution and True Control 2.0 at IBC 2024, Sept. 13-16....
13/09/2024
Media and Broadcast Companies Reveal Confidence In AI Technology
A study of how the media and broadcast industry feels about artificial intelligence (AI) has revealed most media and broadcast companies view its adoption as a ...
13/09/2024
Starchive, Seagate Lyve Cloud Object Storage Announce Collaboration
AMSTERDAM AI-driven digital asset management system (DAM) Starchive announced at IBC 2024 a collaboration with Seagate that fully integrates its toolset with Se...
13/09/2024
DPA Microphones Unveils 2061 Miniature Omnidirectional Mic At IBC 2024
AMSTERDAM DPA Microphones has introduced its 2061 Miniature Omnidirectional Microphone for applications ranging from broadcast studios and ENG to theaters, even...
13/09/2024
Rodney Alejandro Named Dean of Professional Writing and Music Technology
Rodney Alejandro Named Dean of Professional Writing and Music Technology Alejandro, Berklees current Songwriting Department chair and an alumnus of the colleg...
13/09/2024
More Than 67 Million Viewers Watched Presidential Debate
The presidential debate between Vice President Kamala Harris and former President Donald Trump was watched by 67.1 million viewers, according to Nielsen....
13/09/2024
Brightcove Launches AI Suite With Content-Creation Capabilities
Brightcove introduced the Brightcove AI Suite, which uses AI to provide content creation, audience growth and engagement and increased revenue generation....
13/09/2024
Charter Reaches Distribution Deal With Warner Bros. Discovery That Gives Subscribers Streaming Service Max
Warner Bros. Discovery and Charter Communications said the reached a new multi-y...
13/09/2024
History This Week Podcast Returns for Fifth Season
The History Channel's History This Week podcast returns for a fifth season on September 16....
13/09/2024
67.1M Viewers Tune into Harris-Trump Debate
NEW YORK The ABC News Presidential Debate between Vice President Kamala Harris and former President Donald Trump drew an estimated audience of 67.1 million view...
13/09/2024
FCC Chair Rosenworcel Rebuts Trump's Call to Revoke ABC Licenses
WASHINGTON, D.C. In response to Donald Trump's call for ABC stations to lose their broadcast licenses, Federal Communications Commission chairwoman Jessica ...
13/09/2024
Teradek To Introduce Prism 877 4K HEVC Encoder/Decoder Cards At IBC 2024
IRVINE, Calif. Teradek will launch its new Prism 877 4K HEVC encoder/decoder cards at the IBC 2024, Sept. 13-16, at the RAI Amsterdam Convention Center....
13/09/2024
GAB Honors First-Ever Class Of Rising Stars' in Broadcasting
ATLANTA The Georgia Association of Broadcasters honored nine station employees at the early stage of their broadcast career as members of its first-ever class o...
13/09/2024
Grass Valley, ES Broadcast Sign Strategic Partnership Agreement
MONTREAL Grass Valley and broadcast equipment supply and service company ES Broadcast have inked a strategic partnership agreement that continues and expands a ...
13/09/2024
Panasonic Connect Launches New Cameras, Expands NDI Support
NEWARK, N.J. Panasonic Connect has announced the new AK-UCX100 4K studio camera and two new 4K multi-purpose cameras. In addition, the company said it will exte...
13/09/2024
Ross Video Unveils Ultricore Tally
OTTAWA, Canada Ross Video has launched Ultricore Tally, a full-featured, versatile tally control and management solution and will showcase it at IBC 2024, Sept....
13/09/2024
M2A Media welcomes Ciarn Doran and Graham Pitman as new CEO and chairman
Both appointees join the company following extensive careers in the industry By Matthew Corrigan Published: September 13, 2024 Both appointees join the co...
13/09/2024
Hollyland Elevates Wireless Communications for Large-Scale Film and TV Productions
Hollyland Elevates Wireless Communications for Large-Scale Film and TV Productio...
13/09/2024
Blackmagic Design Announces Blackmagic PYXIS Monitor!
Blackmagic Design Announces Blackmagic PYXIS Monitor! Brie Clayton September 12, 2024 0 Comments New cinematic monitoring solution for Blackmagic PYXI...
13/09/2024
Sculpting Reimagined: Maxon Releases ZBrush for iPad
Sculpting Reimagined: Maxon Releases ZBrush for iPad Brie Clayton September 13, 2024 0 Comments Take the power of ZBrush on the go with reimagined UI,...
13/09/2024
WCVB Spotlights Berklee Institute for Accessible Arts Education
WCVB Spotlights Berklee Institute for Accessible Arts Education The institutes ABLE summer camp program was featured in the 5 for Good segment. By Daniel Pe...
13/09/2024
VEON General Counsel Omiyinka Doris Named Among Top 15 Legal Leader by the Financial Times
13 Sep 2024 VEON General Counsel Omiyinka Doris Named Among Top 15 Legal Leader...
13/09/2024
Sky Studios Elstree opens Sky Up Academy to inspire 10,000 students a year to pursue a career in Film and TV
Sky Studios Elstree opens Sky Up Academy to inspire 10,000 students a year to pu...
13/09/2024
Countering the autonomous aerial threat
Countering the autonomous aerial threat In an interview, Martin Woywod, Product Manager Counter-UAS (CUAS) Systems, Rohde & Schwarz , talks about the need for...
13/09/2024
WDR Relies on Riedel Backbone for Remote Production of UEFA Euro 2024
Amsterdam September 13, 2024 WDR Relies on Riedel Backbone for Remote Production of UEFA Euro 2024The German regional public broadcaster Westdeutscher Rundfu...
13/09/2024
Riedel Expands Its Range of NSA Network Stream Adapters
Amsterdam September 13, 2024 Riedel Expands Its Range of NSA Network Stream AdaptersRiedel Communications today announced the launch of two new additions to i...
13/09/2024
Riedel Communications Launches Virtual SmartPanel at IBC2024
Amsterdam September 13, 2024 Riedel Communications Launches Virtual SmartPanel at IBC2024At IBC2024, Riedel Communications announced the launch of its ground...
13/09/2024
Riedel Unveils Revolutionary SAME Smart Audio & Mixing Engine at IBC2024
Amsterdam September 13, 2024 Riedel Unveils Revolutionary SAME Smart Audio & Mixing Engine at IBC2024Riedel SAME Delivers Ultra-Low-Latency Audio Processing ...
13/09/2024
Netflix Greenlights Love Untangled': A Nostalgic Korean Youth Romance Set in 1998
Back to All News Netflix Greenlights Love Untangled': A Nostalgic Korean Y...
13/09/2024
Harmonic's VOS360 Ad SaaS Validated by FreeWheel, Transforming Video Streaming Monetization
SAN JOSE, Calif. - Sept. 13, 2024 - Harmonic (NASDAQ: HLIT) is pleased to announ...
13/09/2024
Eutelsat selected by TVPlus for extensive broadcast services across Australia and New Zealand
Photo credit: GETTY IMAGES Press release - 13 September 2024 09:50 Eutelsat ...
13/09/2024
FRANSAT unveils the TV Stick, a USB format CAM CI Plus 2.0 module for HD and UHD satellite reception directly in the TV
Photo credit: AdobeStock_305149408- nuchao Press release - 13 September 2024...
13/09/2024
2024-07-24
Today, Apple Maps on the web is available in public beta, allowing users around the world to access Maps directly from their browser.1...
13/09/2024
2024-08-07
Apple and Major League Baseball (MLB) today announced the September game schedule for Friday Night Baseball, a weekly doubleheader available to Apple TV+ subs...
13/09/2024
2024-08-14
Starting with iOS 18.1, developers will be able to offer NFC contactless transactions using the Secure Element from within their own apps on iPhone, separate fr...
13/09/2024
2024-09-12
Starting tomorrow, September 13, at 5 a.m. PDT, customers can pre-order Apple's powerful new iPhone 16 and iPhone 16 Pro models on apple.com/uk and in the r...