How GeForce RTX 50 Series GPUs Are Built to Supercharge Generative AI on PCs
05/02/2025
These GPUs were built to accelerate the latest generative AI workloads, delivering up to 3,352 AI trillion operations per second (TOPS), enabling incredible experiences for AI enthusiasts, gamers, creators and developers.
To help AI developers and enthusiasts harness these capabilities, NVIDIA at the CES trade show last month unveiled NVIDIA NIM and AI Blueprints for RTX. NVIDIA NIM microservices are prepackaged generative AI models that let developers and enthusiasts easily get started with generative AI, iterate quickly and harness the power of RTX for accelerating AI on Windows PCs. NVIDIA AI Blueprints are reference projects that show developers how to use NIM microservices to build the next generation of AI experiences.
NIM and AI Blueprints are optimized for GeForce RTX 50 Series GPUs. These technologies work together seamlessly to help developers and enthusiasts build, iterate and deliver cutting-edge AI experiences on AI PCs.
NVIDIA NIM Accelerates Generative AI on PCs While AI model development is rapidly advancing, bringing these innovations to PCs remains a challenge for many people. Models posted on platforms like Hugging Face must be curated, adapted and quantized to run on PC. They also need to be integrated into new AI application programming interfaces (APIs) to ensure compatibility with existing tools, and converted to optimized inference backends for peak performance.
NVIDIA NIM microservices for RTX AI PCs and workstations can ease the complexity of this process by providing access to community-driven and NVIDIA-developed AI models. These microservices are easy to download and connect to via industry-standard APIs and span the key modalities essential for AI PCs. They are also compatible with a wide range of AI tools and offer flexible deployment options, whether on PCs, in data centers, or in the cloud.
NIM microservices include everything needed to run optimized models on PCs with RTX GPUs, including prebuilt engines for specific GPUs, the NVIDIA TensorRT software development kit (SDK), the open-source NVIDIA TensorRT-LLM library for accelerated inference using Tensor Cores, and more.
Microsoft and NVIDIA worked together to enable NIM microservices and AI Blueprints for RTX in Windows Subsystem for Linux (WSL2). With WSL2, the same AI containers that run on data center GPUs can now run efficiently on RTX PCs, making it easier for developers to build, test and deploy AI models across platforms.
In addition, NIM and AI Blueprints harness key innovations of the Blackwell architecture that the GeForce RTX 50 series is built on, including fifth-generation Tensor Cores and support for FP4 precision.
Tensor Cores Drive Next-Gen AI Performance AI calculations are incredibly demanding and require vast amounts of processing power. Whether generating images and videos or understanding language and making real-time decisions, AI models rely on hundreds of trillions of mathematical operations to be completed every second. To keep up, computers need specialized hardware built specifically for AI.
NVIDIA GeForce RTX desktop GPUs deliver up to 3,352 AI TOPS for unmatched speed and efficiency in AI-powered workflows. In 2018, NVIDIA GeForce RTX GPUs changed the game by introducing Tensor Cores - dedicated AI processors designed to handle these intensive workloads. Unlike traditional computing cores, Tensor Cores are built to accelerate AI by performing calculations faster and more efficiently. This breakthrough helped bring AI-powered gaming, creative tools and productivity applications into the mainstream.
Blackwell architecture takes AI acceleration to the next level. The fifth-generation Tensor Cores in Blackwell GPUs deliver up to 3,352 AI TOPS to handle even more demanding AI tasks and simultaneously run multiple AI models. This means faster AI-driven experiences, from real-time rendering to intelligent assistants, that pave the way for greater innovation in gaming, content creation and beyond.
FP4 - Smaller Models, Bigger Performance Another way to optimize AI performance is through quantization, a technique that reduces model sizes, enabling the models to run faster while reducing the memory requirements.
Enter FP4 - an advanced quantization format that allows AI models to run faster and leaner without compromising output quality. Compared with FP16, it reduces model size by up to 60% and more than doubles performance, with minimal degradation.
For example, Black Forest Labs' FLUX.1 [dev] model at FP16 requires over 23GB of VRAM, meaning it can only be supported by the GeForce RTX 4090 and professional GPUs. With FP4, FLUX.1 [dev] requires less than 10GB, so it can run locally on more GeForce RTX GPUs.
On a GeForce RTX 4090 with FP16, the FLUX.1 [dev] model can generate images in 15 seconds with just 30 steps. With a GeForce RTX 5090 with FP4, images can be generated in just over five seconds.
FP4 is natively supported by the Blackwell architecture, making it easier than ever to deploy high-performance AI on local PCs. It's also integrated into NIM microservices, effectively optimizing models that were previously difficult to quantize. By enabling more efficient AI processing, FP4 helps to bring faster, smarter AI experiences for content creation.
AI Blueprints Power Advanced AI Workflows on RTX PCs NVIDIA AI Blueprints, built on NIM microservices, provide prepackaged, optimized reference implementations that make it easier to develop advanced AI-powered projects - whether for digital humans, podcast generators or application assistants.
At CES, NVIDIA demonstrated PDF to Podcast
LINK: | https://blogs.nvidia.com/blog/rtx-ai-garage-blackwell-nim-blueprints-p... |
See more stories from nvidia |
More from Nvidia
05/02/2025
AI Pays Off: Survey Reveals Financial Industry's Latest Technological Trends
The financial services industry is reaching an important milestone with AI, as organizations move beyond testing and experimentation to successful AI implementa...
05/02/2025
How GeForce RTX 50 Series GPUs Are Built to Supercharge Generative AI on PCs
NVIDIA's GeForce RTX 5090 and 5080 GPUs - which are based on the groundbreaking NVIDIA Blackwell architecture -offer up to 8x faster frame rates with NVIDIA...
04/02/2025
NVIDIA Blackwell Now Generally Available in the Cloud
AI reasoning models and agents are set to transform industries, but delivering their full potential at scale requires massive compute and optimized software. Th...
31/01/2025
Accelerate DeepSeek Reasoning Models With NVIDIA GeForce RTX 50 Series AI PCs
The recently released DeepSeek-R1 model family has brought a new wave of excitement to the AI community, allowing enthusiasts and developers to run state-of-the...
30/01/2025
DeepSeek-R1 Now Live With NVIDIA NIM
DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multi...
30/01/2025
GeForce NOW Celebrates Five Years of Cloud Gaming With AAA Blockbusters
GeForce NOW turns five this February. Five incredible years of high-performance gaming have been made possible thanks to the members who've joined the cloud...
30/01/2025
Lights, Camera, Action: New NVIDIA Broadcast AI Features Now Streaming With GeForce RTX 50 Series GPUs
New GeForce RTX 5090 and RTX 5080 GPUs - built on the NVIDIA Blackwell architect...
29/01/2025
Leveling Up User Experiences With Agentic AI, From Bots to Autonomous Agents
AI agents with advanced perception and cognition capabilities are making digital experiences more dynamic and personalized across retail, finance, entertainment...
27/01/2025
Amphitrite Rides AI Wave to Boost Maritime Shipping, Ocean Cleanup With Real-Time Weather Prediction and Simulation
Named after Greek mythology's goddess of the sea, France-based startup Amphi...
23/01/2025
Fast, Low-Cost Inference Offers Key to Profitable AI
Businesses across every industry are rolling out AI services this year. For Microsoft, Oracle, Perplexity, Snap and hundreds of other leading companies, using t...
23/01/2025
Baldur's Gate 3' Mod Support Launches in the Cloud
GeForce NOW is expanding mod support for hit game Baldur's Gate 3 in collaboration with Larian Studios and mod.io for Ultimate and Performance members. Thi...
22/01/2025
How AI Helps Fight Fraud in Financial Services, Healthcare, Government and More
Companies and organizations are increasingly using AI to protect their customers and thwart the efforts of fraudsters around the world. Voice security company ...
22/01/2025
Into the Omniverse: OpenUSD Workflows Advance Physical AI for Robotics, Autonomous Vehicles
Editor's note: This post is part of Into the Omniverse, a series focused on ...
22/01/2025
The Future of Marketing: How AI Agents Can Enhance Customer Journeys in Retail
AI agents - which can understand, adapt to and support each user's unique journey - are making online shopping and digital marketing more efficient and pers...
21/01/2025
NoTraffic Reduces Road Delays, Carbon Emissions With NVIDIA AI and Accelerated Computing
More than 90 million new vehicles are introduced to roads across the globe every...
16/01/2025
Fantastic Four-ce Awakens: Season One of Marvel Rivals' Joins GeForce NOW
Time to suit up, members. The multiverse is about to get a whole lot cloudier as GeForce NOW opens a portal to the first season of hit game Marvel Rivals from N...
16/01/2025
NVIDIA Releases NIM Microservices to Safeguard Applications for Agentic AI
AI agents are poised to transform productivity for the world's billion knowledge workers with knowledge robots that can accomplish a variety of tasks. To ...
15/01/2025
How AI Is Enhancing Surgical Safety and Education
Troves of unwatched surgical video footage are finding new life, fueling AI tools that help make surgery safer and enhance surgical education. The Surgical Data...
14/01/2025
Healthcare Leaders, NVIDIA CEO Share AI Innovation Across the Industry
AI is making inroads across the entire healthcare industry - from genomic research to drug discovery, clinical trial workflows and patient care. In a fireside ...
14/01/2025
NVIDIA GTC 2025: Quantum Day to Illuminate the Future of Quantum Computing
Quantum computing is one of the most exciting areas in computer science, promising progress in accelerated computing beyond what's considered possible today...
13/01/2025
NVIDIA Statement on the Biden Administration's Misguided AI Diffusion' Rule
For decades, leadership in computing and software ecosystems has been a cornerst...
13/01/2025
NVIDIA Statement on the Biden Administration's Misguided ‘AI Diffusion’ Rule
For decades, leadership in computing and software ecosystems has been a cornerst...
13/01/2025
NVIDIA and IQVIA Build Domain-Expert Agentic AI for Healthcare and Life Sciences
IQVIA, the world's leading provider of clinical research services, commercial insights and healthcare intelligence, is working with NVIDIA to build custom f...
10/01/2025
AI Gets Real for Retailers: 9 Out of 10 Retailers Now Adopting or Piloting AI, Latest NVIDIA Survey Finds
Artificial intelligence is rapidly becoming the cornerstone of innovation in the...
09/01/2025
Hyundai Motor Group Embraces NVIDIA AI and Omniverse for Next-Gen Mobility
Driving the future of smart mobility, Hyundai Motor Group (the Group) is partnering with NVIDIA to develop the next generation of safe, secure mobility with AI ...
09/01/2025
GeForce NOW at CES: Bring PC RTX Gaming Everywhere With the Power of GeForce NOW
This GFN Thursday recaps the latest cloud announcements from the CES trade show, including GeForce RTX gaming expansion across popular devices such as Steam Dec...
08/01/2025
Unveiling a New Era of Local AI With NVIDIA NIM Microservices and AI Blueprints
Over the past year, generative AI has transformed the way people live, work and play, enhancing everything from writing and content creation to gaming, learning...
07/01/2025
Why Enterprises Need AI Query Engines to Fuel Agentic AI
Data is the fuel of AI applications, but the magnitude and scale of enterprise data often make it too expensive and time-consuming to use effectively. Accordin...
07/01/2025
Why World Foundation Models Will Be Key to Advancing Physical AI
In the fast-evolving landscape of AI, it's becoming increasingly important to develop models that can accurately simulate and predict outcomes in physical, ...
06/01/2025
Now See This: NVIDIA Launches Blueprint for AI Agents That Can Analyze Video
The next big moment in AI is in sight - literally. Today, more than 1.5 billion enterprise level cameras deployed worldwide are generating roughly 7 trillion h...
06/01/2025
Building Smarter Autonomous Machines: NVIDIA Announces Early Access for Omniverse Sensor RTX
Generative AI and foundation models let autonomous machines generalize beyond th...
06/01/2025
NVIDIA Unveils Mega' Omniverse Blueprint for Building Industrial Robot Fleet Digital Twins
According to Gartner, the worldwide end-user spending on all IT products for 202...
02/01/2025
How AI Is Helping Us Do Better-for the Planet and for Each Other
Artificial intelligence and accelerated computing are being used to help solve the world's greatest challenges. NVIDIA has reinvented the computing stack -...
02/01/2025
GeForce NOW Rings in the New Year With 14 New Games
GeForce NOW is kicking off 2025 by delivering 14 games to the cloud this month, with two available to stream this week so members can get started on their New Y...
30/12/2024
Research Galore From 2024: Recapping AI Advancements in 3D Simulation, Climate Science and Audio Engineering
The pace of technology innovation has accelerated in the past year, most dramati...
27/12/2024
Have You Heard? 5 AI Podcast Episodes Listeners Loved in 2024
NVIDIA's AI Podcast gives listeners the inside scoop on the ways AI is transforming nearly every industry. Since the show's debut in 2016, it's gar...
26/12/2024
Cheers to 2024: GeForce NOW Recaps Year of Ultimate Cloud Gaming
This GFN Thursday wraps up another incredible year for cloud gaming. Take a look back at the top games and new features that made 2024 a standout for GeForce NO...
24/12/2024
From Generative to Agentic AI, Wrapping the Year's AI Advancements
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, softwa...
19/12/2024
AI's in Style: Ulta Beauty Helps Shoppers Virtually Try New Hairstyles
Shoppers pondering a new hairstyle can now try styles before committing to curls or a new color. An AI app by Ulta Beauty, the largest specialty beauty retailer...
19/12/2024
NieR Perfect: GeForce NOW Loops Square Enix's NieR:Automata' and NieR Replicant ver.1.22474487139' Into the Cloud
Stuck in a gaming rut? Get out of the loop this GFN Thursday with four new games...
18/12/2024
AI at Your Service: Digital Avatars With Speech Capabilities Offer Interactive Customer Experiences
Editor's note: This post is part of the AI On blog series, which explores th...
18/12/2024
Imbue's Kanjun Qiu Shares Insights on How to Build Smarter AI Agents
Imagine a future in which everyone is empowered to build and use their own AI agents. That future may not be far off, as new software is infused with intelligen...
18/12/2024
NVIDIA Awards up to $60,000 Research Fellowships to PhD Students
For more than two decades, the NVIDIA Graduate Fellowship Program has supported graduate students doing outstanding work relevant to NVIDIA technologies. Today,...
17/12/2024
AI in Your Own Words: NVIDIA Debuts NeMo Retriever Microservices for Multilingual Generative AI Fueled by Data
In enterprise AI, understanding and working across multiple languages is no long...
17/12/2024
NVIDIA Unveils Its Most Affordable Generative AI Supercomputer
NVIDIA is taking the wraps off a new compact generative AI supercomputer, offering increased performance at a lower price with a software upgrade. The new NVID...
16/12/2024
Tech Leader, AI Visionary, Endlessly Curious Jensen Huang to Keynote CES 2025
On Jan. 6 at 6:30 p.m. PT, NVIDIA founder and CEO Jensen Huang - with his trademark leather jacket and an unwavering vision - will step onto the CES 2025 stage....
12/12/2024
Ready Player Fun: GFN Thursday Brings Six New Adventures to the Cloud
From heart-pounding action games to remastered classics, there's something for everyone this GFN Thursday. Six new titles join the cloud this week, startin...
11/12/2024
Driving Mobility Forward, Vay Brings Advanced Automotive Solutions to Roads With NVIDIA DRIVE AGX
Vay, a Berlin-based provider of automotive-grade remote driving (teledriving) te...
11/12/2024
Built for the Era of AI, NVIDIA RTX AI PCs Enhance Content Creation, Gaming, Entertainment and More
Editor's note: This post is part of the AI Decoded series, which demystifies...
11/12/2024
Into the Omniverse: How OpenUSD-Based Simulation and Synthetic Data Generation Advance Robot Learning
Editor's note: This post is part of Into the Omniverse, a series focused on ...