
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible and showcases new hardware, software, tools and accelerations for NVIDIA RTX PC and workstation users.
The demand for tools to simplify and optimize generative AI development is skyrocketing. Applications based on retrieval-augmented generation (RAG) - a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from specified external sources - and customized models are enabling developers to tune AI models to their specific needs.
While such work may have required a complex setup in the past, new tools are making it easier than ever.
NVIDIA AI Workbench simplifies AI developer workflows by helping users build their own RAG projects, customize models and more. It's part of the RTX AI Toolkit - a suite of tools and software development kits for customizing, optimizing and deploying AI capabilities - launched at COMPUTEX earlier this month. AI Workbench removes the complexity of technical tasks that can derail experts and halt beginners.
What Is NVIDIA AI Workbench? Available for free, NVIDIA AI Workbench enables users to develop, experiment with, test and prototype AI applications across GPU systems of their choice - from laptops and workstations to data center and cloud. It offers a new approach for creating, using and sharing GPU-enabled development environments across people and systems.
A simple installation gets users up and running with AI Workbench on a local or remote machine in just minutes. Users can then start a new project or replicate one from the examples on GitHub. Everything works through GitHub or GitLab, so users can easily collaborate and distribute work. Learn more about getting started with AI Workbench.
How AI Workbench Helps Address AI Project Challenges Developing AI workloads can require manual, often complex processes, right from the start.
Setting up GPUs, updating drivers and managing versioning incompatibilities can be cumbersome. Reproducing projects across different systems can require replicating manual processes over and over. Inconsistencies when replicating projects, like issues with data fragmentation and version control, can hinder collaboration. Varied setup processes, moving credentials and secrets, and changes in the environment, data, models and file locations can all limit the portability of projects.
AI Workbench makes it easier for data scientists and developers to manage their work and collaborate across heterogeneous platforms. It integrates and automates various aspects of the development process, offering:
Ease of setup: AI Workbench streamlines the process of setting up a developer environment that's GPU-accelerated, even for users with limited technical knowledge.
Seamless collaboration: AI Workbench integrates with version-control and project-management tools like GitHub and GitLab, reducing friction when collaborating.
Consistency when scaling from local to cloud: AI Workbench ensures consistency across multiple environments, supporting scaling up or down from local workstations or PCs to data centers or the cloud.
RAG for Documents, Easier Than Ever NVIDIA offers sample development Workbench Projects to help users get started with AI Workbench. The hybrid RAG Workbench Project is one example: It runs a custom, text-based RAG web application with a user's documents on their local workstation, PC or remote system.
Every Workbench Project runs in a container - software that includes all the necessary components to run the AI application. The hybrid RAG sample pairs a Gradio chat interface frontend on the host machine with a containerized RAG server - the backend that services a user's request and routes queries to and from the vector database and the selected large language model.
This Workbench Project supports a wide variety of LLMs available on NVIDIA's GitHub page. Plus, the hybrid nature of the project lets users select where to run inference.
Workbench Projects let users version the development environment and code. Developers can run the embedding model on the host machine and run inference locally on a Hugging Face Text Generation Inference server, on target cloud resources using NVIDIA inference endpoints like the NVIDIA API catalog, or with self-hosting microservices such as NVIDIA NIM or third-party services.
The hybrid RAG Workbench Project also includes:
Performance metrics: Users can evaluate how RAG- and non-RAG-based user queries perform across each inference mode. Tracked metrics include Retrieval Time, Time to First Token (TTFT) and Token Velocity.
Retrieval transparency: A panel shows the exact snippets of text - retrieved from the most contextually relevant content in the vector database - that are being fed into the LLM and improving the response's relevance to a user's query.
Response customization: Responses can be tweaked with a variety of parameters, such as maximum tokens to generate, temperature and frequency penalty.
To get started with this project, simply install AI Workbench on a local system. The hybrid RAG Workbench Project can be brought from GitHub into the user's account and duplicated to the local system.
More resources are available in the AI Decoded user guide. In addition, community members provide helpful video tutorials, like the one from Joe Freeman below.
Customize, Optimize, Deploy Developers often seek to customize AI models for specific use cases. Fine-tuning, a technique that changes the model by training it with additional data, can be useful for style transfer or changing model behavior. AI Workbench helps with fine-tuning, as well.
The Llama-factory AI Workbench Project enables QLoRa, a fine-tuning method that minimizes memory requirements, for a variety of models, as well as
Most recent headlines
06/10/2025
France T l visions, France's leading broadcaster, has received the 2025 EBU ...
04/09/2025
Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...
07/08/2025
July 8 2025, 22:30 (PDT) Tata Motors & Dolby Bring Dolby Atmos to Harrier.ev, R...
29/07/2025
Staines-upon-Thames, UK, 29 July, 2025 Yospace, the global leader in Dynamic Ad ...
29/07/2025
Six Fellows Selected for Program Supporting Projects From Transgender Storytellers of Color
Today the nonprofit Sundance Institute announced the six artists p...
29/07/2025
By Jessica Herndon
One of the most exciting things about the Sundance Film Fest...
29/07/2025
Today, we announced our second quarter 2025 earnings, fueled by standout subscriber and MAU growth. In the first half of 2025, subscriber net additions grew mor...
29/07/2025
Idag rapporterar vi resultatet f r andra kvartalet 2025, med stark tillv xt av antalet prenumeranter och m natliga aktiva anv ndare. Under f rsta halv ret kade...
29/07/2025
Streaming Holds Steady in a Lighter Summer Viewership Season
NEW YORK - July 29...
29/07/2025
NEW YORK Nielsen is reporting that viewing of content with ads became more popular in Q2, 2025, gaining 1.2 share points of overall TV viewing to capture 73.6% ...
29/07/2025
SAN ANTONIO QuickLink has launched two new versions of its StudioEdge line of products: StudioEdge-1 and StudioEdge-2 provide one-channel and two-channels of br...
29/07/2025
The Society of Broadcast Engineers has announced the recipients of the 2025 SBE National Awards, which recognize outstanding achievements by individual members,...
29/07/2025
CHAMPAIGN, Ill. Cobalt Digital is heading to IBC 2025 with an expanded lineup of IPMX-compliant products and solutions that highlight its simple plug-and-play a...
29/07/2025
AMSTERDAM German manufacturer Guntermann & Drunck GmbH (G&D) has announced that it will present a wide range of KVM and video processing solutions for broadcast...
29/07/2025
At IBC 2025 in Amsterdam (September 12 15), German manufacturer Guntermann & Drunck GmbH (G&D) will present a range of intelligent solutions designed to meet th...
29/07/2025
MoU will support the Map Africa Initiative, a program designed to create a con...
29/07/2025
X-Rite Launches CT2100 Spectrophotometer for Fast, Affordable Retail Paint Color...
29/07/2025
Filming is now underway with Damien Molony and wider cast returning to Jersey for Bergerac, written by Toby Whithouse alongside Ashley Sanders, Emilie Robson an...
29/07/2025
NBA Summer League Tests Out, Refines Audio Workflows New mic arrays and ways of mixing them are a focus By Dan Daley, Audio Editor
Tuesday, July 29, 2025 - 7...
29/07/2025
Athlete Audio Builds Fan Engagement, Player Branding at WNBA All-Star Specialist A2 Ron Thompson has helped the technology evolve for decades By Dan Daley, Aud...
29/07/2025
Dante, Dell Technologies, Google, SMT, and Wave Central Renew Corporate Sponsors...
29/07/2025
Ross Video Case Study: How to Draw Fans Deeper into the Game By SVG Staff
Tuesday, July 29, 2025 - 11:36 am
Print This Story | Subscribe
Story Highlight...
29/07/2025
FIFA Club World Cup 2025: Sounding off with HBS at the largest production for a ...
29/07/2025
Back to All News
Breathless Returns to Netflix, Premiering October 31stPlay Video
Play Video
Entertainment
29 July 2025
GlobalSpain
Link copied to clipboa...
29/07/2025
Back to All News
Made in New Jersey: Finding the Perfect Shot for Our Hit Seque...
29/07/2025
Relationship Marks First U.S.-Based Distribution Partnership for FOR-A America...
29/07/2025
A wide shot of MID's new Public Meeting Chambers in session, showcasing the ...
29/07/2025
Through reliable connectivity and industry-leading levels of data completeness, Arqiva is helping water companies to meet regulatory targets and push for a wate...
28/07/2025
Summer is here, and whether you're road-tripping, relaxing poolside, or hosting friends for a backyard barbecue, the right soundtrack can make every moment ...
28/07/2025
In 2021, Spotify launched Amplifika in Brazil as a dedicated program to foster t...
28/07/2025
Summer is the perfect time to dive into a new story, whether you're on the move or just looking for an escape. With Spotify Premium, eligible listeners in s...
28/07/2025
IABM, the International Trade Association for Broadcast and Media Technology, has confirmed the appointment of its Members' Board for 2025, following the co...
28/07/2025
Media Prima has chosen DHD SX2 audio production mixers for integration into new broadcast studios at Balai Berita in Bangsar on the southwest periphery of Kuala...
28/07/2025
MNC Software, a global leader in network management and operational support systems tailored to the broadcast and media industry, has won a major monitoring and...
28/07/2025
Back to All News
Netflix Unveils the Official Trailer for the Limited Series Tw...
28/07/2025
Back to All News
New Korean Romantic Comedy Take Charge of My Heart' Produ...
28/07/2025
Kerry's dominant All Ireland Football Final display draws a peak of over one...
28/07/2025
Kerry's dominant All Ireland Football Final display draws a peak of over one...
28/07/2025
The electrical grid is designed to support loads that are relatively steady, such as lighting, household appliances, and industrial machines that operate at con...
28/07/2025
28 Jul 2025
VEON's Beeline Kazakhstan Opens New Office in Almaty Supporting...
28/07/2025
New classic: EMG / Gravity Media France on overlapping schedules and athlete acc...
28/07/2025
Indiana Pacers Sports & Entertainment's Emily Wright on the IP-based Tech Tr...
28/07/2025
Live From National Baseball Hall of Fame Induction: Cooperstown Is at the Heart ...
28/07/2025
SVG Attendees Get Shared Reality' Treatment at Cosm Experience & Tech Tour ...
28/07/2025
Monday 28 July 2025
To view this content, please enable our use of cookies. To ...
28/07/2025
Monday 28 July 2025
To view this content, please enable our use of cookies. To ...
28/07/2025
LinkedIn Wins Legal Battle to Protect Member Data Published on Jul 28, 2025 Categories: Company News
LinkedIn Corporate Communications
Share
LinkedIn ...
26/07/2025
IABM, the International Trade Association for Broadcast and Media Technology, has confirmed the appointment of its Members' Board for 2025, following the co...
26/07/2025
BALTIMORE In another sign that dealmaking for U.S. TV stations may be heating up amid hopes that regulators will eliminate or loosen broadcast ownership caps, S...
26/07/2025
LISBON wTVision, a provider of real-time graphics and broadcast services, has established a strategic alliance with Adistec that will see Adistec will distribut...