Sony Pixel Power calrec Sony

NVIDIA NIM on AWS Supercharges AI Inference

04/12/2024

Generative AI is rapidly transforming industries, driving demand for secure, high-performance inference solutions to scale increasingly complex models efficiently and cost-effectively.

Expanding its collaboration with NVIDIA, Amazon Web Services (AWS) revealed today at its annual AWS re:Invent conference that it has extended NVIDIA NIM microservices across key AWS AI services to support faster AI inference and lower latency for generative AI applications.

NVIDIA NIM microservices are now available directly from the AWS Marketplace, as well as Amazon Bedrock Marketplace and Amazon SageMaker JumpStart, making it even easier for developers to deploy NVIDIA-optimized inference for commonly used models at scale.

NVIDIA NIM, part of the NVIDIA AI Enterprise software platform available in the AWS Marketplace, provides developers with a set of easy-to-use microservices designed for secure, reliable deployment of high-performance, enterprise-grade AI model inference across clouds, data centers and workstations.

These prebuilt containers are built on robust inference engines, such as NVIDIA Triton Inference Server, NVIDIA TensorRT, NVIDIA TensorRT-LLM and PyTorch, and support a broad spectrum of AI models - from open-source community ones to NVIDIA AI Foundation models and custom ones.

NIM microservices can be deployed across various AWS services, including Amazon Elastic Compute Cloud (EC2), Amazon Elastic Kubernetes Service (EKS) and Amazon SageMaker.

Developers can preview over 100 NIM microservices built from commonly used models and model families, including Meta's Llama 3, Mistral AI's Mistral and Mixtral, NVIDIA's Nemotron, Stability AI's SDXL and many more on the NVIDIA API catalog. The most commonly used ones are available for self-hosting to deploy on AWS services and are optimized to run on NVIDIA accelerated computing instances on AWS.

NIM microservices now available directly from AWS include:

NVIDIA Nemotron-4, available in Amazon Bedrock Marketplace, Amazon SageMaker Jumpstart and AWS Marketplace. This is a cutting-edge LLM designed to generate diverse synthetic data that closely mimics real-world data, enhancing the performance and robustness of custom LLMs across various domains.

Llama 3.1 8B-Instruct, available on AWS Marketplace. This 8-billion-parameter multilingual large language model is pretrained and instruction-tuned for language understanding, reasoning and text-generation use cases.

Llama 3.1 70B-Instruct, available on AWS Marketplace. This 70-billion-parameter pretrained, instruction-tuned model is optimized for multilingual dialogue.

Mixtral 8x7B Instruct v0.1, available on AWS Marketplace. This high-quality sparse mixture of experts model with open weights can follow instructions, complete requests and generate creative text formats.

NIM on AWS for Everyone Customers and partners across industries are tapping NIM on AWS to get to market faster, maintain security and control of their generative AI applications and data, and lower costs.

SoftServe, an IT consulting and digital services provider, has developed six generative AI solutions fully deployed on AWS and accelerated by NVIDIA NIM and AWS services. The solutions, available on AWS Marketplace, include SoftServe Gen AI Drug Discovery, SoftServe Gen AI Industrial Assistant, Digital Concierge, Multimodal RAG System, Content Creator and Speech Recognition Platform.

They're all based on NVIDIA AI Blueprints, comprehensive reference workflows that accelerate AI application development and deployment and feature NVIDIA acceleration libraries, software development kits and NIM microservices for AI agents, digital twins and more.

Start Now With NIM on AWS Developers can deploy NVIDIA NIM microservices on AWS according to their unique needs and requirements. By doing so, developers and enterprises can achieve high-performance AI with NVIDIA-optimized inference containers across various AWS services.

Visit the NVIDIA API catalog to try out over 100 different NIM-optimized models, and request either a developer license or 90-day NVIDIA AI Enterprise trial license to get started deploying the microservices on AWS services. Developers can also explore NIM microservices in the AWS Marketplace, Amazon Bedrock Marketplace or Amazon SageMaker JumpStart.

See notice regarding software product information.
LINK: https://blogs.nvidia.com/blog/nim-microservices-aws-inference/...
See more stories from nvidia

More from Nvidia

16/01/2025

Fantastic Four-ce Awakens: Season One of Marvel Rivals' Joins GeForce NOW

Time to suit up, members. The multiverse is about to get a whole lot cloudier as GeForce NOW opens a portal to the first season of hit game Marvel Rivals from N...

16/01/2025

NVIDIA Releases NIM Microservices to Safeguard Applications for Agentic AI

AI agents are poised to transform productivity for the world's billion knowledge workers with knowledge robots that can accomplish a variety of tasks. To ...

15/01/2025

How AI Is Enhancing Surgical Safety and Education

Troves of unwatched surgical video footage are finding new life, fueling AI tools that help make surgery safer and enhance surgical education. The Surgical Data...

14/01/2025

Healthcare Leaders, NVIDIA CEO Share AI Innovation Across the Industry

AI is making inroads across the entire healthcare industry - from genomic research to drug discovery, clinical trial workflows and patient care. In a fireside ...

14/01/2025

NVIDIA GTC 2025: Quantum Day to Illuminate the Future of Quantum Computing

Quantum computing is one of the most exciting areas in computer science, promising progress in accelerated computing beyond what's considered possible today...

13/01/2025

NVIDIA Statement on the Biden Administration's Misguided AI Diffusion' Rule

For decades, leadership in computing and software ecosystems has been a cornerst...

13/01/2025

NVIDIA Statement on the Biden Administration's Misguided ‘AI Diffusion’ Rule

For decades, leadership in computing and software ecosystems has been a cornerst...

13/01/2025

NVIDIA and IQVIA Build Domain-Expert Agentic AI for Healthcare and Life Sciences

IQVIA, the world's leading provider of clinical research services, commercial insights and healthcare intelligence, is working with NVIDIA to build custom f...

10/01/2025

AI Gets Real for Retailers: 9 Out of 10 Retailers Now Adopting or Piloting AI, Latest NVIDIA Survey Finds

Artificial intelligence is rapidly becoming the cornerstone of innovation in the...

09/01/2025

Hyundai Motor Group Embraces NVIDIA AI and Omniverse for Next-Gen Mobility

Driving the future of smart mobility, Hyundai Motor Group (the Group) is partnering with NVIDIA to develop the next generation of safe, secure mobility with AI ...

09/01/2025

GeForce NOW at CES: Bring PC RTX Gaming Everywhere With the Power of GeForce NOW

This GFN Thursday recaps the latest cloud announcements from the CES trade show, including GeForce RTX gaming expansion across popular devices such as Steam Dec...

08/01/2025

Unveiling a New Era of Local AI With NVIDIA NIM Microservices and AI Blueprints

Over the past year, generative AI has transformed the way people live, work and play, enhancing everything from writing and content creation to gaming, learning...

07/01/2025

Why Enterprises Need AI Query Engines to Fuel Agentic AI

Data is the fuel of AI applications, but the magnitude and scale of enterprise data often make it too expensive and time-consuming to use effectively. Accordin...

07/01/2025

Why World Foundation Models Will Be Key to Advancing Physical AI

In the fast-evolving landscape of AI, it's becoming increasingly important to develop models that can accurately simulate and predict outcomes in physical, ...

06/01/2025

Now See This: NVIDIA Launches Blueprint for AI Agents That Can Analyze Video

The next big moment in AI is in sight - literally. Today, more than 1.5 billion enterprise level cameras deployed worldwide are generating roughly 7 trillion h...

06/01/2025

Building Smarter Autonomous Machines: NVIDIA Announces Early Access for Omniverse Sensor RTX

Generative AI and foundation models let autonomous machines generalize beyond th...

06/01/2025

NVIDIA Unveils Mega' Omniverse Blueprint for Building Industrial Robot Fleet Digital Twins

According to Gartner, the worldwide end-user spending on all IT products for 202...

02/01/2025

How AI Is Helping Us Do Better-for the Planet and for Each Other

Artificial intelligence and accelerated computing are being used to help solve the world's greatest challenges. NVIDIA has reinvented the computing stack -...

02/01/2025

GeForce NOW Rings in the New Year With 14 New Games

GeForce NOW is kicking off 2025 by delivering 14 games to the cloud this month, with two available to stream this week so members can get started on their New Y...

30/12/2024

Research Galore From 2024: Recapping AI Advancements in 3D Simulation, Climate Science and Audio Engineering

The pace of technology innovation has accelerated in the past year, most dramati...

27/12/2024

Have You Heard? 5 AI Podcast Episodes Listeners Loved in 2024

NVIDIA's AI Podcast gives listeners the inside scoop on the ways AI is transforming nearly every industry. Since the show's debut in 2016, it's gar...

26/12/2024

Cheers to 2024: GeForce NOW Recaps Year of Ultimate Cloud Gaming

This GFN Thursday wraps up another incredible year for cloud gaming. Take a look back at the top games and new features that made 2024 a standout for GeForce NO...

24/12/2024

From Generative to Agentic AI, Wrapping the Year's AI Advancements

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, softwa...

19/12/2024

AI's in Style: Ulta Beauty Helps Shoppers Virtually Try New Hairstyles

Shoppers pondering a new hairstyle can now try styles before committing to curls or a new color. An AI app by Ulta Beauty, the largest specialty beauty retailer...

19/12/2024

NieR Perfect: GeForce NOW Loops Square Enix's NieR:Automata' and NieR Replicant ver.1.22474487139' Into the Cloud

Stuck in a gaming rut? Get out of the loop this GFN Thursday with four new games...

18/12/2024

AI at Your Service: Digital Avatars With Speech Capabilities Offer Interactive Customer Experiences

Editor's note: This post is part of the AI On blog series, which explores th...

18/12/2024

Imbue's Kanjun Qiu Shares Insights on How to Build Smarter AI Agents

Imagine a future in which everyone is empowered to build and use their own AI agents. That future may not be far off, as new software is infused with intelligen...

18/12/2024

NVIDIA Awards up to $60,000 Research Fellowships to PhD Students

For more than two decades, the NVIDIA Graduate Fellowship Program has supported graduate students doing outstanding work relevant to NVIDIA technologies. Today,...

17/12/2024

AI in Your Own Words: NVIDIA Debuts NeMo Retriever Microservices for Multilingual Generative AI Fueled by Data

In enterprise AI, understanding and working across multiple languages is no long...

17/12/2024

NVIDIA Unveils Its Most Affordable Generative AI Supercomputer

NVIDIA is taking the wraps off a new compact generative AI supercomputer, offering increased performance at a lower price with a software upgrade. The new NVID...

16/12/2024

Tech Leader, AI Visionary, Endlessly Curious Jensen Huang to Keynote CES 2025

On Jan. 6 at 6:30 p.m. PT, NVIDIA founder and CEO Jensen Huang - with his trademark leather jacket and an unwavering vision - will step onto the CES 2025 stage....

12/12/2024

Ready Player Fun: GFN Thursday Brings Six New Adventures to the Cloud

From heart-pounding action games to remastered classics, there's something for everyone this GFN Thursday. Six new titles join the cloud this week, startin...

11/12/2024

Driving Mobility Forward, Vay Brings Advanced Automotive Solutions to Roads With NVIDIA DRIVE AGX

Vay, a Berlin-based provider of automotive-grade remote driving (teledriving) te...

11/12/2024

Built for the Era of AI, NVIDIA RTX AI PCs Enhance Content Creation, Gaming, Entertainment and More

Editor's note: This post is part of the AI Decoded series, which demystifies...

11/12/2024

Into the Omniverse: How OpenUSD-Based Simulation and Synthetic Data Generation Advance Robot Learning

Editor's note: This post is part of Into the Omniverse, a series focused on ...

10/12/2024

AI Pioneers Win Nobel Prizes for Physics and Chemistry

Artificial intelligence, once the realm of science fiction, claimed its place at the pinnacle of scientific achievement Monday in Sweden. In a historic ceremon...

10/12/2024

Turn Down the Noise: CUDA-Q Enables Industry-First Quantum Computing Demo With Logical Qubits

Quantum computing has the potential to transform industries ranging from drug di...

09/12/2024

Crowning Achievement: NVIDIA Research Model Enables Fast, Efficient Dynamic Scene Reconstruction

Content streaming and engagement are entering a new dimension with QUEEN, an AI ...

06/12/2024

Thailand and Vietnam Embrace Sovereign AI to Drive Economic Growth

Southeast Asia is embracing sovereign AI. The prime ministers of Thailand and Vietnam this week met with NVIDIA founder and CEO Jensen Huang to discuss initiat...

05/12/2024

2025 Predictions: Enterprises, Researchers and Startups Home In on Humanoids, AI Agents as Generative AI Crosses the Chasm

From boardroom to break room, generative AI took this year by storm, stirring di...

05/12/2024

Stream Indiana Jones and the Great Circle' at Launch With RTX Power in the Cloud at up to 50% Off

GeForce NOW is wrapping a sleigh-full of gaming gifts this month, stuffing membe...

04/12/2024

NVIDIA NIM on AWS Supercharges AI Inference

Generative AI is rapidly transforming industries, driving demand for secure, high-performance inference solutions to scale increasingly complex models efficient...

03/12/2024

NVIDIA Advances Physical AI With Accelerated Robotics Simulation on AWS

Field AI is building robot brains that enable robots to autonomously manage a wide range of industrial processes. Vention creates pretrained skills to ease deve...

03/12/2024

Latest NVIDIA AI, Robotics and Quantum Computing Software Comes to AWS

Expanding what's possible for developers and enterprises in the cloud, NVIDIA and Amazon Web Services are converging at AWS re:Invent in Las Vegas this week...

03/12/2024

How AI Can Enhance Disability Inclusion, Special Education

A recent survey from the Special Olympics Global Center for Inclusion in Education shows that while a majority of students with an intellectual and developmenta...

03/12/2024

New NVIDIA Certifications Expand Professionals' Credentials in AI Infrastructure and Operations

As generative AI continues to grow, implementing and managing the right infrastr...

02/12/2024

Siemens Healthineers Adopts MONAI Deploy for Medical Imaging AI

3.6 billion. That's about how many medical imaging tests are performed annually worldwide to diagnose, monitor and treat various conditions. Speeding up th...

28/11/2024

Get the Power of GeForce-Powered Gaming in the Cloud Half Off With Black Friday Deal

Turn Black Friday into Green Thursday with a new deal on GeForce NOW Ultimate an...

27/11/2024

How RTX AI PCs Unlock AI Agents That Solve Complex Problems Autonomously With Generative AI

Editor's note: This post is part of the AI Decoded series, which demystifies...

26/11/2024

Taste of Success: Zordi Plants AI and Robotics to Grow Flavorful Strawberries Indoors

With startup Zordi, founder Gilwoo Lee's enthusiasm for robotics, healthy ea...