Sony Pixel Power calrec Sony

Speak Like a Native: NVIDIA Parlays Win in Voice Challenge

14/02/2024

Thanks to their work driving AI forward, Akshit Arora and Rafael Valle could someday speak to their spouses' families in their native languages.

Arora and Valle - along with colleagues Sungwon Kim and Rohan Badlani - won the LIMMITS '24 challenge which asks contestants to recreate in real time a speaker's voice in English or any of six languages spoken in India with the appropriate accent. Their novel AI model only required a three-second speech sample.

The NVIDIA team advanced the state of the art in an emerging field of personalized voice interfaces for more than a billion native speakers of Bengali, Chhattisgarhi, Hindi, Kannada, Marathi and Telugu.

Making Voice Interfaces Realistic The technology for personalized text-to-speech translation is a work in progress. Existing services sometimes fail to accurately reflect the accents of the target language or nuances of the speaker's voice.

The challenge judged entries by listening for the naturalness of models' resulting speech and its similarity to the original speaker's voice.

The latest improvements promise personalized, realistic conversations and experiences that break language barriers. Broadcasters, telcos, universities, as well as e-commerce and online gaming services are eager to deploy such technology to create multilingual movies, lectures and virtual agents.

We demonstrated we can do this at a scale not previously seen, said Arora, who has two uses close to his heart.

Breaking Down Linguistic Barriers A senior data scientist who supports one of NVIDIA's biggest customers, Arora speaks Punjabi, while his wife and her family are native Tamil speakers.

It's a gulf he's long wanted to bridge for himself and others. I had classmates who knew their native languages much better than the Hindi and English used in school, so they struggled to understand class material, he said.

The gulf crosses continents for Valle, a native of Brazil whose wife and family speak Gujarati, a language popular in west India.

It's a problem I face every day, said Valle, an AI researcher with degrees in computer music and machine listening and improvisation. We've tried many products to help us have clearer conversations.

Badlani, an AI researcher, said living in seven different Indian states, each with its own popular language, inspired him to work in the field.

A Race to the Finish Line The initiative started nearly two years ago when Arora and Badlani formed the four-person team to work on the very different version of the challenge that would be held in 2023.

Their efforts generated a working code base for the so-called Indic languages. But getting to the win announced in January required a full-on sprint because the 2024 challenge didn't get on the team's radar until 15 days before the deadline.

Luckily, Kim, a deep learning researcher in NVIDIA's Seoul office, had been working for some time on an AI model well suited to the challenge.

A specialist in text-to-speech voice synthesis, Kim was designing a so-called P-Flow model prior to starting his second internship at NVIDIA in 2023. P-Flow models borrow the technique large language models employ of using short voice samples as prompts so they can respond to new inputs without retraining.

I created the model for English, but we were able to generalize it for any language, he said.

We were talking and texting about this model even before he started at NVIDIA, said Valle, who mentored Kim in two internships before he joined full time in January.

Giving Others a Voice P-Flow will soon be part of NVIDIA Riva, a framework for building multilingual speech and translation AI software, included in the NVIDIA AI Enterprise software platform.

The new capability will let users deploy the technology inside their data centers, on personal systems or in public or private cloud services. Today, voice translation services typically run on public cloud services.

I hope our customers are inspired to try this technology, Arora said. I enjoy being able to showcase in challenges like this one the work we do every day.

The contest is part of an initiative to develop open-source datasets and AI models for nine languages most widely spoken in India.

Hear Arora and Badlani share their experiences in a session at GTC next month.

And listen to the results of the team's model below, starting with a three-second sample of a native Kannada speaker:

https://blogs.nvidia.com/wp-content/uploads/2024/02/pr_kannada_f_indictts_prompt_3s-1.mp3

Here's a similar-sounding synthesized voice reading the first sentence of this blog in Hindi:

https://blogs.nvidia.com/wp-content/uploads/2024/02/pr_kannada_f_indictts_speaking_hindi_3-2.mp3

And then in English:

https://blogs.nvidia.com/wp-content/uploads/2024/02/pr_kannada_f_indictts_speaking_english-1.mp3 See notice regarding software product information.
LINK: https://blogs.nvidia.com/blog/generative-voice-challenge/...
See more stories from nvidia

Most recent headlines

07/09/2024

Start of College Football Scores Big for ESPN Platforms

The official kickoff to the college football season provided another reminder of how football can boost audiences across all platforms, with ESPN and ABC report...

07/09/2024

Adder Launches ADDERView Matrix C-Range

CAMBRIDGE, U.K. Adder Technology, a provider of connectivity solutions and high performance IP KVM technologies, has announced a new addition to its KVM matrix ...

07/09/2024

Scripps Promotes Sean Franklin to VP & GM of WLEX

CINCINNATI The E.W. Scripps Company (NASDAQ: SSP) has promoted Sean Franklin to vice president and general manager for WLEX, the Scripps-owned NBC affiliate in ...

07/09/2024

Cobalt Digital Sharpens IP Focus at IBC with UltraBlue IP-MV Multiviewer

AMSTERDAM Cobalt Digital has announced that during the IBC Show it will be sharpening its focus on IP-based solutions by adding a scalable software-based multiv...

07/09/2024

Media Tech's got talent at IBC2024

The IBC Talent Programme will look at how the industry can collaborate to foster new talent, make training more accessible, and develop common standards for div...

07/09/2024

EBU founds Security4Media to face cybersecurity challenges

The non-profit association aims to reduce risks and support trust in media, in the face of increasing threat levels By Matthew Corrigan Published: September ...

07/09/2024

Cerberus Tech Rolls Out Key Livelink Updates Enhancing Co...

Cerberus Tech, a leader in cloud-native IP contribution and distribution, today announced a series of updates that combine to increase user control in working w...

07/09/2024

Videlio and Interra Systems Forge Strategic Partnership i...

Interra Systems, a leading provider of end-to-end quality assurance solutions to the digital media industry, today announced a strategic partnership with Videli...

07/09/2024

ioMoVo Unveils New AI-Powered Search and Editing Features...

oMoVo, an innovator in AI-driven digital asset management (DAM), is launching ioPilot, a new feature in its game-changing DAM platform. Designed to streamline c...

07/09/2024

Creativepool Annual 2024: The Most Awards Launched at Sinfonia Smith Square

Creativepool Annual 2024: The Most Awards Launched at Sinfonia Smith Square Brie Clayton September 7, 2024 0 Comments The creative industry's mo...

07/09/2024

Meredith Veloz Named WRMD Tampa News Director

Meredith Veloz has been named news director at WRMD Tampa, part of NBC Universal Local's Telemundo Station Group. She starts in early October and will repor...

06/09/2024

Give Me the Backstory: Get to Know Asmae El Moudir, the Filmmaker Behind The Mother of All Lies

By Jessica Herndon One of the most exciting things about the Sundance Film Fes...

06/09/2024

Look Into My Eyes Peers Into the Lives of New York City Psychics

PARK CITY, UTAH - JANUARY 22: (L-R) Eugene Grygo, Phoebe Hoffman, Lana Wilson, Michael Kim, and Nikenya Hall attend the 2024 Sundance Film Festival Look Into M...

06/09/2024

Robert Richardson, ASC: The cinematography of Once Upon a Time... in Hollywood

The highly anticipated opus from director Quentin Tarantino doesn't disappoint. It's a typically outrageous conglomeration of comedy and shock laced wit...

06/09/2024

Cinematographer Rachel Morrison, ASC on the feature Seberg

Cinema, the most voyeuristic of mediums, is uniquely placed to convey both the act of surveillance and the psychology of the surveilled. In Seberg, director Ben...

06/09/2024

Dan Mindel, ASC, BSC, SASC: The cinematography of Star Wars Episode IX

In February 2019, director J.J. Abrams called Cut! for the last time during principal photography of the ninth chapter of the long-running Star Wars saga. Sta...

06/09/2024

Cinematographer Jacques Jouffret and director Dave Wilson on Bloodshot

In a cinematic universe expanding with superheroes, the creative challenge for filmmakers is to set newcomers apart from the crowd. The task is doubly important...

06/09/2024

Cinematographer John Conroy, ISC on Penny Dreadful: City of Angels

Having photographed episodes of the Showtime horror drama Penny Dreadful's second and third seasons, John Conroy, ISC was eager to reteam with showrunner Jo...

06/09/2024

Cinematographer Jas Shelton on Homecoming s second season

The psychological thriller Homecoming returns for its sophomore season with fresh twists, nail-biting cliff-hangers, and more of the unnerving cinematography th...

06/09/2024

Cinematographer Haris Zambarloukos, BSC, GSC on Artemis Fowl

Based on the series of novels by Eoin Colfer, the feature Artemis Fowl finds its namesake a precocious, hyperintelligent 12-year-old boy (played by Ferdia Sha...

06/09/2024

Cinematographer Oren Soffer on the short film See You Soon

Vincent lives in Los Angeles, Anthony in New York City. They met on a dating app and have been talking for months, but the physical distance between them has ke...

06/09/2024

Anastas Michos, ASC, GSC on The Kissing Booth 2 and The Empty Man

What are we going to talk about today? cinematographer Anastas Michos, ASC, GSC asks almost immediately after answering his phone. It's late July when Pana...

06/09/2024

Cinematographer Hillary Spera on Run and The Craft: Legacy

The recent features and bear a number of similarities. Chief among them: Both were shot by cinematographer Hillary Spera. Each story also exists in the realm of...

06/09/2024

Trevor Michael Brown and director Uga Carlini on the feature Angeliena

Angeliena follows the adventures and self-discovery of its namesake heroine (played by Euodia Samson), a lovable parking attendant who harbors dreams of traveli...

06/09/2024

Cinematographer Mihai Mlaimare Jr. on The Harder They Fall

Pair a British-born songwriter-director who's collaborated with musicians like Jay-Z with a Romanian cinematographer known for shooting intense dramas, and ...

06/09/2024

Mandy Walker, AM, ASC, ACS: The cinematography of Elvis

It was June 2019 when director Baz Luhrmann first approached cinematographer Mandy Walker, ACS, ASC about collaborating on the new biopic Elvis. I was extremel...

06/09/2024

Polly Morgan, ASC, BSC on the feature Where the Crawdads Sing

Based on the best-selling mystery novel of the same name, the feature Where the Crawdads Sing tells the story of Kya (Daisy Edgar-Jones), a young woman who grew...

06/09/2024

The 18th Annual South African Film and Television Awards (SAFTAs18) Nominees Announced.

Johannesburg, 05 September 2024 - The National Film and Video Foundation (NFVF),...

06/09/2024

Has Media Production over IP Reached Maturity?

Video over IP, in the sense that's used in film and television, sometimes feels like a newer idea than it really is. That might be because the high-end, unc...

06/09/2024

NFL Opener Draws 48.9M Average Audience

The ability of NFL football to attract large audiences in an era of declining viewing for linear TV was on full display in the NFL seasons opener, with an avera...

06/09/2024

Chyron Adds Three New Hires To Enhance Position In EMEA, APAC

MELVILLE, N.Y. Chyron has beefed up its team in EMEA and APAC with the addition of three new positions intended to enhance its growth and better serve customers...

06/09/2024

FutureB2B Appoints Joe Palombo Publisher, TV Tech and TVBEurope

NEW YORK FutureB2B, a world-leading provider of business intelligence and networking services, is proud to announce the appointment of Joe Palombo as the publis...

06/09/2024

Adventurer AV Maintains Competitive Edge with Ikegami UHK...

Adventurer A/V, a leading Taiwanese broadcast production company and equipment rental service provider based in Taipei City, has invested in latest-generation U...

06/09/2024

Synamedia at IBC 2024 - Innovations and new services for...

At IBC 2024, Hall 1 B33, leading video software provider, Synamedia, will unveil a raft of new video network solutions as it solidifies its position as a key te...

06/09/2024

LiveU Introduces Lightweight Sports Production Making Pro...

LiveU today launched its Lightweight Sports Production solution making fully featured, professional live sports production accessible and affordable for every e...

06/09/2024

Osprey Video Technology to Drive Partner Streaming Demos...

Osprey Video IBC2024 Show Preview Sept. 13-16 Various Partner Stands At IBC2024, Osprey Video's encoding/decoding technology will be on display at sever...

06/09/2024

GB Labs Releases NebulaNAS Cloud Storage Designed Exclusi...

GB Labs, the leader in intelligent, centralized media storage, today announced the launch of NebulaNAS, a transformative new cloud storage solution developed fr...

06/09/2024

Suitest Dual-Testing Capabilities Now Available for OTT T...

Suitest, provider of test automation for OTT applications, today announced the full availability of image- and object-based test automation capabilities in a si...

06/09/2024

France 3 Nouvelle Aquitaine choose Eolementhe for ready-t...

France 3 Nouvelle Aquitaine (.3NoA) is a 100% regional channel that aims to promote culture and creativity at regional level. At the beginning of 2024, the cha...

06/09/2024

Videosys Broadcast Launches Bidirectional Camera Control...

At IBC 2024, Camera control specialists Videosys Broadcast will be announcing a new bidirectional radio camera system that will simplify wireless workflows, as ...

06/09/2024

nging AI to the Vertical Video Magnifi Partners with Stor...

Magnifi by VideoVerse, an AI-driven video technology company and powerful video-editing SaaS platform, has announced a partnership with Storyteller,a leading Sa...

06/09/2024

iWedia Selected by HbbTV Org for official HbbTV Test Suit...

iWedia, a leading provider of software solutions for connected TV devices and services, is proud to announce its selection as the official vendor for the HbbTV ...

06/09/2024

LTN and Skyline platform DataMiner integrate to enhance...

LTN and Skyline Communications announce a new collaboration to drive enhanced broadcast monitoring and control for Tier 1 media, sports, entertainment, and tech...

06/09/2024

IBC2024 Unveils Speakers Sessions and Networking Events f...

IBC2024 today announces the line-up of speakers and sessions for the first-ever dedicated IBC Talent Programme, taking place on Monday 16 September in the Showc...

06/09/2024

Pioneering Innovations Take Center Stage at IBC 2024---Ki...

Kiloview, a leading provider of IP-based video transmission solutions, is set to make a significant impact at IBC 2024 with the unveiling of three groundbreakin...

06/09/2024

Kira Baca Appointed Chief Revenue Officer for Newly Merge...

The newly combined entity of Fabric and Xytech Systems, currently known as Fabric x Xytech, is pleased to announce the appointment of Kira Baca as Chief Revenue...

06/09/2024

Student Spotlight: Mateo Londoo

Student Spotlight: Mateo Londo o The producer talks about leaving Berklee to grieve the passing of a close friend and fellow student and then returning to fin...

06/09/2024

Teens Like the Ads on YouTube, Shows on Netflix, Precise TV Finds

More rough news for grownups in traditional media. According to a new study from Precise TV, teens prefer the ads on YouTube and are turning to Netflix and YouT...

06/09/2024

Samsung Ads Campaign for Southern California Toyota Wins Overall Advanced Advertising Innovation Award

A data-driven campaign created by Samsung Ads and Davis Elen Advertising for the...

06/09/2024

WFLD-WPWR Chicago Names Glen Dacy VP of News Content and Streaming

Glen Dacy has been named VP of news content and streaming at WFLD-WPWR Chicago, starting September 9. He most recently worked with the CBS Network News Special ...