Gladia raises $16M for AI transcription and analytics

admin
By admin
9 Min Read

Be a part of our day by day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra


Gladia, an AI transcription and audio intelligence supplier, has raised $16 million in funding.

The Paris, France-based firm will use the funding to develop an end-to-end audio infrastructure – beginning with a brand new real-time audio transcription and analytics engine – enabling voice-first platforms to ship extra worth to their customers throughout borders with cutting-edge AI.

It’s a problem to rivals reminiscent of Otter.ai and Fireflies.ai, in addition to different AI-based companies that transcribe voice conversations to textual content. In an interview with VentureBeat, CEO Jean-Louis Quéguiner defined to me why he began the corporate.

“As you can hear from a beautiful French accent, I’m not an English speaker and I was extremely frustrated with the accents,” Quéguiner mentioned. “That’s why I founded the company.”

I acquired a demo of the AI transcription, and it labored in actual time as Quéguiner spoke English together with his heavy French accent. I’m used to companies like Otter getting lots of phrases mistaken in a transcription, however within the first web page of outcomes from Gladia, I noticed no errors. He additionally confirmed how he might converse two completely different languages and the system might shift from one language to a different as wanted.

XAnge led the spherical, with participation by Illuminate Monetary, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, and Soma Capital.

Gladia makes use of AI for audio transcription.

Based in 2022, Gladia has now raised a complete of $20.3 million, with earlier seed investments headed by New Wave, Sequoia Capital (as a part of the First Sequoia Arc program), Cocoa, and GFC. Gladia just lately was chosen to take part within the AWS generative AI accelerator program.

“Gladia represents the qualities we like to champion at XAnge: a bold, global tech team at the forefront of AI innovation, with a proven business model to unlock new opportunities across industries,” mentioned Alexis du Peloux, companion at XAnge, in a press release. “In a fast-paced AI environment, Jean-Louis Quéguiner and his team have executed extremely well, and we are proud to back Gladia for the Series A.”

Given that the majority speech recognition fashions immediately are skilled predominantly on English audio knowledge and are subsequently inherently biased, Gladia prioritized constructing the primary real-time product that’s actually multilingual.

The brand new fine-tuned engine delivers superior real-time transcription in over 100 languages, together with enhanced assist for accents and the distinctive skill to adapt to completely different languages on the fly.

Gladia’s new engine is exclusive in its skill to extract insights from a name—just like the caller’s sentiment, key data, and dialog abstract—in real-time. This implies it takes lower than a second to generate each transcript and insights from a name or assembly utilizing Gladia.

New real-time AI transcription

2024 09 12 GLADIA QUEGUINER SOTO 002
Gladia founders Jonathan Soto (left) and Jean-Louis Quéguiner.

Constructing an correct, low-latency, and multilingual engine in-house is a fancy and resource-intensive job. It requires intensive experience in language understanding, real-time knowledge dealing with, with steady optimization and upkeep. Actual-time fashions require extra computing energy and will wrestle to provide correct output instantly as a result of restricted context.

Gladia’s new product permits corporations to bypass these challenges. The actual-time speech-to-text engine boasts an industry-leading latency of below 300 milliseconds with out compromising accuracy, whatever the language, geography, or tech stack used.

“Companies are spending valuable time and resources trying to incorporate multiple AI functions into their existing platforms,” mentioned Jonathan Soto, CTO of Gladia, in a press release. “Our single API is compatible with all existing tech stacks and protocols, including SIP, VoIP, FreeSwitch, and Asterisk. This allows us to easily integrate real-time transcription and analysis into our customers’ AI platforms, so they can focus on delivering the best services to their end users.”

What’s forward

The corporate’s first async transcription and audio intelligence API launched in June 2023 and was primarily based on a proprietary model of Whisper ASR.

It quickly gained traction within the enterprise market, notably with assembly recorders and note-taking assistants. The API is now adopted by over 600 prospects world wide, together with Consideration, Circleback, Technique Monetary, Recall, Sana, and VEED.IO and has greater than 70,000 customers.

“Gladia’s technology allows companies in vertical markets that need cutting-edge real-time transcription, including sales enablement and contact center platform, to shift seamlessly from manual post-call processing to proactive, low-latency workflows,” Quéguiner mentioned. “Whether it’s automated CRM enrichment or real-time guidance for support agents, Gladia is designed to help businesses operate smarter and more efficiently in record time, without requiring AI expertise in-house.”

Gladia will use the brand new capital to advance its R&D efforts and shortly convey to market a one-stop AI toolkit for audio and increase its product providing with extra à la carte fashions—together with giant language fashions (LLMs) and retrieval-augmented technology (RAG). With a number of design companions within the contact-center-as-a-service (CCaaS) section, the corporate is presently piloting an agent-assist resolution powered by Gladia’s real-time AI engine. Moreover, Gladia will proceed to increase its expertise base because it prepares for worldwide enlargement.

“We are multilingual, and we have something that is called ‘code switching,’ which makes it unique,” Quéguiner mentioned. “You can start with the language and switch to another.”

He went on to indicate me that he might begin a name in English and provoke the transcription. Then he spoke French phrases, and the mannequin accurately translated it in French.

“Keep in mind that [others] are not real time right now, and this one is real time,” he mentioned. “Usually, real time is a little bit less accurate. You can also have your own custom vocabulary in real time, which is pretty unusual, with us. We have the capability to extract some real-time insights.”

The service has an AI summarizer, and it’ll have new non-obligatory options within the coming months. Quéguiner mentioned that his service may also get acronyms proper and detect the swap to a different language.

“The mannequin we use is similar to LLMs (giant language fashions). It has no code decoder structure, which isn’t the case for a lot of the fashions that you just’ve seen with Fireflies, as an illustration.

The market contains “meeting recorders,” Quéguiner mentioned. The outcomes might be handed on to real-time insights, which might help individuals like gross sales leads shut offers sooner.

The corporate additionally works with Name Facilities, giving them 30% sooner time to completion when they’re on the telephone thanks to higher accuracy. The corporate will cost a flat charge reminiscent of a per-hour pricing.

Share This Article