Verbit.ai raises $23 million to automate transcription and captioning

There’s a lot of value inherent in services that can capture and automatically transcribe phone calls, keynotes, and recorded content — if nothing else, they save an enormous amount of manual labor. According to a recent report published by Grand View Research, the speech and voice recognition market is estimated to reach $31.82 billion by 2025, driven by new and “rising” applications in banking, health care, and automotive sectors.

Tom Livne, who cofounded Verbit.ai with Eric Shellef and Kobi Ben Tzvi in 2017, has high hopes that the Tel Aviv and New York-based startup will contribute substantially to the industry’s growth in the years ahead. Verbit’s adaptive speech recognition tech, which it claims can generate “detailed” transcriptions with over 99 percent accuracy at “record” speed, recently attracted the attention of VCs at the likes of Vertex Ventures and Oryzn Capital, which both participated in the startup’s Series A round.

Verbit today announced that it has raised $23 million in a round led by Voila Partners, with the aforementioned investors and HV Ventures, Vintage Venture Partners, and ClalTech chipping in. This comes less than a year after the firm’s $11 million seed round and follows a fiscal year in which total revenue grew by 300 percent. The new funding brings Verbit’s total capital raised to $34 million.

As part of the round, Viola Ventures’ Ronen Nir will join the board of directors, and Livne said the capital will be used to jumpstart global growth of Verbit’s sales, marketing, and product teams, with a particular emphasis on stateside expansion.

“I am lucky to work with such a talented team that is devoted to customer experience, company growth, and product innovation,” he said. “It’s been only eight months since our last round of funding, and this latest infusion of capital is a testament to the strong demand for an AI solution in such a manual and traditional space.”

Voice transcription and captioning isn’t exactly novel — it’s a decades-old industry with well-established players, like Nuance and Google. Enterprise platforms like Microsoft 365 offer AI-powered speech-to-text, along with Cisco and startups such as Otter and Voicera.

But, according to Livne, what sets Verbit apart is its reliance on “cutting-edge” advances in deep learning, neural networks, and natural language understanding.

Three models — an acoustic model, linguistic model, and contextual events model — inform Verbit’s captioning, first by filtering out background noise and echo and identifying speakers and next by detecting domain-specific terms, recognizing accents and dialects, and incorporating current events and updates. In practice, clients upload an audio or video file to the cloud for processing, which a team of thousands of human freelancers in over 20 countries subsequently edits and reviews, taking into account any customer-supplied notes and guidelines before making the finished transcription available for export to platforms like Blackboard, Vimeo, YouTube, Canvas, and BrightCode.

Verbit’s cloud dashboard shows progress throughout each job and lets users edit and share files or add inline comments, request reviews, and update files every step of the way, as needed. A forthcoming feature — Verbit Express — will allow clients to drag files in need of transcription to a folder on a desktop PC, where they’ll be automatically uploaded and processed.

Livne claims the platform can reduce operating costs by up to 50 percent and deliver results 10 times faster than the competition. In any case, it was enough to woo a healthy client base of educational institutions and commercial customers, including the London Business School, Fashion Institute of Technology, Utah State University, University of Utah, University of Southern Utah, University of Vermont, Auburn University, Western Governor University, University of California Santa Barbara, Oakland University, Stanford, Coursera, Panopto, Kaltura, and close to 100 others (up from 50 in May 2018).

Customer have to make a minimum commitment of $10,000 worth of work, a pricing structure that has apparently paid dividends. Verbit.ai isn’t disclosing exact revenue but says it’s in the “millions” and that the company is cash flow positive.

“We have been closely following Verbit for the past two years. The disruption it brings to the market, both in its technological superiority, as well as market traction, are really exceptional,” Nir said. “We are excited to partner with the Verbit team to accelerate this journey.”

Verbit currently employs a team of over 30 across its Tel Aviv and New York offices, and it hopes to bump that number to around 60 this year.