Up Next

Verbit.ai raises $23 million to automate transcription and captioning


There’s a lot of value inherent in services that can capture and automatically transcribe phone calls, keynotes, and recorded content — if nothing else, they save an enormous amount of manual labor. According to a recent report published by Grand View Research, the speech and voice recognition market is estimated to reach $31.82 billion by 2025, driven by new and “rising” applications in banking, health care, and automotive sectors.

Tom Livne, who cofounded Verbit.ai with Eric Shellef and Kobi Ben Tzvi in 2017, has high hopes that the Tel Aviv and New York-based startup will contribute substantially to the industry’s growth in the years ahead. Verbit’s adaptive speech recognition tech, which it claims can generate “detailed” transcriptions with over 99 percent accuracy at “record” speed, recently attracted the attention of VCs at the likes of Vertex Ventures and Oryzn Capital, which both participated in the startup’s Series A round.

Verbit today announced that it has raised $23 million in a round led by Voila Partners, with the aforementioned investors and HV Ventures, Vintage Venture Partners, and ClalTech chipping in. This comes less than a year after the firm’s $11 million seed round and follows a fiscal year in which total revenue grew by 300 percent. The new funding brings Verbit’s total capital raised to $34 million.

As part of the round, Viola Ventures’ Ronen Nir will join the board of directors, and Livne said the capital will be used to jumpstart global growth of Verbit’s sales, marketing, and product teams, with a particular emphasis on stateside expansion.

“I am lucky to work with such a talented team that is devoted to customer experience, company growth, and product innovation,” he said. “It’s been only eight months since our last round of funding, and this latest infusion of capital is a testament to the strong demand for an AI solution in such a manual and traditional space.”

Voice transcription and captioning isn’t exactly novel — it’s a decades-old industry with well-established players, like Nuance and Google. Enterprise platforms like Microsoft 365 offer AI-powered speech-to-text, along with Cisco and startups such as Otter and Voicera.

But, according to Livne, what sets Verbit apart is its reliance on “cutting-edge” advances in deep learning, neural networks, and natural language understanding.

Three models — an acoustic model, linguistic model, and contextual events model — inform Verbit’s captioning, first by filtering out background noise and echo and identifying speakers and next by detecting domain-specific terms, recognizing accents and dialects, and incorporating current events and updates. In practice, clients upload an audio or video file to the cloud for processing, which a team of thousands of human freelancers in over 20 countries subsequently edits and reviews, taking into account any customer-supplied notes and guidelines before making the finished transcription available for export to platforms like Blackboard, Vimeo, YouTube, Canvas, and BrightCode.

Verbit’s cloud dashboard shows progress throughout each job and lets users edit and share files or add inline comments, request reviews, and update files every step of the way, as needed. A forthcoming feature — Verbit Express — will allow clients to drag files in need of transcription to a folder on a desktop PC, where they’ll be automatically uploaded and processed.

Livne claims the platform can reduce operating costs by up to 50 percent and deliver results 10 times faster than the competition. In any case, it was enough to woo a healthy client base of educational institutions and commercial customers, including the London Business School, Fashion Institute of Technology, Utah State University, University of Utah, University of Southern Utah, University of Vermont, Auburn University, Western Governor University, University of California Santa Barbara, Oakland University, Stanford, Coursera, Panopto, Kaltura, and close to 100 others (up from 50 in May 2018).

Customer have to make a minimum commitment of $10,000 worth of work, a pricing structure that has apparently paid dividends. Verbit.ai isn’t disclosing exact revenue but says it’s in the “millions” and that the company is cash flow positive.

“We have been closely following Verbit for the past two years. The disruption it brings to the market, both in its technological superiority, as well as market traction, are really exceptional,” Nir said. “We are excited to partner with the Verbit team to accelerate this journey.”

Verbit currently employs a team of over 30 across its Tel Aviv and New York offices, and it hopes to bump that number to around 60 this year.

Up Next

Automated Transcription Provider Verbit Raises USD 23m in Series A

Tel Aviv-based transcription platform Verbit raised USD 23m in a Series A round led by Viola Ventures toward the end of January 2019. HV Ventures, Oryzn Capital, Vintage Venture Partners, and ClalTech also participated in the round, which brings Verbit’s total funding raised to USD 34m.

Verbit offers video and audio transcription services primarily through AI-powered automated speech-to-text technology refined through a network of human transcribers and post-editors.

In an article on the funding round, company CEO Tom Livne claimed the platform can transcribe an hour of audio in five minutes. He also said the vast amounts of transcribed text can be analyzed for patterns and insights through natural language processing (NLP) technology, noting that Verbit’s model has attracted “more than 100 customers in legal and higher education.”

Slator reached out to Livne for more information on the company and its latest funding round.

Court Reporter Shortage

Livne declined to comment on the company’s valuation but shared a few other interesting details including some of the most common client use cases.

In the education sector, Livne said Verbit fulfills the requirement for higher education institutions to provide captioned videos and full lecture transcripts for deaf or hearing-impaired students. He said some education sector clients include the London Business School, Stanford, Harvard, FIT, Brigham Young University-Idaho, and the University of California, Santa Barbara, as well as online course providers Coursera and Udacity.

“In the legal industry, Verbit’s solution is able to help tackle the court reporter shortage and provide companies with the technology to make reporters more efficient to cover all jobs,” Livne said.

Livne appears to have come full circle from his original inspiration for the company. “Before I found myself in the world of tech startups, I began my career in law, where transcripts are essential. I was often unsatisfied with the slow turnaround time for a finished transcript,” Livne said, adding, “Drawing on my prior experience with legal transcription, I felt that this was an important need in the market and that a viable solution could be found with the right technology.”

In the same article on the latest funding round, Livne explained that the money will go into accelerating expansion in the US as well as developing new capabilities for the platform.

For now, Verbit is mostly confined to the US as its key geographic market and region with the most clients. The platform currently supports English and Spanish, Livne said, adding that they intend to “expand to support other languages according to the market demand.”

Transcription vis-à-vis Translation

Transcription shares a number of similarities to translation in terms of the supply chain and, now, the underlying NLP technology. In fact, companies like France-based Ubiqus employ tech stacks that enable them to develop their automatic transcription and translation solutions at the same time.

If Slator’s own experience with a different AI-supported transcription platform is any indication, the technology has developed far enough to be quite helpful for very clear audio files; although it still struggles with background noise, accents, and differentiating speakers. This parallels developments in neural machine translation: now more useful than ever albeit with limitations.

Back To Top