Up Next

Verbit raises $60 million to improve enterprise-focused transcription software


Verbit today announced the close of a $60 million series C round ($10 million of which is debt) that the company says will bolster its product R&D efforts. Verbit CEO Tom Livne, speaking to VentureBeat via email, said the infusion will also lay the groundwork for merger and acquisition opportunities as Verbit pursues new verticals, increases the number of languages its platform supports, and hires employees to expand its international reach.

The voice and speech recognition tech market is anticipated to be worth $31.82 billion by 2025, driven by new applications in the banking, health care, and automotive industries. In fact, it’s estimated that one in five people in the U.S. interact with a smart speaker on a daily basis and that the share of Google searches conducted by voice in the country recently surpassed 30%.

Livne, who cofounded Verbit.ai with Eric Shellef and Kobi Ben Tzvi in 2017, asserts the Tel Aviv- and New York-based startup (which also has offices in Kyiv, Ukraine and Palo Alto, California) will contribute substantially to the voice transcription segment’s rise. Verbit’s voice transcription and captioning services aren’t novel — well-established players like Nuance, Cisco, Otter, Voicera, Microsoft, Amazon, and Google have offered rival products for years, including enterprise-focused platforms like Microsoft 365. But Verbit’s adaptive speech recognition tech can generate detailed transcriptions with a claimed over 99.9% accuracy.

What sets Verbit apart is its reliance on “cutting-edge” advances in machine learning and natural language understanding, according to Livne. Three algorithms — acoustic, linguistic, and contextual — power Verbit’s captioning. They filter out background noise and echoes and identify speakers regardless of accent, detecting domain-specific terms while incorporating current events and updates. Clients first upload audio or video files to a cloud dashboard for processing. Then a team of over 22,000 human freelancers in over 120 countries edits and reviews the material, taking into account customer-supplied notes and guidelines.

“Verbit stays up to date with competitors’ rates to ensure that its transcribers are compensated fairly. Currently, the company’s transcribers can choose if they wish to work according to time spent or a flat pay-per-AM,” a spokesperson told VentureBeat via email. “Verbit frequently conducts roundtable discussions to hear from its transcribers first-hand to get their feedback. The company’s transcribers have a support system that constantly relays feedback to Verbit management, and it has a bonus program to ensure proper compensation for its top performers.”

Finished transcriptions from Verbit are available for export to services like Blackboard, Vimeo, YouTube, Canvas, and BrightCode. A web frontend shows the progress of jobs and lets users edit and share files or define the access permissions for each, as well as add inline comments, request reviews, or view usage reports. A feature called Verbit Express allows them to drag files in need of transcription to a folder on a desktop PC, where they’re automatically uploaded and processed.

The transcriber side of the equation is self-serve and on-demand. Verbit transcribers can choose the files they’d like to work on (the platform doesn’t assign them manually) and take advantage of built-in dictionary and research tools, keyboard shortcuts, speed control, a highlighter, and spell check. Those who consistently produce exceptional work and achieve high quality scores are offered the chance to become reviewers, responsible for proofreading — and editing, if necessary — transcribers’ work.

Livne claims its suite can reduce operating costs by up to 50% and deliver results 10 times faster than the competition. In any case, it was enough to woo a healthy client base of over 400 educational institutions and commercial customers (up from 70 as of January 2019), including Harvard, the NCAA, London Business School, Fashion Institute of Technology, Stanford, Coursera, Udacity, and more than 400 others. Revenue has grown fivefold since 2017.

Customer have to make a minimum commitment of $10,000, a pricing structure that apparently paid dividends. Verbit.ai isn’t disclosing exact revenue but says it’s in the “millions” and that the company is cash flow positive. Despite pandemic-related headwinds, revenue run-rate has grown fivefold since 2019, according to Livne.

Verbit plans to explore verticals in the insurance and financial sectors, as well as media and medical use cases. To this end, it recently launched a human-in-the-loop transcription service for media firms with a delay of only a few seconds. It also launched Live Room, a desktop app for live, interactive transcripts and captions within Zoom featuring highlighting and note-taking, options to download the transcript, abilities to delay or speed up the transcript, and direct sharing of notes with peers and clients. And Verbit inked an agreement with the nonprofit Speech to Text Institute to invest in court reporting and legal transcription technologies.

Sapphire Ventures led the 110-employee Verbit’s series C round with participation from existing investors Vertex Ventures, Stripes, HV Ventures, ClalTech, and new investor Vertex Growth. It brings the 3.5-year-old, 120-plus-employee company’s total capital raised to more than $100 million.

The original article can be viewed on VentureBeat here.

Up Next

Verbit Closes $60 Million Funding Round for Expansion of AI-Based Transcription Service, Surpasses $100 Million in Investment

Verbit has just announced another big funding round pushing total investment in the AI-based transcription service past $100 million since its founding. Just 10 months after closing a $31 million Series B funding, the company has finalized terms on a new Series C round of $60 million. Sapphire Ventures led the most recent financing that also saw participation from existing investors Vertex Ventures, Stripes, HV Ventures, and ClalTech along with new investor Vertex Growth.

“We are still not at the billion-dollar valuation but we are not far from it,” said Tom Livne, CEO and founder of Verbit, during a podcast recording with Voicebot earlier today. “COVID has been an accelerator to our business. We actually grew our revenue more than five times this year. Usually, the valuation is a reflection of revenue growth so definitely, we are ahead of where we planned. Our goal is still to become a billion-dollar company and an independent public company. I do believe that can happen in the next two-to-three years.”


Most of Verbit’s current business comes from academia and legal services. Universities and other education institutions are often obligated to provide transcriptions of classroom lectures to comply with laws such as the Americans with Disabilities Act that protect disabled students. This segment alone is a large market of over $1 billion in potential annual revenue according to Livne.

The legal market is three-to-four times larger he says. Sworn depositions require high accuracy and represent a rapidly growing segment for Verbit. Media, distance learning, and enterprise are new markets that have recently launched and the company expects to use some of the new funding to build more awareness and customers in these industries.


Livne says the global transcription market is more than $30 billion annually but has historically been fragmented and highly manual. Attempts at automated transcription solutions have delivered 80-85% accuracy at best and fallen short of many organization’s need for near 100% accuracy. Verbit has built an AI-based speech recognition solution for automated transcription that is then reviewed by human transcribers to guarantee 99.9% accuracy.

Verbit has an AI-based speech model that gets applied to all of its customer projects. The transcriptions are also refined using domain or industry-specific speech models and refined again with models tailored to each customer, potentially down to the level of an individual professor. Only after these automated transcription processes are applied is a human engaged to bring the final product to 99.9% accuracy. That reflects a word error rate (WER) of 1-in-1000 or lower.

A network of 22,000 human transcribers is a key differentiator for Verbit. The company onboards transcribers with a test that assesses their competency and identifies if they have special skills in languages, accents, or subject matter. The system then updates that information based on each transcriber’s performance and experience. This data is used to route new projects to transcribers that are most likely to efficiently deliver high accuracy.


While the transcribers are editing the transcripts, the speech-to-text speech recognition system is automatically updating its model based on the input. It is a de facto supervised learning approach but based on actual edits instead of data labeling. This process works in real-time so an edit to one word early in a transcript can lead to updates in the text later in the recording even before the transcriber gets to it. Any edit is a signal to the AI it can use to update and improve the speech models. The approach is also applied to Verbit’s new real-time transcription service which Livne said has been a significant growth driver recently.


Verbit has plans for its new funding that go beyond expanding into new vertical industries. Livne told Voicebot that the company would invest in natural language understanding (NLU) technology in order to provide more services to customers. For example, he mentioned the ability to identify conflicting statements from a witness during their testimony solely based on their transcribed depositions. In addition, he said that the resources could be applied to acquisitions which could help Verbit expand more quickly into new verticals.

Back To Top