Verbit Raises $23 Million from Viola Ventures to Expand Automated Transcription and Captioning Solution

Verbit announced yesterday that it has completed a funding round led by Viola Ventures for $23 million. The company had previously raised $11 million from HV Holtzbrinck Ventures, Vertex Ventures Israel and Oryzn Capital in March 2018. That round set the post-funding company valuation at $20 million. Given the size of the new round, it appears the valuation has risen quickly over the past ten months and the announcement says that revenue grew more than 300%. This type of growth rate is not unusual for startup in early stages because the starting revenue base is very low. The more telling number is the size of the funding round.

Vertex Ventures, HV Ventures, Oryzn Capital, Vintage Venture Partners, and Cal-Tech all participated in the latest funding round. The company says it will use the funds for solution development, marketing, and to accelerate expansion into the U.S. market. This includes a new office in New York City. A Verbit spokesperson says the company expects to grow 300% again in 2019 due to these investments. The company claims to have over 100 customers in the academic and legal verticals today.

AI Plus Humans to Offer Speed and Accuracy

The company offers a speech-to-text solution that transforms spoken content into text. This is applied to traditional transcription services as well as providing captioning for video content. While many companies are now focusing on transforming textual content into audio so it can be accessed through voice assistants, there is a growing market for transcription and captioning services due in part to the rapid rise of video.

Verbit uses both AI for transcription and an on-demand network of human translators to improve accuracy. A company spokesperson commented:

“Verbit’s technology combines AI with humans to transcribe voice content with targeted 99% accuracy in nearly half the time (1 hour of video content can be transcribed in 5 minutes). They’re seeing a lot of success working with universities to level the education experience for students who have hearing impairments. They’re also working in the legal space to help address the court reporter talent shortage.”

Tom Livne (Verbit CEO) Interview

Voicebot caught up with Verbit CEO Tom Livne to learn more about the company and the timing of the funding round.

Voicebot: You show a focus on horizontal services such as transcription and captioning as well as two verticals in legal and academic. Will the new funding be used to deepen your focus in these areas or do you intend to use it to invest in new vertical and/or horizontal markets?

Tom Livne: Because of the high demand from academia and law, we’re looking to use this funding to help support these sectors on a larger scale. The funding will also be used to add support for our platform in other languages, invest in recruit talent for our New York office, and focus on sales, marketing, and customer success.

Voicebot: There is a lot of innovation in speech-to-text (STT) technology today to feed NLUs for voice assistants. Yet, your business model takes just the front end of that process and refines it for transcription services. Given all of the STT innovation, why do you think Verbit has grown so quickly over the past year? Was it simply an underserved market because of the rise of video and podcasting?

Livne: Virtually all industries today produce massive amounts of audio and video content. In fact, Gartner estimates this to be upwards of 70% of all organizational data. In this context, it’s clear that transcription and captioning is more critical than ever. The market itself is a traditionally manual space, so Verbit is capitalizing on the huge opportunity there for an AI solution to automate the task with greater speed and accuracy.

Voicebot: You employ a combination of AI for transcription and augment it with human editing to drive higher accuracy. Why is the human element necessary at this point?

Livne: Our voice recognition technology is about 90 percent accurate on its own, but in order to push it to 99% accuracy, we incorporate a dual review process of human transcribers to correct any mistakes. All of these edits then flow back into the model, which works to improve Verbit’s machine learning algorithms over time.

Voicebot: Facebook M famously tried to offer a combination of AI and human curation for its assistant experiment, but it ultimately didn’t work out. Why is your mixed model of AI and humans different and succeeding?

Livne: Verbit leverages an on-demand platform of thousands of highly skilled freelancers who edit the automatically transcribed files. Our combination of top artificial and human intelligence is our advantage, and we prioritize linking the two by investing in our transcriber community through dedicated support and introducing initiatives to engage them and improve their skills. This ensures that both essential components of our model are successful and produce the best results.