Up Next

What is Speech Recognition Used For?


Automatic speech recognition impacts everything from our personal lives to how justice is served in courts to improved student learning. So what exactly is speech recognition? How does speech recognition work across industries that impact our day to day lives, health, education, careers, and chances to thrive?


How Did Speech Recognition Begin?

It was 1952 when Bell Laboratories presented Audrey, the first speech recognition system, to the world. Audrey was able to recognize digits when spoken by a person. Ten years later, IBM’s Shoebox could already understand 16 spoken words in English, reports PCWorld. The US Department of Defense’s interest in the technology in the ’70s propelled significant growth.

Other important milestones included Carnegie Mellon’s introduction of the Harpy system, which was able to understand 1,011 words, and Bell Laboratories’ development of technology with the ability to recognize multiple people’s voices.

What is Speech Recognition Software and How Does it Work?

Automatic speech recognition is a software’s capability to understand spoken human language. Today’s tools can understand complex sentences and the jargon of various industries. Uses range from dictating notes to an app where notes are saved in text format to handling larger tasks like booking a car or ordering your groceries.

Speech recognition works by combining statistical modeling systems. In early speech recognition systems, if the words spoken matched a certain set of rules, the program could determine what the words were.

It’s important to note that the human language is full of nuances. Accents, dialects and mannerisms can vastly change the way certain words or phrases are spoken or meant. Today’s speech recognition systems involve many mathematical calculations to come up with the most accurate understanding of how a word is pronounced, or where one word ends and another begins. It looks at how different words are combined together and much more, across more than 200 trillion possibilities. The more the software is trained, the smarter and more accurate it gets.


Speech Recognition Affects Our Lives

From the legal system to healthcare to customer service, speech recognition software is being applied to create ease for businesses and affect our lives in a variety of ways.


What is Speech Recognition Used for in the Legal System?

The legal industry has long been transcribing its procedures to guarantee all parties have quick access to everything that was said on the record. A transcription ensures no party tries to change what was said later on, and it makes it easier to find and pull necessary quotes from a proceeding quickly.

When legal professionals record depositions or hearings with a transcription software that includes AI speech recognition they get a more accurate transcript, and receive it quicker. The software recognizes the speech and meaning of complex legal terminology in the recording. It’s then able to put these words into context the more it is trained. Transcripts are often ready either in real time or within a few days, and some providers are already able to guarantee 99% accuracy.


What is Speech Recognition Used for in Healthcare?

Like in the legal system, accuracy in healthcare matters significantly. A growing number of health-focused companies are stepping up to develop speech recognition algorithms. There are apps that take doctors’ notes for them during an exam or consultation, allowing doctors to better focus on patients. In operating rooms, doctors’ concentration can be a differentiator in saving a life and keeping hands sterile is critical as well. Instead of touching devices, surgeons can simply talk to them.

Other apps focus on empowering patients. People with speech impairments, often caused by stroke, TBI and autism, can better communicate with their environment as a whole, including voicing medical concerns to doctors who previously might not have understood their concerns.


What is Speech Recognition Used for in Customer Service?

Speech recognition technology is often used to offer self service. Contact center software, also known as IVR or interactive voice response technology, is often used for call routing. A customer tells the machine what he or she needs, and the machine provides the extension of the professional most equipped to help.

This use of speech recognition can be helpful when businesses are handling multiple calls. For example, when a customer calls to check bus schedules, she can state her departure location and destination out loud, and the contact center software will reply with the best bus route. When a customer prefers to talk to a human agent, she may be asked to state identifying information, such as her name, username, and phone number. She is then referred to the customer agent, who already has this identifying information on hand. As technology advances, it’s possible that customers will be recognized based on the unique characteristics of their voice, without the need to give this identifying information.


What is Speech Recognition Used for in Our Everyday Lives?

Speech recognition algorithms impact our everyday personal lives. At home, we can ask smart devices to find us a playlist or change the lighting without touching a physical computer or phone.

You can use speech recognition to provide GPS directions to your office or put on your favorite show via a smart TV or online with YouTube. This technology aids individuals with disabilities, the elderly, or even kids, helping them search for and consume entertainment. 


Where Speech Recognition Makes Some of the Biggest Impact

Technology has always been about empowering human abilities. Speech recognition technology is making a huge impact, opening the door to voice-to-text and transcription possibilities. These technologies help deaf and hard of hearing students gain access to higher education which they didn’t have previously. That accessibility took years to develop, but now it happens instantaneously.

Verbit’s AI speech recognition engine, for example, is trained to automatically recognize countless words, professional terms, thought leaders’ names and rare book titles. Along with a range of other engines, the software is able to create real time transcription, so students can follow the class just like their peers.

When transcription is not done in real time, it’s reviewed by two human professionals, and their edits are used to train the software further. The more an organization uses the software, the faster students gain access to accurate educational content, reducing the barriers to education.

Visual learners, commuters and those who study in a second language are also positively impacted by speech recognition as an assistive technology in classrooms. Alternative learning paths provided by assistive technologies personalize the experience and set more students up for success when compared to traditional learning methods.

Up Next

Alexa, Tell the Truth & Nothing but the Truth: How Audio is Solving Crimes

In July, a Florida man was charged with the murder of his girlfriend, Silvia Crespo. He claimed it was an accident and that he’d even tried to save her life, reported The Washington Post. Back in November 2015, an Arkansas man was accused of murdering a friend at home. The man claimed it was an accident that happened while he slept off a night of football and drinking, reported CNN

In both cases, the police decided to investigate a key witness: Alexa, Amazon’s voice-activated virtual assistant.

Police sought out evidence of the July crime through a warrant for the audio recordings maintained by Amazon. They believed the attack on Crespo that occurred may be found on the server, but obtaining these recordings opens the door to significant debate. 

Audio and Voice Forensics Uncover The Facts

Sometimes, receiving a recording is all it takes to prove guilt or innocence. Yet not all recordings are made equal, and they’re certainly not all equally trustworthy. That’s where voice forensic and audio forensic experts come in. There’s also a lot of math involved.

“By measuring the time it takes the sound to reach each microphone,” software tools “can estimate the original location,” explained CBC. Further investigations of audio submitted as evidence often occur to ensure a direct recording was provided and that it caught the sound or events authentically.

Accuracy and authenticity are critical. Audio forensics experts are often tasked with figuring out whether an audio file matches the device it was allegedly recorded on. They must consider whether the recording was delivered as is, or some elements were edited out or added in.

Voice biometrics also come into play. Like fingerprints, voiceprints are unique to each person. They are composed of countless elements, and cannot be replicated or confused with another human’s voiceprints. This fact is crucial when considering recordings that contain multiple voices, especially similar ones, that could sound the same to the naked ear. As voice recognition technology grows smarter, ensuring the accuracy of an investigation and its corresponding evidence will become easier.


Transcribing Audio Files Accelerates Investigations

Another way technology is making law enforcement professionals’ lives easier is by allowing them to search through audio files, so they can easily find the piece of the recording needed to understand where to follow up.

Artificial intelligence has made it possible to get audio files transcribed faster than ever at a high level of accuracy and in a cost effective way. AI-based tools are trained using thousands of hours of content to understand industry terms, past cases and current events. Some providers have human professionals review each transcript in order to deliver 99% accurate results. With artificial intelligence, each time the software is used and its transcript is corrected, it becomes smarter.

Law enforcement professionals use transcription software on the go to record notes instead of stopping to write them down while chasing a suspect, which reduces the need to sacrifice accurate documentation. At the office, they can transcribe investigations, testimonies, 911 calls and surveillance materials, among others.

Some professionals process information better when reading it, while others might notice a detail they missed initially while reviewing materials in different formats. Text-based content is also easier to share with colleagues who are collaborating on a case. Finally, audio search is often helpful for law enforcement who must search for and find specific sentences in a sea of evidence.


Could Increased Audio Usage in Law Enforcement Lead to the End of Privacy?

Despite the many benefits of using today’s audio technologies to solve crimes efficiently and accurately, Amazon, who introduced Alexa to the world, was not quick to collaborate with law enforcement.

Amazon’s reasoning? Privacy.

“‘Given the important First Amendment and privacy implications at stake, the warrant should be quashed unless the Court finds that the State has met its heightened burden for compelled production of such materials,’ Amazon’s lawyers wrote in a February memo,” reported CNN in 2017, referring to the 2015 Arkansas murder case.

In July 2019, when law enforcement officials asked for the Amazon Echo files regarding the murder case, Amazon pushed back again. According to The Washington Post, which is owned by Amazon CEO Jeff Bezos, “‘in a statement, Amazon spokeswoman Faith Eischen told The Washington Post that the company ‘does not disclose customer information in response to government demands unless we’re required to do so to comply with a legally valid and binding order.’ She added that the company ‘objects to overbroad or otherwise inappropriate demands as a matter of course.'”


While Amazon’s fight for privacy is likely, at least partially, influenced by its commercial interest and the need to retain customer trust, it brings up a point that is valuable regardless of such interests. Many of us reveal intimate information to the devices that surround us every day. Our privacy, and therefore our safety, becomes more and more vulnerable with each passing year. Tapping into this information without our authorization cannot be an easy decision or process.


Moreover, organizations that partner up with technology companies to accelerate investigations and deliver justice need to take these facts into consideration. It is critical to ensure sensitive information is only shared with providers who have taken real steps to secure privacy, both from cybercriminals and from colleagues who don’t absolutely need the information for their work.

Despite the clear argument to protect one’s privacy, technology is empowering law enforcement and the workforce’s ability to solve crimes like never before. 

When Used Correctly, Technology Provides Greater Justice

Using audio in investigations is nothing new, yet as technology advances, it becomes easier to determine the specifics of a committed crime – the how, when, why, and who. As law enforcement professionals tap into the opportunities that the market’s tech tools provide, it is important to keep citizens’ rights in mind. Privacy and security concerns remain of the utmost importance.

When implemented effectively, technological advancements can help make the world a much safer place.


Back To Top