Wondering what is TTS SDK or NLP? The ever mounting exposure to modern technology is ought to make you search for acronyms and glossary for various technical stuff. Text to speech software applications have gained traction with the rise in demand for hands free operations, be it voice biometrics , e-learning content, aircraft voice commands or home automation. It sounds amazing the way the software interprets the language in order to generate audio outputs that resemble the natural cadence and complete intonation. Since synthesized audio outputs need to match text language, it involves complex computation techniques and terms describing certain tools and usages.

For all those who are new to the field or just curious about the complicated phrases and acronyms related to text to speech softwares, below is list of some common terms related to text to speech softwares.

Speech Technology

All tools and techniques capable of interpreting, duplicating and responding to human voice can be termed as speech technology. Speech technology is the field of study and development of text-to-speech involving technologies to process human voice for e.g. speech to text and mirror human speech for e.g. text to speech.

Speech Recognition

It is also known as computer speech recognition or SST-speech to text or ASR-automatic speech recognition. It simply refers to process of interpreting spoken words and translating them into text. Some of the examples include voice search options on the internet and voice activated assistance on smartphones.


This is an acronym for text-to speech and by now you should already know what it exactly does. It translates text into speech. Another term used for it is voice/speech synthesis. If you are looking for text to speech softwares please visit and explore more about it on http://www.ttssoft.org/text-to-speech-software/ .

TTS Engine

It stands for Text to Speech Engine and also known as speech synthesizer or voice synthesizer and is the core technology that converts text into speech. It accepts user texts, classifies it into phrases and syllables or more precisely into linguistic segments that can be assembled into a database. The engine analyzes the text received through the input, looks into the assembled database for the nearest sounding speech units related to input text, strings them together and generates the human sounding voice for you to hear.


Text-to-Speech application programming interface or TTS API is a set of tools that aid in the building of TTS applications.

HMM or Hidden Markov Model

It refers to a model used to study hidden or unobserved states. It is basically the modeling framework that uses mathematical computation for applications related to speech, handwriting and gestures.

HTS a.k.a HMM based Speech Synthesis System

It is acronym for ā€œH and Three Sā€™sā€ and refers to the HMM based Speech Synthesis System. HTS is a speech synthesis technique based on a statistical model that produces closely a similar sounding set of speech units in relation to the input text. The only drawback is its slower performance even for a smaller database as it requires more computing power than the USS. This brings us to the next term USS.

USS or Unit Selection Synthesis

It is the most renowned and widely used text-to-speech method which is obviously used to convert text into speech. Among many different methods for turning text into speech, this one is known as it preserves the original voice of actors of all time and generates the most natural, human-like voices.


Most of us know HTML, a markup language used to write websites. SSML refers to Speech Synthesis Markup Language and is an XML-based markup language exclusively designed for speech synthesis applications.


This seems like a mouthful acronym but is really simple. TTS SDK literally translates as Text to Speech Software Development Kit. Computer scientists and software professionals probably know about SDKs and if not it is basically the tool kit that allows them to incorporate the text-to-speech functionality into their application.

NLP/Natural Language Processing

It is a field of study spread across computer science, linguistics and artificial intelligence. Related with the interactions between computers and natural languages, NLP is critical in text-to-speech production as the TTS Engine is required to understand and interpret natural languages.

NLU/Natural Language Understanding

It is a subgroup of the NLP that enables the machine to read comprehensively by assisting computers to derive a meaning from human interactions/natural language inputs. In simple words it is the aspect of linguistics that allows the text-to-speech engine to understand the natural language inputs typed in.

If you are not a computer professional or do not have a computer science degree, text-to-speech technology technical terms can be overwhelming but with a little effort you can overcome this mental barrier.