Cloud Speech Recognition API

Speech-to-Text Features

Fast, accurate, multilingual speech recognition and speech summarization software

High-Quality Transcription

Get human-quality transcriptions with state-of-the-art speech recognition technology

Audio and Video Summarization

Auto-summarize your recordings and highlight key moments in discussion

Ease of Integration

Integrate speech recognition technology into your apps in minutes

Multi Language Support

We support all major languages for use in speech-to-text transcription

All File Formats Accepted

Speech-to-text API supports almost all formats of audio and video files

Affordable Price

Accurate and multi-language speech recognition API at only 1.2¢ per minute

Save money and get unique speech recognition features

	SpeechText.AI	Watson Speech to Text	Google Speech API	Rev.AI
Price per Hour	$0.7-$1.0	$1.2	$1.44-$2.16	$1.4
Languages Supported	Multiple	Multiple	Multiple	Multiple
Punctuation/Casing
Keyword Highlights
Audio/Video Summarization
Integration Time	Up to 1 hour	1-2 days	1-2 days	2-4 hours
All File Formats Accepted
Process Data From	Anywhere	Binary Data	Cloud Storage Bucket	Anywhere
Export as SRT/VTT
Free Technical Support

Quick and Simple Integration

Build accurate speech recognition applications in minutes. We take care of the complexity behind and wrap it in a few lines of code.

Documentation


import requests
import json

secret_key = "SECRET_KEY"

# loads the audio into memory
with open("/path/to/your/file.mp3", mode="rb") as file:
  post_body = file.read()

API_URL = "https://api.speechtext.ai/recognize?"
header = {'Content-Type': "application/octet-stream"}

options = {
  "key" : secret_key,
  "language" : "en-US",
  "punctuation" : True,
  "format" : "mp3"
}
# send an audio file to SpeechText.AI
r = requests.post(API_URL, headers = header, params = options, data = post_body)


# create transcription task
curl -H "Content-Type:application/octet-stream" --data-binary @/path/to/your/file.m4a "https://api.speechtext.ai/recognize?key=SECRET_KEY&language=en-US&punctuation=true&format=m4a"

# retrieve transcription results
curl -X GET "https://api.speechtext.ai/results?key=SECRET_KEY&task=TASK_ID&summary=true&summary_size=15&highlights=true&max_keywords=10"

# get captions
curl -X GET "https://api.speechtext.ai/results?key=SECRET_KEY&task=TASK_ID&output=srt&max_caption_words=10"

# process public URL
curl -X GET "https://api.speechtext.ai/recognize?key=SECRET_KEY&url=PUBLIC_URL&language=en-US&punctuation=true&format=mp3"


<?php

$secret_key = "SECRET_KEY";

# loads the audio
$filesize = filesize('/path/to/your/file.m4a');
$fp = fopen('/path/to/your/file.m4a', 'rb');
// read the entire file into a binary string
$binary = fread($fp, $filesize);

# endpoint and options to start a transcription task
$endpoint = "https://api.speechtext.ai/recognize?key=".$secret_key."&language=en-US&punctuation=true&format=m4a";
$header = array('Content-type: application/octet-stream');

# curl connection initialization
$ch = curl_init();

# curl options
curl_setopt_array($ch, array(
    CURLOPT_URL => $endpoint,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_HEADER => false,
    CURLOPT_HTTPHEADER => $header,
    CURLOPT_POSTFIELDS => $binary,
    CURLOPT_FOLLOWLOCATION => true
));

# send an audio transcription request
$body = curl_exec($ch);

curl_close($ch);


import java.net.*;
import java.io.*;
import java.util.concurrent.TimeUnit;
import org.json.*;


public class Transcriber {

    public static void main(String[] args) throws Exception {
        String secret_key = "SECRET_KEY";
        HttpURLConnection conn;
        
        // endpoint and options to start a transcription task
        URL endpoint = new URL("https://api.speechtext.ai/recognize?key=" + secret_key +"&language=en-US&punctuation=true&format=m4a");
        
        // loads the audio into memory
        File file = new File("/path/to/your/file.m4a");
        RandomAccessFile f = new RandomAccessFile(file, "r");
        long sz = f.length();
        byte[] post_body = new byte[(int) sz];
        f.readFully(post_body);
        f.close();
        
        // send an audio transcription request
        conn = (HttpURLConnection) endpoint.openConnection();
        conn.setRequestMethod("POST");
        conn.setRequestProperty("Content-Type", "application/octet-stream");
        
        conn.setDoOutput(true);
        conn.connect();
        OutputStream os = conn.getOutputStream();
        os.write(post_body);
        os.flush();
        os.close();
        
    }
}

Pricing

Monthly subscription packages to suit any budget and use case

LITE

$49

2.700 recognition minutes
$0.018 price per minute
Multiple languages
Speech Recognition

Subscribe

STARTUP

$99

5.800 recognition minutes
$0.017 price per minute
Multiple languages
Speech Recognition
Summarization features

Subscribe

popular

PRODUCTION

$199

13.250 recognition minutes
$0.015 price per minute
Multiple languages
Speech Recognition
Summarization features

Subscribe

ENTERPRISE

$399

33.250 recognition minutes
$0.012 price per minute
Multiple languages
Speech Recognition
Summarization features

Subscribe

Frequently Asked Questions

Is SpeechText.AI secured?

We guarantee SpeechText.AI subscribers that all files are deleted immediately after the transcription has been finished and that the connection to our servers is always encrypted. This means that your audio/video files are not used for any purposes other than automatic speech recognition, nor can they be accessed by third parties. All our physical servers are located in Europe, and all our operations comply with European Union Data Protection laws.
How can I manage/cancel my subscription?

Order receipt email messages sent to customers include a link to the customer's Account Management site. The Account Management site includes separate tabs for Subscriptions, Account Details, and Payment Methods. The Subscriptions tab lists all active and inactive subscriptions, and the Manage command for each subscription lets you update the payment method or cancel the subscription. If you cancel a subscription, you can also uncancel it here, up until the deactivation date.
Can I use the speech to text API without programming skills?

Yes. The speech recognition service supports GET HTTP requests and can transcribe audio/video data from public URLs (e.g. Google Drive, Dropbox). You can execute GET requests by using curl in your terminal window or even call the API directly from your web browser.
Which languages are supported by the speech recognition API?

Our speech to text service currently supports English, German, French, Spanish, Dutch, Italian, Portuguese, Russian, Chinese, Japanese, Korean, Arabic, Hindi, Polish, Swedish, Norwegian, Danish, Finnish, Turkish, Romanian, Czech, Ukrainian, Greek, Thai, Indonesian, Vietnamese, Filipino, English (Global model, multiple accents), English (India), German (Austria), Portuguese (Brazil), Spanish (Mexico), French (Canada).
What are audio summarization and keyword highlights features?

The audio summarization feature automatically extracts the most important ideas from audio or video content and generates an accurate summary of the transcription text. Also, the speech recognition service can automatically detect and highlight key phrases in transcription results.
What is SpeechText.AI's accuracy?

The speech to text API is powered by deep learning technologies to assist you in transcribing speech accurately and fast. Our state-of-the-art speech recognition algorithm achieves an average word error rate of 3.8% on different open datasets (~1000 hours of speech). But there are many factors that can affect the recognition accuracy, including but not limited to audio quality, background noise, multiple speakers talking at the same time, etc.

SPEECHTEXT.AI

Cloud Speech Recognition API

Transform speech to text with high accuracy in multiple languages

Speech-to-Text Features

High-Quality Transcription

Audio and Video Summarization

Ease of Integration

Multi Language Support

All File Formats Accepted

Affordable Price

Save money and get unique speech recognition features

Quick and Simple Integration

Pricing

LITE

$49

STARTUP

$99

PRODUCTION

$199

ENTERPRISE

$399

Frequently Asked Questions