Cloud Speech Recognition API

Transform speech to text with high accuracy in multiple languages

Start for Free

Speech-to-Text Features

Fast, accurate, multilingual speech recognition and speech summarization software

img

High-Quality Transcription

Get human-quality transcriptions with state-of-the-art speech recognition technology

img

Audio and Video Summarization

Auto-summarize your recordings and highlight key moments in discussion

img

Ease of Integration

Integrate speech recognition technology into your apps in minutes

img

Multi Language Support

We support all major languages for use in speech-to-text transcription

img

All File Formats Accepted

Speech-to-text API supports almost all formats of audio and video files

img

Affordable Price

Accurate and multi-language speech recognition API at only 1.2¢ per minute

Save money and get unique speech recognition features

SpeechText.AI Watson Speech to Text Google Speech API Rev.AI
Price per Hour $0.7-$1.0 $1.2 $1.44-$2.16 $2.16
Languages Supported Multiple Multiple Multiple English Only
Punctuation/Casing
Speaker Identification
Keyword Highlights
Audio/Video Summarization
Integration Time Up to 1 hour 1-2 days 1-2 days 2-4 hours
All File Formats Accepted
Process Data From Anywhere Binary Data Cloud Storage Bucket Anywhere
Export as SRT/VTT
Free Technical Support

Quick and Simple Integration

Build accurate speech recognition applications in minutes. We take care of the complexity behind and wrap it in a few lines of code.


import requests
import json

secret_key = "SECRET_KEY"

# loads the audio into memory
with open("/path/to/your/file.mp3", mode="rb") as file:
  post_body = file.read()

API_URL = "https://api.speechtext.ai/recognize?"
header = {'Content-Type': "application/octet-stream"}

options = {
  "key" : secret_key,
  "language" : "en-US",
  "punctuation" : True,
  "speaker_detection": True,
  "format" : "mp3"
}
# send an audio file to SpeechText.AI
r = requests.post(API_URL, headers = header, params = options, data = post_body)
                      

# create transcription task
curl -H "Content-Type:application/octet-stream" --data-binary @/path/to/your/file.m4a "https://api.speechtext.ai/recognize?key=SECRET_KEY&language=en-US&punctuation=true&speaker_detection=true&format=m4a"

# retrieve transcription results
curl -X GET "https://api.speechtext.ai/results?key=SECRET_KEY&task=TASK_ID&summary=true&summary_size=15&highlights=true&max_keywords=10"

# get captions
curl -X GET "https://api.speechtext.ai/results?key=SECRET_KEY&task=TASK_ID&output=srt&max_caption_words=10"

# process public URL
curl -X GET "https://api.speechtext.ai/recognize?key=SECRET_KEY&url=PUBLIC_URL&language=en-US&punctuation=true&speaker_detection=true&format=mp3"
                    

<?php

$secret_key = "SECRET_KEY";

# loads the audio
$filesize = filesize('/path/to/your/file.m4a');
$fp = fopen('/path/to/your/file.m4a', 'rb');
// read the entire file into a binary string
$binary = fread($fp, $filesize);

# endpoint and options to start a transcription task
$endpoint = "https://api.speechtext.ai/recognize?key=".$secret_key."&language=en-US&punctuation=true&speaker_detection=true&format=m4a";
$header = array('Content-type: application/octet-stream');

# curl connection initialization
$ch = curl_init();

# curl options
curl_setopt_array($ch, array(
    CURLOPT_URL => $endpoint,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_HEADER => false,
    CURLOPT_HTTPHEADER => $header,
    CURLOPT_POSTFIELDS => $binary,
    CURLOPT_FOLLOWLOCATION => true
));

# send an audio transcription request
$body = curl_exec($ch);

curl_close($ch);
                    

import java.net.*;
import java.io.*;
import java.util.concurrent.TimeUnit;
import org.json.*;


public class Transcriber {

    public static void main(String[] args) throws Exception {
        String secret_key = "SECRET_KEY";
        HttpURLConnection conn;
        
        // endpoint and options to start a transcription task
        URL endpoint = new URL("https://api.speechtext.ai/recognize?key=" + secret_key +"&language=en-US&punctuation=true&speaker_detection=true&format=m4a");
        
        // loads the audio into memory
        File file = new File("/path/to/your/file.m4a");
        RandomAccessFile f = new RandomAccessFile(file, "r");
        long sz = f.length();
        byte[] post_body = new byte[(int) sz];
        f.readFully(post_body);
        f.close();
        
        // send an audio transcription request
        conn = (HttpURLConnection) endpoint.openConnection();
        conn.setRequestMethod("POST");
        conn.setRequestProperty("Content-Type", "application/octet-stream");
        
        conn.setDoOutput(true);
        conn.connect();
        OutputStream os = conn.getOutputStream();
        os.write(post_body);
        os.flush();
        os.close();
        
    }
}
                    

Pricing

Monthly subscription packages to suit any budget and use case

LITE

$49

  • 2.700 recognition minutes
  • $0.018 price per minute
  • Multiple languages
  • Speech Recognition

STARTUP

$99

  • 5.800 recognition minutes
  • $0.017 price per minute
  • Multiple languages
  • Speech Recognition
  • Summarization features
popular

PRODUCTION

$199

  • 13.250 recognition minutes
  • $0.015 price per minute
  • Multiple languages
  • Speech Recognition
  • Summarization features

ENTERPRISE

$399

  • 33.250 recognition minutes
  • $0.012 price per minute
  • Multiple languages
  • Speech Recognition
  • Summarization features

Frequently Asked Questions