收费标准 
API价格

马上免费试用

暗色

亮色

Speech-to-Text (Streaming)
语音识别 (Streaming) (Beta)
从语音数据流里实时获得识别结果,数据传输通过WebSocket进行。
Endpoint
Request details
wss://translate.rozetta-api.io/api/v1/translate/stt-streaming
Header
Header
Description
accessKey, nonce, signature
请参考「加密签名」章节
Command
Command type
Description
SET_LANGUAGE
设置说话录音的语言,目前支持"en" (英语), "ja" (日语), "zh-CN" (普通话), "zh-HK" (粤语), and "zh-TW" (台湾普通话)。
SET_SAMPLING_RATE
设置语音数据流的采样率,推荐值是:16000。
END_STREAM
标识语音数据流的结束,不需要其他参数。
END_SESSION
标识单次语音识别会话结束,不需要其他参数。一旦会话结束,WebSocket的连接将被关闭。
Command example (set speech language)
{
  "command": "SET_LANGUAGE",
  "value": "ja"
}
Command example (signify end of stream)
{
  "command": "END_STREAM"
}
Audio stream
语音数据传输应通过WebSocket进行。语音数据应具1频道、16bps及WAV格式。
Response
Response type
Description
LANGUAGE_READY
设置录音的语言成功。
SAMPLING_RATE_READY
设置语音数据流采样率成功。
RECOGNITION_RESULT
这是根据语音数据流识别的文本。服务器根据发送的语音数据流,可能会调整识别结果,多次发送调整后的结果。
RECOGNITION_ERROR
这是从语音识别程序返回的错误,可能会出现多次。
Response example (language is set)
{
  "type": "LANGUAGE_READY"
}
Response example (recognition result)
{
  "type": "RECOGNITION_RESULT",
  "value": "そこに着いたらもう一度誰かに尋ねてください"
}

api/v1/translate/stt-streaming
发送语音数据流,实时获得识别结果。
const fs = require('fs');
const WebSocket = require('ws');

const path = require('path');

const authUtils = require('./utils/auth-utils');
const fsUtils = require('./utils/fs-utils');

const fsPromise = fs.promises;

const apiPath = '/api/v1/translate/stt-streaming';
const apiEndpoint = `wss://translate.rozetta-api.io${apiPath}`;
const authConfig = {
  accessKey: 'ACCESS_KEY',
  secretKey: 'SECRET_KEY',
  nonce: Date.now().toString(),
  contractId: 'CONTRACT_ID',
};
const speechData = {
  language: 'ja',
  samplingRate: 16000,
  audioFile: 'speech.wav',
  audioBuffer: null,
};

/**
* Command type sent from the client.
*/
const commandType = {
  setLanguage: 'SET_LANGUAGE',
  setSamplingRate: 'SET_SAMPLING_RATE',
  endStream: 'END_STREAM',
  endSession: 'END_SESSION',
};

/**
* Response types received from API endpoint.
*/
const responseType = {
  languageReady: 'LANGUAGE_READY',
  samplingRateReady: 'SAMPLING_RATE_READY',
  recognitionResult: 'RECOGNITION_RESULT',
  recognitionError: 'RECOGNITION_ERROR',
};

const getAuth = (url) => {
  const nonce = Date.now().toString();
  return {
      accessKey: authConfig.accessKey,
      nonce: nonce,
      signature: generateSignature(url, authConfig.secretKey, nonce),
      remoteurl: url,
      contractId: authConfig.contractId
  }
}

const handleSessionMessage = (connection, message) => {
  const messageJSON = JSON.parse(message);
  switch (messageJSON.type) {
    case responseType.languageReady:
      // The language is set. Set the sampling rate.
      console.log('Language is set. Set sampling rate.');
      connection.send(JSON.stringify({
        command: commandType.setSamplingRate,
        value: speechData.samplingRate,
      }));
      break;
    case responseType.samplingRateReady:
      // The language is set. Send the audio data stream.
      console.log('Sampling rate is set. Send audio data stream.');
      connection.send(speechData.audioBuffer);
      connection.send(JSON.stringify({
        command: commandType.endStream,
      }));
      break;
    case responseType.recognitionResult:
      console.log('Recognized transcript:');
      console.log(messageJSON.value);
      break;
    case responseType.recognitionError:
      console.error('Recognition error:');
      console.error(messageJSON.value);
      // In case of error, we close the connection immediately.
      connection.send(JSON.stringify({
        command: commandType.endSession,
      }));
      break;
    default:
      console.log('Unexpected response type:');
      console.log(messageJSON.type);
  }
};

const main = async () => {
  speechData.audioBuffer = await fsPromise.readFile(speechData.audioFile);
  const auth = getAuth(apiPath);
  console.log(apiPath);
  console.log(auth);
  const auth64 = btoa(JSON.stringify(auth));
  const url = `${apiEndpoint}?auth=${auth64}`
  console.log(url);
  const connection = new WebSocket(url);
  connection.on('open', () => {
    console.log('Connected to streaming STT API.');
    // Once connected, set the speech language.
    connection.send(JSON.stringify({
      command: commandType.setLanguage,
      value: speechData.language,
    }));
  });
  connection.on('message', (message) => {
    handleSessionMessage(connection, message);
  });
  connection.on('error', (error) => {
    console.error(error.message);
    connection.close();
  });
  connection.on('close', () => {
    console.log('Connection closed.');
  });
};

main();
关于认证方法,请参考「加密签名」章节。
关于各语言完整的示例代码,请参考这裡
©️ 2019 Rozetta API  ・  Powered by Rozetta

Rozetta股份有限公司

^