Paraformer語音識別熱詞定制與管理
支持的領域 / 任務:audio(音頻) / asr(語音識別)
在語音識別服務中,如果您的業務領域有部分詞匯默認識別效果不夠好,可以考慮使用熱詞功能,將這些詞添加到詞表從而改善識別結果。
前提條件
已開通服務并獲得API-KEY:API-KEY的獲取與配置。
已安裝最新版SDK:安裝DashScope SDK。
熱詞
熱詞通過熱詞列表的形式在SDK中使用,熱詞列表是一個以熱詞文本為Key,熱詞權重為Value的字典。熱詞列表最大支持設置500個熱詞,熱詞文本規則如下:純中文熱詞不超過10個漢字,純英文或者中英文混合熱詞,按空格分詞后,不超過5個詞;對于熱詞權重規則如下:有效的熱詞權重取值范圍為[1, 5]和[-6, -1]區間內的整數值。如果想提高某個熱詞的識別概率,則可以設置[1, 5]范圍內的權重,權重越大概率越高;如果想降低某個熱詞的識別概率,則可以設置[-6, -1]范圍內的權重,權重越小概率越低。
熱詞管理
在Java和Python中使用AsrPhraseManager類來管理熱詞的創建,更新,刪除,查詢等功能。
導入
from dashscope.audio.asr import AsrPhraseManager
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
創建熱詞
同步調用的形式提交一個創建熱詞請求。
接口
def create_phrases(cls,
model: str,
phrases: Dict[str, Any],
training_type: str = 'compile_asr_phrase',
**kwargs)
AsrPhraseStatusResult CreatePhrases(AsrPhraseParam param)
throws ApiException, NoApiKeyException, InputRequiredException
參數說明
對于Java SDK,將使用一個AsrPhraseParam對象作為參數,其方法和參數如下:
參數 | 類型 | 說明 |
param | AsrPhraseParam | 創建熱詞的配置參數,見上文關于AsrPhraseParam類型的描述,CreatePhrases調用不需要填寫pageNo和pageSize字段,但是要求添加model,phraseList字段。 |
對于Python SDK,其參數說明如下:
參數 | 類型 | 說明 |
model | str | 指定的Paraformer模型名,關于如何進行模型選擇,請參考:模型概覽。 |
phrases | Dict[str, Any] | 熱詞列表。 |
training_type | str | 固定為compile_asr_phrase。 |
模型概覽
模型名 | 模型簡介 |
paraformer-realtime-v1 | Paraformer中文實時語音識別模型,支持視頻直播、會議等實時場景下的語音識別。僅支持16kHz采樣率的音頻。 |
paraformer-realtime-8k-v1 | Paraformer中文實時語音識別模型,支持8kHz電話客服等場景下的實時語音識別。 |
Paraformer中英文語音識別模型,支持16kHz及以上采樣率的音頻或視頻語音識別。 | |
paraformer-8k-v1 | Paraformer中文語音識別模型,支持8kHz電話語音識別。 |
paraformer-mtl-v1 | Paraformer多語言語音識別模型,支持16kHz及以上采樣率的音頻或視頻語音識別。 支持的語種/方言包括:中文普通話、中文方言(粵語、吳語、閩南語、東北話、甘肅話、貴州話、河南話、湖北話、湖南話、寧夏話、山西話、陜西話、山東話、四川話、天津話)、英語、日語、韓語、西班牙語、印尼語、法語、德語、意大利語、馬來語。 |
返回示例
對于Java SDK,將返回一個AsrPhraseStatusResult對象,對于Python SDK,將返回一個Dict,AsrPhraseStatusResult成員通過對應get方法獲取,成員名稱和Python SDK基本一致,僅命名方式不同(Java為駝峰式)。
{
"status_code": 200,
"request_id": "2b815cfe-793f-9f3c-b528-5ade0a2d498e",
"code": null,
"message": "",
"output": {
"job_id": "ft-202309261539-2af1",
"status": "SUCCEEDED",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"training_type": "compile_asr_phrase",
"create_time": "2023-09-26 15:39:07"
},
"usage": null,
"job_id": "ft-202309261539-2af1",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": "2023-09-26 15:39:07",
"output_type": null,
"model": null
}
調用范例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
phrases = {'通義千問': 5}
result = AsrPhraseManager.create_phrases(model='paraformer-realtime-v1',
phrases=phrases)
if result.output is not None and result.output['finetuned_output'] is not None:
print('job_id:%s, finetuned_output:%s' %
(result.output['job_id'], result.output['finetuned_output']))
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
import java.util.Collections;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.phraseList(Collections.singletonMap("通義千問", 5))
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult createResult = null;
try {
createResult = AsrPhraseManager.CreatePhrases(param);
if (createResult.getOutput() != null && createResult.getOutput().getFineTunedOutput() != null) {
System.out.println("job_id: " + createResult.getOutput().getJobId() + ", finetuned_output: " + createResult.getOutput().getFineTunedOutput());
} else {
System.out.println("Error: " + createResult);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
查詢熱詞
該接口將以同步調用的形式提交一個查詢熱詞請求。
接口
def query_phrases(cls, phrase_id: str, **kwargs)
AsrPhraseStatusResult QueryPhrase(AsrPhraseParam param, String phraseId)
throws ApiException, NoApiKeyException, InputRequiredException
參數配置
對于Java SDK,將使用一個AsrPhraseParam對象作為參數,其方法和參數如下:
參數 | 類型 | 說明 |
param | AsrPhraseParam | 創建熱詞的配置參數,見上文關于AsrPhraseParam類型的描述,對于QueryPhrase調用,不需要填phraseLIst及pageNo和pageSize字段。 |
phraseId | String | 調用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput對象后,通過該對象的getFineTunedOutput返回的熱詞ID,String類型。調用ListPhrases時,使用AsrPhraseStatusOutput對象的getFinetunedOutputs接口將返回所有熱詞信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可獲取對應熱詞的熱詞ID。 |
對于Python SDK,參數說明如下:
參數 | 類型 | 說明 |
phrase_id | str | 調用create_phrases,update_phrases, query_phrases等接口返回的Dict對象,通過finetuned_output訪問對應phrase_id |
返回示例
{
"status_code": 200,
"request_id": "19ee9c5f-173b-9fed-8e61-40bc53f1eea7",
"code": null,
"message": "",
"output": {
"create_time": "2023-09-26 15:39:08",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"job_id": "ft-202309261539-2af1",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
},
"usage": null,
"job_id": "ft-202309261539-2af1",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": "2023-09-26 15:39:08",
"output_type": "custom_resource",
"model": "paraformer-realtime-v1"
}
調用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
result = AsrPhraseManager.query_phrases(phrase_id='phrase-id')
if result.output is not None:
print('query phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.QueryPhrase(param, "phrase-id");
if (result.getOutput() != null) {
System.out.println("query phrases: " + result.getOutput());
} else {
System.out.println("Error: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
更新熱詞
該接口將以同步調用的形式提交一個更新熱詞請求。
接口
def update_phrases(cls,
model: str,
phrase_id: str,
phrases: Dict[str, Any],
training_type: str = 'compile_asr_phrase',
**kwargs)
AsrPhraseStatusResult UpdatePhrases(AsrPhraseParam param, String phraseId)
throws ApiException, NoApiKeyException, InputRequiredException
參數配置
對于Java SDK,將使用一個AsrPhraseParam對象作為參數,其方法和參數如下:
參數 | 類型 | 說明 |
param | AsrPhraseParam | 創建熱詞的配置參數,見上文關于AsrPhraseParam類型的描述,對于UpdatePhrase調用,不需要填pageNo和pageSize字段。 |
phraseId | String | 調用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput對象后,通過該對象的getFineTunedOutput返回的熱詞ID,String類型。調用ListPhrases時,使用AsrPhraseStatusOutput對象的getFinetunedOutputs接口將返回所有熱詞信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可獲取對應熱詞的熱詞ID。 |
對于Python SDK,其參數如下:
參數 | 類型 | 默認值 | 說明 |
model | str | - | 指定用于音視頻文件轉寫的Paraformer模型名,關于如何進行模型選擇,請參考:模型概覽。 |
phrase_id | str | - | 調用create_phrases,update_phrases, query_phrases等接口返回的Dict對象,通過finetuned_output訪問對應phrase_id。 |
phrases | Dict[str, Any] | - | 熱詞列表,是一個Dict類型對象,其中鍵為熱詞文本,值為熱詞對應權重。對于熱詞要求請參考下方重要一欄。 |
training_type | str | compile_asr_phrase | 固定為compile_asr_phrase |
返回示例
{
"status_code": 200,
"request_id": "8c8d64e3-5198-9624-99cd-e9dcf7eb22f6",
"code": null,
"message": "",
"output": {
"job_id": "ft-202309261543-b0ae",
"status": "SUCCEEDED",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"training_type": "compile_asr_phrase",
"create_time": "2023-09-26 15:43:09"
},
"usage": null,
"job_id": "ft-202309261543-b0ae",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": "2023-09-26 15:43:09",
"output_type": null,
"model": null
}
調用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
phrases = {'通義千問': 2}
result = AsrPhraseManager.update_phrases(model='paraformer-realtime-v1',
phrase_id='phrase-id',
phrases=phrases)
if result.output is not None:
print('update phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
import import java.util.Collections;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.phraseList(Collections.singletonMap("通義千問", 2))
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.UpdatePhrases(param, "phrase-id");
if (result.getOutput() != null) {
System.out.println("update phrases: " + result.getOutput());
} else {
System.out.println("err: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
刪除熱詞
該接口將以同步調用的形式提交一個熱詞刪除請求。
接口
def delete_phrases(cls, phrase_id: str,
**kwargs)
AsrPhraseStatusResult DeletePhrase(AsrPhraseParam param, String phraseId)
throws ApiException, NoApiKeyException, InputRequiredException
參數配置
對于Java SDK,將使用一個AsrPhraseParam對象作為參數,其方法和參數如下:
參數 | 類型 | 說明 |
param | AsrPhraseParam | 創建熱詞的配置參數,見上文關于AsrPhraseParam類型的描述,對于DeletePhrase調用,不需要填phraseList, pageNo和pageSize字段。 |
phraseId | String | 調用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput對象后,通過該對象的getFineTunedOutput返回的熱詞ID,String類型。調用ListPhrases時,使用AsrPhraseStatusOutput對象的getFinetunedOutputs接口將返回所有熱詞信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可獲取對應熱詞的熱詞ID。 |
對于Python SDK,其參數如下:
參數 | 類型 | 默認值 | 說明 |
phrase_id | str | - | 調用create_phrases,update_phrases, query_phrases等接口返回的Dict對象,通過finetuned_output訪問對應phrase_id。 |
返回示例
{
"status_code": 200,
"request_id": "00bb0287-2593-94a3-8e21-93c90f5e9dd8",
"code": null,
"message": "",
"output": {
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1"
},
"usage": null,
"job_id": null,
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": null,
"output_type": null,
"model": null
}
調用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
result = AsrPhraseManager.delete_phrases(phrase_id='phrase-id')
if result.output is not None:
print('delete phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.DeletePhrase(param, "phrase-id");
if (result.getOutput() != null) {
System.out.println("delete phrases: " + result.getOutput());
} else {
System.out.println("Error: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
列表形式返回所有熱詞
該接口將以同步調用的形式提交一個返回所有熱詞的請求。
接口
def list_phrases(cls,
page: int = 1,
page_size: int = 10,
**kwargs)
AsrPhraseStatusResult ListPhrases(AsrPhraseParam param)
throws ApiException, NoApiKeyException, InputRequiredException
參數配置
對于Java SDK,將使用一個AsrPhraseParam對象作為參數,其方法和參數如下:
參數 | 類型 | 說明 |
param | AsrPhraseParam | 創建熱詞的配置參數,見上文關于AsrPhraseParam類型的描述,對于ListPhrases調用,不需要填phraseList字段。 |
phraseId | String | 調用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput對象后,通過該對象的getFineTunedOutput返回的熱詞ID,String類型。調用ListPhrases時,使用AsrPhraseStatusOutput對象的getFinetunedOutputs接口將返回所有熱詞信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可獲取對應熱詞的熱詞ID。 |
對于Python SDK,其參數如下:
參數 | 類型 | 默認值 | 說明 |
page | int | 1 | 當請求是list_phrases有效,用于查詢第幾頁列表,默認1 |
page_size | int | 10 | 當請求是list_phrases有效,用于設置分頁大小,默認10 |
返回示例
{
"status_code": 200,
"request_id": "95f969ef-bcbc-9bb5-b05d-4caed9326409",
"code": null,
"message": "",
"output": {
"page_no": 1,
"page_size": 5,
"total": 61,
"finetuned_outputs": [{
"create_time": "2023-09-26 15:32:20",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-93bf",
"job_id": "ft-202309261532-93bf",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:18",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-7b51",
"job_id": "ft-202309261532-7b51",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:17",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-8bbc",
"job_id": "ft-202309261532-cef0",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:16",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-fc6a",
"job_id": "ft-202309261532-fc6a",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:31:56",
"finetuned_output": "paraformer-realtime-v1-ft-202309261531-e92d",
"job_id": "ft-202309261531-e92d",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}]
},
"usage": null,
"job_id": null,
"finetuned_output": null,
"finetuned_outputs": [{
"create_time": "2023-09-26 15:32:20",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-93bf",
"job_id": "ft-202309261532-93bf",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:18",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-7b51",
"job_id": "ft-202309261532-7b51",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:17",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-8bbc",
"job_id": "ft-202309261532-cef0",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:16",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-fc6a",
"job_id": "ft-202309261532-fc6a",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:31:56",
"finetuned_output": "paraformer-realtime-v1-ft-202309261531-e92d",
"job_id": "ft-202309261531-e92d",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}],
"training_type": null,
"create_time": null,
"output_type": null,
"model": null
}
調用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
result = AsrPhraseManager.list_phrases()
if result.output is not None:
print('list phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.ListPhrases(param);
if (result.getOutput() != null) {
System.out.println("list phrases: " + result.getOutput());
} else {
System.out.println("Error: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}