阿里云Elasticsearch支持使用filter過濾器來配置同義詞。filter過濾器支持Solr和WordNet兩種同義詞格式。本文介紹同義詞詞典的配置規(guī)則。
配置示例
PUT /test_index
{
"settings": {
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "analysis/synonym.txt",
"tokenizer" : "whitespace"
}
}
}
}
}
}
filter中配置了一個synonym(同義詞)過濾器,其中包含了同義詞詞典文件的路徑analysis/synonym.txt(路徑是相對于config的位置)。更多參數(shù)說明請參見官方Synonym Token Filter文檔。
Solr同義詞
配置規(guī)則示例如下。
# Blank lines and lines starting with pound are comments.
# Explicit mappings match any token sequence on the LHS of "=>"
# and replace with all alternatives on the RHS. These types of mappings
# ignore the expand parameter in the schema.
# Examples:
i-pod, i pod => ipod,
sea biscuit, sea biscit => seabiscuit
# Equivalent synonyms may be separated with commas and give
# no explicit mapping. In this case the mapping behavior will
# be taken from the expand parameter in the schema. This allows
# the same synonym file to be used in different synonym handling strategies.
# Examples:
ipod, i-pod, i pod
foozball , foosball
universe , cosmos
lol, laughing out loud
# If expand==true, "ipod, i-pod, i pod" is equivalent
# to the explicit mapping:
ipod, i-pod, i pod => ipod, i-pod, i pod
# If expand==false, "ipod, i-pod, i pod" is equivalent
# to the explicit mapping:
ipod, i-pod, i pod => ipod
# Multiple synonym mapping entries are merged.
foo => foo bar
foo => baz
# is equivalent to
foo => foo bar, baz
您也可以在filter過濾器中直接定義同義詞(請注意使用synonyms而不是synonyms_path),示例如下。
PUT /test_index
{
"settings": {
"index" : {
"analysis" : {
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms" : [
"i-pod, i pod => ipod",
"begin, start"
]
}
}
}
}
}
}
說明 建議您使用synonyms_path在文件中定義大型同義詞集。
WordNet同義詞
配置規(guī)則示例如下。
PUT /test_index
{
"settings": {
"index" : {
"analysis" : {
"filter" : {
"synonym" : {
"type" : "synonym",
"format" : "wordnet",
"synonyms" : [
"s(100000001,1,'abstain',v,1,0).",
"s(100000001,2,'refrain',v,1,0).",
"s(100000001,3,'desist',v,1,0)."
]
}
}
}
}
}
}
以上示例使用synonyms定義WordNet同義詞,您也可以使用synonyms_path在文本中定義WordNet同義詞。