多分類評(píng)估
多分類評(píng)估算法用于評(píng)估一個(gè)模型在處理多于兩個(gè)類別的分類問題中的效能。該算法計(jì)算諸如準(zhǔn)確率、召回率、F1分?jǐn)?shù)以及混淆矩陣等指標(biāo),以量化模型對(duì)不同類別的分類精度。混淆矩陣展示了模型預(yù)測(cè)的類別與真實(shí)類別之間的關(guān)系,而其他指標(biāo)則提供了每個(gè)類別分類正確與否的細(xì)節(jié)信息。這些度量幫助了解模型在各個(gè)類別上的表現(xiàn),指導(dǎo)后續(xù)的模型優(yōu)化。
配置組件
方法一:可視化方式
在Designer工作流頁面添加多分類評(píng)估組件,并在界面右側(cè)配置相關(guān)參數(shù):
參數(shù)類型 | 參數(shù) | 描述 |
字段設(shè)置 | 原分類結(jié)果列 | 可以選擇原始標(biāo)簽列,分類數(shù)量不能大于1000。 |
預(yù)測(cè)分類結(jié)果列 | 預(yù)測(cè)分類列,一般情況下,該參數(shù)的字段名為prediction_result。 | |
高級(jí)選項(xiàng) | 如果選中高級(jí)選項(xiàng)復(fù)選框,則預(yù)測(cè)結(jié)果概率列參數(shù)生效。 | |
預(yù)測(cè)結(jié)果概率列 | 用于計(jì)算模型的logloss,且僅對(duì)隨機(jī)森林模型有效,其他模型設(shè)置后可能會(huì)報(bào)錯(cuò);一般情況下,該參數(shù)的字段名為prediction_detail。 | |
執(zhí)行調(diào)優(yōu) | 核心數(shù) | 與核內(nèi)存分配搭配使用,默認(rèn)為系統(tǒng)自動(dòng)分配。 |
核內(nèi)存分配 | 每個(gè)核心的內(nèi)存,單位:MB,默認(rèn)為系統(tǒng)自動(dòng)分配。 |
方法二:PAI命令方式
使用PAI命令配置多分類評(píng)估組件參數(shù)。您可以使用SQL腳本組件進(jìn)行PAI命令調(diào)用,詳情請(qǐng)參見場(chǎng)景4:在SQL腳本組件中執(zhí)行PAI命令。
PAI -name MultiClassEvaluation -project algo_public
-DinputTableName="test_input"
-DoutputTableName="test_output"
-DlabelColName="label"
-DpredictionColName="prediction_result"
-Dlifecycle=30;
參數(shù) | 是否必選 | 默認(rèn)值 | 參數(shù)描述 |
inputTableName | 是 | 無 | 輸入表的名稱。 |
inputTablePartitions | 否 | 全表 | 輸入表的分區(qū)。 |
outputTableName | 是 | 無 | 輸出表的名稱。 |
labelColName | 是 | 無 | 輸入表原始標(biāo)簽列名。 |
predictionColName | 是 | 無 | 預(yù)測(cè)結(jié)果的標(biāo)簽列名。 |
predictionDetailColName | 否 | 空 | 預(yù)測(cè)結(jié)果的概率列,例如 |
lifecycle | 否 | 無 | 輸出表的生命周期。 |
coreNum | 否 | 系統(tǒng)自動(dòng)計(jì)算 | 核心數(shù)量。 |
memSizePerCore | 否 | 系統(tǒng)自動(dòng)計(jì)算 | 每個(gè)核心的內(nèi)存。 |
使用示例
添加SQL腳本組件,輸入以下SQL語句生成訓(xùn)練數(shù)據(jù)。
drop table if exists multi_esti_test; create table multi_esti_test as select * from ( select '0' as id,'A' as label,'A' as prediction,'{"A": 0.6, "B": 0.4}' as detail union all select '1' as id,'A' as label,'B' as prediction,'{"A": 0.45, "B": 0.55}' as detail union all select '2' as id,'A' as label,'A' as prediction,'{"A": 0.7, "B": 0.3}' as detail union all select '3' as id,'A' as label,'A' as prediction,'{"A": 0.9, "B": 0.1}' as detail union all select '4' as id,'B' as label,'B' as prediction,'{"A": 0.2, "B": 0.8}' as detail union all select '5' as id,'B' as label,'B' as prediction,'{"A": 0.1, "B": 0.9}' as detail union all select '6' as id,'B' as label,'A' as prediction,'{"A": 0.52, "B": 0.48}' as detail union all select '7' as id,'B' as label,'B' as prediction,'{"A": 0.4, "B": 0.6}' as detail union all select '8' as id,'B' as label,'A' as prediction,'{"A": 0.6, "B": 0.4}' as detail union all select '9' as id,'A' as label,'A' as prediction,'{"A": 0.75, "B": 0.25}' as detail )tmp;
添加SQL腳本組件,輸入以下PAI命令進(jìn)行訓(xùn)練。
drop table if exists ${o1}; PAI -name MultiClassEvaluation -project algo_public -DinputTableName="multi_esti_test" -DoutputTableName=${o1} -DlabelColName="label" -DpredictionColName="prediction" -Dlifecycle=30;
右擊上一步的組件,選擇查看數(shù)據(jù) > SQL腳本的輸出,查看訓(xùn)練結(jié)果。
| result | | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | { "ActualLabelFrequencyList": [5, 5], "ActualLabelProportionList": [0.5, 0.5], "ConfusionMatrix": [[4, 1], [2, 3]], "LabelList": ["A", "B"], "LabelMeasureList": [{ "Accuracy": 0.7, "F1": 0.7272727272727273, "FalseDiscoveryRate": 0.3333333333333333, "FalseNegative": 1, "FalseNegativeRate": 0.2, "FalsePositive": 2, "FalsePositiveRate": 0.4, "Kappa": 0.3999999999999999, "NegativePredictiveValue": 0.75, "Precision": 0.6666666666666666, "Sensitivity": 0.8, "Specificity": 0.6, "TrueNegative": 3, "TruePositive": 4}, { "Accuracy": 0.7, "F1": 0.6666666666666666, "FalseDiscoveryRate": 0.25, "FalseNegative": 2, "FalseNegativeRate": 0.4, "FalsePositive": 1, "FalsePositiveRate": 0.2, "Kappa": 0.3999999999999999, "NegativePredictiveValue": 0.6666666666666666, "Precision": 0.75, "Sensitivity": 0.6, "Specificity": 0.8, "TrueNegative": 4, "TruePositive": 3}], "LabelNumber": 2, "OverallMeasures": { "Accuracy": 0.7, "Kappa": 0.3999999999999999, "LabelFrequencyBasedMicro": { "Accuracy": 0.7, "F1": 0.696969696969697, "FalseDiscoveryRate": 0.2916666666666666, "FalseNegative": 1.5, "FalseNegativeRate": 0.3, "FalsePositive": 1.5, "FalsePositiveRate": 0.3, "Kappa": 0.3999999999999999, "NegativePredictiveValue": 0.7083333333333333, "Precision": 0.7083333333333333, "Sensitivity": 0.7, "Specificity": 0.7, "TrueNegative": 3.5, "TruePositive": 3.5}, "MacroAveraged": { "Accuracy": 0.7, "F1": 0.696969696969697, "FalseDiscoveryRate": 0.2916666666666666, "FalseNegative": 1.5, "FalseNegativeRate": 0.3, "FalsePositive": 1.5, "FalsePositiveRate": 0.3, "Kappa": 0.3999999999999999, "NegativePredictiveValue": 0.7083333333333333, "Precision": 0.7083333333333333, "Sensitivity": 0.7, "Specificity": 0.7, "TrueNegative": 3.5, "TruePositive": 3.5}, "MicroAveraged": { "Accuracy": 0.7, "F1": 0.7, "FalseDiscoveryRate": 0.3, "FalseNegative": 3, "FalseNegativeRate": 0.3, "FalsePositive": 3, "FalsePositiveRate": 0.3, "Kappa": 0.3999999999999999, "NegativePredictiveValue": 0.7, "Precision": 0.7, "Sensitivity": 0.7, "Specificity": 0.7, "TrueNegative": 7, "TruePositive": 7}}, "PredictedLabelFrequencyList": [6, 4], "PredictedLabelProportionList": [0.6, 0.4], "ProportionMatrix": [[0.8, 0.2], [0.4, 0.6]]} |
附錄
如果您通過可視化方式運(yùn)行多分類評(píng)估算法,可右擊該組件,選擇可視化分析,查看結(jié)果詳情。