秒級指標(biāo)數(shù)據(jù)的生成方法
本文為您介紹如何使用日志管理功能生成秒級指標(biāo)數(shù)據(jù)。
背景信息
當(dāng)前云監(jiān)控提供的圖表是分鐘級統(tǒng)計數(shù)據(jù)的平均值,無法展示秒級的TPS統(tǒng)計數(shù)據(jù)。云消息隊列 RabbitMQ 版的TPS統(tǒng)計了每秒Client主動發(fā)起的AMQP協(xié)議方法請求數(shù)量。
TPS統(tǒng)計的AMQP協(xié)議請求方法如下:
ConnectionOpen、ChannelOpen
QueueDeclare、QueueDelete、QueueBind、QueueUnbind
ExchangeDeclare、ExchangeDelete
ExchangeBind、ExchangeUnBind
SendMessage、BasicConsume、BasicGet、BasicAck、BasicReject、BasicNack、BasicRecover
關(guān)于請求方法的詳細描述,請參見請求方法。
操作步驟
創(chuàng)建Metric時序庫,用于存儲清洗出來的指標(biāo)數(shù)據(jù)。
在日志服務(wù)控制臺的Project詳情頁面,選擇 。
在創(chuàng)建MetricStore面板中設(shè)置Metric時序庫的基本信息。
創(chuàng)建清洗任務(wù)。
在logstore中輸入查詢語句,以實例錯誤碼為例。
* | SELECT Code, count(*) as num, microtime / 1000 / 1000 as timeSecond group by Code, timeSecond limit 1000000
上述語句格式為:
查詢語句|分析語句
,前者為條件的篩選,后者為標(biāo)準(zhǔn)的SQL語法。從查詢結(jié)果中清洗出以下三項內(nèi)容即可寫入Metric時序庫:您需要的Label;各個Label下的指標(biāo)值;時間。以本語句為例,Code
對應(yīng)Label,代表各個請求的返回碼;num
對應(yīng)各個Code的值;timeSecond
對應(yīng)時間,單位為秒。查詢結(jié)果如下所示:
在查詢結(jié)果中,單擊統(tǒng)計圖表頁簽中的另存為定時SQL,在計算配置頁簽中配置以下參數(shù),然后單擊下一步。
說明目標(biāo)庫應(yīng)選擇上文中已創(chuàng)建的Metric時序庫。
在調(diào)度配置頁簽中設(shè)置調(diào)度時間間隔,然后單擊確定。
在Metric時序庫中查詢指標(biāo)數(shù)值分布。
查詢結(jié)果如下所示:
可選:將Metric時序庫中的數(shù)據(jù)作為數(shù)據(jù)源接入可視化圖表大盤,大盤展示可選用Grafana或日志服務(wù)的可視化能力。
接入Grafana大盤,詳情請參見時序數(shù)據(jù)對接Grafana。
使用日志服務(wù)的可視化儀表盤能力,詳情請參見可視化。
以上教程以清洗實例錯誤碼數(shù)據(jù)為例,您也可以清洗其他數(shù)據(jù),例如每個RemoteAddress的每個Channel的消息收發(fā)速度、每秒鐘每個隊列的活躍情況、每秒鐘的總消息發(fā)送條數(shù)和接收條數(shù)、每秒鐘各個API的調(diào)用次數(shù)等。
常用語句
查詢實例秒級TPS指標(biāo)數(shù)據(jù)
* | select microtime/1000/1000 as time, sum(count) as tps
from
(SELECT microtime, if(Action!='SendMessage', 1, tps) as count
from log
Where InstanceId='amqp-xx-xxx'
and Action in ('SendMessage', 'ConnectionOpen', 'ChannelOpen', 'ExchangeDeclare', 'QueueBind', 'QueueDeclare', 'QueueDelete', 'ExchangeDelete', 'QueueUnBind', 'ExchangeBind', 'ExchangeUnBind', 'BasicConsume', 'BasicReject', 'BasicRecover', 'BasicAck', 'BasicNAck', 'PullMessage')
limit 90000000)
GROUP by time ORDER by time limit 90000000
查詢結(jié)果如下所示:
查詢前請將上文中的實例ID
amqp-xx-xxx
替換為待查詢實例的ID。其中
BasicNack(multiple=false)
,計TPS=1,BasicNack(multiple=true)
,計TPS=N,因此通過SLS日志配置統(tǒng)計出來的TPS值會小于實際發(fā)起的請求量。查詢TPS流量圖時,如果客戶端的流量比較大,建議將查詢的時間范圍限制在1小時或是更小的范圍,然后在SQL語句后面加上
limit 90000000
,或者limit
取值盡可能大。
查詢各exchange、routing key的消息發(fā)送總量
* and Action : SendMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
split_part(ResourceName,',',2) as exchange_name,
split_part(ResourceName,',',3) as routing_key,
count(*) as send_total_num
group by
instance_id,
virtual_host,
exchange_name,
routing_key
order by
send_total_num
limit 10000000
查詢結(jié)果如下所示:
查詢各exchange、routing key的每秒消息發(fā)送速率
* and Action : SendMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
split_part(ResourceName,',',2) as exchange_name,
split_part(ResourceName,',',3) as routing_key,
microtime / 1000 / 1000 as time_second,
count(*) as send_qps
group by
instance_id,
virtual_host,
exchange_name,
routing_key,
time_second
order by
time_second,
send_qps
limit 10000000
查詢結(jié)果如下所示:
查詢各隊列的消費消息量
* and Action : PushMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
Queue as queue_name,
count(*) as push_total_num
group by
instance_id,
virtual_host,
queue_name
order by
push_total_num
limit 10000000
查詢結(jié)果如下所示:
查詢各隊列的每秒消費消息速率
* and Action : PushMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
Queue as queue_name,
microtime / 1000 / 1000 as time_second,
count(*) as push_qps
group by
instance_id,
virtual_host,
queue_name,
time_second
order by
time_second,
push_qps
limit 10000000
查詢結(jié)果如下所示:
查詢各客戶端的每秒消息發(fā)送量
* and Action : SendMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
RemoteAddress as client_ip_port,
microtime / 1000 / 1000 as time_second,
count(*) as send_qps
group by
instance_id,
virtual_host,
client_ip_port,
time_second
order by
time_second,
send_qps
limit 10000000
查詢結(jié)果如下所示:
查詢各客戶端的每秒消息消費量
* and Action : PushMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
RemoteAddress as client_ip_port,
microtime / 1000 / 1000 as time_second,
count(*) as push_qps
group by
instance_id,
virtual_host,
client_ip_port,
time_second
order by
time_second,
push_qps
limit 10000000
查詢結(jié)果如下所示:
查詢各客戶端某行為的每秒速率
如果需要查詢某客戶端對于某個行為的操作QPS,請復(fù)制下面的語句,并修改{action_name}
為您需要查詢的Action名稱,具體Action名稱包括:
ConnectionOpen、ChannelOpen
QueueDeclare、QueueDelete、QueueBind、QueueUnbind
ExchangeDeclare、ExchangeDelete
ExchangeBind、ExchangeUnBind
SendMessage、BasicConsume、BasicGet、BasicAck、BasicReject、BasicNack、BasicRecover
* and Action : {action_name} and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
RemoteAddress as client_ip_port,
microtime / 1000 / 1000 as time_second,
count(*) as {action_name}_qps
group by
instance_id,
virtual_host,
client_ip_port,
time_second
order by
time_second,
{action_name}_qps
limit 10000000
例如,如果希望查詢某客戶端打開Connection的QPS,可使用如下語句:
* and Action : ConnectionOpen and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
RemoteAddress as client_ip_port,
microtime / 1000 / 1000 as time_second,
count(*) as connection_open_qps
group by
instance_id,
virtual_host,
client_ip_port,
time_second
order by
time_second,
connection_open_qps
limit 10000000
查詢結(jié)果如下所示:
查詢各Action的QPS
該語句能夠一次性統(tǒng)計各客戶端的所有Action QPS。
* and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
Action as action_type,
RemoteAddress as client_ip_port,
microtime / 1000 / 1000 as time_second,
count(*) as action_qps
group by
instance_id,
virtual_host,
client_ip_port,
action_type,
time_second
order by
time_second,
action_qps
limit 10000000
查詢結(jié)果如下所示:
查詢各錯誤出現(xiàn)頻次
* and not Code = 200 |
select
Code as error_code,
VHost as virtual_host,
split_part(split_part(Info, '[', 1), 'Req', 1) as error_info,
microtime / 1000 / 1000 as time_second,
count(*) as error_num
group by
virtual_host,
error_code,
time_second,
error_info
order by
time_second,
error_num
limit 10000000
查詢結(jié)果如下所示:
查詢平均消息體大小
* and Action : SendMessage and Code: 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
split_part(Queue, ';', 1) as queue_name,
microtime / 1000 / 1000 as time_second,
avg(cast(split_part(ResourceName, 'bodySize=', 2) as bigint)) as avg_body_size
group by
instance_id,
virtual_host,
queue_name,
time_second
order by
time_second,
avg_body_size
limit 10000000
查詢結(jié)果如下所示:
查詢各消息ID的推送次數(shù)
* and Action : PushMessage and Code : 200 |
select
InstanceId as instance_id,
VHost as virtual_host,
split_part(split_part(ResourceName, ',', 1), '=', 2) as msg_id,
count(*) as push_times
group by
instance_id,
virtual_host,
msg_id
order by
push_times desc
limit 1000000
查詢結(jié)果如下所示: