磁盤空間分析
云數(shù)據(jù)庫ClickHouse集群空間使用率是日常運(yùn)維中重點(diǎn)關(guān)注的監(jiān)控項(xiàng)之一。集群存儲空間的不足可能導(dǎo)致嚴(yán)重后果,例如數(shù)據(jù)無法寫入、無法備份,以及存儲空間擴(kuò)容任務(wù)耗時(shí)過長等。本文介紹如何通過SQL語句查看云數(shù)據(jù)庫ClickHouse集群的磁盤空間使用情況。
示例環(huán)境
以下示例以s-2-r-0
節(jié)點(diǎn)為基礎(chǔ)環(huán)境,在實(shí)際使用過程中,請根據(jù)您的場景修改對應(yīng)參數(shù)。如果您不知道如何獲取節(jié)點(diǎn)名稱,可以通過以下方式獲取。
通過控制臺:您可以在集群監(jiān)控頁面,獲取節(jié)點(diǎn)名稱。如何進(jìn)入集群監(jiān)控頁面,請參見查看集群監(jiān)控信息。
通過SQL語句:您可以執(zhí)行以下語句,獲取集群所有節(jié)點(diǎn)的名稱。
SELECT * FROM system.clusters;
查看表占用磁盤空間大小
通過查看表數(shù)據(jù)大小,可以識別表占用磁盤空間的大小,從而為您優(yōu)化數(shù)據(jù)庫性能以及合理規(guī)劃存儲資源提供有效分析依據(jù)。
查看表數(shù)據(jù)詳情
查看
s-2-r-0
節(jié)點(diǎn)下每個(gè)表活躍數(shù)據(jù)的情況。SELECT `database`, table, formatReadableSize(sum(data_compressed_bytes) AS size) AS compressed, --壓縮數(shù)據(jù)大小 formatReadableSize(sum(data_uncompressed_bytes) AS usize) AS uncompressed, --未壓縮數(shù)據(jù)大小 round(usize / size, 2) AS compr_rate, --壓縮率 sum(rows) AS rows, --總行數(shù) count() AS part_count --part數(shù)量 FROM clusterAllReplicas('default', system.parts) WHERE (active = 1) AND (table LIKE '%') AND (`database` LIKE '%') AND substring(hostname(),38,8) = 's-2-r-0' GROUP BY `database`, table ORDER BY size DESC;
查看副本表數(shù)據(jù)詳情
查看每個(gè)副本中的表數(shù)據(jù)。
SELECT hostname() AS h, `database` , table, count(*) AS data_part_cnt, --數(shù)據(jù)部分總量 sum(rows) AS total_rows, --總行數(shù) formatReadableSize(sum(bytes_on_disk)) AS total_compressed_bytes, --壓縮數(shù)據(jù)大小 sum(data_uncompressed_bytes) AS total_uncompressed_bytes --未壓縮數(shù)據(jù)大小 FROM clusterAllReplicas('default', system.parts) WHERE active = 1 GROUP BY h, `database`, table ORDER BY total_rows DESC;
查看表占用磁盤空間排行
查看集群中占用磁盤空間排名前十的表。
SELECT `database`, table, sum(bytes_on_disk) AS bytes_on_disk FROM clusterAllReplicas('default', system.parts) WHERE active AND (`database` != 'system') GROUP BY `database`, table ORDER BY bytes_on_disk DESC LIMIT 10;
查看表各列的存儲信息
查詢模板
SELECT
`database`,
table,
column,
formatReadableSize(sum(column_data_compressed_bytes) AS size) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes) AS usize) AS uncompressed,
round(usize / size, 2) AS compr_rate,
sum(rows) rows_cnt,
round(sum(column_data_uncompressed_bytes)/sum(rows) ,2) avg_row_size
FROM clusterAllReplicas('default', system.parts_columns)
WHERE (active = <active_type>) AND (table LIKE '<table_name>')
AND substring(hostname(),38,8) = '<node_name>'
GROUP BY
`database`,
table,
column
ORDER BY size DESC;
參數(shù)說明
參數(shù) | 說明 |
table_name | 目標(biāo)數(shù)據(jù)表表名。 值為%時(shí),表示匹配所有表。 |
node_name | 目標(biāo)集群節(jié)點(diǎn)名。 |
active_type | 數(shù)據(jù)是否活躍。
|
示例
查看s-2-r-0
節(jié)點(diǎn)上query_log
表中存儲活躍數(shù)據(jù)的列。
SELECT
`database`,
table,
column,
formatReadableSize(sum(column_data_compressed_bytes) AS size) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes) AS usize) AS uncompressed,
round(usize / size, 2) AS compr_rate,
sum(rows) rows_cnt,
round(sum(column_data_uncompressed_bytes)/sum(rows) ,2) avg_row_size
FROM clusterAllReplicas('default', system.parts_columns)
WHERE (active = 1)
AND (table LIKE 'query_log')
AND substring(hostname(),38,8) = 's-2-r-0'
GROUP BY
`database`,
table,
column
ORDER BY size DESC;
查看表分區(qū)信息
查詢模板
SELECT
partition AS `分區(qū)`,
sum(rows) AS `總行數(shù)`,
formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
formatReadableSize(sum(data_compressed_bytes)) AS `壓縮大小`,
round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `壓縮率`
FROM clusterAllReplicas('default', system.parts)
WHERE (database IN ('<database_name>'))
AND (table IN ('<table_name>'))
AND (partition LIKE '<partition_prefix>')
GROUP BY partition
ORDER BY partition ASC
參數(shù)說明
參數(shù) | 說明 |
database_name | 數(shù)據(jù)庫名。 |
table_name | 數(shù)據(jù)表表名。 |
partition_prefix | 分區(qū)前綴。 |
示例
查看default
數(shù)據(jù)庫中test
表分區(qū)前綴為2019-12-
的分區(qū)信息。
SELECT
partition AS `分區(qū)`,
sum(rows) AS `總行數(shù)`,
formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
formatReadableSize(sum(data_compressed_bytes)) AS `壓縮大小`,
round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `壓縮率`
FROM clusterAllReplicas('default', system.parts)
WHERE (`database` IN ('default'))
AND (table IN ('test'))
AND (partition LIKE '2019-12-%')
GROUP BY partition
ORDER BY partition ASC
查看datapart數(shù)據(jù)的大小
在ClickHouse中,system.part
表存儲了datapart
的狀態(tài)、大小、創(chuàng)建時(shí)間等信息。您可以通過此表了解datapart
的詳細(xì)信息。
查看活躍datapart數(shù)據(jù)的大小
活躍的datapart
是表中當(dāng)前被活躍使用的數(shù)據(jù)。了解活躍數(shù)據(jù)的數(shù)據(jù)量大小,可以幫助您識別表的實(shí)際數(shù)據(jù)大小,從而識別表占用磁盤空間的大小。
查看s-2-r-0
節(jié)點(diǎn)上所有非系統(tǒng)表的活躍數(shù)據(jù)分區(qū)信息。
SELECT
`database`,
table,
count(*) AS data_part_cnt, --活躍數(shù)據(jù)分區(qū)的數(shù)量
sum(rows) AS total_rows, --總行數(shù)
formatReadableSize(sum(bytes_on_disk)) AS total_compressed_bytes, --磁盤上的總壓縮數(shù)據(jù)大小
sum(data_uncompressed_bytes) AS total_uncompressed_bytes --未壓縮的總數(shù)據(jù)大小
FROM clusterAllReplicas('default', system.parts )
WHERE active = 1 AND `database` != 'system'
AND substring(hostname(),38,8) = 's-2-r-0'
GROUP BY `database`, table
ORDER BY total_rows DESC;
查看非活躍datapart數(shù)據(jù)大小
非活躍的datapart
可能包含過時(shí)或已標(biāo)記為刪除的數(shù)據(jù)。若這些數(shù)據(jù)不再需要,建議及時(shí)清理,以減少磁盤占用。
查看s-2-r-0
節(jié)點(diǎn)上所有非系統(tǒng)表的非活躍數(shù)據(jù)分區(qū)信息。
SELECT
`database`,
table,
count(*) AS data_part_cnt, --非活躍數(shù)據(jù)分區(qū)數(shù)量
sum(rows) AS total_rows, --總行數(shù)
formatReadableSize(sum(bytes_on_disk)) AS total_compressed_bytes, --磁盤上的壓縮數(shù)據(jù)總大小
sum(data_uncompressed_bytes) AS total_uncompressed_bytes --未壓縮數(shù)據(jù)總大小
FROM clusterAllReplicas('default', system.parts )
WHERE active = 0
AND `database` != 'system'
AND substring(hostname(),38,8) = 's-2-r-0'
GROUP BY `database`, table;
Projection占用磁盤空間大小
Projections是ClickHouse中一種用于優(yōu)化查詢性能的數(shù)據(jù)結(jié)構(gòu),其類似于物化視圖,存儲了預(yù)先計(jì)算的聚合或者數(shù)據(jù)變換結(jié)果。了解Projections的大小有助于評估其對磁盤空間的影響。
查看s-2-r-0
節(jié)點(diǎn)上test
表的Projections占用磁盤空間大小。
SELECT
database,
table,
name,
formatReadableSize(sum(data_compressed_bytes) AS size) AS compressed, --壓縮數(shù)據(jù)大小
formatReadableSize(sum(data_uncompressed_bytes) AS usize) AS uncompressed, --未壓縮數(shù)據(jù)大小
round(usize / size, 2) AS compr_rate, --壓縮比率
sum(rows) AS rows, --總行數(shù)
count() AS part_count --part總數(shù)
FROM clusterAllReplicas('default', system.projection_parts )
WHERE (table = 'test')
AND (active = 1) AND substring(hostname(),38,8) = 's-2-r-0'
GROUP BY
database,
table,
name
ORDER BY size DESC;
查看s-2-r-0
節(jié)點(diǎn)上test
表的每個(gè)Projection列占用磁盤空間大小。
SELECT
database,
table,
column,
formatReadableSize(sum(column_data_compressed_bytes) AS size) AS compressed, --壓縮數(shù)據(jù)大小
formatReadableSize(sum(column_data_uncompressed_bytes) AS usize) AS uncompressed, --未壓縮數(shù)據(jù)大小
round(usize / size, 2) AS compr_rate --壓縮比率
FROM clusterAllReplicas('default', system.projection_parts_columns )
WHERE (active = 1) AND (table LIKE 'test')
AND substring(hostname(),38,8) = 's-2-r-0'
GROUP BY
database,
table,
column
ORDER BY size DESC;