小游戏网站建设公司,信誉好的低价网站建设,携创网中国工商注册网年审,深圳app开发公司一、 测试数据构造
1. 数据样例
官方文档有给出一批数据样例。优点是比较真实#xff0c;缺点是太大了#xff0c;动辄上百G不适合简单小测试
Anonymized Yandex.Metrica DatasetStar Schema BenchmarkWikiStatTerabyte of Click Logs from CriteoAMPLab Big Data Benchma…一、 测试数据构造
1. 数据样例
官方文档有给出一批数据样例。优点是比较真实缺点是太大了动辄上百G不适合简单小测试
Anonymized Yandex.Metrica DatasetStar Schema BenchmarkWikiStatTerabyte of Click Logs from CriteoAMPLab Big Data BenchmarkNew York Taxi DataOnTime
相对来说 ssb-dbgen工具 生成的表比较简单数据量也可以自己控制更加方便。 2. ssb-dbgen下载安装
下载
https://github.com/vadimtk/ssb-dbgen
安装依赖包
yum -y install gcc gcc-c make cmake
ssb-dbgen安装解压进入目录执行 make 即可 3. ssb-dbgen生成测试数据
ssb-dbgen工具指定参数可以生成如下表的数据其中lineorder是最大的
c–customer.tbld–date.tblp–part.tbls–supplier.tbll–lineorder.tbla-all
./dbgen -s 10 -T a -s 100 lineorder表会生成6亿行数据约67G-s 1000则会为其生成60亿行数据约670G需要大量空间和时间注意控制。 4. 创建表结构
测试表可以都用也可以挑一些官方文档只建了4个
CREATE TABLE customer
(C_CUSTKEY UInt32,C_NAME String,C_ADDRESS String,C_CITY LowCardinality(String),C_NATION LowCardinality(String),C_REGION LowCardinality(String),C_PHONE String,C_MKTSEGMENT LowCardinality(String)
)
ENGINE MergeTree ORDER BY (C_CUSTKEY);CREATE TABLE lineorder
(LO_ORDERKEY UInt32,LO_LINENUMBER UInt8,LO_CUSTKEY UInt32,LO_PARTKEY UInt32,LO_SUPPKEY UInt32,LO_ORDERDATE Date,LO_ORDERPRIORITY LowCardinality(String),LO_SHIPPRIORITY UInt8,LO_QUANTITY UInt8,LO_EXTENDEDPRICE UInt32,LO_ORDTOTALPRICE UInt32,LO_DISCOUNT UInt8,LO_REVENUE UInt32,LO_SUPPLYCOST UInt32,LO_TAX UInt8,LO_COMMITDATE Date,LO_SHIPMODE LowCardinality(String)
)
ENGINE MergeTree PARTITION BY toYear(LO_ORDERDATE) ORDER BY (LO_ORDERDATE, LO_ORDERKEY);CREATE TABLE part
(P_PARTKEY UInt32,P_NAME String,P_MFGR LowCardinality(String),P_CATEGORY LowCardinality(String),P_BRAND LowCardinality(String),P_COLOR LowCardinality(String),P_TYPE LowCardinality(String),P_SIZE UInt8,P_CONTAINER LowCardinality(String)
)
ENGINE MergeTree ORDER BY P_PARTKEY;CREATE TABLE supplier
(S_SUPPKEY UInt32,S_NAME String,S_ADDRESS String,S_CITY LowCardinality(String),S_NATION LowCardinality(String),S_REGION LowCardinality(String),S_PHONE String
)
ENGINE MergeTree ORDER BY S_SUPPKEY; 将star schema转换为flat schema表关联转为大宽表
SET max_memory_usage 20000000000;CREATE TABLE lineorder_flat
ENGINE MergeTree ORDER BY (LO_ORDERDATE, LO_ORDERKEY)
AS SELECTl.LO_ORDERKEY AS LO_ORDERKEY,l.LO_LINENUMBER AS LO_LINENUMBER,l.LO_CUSTKEY AS LO_CUSTKEY,l.LO_PARTKEY AS LO_PARTKEY,l.LO_SUPPKEY AS LO_SUPPKEY,l.LO_ORDERDATE AS LO_ORDERDATE,l.LO_ORDERPRIORITY AS LO_ORDERPRIORITY,l.LO_SHIPPRIORITY AS LO_SHIPPRIORITY,l.LO_QUANTITY AS LO_QUANTITY,l.LO_EXTENDEDPRICE AS LO_EXTENDEDPRICE,l.LO_ORDTOTALPRICE AS LO_ORDTOTALPRICE,l.LO_DISCOUNT AS LO_DISCOUNT,l.LO_REVENUE AS LO_REVENUE,l.LO_SUPPLYCOST AS LO_SUPPLYCOST,l.LO_TAX AS LO_TAX,l.LO_COMMITDATE AS LO_COMMITDATE,l.LO_SHIPMODE AS LO_SHIPMODE,c.C_NAME AS C_NAME,c.C_ADDRESS AS C_ADDRESS,c.C_CITY AS C_CITY,c.C_NATION AS C_NATION,c.C_REGION AS C_REGION,c.C_PHONE AS C_PHONE,c.C_MKTSEGMENT AS C_MKTSEGMENT,s.S_NAME AS S_NAME,s.S_ADDRESS AS S_ADDRESS,s.S_CITY AS S_CITY,s.S_NATION AS S_NATION,s.S_REGION AS S_REGION,s.S_PHONE AS S_PHONE,p.P_NAME AS P_NAME,p.P_MFGR AS P_MFGR,p.P_CATEGORY AS P_CATEGORY,p.P_BRAND AS P_BRAND,p.P_COLOR AS P_COLOR,p.P_TYPE AS P_TYPE,p.P_SIZE AS P_SIZE,p.P_CONTAINER AS P_CONTAINER
FROM lineorder AS l
INNER JOIN customer AS c ON c.C_CUSTKEY l.LO_CUSTKEY
INNER JOIN supplier AS s ON s.S_SUPPKEY l.LO_SUPPKEY
INNER JOIN part AS p ON p.P_PARTKEY l.LO_PARTKEY; 5. 导入数据
cd ssb-dbgen-masterclickhouse-client --password --query INSERT INTO hydb.customer FORMAT CSV customer.tbl
clickhouse-client --password --query INSERT INTO hydb.part FORMAT CSV part.tbl
clickhouse-client --password --query INSERT INTO hydb.supplier FORMAT CSV supplier.tbl
clickhouse-client --password --query INSERT INTO hydb.lineorder FORMAT CSV lineorder.tbl
如果还不够也可以多次执行以下语句至满意数据量
insert into hydb.lineorder select * from hydb.lineorder; 二、 clickhouse-benchmark简单压测
clickhouse-benchmark是自带的一个简单压测工具可以控制执行SQL的次数、并发度等。 1. 常用参数
-c 并发度例如10个并发同时执行指定SQL-d 间隔几秒执行SQL默认为10表示禁用-h 指定连接的db ip可以同时指定多个-h 连接多个库进行对比-i SQL执行总次数-r 有多个SQL时以随机顺序执行-t 指定压测时间到达指定时间后停止发送压测SQL。默认为0表示无限制 2. 两种用法
直接执行适合简单SQL
echo SELECT toYear(LO_ORDERDATE),count(*) FROM hydb.lineorder group by toYear(LO_ORDERDATE) order by 2 desc | clickhouse-benchmark --passwordxxxx -i 10
执行SQL文件适合复杂、批量语句
vi queries_file#查询语句
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM hydb.lineorder WHERE toYear(LO_ORDERDATE) 1993 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY 25;
执行压测
clickhouse-benchmark --passwordxxxx -i 10 queries_file
压测的语句可以简单写点也可以参考Star Schema Benchmark | ClickHouse Docs 3. 结果分析 Queries executed: 10. localhost:9000, queries 10, QPS: 6.772, RPS: 67904487.440, MiB/s: 518.070, result RPS: 67721584.984, result MiB/s: 516.675. 0.000% 0.145 sec. 10.000% 0.146 sec. 20.000% 0.146 sec. 30.000% 0.146 sec. 40.000% 0.147 sec. 50.000% 0.148 sec. 60.000% 0.148 sec. 70.000% 0.148 sec. 80.000% 0.149 sec. 90.000% 0.150 sec. 95.000% 0.150 sec. 99.000% 0.150 sec. 99.900% 0.150 sec. 99.990% 0.150 sec. 在结果报告中您可以找到: 查询数量参见Queries executed:字段。 状态码按顺序给出: ClickHouse服务器的连接信息。已处理的查询数。QPS服务端每秒处理的查询数量RPS服务器每秒读取多少行MiB/s服务器每秒读取多少字节的数据结果RPS服务端每秒生成多少行的结果集数据结果MiB/s.服务端每秒生成多少字节的结果集数据 查询执行时间的百分比。 参考
https://github.com/vadimtk/ssb-dbgen
Star Schema Benchmark | ClickHouse Docs
性能测试 | ClickHouse Docs
使用 ssb-dbgen 对 ClickHouse 压测_数据库人生的博客-CSDN博客