HiBench 7
官方:https://github.com/intel-hadoop/HiBench
一 简介
HiBench is a big data benchmark suite that helps evaluate different big data frameworks in terms of speed, throughput and system resource utilizations. It contains a set of Hadoop, Spark and streaming workloads, including Sort, WordCount, teraSort, Sleep, sql, PageRank, Nutch indexing, Bayes, Kmeans, NWeight and enhanced DFSIO, etc. It also contains several streaming workloads for Spark Streaming, Flink, Storm and Gearpump.
There are totally 19 workloads in HiBench.
Supported Hadoop/Spark/Flink/Storm/Gearpump releases:
Hadoop: Apache Hadoop 2.x, CDH5, HDP
Spark: Spark 1.6.x, Spark 2.0.x, Spark 2.1.x, Spark 2.2.x
Flink: 1.0.3
Storm: 1.0.1
Gearpump: 0.8.1
Kafka: 0.8.2.2
二 spark sql测试
1 download
$ wget https://github.com/intel-hadoop/HiBench/archive/HiBench-7.0.tar.gz
$ tar xvf HiBench-7.0.tar.gz
$ cd HiBench-HiBench-7.0
2 build
1)build all
$ mvn -Dspark=2.1 -Dscala=2.11 clean package
2)build hadoopbench and sparkbench
$ mvn -Phadoopbench -Psparkbench -Dspark=2.1 -Dscala=2.11 clean package
3)only build spark sql
$ mvn -Psparkbench -Dmodules -Psql -Dspark=2.1 -Dscala=2.11 clean package
3 prepare
$ cp conf/hadoop.conf.template conf/hadoop.conf
$ vi conf/hadoop.conf$ cp conf/spark.conf.template conf/spark.conf
$ vi conf/spark.conf$ vi conf/hibench.conf
# Data scale profile. Available value is tiny, small, large, huge, gigantic and bigdata.
# The deFinition of these profiles can be found in the workload's conf file i.e. conf/workloads/micro/wordcount.conf
hibench.scale.profile bigdata
4 run
sql测试分为3种:scan/aggregation/join
$ bin/workloads/sql/scan/prepare/prepare.sh
$ bin/workloads/sql/scan/spark/run.sh
prepare之后会在hdfs的/HiBench/Scan/Input下生成测试数据,在report/scan/prepare/下生成报告
run之后会在report/scan/spark/下生成报告,比如monitor.html,在hive的default库下可以看到测试数据表
$ bin/workloads/sql/join/prepare/prepare.sh
$ bin/workloads/sql/join/spark/run.sh$ bin/workloads/sql/aggregation/prepare/prepare.sh
$ bin/workloads/sql/aggregation/spark/run.sh
依此类推
参考:
https://github.com/intel-hadoop/HiBench/blob/master/docs/build-hibench.md
https://github.com/intel-hadoop/HiBench/blob/master/docs/run-sparkbench.md
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。