kyuubi/dev/kyuubi-tpcds
Fei Wang 885ace06c8 [KYUUBI #1769][FOLLOWUP] Some cleanups for log4j properties
<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->

This is a followup of #1769. Cleanup the log4j properties files and using log4j2.properties for UT.

### _How was this patch tested?_

Passed UT.

Closes #1842 from turboFei/remove_log4j1.

Closes #1769

b6757555 [Fei Wang] unused change
365fbec7 [Fei Wang] revert license
14c64ec9 [Fei Wang] exclude log4j 1.2.17 in slf4j-log4j12
034b04db [Fei Wang] recover jcl-over-slf4j and slf4j-api for k8s it
2beae75e [Fei Wang] remove log4j 1.2.17
b00f07b5 [Fei Wang] remove from license
e55fd2ec [Fei Wang] remove unused dependencies
ab86f023 [Fei Wang] [KYUUBI #1769][FOLLOWUP] Some cleanups for log4j properties

Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: Fei Wang <fwang12@ebay.com>
2022-01-28 00:58:25 +08:00
..
src/main [KYUUBI #1743] Fix parallelism of DataGenerator and other enhancements 2022-01-13 11:41:22 +08:00
pom.xml [KYUUBI #1769][FOLLOWUP] Some cleanups for log4j properties 2022-01-28 00:58:25 +08:00
README.md [KYUUBI #1743] Fix parallelism of DataGenerator and other enhancements 2022-01-13 11:41:22 +08:00

Introduction

This module includes TPC-DS data generator and benchmark tool.

How to use

package jar with following command: ./build/mvn clean package -Ptpcds -pl dev/kyuubi-tpcds -am

Data Generator

Support options:

key default description
db default the database to write data
scaleFactor 1 the scale factor of TPC-DS
format parquet the format of table to store data
parallel scaleFactor * 2 the parallelism of Spark job

Example: the following command to generate 10GB data with new database tpcds_sf10.

$SPARK_HOME/bin/spark-submit \
  --class org.apache.kyuubi.tpcds.DataGenerator \
  kyuubi-tpcds_*.jar \
  --db tpcds_sf10 --scaleFactor 10 --format parquet --parallel 20

Benchmark Tool

Support options:

key default description
db none(required) the TPC-DS database
benchmark tpcds-v2.4-benchmark the name of application
iterations 3 the number of iterations to run
filter a filter on the name of the queries to run, e.g. q1-v2.4

Example: the following command to benchmark TPC-DS sf10 with exists database tpcds_sf10.

$SPARK_HOME/bin/spark-submit \
  --class org.apache.kyuubi.tpcds.benchmark.RunBenchmark \
  kyuubi-tpcds_*.jar --db tpcds_sf10

We also support run one of the TPC-DS query:

$SPARK_HOME/bin/spark-submit \
  --class org.apache.kyuubi.tpcds.benchmark.RunBenchmark \
  kyuubi-tpcds_*.jar --db tpcds_sf10 --filter q1-v2.4

The result of TPC-DS benchmark like:

name minTimeMs maxTimeMs avgTimeMs stdDev stdDevPercent
q1-v2.4 50.522384 868.010383 323.398267 471.6482 145.8413108576