Commit Graph

124 Commits

Author SHA1 Message Date
Davies Liu
c087b68a5c make number of partitions configurable 2016-05-24 10:40:51 -07:00
Sameer Agarwal
375e116b1a bump to 0.4.6 2016-05-23 14:08:02 -07:00
Sameer Agarwal
1840fd9f21 Fix/rewrite some TPC-DS 1.4 queries
This patch ports upstream query modifications from apache/spark#13188
2016-05-23 14:02:47 -07:00
Sameer Agarwal
0355fc4ee7 Fix build and switch to jdk8
* Fix Build

* more memory

* switch to jdk8

* old memory settings
2016-05-23 12:54:07 -07:00
Sameer Agarwal
10b90c0d2b Fix q8 in ImpalaKit 2016-04-29 14:07:31 -07:00
Davies Liu
b8a90621cf bump to 0.4.5 2016-03-30 11:57:35 -07:00
Davies Liu
656f1bdb17 fix writing results 2016-03-30 11:56:55 -07:00
Michael Armbrust
2cd9b89388 Setting version to 0.4.4-SNAPSHOT 2016-03-25 11:52:37 -07:00
Michael Armbrust
598a242aa2 Setting version to 0.4.3 2016-03-25 11:49:53 -07:00
Michael Armbrust
d39f79a5cc Setting version to 0.4.2 2016-03-25 11:47:45 -07:00
Michael Armbrust
5912673b0d Fix JoinPerformance compilation
Author: Michael Armbrust <michael@databricks.com>

Closes #55 from marmbrus/fixJoinPerf.
2016-03-25 11:46:36 -07:00
Michael Armbrust
4450982f8e Setting version to 0.4.2-SNAPSHOT 2016-03-16 13:08:56 -07:00
Michael Armbrust
da08d8a66b Setting version to 0.4.1 2016-03-16 13:07:33 -07:00
Michael Armbrust
a013afcf29 Setting version to 0.4.1-SNAPSHOT 2016-03-15 14:47:44 -07:00
Michael Armbrust
1cfce2f9ad Setting version to 0.4.0 2016-03-15 14:45:50 -07:00
Josh Rosen
42a415e8d4 Extract Query class from Benchmark into its own top-level class and make SparkContext field transient
This patch extracts `Query` into its own top-level class and makes its `sparkContext` field transient in order to fix `NotSerializableException`s.

Author: Josh Rosen <rosenville@gmail.com>

Closes #53 from JoshRosen/make-query-into-top-level-class.
2016-02-22 18:23:06 -08:00
Josh Rosen
7e38b77c50 Update to compile against Spark 2.0.0-SNAPSHOT and bump version to 0.4.0-SNAPSHOT
Author: Josh Rosen <rosenville@gmail.com>

Closes #51 from JoshRosen/spark-2.0.0.
2016-02-19 13:02:29 -08:00
Josh Rosen
685ed9e488 Add TPCDS(sqlContext) constructor for backwards-compatibility
This patch adds additional constructors to `TPCDS` to maintain backwards-compatibility with code which calls `new TPCDS(anExistingSqlContext)`. This constructor was removed in #47.

The motivation for backwards-compatibility here is to simplify the gradual roll-out of an updated spark-sql-perf library to some existing jobs which share the same notebook.

Author: Josh Rosen <rosenville@gmail.com>

Closes #52 from JoshRosen/backwards-compatible-tpcds-constructor.
2016-02-19 13:01:23 -08:00
Michael Armbrust
7a3d9ce5b9 Setting version to 0.3.3-SNAPSHOT 2016-01-24 20:26:47 -08:00
Michael Armbrust
cb0347bb9d Setting version to 0.3.2 2016-01-24 20:26:00 -08:00
Michael Armbrust
9d3347e949 Improvements to running the benchmark
- Scripts for running the benchmark either while working on spark-sql-perf (bin/run) or while working on Spark (bin/spark-perf).  The latter uses Spark's sbt build to compile spark and downloads the most recent published version of spark-sql-perf.
 - Adds a `--compare` that can be used to compare the results with a baseline run

Author: Michael Armbrust <michael@databricks.com>

Closes #49 from marmbrus/runner.
2016-01-24 20:24:54 -08:00
Michael Armbrust
41ee700bf4 Setting version to 0.3.2-SNAPSHOT 2016-01-19 13:34:25 -08:00
Michael Armbrust
b65b9286a8 Setting version to 0.3.1 2016-01-19 13:33:53 -08:00
Michael Armbrust
43f7457d03 Add required developer info to pom 2016-01-19 13:03:31 -08:00
Michael Armbrust
24d1b3f6e3 Setting version to 0.3.1-SNAPSHOT 2016-01-19 12:54:00 -08:00
Michael Armbrust
ed246c945f Setting version to 0.3.0 2016-01-19 12:53:06 -08:00
Michael Armbrust
9afabf249a remove sql dependency 2016-01-19 12:52:03 -08:00
Michael Armbrust
d52b4c398c add results to git ignore 2016-01-19 12:39:21 -08:00
Michael Armbrust
663ca7560e Main Class for running Benchmarks from the command line
This PR adds the ability to run performance test locally as a stand alone program that reports the results to the console:

```
$ bin/run --help
spark-sql-perf 0.2.0
Usage: spark-sql-perf [options]

  -b <value> | --benchmark <value>
        the name of the benchmark to run
  -f <value> | --filter <value>
        a filter on the name of the queries to run
  -i <value> | --iterations <value>
        the number of iterations to run
  --help
        prints this usage text

$ bin/run --benchmark DatasetPerformance
```

Author: Michael Armbrust <michael@databricks.com>

Closes #47 from marmbrus/MainClass.
2016-01-19 12:37:51 -08:00
Michael Armbrust
5c93fff323 Upgrade to 1.6
Author: Michael Armbrust <michael@databricks.com>

Closes #48 from marmbrus/upgrade.
2016-01-18 09:11:35 -08:00
Davies Liu
cec648ac0f try to run all TPCDS queries in benchmark (even can't be parsed) 2016-01-08 15:03:44 -08:00
Davies Liu
3105219fb0 Merge commit '11d1f9dd7237ea2a09ecfa61f09d7623ad52fd47' 2016-01-08 11:29:07 -08:00
Davies Liu
11d1f9dd72 update some queries:
" -> `
   fill some values
2016-01-08 11:27:50 -08:00
Michael Armbrust
9269f8f594 Capture BuildInfo when available
Author: Michael Armbrust <michael@databricks.com>

Closes #45 from marmbrus/buildInfo.
2015-12-23 11:03:06 -08:00
Michael Armbrust
4ba3802f95 Setting version to 0.2.4-SNAPSHOT 2015-12-23 00:11:53 -08:00
Michael Armbrust
61e6bd1897 Setting version to 0.2.3 2015-12-23 00:11:15 -08:00
Michael Armbrust
7825449eef Include publishing to BinTray in release process
After this you should be able to use the library in the shell as follows:

```
bin/spark-shell --packages com.databricks:spark-sql-perf:0.2.3
```

Author: Michael Armbrust <michael@databricks.com>

Closes #46 from marmbrus/publishToMaven.
2015-12-23 00:09:35 -08:00
Michael Armbrust
b2e4896efc Setting version to 0.2.3-SNAPSHOT 2015-12-08 16:07:15 -08:00
Michael Armbrust
c764be3e00 Setting version to 0.2.2 2015-12-08 16:06:53 -08:00
Michael Armbrust
f8aa93d968 Initial set of tests for Datasets
Author: Michael Armbrust <michael@databricks.com>

Closes #42 from marmbrus/dataset-tests.
2015-12-08 16:04:42 -08:00
Michael Armbrust
0aa2569a18 Write only one file per run
Author: Michael Armbrust <michael@databricks.com>

Closes #35 from marmbrus/oneResultFile.
2015-12-08 15:46:20 -08:00
Michael Armbrust
12b7537181 Update databricks plugin
Author: Michael Armbrust <michael@databricks.com>

Closes #43 from marmbrus/updatePlugin.
2015-12-08 15:29:47 -08:00
Yin Huai
3af656defa Make ExecutionMode.HashResults handle null value
In Spark 1.6, if a value is null, `getLong` will throw an exception. Before 1.6, it will return 0. With this PR, we will check if the result is null. If it is null, null will be returned instead of 0.

Author: Yin Huai <yhuai@databricks.com>

Closes #41 from yhuai/fixSumHash.
2015-12-08 15:28:48 -08:00
Nong Li
43c2f23bb9 Fixes for Q34 and Q73 to return results deterministically.
Author: Nong Li <nong@databricks.com>

Closes #38 from nongli/tpcds.
2015-11-25 15:03:33 -08:00
Nong
70e0dbe656 Add official TPCDS 1.4 queries.
Author: Nong <nong@cloudera.com>

Closes #36 from nongli/tpcds.
2015-11-24 13:12:46 -08:00
Nong Li
1aa5bfc838 Add remaining tpcds tables.
Author: Nong Li <nongli@gmail.com>

Closes #34 from nongli/tpcds.
2015-11-19 13:50:00 -08:00
Andrew Or
e2073129cf Setting version to 0.2.2-SNAPSHOT 2015-11-18 13:17:56 -08:00
Andrew Or
180003e4f9 Setting version to 0.2.1 2015-11-18 13:17:48 -08:00
Nong Li
8d9e8ce9a3 Add another fact table and updates to load a single table at a time.
Author: Nong Li <nongli@gmail.com>

Closes #31 from nongli/more_tables.
2015-11-18 11:12:01 -08:00
Andrew Or
426ae30a2e Increase integration surface area with Spark perf
The changes in this PR are centered around making `Benchmark#runExperiment` accept things other than `Query`s. In particular, in spark-perf we don't always have a DataFrame or an RDD to work with and may want to run arbitrary code (e.g. ALS.train). This PR makes it possible to use the same code in `Benchmark` to do this.

I tested this on dogfood and it works well there.

Author: Andrew Or <andrew@databricks.com>

Closes #33 from andrewor14/spark-perf.
2015-11-18 10:50:46 -08:00