Michael Armbrust
a49938903d
Add Merge Script
...
Shamelessly copied from spark... still prompts for options that don't make sense (JIRA) but its better than messy merge trees :)
Author: Michael Armbrust <michael@databricks.com>
Closes #16 from marmbrus/mergeScript.
2015-09-09 20:03:52 -07:00
Yin Huai
88fa2f5af2
Merge pull request #9 from yhuai/genData
...
Add data generation support for TPC-DS
2015-09-04 15:48:10 -07:00
Yin Huai
34f66a0a10
Add a option of filter rows with null partition column values.
2015-08-26 11:14:19 -07:00
Yin Huai
f4e20af107
fix typo
2015-08-25 23:31:50 -07:00
Yin Huai
06eb11f326
Fix the seed to 100 and use distribute by instead of order by.
2015-08-25 20:44:14 -07:00
Yin Huai
9936d49239
Add a option to orderBy partition columns.
2015-08-25 20:44:14 -07:00
Yin Huai
58188c6711
Allow users to use double instead of decimal for generated tables.
2015-08-25 20:44:14 -07:00
Yin Huai
88aadb45a4
Update README.
2015-08-25 20:44:14 -07:00
Yin Huai
77fbe22b7b
address comments.
2015-08-25 20:44:13 -07:00
Yin Huai
97093a45cd
Update readme and register temp tables.
2015-08-25 20:44:13 -07:00
Yin Huai
edb4daba80
Bug fix.
2015-08-25 20:44:13 -07:00
Yin Huai
544adce70f
Add methods to genData.
2015-08-25 20:44:13 -07:00
Michael Armbrust
e046705e7f
update version
2015-08-24 16:14:17 -07:00
Michael Armbrust
98dd76befd
Release 0.1.1
2015-08-24 16:13:51 -07:00
Michael Armbrust
32215e05ee
Block completion of cpu collection
2015-08-24 16:13:26 -07:00
Michael Armbrust
e5ac7f6b4a
update version 0.1.1-SNAPSHOT
2015-08-23 13:45:01 -07:00
Michael Armbrust
cabbf7291c
release 0.1
2015-08-23 13:44:23 -07:00
Yin Huai
8e46fbdb6c
Merge pull request #11 from marmbrus/cpuProfile
...
Add support for CPU Profiling
2015-08-21 17:15:23 -07:00
Yin Huai
8674c153b7
Merge pull request #10 from marmbrus/updateSBT
...
Update SBT
2015-08-20 16:52:29 -07:00
Michael Armbrust
3f33db31c0
Update SBT
2015-08-20 16:46:45 -07:00
Michael Armbrust
00aa49e8e4
Add support for CPU Profiling.
2015-08-20 16:46:12 -07:00
Yin Huai
249157f6a6
Fix typo.
2015-08-17 12:56:35 -07:00
Michael Armbrust
d6f89c862d
Merge pull request #8 from yhuai/hashSum
...
Add a new ExecutionMode to calculate the sum of hash values of result rows.
2015-08-14 12:39:51 -07:00
Yin Huai
d5c3104ec6
address comments.
2015-08-14 11:39:06 -07:00
Yin Huai
51546868f4
You can specific perf result location.
2015-08-13 18:43:50 -07:00
Yin Huai
11bfdc7c5a
Add an ExecutionMode to check query results.
2015-08-13 18:43:49 -07:00
Yin Huai
1fe1729331
Merge pull request #7 from marmbrus/fixBreakdown
...
Fixes to breakdown calculation and table creation.
2015-08-13 18:39:53 -07:00
Michael Armbrust
ed8ddfedcd
yins comments
2015-08-13 17:54:00 -07:00
Michael Armbrust
4101a1e968
Fixes to breakdown calculation and table creation.
2015-08-13 15:47:01 -07:00
Yin Huai
ff19051b0e
Merge pull request #6 from marmbrus/refactor
...
Remove deprecated parquet writing code and add some micro benchmarks
2015-08-11 16:10:11 -07:00
Michael Armbrust
a239da90a2
more cleanup, update readme
2015-08-11 15:51:34 -07:00
Michael Armbrust
51b9dcb5b5
Merge remote-tracking branch 'origin/master' into refactor
...
Conflicts:
src/main/scala/com/databricks/spark/sql/perf/bigdata/Queries.scala
src/main/scala/com/databricks/spark/sql/perf/query.scala
src/main/scala/com/databricks/spark/sql/perf/runBenchmarks.scala
src/main/scala/com/databricks/spark/sql/perf/table.scala
src/main/scala/com/databricks/spark/sql/perf/tpcds/queries/ImpalaKitQueries.scala
src/main/scala/com/databricks/spark/sql/perf/tpcds/queries/SimpleQueries.scala
2015-08-07 15:31:32 -07:00
Yin Huai
e650da3533
Merge pull request #3 from jystephan/master
...
Closing bracket typo
2015-07-22 15:06:15 -07:00
Jean-Yves Stephan
9421522820
Closing bracket
2015-07-22 15:03:43 -07:00
Yin Huai
a50fedd5bc
Merge pull request #2 from jystephan/master
...
Allow saving benchmark queries results as parquet files
2015-07-22 13:40:39 -07:00
Jean-Yves Stephan
653d82134d
No collect before saveAsParquet
2015-07-22 13:30:40 -07:00
Yin Huai
b10fa582ea
Merge pull request #1 from Nosfe/patch-1
...
Reading hadoopConfiguration directly from SparkContext.
2015-07-22 10:27:24 -07:00
Michael Armbrust
f00ad77985
with data generation
2015-07-22 00:29:58 -07:00
Jean-Yves Stephan
a4a53b8a73
Took Aaron's comments
2015-07-21 20:05:53 -07:00
Jean-Yves Stephan
d866cce1a1
Format
2015-07-21 13:27:50 -07:00
Jean-Yves Stephan
933f3f0bb5
Removed queryOutputLocation parameter
2015-07-21 13:26:50 -07:00
Jean-Yves Stephan
9640cd8c1e
The execution mode (collect results / foreach results / writeparquet) is now specified as an argument to Query.
2015-07-21 13:23:11 -07:00
Jean-Yves Stephan
8e62e4fdbd
Added optional parameters to runBenchmark to specify a location to save queries outputs as parquet files.
...
+ Removed the hardcoded baseDir/parquet/ structure
2015-07-20 17:09:20 -07:00
Michael Armbrust
eba8cea93c
Basic join performance tests
2015-07-13 16:20:36 -07:00
Michael Armbrust
eb3dd30c35
Refactor to work in notebooks
2015-07-03 11:26:06 -07:00
Pace Francesco
4f4b08a122
Reading hadoopConfiguration from Spark.
...
Read hadoopConfiguration from SparkContext instead of creating a new Configuration directly from Hadoop config files.
This allow us to use hadoop parameters inserted or modified in one of Spark config files. (e.g.: Swift credentials).
2015-06-19 15:01:57 +02:00
Yin Huai
3eca8d2947
Add a method to wait for the finish of the experiment (waitForFinish).
2015-05-22 12:41:55 -07:00
Yin Huai
70da4f490e
Move dataframe into benchmark.
2015-05-16 19:31:55 -07:00
Yin Huai
9156e14f4b
Provide userSpecifiedBaseDir to access a dataset that is not in the path with the default format.
2015-05-07 11:01:38 -07:00
Yin Huai
fb9939b136
includeBreakdown is a parameter of runExperiment.
2015-04-20 10:03:41 -07:00