Commit Graph

53 Commits

Author SHA1 Message Date
Michael Armbrust
a49938903d Add Merge Script
Shamelessly copied from spark... still prompts for options that don't make sense (JIRA) but its better than messy merge trees :)

Author: Michael Armbrust <michael@databricks.com>

Closes #16 from marmbrus/mergeScript.
2015-09-09 20:03:52 -07:00
Yin Huai
88fa2f5af2 Merge pull request #9 from yhuai/genData
Add data generation support for TPC-DS
2015-09-04 15:48:10 -07:00
Yin Huai
34f66a0a10 Add a option of filter rows with null partition column values. 2015-08-26 11:14:19 -07:00
Yin Huai
f4e20af107 fix typo 2015-08-25 23:31:50 -07:00
Yin Huai
06eb11f326 Fix the seed to 100 and use distribute by instead of order by. 2015-08-25 20:44:14 -07:00
Yin Huai
9936d49239 Add a option to orderBy partition columns. 2015-08-25 20:44:14 -07:00
Yin Huai
58188c6711 Allow users to use double instead of decimal for generated tables. 2015-08-25 20:44:14 -07:00
Yin Huai
88aadb45a4 Update README. 2015-08-25 20:44:14 -07:00
Yin Huai
77fbe22b7b address comments. 2015-08-25 20:44:13 -07:00
Yin Huai
97093a45cd Update readme and register temp tables. 2015-08-25 20:44:13 -07:00
Yin Huai
edb4daba80 Bug fix. 2015-08-25 20:44:13 -07:00
Yin Huai
544adce70f Add methods to genData. 2015-08-25 20:44:13 -07:00
Michael Armbrust
e046705e7f update version 2015-08-24 16:14:17 -07:00
Michael Armbrust
98dd76befd Release 0.1.1 2015-08-24 16:13:51 -07:00
Michael Armbrust
32215e05ee Block completion of cpu collection 2015-08-24 16:13:26 -07:00
Michael Armbrust
e5ac7f6b4a update version 0.1.1-SNAPSHOT 2015-08-23 13:45:01 -07:00
Michael Armbrust
cabbf7291c release 0.1 2015-08-23 13:44:23 -07:00
Yin Huai
8e46fbdb6c Merge pull request #11 from marmbrus/cpuProfile
Add support for CPU Profiling
2015-08-21 17:15:23 -07:00
Yin Huai
8674c153b7 Merge pull request #10 from marmbrus/updateSBT
Update SBT
2015-08-20 16:52:29 -07:00
Michael Armbrust
3f33db31c0 Update SBT 2015-08-20 16:46:45 -07:00
Michael Armbrust
00aa49e8e4 Add support for CPU Profiling. 2015-08-20 16:46:12 -07:00
Yin Huai
249157f6a6 Fix typo. 2015-08-17 12:56:35 -07:00
Michael Armbrust
d6f89c862d Merge pull request #8 from yhuai/hashSum
Add a new ExecutionMode to calculate the sum of hash values of result rows.
2015-08-14 12:39:51 -07:00
Yin Huai
d5c3104ec6 address comments. 2015-08-14 11:39:06 -07:00
Yin Huai
51546868f4 You can specific perf result location. 2015-08-13 18:43:50 -07:00
Yin Huai
11bfdc7c5a Add an ExecutionMode to check query results. 2015-08-13 18:43:49 -07:00
Yin Huai
1fe1729331 Merge pull request #7 from marmbrus/fixBreakdown
Fixes to breakdown calculation and table creation.
2015-08-13 18:39:53 -07:00
Michael Armbrust
ed8ddfedcd yins comments 2015-08-13 17:54:00 -07:00
Michael Armbrust
4101a1e968 Fixes to breakdown calculation and table creation. 2015-08-13 15:47:01 -07:00
Yin Huai
ff19051b0e Merge pull request #6 from marmbrus/refactor
Remove deprecated parquet writing code and add some micro benchmarks
2015-08-11 16:10:11 -07:00
Michael Armbrust
a239da90a2 more cleanup, update readme 2015-08-11 15:51:34 -07:00
Michael Armbrust
51b9dcb5b5 Merge remote-tracking branch 'origin/master' into refactor
Conflicts:
	src/main/scala/com/databricks/spark/sql/perf/bigdata/Queries.scala
	src/main/scala/com/databricks/spark/sql/perf/query.scala
	src/main/scala/com/databricks/spark/sql/perf/runBenchmarks.scala
	src/main/scala/com/databricks/spark/sql/perf/table.scala
	src/main/scala/com/databricks/spark/sql/perf/tpcds/queries/ImpalaKitQueries.scala
	src/main/scala/com/databricks/spark/sql/perf/tpcds/queries/SimpleQueries.scala
2015-08-07 15:31:32 -07:00
Yin Huai
e650da3533 Merge pull request #3 from jystephan/master
Closing bracket typo
2015-07-22 15:06:15 -07:00
Jean-Yves Stephan
9421522820 Closing bracket 2015-07-22 15:03:43 -07:00
Yin Huai
a50fedd5bc Merge pull request #2 from jystephan/master
Allow saving benchmark queries results as parquet files
2015-07-22 13:40:39 -07:00
Jean-Yves Stephan
653d82134d No collect before saveAsParquet 2015-07-22 13:30:40 -07:00
Yin Huai
b10fa582ea Merge pull request #1 from Nosfe/patch-1
Reading hadoopConfiguration directly from SparkContext.
2015-07-22 10:27:24 -07:00
Michael Armbrust
f00ad77985 with data generation 2015-07-22 00:29:58 -07:00
Jean-Yves Stephan
a4a53b8a73 Took Aaron's comments 2015-07-21 20:05:53 -07:00
Jean-Yves Stephan
d866cce1a1 Format 2015-07-21 13:27:50 -07:00
Jean-Yves Stephan
933f3f0bb5 Removed queryOutputLocation parameter 2015-07-21 13:26:50 -07:00
Jean-Yves Stephan
9640cd8c1e The execution mode (collect results / foreach results / writeparquet) is now specified as an argument to Query. 2015-07-21 13:23:11 -07:00
Jean-Yves Stephan
8e62e4fdbd Added optional parameters to runBenchmark to specify a location to save queries outputs as parquet files.
+ Removed the hardcoded baseDir/parquet/ structure
2015-07-20 17:09:20 -07:00
Michael Armbrust
eba8cea93c Basic join performance tests 2015-07-13 16:20:36 -07:00
Michael Armbrust
eb3dd30c35 Refactor to work in notebooks 2015-07-03 11:26:06 -07:00
Pace Francesco
4f4b08a122 Reading hadoopConfiguration from Spark.
Read hadoopConfiguration from SparkContext instead of creating a new Configuration directly from Hadoop config files.
This allow us to use hadoop parameters inserted or modified in one of Spark config files. (e.g.: Swift credentials).
2015-06-19 15:01:57 +02:00
Yin Huai
3eca8d2947 Add a method to wait for the finish of the experiment (waitForFinish). 2015-05-22 12:41:55 -07:00
Yin Huai
70da4f490e Move dataframe into benchmark. 2015-05-16 19:31:55 -07:00
Yin Huai
9156e14f4b Provide userSpecifiedBaseDir to access a dataset that is not in the path with the default format. 2015-05-07 11:01:38 -07:00
Yin Huai
fb9939b136 includeBreakdown is a parameter of runExperiment. 2015-04-20 10:03:41 -07:00