spark-sql-perf

Author	SHA1	Message	Date
Michael Armbrust	a49938903d	Add Merge Script Shamelessly copied from spark... still prompts for options that don't make sense (JIRA) but its better than messy merge trees :) Author: Michael Armbrust <michael@databricks.com> Closes #16 from marmbrus/mergeScript.	2015-09-09 20:03:52 -07:00
Yin Huai	88fa2f5af2	Merge pull request #9 from yhuai/genData Add data generation support for TPC-DS	2015-09-04 15:48:10 -07:00
Yin Huai	34f66a0a10	Add a option of filter rows with null partition column values.	2015-08-26 11:14:19 -07:00
Yin Huai	f4e20af107	fix typo	2015-08-25 23:31:50 -07:00
Yin Huai	06eb11f326	Fix the seed to 100 and use distribute by instead of order by.	2015-08-25 20:44:14 -07:00
Yin Huai	9936d49239	Add a option to orderBy partition columns.	2015-08-25 20:44:14 -07:00
Yin Huai	58188c6711	Allow users to use double instead of decimal for generated tables.	2015-08-25 20:44:14 -07:00
Yin Huai	88aadb45a4	Update README.	2015-08-25 20:44:14 -07:00
Yin Huai	77fbe22b7b	address comments.	2015-08-25 20:44:13 -07:00
Yin Huai	97093a45cd	Update readme and register temp tables.	2015-08-25 20:44:13 -07:00
Yin Huai	edb4daba80	Bug fix.	2015-08-25 20:44:13 -07:00
Yin Huai	544adce70f	Add methods to genData.	2015-08-25 20:44:13 -07:00
Michael Armbrust	e046705e7f	update version	2015-08-24 16:14:17 -07:00
Michael Armbrust	98dd76befd	Release 0.1.1	2015-08-24 16:13:51 -07:00
Michael Armbrust	32215e05ee	Block completion of cpu collection	2015-08-24 16:13:26 -07:00
Michael Armbrust	e5ac7f6b4a	update version 0.1.1-SNAPSHOT	2015-08-23 13:45:01 -07:00
Michael Armbrust	cabbf7291c	release 0.1	2015-08-23 13:44:23 -07:00
Yin Huai	8e46fbdb6c	Merge pull request #11 from marmbrus/cpuProfile Add support for CPU Profiling	2015-08-21 17:15:23 -07:00
Yin Huai	8674c153b7	Merge pull request #10 from marmbrus/updateSBT Update SBT	2015-08-20 16:52:29 -07:00
Michael Armbrust	3f33db31c0	Update SBT	2015-08-20 16:46:45 -07:00
Michael Armbrust	00aa49e8e4	Add support for CPU Profiling.	2015-08-20 16:46:12 -07:00
Yin Huai	249157f6a6	Fix typo.	2015-08-17 12:56:35 -07:00
Michael Armbrust	d6f89c862d	Merge pull request #8 from yhuai/hashSum Add a new ExecutionMode to calculate the sum of hash values of result rows.	2015-08-14 12:39:51 -07:00
Yin Huai	d5c3104ec6	address comments.	2015-08-14 11:39:06 -07:00
Yin Huai	51546868f4	You can specific perf result location.	2015-08-13 18:43:50 -07:00
Yin Huai	11bfdc7c5a	Add an ExecutionMode to check query results.	2015-08-13 18:43:49 -07:00
Yin Huai	1fe1729331	Merge pull request #7 from marmbrus/fixBreakdown Fixes to breakdown calculation and table creation.	2015-08-13 18:39:53 -07:00
Michael Armbrust	ed8ddfedcd	yins comments	2015-08-13 17:54:00 -07:00
Michael Armbrust	4101a1e968	Fixes to breakdown calculation and table creation.	2015-08-13 15:47:01 -07:00
Yin Huai	ff19051b0e	Merge pull request #6 from marmbrus/refactor Remove deprecated parquet writing code and add some micro benchmarks	2015-08-11 16:10:11 -07:00
Michael Armbrust	a239da90a2	more cleanup, update readme	2015-08-11 15:51:34 -07:00
Michael Armbrust	51b9dcb5b5	Merge remote-tracking branch 'origin/master' into refactor Conflicts: src/main/scala/com/databricks/spark/sql/perf/bigdata/Queries.scala src/main/scala/com/databricks/spark/sql/perf/query.scala src/main/scala/com/databricks/spark/sql/perf/runBenchmarks.scala src/main/scala/com/databricks/spark/sql/perf/table.scala src/main/scala/com/databricks/spark/sql/perf/tpcds/queries/ImpalaKitQueries.scala src/main/scala/com/databricks/spark/sql/perf/tpcds/queries/SimpleQueries.scala	2015-08-07 15:31:32 -07:00
Yin Huai	e650da3533	Merge pull request #3 from jystephan/master Closing bracket typo	2015-07-22 15:06:15 -07:00
Jean-Yves Stephan	9421522820	Closing bracket	2015-07-22 15:03:43 -07:00
Yin Huai	a50fedd5bc	Merge pull request #2 from jystephan/master Allow saving benchmark queries results as parquet files	2015-07-22 13:40:39 -07:00
Jean-Yves Stephan	653d82134d	No collect before saveAsParquet	2015-07-22 13:30:40 -07:00
Yin Huai	b10fa582ea	Merge pull request #1 from Nosfe/patch-1 Reading hadoopConfiguration directly from SparkContext.	2015-07-22 10:27:24 -07:00
Michael Armbrust	f00ad77985	with data generation	2015-07-22 00:29:58 -07:00
Jean-Yves Stephan	a4a53b8a73	Took Aaron's comments	2015-07-21 20:05:53 -07:00
Jean-Yves Stephan	d866cce1a1	Format	2015-07-21 13:27:50 -07:00
Jean-Yves Stephan	933f3f0bb5	Removed queryOutputLocation parameter	2015-07-21 13:26:50 -07:00
Jean-Yves Stephan	9640cd8c1e	The execution mode (collect results / foreach results / writeparquet) is now specified as an argument to Query.	2015-07-21 13:23:11 -07:00
Jean-Yves Stephan	8e62e4fdbd	Added optional parameters to runBenchmark to specify a location to save queries outputs as parquet files. + Removed the hardcoded baseDir/parquet/ structure	2015-07-20 17:09:20 -07:00
Michael Armbrust	eba8cea93c	Basic join performance tests	2015-07-13 16:20:36 -07:00
Michael Armbrust	eb3dd30c35	Refactor to work in notebooks	2015-07-03 11:26:06 -07:00
Pace Francesco	4f4b08a122	Reading hadoopConfiguration from Spark. Read hadoopConfiguration from SparkContext instead of creating a new Configuration directly from Hadoop config files. This allow us to use hadoop parameters inserted or modified in one of Spark config files. (e.g.: Swift credentials).	2015-06-19 15:01:57 +02:00
Yin Huai	3eca8d2947	Add a method to wait for the finish of the experiment (waitForFinish).	2015-05-22 12:41:55 -07:00
Yin Huai	70da4f490e	Move dataframe into benchmark.	2015-05-16 19:31:55 -07:00
Yin Huai	9156e14f4b	Provide userSpecifiedBaseDir to access a dataset that is not in the path with the default format.	2015-05-07 11:01:38 -07:00
Yin Huai	fb9939b136	includeBreakdown is a parameter of runExperiment.	2015-04-20 10:03:41 -07:00

1 2

53 Commits