-
304fdaf81a
增加单表参数
master
Your Name
2026-01-07 12:47:26 +0800
-
28d88190f6
Update Spark repository for sbt (#206)
Frank Luan
2021-12-13 02:20:13 -0800
-
ca4ccea3dd
Add a convenient class to generate TPC-DS data (#196)
Yuming Wang
2021-03-30 20:19:36 +0800
-
65785a8a04
Fix Travis CI JDK installation (#195)
Yuming Wang
2021-01-29 00:28:46 +0800
-
d85f75bb38
Update for Spark 3.0.0 compatibility (#191)
Nico Poggi
2020-11-03 15:27:34 +0100
-
6b2bf9f9ad
Fix files truncating according to maxRecordPerFile (#180)
Guo Chenzhao
2019-05-29 23:20:01 +0800
-
3f92a094cc
Bumping version to 0.5.1-SNAPSHOT (spark 3, scala 2.12, log4j ) (#168)
Nico Poggi
2019-01-29 10:00:54 +0100
-
e1e1365a87
Updates for Spark 3.0 and Scala 2.12 compatibility (#176)
Luca Canali
2019-01-29 09:58:52 +0100
-
85bbfd4ca2
[ML-5437] Build with spark-2.4.0 and resolve build issues (#174)
Bago Amirbekian
2018-11-09 16:21:22 -0800
-
d44caec277
Revert "Update Scala Logging to officially supported one " (#172)
Nico Poggi
2018-10-19 17:33:34 +0200
-
0367ff65a6
Coalesce(n) instead of hardcoded (1) for large tables/partitions
Nico Poggi
2018-10-16 11:21:05 +0200
-
3c1c9e9070
Rebase for PR 87: Add -m for custom master, use SBT_HOME if set (#169)
Nico Poggi
2018-09-17 15:18:16 +0200
-
d9a41a1204
Fix 3 local benchmark classes (#165)
Phil
2018-09-17 13:08:56 +0100
-
aac7eb54c1
Fixing TPCH DDL datatype of
customer.c_nationkey from string to long (#167)
Nico Poggi
2018-09-13 12:00:29 +0200
-
56f73482d7
Update Scala Logging to officially supported one
Piotr Mrówczyński
2018-09-11 12:17:06 +0200
-
6136ecea6e
TPC-H datagenerator and instructions (#136)
Nico Poggi
2018-09-10 23:18:33 +0200
-
8bbeae664d
Adds an optional version of the SS_MAX query (#137)
Nico Poggi
2018-09-10 22:54:02 +0200
-
bf55bdb987
Make queryNames public so it can be accessed from notebooks. (#166)
Nico Poggi
2018-09-10 22:53:20 +0200
-
bb12958874
Fix compile for Spark 2.4 SNAPSHOT and only catch NonFatal (#164)
Xiangrui Meng
2018-09-10 08:49:31 -0700
-
0ab6bf606b
Benchmark for SparkR UDF *apply() APIs
Liang Zhang
2018-07-12 17:12:35 -0700
-
8e8c08d75b
[ML-4154] Added testing for before/after of ml benchmarks. (#162)
Bago Amirbekian
2018-07-12 16:43:54 -0700
-
107495afe2
[ML-4069] Improve timing of estimators (#161)
Joseph Bradley
2018-07-09 17:41:44 -0700
-
30c50dddbb
[ML-2918] Call count() in default score() to improve timing of transform() (#159)
Joseph Bradley
2018-07-08 16:09:24 -0700
-
1798b12077
change large test timeout to 12 hours (#160)
Xiangrui Meng
2018-07-04 15:32:00 -0700
-
2895ae1139
update VectorAssembler test such that the dataset size is numExamples * numFeatures (#158)
Xiangrui Meng
2018-07-03 17:16:36 -0700
-
e9ef9788c2
[ML-3844] Add GBTRegression benchmark (#156)
ludatabricks
2018-06-27 09:17:38 -0700
-
e8aa132bb8
[ML-3870] Make spark-sql-perf master compiled with spark 2.3 and scala 2.11 (#155)
ludatabricks
2018-06-15 06:40:14 -0700
-
49717a72dd
put additionalTests to mlmetrics (#153)
ludatabricks
2018-06-13 15:21:50 -0700
-
a4e1c790ba
[ML-3869] Make Quantilediscretizer work with spark-2.3 (#154)
ludatabricks
2018-06-13 15:19:52 -0700
-
51786921a6
[ML-3583] Add benchmarks to mllib-large.yaml for featurization (#152)
ludatabricks
2018-06-12 17:31:30 -0700
-
aa1587fec5
[ML-3824] Add benchmarks to mllib-large.yaml for FPGrowth (#151)
ludatabricks
2018-06-12 13:10:12 -0700
-
6a45dc8a2d
[ML-3581] Add benchmarks to mllib-large.yaml for regression (#150)
ludatabricks
2018-06-12 10:32:02 -0700
-
9ab2a8bb14
[ML-3585] Added benchmarks to mllib-large.yaml for clustering (#149)
ludatabricks
2018-06-08 12:06:52 -0700
-
62b173d779
Output Training Time as metrics (#148)
ludatabricks
2018-06-07 13:21:32 -0700
-
d9984e1c0a
[ML-3584] Added benchmarks to mllib-large.yaml for ALS (#147)
ludatabricks
2018-06-07 08:11:37 -0700
-
93626c11b4
[ML-3775] Add "benchmarkId" to BenchmarkResult (#146)
ludatabricks
2018-06-04 14:13:45 -0700
-
f1139fc742
[ML-3753] Log "value" instead of "Some(value)" for ML params in results (#145)
ludatabricks
2018-06-04 11:09:41 -0700
-
1768d376f9
[ML-3749] Log metric name and isLargerBetter in BenchmarkResult (#144)
ludatabricks
2018-06-01 15:49:16 -0700
-
789a0f5b8b
Added benchmarks to mllib-large.yaml for classifcation Estimators. (#143)
Bago Amirbekian
2018-05-30 08:18:49 -0700
-
3786a8391e
Quantile discretizer benchmark (#135)
WeichenXu
2018-05-18 02:55:00 +0800
-
15d9283473
Run mllib small in unit tests (#141)
Bago Amirbekian
2018-05-09 16:24:30 -0700
-
9ece11ff20
Add decision tree benchmark (#140)
Bago Amirbekian
2018-05-08 21:44:11 -0700
-
ed9bbb01a5
fix bug with ML additional method tests (#142)
Joseph Bradley
2018-05-08 13:23:22 -0700
-
be4459fe41
Additional method test for some ML algos (#139)
WeichenXu
2018-05-03 04:45:58 +0800
-
5af9f6dfc2
Word2Vec benchmark (#127)
WeichenXu
2018-03-16 04:10:04 +0800
-
a8acd53fdd
Use DECIMAL and DATE in the default TPCDS notebooks. (#130)
Juliusz Sompolski
2018-03-07 21:44:42 +0100
-
b7ac7e55ae
Remove VACUUM from tpcds_datagen notebook. (#129)
Juliusz Sompolski
2018-03-07 15:36:27 +0100
-
93a34553f0
MinHashLSH and BucketedRandomProjectionLSH benchmark #128
WeichenXu
2018-03-03 07:21:37 +0800
-
6d01ac94a1
[ML-3342] Bug fixes to make mllib benchmarks work with dbr-4.0. (#125)
Bago Amirbekian
2018-03-02 09:12:38 -0800
-
91604a3ab0
Update README to specify that TPCDS kit needs to be installed on all nodes.
Juliusz Sompolski
2018-02-27 12:06:12 +0100
-
31f34beee5
Update README to do sql("use database") (#123)
Juliusz Sompolski
2017-11-07 20:38:26 +0100
-
7bf2d45b0f
Don't clean blocks after every run in Benchmarkable (#119)
Juliusz Sompolski
2017-09-18 11:51:12 +0200
-
fdd0e38717
TPCDS notebooks in source, not binary format (#121)
Juliusz Sompolski
2017-09-13 14:57:59 +0200
-
006f096562
Merge pull request #120 from juliuszsompolski/tpcds_notebooks
Nico Poggi
2017-09-12 17:22:38 +0200
-
-
5ebb9cfb12
add some more comments
Juliusz Sompolski
2017-09-12 16:51:26 +0200
-
c78f2b3a9b
update readme
Juliusz Sompolski
2017-09-12 16:40:23 +0200
-
ae8bcdb292
add notebooks
Juliusz Sompolski
2017-09-12 15:43:08 +0200
-
f08bf31d18
add benchmark for FPGrowth (#113)
WeichenXu
2017-09-05 01:48:05 +0800
-
-
bcda8fc1e5
Coalesce non-partitioned tables. (#118)
Juliusz Sompolski
2017-09-04 18:05:42 +0200
-
3e1bbd00ed
[ML-2847] Add new tests for (DecisionTree, RandomForest)Regression, GMM, HashingTF (#116)
Siddharth Murching
2017-09-03 22:26:20 -0700
-
19c41464c7
fix df.drop in VectorAssembler (#117)
WeichenXu
2017-09-02 04:51:05 +0800
-
6ec83fd0f7
Add benchmark for LinearSVC/OnehotEncoder/VectorSlicer/VectorAssembler/StringIndexer/Tokenizer (#112)
WeichenXu
2017-09-01 04:56:43 +0800
-
737a1bc355
BlockingLineStream (#115)
Juliusz Sompolski
2017-08-31 15:16:22 +0200
-
9febc34f66
Refactor MLParams for spark-sql-perf (#114)
Siddharth Murching
2017-08-28 13:23:59 -0700
-
d0de5ae8aa
Update tests to run with Spark 2.2, add NaiveBayes & Bucketizer ML tests (#110)
Siddharth Murching
2017-08-21 15:07:46 -0700
-
b3a6ed79b3
Start the development 0.5.0-SNAPSHOT
Yin Huai
2017-08-21 14:21:19 -0700
-
4e7a2363b9
Support for TPC-H benchmark
Bogdan Raducanu
2017-08-09 12:26:32 +0200
-
fdcde7595c
Update README (#107)
Kevin
2017-07-13 10:45:24 +0200
-
6488d74d23
tpcds_2_4: Add alias names to subqueries in FROM clause.
Juliusz Sompolski
2017-06-29 02:59:34 +0200
-
bff6b34f62
Tweaks and improvements (#106)
Juliusz Sompolski
2017-06-13 11:42:14 +0200
-
75f3876e59
Merge pull request #103 from juliuszsompolski/fixtypes
Juliusz Sompolski
2017-05-26 11:53:19 +0200
-
-
2ddd521ab5
ok, make it long only where really needed.
Juliusz Sompolski
2017-05-26 10:36:40 +0200
-
1bca964a3d
Correct types of keys
Juliusz Sompolski
2017-05-25 17:12:47 +0200
-
-
beec62844d
Merge pull request #101 from vlyubin/master
Volodymyr Lyubinets
2017-05-16 10:35:35 +0200
-
-
c0bd21c2ec
Add ss_max
vlyubin
2017-05-16 10:29:00 +0200
-
e5dc6f338f
Updated queries 23
vlyubin
2017-05-15 17:30:20 +0200
-
e8f85b0b0e
Moved queries into a separate folder
vlyubin
2017-05-15 14:22:37 +0200
-
96bf10bffc
Add tpcds 2.4 queries
vlyubin
2017-05-12 11:54:32 +0200
-
-
c12b14b013
Merge pull request #98 from databricks/parallel-runs
Eric Liang
2017-03-15 13:50:41 -0700
-
-
64728c7cff
Add option to avoid cleaning after each run, to enable parallel runs
Eric Liang
2017-03-14 19:45:27 -0700
-
-
53091a1935
Removes labels from tree data generation (#82)
Timothy Hunter
2016-12-13 16:47:31 -0800
-
685c50d9dc
Cross build with Scala 2.11 (#91)
srinathshankar
2016-10-03 17:01:17 -0700
-
0eaa4b1d57
[SC-4409] Correct query 41 in TPCDS kit (#90)
srinathshankar
2016-09-30 18:02:39 -0700
-
c2224f37e5
Depend on non-snapshot Spark now that 2.0.0 is released
Josh Rosen
2016-08-17 17:53:30 -0700
-
948c8369e7
Fixes issues with scala 2.11
Timothy Hunter
2016-07-19 11:19:52 -0700
-
8830bffd46
Merge pull request #79 from jkbradley/tree-test-fix
Timothy Hunter
2016-07-11 10:42:19 -0700
-
-
51469a34d6
Fixed tree, forest, GBT tests by adding metadata to DataFrames
Joseph K. Bradley
2016-07-11 10:33:19 -0700
-
-
1fcc366cec
Merge pull request #78 from thunterdb/1607-fixes
Timothy Hunter
2016-07-06 11:34:05 -0700
-
-
c7d42d3626
adding parameters
Timothy Hunter
2016-07-06 11:23:07 -0700
-
-
2672bcd5b7
ALS algorithm for spark-sql-perf
Timothy Hunter
2016-07-05 15:54:08 -0700
-
93c0407bbe
Merge pull request #77 from thunterdb/1607-linear
Timothy Hunter
2016-07-05 15:41:35 -0700
-
-
40e97ca3c0
comment
Timothy Hunter
2016-07-05 15:01:50 -0700
-
ce7e20ae6d
set the solver
Timothy Hunter
2016-07-05 13:46:19 -0700
-
def20479a1
linear regression
Timothy Hunter
2016-07-05 13:42:56 -0700
-
-
979ebd5d0f
Merge pull request #75 from jkbradley/kmeans
Timothy Hunter
2016-07-05 10:14:11 -0700
-
-
9d11a601c3
added kmeans test
Joseph K. Bradley
2016-07-01 18:00:49 -0700
-
-
3d3443791c
Merge pull request #74 from jkbradley/dt-tests
jkbradley
2016-07-01 17:40:16 -0700
-
-
495e2716c4
updated per code review. works in local tests
Joseph K. Bradley
2016-07-01 17:39:28 -0700
-
c2f0a35db4
Merge pull request #1 from thunterdb/1606-trees
jkbradley
2016-07-01 11:46:41 -0700
-
-
813bd8ad59
adding more experiments
Timothy Hunter
2016-07-01 10:34:42 -0700
-