spark-sql-perf/src
Joseph Bradley 30c50dddbb [ML-2918] Call count() in default score() to improve timing of transform() (#159)
For Models and Transformers which are not tested with Evaluators, I think we are not timing transform() correctly here:

spark-sql-perf/src/main/scala/com/databricks/spark/sql/perf/mllib/MLPipelineStageBenchmarkable.scala

Line 65 in aa1587f

 transformer.transform(trainingData) 
Since transform() is lazy, we need to materialize it during timing. This PR currently just calls count() in the default implementation of score().

* call count() in score()
* changed count to UDF
2018-07-08 16:09:24 -07:00
..
main [ML-2918] Call count() in default score() to improve timing of transform() (#159) 2018-07-08 16:09:24 -07:00
test/scala/com/databricks/spark/sql/perf Run mllib small in unit tests (#141) 2018-05-09 16:24:30 -07:00