Well, if you need a full support for linear algebra operators you have to implement these by yourself or use an external library. In the second case the obvious choice is Breeze.
It is already used behind the scenes so doesn't introduce additional dependencies and you can easily modify existing Spark code for conversions:
import breeze.linalg.{DenseVector => BDV, SparseVector => BSV, Vector => BV}
def toBreeze(v: Vector): BV[Double] = v match {
case DenseVector(values) => new BDV[Double](values)
case SparseVector(size, indices, values) => {
new BSV[Double](indices, values, size)
}
}
def toSpark(v: BV[Double]) = v match {
case v: BDV[Double] => new DenseVector(v.toArray)
case v: BSV[Double] => new SparseVector(v.length, v.index, v.data)
}
Mahout provides interesting Spark and Scala bindings you may find interesting as well.
For simple matrix vector multiplications it can be easier to leverage existing matrix methods. For example IndexedRowMatrix
and RowMatrix
provide multiply
methods which can take a local matrix. You can check Matrix Multiplication in Apache Spark for an example usage.