I am trying to access an accumulator's value while in a task of a cluster. But when I do so it throws an exception:
can't read the accumulator's value
I tried to use the row.localValue
but it returns the same numbers. Is there a workaround?
private def modifyDataset(
data: String, row: org.apache.spark.Accumulator[Int]): Array[Int] = {
var line = data.split(",")
var lineSize = line.size
var pairArray = new Array[Int](lineSize-1)
var a = row.value
paiArray(0)=a
row+=1
pairArray
}
var sc = Spark_Context.InitializeSpark
var row = sc.accumulator(1, "Rows")
var dataset = sc.textFile("path")
var pairInfoFile = noHeaderRdd.flatMap{ data => modifyDataset(data,row) }
.persist(StorageLevel.MEMORY_AND_DISK)
pairInfoFile.count()