I am using the example from:
http://aseigneurin.github.io/2016/03/04/kafka-spark-avro-producing-and-consuming-avro-messages.html
and receiving a error with the foreachRDD
. I have updated the java to use lamda
expressions.
I was wondering if you had any insight into why this would occur.
I have a program to already deserialize
the avro
with :
InputStream inputStream = new ByteArrayInputStream(byteData);
DatumReader<GenericRecord> datumReader = new SpecificDatumReader<GenericRecord>(schema);
DataFileStream<GenericRecord> dataFileReader = new DataFileStream<GenericRecord>(inputStream, datumReader);
as per my previous question Kafka Avro Consumer with Decoder issues
would it be better to just use spark
parallelize for this data?, is that efficient after reading it from kafka
?