0

I am using the example from:

http://aseigneurin.github.io/2016/03/04/kafka-spark-avro-producing-and-consuming-avro-messages.html

and receiving a error with the foreachRDD. I have updated the java to use lamda expressions.

Kafka foreachRDD Error

I was wondering if you had any insight into why this would occur.

I have a program to already deserialize the avro with :

InputStream inputStream = new ByteArrayInputStream(byteData);
        DatumReader<GenericRecord> datumReader = new SpecificDatumReader<GenericRecord>(schema);
        DataFileStream<GenericRecord> dataFileReader = new DataFileStream<GenericRecord>(inputStream, datumReader);

as per my previous question Kafka Avro Consumer with Decoder issues

would it be better to just use spark parallelize for this data?, is that efficient after reading it from kafka?

Community
  • 1
  • 1
SparkleGoat
  • 503
  • 1
  • 9
  • 22
  • @miguno I let the leadership here know about the benefits of confluents schemaRegistry and they are going to incorporate it in our configuration. – SparkleGoat Mar 17 '16 at 14:01
  • Not sure this solves your problem, but I had to ship my schemas as String to the executor, then parse them all there before doing any decoding in the executor. I couldn't figure out how to serialize the Parser, instead I create a localParser for each executor, and if it is `null` I parse all my schemas before trying to decode anything. – David Griffin Mar 17 '16 at 15:48
  • Thanks . I added a return null in the lamda expression and it allowed me to compile, its not a good solution but I will let you know how it goes. – SparkleGoat Mar 18 '16 at 13:13

0 Answers0