1

Using spark streaming I am watching for the files in HDFS using textFileStream/fileStream methods, how do we get the fileNames which are read by these methods?

I used textFileStream which has file contents in JavaDStream and I got no success with fileStream as it is throwing me a compilation error with spark version 1.3.1.

Can someone please tell me if we have an API function or any other way to get the file names that these streaming methods read?

Thanks Lokesh

Ashrith
  • 6,745
  • 2
  • 29
  • 36
Lokesh Kumar P
  • 369
  • 5
  • 20
  • What is the compilation error? I guess you did not do it right with the `InputFormat`. From what I learned from the functions in `DStream`, I really don't think you can get the file name :( – David S. Apr 29 '15 at 11:40
  • Hi Thanks for the reply, regarding the compilation error. I have posted the same in spark forum in the below link. http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-3-1-JavaStreamingContext-fileStream-compile-error-td22683.html – Lokesh Kumar P Apr 30 '15 at 05:27
  • Hi DavidShen You were correct I was using TextInputFormat from a different package. When I used the right one I was able to get this method working. – Lokesh Kumar P May 15 '15 at 07:50
  • Hi Lokesh, can you please highlight the solution and the correct package name? Refer this question: http://stackoverflow.com/questions/33076339/sparkstreaming-error-in-filestream – Mata Oct 15 '15 at 10:59
  • Hi I have replied to the above link in the answer section, the link is http://stackoverflow.com/questions/33076339/sparkstreaming-error-in-filestream – Lokesh Kumar P Oct 16 '15 at 07:03

0 Answers0