1

I have a use case which involves pulling/streaming data from numerous HTTP endpoints, in excess of over 100.

I have a standalone java app that can manage up to 30+ client requests using ning async client http library but was looking for some ideas on how I could scale this up to handle much more.

The use case is to pull the data from the end points and push them into a kafka queue (similar to jms queue) for processing by a storm topology. The bit I'm stuck on is how to best efficiently get the http end point data into the Kafka queues in the first place.

thanks

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
user983022
  • 979
  • 1
  • 18
  • 30
  • Is this really a storm question? Can you use some web server? for example `nginx` can handle thousands concurent clients – Vor Oct 05 '13 at 00:53
  • Hmm, yes we could use a storm topology with spouts pulling from the endpoints, but I was under the assumption it would be create connections from spouts to message emitters in my case kafka, rather than the other way around. So I was looking at a way of pushing to the queue and let storm pull from the queue rather than the http end points directly. Or am I missing the point here? – user983022 Oct 06 '13 at 09:17

0 Answers0