How to make spark sql accessible across all nodes?

Asked Sep 23 '15 at 06:23

Active Sep 23 '15 at 06:28

Viewed 565 times

I have a multiple files on hdfs I want to be queryable through spark sql JDBC. I can start a spark shell and use "Sqlcontext", etc. What happens if I want to keep the sqlcontext open so that I can have a separate application connect via JDBC to issue queries to it?

Note I know I can run "spark-shell" and open up a local instance of spark, and do import sqlcontext, but the files I have are big in size (100GB), I only have at most 16GB on a single machine, so I want it to take advantage of my 50 node cluster of a single master and 49 slaves for performance. Or spark sql only possible with a single node?

edited Sep 23 '15 at 06:28

asked Sep 23 '15 at 06:23

Rolando

58,640
98
266
407

I don't I understand what you want want here. If you work on files why do you want JDBC connection? SQLContext, same as SparkContext is handled by a Driver. There is no need, not to mention it is not possible, to create SQLContext per worker node. It doesn't mean you data will be handled on a Driver though. – zero323 Sep 23 '15 at 10:08
I want a separate application to be able to issue database queries against the sqlcontext. How do you start up that driver to use/take advantage of the entire cluster? I am used to writing jobs and spark-submitting to local cluster or master. – Rolando Sep 23 '15 at 12:55
Something like this: http://stackoverflow.com/q/27108863/1560062 – zero323 Sep 23 '15 at 12:57

How to make spark sql accessible across all nodes?

0 Answers0