5

We have tried using various combinations of settings - but mpstat is showing that all or most cpu's are always being used (on a single 8 core system)

Following have been tried:

set master to:

local[2]

send in

conf.set("spark.cores.max","2")

in the spark configuration

Also using

--total-executor-cores 2

and

--executor-cores 2

In all cases

mpstat -A

shows that all of the CPU's are being used - and not just by the master.

So I am at a loss presently. We do need to limit the usage to a specified number of cpu's.

WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560

2 Answers2

6

I had the same problem with memory size and I wanted to increase it but none of the above worked for me as well. Based on this user post I was able to resolve my problem and I think this should also work for number of cores:

from pyspark import SparkConf, SparkContext

# In Jupyter you have to stop the current context first
sc.stop()

# Create new config
conf = (SparkConf().set("spark.cores.max", "2"))

# Create new context
sc = SparkContext(conf=conf)

Hope this helps you. And please, if you have resolved your problem, send your solution as answer for this post so we can all benefit from it :)

Cheers

Community
  • 1
  • 1
ahajib
  • 12,838
  • 29
  • 79
  • 120
  • 2
    +1 for spark.cores.max - fyi, if using spark-submit you should be able to specify from the command line using: --conf spark.cores.max=2 – taylorcressy Jun 19 '17 at 21:45
  • as shown in the original post those were not having any effect in `standalone` mode. I think you are not running in standalone. This does not answer the original question and in fact duplicates information already contained there. – WestCoastProjects Apr 27 '18 at 10:02
  • @javadba It is not an answer but could be added as an update to your question or posted as a comment. – ahajib Apr 27 '18 at 16:42
  • This was in the original question `conf.set("spark.cores.max","2")` . In addition you don't downvote another answer for a reason of "my answer was downvoted". – WestCoastProjects Apr 27 '18 at 16:44
  • This was not my intention whatsoever and have already flagged it as not an answer. If you think my answer is not acceptable and does not provide any useful information, you are more than welcome to do so as well. After all, as I have mentioned it helped me with other configurations and that's why it was posted here. Good luck – ahajib Apr 27 '18 at 16:50
0

Apparently spark standalone ignores the spark.cores.max setting. That setting does work in yarn.

WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560