I'm trying to define a system property for my log file, e.g. 'logger.File'='/tmp/some_name.log' in order to dynamically assign different log files to different spark jobs. I've tried to reduced the test case to the bare minimum. It is failing when using the property, all works if I hardcode the value in log4j.properties.
Can someone please enlighten me?..
Thanks!
Python code test_log4j.py:
from pyspark.context import SparkContext
from pyspark import SparkConf
from pyspark.sql import SparkSession
SparkContext.setSystemProperty('logger.File', '/tmp/some_name.log')
spark = SparkSession.builder.config(conf=SparkConf()).getOrCreate()
logger = spark._jvm.org.apache.log4j.LogManager.getLogger('default')
logger.warn('hey')
log4j.properties:
log4j.rootLogger=INFO, spark_jobs
log4j.logger.spark_jobs=INFO
log4j.appender.spark_jobs=org.apache.log4j.RollingFileAppender
log4j.appender.spark_jobs.layout=org.apache.log4j.PatternLayout
log4j.appender.spark_jobs.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.spark_jobs.Threshold=INFO
log4j.appender.spark_jobs.File=${logger.File}
Test:
/usr/hdp/current/spark2-client/bin/spark-submit --master local --conf spark.driver.extraJavaOptions="-Dlog4j.debug -Dlog4j.configuration=file:/tmp/log4j.properties" --conf spark.executor.extraJavaOptions="-Dlog4j.debug -Dlog4j.configuration=file:/tmp/log4j.properties" test_log4j.py
Result:
log4j: Using URL [file:/tmp/log4j.properties] for automatic log4j configuration.
log4j: Reading configuration from URL file:/tmp/log4j.properties
log4j: Parsing for [root] with value=[INFO, spark_jobs].
log4j: Level token is [INFO].
log4j: Category root set to INFO
log4j: Parsing appender named "spark_jobs".
log4j: Parsing layout options for "spark_jobs".
log4j: Setting property [conversionPattern] to [%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n].
log4j: End of parsing for "spark_jobs".
log4j: Setting property [file] to [].
log4j: Setting property [threshold] to [INFO].
log4j: setFile called: , true
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:120)
at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:108)
at org.apache.spark.deploy.SparkSubmit$.initializeLogIfNecessary(SparkSubmit.scala:71)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:128)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
log4j: Parsed "spark_jobs" options.
log4j: Parsing for [spark_jobs] with value=[INFO].
log4j: Level token is [INFO].
log4j: Category spark_jobs set to INFO
log4j: Handling log4j.additivity.spark_jobs=[null]
log4j: Finished configuring.