0

Running some Pig jobs, I noticed the following line in the logs:

[main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
 - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

However, a Google search does not reveal anything about the meaning of the parameter mapred.job.reduce.markreset.buffer.percent. Does anybody know what it's for?

Giovanni Botta
  • 9,626
  • 5
  • 51
  • 94

1 Answers1

1

From the mapred-default.xml documentation:

The percentage of memory -relative to the maximum heap size- to be used for caching values when using the mark-reset functionality.

Note that this refers to a property named mapreduce.reduce.markreset.buffer.percent. There are two APIs within Hadoop, mapred and mapreduce. See this question for information about their differences.

I am not sure about this particular property, but my guess would be either that you are using an older version of Hadoop that has not updated the property's name, or the Pig developers made a mistake and typed "mapred" instead of "mapreduce" (and that's why you are finding that the property is not set). In either case, I think you can feel confident that it means what I have quoted from the docs.

Community
  • 1
  • 1
reo katoa
  • 5,751
  • 1
  • 18
  • 30
  • Very interesting. I am using DSE 4, so I am not sure which version of Hadoop it uses. – Giovanni Botta May 20 '14 at 19:13
  • Actually, here it is, Hadoop 1.0.4.9 and Pig 0.10.1. They both seem quite old actually. – Giovanni Botta May 20 '14 at 19:14
  • It's not excessively old (Oct. 2012), but as you can see from [this list of Hadoop releases](http://hadoop.apache.org/releases.html), the branching and versioning of Hadoop is totally convoluted and it's hard to be sure just what is applicable to your version or not. – reo katoa May 20 '14 at 19:19