To create MapReduce jobs you can either use the old org.apache.hadoop.mapred
package or the newer org.apache.hadoop.mapreduce
package for Mappers and Reducers, Jobs ... The first one had been marked as deprecated but this got reverted meanwhile. Now I wonder whether it is better to use the old mapred package or the new mapreduce package to create a job and why. Or is it just dependent on whether you need stuff like the MultipleTextOutputFormat which is only available in the old mapred package?
-
`but this got reverted meanwhile` are you sure? – Thomas Jungblut Sep 29 '11 at 14:36
-
5E.g. Interface Mapper in package org.apache.hadoop.mapred.lib in r0.21.0 is not marked as deprecated while it is marked as deprecated in r0.20.2. – momo13 Sep 29 '11 at 16:02
3 Answers
Functionality wise there is not much difference between the old (o.a.h.mapred
) and the new (o.a.h.mapreduce
) API. The only significant difference is that records are pushed to the mapper/reducer in the old API. While the new API supports both pull/push mechanism. You can get more information about the pull mechanism here.
Also, the old API has been un-deprecated since 0.21. You can find more information about the new API here.
As you mentioned some of the classes (like MultipleTextOutputFormat) have not been migrated to the new API, due to this and the above mentioned reason it's better to stick to the old API (although a translation is usually quite simple).

- 1
- 1

- 32,799
- 16
- 80
- 117
-
4
-
3[Hadoop - The Definitive Guide](http://shop.oreilly.com/product/0636920021773.do) has most of the code in the new API. – Praveen Sripati Jul 14 '12 at 04:14
-
3As a side note - MRUnit uses the new API, .mapreduce. So if you're using .mapred in your code, it's gonna throw errors. And you're not gonna be happy. – wmute Feb 21 '13 at 16:51
-
As a small update to the current version of Hadoop (2.1.0 beta), the old API is declared stable and wasn't deprecated: http://hadoop.apache.org/docs/current/api/index.html?org/apache/hadoop/mapreduce/Mapper.html Maybe we can turn this into a community wiki @PraveenSripati – Thomas Jungblut Oct 02 '13 at 08:54
-
4Due to the lack of `@deprecated` annotations, I've been using the old API for 2 years now without even knowing that a new API exists (which was 2 years old when I started: I should have been using it all along). I only found out yesterday because an `OutputFormat` that I want to use is written for the new API, and now I have to change everything. I've gotten used to the compiler telling me if something is deprecated, rather than community folklore. – Jim Pivarski Nov 20 '13 at 17:47
Both the old and new APIs are good. The new API is cleaner though. Use the new API wherever you can, and use the old one wherever you need specific classes that are not present in the new API (like MultipleTextOutputFormat
)
But do take care not to use a mix of the old and new APIs in the same Mapreduce job. That leads to weird problems.

- 33,649
- 14
- 85
- 108
Old API (mapred)
Exists in Package org.apache.hadoop.mapred
Provide A map/reduce job configuration.
- Reduces values for a given key, based on the Iterator
- Package Summary
New API (mapreduce)
Exists in Package org.apache.hadoop.mapreduce
Job configuration is done by separate class, Called JobConf which is extension of Configuration
ClassReduces values for a given key, based on the Iterable
-
org.apache.hadoop.mapred is the older API and org.apache.hadoop.mapreduce is the new one. You might want to change your answer – Harinder Sep 27 '14 at 15:55
-
3