HDPCD Free Dumps Study Materials
Question 1: You have just executed a MapReduce job.
Where is intermediate data written to after being emitted from the Mapper's map method?
A. Intermediate data in streamed across the network from Mapper to the Reduce and is never
written to disk.
B. Into in-memory buffers on the TaskTracker node running the Mapper that spill over and are
written into HDFS.
C. Into in-memory buffers that spill over to the local file system of the TaskTracker node running the
Mapper.
D. Into in-memory buffers that spill over to the local file system (outside HDFS) of the TaskTracker
node running the Reducer
E. Into in-memory buffers on the TaskTracker node running the Reducer that spill over and are
written into HDFS.
Correct Answer: C
Explanation:
The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each
individual mapper nodes. This is typically a temporary directory location which can be setup in config
by the hadoop administrator. The intermediate data is cleaned up after the Hadoop Job completes.
Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, Where is the
Mapper Output (intermediate kay-value data) stored ?