Wednesday, February 16, 2011

Hadoop vs Hadoop Streaming

What is the difference between Hadoop and Hadoop streaming?

Hadoop is a gigantic program that defines its mapper and reducer all in one code (or compiled as one) and all conf details are all in that one object file
With hadoop streaming, you can use the streaming option to run a mapper of any kind and a reducer of any kind and specify other details (conf etc) externally

No comments:

Post a Comment