MapReduce and Scale

Lately, I have become extremely interested in MapReduce, specifically the open source implementation of this in Hadoop.

From Wikipedia (MapReduce):

MapReduce is a software framework implemented by Google to support parallel computations over large (greater than 100 terabyte) data sets on unreliable clusters of computers. This framework is largely taken from map and reduce functions commonly used in functional programming.

That description, though quite accurate, does not do justice to MapReduce.

Read on…

This entry was posted in Scalability, Tools and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Comments

  1. Naseer
    Posted February 5, 2008 at 12:47 pm | Permalink

    Nice summary, Sid. Looking forward to read more on this subject from you.

  2. Posted February 5, 2008 at 1:16 pm | Permalink

    Thank you Naseer. I will try to report on my experiences as I go about using this.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*
  • About grok.in

    This is a blog primarily focussed on the subjects of Information Engineering—Retrieval, Extraction & Management, Machine Learning, Scalability and Cloud Computing.