Lately, I have become extremely interested in MapReduce, specifically the open source implementation of this in Hadoop.
From Wikipedia (MapReduce):
MapReduce is a software framework implemented by Google to support parallel computations over large (greater than 100 terabyte) data sets on unreliable clusters of computers. This framework is largely taken from map and reduce functions commonly used [...]
About grok.in
This is a blog primarily focussed on the subjects of Information Engineering—Retrieval, Extraction & Management, Machine Learning, Scalability and Cloud Computing.