Hadoop hands-on guide for beginners/intermediate users
Table of Contents
1. Apache Hadoop - Overview
Apache Hadoop is a software solution for distributed computing of large datasets. Hadoop provides a distributed filesystem (HDFS) and a MapReduce implementation.
A special computer acts as the "name node". This computer saves the information about the available clients and the files. The Hadoop clients (computers) are called nodes.
The "name node" is currently a single point of failure. The Hadoop project is working on solutions for this.
2. MapReduce by example
Please refer to my blog: Hadoop tutorial - MapReduce by example
Direct link: http://ihadoop.blogspot.in/2012/11/hadoop-tutorial-mapreduce-by-example.html