Our Courses  
 
Hadoop
HADOOP Course Contents

Introduction to Big Data and Hadoop

  • What is Big Data?
  • What are the challenges for processing big data? 
  • What technologies support big data?
  • Distribution systems.
  • What is Hadoop?
  • Why Hadoop?
  • History of Hadoop
  • Use Cases of Hadoop
  • Hadoop eco System
  • HDFS
  • Map Reduce
  • Statistics

Understanding the Cluster

  • Typical workflow
  • Writing files to HDFS
  • Reading files from HDFS
  • Rack Awareness
  • 5 daemons 

Developing the Map Reduce Application

  • Configuring development environment - Eclipse
  • Writing Unit Test
  • Running locally
  • Running on Cluster
  • MapReduce workflows

How MapReduce Works

  • Anatomy of a MapReduce job run
  • Failures
  • Job Scheduling
  • Shuffle and Sort
  • Task Execution

MapReduce Types and Formats

  • MapReduce Types
  • Input Formats - Input splits & records, text input, binary input, multiple inputs & database input
  • Output Formats - text Output, binary output, multiple outputs, lazy output and database output

MapReduce Features

  • Counters
  • Sorting
  • Joins - Map Side and Reduce Side
  • Side Data Distribution
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache

Hive and PIG

  • Fundamentals
  • When to Use PIG and HIVE
  • Concepts

HBASE

    • CAP Theorem
    • Hbase Architecture and concepts
    • Programming
 
 
 
    Copyright © 2012-14 www.ifocusitolutions.com All rights reserved. Facebook Twitter LinkedIn Blog