1. Big Data Overview

    What is Big Data?

    Benefits of Big Data

    Big Data Technologies

  2. Big Data Solutions

    • Traditional Approach

    • Hadoop Architecture

    • MapReduce

    • Hadoop Distributed File System

    • How Does Hadoop Work?

  3. Environment Setup

    • Pre­installation Setup

    • SSH Setup and Key Generation

    • Installing Java

    • Downloading Hadoop

    • Hadoop Operation Modes

    • Installing Hadoop in Standalone Mode

    • Installing Hadoop in Pseudo Distributed Mode

    • Verifying Hadoop Installation

  4. HDFS Overview

    • Features of HDFS

    • HDFS Architecture

    • Goals of HDFS

  5. HDFS Operations

    • Starting HDFS

    • Listing Files in HDFS

    • Inserting Data into HDFS

    • Retrieving Data from HDFS

    • Shutting Down the HDFS

    • Hadoop ­ Commands

  6. MapReduce

    • What is MapReduce?

    • Inputs and Outputs (Java Perspective)•

    • Compilation and Execution of Process Units Program

    • Important Commands

    • How to Interact with MapReduce Jobs

  7. Streaming

    • Mapper and Reducer

    • Exmples using Python

    • How streaming works

  8. Multi Node Cluster

    • Creating User Account

    • Mapping the nodes

    • Configuring Key Based Login

    • Installing Hadoop

    • Configuring Hadoop

    • Installing Hadoop on Slave Servers

    • Configuring Hadoop on Master Server

    • Starting Hadoop Services

    • Adding a New DataNode in the Hadoop Cluster

    • Adding User and SSH Access

    • Set Hostname of New Node

    • Start the DataNode on New Node

Download Presentations

01. Overview of BigData
02. BigData Solution
03. Hadoop Installation
04. HDFS Overview
05. HDFS Operations
06. MapReduce
07. Multinode Clusters

08. Hive Installation on Ubuntu

Sample Programs :
Download zip file contains : ProcessUnits Java Program, Mapper Python Program, Reducer Python Program, Stream Program
Statistical Text File