Menu
or
All Courses
INR USD

Hadoop

Apache Hadoop. The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster.

Why this course ?

Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.

Course Features

  • Instructor Live Sessions

    30hrs of Online Live Instructor-led Classes. Weekend class:10 sessions of 3 hours each and Weekday class:15 sessions of 2 hours each.
  • Real-life Case Studies

    Live project based on any of the selected use cases on the above selected Domain.
  • Assignments

    Each class will be followed by practical assignments which can be completed before the next class.
  • 24 x 7 Expert Support

    We have 24x7 online support team available to help you with any technical queries you may have during the course.
  • Certification

    Towards the end of the course, you will be working on a project. Covalent certifies you as an course Expert based on the project.

Course Curriculum

Download
  • Hadoop Course Content

    Introduction to Big Data
    What is Big data
    Big Data opportunities
    Big Data Challenges
    Characteristics of Big data


    Hadoop Architecture
    Introduction to Hadoop
    Parallel Computer vs. Distributed Computing
    Comparing Hadoop & SQL.
    Hadoop and Datawarehouse - When to use which?
    Industries using Hadoop.
    HDFS Design & Concepts
    Blocks, Name nodes and Data nodes
    HDFS High-Availability and HDFS Federation.
    Hadoop DFS The Command-Line Interface
    Basic File System Operations
    Anatomy of File Read
    Anatomy of File Write
    Block Placement Policy and Modes
    More detailed explanation about Configuration files.
    Metadata, FS image, Edit log, Secondary Name Node and Safe Mode.
    How to add New Data Node dynamically.
    How to decommission a Data Node dynamically (Without stopping cluster).
    FSCK Utility. (Block report).
    How to override default configuration at system level and Programming level.
    ZOOKEEPER Leader Election Algorithm.
    How to install Hadoop on your system
    How to install Hadoop cluster on multiple machines
    Hadoop Daemons introduction: NameNode, DataNode, JobTracker, TaskTracker
    Exploring HDFS (Hadoop Distributed File System)
    Exploring the HDFS Apache Web UI
    NameNode architecture (EditLog, FsImage, location of replicas)
    Secondary NameNode architecture
    DataNode architecture


    MapReduce Architecture
    Exploring JobTracker/TaskTracker
    How to run a Map-Reduce job
    Exploring Mapper/Reducer/Combiner
    Shuffle: Sort & Partition
    Input/output formats
    Exploring the Apache MapReduce Web UI
    Distributed Cache and Hadoop Streaming (Python, Ruby and R).
    YARN.

    Hadoop Developer Tasks
    Writting a Map-Reduce programme
    Reading and writing data using Java

    Hadoop Eclipse integration
    Mapper in details
    Reducer in details
    Using Combiners
    Reducing Intermediate Data with Combiners
    Writing Partitioners for Better Load Balancing
    Sorting in HDFS
    Searching in HDFS
    Hands-On Exercise

    Hadoop Administrative Tasks
    Routine Administrative Procedures
    Understanding dfsadmin and mradmin
    Block Scanner, Balancer
    Health Check & Safe mode
    Monitoring and Debugging on a production cluster
    NameNode Back up and Recovery
    DataNode commissioning/decommissioning
    ACL (Access control list)
    Upgrading Hadoop


    NOSQL
    ACID in RDBMS and BASE in NoSQL.
    CAP Theorem and Types of Consistency.
    Types of NoSQL Databases in detail.
    Columnar Databases in Detail (HBASE and CASSANDRA).


    HBase Architecture
    Introduction to HBase
    Installation of HBase on your system
    Exploring HBase Master & Region server
    Exploring Zookeeper
    Column Families and Regions
    Basic HBase shell commands.
    HBase Data Model and Comparison between RDBMS and NOSQL.
    HBase Operations (DDL and DML) through Shell and Programming and HBase
    Architecture.
    Catalog Tables.
    Block Cache and sharding.
    SPLITS.
    DATA Modeling (Sequential, Salted, Promoted and Random Keys).
    JAVA API’s and Rest Interface.
    Client Side Buffering and Process 1 million records using Client side Buffering.
    HBASE Counters.
    Enabling Replication and HBASE RAW Scans.
    HBASE Filters.
    Hands-On Exercise


    Hive Architecture

    Introduction to Hive
    HBase vs Hive
    Installation of Hive on your system
    HQL (Hive query language )
    Basic Hive commands
    Hive Services, Hive Shell, Hive Server and Hive Web Interface (HWI)
    Meta store
    Working with Tables.
    Primitive data types and complex data types.
    Working with Partitions.
    User Defined Functions
    Hive Bucketed Tables and Sampling.
    External partitioned tables, Map the data to the partition in the table, Writing the output of
    one query to another table, Multiple inserts
    Dynamic Partition
    Differences between ORDER BY, DISTRIBUTE BY and SORT BY.
    Bucketing and Sorted Bucketing with Dynamic partition.
    RC File.
    INDEXES and VIEWS.
    MAPSIDE JOINS.
    Compression on hive tables and Migrating Hive tables.
    Dynamic substation of Hive and Different ways of running Hive
    How to enable Update in HIVE.
    Log Analysis on Hive.
    Access HBASE tables using Hive
    Hands-On Exercise


    Pig Architecture
    Introduction to Pig
    Installation of Pig on your system
    Basic Pig commands
    Execution Types
    Grunt Shell
    Pig Latin
    Data Processing
    Schema on read
    Primitive data types and complex data types.
    Tuple schema, BAG Schema and MAP Schema.
    Loading and Storing
    Filtering
    Grouping & Joining
    Debugging commands (Illustrate and Explain).
    Validations in PIG.
    Type casting in PIG.
    Working with Functions
    User Defined Functions
    Types of JOINS in pig and Replicated Join in detail.
    SPLITS and Multiquery execution.
    Error Handling, FLATTEN and ORDER BY.
    Parameter Substitution.
    Nested For Each.
    User Defined Functions, Dynamic Invokers and Macros.
    How to access HBASE using PIG.
    How to Load and Write JSON DATA using PIG.
    Piggy Bank.
    Hands-On Exercise


    Sqoop Architecture
    Introduction to Sqoop
    Installation of Sqoop on your system
    Import/Export data from RDBMS to HDFS
    Import/Export data from RDBMS to HBase
    Import/Export data from RDBMS to Hive
    Hands-On Exercise
    Incremental Import(Import only New data, Last Imported data, storing Password in
    Metastore, Sharing Metastore between Sqoop Clients)
    Free Form Query Import


    FLUME
    Installation
    Introduction to Flume
    Flume Agents: Sources, Channels and Sinks
    Log User information using Java program in to HDFS using LOG4J and Avro Source
    Log User information using Java program in to HDFS using Tail Source
    Log User information using Java program in to HBASE using LOG4J and Avro Source
    Log User information using Java program in to HBASE using Tail Source
    Flume Commands


    MongoDB
    Introduction
    CRUD
    MongoDB Shell
    Indexing and Schema design
    Replication
    Sharding
    GridFS
    Aggressions


    Oozie
    Workflow (Action, Start, Action, End, Kill, Join and Fork), Schedulers, Coordinators and
    Bundles.
    Workflow to show how to schedule Sqoop Job, Hive, MR and PIG.
    Real world Use case which will find the top websites used by users of certain ages and will
    be scheduled to run for every one hour.
    Zoo Keeper
    HBASE Integration with HIVE and PIG.
    Phoenix

     



     

Projects

  • Hadoop Projects

    We will give Real Time projects on Hadoop.

FAQ's

  • Can I attend a demo session before enrollment?

    Yes

  • What if I miss a class ?

    If you miss a class we can provide recording video for particular session and same session you have to attend another batch also

  • Will I get placement Assistance ?

    Yes

  • Do I receive a certificate for training ?

    • Once you are successfully through the course you will be awarded with Covalent's Training certificate.
    • Covalent certification has industry recognition and we are the preferred training partner for many MNCs.
  • what support is available after the training?

    Doubts clarification up to getting a job

    Resume preparation

    Malk interviews

    Placement Assistance

  • What Features do you provide?

    • Classroom & Online sessions with corporate trainers
    • Course Material
    • Real time projects with industry experts
    • Regular Assignments (Tasks)
    • Placement assistance
    • Resume preparation
    • Doubts clarifications

Certifications

  • Course Completion Certificate

    • Once you are successfully through the course you will be awarded with Covalent's Training certificate.
    • Covalent certification has industry recognition and we are the preferred training partner for many MNCs.

Reviews

Kalpana

Fresher

This is an outstanding real-time based training session on Hadoop. Very flexible course timings and the best thing is that they have handled backup classes for the missed out students very well without affecting the ongoing classes.

readmore...

PRIYA RANI

Fresher

This coaching centre is really amazing and it helps me a lot by providing the best services with their real time faculty. It is best Institute for learning hadoop Technology. mentor teaching is very good.I would recommend this institute if anyone.

readmore...

Enroll Now

© 2020 Covalent Technologies All rights reserved

Drop us a Query

Drop us a Query

Call us on

IN: 91-9848733309
IN: +91-9676828080