We provide best Hadoop training in Bangalore.Take Hadoop online or in person at RelQSoft. Our instructors are industry experts and they carry vast experience in Hadoop and Big Data
Master Your Big Data With Hadoop
1. Introduction to Big Data
- What is Big Data?
- Why Big Data?
- The Three V’s of Big Data
2. Understanding Hadoop
- What is Hadoop?
- Structured Data Vs Unstructured Data
- Relational Databases Vs. Hadoop
3. The Hadoop Distributed File system (HDFS)
- What is HDFS?
- HDFS components
- Understanding Block Storage
- Reading and Writing Files in HDFS
- Hadoop Daemons(Namenode, Datanode, Resource Manager, Node Manager etc)
- HDFS Commands
- HDFS File Permissions
4. The MapReduce Framework
- Overview of MapReduce
- Understanding MapReduce
- The Map Phase
- The Reduce Phase
- WordCount in MapReduce
- Running MapReduce Job
5. Planning Your Hadoop Cluster
- Single Node Cluster Configuration
- Multi-Node Cluster Configuration
- Setup Cluster in High Availability
6. Cluster Maintenance
- Checking HDFS Status
- Copying Data Between Clusters
- Adding and Removing Cluster Nodes
- Rebalancing the cluster
- Namenode Metabata Backup
- Cluster Upgrade
7. Installing and Managing Hadoop Ecosystem Projects
8. Understanding and Installing Different Flavors of Hadoop
- Cloudera Distribution
- Hortonworks Distribution
- MapR Distribution
9. Managing and Scheduling Jobs
- Managing Jobs
- The FIFO Scheduler
- The Fair Scheduler
- Capacity Scheduler
10. Cluster Monitoring, Troubleshooting and Optimization
- General system conditions to monitor
- Namenode and Resource Manager Web UIs
- View and manage Hadoop’s log files
- Ganglia Monitoring Tool
- Common cluster issues and their resolutions
- Benchmarking your cluster’s performance
11. Populating HDFS from External Sources
- How to use Sqoop to import data from RDBMS to HDFS
- How to gather logs from multiple systems using Flume
- Features of Hive, Presto and Spark
- How to populate HDFS from external Sources
Hadoop DEVELOPER & ADMIN With Cassandra & Impala
- Understanding Big Data and Hadoop 4hrs
Learning Objectives – In this module, you will understand Big Data, the limitations of the existing solutions for Big Data problem, how Hadoop solves the Big Data problem, the common Hadoop ecosystem components, Hadoop Architecture, HDFS, Anatomy of File Write and Read, how MapReduce Framework works.
Topics – Big Data, Limitations and Solutions of existing Data Analytics Architecture, Hadoop, Hadoop Features, Hadoop Ecosystem, Hadoop 2.x core components, Hadoop Storage: HDFS, Hadoop Processing: MapReduce Framework, Hadoop Different Distributions.
- Hadoop Architecture and HDFS 6hrs Hands On for Cluster Setup
Learning Objectives – In this module, you will learn the Hadoop Cluster Architecture, Important Configuration files in a Hadoop Cluster, Data Loading Techniques, how to setup single node and multi node hadoop cluster.
Topics – Hadoop 2.x Cluster Architecture – Federation and High Availability, A Typical Production Hadoop Cluster, Hadoop Cluster Modes, Common Hadoop Shell Commands, Hadoop 2.x Configuration Files, Single node cluster set up Hadoop Administration.
- Hadoop MapReduce Framework : 6hrs Lab
Learning Objectives – In this module, you will understand Hadoop MapReduce framework and the working of MapReduce on data stored in HDFS. You will understand concepts like Input Splits in MapReduce, Combiner & Partitioner and Demos on MapReduce using different data sets.
Topics – MapReduce Use Cases, Traditional way Vs MapReduce way, Why MapReduce, Hadoop 2.x MapReduce Architecture, Hadoop 2.x MapReduce Components, YARN MR Application Execution Flow, YARN Workflow, Anatomy of MapReduce Program, Demo on MapReduce. Input Splits, Relation between Input Splits and HDFS Blocks, MapReduce: Combiner & Partitioner.
- Pig : 8hrs (6hrs ) Lab
Learning Objectives – In this module, you will learn Pig, types of use case we can use Pig, tight coupling between Pig and MapReduce, and Pig Latin scripting, PIG running modes, PIG UDF, Pig Streaming, Testing PIG Scripts. Demo on healthcare dataset.
Topics – About Pig, MapReduce Vs Pig, Pig Use Cases, Programming Structure in Pig, Pig Running Modes, Pig components, Pig Execution, Pig Latin Program, Data Models in Pig, Pig Data Types, Shell and Utility Commands, Pig Latin : Relational Operators, File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Specialized joins in Pig, Built In Functions ( Eval Function, Load and Store Functions, Math function, String Function, Date Function, Pig UDF, Piggybank.
- Hive : 8hrs Lab
Learning Objectives – This module will help you in understanding Hive concepts, Hive Data types, Loading and Querying Data in Hive, running hive scripts and Hive UDF.
Topics – Hive Background, Hive Use Case, About Hive, Hive Vs Pig, Hive Architecture and Components, Metastore in Hive, Limitations of Hive, Comparison with Traditional Database, Hive Data Types and Data Models, Partitions and Buckets, Hive Tables(Managed Tables and External Tables), Importing Data, Querying Data, Managing Outputs, Hive Script, Hive UDF, Retail use case in Hive, Hive Demo on Healthcare Data set. Advanced Hive concepts such as UDF, Dynamic Partitioning
Apache SQOOP : 2hrs
- Introduction to Sqoop
- MySQL client and Server Installation Sqoop Installation
- How to connect to Relational Database using Sqoop Sqoop Commands and Examples on Import and Export commands
Apache FLUME : 2HRS
- Introduction to flume Flume installation
- Flume agent usage and Flume examples execution
- REAL TIME EXAMPLE WITH TWITTER STREAMING
Apache OOZIE : 1HRS
- Introduction to oozie Oozie installation
o Executing oozie workflow jobs Monitering Oozie workflow jobs
Apache ZOOKEEPER : 1HRS
- Introduction to Zookeeper
- Configuring Zookeeper
- what is the role of zookeeper
- One use case with zookeeper
NO SQL : HBASE : 10HRS