Workshop on Big Data & Hadoop
Big data is the term used in terms of exponential growth and availability of structured and unstructured data usually related to business, society.
Hadoop is an open source free technique used for storing and processing this kind of big data. It is very easy way to process a huge amount of data across inexpensive, industry standard serves that both store and process the data and scale without limits. In today’s hyper connected world where huge amount of is being created every day, Hadoop’s breakthrough advantages mean that businesses and organizations can now find value in data that was recently considered useless.
In this workshop students would get an opportunity to work on a live project on Big Data Analytics and experience hands on project. Workshop is conducted by the industry expert professionals in a corporate environment.
- Gaining a comprehensive knowledge of the concept of HDFS and Map Reduce framework.
- Understanding the architecture of Hadoop.
- Learning to establish Hadoop cluster and writing tedious map reduce programs with big data Hadoop.
- Utilization of Sqoop and Flume to aid in learning of date loading techniques.
- Integration of Map reduces.
- Implementation of HBase.
- Indexing with the help of big data.
- Utilizing Oozie to schedule jobs.
- Developing Hadoop by implementing optimized practices.
- Working on a real life project.
- Utilization of the Pig, Hive and YARN to perform data analytics.
Topics Covered in our Workshop :
- How big is this Big Data?
- Definition with Real Time Examples
- How BigData is generated with Real Time Generation
- Use of BigData-How Industry is utilizing BigData
- Traditional Data Processing Technologies
- Future of BigData!!!
- Why Hadoop?
- What is Hadoop?
- Hadoop vs. RDBMS, Hadoop vs. Big Data
- Brief history of Hadoop
- Apache Hadoop Architecture
- Problems with traditional large-scale systems
- Requirements for a new approach
- Anatomy of a Hadoop cluster
- Hadoop Setup and Installation
- Concepts & Architecture
- Data Flow (File Read , File Write)
- Fault Tolerance
- Shell Commands
- Java Base API
- Data Flow Archives
- Data Integrity
- Role of Secondary NameNode
- HDFS Programming Basics
- MapReduce Architecture
- Data Flow (Map – Shuffle - Reduce)
- MapRed vs. MapReduce APIs
- MapReduce Programming Basics
- Programming [ Mapper, Reducer, Combiner, Partitioner ]
HIVE & PIG
- Hive vs. RDBMS
- DDL & DML
- Partitioning & Bucketing
- Hive Web Interface
- Why Pig
- Use case of Pig
- Brief Introduction about Hadoop Ecosystem (MapReduce, HDFS, Hive, PIG, HBase).
- RDBMS Vs NoSQL
- HBase Introduction
Eligibility Criteria :
As we are conducting a very basic level workshop so no specific criteria is defined anyone willing to do career in animation or having interest in the same are welcomed for the Workshop
Workshop duration will be two back to back days with eight hour session each day. Each day is divided in proper theory and hands on practical session.