Hadoop Platform and Application Framework




  • 分类: 计算机
  • 平台: Coursera
  • 语言: 英语


After completing this course, you will be comfortable explaining the specific components and basic processes of the Hadoop architecture, software stack, and execution environment. You will walk through simple examples of more common tools in the environment. The Map/Reduce and Spark execution frameworks will be presented in detail with hands on examples. In the assignments using Map/Reduce and Spark you will apply the important concepts that are used to solve fundamental problems in big data. These approaches are directly scalable to situations that require a full distributed/parallel processing Hadoop platform.

Get ready to be empowered to manipulate and process big data!

Hadoop Platform and Application Framework 是 大数据 专项课程中 2/6 的课程。

In this Specialization, you will develop a robust set of skills that will allow you to process, analyze, and extract meaningful information from large amounts of complex data. You will develop talking knowledge, and practical execution knowledge, for the Hadoop platform, it’s architecture and major elements of the ecosystem. Through hands-on instruction and assignments, you will develop working knowledge of tools, such as Spark, Pig, and Hive, and strategies for processing massive datasets using the map/reduce framework. You will be exposed to these tools and strategies as they might apply in particular to analyzing big data. You will become proficient in carrying out scalable basic analysis and comfortable enough to apply advanced analytics, predictive modeling, or graph analysis to problems in your domain. In the final Capstone Project, developed in partnership with data software company Splunk, you’ll apply the skills you learned by tuning and scaling your own analysis, building your own model, applying tools in new ways, or some other similar kind of effort, to analyze big data in the whatever area of your choice.This class is designed for non-programmers familiar with SQL and desire big-data skill, and for programmers who are new to big data, or new to big data analytics.


Natasha Balac
Director, Predictive Analytics Center of Excellence (PACE)
San Diego Supercomputer Center

Paul Rodriguez
Research Programmer
San Diego Supercomputer Center (SDSC)

Andrea Zonca
HPC Applications Specialist
San Diego Supercomputer Center (SDSC)


第 1 周 Hadoop Basics

Lesson 1: Big Data Hadoop Stack
Lesson 2: Hands-On Exploration of the Cloudera VM
Quiz: Basic Hadoop Stack

第 2 周 Introduction to the Hadoop Stack

Lesson 1: Overview of the Hadoop Stack
Lesson 2: The Hadoop Execution Environment
Lesson 3: Overview of Hadoop based Applications and Services
Quiz: Overview of Hadoop Stack
Quiz: Hadoop Execution Environment
Quiz: Hadoop Applications

第 3 周 Introduction to Hadoop Distributed File System (HDFS)

Lesson 1: HDFS Architecture and Configuration
Lesson 2: HDFS Performance and Tuning
Lesson 3: HDFS Access, Commands, APIs, and Applications
Quiz: HDFS Architecture
Quiz: HDFS performance,tuning, and robustness
Quiz: Accessing HDFS

第 4 周 Introduction to Map/Reduce

Lesson 1: Introduction to Map/Reduce
Lesson 2: Map/Reduce Examples and Principles
作业: Running Wordcount with Hadoop streaming, using Python code
Quiz: Lesson 1 Review
作业: Joining Data

第 5 周 Spark

Lesson 1: Introduction to Apache Spark
Lesson 2: Resilient Distributed Datasets and Transformations
Lesson 3: Job scheduling, Actions, Caching and Shared Variables
Quiz: Spark Lesson 1
Quiz: Spark Lesson 2
作业: Simple Join in Spark
Quiz: Spark Lesson 3
作业: Advanced Join in Spark

Online learning to jumpstart your future.
  • Coursera
  • edX
  • OpenLearning
  • FutureLearn
  • iversity
  • Udacity
  • NovoEd
  • Canvas
  • Open2Study
  • Google
  • ewant
  • FUN
  • IOC-Athlete-MOOC
  • World-Science-U
  • Codecademy
  • CourseSites
  • opencourseworld
  • ShareCourse
  • gacco
  • MiriadaX
  • openhpi
  • Stanford-Open-Edx
  • 网易云课堂
  • 中国大学MOOC
  • 学堂在线
  • 顶你学堂
  • 华文慕课
  • 好大学在线CnMooc
  • 以及更多...

© 2008-2018 MOOC.CN 慕课改变你,你改变世界