hero2-desktop.webp

Big Data Architecture & Technology Concepts

Intermediar

Big Data Architecture & Technology Concepts

Durată: 4 zile

Certificare: Diploma de participare

Cui îi este dedicat cursul?
  • Existing architects from different other domains interested in understanding what means architecting a big data solution

OR

  • Solutions responsables/engineers knowledgeable in different Big Data solutions (Hadoop, Spark, NoSQL solutions, ETL components, ML) that would like to understand “the big picture” of what architecting a Big Data solutions means.
Cunoștințe și abilități inițiale
Prezentarea cursului

The course is designed to make sure the participant will understand the usage and applicability of big data technologies like Hadoop, Spark, Cassandra, Hbase, Kafka, and which aspects to consider when starting to build a Big Data architecture.

Ce subiecte abordează cursul
  1. Big Data Architecture overview: components and their role in an architecture
  2. Specific technologies overview and details:
    • Storage: NoSQL databases (random reads on data) - 2 days: 
      • Overview of different NoSQL solutions
      • Cassandra detailed overview
        • Main concepts: data partitioning, distribution, replication, consistency, how to write and read data, compaction
        • How data is inserted / updated / deleted
        • Understand the main patterns and anti-patterns
        • Basic data modeling rules for Cassandra best performance
      • Use case based on Cassandra - hands on session
      • Hbase
        • Main concepts recap
        • Differences to Cassandra
        • Main data modeling aspects: how data is best read
      • How to choose a noSQL solution - the considerations
      • Recap storage options: long term storage of immutable data (HDFS) &random writes/reads of data (Hbase/Cassandra/..). Apache Kudu the new option for real time access & storage of structured data.
    • Distributed data processing frameworks: - 0.5 days
      • Overview of different solutions : Flink, Storm, Spark , …
      • Distributed computations and Stream processing with Spark
      • Spark as ETL tool - hands on session , examples and demo
      • Examples on using Spark and Spark streaming + demo session
    • Data Analysis - 0.5 days :
      • SQL on everything options: Hive, Impala, Spark SQL, Apache Drill
      • Integrate Cassandra with Spark SQL for data analytics, what kind ofanalytics can be performed;
      • Hands on exercises for analytics with Spark SQL and Cassandra
    • Messaging bus: Kafka - 0.75 days
      • Why a messaging bus in a big data architecture? Kafka Streams intro.
      • Hands on session with Kafka producer and consumer console
      • Kafka demo in combination with Spark streaming
    • Clustering and Resource Management : Mesos - 0.25 days
Ce abilități se dobândesc în urmă cursului
  • the usage and applicability of big data technologies like Hadoop, Spark, Cassandra, Hbase, Kafka 
  • which aspects to consider when starting to build a Big Data architecture

Hardware/System requirements for this course :

  • We will need open Internet connection throughout the course;
  • Each participant needs to have it’s own computer in order to run the hands on exercises, also the computer settings has to allow access to Google docs and Github for getting access to presenters slides, documents, data and exercises;
  • We will work in cloud for this course, thus an open and reliable Internet connection is mandatory for running the exercises
  • Local computers: will need to have an SSH client for connection to the cloud environment

Nu ai găsit ce căutai? Dă-ne un mesaj!

Prin trimiterea acestui formular sunteți de acord cu termenii și condițiile noastre și cu Politica noastră de confidențialitate, care explică modul în care putem colecta, folosi și dezvălui informațiile dumneavoastră personale, inclusiv către terți.